Architecture - Sub Tree Locks

Summary
Subtree lock is a lock on a directory which protects an entire namespace (or its part) rooted at that directory. Subtree lock is supposed to be optimal for workloads where clients work in isolated directories and to not make things worse in highly contended workloads by resorting to current client-server locking protocol.

Definitions

 * STL : Sub tree lock
 * strong STL : an STL lock which invalidates all conflicting locks inside the sub tree. This is not of any use because of high acquisition and cancellation latencies
 * weak STL : an STL lock which delays lock conflict resolution until the STL holder actually accesses (fetches into its cache) conflicting object
 * extra weak STL, EW STL: an optimization to the weak STL mode when STL holder may response to a BAST by dropping object from the cache.
 * path revalidation : scanning the file name components up to the root for possible conflicts with STL locks

Requirements

 * Performance: reduce lock RPC traffic for STL-locked objects.
 * Scalability: reduce load of DLM server and memory consumption on servers and clients
 * Correctness: provide a correct interaction between STLs and ordinary DLM locks.
 * Usability: usability to other components (WBC, disconnected operations).

Strong vs Weak STL
Strong STL has two disadvantages. First, it is too strong. Its acquiring immediately affects all locks behind and might force large caches to flush. Second, the Strong STL approach requires an ability to search all conflicting locks behind an STL lock. Even in non-CMD case it looks as an resource-eating task. That makes the Weak STL the primary candidate to implement. We are assuming Weak STL when we say STL or subtree lock below in this document.

What does a subtree lock protect
Subtree lock on a directory protects the directory itself explicitly (both attributes and body). All other objects in the namespace are protected unless they are open files, hardlinked files, mount points or locked by other clients.

What does a subtree lock not protect
open files, hardlinked files, mount points and locked objects are not protected by subtree lock. For all these cases but mount points subtree lock owner has to obtain ordinary lock on an object.

Subtree lock acquiring policy
server and client contribute to the policy

STL locking rules

 * any lock (STL and non-STL) can be acquired after a lookup from the fs root or after successful path revalidation procedure.
 * when and STL holder accesses hardlinked files, objects under conflicting ordinary locks, the thread fallbacks to ordinary lock mode (non STL).
 * when an STL holder's lookup operation crosses an ordinary locked directory, STL stops to work under that directory and the thread should continue with ordinary locks
 * taking a lock on parent directory starting with ordinary lock or leaving an STL lock protected area requires to revoke all conflicting STL locks above. The revalidation may stop when an ordinary directory lock is met.
 * a directory non-STL lock holder can lookups and take more ordinary locks under the directory.

Path revalidation
A procedure which recover object's full path and guarantees (at its completion) that there are no STL locks above the object.

CMD

 * interaction between subtree lock CMD: what happens when subtree lock is given on a directory, whose subdirectories live on other servers.

Policy
this is about when _not_ to grant subtree lock

Persistent subtree lock
persistent subtree lock is granted after commit

Implementation details
1. inode protected with subtree lock (during lookup) protects all objects
 * if you take a subtree lock on MDT, everything underneath is now unreachable.
 * may already be existing locks under subtree, can't expand them up.
 * if you haven't getattr on an element of subtree, there may be a lock on it already

2. if caching under an STL hits open file, open dir, hardlink or mount point ordinary lock is granted.
 * this is because once a file is open, client has fid access, doesn't need to traverse anymore, so it will not see that file is protected by subtree lock.

3. any use of ".." on client requires revalidate path - new fs method on client, or can do it on server (harder on server with cmd) we don't traverse through stl, client knows fid so we do stat by fid, bypasses name traversal, so we don't see conflict with stl. Path revalidation (on server?) is needed.
 * this is because in a subdir under a stl held on a different client and doing, for example, stat(..),

4. when storage management by FID on directories, all subtree locks are revoked
 * object is cached on client without server knowing it
 * or maybe migration is fine, we just mark it dirty after we flush subtree lock
 * layout lock bit must be protected? client must lock layout before using during migration - must update it on the mds anyhow

???5. during migration STL cached data is "layout" invalidated (everything with a new layout must be flushed) - and data,
 * on all clients (broadcast!) (degraded performance during migration)

6. Every lookup based on stl includes fid of STL root??

7. If stl1 is called back
 * flush update cache
 * take stl(i)'s on children of stl1, callback on stl1 then client requests N stli's for children with N < ...
 * release stl1
 * (client policy)
 * do this so that e.g. ls -l on parent can finish without having to flush big proxy cache

8. collect access statistics on server in order to avoid subtree locks on highly contended resources.
 * If stl(i) sees cb's > x msec then no more stl(i)'s (server)

9. persistent STL is granted after commit