Architecture - Simplified Interoperation

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Summary
Controlled server shutdown allows Lustre clients to minimize state that must be recovered on reconnection. This completely eliminate some classes of inter-version recovery interactions and simplifies interoperation on server upgrades.

Requirements

 * 1) Nodes can be upgraded independently or in groups, provided any ordering required between servers and clients is followed.
 * 2) Upgrade using failover server pairs must be supported.
 * 3) Peers must be able to interoperate at different versions indefinitely.
 * 4) On startup, servers check the previous server instance's version number and whether it shut down cleanly.  The server starts only if its version has not changed, or if the previous shutdown was clean, or if the administrator forces it to start.  Previously connected clients may be evicted in the latter case.
 * 5) Client operations must not be disrupted during a clean server upgrade - open files must remain open and operations that have completed on the client may not be lost.
 * 6) Clients must clean and evict all cached server state prior to the upgrade.
 * 7) Clients that can not recover from the server upgrade (e.g. they were disconnected when the old server did a controlled shutdown, or the server did an uncontrolled shutdown) are evicted.
 * 8) During the server upgrade, all client operations other than those required to participate in clean server shutdown are blocked before any associated RPC requests are formatted.
 * 9) User processes blocking while a server upgrade is progress may abort on signal delivery.

Description
At the start of a controlled shutdown, the server notifies all its clients that shutdown has started. Clients block all new request processing, other than requests required to participate in clean server shutdown. Clients clean and evict their caches and cancel all associated locks. Clients notify the server when they have processed all outstanding RPCs and are ready to reconnect. The server shuts down when all outstanding updates including dirty caches have been committed to disk.

When the server restarts or the failover server takes over, clients reconnect and perform normal recovery. This only requires open replay since new requests are blocked (i.e. no resends) and the server has committed all outstanding updates (i.e. no replays). New request processing is now unblocked and normal operation resumes.

Implementation
Servers have a new DLM resource, the "active server lock" which a client must lock PR before it is allowed to send RPC requests. When the server starts to shut down, it enqueues an EX lock, which (a) delivers a BAST to all existing clients and (b) prevents new clients from acquiring the lock until the server restarts or fails over.

When the client receives the BAST, it blocks all new request processing and starts to clean and evict its caches as described above. When all outstanding RPCs have completed, the client cancels its lock.

When the server has acquired its own active server lock, it cleans its caches and waits until all outstanding operations have committed. It then marks the backend filesystem(s) clean and shuts down.

When clients reconnect, they perform normal recovery irrespective of whether they just participated in clean server shutdown or the server has changed version. If the server has changed version, the client is evicted if it has any resends outstanding, or anything to replay other opens.