Architecture - Commit on Share

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Summary
Commit-on-Share is intended to allow better recoverability in enviroment where clients miss reconnect window.

Definitions

 * Dependent transactions: Two transactions are dependent if the second one cannot be executed until the first one is executed.
 * Isolation: defines level to which we consider transactions dependent:
 * per-object -- all changes to same object are considered dependent,
 * fine-grained -- some changes to same object are considered independent.
 * Uncommitted object: an object with changes cached and non-committed to disk.
 * Dependency resolution: remove request from replay queue committing it to persistent storage

Requirements

 * 1) Provide with mechanism to avoid non-recoverable requests.
 * 2) Mechanism to be optional (runtime???) in order to allow users to choose between performance and reliability.
 * 3) No changes in wire protocol are allowed.
 * 4) Provide compatibility for old clients.

CoS disable

 * QAS template

Memos for HLD

 * 1) is it possible to cancel client's lock and do sync in parallel?

Questions

 * runtime control?