Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.
Summary
Commit-on-Share is intended to allow better recoverability in enviroment where clients miss reconnect window.
Definitions
- Dependent transactions
- Two transactions are dependent if the second one cannot be executed until the first one is executed.
- Isolation
- defines level to which we consider transactions dependent:
- per-object -- all changes to same object are considered dependent,
- fine-grained -- some changes to same object are considered independent.
- Uncommitted object
- an object with changes cached and non-committed to disk.
- Dependency resolution
- remove request from replay queue committing it to persistent storage
Requirements
- Provide with mechanism to avoid non-recoverable requests .
- Mechanism to be optional (runtime???) in order to allow users to choose between performance and reliability.
- No changes in wire protocol are allowed.
- Provide compatibility for old clients.
Use Cases
ID |
Quality Attribute |
Summary
|
dependent request from same client
|
performance
|
performance shouldn't suffer with COS enabled
|
independent request from different client
|
performance
|
performance shouldn't suffer with COS enabled
|
dependent request from different client
|
availability
|
no dependency allowed after request execution, must be resolved before
|
set of independent requests
|
availability, performance
|
with fine-grained isolation request can be independent from all except some one
|
commit
|
availability, performance
|
commit event
|
CoS enable
|
usability
|
when and how we can enable CoS
|
CoS disable
|
usability
|
when and how we can disable CoS
|
Quality Attribute Scenarios
Dependent request from same client
Scenario: |
Dependent request from same client
|
Business Goals: |
application's performance doesn't drop
|
Relevant QA's: |
performance
|
details
|
Stimulus source: |
application
|
Stimulus: |
request modifying file system
|
Environment: |
object has non-committed modification and new request depends on that
|
Artifact: |
a record for dependency tracker
|
Response: |
immediate execution, no dependency resolution is required
|
Response measure: |
roughly same performance as with COS disabled
|
Questions: |
|
Independent request from different client
Scenario: |
Independent request from different client
|
Business Goals: |
application's performance doesn't drop
|
Relevant QA's: |
performance
|
details
|
Stimulus source: |
application
|
Stimulus: |
request modifying file system
|
Environment: |
object has no modifications which new request depends on
|
Artifact: |
a record for dependency tracker
|
Response: |
immediate execution, no dependency resolution required
|
Response measure: |
roughly same performance as with COS disabled
|
Questions: |
|
Dependent request from different client
Scenario: |
Dependent request from different client
|
Business Goals: |
prevent recovery failure if client doesn't re-connect in time
|
Relevant QA's: |
availablity
|
details
|
Stimulus source: |
application
|
Stimulus: |
request modifying file system
|
Environment: |
object has non-committed modification and new request depends on that
|
Artifact: |
old dependency records are released, new one is created
|
Response: |
server resolves dependency flushing non-committed changes and suspend current operation till commit event
|
Response measure: |
performance degrades compared non-CoS
|
Questions: |
|
Set of independent requests
Scenario: |
Set of independent requests
|
Business Goals: |
Allow few independent requests against same object to co-exist
|
Relevant QA's: |
performance, availability
|
details
|
Stimulus source: |
application
|
Stimulus: |
few requests modifying file system
|
Environment: |
few clients issue requests against same object
|
Artifact: |
few records for dependency tracker
|
Response: |
immediate execution, no dependency resolution required, but each request should be checked it doesn't depend on any request from the set
|
Response measure: |
performance doesn't degrade significantly
|
Questions: |
is it really a requirement for current CoS?
|
Commit
Scenario: |
Commit
|
Business Goals: |
Block dependent operations for as short as possible
|
Relevant QA's: |
availability, performance
|
details
|
Stimulus source: |
underlying disk file system
|
Stimulus: |
all previous changes are committed to storage
|
Environment: |
|
Artifact: |
all dependencies become resolved
|
Response: |
dependency tracker get new "committed" border and continue suspended operations
|
Response measure: |
Dependency tables don't grow indefinitely
|
Questions: |
|
CoS enable
Scenario: |
CoS enable
|
Business Goals: |
allow customers to control CoS
|
Relevant QA's: |
usability
|
details
|
Stimulus source: |
administrator
|
Stimulus: |
request through control utility and/or procfs
|
Environment: |
CoS is disabled
|
Artifact: |
per-server flag enabling CoS
|
Response: |
since now dependency tracker checks whether coming operation depends on any uncommitted ones
|
Response measure: |
dependent operations are slow, but recovery during the server's reconnect window is guaranteed to succeed
|
Questions: |
do we need this run-time? if so, we need to take care of possible races here
|
CoS disable
Scenario: |
CoS disable
|
Business Goals: |
Allow customers to control CoS
|
Relevant QA's: |
usability
|
details
|
Stimulus source: |
administrator
|
Stimulus: |
request through control utility and/or procfs
|
Environment: |
CoS is enabled
|
Artifact: |
per-server flag disabling CoS
|
Response: |
dependency tracker considers all operations independent since now
|
Response measure: |
dependent operations are fast, but recovery failure is possible in case of missed client
|
Questions: |
do we need this run-time?
|
- QAS template
Scenario: |
|
Business Goals: |
|
Relevant QA's: |
|
details
|
Stimulus source: |
|
Stimulus: |
|
Environment: |
|
Artifact: |
|
Response: |
|
Response measure: |
|
Questions: |
|
Memos for HLD
- is it possible to cancel client's lock and do sync in parallel?
Questions