WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.
Architecture - Migration (2)
From Obsolete Lustre Wiki
Jump to navigationJump to search
Summary
Migration is a process of data and metadata moving within one cluster (one namespace) as well as to/from external non-Lustre storage servers. HSM, free space balance, file restriping are examples of migration.
Definitions
- Agent
- actually copies objects
- Feed
- object enumeration
- Feed Generator
- produces object enumeration for migration agent
- Coordinator
- menages migrations and running migration agents
- Initiator
- initiates a migration, issues a migration request
Use cases
ID | Quality Attribute | Summary |
---|---|---|
simple migration | usability, performance | a simple migration within one name space |
duplicate requests are merged | performance | duplicate migration requests are proccessed as one request |
conflicting requests abort in-progress migration | performance | when an object in process of migrating to HSM, any access to the object aborts migration |
coherent access to moving objects | usability | moving object continues to be accessible to clients |
propagate punch and trunc to source, as well as llog on sink | performance | don't copy truncated data |
recovery | availability | restore moving object state after a server crash |
single namespace at all times | usability | moving and migrated objects are in the same namespace |
scalability | scalability | more servers - faster migration |
reconnect with dirty cache after server migration has completed (FLDB) | usability | dirty cache reintegration when the objects are moved already |
support partial file reclamations | performance, usability | file can't fit fully in cache |
cache full / master full management (grants?) | usability | we don't want destination server to run out of space in the middle of migration |
IO optimization | performance | agents will somehow create large IOs |
Quality Attribute Scenarios
Simple migration
Scenario: | simple data and md migration within one namespace | |
Business Goals: | advanced control over Lustre object placement | |
Relevant QA's: | usability, performance | |
details | Stimulus: | a migration request |
Stimulus source: | administrator, lustre control utilities, client | |
Environment: | a Lustre cluster | |
Artifact: | migration coordinator | |
Response: | an administrator issues a migration request to a coordinator. The
coordinator starts or wakes up one or more migration agents. The coordinator asks migration agents to move Lustre objects with given IDs. The migration agents do actual migration. | |
Response measure: | successful migration, achieving good performance by moving objects in parallel | |
Questions: | ||
Issues: |
duplicate requests are merged
Scenario: | second migration request is issued when the object is being migrated | |
Business Goals: | handle duplicated requests efficiently | |
Relevant QA's: | performance | |
details | Stimulus: | a migration request |
Stimulus source: | client, administrator, a control utility | |
Environment: | an object being migrated | |
Artifact: | coordinator | |
Response: |
the coordinator detects duplication of the requests and execute them as one | |
Response measure: | successful execution of two requests, no duplicated requrests disturbing each other | |
Questions: | ||
Issues: |
conflicting requests abort in-progress migration
Scenario: | archiving to a tape is aborted if conflicting migration request is issued | |
Business Goals: | eliminate useless arhive operation | |
Relevant QA's: | performance | |
details | Stimulus: | client's file access (for the HSM case) |
Stimulus source: | client application | |
Environment: | HSM, a file is being archived to a tape, someone wants to write to the file | |
Artifact: | coordinator | |
Response: | coordinator aborts the archiving operation | |
Response measure: | archiving op is aborted | |
Questions: | ||
Issues: |
coherent access to moving objects
Scenario: | transparent access to objects being migrated | |
Business Goals: | concurrent access to the moving objects | |
Relevant QA's: | performance, scalability | |
details | Stimulus: | file access |
Stimulus source: | client application | |
Environment: | an object being moved, a client application accesses the object | |
Artifact: | moving object | |
Response: | The migration agent(s) move the objects in chunks, protecting the currently moved chunks by exclusive locks and redirect application requests to appropriate data location (source or target) | |
Response measure: | client is able to access an object being moved,
client access isn't blocked for the period of object migration. | |
Questions: | always redirect to target? | |
Issues: |
Empty UC
Scenario: | ||
Business Goals: | ||
Relevant QA's: | ||
details | Stimulus: | |
Stimulus source: | ||
Environment: | ||
Artifact: | ||
Response: | ||
Response measure: | ||
Questions: | ||
Issues: |
Implementation details
- IMP1 all IO involving Lustre servers uses client API (exploit existing locking and sync LLITE infrastructure)
- avoid reimplementing client
- IMP2 separate problem into coordinators and agents
- agents have datamover plugins
- IMP3: pull model if target is Lustre OSD (run agent on sink)
- IMP4: if target is Lustre OSD, send all requests to target (block client requests until they can be filled on target). source OSD must redirect
- let MDTmaintain the redirection, in line with flash cache
- callback layout (stripe descriptor) when we initiate migration, sends
- blocking asts to all clients with any locks on the file
- 3 phase: old layout, dual layout, final layout
- IMP5: need a lock bit for layouts
- means we need to drop the client lock when we flush inode
- IMP6: creation of target object results in llog entry with old and new EA (SOM-style recovery)
- IMP7: record (llog) and execute trunc/punch on tgt, propagate to source
- IMP8: bit (or extent log) on MDT (master copy) indicating copy or tape is current ("can I reclaim this space?") 1 llog of extents indicating which files are on tape (for fast space reclamation). Similar to WBC on clients
- IMP9: Use commit CB on EA with tape FID to tell the tape "not an orphan" (i.e. "we're counting on this tape fid")
- IMP10: MDS inode objects never change fids
- IMP11 locks a migrator might take automatically interact with locks other clients may take