WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Migration (2)

From Obsolete Lustre Wiki
Revision as of 17:16, 18 January 2010 by Docadmin (talk | contribs)
Jump to navigationJump to search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Summary

Migration is a process of data and metadata moving within one cluster (one namespace) as well as to/from external non-Lustre storage servers. HSM, free space balance, file restriping are examples of migration.

Definitions

Agent
actually copies objects
Feed
object enumeration
Feed Generator
produces object enumeration for migration agent
Coordinator
menages migrations and running migration agents
Initiator
initiates a migration, issues a migration request

Use cases

ID Quality Attribute Summary
simple migration usability, performance a simple migration within one name space
duplicate requests are merged performance duplicate migration requests are proccessed as one request
conflicting requests abort in-progress migration performance when an object in process of migrating to HSM, any access to the object aborts migration
coherent access to moving objects usability moving object continues to be accessible to clients
propagate punch and trunc to source, as well as llog on sink performance don't copy truncated data
recovery availability restore moving object state after a server crash
single namespace at all times usability moving and migrated objects are in the same namespace
scalability scalability more servers - faster migration
reconnect with dirty cache after server migration has completed (FLDB) usability dirty cache reintegration when the objects are moved already
support partial file reclamations performance, usability file can't fit fully in cache
cache full / master full management (grants?) usability we don't want destination server to run out of space in the middle of migration
IO optimization performance agents will somehow create large IOs

Quality Attribute Scenarios

Simple migration

Scenario: simple data and md migration within one namespace
Business Goals: advanced control over Lustre object placement
Relevant QA's: usability, performance
details Stimulus: a migration request
Stimulus source: administrator, lustre control utilities, client
Environment: a Lustre cluster
Artifact: migration coordinator
Response: an administrator issues a migration request to a coordinator. The

coordinator starts or wakes up one or more migration agents. The coordinator asks migration agents to move Lustre objects with given IDs. The migration agents do actual migration.

Response measure: successful migration, achieving good performance by moving objects in parallel
Questions:
Issues:

duplicate requests are merged

Scenario: second migration request is issued when the object is being migrated
Business Goals: handle duplicated requests efficiently
Relevant QA's: performance
details Stimulus: a migration request
Stimulus source: client, administrator, a control utility
Environment: an object being migrated
Artifact: coordinator
Response:

the coordinator detects duplication of the requests and execute them as one

Response measure: successful execution of two requests, no duplicated requrests disturbing each other
Questions:
Issues:

conflicting requests abort in-progress migration

Scenario: archiving to a tape is aborted if conflicting migration request is issued
Business Goals: eliminate useless arhive operation
Relevant QA's: performance
details Stimulus: client's file access (for the HSM case)
Stimulus source: client application
Environment: HSM, a file is being archived to a tape, someone wants to write to the file
Artifact: coordinator
Response: coordinator aborts the archiving operation
Response measure: archiving op is aborted
Questions:
Issues:

coherent access to moving objects

Scenario: transparent access to objects being migrated
Business Goals: concurrent access to the moving objects
Relevant QA's: performance, scalability
details Stimulus: file access
Stimulus source: client application
Environment: an object being moved, a client application accesses the object
Artifact: moving object
Response: The migration agent(s) move the objects in chunks, protecting the currently moved chunks by exclusive locks and redirect application requests to appropriate data location (source or target)
Response measure: client is able to access an object being moved,

client access isn't blocked for the period of object migration.

Questions: always redirect to target?
Issues:


Empty UC

Scenario:
Business Goals:
Relevant QA's:
details Stimulus:
Stimulus source:
Environment:
Artifact:
Response:
Response measure:
Questions:
Issues:

Implementation details

  • IMP1 all IO involving Lustre servers uses client API (exploit existing locking and sync LLITE infrastructure)
    • avoid reimplementing client
  • IMP2 separate problem into coordinators and agents
    • agents have datamover plugins
  • IMP3: pull model if target is Lustre OSD (run agent on sink)
  • IMP4: if target is Lustre OSD, send all requests to target (block client requests until they can be filled on target). source OSD must redirect
    • let MDTmaintain the redirection, in line with flash cache
    • callback layout (stripe descriptor) when we initiate migration, sends
    • blocking asts to all clients with any locks on the file
    • 3 phase: old layout, dual layout, final layout
  • IMP5: need a lock bit for layouts
    • means we need to drop the client lock when we flush inode
  • IMP6: creation of target object results in llog entry with old and new EA (SOM-style recovery)
    • IMP7: record (llog) and execute trunc/punch on tgt, propagate to source
  • IMP8: bit (or extent log) on MDT (master copy) indicating copy or tape is current ("can I reclaim this space?") 1 llog of extents indicating which files are on tape (for fast space reclamation). Similar to WBC on clients
  • IMP9: Use commit CB on EA with tape FID to tell the tape "not an orphan" (i.e. "we're counting on this tape fid")
  • IMP10: MDS inode objects never change fids
  • IMP11 locks a migrator might take automatically interact with locks other clients may take

References

bug 14698