WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Migration (1)

From Obsolete Lustre Wiki
Jump to navigationJump to search

Migration of objects from one object device to another is a tricky issue. The complications seem to arise from multiple separate issues:

  1. the location of the object changes
  2. while the migration operation is in progress, other systems may try to access the object
  3. while the migration operations is in progress systems may crash and recovery needs to take place.

Migration can apply to single objects, but frequently files will be migrated from one pool to another, involving re-striping. This page contains considerations for migration architecture.

Simple Space Balancing Migration

  1. A subset of the full migration architecture, Simple Space Balance Migration, will be implemented as a stepping stone toward full object Migration. With Simple Migration objects cannot be modified while they are undergoing migration. If a file is opened for write during migration this will cause the migrate to be aborted and the partially-migrated objects to be destroyed.

HSM Migration

A second important use case is Migration for HSM, see HSM Migration.

Considerations

  1. sometimes all stripe objects should move simultaneously, e.g. when you are re-striping a file
  2. when a file is large, a facility is needed to do I/O while the file is moving (for metadata, we can live without that probably).
    1. I think that migration should start with the migration coordinator taking an exclusive lock on the whole file, to flush caches and prevent I/O.
    2. Then it should inform all OST's of the fact that the objects are moving, and giving referral information to them as you propose is a way to avoid a new synchronization mechanism of stripe EA's. This of course is failure prone in an awkward way, if we execute half this transaction and the power goes off, we cannot release the lock on the file
    3. While the file is moving each object should maintain a cursor of what has migrated and what has not.