WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Wide Striping

From Obsolete Lustre Wiki
Jump to navigationJump to search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

There are several use cases where Lustre wants to write exceptionally many stripes in files:

  1. Major HPC installations may have many hundreds or thousands of OSTs and we need to be able to stripe files over all of them
  2. Server Network Striping (SNS) will use parity declustering, resulting in an very large number of objects building up the striped file.

Therefore, wide striping will be a commonly encountered case. The goal is to encode the striping information in a very compact way.

Definitions (see fid-hld)

A pool
defines an un-ordered sets of OSTs and will be used to describe the striping in a manageable way.
fid seq number
part of fully specified FID, contains sequence in which object was created
fid number
part of fully specified FID, contains object id within its sequence
object version
part of fully specified FID, contains object version number
fully specified object identification structure: FID = {f-sequence, f-number, f-version}
FID Location DataBase, provides fid sequence to server (OST, MDS) mapping

APIs required

  1. Get a consecutive set of fid sequence numbers from the FLDB
  2. define an on-disk EA that contains a pool name and other RAID striping parameters, for use as a default directory EA
  3. define an on-disk EA that contains a RAID type, raid parameters, a starting fid sequence number, a count of objects over which the object may be striped, a sequence skip count, a single fid number used by this file in all specified sequences, the object version, possibly the pool from which this object was allocated (for future reference)
  4. offsets within the file are {lov_offset, stripe_index} = fn(file_offset, raid_type, raid_parameters}
  5. individual objects OBJ{0, ..., num_obj - 1} in the file can be located:
    • OST(stripe_idx) = FLDB(seq_start + stripe_idx*seq_skip)
    • OBJ(stripe_idx) = FID{seq_start + stripe_idx*seq_skip,fid_number,obj_version}