Managing Free Space

In Lustre™ 1.6 and later, the MDT assigns file stripes to OSTs based on location (which OSS) and size considerations (free space) to optimize file system performance. Emptier OSTs are preferentially selected for stripes, and stripes are preferentially spread out between OSSs to increase network bandwidth utilization. The weighting factor between these two optimizations can be adjusted by the user.

Two stripe allocation methods are provided: round-robin and weighted. By default, the allocation method is determined by the amount of free-space imbalance on the OSTs. The weighted allocator is used when any two OSTs are imbalanced by more than 20%. Otherwise, the faster round-robin allocator is used. (The round-robin order maximizes network balancing.)

Round-Robin Allocator
When OSTs have approximately the same amount of free space (within 20%), an efficient round-robin allocator is used. The round-robin allocator alternates stripes between OSTs on different OSSs. Shown below are several sample round-robin stripe orders (each letter represents a different OST on a single OSS):
 * 3: AAA (one 3-OST OSS)
 * 3x3: ABABAB (two 3-OST OSSs)
 * 3x4: BBABABA (one 3-OST OSS (A) and one 4-OST OSS (B))
 * 3x5: BBABBABA
 * 3x5x1: BBABABABC
 * 3x5x2: BABABCBABC
 * 4x6x2: BABABCBABABC

Weighted Allocator
When the free space difference between the OSTs is significant, then a weighting algorithm is used to influence OST ordering based on size and location. Note that these are weightings for a random algorithm, so the OST with the most free space is not necessarily chosen every time. On average, the weighted allocator fills the emptier OSTs faster.

Adjusting the Weighting Between Free Space and Location
The weighting priority can be adjusted in the proc file /proc/fs/lustre/lov/lustremdtlov/qos_prio_free. The default value is 90%.

Use the following command on the MGS to change this weighting:

lctl conf_param -MDT0000.lov.qos_prio_free=90

Increasing the value puts more weighting on free space. When the free space priority is set to 100%, then location is no longer used in stripe-ordering calculations and weighting is based entirely on free space.

Note: Setting the priority to 100% means that OSS distribution does not count in the weighting, but the stripe assignment is still done via a weighting. For example, if OST2 has twice as much free space as OST1, then OST2 is twice as likely to be used, but it is not guaranteed to be used.

(Updated 10/09)