Lustre 2.0 Features

(Updated: Nov 2009)

Lustre™ 2.0 release will introduce several significant new features and improved system functionality. This page provides descriptions of these features and lists the benefits offered by upgrading to the Lustre 2.0 release family.

=Lustre 2.0.0=

The initial Lustre 2.0 release (known as 2.0.0) will offer these features:

Changelogs
Changelogs record events that change the filesystem namespace or file metadata. Events such as file creation, deletion, renaming, attribute changes, etc. are recorded with the target and parent file identifiers (FIDs), the name of the target, and a timestamp. These records can be used for a variety of purposes:


 * Record recent changes to feed into an archiving system.
 * Use changelog entries to exactly replicate changes in a filesystem mirror.
 * Set up "watch scripts" that take action on certain events or directories. Changelog record are persistent (on disk) until explicitly cleared by the user. The are guaranteed to accurately reflect on-disk changes in the event of a server failure.
 * Maintain a rough audit trail (file/directory changes with timestamps, but no user information).

These are sample changelog entries:

2 02MKDIR 4298396676 0x0 t=[0x200000405:0x15f9:0x0] p=[0x13:0x15e5a7a3:0x0] pics 3 01CREAT 4298402264 0x0 t=[0x200000405:0x15fa:0x0] p=[0x200000405:0x15f9:0x0] chloe.jpg 4 06UNLNK 4298404466 0x0 t=[0x200000405:0x15fa:0x0] p=[0x200000405:0x15f9:0x0] chloe.jpg 5 07RMDIR 4298405394 0x0 t=[0x200000405:0x15f9:0x0] p=[0x13:0x15e5a7a3:0x0] pics

The record types are:

FID-to-full-pathname and pathname-to-FID functions are also included to map target and parent FIDs into the filesystem namespace.

Why should I upgrade to Lustre 2.0.0 to get it?

Changelogs offer these benefits:


 * File/directory change notification
 * Event notification
 * Filesystem replication
 * File backup policy decisions
 * Audit trail

Additional Resources

For more information about changelogs, see:


 * [[Media:Changelog-hld.pdf|Changelogs HLD]]

Commit on Share
The Commit on Share (COS) feature prevents missing clients from causing cascading evictions of other clients. If some clients miss the recovery window, remaining clients are not evicted.

When an MDS starts up and enters recovery mode after a failover or service restart, clients begin to reconnect and replay their uncommitted transactions. If one or more clients miss the recovery window, this may cause other clients to abort their transactions or be evicted. The transactions of evicted clients cannot be applied and are aborted. This causes a cascade effect as transactions dependent on the aborted ones fail and so on. COS addresses this problem by eliminating dependent transactions. With no dependent, uncommitted transactions to apply, the clients replay their requests independently without the risk of being evicted.

Why should I upgrade to Lustre 2.0.0 to get it?

COS offers these benefits:


 * Allows clients to always be able to recover, regardless of whether other clients have failed.
 * Reduces recovery problems when multiple node failures occur

Additional Resources

For more information on COS, see:


 * [[Media:COS_HLD.pdf|COS HLD]]
 * [[Media:COS_TestPlan.pdf|COS Test Plan]]

lustre_rsync
The lustre_rsync feature provides namespace and data replication to an external (remote) backup system without having to scan the file system for inode changes and modification times. Lustre metadata changelogs are used to record file system changes and determine which directory and file operations to execute on the replicated system. The lustre_rsync feature differs from existing backup/replication/synchronization systems because it avoids full file system scans, which can be unreasonably time-consuming for very large file systems. Also, the lustre_rsync process can be resumed from where it left off, so the replicated file system is fully synchronized when operation completes. Lustre_rsync may be bi-directional for distinct directories.

The replicated system may be another Lustre file system or any other file system. The replica is an exact copy of the namespace of the original file system at a given point in time. However, the replicated file system is not a snapshot of the source file system in that its contents may differ from the original file system's contents. On the replicated file system, a file's contents will be the data in the file at the time the file transfer occurred.

Why should I upgrade to Lustre 2.0.0 to get it?

Lustre_rsync offers these benefits:


 * Namespace-coherent duplication of large file systems without scanning the complete file system
 * Functionality is safe when run repeatedly or run after an aborted attempt
 * Synchronization facility to switch the role of source and target file systems
 * In the case of recovery, the feature provides for reverse replication

Additional Resources

For more information on replication, see:


 * Lustre_rsync topic for the Lustre 2.0 manual
 * Architecture Page - Replication
 * [[Media:Lrepl.txt|lreplicate man page]]