[edit] WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Lustre 2.0 Features

From Obsolete Lustre Wiki
(Difference between revisions)
Jump to: navigation, search
m (Security GSS: move to 2.x section)
 
(31 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Lustre 2.0 and 2.x releases will introduce several significant new features and improved system functionality. This page provides descriptions of these features and lists the benefits offered by upgrading to the Lustre 2.0 release family. For the latest information on when Lustre 2.0 is expected to be released, see the [http://wiki.lustre.org/index.php?title=Lustre_Roadmap Lustre Roadmap].
+
<small>''(Updated: Nov 2009)''</small>
 +
__TOC__
 +
[[Lustre_2.0|Lustre 2.0]] release introduced several significant new features and improved system functionality. This page provides descriptions of these features and lists the benefits offered by upgrading to the Lustre 2.0 release family.
  
 
=Lustre 2.0.0=
 
=Lustre 2.0.0=
  
The initial Lustre 2.0 release (known as 2.0.0) will offer these features:
+
The initial [[Lustre 2.0]] release (known as 2.0.0) offers these features:
  
 
===Changelogs===
 
===Changelogs===
Line 46: Line 48:
 
|-
 
|-
 
|<small><strong>RNMTO</strong></small>|||<small>rename, final</small>
 
|<small><strong>RNMTO</strong></small>|||<small>rename, final</small>
|-
 
|<small><strong>OPEN</strong></small>|||<small>file opened for write</small>
 
|-
 
|<small><strong>CLOSE</strong></small>|||<small>file closed for write</small>
 
 
|-
 
|-
 
|<small><strong>IOCTL</strong></small>|||<small>ioctl on file or directory</small>
 
|<small><strong>IOCTL</strong></small>|||<small>ioctl on file or directory</small>
Line 78: Line 76:
 
For more information about changelogs, see:
 
For more information about changelogs, see:
  
*
+
* [http://wiki.lustre.org/manual/LustreManual20_HTML/LustreMonitoring.html#50438273_pgfId-1296751 Section 12.1: Changelogs - Lustre 2.0 manual]
*
+
 
+
 
+
 
+
  
 
===Commit on Share===
 
===Commit on Share===
  
The Commit on Share (COS) feature detects conflicts by checking for uncommitted transactions from a different client before updating an object. The transaction commitment occurs first, then the update. Uncommitted transactions have no dependencies.
+
The Commit on Share (COS) feature prevents missing clients from causing cascading evictions of other clients. If some clients miss the recovery window, remaining clients are not evicted.
 
+
What this means is that if one client is doing some operation in memory (say creating a file '''dir/b''') it can be sure that all of the stat required for '''dir''' to exist has already committed to disk, if it was created by another client.  Dependent operations done by a single client can be asynchronous at the server, (e.g. doing an untar of a file from one client), and independent operations done by different clients (e.g. clients creating files in separate directories) can also be asynchronous, but if there are dependencies between different client operations then the dependent operations are synced to disk.  
+
  
In conjunction with Version Based Recovery (introduced in 1.8) this allows clients to always be able to recover, regardless of whether other clients have failed.
+
When an MDS starts up and enters recovery mode after a failover or service restart, clients begin to reconnect and replay their uncommitted transactions. If one or more clients miss the recovery window, this may cause other clients to abort their transactions or be evicted. The transactions of evicted clients cannot be applied and are aborted. This causes a cascade effect as transactions dependent on the aborted ones fail and so on. COS addresses this problem by eliminating dependent transactions. With no dependent, uncommitted transactions to apply, the clients replay their requests independently without the risk of being evicted.
  
 
<big>Why should I upgrade to Lustre 2.0.0 to get it?</big>
 
<big>Why should I upgrade to Lustre 2.0.0 to get it?</big>
Line 96: Line 88:
 
COS offers these benefits:
 
COS offers these benefits:
  
* Better recovery with multiple node failures
+
* Allows clients to always be able to recover, regardless of whether other clients have failed.
* Doesn't force fully synchronous operations
+
* Reduces recovery problems when multiple node failures occur
 
+
  
 
<big>Additional Resources</big>
 
<big>Additional Resources</big>
Line 104: Line 95:
 
For more information on COS, see:
 
For more information on COS, see:
  
*
+
* [http://wiki.lustre.org/manual/LustreManual20_HTML/LustreRecovery.html#50438268_pgfId-1292073 Section 30.5: Commit on Share - Lustre 2.0 manual]
*
+
* [[Media:COS_TestPlan.pdf|COS Test Plan]]
  
===Replication===
+
===lustre_rsync===
  
The replication feature makes a (qualified) replica of a Lustre filesystem on another filesystem target. The target may be another Lustre filesystem or any other filesystem. This features differs from existing backup/replication/synchronization systems primarily in that it is designed to avoid walking the namespace tree, which for very large filesystems becomes unreasonably time-consuming. Replication is based on server changelogs, and uses the information in those logs to determine which directory and file operations to execute on the replicated system. The replicated filesystem is an exact copy of the namespace of the original system as of a given point in time. However, the replicated filesystem it is '''not''' a snapshot of the original filesystem in that its contents may differ from the contents of the original filesystem. File contents of the replica will be the contents of the file at the time the data transfer of that file took place.  
+
The lustre_rsync feature provides namespace and data replication to an external (remote) backup system without having to scan the file system for inode changes and modification times. Lustre metadata changelogs are used to record file system changes and determine which directory and file operations to execute on the replicated system. The lustre_rsync feature differs from existing backup/replication/synchronization systems because it avoids full file system scans, which can be unreasonably time-consuming for very large file systems. Also, the lustre_rsync process can be resumed from where it left off, so the replicated file system is fully synchronized when operation completes. Lustre_rsync may be bi-directional for distinct directories.
 +
 
 +
The replicated system may be another Lustre file system or any other file system. The replica is an exact copy of the namespace of the original file system at a given point in time. However, the replicated file system is '''not''' a snapshot of the source file system in that its contents may differ from the original file system's contents. On the replicated file system, a file's contents will be the data in the file at the time the file transfer occurred.  
  
 
<big>Why should I upgrade to Lustre 2.0.0 to get it?</big>
 
<big>Why should I upgrade to Lustre 2.0.0 to get it?</big>
  
Replication offers this benefit:
+
Lustre_rsync offers these benefits:
  
* Namespace-coherent duplication of large filesystems without walking the filesystem.
+
* Namespace-coherent duplication of large file systems without scanning the complete file system
 +
* Functionality is safe when run repeatedly or run after an aborted attempt
 +
* Synchronization facility to switch the role of source and target file systems
 +
* In the case of recovery, the feature provides for reverse replication
  
 
<big>Additional Resources</big>
 
<big>Additional Resources</big>
Line 121: Line 117:
 
For more information on replication, see:
 
For more information on replication, see:
  
*
+
* [http://wiki.lustre.org/manual/LustreManual20_HTML/SystemConfigurationUtilities_HTML.html#50438219_pgfId-1317225 Section 36.13: Lustre_rsync - Lustre 2.0 manual]
*
+
*
+
 
+
 
+
 
+
 
+
=Lustre 2.x=
+
 
+
Lustre 2.x releases will offer these features:
+
 
+
===HSM===
+
 
+
The HSM feature provides several mechanisms to interface with an external HSM system. External components include the policy engine, and file storage, retrieval, and removal methods. The external components are expected to run in userspace. Internal components include Lustre metadata extensions, and a distributed coordinator/agent architecture to call the file storage methods. Policy engine input and feedback is expected to occur primarily though the changelog. In its initial implementation, the HSM feature uses HPSS for the external components.
+
 
+
<big>Why should I upgrade to Lustre 2.x to get it?</big>
+
 
+
HSM offers these benefits:
+
 
+
* Cost-effective filesystem expansion
+
* Potential for backup policies in the policy engine
+
 
+
<big>Additional Resources</big>
+
 
+
For more information on HSM, see:
+
 
+
*
+
*
+

Latest revision as of 18:23, 20 January 2011

(Updated: Nov 2009)

Contents

Lustre 2.0 release introduced several significant new features and improved system functionality. This page provides descriptions of these features and lists the benefits offered by upgrading to the Lustre 2.0 release family.

Lustre 2.0.0

The initial Lustre 2.0 release (known as 2.0.0) offers these features:

Changelogs

Changelogs record events that change the filesystem namespace or file metadata. Events such as file creation, deletion, renaming, attribute changes, etc. are recorded with the target and parent file identifiers (FIDs), the name of the target, and a timestamp. These records can be used for a variety of purposes:

  • Record recent changes to feed into an archiving system.
  • Use changelog entries to exactly replicate changes in a filesystem mirror.
  • Set up "watch scripts" that take action on certain events or directories. Changelog record are persistent (on disk) until explicitly cleared by the user. The are guaranteed to accurately reflect on-disk changes in the event of a server failure.
  • Maintain a rough audit trail (file/directory changes with timestamps, but no user information).

These are sample changelog entries:

2 02MKDIR 4298396676 0x0 t=[0x200000405:0x15f9:0x0] p=[0x13:0x15e5a7a3:0x0] pics
3 01CREAT 4298402264 0x0 t=[0x200000405:0x15fa:0x0] p=[0x200000405:0x15f9:0x0] chloe.jpg
4 06UNLNK 4298404466 0x0 t=[0x200000405:0x15fa:0x0] p=[0x200000405:0x15f9:0x0] chloe.jpg
5 07RMDIR 4298405394 0x0 t=[0x200000405:0x15f9:0x0] p=[0x13:0x15e5a7a3:0x0] pics 

The record types are:

Record Type Description
MARK internal recordkeeping
CREAT regular file creation
MKDIR directory creation
HLINK hardlink
SLINK softlink
MKNOD other file creation
UNLNK regular file removal
RMDIR directory removal
RNMFM rename, original
RNMTO rename, final
IOCTL ioctl on file or directory
TRUNC regular file truncated
SATTR attribute change
XATTR extended attribute change
UNKNW unknown op

FID-to-full-pathname and pathname-to-FID functions are also included to map target and parent FIDs into the filesystem namespace.

Why should I upgrade to Lustre 2.0.0 to get it?

Changelogs offer these benefits:

  • File/directory change notification
  • Event notification
  • Filesystem replication
  • File backup policy decisions
  • Audit trail

Additional Resources

For more information about changelogs, see:

Commit on Share

The Commit on Share (COS) feature prevents missing clients from causing cascading evictions of other clients. If some clients miss the recovery window, remaining clients are not evicted.

When an MDS starts up and enters recovery mode after a failover or service restart, clients begin to reconnect and replay their uncommitted transactions. If one or more clients miss the recovery window, this may cause other clients to abort their transactions or be evicted. The transactions of evicted clients cannot be applied and are aborted. This causes a cascade effect as transactions dependent on the aborted ones fail and so on. COS addresses this problem by eliminating dependent transactions. With no dependent, uncommitted transactions to apply, the clients replay their requests independently without the risk of being evicted.

Why should I upgrade to Lustre 2.0.0 to get it?

COS offers these benefits:

  • Allows clients to always be able to recover, regardless of whether other clients have failed.
  • Reduces recovery problems when multiple node failures occur

Additional Resources

For more information on COS, see:

lustre_rsync

The lustre_rsync feature provides namespace and data replication to an external (remote) backup system without having to scan the file system for inode changes and modification times. Lustre metadata changelogs are used to record file system changes and determine which directory and file operations to execute on the replicated system. The lustre_rsync feature differs from existing backup/replication/synchronization systems because it avoids full file system scans, which can be unreasonably time-consuming for very large file systems. Also, the lustre_rsync process can be resumed from where it left off, so the replicated file system is fully synchronized when operation completes. Lustre_rsync may be bi-directional for distinct directories.

The replicated system may be another Lustre file system or any other file system. The replica is an exact copy of the namespace of the original file system at a given point in time. However, the replicated file system is not a snapshot of the source file system in that its contents may differ from the original file system's contents. On the replicated file system, a file's contents will be the data in the file at the time the file transfer occurred.

Why should I upgrade to Lustre 2.0.0 to get it?

Lustre_rsync offers these benefits:

  • Namespace-coherent duplication of large file systems without scanning the complete file system
  • Functionality is safe when run repeatedly or run after an aborted attempt
  • Synchronization facility to switch the role of source and target file systems
  • In the case of recovery, the feature provides for reverse replication

Additional Resources

For more information on replication, see:

Personal tools
Navigation