[edit] WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Fileset

From Obsolete Lustre Wiki
Jump to: navigation, search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.



A user application (or Lustre internal features) may want to perform an action on a very large set of files. Such actions might include migration to slower storage, purging of old files, or replication to a proxy server. A fileset is an efficient representation of these file identifiers (fids).

The definition of any particular fileset is left to an external agent; no search features will be included in Lustre itself (excluding Least Recently Used files, which is probably only efficiently tracked within Lustre). Typically searches for files with particular metadata characteristics will be done a database that mirrors the Lustre file tree via a ChangeLog. The files matching these criteria will be added to a fileset via a Lustre fileset API.

Filesets will generally come in two flavors: arbitrary collections of files, or a full file tree. See Enumeration below.


an arbitrary subset of files from within a single filesystem's namespace.
an entity acting on the contents of a fileset
Internal consumer 
a Lustre internal feature using a fileset (e.g. fileset client mount, maybe replicator, migrator)
External consumer 
an entity external to Lustre using a fileset. This may be limited to a user of a fileset client mount, and no access to any other representation of a fileset is needed. see Client Access below.
fileset type, see Enumeration below


Description Quality Semantics
coherence usability file modifications are reflected in the fileset (e.g. unlink, rename)
permanence usability, scalability when filesets are discarded.
synchronization usability the list of files in the set may change.
physiology scalability internal representation must be used efficiently
hashing scalability actions on a fileset may need to be distributed across multiple servers for scalability
modification usability the contents of a fileset may be modified over time to add or remove items

Use Cases

id quality attribute summary
compliance usability, scalability delete all files modified in 2002
workset availability the files in the fileset are available on a remote proxy server
backup scalability filesystem must be subdivided into manageable chunks for backup / replication
Scenario: delete all files modified in 2002
Business Goals: Provide an API to facilitate filesystem operations based on database search output
Relevant QA's: Usability, scalability
details Stimulus: The fileset and requested operation are fed to the API
Stimulus source: Compliance policy dictates removal of old files
Environment: Database has recent FS information (from watching a ChangeLog)
Artifact: Fileset, type 1
Response: Lustre performs the requested operation on each of the files in the fileset
Response measure: fileset is created, operation is completed on all elements of the fileset
Questions: Are all operations executed from userspace on a client (external), or some directly on Lustre via an API?
Scenario: all files with the words "bunny rabbit" are replicated at a dozen remote analysis clusters
Business Goals: Provide current access to dynamic set of files on a proxy server
Relevant QA's: Availability
details Stimulus: Search results are fed to the API
Stimulus source: External search or project directory
Environment: Database has recent FS information (e.g. from watching a ChangeLog)
Artifact: Fileset, type 1 or type 2
Response: Lustre creates an internal representation of the fileset and makes it available for export.
Response measure: Fileset is created
Questions: Is a small time lag acceptable, or must proxies / filesets be absolutely synchronous
Scenario: filesystem must be subdivided into manageable chunks for backup / replication
Business Goals: User requires particular backup policies on particular sets of files
Relevant QA's: Feature, Scalability
details Stimulus: External app reads all files in a fileset
Stimulus source: External HSM or backup application
Environment: Client access to a limited, defined list of files
Artifact: Fileset, type 2
Response: All files in fileset are backed up
Response measure: Backup time, minor filesystem load during backup
Questions: Subdivision of migration work seems like it should be handled by migration architecture; doesn't seem to really have anything to do with filesets



Search results may be returned slowly, or new files that meet the search criteria may be added to the filesystem. In those cases, it should be possible to add (or remove) items to an existing fileset. The fileset should in turn notify consumers of the fileset. Alternately, some filesets may be defined to be static.


The workset case implies a fileset must be persistent across server / client reboots.


It may be desirable for a remote site to specify a fileset that should be locally proxied (i.e. pull instead of push). A fileset name is probably useful for this. (e.g. a client requests mirror fileset 'bunnyrabbit' on local proxy servers)


Files referenced in the fileset must be coherent with the original file. E.g. if a file referenced by a fileset is moved, the fileset should reflect the new file location. If a file in a fileset is deleted, the file should disappear from the fileset. Maybe this can be achieved by having the fileset take appropriate locks on the original files.

Coherence requirements:

 - unlink
 - rename
 - move to a new directory
 - file metadata (access time, perms, owner, etc.)

Note that if changing the above would cause a file to no longer meet the original search criteria that generated that fileset, it is up to the search generator to (eventually) remove it from the fileset. There are two exception to this rule, where the file should be removed from the fileset automatically:

1. unlink
2. move of a file included by virtue of its location in a file tree to a location outside of that tree (see Enumeration below)

Fileset as Object

Depending on the intended use, some filesets may be represented more efficiently than others, or may require different descriptors or methods. Implementing filesets as objects with variable attributes and methods may provide broad but efficient coverage of the range of uses. For example, one common type of fileset may be "a user's home directory", which could be efficiently represented as a single directory fid.


When performing an action on large filesets or large numbers of filesets, we must be able to distribute load across multiple servers to insure performant operation. This is true for internal consumers, but perhaps this function should be offloaded to a distributed application for external consumers.

For example, 10,000 filesets are to be replicated independently. A changelog per fileset may not scale well, and instead we may need a scalable algorithm to find the results for each fileset from a global changelog.


It may be useful to have a per-fileset changelog maintained for audit or replication purposes. A fileset-specific changelog could be used to provide migration/replication-related events specific to the fileset to migration agents. The agents would then use this information e.g. to abort / commence copying a file.

However, maintaining a per-fileset changelog may not scale. At some point, it make make more sense to process a common global changelog.

Multiple Membership

A file may be part of multiple filesets. A type 2 fileset may implicitly include other type 2 filesets. Operations on a file should affect all filesets it belongs to, and vice-versa.

Fileset API

The user API for filesets should include the following functionality:

  • Start a new fileset
  • Add items to a fileset
  • Remove items from a fileset
  • Delete a fileset
  • Initiate activity of an internal consumer (e.g. migrate fileset bunny from poolA to poolB)
  • Provide client access to a fileset (see Client Access below)

Implementation Notes


Fileset enumeration should be handled in two ways:

  • Type 1. An explicit enumeration of files or directories. Files within directories are not included in the fileset unless explicitly listed as well.
  • Type 2. Inclusive file trees. All files / subdirectories below enumerated directories are included in the fileset.

We should have provision for using both types of filesets. In fact, with some per-entry flags, we can define "mixed" filesets including both of the above (each entry in a fileset may be type 1 (flat=single file) or type 2 (tree). Perhaps a 3rd type; "not_included" would be a useful definition as well, to specifically exclude a particular subdirectory from a type 2 fileset.


Permanent fileset definitions would probably be stored on the MDT (as opposed to the MGS) for scalability and namespace-related locking.



The UI for maintaining filesets might reasonably be run through lfs similar to pools:

  1. lfs fileset_new <fileset name> Define a new fileset
  2. lfs fileset_add <fileset name> <options> <filename1> <filename2> ... Add the named files to the fileset; define type 1 or type 2
  3. lfs fileset_remove <fileset name> <filename> Remove the named file from the fileset
  4. lfs fileset_destroy <fileset name> Remove the definition of the fileset

Client Access

For arbitrary user access to the files in a fileset, a mechanism like mount(8) seems like it would provide a clear, simple way to retrieve a fileset. (Command format might be "mount -t lustre mgs://fsname/fileset mntpt")

For type 1 filesets, a hierarchical namespace defined by the files and directories in the fileset would be constructed locally. Directories would all be read/execute-only; a client cannot add new entries into the fileset by creating files in the fileset hierarchy. Regular files would keep their normal access permissions.

For type 2 filesets, the mount point would act exactly like a subtree of the full lustre fs.


bug 14168
server changelogs
Personal tools