WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.
Architecture Descriptions: Difference between revisions
No edit summary |
No edit summary |
||
(15 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
<small>''(Updated: Jan 2010)''</small> | |||
The architecture descriptions listed below provide information about Lustre architecture and design and are intended to help users better understand the conceptual framework of the Lustre file system. | The architecture descriptions listed below provide information about Lustre architecture and design and are intended to help users better understand the conceptual framework of the Lustre file system. | ||
'''''Note:''''' ''These documents reflect the state of design of a Lustre feature at a particular point in time. They many contain information that is incomplete or obsolete and may not reflect the current architecture, features and functionality of Lustre.'' | '''''Note:''''' ''These documents reflect the state of design of a Lustre feature at a particular point in time. They many contain information that is incomplete or obsolete and may not reflect the current architecture, features and functionality of Lustre.'' | ||
[[Architecture - Adaptive Timeouts - Use Cases|Adaptive Timeouts - Use Cases | [[Architecture - Adaptive Timeouts - Use Cases|''Adaptive Timeouts - Use Cases'' ]] (Network RPC timeouts based on server and network loading) | ||
[[Architecture - | [[Architecture - Backup|''Backup'' ]] (File system backup) | ||
[[Architecture - | [[Architecture - Caching OSS|''Caching OSS'' ]] (Caching on object storage servers) | ||
[[Architecture - | [[Architecture - Changelogs|''Changelogs'' ]] (Per-server logs of data or metadata changes) | ||
[[Architecture - | [[Architecture - Changelogs 1.6|''Changelogs 1.6'' ]] (Used to facilitate efficient replication of large Lustre 1.6 filesystems) | ||
[[Architecture - | [[Architecture - Client Cleanup|''Client Cleanup'' ]] (Use cases, business drivers, models to consider, implementation contraints) | ||
[[Architecture - | [[Architecture - Clustered Metadata|''Clustered Metadata'' ]] (Clustered metadata server capability) | ||
[[Architecture - | [[Architecture - Commit on Share|''Commit on Share'' ]] (Better recover-ability in an environment where clients miss reconnect window) | ||
[[Architecture - | [[Architecture - CROW|''CROW'' ]] (Create On Write optimizes create performance by deferring OSS object creation) | ||
[[Architecture - | [[Architecture - CTDB with Lustre|''CTDB with Lustre'' ]] (Cluster implementation of the TDB database with Lustre provides a solution for windows pCIFS) | ||
[[Architecture - | [[Architecture - Cuts|''Cuts'' ]] (Technique for recovering file system metadata stored on file server clusters) | ||
[[Architecture - | [[Architecture - DMU OSD|''DMU OSD'' ]] (An implementation of the Object Storage Device API for a Data Management Unit) | ||
[[Architecture - | [[Architecture - End-to-end Checksumming|''End-to-end Checksumming'' ]] (Lustre network checksumming) | ||
[[Architecture - | [[Architecture - Epochs|''Epochs'' ]] (Used to merge distributed data and meta-data updates in a redundant cluster configuration) | ||
[[Architecture - | [[Architecture - External File Locking|''External File Locking'' ]] (File range lock and whole-file lock capabilities) | ||
[[Architecture - | [[Architecture - FIDs on OST|''FIDs on OST'' ]] (File identifiers used to identify objects on an object storage target) | ||
[[Architecture - | [[Architecture - Fileset|''Fileset'' ]] (An efficient representation of a group of file identifiers (FIDs)) | ||
[[Architecture - | [[Architecture - Flash Cache|''Flash Cache'' ]] (Very fast read-only flash storage) | ||
[[Architecture - | [[Architecture - Free Space Management|''Free Space Management'' ]] (Managing free space for stripe allocation) | ||
[[Architecture - | [[Architecture - GNS|''GNS'' ]] (Global namespace for a distributed file system) | ||
[[Architecture - HSM | [[Architecture - HSM|''HSM'' ]] (Hierarchical storage management) | ||
[[Architecture - HSM | [[Architecture - HSM and Cache|''HSM and Cache'' ]] (Reuse of components by Lustre features that involve migration of file system objects) | ||
[[Architecture - | [[Architecture - HSM Migration|''HSM Migration'' ]] (Use cases and high-level architecture for migrating files between Lustre and a HSM system) | ||
[[Architecture - Interoperability | [[Architecture - Interoperability fids zfs|''Interoperability fids zfs'' ]] (Client, server, network, storage interoperability during migration to clusters based on file identifiers and the ZFS file system) | ||
[[Architecture - | [[Architecture - Interoperability 1.6 1.8 2.0|''Interoperability 1.6 / 1.8 / 2.0'' ]] (interoperability definitions and QAS summary) | ||
[[Architecture - | [[Architecture - IO system|''IO system'' ]] (Client IO and server I/O request handling) | ||
[[Architecture - | [[Architecture - Libcfs|''Libcfs'' ]] (Portable runtime environment for process management and debugging support) | ||
[[Architecture - | [[Architecture - Llog over OSD|''Llog over OSD'' ]] (Re-implement llog API to use OSD device as backend device) | ||
[[Architecture - | [[Architecture - LRE Images|''LRE Images'' ]] (Provide development and training Lustre software environments based on supported environments for Lustre) | ||
[[Architecture - | [[Architecture - Lustre Logging API|''Lustre Logging API'' ]] (Requirements and detailed description) | ||
[[Architecture - MDS | [[Architecture - MDS striping format|''MDS striping format'' ]] (Striping extended attributes, striping formats, striping APIs) | ||
[[Architecture - | [[Architecture - MDS-on-DMU|''MDS-on-DMU'' ]] (Metadata server on the ZFS Data Management Unit - use cases, features and functional behavior) | ||
[[Architecture - | [[Architecture - Metadata API|''Metadata API'' ]] (A set of methods used by the Lustre file system driver to access and manipulate metadata) | ||
[[Architecture - Migration (2)|Migration (2) ]] | [[Architecture - Migration (1)|''Migration (1)'' ]] (Overview of development path for migration capabilities) | ||
'' | |||
[[Architecture - Migration (2)|Migration (2)'' ]] (Use cases, quality attribute scenarios, and implementation details) | |||
[[Architecture - MPI IO and NetCDF|MPI IO and NetCDF ]] | [[Architecture - MPI IO and NetCDF|''MPI IO and NetCDF'' ]] (Message Passing Interface I/O and network Common Data Form libraries - Lustre ADIO driver improvements and internal optimization) | ||
[[Architecture - MPI LND|MPI LND ]] | [[Architecture - MPI LND|''MPI LND'' ]] (Link to paper ''Lustre Networking over MPI'') | ||
[[Architecture - Multiple Interfaces For LNET|Multiple Interfaces For LNET ]] | [[Architecture - Multiple Interfaces For LNET|''Multiple Interfaces For LNET'' ]] (Use cases and configuration management for Lustre networking) | ||
[[Architecture - Network Request Scheduler|Network Request Scheduler ]] | [[Architecture - Network Request Scheduler|''Network Request Scheduler'' ]] (Requirements for network request scheduler to manage incoming RPC requests on a server) | ||
[[Architecture - New Metadata API|New Metadata API ]] | [[Architecture - New Metadata API|''New Metadata API'' ]] (Proposal and use cases) | ||
[[Architecture - Open by fid|Open by fid ]] | [[Architecture - Open by fid|''Open by fid'' ]] (Returns a file descriptor based on a file ID - implementation choices, design description, use cases) | ||
[[Architecture - OSS-on-DMU|OSS-on-DMU ]] | [[Architecture - OSS-on-DMU|''OSS-on-DMU'' ]](Object storage server on the ZFS Data Management Unit) | ||
[[Architecture - PAG|PAG ]] | [[Architecture - PAG|''PAG'' ]] (Process Authentication Groups - Use of Linux keyring, setuid in Lustre, Kerberos credential, use cases) | ||
[[Architecture - Pools of targets|Pools of targets ]] | [[Architecture - Pools of targets|''Pools of targets'' ]] (Use cases, command line definitions of pools of OSTs, implementation constraints) | ||
[[Architecture - Profiling Tools for IO|Profiling Tools for IO ]] | [[Architecture - Profiling Tools for IO|''Profiling Tools for IO'' ]] (Profiling system based on Ganglia) | ||
[[Architecture - Proxy Cache|Proxy Cache ]] | [[Architecture - Proxy Cache|''Proxy Cache'' ]] (Caching and aggregation used to reduce load on backend server and provide better throughput and latency to clients) | ||
[[Architecture - Punch and Extent Migration|Punch and Extent Migration ]] | [[Architecture - Punch and Extent Migration|''Punch and Extent Migration'' ]] (Prototypes for ''punch'' and ''migrate'' functionality) | ||
[[Architecture - Punch and Extent Migration Requirements|Punch and Extent Migration Requirements]] | [[Architecture - Punch and Extent Migration Requirements|''Punch and Extent Migration Requirements'']] (Punch functionality use cases) | ||
[[Architecture - Recovery Failures|Recovery Failures ]] | [[Architecture - Recovery Failures|''Recovery Failures'' ]] (Recovery terminology, architectures, and use cases) | ||
[[Architecture - Request Redirection|Request Redirection ]] | [[Architecture - Request Redirection|''Request Redirection'' ]] (Allows target OST to redirect client requests to other servers) | ||
[[Architecture - Scalable Pinger|Scalable Pinger ]] | [[Architecture - Scalable Pinger|''Scalable Pinger'' ]] (Provides peer health information to Lustre clients and servers) | ||
[[Architecture - Security|Security ]] | [[Architecture - Security|''Security'' ]] (Detailed description of the security architecture for Lustre) | ||
[[Architecture - Server Network Striping|Server Network Striping ]] | [[Architecture - Server Network Striping|''Server Network Striping'' ]] (Description of Lustre-level striping of file data over multiple object servers with redundancy) | ||
[[Architecture - Simple Space Balance Migration|Simple Space Balance Migration ]] | [[Architecture - Simple Space Balance Migration|''Simple Space Balance Migration'' ]] (A subset of full data migration limited to migrating files that are not currently in use) | ||
[[Architecture - Simplified Interoperation|Simplified Interoperation ]] | [[Architecture - Simplified Interoperation|''Simplified Interoperation'' ]] (Controlled server shutdown simplifies inter-operation on server upgrades) | ||
[[Architecture - Space Manager|Space Manager ]] | [[Architecture - Space Manager|''Space Manager'' ]] (Manages file system free space) | ||
[[Architecture - Sub Tree Locks|Sub Tree Locks ]] | [[Architecture - Sub Tree Locks|''Sub Tree Locks'' ]] (A lock on a directory that protects a namespace (or part of a namespace) rooted at that directory) | ||
[[Architecture - User Level Access|User Level Access ]] | [[Architecture - User Level Access|''User Level Access'' ]] (LNET userspace API driver that exports the LNET API to userspace) | ||
[[Architecture - User Level OSS|User Level OSS ]] | [[Architecture - User Level OSS|''User Level OSS'' ]] (Functionality related to Lustre client or OSS failure) | ||
[[Architecture - Userspace Servers|Userspace Servers ]] | [[Architecture - Userspace Servers|''Userspace Servers'' ]] (Requirements for capability to run a Lustre server in user space) | ||
[[Architecture - Version Based Recovery|Version Based Recovery ]] | [[Architecture - Version Based Recovery|''Version Based Recovery'' ]] (A recovery mechanism allowing clients to recover outside of a strict order or later in time - requirements, use cases, quality attribute scenarios, implementation details) | ||
[[Architecture - Wide Striping|Wide Striping ]] | [[Architecture - Wide Striping|''Wide Striping'' ]] (Mechanism to encode striping information compactly to efficiently support striping of files across many devices) | ||
[[Architecture - Wire Level Protocol|Wire Level Protocol ]] | [[Architecture - Wire Level Protocol|''Wire Level Protocol'' ]] (Wire formats used by Lustre - messages, wire and record packet structures, recovery protocol) | ||
[[Architecture - Write Back Cache|Write Back Cache ]] | [[Architecture - Write Back Cache|''Write Back Cache'' ]] (Allows client meta-data operations to be delayed and batched) | ||
[[Architecture - ZFS for Lustre|ZFS for Lustre ]] | [[Architecture - ZFS for Lustre|''ZFS for Lustre'' ]] (Architecture and requirements related to Lustre servers using the ZFS Data Management Unit) | ||
[[Architecture - ZFS large dnodes|ZFS large dnodes ]] | [[Architecture - ZFS large dnodes|''ZFS large dnodes'' ]] (Increased dnode size to allow more data in the inode) | ||
[[Architecture - ZFS TinyZAP|ZFS TinyZAP ]] | [[Architecture - ZFS TinyZAP|''ZFS TinyZAP'' ]] (A compact ZFS Attribute Processor format that allows arbitrary values to be stored) |
Latest revision as of 11:09, 20 January 2011
(Updated: Jan 2010)
The architecture descriptions listed below provide information about Lustre architecture and design and are intended to help users better understand the conceptual framework of the Lustre file system.
Note: These documents reflect the state of design of a Lustre feature at a particular point in time. They many contain information that is incomplete or obsolete and may not reflect the current architecture, features and functionality of Lustre.
Adaptive Timeouts - Use Cases (Network RPC timeouts based on server and network loading)
Backup (File system backup)
Caching OSS (Caching on object storage servers)
Changelogs (Per-server logs of data or metadata changes)
Changelogs 1.6 (Used to facilitate efficient replication of large Lustre 1.6 filesystems)
Client Cleanup (Use cases, business drivers, models to consider, implementation contraints)
Clustered Metadata (Clustered metadata server capability)
Commit on Share (Better recover-ability in an environment where clients miss reconnect window)
CROW (Create On Write optimizes create performance by deferring OSS object creation)
CTDB with Lustre (Cluster implementation of the TDB database with Lustre provides a solution for windows pCIFS)
Cuts (Technique for recovering file system metadata stored on file server clusters)
DMU OSD (An implementation of the Object Storage Device API for a Data Management Unit)
End-to-end Checksumming (Lustre network checksumming)
Epochs (Used to merge distributed data and meta-data updates in a redundant cluster configuration)
External File Locking (File range lock and whole-file lock capabilities)
FIDs on OST (File identifiers used to identify objects on an object storage target)
Fileset (An efficient representation of a group of file identifiers (FIDs))
Flash Cache (Very fast read-only flash storage)
Free Space Management (Managing free space for stripe allocation)
GNS (Global namespace for a distributed file system)
HSM (Hierarchical storage management)
HSM and Cache (Reuse of components by Lustre features that involve migration of file system objects)
HSM Migration (Use cases and high-level architecture for migrating files between Lustre and a HSM system)
Interoperability fids zfs (Client, server, network, storage interoperability during migration to clusters based on file identifiers and the ZFS file system)
Interoperability 1.6 / 1.8 / 2.0 (interoperability definitions and QAS summary)
IO system (Client IO and server I/O request handling)
Libcfs (Portable runtime environment for process management and debugging support)
Llog over OSD (Re-implement llog API to use OSD device as backend device)
LRE Images (Provide development and training Lustre software environments based on supported environments for Lustre)
Lustre Logging API (Requirements and detailed description)
MDS striping format (Striping extended attributes, striping formats, striping APIs)
MDS-on-DMU (Metadata server on the ZFS Data Management Unit - use cases, features and functional behavior)
Metadata API (A set of methods used by the Lustre file system driver to access and manipulate metadata)
Migration (1) (Overview of development path for migration capabilities) Migration (2) (Use cases, quality attribute scenarios, and implementation details)
MPI IO and NetCDF (Message Passing Interface I/O and network Common Data Form libraries - Lustre ADIO driver improvements and internal optimization)
MPI LND (Link to paper Lustre Networking over MPI)
Multiple Interfaces For LNET (Use cases and configuration management for Lustre networking)
Network Request Scheduler (Requirements for network request scheduler to manage incoming RPC requests on a server)
New Metadata API (Proposal and use cases)
Open by fid (Returns a file descriptor based on a file ID - implementation choices, design description, use cases)
OSS-on-DMU (Object storage server on the ZFS Data Management Unit)
PAG (Process Authentication Groups - Use of Linux keyring, setuid in Lustre, Kerberos credential, use cases)
Pools of targets (Use cases, command line definitions of pools of OSTs, implementation constraints)
Profiling Tools for IO (Profiling system based on Ganglia)
Proxy Cache (Caching and aggregation used to reduce load on backend server and provide better throughput and latency to clients)
Punch and Extent Migration (Prototypes for punch and migrate functionality)
Punch and Extent Migration Requirements (Punch functionality use cases)
Recovery Failures (Recovery terminology, architectures, and use cases)
Request Redirection (Allows target OST to redirect client requests to other servers)
Scalable Pinger (Provides peer health information to Lustre clients and servers)
Security (Detailed description of the security architecture for Lustre)
Server Network Striping (Description of Lustre-level striping of file data over multiple object servers with redundancy)
Simple Space Balance Migration (A subset of full data migration limited to migrating files that are not currently in use)
Simplified Interoperation (Controlled server shutdown simplifies inter-operation on server upgrades)
Space Manager (Manages file system free space)
Sub Tree Locks (A lock on a directory that protects a namespace (or part of a namespace) rooted at that directory)
User Level Access (LNET userspace API driver that exports the LNET API to userspace)
User Level OSS (Functionality related to Lustre client or OSS failure)
Userspace Servers (Requirements for capability to run a Lustre server in user space)
Version Based Recovery (A recovery mechanism allowing clients to recover outside of a strict order or later in time - requirements, use cases, quality attribute scenarios, implementation details)
Wide Striping (Mechanism to encode striping information compactly to efficiently support striping of files across many devices)
Wire Level Protocol (Wire formats used by Lustre - messages, wire and record packet structures, recovery protocol)
Write Back Cache (Allows client meta-data operations to be delayed and batched)
ZFS for Lustre (Architecture and requirements related to Lustre servers using the ZFS Data Management Unit)
ZFS large dnodes (Increased dnode size to allow more data in the inode)
ZFS TinyZAP (A compact ZFS Attribute Processor format that allows arbitrary values to be stored)