WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Userspace Servers

From Obsolete Lustre Wiki
Revision as of 16:47, 3 February 2010 by Docadmin (talk | contribs) (→‎libcfs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Summary

Userspace Server is a Lustre server (OSS, MDS, MGS, ?) running in user space in contrast with kernel space.

Definitions

DMU
a core of ZFS, capable to run in userspace
control request
request from lustre utilities to start/stop/configure services
profile
file enlisting actions to set up services associated with given storage device

Requirements

  1. run lustre services in userspace
  2. make most of lustre code platform independent
  3. put all platform dependent code into few components with well-defined API (in order to improve portability)
  4. keep same recovery model (atomic updates, executed-once semantics, clients retain non-committed requests)
  5. achieve comparable to in-kernel lustre performance

Details

The core idea is to get environment similar to kernel one:

  1. single address space
  2. ioctl-like interface (control)
  3. API to control threads, memory, timers, etc

We break all components into two categories:

  1. platform-dependent: control, libcfs, OSD, lnet, build system?
  2. platform-independent: everything else, including MDT, MDD, CMM, obdfilter, ldlm, llog, ptlrpc, obdclass, utilities, etc


Uoss-arch2.png

Now when we define platform-dependent components, we describe them in details.

Decomposition

Control

We introduce a special interface to allow utilities to communicate with other components. This component together with libcfs forms kernel from lustre service's point of view.

Kernel is started by administator or scripts before any call to lustre utilities.

Kernel contains set of threads to handle control requests.

Use Cases

ID Quality Attribute Summary
kernel start usability start kernel component
kernel stop usability stop all running services and kernel component
mount usability start all services associated with given storage device
umount usability stop all service associated with given storage device
forced umount availability stop all services associated with given storage, forcefully disconnecting all clients
control request usability handle control request from utilities
stats usability access to server's and storage's stats

Quality Attribute Scenarios

kernel start
Scenario: kernel start
Business Goals: allow customer to run lustre server in userspace
Relevant QA's: usabilty
details Stimulus source: administrator
Stimulus: lustre.kernel start command
Environment: no kernel is started yet
Artifact: kernel is running, control interface is set up
Response:
Response measure: lustre utility can talk to control interface
Questions:
kernel stop
Scenario: kernel stop
Business Goals: allow customer to run lustre servers in userspace
Relevant QA's: usability
details Stimulus source: administrator
Stimulus: lustre.kernel stop command
Environment: kernel is running
Artifact: no kernel is running
Response:
Response measure: no lustre service can be running
Questions:
mount
Scenario: mount
Business Goals: allow customer to start services on given storage device
Relevant QA's: usability
details Stimulus source: administrator
Stimulus: lustre.mount [device] command
Environment: kernel is running, device isn't used yet
Artifact: service ready to handle requests
Response: OSD starts on given storage device, mountconf component reads profile and starts all services associated with the device
Response measure: clients can talk to new services
Questions:
umount
Scenario: umount
Business Goals: allow customer to stop services on given storage device
Relevant QA's: usability
details Stimulus source: administrator
Stimulus: lustre.umount [device] command
Environment: kernel is running, services are running
Artifact: service and device aren't accessible
Response: mounconf stops all services associated with the device, OSD stops on the device
Response measure: clients can't talk to these services
Questions:
forced umount
Scenario: forced umount
Business Goals:
Relevant QA's:
details Stimulus source:
Stimulus:
Environment:
Artifact:
Response:
Response measure:
Questions:
control request
Scenario: control request
Business Goals:
Relevant QA's:
details Stimulus source:
Stimulus:
Environment:
Artifact:
Response:
Response measure:
Questions:

libcfs

libcfs provides other components with platform-independent API and includes functions to control threads, memory, etc. See Libcfs for details.

Use Cases

ID Quality Attribute Summary
spinlock performance some platforms allow real spinlocks in userspace
swapping performance protect all allocated memory from swapping

Quality Attribute Scenarios

OSD

OSD provides access to persistent storage with well-defined API. For userspace we plan to use OSD built on top of DMU. We consider local caching (blocks, inodes) an internal component of OSD. DMU OSD details

Use Cases

ID Quality Attribute Summary
async IO performance use async IO where possible
0-copy IO performance use 0-copy IO where possible
swapping performance local structures and cache should be locked in memory preventing swapping

Quality Attribute Scenarios

lnet

Build system

Implementation details

  1. poor control over IO in POSIX (AIO, elevator, merging)
  2. poor control over memory management in POSIX (no way to communicate memory pressure from the kernel)
  3. synchronization primitives (on majority platforms we can't use spinlocks)

References

MDS-on-DMU

OSS-on-DMU