WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Proxy Cache

From Obsolete Lustre Wiki
Jump to navigationJump to search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Definitions

server
By default, the entire distributed entity exporting a lustre file system, both data and metadata (i.e. not the individual server processes of which it is composed).
proxy
An intermediate server that aggregates its clients' filesystem operations onto the backend server it exports to them.
caching proxy
A proxy that caches data and/or metadata.
NV caching proxy
Non-volatile caching proxy - i.e. a proxy that extends its cache onto local non-volatile memory (e.g. disk, flash).
disconnected operation
How the proxy operates when one or more of its component servers can no longer communicate with one or more components of its backend servers.

Summary

A proxy uses caching and aggregation to reduce the load on its backend server and to provide better throughput and latency to its clients. Both volatile (in-store) and non-volatile (flash, disk) cache may be applied in any combination (including none at all), as determined by caching policies. This gives the proxy many uses from simple connection aggregation to limit backend server fan-out, to filesystem replication over a wide-area network.

A proxy translates credentials between the client side and the backend server side. This translation implements a security policy on the proxy's clients.

Requirements

Cache control

Manage cache contention between different backends.

Cache policies

 - Prefetch (what to cache aggressively)
 - writeback (how much can backend lag)
 - granularity 
   - whole F/S 
   - single file
   - filesets?


F/S snapshot - "cut" over servers! persistent lock pre-empts reduced coherence debate? google uses persistent locks?

Performance

(partial) Disconnected operation

partial disconnected operation

Strong coherence

Local stable storage

Clustered - Restriping - Data and Metadata

Use Cases

Connection offload (Security! SSL etc)

UID translation