WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Flash Cache

From Obsolete Lustre Wiki
Jump to navigationJump to search

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Summary

flash cache is a write only cache. When clients write, servers may redirect those writes to flash cache server. When clients read data which were written to the flash cache, the data have to be flushed data from flash cache to data server.

Details

Flash servers

A filesystem may have a set of flash cache servers which are typically very fast flash storage of capacity smaller than OST. Whenever a client wants to flush dirty data to storage - it sends data to flash server. Data are never read from flash cache servers.

Layouts

When a client opens a file it gets two file data layouts from the MDS. The client keeps those layouts in redirection layer, the appropriate layout is chosen depending on whether read or write is in progress.

Locking

Flash cache server runs LDLM and clients send write locks requests to it. The flash cache server in its turn takes lock on master OST and then grants the lock to the client. In case case of read, clients send read lock requests to master OST, which is to make flash servers to flush the data if necessary.

Use cases

ID Quality attribute Summary
remapping extents performance the object has different layouts on OSTs and FCSs
how to take locks coherency client uses DLM extent locking during read/write
how to keep cache and master consistent correctness, usability clients see consistent file data in the face of concurrent read-write accesses to the master and proxy servers
cache miss in cobd ?? ??
local cache coherency correctness for a file accessed by given client only, write followed by the read of the same data returns last written data
how to acquire EA2 performance flash cache layout is obtained from MDS
powerloss with cached writes in flash availability data cached on a flash cache server aren't lost in the case of the flash cache server failure"
file size recovery/consistency consistency all clients see correct file size, file size gets recovered in case of flash cache server failure
mmap ??
cache is full usability flash cache server free space managements (grants?)
lose OSTc fault tolerance filesystem survives flash cache server dearth

Quality Attribute Scenarios

remapping extents
how to take locks
how to keep cache and master consistent
cache miss in cobd
local cache coherency
how to acquire EA2
powerloss with cached writes in flash
file size recovery/consistency
mmap
cache is full
Scenario:
Business Goals:
Relevant QA's:
details Stimulus:
Stimulus source:
Environment:
Artifact:
Response:
Response measure:
Questions:
Issues:

Implementation details

1. add new layer "redir" between llite and lov in order to redirect write requests to flash cache and to let read requests go to OSTs

2. flash is feature of filesystem - lov descriptor contains flash desc

  • revoke config lock for dynamic retrieval EA2
  • write: PW lock on ostC, ostC's lov takes PW lock on ostM
  • lockless IO with write lock on MDT (close to WBC locks, per-dir write locks) (MDT locks everything with a single extent lock bit). (Good for file-per-process).

3. use extent-lock bit on MDT to protect whole file (data as well)

4. hierarchical locks

5. (retracted) client with lovC lock implies data is still on ostC (no consequence to ostM)

  • ostM lock held long time
  • client doing reading might have to wait longer - flash cache may have a lot of data.
  • client doing read gets ostM lock, write gets ostC lock

6. ostC locks are automatically non-overlapping. Don't hand out optimistic ostM extent locks that violate this.

7. after flushed data, remove from cache (so we don't recover clean data). flash cache only.

8. map lovC to lovM without aliasing

9. lovC obtain max grant from ostM

  • special grant rpc

10. all updates go through cache

References

bug 14699