Architecture - Request Redirection

Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.

Definitions

Target OST: an OST which file data were initially written to, it receives data access requests from clients, manages locks, maintains information about data cached by clients and dedicated servers, maintains persistent and in-code redirect information.
Client: an application which accesses data from target OST
Migration: moving file data from one set of OSTs to another
Collaborative cache: read only cache distributed over clients or dedicated cache servers. Target OSTs maintain in-core information about clients and cache servers and data they are caching and redirect read accesses to appropriate cache server.
Flash cache: write only cache. There are dedicated flash cache servers. Target OSTs maintain (persistent?) information about data cached by flash cache servers and redirect write accesses to them.

Summary

Request Redirection is a mechanism which allows target OST to redirect client requests to other servers. Target OST may decide to redirect a request in hope to improve system throughput or may have to redirect in case when it does not store requested data anymore.

Requirements

Universality: clients get redirected to instances of data created via different ways with the same request redirection mechanism
Flexibility: the request redirection mechanism can allow clients to specify preferences (for example, "do not redirect me", or "let me choose myself")
Extensibility: adding new ways to create instances of data should require no or minimal changes to request redirection mechanism
Availability: requested data are accessible either at servers to which a client is redirected or at target OST
Centralization: target OST keeps all the information about possible redirections, all lock requests get sent to it, it does locking and optionally can redirect a client to another server where data access will happen without further locking
Modes: redirection information can be stored persistently when it has to survive reboots (for example in case of migration) or it can be maintained in memory only (for example in case of collaborative cache)
API: request redirection mechanism provides means for clients to send/receive redirection information to/from a target OST, means for target OST to store that information
Multiplexing: there may be several instances of the same data. Request redirection mechanism should be able to deal with that. For example, in case of collaborative cache target OST has to be able to find clients which are caching requested data and to choose where to redirect.

Use Cases

id	quality attribute	summary
collaborative cache populating	performance,scalability	client holds a read lock on data extent, sends read request to target OST
collaborative cache redirection	performance,scalability	client sends read lock request to target OST
flash cache redirection	performance	client sends write lock request, there is flash server in the filesystem
client reads or writes migrated data	availability	client sends lock request to target OST for data extent, data are not on target OST due to migration
data server crash	availability	data server crashes and gets up
update persistent redirect information	availability	migration is in progress: data extent

collaborative cache populating

Scenario:		a client sends read request to target OST, there are no other caches for the data, the client is running OST service
Business Goals:		populate collaborative cache
Relevant QA's:		performance, scalability
details	Stimulus:	client read request
	Stimulus source:	client
	Environment:	the requested data are not cached by anybody
	Artifact:	requested data
	Response:	target OST reads requested data and sends them to client, if the client node has OST service then the target OST makes a record (in-core) that the requested data extent is cached by this client node
	Response measure:	target OST knows that certain data are cached by certain client node
Questions:
Issues:		Client node may not desire to participate in collaborative cache, this can be controlled with preferences

collaborative cache redirection

Scenario:		a client sends read lock request to target OST, the requested data extent is cached by another client
Business Goals:		offload target OST by redirecting read request to hopefully less loaded node
Relevant QA's:		performance, scalability
details	Stimulus:	client read lock request
	Stimulus source:	client
	Environment:	the requested data extent is cached by another client
	Artifact:	requested data
	Response:	target OST grants lock, checks its records and sees that the data are available via collaborative cache. The request is serviced in accordance with its preferences: list of nodes caching the data range can be returned, or client node in local network can be choosen, etc
	Response measure:	lock is granted, the client knows where the data can be fetched
Questions:
Issues:

flash cache redirection

Scenario:		a client sends write lock request to target OST, there is a flash cache in the filesystem
Business Goals:
Relevant QA's:		performance
details	Stimulus:	client write lock request
	Stimulus source:	client
	Environment:	flash cache is capable to perform this write
	Artifact:	write lock
	Response:	target OST grants lock (all other locks are revoked, caches are released), checks its records to see if the data were already written to flash cache server, selects flash cache server which is able to do this write, redirect the client to selected flash cache server
	Response measure:	the client has write lock and knows where to sent write request to
Questions:		the client has to send a notification to target OST when flash server completes the write, so that target OST could make appropriate redirect record. Only after that the client releases the write lock.
Issues:		How is it guaranteed that the flash server will complete the write?

client accesses a file which migrates

Scenario:		client accesses a file which is migrating to other OST and accessed extent of data is not available on target OST
Business Goals:		allow migration and client access to work simultaneously
Relevant QA's:		Availability
details	Stimulus:	Clients needs to access data to do its job
	Stimulus source:	Client
	Environment:	Data requested by the client were migrated to other data server or to several servers
	Artifact:	lock request
	Response:	target OST grants the lock, checks its redirect records and sees that the data are already on another server, redirects the client to that server
	Response measure:	lock is granted to the client, client knows where it can fetch data from, client does not have to wait until migration completes, migration continues
Questions:		What does target OST do if requested data extent migrated to several servers? It can return either array of redirections or redirection for first part of extent only
Issues:

data server crash

Scenario:		a data server crashes while a migration agent copies a file hosted by the data server
Business Goals:		incorrect redirection is not allowed
Relevant QA's:		availabiliy
details	Stimulus:	OST crash
	Stimulus source:	power failure
	Environment:	at time of crash a migration agent worked with data hosted on the crashed data server
	Artifact:	data server
	Response:	when the data server is up again, none of RID update requests from the agent are lost
	Response measure:
Questions:		is there anything to do about recovering? I guess no, if agents follow simple rules interacting with a data server.
Issues:		Migration agent is responsible for resending RID update requests which the data server did not complete before crash

agent sends a RID update request to data server

Scenario:		agent sends to data server a request to update the RID
Business Goals:		keep data server aware of real data location
Relevant QA's:		availabiliy
details	Stimulus:	Agent copied data somewhere
	Stimulus source:	Agent
	Environment:	Agent copied data to target data seerver, so source data server's RID has to be updated
	Artifact:	RID of data server
	Response:	data server adds new record to its RID and sends to agent completion notification
	Response measure:
Questions:
Issues:		Migration agent has to send RID update request after copied data are written to disk on target data server

Questions

1. Is there need for request redirection mechanism to be involved into filesystem replication? Hopefully not, because of significant overhead of RID maintainence.

2. in case of replication it may happen that data server can either serve a request locally or redirect the request somewhere else. Who is to make a choice?

References

bug 14174

Simple Space Balance Migration

WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Architecture - Request Redirection

Contents

Definitions

Summary

Requirements

Use Cases

Questions

References

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools