Architecture - Version Based Recovery

Summary
Version Based Recovery is a recovery mechanism allowing clients to recover in less strict order and even allow client to replay request long after main recovery is completed. Independent changes should be recovered even if some clients are missing.

Definitions

 * transno: transaction number, unique per disk filesystem through all life cycle
 * version: an unique version of object, every change to object changes its version
 * pre-version: a version of object before change
 * post-version: a version of object after change
 * transno-based recovery: existing recovery where all dependencies are tracked using transno and replay is done in order of transno
 * version-based recovery: new recovery where every object has version and every changes is applied only if its pre-version match current object version; thus dependencies are tracked per-object
 * applicable request: a request with pre-version matching current object(s) version
 * orphan: file or directory open by one or few clients and unlinked since then

Requirements

 * 1) better recoverability in case of missed client
 * 2) allow late clients to continue their work with no application visible errors
 * 3) no performance penalty (IOW, no additional seeks to access/update versions)
 * 4) compatibility or clearly stated incompatibility through connect flag?

Implementation details

 * 1) 1.6/1.8 rely on ability underlying disk filesystem to recreate inode with given ino (wantedi patch in ldiskfs); ino space is very limited and disk filesystem reuses them in uncontrolable manner. so, late replay can find its ino already used. currently this is fatal for server. we can either reject such replay (and efficiency of VBR for 1.6/1.8 suffers) or try to update all client's state (inode in icache, locks, etc). fids (appear in 2.0?) aren't reused, so the problem disappears with them.