WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Lustre Project List

From Obsolete Lustre Wiki
Revision as of 17:02, 5 September 2010 by Adilger (talk | contribs) (→‎List of Lustre Features and Projects: reference lustre-devel)
Jump to navigationJump to search

List of Lustre Features and Projects

Below is a list of Lustre features and projects that are just waiting for someone to start working on them. They are listed roughly in order of increasing complexity, but this is highly dependent upon the coding skills of the developer and their familiarity with the Lustre code base.

After you have chosen a project, or if you are having trouble deciding what to work on, please contact the lustre-devel mailing list to discuss your project with the Lustre developers. That will ensure that the work you are doing is in line with other plans/projects for Lustre and also to ensure that nobody else is working on the same thing.

Feature Complexity Required skills Tracking Bug Brief Description
ioctl() number cleanups 1 kernel 20731 Clean up Linux IOC numbering to properly use "size" field so that mixed 32- and 64-bit kernel/userspace ioctls work correctly. Attention needs to be paid to maintaining userspace compatibility for a number of releases, so the old ioctl() numbers cannot simply be removed.
Improve testing Efficiency 3 shell, test Improve the performance, efficiency, and coverage of the acceptance-small.sh test scripts. More advanced work includes improved test scheduling. Dynamic cluster configuration. Virtual machines for functional tests.
Config save/edit/restore 3 MGS, llog, config 17094 Need to be able to backup/edit/restore the client/MDS/OSS config llog files after a writeconf. One reason is for config recovery if the config llog becomes corrupted. Another reason is that all of the filesystem tunable parameters (including all of the OST pool definitions) are stored in the config llog and are lost if a writeconf is done.
mdd-survey tools for performance analysis 3 obdfilter-survey, mdd, benchmarking 21658 Add a low-level metadata unit test to allow measuring performance of the metadata stack without having connected clients, similar and/or integrated to the obdfilter survey (echo client, echo server)
Readdir with large requests 4 MDS, RPC 17833 Read directory pages in large chunks instead of the current page-at-a-time reads from the client. This will improve readdir performance somewhat, and reduce load on the MDS. It is expected to be significant over WAN high-latency links.
32TB ldiskfs filesystems 5 ldiskfs, obdfilter 20063 Single OST sizes larger than 16TB. This is largely supported in newer ext4 filesystems (e.g. RHEL5.4, RHEL6), but thorough testing and some bug fixing work may be needed in obdfilter (1.8, 2.0) or OFD (2.x), and other work may be needed in client (all versions).

All RPCs pass a lock handle 5 DLM, RPC 22849 For protocol correctness, and improved performance, it would be desirable for all RPCs that are done with a client lock held to send the lock handle along with the request. For OST requests this means all read, write, truncate operations (unless "lockless") should include a lock handle. This allows the OST to validate the request is being done by a client that holds the correct locks, and allows lockh->lock->object lookups to avoid OI lookups in most cases.
OST Space Management (Basic) 6 HSM, layout 13107 Simple migration capability - transparently migrate objects/files between OSTs (blocking application writes, or aborting migration during contention); evacuate OSTs and move file data to other OSTs; add new OST and balance data on it. The OST doesn't really need to understand this, only the MDS (for LOV EA rewrite) and client (LOV EA rewrite); HSM project implements layout lock support and policy engine for automatic space management.
Imperative recovery 6 recovery, RPC 18767 Reduce recovery time by having the server notify clients after recovery has completed instead of waiting for the client to timeout the RPC before it begins recovery.