Documenting Code

In addition to the architecture and design documentation, certain amount of documentation has to be maintained in the Lustre source code. Main reason for this is that it is very difficult to constantly keep separate documentation up to date with the changing code.

The best way to document the code is to make its so simple and clear that no additional documentation is necessary. As R. Pike, Esq. put it: Basically, avoid comments. Failing to reach this ideal, code has to be commented. There are two broad categories of the comments:


 * how: describes how this particular piece of code achieves its function;
 * what: describes what is the purpose of this function or data-type or module, and how it fits into larger picture. These are interface comments.

This page deals only with the latter type. Lustre is using (will use) doxygen to automatically generate cross-linked interface descriptions from source code. As a result, interface comments have to follow certain template, which has advantages on its own.

Below are few examples:

commenting a function
 /** * Owns a page by IO. * * Waits until \a pg is in cl_page_state::CPS_CACHED state, and then switch it * into cl_page_state::CPS_OWNED state. * * \pre !cl_page_is_owned(pg, io) * \post result == 0 iff cl_page_is_owned(pg, io) * * \retval 0  success * * \retval -ve failure, e.g., page was destroyed (and landed in *            cl_page_state::CPS_FREEING instead of cl_page_state::CPS_CACHED). * * \see cl_page_disown * \see cl_page_operations::cpo_own */ int cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *pg) Note that:
 * doxygen comment starts with /** (like in javadoc)
 * it opens with a brief description of what this function is doing. Brief description runs up to the first full-stop mark (.)
 * brief description is followed by the detailed description.
 * descriptions are written in active voice with indicative mood verbs in third person singular: "[This function] does this and that", "[This data-type represents] such and such concepts".
 * to refer to a function argument use \a argname syntax.
 * to refer to another function use funcname syntax---it will produce a cross-reference.
 * to refer to a field or an enum value use SCOPE::NAME syntax.
 * if possible, specify a (weakest) precondition and (strongest) postcondition for the function. If conditions cannot be expressed as a C language expression, provide informal description. Use result to refer to the function return value. Mention all concurrency control restrictions (such as locks that function expects to he held, or holds on exit) here.
 * describe possible return values with \retval.
 * enumerate related functions and data-types in \see section. Note, that doxygen will automatically cross-reference all places where given function is called (but not through function pointer), and all functions that it calls, there is no need to enumerate all this.
 * optionally use \author tag, so that the world knows whom to praise.

data-type
 /** * "Compound" object, consisting of multiple layers. * * Compound object with given fid is unique with given lu_site. * * Note, that object does *not* necessary correspond to the real object in the * persistent storage: object is an anchor for locking and method calling, so * it is created for things like not-yet-existing child created by mkdir or * create calls. lu_object_operations::loo_exists can be used to check * whether object is backed by persistent storage entity. */ struct lu_object_header { /**        * Object flags from enum lu_object_header_flags. Set and checked * atomically. */       unsigned long     loh_flags; /**        * Object reference count. Protected by lu_site::ls_guard. */       atomic_t          loh_ref; /**        * Fid, uniquely identifying this object. */       struct lu_fid     loh_fid; /**        * Common object attributes, cached for efficiency. From enum * lu_object_header_attr. */       __u32             loh_attr; /**        * Linkage into per-site hash table. Protected by lu_site::ls_guard. */       struct hlist_node loh_hash; /**        * Linkage into per-site LRU list. Protected by lu_site::ls_guard. */       struct list_head  loh_lru; /**        * Linkage into list of layers. Never modified once set (except lately        * during object destruction). No locking is necessary. */       struct list_head  loh_layers; };  /** * Fields are protected by the lock on cfs_page_t, except for atomics and * immutables. * * \invariant Data type invariants are in cl_page_invariant. Basically: * cl_page::cp_parent and cl_page::cp_child are a well-formed double-linked * list, consistent with the parent/child pointers in the cl_page::cp_obj and * cl_page::cp_owner (when set). */ struct cl_page { /** Reference counter. */       atomic_t           cp_ref;  /** An object this page is a part of. Immutable after creation. */       struct cl_object  *cp_obj; /** Logical page index within the object. Immutable after creation. */       pgoff_t            cp_index; /** List of slices. Immutable after creation. */       struct list_head   cp_layers; ... };  /**        * Owning IO in cl_page_state::CPS_OWNED state. Sub-page can be owned * by sub-io. */       struct cl_io      *cp_owner; /**        * Owning IO request in cl_page_state::CPS_PAGEOUT and * cl_page_state::CPS_PAGEIN states. This field is maintained only in        * the top-level pages. */       struct cl_req     *cp_req;  struct cl_object_header { /** Standard lu_object_header. cl_object::co_lu::lo_header points * here. */       struct lu_object_header  coh_lu; /** \name locks * \todo XXX move locks below to the separate cache-lines, they are * mostly useless otherwise. */       /** @{ */        /** Lock protecting page tree. */       spinlock_t               coh_page_guard; /** Lock protecting lock list. */       spinlock_t               coh_lock_guard; /** @} locks */  /** \struct cl_page * Layered client page. * * cl_page: represents a portion of a file, cached in the memory. All pages *   of the given file are of the same size, and are kept in the radix tree
 * describe data-type invariants (again, preferably formally).
 * describe concurrency control mechanisms for structure fields:
 * specify when fields are valid:
 * a sub-set of fields of enum values can be grouped together with @{...@} block:
 * by default documenting comment goes immediately before an entity being commented. Sometimes to streamline comments in the header file it's necessary to place comment separately. Use following syntax for this:

files and modules
 /** \defgroup component_name component_name * * overall module documentation * ... * * @{ */ type definitions... exported functions... /** @} component_name */  /** * \name subcomponent_name subcomponent_name * * Description of a sub-component */ /** @{ */ type definitions... exported functions... /** @} subcomponent_name */  /** \addtogroup cl_object cl_object * @{ */ /** * "Data attributes" of cl_object. Data attributes can be updated * independently for a sub-object, and top-object's attributes are calculated * from sub-objects' ones. */ struct cl_attr { /** Object size, in bytes */ loff_t cat_size; ... }; ... /** @} cl_object */
 * document functions in the .c files, rather than headers.
 * to document a software component add the following to the header file with definitions of the key data-types for this module:
 * to separate a logical part of larger component add the following somewhere withing components's \defgroup:
 * if exported function prototype in a header is located within some group, appropriate function definition in a .c file is automatically assigned to the same group.
 * a set of comments which is not lexically a part of a group, can be included into it with \addtogroup command:

running doxygen
Doxygen uses a template file to control documentation build. Lustre comes with two templates:
 * build/doxyfile.ref: produces a short form of documentation set, suitable as a reference. Output is placed into apidoc.ref/ directory.
 * build/doxyfile.api: produces full documentation set, more suitable for learning code structure. In addition to apidoc.ref/ version this includes call-graphs, source code excerpts, and non-html forms of documentation (rtf, latex, troff, and rtf). Output is placed into apidoc.api/ directory.

To build documentation, run  doxygen build/$TEMPLATE in the top-level lustre directory.

publishing
An effort is currently underway to establish some way to publish interface documentation on a web.