WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Documenting Code

From Obsolete Lustre Wiki
Jump to navigationJump to search

In addition to the architecture and design documentation, documentation is required for the interface to each Lustre subsystem to provide reference information on how to use the subsystem correctly. This documentation is embedded in the source code as stylised comments using doxygen to ensure it stays up to date as the source is developed and maintained.

The minimum requirement is to document the subsystem API, including each datatype, procedure and global designed to be used externally, as follows:

  • Procedure
    • What does it do
    • How to use it / how not to abuse it
    • What does it return
    • Description of parameters and their valid values
  • Global
    • What it is for
    • How to use it / how not to abuse it
  • Datatype (structs, typedefs, enums)
    • What it is for
    • How to use it / how not to abuse it
    • Description of each struct member

Information about "How to use it / how not to abuse it" requires particular attention. The documentation should include all usage constraints, such as concurrency controls, reference counting, and permitted caller context. In some cases, a description of the entire object life-cycle from creation through destruction is required to ensure safe usage.

Additional overview documentation for the subsystem is encouraged but is not a requirement.

Examples

Doxygen comments start with /** (like in javadoc).

Procedures and Globals

Document procedures and globals in the .c files, rather than in headers.

/**
 * Owns a page by IO.
 *
 * Waits until \a pg is in cl_page_state::CPS_CACHED state, and then switch it
 * into cl_page_state::CPS_OWNED state.
 *
 * \param io IO context which wants to own the page
 * \param pg page to be owned
 *
 * \pre  !cl_page_is_owned(pg, io)
 * \post result == 0 iff cl_page_is_owned(pg, io)
 *
 * \retval 0   success
 *
 * \retval -ve failure, e.g., page was destroyed (and landed in
 *             cl_page_state::CPS_FREEING instead of cl_page_state::CPS_CACHED).
 *
 * \see cl_page_disown()
 * \see cl_page_operations::cpo_own()
 */
int cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *pg)

Notes:

  • Start with a brief description, which continues to the first '.' (period or full stop).
  • Follow the brief description with a detailed description.
  • Descriptions are written in the third person singular, e.g. "<this function> does this and that", "<this datatype> represents such and such a concept".
  • To refer to a function argument, use the \a argname syntax.
  • To refer to another function, use the funcname() syntax. This will produce a cross-reference.
  • To refer to a field or an enum value use the SCOPE::NAME syntax.
  • Describe possible return values with \retval.
  • Mention all concurrency control restrictions here (such as locks that the function expects to be held, or holds on exit).
  • If possible, specify a (weakest) pre-condition and (strongest) post-condition for the function. If conditions cannot be expressed as a C language expression, provide an informal description.
  • Enumerate related functions and datatypes in the \see section. Note, that doxygen will automatically cross-reference all places where a given function is called (but not through a function pointer) and all functions that it calls, so there is no need to enumerate all this.

Datatype

Document datatypes where they are declared.

/**
 * "Compound" object, consisting of multiple layers.
 *
 * Compound object with given fid is unique with given lu_site.
 *
 * Note, that object does *not* necessary correspond to the real object in the
 * persistent storage: object is an anchor for locking and method calling, so
 * it is created for things like not-yet-existing child created by mkdir or
 * create calls. lu_object_operations::loo_exists() can be used to check
 * whether object is backed by persistent storage entity.
 */
struct lu_object_header {
        /**
         * Object flags from enum lu_object_header_flags. Set and checked
         * atomically.
         */
        unsigned long     loh_flags;
        /**
         * Object reference count. Protected by lu_site::ls_guard.
         */
        atomic_t          loh_ref;
        /**
         * Fid, uniquely identifying this object.
         */
        struct lu_fid     loh_fid;
        /**
         * Common object attributes, cached for efficiency. From enum
         * lu_object_header_attr.
         */
        __u32             loh_attr;
        /**
         * Linkage into per-site hash table. Protected by lu_site::ls_guard.
         */
        struct hlist_node loh_hash;
        /**
         * Linkage into per-site LRU list. Protected by lu_site::ls_guard.
         */
        struct list_head  loh_lru;
        /**
         * Linkage into list of layers. Never modified once set (except lately
         * during object destruction). No locking is necessary.
         */
        struct list_head  loh_layers;
};

Describe datatype invariants (preferably formally).

/**
 * Fields are protected by the lock on cfs_page_t, except for atomics and
 * immutables.
 *
 * \invariant Datatype invariants are in cl_page_invariant(). Basically:
 * cl_page::cp_parent and cl_page::cp_child are a well-formed double-linked
 * list, consistent with the parent/child pointers in the cl_page::cp_obj and
 * cl_page::cp_owner (when set).
 */
struct cl_page {
        /** Reference counter. */
        atomic_t           cp_ref;

Describe concurrency control mechanisms for structure fields.

        /** An object this page is a part of. Immutable after creation. */
        struct cl_object  *cp_obj;
        /** Logical page index within the object. Immutable after creation. */
        pgoff_t            cp_index;
        /** List of slices. Immutable after creation. */
        struct list_head   cp_layers;
        ...
};

Specify when fields are valid.

        /**
         * Owning IO in cl_page_state::CPS_OWNED state. Sub-page can be owned
         * by sub-io.
         */
        struct cl_io      *cp_owner;
        /**
         * Owning IO request in cl_page_state::CPS_PAGEOUT and
         * cl_page_state::CPS_PAGEIN states. This field is maintained only in
         * the top-level pages.
         */
        struct cl_req     *cp_req;

You can use @{...@} syntax to define a subset of fields or enum values, which should be grouped together.

struct cl_object_header {
        /** Standard lu_object_header. cl_object::co_lu::lo_header points
         * here. */
        struct lu_object_header  coh_lu;
        /** \name locks
         * \todo XXX move locks below to the separate cache-lines, they are
         * mostly useless otherwise.
         */
        /** @{ */
        /** Lock protecting page tree. */
        spinlock_t               coh_page_guard;
        /** Lock protecting lock list. */
        spinlock_t               coh_lock_guard;
        /** @} locks */

By default, a documenting comment goes immediately before the entity being commented. If it is necessary to place this comment separately (e.g. to streamline comments in the header file), use the following syntax.

/** \struct cl_page
 * Layered client page.
 *
 * cl_page: represents a portion of a file, cached in the memory. All pages
 *    of the given file are of the same size, and are kept in the radix tree

Subsystem Overview

To document a subsystem, add the following comment to the header file which contains the definitions of its key datatypes.

/** \defgroup component_name component_name
 *
 * overall module documentation
 * ...
 *
 * @{ 
 */
datatype definitions...
exported functions...
/** @} component_name */

To separate a logical part of a larger component, add the following somewhere within the component's \defgroup:

/**
 * \name subcomponent_name subcomponent_name
 *
 * Description of a sub-component
 */
/** @{ */
datatype definitions...
exported functions...
/** @} subcomponent_name */

If an exported function prototype in a header is located within some group, the appropriate function definition in a .c file is automatically assigned to the same group.

A set of comments that is not lexically a part of a group can be included into it with the \addtogroup command:

/** \addtogroup cl_object cl_object
 * @{ */
/**
 * "Data attributes" of cl_object. Data attributes can be updated
 * independently for a sub-object, and top-object's attributes are calculated
 * from sub-objects' ones.
 */
struct cl_attr {
        /** Object size, in bytes */
        loff_t cat_size;
        ...
};
...
/** @} cl_object */

Running Doxygen

Doxygen uses a template file to control documentation build. Lustre comes with two templates:

  • build/doxyfile.ref: produces a short form of the documentation set, suitable as a reference. Output is placed into the apidoc.ref/ directory.
  • build/doxyfile.api: produces a full documentation set, more suitable for a learning code structure. In addition to the apidoc.ref/ version, this set includes call-graphs, source code excerpts, and non-html forms of documentation (rtf, latex, troff, and rtf). Output is placed into the apidoc.api/ directory.

To build documentation, in the top-level lustre directory run:

doxygen build/$TEMPLATE

Publishing

The build/apidoc.publish script publishes a local version of the documentation on the http://wiki.lustre.org/apidoc:

build/apidoc.publish [-b branchname] [-l additional-label] [-d] [-u user]

The build/apidoc.publish script tries to guess the branchname by looking into CVS/Tag. -d instructs the script to use the current date as a label. Documentation is uploaded into

user@shell.lustre.sun.com:/home/www/apidoc/$branch$label

where $label is a concatenation of all labels given on the command line in order.

Doxygen References

Doxygen Home

Doxygen Manual

Doxygen Special Commands