WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Documenting Code: Difference between revisions

From Obsolete Lustre Wiki
Jump to navigationJump to search
m (add reference to doxygen tags)
mNo edit summary
Line 1: Line 1:
In addition to the architecture and design documentation, certain amount of documentation has to be maintained in the Lustre source code. Main reason for this is that it is very difficult to constantly keep separate documentation up to date with the changing code.  
== Introduction ==
In addition to the architecture and design documentation, ''interface'' documentation is required for each Lustre subsystem to provide reference information on how to use each subsystem correctly.  This documentation is embedded in the source code as stylised comments using [http://en.wikipedia.org/wiki/Doxygen doxygen] to ensure it stays up to date as the source is developed and maintained.


The best way to document the code is to make its so simple and clear that no additional documentation is necessary. As [http://en.wikipedia.org/wiki/Rob_Pike R. Pike, Esq.] put it: ''[http://www.lysator.liu.se/c/pikestyle.html Basically, avoid comments]''. Failing to reach this ideal, code has to be commented. There are two broad categories of the comments:
The minimum requirement is to document the subsystem API, including every datatype, procedure and global designed to be used externally.


* ''how'': describes how this particular piece of code achieves its function;
* Procedures
* ''what'': describes what is the purpose of this function or data-type or module, and how it fits into larger picture. These are ''interface'' comments.
** What does it do
** How to use it / how not to abuse it
** What does it return
** Description of parameters and their valid values
* Globals
** What it is for
** How to use it / how not to abuse it
* Datatypes (structs, typedefs, enums)
** What it is for
** How to use it / how not to abuse it
** Description of each struct member


This page deals only with the latter type. Lustre is using (will use) [http://en.wikipedia.org/wiki/Doxygen doxygen] to automatically generate cross-linked interface descriptions from source code. As a result, interface comments have to follow certain template, which has advantages on its own.
Additional overview documentation for the subsystem is encouraged but is not a requirement.


Below are few examples:
== Examples ==
 
Doxygen comments start with [http://www.doxygen.org/docblocks.html '''/**'''] (like in [http://en.wikipedia.org/wiki/Javadoc javadoc]).
 
===Procedures and Globals===
Document procedures and globals in the .c files, rather than headers.


===commenting a function===
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/**
/**
Line 34: Line 49:
int cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *pg)
int cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *pg)
</pre>
</pre>
Note that:
Note that:
* doxygen comment starts with /** (like in [http://en.wikipedia.org/wiki/Javadoc javadoc])
* It opens with a brief description which runs up to the first '.' (full stop).
* it opens with a brief description of what this function is doing. Brief description runs up to the first full-stop mark (.)
* The brief description is followed by the detailed description.
* brief description is followed by the detailed description.
* Descriptions are written in the third person singular - e.g. "<this function> does this and that", "<this datatype> represents such and such a concept".
* descriptions are written in active voice with indicative mood verbs in third person singular: "[This function] does this and that", "[This data-type represents] such and such concepts".
* To refer to a function argument use the [http://www.doxygen.org/commands.html#cmda '''\a argname'''] syntax.
* to refer to a function argument use \a argname syntax.
* To refer to another function use the [http://www.doxygen.org/autolink.html '''funcname()'''] syntax.  This will produce a cross-reference.
* to refer to another function use funcname() syntax---it will produce a cross-reference.
* To refer to a field or an enum value use the [http://www.doxygen.org/autolink.html '''SCOPE::NAME'''] syntax.
* to refer to a field or an enum value use SCOPE::NAME syntax.
* Describe possible return values with [http://www.doxygen.org/commands.html#cmdretval '''\retval'''].
* if possible, specify a (weakest) precondition and (strongest) postcondition for the function. If conditions cannot be expressed as a C language expression, provide informal description. Use ''result'' to refer to the function return value. Mention all concurrency control restrictions (such as locks that function expects to he held, or holds on exit) here.
* Mention all concurrency control restrictions (such as locks that function expects to he held, or holds on exit) here.
* describe possible return values with \retval.
* If possible, specify a (weakest) precondition and (strongest) postcondition for the function. If conditions cannot be expressed as a C language expression, provide informal description.
* enumerate related functions and data-types in \see section. Note, that doxygen will automatically cross-reference all places where given function is called (but not through function pointer), and all functions that it calls, there is no need to enumerate all this.
* enumerate related functions and datatypes in the [http://www.doxygen.org/commands.html#cmdsee '''\see'''] section. Note, that doxygen will automatically cross-reference all places where a given function is called (but not through function pointer), and all functions that it calls so there is no need to enumerate all this.
* optionally use \author tag, so that the world knows whom to praise.
 
===Datatype===
Document datatypes where they are declared.


===data-type===
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/**
/**
Line 94: Line 111:
};
};
</pre>
</pre>
* describe data-type invariants (again, preferably formally).
 
Describe datatype invariants (preferably formally).
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/**
/**
Line 100: Line 119:
  * immutables.
  * immutables.
  *
  *
  * \invariant Data type invariants are in cl_page_invariant(). Basically:
  * \invariant Datatype invariants are in cl_page_invariant(). Basically:
  * cl_page::cp_parent and cl_page::cp_child are a well-formed double-linked
  * cl_page::cp_parent and cl_page::cp_child are a well-formed double-linked
  * list, consistent with the parent/child pointers in the cl_page::cp_obj and
  * list, consistent with the parent/child pointers in the cl_page::cp_obj and
Line 109: Line 128:
         atomic_t          cp_ref;
         atomic_t          cp_ref;
</pre>
</pre>
* describe concurrency control mechanisms for structure fields:
 
Describe concurrency control mechanisms for structure fields.
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
         /** An object this page is a part of. Immutable after creation. */
         /** An object this page is a part of. Immutable after creation. */
Line 120: Line 141:
};
};
</pre>
</pre>
* specify when fields are valid:
 
Specify when fields are valid.
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
         /**
         /**
Line 134: Line 157:
         struct cl_req    *cp_req;
         struct cl_req    *cp_req;
</pre>
</pre>
* a sub-set of fields of enum values can be grouped together with @{...@} block:
 
You can use [http://www.doxygen.org/grouping.html#memgroup '''@{'''...'''@}'''] syntax to define a subset of fields or enum values which should be grouped together.
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
struct cl_object_header {
struct cl_object_header {
Line 151: Line 176:
         /** @} locks */
         /** @} locks */
</pre>
</pre>
* by default documenting comment goes immediately before an entity being commented. Sometimes to streamline comments in the header file it's necessary to place comment separately. Use following syntax for this:
 
By default a documenting comment goes immediately before the entity being commented. If it's necessary to place this comment separately (e.g. to  streamline comments in the header file) use following syntax.
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/** \struct cl_page
/** \struct cl_page
Line 160: Line 187:
</pre>
</pre>


===files and modules===
===Subsystem Overview===
 
To document a subsystem, add the following comment to the header file which contains the definitions of its key datatypes.


* document functions in the .c files, rather than headers.
* to document a software component add the following to the header file with definitions of the key data-types for this module:
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/** \defgroup component_name component_name
/** \defgroup component_name component_name
Line 172: Line 199:
  * @{  
  * @{  
  */
  */
type definitions...
datatype definitions...
exported functions...
exported functions...
/** @} component_name */
/** @} component_name */
</pre>
</pre>
* to separate a logical part of larger component add the following somewhere withing components's \defgroup:
 
To separate a logical part of a larger component, add the following somewhere within the component's \defgroup:
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/**
/**
Line 184: Line 213:
  */
  */
/** @{ */
/** @{ */
type definitions...
datatype definitions...
exported functions...
exported functions...
/** @} subcomponent_name */
/** @} subcomponent_name */
</pre>
</pre>
* if exported function prototype in a header is located within some group, appropriate function definition in a .c file is automatically assigned to the same group.
 
* a set of comments which is not lexically a part of a group, can be included into it with \addtogroup command:
If an exported function prototype in a header is located within some group, the appropriate function definition in a .c file is automatically assigned to the same group.
 
A set of comments which is not lexically a part of a group, can be included into it with \addtogroup command:
 
<pre style="background:lightgrey;">
<pre style="background:lightgrey;">
/** \addtogroup cl_object cl_object
/** \addtogroup cl_object cl_object
Line 207: Line 239:
</pre>
</pre>


=== running doxygen ===
== Running Doxygen ==
Doxygen uses a ''template file'' to control documentation build. Lustre comes with two templates:
Doxygen uses a ''template file'' to control documentation build. Lustre comes with two templates:
* build/doxyfile.ref: produces a ''short'' form of documentation set, suitable as a reference. Output is placed into apidoc.ref/ directory.
* build/doxyfile.ref: produces a ''short'' form of documentation set, suitable as a reference. Output is placed into apidoc.ref/ directory.
Line 218: Line 250:
in the top-level lustre directory.
in the top-level lustre directory.


=== publishing ===
== Publishing ==


build/apidoc.publish scripts publishes your local version of documentation on the http://wiki.lustre.org/apidoc:
build/apidoc.publish scripts publishes your local version of documentation on the http://wiki.lustre.org/apidoc:
Line 233: Line 265:
where $label is a concatenation of all labels given on the command line in order.
where $label is a concatenation of all labels given on the command line in order.


=== Refer to Doxygen tags ===
== Doxygen References ==
 
[http://www.doxygen.org/ Doxygen Home]
 
[http://www.doxygen.org/manual.html Manual]


[http://www.stack.nl/~dimitri/doxygen/commands.html]
[http://www.doxygen.org/commands.html Special Comment Tags]

Revision as of 06:51, 12 November 2008

Introduction

In addition to the architecture and design documentation, interface documentation is required for each Lustre subsystem to provide reference information on how to use each subsystem correctly. This documentation is embedded in the source code as stylised comments using doxygen to ensure it stays up to date as the source is developed and maintained.

The minimum requirement is to document the subsystem API, including every datatype, procedure and global designed to be used externally.

  • Procedures
    • What does it do
    • How to use it / how not to abuse it
    • What does it return
    • Description of parameters and their valid values
  • Globals
    • What it is for
    • How to use it / how not to abuse it
  • Datatypes (structs, typedefs, enums)
    • What it is for
    • How to use it / how not to abuse it
    • Description of each struct member

Additional overview documentation for the subsystem is encouraged but is not a requirement.

Examples

Doxygen comments start with /** (like in javadoc).

Procedures and Globals

Document procedures and globals in the .c files, rather than headers.

/**
 * Owns a page by IO.
 *
 * Waits until \a pg is in cl_page_state::CPS_CACHED state, and then switch it
 * into cl_page_state::CPS_OWNED state.
 *
 * \param io IO context which wants to own the page
 * \param pg page to be owned
 *
 * \pre  !cl_page_is_owned(pg, io)
 * \post result == 0 iff cl_page_is_owned(pg, io)
 *
 * \retval 0   success
 *
 * \retval -ve failure, e.g., page was destroyed (and landed in
 *             cl_page_state::CPS_FREEING instead of cl_page_state::CPS_CACHED).
 *
 * \see cl_page_disown()
 * \see cl_page_operations::cpo_own()
 */
int cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *pg)

Note that:

  • It opens with a brief description which runs up to the first '.' (full stop).
  • The brief description is followed by the detailed description.
  • Descriptions are written in the third person singular - e.g. "<this function> does this and that", "<this datatype> represents such and such a concept".
  • To refer to a function argument use the \a argname syntax.
  • To refer to another function use the funcname() syntax. This will produce a cross-reference.
  • To refer to a field or an enum value use the SCOPE::NAME syntax.
  • Describe possible return values with \retval.
  • Mention all concurrency control restrictions (such as locks that function expects to he held, or holds on exit) here.
  • If possible, specify a (weakest) precondition and (strongest) postcondition for the function. If conditions cannot be expressed as a C language expression, provide informal description.
  • enumerate related functions and datatypes in the \see section. Note, that doxygen will automatically cross-reference all places where a given function is called (but not through function pointer), and all functions that it calls so there is no need to enumerate all this.

Datatype

Document datatypes where they are declared.

/**
 * "Compound" object, consisting of multiple layers.
 *
 * Compound object with given fid is unique with given lu_site.
 *
 * Note, that object does *not* necessary correspond to the real object in the
 * persistent storage: object is an anchor for locking and method calling, so
 * it is created for things like not-yet-existing child created by mkdir or
 * create calls. lu_object_operations::loo_exists() can be used to check
 * whether object is backed by persistent storage entity.
 */
struct lu_object_header {
        /**
         * Object flags from enum lu_object_header_flags. Set and checked
         * atomically.
         */
        unsigned long     loh_flags;
        /**
         * Object reference count. Protected by lu_site::ls_guard.
         */
        atomic_t          loh_ref;
        /**
         * Fid, uniquely identifying this object.
         */
        struct lu_fid     loh_fid;
        /**
         * Common object attributes, cached for efficiency. From enum
         * lu_object_header_attr.
         */
        __u32             loh_attr;
        /**
         * Linkage into per-site hash table. Protected by lu_site::ls_guard.
         */
        struct hlist_node loh_hash;
        /**
         * Linkage into per-site LRU list. Protected by lu_site::ls_guard.
         */
        struct list_head  loh_lru;
        /**
         * Linkage into list of layers. Never modified once set (except lately
         * during object destruction). No locking is necessary.
         */
        struct list_head  loh_layers;
};

Describe datatype invariants (preferably formally).

/**
 * Fields are protected by the lock on cfs_page_t, except for atomics and
 * immutables.
 *
 * \invariant Datatype invariants are in cl_page_invariant(). Basically:
 * cl_page::cp_parent and cl_page::cp_child are a well-formed double-linked
 * list, consistent with the parent/child pointers in the cl_page::cp_obj and
 * cl_page::cp_owner (when set).
 */
struct cl_page {
        /** Reference counter. */
        atomic_t           cp_ref;

Describe concurrency control mechanisms for structure fields.

        /** An object this page is a part of. Immutable after creation. */
        struct cl_object  *cp_obj;
        /** Logical page index within the object. Immutable after creation. */
        pgoff_t            cp_index;
        /** List of slices. Immutable after creation. */
        struct list_head   cp_layers;
        ...
};

Specify when fields are valid.

        /**
         * Owning IO in cl_page_state::CPS_OWNED state. Sub-page can be owned
         * by sub-io.
         */
        struct cl_io      *cp_owner;
        /**
         * Owning IO request in cl_page_state::CPS_PAGEOUT and
         * cl_page_state::CPS_PAGEIN states. This field is maintained only in
         * the top-level pages.
         */
        struct cl_req     *cp_req;

You can use @{...@} syntax to define a subset of fields or enum values which should be grouped together.

struct cl_object_header {
        /** Standard lu_object_header. cl_object::co_lu::lo_header points
         * here. */
        struct lu_object_header  coh_lu;
        /** \name locks
         * \todo XXX move locks below to the separate cache-lines, they are
         * mostly useless otherwise.
         */
        /** @{ */
        /** Lock protecting page tree. */
        spinlock_t               coh_page_guard;
        /** Lock protecting lock list. */
        spinlock_t               coh_lock_guard;
        /** @} locks */

By default a documenting comment goes immediately before the entity being commented. If it's necessary to place this comment separately (e.g. to streamline comments in the header file) use following syntax.

/** \struct cl_page
 * Layered client page.
 *
 * cl_page: represents a portion of a file, cached in the memory. All pages
 *    of the given file are of the same size, and are kept in the radix tree

Subsystem Overview

To document a subsystem, add the following comment to the header file which contains the definitions of its key datatypes.

/** \defgroup component_name component_name
 *
 * overall module documentation
 * ...
 *
 * @{ 
 */
datatype definitions...
exported functions...
/** @} component_name */

To separate a logical part of a larger component, add the following somewhere within the component's \defgroup:

/**
 * \name subcomponent_name subcomponent_name
 *
 * Description of a sub-component
 */
/** @{ */
datatype definitions...
exported functions...
/** @} subcomponent_name */

If an exported function prototype in a header is located within some group, the appropriate function definition in a .c file is automatically assigned to the same group.

A set of comments which is not lexically a part of a group, can be included into it with \addtogroup command:

/** \addtogroup cl_object cl_object
 * @{ */
/**
 * "Data attributes" of cl_object. Data attributes can be updated
 * independently for a sub-object, and top-object's attributes are calculated
 * from sub-objects' ones.
 */
struct cl_attr {
        /** Object size, in bytes */
        loff_t cat_size;
        ...
};
...
/** @} cl_object */

Running Doxygen

Doxygen uses a template file to control documentation build. Lustre comes with two templates:

  • build/doxyfile.ref: produces a short form of documentation set, suitable as a reference. Output is placed into apidoc.ref/ directory.
  • build/doxyfile.api: produces full documentation set, more suitable for learning code structure. In addition to apidoc.ref/ version this includes call-graphs, source code excerpts, and non-html forms of documentation (rtf, latex, troff, and rtf). Output is placed into apidoc.api/ directory.

To build documentation, run

doxygen build/$TEMPLATE

in the top-level lustre directory.

Publishing

build/apidoc.publish scripts publishes your local version of documentation on the http://wiki.lustre.org/apidoc:

build/apidoc.publish [-b branchname] [-l additional-label] [-d] [-u user]

build/apidoc.publish tries to guess branchname by looking into CVS/Tag. -d instructs script to use current date as a label. Documentation will be uploaded into

user@shell.lustre.sun.com:/home/www/apidoc/$branch$label

where $label is a concatenation of all labels given on the command line in order.

Doxygen References

Doxygen Home

Manual

Special Comment Tags