Lustre Debugging for Developers

Intro...

Adding Debugging to the Source Code
The debug infrastructure provides a number of macros that can be used in Lustre™ source code to aid in debugging or reporting serious errors.

To use these macros, you will need to set the DEBUG_SUBSYSTEM variable at the top of the file to do what?? as shown below:


 * 1) define DEBUG_SUBSYSTEM S_PORTALS

A list of available macros with descriptions is provided in see Section 23.2.8: Adding Debugging to the Lustre Source Code in the Lustre Operations Manual.

Ptlrpc Request History ...Requesting a service history using prlrpc?
Each service maintains a request history, which can be useful for first occurrence troubleshooting.

Is ptlrpc an acronym?

PTLRPC An RPC protocol layered on LNET. This protocol deals with stateful servers and has exactly-once semantics and built in support for recovery.

prlrpc is listed as a subsystem in the Lustre Debug Messages section.

For more information about how to use prlrpc, see Section 23.5: Ptlrpc Request History in the Lustre Operations Manual.

Using lightweight tracing (LWT) for debugging
Lustre offers a lightweight tracing facility called LWT that can be useful for debugging difficult problems. It prints fixed size requests into a buffer and is much faster than LDEBUG.

LWT trace-based records that are dumped contain:
 * Current CPU
 * Process counter
 * Pointer to file
 * Pointer to line in the file
 * Four void * pointers

An lctl command dumps the logs to files.

How is this facility used?

=Finding memory leaks using leak_finder.pl=

Memory leaks can occur in code when memory has been allocated and then not freed once it is no longer required. The leak_finder.pl program provides a way to find memory leaks.

For details, see Section 23.2.4: Finding Memory Leaks in the Lustre Operations Manual.