WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Difference between revisions of "Diagnostic and Debugging Tools"

From Obsolete Lustre Wiki
Jump to navigationJump to search
Line 1: Line 1:
==Tools Available for Debugging & Analysis==
+
Tools Available for Debugging & Analysis
  
 
There are several diagnostic tools available to debug Lustre; some are provided by the operating system, while others were developed and made available by the Lustre project.
 
There are several diagnostic tools available to debug Lustre; some are provided by the operating system, while others were developed and made available by the Lustre project.
  
===Lustre tools -Components of same in-kernel debug mechanism===
+
= Debugging tool provided as a part of Lustre =
 +
 
 +
==In-kernel debug mechanisms ==
  
 
* debug logs: When a kernel module is first inserted, a circular debug buffer is allocated to hold substantial amount of debugging information (in megabytes or more). When the buffer fills up, it wraps around and discards the oldest information. We have added debug messages specifically for Lustre; they can be written out to this kernel log.
 
* debug logs: When a kernel module is first inserted, a circular debug buffer is allocated to hold substantial amount of debugging information (in megabytes or more). When the buffer fills up, it wraps around and discards the oldest information. We have added debug messages specifically for Lustre; they can be written out to this kernel log.
Line 9: Line 11:
 
* /proc/sys/lnet/debug: Contains a mask that can be used to delimit the debugging information written out to the kernel debug logs.  
 
* /proc/sys/lnet/debug: Contains a mask that can be used to delimit the debugging information written out to the kernel debug logs.  
  
 
+
==Other internal tools==
 
* lctl: This tool is made available by Lustre. It is very useful to filter the kernel and extract useful information.
 
* lctl: This tool is made available by Lustre. It is very useful to filter the kernel and extract useful information.
 
* Lustre subsystem asserts: In case of asserts, a log will be written out to /tmp/lustre_log.<timestamp>.
 
* Lustre subsystem asserts: In case of asserts, a log will be written out to /tmp/lustre_log.<timestamp>.
 
* lfs: A Lustre utility that can be used to get to a Lustre file's extended attributes (among other things).  
 
* lfs: A Lustre utility that can be used to get to a Lustre file's extended attributes (among other things).  
  
 
+
=External debugging tools for administrators and developers=
* leak_finder.pl: An extremely useful program that helps locate memory leaks in the code.
 
 
 
External tools
 
  
 
* strace: Allows Lustre users to trace a system call.
 
* strace: Allows Lustre users to trace a system call.
Line 23: Line 22:
 
* Crash dumps: On some kernels, a sysrq "c" is enabled which produces a crash dump. Lustre enhances this crash dump with a log dump (the last 64K of log) to the console.
 
* Crash dumps: On some kernels, a sysrq "c" is enabled which produces a crash dump. Lustre enhances this crash dump with a log dump (the last 64K of log) to the console.
 
* debugfs: An nteractive Ext2 filesystem debugger.
 
* debugfs: An nteractive Ext2 filesystem debugger.
 +
==Logging and data collection tools ==
 +
These logging and data collection tools can be used to collect information for debugging Lustre kernel issues.
 +
 +
==== kdump ====
 +
''kdump'' is a Linux kernel crash utility useful for debugging a system running Red Hat Enterprise Linux. For more information about ''kdump'', see the Red Hat knowledge base article [http://kbase.redhat.com/faq/docs/DOC-6039 ''How do I configure kexec/kdump on Red Hat Enterprise Linux 5?'']. To download ''kdump'', go to the [http://fedoraproject.org/wiki/SystemConfig/kdump#Download Fedora Project Download] site.
 +
 +
==== netdump ====
 +
''netdump'' is a crash dump utility from Red Hat that allows memory images to be dumped over a network to a central server for analysis. It is now obsolete and has been replaced by ''kdump''.
 +
 +
==== netconsole ====
 +
''netconsole'' supports kernel-level network logging over UDP. A system requires (''SysRq'') allows users to collect relevant data through ''netconsole''. For more information, see [[Netconsole|Netconsole]].
 +
 +
==== [[lctl]] ====
 +
 +
[[lctl used with the debug_kernel option dumps the lustre debugging log]]
 +
 +
=Additional external debugging and analysis tools for developers=
 +
 +
* leak_finder.pl: An extremely useful program that helps locate memory leaks in the code.
 +
The tools described below may be useful for debugging Lustre™ in a development environment.
 +
 +
=== Virtual Machines ===
 +
A virtual machine is often used to create an isolated development and test environment.
 +
 +
==== VirtualBox ====
 +
VirtualBox Open Source Edition provides enterprise-class virtualization capability for all major platforms and is available free from Sun Microsystems at [http://www.sun.com/software/products/virtualbox/get.jsp?intcmp=2945 Get Sun Virtual Box].
 +
 +
==== VMware Server ====
 +
The VMware Server virtualization platform is available as free introductory software at [http://downloads.vmware.com/d/info/datacenter_downloads/vmware_server/2_0 Download VMware Server].
 +
 +
==== Xen ====
 +
 +
Xen is a para-virtualized environment with virtualization capabilities similar to VMware Server and Virtual Box. However, Xen allows the use of modified kernels to provide near-native performance and the ability to emulate shared storage. For more information, see [[Using Xen with Lustre]].  [[link to xen.org]]
 +
 +
== Debuggers and Analysis Tools ==
 +
==== kgdb ====
 +
''kgdb'' is a source-level kernel debugger that allows remote debugging using ''conman''.
 +
 +
kgdb provides a special set of hooks for a Linux kernel to attach ''gdb'' from another machine over a serial console. We provide ''kgdb'' patches for some kernels like ''rhel4'' with the Lustre patches (these are not patched in by default).
 +
 +
For more information, see [[KGDB]]
 +
and [[Using kgdb with UDP]].
 +
 +
Also see [http://www.linuxtopia.org/online_books/redhat_linux_debugging_with_gdb/running.html ''Chapter 6. Running Programs Under gdb''] in the ''Red Hat Linux 4 Debugging with GDB'' guide.
 +
 +
==== [[lcrash]] ====
 +
[[lcrash - Linux crash dump analyzer]] generic Linux tool - find link
 +
 +
==== crash ====
 +
''crash'' is used to analyze saved crash dump data.
 +
 +
Enter:
 +
crash vmlinux crash_dump
 +
 +
For more information about using ''crash'' to analyze crash dump output, see:
 +
 +
* Red Hat Magazine article [http://magazine.redhat.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis/ ''A quick overview of Linux kernel crash dump analysis''].
 +
* [http://people.redhat.com/anderson/crash_whitepaper/#EXAMPLES Crash Usage: A Case Study] from the white paper ''Red Hat Crash Utility'' by David Anderson.
 +
*Kernel Trap forum entry [http://kerneltrap.org/node/5758 Linux: Kernel Crash Dumps].
 +
* White paper [http://www.google.com/url?sa=t&source=web&ct=res&cd=8&ved=0CCUQFjAH&url=http%3A%2F%2Fwww.kernel.sg%2Fpapers%2Fcrash-dump-analysis.pdf&rct=j&q=redhat+crash+dump&ei=6aQBS-ifK4T8tAPcjdiHCw&usg=AFQjCNEk03E3GDtAsawG3gfpwc1gGNELAg ''A Quick Overview of Linux Kernel Crash Dump Analysis''].

Revision as of 17:17, 10 January 2010

Tools Available for Debugging & Analysis

There are several diagnostic tools available to debug Lustre; some are provided by the operating system, while others were developed and made available by the Lustre project.

Debugging tool provided as a part of Lustre

In-kernel debug mechanisms

  • debug logs: When a kernel module is first inserted, a circular debug buffer is allocated to hold substantial amount of debugging information (in megabytes or more). When the buffer fills up, it wraps around and discards the oldest information. We have added debug messages specifically for Lustre; they can be written out to this kernel log.
  • debug daemon: The debug daemon provides the facility for unlimited logging of the CDEBUG logs in Lustre.
  • /proc/sys/lnet/debug: Contains a mask that can be used to delimit the debugging information written out to the kernel debug logs.

Other internal tools

  • lctl: This tool is made available by Lustre. It is very useful to filter the kernel and extract useful information.
  • Lustre subsystem asserts: In case of asserts, a log will be written out to /tmp/lustre_log.<timestamp>.
  • lfs: A Lustre utility that can be used to get to a Lustre file's extended attributes (among other things).

External debugging tools for administrators and developers

  • strace: Allows Lustre users to trace a system call.
  • /var/log/messages: The directory to which fatal or serious messages are printed by the syslogd.
  • Crash dumps: On some kernels, a sysrq "c" is enabled which produces a crash dump. Lustre enhances this crash dump with a log dump (the last 64K of log) to the console.
  • debugfs: An nteractive Ext2 filesystem debugger.

Logging and data collection tools

These logging and data collection tools can be used to collect information for debugging Lustre kernel issues.

kdump

kdump is a Linux kernel crash utility useful for debugging a system running Red Hat Enterprise Linux. For more information about kdump, see the Red Hat knowledge base article How do I configure kexec/kdump on Red Hat Enterprise Linux 5?. To download kdump, go to the Fedora Project Download site.

netdump

netdump is a crash dump utility from Red Hat that allows memory images to be dumped over a network to a central server for analysis. It is now obsolete and has been replaced by kdump.

netconsole

netconsole supports kernel-level network logging over UDP. A system requires (SysRq) allows users to collect relevant data through netconsole. For more information, see Netconsole.

lctl

lctl used with the debug_kernel option dumps the lustre debugging log

Additional external debugging and analysis tools for developers

  • leak_finder.pl: An extremely useful program that helps locate memory leaks in the code.

The tools described below may be useful for debugging Lustre™ in a development environment.

Virtual Machines

A virtual machine is often used to create an isolated development and test environment.

VirtualBox

VirtualBox Open Source Edition provides enterprise-class virtualization capability for all major platforms and is available free from Sun Microsystems at Get Sun Virtual Box.

VMware Server

The VMware Server virtualization platform is available as free introductory software at Download VMware Server.

Xen

Xen is a para-virtualized environment with virtualization capabilities similar to VMware Server and Virtual Box. However, Xen allows the use of modified kernels to provide near-native performance and the ability to emulate shared storage. For more information, see Using Xen with Lustre. link to xen.org

Debuggers and Analysis Tools

kgdb

kgdb is a source-level kernel debugger that allows remote debugging using conman.

kgdb provides a special set of hooks for a Linux kernel to attach gdb from another machine over a serial console. We provide kgdb patches for some kernels like rhel4 with the Lustre patches (these are not patched in by default).

For more information, see KGDB and Using kgdb with UDP.

Also see Chapter 6. Running Programs Under gdb in the Red Hat Linux 4 Debugging with GDB guide.

lcrash

lcrash - Linux crash dump analyzer generic Linux tool - find link

crash

crash is used to analyze saved crash dump data.

Enter:

crash vmlinux crash_dump

For more information about using crash to analyze crash dump output, see: