Architecture - Profiling Tools for IO

2007-09-24T20:47:44Z

Wangdi:

== Definitions ==
Ganglia - distributed monitoring system(http://ganglia.info/)

== Background ==
The profiling tool should be the part of LRE, and it will also be used in ORNL to profile the I/O status of their XT4/XT3 cluster. We decided to implemented the whole profiling system based on Ganglia.

== Use cases ==
{|border=1 cellspacing="0"
|Collect profile information|| Performance & Usuability || collect I/O and other stats from servers or all nodes.
|-align="left"
|Analyse profile information || Usuability || generate some nice graph according to the profiling information
|-align="left"
|Output profile information || Usuability || Output these graph to the end user.
|}
=== Collect profile information ===
Stats collection
{|border=1 cellspacing="0"
|-align="left"
|colspan=2|'''Scenario:''' || Collecting stats from the servers and clients.
|-align="left"
|colspan=2|'''goals''' || Overhead & Usuability
|-align="left"
|rowspan="7" writing-mode="vertical"|'''details'''
|'''OST_Req_Handle_Info''' || req_qdepth, req_active, req_waittime /proc/fs/lustre/ost/OSS/ost_io/stats (server load)
|-align="left"
|'''OST_Read/Write_Info''' || ost read/write count from each client /proc/fs/lustre/ost/OSS/ost_io/req_history
|-align="left"
|'''Read/Write_Req_Info''' || req detail information. (percentage of !M rpc)/proc/fs/lustre/obdfilter/lustre-OST0001/brw_stats
|-align="left"
|'''Client_Cache_Avaiblity''' || client cache stats information/proc/fs/lustre/obdfilter/lustre-OSTXXXX/exports/NID@nettype/UUID/cur_grant(dirty)_bytes
|-align="left"
|'''Client_RPC_Frequency''' || Client RPC stats /proc/fs/lustre/osc(mdc)/stats
|-align="left"
|'''MDS_OPS_Stats''' || MDS stats ops /proc/fs/lustre/mds/stats
|-align="left"
|'''Ldlm_Stats''' || /proc/fs/lustre/ldlm/services/ldlm_cbd/stats
|-align="left"
|colspan=2|'''Implementation constrains ''' || Collecting stats by garlia monitor daemon
|}

Trace logs collection job
{|border=1 cellspacing="0"
|-align="left"
|colspan=2|'''Scenario:''' || Generating the trace logs on each nodes.
|-align="left"
|colspan=2|'''goals''' || Usuability (Geting indivial call trace information)
|-align="left"
|rowspan="2" writing-mode="vertical"|'''details'''
|'''VFS trace call logs''' || Get VFS trace logs with enable D_VFSTRACE on each clients.
|-align="left"
|'''Server RPC trace log''' || Get OST RPC handler trace log with enable D_RPCTRACE on each OSTS.
|-align="left"
|colspan=2|'''Implementation constrains ''' || Enable/Disable trace log by garlia monitor daemon
|}

===Analyse profile information ===
'''items'''
{|border=1 cellspacing="0"
|-
|-align="left"
|'''ID''' || Description
|-
|OST_Load || Represent the OST load over the time
|-
|Client_IO_Efficiency || Represent client I/O rpc effiency (1 MB RPC percentage)
|-
|Client_Cache_Stats || Represent whether client cache(grant) is efficiency
|-
|Client_RPC || Represent client RPC frequency
|-
|VFS_trace || Individial VFS trace call execute time (different call has different color)
|-
|Server RPC trace ||Individial OST RPC handler time
|-
|-
|Ldlm Stats || Represent lock (enqueue)conflicts status over the time
|-
|}

'''graphs'''
{|border=1 cellspacing="0"
|-
|-align="left"
|'''graphs''' || input || x_axis || y axis
|-
|OST_Load || OST_Req_Handle_Info || time || req_qdepth + req_active, req_waittime
|-
|Client_OST_IO_Efficiency || Read/Write_Req_Info || time || each size req percent
|-
|Client_Cache_Stats || Client_Cache_Avaiblity || time || Client cache(grant) avaiblity.
|-
|Client_RPC || Client_RPC_Frequency || time || read_req_read_count
|-
|ldlm stats || ldlm_stats || time || lock blocking_ast handler count on server.
|-
|VFS trace info || VFS trace logs || time || Individial VFS trace call execute time (different call has different color)
|-
|Server RPC trace || OST RPC trace logs || time || Individial OST RPC handler time
|-
|}
'''Note''': It will need trace log analyse tool to retrieve the exectue call time frome the trace log.

=== Output profile information ===

Output the those graphes we got by Ganglia PHP Web Frontend.

== Implementation constraint ==
# Use current utilities and architecture as much as possible, and be available to use as soon as possible.
# Implement the whole profiling system based on Ganglia
# It should work with lustre 1.4 also (ORNL may be stuck here for a long time)
# Easily extensible - realize that we may want to add or remove some stats in the future.

== References ==
[[Category:Architecture|Profiling Tools for IO]]

Obsolete Lustre Wiki - User contributions [en]

Architecture - Profiling Tools for IO