WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Difference between revisions of "Change Log 1.6"

From Obsolete Lustre Wiki
Jump to navigationJump to search
(fix formatting)
Line 1: Line 1:
 
=Changes from v1.6.4.3 to v1.6.5=
 
=Changes from v1.6.4.3 to v1.6.5=
'''Support for networks: socklnd - any kernel supported by Lustre, qswlnd - Qsnet kernel modules 5.20 and later, openiblnd - IbGold 1.8.2, o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3 viblnd - Voltaire ibhost 3.4.5 and later, ciblnd - Topspin 3.2.0, iiblnd - Infiniserv 3.3 + PathBits patch, gmlnd - GM 2.1.22 and later, mxlnd - MX 1.2.1 or later, ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x'''
+
'''Support for networks:'''<br>
 +
''' socklnd - any kernel supported by Lustre'''<br>
 +
''' qswlnd - Qsnet kernel modules 5.20 and later'''<br>
 +
''' openiblnd - IbGold 1.8.2'''<br>
 +
''' o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3'''<br>
 +
''' viblnd - Voltaire ibhost 3.4.5 and later'''<br>
 +
''' ciblnd - Topspin 3.2.0'''<br>
 +
''' iiblnd - Infiniserv 3.3 + PathBits patch'''<br>
 +
''' gmlnd - GM 2.1.22 and later'''<br>
 +
''' mxlnd - MX 1.2.1 or later'''<br>
 +
''' ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x'''<br>
  
'''Support for kernels: 2.6.5-7.311 (SLES 9), 2.6.9-67.0.7.EL (RHEL 4), 2.6.16.54-0.2.5 (SLES 10), 2.6.18-53.1.14.el5 (RHEL 5), 2.6.22.14 vanilla (kernel.org)'''
+
'''Support for kernels:'''<br>
 +
''' 2.6.5-7.311 (SLES 9)'''<br>
 +
''' 2.6.9-67.0.7.EL (RHEL 4)'''<br>
 +
''' 2.6.16.54-0.2.5 (SLES 10)'''<br>
 +
''' 2.6.18-53.1.14.el5 (RHEL 5)'''<br>
 +
''' 2.6.22.14 vanilla (kernel.org)'''<br>
  
'''Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)'''
+
'''Client support for unpatched kernels: (see [[Patchless_Client]])'''<br>
 +
''' 2.6.16 - 2.6.22 vanilla (kernel.org)'''<br>
  
'''Due to problems with nested symlinks and FMODE_EXEC (bug 12652), we do not recommend using patchless RHEL4 clients with kernels prior to 2.6.9-55EL (RHEL4U5).'''
+
'''Due to problems with nested symlinks and FMODE_EXEC [[https://bugzilla.lustre.org/show_bug.cgi?id=12652| 12652]], we do not recommend using patchless RHEL4 clients with kernels prior to 2.6.9-55EL (RHEL4U5).'''
  
 
'''Recommended e2fsprogs version: 1.40.7-sun1'''
 
'''Recommended e2fsprogs version: 1.40.7-sun1'''
Line 12: Line 28:
 
'''Note that reiserfs quotas are disabled on SLES 10 in this kernel.'''
 
'''Note that reiserfs quotas are disabled on SLES 10 in this kernel.'''
  
'''RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a removed cwd "./" (refer to Bugzilla 14399).'''
+
'''RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd ./' to a removed current working directory (refer to bugzilla [[https://bugzilla.lustre.org/show_bug.cgi?id=14399| 14399]]'''
  
 
'''A new quota file format has been introduced in 1.6.5.
 
'''A new quota file format has been introduced in 1.6.5.
 
The format conversion from prior releases is handled transparently, but releases older than 1.4.12/1.6.5 will not understand this new format.  The automatic format conversion can be avoided by running the following command on the MDS before upgrading:
 
The format conversion from prior releases is handled transparently, but releases older than 1.4.12/1.6.5 will not understand this new format.  The automatic format conversion can be avoided by running the following command on the MDS before upgrading:
 
     'tunefs.lustre --param="mdt.quota_type=ug1" $MDTDEV'.
 
     'tunefs.lustre --param="mdt.quota_type=ug1" $MDTDEV'.
For more information, please refer to bugzilla 13904.'''
+
For more information, please refer to bugzilla [[https://bugzilla.lustre.org/show_bug.cgi?id=13904| 13904]]'''
  
  
Line 24: Line 40:
 
Frequency: very rare, if additional xattrs are used on kernels >= 2.6.12
 
Frequency: very rare, if additional xattrs are used on kernels >= 2.6.12
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15777 15777}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15777| 15777]]
  
 
Description: MDS may lose file striping (and hence file data) in some cases
 
Description: MDS may lose file striping (and hence file data) in some cases
Line 32: Line 48:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12191 12191}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12191| 12191]]
  
 
Description: add message levels for liblustreapi
 
Description: add message levels for liblustreapi
Line 40: Line 56:
 
Frequency: rare, only if {mds,oss}_num_threads is specified
 
Frequency: rare, only if {mds,oss}_num_threads is specified
  
Bugzilla : {https://bugzilla.lustre.org/show_bug.cgi?id=15759 15759}
+
Bugzilla : [[https://bugzilla.lustre.org/show_bug.cgi?id=15759| 15759]]
  
 
Description: MDS or OSS service threads fail startup with -24 (-EMFILE)
 
Description: MDS or OSS service threads fail startup with -24 (-EMFILE)
Line 50: Line 66:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla  : {https://bugzilla.lustre.org/show_bug.cgi?id=13380 13380}
+
Bugzilla  : [[https://bugzilla.lustre.org/show_bug.cgi?id=13380| 13380]]
  
 
Description: MDT cannot be unmounted, reporting "Mount still busy"
 
Description: MDT cannot be unmounted, reporting "Mount still busy"
  
Details: Mountpoint references were being leaked during open reply
+
Details: Mountpoint references were being leaked during open reply reconstruction after an MDS restart.  Drop mountpoint reference in reconstruct_open() and free dentry reference also.
            reconstruction after an MDS restart.  Drop mountpoint reference
 
            in reconstruct_open() and free dentry reference also.
 
  
 
*Severity: minor
 
*Severity: minor
Line 62: Line 76:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13380 13380}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13380| 13380]]
  
 
Description: fix for occasional failure case of -ENOSPC in recovery-small tests
 
Description: fix for occasional failure case of -ENOSPC in recovery-small tests
  
Details: Move the 'good_osts' check before the 'total_bavail' check.  Thiswill result in an -EAGAIN and in the exit call path we callalloc_rr() which will with increasing aggressiveness attempt toaquire precreated objects on the minimum number of required OSCs.
+
Details: Move the 'good_osts' check before the 'total_bavail' check.  This will result in an -EAGAIN and in the exit call path we call alloc_rr() which will with increasing aggressiveness attempt to aquire precreated objects on the minimum number of required OSCs.
  
 
*Severity: major
 
*Severity: major
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14326 14326}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14326| 14326]]
  
 
Description: Use old size assignment to avoid deadlock
 
Description: Use old size assignment to avoid deadlock
Line 79: Line 93:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14655 14655}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14655| 14655]]
  
 
Description: Use __u64 instead of int for valid bits
 
Description: Use __u64 instead of int for valid bits
Line 85: Line 99:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14746 14746}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14746| 14746]]
  
 
Description: resolve "_IOWR redefined" build error on SLES10
 
Description: resolve "_IOWR redefined" build error on SLES10
Line 91: Line 105:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14763 14763}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14763| 14763]]
  
 
Description: dump the memory debugging after all modules are unloaded to suppress false negative in conf_sanity test 39
 
Description: dump the memory debugging after all modules are unloaded to suppress false negative in conf_sanity test 39
Line 97: Line 111:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14872 14872}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14872| 14872]]
  
 
Description: the recovery timer never expires
 
Description: the recovery timer never expires
Line 105: Line 119:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15521 15521}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15521| 15521]]
  
 
Description: the min numbers of lproc stats are wrong
 
Description: the min numbers of lproc stats are wrong
Line 115: Line 129:
 
Frequency: always with interactive lfs
 
Frequency: always with interactive lfs
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15212 15212}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15212| 15212]]
  
 
Description: Reinitialize optind to 0 so that interactive lfs works in all cases
 
Description: Reinitialize optind to 0 so that interactive lfs works in all cases
Line 123: Line 137:
 
Frequency: with multiple concurrent readdir processes in same directory
 
Frequency: with multiple concurrent readdir processes in same directory
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15406 15406}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15406| 15406]] [[https://bugzilla.lustre.org/show_bug.cgi?id=15169| 15169]] [[https://bugzilla.lustre.org/show_bug.cgi?id=15175| 15175]]
          {https://bugzilla.lustre.org/show_bug.cgi?id=15169 15169}
 
          {https://bugzilla.lustre.org/show_bug.cgi?id=15175 15175}
 
  
 
Description: misc fixes for directory readahead.
 
Description: misc fixes for directory readahead.
Line 133: Line 145:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15316 15316}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15316| 15316]]
  
 
Description: build kernel-ib packages for OFED 1.3 in our release cycle
 
Description: build kernel-ib packages for OFED 1.3 in our release cycle
Line 139: Line 151:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15036 15036}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15036| 15036]]
  
 
Description: incore types cleaning in quota code (with respect to 64-bit limits)
 
Description: incore types cleaning in quota code (with respect to 64-bit limits)
Line 149: Line 161:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13969 13969}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13969| 13969]]
  
 
Description: fix SLES kernel versioning
 
Description: fix SLES kernel versioning
Line 159: Line 171:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14803 14803}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14803| 14803]]
  
 
Description: Don't update lov_desc members until making sure they are valid
 
Description: Don't update lov_desc members until making sure they are valid
Line 169: Line 181:
 
Frequency: very rare
 
Frequency: very rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15069 15069}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15069| 15069]]
  
 
Description: don't put request into delay list while invalidate in flight.
 
Description: don't put request into delay list while invalidate in flight.
Line 177: Line 189:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15416 15416}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15416| 15416]]
  
 
Description: Update kernel to SLES9 2.6.5-7.311.
 
Description: Update kernel to SLES9 2.6.5-7.311.
Line 183: Line 195:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15240 15240}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15240| 15240]]
  
 
Description: Update kernel to RHEL4 2.6.9-67.0.7.
 
Description: Update kernel to RHEL4 2.6.9-67.0.7.
Line 191: Line 203:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14856 14856}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14856| 14856]]
  
 
Frequency: on PPC only
 
Frequency: on PPC only
Line 197: Line 209:
 
Description: not convert ost objects for directory because it's not exist.
 
Description: not convert ost objects for directory because it's not exist.
  
Details: ll_dir_getstripe assume dirrectory has ost objects but this wrong.
+
Details: ll_dir_getstripe assume directory has ost objects but this wrong.
  
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15517 15517}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15517| 15517]]
  
 
Description: Fix warnings with compile liblustre at sles10/rhel5 which have __u64 as usingied long long type.
 
Description: Fix warnings with compile liblustre at sles10/rhel5 which have __u64 as usingied long long type.
Line 209: Line 221:
 
Frequency: rare, on shutdown
 
Frequency: rare, on shutdown
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15210 15210}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15210| 15210]]
  
Description: race process ast vs remove callback
+
Description: race process AST vs remove callback
  
 
Details: removing callback before disconnect import open race with processing callback.
 
Details: removing callback before disconnect import open race with processing callback.
Line 217: Line 229:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15416 15416}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15416| 15416]]
  
 
Description: Update kernel to SLES9 2.6.5-7.311.
 
Description: Update kernel to SLES9 2.6.5-7.311.
Line 223: Line 235:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12652 12652}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12652| 12652]]
  
 
Description: Files open for execute are not marked busy on SLES10
 
Description: Files open for execute are not marked busy on SLES10
Line 231: Line 243:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13397 13397}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13397| 13397]]
  
 
Description: Add server support for vanilla-2.6.22.14.
 
Description: Add server support for vanilla-2.6.22.14.
Line 239: Line 251:
 
Frequency: occasional
 
Frequency: occasional
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13375 13375}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13375| 13375]]
  
 
Description: Avoid lov_create() getting stuck in obd_statfs_rqset()
 
Description: Avoid lov_create() getting stuck in obd_statfs_rqset()
Line 247: Line 259:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=3055 3055}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=3055| 3055]]
  
 
Description: Disable adaptive timeouts by default
 
Description: Disable adaptive timeouts by default
Line 255: Line 267:
 
Frequency: on network error
 
Frequency: on network error
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15027 15027}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15027| 15027]]
  
 
Description: panic with double free request if network error
 
Description: panic with double free request if network error
Line 265: Line 277:
 
Frequency: rare, on recovery
 
Frequency: rare, on recovery
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14533 14533}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14533| 14533]]
  
 
Description: read procfs can produce deadlock in some situation
 
Description: read procfs can produce deadlock in some situation
Line 273: Line 285:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15152 15152}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15152| 15152]]
  
 
Description: Update kernel to RHEL5 2.6.18-53.1.14.el5.
 
Description: Update kernel to RHEL5 2.6.18-53.1.14.el5.
Line 281: Line 293:
 
Frequency: frequent on X2 node
 
Frequency: frequent on X2 node
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15010 15010}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15010| 15010]]
  
 
Description: mdc_set_open_replay_data LBUG
 
Description: mdc_set_open_replay_data LBUG
Line 291: Line 303:
 
Frequency: common
 
Frequency: common
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14321 14321}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14321| 14321]]
  
 
Description: lustre_mgs: operation 101 on unconnected MGS
 
Description: lustre_mgs: operation 101 on unconnected MGS
Line 301: Line 313:
 
Frequency: rare, depends on device drivers and load
 
Frequency: rare, depends on device drivers and load
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14529 14529}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14529| 14529]]
  
 
Description: MDS or OSS nodes crash due to stack overflow
 
Description: MDS or OSS nodes crash due to stack overflow
Line 309: Line 321:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14876 14876}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14876| 14876]]
  
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.13.el5.
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.13.el5.
Line 315: Line 327:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14858 14858}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14858| 14858]]
  
 
Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.5.
 
Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.5.
Line 321: Line 333:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14720 14720}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14720| 14720]]
  
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.6.el5.
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.6.el5.
Line 327: Line 339:
 
*Serverity: enhancement
 
*Serverity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14793 14793}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14793| 14793]]
  
 
Description: Update RHEL4 kernel to 2.6.9-67.0.4.
 
Description: Update RHEL4 kernel to 2.6.9-67.0.4.
Line 335: Line 347:
 
Frequency: rare on shutdown OST
 
Frequency: rare on shutdown OST
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13196 13196}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13196| 13196]]
  
 
Description: Don't allow skipping OSTs if index has been specified.
 
Description: Don't allow skipping OSTs if index has been specified.
Line 345: Line 357:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14421 14421}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14421| 14421]]
  
 
Description: ASSERTION(!PageDirty(page)) failed
 
Description: ASSERTION(!PageDirty(page)) failed
Line 355: Line 367:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12228 12228}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12228| 12228]]
  
 
Description: LBUG in ptlrpc_check_set() bad phase ebc0de00
 
Description: LBUG in ptlrpc_check_set() bad phase ebc0de00
Line 365: Line 377:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13647 13647}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13647| 13647]]
  
Description: Lustre make rpms failed.
+
Description: Lustre 'make rpms' failed.
  
 
Details: Remove ldiskfs spec file to avoids rpmbuild be confused when builds Lustre rpms from tarball.
 
Details: Remove ldiskfs spec file to avoids rpmbuild be confused when builds Lustre rpms from tarball.
Line 373: Line 385:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14498 14498}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14498| 14498]]
  
 
Description: Update to SLES9 SP4 kernel-2.6.5-7.308.
 
Description: Update to SLES9 SP4 kernel-2.6.5-7.308.
Line 381: Line 393:
 
Frequency: rare on shutdown OST
 
Frequency: rare on shutdown OST
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14608 14608}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14608| 14608]]
  
Description: If llog cancel was not send before clean_exports phase, this can
+
Description: If llog cancel was not send before clean_exports phase, this can produce deadlock in llog code.
            produce deadlock in llog code.
 
  
 
Details: If llog thread has last reference to obd and call class_import_put this produce deadlock because llog_cleanup_commit_master wait when last llog_commit_thread exited, but this never success because was called from llog_commit_thread.
 
Details: If llog thread has last reference to obd and call class_import_put this produce deadlock because llog_cleanup_commit_master wait when last llog_commit_thread exited, but this never success because was called from llog_commit_thread.
Line 392: Line 403:
 
Frequency: only if OST index is skipped
 
Frequency: only if OST index is skipped
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14607 14607}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14607| 14607]]
  
 
Description: NULL lov_tgts causing MDS oops
 
Description: NULL lov_tgts causing MDS oops
Line 400: Line 411:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14531 14531}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14531| 14531]]
  
 
Description: Update to RHEL4 latest kernel-2.6.9-67.0.1.EL.
 
Description: Update to RHEL4 latest kernel-2.6.9-67.0.1.EL.
Line 406: Line 417:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14368 14368}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14368| 14368]]
  
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.4.el5.
 
Description: Update to RHEL5 latest kernel-2.6.18-53.1.4.el5.
Line 414: Line 425:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14136 14136}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14136| 14136]]
  
 
Description: make mgs_setparam() handle fsname containing dash
 
Description: make mgs_setparam() handle fsname containing dash
Line 422: Line 433:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14288 14288}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14288| 14288]]
  
 
Description: Update to RHEL4 Update-6 kernel-2.6.9-67.EL.
 
Description: Update to RHEL4 Update-6 kernel-2.6.9-67.EL.
Line 430: Line 441:
 
Frequency: rare, in recovery and (or) destroy lovobjid file.
 
Frequency: rare, in recovery and (or) destroy lovobjid file.
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12702 12702}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12702| 12702]]
  
Description: rewrite lov objid code.
+
Description: rewrite lov_objid code.
  
Details: Cleanup for lov objid code, remove scability problems and wrong locking. Fix sending last_id into ost.
+
Details: Cleanup for lov_objid code, remove scability problems and wrong locking. Fix sending last_id into OST.
  
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14388 14388}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14388| 14388]]
  
 
Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.3.
 
Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.3.
Line 444: Line 455:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14289 14289}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14289| 14289]]
  
 
Description: Update to RHEL5 Update-1 kernel 2.6.18-53.el5.
 
Description: Update to RHEL5 Update-1 kernel 2.6.18-53.el5.
Line 454: Line 465:
 
Frequency: rare, at shutdown
 
Frequency: rare, at shutdown
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14260 14260}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14260| 14260]]
  
 
Description: access already free / zero obd_namespace.
 
Description: access already free / zero obd_namespace.
Line 464: Line 475:
 
Frequency: only at startup
 
Frequency: only at startup
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14418 14418}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14418| 14418]]
  
 
Description: not alloc memory with spinlock held.
 
Description: not alloc memory with spinlock held.
Line 474: Line 485:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14270 14270}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14270| 14270]]
  
 
Description: lfs find does not continue on file error
 
Description: lfs find does not continue on file error
Line 482: Line 493:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=11791 11791}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=11791| 11791]]
  
 
Description: Inconsistent usage of lustre_pack_reply()
 
Description: Inconsistent usage of lustre_pack_reply()
  
Details: Standardize the usage of lustre_pack_reply() such that it
+
Details: Standardize the usage of lustre_pack_reply() such that it always generate a CERROR on failure.
            always generate a CERROR on failure.
 
  
 
*Severity: normal
 
*Severity: normal
Line 493: Line 503:
 
Frequency: very rare
 
Frequency: very rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=3462 3462}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=3462| 3462]]
  
 
Description: Fix replay if there is an un-replied request and open
 
Description: Fix replay if there is an un-replied request and open
Line 501: Line 511:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13969 13969}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13969| 13969]]
  
 
Description: Update to RHEL5 kernel 2.6.18-8.1.15.el5.
 
Description: Update to RHEL5 kernel 2.6.18-8.1.15.el5.
Line 507: Line 517:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13874 13874}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13874| 13874]]
  
 
Description: Update to SLES10 SP1 kernel 2.6.16.53-0.16
 
Description: Update to SLES10 SP1 kernel 2.6.16.53-0.16
Line 513: Line 523:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13889 13889}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13889| 13889]]
  
 
Description: Update to SLES9 kernel-2.6.5-7.287.3.
 
Description: Update to SLES9 kernel-2.6.5-7.287.3.
Line 519: Line 529:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14041 14041}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14041| 14041]]
  
 
Description: Update to RHEL4 kernel-2.6.9-55.0.12.EL.
 
Description: Update to RHEL4 kernel-2.6.9-55.0.12.EL.
Line 525: Line 535:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13690 13690}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13690| 13690]]
  
 
Description: Build SLES10 patchless client fails
 
Description: Build SLES10 patchless client fails
Line 533: Line 543:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=11622 11622}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=11622| 11622]]
  
 
Description: Lustre Page Accounting
 
Description: Lustre Page Accounting
Line 543: Line 553:
 
Frequency: only if debugging is disabled
 
Frequency: only if debugging is disabled
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13497 13497}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13497| 13497]]
  
 
Description: LASSERT_{REQ,REP}SWAB macros are buggy
 
Description: LASSERT_{REQ,REP}SWAB macros are buggy
Line 553: Line 563:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13888 13888}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13888| 13888]]
  
 
Description: interrupt oig_wait produce painc on resend.
 
Description: interrupt oig_wait produce painc on resend.
Line 561: Line 571:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=11089 11089}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=11089| 11089]]
  
 
Description: organize the server-side client stats on per-nid basis
 
Description: organize the server-side client stats on per-nid basis
Line 579: Line 589:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12266 12266}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12266| 12266]]
  
 
Description: Processes looping in ll_readdir() on Lustre clients finally causing a full node pseudo-hang
 
Description: Processes looping in ll_readdir() on Lustre clients finally causing a full node pseudo-hang
Line 589: Line 599:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13976 13976}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13976| 13976]]
  
 
Description: touch file failed when fs is not full
 
Description: touch file failed when fs is not full
Line 599: Line 609:
 
Frequency: only for Cray XT3
 
Frequency: only for Cray XT3
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12829 12829}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12829| 12829]] [[https://bugzilla.lustre.org/show_bug.cgi?id=13455| 13455]]
          {https://bugzilla.lustre.org/show_bug.cgi?id=13455 13455}
 
  
Description: Changing primary group doesn't change the group lustre assigns to
+
Description: Changing primary group doesn't change the group lustre assigns to a file
            a file
 
  
Details: When CRAY_XT3 is defined, the fsgid supplied by the client is
+
Details: When CRAY_XT3 is defined, the fsgid supplied by the client is overridden with the primary group provided by the group upcall, whereas the supplied fsgid can be trusted if it is in the list of supplementary groups returned by the group upcall.
            overridden with the primary group provided by the group upcall,
 
            whereas the supplied fsgid can be trusted if it is in the list of
 
            supplementary groups returned by the group upcall.
 
  
 
*Severity: enhancement
 
*Severity: enhancement
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=12749 12749}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=12749| 12749]]
  
 
Description: Root Squash Functionality
 
Description: Root Squash Functionality
Line 619: Line 624:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=10718 10718}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=10718| 10718]]
  
 
Description: Slow trucate/writes to huge files at high offsets.
 
Description: Slow trucate/writes to huge files at high offsets.
Line 629: Line 634:
 
Frequency: common
 
Frequency: common
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14379 14379}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14379| 14379]]
  
 
Description: Too many locks accumulating on client during NFS usage
 
Description: Too many locks accumulating on client during NFS usage
Line 637: Line 642:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14477 14477}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14477| 14477]]
  
Description: Hit ASSERTION(obd->obd_stopping == 1) failed in some setup failed
+
Description: Hit ASSERTION(obd->obd_stopping == 1) failed in some setup failed situation.
            situation.
 
  
 
Details: In obd setup failure handler, obd_stopping will not necessarily to be 1, and obd_set_up should also be checked to make sure whether obd is completely setup.
 
Details: In obd setup failure handler, obd_stopping will not necessarily to be 1, and obd_set_up should also be checked to make sure whether obd is completely setup.
Line 646: Line 650:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14398 14398}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14398| 14398]]
  
 
Description: Allow masking D_WARNING, D_ERROR messages from console
 
Description: Allow masking D_WARNING, D_ERROR messages from console
Line 656: Line 660:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14614 14614}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14614| 14614]]
  
 
Description: User code with malformed file open parameter crashes client node
 
Description: User code with malformed file open parameter crashes client node
Line 666: Line 670:
 
Frequency: always
 
Frequency: always
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=10600 10600}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=10600| 10600]]
  
 
Description: shrink/enlarge qunit size when needed; fix the problem of coarse grain of quota doing harm to quota's accuracy
 
Description: shrink/enlarge qunit size when needed; fix the problem of coarse grain of quota doing harm to quota's accuracy
Line 674: Line 678:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14225 14225}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14225| 14225]]
  
 
Description: LDLM_ENQUEUE races with LDLM_CP_CALLBACK
 
Description: LDLM_ENQUEUE races with LDLM_CP_CALLBACK
Line 682: Line 686:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14360 14360}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14360| 14360]]
  
 
Description: Heavy nfs access might result in deadlocks
 
Description: Heavy nfs access might result in deadlocks
Line 690: Line 694:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14443 14443}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14443| 14443]]
  
 
Description: 35% write performance drop with ldiskfs2 when quotas are on
 
Description: 35% write performance drop with ldiskfs2 when quotas are on
Line 698: Line 702:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13843 13843}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13843| 13843]]
  
 
Description: Client eviction while running blogbench
 
Description: Client eviction while running blogbench
Line 708: Line 712:
 
Frequency: RHEL4 only
 
Frequency: RHEL4 only
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14618 14618}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14618| 14618]]
  
 
Description: mkfs is very slow on IA64/RHEL4
 
Description: mkfs is very slow on IA64/RHEL4
Line 718: Line 722:
 
Frequency: PPC/PPC64 only
 
Frequency: PPC/PPC64 only
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14845 14845}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14845| 14845]]
  
 
Description: conflicts between asm-ppc64/types.h and lustre_types.h
 
Description: conflicts between asm-ppc64/types.h and lustre_types.h
Line 728: Line 732:
 
Frequency: PPC/PPC64 only
 
Frequency: PPC/PPC64 only
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14844 14844}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14844| 14844]]
  
 
Description: asm-ppc/segment.h does not exist
 
Description: asm-ppc/segment.h does not exist
Line 736: Line 740:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13805 13805}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13805| 13805]]
  
 
Description: data checksumming impacts single node performance
 
Description: data checksumming impacts single node performance
Line 744: Line 748:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14648 14648}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14648| 14648]]
  
 
Description: use adler32 for page checksums
 
Description: use adler32 for page checksums
  
Details: when available, use the Adler-32 algorithm instead of CRC32 for
+
Details: when available, use the Adler-32 algorithm instead of CRC32 for page checksums.
            page checksums.
 
  
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14864 14864}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14864| 14864]]
  
 
Description: better handle error messages in extents code
 
Description: better handle error messages in extents code
Line 759: Line 762:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14729 14729}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14729| 14729]]
  
 
Description: SNMP support enhancement
 
Description: SNMP support enhancement
  
Details: Adding total number of sampled request for an MDS node in snmp
+
Details: Adding total number of sampled request for an MDS node in snmp support.
            support.
 
  
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14748 14748}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14748| 14748]]
  
 
Description: Optimize ldlm waiting list processing for PR extent locks
 
Description: Optimize ldlm waiting list processing for PR extent locks
Line 776: Line 778:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14774 14774}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14774| 14774]]
  
 
Description: Time out and refuse to reconnect
 
Description: Time out and refuse to reconnect
Line 784: Line 786:
 
*Severity: major
 
*Severity: major
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14775 14775}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14775| 14775]]
  
 
Description: Client not clear own cache if answer to reconnect is lost.
 
Description: Client not clear own cache if answer to reconnect is lost.
Line 792: Line 794:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14483 14483}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14483| 14483]]
  
 
Description: Detect stride IO mode in read-ahead
 
Description: Detect stride IO mode in read-ahead
Line 800: Line 802:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15033 15033}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15033| 15033]]
  
Description: build for x2 fails
+
Description: build for X2 fails
  
 
Details: fix compile issue on Cray systems.
 
Details: fix compile issue on Cray systems.
Line 808: Line 810:
 
*Severity: enhancement
 
*Severity: enhancement
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13371 13371}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13371| 13371]]
  
 
Description: implement readv/writev APIs(aio_read/aio_writes in newer kernels)
 
Description: implement readv/writev APIs(aio_read/aio_writes in newer kernels)
Line 818: Line 820:
 
Frequency: only on PPC/SLES10
 
Frequency: only on PPC/SLES10
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14855 14855}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14855| 14855]]
  
 
Description: "BITS_PER_LONG is not 32 or 64" in linux/idr.h
 
Description: "BITS_PER_LONG is not 32 or 64" in linux/idr.h
Line 826: Line 828:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14257 14257}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14257| 14257]]
  
 
Description: LASSERT on MDS when client holding flock lock dies
 
Description: LASSERT on MDS when client holding flock lock dies
Line 834: Line 836:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15188 15188}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15188| 15188]]
  
 
Description: MDS deadlock with many ll_sync_lov threads and I/O stalled
 
Description: MDS deadlock with many ll_sync_lov threads and I/O stalled
Line 842: Line 844:
 
*Severity: minor
 
*Severity: minor
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15566 15566}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15566| 15566]]
  
 
Description: Update an obsolete wirecheck.c generator
 
Description: Update an obsolete wirecheck.c generator
Line 850: Line 852:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14712 14712}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14712| 14712]]
  
 
Description: Client can panic on open sometimes
 
Description: Client can panic on open sometimes
Line 858: Line 860:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14410 14410}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14410| 14410]]
  
 
Description: performance in 1.6.3
 
Description: performance in 1.6.3
Line 866: Line 868:
 
*Severity: normal
 
*Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15198 15198}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15198| 15198]]
  
 
Description: LDLM soft lockups - improvement
 
Description: LDLM soft lockups - improvement
Line 876: Line 878:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14036 14036}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14036| 14036]]
  
 
Description: lfs quota fails with deactivated OSTS
 
Description: lfs quota fails with deactivated OSTS
Line 889: Line 891:
 
Frequency: rare
 
Frequency: rare
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15776 15776}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15776| 15776]]
  
 
Description: Extent locks not granted with no conflicts sometimes.
 
Description: Extent locks not granted with no conflicts sometimes.
Line 897: Line 899:
 
Severity: major
 
Severity: major
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=15712 15712}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=15712| 15712]]
  
 
Description: ksocknal_create_conn() hit ASSERTION during connection race
 
Description: ksocknal_create_conn() hit ASSERTION during connection race
Line 905: Line 907:
 
Severity: major
 
Severity: major
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=13983 13983}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=13983| 13983]]
  
 
Description: ksocknal_send_hello() hit ASSERTION while connecting race
 
Description: ksocknal_send_hello() hit ASSERTION while connecting race
Line 913: Line 915:
 
Severity: major
 
Severity: major
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14425 14425}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14425| 14425]]
  
 
Description: o2iblnd/ptllnd credit deadlock in a routed config.
 
Description: o2iblnd/ptllnd credit deadlock in a routed config.
Line 921: Line 923:
 
Severity: normal
 
Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14956 14956}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14956| 14956]]
  
 
Description: High load after starting lnet
 
Description: High load after starting lnet
Line 929: Line 931:
 
Severity: normal
 
Severity: normal
  
Bugzilla: {https://bugzilla.lustre.org/show_bug.cgi?id=14838 14838}
+
Bugzilla: [[https://bugzilla.lustre.org/show_bug.cgi?id=14838| 14838]]
  
 
Description: ksocklnd fails to establish connection if accept_port is high
 
Description: ksocklnd fails to establish connection if accept_port is high
  
 
Details: PID remapping must not be done for active (outgoing) connections
 
Details: PID remapping must not be done for active (outgoing) connections
 
  
 
=Changes from v1.6.4.2 to v1.6.4.3=
 
=Changes from v1.6.4.2 to v1.6.4.3=

Revision as of 20:38, 12 June 2008

Changes from v1.6.4.3 to v1.6.5

Support for networks:
socklnd - any kernel supported by Lustre
qswlnd - Qsnet kernel modules 5.20 and later
openiblnd - IbGold 1.8.2
o2iblnd - OFED 1.1, 1.2.0, 1.2.5, and 1.3
viblnd - Voltaire ibhost 3.4.5 and later
ciblnd - Topspin 3.2.0
iiblnd - Infiniserv 3.3 + PathBits patch
gmlnd - GM 2.1.22 and later
mxlnd - MX 1.2.1 or later
ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x

Support for kernels:
2.6.5-7.311 (SLES 9)
2.6.9-67.0.7.EL (RHEL 4)
2.6.16.54-0.2.5 (SLES 10)
2.6.18-53.1.14.el5 (RHEL 5)
2.6.22.14 vanilla (kernel.org)

Client support for unpatched kernels: (see Patchless_Client)
2.6.16 - 2.6.22 vanilla (kernel.org)

Due to problems with nested symlinks and FMODE_EXEC [12652], we do not recommend using patchless RHEL4 clients with kernels prior to 2.6.9-55EL (RHEL4U5).

Recommended e2fsprogs version: 1.40.7-sun1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd ./' to a removed current working directory (refer to bugzilla [14399]

A new quota file format has been introduced in 1.6.5. The format conversion from prior releases is handled transparently, but releases older than 1.4.12/1.6.5 will not understand this new format. The automatic format conversion can be avoided by running the following command on the MDS before upgrading:

    'tunefs.lustre --param="mdt.quota_type=ug1" $MDTDEV'.

For more information, please refer to bugzilla [13904]


  • Severity: critical

Frequency: very rare, if additional xattrs are used on kernels >= 2.6.12

Bugzilla: [15777]

Description: MDS may lose file striping (and hence file data) in some cases

Details: If there are additional extended attributes stored on the MDS, in particular ACLs, SELinux, or user attributes (if user_xattr is specified for the client mount options) then there is a risk of attribute loss. Additionally, the Lustre file striping needs to be larger than default (e.g. striped over all OSTs), and an additional attribute must be stored initially in the inode and then increase in size enough to be moved to the external attribute block (e.g. ACL growing in size) for file data to be lost.

  • Severity: enhancement

Bugzilla: [12191]

Description: add message levels for liblustreapi

  • Severity: minor

Frequency: rare, only if {mds,oss}_num_threads is specified

Bugzilla : [15759]

Description: MDS or OSS service threads fail startup with -24 (-EMFILE)

Details: During startup under recovery, it is possible for service thread startup to fail in ptlrpc_start_threads() if one of the threads begins processing a request and then starts an additional thread. This causes ptlrpc_start_threads() to try and start 1 too many threads and get an error.

  • Severity: normal

Frequency: rare

Bugzilla : [13380]

Description: MDT cannot be unmounted, reporting "Mount still busy"

Details: Mountpoint references were being leaked during open reply reconstruction after an MDS restart. Drop mountpoint reference in reconstruct_open() and free dentry reference also.

  • Severity: minor

Frequency: rare

Bugzilla: [13380]

Description: fix for occasional failure case of -ENOSPC in recovery-small tests

Details: Move the 'good_osts' check before the 'total_bavail' check. This will result in an -EAGAIN and in the exit call path we call alloc_rr() which will with increasing aggressiveness attempt to aquire precreated objects on the minimum number of required OSCs.

  • Severity: major

Bugzilla: [14326]

Description: Use old size assignment to avoid deadlock

Details: This reverts the changes in bugs 2369 and bug 14138 that introduced the scheduling while holding a spinlock. We do not need locking for size in ll_update_inode() because size is only updated from the MDS for directories or files without objects, so there is no other place to do the update, and concurrent access to such inodes are protected by the inode lock.

  • Severity: normal

Bugzilla: [14655]

Description: Use __u64 instead of int for valid bits

  • Severity: normal

Bugzilla: [14746]

Description: resolve "_IOWR redefined" build error on SLES10

  • Severity: normal

Bugzilla: [14763]

Description: dump the memory debugging after all modules are unloaded to suppress false negative in conf_sanity test 39

  • Severity: normal

Bugzilla: [14872]

Description: the recovery timer never expires

Details: for new client connect request, the recovery timer should not be reset, otherwise recovery timer will never expired, if the old client never come. Only old client connect and first connection req should trigger recovery timer reset.

  • Severity: normal

Bugzilla: [15521]

Description: the min numbers of lproc stats are wrong

Details: adding a new constant LC_MIN_INIT and use it for initialization of lc_min.

  • Severity: normal

Frequency: always with interactive lfs

Bugzilla: [15212]

Description: Reinitialize optind to 0 so that interactive lfs works in all cases

  • Severity: normal

Frequency: with multiple concurrent readdir processes in same directory

Bugzilla: [15406] [15169] [15175]

Description: misc fixes for directory readahead.

Details: prevent previous statahead async RPC callback from processing the current "statahead_info", race condition between async RPC callback add dentry into dentry hash table and "ls" thread revalidate such dentry, statahead his/miss control for hidden items, and so on.

  • Severity: enhancement

Bugzilla: [15316]

Description: build kernel-ib packages for OFED 1.3 in our release cycle

  • Severity: normal

Bugzilla: [15036]

Description: incore types cleaning in quota code (with respect to 64-bit limits)

Details: several u32 variables declarations are replaced with u64 declarations

  • Severity: minor

Frequency: always

Bugzilla: [13969]

Description: fix SLES kernel versioning

Details: the kernel version for our SLES 10 kernel did not include a "-" before the "smp" at the end. while this was not a problem in general, it did mean that software trying to use the kernel version to try to detect a vendor specific kernel would fail. this was most evident by the OFED build scripts.

  • Severity: normal

Frequency: rare

Bugzilla: [14803]

Description: Don't update lov_desc members until making sure they are valid

Details: When updating lov_desc members via proc fs, need fix their validities before doing the real update.

  • Severity: normal

Frequency: very rare

Bugzilla: [15069]

Description: don't put request into delay list while invalidate in flight.

Details: ptlrpc_delay_request sometimes put in delay list while invalidate import in flight. this produce timeout for invalidate and sometimes can cause stale data.

  • Severity: enhancement

Bugzilla: [15416]

Description: Update kernel to SLES9 2.6.5-7.311.

  • Severity: enhancement

Bugzilla: [15240]

Description: Update kernel to RHEL4 2.6.9-67.0.7.

  • Severity: normal

Frequency: always

Bugzilla: [14856]

Frequency: on PPC only

Description: not convert ost objects for directory because it's not exist.

Details: ll_dir_getstripe assume directory has ost objects but this wrong.

  • Severity: enhancement

Bugzilla: [15517]

Description: Fix warnings with compile liblustre at sles10/rhel5 which have __u64 as usingied long long type.

  • Severity: minor

Frequency: rare, on shutdown

Bugzilla: [15210]

Description: race process AST vs remove callback

Details: removing callback before disconnect import open race with processing callback.

  • Severity: enhancement

Bugzilla: [15416]

Description: Update kernel to SLES9 2.6.5-7.311.

  • Severity: enhancement

Bugzilla: [12652]

Description: Files open for execute are not marked busy on SLES10

Details: Add FMODE_EXEC to SLES10 SP1 server kernel series.

  • Severity: enhancement

Bugzilla: [13397]

Description: Add server support for vanilla-2.6.22.14.

  • Severity: normal

Frequency: occasional

Bugzilla: [13375]

Description: Avoid lov_create() getting stuck in obd_statfs_rqset()

Details: If an OST is down the MDS will hang indefinitely in obd_statfs_rqset() waiting for the statfs data. While for MDS QOS usage of statfs, it should not stuck in waiting.

  • Severity: enhancement

Bugzilla: [3055]

Description: Disable adaptive timeouts by default

  • Severity: major

Frequency: on network error

Bugzilla: [15027]

Description: panic with double free request if network error

Details: mdc_finish_enqueue is finish request if any network error occuring, but it's true only for synchronus enqueue, for async enqueue (via ptlrpcd) this incorrect and ptlrpcd want finish request himself.

  • Severity: normal

Frequency: rare, on recovery

Bugzilla: [14533]

Description: read procfs can produce deadlock in some situation

Details: Holding lprocfs lock with send rpc can produce block for destroy obd objects and this also block reconnect with -EALREADY. This isn't fix all lprocfs bugs - but make it rare.

  • Severity: enhancement

Bugzilla: [15152]

Description: Update kernel to RHEL5 2.6.18-53.1.14.el5.

  • Severity: major

Frequency: frequent on X2 node

Bugzilla: [15010]

Description: mdc_set_open_replay_data LBUG

Details: Set replay data for requests that are eligible for replay.

  • Severity: normal

Frequency: common

Bugzilla: [14321]

Description: lustre_mgs: operation 101 on unconnected MGS

Details: When MGC is disconnected from MGS long enough, MGS will evict the MGC, and late on MGC cannot successfully connect to MGS and a lot of the error messages complaining that MGS is not connected.

  • Severity: major

Frequency: rare, depends on device drivers and load

Bugzilla: [14529]

Description: MDS or OSS nodes crash due to stack overflow

Details: Code changes in 1.6.4 increased the stack usage of some functions. In some cases, in conjunction with device drivers that use a lot of stack, the MDS (or possibly OSS) service threads could overflow the stack. One change which was identified to consume additional stack has been reworked to avoid the extra stack usage.

  • Severity: enhancement

Bugzilla: [14876]

Description: Update to RHEL5 latest kernel-2.6.18-53.1.13.el5.

  • Severity: enhancement

Bugzilla: [14858]

Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.5.

  • Severity: enhancement

Bugzilla: [14720]

Description: Update to RHEL5 latest kernel-2.6.18-53.1.6.el5.

  • Serverity: enhancement

Bugzilla: [14793]

Description: Update RHEL4 kernel to 2.6.9-67.0.4.

  • Severity: minor

Frequency: rare on shutdown OST

Bugzilla: [13196]

Description: Don't allow skipping OSTs if index has been specified.

Details: Don't allow skipping OSTs if index has been specified, make locking in internal create lots better.

  • Severity: normal

Frequency: rare

Bugzilla: [14421]

Description: ASSERTION(!PageDirty(page)) failed

Details: Wrong check could lead to an assertion failure under specific load patterns.

  • Severity: normal

Frequency: rare

Bugzilla: [12228]

Description: LBUG in ptlrpc_check_set() bad phase ebc0de00

Details: access to bitfield in structure is always rounded to long and this produce problem with not atomic change any bit.

  • Severity: normal

Frequency: always

Bugzilla: [13647]

Description: Lustre 'make rpms' failed.

Details: Remove ldiskfs spec file to avoids rpmbuild be confused when builds Lustre rpms from tarball.

  • Severity: enhancement

Bugzilla: [14498]

Description: Update to SLES9 SP4 kernel-2.6.5-7.308.

  • Severity: normal

Frequency: rare on shutdown OST

Bugzilla: [14608]

Description: If llog cancel was not send before clean_exports phase, this can produce deadlock in llog code.

Details: If llog thread has last reference to obd and call class_import_put this produce deadlock because llog_cleanup_commit_master wait when last llog_commit_thread exited, but this never success because was called from llog_commit_thread.

  • Severity: normal

Frequency: only if OST index is skipped

Bugzilla: [14607]

Description: NULL lov_tgts causing MDS oops

Details: more safe checks for NULL lov_tgts for avoid oops.

  • Severity: enhancement

Bugzilla: [14531]

Description: Update to RHEL4 latest kernel-2.6.9-67.0.1.EL.

  • Severity: enhancement

Bugzilla: [14368]

Description: Update to RHEL5 latest kernel-2.6.18-53.1.4.el5.

  • Severity: normal

Frequency: always

Bugzilla: [14136]

Description: make mgs_setparam() handle fsname containing dash

Details: fsname containing a dash does not work with lctl conf_param

  • Severity: enhancement

Bugzilla: [14288]

Description: Update to RHEL4 Update-6 kernel-2.6.9-67.EL.

  • Severity: normal

Frequency: rare, in recovery and (or) destroy lovobjid file.

Bugzilla: [12702]

Description: rewrite lov_objid code.

Details: Cleanup for lov_objid code, remove scability problems and wrong locking. Fix sending last_id into OST.

  • Severity: enhancement

Bugzilla: [14388]

Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.3.

  • Severity: enhancement

Bugzilla: [14289]

Description: Update to RHEL5 Update-1 kernel 2.6.18-53.el5.

Details: Use d_move_locked instead of __d_move.

  • Severity: major

Frequency: rare, at shutdown

Bugzilla: [14260]

Description: access already free / zero obd_namespace.

Details: if client_disconnect_export was called without force flag set, and exist connect request in flight, this can produce access to NULL pointer (or already free pointer) when connect_interpret store ocd flags in obd_namespace.

  • Severity: minor

Frequency: only at startup

Bugzilla: [14418]

Description: not alloc memory with spinlock held.

Details: allocation memory with GFP_KERNEL can produce sleep deadlock, if any spinlock held.

  • Severity: normal

Frequency: always

Bugzilla: [14270]

Description: lfs find does not continue on file error

Details: Continue other files processing when a file/dir is absent.

  • Severity: normal

Bugzilla: [11791]

Description: Inconsistent usage of lustre_pack_reply()

Details: Standardize the usage of lustre_pack_reply() such that it always generate a CERROR on failure.

  • Severity: normal

Frequency: very rare

Bugzilla: [3462]

Description: Fix replay if there is an un-replied request and open

Details: In some cases, older replay request will revert the mcd->mcd_last_xid on MDS which is used to record the client's latest sent request.

  • Severity: enhancement

Bugzilla: [13969]

Description: Update to RHEL5 kernel 2.6.18-8.1.15.el5.

  • Severity: enhancement

Bugzilla: [13874]

Description: Update to SLES10 SP1 kernel 2.6.16.53-0.16

  • Severity: enhancement

Bugzilla: [13889]

Description: Update to SLES9 kernel-2.6.5-7.287.3.

  • Severity: enhancement

Bugzilla: [14041]

Description: Update to RHEL4 kernel-2.6.9-55.0.12.EL.

  • Severity: enhancement

Bugzilla: [13690]

Description: Build SLES10 patchless client fails

Details: The configure was broken by run ./configure with --with-linux-obj=.... argument for patchless client. When the configure use --with-linux-obj, the LINUXINCLUDE= -Iinclude can't search header adequately. Use absolute path such as -I($LINUX)/include instead.

  • Severity: enhancement

Bugzilla: [11622]

Description: Lustre Page Accounting

Details: New macros for page alloc and free which enable accounting of page allocation of Lustre. Use percpu counters to store memory and page statistics.

  • Severity: normal

Frequency: only if debugging is disabled

Bugzilla: [13497]

Description: LASSERT_{REQ,REP}SWAB macros are buggy

Details: If SWAB_PARANOIA is disabled, the LASSERT_REQSWAB and LASSERT_REPSWAB macros become no-ops, which is incorrect. Drop these macros and replace them with their definitions instead.

  • Severity: normal

Frequency: rare

Bugzilla: [13888]

Description: interrupt oig_wait produce painc on resend.

Details: brw_redo_request can be used for resend requests from ptlrpcd and private set, and this produce situation when rq_ptlrpcd_data not copyed to new allocated request and triggered LBUG on assert req->rq_ptlrpcd_data != NULL. But this member used only for wakeup ptlrpcd set if request is changed and can be safety changed to use rq_set directly.

  • Severity: enhancement

Bugzilla: [11089]

Description: organize the server-side client stats on per-nid basis

Details: Change the structure of stats under obdfilter and mds to

        New structure:
           +- exports
                   +- nid#1
                   |   + stats
                   |   + uuids
                   +- nid#2...
                   +- clear

The "uuid"s file would list the uuids of _active_ exports. And the clear entry is to clear all stats and stale nids.

  • Severity: normal

Frequency: rare

Bugzilla: [12266]

Description: Processes looping in ll_readdir() on Lustre clients finally causing a full node pseudo-hang

Details: Concurrent access to the same directory from multiple clients with intensive file creation/removal can cause a client node to spin in ll_readdir(). i_version must be increased every time the lock is cancelled to ensure a revalidate is done.

  • Severity: normal

Frequency: always

Bugzilla: [13976]

Description: touch file failed when fs is not full

Details: OST in recovery should not be discarded by MDS in alloc_qos(), otherwise we can get ENOSP while fs is not full.

  • Severity: normal

Frequency: only for Cray XT3

Bugzilla: [12829] [13455]

Description: Changing primary group doesn't change the group lustre assigns to a file

Details: When CRAY_XT3 is defined, the fsgid supplied by the client is overridden with the primary group provided by the group upcall, whereas the supplied fsgid can be trusted if it is in the list of supplementary groups returned by the group upcall.

  • Severity: enhancement

Bugzilla: [12749]

Description: Root Squash Functionality

Details: Implementation of NFS-like root squash capability. Specifically, don't allow someone with root access on a client node to be able to manipulate files owned by root on a server node.

  • Severity: enhancement

Bugzilla: [10718]

Description: Slow trucate/writes to huge files at high offsets.

Details: Directly associate cached pages to lock that protect those pages, this allows us to quickly find what pages to write and remove once lock callback is received.

  • Severity: normal

Frequency: common

Bugzilla: [14379]

Description: Too many locks accumulating on client during NFS usage

Details: mds_open improperly used accmode to find out access mode to a file. Also mdc_intent_lock logic to find out if we already have lock similar to just received was flawed since introduction of skiplists - locks are now added to the front of the granted queue.

  • Severity: normal

Bugzilla: [14477]

Description: Hit ASSERTION(obd->obd_stopping == 1) failed in some setup failed situation.

Details: In obd setup failure handler, obd_stopping will not necessarily to be 1, and obd_set_up should also be checked to make sure whether obd is completely setup.

  • Severity: enhancement

Bugzilla: [14398]

Description: Allow masking D_WARNING, D_ERROR messages from console

Details: Console messages can now be disabled via lnet.printk.

  • Severity: normal

Frequency: always

Bugzilla: [14614]

Description: User code with malformed file open parameter crashes client node

Details: Before packing join_file req, all the related reference should be checked carefully in case some malformed flags cause fake join_file req on client.

  • Severity: normal

Frequency: always

Bugzilla: [10600]

Description: shrink/enlarge qunit size when needed; fix the problem of coarse grain of quota doing harm to quota's accuracy

Details: qunit size will be changed when quota limitation is too low/high; record the pending quota write in order to get more accureate quota; delete the patch for bug12588, which is unnecessary when this patch is landed. This bug also contains fixes for bug 14526, 14299, 14601 and 13794.

  • Severity: normal

Bugzilla: [14225]

Description: LDLM_ENQUEUE races with LDLM_CP_CALLBACK

Details: ldlm_completion_ast() assumes that a lock is granted when the req mode is equal to the granted mode. However, it should also check that LDLM_FL_CP_REQD is not set.

  • Severity: normal

Bugzilla: [14360]

Description: Heavy nfs access might result in deadlocks

Details: After ELC code landed, it is now improper to enqueue any mds locks under och_sem, because enqueue might want to decide to cancel open locks for same inode we are holding och_sem for.

  • Severity: normal

Bugzilla: [14443]

Description: 35% write performance drop with ldiskfs2 when quotas are on

Details: Enable ext3 journalled quota by default to improve performance when quotas are turned on.

  • Severity: normal

Bugzilla: [13843]

Description: Client eviction while running blogbench

Details: A lot of unlink operations with concurrent I/O can lead to a deadlock causing evictions. To address the problem, the number of oustanding OST_DESTROY requests is now throttled to max_rpcs_in_flight per OSC and LDLM_FL_DISCARD_DATA blocking callbacks are processed in priority.

  • Severity: normal

Frequency: RHEL4 only

Bugzilla: [14618]

Description: mkfs is very slow on IA64/RHEL4

Details: A performance regression has been discovered in the MPT Fusion driver between versions 3.02.73rh and 3.02.99.00rh. As a consequence, we have downgraded the MPT Fusion driver in the RHEL4

  • Severity: normal

Frequency: PPC/PPC64 only

Bugzilla: [14845]

Description: conflicts between asm-ppc64/types.h and lustre_types.h

Details: fix duplicated definitions between asm-ppc64/types.h and lustre_types.h on PPC.

  • Severity: normal

Frequency: PPC/PPC64 only

Bugzilla: [14844]

Description: asm-ppc/segment.h does not exist

Details: fix compile issue on PPC.

  • Severity: normal

Bugzilla: [13805]

Description: data checksumming impacts single node performance

Details: add support for several checksum algorithms. Currently, CRC32 and Adler-32 are supported. The checksum type can be changed on the fly through /proc/fs/lustre/osc/*/checksum_type.

  • Severity: normal

Bugzilla: [14648]

Description: use adler32 for page checksums

Details: when available, use the Adler-32 algorithm instead of CRC32 for page checksums.

  • Severity: normal

Bugzilla: [14864]

Description: better handle error messages in extents code

  • Severity: enhancement

Bugzilla: [14729]

Description: SNMP support enhancement

Details: Adding total number of sampled request for an MDS node in snmp support.

  • Severity: enhancement

Bugzilla: [14748]

Description: Optimize ldlm waiting list processing for PR extent locks

Details: When processing waiting list for read extent lock and meeting read lock that is same or wider to it that is not contended, skip processing rest of the list and immediatelly return current status of conflictness, since we are guaranteed there are no conflicting locks in the rest of the list.

  • Severity: normal

Bugzilla: [14774]

Description: Time out and refuse to reconnect

Details: When the failover node is the primary node, it is possible to have two identical connections in imp_conn_list. We must compare not conn's pointers but NIDs, otherwise we can defeat connection throttling.

  • Severity: major

Bugzilla: [14775]

Description: Client not clear own cache if answer to reconnect is lost.

Details: Client gets evicted from server. Now client also thinks it is disconnected (or gets ENOTCONN on its operation) and decides to reconnect. Server receives reconnect message, but cannot find export. New export is created that is fully valid (new cookie!), but reply is lost and not reported to client. Client reconnects again and gets back a just-created connection, but it is not new so client thinks it was not evicted and does not do recovery.

  • Severity: normal

Bugzilla: [14483]

Description: Detect stride IO mode in read-ahead

Details: When a client does stride read, read-ahead should detect that and read-ahead pages according to the detected stride pattern.

  • Severity: normal

Bugzilla: [15033]

Description: build for X2 fails

Details: fix compile issue on Cray systems.

  • Severity: enhancement

Bugzilla: [13371]

Description: implement readv/writev APIs(aio_read/aio_writes in newer kernels)

Details: This greatly improves speed of NFS writes on 2.6 kernels.

  • Severity: normal

Frequency: only on PPC/SLES10

Bugzilla: [14855]

Description: "BITS_PER_LONG is not 32 or 64" in linux/idr.h

Details: On SLES10/PPC, fs.h includes idr.h which requires BITS_PER_LONG to be defined. Add a hack in mkfs_lustre.c to work around this compile issue.

  • Severity: normal

Bugzilla: [14257]

Description: LASSERT on MDS when client holding flock lock dies

Details: ldlm pool logic depends on number of granted locks equal to number of released locks which is not true for flock locks, so just exclude such locks from consideration.

  • Severity: normal

Bugzilla: [15188]

Description: MDS deadlock with many ll_sync_lov threads and I/O stalled

Details: Use fsfilt_sync() for both the whole filesystem sync and individual file sync to eliminate dangerous inode locking with I_LOCK that can lead to a deadlock.

  • Severity: minor

Bugzilla: [15566]

Description: Update an obsolete wirecheck.c generator

Details: Update wirecheck.c/wirehdr.c and regenerate wiretest.c

  • Severity: normal

Bugzilla: [14712]

Description: Client can panic on open sometimes

Details: It is possible that we try to free already freed request in ll_file_open in some error cases when we send request from ll_file_open

  • Severity: normal

Bugzilla: [14410]

Description: performance in 1.6.3

Details: Force q->max_phys_segments to MAX_PHYS_SEGMENTS on SLES10 to be sure that 1MB requests are not fragmented by the block layer.

  • Severity: normal

Bugzilla: [15198]

Description: LDLM soft lockups - improvement

Details: It is be possible to send the lock handle along with each read or write request because the client is already doing a lock match itself so there isn't any reason the OST should have to re-do that search.

  • Severity: normal

Frequency: rare

Bugzilla: [14036]

Description: lfs quota fails with deactivated OSTS

Details: With this patch, three improvements are included:

    1. detete the softlimit in mds and osts when use "lfs quota".
    2. display the inaccurate data in the output of "lfs quota".
    3. try to get quota info when "lfs quota" is executed.
  • Severity: normal

Frequency: rare

Bugzilla: [15776]

Description: Extent locks not granted with no conflicts sometimes.

Details: When race occurs in glimpse handler and nothing is returned, we do not reprocess the queue after lock cancel, and that leads to a stall until next activity on a resource

Severity: major

Bugzilla: [15712]

Description: ksocknal_create_conn() hit ASSERTION during connection race

Details: ksocknal_create_conn() hit ASSERTION during connection race

Severity: major

Bugzilla: [13983]

Description: ksocknal_send_hello() hit ASSERTION while connecting race

Details: ksocknal_send_hello() hit ASSERTION while connecting race

Severity: major

Bugzilla: [14425]

Description: o2iblnd/ptllnd credit deadlock in a routed config.

Details: o2iblnd/ptllnd credit deadlock in a routed config.

Severity: normal

Bugzilla: [14956]

Description: High load after starting lnet

Details: gmlnd should sleep in rx thread in interruptible way. Otherwise, uptime utility reports high load that looks confusingly.

Severity: normal

Bugzilla: [14838]

Description: ksocklnd fails to establish connection if accept_port is high

Details: PID remapping must not be done for active (outgoing) connections

Changes from v1.6.4.2 to v1.6.4.3

Support for kernels: 2.6.5-7.286 (SLES 9), 2.6.9-67.0.4.EL (RHEL 4), 2.6.16.54-0.2.5 (SLES 10), 2.6.18-53.1.13.el5 (RHEL 5), 2.6.18.8 vanilla (kernel.org)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)

Due to problems with nested symlinks and FMODE_EXEC (bug 12652), we do not recommend using patchless RHEL4 clients with kernels prior to 2.6.9-55EL (RHEL4U5).

Recommended e2fsprogs version: 1.40.4-cfs1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

RHEL 4 (patched) and RHEL 5/SLES 10 (patchless) clients behave differently on 'cd' to a removed cwd "./" (refer to Bugzilla 14399).

  • Severity: critical

Bugzilla: 14793

Description: Update to the latest RHEL4 kernel (i.e. 2.6.9-67.0.4.EL) to fix the vulnerabilities described in CVE-2008-0001, CVE-2007-5500 and CVE-2007-4130.


  • Severity: critical

Bugzilla: 14858

Description: Update to the latest SLES10 kernel (i.e. 2.6.16.54-0.2.5) to fix the security problems described in CVE-2008-0007, CVE-2008-0001, CVE-2007-5966 and CVE-2007-6417.


  • Severity: critical

Bugzilla: 14876

Description: Update to the latest RHEL5 kernel (i.e. 2.6.18-53.1.13.el5) to fix the vulnerability described in CVE-2008-0600. This problem could allow local user to gain root privileges.


  • Severity: normal

Frequency: RHEL4 only

Bugzilla: 14618

Description: mkfs is very slow on IA64/RHEL4

Details: A performance regression has been discovered in the MPT Fusion driver between versions 3.02.73rh and 3.02.99.00rh. As a consequence, we have downgraded the MPT Fusion driver in the RHEL4 kernel from 3.02.99.00 to 3.02.73 until this problem is fixed.


  • Severity: major

Bugzilla: 14775

Description: Client not clear own cache if answer to reconnect is lost.

Details: client gets evicted from server. Now client also thinks it is disconnected (ot gets enotconn on its operation) and decides to reconnect. Server receives reconnect message, but cannot find export. New export is created that is fully valid (new cookie!), but client gets a reply that the export is new, and so no recovery should be performed.


Changes from v1.6.4.1 to v1.6.4.2

Support for kernels: 2.6.5-7.286 (SLES 9), 2.6.9-55.0.9.EL (RHEL 4), 2.6.16.53-0.8 (SLES 10), 2.6.18-8.1.14.el5 (RHEL 5), 2.6.18.8 vanilla (kernel.org)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)

Due to problems with nested symlinks and FMODE_EXEC (bug 12652), we do not recommend using patchless RHEL4 clients with kernels prior to 2.6.9-55EL (RHEL4U5).

Recommended e2fsprogs version: 1.40.4-cfs1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

RHEL 4 (patched) and RHEL 5/SLES 10 (patchless) clients behave differently on 'cd' to a removed cwd "./" (refer to Bugzilla 14399).

  • Severity: critical

Frequency: only for relatively new filesystems, when OSTs are in recovery

Bugzilla: 14631

Description: OST objects below id 20000 are deleted, causing data loss

Details: For relatively newly formatted OST filesystem(s), where there have not been at least 20000 objects created on an OST a bug in MDS->OST orphan recovery could cause those objects to be deleted if the OST was in recovery, but the MDS was not. Safety checks in the orphan recovery prevent this if more than 20000 objects were ever created on an OST. If the MDS was also in recovery the problem was not hit. Only in 1.6.4.1.


  • Severity: major

Frequency: rare, depends on device drivers and load

Bugzilla: 14529

Description: MDS or OSS nodes crash due to stack overflow

Details: Code changes in 1.6.4 increased the stack usage of some functions. In some cases, in conjunction with device drivers that use a lot of stack the MDS (or possibly OSS) service threads could overflow the stack. One change which was identified to consume additional stack has been reworked to avoid the extra stack usage.


Changes from v1.6.4 to v1.6.4.1

Support for networks: socklnd - any kernel supported by Lustre, qswlnd - Qsnet kernel modules 5.20 and later, openiblnd - IbGold 1.8.2, o2iblnd - OFED 1.1 and 1.2, viblnd - Voltaire ibhost 3.4.5 and later, ciblnd - Topspin 3.2.0, iiblnd - Infiniserv 3.3 + PathBits patch, gmlnd - GM 2.1.22 and later, mxlnd - MX 1.2.1 or later, ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x

Support for kernels: 2.6.5-7.286 (SLES 9), 2.6.9-55.0.9.EL (RHEL 4), 2.6.16.53-0.8 (SLES 10), 2.6.18-8.1.14.el5 (RHEL 5), 2.6.18.8 vanilla (kernel.org)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)

Due to recently discovered recovery problems, we do not recommend using patchless RHEL 4 clients with this or any earlier release.

Recommended e2fsprogs version: 1.40.2-cfs1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

  • Severity: major

Bugzilla: 14433

Description: Oops on connection from 1.6.3 client

Frequency: always, on connection from 1.6.3 client

Details: Enable and accept the OBD_CONNECT_LRU_RESIZE flag only if LRU resizing is enabled at configure time. This fixes an oops caused by incorrectly accepting the LRU_RESIZE feature even if --enable-lru-resize is not specified.


Changes from v1.6.3 to v1.6.4

Support for networks: socklnd - any kernel supported by Lustre, qswlnd - Qsnet kernel modules 5.20 and later, openiblnd - IbGold 1.8.2, o2iblnd - OFED 1.1 and 1.2, viblnd - Voltaire ibhost 3.4.5 and later, ciblnd - Topspin 3.2.0, iiblnd - Infiniserv 3.3 + PathBits patch, gmlnd - GM 2.1.22 and later, mxlnd - MX 1.2.1 or later, ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x

Support for kernels: 2.6.5-7.286 (SLES 9), 2.6.9-55.0.9.EL (RHEL 4), 2.6.16.53-0.8 (SLES 10), 2.6.18-8.1.14.el5 (RHEL 5), 2.6.18.8 vanilla (kernel.org)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)

Due to recently discovered recovery problems, we do not recommend using patchless RHEL 4 clients with this or any earlier release.

Recommended e2fsprogs version: 1.40.2-cfs1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

  • Severity: enhancement

Bugzilla: 11686

Description: Console message flood

Details: Make cdls ratelimiting more tunable by adding several tunable in procfs /proc/sys/lnet/console_{min,max}_delay_centisecs and /proc/sys/lnet/console_backoff.


  • Severity: enhancement

Bugzilla: 13521

Description: Update kernel patches for SLES10 2.6.16.53-0.8.

Details: Update which_patch & target file for SLES10 latest kernel.


  • Severity: enhancement

Bugzilla: 13128

Description: add --type and --size parameters to lfs find

Details: Enhance lfs find by adding filetype and filesize parameters. Also multiple OBDs can now be specified for the --obd option.


  • Severity: enhancement

Bugzilla: 11270

Description: eliminate client locks in face of contention

Details: file contention detection and lockless i/o implementation for contended files.


  • Severity: enhancement

Bugzilla: 12411

Description: Remove client patches from SLES 10 kernel.

Details: This causes SLES 10 clients to behave as patchless clients even on a Lustre-patched (server) kernel.


  • Severity: enhancement

Bugzilla: 2369

Description: use i_size_read and i_size_write in 2.6 port

Details: replace inode->i_size access with i_size_read/write()


  • Severity: enhancement

Bugzilla: 13454

Description: Add jbd statistics patch for RHEL5 and 2.6.18-vanilla.


  • Severity: enhancement

Bugzilla: 13518

Description: Kernel patches update for RHEL4 2.6.9-55.0.6.

Details: Modify vm-tunables-rhel4.patch.


  • Severity: enhancement

Bugzilla: 13452

Description: Kernel config for 2.6.18-vanilla.

Details: Modify targets/2.6-vanilla.target.in. Add config file kernel-2.6.18-2.6-vanilla-i686.config. Add config file kernel-2.6.18-2.6-vanilla-i686-smp.config. Add config file kernel-2.6.18-2.6-vanilla-x86_64.config. Add config file kernel-2.6.18-2.6-vanilla-x86_64-smp.config.


  • Severity: enhancement

Bugzilla: 13207

Description: adapt the lustre_config script to support the upgrade case

Details: Add "-u" option for lustre_config script to support upgrading 1.4 server targets to 1.6 in parallel.


  • Severity: critical

Frequency: always

Bugzilla: 13751

Description: Kernel patches update for RHEL5 2.6.18-8.1.14.el5.

Details: Modify target file & which_patch. A flaw was found in the IA32 system call emulation provided on AMD64 and Intel 64 platforms. An improperly validated 64-bit value could be stored in the %RAX register, which could trigger an out-of-bounds system call table access. An untrusted local user could exploit this flaw to run code in the kernel (ie a root privilege escalation). (CVE-2007-4573).


  • Severity: critical

Frequency: always

Bugzilla: 13748

Description: Update RHEL 4 kernel to fix local root privilege escalation.

Details: Update to the latest RHEL 4 kernel to fix the vulnerability described in CVE-2007-4573. This problem could allow untrusted local users to gain root access.


  • Severity: major

Frequency: occasional

Bugzilla: 14353

Description: excessive CPU consumption on client reduces IO performance

Details: in some cases the ldlm_poold thread is spending too much time trying to cancel locks, and is cancelling them too aggressively and this can severely impact IO performance. Disable the dynamic LRU resize code at build time. It can be re-enabled with configure --enable-lru-resize at build time.


  • Severity: major

Frequency: occasional

Bugzilla: 13917

Description: MDS hang or stay in waiting lock

Details: If client receive lock with CBPENDING flag ldlm need send lock cancel as separate rpc, to avoid situation when cancel request can't processed due all i/o threads stay in wait lock.


  • Severity: major

Frequency: occasional

Bugzilla: 11710

Description: improve handling recoverable errors Details: If request processed with error which can be recoverable on server request should be resend, otherwise page released from cache and marked as error.


  • Severity: normal

Bugzilla: 12302

Description: new userspace socklnd

Details: Old userspace tcpnal that resided in lnet/ulnds/socklnd replaced with new one - usocklnd.


  • Severity: normal

Frequency: occasional

Bugzilla: 13730

Description: Do not fail import if osc_interpret_create gets -EAGAIN

Details: If osc_interpret_create got -EAGAIN it immediately exits and wakeup oscc_waitq. After wakeup oscc_wait_for_objects call oscc_has_objects and see OSC has no objests and call oscc_internal_create to resend create request.


  • Severity: normal

Frequency: when removing large files

Bugzilla: 13181

Description: scheduling issue during removal of large Lustre files

Details: Don't take the BKL in fsfilt_ext3_setattr() for 2.6 kernels. It causes scheduling issues when removing large files (17TB in the present case).


  • Severity: normal

Frequency: always

Bugzilla: 13358

Description: 1.4.11 Can't handle directories with stripe set and extended ACLs

Details: Impossible (EPROTO is returned) to access a directory that has a non-default striping and ACLs.


  • Severity: normal

Frequency: only on ppc

Bugzilla: 12234

Description: /proc/fs/lustre/devices broken on ppc

Details: The patch as applied to 1.6.2 doesn't look correct for all arches. We should make sure the type of 'index' is loff_t and then cast explicitly as needed below. Do not assign an explicitly cast loff_t to an int.


  • Severity: normal

Frequency: only for rhel5

Bugzilla: 13616

Description: Kernel patches update for RHEL5 2.6.18-8.1.10.el5.

Details: Modify the target file & which_kernel.


  • Severity: normal

Frequency: if the uninit_groups feature is enabled on ldiskfs

Bugzilla: 13706

Description: e2fsck reports "invalid unused inodes count"

Details: If a new ldiskfs filesystem is created with the "uninit_groups" feature and only a single inode is created in a group then the "bg_unused_inodes" count is incorrectly updated. Creating a second inode in that group would update it correctly.


  • Severity: normal

Frequency: only if filesystem is inconsistent

Bugzilla: 11673

Description: handle "serious error: objid * already exists" more gracefully

Details: If LAST_ID value on disk is smaller than the objects existing in the O/0/d* directories, it indicates disk corruption and causes an LBUG(). If the object is 0-length, then we should use the existing object. This will help to avoid a full fsck in most cases.


  • Severity: normal

Frequency: rarely

Bugzilla: 13570

Description: To avoid grant space > avaible space when the disk is almost full. Without this patch you might see the error "grant XXXX > available" or some LBUG about grant, when the disk is almost full.

Details: In filter_check_grant, for non_grant cache write, we should check the left space by if (*left > ungranted + bytes), instead of (*left > ungranted), because only we are sure the left space is enough for another "bytes", then the ungrant space should be increase. In client, we should update cl_avail_grant only there is OBD_MD_FLGRANT in the reply.


  • Severity: normal

Frequency: when using O_DIRECT and quotas

Bugzilla: 13930

Description: Incorrect file ownership on O_DIRECT output files

Details: block usage reported by 'lfs quota' does not take into account files that have been written with O_DIRECT.


  • Severity: normal

Frequency: always

Bugzilla: 13976

Description: touch file failed when fs is not full

Details: OST in recovery should not be discarded by MDS in alloc_qos(), otherwise we can get ENOSP while fs is not full.


  • Severity: normal

Frequency: always

Bugzilla: 13805

Description: data checksumming impacts single node performance

Details: disable checksums by default since it impacts single node performance. It is still possible to enable checksums by default via "configure --enable-checksum", or at runtime via procfs.


  • Severity: minor

Frequency: when lov objid is destroyed

Bugzilla: 14222

Description: mds can't recreate lov objid file.

Details: if lov objid file is destroyed and ost with highest index connected first mds not get last objid number from ost. Also if mds get last id from ost his not tell osc about this and it's produce warning about wrong del orphan request.


  • Severity: minor

Frequency: rarely

Bugzilla: 12948

Description: buffer overruns could theoretically occur

Details: llapi_semantic_traverse() modifies the "path" argument by appending values to the end of the origin string, and a buffer overrun may occur. Adding buffer overrun check in liblustreapi.


  • Severity: minor

Bugzilla: 13732

Description: change order of libsysio includes

Details: '#include sysio.h' should always come before '#include xtio.h'


Changes from v1.6.2 to v1.6.3

Support for networks: socklnd - any kernel supported by Lustre, qswlnd - Qsnet kernel modules 5.20 and later, openiblnd - IbGold 1.8.2, o2iblnd - OFED 1.1 and 1.2, viblnd - Voltaire ibhost 3.4.5 and later, ciblnd - Topspin 3.2.0, iiblnd - Infiniserv 3.3 + PathBits patch, gmlnd - GM 2.1.22 and later, mxlnd - MX 1.2.1 or later, ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x

Support for kernels: 2.6.5-7.286 (SLES 9), 2.6.9-55.0.2.EL (RHEL 4), 2.6.16.46-0.14 (SLES 10), 2.6.18-8.1.8.el5 (RHEL 5), 2.6.18.8 vanilla (kernel.org)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.22 vanilla (kernel.org)

Due to recently discovered recovery problems, we do not recommend using patchless RHEL 4 clients with this or any earlier release.

Recommended e2fsprogs version: 1.40.2-cfs1

Note that reiserfs quotas are disabled on SLES 10 in this kernel.

  • Severity: enhancement

Bugzilla: 12192

Description: llapi_file_create() does not allow some changes

Details: add llapi_file_open() that allows specifying the file creation mode and open flags, and also returns an open file handle.


  • Severity: enhancement

Bugzilla: 12743

Description: df doesn't work properly if diskfs blocksize != 4K

Details: Choose biggest blocksize of OST's as the LOV's blocksize.


  • Severity: enhancement

Bugzilla: 11248

Description: merge and cleanup kernel patches.

Details: Remove mnt_lustre_list in vfs_intent-2.6-rhel4.patch.


  • Severity: enhancement

Bugzilla: 13039

Description: RedHat Update kernel for RHEL5

Details: Kernel config file for RHEL5.


  • Severity: enhancement

Bugzilla: 12446

Description: OSS needs mutliple precreate threads

Details: Add ability to start more than one create thread per OSS.


  • Severity: enhancement

Bugzilla: 13039

Description: RedHat Update kernel for RHEL5

Details: Modify the kernel config file more closer RHEL5.


  • Severity: enhancement

Bugzilla: 13360

Description: Build failure against Centos5 (RHEL5)

Details: Define PAGE_SIZE when it isn't present.


  • Severity: enhancement

Bugzilla: 11401

Description: client-side metadata stat-ahead during readdir(directory readahead)

Details: perform client-side metadata stat-ahead when the client detects readdir and sequential stat of dir entries therein


  • Severity: enhancement

Bugzilla: 11230

Description: Tune the kernel for good SCSI performance.

Details: Set the value of /sys/block/{dev}/queue/max_sectors_kb to the value of /sys/block/{dev}/queue/max_hw_sectors_kb in mount_lustre.


  • Severity: critical

Frequency: Always for filesystems larger than 2TB on 32-bit systems.

Bugzilla: 13547 , 13627

Description: Data corruption for OSTs that are formatted larger than 2TB on 32-bit servers.

Details: When generating the bio request for lustre file writes the sector number would overflow a temporary variable before being used for the IO. The data reads correctly from Lustre (which will overflow in a similar manner) but other file data or filesystem metadata may be corrupted in some cases.


  • Severity: major

Bugzilla: 13236

Description: TOE Kernel panic by ksocklnd

Details: offloaded sockets provide their own implementation of sendpage, can't call tcp_sendpage() directly


  • Severity: major

Bugzilla: 13482

Description: build error

Details: fix typos in gmlnd, ptllnd and viblnd


  • Severity: major

Bugzilla: 12932

Description: obd_health_check_timeout too short

Details: set obd_health_check_timeout as 1.5x of obd_timeout


  • Severity: major

Frequency: only with quota on the root user

Bugzilla: 12223

Description: mds_obd_create error creating tmp object

Details: When the user sets quota on root, llog will be affected and can't create files and write files.


  • Severity: normal

Bugzilla: 12782

Description: /proc/sys/lnet has non-sysctl entries

Details: Updating dump_kernel/daemon_file/debug_mb to use sysctl variables


  • Severity: normal

Bugzilla: 10778

Description: kibnal_shutdown() doesn't finish; lconf --cleanup hangs

Details: races between lnd_shutdown and peer creation prevent lnd_shutdown from finishing.


  • Severity: normal

Bugzilla: 13279

Description: open files rlimit 1024 reached while liblustre testing

Details: ulnds/socklnd must close open socket after unsuccessful 'say hello' attempt.


  • Severity: normal

Frequency: always on directories with default striping set

Bugzilla: 12836

Description: lfs find on -1 stripe looping in lsm_lmm_verify_common()

Details: Avoid lov_verify_lmm_common() on directory with -1 stripe count.


  • Severity: normal

Frequency: Always on ia64 patchless client, and possibly others.

Bugzilla: 12826

Description: Add EXPORT_SYMBOL check for node_to_cpumask symbol.

Details: This allows the patchless client to be loaded on architectures without this export.


  • Severity: normal

Frequency: rare

Bugzilla: 13142

Description: disorder of journal start and llog_add cause deadlock.

Details: in llog_origin_connect, journal start should happen before llog_add keep the same order as other functions to avoid the deadlock.


  • Severity: normal

Frequency: occasionally when using NFS

Bugzilla: 13030

Description: "ll_intent_file_open()) lock enqueue: err: -13" with nfs

Details: with NFS, the anon dentry's parent was set to itself in d_alloc_anon(), so in MDS, we use rec->ur_fid1 to find the corresponding dentry other than use rec->ur_name.


  • Severity: normal

Frequency: Occasionally with failover

Bugzilla: 12459

Description: Client eviction due to failover config

Details: after a connection loss, the lustre client should attempt to reconnect to the last active server first before trying the other potential connections.


  • Severity: normal

Frequency: only with liblustre clients on XT3

Bugzilla: 12418

Description: evictions taking too long

Details: allow llrd to evict clients directly on OSTs


  • Severity: normal

Bugzilla: 13125

Description: osts not allocated evenly to files

Details: change the condition to increase offset_idx


  • Severity: normal

Bugzilla: 13436

Description: Only those disconnect error should be returned by rq_status.

Details: In open/enqueue processs, Some errors, which will cause client disconnected, should be returned by rq_status, while other errors should still be returned by intent, then mdc or llite will detect them.


  • Severity: normal

Bugzilla: 13600

Description: "lfs find -obd UUID" prints directories

Details: "lfs find -obd UUID" will return all directory names instead of just file names. It is incorrect because the directories do not reside on the OSTs.


  • Severity: normal

Bugzilla: 13596

Description: MDS hang after unclean shutdown of lots of clients

Details: Never resend AST requests.