WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Difference between revisions of "Change Log 1.8"

From Obsolete Lustre Wiki
Jump to navigationJump to search
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
<small>''(Updated: Aug 2010)''</small>
+
<small>''(Updated: July 2013)''</small>
 +
=Changes from v1.8.7 to v1.8.8-x1=
 +
Support for networks:
 +
* socklnd  - any kernel supported by Lustre,
 +
* qswlnd    - Qsnet kernel modules 5.20 and later,
 +
* openiblnd - IbGold 1.8.2,
 +
* o2iblnd  - OFED 1.3, 1.4.1, 1.4.2, 1.5.1 and 1.5.2
 +
* viblnd    - Voltaire ibhost 3.4.5 and later,
 +
* ciblnd    - Topspin 3.2.0,
 +
* iiblnd    - Infiniserv 3.3 + PathBits patch,
 +
* gmlnd    - GM 2.1.22 and later,
 +
* mxlnd    - MX 1.2.10 or later,
 +
* ptllnd    - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
 +
 
 +
 
 +
Support for kernels:
 +
* 2.6.32-279.2.1.el6 (RHEL 6)
 +
* 2.6.32-279.2.1.el6 (OEL 6)
 +
Client support for unpatched kernels:
 +
  (see http://wiki.lustre.org/index.php?title=Patchless_Client)
 +
* 2.6.32-279.2.1.el6 (RHEL 6)
 +
* 2.6.32-279.2.1.el6 (OEL 6)
 +
 
 +
Recommended e2fsprogs version:
 +
* 1.42.6.x1-mrp.107-8
 +
 
 +
The async journal commit feature (bug 19128) is off by default
 +
 
 +
Severity  : minor<br>
 +
Bugzilla  : MRP-1086 debug CWARN removed<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : MRP-1053 use mutex for cl_loi_list_lock instead of spinlock
 +
Description: Async page operations are not guaranteed to not block, therefore spinlock is not appropriate for protecting structures accessed by them. This patch changes the spinlock with mutex.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : MRP-1033 rpc.sh defect: LUSTRE is not set properly<br>
 +
Description: make do_nodes(), do_node() and rpc.sh to be more accurate on setting LUSTRE<br>
 +
 
 +
Severity  : minor<br>
 +
Bugzilla  : MRP-1057 check lustre.conf for modprobe<br>
 +
Description: Add a check for /etc/modprobe.d/lustre.conf to get lnet module parameters during testing
 +
 
 +
Severity  : minor<br>
 +
Bugzilla  : MRP-1008 make lustre-iokit rpmbuildable<br>
 +
 
 +
Severity  : minor<br>
 +
Bugzilla  : MRP-1007 update config files for rhel6<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24670 allow builing OFED of wider range of versions<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24668 fix broken sles10 build<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24668 fix for semaphore mess in ext4_ext_walk_space<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24554 noatime fix<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24554 noatime,nodiratime fix<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20128 Allow objects larger than 2TB in size<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24606 Misc changes<br>
 +
Description: - Remove unneeded patch file: ext4-store-tree-generation-at-find.patch
 +
    - Remove the hack for fsfilt_ext3_statfs()
 +
    - Use the correct spec file for rpmbuild
 +
    - Update the ChangeLog<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24606 Stop hacking around i_data_sem<br>
 +
Description: - Let ext4_ext_walk_space() itself handle the semaphore.
 +
    - Remove macro WALK_SPACE_HAS_DATA_SEM.
 +
    - Redefine macro fsfilt_up_truncate_sem().<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24606 ldiskfs changes for the new kernel<br>
 +
Description: Ldiskfs related changes for kernel 2.6.18-308.24.1.el5:
 +
    - Update related patches.
 +
    - Add Force over 24TB option.
 +
    - Add upstream patch to avoid loading bitmaps from full groups.
 +
    - Update the series file.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24606 Update RHEL5 and OEL5 kernel patches<br>
 +
Description: The kernel is updated to 2.6.18-308.24.1.el5.<br>
 +
Details    : Kernel related changes:
 +
      - Update some kernel patches to adapt to the new kernel.
 +
      - Remove unneeded kernel patch: md-avoid-corrupted-ldiskfs-after-rebuild.patch.
 +
      - Add a new upstream patch (soft RAID6 bug): make-bi_phys_segments-uint.patch.
 +
      - Update kernel configs, series, and targets, etc.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 add OEL6 server support<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 quota fix<br>
 +
Description: specify QFMT_VFS_V1 if available<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 define ext4_mb_discard_inode_preallocations for rhel5<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 disable dump_trace for rhel6<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 use inode version in rhel6 server<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 update ldiskfs patches<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 ldiskfs for 2.6.32-279<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 update to 2.6.32-279<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 long long s_mount_opt for rhel6<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 deadlock fix<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 minor conflict resolving<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 RHEL6 server support<br>
 +
Description: Add RHEL6 server (kernel version is 2.6.32-279.2.1.el6) support. This introduces many changes and new features of ldiskfs (ext4) such as mmp, large EA, fs data in dirent, open file by inode number, etc.
 +
   
 +
NOTE: This patch only suffice mount and further tuning is needed for other file operations, which will be dealt with in later patches.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 19526 conf-sanity test_46a fix<br>
 +
Description: LU-743 conf-sanity: test_46a failure<br>
 +
Details    : This failure is because client still didn't see the adding OSTs so it met a problem when decoding lsm because the # of OSTs was over tgt count at the client side.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24645 build kernel debuginfo rpm for sles11sp1<br>
 +
Description: In order to build debuginfo rpm for SLES11 SP1, We need to modify SLES11 kernel spec file in the following way:
 +
- explicitly declare __debug_package as true(1).
 +
- use debugfiles.list as the %files content instead of the default file in spec.
 +
- change the file attributes.
 +
- ignore some missing/unpackaged files while doing rpmbuild.<br>
 +
Also, we need to increase the BUILD_GEN in order to avoid future RPM reuse of the testing builds.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24596 skip metabench for rhel 6.2 nfs client<br>
 +
Description: rhel 6.2 nfs client bug<br>
 +
Details    : https://bugzilla.redhat.com/show_bug.cgi?id=790729<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24515 test_7 activate osc failed<br>
 +
Description: take into account the possible race between activation from lctl and activation from pinger thread
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 RHEL6 support in b1_8 branch<br>
 +
Description: RHEL6.2 support along with build code refactor.<br>
 +
Details    : This patch is largely based on the patches in the following bugs:<br>
 +
22375 RHEL6 patchless client support.<br>
 +
24089 Avoid reuse cache storage collisions.<br>
 +
24090 Distro and target autodetection.<br>
 +
24091 Find_linux_rpms utility.<br>
 +
24092 Build src.rpm for lustre if requested.<br>
 +
24300 Don't run autogen.sh in the spl and zfs repos.<br>
 +
LU-62  Adds support to build RHEL6 patchless client.<br>
 +
LU-73  Re-org of rhel* build code to max code reuse.<br>
 +
LU-402  Check if dump_trace wants address argument<br>
 +
LU-1116 Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.<br>
 +
For more information, please refer to the individual bug.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 22065 ko2iblnd failover deadlock fix<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20288 IB bonding & fix kiblnd_check_conns deadlock<br>
 +
Bugzilla  : 20153 IB bonding & fix kiblnd_check_conns deadlock<br>
 +
Description: Combined patch for IB bonding issues of Bug 20288 (att 25001) and Bug 20153 (att 26145) from Atul.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : LU-278 build: Only warn for tag/version mismatch<br>
 +
Description: The configure process should NOT abort just because the most recent tag is not of the form that upstream uses to tag Lustre. Downstream developers may use their own tags, or just add extensions to upsteam's version tags.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24458 files sometimes show up as zero size or missing<br>
 +
Description: LU-274 Update LVB from disk when glimpse callback return error<br>
 +
Details    : Client ll_glimpse_callback() could fail to get inode if the inode is already been cleared, and this glimpse callback will fail for -ELDLM_NO_LOCK_DATA, so server should update LVB from disk (in filter_intent_policy()) when it received such error from client.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 22281 This patch combines patches from bug 22281<br>
 +
Description: This patch combines all the patches from bug 22281.<br>
 +
Details    : It mainly deals with the build subsystem:<br>
 +
- add config opts like --downstream-release, --enable-dist, etc.
 +
- add BUILDID support.
 +
- build lustre with an external ldiskfs package.
 +
Check bug 22281 for details.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24450 new test: check bast timeout serialization <br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 19526 conf-sanity test_46a fix<br>
 +
Description: LU-743 conf-sanity: test_46a failure<br>
 +
Details    : This failure is because client still didn't see the adding OSTs so it met a problem when decoding lsm because the # of OSTs was over tgt count at the client side.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24645 build kernel debuginfo rpm for sles11sp1<br>
 +
Description: In order to build debuginfo rpm for SLES11 SP1, We need to modify SLES11 kernel spec file in the following way:
 +
- explicitly declare __debug_package as true(1).
 +
- use debugfiles.list as the %files content instead of the default file in spec.
 +
- change the file attributes.
 +
- ignore some missing/unpackaged files while doing rpmbuild.<br>
 +
 
 +
Also, we need to increase the BUILD_GEN in order to avoid future RPM reuse of the testing builds.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24646 fix a bug for raid6 driver from upstream<br>
 +
Description: For more info, refer to this link: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=581392
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20997 skip peer health check for not router<br>
 +
Description: this is patch from LU-630<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24636 compile fix for sles11 when jbd debug is turned on<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24376 do not shrink busy pages<br>
 +
Description: llap_shrink_cache_internail() used to avoid shrinking of dirty pages and pages being written. This patch makes it to avoid shrinking pages which are in use.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 23206 osc_precreate, osc_create: check OSCC_FLAG_NOSPC after checking for preallocated objects<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24531 replace generic_write_sync with ll_write_sync<br>
 +
Description: generic_write_sync() takes inode mutex which leads to deadlock because the mutex is taken now in ll_file_aio_write/ll_file_writev.<br>
 +
Details    : replace generic_write_sync() with ll_write_sync() which skips taking of i_mutex<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24419 ldlm_pools_shrink algorithm change<br>
 +
Description: -shrink namespaces by batches of 64 namespaces, the batch is implemented as list<br>
 +
-stop shrinking once required number of elements is freed<br>
 +
-have ldlm_pools_recalc to operate with namespaces similar to ldlm_pools_shrink<br>
 +
-use global counters of unused locks on cliens and granted locks on servers to avoid iterating over namespaces<br>
 +
-port b=21519&LU-499, a race between shrink or recalc and namespace_free<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24531 vfs locking simplification and lockless i/o for direct i/o<br>
 +
Description: ll_file_write used to lock in the following order:<br>
 +
"lli_write_sem; ldlm extent lock; inode mutex (taken in generic_file_write)".
 +
OTOH, direct I/O read used opposite order: "inode mutex; ldlm extent lock on
 +
server". That led to a deadlock.
 +
 
 +
Another drawback of that is need to drop inode mutex on truncate before taking
 +
ldlm extent lock.
 +
 
 +
This patch fixes the problem by simplifing the locking with help of using
 +
version of generic_file_write routine which does not take inode mutex: "inode
 +
mutex; ldlm extent lock". That makes lli_write_sem in write and mutex re-lock
 +
in truncate unnecessary.
 +
 
 +
DIO read takes inode mutex as it used to be.
 +
 
 +
One more fix is to make sure that in case of DIO read fast lock matching is
 +
avoided. That fixed yet another deadlock between direct i/o reads: those who
 +
got a fast lock locked in order "ldlm lock; inode mutex" while those who ran
 +
lockless reads locked in opposite order: "inode mutex; ldlm lock on server".<br>
 +
Details    : The below summarizes read, write, truncate locking rules:<br>
 +
read:            trunc sem, ldlm<br>
 +
write:          mutex, ldlm<br>
 +
read direct:    mutex, server ldlm<br>
 +
write direct:    mutex, server ldlm<br>
 +
truncate:        mutex, trunc sem, ldlm<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24592 ENOSUPP migratepage<br>
 +
Description: rhel6 kernel has "memory compaction" feature which seems to be slighlty inaccurate: it misses setting page->private to 0 for pages allocated for migration.
 +
Details    : Detect kernel with that feature and add ENOSUPP migration address space operation as a workaround for the problem<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 23206 handle_async_create(): do not return ENOSPC if there are preallocated objects<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24628 OEL6 support in 1.8 branch<br>
 +
Description: Add OEL6 support in b1_8 branch. Kernel version is 2.6.32-279.2.1.el6.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 RHEL6 support in b1_8 branch<br>
 +
Description: Update RHEL6 patchless client kernel to 2.6.32-279.2.1.el6.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 23206 return 0 if precreation succeeded even partially<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20569 count bad lines correctly<br>
 +
Description: -have parse_buffer() to count lines with bogus headers correctly<br>
 +
-simplification of end of line detection in parse_buffer()<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20569 test_170 fix<br>
 +
Description: use perl instead of sed to process binary files properly; verify that bad and good files differ; minor cleanup<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24596 skip metabench for rhel 6.2 nfs client<br>
 +
Description: rhel 6.2 nfs client bug<br>
 +
Details    : https://bugzilla.redhat.com/show_bug.cgi?id=790729<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24515 test_7 activate osc failed<br>
 +
Description: take into account the possible race between activation from lctl and activation from pinger thread<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24580 RHEL6 support in b1_8 branch<br>
 +
Description: RHEL6.2 support along with build code refactor.<br>
 +
Details    : This patch is largely based on the patches in the following bugs:<br>
 +
22375 RHEL6 patchless client support.<br>
 +
24089 Avoid reuse cache storage collisions.<br>
 +
24090 Distro and target autodetection.<br>
 +
24091 Find_linux_rpms utility.<br>
 +
24092 Build src.rpm for lustre if requested.<br>
 +
24300 Don't run autogen.sh in the spl and zfs repos.<br>
 +
LU-62  Adds support to build RHEL6 patchless client.<br>
 +
LU-73  Re-org of rhel* build code to max code reuse.<br>
 +
LU-402  Check if dump_trace wants address argument<br>
 +
LU-1116 Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.<br>
 +
For more information, please refer to the individual bug.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 22065 ko2iblnd failover deadlock fix<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 20288 IB bonding & fix kiblnd_check_conns deadlock<br>
 +
Bugzilla  : 20153 IB bonding & fix kiblnd_check_conns deadlock<br>
 +
Description: Combined patch for IB bonding issues of Bug 20288 (att 25001) and Bug 20153 (att 26145) from Atul.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : LU-278 build: Only warn for tag/version mismatch<br>
 +
Description: The configure process should NOT abort just because the most recent tag is not of the form that upstream uses to tag Lustre. Downstream developers may use their own tags, or just add extensions to upsteam's version tags.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24458 files sometimes show up as zero size or missing<br>
 +
Description: LU-274 Update LVB from disk when glimpse callback return error<br>
 +
Details    : Client ll_glimpse_callback() could fail to get inode if the inode is already been cleared, and this glimpse callback will fail for -ELDLM_NO_LOCK_DATA, so server should update LVB from disk (in filter_intent_policy()) when it received such error from client.
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 22281 This patch combines patches from bug 22281<br>
 +
Description: This patch combines all the patches from bug 22281.<br>
 +
Details    : It mainly deals with the build subsystem:<br>
 +
- add config opts like --downstream-release, --enable-dist, etc.<br>
 +
- add BUILDID support.<br>
 +
- build lustre with an external ldiskfs package.<br>
 +
Check bug 22281 for details.<br>
 +
 
 +
Severity  : normal<br>
 +
Bugzilla  : 24450 new test: check bast timeout serialization<br>
 +
 
 +
 
 +
=Changes from v1.8.6 to v1.8.7=
 +
Support for networks:
 +
* socklnd  - any kernel supported by Lustre,
 +
* qswlnd    - Qsnet kernel modules 5.20 and later,
 +
* openiblnd - IbGold 1.8.2,
 +
* o2iblnd  - OFED 1.3, 1.4.1, 1.4.2, 1.5.1 and 1.5.2
 +
* viblnd    - Voltaire ibhost 3.4.5 and later,
 +
* ciblnd    - Topspin 3.2.0,
 +
* iiblnd    - Infiniserv 3.3 + PathBits patch,
 +
* gmlnd    - GM 2.1.22 and later,
 +
* mxlnd    - MX 1.2.10 or later,
 +
* ptllnd    - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
 +
 
 +
 
 +
 
 +
Server support for kernels:
 +
* 2.6.16.60-0.69.1 (SLES 10),
 +
* 2.6.32.19-0.2.1 (SLES11),
 +
* 2.6.18-194.17.1.el5 (RHEL 5)
 +
* 2.6.18-194.17.1.0.1.el5 (OEL 5)
 +
 
 +
 
 +
Client support for unpatched kernels: see [http://wiki.lustre.org/index.php?title=Patchless_Client "Patchless Client"]
 +
        2.6.16 - 2.6.32 vanilla (kernel.org)
 +
 
 +
 
 +
Recommended e2fsprogs version:
 +
* 1.41.12.2-ora1
 +
 
 +
 
 +
The async journal commit feature (bug 19128) and the cancel lock before replay feature (bug 16774) are disabled by default.
 +
 
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24548 24548]'''
 +
Severity: normal<br>
 +
Description: regression test: make sure that data written concurrently do not get discarded on file close<br>
 +
Details: write_disjoint.c modification :    -- several new options    -- minor cleanup (rank=0: open file once; close file at the end; add usage ());  new parallel-scale write_disjoint2 () regression test;  new mpi_run() --quiet option to skip lfs df<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24450 24450]'''
 +
Severity: normal<br>
 +
Description: comment on top of ptlrpc_check_set() update<br>
 +
Details: ptlrpc_check_set() returns result of set_condition hook if it is defined<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24450 24450]'''
 +
Severity: normal<br>
 +
Description: ldlm_run_bl_ast_work: use ptlrpc_set_wait() with condition<br>
 +
Details: ldlm_run_bl_ast_work() sends ASTs in sets of PARALLEL_AST_LIMIT requests and waits for whole set to complete and then sends another set of requests and waits again. If there is a least one request per set which timeouts, we have timeout serialization.    This patch changes ldlm_run_bl_ast_work() so that having sent request set it then waits for any of sent requests to complete and refills running request set with requests which are yet to be sent. For a case where number of timeout-ing requests is smaller than PARALLEL_AST_LIMIT it is supposed to eliminate possibility of timeout serailization.    This patch uses posibility to specify wait condition for ptlrpc_set_wait() (proposed in https://bugzilla.lustre.org/attachment.cgi?id=33099)<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24450 24450]'''
 +
Severity: normal<br>
 +
Description: ptlrpc_set_wait flexibility<br>
 +
Details: ptlrpc_set_wait() waits until all requests in a set complete.  This patch makes it possible to specify a condition on which ptlrpc_set_wait() will wait instead of default condition "no remaining requests".    With that it wiil be possible to add requests to a set as sent ones complete without waiting for all requests to finish.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22936 22936]'''
 +
Severity: normal<br>
 +
Description: remove wrong assertion<br>
 +
Details: The assertion underestimates exp_refcount of obd_export. The exp_refcount is incremented on adding a lock into export's hash table. For decent RAM there can be millions of locks in memory.    Similar problem is reported in 23265, 17924, 24376<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22221 22221]'''
 +
Severity: normal<br>
 +
Description: use read-write semaphore for lov_lock<br>
 +
Details: After adding obd_getref() into lov_prep_async_page() it appeared that read performance degradated. lov_getref() uses mutex_down(), so it looks like concurrent reads got stuck on than mutex.    This fix replaces the mutex with r/w semaphore, so that reads do not get blocked on it. That cured the performance.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23978 23978]'''
 +
Severity: normal<br>
 +
Description: avoid unnecessary dentry rehashing (v2)<br>
 +
Details: In patchless case the sequence __d_drop(); d_rehash_cond() creates race window where dentry incorrectly looks like unhashed when it is not.  If dentry is not unhashed, it seems that rehashing can be avoided.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=17764 17764]'''
 +
Severity: normal<br>
 +
Description: accessing files via nfs test<br>
 +
Details: -- add nfsserver MOUNT2 cleanup<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22060 22060]'''
 +
Severity: normal<br>
 +
Description: use interval tree to calculate kms<br>
 +
Details: with interval tree of locked extents granted list iteration can be avoided which is supposed to save CPU in case of long granted lock lists<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=17764 17764]'''
 +
Severity: normal<br>
 +
Description: correct assertion<br>
 +
Details: orphan inode can be reached on mds_open when opening by fid which takes place on accessing files via nfs correct the assertion correspondingly<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=17764 17764]'''
 +
Severity: normal<br>
 +
Description: accessing files via nfs test<br>
 +
Details: -- new nfsread_orphan_file test    -- rmultiop_start(), rmultiop_stop() modification: add possibility to run several multiop_bg on remote node<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21937 21937]'''
 +
Severity: normal<br>
 +
Description: never resend glimpse ASTs<br>
 +
Details: when a connection to client fails glimpse ast gets resend endlessly as the request does not have rq_noresend flag. Set the flag to avoid resends.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21812 21812]'''
 +
Severity: normal<br>
 +
Description: generate warnings in case of discarding dirty pages<br>
 +
Details: When a client is evicted, dirty pages may get silently discarded.  The caller of successful write(2) will not know that the data he wrote have been discarded due to eviction before they can be flushed to the OSS.    With this patch system administrator gets warned about dirty page discard.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23858 23858]'''
 +
Severity: normal<br>
 +
Description: do not compare unsigned < 0<br>
 +
Details: this is also supposed to catch overflow of lqs_bwrite_pending<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24423 24423]'''
 +
Severity: normal<br>
 +
Description: ext3_dx_find_entry: check directory entry consistency before ext3_match<br>
 +
Details: to avoid getting into infinite loop when directory block contains wrong data<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24141 24141]'''
 +
Severity: normal<br>
 +
Description: llite: -EIO instead of LBUG for multi-referenced object<br>
 +
Details: Whenever an inode is used with a DLM lock, the client checks that no other inodes are referencing the same OST object, since this is a sign of filesystem corruption on the MDS (or some other code bug that behaves in this way).  If the client detected the same OST object is referenced from multiple inodes at the same time, it will LASSERT() and print a message to this effect, rather than continue to corrupt the data files.      osc_set_data_with_check() ASSERTION(old_inode->i_state & I_FREEING) failed: Found existing inode ffff880587d15d10/222311317/67781718 state 0 in lock: setting data to ffff88046b7f8d50/223489633/67781099    Instead of LASSERTing on this condition, instead return EIO for this file.  This allows the problem to be analyzed and fixed without the need to reboot the client node.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24264 24264]'''
 +
Severity: normal<br>
 +
Description: Avoid corropt ldiskfs after MD rebuild on RHEL5/CentOS5.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24546 24546]'''
 +
Severity: normal<br>
 +
Description: limit bio size to BIO_MAX_PAGES<br>
 +
Details: this is neede because bio_alloc_bioset()->bvec_alloc_bs() refuses to allocate bigger bio-s<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19944 19944]'''
 +
Severity: normal<br>
 +
Description: set $PTLDEBUG, $SUBSYSTEM and $DEBUG_SIZE values on every node (LU-196)<br>
 +
Details: The current set_default_debug_nodes() could not pass the values of $PTLDEBUG, $SUBSYSTEM and $DEBUG_SIZE to the remote nodes while they are specified from the command line on the local node. This patch is to fix this issue.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24437 24437]'''
 +
Severity: normal<br>
 +
Description: fix deadlock caused by original fix b=24525 (LU-146)<br>
 +
Details: Get open lock inside mds_get_parent_child_locked() to avoid deadlock.  Never get open lock if child is newly created to avoid deadlock.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24548 24548]'''
 +
Severity: normal<br>
 +
Description: fix v1<br>
 +
Details: canceling lock may contain data being sent to OSTs. Change find_cbdata iterator to take that into account<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24303 24303]'''
 +
Severity: normal<br>
 +
Description: kernel BUG at fs/inode.c:323!<br>
 +
Details: workaround patch to avoid the race at truncate_inode_pages_range()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24508 24508]'''
 +
Severity: normal<br>
 +
Description: racer: general protection fault (LU-286)<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23485 23485]'''
 +
Severity: normal<br>
 +
Description: fsync for directories<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23884 23884]'''
 +
Severity: normal<br>
 +
Description: allow lnet to talk to gnilnd<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24490 24490]'''
 +
Severity: normal<br>
 +
Description: obdfilter-survey cleanup<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24050 24050]'''
 +
Severity: normal<br>
 +
Description: add an -s option to set an altenative order of services start<br>
 +
Details: -s start services in the order MGS->OST(s)->MDT(s).  The default order is MGS->MDT(s)->OST(s).<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22638 22638]'''
 +
Severity: normal<br>
 +
Description: add lst stat --count<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21103 21103]'''
 +
Severity: normal<br>
 +
Description: ORNL LCE Router features\fixes<br>
 +
Details: Only squawk when md->start is NULL on non-zero length v2<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24512 24512]'''
 +
Severity: normal<br>
 +
Description: lfs find -s doesn't seem to work quite with >2GB args<br>
 +
Details: fix the wrong size type in find_value_cmp()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22221 22221]'''
 +
Severity: normal<br>
 +
Description: client nodes crash on fs with inactive OST<br>
 +
Details: take lov reference in lov_prep_async_page()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=20831 20831]'''
 +
Severity: normal<br>
 +
Description: replay-dual: ldlm_lock.c:1622:ldlm_lock_cancel()) LBUG type: PLN<br>
 +
Details: fix a race between do_requeue and client_disconnect_export<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24032 24032]'''
 +
Severity: normal<br>
 +
Description: add lctl push<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=18750 18750]'''
 +
Severity: normal<br>
 +
Description: remove OBD_CHECK_FAIL_CHECK_ONCE<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24464 24464]'''
 +
Severity: normal<br>
 +
Description: Load Lustre modules before mounting targets to avoid race conditions.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24498 24498]'''
 +
Severity: normal<br>
 +
Description: wait_osc_import_state () fixes<br>
 +
Details: -- increase maxtime to wait the timeout of 1st request; take into account at_min value;    -- cleanup wait_osc_import_state () to use _wait_import_state ();    -- ost-pools test_1 fix: use local var instead of global NAME<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24504 24504]'''
 +
Severity: normal<br>
 +
Description: sanity test_133* and check_stats() fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24487 24487]'''
 +
Severity: normal<br>
 +
Description: canonicalize the devices names<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21047 21047]'''
 +
Severity: normal<br>
 +
Description: ->commit should always be called after successful ->prep on b1_8<br>
 +
 
 +
 
 +
=Changes from v1.8.5 to v1.8.6=
 +
Support for networks:
 +
* socklnd  - any kernel supported by Lustre,
 +
* qswlnd    - Qsnet kernel modules 5.20 and later,
 +
* openiblnd - IbGold 1.8.2,
 +
* o2iblnd  - OFED 1.3, 1.4.1, 1.4.2, 1.5.1 and 1.5.2
 +
* viblnd    - Voltaire ibhost 3.4.5 and later,
 +
* ciblnd    - Topspin 3.2.0,
 +
* iiblnd    - Infiniserv 3.3 + PathBits patch,
 +
* gmlnd    - GM 2.1.22 and later,
 +
* mxlnd    - MX 1.2.10 or later,
 +
* ptllnd    - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x
 +
 
 +
 
 +
 
 +
Server support for kernels:
 +
* 2.6.16.60-0.42.8 (SLES 10),
 +
* 2.6.27.39-0.3.1 (SLES11),
 +
* 2.6.18-194.3.1.el5 (RHEL 5)
 +
* 2.6.18-194.3.1.0.1.el5 (OEL 5)
 +
 
 +
 
 +
Client support for unpatched kernels: see [http://wiki.lustre.org/index.php?title=Patchless_Client "Patchless Client"]
 +
        2.6.16 - 2.6.30 vanilla (kernel.org)
 +
 
 +
 
 +
Recommended e2fsprogs version:
 +
* 1.41.12.2-ora1
 +
 
 +
 
 +
The async journal commit feature (bug 19128) and the cancel lock before replay feature (bug 16774) are disabled by default.
 +
 
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19064 19064]'''
 +
Severity: normal<br>
 +
Description: Allow OSTs to be created with no primary node (LU-57)<br>
 +
Details: Add a --servicenode parameter for mkfs.lustre to treat all service nodes equally.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23935 23935]'''
 +
Severity: normal<br>
 +
Description: append truncate race<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21847 21847]'''
 +
Severity: normal<br>
 +
Description: obdfilter-survey: Syntax error in some locales<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21501 21501]'''
 +
Severity: normal<br>
 +
Description: Properly cleanup flock lock on disconnect<br>
 +
Details: Properly wakeup flock waiters on eviction.  Destroyed lock for flock completion ast is not an error, return success to avoid double lock decref.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24437 24437]'''
 +
Severity: normal<br>
 +
Description: revoke open lock for executable files if needed<br>
 +
Details: When a normal lustre client open write/exec a file, the open lock on that file needs to be revoked in case an NFSD lustre client still holds it.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22729 22729]'''
 +
Severity: normal<br>
 +
Description: Remove LPSZ & LPSSZ<br>
 +
Details: Code cleanup patch for 1.8 which removes the use of LPSZ/LPSSZ to improve the build portability.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24418 24418]'''
 +
Severity: normal<br>
 +
Description: run autogen if a Makefile.am is patched (LU-53)<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21137 21137]'''
 +
Severity: normal<br>
 +
Description: Sles11 with 1.8 is slower than 1.6 sles10 for O_DIRECT single file IOR writes<br>
 +
Details: Fix ptlrpc_main() condition to start service threads correctly.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23049 23049]'''
 +
Severity: normal<br>
 +
Description: t-f do_node() VERBOSE fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24479 24479]'''
 +
Severity: normal<br>
 +
Description: files and dirs missing in dist tarball (LU-92)<br>
 +
Details: Some files and dirs are missing in the "dist" tarball.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19494 19494]'''
 +
Severity: normal<br>
 +
Description: "lfs find" hangs when searching for an OST index<br>
 +
Details: - new test_88 "lfs find identifies the missing striped file segments"    - exit_status () egrep pattern fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24194 24194]'''
 +
Severity: normal<br>
 +
Description: increase reseed count to mitigate inconsistence in OST allocation<br>
 +
Details: in alloc_rr, "LOV_CREATE_RESEED_MULT" and "LOV_CREATE_RESEED_MIN" is increased to mitigate the inconsistence in OST allocation.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24451 24451]'''
 +
Severity: normal<br>
 +
Description: racer test cleanup<br>
 +
Details: - modify racer/racer.sh to wait the process killed, exit 1 if the process are still existing;    - remove runracer;<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19649 19649]'''
 +
Severity: normal<br>
 +
Description: sanity test_77j fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24426 24426]'''
 +
Severity: normal<br>
 +
Description: add ERRLOG suffix to not ovewrite the lustre logs<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24420 24420]'''
 +
Severity: normal<br>
 +
Description: avoid an LASSERT on recovery<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24375 24375]'''
 +
Severity: normal<br>
 +
Description: Fix a race between completion and enqueue<br>
 +
Details: ldlm_enqueue_tail does not obtain proper lockng when checking lock mode to see if the lock is granted, so there is a window where ldlm_handle_completion_ast can update lvb with correct data, but beforeit has a chance to update the lock mode, the ldlm_enqueue_tail will check the lock mode and since the lock is not granted yet, it will overwrite correct lvb with stale value from enqueue time.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24050 24050]'''
 +
Severity: normal<br>
 +
Description: fix lustre_start to start server targets in the order of MGS->MDT->OST(s)<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24426 24426]'''
 +
Severity: normal<br>
 +
Description: run_one(): run error() once<br>
 +
Details: there is no reason to run error() (and lctl dk thereby) more than once.  second lctl dk overwrites the most important logs obtained on first lctl dk<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23787 23787]'''
 +
Severity: normal<br>
 +
Description: Modified struct lprocfs_percpu to be C99 compliant.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24432 24432]'''
 +
Severity: normal<br>
 +
Description: mount_lustre.c/parse_options() fix to differentiate between 'force*' and 'force'<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22168 22168]'''
 +
Severity: normal<br>
 +
Description: write-append-truncate: retry write when receives EINTR.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22984 22984]'''
 +
Severity: normal<br>
 +
Description: change all references to tune.ldiskfs in lustre to tunefs.ldiskfs<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21135 21135]'''
 +
Severity: normal<br>
 +
Description: calculate Use% for "lfs df" the same way as standard "df"<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19944 19944]'''
 +
Severity: normal<br>
 +
Description: adjust debug size to be -gt num_possible_cpus()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23670 23670]'''
 +
Severity: normal<br>
 +
Description: exit_status () fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23430 23430]'''
 +
Severity: normal<br>
 +
Description: fix sanity-quota test 14a to write file in O_DIRECT mode<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24374 24374]'''
 +
Severity: normal<br>
 +
Description: lov_dump_user_lmm_header () fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23064 23064]'''
 +
Severity: normal<br>
 +
Description: create proper macro check for bdi interface<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=14846 14846]'''
 +
Severity: normal<br>
 +
Description: dynamically grow/shrink connd threads pool<br>
 +
Details: if multiple nodes are down, all socklnd connds could be blocked for a long while, we can workaround this by increase default nconnds but it always requires to have unnecessary number of threads.  This patch can support dynamically grow/shrink connd threads pool, it can create new thread if there's pending active connecting, it will kill some threads if there are too many idle connds.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24218 24218]'''
 +
Severity: normal<br>
 +
Description: fix contention on ksock_tx_t<br>
 +
Details: If a connection is closed before ksocknal_transmit() returns to ksocknal_process_transmit(), then nobody has refcount on conn::ksnc_sock and all pending ZC requests will be finalized by ksocknal_connsock_decref->ksocknal_finalize_zcreq, ksocknal_finalize_zcreq will mark not-acked ZC request as error by setting tx::tx_reside = -1.  This is race because ksocknal_process_transmit() will check tx::tx_resid right after calling ksocknal_transmit(), and it can get tx->tx_resid != 0 and rc == 0 then hit later LASSERT(rc < 0).<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23983 23983]'''
 +
Severity: normal<br>
 +
Description: mmp test_10 fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23499 23499]'''
 +
Severity: normal<br>
 +
Description: ASSERTION(atomic_read(&client_stat->nid_exp_ref_count) == 0)<br>
 +
Details: In lprocfs_exp_setup(), we need release old stats in all cases.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23729 23729]'''
 +
Severity: normal<br>
 +
Description: cancel_lru_locks not working cause some locks are still in cache from mmap files<br>
 +
Details: Fix sanity-benchmark.sh to remove files after fsx otherwise client keeps locks acquired for mmap files in cache.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21581 21581]'''
 +
Severity: normal<br>
 +
Description: change wrong URL<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21581 21581]'''
 +
Severity: normal<br>
 +
Description: Fix a typo.  Add Fedora for the yum cases per Andreas.  (LU-47)<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24427 24427]'''
 +
Severity: normal<br>
 +
Description: hopefully the last libcfs_memory_pressure_* fix for liblustre<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24427 24427]'''
 +
Severity: normal<br>
 +
Description: another userspace fix for libcfs_memory_pressure_restore()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24427 24427]'''
 +
Severity: normal<br>
 +
Description: define libcfs_memory_pressure_get for userspace<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21581 21581]'''
 +
Severity: normal<br>
 +
Description: too long file / path names for old tar<br>
 +
Details: Instruct automake to use tar's ustar format to prevent errors when pathnames are longer than 99 characters.    - this requires automake >= 1.9, so adjust accordingly      - including dealing with multiple versions of automake installed<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24410 24410]'''
 +
Severity: normal<br>
 +
Description: exit with error if NFSCLIENT is set, but no nfs export found<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24388 24388]'''
 +
Severity: normal<br>
 +
Description: remove files inadvertently added by previous commit<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24388 24388]'''
 +
Severity: normal<br>
 +
Description: sgpdd-survey fix: use node_var_name () for variables<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21776 21776]'''
 +
Severity: normal<br>
 +
Description: Set PF_MEMALLOC on outgoing path to prevent deadlock on memory allocation under pressure<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22980 22980]'''
 +
Severity: normal<br>
 +
Description: init_logging does not exist in 1.8<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24417 24417]'''
 +
Severity: normal<br>
 +
Description: Update Build-Depends<br>
 +
Details: - remove texlive-latex-recommended as a build requirement    - add missing "| automake1.7 | automake1.8 | automake1.9" to debian/control.main<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24416 24416]'''
 +
Severity: normal<br>
 +
Description: debian packaging fixes<br>
 +
Details: - don't make a patch out of anything in /debian    - exclude noise files from the debian built source tarball    - fake debian/patche{s,d} for make dist    - a few more reasons to run autogen.sh    - figure out if dist tarball needs autogen.shs and include it if so    - look for and run autogen.sh in the build subdir    - make debdiff as part of make dist    - add a debian/source/format file    - mv the orig tarball and the debdiff to the debs dir    - don't try to dist /debian for non-dpkg-using build targets<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24413 24413]'''
 +
Severity: normal<br>
 +
Description: fix for automake > 1.9.6<br>
 +
Details: We seem to be using a Makefile variable that does not exist in more recent versions of automake.  This fixes that problem.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22980 22980]'''
 +
Severity: normal<br>
 +
Description: Support unlocked_ioctl<br>
 +
Details: Adding 'unlocked_ioctl' for performance sensitive ioctls, such as OBD_IOC_BRW_READ/WRITE<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24320 24320]'''
 +
Severity: normal<br>
 +
Description: do not fork a new thread in mem pressure<br>
 +
Details: we already check for PF_MEMALLOC in ldlm shrinker and pass this flag to the blocking thread, but a new thread start was still done with no check for this flag.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24245 24245]'''
 +
Severity: normal<br>
 +
Description: fix SA perf test to support SA disabled by default<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=17275 17275]'''
 +
Severity: normal<br>
 +
Description: make lustre client less verbose at startup time for Cray<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24360 24360]'''
 +
Severity: normal<br>
 +
Description: fix NULL pointer deref in mds_verify_child() when ll_lookup_one_len() fails<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=20563 20563]'''
 +
Severity: normal<br>
 +
Description: Fix fid_flatten() after 1 trillion SEQ numbers<br>
 +
Details: Fix the fid_flatten() function to properly handle FID mapping to 64-bit inode numbers, after the first 1 trillion SEQ numbers have been granted out.  Even with CMD this would only happen after 1024 MDTs have each had 1B client mounts, so there is little risk of introducing collisions as a result of this change, and at worst this is a client-local phenomenon that is not persistent.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=20563 20563]'''
 +
Severity: normal<br>
 +
Description: Fix fid_flatten32() to not lose OID bits<br>
 +
Details: The original implementation of fid_flatten32() was broken due to an error in the shift calculation (note to self - "0x00" is 8 bits, not 16 bits).  This could negatively impact 32-bit clients that were creating more than 64k files in the same directory.  This 32-bit inode number is visible only within a single client mount, is not used in any persistent storage, and only if a 2.x server is in use (which is basically none today) by a 32-bit client, so there is no issue to change it at this time.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22660 22660]'''
 +
Severity: normal<br>
 +
Description: Return kernel's locking return code to when lustre reports success<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23352 23352]'''
 +
Severity: normal<br>
 +
Description: modified value of at_min is not taken into account<br>
 +
Details: xxx<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22378 22378]'''
 +
Severity: normal<br>
 +
Description: Correct MDS client stats<br>
 +
Details: sanity test_133b fails with "The getattr counter on mds is wrong" message.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=15962 15962]'''
 +
Severity: normal<br>
 +
Description: disable statahead by default due to important races found in the code<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22882 22882]'''
 +
Severity: normal<br>
 +
Description: MMP might sleep negative time<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21456 21456]'''
 +
Severity: normal<br>
 +
Description: Patch to support lnet v1 pings in 'lctl ping'<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23988 23988]'''
 +
Severity: normal<br>
 +
Description: Remove sd iostats patch from sles11 patch series<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24039 24039]'''
 +
Severity: normal<br>
 +
Description: actually add exit_traps.sh to EXTRA_DIST<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23122 23122]'''
 +
Severity: normal<br>
 +
Description: make exit_traps.sh executable<br>
 +
Details: While bug 24093 added exit_traps.sh to the make dist list, it is not an executable file to start with.  Fix this in the git repo.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24093 24093]'''
 +
Severity: normal<br>
 +
Description: not all build files/scripts being distributed<br>
 +
Details: Some files that need to be are not being included in the tarball when make dist is being run.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24087 24087]'''
 +
Severity: normal<br>
 +
Description: reverse order of $LINUX{,_OBJ}/include<br>
 +
Details: It is important that /usr/src/linux-...-obj/include is searched for includes before /usr/src/linux-.../include so that the inclusion of "include/linux/autoconf.h" picks up the one for the kernel we are trying to build against, and not the one for the currently running kernel, which is what is in /usr/src/linux-.../ copy is.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24294 24294]'''
 +
Severity: normal<br>
 +
Description: test_pios: take the ost-s sizes into account remove obsolete workaround bug19657 part<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23793 23793]'''
 +
Severity: normal<br>
 +
Description: MOUNTOPT "-o" cleanup<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23051 23051]'''
 +
Severity: normal<br>
 +
Description: improve summary of acc-sm to include test times<br>
 +
Details: acceptance-small test suites name cleanup:    - rename sanityN -> sanityn, lfscktest -> lfsck    - add racer.sh, liblustre.sh scripts    - remove fsx,bonnie,dbench,iozone.lfsck parts<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23051 23051]'''
 +
Severity: normal<br>
 +
Description: improve summary of acc-sm to include test times<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23081 23081]'''
 +
Severity: normal<br>
 +
Description: Move llap page to tail instead of head.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24226 24226]'''
 +
Severity: normal<br>
 +
Description: typo fix for sanity test 72<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=20394 20394]'''
 +
Severity: normal<br>
 +
Description: correct check for transno value in filter_finish_transno<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24048 24048]'''
 +
Severity: normal<br>
 +
Description: Set body->eadatasize in mdc_getattr_pack()<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=18717 18717]'''
 +
Severity: normal<br>
 +
Description: make "lfs check" output consistent on stdout<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23049 23049]'''
 +
Severity: normal<br>
 +
Description: canonicalize disk names<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23049 23049]'''
 +
Severity: normal<br>
 +
Description: various t-f.sh patches<br>
 +
Details: rundbench is a bash script;    obdfilter-survey is a bash script;    don't su if MPI_USER == "";<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23049 23049]'''
 +
Severity: normal<br>
 +
Description: set path to truncate<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22544 22544]'''
 +
Severity: normal<br>
 +
Description: delete module_setup.sh<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24039 24039]'''
 +
Severity: normal<br>
 +
Description: lfs setstripe --pool broken<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24239 24239]'''
 +
Severity: normal<br>
 +
Description: use SAMPLE_FILE instead of termcap<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24266 24266]'''
 +
Severity: normal<br>
 +
Description: increase replay-single test_70d dbench duration for HARD failure mode<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24226 24226]'''
 +
Severity: normal<br>
 +
Description: Only force the mode change if we're changing the size as well<br>
 +
Details: The offending code was added by commit 77ba4b2141d04180211efa8a75c11ab0abf7fafb to remove setgid/setuid bits when do_truncate() is called on the file. We should only force the change when that occurs, similarly to ll_setattr() in lustre/llite/llite_lib.c<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=19808 19808]'''
 +
Severity: normal<br>
 +
Description: fix d_obtain_alias() misuse due to compat macro<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24055 24055]'''
 +
Severity: normal<br>
 +
Description: a patch to detect if quota is turned on properly<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22546 22546]'''
 +
Severity: normal<br>
 +
Description: fix errors in test_18c<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24245 24245]'''
 +
Severity: normal<br>
 +
Description: skip sanity test 123 under 1.8 <-> 2.x interoperability mode<br>
 +
Details: statahead is disabled automatically under 1.8 <-> 2.x interoperability mode<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23821 23821]'''
 +
Severity: normal<br>
 +
Description: Limit bio_alloc() to BIO_MAX_PAGES iovecs.<br>
 +
Details: Fix logic error when patch was originally landed from b=9945.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23786 23786]'''
 +
Severity: normal<br>
 +
Description: make lh_exit code C99 compliant<br>
 +
Details: Based on the patch from Kenneth D. Matney, Sr. <[email protected]><br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23157 23157]'''
 +
Severity: normal<br>
 +
Description: do not crash on wrong network message in filter_connect_internal<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24270 24270]'''
 +
Severity: normal<br>
 +
Description: need to mkdir mntpt before mount<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=16605 16605]'''
 +
Severity: normal<br>
 +
Description: don't LASSERT on unverified client data in filter_parent<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=13698 13698]'''
 +
Severity: normal<br>
 +
Description: llapi_get_version<br>
 +
Details: this uses OBD_GET_VERSION ioctl to obtain lustre version<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23961 23961]'''
 +
Severity: normal<br>
 +
Description: fix for setup with several network interfaces<br>
 +
Details: - metadata-updates fix for setup when several interfaces are UP on host; hostname could be assigned to IP which is different from lnet network used, the hostname-s of NODES_TO_USE are now stored in HOSTS    - new SHUTDOWN_ATTEMPTS: the tunable number of attepts to shutdown node    - shutdown_node_hard () fix: do not call "power off" each time, wait that the node is not pingable before the next "power off" attempt    - unused check_port() is removed<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=4424 4424]'''
 +
Severity: normal<br>
 +
Description: Reserve obd_connect_data.ocd_max_easize field<br>
 +
Details: To avoid potential incompatible changes between b1_8 and master, reserve the ocd_max_easize field.  The corresponding connect flag OBD_CONNECT_MAX_EASIZE has been reserved for some time already.  Add several other OBD_CONNECT_ flags that have already been defined to the wirecheck/wiretest tools.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22376 22376]'''
 +
Severity: normal<br>
 +
Description: sanity test for non-root exec-only file execution<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23766 23766]'''
 +
Severity: normal<br>
 +
Description: interop bits for sanity/203<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24118 24118]'''
 +
Severity: normal<br>
 +
Description: test_70b rundbench load failed<br>
 +
Details: - give rundbench a chance to start before the dbench load check    - new check_for_process () and killall_process () to check/kill any defined progs instead of "dbench" only    - fix 70a, 70b to mount the clients on MOUNT instead of DIR<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24228 24228]'''
 +
Severity: normal<br>
 +
Description: fix test duration check to be more accurate<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23535 23535]'''
 +
Severity: normal<br>
 +
Description: sgpdd-survey.sh should check for sg_map<br>
 +
Details: check that iokit sgpdd-survey and sg_map are installed<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=22157 22157]'''
 +
Severity: normal<br>
 +
Description: combined mgs/mds fix for single node setup<br>
 +
Details: for configuration combined mgs/mds on single node setup we do not need to unload the modules because conf-sanity keeps the mgs mounted during all tests<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23402 23402]'''
 +
Severity: normal<br>
 +
Description: mmp_fini () multiple oss fix<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23575 23575]'''
 +
Severity: normal<br>
 +
Description: O2iblnd credit deadlock regression<br>
 +
Details: This fixed a regression of bug 14425.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23868 23868]'''
 +
Severity: normal<br>
 +
Description: fix "sanity-quota test_18c: @@@@@@ FAIL: quotaon failed!"<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23954 23954]'''
 +
Severity: normal<br>
 +
Description: MGS device has stopped when we try to start the second mgs<br>
 +
Details: add test_24b to ALWAYS_EXCEPT list for configuration mgs/mds are not combined<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23869 23869]'''
 +
Severity: normal<br>
 +
Description: HARD failure mode fixes<br>
 +
Details: facet_failover() has to restart only those affected facets which were UP before the node failure.  replay-single tests which use shutdown_facet() && reboot_facet() instead of facet_failover() have to take care about the affected facets<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23956 23956]'''
 +
Severity: normal<br>
 +
Description: change conf-sanity test_37 to be functional on remote setup<br>
 +
Details: fix test_37 to not be skipped on remote setup; use the existing mds device instead of create a new one<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24020 24020]'''
 +
Severity: normal<br>
 +
Description: lustre doesn't start with ext4 based ldiskfs.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=24201 24201]'''
 +
Severity: normal<br>
 +
Description: add procfs tunable to enable/disable lockless direct I/O<br>
 +
Details: llite.lustre-*.lockless_direct_io=0 will disable default semantics of direct I/O that forces it to be lockless. lockless_direct_io value, however, will be ignored if per-file LL_FILE_LOCKED_DIRECTIO bit is set.<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21804 21804]'''
 +
Severity: normal<br>
 +
Description: make sure the request is protected by rq_refcount while<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=21760 21760]'''
 +
Severity: normal<br>
 +
Description: start bulk unregistering at the same time as reply unlink<br>
 +
 
 +
*'''Bugzilla: [https://bugzilla.lustre.org/show_bug.cgi?id=23820 23820]'''
 +
Severity: normal<br>
 +
Description: ptlrpc_check_set()) ASSERTION(req->rq_phase == RQ_PHASE_BULK) failed<br>
 +
Details: Handle unsent requests with rq_net_err in ptlrpc_check_set().<br>
  
 
=Changes from v1.8.4 to v1.8.5=
 
=Changes from v1.8.4 to v1.8.5=
Line 144: Line 1,248:
 
Severity: normal<br>
 
Severity: normal<br>
 
Description: fix LBUG when obdfilter-survey is interrupted.<br>
 
Description: fix LBUG when obdfilter-survey is interrupted.<br>
 
  
 
=Changes from v1.8.3 to v1.8.4=
 
=Changes from v1.8.3 to v1.8.4=

Latest revision as of 04:09, 24 July 2013

(Updated: July 2013)

Changes from v1.8.7 to v1.8.8-x1

Support for networks:

  • socklnd - any kernel supported by Lustre,
  • qswlnd - Qsnet kernel modules 5.20 and later,
  • openiblnd - IbGold 1.8.2,
  • o2iblnd - OFED 1.3, 1.4.1, 1.4.2, 1.5.1 and 1.5.2
  • viblnd - Voltaire ibhost 3.4.5 and later,
  • ciblnd - Topspin 3.2.0,
  • iiblnd - Infiniserv 3.3 + PathBits patch,
  • gmlnd - GM 2.1.22 and later,
  • mxlnd - MX 1.2.10 or later,
  • ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x


Support for kernels:

  • 2.6.32-279.2.1.el6 (RHEL 6)
  • 2.6.32-279.2.1.el6 (OEL 6)

Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client)

  • 2.6.32-279.2.1.el6 (RHEL 6)
  • 2.6.32-279.2.1.el6 (OEL 6)

Recommended e2fsprogs version:

  • 1.42.6.x1-mrp.107-8

The async journal commit feature (bug 19128) is off by default

Severity : minor
Bugzilla : MRP-1086 debug CWARN removed

Severity : normal
Bugzilla : MRP-1053 use mutex for cl_loi_list_lock instead of spinlock Description: Async page operations are not guaranteed to not block, therefore spinlock is not appropriate for protecting structures accessed by them. This patch changes the spinlock with mutex.

Severity : normal
Bugzilla : MRP-1033 rpc.sh defect: LUSTRE is not set properly
Description: make do_nodes(), do_node() and rpc.sh to be more accurate on setting LUSTRE

Severity : minor
Bugzilla : MRP-1057 check lustre.conf for modprobe
Description: Add a check for /etc/modprobe.d/lustre.conf to get lnet module parameters during testing

Severity : minor
Bugzilla : MRP-1008 make lustre-iokit rpmbuildable

Severity : minor
Bugzilla : MRP-1007 update config files for rhel6

Severity : normal
Bugzilla : 24670 allow builing OFED of wider range of versions

Severity : normal
Bugzilla : 24668 fix broken sles10 build

Severity : normal
Bugzilla : 24668 fix for semaphore mess in ext4_ext_walk_space

Severity : normal
Bugzilla : 24554 noatime fix

Severity : normal
Bugzilla : 24554 noatime,nodiratime fix

Severity : normal
Bugzilla : 20128 Allow objects larger than 2TB in size

Severity : normal
Bugzilla : 24606 Misc changes
Description: - Remove unneeded patch file: ext4-store-tree-generation-at-find.patch - Remove the hack for fsfilt_ext3_statfs() - Use the correct spec file for rpmbuild - Update the ChangeLog

Severity : normal
Bugzilla : 24606 Stop hacking around i_data_sem
Description: - Let ext4_ext_walk_space() itself handle the semaphore. - Remove macro WALK_SPACE_HAS_DATA_SEM. - Redefine macro fsfilt_up_truncate_sem().

Severity : normal
Bugzilla : 24606 ldiskfs changes for the new kernel
Description: Ldiskfs related changes for kernel 2.6.18-308.24.1.el5: - Update related patches. - Add Force over 24TB option. - Add upstream patch to avoid loading bitmaps from full groups. - Update the series file.

Severity : normal
Bugzilla : 24606 Update RHEL5 and OEL5 kernel patches
Description: The kernel is updated to 2.6.18-308.24.1.el5.
Details : Kernel related changes: - Update some kernel patches to adapt to the new kernel. - Remove unneeded kernel patch: md-avoid-corrupted-ldiskfs-after-rebuild.patch. - Add a new upstream patch (soft RAID6 bug): make-bi_phys_segments-uint.patch. - Update kernel configs, series, and targets, etc.

Severity : normal
Bugzilla : 24580 add OEL6 server support

Severity : normal
Bugzilla : 24580 quota fix
Description: specify QFMT_VFS_V1 if available

Severity : normal
Bugzilla : 24580 define ext4_mb_discard_inode_preallocations for rhel5

Severity : normal
Bugzilla : 24580 disable dump_trace for rhel6

Severity : normal
Bugzilla : 24580 use inode version in rhel6 server

Severity : normal
Bugzilla : 24580 update ldiskfs patches

Severity : normal
Bugzilla : 24580 ldiskfs for 2.6.32-279

Severity : normal
Bugzilla : 24580 update to 2.6.32-279

Severity : normal
Bugzilla : 24580 long long s_mount_opt for rhel6

Severity : normal
Bugzilla : 24580 deadlock fix

Severity : normal
Bugzilla : 24580 minor conflict resolving

Severity : normal
Bugzilla : 24580 RHEL6 server support
Description: Add RHEL6 server (kernel version is 2.6.32-279.2.1.el6) support. This introduces many changes and new features of ldiskfs (ext4) such as mmp, large EA, fs data in dirent, open file by inode number, etc.

NOTE: This patch only suffice mount and further tuning is needed for other file operations, which will be dealt with in later patches.

Severity : normal
Bugzilla : 19526 conf-sanity test_46a fix
Description: LU-743 conf-sanity: test_46a failure
Details : This failure is because client still didn't see the adding OSTs so it met a problem when decoding lsm because the # of OSTs was over tgt count at the client side.

Severity : normal
Bugzilla : 24645 build kernel debuginfo rpm for sles11sp1
Description: In order to build debuginfo rpm for SLES11 SP1, We need to modify SLES11 kernel spec file in the following way: - explicitly declare __debug_package as true(1). - use debugfiles.list as the %files content instead of the default file in spec. - change the file attributes. - ignore some missing/unpackaged files while doing rpmbuild.
Also, we need to increase the BUILD_GEN in order to avoid future RPM reuse of the testing builds.

Severity : normal
Bugzilla : 24596 skip metabench for rhel 6.2 nfs client
Description: rhel 6.2 nfs client bug
Details : https://bugzilla.redhat.com/show_bug.cgi?id=790729

Severity : normal
Bugzilla : 24515 test_7 activate osc failed
Description: take into account the possible race between activation from lctl and activation from pinger thread

Severity : normal
Bugzilla : 24580 RHEL6 support in b1_8 branch
Description: RHEL6.2 support along with build code refactor.
Details : This patch is largely based on the patches in the following bugs:
22375 RHEL6 patchless client support.
24089 Avoid reuse cache storage collisions.
24090 Distro and target autodetection.
24091 Find_linux_rpms utility.
24092 Build src.rpm for lustre if requested.
24300 Don't run autogen.sh in the spl and zfs repos.
LU-62 Adds support to build RHEL6 patchless client.
LU-73 Re-org of rhel* build code to max code reuse.
LU-402 Check if dump_trace wants address argument
LU-1116 Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.
For more information, please refer to the individual bug.

Severity : normal
Bugzilla : 22065 ko2iblnd failover deadlock fix

Severity : normal
Bugzilla : 20288 IB bonding & fix kiblnd_check_conns deadlock
Bugzilla : 20153 IB bonding & fix kiblnd_check_conns deadlock
Description: Combined patch for IB bonding issues of Bug 20288 (att 25001) and Bug 20153 (att 26145) from Atul.

Severity : normal
Bugzilla : LU-278 build: Only warn for tag/version mismatch
Description: The configure process should NOT abort just because the most recent tag is not of the form that upstream uses to tag Lustre. Downstream developers may use their own tags, or just add extensions to upsteam's version tags.

Severity : normal
Bugzilla : 24458 files sometimes show up as zero size or missing
Description: LU-274 Update LVB from disk when glimpse callback return error
Details : Client ll_glimpse_callback() could fail to get inode if the inode is already been cleared, and this glimpse callback will fail for -ELDLM_NO_LOCK_DATA, so server should update LVB from disk (in filter_intent_policy()) when it received such error from client.

Severity : normal
Bugzilla : 22281 This patch combines patches from bug 22281
Description: This patch combines all the patches from bug 22281.
Details : It mainly deals with the build subsystem:
- add config opts like --downstream-release, --enable-dist, etc. - add BUILDID support. - build lustre with an external ldiskfs package. Check bug 22281 for details.

Severity : normal
Bugzilla : 24450 new test: check bast timeout serialization

Severity : normal
Bugzilla : 19526 conf-sanity test_46a fix
Description: LU-743 conf-sanity: test_46a failure
Details : This failure is because client still didn't see the adding OSTs so it met a problem when decoding lsm because the # of OSTs was over tgt count at the client side.

Severity : normal
Bugzilla : 24645 build kernel debuginfo rpm for sles11sp1
Description: In order to build debuginfo rpm for SLES11 SP1, We need to modify SLES11 kernel spec file in the following way: - explicitly declare __debug_package as true(1). - use debugfiles.list as the %files content instead of the default file in spec. - change the file attributes. - ignore some missing/unpackaged files while doing rpmbuild.

Also, we need to increase the BUILD_GEN in order to avoid future RPM reuse of the testing builds.

Severity : normal
Bugzilla : 24646 fix a bug for raid6 driver from upstream
Description: For more info, refer to this link: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=581392

Severity : normal
Bugzilla : 20997 skip peer health check for not router
Description: this is patch from LU-630

Severity : normal
Bugzilla : 24636 compile fix for sles11 when jbd debug is turned on

Severity : normal
Bugzilla : 24376 do not shrink busy pages
Description: llap_shrink_cache_internail() used to avoid shrinking of dirty pages and pages being written. This patch makes it to avoid shrinking pages which are in use.

Severity : normal
Bugzilla : 23206 osc_precreate, osc_create: check OSCC_FLAG_NOSPC after checking for preallocated objects

Severity : normal
Bugzilla : 24531 replace generic_write_sync with ll_write_sync
Description: generic_write_sync() takes inode mutex which leads to deadlock because the mutex is taken now in ll_file_aio_write/ll_file_writev.
Details : replace generic_write_sync() with ll_write_sync() which skips taking of i_mutex

Severity : normal
Bugzilla : 24419 ldlm_pools_shrink algorithm change
Description: -shrink namespaces by batches of 64 namespaces, the batch is implemented as list
-stop shrinking once required number of elements is freed
-have ldlm_pools_recalc to operate with namespaces similar to ldlm_pools_shrink
-use global counters of unused locks on cliens and granted locks on servers to avoid iterating over namespaces
-port b=21519&LU-499, a race between shrink or recalc and namespace_free

Severity : normal
Bugzilla : 24531 vfs locking simplification and lockless i/o for direct i/o
Description: ll_file_write used to lock in the following order:
"lli_write_sem; ldlm extent lock; inode mutex (taken in generic_file_write)". OTOH, direct I/O read used opposite order: "inode mutex; ldlm extent lock on server". That led to a deadlock.

Another drawback of that is need to drop inode mutex on truncate before taking ldlm extent lock.

This patch fixes the problem by simplifing the locking with help of using version of generic_file_write routine which does not take inode mutex: "inode mutex; ldlm extent lock". That makes lli_write_sem in write and mutex re-lock in truncate unnecessary.

DIO read takes inode mutex as it used to be.

One more fix is to make sure that in case of DIO read fast lock matching is avoided. That fixed yet another deadlock between direct i/o reads: those who got a fast lock locked in order "ldlm lock; inode mutex" while those who ran lockless reads locked in opposite order: "inode mutex; ldlm lock on server".
Details : The below summarizes read, write, truncate locking rules:
read: trunc sem, ldlm
write: mutex, ldlm
read direct: mutex, server ldlm
write direct: mutex, server ldlm
truncate: mutex, trunc sem, ldlm

Severity : normal
Bugzilla : 24592 ENOSUPP migratepage
Description: rhel6 kernel has "memory compaction" feature which seems to be slighlty inaccurate: it misses setting page->private to 0 for pages allocated for migration. Details : Detect kernel with that feature and add ENOSUPP migration address space operation as a workaround for the problem

Severity : normal
Bugzilla : 23206 handle_async_create(): do not return ENOSPC if there are preallocated objects

Severity : normal
Bugzilla : 24628 OEL6 support in 1.8 branch
Description: Add OEL6 support in b1_8 branch. Kernel version is 2.6.32-279.2.1.el6.

Severity : normal
Bugzilla : 24580 RHEL6 support in b1_8 branch
Description: Update RHEL6 patchless client kernel to 2.6.32-279.2.1.el6.

Severity : normal
Bugzilla : 23206 return 0 if precreation succeeded even partially

Severity : normal
Bugzilla : 20569 count bad lines correctly
Description: -have parse_buffer() to count lines with bogus headers correctly
-simplification of end of line detection in parse_buffer()

Severity : normal
Bugzilla : 20569 test_170 fix
Description: use perl instead of sed to process binary files properly; verify that bad and good files differ; minor cleanup

Severity : normal
Bugzilla : 24596 skip metabench for rhel 6.2 nfs client
Description: rhel 6.2 nfs client bug
Details : https://bugzilla.redhat.com/show_bug.cgi?id=790729

Severity : normal
Bugzilla : 24515 test_7 activate osc failed
Description: take into account the possible race between activation from lctl and activation from pinger thread

Severity : normal
Bugzilla : 24580 RHEL6 support in b1_8 branch
Description: RHEL6.2 support along with build code refactor.
Details : This patch is largely based on the patches in the following bugs:
22375 RHEL6 patchless client support.
24089 Avoid reuse cache storage collisions.
24090 Distro and target autodetection.
24091 Find_linux_rpms utility.
24092 Build src.rpm for lustre if requested.
24300 Don't run autogen.sh in the spl and zfs repos.
LU-62 Adds support to build RHEL6 patchless client.
LU-73 Re-org of rhel* build code to max code reuse.
LU-402 Check if dump_trace wants address argument
LU-1116 Update RHEL6.2 kernel to 2.6.32-220.7.1.el6.
For more information, please refer to the individual bug.

Severity : normal
Bugzilla : 22065 ko2iblnd failover deadlock fix

Severity : normal
Bugzilla : 20288 IB bonding & fix kiblnd_check_conns deadlock
Bugzilla : 20153 IB bonding & fix kiblnd_check_conns deadlock
Description: Combined patch for IB bonding issues of Bug 20288 (att 25001) and Bug 20153 (att 26145) from Atul.

Severity : normal
Bugzilla : LU-278 build: Only warn for tag/version mismatch
Description: The configure process should NOT abort just because the most recent tag is not of the form that upstream uses to tag Lustre. Downstream developers may use their own tags, or just add extensions to upsteam's version tags.

Severity : normal
Bugzilla : 24458 files sometimes show up as zero size or missing
Description: LU-274 Update LVB from disk when glimpse callback return error
Details : Client ll_glimpse_callback() could fail to get inode if the inode is already been cleared, and this glimpse callback will fail for -ELDLM_NO_LOCK_DATA, so server should update LVB from disk (in filter_intent_policy()) when it received such error from client.

Severity : normal
Bugzilla : 22281 This patch combines patches from bug 22281
Description: This patch combines all the patches from bug 22281.
Details : It mainly deals with the build subsystem:
- add config opts like --downstream-release, --enable-dist, etc.
- add BUILDID support.
- build lustre with an external ldiskfs package.
Check bug 22281 for details.

Severity : normal
Bugzilla : 24450 new test: check bast timeout serialization


Changes from v1.8.6 to v1.8.7

Support for networks:

  • socklnd - any kernel supported by Lustre,
  • qswlnd - Qsnet kernel modules 5.20 and later,
  • openiblnd - IbGold 1.8.2,
  • o2iblnd - OFED 1.3, 1.4.1, 1.4.2, 1.5.1 and 1.5.2
  • viblnd - Voltaire ibhost 3.4.5 and later,
  • ciblnd - Topspin 3.2.0,
  • iiblnd - Infiniserv 3.3 + PathBits patch,
  • gmlnd - GM 2.1.22 and later,
  • mxlnd - MX 1.2.10 or later,
  • ptllnd - Portals 3.3 / UNICOS/lc 1.5.x, 2.0.x


Server support for kernels:

  • 2.6.16.60-0.69.1 (SLES 10),
  • 2.6.32.19-0.2.1 (SLES11),
  • 2.6.18-194.17.1.el5 (RHEL 5)
  • 2.6.18-194.17.1.0.1.el5 (OEL 5)


Client support for unpatched kernels: see "Patchless Client"

        2.6.16 - 2.6.32 vanilla (kernel.org)


Recommended e2fsprogs version:

  • 1.41.12.2-ora1


The async journal commit feature (bug 19128) and the cancel lock before replay feature (bug 16774) are disabled by default.


Severity: normal
Description: regression test: make sure that data written concurrently do not get discarded on file close
Details: write_disjoint.c modification : -- several new options -- minor cleanup (rank=0: open file once; close file at the end; add usage ()); new parallel-scale write_disjoint2 () regression test; new mpi_run() --quiet option to skip lfs df

Severity: normal
Description: comment on top of ptlrpc_check_set() update
Details: ptlrpc_check_set() returns result of set_condition hook if it is defined

Severity: normal
Description: ldlm_run_bl_ast_work: use ptlrpc_set_wait() with condition
Details: ldlm_run_bl_ast_work() sends ASTs in sets of PARALLEL_AST_LIMIT requests and waits for whole set to complete and then sends another set of requests and waits again. If there is a least one request per set which timeouts, we have timeout serialization. This patch changes ldlm_run_bl_ast_work() so that having sent request set it then waits for any of sent requests to complete and refills running request set with requests which are yet to be sent. For a case where number of timeout-ing requests is smaller than PARALLEL_AST_LIMIT it is supposed to eliminate possibility of timeout serailization. This patch uses posibility to specify wait condition for ptlrpc_set_wait() (proposed in https://bugzilla.lustre.org/attachment.cgi?id=33099)

Severity: normal
Description: ptlrpc_set_wait flexibility
Details: ptlrpc_set_wait() waits until all requests in a set complete. This patch makes it possible to specify a condition on which ptlrpc_set_wait() will wait instead of default condition "no remaining requests". With that it wiil be possible to add requests to a set as sent ones complete without waiting for all requests to finish.

Severity: normal
Description: remove wrong assertion
Details: The assertion underestimates exp_refcount of obd_export. The exp_refcount is incremented on adding a lock into export's hash table. For decent RAM there can be millions of locks in memory. Similar problem is reported in 23265, 17924, 24376

Severity: normal
Description: use read-write semaphore for lov_lock
Details: After adding obd_getref() into lov_prep_async_page() it appeared that read performance degradated. lov_getref() uses mutex_down(), so it looks like concurrent reads got stuck on than mutex. This fix replaces the mutex with r/w semaphore, so that reads do not get blocked on it. That cured the performance.

Severity: normal
Description: avoid unnecessary dentry rehashing (v2)
Details: In patchless case the sequence __d_drop(); d_rehash_cond() creates race window where dentry incorrectly looks like unhashed when it is not. If dentry is not unhashed, it seems that rehashing can be avoided.

Severity: normal
Description: accessing files via nfs test
Details: -- add nfsserver MOUNT2 cleanup

Severity: normal
Description: use interval tree to calculate kms
Details: with interval tree of locked extents granted list iteration can be avoided which is supposed to save CPU in case of long granted lock lists

Severity: normal
Description: correct assertion
Details: orphan inode can be reached on mds_open when opening by fid which takes place on accessing files via nfs correct the assertion correspondingly

Severity: normal
Description: accessing files via nfs test
Details: -- new nfsread_orphan_file test -- rmultiop_start(), rmultiop_stop() modification: add possibility to run several multiop_bg on remote node

Severity: normal
Description: never resend glimpse ASTs
Details: when a connection to client fails glimpse ast gets resend endlessly as the request does not have rq_noresend flag. Set the flag to avoid resends.

Severity: normal
Description: generate warnings in case of discarding dirty pages
Details: When a client is evicted, dirty pages may get silently discarded. The caller of successful write(2) will not know that the data he wrote have been discarded due to eviction before they can be flushed to the OSS. With this patch system administrator gets warned about dirty page discard.

Severity: normal
Description: do not compare unsigned < 0
Details: this is also supposed to catch overflow of lqs_bwrite_pending

Severity: normal
Description: ext3_dx_find_entry: check directory entry consistency before ext3_match
Details: to avoid getting into infinite loop when directory block contains wrong data

Severity: normal
Description: llite: -EIO instead of LBUG for multi-referenced object
Details: Whenever an inode is used with a DLM lock, the client checks that no other inodes are referencing the same OST object, since this is a sign of filesystem corruption on the MDS (or some other code bug that behaves in this way). If the client detected the same OST object is referenced from multiple inodes at the same time, it will LASSERT() and print a message to this effect, rather than continue to corrupt the data files. osc_set_data_with_check() ASSERTION(old_inode->i_state & I_FREEING) failed: Found existing inode ffff880587d15d10/222311317/67781718 state 0 in lock: setting data to ffff88046b7f8d50/223489633/67781099 Instead of LASSERTing on this condition, instead return EIO for this file. This allows the problem to be analyzed and fixed without the need to reboot the client node.

Severity: normal
Description: Avoid corropt ldiskfs after MD rebuild on RHEL5/CentOS5.

Severity: normal
Description: limit bio size to BIO_MAX_PAGES
Details: this is neede because bio_alloc_bioset()->bvec_alloc_bs() refuses to allocate bigger bio-s

Severity: normal
Description: set $PTLDEBUG, $SUBSYSTEM and $DEBUG_SIZE values on every node (LU-196)
Details: The current set_default_debug_nodes() could not pass the values of $PTLDEBUG, $SUBSYSTEM and $DEBUG_SIZE to the remote nodes while they are specified from the command line on the local node. This patch is to fix this issue.

Severity: normal
Description: fix deadlock caused by original fix b=24525 (LU-146)
Details: Get open lock inside mds_get_parent_child_locked() to avoid deadlock. Never get open lock if child is newly created to avoid deadlock.

Severity: normal
Description: fix v1
Details: canceling lock may contain data being sent to OSTs. Change find_cbdata iterator to take that into account

Severity: normal
Description: kernel BUG at fs/inode.c:323!
Details: workaround patch to avoid the race at truncate_inode_pages_range()

Severity: normal
Description: racer: general protection fault (LU-286)

Severity: normal
Description: fsync for directories

Severity: normal
Description: allow lnet to talk to gnilnd

Severity: normal
Description: obdfilter-survey cleanup

Severity: normal
Description: add an -s option to set an altenative order of services start
Details: -s start services in the order MGS->OST(s)->MDT(s). The default order is MGS->MDT(s)->OST(s).

Severity: normal
Description: add lst stat --count

Severity: normal
Description: ORNL LCE Router features\fixes
Details: Only squawk when md->start is NULL on non-zero length v2

Severity: normal
Description: lfs find -s doesn't seem to work quite with >2GB args
Details: fix the wrong size type in find_value_cmp()

Severity: normal
Description: client nodes crash on fs with inactive OST
Details: take lov reference in lov_prep_async_page()

Severity: normal
Description: replay-dual: ldlm_lock.c:1622:ldlm_lock_cancel()) LBUG type: PLN
Details: fix a race between do_requeue and client_disconnect_export

Severity: normal
Description: add lctl push

Severity: normal
Descrip