WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Upgrading to a New Version of Lustre: Difference between revisions

From Obsolete Lustre Wiki
Jump to navigationJump to search
Line 37: Line 37:
== Upgrade procedures ==
== Upgrade procedures ==


[[Which should be included here and which referred to in the manual?]]
[[Which should be included here and which referred to in the manual (Chapter 13)?]]


Upgrading from Lustre 1.6.x to Lustre 1.8.0
Upgrading from Lustre 1.6.x to Lustre 1.8.0
Line 60: Line 60:




Lustre Failover and Rolling Upgrades in OM.
==Lustre Failover and Rolling Upgrades in OM. (OM 1-17)==
 
[[Include a page with this content or refer to OM?]]
 
Lustre offers a robust, application-transparent failover mechanism that delivers call completion. This failover mechanism, in conjunction with software that offers interoperability between versions, is used to support rolling upgrades of file system software on active clusters.
 
The Lustre recovery feature allows servers to be upgraded without taking down the system. The server is simply taken offline, upgraded and restarted (or failed over to a standby server with the new software). All active jobs continue to run without failures, they merely experience a delay.
 
Lustre MDSs are configured as an active/passive pair, while OSSs are typically deployed in an active/active configuration that provides redundancy without extra overhead, as shown in FIGURE 1-8. Often the standby MDS is the active MDS for another Lustre file system, so no nodes are idle in the cluster.
 
FIGURE 1-8 Lustre failover configurations for OSSs and MDSs
 
Although a file system checking tool (lfsck) is provided for disaster recovery, journaling and sophisticated protocols re-synchronize the cluster within seconds, without the need for a lengthy fsck. Lustre version interoperability between successive minor versions is guaranteed. As a result, the Lustre failover capability is used regularly to upgrade the software without cluster downtime.
 
Note – Lustre does not provide redundancy for data; it depends exclusively on redundancy of backing storage devices. The backing OST storage should be RAID 5 or, preferably, RAID 6 storage. MDT storage should be RAID 1 or RAID 0+1.

Revision as of 10:40, 24 September 2009

Use the procedures in this chapter to upgrade Lustre 1.6.x to Lustre 1.8.0.

Note: In Lustre version 1.6 and later, the file system name (--fsname parameter) is limited to 8 characters so that it fits on the disk label.

Supported Upgrade and Downgrade Paths

For Lustre 1.8.x, the following upgrades are supported:

  • Lustre 1.6.x to Lustre 1.8.0.
  • One minor version to the next (for example, 1.8.0 > 1.8.x).

For Lustre 1.8.0, downgrades in the same ranges are supported.

  • If you upgrade from Lustre 1.6.x > 1.8.0, you can downgrade to version 1.6.x.
  • If you upgrade from one minor version to the next (for example Lustre 1.8.0 > 1.8.x), you can downgrade to an earlier minor version (e.g., version 1.8.x). Is this correctly stated?

Caution: A fresh installation of Lustre 1.8.x is not guaranteed to be downgradable to an earlier Lustre version.

For supported upgrade paths for earlier releases, see Upgrading from 1.4.6 and later to 1.6.

Prerequisites to Upgrading Lustre

Are these procedures that must be completed before the upgrade is started? Do they apply to any upgrade path?

Remember the following points before upgrading Lustre.

The MDT must be upgraded before the OSTs are upgraded. 1. Shut down lconf failover. 2. Install the new modules. 3. Run tunefs.lustre. 4. Mount startup.

A Lustre upgrade can be done across a failover pair. 1. On the backup server, install the new modules. 2. Shut down lconf failover. 3. On the new server, run tunefs.lustre. 4. On the new server, mount startup. 5. On the primary server, install the new modules.

Upgrade procedures

Which should be included here and which referred to in the manual (Chapter 13)?

Upgrading from Lustre 1.6.x to Lustre 1.8.0

  • Prerequisites to Upgrading Lustre
  • Starting Clients
  • Upgrading a Single File System
  • Upgrading Multiple File Systems with a Shared MGS

Upgrading Lustre 1.8.0 to the Next Minor Version

Downgrading from Lustre 1.8.0 to Lustre 1.6.x

  • Downgrade Requirements
  • Downgrading a File System




Lustre Failover and Rolling Upgrades in OM. (OM 1-17)

Include a page with this content or refer to OM?

Lustre offers a robust, application-transparent failover mechanism that delivers call completion. This failover mechanism, in conjunction with software that offers interoperability between versions, is used to support rolling upgrades of file system software on active clusters.

The Lustre recovery feature allows servers to be upgraded without taking down the system. The server is simply taken offline, upgraded and restarted (or failed over to a standby server with the new software). All active jobs continue to run without failures, they merely experience a delay.

Lustre MDSs are configured as an active/passive pair, while OSSs are typically deployed in an active/active configuration that provides redundancy without extra overhead, as shown in FIGURE 1-8. Often the standby MDS is the active MDS for another Lustre file system, so no nodes are idle in the cluster.

FIGURE 1-8 Lustre failover configurations for OSSs and MDSs

Although a file system checking tool (lfsck) is provided for disaster recovery, journaling and sophisticated protocols re-synchronize the cluster within seconds, without the need for a lengthy fsck. Lustre version interoperability between successive minor versions is guaranteed. As a result, the Lustre failover capability is used regularly to upgrade the software without cluster downtime.

Note – Lustre does not provide redundancy for data; it depends exclusively on redundancy of backing storage devices. The backing OST storage should be RAID 5 or, preferably, RAID 6 storage. MDT storage should be RAID 1 or RAID 0+1.