WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Using Red Hat Cluster Manager with Lustre

From Obsolete Lustre Wiki
Jump to navigationJump to search

(Updated: Dec 2010)

DISCLAIMER - EXTERNAL CONTRIBUTOR CONTENT

This content was submitted by an external contributor. We provide this information as a resource for the Lustre™ open-source community, but we make no representation as to the accuracy, completeness or reliability of this information.


This page describes how to configure and use Red Hat Cluster Manager with Lustre failover. Sven Trautmann has contributed this content.

For more about Lustre failover, see Configuring Lustre for Failover.


Preliminary Notes

This document is based on the RedHat Cluster version 2.0, which is part of RedHat Enterprise Linux version 5.5. For other Versions or RHEL-based distributions the syntax or methods to setup and run RedHat Cluster may differ.

In comparison with other HA solutions RedHat Cluster as in RHEL 5.5 is a pretty old HA solution. It is recommended to use other HA solutions like pacemaker, if possible.

It is assumed that two Lustre server nodes share a number of Lustre targets. Each of the Lustre nodes provide a number of Lustre targets and in case of a failure the not failed node takes over the Lustre targets of the failed nodes and makes them available to the Lustre clients.

Furthermore, to make sure the Lustre targets are mounted only on one of the Lustre server nodes at a time we implement STONITH fencing. This requires a way to make sure the failed node is shut down in case of a failure. In the examples shown in this article it is assumed that the Lustre server nodes are equipped with a service processor allowing to shut down a failed node using IPMI.

Setting Up RedHat Cluster

Setting Up the openais Communication Stack

The openais package is distributed with RHEL and can be installed using

rpm -i /path/to/RHEL-DVD/Server/openais0.80.6-16.el5.x86_64.rpm

or

yum install openais

if yum is configured to access the RHEL repository.

Once installed, the software looks for a configuration in the file /etc/ais/openais.conf .

Complete the following steps to set up the openais communication stack:

1. Edit the totem section of the openais.conf configuration file to designate the IP address and netmask of the interface(s) to be used. The totem section of the configuration file describes the way openais communicates between nodes.

totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 10.0.0.0
		mcastaddr:		226.94.1.1
		mcastport:		5405
	}
}

Openais uses the option bindnetaddr to determine which interface is to be used for cluster communication. The example above assumes one of the node’s interfaces is configured on the network 10.0.0.0. The value of the option is calculated from the IP address AND the network mask for the interface (IP & MASK) so the final bits of the address are cleared. Thus the configuration file is independent of any node and can be copied to all nodes.

2. Create an AIS key

# /usr/sbin/ais-keygen
OpenAIS Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Writing openais key to /etc/ais/authkey.

Installing RedHat Cluster

The minimum installation of RedHat Cluster consists of the Cluster Manager package cman and the Resource Group Manager package rgmanager. The cman package can be found in the RHEL repository. The rgmanager package is part of the Cluster repository. It can be found on the RHEL DVD in the Cluster sub-directory and may need to be added to the yum configuration manually. With yum configured correctly RedHat Cluster can be installed using:

yum install cman rgmanager

Installing the Lustre Resource Skript

The rgmanager package includes a number of resource scripts (/usr/share/cluster) which are used to integrate resources like network interfaces or file systems with rgmanager. Unfortunately, there is no resource script for Lustre included.

Luckily Giacomo Montagner posted an resource script on the lustre-discuss mailing list:

http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090623/7799de37/attachment-0001.bin

After downloading this file it needs to be copied to /usr/share/cluster/lustrefs.sh. Make sure the script is executable.

Configure RedHat Cluster

RedHat Cluster uses /etc/cluster/cluster.conf as central configuration file. This file is in XML format. The complete schema of the XML file can be found at http://sources.redhat.com/cluster/doc/cluster_schema_rhel5.html.

The Basic structure of a cluster.conf file may look like this:

<?xml version="1.0" ?>
<cluster config_version="1" name="Lustre">

...
</cluster>

In this example the name of the cluster is set to Lustre and the version is initialized as 1. If the cluster configuration is updated the config_version attribute must be increased on all nodes in this cluster. RedHat cluster is usually used with more than two nodes providing resources. To tell RedHat cluster to work with two nodes the following cman attributes need to be set:

  <cman expected_votes="1" two_node="1"/>

This tells cman, that there are only two nodes in a cluster and one vote is enough declare a node failed.

Nodes

Next the nodes which form the cluster need to be specified. Each cluster node need to be specified separately wrapped in an surrounding clusternodes tag.

  <clusternodes>
    <clusternode name="lustre1" nodeid="1">
      <fence>
        <method name="single">
          <device lanplus="1" name="lustre1-sp"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="lustre2" nodeid="2">
      <fence>
        <method name="single">
          <device lanplus="1" name="lustre2-sp"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>

Each cluster node is given a name which must be it's hostname or IP address. Additionally a unique node ID needs to be specified. The fence tag assigned to each node specifies a fence device to use to shut down this cluster node. The fence devices are defined elsewhere in cluster.conf (see below for details).

Fencing

  <fence_daemon post_fail_delay="0" post_join_delay="3"/>
  <fencedevices>
    <fencedevice name="lustre1-sp" agent="fence_ipmilan" auth="password" ipaddr="10.0.1.1" login="root" passwd="supersecretpassword"/>
    <fencedevice name="lustre2-sp" agent="fence_ipmilan" auth="password" ipaddr="10.0.1.2" login="root" passwd="supersecretpassword"/>
  </fencedevices>

Resource Manager

Failover Domains

  <failoverdomains>
     <failoverdomain name="second_first" ordered="1" restricted="1">
        <failoverdomainnode name="lustre2" priority="1"/>
        <failoverdomainnode name="lustre1" priority="2"/>
      </failoverdomain>
      <failoverdomain name="first_first" ordered="1" restricted="1">         
        <failoverdomainnode name="lustre1" priority="1"/>
        <failoverdomainnode name="lustre2" priority="2"/>
      </failoverdomain>
    </failoverdomains>

Resources

    <resources>
      <lustrefs name="target1" mountpoint="/mnt/ost1" device="/path/to/ost1/device" force_fsck="0" force_unmount="0" self_fence="1"/>
      <lustrefs name="target2" mountpoint="/mnt/ost2" device="/path/to/ost2/device" force_fsck="0" force_unmount="0" self_fence="1"/>
      <lustrefs name="target3" mountpoint="/mnt/ost3" device="/path/to/ost3/device" force_fsck="0" force_unmount="0" self_fence="1"/>
      <lustrefs name="target4" mountpoint="/mnt/ost4" device="/path/to/ost4/device" force_fsck="0" force_unmount="0" self_fence="1"/>
    </resources>

Services

    <service autostart="1" exclusive="0" recovery="relocate" domain="second_first" name="lustre_2">
      <lustrefs ref="target1"/>
      <lustrefs ref="target2"/>
    </service>

    <service autostart="1" exclusive="0" recovery="relocate" domain="first_first" name="lustre_1">
      <lustrefs ref="target3"/>
      <lustrefs ref="target4"/>
    </service>

Tools to use with RedHat Cluster

cman_tool

ccs_tool

system-config-cluster