GSS / Kerberos

Note: Only the HEAD branch supports GSS/Kerberos functionality. It is subject to changes at any time, and backward compatibility is NOT guaranteed.

= Kerberos Lustre Setup =

Security Flavor
A security flavor is a string to describe what kind authentication and data transformation be performed upon a PTLRPC connection. It covers both RPC message and BULK data.

The support flavors are described in following table:

[*] In Lustre 1.4 and 1.6 it is possible to enable bulk data checksumming to provide integrity checking using CRC32. In 1.6.5 this is expected to be the default behaviour, using the Adler32 mechanism by default (lower CPU overhead than CRC32).

In the future, we may want to support customize flavor to some extend. For example, allow set different flavors for RPC message and bulk data.

Distribution

 * We only support MIT Kerberos 5, version from 1.3.x to latest 1.6.x.

Configuration
1. Configure client nodes: kadmin> addprinc -randkey lustre_root/client_host.domain@REALM kadmin> ktadd -e aes128-cts:normal lustre_root/client_host.domain@REALM
 * For each client node, create a lustre_root principal and generate keytab.
 * Install the keytab on the client node.

2. Configure MDS node: kadmin> addprinc -randkey lustre_mds/mds_host.domain@REALM kadmin> ktadd -e aes128-cts:normal lustre_mds/mds_host.domain@REALM
 * For each MDS node, create a lustre_mds principal and generate keytab.
 * Install the keytab on the MDS node.

3. Configure OSS node: kadmin> addprinc -randkey lustre_oss/oss_host.domain@REALM kadmin> ktadd -e aes128-cts:normal lustre_oss/oss_host.domain@REALM
 * For each OSS node, create a lustre_oss principal and generate keytab.
 * Install the keytab on the OSS node.

NOTES:
 * The host.domain should be the FQDN in your network, otherwise server might not recognize any GSS request.

kadmin> addprinc -randkey lustre_root@REALM kadmin> ktadd -e aes128-cts:normal lustre_root@REALM
 * As an alternative of the client keytab, if you want to save the trouble of assigning unique keytab for each client node, you can create a general lustre_root principal and its keytab, and install the same keytab on as many client nodes as you want. But be aware that in this way one compromised client means all clients are insecure.


 * To merge keytab files, you need the tool ktutil, for more details please refers to manual of ktutil.


 * Lustre support following enctypes for MIT Kerberos 5 version 1.4 or higher:
 * des-cbc-md5
 * des3-hmac-sha1
 * aes128-cts
 * aes256-cts


 * For MIT Kerberos 1.3.x, only des-cbc-md5 works because a known issue between libgssapi and Kerberos library.

Required packages
Every node should have follow packages installed:
 * libgssapi version 0.10 or higher. Some newer Linux distributions already come with it. If not, build & install from source: http://www.citi.umich.edu/projects/nfsv4/linux/libgssapi/libgssapi-0.11.tar.gz
 * keyutils

Kernel & Environment
On Each node (MDT, OST, Client) following line should be added into /etc/fstab to be automatically mounted nfsd        /proc/fs/nfsd            nfsd            defaults   0 0 Each MDT and Client node add following line into /etc/request-key.conf: create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S Note you might need to replace /usr/sbin/lgss_keyring in above line to the actual path to lgss_keyring binary in your setting.
 * System wide configuration:

If you are using network which is NOT TCP or Infiniband (e.g. Quadrics Elan, Myrinet, etc), you need configure a /etc/lustre/nid2hostname on each server node (MDT & OST), which is a simple script to translate NID into hostname. Following is sample on a Elan cluster:
 * Networking:

#!/bin/bash set -x exec 2>/tmp/$(basename $0).debug # convert a NID for a LND to a hostname, for GSS for example # called with thre arguments: lnd netid nid #  $lnd will be string "QSWLND", "GMLND", etc. #  $netid will be number in hex string format, like "0x16", etc.   #   $nid has the same format as $netid # output the corresponding hostname, or error message leaded by a '@' for error logging. lnd=$1 netid=$2 nid=$3 # uppercase the hex nid=$(echo $nid | tr '[abcdef]' '[ABCDEF]') # and convert to decimal nid=$(echo -e "ibase=16\n${nid/#0x}" | bc) case $lnd in       QSWLND)   # simply stick "mtn" on the front                  echo "mtn$nid"                  ;;        *)        echo "@unknown LND: $lnd" ;;  esac

Build Lustre
Enable GSS during configuration:

./configure --enable-gss --other-options

GSS Daemons
Make sure start the daemon process lsvcgssd on each OST and MDT node before starting Lustre. The command syntax is: lsvcgssd [-f] [-v]
 * -f: running at foreground instead of as daemon, thus output error/warning messages to front console instead of system log.
 * -v: increase verbosity by 1. The default is 0, maximum is 4.

Setting Security Flavors
Note: If nothing specified, by default all RPC connections will use null.

On MGS there's a persistent sptlrpc rule database, by specifying set of rules you can change security flavors between nodes. A rule is in form of: = Rules can be manipulated on MGS node. To add a rule: mgs> lctl conf_param = If there a existing rule of part, it will overwritten.

To delete a rule: mgs> lctl conf_param -d

Current rule set could be obtained by: msg> cat /proc/fs/lustre/mgs//live/ | grep "srpc.flavor"

Note:
 * Rules have persistent storage on MGS, so it applied across re-mount.
 * It doesn't matter in which order you add a set of rules, lustre keep rules in certain order or priority.
 * After you changed a rule, usually it will take the system within 1 minutes to apply the new rules to all nodes, depend on system load.
 * Before you change a rule, make sure affected nodes are ready for the new security flavor. E.g. you changed flavor from null to krb5p but GSS/Kerberos env is not properly configured on affected nodes, those nodes might be evicted because they can't communicate with others.
 * You can also specify rules via device on-disk parameters, by mke2fs.lustre or tune2fs.lustre. The syntax is the same, and the rule only applied to connections to this specific target (MDT/OST).

Rules Syntax & Examples
The general syntax is: .srpc.flavor. [. ]=flavor


 * : could be filesystem name, or specific MDT/OST device name. For example, lustre, lustre-MDT0000, lustre-OST0001, etc.
 * : LNET network name of the RPC initiator. For example, tcp0, elan1, o2ib0.
 * : could be one of cli2mdt, cli2ost, mdt2mdt, mdt2ost. In most cases you don't need to specify part.

Examples: mgs> lctl conf_param lustre.srpc.flavor.default=krb5i
 * Apply krb5i on ALL connections:

mgs> lctl conf_param lustre.srpc.flavor.tcp0=krb5p mgs> lctl conf_param lustre.srpc.flavor.default=null
 * Nodes in network tcp0 use krb5p; All other nodes use null

mgs> lctl conf_param lustre.srpc.flavor.tcp0=krb5p mgs> lctl conf_param lustre.srpc.flavor.elan1=plain mgs> lctl conf_param lustre.srpc.flavor.default.cli2mdt=krb5i mgs> lctl conf_param lustre.srpc.flavor.default.cli2ost=krb5i mgs> lctl conf_param lustre.srpc.flavor.default.mdt2mdt=null mgs> lctl conf_param lustre.srpc.flavor.default.mdt2ost=plain
 * Nodes in network tcp0 use krb5p; Nodes in elan1 use plain; Amount other nodes, clients use krb5i to MDT/OST, MDT use null to other MDTs, MDT use plain to OSTs.

Authenticate Normal Users
On client nodes, a non-root user need kinit before accessing Lustre, just like other Kerberized applications. A use could destroy the established security contexts before logout, by "lfs flushctx":
 * Required by kerberos, the user's principal (username@REALM) should be added into KDC.
 * Client and MDT nodes should have the same user database, i.e. the user name and uid/gid translation.

lfs flushctx [-k]

Here "-k" means also destroy the on-disk kerberos credential cache, equals to "kdestroy", otherwise it only destroy established contexts in Lustre kernel memory.

Secure MGC - MGS connection
Each node can specify what flavor to use to connect to MGS, by option mgssec=flavor upon mounting a target device or client. By default null is chosen. Once a flavor is chosen, it can't be changed until umount.

Because each node has only one connection to MGS, so if there's more than one target device or client on a single node, all the "mgssec=" specification must be the same. Or simply missing option "mgssec=" means "using currently chosen flavor.

By default, MGS accept RPCs with any flavor. But sysad can configure MGS to only accept certain flavor from certain network. The syntax is similar but with target as a special "_mgs": mgs> lctl conf_param _mgs.srpc.flavor. =flavor '''NOTE: apply inappropriate flavor may lead to a node never be able to communicate with MGS until restart. So use it carefully.'''

Cross-Realms Authentication
Due to idmap functionality is missing, we don't support cross-realm authentication currently.