WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.

Handling Full OSTs: Difference between revisions

From Obsolete Lustre Wiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 4: Line 4:


<pre>
<pre>
[root@LustreClient01 ~]# lfs getstripe /mnt/lustre/test_2
OBDS:
0: lustre-OST0000_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
2: lustre-OST0002_UUID ACTIVE
3: lustre-OST0003_UUID ACTIVE
4: lustre-OST0004_UUID ACTIVE
5: lustre-OST0005_UUID ACTIVE
/mnt/lustre/test_2
obdidx objidobjid group
2 80x80
root@LustreClient01 ~]# lfs df -h
root@LustreClient01 ~]# lfs df -h
UUID bytes Used Available Use% Mounted on
UUID                 bytes   Used Available Use% Mounted on
lustre-MDT0000_UUID 4.4G 214.5M 3.9G 4% /mnt/lustre[MDT:0]
lustre-MDT0000_UUID 4.4G   214.5M   3.9G     4%   /mnt/lustre[MDT:0]
lustre-OST0000_UUID 2.0G 751.3M 1.1G 37% /mnt/lustre[OST:0]
lustre-OST0000_UUID 2.0G   751.3M   1.1G   37%   /mnt/lustre[OST:0]
lustre-OST0001_UUID 2.0G 755.3M 1.1G 37% /mnt/lustre[OST:1]
lustre-OST0001_UUID 2.0G   755.3M   1.1G   37%   /mnt/lustre[OST:1]
lustre-OST0002_UUID 2.0G 1.7G 155.1M 86% /mnt/lustre[OST:2] <-
lustre-OST0002_UUID 2.0G     1.7G 155.1M   86%   /mnt/lustre[OST:2] <-
lustre-OST0003_UUID 2.0G 751.3M 1.1G 37% /mnt/lustre[OST:3]
lustre-OST0003_UUID 2.0G   751.3M   1.1G   37%   /mnt/lustre[OST:3]
lustre-OST0004_UUID 2.0G 747.3M 1.1G 37% /mnt/lustre[OST:4]
lustre-OST0004_UUID 2.0G   747.3M   1.1G   37%   /mnt/lustre[OST:4]
lustre-OST0005_UUID 2.0G 743.3M 1.1G 36% /mnt/lustre[OST:5]
lustre-OST0005_UUID 2.0G   743.3M   1.1G   36%   /mnt/lustre[OST:5]
filesystem summary: 11.8G 5.4G 5.8G 45% /mnt/lustre
 
filesystem summary: 11.8G     5.4G   5.8G   45% /mnt/lustre
</pre>
</pre>


In this case, OST:2 is almost full and when one tries to write additional information to the file system (even with uniform striping over all the OSTs), the write command fails as follows:
In this case, ''OST:2'' is almost full and when one tries to write additional information to the file system (even with uniform striping over all the OSTs), the write command fails as follows:


<pre>
[root@LustreClient01 ~]# lfs setstripe /mnt/lustre 4M 0 -1
[root@LustreClient01 ~]# lfs setstripe /mnt/lustre 4M 0 -1
[root@LustreClient01 ~]# dd if=/dev/zero of=/mnt/lustre/test_3 bs=10M count=100
[root@LustreClient01 ~]# dd if=/dev/zero of=/mnt/lustre/test_3 bs=10M count=100
Line 35: Line 26:
97+0 records out
97+0 records out
1017192448 bytes (1.0 GB) copied, 23.2411 seconds, 43.8 MB/s
1017192448 bytes (1.0 GB) copied, 23.2411 seconds, 43.8 MB/s
16 Lustre File System: Demo Quick Start Guide Sun Microsystems, Inc.
</pre>
To enable continued use of the file system, the full OST has to be taken offline or, more specifically, rendered read-only. This can be accomplished using the lctl command.
 
Note This action has to be done on the MDS, since this is the server that allocates space for writing:
To enable continued use of the file system, the full OST has to be taken offline or, more specifically, rendered read-only. This can be accomplished using the ''lctl'' command.
1.Log in to the MDS server:
 
2.Use the lctl dl command to show the status of all file system components:
'''''Note:''''' This action has to be done on the MDS, since this is the server that allocates space for writing:
3.Use the lctl deactive command to take the full OST offline:
 
4.Again, display the status of the file system components:
1. Log in to the MDS server:
As can be seen from the device list, OST2 is now inactive. If a new file is now written to the file system, the write will be successful as the stripes are allocated across all the other active OSTs.
 
<pre>
[root@LustreClient01 ~]# ssh root@192.168.0.10
root@192.168.0.10's password:
Last login: Wed Nov 26 13:35:12 2008 from 192.168.0.6
</pre>
 
2. Use the ''lctl dl'' command to show the status of all file system components:
 
<pre>
[root@mds ~]# lctl dl
  0 UP mgs MGS MGS 9
  1 UP mgc MGC192.168.0.10@tcp e384bb0e-680b-ce25-7bc9-81655dd1e813 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
  4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 5
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 UP osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 UP osc lustre-OST0003-osc lustre-mdtlov_UUID 5
  9 UP osc lustre-OST0004-osc lustre-mdtlov_UUID 5
10 UP osc lustre-OST0005-osc lustre-mdtlov_UUID 5
</pre>
 
3. Use the ''lctl'' deactive command to take the full OST offline:
 
[root@mds ~]# lctl --device 7 deactivate
 
4. Again, display the status of the file system components:
 
<pre>
[root@mds ~]# lctl dl
  0 UP mgs MGS MGS 9
  1 UP mgc MGC192.168.0.10@tcp e384bb0e-680b-ce25-7bc9-81655dd1e813 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
  4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 5
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 IN osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 UP osc lustre-OST0003-osc lustre-mdtlov_UUID 5
  9 UP osc lustre-OST0004-osc lustre-mdtlov_UUID 5
10 UP osc lustre-OST0005-osc lustre-mdtlov_UUID 5
</pre>
 
 
As can be seen from the device list, ''OST2'' is now inactive. If a new file is now written to the file system, the write will be successful as the stripes are allocated across all the other active OSTs.

Revision as of 16:18, 24 September 2009

Sometimes the file system becomes unbalanced, often due to changed stripe settings. If an OST is full and an attempt is made to write more information to the file system, an error occurs.

The example below shows an unbalanced file system:

root@LustreClient01 ~]# lfs df -h
UUID                 bytes   Used  Available Use%  Mounted on
lustre-MDT0000_UUID  4.4G   214.5M   3.9G     4%   /mnt/lustre[MDT:0]
lustre-OST0000_UUID  2.0G   751.3M   1.1G    37%   /mnt/lustre[OST:0]
lustre-OST0001_UUID  2.0G   755.3M   1.1G    37%   /mnt/lustre[OST:1]
lustre-OST0002_UUID  2.0G     1.7G 155.1M    86%   /mnt/lustre[OST:2] <-
lustre-OST0003_UUID  2.0G   751.3M   1.1G    37%   /mnt/lustre[OST:3]
lustre-OST0004_UUID  2.0G   747.3M   1.1G    37%   /mnt/lustre[OST:4]
lustre-OST0005_UUID  2.0G   743.3M   1.1G    36%   /mnt/lustre[OST:5]

filesystem summary: 11.8G     5.4G    5.8G    45%  /mnt/lustre

In this case, OST:2 is almost full and when one tries to write additional information to the file system (even with uniform striping over all the OSTs), the write command fails as follows:

[root@LustreClient01 ~]# lfs setstripe /mnt/lustre 4M 0 -1
[root@LustreClient01 ~]# dd if=/dev/zero of=/mnt/lustre/test_3 bs=10M count=100
dd: writing `/mnt/lustre/test_3': No space left on device
98+0 records in
97+0 records out
1017192448 bytes (1.0 GB) copied, 23.2411 seconds, 43.8 MB/s

To enable continued use of the file system, the full OST has to be taken offline or, more specifically, rendered read-only. This can be accomplished using the lctl command.

Note: This action has to be done on the MDS, since this is the server that allocates space for writing:

1. Log in to the MDS server:

[root@LustreClient01 ~]# ssh root@192.168.0.10
root@192.168.0.10's password:
Last login: Wed Nov 26 13:35:12 2008 from 192.168.0.6

2. Use the lctl dl command to show the status of all file system components:

[root@mds ~]# lctl dl
  0 UP mgs MGS MGS 9
  1 UP mgc MGC192.168.0.10@tcp e384bb0e-680b-ce25-7bc9-81655dd1e813 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
  4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 5
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 UP osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 UP osc lustre-OST0003-osc lustre-mdtlov_UUID 5
  9 UP osc lustre-OST0004-osc lustre-mdtlov_UUID 5
 10 UP osc lustre-OST0005-osc lustre-mdtlov_UUID 5

3. Use the lctl deactive command to take the full OST offline:

[root@mds ~]# lctl --device 7 deactivate

4. Again, display the status of the file system components:

[root@mds ~]# lctl dl
  0 UP mgs MGS MGS 9
  1 UP mgc MGC192.168.0.10@tcp e384bb0e-680b-ce25-7bc9-81655dd1e813 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
  4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 5
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 IN osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 UP osc lustre-OST0003-osc lustre-mdtlov_UUID 5
  9 UP osc lustre-OST0004-osc lustre-mdtlov_UUID 5
 10 UP osc lustre-OST0005-osc lustre-mdtlov_UUID 5


As can be seen from the device list, OST2 is now inactive. If a new file is now written to the file system, the write will be successful as the stripes are allocated across all the other active OSTs.