WARNING: This is the _old_ Lustre wiki, and it is in the process of being retired. The information found here is all likely to be out of date. Please search the new wiki for more up to date information.
POSIX Compliance Testing
This page provides instructions for installing and running the POSIX compliance suite of file system tests.
Quick Start
Several versions of the POSIX compliance suite are provided on our FTP site (ftp://ftp.lustre.org:/pub/benchmarks/posix/). Each version is gcc and architecture specific. You will need to determine which version of gcc you are running locally ({{{gcc -v}}})Is this correct? and then download the appropriate tarball.
If a package isn't available for your particular combination of gcc+archtecture, follow the Detailed Instructions provided below.
The following quick start versions are provided:
- one-step-gcc2.96-i686.tgz
- one-step-gcc2.96-ia64.tgz
- one-step-gcc3.04-i686.tgz
- one-step-gcc3.2-i686.tgz
Installing Using the Quick Start Script
1. Download the test suite and quick start script into /usr/src/posix:
- one-step-gcc<gcc version>-<arch>.tgz
- one-step-setup.sh
Both are available from ftp://ftp.lustre.org:/pub/benchmarks/posix/ This link doesn't work
2. Run the setup script by entering:
cd /usr/src/posix sh one-step-setup.sh
4. Edit the configuration file /mnt/lustre/TESTROOT/tetexec.cfg to enter appropriate values for your system.
5. Save the TESTROOT for running lustre tests by entering:
cd /mnt/lustre tar zcvf /usr/src/posix/TESTROOT.tgz TESTROOT
Note: The quick start procedure only works with the paths /home/tet and /mnt/lustre. If you want to change the paths, follow the Building from Scratch instructions below and create a new tarball.
Running the Test Suite
To run the test suite, enter:
su - vxs0 . ../profile tcc -e -a /mnt/lustre/TESTROOT -s scen.exec -p
Detailed Instructions
Building from Scratch
Basic instructions on how to install POSIX for Lustre testing: 1. Download all the POSIX files in ftp://ftp.lustre.org:/pub/benchmarks/posix/
- tet_vsxgen_2.0.tgz
- lts_vsx-pcts-1.0.1.2.tgz
- install.sh
- myscen.bld
- myscen.exec
2. DO NOT configure or mount a Lustre filesystem yet.
3. Run the {{{install.sh}}} script and select /home/tet for the root directory for the test suite installation. Say 'y' to installing the users and groups. Accept the defaults for the packages to install.
4. To avoid a bug in the installation scripts where the test directory is not created properly, create a temporary directory for holding the POSIX tests while they are being built:
- mkdir -p /mnt/lustre/TESTROOT;chown vsx0.vsxg0 !$
5. Log in as the test user: su - vsx0
6. Run ../setup.sh to build the test suite. Most of the default answers are correct, except the root directory from which to run the testsets. For this you should specify /mnt/lustre/TESTROOT. For "Install pseudolanguages?", answer 'n'.
7. When it asks you "Install scripts into TESTROOT/BIN..?", do not answer right away. Using another terminal -- because stopping the script doesn't work -- replace the files /home/tet/test_sets/scen.exec and /home/tet/test_sets/scen.bld with myscen.exec and myscen.bld which you downloaded, i.e:
- cp .../myscen.bld /home/tet/test_sets/scen.bld
- cp .../myscen.exec /home/tet/test_sets/scen.exec
This will limit the tests that are run to those relevant for filesystems, and avoid hours of other tests on sockets, math, stdio, libc, shell, etc.
8. Continue with the installation at this point, and answer 'y' to the "Build testsets" question. It will proceed to build and install all of the filesystem tests, and will then run them all as well. Even though it is running them on a local filesystem, this is a valuable baseline for comparison with the behaviour of Lustre. It should put the results into /home/tet/test_sets/results/0002e/journal, and I suggest renaming or symlinking this directory to /home/tet/test_sets/results/ext3/journal (or whatever the name of the local filesystem is that the test was run on).
9. Running the full test should only take 5 minutes or so. Answer 'n' to re-running just the failed tests, and you are done. The results (in a very lengthy table) are in /home/tet/test_sets/results/report.
10. At this point you need to save the test suite for running further tests on a Lustre filesystem. Tar up the tests, so we don't have to rebuild each time:
- tar cvzf TESTROOT.tgz -C /mnt/lustre TESTROOT
You will probably also want to remove the installed tests at this time, to save a bit space, and more importantly to avoid confusion if you forget to mount your Lustre filesystem before running the test.
Running the Suite against Lustre, and Checking your Results
1. As root, set up your Lustre filesystem, mounted on /mnt/lustre (e.g. sh llmount.sh and untar the POSIX tests back to their home:
- tar --same-owner -xzpvf /path/to/tarball/TESTROOT.tgz -C /mnt/lustre
2. As the vsx0 user, you can re-run the tests as many times as you want. If you are newly su'd or logged in as the vsx0 user, you will need to source the environment with '. profile' so that your path and other environment is set up correctly. You run the tests with:
- . /home/tet/profile
- tcc -e -s scen.exec -a /mnt/lustre/TESTROOT -p
3. Each new result is put in a new directory under /home/tet/test_sets/results and is given a directory name similar to 0004e (an increasing number and ending with e for test execution, or b for building the tests).
To look at a formatted report:
- vrpt results/0004e/journal | less
Some tests are "Unsupported", "Untested", or "Not In Use", which does not necessarily indicate a problem.
To compare two test results, use the command:
- vrptm results/ext3/journal results/0004e/journal | less
This is more interesting than looking at the result of a single test since it helps to find test failures that are specific to the filesystem instead of the Linux VFS or kernel. Up to 6 test results can be compared at once. It is often useful to rename the results directory to have more interesting names so that they have meaning in the future (e.g. before_unlink_fix, or similar).
Isolating and Debugging Failures
1. When failures happen, you'll need to gather some more information about what is happening at runtime. For example, some tests may cause kernel panics depending on your configuration. The POSIX suite doesn't have debugging enabled by default, so it's useful to turn on the debugging options of VSX. There are two debug options of note in the config file tetexec.cfg, under your TESTROOT directory:
- i. VSX_DBUG_FILE=output_file - If you're running the test under UML with hostfs support, you should use a file on the hostfs as the debug output file. In the case of a crash, the debug output will then be safely written to the debug file. NOTE: the default value for this option puts the debug log under your test directory in /mnt/lustre/TESTROOT, which is not horribly useful if you do experience a kernel panic and lustre (or your machine) crashes.
- ii. VSX_DBUG_FLAGS=xxxxx - For detailed info about debug flags, please refer to the documentation included with the POSIX suite. The following example will make VSX output all debug messages:
VSX_DBUG_FLAGS=t:d:n:f:F:L:l,2:p:P
2. VSX is based on the TET framework which provides common libraries for VSX. You can also have TET print out verbose debug messages by inserting the option -T when running the tests, e.g.
tcc -Tall5 -e -s scen.exec -a /mnt/lustre/TESTROOT -p 2>&1 | tee /tmp/POSIX-command-line-output.log
3. VSX will print out detailed messages in the report for failed tests. This will include the test strategy, what kind of operations done by the test suite, and what's going wrong. Each subtest (e.g. 'access', 'create') usually contains many single tests; the report will show exactly which single testing fails. In this case, you can find more infomation directly from the VSX source code. For example: the 5th single test of subtest chmod failed, you could look at the source:
- /home/tet/test_sets/tset/POSIX.os/files/chmod/chmod.c
which contains a single test array:
public struct tet_testlist tet_testlist[] = { test1, 1, test2, 2, test3, 3, test4, 4, test5, 5, test6, 6, test7, 7, test8, 8, test9, 9, test10, 10, test11, 11, test12, 12, test13, 13, test14, 14, test15, 15, test16, 16, test17, 17, test18, 18, test19, 19, test20, 20, test21, 21, test22, 22, test23, 23, NULL, 0 };
If this single test is causing real problems, as in the case of a kernel panic, or if you are trying to isolate a single failure, it may be useful to edit the tet_testlist array down to the single test in question and then recompile the test suite. You can then create a new tarball of the resulting TESTROOT directory, named appropriately (e.g. TESTROOT-chmod-5-only.tgz) and re-run the POSIX suite using the steps above. It may also be helpful to edit the scen.exec file to only run the test set in question:
all "total tests in POSIX.os 1" /tset/POSIX.os/files/chmod/T.chmod
NOTE: rebuilding the individual POSIX tests is not exactly straightforward due to the reliance on tcc. I usually end up substituting my edited source files into the source tree while following the manual installation procedure outlined above and letting the existing POSIX install scripts do the work for me. The installation scripts (specifically /home/tet/test_sets/run_testsets.sh) do contain relevant commands for building the test suite --- something akin to tcc -p -b -s $HOME/scen.bld $* -- but I've never had it work outside of the script I found it in. Please update this documentation if you manage to get better mileage rebuilding these tests.