Automatic running test with large-scale systems

In some cases, one may want to know machine performance for more time consuming calculations. For this purpose, an automatic running test with relatively large-scale systems can be performed by

For the MPI parallel running

     % mpirun -np 132 openmx -runtestL2
  
For the MPI/OpenMP parallel running
     % mpirun -np 132 openmx -runtestL -nt 2
  
Then, OpenMX will run with 16 test files, and compare calculated results with the reference results which are stored in 'work/large_example'. The comparison (absolute difference in the total energy and force) is stored in a file 'runtestL.result' in the directory 'work'. The reference results were calculated using 16 MPI processes of a 2.6 GHz Xeon cluster machine. If the difference is within last seven digits, we may consider that the installation is successful. As an example, 'runtestL.result' generated by the automatic running test is shown below:

1 large_example/5_5_13COb2.dat Elapsed time(s)= 29.90 diff Utot= 0.000000000066 diff Force= 0.000000000045
2 large_example/B2C62_Band.dat Elapsed time(s)= 337.18 diff Utot= 0.000000000030 diff Force= 0.000000016106
3 large_example/CG15c-Kry.dat Elapsed time(s)= 40.14 diff Utot= 0.000000011260 diff Force= 0.000000415862
4 large_example/DIA512-1.dat Elapsed time(s)= 25.85 diff Utot= 0.000000000030 diff Force= 0.000000006092
5 large_example/FeBCC.dat Elapsed time(s)= 49.46 diff Utot= 0.000000000094 diff Force= 0.000000000010
6 large_example/GEL.dat Elapsed time(s)= 33.36 diff Utot= 0.000000000028 diff Force= 0.000000000001
7 large_example/GFRAG.dat Elapsed time(s)= 17.98 diff Utot= 0.000000000315 diff Force= 0.000000000030
8 large_example/GGFF.dat Elapsed time(s)= 528.97 diff Utot= 0.000000000068 diff Force= 0.000000000349
9 large_example/MCCN.dat Elapsed time(s)= 45.48 diff Utot= 0.000000000062 diff Force= 0.000000000001
10 large_example/Mn12_148_F.dat Elapsed time(s)= 51.59 diff Utot= 0.000000000093 diff Force= 0.000000000076
11 large_example/N1C999.dat Elapsed time(s)= 85.00 diff Utot= 0.000000000389 diff Force= 0.000000000096
12 large_example/Ni63-O64.dat Elapsed time(s)= 42.77 diff Utot= 0.000000000111 diff Force= 0.000000000085
13 large_example/Pt63.dat Elapsed time(s)= 37.97 diff Utot= 0.000000000246 diff Force= 0.000000000139
14 large_example/SialicAcid.dat Elapsed time(s)= 45.34 diff Utot= 0.000000000004 diff Force= 0.000000000005
15 large_example/ZrB2_2x2.dat Elapsed time(s)= 92.80 diff Utot= 0.000000000086 diff Force= 0.000000000002
16 large_example/nsV4Bz5.dat Elapsed time(s)= 82.71 diff Utot= 0.000000005296 diff Force= 0.000000000023
Total elapsed time (s) 1546.50


The comparison was made using 132 MPI processes and 2 OpenMP threads (totally 264 cores) on CRAY-XC30. Since the automatic running test requires large memory, you may encounter a segmentation fault in case that a small number of cores are used. Also the above example implies that the total elapsed time is about 26 minutes even using 264 cores. See also the Section 'Large-scale calculation' for another large-scale benchmark calculation.

2016-04-03