 
 
 
 
 
 
 
 
 
 
In some cases, one may want to know machine performance for more time consuming calculations. For this purpose, an automatic running test with relatively large-scale systems can be performed by
For the MPI parallel running
     % mpirun -np 128 openmx -runtestL
  
  For the OpenMP/MPI parallel running
  
     % mpirun -np 128 openmx -runtestL -nt 2
  
Then, OpenMX will run with 16 test files, and compare calculated 
  results with the reference results which are stored in 'work/large_example'. 
  The comparison (absolute difference in the total energy and force) is
  stored in a file 'runtestL.result' in the directory 'work'.
  The reference results were calculated using 16 MPI processes of
  a 2.6 GHz Xeon cluster machine. If the difference is within last 
  seven digits, we may consider that the installation is successful.
  As an example, 'runtestL.result' generated by the automatic running
  test is shown below:
| 1 | large_example/5_5_13COb2.dat | Elapsed time(s)= 39.43 | diff Utot= 0.000000000013 | diff Force= 0.000000000046 | 
| 2 | large_example/B2C62_Band.dat | Elapsed time(s)= 572.22 | diff Utot= 0.000000000025 | diff Force= 0.000000013928 | 
| 3 | large_example/CG15c-Kry.dat | Elapsed time(s)= 40.71 | diff Utot= 0.000000002112 | diff Force= 0.000000001090 | 
| 4 | large_example/DIA512-1.dat | Elapsed time(s)= 37.93 | diff Utot= 0.000000169524 | diff Force= 0.000000033761 | 
| 5 | large_example/FeBCC.dat | Elapsed time(s)= 81.55 | diff Utot= 0.000000000649 | diff Force= 0.000000001349 | 
| 6 | large_example/GEL.dat | Elapsed time(s)= 47.05 | diff Utot= 0.000000000066 | diff Force= 0.000000000002 | 
| 7 | large_example/GFRAG.dat | Elapsed time(s)= 24.05 | diff Utot= 0.000000000122 | diff Force= 0.000000000015 | 
| 8 | large_example/GGFF.dat | Elapsed time(s)= 639.31 | diff Utot= 0.000000000051 | diff Force= 0.000000000243 | 
| 9 | large_example/MCCN.dat | Elapsed time(s)= 53.72 | diff Utot= 0.000000009994 | diff Force= 0.000000016474 | 
| 10 | large_example/Mn12_148_F.dat | Elapsed time(s)= 76.58 | diff Utot= 0.000000000096 | diff Force= 0.000000000090 | 
| 11 | large_example/N1C999.dat | Elapsed time(s)= 97.56 | diff Utot= 0.000000006902 | diff Force= 0.000000007356 | 
| 12 | large_example/Ni63-O64.dat | Elapsed time(s)= 78.00 | diff Utot= 0.000000000782 | diff Force= 0.000000000047 | 
| 13 | large_example/Pt63.dat | Elapsed time(s)= 60.40 | diff Utot= 0.000000002147 | diff Force= 0.000000000059 | 
| 14 | large_example/SialicAcid.dat | Elapsed time(s)= 47.80 | diff Utot= 0.000000000005 | diff Force= 0.000000000003 | 
| 15 | large_example/ZrB2_2x2.dat | Elapsed time(s)= 143.16 | diff Utot= 0.000000000030 | diff Force= 0.000000000003 | 
| 16 | large_example/nsV4Bz5.dat | Elapsed time(s)= 104.20 | diff Utot= 0.000000010770 | diff Force= 0.000000000605 | 
The comparison was made using 128 MPI processes and 4 OpenMP threads
  (totally 256 cores) on CRAY-XC30. 
  Since the automatic running test requires large memory, you may encounter
  a segmentation fault in case that a small number of cores are used. 
  Also the above example implies that the total elapsed time is about 36 minutes
  even using 256 cores. See also the Section 'Large-scale calculation' for another large-scale 
  benchmark calculation.
 
 
 
 
 
 
 
 
