next up previous contents index
Next: Band calculation Up: MPI parallelization Previous: O() calculation   Contents   Index

Cluster calculation

In the cluster calculation, a double parallelization is made for two loops: spin multiplicity and eigenstates, where the spin multiplicity means one, two, and one for spin-unpolarized, spin-polarized, and non-collinear calculations, respectively. The priority of parallelization is in order of spin multiplicity and eigenstates. In the eigenvalue solver, the Householder transformation, which tridiagonalizes a Hermitian matrix, the back transformation, and other matrix operations are parallelized. Only eigenvalues and eigenvectors of the tridiagonalized matrix are evaluated using lapack routines, which is a minority part in the computational time of the diagonalization if only eigenvectors of occupied and lower exited states are evaluated. To avoid the calculation of eigenstates in the high energy region, it is highly recommended to use 'dstevx' which is specified by the following keyword:

    scf.lapack.dste    dstevx    # dstegr|dstedc|dstevx, default=dstevx
Since 'dstevx' is default, if you like 'dstevx', you do not need to specify the keyword. In case of 'dstevx', the eigenstates to be calculated is automatically determined by the number of electrons. In the other schemes 'dstegr' and 'dstedc', eigenstates in the higher energy region are also calculated. Figure 18 (b) shows the speed-up ratio as a function of processors in the elapsed time for a spin-polarized calculation of a single molecular magnet consisting of 148 atoms. The input file Mn12.dat is found in the directory 'work'. It is found that the speed-up ratio is 19 and 27 using 32 and 64 processors, respectively.


next up previous contents index
Next: Band calculation Up: MPI parallelization Previous: O() calculation   Contents   Index
2009-08-28