openmx runtime questions

Top Page > Browsing

openmx runtime questions

Date: 2005/12/08 00:13
Name: peter: Dear Prof. Ozaki,

As you could see from my previous posts, I am interested in applying openmx for small molecules modelling. What I'd like to do is to perform similar QM calculations on reasonably large (~100) sets of molecules. Since a single openmx run takes fairly large time (what else one can expect from QM though), I did some timing measurements.

What I see on my linux x86 cluster is the following. A very substantial time openmx does "Fourier transform of PAO and projectors of VNL". Do I understand right that these quantities can be pre-calculated for my molecules set provided that I know the atoms types in the set and all the molecules are calculated within the same size fft grid (all of my molecules are roughly of the same size)? Are there other "constant" quantities you could reccomend for pre-calculation and storing?

My MPI environment is sensitive to memory misallocations. There are few small "problems" in your code. Below I am sending you the two modified functions. There are comments in the code (the problem is related to treatment of the edges of arrays, both times there is an attempt to access unallocated memory). This isn't a great deal, though may cause trouble in a massive calculation, at least on our cluster. Hope it is helpful.

More than 30% of time openmx spends in Bessel functions evaluations. What actually is the reason for that specific realization of bessjp (from NR.COM I guess?). I would think that a realization of Bessel functions specifically optimizied for fast and accurate calculations for real arguments and integer indices might work faster. There other versions of besseljp (netlib, nist and so on). Do you think it is worth to go deep into?

Regards
Peter

Page: [1]

Re: openmx runtime questions ( No.1 )

Date: 2005/12/08 00:15
Name: peter

Here is the followup with the code-fixes:
---------------------------------
double VNAF(int Gensi, double R)
{
/*
..............
ORIGINAL CODE
..............
*/

/****************************************************
Spline like interpolation
****************************************************/

if (po==0){

// h1 = Spe_VPS_RV[Gensi][m-1] - Spe_VPS_RV[Gensi][m-2]; ERROR - ACCESS TO UNALLOCATED MEMORY
h2 = Spe_VPS_RV[Gensi][m] - Spe_VPS_RV[Gensi][m-1];
// h3 = Spe_VPS_RV[Gensi][m+1] - Spe_VPS_RV[Gensi][m]; ERROR - ACCESS TO UNALLOCATED MEMORY

// f1 = Spe_Vna[Gensi][m-2]; ERROR - ACCESS TO UNALLOCATED MEMORY
f2 = Spe_Vna[Gensi][m-1];
f3 = Spe_Vna[Gensi][m];
// f4 = Spe_Vna[Gensi][m+1]; ERROR - ACCESS TO UNALLOCATED MEMORY

/****************************************************
Treatment of edge points
****************************************************/

if (m==1)
{
h3 = Spe_VPS_RV[Gensi][m+1] - Spe_VPS_RV[Gensi][m];
f4 = Spe_Vna[Gensi][m+1];

h1 = -(h2+h3);
f1 = f4;
}
else if (m==(Spe_Num_Mesh_VPS[Gensi]-1))
{
h1 = Spe_VPS_RV[Gensi][m-1] - Spe_VPS_RV[Gensi][m-2];
f1 = Spe_Vna[Gensi][m-2];

h3 = -(h1+h2);
f4 = f1;
}
else
{
h1 = Spe_VPS_RV[Gensi][m-1] - Spe_VPS_RV[Gensi][m-2];
h3 = Spe_VPS_RV[Gensi][m+1] - Spe_VPS_RV[Gensi][m];

f1 = Spe_Vna[Gensi][m-2];
f4 = Spe_Vna[Gensi][m+1];
}
/*
..............
ORIGINAL CODE
..............
*/
}

double PhiF(double R, double *phi0, double *MRV, int Grid_Num)
{
/*
..............
ORIGINAL CODE
..............
*/

/****************************************************
Spline like interpolation
****************************************************/

if (po==0){

// h1 = MRV[m-1] - MRV[m-2]; ERROR - ACCESS TO UNALLOCATED MEMORY
h2 = MRV[m] - MRV[m-1];
// h3 = MRV[m+1] - MRV[m]; ERROR - ACCESS TO UNALLOCATED MEMORY

// f1 = phi0[m-2]; ERROR - ACCESS TO UNALLOCATED MEMORY
f2 = phi0[m-1];
f3 = phi0[m];
// f4 = phi0[m+1]; ERROR - ACCESS TO UNALLOCATED MEMORY

/****************************************************
Treatment of edge points
****************************************************/

if (m==1)
{
h3 = MRV[m+1] - MRV[m];
f4 = phi0[m+1];

h1 = -(h2+h3);
f1 = f4;
}
else if (m==(Grid_Num-1))
{
h1 = MRV[m-1] - MRV[m-2];
f1 = phi0[m-2];

h3 = -(h1+h2);
f4 = f1;
}
else
{
h1 = MRV[m-1] - MRV[m-2];
h3 = MRV[m+1] - MRV[m];

f1 = phi0[m-2];
f4 = phi0[m+1];
}
/*
..............
ORIGINAL CODE
..............
*/
}

Re: openmx runtime questions ( No.2 )

Date: 2005/12/08 19:54
Name: T.Ozaki

Hi,

> A very substantial time openmx does "Fourier transform of PAO
> and projectors of VNL".

Its portion of the computational time depends on the system size.
Let me know the typical size of your 100 molecules.

> Do I understand right that these quantities
> can be pre-calculated for my molecules set provided that I know the
> atoms types in the set and all the molecules are calculated within
> the same size fft grid (all of my molecules are roughly of the same size)?
> Are there other "constant" quantities you could reccomend for pre-calculation
> and storing?

The Fourier-transformed quantities generated in "FT_*" can be pre-calculated
and re-used. if your systems consist of less than 30 atoms, then such a
scheme is maybe effective for reduction of the computational time.
But, I am not planning to implement the capability soon, while your contribution
is wellcome. If so, please e-mail me directry. Then, I will send you the latest
version of OpenMX.

> My MPI environment is sensitive to memory misallocations. There are
> few small "problems" in your code. Below I am sending you the two modified
> functions. There are comments in the code (the problem is related to treatment
> of the edges of arrays, both times there is an attempt to access unallocated
> memory). This isn't a great deal, though may cause trouble in a massive
> calculation, at least on our cluster. Hope it is helpful.

According to your suggestions, I have fixed the bug which lived in a dozen of
routines. It will be reflected to the next release. Thanks for your pointer.

> More than 30% of time openmx spends in Bessel functions evaluations.
> What actually is the reason for that specific realization of bessjp
> (from NR.COM I guess?). I would think that a realization of Bessel
> functions specifically optimizied for fast and accurate calculations
> for real arguments and integer indices might work faster. There other
> versions of besseljp (netlib, nist and so on). Do you think it is worth
> to go deep into?

I had also recognized that the evaluation of Bessel functions is time-consuming,
and have recently modified by implenting a much faster scheme.

Regards,

T.Ozaki

Page: [1]