This thread is locked.Only browsing is available.
Top Page > Browsing
parallel questions
Date: 2006/05/01 06:02
Name: Jeason

Dear Prof. Ozaki,

I think I have succesfully compiled the code in parallel.
The enviroument is: icc9.0, acml3.1, lam7.0.6, Opeteron2.0G


But when doing test, I find two problems:

1. When it parallel running, I require the same or more memory /per node than series run.

2. When I running the auto-test run, I get result:

1)running in single cpu, series run:
1 input_example/Benzene.dat Elapsed time(s)= 29.09 diff Utot= 0.000000000000 diff Force= 0.000000000001
2 input_example/C60.dat Elapsed time(s)= 364.59 diff Utot= 0.000000000001 diff Force= 0.000000000001
3 input_example/Cdia.dat Elapsed time(s)= 24.06 diff Utot= 0.000000000001 diff Force= 0.000000002656
4 input_example/CO.dat Elapsed time(s)= 130.29 diff Utot= 0.000000000000 diff Force= 0.000000000033
5 input_example/Cr2.dat Elapsed time(s)= 51.81 diff Utot= 0.000000000000 diff Force= 0.000000000000
6 input_example/Crys-MnO.dat Elapsed time(s)= 523.67 diff Utot= 0.000000000032 diff Force= 0.000000000060
7 input_example/GaAs.dat Elapsed time(s)= 337.79 diff Utot= 0.000000000074 diff Force= 0.000000001227
8 input_example/Glycine.dat Elapsed time(s)= 50.35 diff Utot= 0.000000000001 diff Force= 0.000000000001
9 input_example/Graphite4.dat Elapsed time(s)= 22.74 diff Utot= 0.000000000064 diff Force= 0.000000000213
10 input_example/H2O.dat Elapsed time(s)= 24.65 diff Utot= 0.000000000000 diff Force= 0.000000000001
11 input_example/H2O-EF.dat Elapsed time(s)= 37.07 diff Utot= 0.000000000000 diff Force= 0.000000000197
12 input_example/HYb.dat Elapsed time(s)= 100.95 diff Utot= 0.000000000002 diff Force= 0.000000000000
13 input_example/Methane.dat Elapsed time(s)= 18.26 diff Utot= 0.000000000045 diff Force= 0.000000004540 1 input_example/Benzene.dat Elapsed time(s)= 68.58 diff Utot= 0.000000000000 diff Force= 0.000000000001
2 input_example/C60.dat Elapsed time(s)= 381.21 diff Utot= 0.000000000010 diff Force= 0.000000000005
3 input_example/Cdia.dat Elapsed time(s)= 19.10 diff Utot= 0.000000000000 diff Force= 0.000000002025
4 input_example/CO.dat Elapsed time(s)= 113.78 diff Utot= 0.000000000000 diff Force= 0.000000000035
5 input_example/Cr2.dat Elapsed time(s)= 41.63 diff Utot= 0.000000000001 diff Force= 0.000000000000
6 input_example/Crys-MnO.dat Elapsed time(s)= 268.52 diff Utot= 0.000000000007 diff Force= 0.000000000219
7 input_example/GaAs.dat Elapsed time(s)= 404.82 diff Utot= 0.000000000123 diff Force= 0.000000000756
8 input_example/Glycine.dat Elapsed time(s)= 81.26 diff Utot= 0.000000000001 diff Force= 0.000000000001
9 input_example/Graphite4.dat Elapsed time(s)= 18.24 diff Utot= 0.000000000012 diff Force= 0.000000001777
10 input_example/H2O.dat Elapsed time(s)= 19.13 diff Utot= 0.000000000000 diff Force= 0.000000000001
11 input_example/H2O-EF.dat Elapsed time(s)= 26.73 diff Utot= 0.000000000002 diff Force= 0.000000000140
12 input_example/HYb.dat Elapsed time(s)= 121.02 diff Utot= 0.000000000001 diff Force= 0.000000000001
13 input_example/Methane.dat Elapsed time(s)= 23.37 diff Utot= 0.000000000001 diff Force= 0.000000000095
14 input_example/Mol_MnO.dat Elapsed time(s)= 169.46 diff Utot= 0.000000000013 diff Force= 0.000000000009


Which almost the same effiency. I want to know why

Could you help me?

Thanks
14 input_example/Mol_MnO.dat Elapsed time(s)= 161.71 diff Utot= 0.000000000026 diff Force= 0.000000000046

2) in 6 cpus, parallel run:






メンテ
Page: [1]

Re: parallel questions ( No.1 )
Date: 2006/05/12 10:44
Name: T.Ozaki

Hi,

> 1. When it parallel running, I require the same or
> more memory /per node than series run.

For smaller systems consisting of less than 100 atoms
with conventional diagonalization, the memory size required
is not reduced significantly, since each spatially
decomposed region has buffer spatial region so that
matrix elements in the node can be constructed with a minium
MPI communication.

But, for O(N) methods or larger systems consisting more than
100 atoms, you can confirm that the memory size required
can be largely reduced.

> 2. When I running the auto-test run, I get result:

I shall show you my runtest results using 1 and 6 cpus
of an Opteron cluster. The result by 6 cpus is faster than
that by 1 cpu in all the cases. And much acceleration is found
especially for C60.dat with the DC method. I would wonder if
your parallel environment is well optimized.

Regards,

TO

********** 1 CPU **************

1 input_example/Benzene.dat Elapsed time(s)= 38.12 diff Utot= 0.000000000000 diff Force= 0.000000000002
2 input_example/C60.dat Elapsed time(s)= 456.74 diff Utot= 0.000000000001 diff Force= 0.000000000001
3 input_example/Cdia.dat Elapsed time(s)= 27.91 diff Utot= 0.000000000000 diff Force= 0.000000000764
4 input_example/CO.dat Elapsed time(s)= 299.52 diff Utot= 0.000000000000 diff Force= 0.000000000264
5 input_example/Cr2.dat Elapsed time(s)= 70.40 diff Utot= 0.000000000000 diff Force= 0.000000000000
6 input_example/Crys-MnO.dat Elapsed time(s)= 480.24 diff Utot= 0.000000000246 diff Force= 0.000000000193
7 input_example/GaAs.dat Elapsed time(s)= 452.82 diff Utot= 0.000000000234 diff Force= 0.000000000500
8 input_example/Glycine.dat Elapsed time(s)= 95.19 diff Utot= 0.000000000001 diff Force= 0.000000000000
9 input_example/Graphite4.dat Elapsed time(s)= 33.74 diff Utot= 0.000000000032 diff Force= 0.000000001536
10 input_example/H2O.dat Elapsed time(s)= 41.02 diff Utot= 0.000000000000 diff Force= 0.000000000001
11 input_example/H2O-EF.dat Elapsed time(s)= 99.10 diff Utot= 0.000000000005 diff Force= 0.000000000213
12 input_example/HYb.dat Elapsed time(s)= 169.89 diff Utot= 0.000000000001 diff Force= 0.000000000000
13 input_example/Methane.dat Elapsed time(s)= 30.41 diff Utot= 0.000000000006 diff Force= 0.000000001846
14 input_example/Mol_MnO.dat Elapsed time(s)= 444.23 diff Utot= 0.000000000023 diff Force= 0.000000000032

Total elapased time (s) 2739.32800



********** 6 CPU **************

1 input_example/Benzene.dat Elapsed time(s)= 18.26 diff Utot= 0.000000000000 diff Force= 0.000000000000
2 input_example/C60.dat Elapsed time(s)= 91.06 diff Utot= 0.000000000009 diff Force= 0.000000000004
3 input_example/Cdia.dat Elapsed time(s)= 18.96 diff Utot= 0.000000000000 diff Force= 0.000000004416
4 input_example/CO.dat Elapsed time(s)= 176.96 diff Utot= 0.000000000000 diff Force= 0.000000000038
5 input_example/Cr2.dat Elapsed time(s)= 56.44 diff Utot= 0.000000000001 diff Force= 0.000000000001
6 input_example/Crys-MnO.dat Elapsed time(s)= 138.47 diff Utot= 0.000000015518 diff Force= 0.000000008365
7 input_example/GaAs.dat Elapsed time(s)= 214.62 diff Utot= 0.000000000653 diff Force= 0.000000001369
8 input_example/Glycine.dat Elapsed time(s)= 37.54 diff Utot= 0.000000000001 diff Force= 0.000000000000
9 input_example/Graphite4.dat Elapsed time(s)= 17.76 diff Utot= 0.000000000006 diff Force= 0.000000000071
10 input_example/H2O.dat Elapsed time(s)= 22.84 diff Utot= 0.000000000000 diff Force= 0.000000000001
11 input_example/H2O-EF.dat Elapsed time(s)= 34.30 diff Utot= 0.000000000001 diff Force= 0.000000000003
12 input_example/HYb.dat Elapsed time(s)= 135.32 diff Utot= 0.000000000002 diff Force= 0.000000000000
13 input_example/Methane.dat Elapsed time(s)= 15.36 diff Utot= 0.000000000000 diff Force= 0.000000000000
14 input_example/Mol_MnO.dat Elapsed time(s)= 267.92 diff Utot= 0.000000000065 diff Force= 0.000000000080

Total elapased time (s) 1245.79800


メンテ

Page: [1]