QuantumATK Forum
QuantumATK => General Questions and Answers => Topic started by: postnikov on June 30, 2010, 05:44
-
I perform calculations with atk code. The charge in the scattering region varied very smoothly as follows.
# sc 0 : q = 430.00000 e
# sc 1 : q = 415.65637 e dRho = 9.8695E-01
# sc 2 : q = 620.18219 e dRho = 5.9300E+01
# sc 3 : q = 442.87604 e dRho = 5.8095E+01
# sc 4 : q = 439.45299 e dRho = 9.4812E-01
# sc 5 : q = 437.16236 e dRho = 3.9194E-01
.......
# sc 36 : q = 432.10083 e dRho = 5.4732E-01
# sc 37 : q = 431.76483 e dRho = 6.5820E-02
# sc 38 : q = 431.26704 e dRho = 1.7698E-01
# sc 39 : q = 430.68879 e dRho = 3.4002E-01
# sc 40 : q = 431.47270 e dRho = 3.1151E-01
# sc 41 : q = 431.60416 e dRho = 1.8738E-02
# sc 42 : q = 431.88605 e dRho = 4.7449E-02
Unfortunatlely, my calculation is closed due to the computer shutdown. I restart the calculation,
the charge in the scattering region calculation is very strange.
# sc 0 : q = 432.55992 e
# sc 0 : q = 432.55992 e
# sc 0 : q = 432.55992 e
# sc 1 : q = -4.00223 e dRho = 3.6860E+01
# sc 1 : q = -4.00223 e dRho = 3.6860E+01
# sc 1 : q = -4.00223 e dRho = 3.6860E+01
-
Except for the negative charge, which could be recovered in step 2, you are somehow not running the second calculation properly in parallel. It seems all nodes think they are master nodes... How did you restart (script and command line)?
-
This is my init script part:
.
.
.
.
.
runtime_parameters = runtimeParameters(
verbosity_level = 10,
checkpoint_filename = 'twoprobe.nc'
)
# Perform self-consistent field calculation
scf = executeSelfConsistentCalculation(
twoprobe_configuration,
two_probe_method,
runtime_parameters = runtime_parameters,
initial_calculation = scf
)
.
.
.
.
.
my input file for restart calcilation is just like this:
.
.
.
.
.
runtime_parameters = runtimeParameters(
verbosity_level = 10,
checkpoint_filename = 'newtwoprobe.nc'
)
scf = restoreSelfConsistentCalculation("twoprobe.nc")
# Perform self-consistent field calculation
scf = executeSelfConsistentCalculation(
twoprobe_configuration,
two_probe_method,
runtime_parameters = runtime_parameters,
initial_calculation = scf
)
.
.
.
.
.
-
And exactly the same command for mpiexec?
-
YES, they are all using the followingg command to submit jobs:
mpirun -np 8 /home/postnikov/atk-2008-10/bin/atk test.py </dev/null | tee out&
which mpirun shows
/opt/intel/mpich2-1.0.7rc1/bin/mpirun
-
It seems you are not running with the proper MPICH2 libraries. You must use that MPI, ATK does not support Intel MPI, even if it's supposed to be "compatible". The effect you see, that all nodes think they are masters, is a typical symptom. It also means you are actually not getting a proper parallel performance.
To run ATK in parallel, you should use "mpiexec" from MPICH2 (the one from Argonne!).
Unless your mpirun is some kind of alias for that...?
-
Thanks!
I think my mpirun is not the IntelMpi.
The used mpich2 is only installed by the ifort compiler, not the pgi complier.
In my dir /opt/intel/mpich2-1.0.7rc1/bin/
There is one file mpiexec.
You mean that I must use mpiexe not using the mpirun, although they both belong to the mpich2?
By the way, the openmpi can be used in the atk code?
-
No, OpenMPI can not be used, that's a completely different MPI architecture compared to MPICH(2).
In MPICH1 the command was "mpirun", in 2 they changed it to "mpiexec", but usually there's a symbolic link from mpirun->mpiexec for compatibility, so probably that makes no real difference.
Under any circumstance, ATK only officially supports MPICH2 from Argonne. 1.0.7p1 is also a very old version, there is 1.2.1 now. I suggest you install that.