Author Topic: Parallel only works on 3 CPUs.  (Read 5536 times)

0 Members and 1 Guest are viewing this topic.

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Parallel only works on 3 CPUs.
« on: April 6, 2009, 07:35 »
Dear All,
    I run ATK with the command:
      mpirun -np 8 atk job.py &> outfile&
    Then I run "top" to watch the system and eight atk_exec processes were found to be running. But after  the atk output "# sc  0 : Fermi Energy =    0.00000 Ry" in outfile, only three atk_exec processes were running until the end of job. I have changed the parallel number from 8 to 4. There was no change. Only three atk_exec processes were running in the SCF loop.
    Is this normal? My job is a SCF calculation of TwoProbeMethod.
    Regards,

Frsy
« Last Edit: April 6, 2009, 07:41 by frsy »

Offline Nordland

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 812
  • Reputation: 18
    • View Profile
Re: Parallel only works on 3 CPUs.
« Reply #1 on: April 6, 2009, 07:51 »
I have never seen this before, but I have two things you can check:

1) Do you have enough licenses for running 8 jobs?

2) Do you use a machine file? or do you use the system default machine file? Is there more than 3 names written is this file?

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Parallel only works on 3 CPUs.
« Reply #2 on: April 6, 2009, 08:00 »
1) Do you have enough licenses for running 8 jobs?
Yes.

2) Do you use a machine file? or do you use the system default machine file? Is there more than 3 names written is this file?
I do not use any machine file since I use "mpirun" instead of "mpiexec". Intel MPI outputs
"WARNING: Can't read mpd.hosts for list of hosts, start only on current"
at the begining of outfile. This message is not an error. I also run VASP in this way. It always works. I will try "mpiexec" and check if this still happens.

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: Parallel only works on 3 CPUs.
« Reply #3 on: April 6, 2009, 08:53 »
First a check: which version of ATK are you using? The use of "mpirun" rather than "mpiexec" first led me to believe that you have an older, outdated version (i.e. older than 2008.02). But, then you write "Intel MPI", which could be the simple explanation: ATK only works with MPICH2 (ver 1.0.8 or similar).

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Parallel only works on 3 CPUs.
« Reply #4 on: April 7, 2009, 04:40 »
First a check: which version of ATK are you using?
ATK 2008.10.0 Linux-x86_64

But, then you write "Intel MPI", which could be the simple explanation: ATK only works with MPICH2 (ver 1.0.8 or similar).
Now I have tried mpich2 included in Fedora 10, mpich2version outputs:
MPICH2 Version:         1.0.8
MPICH2 Release date:    Unknown, built on Tue Mar 10 00:21:11 EDT 2009
MPICH2 Device:          ch3:nemesis

This time I wrote machinefile but the same thing happened to my SCF calculation of TwoProbeMethod. Only 3 CPUs (sometimes 2 CPUs) worked when the atk_exec stepped into the SCF loop.
So this should not an error occuerd by mpich2 or intel mpi. I think this may relate to the TwoProbeMethod. If I run the bulk job (KSMethod) 8 CPUs works all the time.

Can you give me more hints? Are there other parameters in the input file can controll the parallel?

BTW: On the manual of "Launching a parallel job using MPICH2" it said
If you want to run on a specific set of machines you can construct a machine file. To run 2 jobs on the specified machines:
mpiexec -n 2 -machinefile mymachinefile $ATK_BIN_DIR/atk [args...]

I believe this is not correct. (At least on my machine.) Since machinfile is a global args, it must appear before the local args n:
mpiexec -machinefile mymachinefile -n 2 $ATK_BIN_DIR/atk [args...]
« Last Edit: April 7, 2009, 04:49 by frsy »

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: Parallel only works on 3 CPUs.
« Reply #5 on: April 7, 2009, 09:49 »
The degree of parallelization also depends to some extent on the system and the parameters used. For instance, if you have 1x1x100 k-point sampling, then you will see good parallelization in the beginning, for the electrodes, but once you get into the two-probe calculation there isn't much for ATK to parallelize over.

By the way, are you parallelizing this calculation on a single machine (the use of "top" leads me to believe so)? In that case, it's probably not a good idea to use more than 2-3 parallel processes anyway, because of competition for RAM and cache. You're probably better off using threading, if you have a multi-core CPU, and perhaps 2 MPI processes.

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Parallel only works on 3 CPUs.
« Reply #6 on: April 10, 2009, 04:21 »
Thank you. I think you are right. For electrode calculations the parallelization is very good but few CPUs are used when two-probe is calculated.