Author Topic: Problem with a parallel version on a cluster  (Read 10770 times)

0 Members and 1 Guest are viewing this topic.

Offline CVD

  • New QuantumATK user
  • *
  • Posts: 3
  • Reputation: 0
    • View Profile
Problem with a parallel version on a cluster
« on: February 24, 2009, 12:31 »
Hi,

I do some calculations on a cluster which I'm not the manager. I use four processors and I think it don't work correctly. The script has to write some text during the calculations but it prints four times ! So I think it's like there is four serial process and not parallel.

Could somebody confirm that there is a problem ? If there is, what could I say to the manager of the cluster ? Is it a problem of installation ?

Thanks !

CVD

Offline CVD

  • New QuantumATK user
  • *
  • Posts: 3
  • Reputation: 0
    • View Profile
Re: Problem with a parallel version on a cluster
« Reply #1 on: February 24, 2009, 13:29 »
I forgot to say that i use ATK version 2008.10

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5405
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: Problem with a parallel version on a cluster
« Reply #2 on: February 24, 2009, 13:43 »
Yes, there is a problem, and a quite typical one. ATK is running 4 master processors, instead of a calculation parallelized on 4 nodes.

Most likely you are using the wrong MPI library. So, the first thing to check is what version of MPICH you are using. It should be MPICH2 1.0.x (anything from 1.0.5 to 1.0.8 works fine), whereas MPICH1 1.2.7 etc will not work.

Also, what is the command used to start the parallel calculation? If you use "mpirun" then you have MPICH1 (which is wrong), if you use "mpiexec" then actually you might have MPICH2 (which is right), and the problem may be a different one.

There is a very simple test to check if the parallel environment is set up properly for ATK; see this post.