QuantumATK Forum

QuantumATK => Installation and License Questions => Topic started by: dhurba on July 20, 2012, 11:46

Title: password required when runned in mpiexec
Post by: dhurba on July 20, 2012, 11:46
when i try to run a script using

Code
mpiexec -n 2 -machinefile mymachinefile $ATK_BIN_DIR/atkpython script.py > script.log

it asks for password every time and of all the three other machines.

is there any way to disable the password .i am using a redhat cluster
Title: Re: password required when runned in mpiexec
Post by: Nordland on July 20, 2012, 12:55
The best way is to generated passphraseless SSH key and using that logging into your machines.

It works really for me.
Title: Re: password required when runned in mpiexec
Post by: dhurba on July 23, 2012, 14:06
Yes i have managed to combine all the computer together without ssh password...
But when i run any script i have observed that only one server is in use .that server is the one which is on the first in the machine file

for example

if my machine file is

quantum.xxx.xx
vlsi1.xxx.xx
vlsi2.xxx.xx

quantum takes away all the 3 slave licenses including the master.

same thing if i place 'vlsi1' at the top

please say if i am making any mistake ..


Title: Re: password required when runned in mpiexec
Post by: Nordland on July 23, 2012, 19:49
Then your MPI setup is not working correctly.
Title: Re: password required when runned in mpiexec
Post by: Anders Blom on July 23, 2012, 21:43
Try "-npernode 1" as argument to mpiexec. Not all versions support it, however. It means one process per node, instead of stacking them. Otherwise, you may need to specify in the machinefile how many processes it is allowed to put on each node; if each node has several cores, it may figure that out and try to put as many processes as cores (or sockets).

When experimenting, you can use something simpler than ATK (to make it faster, by avoiding the license check), for instance

Code
mpiexec -n 32 -machinefile machinefile echo $HOSTID

Then you can also test how high you need -n to be before it starts using other nodes, if this is 4 or 8 or another number matching the core count, it would confirm my suspicion above, and you would need to edit the machinefile a bit.