Print Page - how to check active license checkouts

QuantumATK => Installation and License Questions => Topic started by: esp on September 15, 2012, 05:53

Title: how to check active license checkouts
Post by: esp on September 15, 2012, 05:53

we have a license which allows N masters and M slaves ... how can I check at any given time how many are being used, and how can I specify how many of each to use on a particular job?

Title: Re: how to check active license checkouts
Post by: Anders Blom on September 15, 2012, 06:06

For the first part: "lmxendutil -licstat" on a machine configured to reach the license server.

For the second, a job running "mpiexec -n N" consumes 1 master licenses and N-1 slaves.

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 06:12

actually i run using:
mpirun -machinefile $PBS_NODEFILE -np $n_proc
I imagine mpirun has similar functionality, i will take a look

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 06:15

the first part never seems to work .. although i have a job running right now actually, i have a machine using an ssh tunnel to point to the license server ... when i try this command, it shows me expired trial license information ... that is not correct .. there is in fact a real license being used, but i am not sure how to do the licstat on it since i am using the tunnel ... are there more options i can use when running licstat to indicate where the license server is?

Title: Re: how to check active license checkouts
Post by: Anders Blom on September 15, 2012, 06:16

mpirun is a softlink to mpiexec on most installations :)

Title: Re: how to check active license checkouts
Post by: Anders Blom on September 15, 2012, 06:17

A trivial solution is to run lmxendutil on the license server or the cluster at MSI, I assume you can ssh to that.

Otherwise the command uses the same information as when you run ATK, i.e. you can set QUANTUM_LICENSE_PATH the same way.

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 06:46

is this sentence above correct?
"For the second, a job running "mpiexec -n N" consumes 1 master licenses and N-1 slaves."

should it be n master?

Title: Re: how to check active license checkouts
Post by: Anders Blom on September 15, 2012, 07:10

Not at all. Each running job needs 1 master, then an additional slave license for each additional MPI process.

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 07:12

ok about number of master/slave nodes, something is still not clear, i apologize ..

the way the msi system explain, for example, i can specify as below:
#PBS -l walltime=24:00:00,pmem=4000mb,nodes=8:ppn=8

then i can run as:
mpirun -machinefile $PBS_NODEFILE -np $n_proc ...

but the way they explain n_proc should be 64, the number of cores, not 8x2=16 which is what you told me before .. we know each node is a 2 processor quad core system, so there are 8 cores per node .. you said previously i should then specify n_proc as # nodes x 2, .. but the way they explain is # nodes x # cores in their example scripts .. so question 1 is, is this particular for how atk works, or is n_proc=#cores correct?

Second question is, if i run as above, where n_proc = 64, with 8 nodes, .. how many are master and how many are slave? our license allows 3 masters, so should i assume it is using all 3? i looked up a page on mpirun and i found this:

http://www.mcs.anl.gov/research/projects/mpi/mpich1-old/docs/mpichman-globus2/node122.htm

there I only see 2 options which seems to be applicable:
-np <np>
   specify the number of processors to run on
-nodes <nodes>
   specify the number of nodes to run on (for SMP systems,
   currently only ch_mpl device supports this)

Still though if i use these options, it does not tell me how many master and slave, and also the description above says -np should be the number of processors, and the MSI documentation here: (https://www.msi.umn.edu/hpc/itasca/quickstart) shows that -np is the number of cores, not processors ...

I am confused ..

Basically what i want is to run with as much power as possible, but leaving 1 master node available for my collegue who is also using the license ...

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 07:15

instead of having you chase links, here is an example given for the system i am using:

#!/bin/bash -l
#PBS -l walltime=01:00:00,pmem=500mb,nodes=4:ppn=8
#PBS -m abe
cd /lustre/szhang
module load intel
module load ompi/intel
mpirun -np 32 ./test > run.out

this is different from what you told me, since you said it should be -np 8 in this case, since we know each node has 2 processors .. so is the example they have incorrect?

Title: Re: how to check active license checkouts
Post by: Anders Blom on September 15, 2012, 16:08

This point has been discussed about a million times in various contexts...

"Their" assumption is that you want to run an MPI process on all the allocated cores. That's a reasonable assumption - if you don't know anything about how ATK parallelizes for performance.

To cut it short (you can read more at http://quantumwise.com/documents/tutorials/latest/ParallelGuide/index.html/), if you put more than 1 MPI process/socket, the competition among the MPI processes for RAM/cache/bus access means the calculation goes slower. But having only one MPI/socket that doesn't mean that ATK doesn't use the cores - it uses them for threading, but that is not a parameter you specify to "mpiexec".

So, again - a run with mpiexec/mpirun -n/np N uses 1 master and N-1 slaves, and launches N MPI processes. How those processes are distributed among your allocated nodes is up the scheduler/queue system - on many cluster you may need additional arguments like -npernode 2 or some other directive to PBS, that should be documented for the cluster

Like the -nodes argument you show, but that documentation is for MPICH1, not MPICH2 which is what you should for ATK. But there should be a similar option.

So in your example case (final post), in which case I'm assuming the cluster consists of X machines with dual quad-core chips (i.e. 2 sockets/nodes, 4 cores/socket), you would do

Code

#!/bin/bash -l
#PBS -l walltime=01:00:00,pmem=500mb,nodes=4:ppn=8
#PBS -m abe
cd $HOME
module load intel
mpirun -np 8 -nodes 4 script.py > run.out

NOTE: load ompi/intel is a big no-no!

In this case ATK will need 1 master and 7 slaves, will run 8 MPI processes distributed over 4 nodes (i.e. 2 MPIs/node), but at least for parts of the calculation (not a whole lot for devices, I admit) it will utilize all 8 cores on all 4 nodes.

And - very importantly: since you have requested "full nodes" - all cores on each node - no other process will run on those nodes, so you get the best possible performance, since you have the node to yourself.

Title: Re: how to check active license checkouts
Post by: esp on September 15, 2012, 18:58

thank you

QuantumATK Forum

QuantumATK => Installation and License Questions => Topic started by: esp on September 15, 2012, 05:53