Author Topic: supercomputers  (Read 10668 times)

0 Members and 1 Guest are viewing this topic.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
supercomputers
« on: March 21, 2012, 02:17 »
I just access to some machines with hundreds of cores and tons of memory, i want to make use of them but having some difficulty installing ... do you know what this means?


 /home/it1/patakye/ATK/atkpython/bin//atkpython: line 3: 26274 Killed                  PSEUDOPOTENTIALS_PATH=$EXEC_DIR/../share/pseudopotentials GPAW_SETUP_PATH=$EXEC_DIR/../share/gpaw-setups/ PYTHONHOME=$EXEC_DIR/.. PYTHONPATH= LD_LIBRARY_PATH=$EXEC_DIR/../lib $EXEC_DIR/atkpython_exec $*


Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #1 on: March 22, 2012, 08:50 »
Ok, i think i figured out what the issue was with the error I posted before ... i had to submit "jobs" in a different way with these supercomputers ... but now i have another question and i would appreciate your help ... the link below has a few very specific examples on how to run ATK on these huge powerful machines ... can you tell me, if i am running things like LDOS or transmission calcs, how should i set them per these examples?  for example, i can specify how many nodes (up to 1000), processors (out of 8744), and memory per core ... what would be best?

I set up a job for a transmission calc that normally has taken about 10-24 hours on my machines .. i set it up on their machine with 8 nodes and 48 cores and 1.5GB memory per core ... i want to make full use of these machines to make it run like the wind ... how can I best do that with ATK?

https://www.msi.umn.edu/hardware/itasca/quickstart.html

specifics of this machine:

1,086 compute nodes
2 interactive nodes
5 server nodes
8,744 total cores
26.184 TB total main memory
Suitable for: large MPI jobs
Each node:

Processors: Two quad-core 2.8 GHz Intel Xeon X5560 "Nehalem EP"-class processors
Memory: 24 GB main memory

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: supercomputers
« Reply #2 on: March 22, 2012, 11:31 »
In order that I don't waste time answering the wrong question, you are wondering about recommendations about allocation (number of nodes/cores etc), and not (or also) how to actually request them in your PBS script?

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #3 on: March 22, 2012, 21:52 »
Yes just number of nodes, etc .. I am not so familiar with this type of setup so I dont know how it applies best to atk ... also, I do not need scripts but there are multiple methods of running parallel jobs as you can see on the link I posted, I do not know which is best ... mpi,openmp, others ... the page has multiple short examples

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: supercomputers
« Reply #4 on: March 22, 2012, 22:49 »
ATK can take advantage of both MPI and OpenMP (to a lesser extent), but for your calculations I think all the benefit will lie in MPI. As a rule of thumb, the code will scale well up to roughly the number of k-points NAxNB/2 for the self-consistent part for zero bias, whereas for finite bias you have a benefit up to 30-50 nodes due to the integration in the complex plane. The speed-up is however not linear, and you have to account for the probability to wait very long in the queue if you request too many nodes. For analysis, like computing the LDOS or T(E) etc, the scaling can be linear up to 100 nodes easily (the number of energy points in T(E) for instance).

I would recommend running over 16 MPI nodes, try 32 for some of the analysis.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #5 on: March 22, 2012, 23:26 »
thank you

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #6 on: March 22, 2012, 23:49 »
I had a few jobs run this way sort of get stuck this way and it seems like different nodes are trying to create the same files, and maybe the job died?  so i have a question ... i use nlsave and nlprint always .. but i am seeing multiple printing and multiple messages about trying to create the same file from within one of my scripts ... now this never happened before i went to the MPI system ... shouldn't nlsave protect different nodes from saving the same file?

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: supercomputers
« Reply #7 on: March 23, 2012, 00:15 »
This is a well-known problem. It means your system uses OpenMPI (or similar) rather than MPICH2 which is required for ATK.

It is possible they have MPICH2 installed already (or a similar MPICH-compatible MPI library), if not they will have to install it, for you to run ATK.

I can see that Intel MPI is available on your system, that will work. So you need to load that module.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #8 on: March 23, 2012, 22:42 »
ahhh thank you very much i will try again ... :))

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #9 on: March 23, 2012, 22:48 »
there is intel ompi and intel pmpi ... can i use either ... ?  actually last time I did use

module load pmpi/intel

and got the same error ... the file it says DNE does exist


oh they also have "impi":
module load impi/intel


« Last Edit: March 23, 2012, 22:50 by esp »

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: supercomputers
« Reply #10 on: March 24, 2012, 12:56 »
There is a simple test to check if the parallelization is correctly set up. Enter the following into a script
Code: python
import socket
if processIsMaster():
    print 'Master node:',
else:
    print 'Slave node:',
print socket.gethostname()
and execute it in parallel. Make sure to capture the output. It should print "Master" once and "Slave" N-1 times, where N is your -n N in mpiexec. This will be the signal that you have a proper MPI setup. For OpenMPI and its relatives it prints "Master" N times, however, and that tells you that all processes think they are masters, and will try to write to the NetCDF file, and this of course causes problems. For more information about running ATK in parallel, see the Parallel Tutorial.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #11 on: March 24, 2012, 22:27 »
Ok it is all working now .. thank you very much

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #12 on: March 26, 2012, 08:07 »
hey i finally ran a device, and got some good results!!!  i just had to post .. this shows on/off ratio and subthreshold slope for a graphene TFET .. just one i picked randomly from a paper i read .. but at least i got it to work now ... these supercomputer sure are a luxury too i must say .. i am running 32 nodes with 8 processor each ... awesome


Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: supercomputers
« Reply #13 on: March 26, 2012, 08:10 »
and thank you guys for all the help!

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5411
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: supercomputers
« Reply #14 on: March 26, 2012, 09:44 »
Great! Parallel does help a lot, and once you get used to it you are hooked - you don't want to go back to serial ;)