Author Topic: How to restart a phonon calculation?, and about the parrallelization strategies  (Read 2870 times)

0 Members and 1 Guest are viewing this topic.

Offline yqxie

  • Regular QuantumATK user
  • **
  • Posts: 15
  • Country: cn
  • Reputation: 0
    • View Profile
Dear Quantumwise staff,
      I got some problems and need your help . Thanks.
      I have a nanoribbon of 42 atoms to calculate the phonon bandstructures, and run it on a cluster, with each node of 16 cores.  I assigned the job to 3 nodes, 2 MPI processes in each node (since there are two sockets pernode).   I found this is much faster than the way to assign 16 MPI processes per nodes using total 3 nodes when the atk doing calculations such as dynamics matrix calculations. This should be attributed to the OpenMP multithreading which is very efficient in matrix operations and FFT solver. 
After all the calculations for dynamics matrix have been done (takes 2days and 5 hours), the atk continues to calculate phonon bandstrucutre. However, I found it seems to take a very long time to calculate the bandstructure.  For this job (of 42 atoms, repeat=(1,1,3) ), the calculation wasn't finished for more than 2 days.  I thought that in the calculation of the bandstructures the atk is parallelized on k points, so in this case the OpenMP no longer takes effect (I found the CPU is always around 100% occupied, indicating no multithreads). Thus, I should use more MPI processes in the calculation of bandstructure. However, as only 3*2=6 MPI process was assigned, it will take a long time to finish the calculations. Am I right? 
If this is true, I have to stop the job once the calculations of dynamics matrix were finished, (which was indicated by the lines like “ Phonon: Calculating forces for displacement 126 / 126”), and then use the checkpoint file to resume the calculation of bandstructure using more MPI processes.  So there is the second questions, how to restart a phonon calculation?  I tried to revise the script and comment the line “#bulk_configuration.update()”, and add two new lines “bulk_configuration = nlread("./checkpoint2.nc")[0]”, “bulk_configuration.update(force_restart=True)”. In the checkpoint file, there are total 42*3=126 atoms now as was repeated by 3 times. I can make sure the numbers of atoms in VNL.
But I found that atk didn’t calculate the bandstructures, instead it repeated the cell further (now it has totally 126*3=378 atoms), and calculate the dynamics matrix once more, as I found in the output file such lines like “ Phonon: Calculating forces for displacement 1 / 378”.
So, firstly, do I have to use two different parallelization strategies in phonon calculations for a large system of many atoms? I mean using OpenMP (less MPI processes pernode) in calculation of the force constant matrix (dynamic matrix), while using more MPI processes in bandstructure calculations?
Secondly, how to restart a phonon calculation?
Thanks very much.

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5565
  • Country: dk
  • Reputation: 93
    • View Profile
    • QuantumATK at Synopsys
No, this kind of restart will not work, because the checkpoint file is only designed for a single calculation.

In ATK 2015 you will be able to save the dynamical matrix, that will make it easier to use different parallelization strategies, etc. In general, computing the dynamical matrix parallelizes linearly up to 3N MPI processes, where N is the number of atoms (in the original system - not the repeated one), whereas the phonon band structure calculation is parallel over q-points (also MPI). So in principle yes, you can tune the strategy based on this, but in 2014 you cannot save the dynamical matrix and so it will not work. But since you probably have more than enough q-points to parallelize over, just run it on as many nodes as you want.

If you are using DFT, you will not really have any benefit from OpenMP so just use MPI - and make sure to turn threading off with OMP_NUM_THREADS=1! For ATKClassical there is some benefit of OpenMP but you will only see this for very large structures (many thousand atoms). Since the MPI scaling for these calculations is basically linear, just stick to MPI.
« Last Edit: February 22, 2015, 22:24 by Anders Blom »

Offline yqxie

  • Regular QuantumATK user
  • **
  • Posts: 15
  • Country: cn
  • Reputation: 0
    • View Profile
 Thanks very much! Looking forward to the release of ATK2015.