Dear Quantumwise staff,
I got some problems and need your help . Thanks.
I have a nanoribbon of 42 atoms to calculate the phonon bandstructures, and run it on a cluster, with each node of 16 cores. I assigned the job to 3 nodes, 2 MPI processes in each node (since there are two sockets pernode). I found this is much faster than the way to assign 16 MPI processes per nodes using total 3 nodes when the atk doing calculations such as dynamics matrix calculations. This should be attributed to the OpenMP multithreading which is very efficient in matrix operations and FFT solver.
After all the calculations for dynamics matrix have been done (takes 2days and 5 hours), the atk continues to calculate phonon bandstrucutre. However, I found it seems to take a very long time to calculate the bandstructure. For this job (of 42 atoms, repeat=(1,1,3) ), the calculation wasn't finished for more than 2 days. I thought that in the calculation of the bandstructures the atk is parallelized on k points, so in this case the OpenMP no longer takes effect (I found the CPU is always around 100% occupied, indicating no multithreads). Thus, I should use more MPI processes in the calculation of bandstructure. However, as only 3*2=6 MPI process was assigned, it will take a long time to finish the calculations. Am I right?
If this is true, I have to stop the job once the calculations of dynamics matrix were finished, (which was indicated by the lines like “ Phonon: Calculating forces for displacement 126 / 126”), and then use the checkpoint file to resume the calculation of bandstructure using more MPI processes. So there is the second questions, how to restart a phonon calculation? I tried to revise the script and comment the line “#bulk_configuration.update()”, and add two new lines “bulk_configuration = nlread("./checkpoint2.nc")[0]”, “bulk_configuration.update(force_restart=True)”. In the checkpoint file, there are total 42*3=126 atoms now as was repeated by 3 times. I can make sure the numbers of atoms in VNL.
But I found that atk didn’t calculate the bandstructures, instead it repeated the cell further (now it has totally 126*3=378 atoms), and calculate the dynamics matrix once more, as I found in the output file such lines like “ Phonon: Calculating forces for displacement 1 / 378”.
So, firstly, do I have to use two different parallelization strategies in phonon calculations for a large system of many atoms? I mean using OpenMP (less MPI processes pernode) in calculation of the force constant matrix (dynamic matrix), while using more MPI processes in bandstructure calculations?
Secondly, how to restart a phonon calculation?
Thanks very much.