QuantumATK Forum
QuantumATK => General Questions and Answers => Topic started by: kumars12 on November 6, 2021, 20:09
-
Hello,
I have been trying to optimize the device for my simulation which is an NEGF calculation for a Cu metal slab. As you would notice in the files attached, my calculations do not progress after the first step of optimization and get stuck there forever. I am running the calculations on a SLURM cluster which has 12 CPUs each with 32 cores. For this calculation, I have chosen 6 MPI processes (1 process per node) since there are 6 irreducible k-points. It seems ATK automatically chooses 16 threads for each process. I have attached my log files here. Has anyone encountered this problem before? Any help would be much appreciated.
Thanks!
-
Update: I also tried running using 6 MPI processes with 1 thread each (using export OMP_NUM_THREADS=1 and export MKL_NUM_THREADS=1) but it gets stuck at the same point. It has been running for over 7 hours but the file has not updated after the last line in the attached log file.
-
The issue is likely a recently-found bug with Parallel conjugate gradient and related type of the Poisson solvers - the corresponding fix will be available in Service Pack 2 (SP2) in about a month time. I think for your system (I see no continuum metal or dielectric regions in the script you have posted), you could use FFT2D type of the Poisson solver instead of PCG; the FFT2D is not affected by this bug.
-
Thanks a lot! That solved the issue.