Author Topic: Problem running QuantumATK X-2025.06 in parallel with SLURM  (Read 47390 times)

0 Members and 1 Guest are viewing this topic.

Offline pshinyeong

  • Heavy QuantumATK user
  • ***
  • Posts: 50
  • Country: kr
  • Reputation: 0
    • View Profile
Hi,

I just updated to QuantumATK X-2025.06.
On the main node, atkpython runs fine.
But when I try a parallel job with SLURM, it stops immediately and no log file is created.
With the previous version of QuantumATK, the same job script worked for parallel calculation.

I attached my job script and the slurm out file.
Any ideas what changed in this version or how to fix MPI parallel runs?

Thank you

Offline AsifShah

  • QuantumATK Guru
  • ****
  • Posts: 216
  • Country: in
  • Reputation: 4
    • View Profile
Re: Problem running QuantumATK X-2025.06 in parallel with SLURM
« Reply #1 on: August 31, 2025, 18:09 »
Hi Pshinyeong, From what I see you have commented the following lines: #mpirun ~/QuantumATK23/quantumatk/V-2023.09/bin/atkpython_system-mpi $PYTHON_SCRIPT > $LOG_FILE #mpirun /home/edrl_05/QuantumATK/QuantumATK-U-2022.12-SP1/bin/atkpython $PYTHON_SCRIPT > $LOG_FILE Also, instead of using mpirun, I would recommend using QATK inbuilt mpiexec.hydra for parallelization and atkpython for execution. Also, you need to update the paths, so your SLURM script will look something like this:
Code
#!/bin/bash

#SBATCH --job-name=QuantumATK
#SBATCH --ntasks=60
#SBATCH --ntasks-per-node=60
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --partition=normal
#SBATCH --mem=210GB
#SBATCH --nodelist=n16,n15,n14

cd $SLURM_SUBMIT_DIR
export ATK=/home/edrl_05/QuantumATK/quantumatk/X-2025.06/bin/atkpython
export MPI=/home/edrl_05/QuantumATK/quantumatk/X-2025.06/mpi/bin/mpiexec.hydra
export MPIE=/home/edrl_05/QuantumATK/quantumatk/X-2025.06/mpi/bin/mpiexec
export MKL_DYNAMIC=TRUE
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
export LM_LICENSE_FILE="path"
export SNPSLMD_LICENSE_FILE="path"

${MPI} ${ATK} in.py > out.log 

Offline filipr

  • QuantumATK Staff
  • QuantumATK Guru
  • *****
  • Posts: 102
  • Country: dk
  • Reputation: 10
  • QuantumATK developer
    • View Profile
Re: Problem running QuantumATK X-2025.06 in parallel with SLURM
« Reply #2 on: September 1, 2025, 15:48 »
Hi Pshinyeong, QuantumATK ships it's own version of Intel MPI. The version shipped with QuantumATK X-2025.06 is Intel MPI 2021.15 which comes with Intel oneAPI 2024.2 (I believe), which is newer than the one you load in your job script. Newer versions of Intel MPI are not necessarily compatible with older versions, so that could be why it fails. I suggest that you do not use a custom Intel MPI version unless you really, REALLY know what you're doing. So instead of loading oneAPI or OpenMPI as you would maybe normally do for other academic software your simply don't. QuantumATK is a self-container plug-and-play solution that works as-is without any installed libraries or tools. So I recommend you remove the module loads and simply have this line to launch your job:
Code
srun /path/to/atkpython $PYTHON_SCRIPT > $LOG_FILE
If in doubt - use the built in job manager GUI to set up submission scripts for Slurm.