Author Topic: MPI error  (Read 32871 times)

0 Members and 1 Guest are viewing this topic.

Offline AsifShah

  • QuantumATK Guru
  • ****
  • Posts: 216
  • Country: in
  • Reputation: 4
    • View Profile
MPI error
« on: May 15, 2025, 11:04 »
Dear admin,

I got this error when running simulation on cluster on 3 nodes:

Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........:
MPID_Init(1525)..............:
MPIDI_OFI_mpi_init_hook(1597):
MPIDU_bc_table_create(320)...: Missing hostname or invalid host/port description in business card


SLURM Script:

#!/bin/bash

#SBATCH --job-name=QuantumATK
#SBATCH --ntasks=120
#SBATCH --ntasks-per-node=60
#SBATCH --nodes=2
#SBATCH --cpus-per-task=1
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
#SBATCH --partition=normal
#SBATCH --mem=210GB


cd $SLURM_SUBMIT_DIR
export ATK=/home/user/QATK/QATK_W_SP2/Install/quantumatk/W-2024.09-SP2/bin/atkpython
export MPI=/home/user/QATK/QATK_W_SP2/Install/quantumatk/W-2024.09-SP2/mpi/bin/mpiexec.hydra
export MPIE=/home/user/QATK/QATK_W_SP2/Install/quantumatk/W-2024.09-SP2/mpi/bin/mpiexec
export MKL_DYNAMIC=TRUE
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
export LM_LICENSE_FILE="####@*********":"###@***********"
export SNPSLMD_LICENSE_FILE="####@**********":"####@*******"

${MPI} -n 120 -ppn 60 ${ATK} in.py > out.log

Offline filipr

  • QuantumATK Staff
  • QuantumATK Guru
  • *****
  • Posts: 102
  • Country: dk
  • Reputation: 10
  • QuantumATK developer
    • View Profile
Re: MPI error
« Reply #1 on: May 15, 2025, 14:05 »
This appears to be problem with either the cluster configuration or Intel MPI. Contact your cluster admin and show them this error and/or submit a support question on the Intel oneAPI support forum: https://community.intel.com/t5/Intel-MPI-Library/bd-p/oneapi-hpc-toolkit

In both cases it can help them if you set the environment variable I_MPI_DEBUG=5 in your submission script and rerun and copy the debug output from Intel MPI when asking for help elsewhere.

Offline AsifShah

  • QuantumATK Guru
  • ****
  • Posts: 216
  • Country: in
  • Reputation: 4
    • View Profile
Re: MPI error
« Reply #2 on: May 15, 2025, 18:49 »
Thanks. I will do the same.