QuantumATK Forum
QuantumATK => General Questions and Answers => Topic started by: naash on April 4, 2011, 06:23
-
Numbers of Processors: 10
---------------------------
started mpd Number: 5
---------------------------
/share/apps/ATK/atkpython/bin/atkpython: line 3: 7097 Killed PSEUDOPOTENTIALS_PATH=$EXEC_DIR/../share/pseudopotentials GPAW_SETUP_PATH=$EXEC_DIR/../share/gpaw-setups/ PYTHONHOME=$EXEC_DIR/.. PYTHONPATH= LD_LIBRARY_PATH=$EXEC_DIR/../lib $EXEC_DIR/atkpython_exec $*
rank 8 in job 1 node9_42312 caused collective abort of all ranks
exit status of rank 8: return code 137
what does the above message mean
-
Usually it means you have run out of memory. Your first lines seem to indicate that your run 10 MPI processes on 5 nodes. That means each node runs two processes, and this doubles the memory requirement (each processes uses about the same amount as a serial process). The rule of thumb is one MPI process per physical node, unless the calculation is small (compared to the available RAM).
Also, if this is ATK 10.8, you can consider upgrading to 11.2 which uses less memory for two-probe calculations.