Author Topic: logfile meaning  (Read 1829 times)

0 Members and 1 Guest are viewing this topic.

Offline naash

  • Regular QuantumATK user
  • **
  • Posts: 12
  • Reputation: 0
    • View Profile
logfile meaning
« on: April 4, 2011, 06:23 »
Numbers of Processors:  10
---------------------------
started mpd Number: 5
---------------------------
/share/apps/ATK/atkpython/bin/atkpython: line 3:  7097 Killed                  PSEUDOPOTENTIALS_PATH=$EXEC_DIR/../share/pseudopotentials GPAW_SETUP_PATH=$EXEC_DIR/../share/gpaw-setups/ PYTHONHOME=$EXEC_DIR/.. PYTHONPATH= LD_LIBRARY_PATH=$EXEC_DIR/../lib $EXEC_DIR/atkpython_exec $*


rank 8 in job 1  node9_42312   caused collective abort of all ranks
  exit status of rank 8: return code 137

 what does the above message mean

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5429
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: logfile meaning
« Reply #1 on: April 4, 2011, 10:37 »
Usually it means you have run out of memory. Your first lines seem to indicate that your run 10 MPI processes on 5 nodes. That means each node runs two processes, and this doubles the memory requirement (each processes uses about the same amount as a serial process). The rule of thumb is one MPI process per physical node, unless the calculation is small (compared to the available RAM).

Also, if this is ATK 10.8, you can consider upgrading to 11.2 which uses less memory for two-probe calculations.