QuantumATK Forum

QuantumATK => General Questions and Answers => Topic started by: zhangxuebiao on June 10, 2010, 14:07

Title: An error occured:when run a scipt in parallel
Post by: zhangxuebiao on June 10, 2010, 14:07
/home/atk/atk-2008.10.0/bin/atk: line 3: 30180 Killed                  LD_LIBRARY_PATH=$EXEC_DIR/../lib $EXEC_DIR/atk_exec $*
rank 0 in job 8  node3_53803   caused collective abort of all ranks


How to solve this?Can anyone help me?
Title: Re: An error occured:when run a scipt in parallel
Post by: jdgayles16 on June 11, 2010, 09:54
I had a similar problem, it may be memory, Im just guessing though.
Title: Re: An error occured:when run a scipt in parallel
Post by: Anders Blom on June 11, 2010, 10:21
That's the most likely reason. The error is actually thrown by MPICH2, not ATK, and means that one of the nodes stopped working, basically. Look in the log files (the .o or .e files if you run via qsub) for additional error messages that actually tell you why the node shut down ATK.