QuantumATK Forum

QuantumATK => General Questions and Answers => Topic started by: huangshenjie on March 5, 2013, 09:22

Title: A problem occurred when calculating
Post by: huangshenjie on March 5, 2013, 09:22
Dear all, I made a py file and calculated through ATK 10.8. However, an error appeared and the log file shows as follow:
"Density Matrix Calculation : ==============rank 0 in job 9  localhost.localdomain_33142   caused collective abort of all ranks
  exit status of rank 0: return code 137"

I don't what is wrong with my calculation~~
Title: Re: A problem occurred when calculating
Post by: Anders Blom on March 5, 2013, 10:15
It may be some local problem on the cluster, or possibly you ran out of memory. Try running again, if the problem appears in the same place make sure you are running only one MPI process / node. You can estimate the amount of memory a calculation need (roughly) as described in http://quantumwise.com/documents/tutorials/ATK-12.8/MemoryUsage/
Title: Re: A problem occurred when calculating
Post by: huangshenjie on March 5, 2013, 11:23
It may be some local problem on the cluster, or possibly you ran out of memory. Try running again, if the problem appears in the same place make sure you are running only one MPI process / node. You can estimate the amount of memory a calculation need (roughly) as described in http://quantumwise.com/documents/tutorials/ATK-12.8/MemoryUsage/

Thank you sir. By the way, how to make sure whether I am running only one MPI process?? I'm a beginner at linux~~

And after I reduced the k pionts from 9*9 to 3*3, this problem didn't appear any more. Is that means the problem is running out of memory?
Title: Re: A problem occurred when calculating
Post by: Anders Blom on March 5, 2013, 13:47
Probably.

The control of how many MPI processes you are using and where they go are essential for performance. It is controlled by a combination of the "mpiexec -n" options and the queue resource allocation (if such is present). You should study those options carefully to make sure you know what they mean. Then also have a look at http://quantumwise.com/documents/tutorials/latest/ParallelGuide/.