QuantumATK Forum
QuantumATK => General Questions and Answers => Topic started by: alex_z on June 16, 2011, 03:23
-
Dear support,
For some reason my program stops working and I do not know if this is the problem of my computer or ATK 11.2.3 problem.
I'm running iv-loop calculations for my system. Everything starts smoothly, but after a couple of days the program terminates with the message in error file:
terminate called after throwing an instance of 'std::bad_alloc'
what(): St9bad_alloc
The last message in output file is the following:
rank 3 in job 1 tmg-dl585g5-00_55054 caused collective abort of all ranks
exit status of rank 3: killed by signal 9
It happened twice. I'm using version of Fedora 14 (x86_64).
Please help me understand if this is my hardware or ATK problem.
Also, how can I resume my interrupted simulation from 'checkpoint.nc' file?
Thanks.
-
I think it is out of memory which you can adjust the parameters to test it is or not.
-
Indeed the error message means "out of memory". Make sure you only assing 1 MPI process per physical machine when running in parallel. If you do, and still get the error, then you may have to adjust the parameters, perhaps lower the number of k-points or otherwise reduce the memory requirement of the calculation.