QuantumATK Forum

QuantumATK => General Questions and Answers => Topic started by: Jenny on May 27, 2014, 18:11

Title: parallel calculation error
Post by: Jenny on May 27, 2014, 18:11
Dear All,

I was using parallel calculation for a while and never met a problem previously. Recently I met a problem about parallel calcualtion. The error shows like following:

|  96  Cu   [   3.866 ,  12.144 ,   6.326 ]   10.99981  -0.00019               |
|  97  Cu   [   6.422 ,  12.144 ,   6.326 ]   10.99981  -0.00019               |
|  98  Cu   [   8.978 ,  12.144 ,   6.326 ]   10.99981  -0.00019               |
|  99  Cu   [  11.535 ,  12.144 ,   6.326 ]   10.99981  -0.00019               |
+------------------------------------------------------------------------------+
|  12 E =   -241.9 dE =  6.952814e-04 dH =  1.960330e-05                       |
+------------------------------------------------------------------------------+

                            |--------------------------------------------------|
Calculating Eigenvalues    : ==================================================
Calculating Density Matrix : ====
job aborted:
rank: node: exit code[: error message]
0: MEMS361-4-PC: 123
1: MEMS361-2-PC: -1073741819: process 1 exited without calling finalize
2: MEMS361-PC: 123
3: MEMSLab-PC: 123

After this, I tried to do the same parallel calcuation again. The error appears again showing as follow:

|  96  Cu   [   2.588 ,  10.866 ,   6.326 ]   11.00019   0.00019               |
|  97  Cu   [   5.144 ,  10.866 ,   6.326 ]   11.00019   0.00019               |
|  98  Cu   [   7.700 ,  10.866 ,   6.326 ]   11.00019   0.00019               |
|  99  Cu   [  10.256 ,  10.866 ,   6.326 ]   11.00019   0.00019               |
+------------------------------------------------------------------------------+
|  17 E = -241.902 dE =  1.842085e-04 dH =  1.172817e-05                       |
+------------------------------------------------------------------------------+

                            |--------------------------------------------------|
Calculating Eigenvalues    : ==================================================

job aborted:
rank: node: exit code[: error message]
0: MEMS361-4-PC: 123
1: MEMS361-2-PC: -1073741819: process 1 exited without calling finalize
2: MEMS361-PC: 123
3: MEMSLab-PC: 123


Does anyone know what the problem is with my computers?

Thank you very much.

Jenny
Title: Re: parallel calculation error
Post by: Anders Blom on May 28, 2014, 00:32
I'd guess too that it's an issue with the machines, rather than the calculation. Possibly one of the nodes has run out of memory. Try a slightly smaller calculation (for instance, reduce the number of history steps) and see if it can complete.