QuantumATK Forum

QuantumATK => General Questions and Answers => Topic started by: Choongman Moon on February 17, 2017, 05:28

Title: Error message: "unknown software exception" when using larger # of processors
Post by: Choongman Moon on February 17, 2017, 05:28
Dear All,

My calculation is optimizing the structure of graphene, and "the number of k points" are 85.
I'm trying to compare the time taken for calculation when "the number of processors" are 5 and 17 (I have 28 processors in total).

I got the calculation result when using 5 processors,
but I got the error message ".... unknown software exception (0xc000417)......" when using 17 processors.
(please see the attached img file. Error message is written in Korean partially, but I hope you can guess.)

I do not get this error message when the number of processors is below 8,
but when the number of processor is more than 8, say "N" number of processors,
N-8 number of atkpython.exe is forcibly closed about 1 minute after starting calculation.

Please help me solving this problem.
Thank you in advance.
Title: Re: Error message: "unknown software exception" when using larger # of processors
Post by: Anders Blom on February 17, 2017, 13:22
You may run out of memory... You really have 28 cores on that Windows machine? How much RAM?
Title: Re: Error message: "unknown software exception" when using larger # of processors
Post by: Choongman Moon on February 20, 2017, 07:10
You may run out of memory... You really have 28 cores on that Windows machine? How much RAM?

My windows machine has 56 cores (in 2 sockets) and my ATK licence is for 28 cores (to meet a budget)
In case of RAM, I have 128GB in total.
And when running calculation, the amount of memory usage is only about 10~15GB
Title: Re: Error message: "unknown software exception" when using larger # of processors
Post by: Choongman Moon on February 20, 2017, 07:20
And one more thing, I do not get this error message when I'm running other python scripts.
Somehow, the attached .py file might have problem.
Title: Re: Error message: "unknown software exception" when using larger # of processors
Post by: Jess Wellendorff on February 21, 2017, 08:51
The script you attached works if executed in serial and in parallel. A few comments:
- The chosen convergence threshold for the unit cell stress is 0.001 GPa, which is extremely low (below the level on numerical noise). This causes the geometry optimization algorithm to take many more steps than needed. I suggest to increase that tolerance to 0.01 GPa, which is still very low, so that only 4 BFGS steps are needed. The full calculation then finishes in 10 seconds if executed in serial!
- Your graphene unit cell is small, and the chosen ATK-DFT calculator settings are not "heavy". There is therefore no need for advanced MPI parallelization with odd numbers of processes. As mentioned above, the calculation should not take much more than 10 seconds if executed in serial on a modern laptop.
Title: Re: Error message: "unknown software exception" when using larger # of processors
Post by: Choongman Moon on February 23, 2017, 03:41
The script you attached works if executed in serial and in parallel. A few comments:
- The chosen convergence threshold for the unit cell stress is 0.001 GPa, which is extremely low (below the level on numerical noise). This causes the geometry optimization algorithm to take many more steps than needed. I suggest to increase that tolerance to 0.01 GPa, which is still very low, so that only 4 BFGS steps are needed. The full calculation then finishes in 10 seconds if executed in serial!
- Your graphene unit cell is small, and the chosen ATK-DFT calculator settings are not "heavy". There is therefore no need for advanced MPI parallelization with odd numbers of processes. As mentioned above, the calculation should not take much more than 10 seconds if executed in serial on a modern laptop.

Oh, it seems my unit cell stress setting was too high. The calculation is finished in a few seconds. Thanks a lot!