Author Topic: Error message: "unknown software exception" when using larger # of processors  (Read 4789 times)

0 Members and 1 Guest are viewing this topic.

Offline Choongman Moon

  • Regular QuantumATK user
  • **
  • Posts: 7
  • Country: kr
  • Reputation: 0
    • View Profile
Dear All,

My calculation is optimizing the structure of graphene, and "the number of k points" are 85.
I'm trying to compare the time taken for calculation when "the number of processors" are 5 and 17 (I have 28 processors in total).

I got the calculation result when using 5 processors,
but I got the error message ".... unknown software exception (0xc000417)......" when using 17 processors.
(please see the attached img file. Error message is written in Korean partially, but I hope you can guess.)

I do not get this error message when the number of processors is below 8,
but when the number of processor is more than 8, say "N" number of processors,
N-8 number of atkpython.exe is forcibly closed about 1 minute after starting calculation.

Please help me solving this problem.
Thank you in advance.

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
You may run out of memory... You really have 28 cores on that Windows machine? How much RAM?

Offline Choongman Moon

  • Regular QuantumATK user
  • **
  • Posts: 7
  • Country: kr
  • Reputation: 0
    • View Profile
You may run out of memory... You really have 28 cores on that Windows machine? How much RAM?

My windows machine has 56 cores (in 2 sockets) and my ATK licence is for 28 cores (to meet a budget)
In case of RAM, I have 128GB in total.
And when running calculation, the amount of memory usage is only about 10~15GB
« Last Edit: February 20, 2017, 07:14 by Choongman Moon »

Offline Choongman Moon

  • Regular QuantumATK user
  • **
  • Posts: 7
  • Country: kr
  • Reputation: 0
    • View Profile
And one more thing, I do not get this error message when I'm running other python scripts.
Somehow, the attached .py file might have problem.

Offline Jess Wellendorff

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 933
  • Country: dk
  • Reputation: 29
    • View Profile
The script you attached works if executed in serial and in parallel. A few comments:
- The chosen convergence threshold for the unit cell stress is 0.001 GPa, which is extremely low (below the level on numerical noise). This causes the geometry optimization algorithm to take many more steps than needed. I suggest to increase that tolerance to 0.01 GPa, which is still very low, so that only 4 BFGS steps are needed. The full calculation then finishes in 10 seconds if executed in serial!
- Your graphene unit cell is small, and the chosen ATK-DFT calculator settings are not "heavy". There is therefore no need for advanced MPI parallelization with odd numbers of processes. As mentioned above, the calculation should not take much more than 10 seconds if executed in serial on a modern laptop.

Offline Choongman Moon

  • Regular QuantumATK user
  • **
  • Posts: 7
  • Country: kr
  • Reputation: 0
    • View Profile
The script you attached works if executed in serial and in parallel. A few comments:
- The chosen convergence threshold for the unit cell stress is 0.001 GPa, which is extremely low (below the level on numerical noise). This causes the geometry optimization algorithm to take many more steps than needed. I suggest to increase that tolerance to 0.01 GPa, which is still very low, so that only 4 BFGS steps are needed. The full calculation then finishes in 10 seconds if executed in serial!
- Your graphene unit cell is small, and the chosen ATK-DFT calculator settings are not "heavy". There is therefore no need for advanced MPI parallelization with odd numbers of processes. As mentioned above, the calculation should not take much more than 10 seconds if executed in serial on a modern laptop.

Oh, it seems my unit cell stress setting was too high. The calculation is finished in a few seconds. Thanks a lot!