Author Topic: GGA-PAW Memory error  (Read 1974 times)

0 Members and 1 Guest are viewing this topic.

Offline sadegh

  • Regular QuantumATK user
  • **
  • Posts: 6
  • Country: us
  • Reputation: 0
    • View Profile
GGA-PAW Memory error
« on: October 16, 2020, 03:31 »
Hi dear ATK

I am trying to use GGA-PAW-DFT-D3 for optimizing Mxene structure (python file was attached). I am aware that running PAW would reduce performance and increase accuracy. However, in my case after almost 2 days  my job was stopped due to "MemoryError: Unable to allocate the required storage. This is probably caused by insufficient available memory." (please see the end of attached log file for the rest of the error message)

In the log file (attached) the "Memory estimate for plane-wave quantities per process" is 3.70 GB!
 
The server that I use is equipped with 2 Intel Xeon Gold 6148 CPU and 384 GB RAM. In submission settings I used Multiprocesss parallel and chose 26 for number of processors (please see attached log file)

I was wondering if you could help me with a way to complete the job in a faster and more efficient way or reduce the memory usage...

Offline filipr

  • QuantumATK Staff
  • Heavy QuantumATK user
  • *****
  • Posts: 73
  • Country: dk
  • Reputation: 6
  • QuantumATK developer
    • View Profile
Re: GGA-PAW Memory error
« Reply #1 on: October 19, 2020, 09:29 »
Dear Sadegh

I've looked through your input script and your calculation output and as you also have observed it should have enough memory to run a DFT calculation. Tt is also able to do a few optimization steps before it crashes. This does sound a bit weird, as if something is leaking memory or uses more memory than we would think. We will try to run you script and see if we can pinpoint the issue. Unfortunately, this may take a while as we have a lot of other things to do, so be patient.

Until then I have some suggestions for reducing the memory footprint:

I looked up the CPU you are running on, and it has 20 physical cores, and you have two of those, so a total of 40 available cores. Let's put them all to use :)

I see you are using the newest version (2020.09) which is quite well parallelized using threads. So I suggest that you run using 4 threads, this means that you should run with 40/4 = 10 processes (you can select 4 threads and 10 processes in the job manager). You have 13 k-points and ~70.000 plane waves to parallelize over. In order to reduce the number of wave functions to store in memory, you can distribute them over processes by choosing processes per k-point = 5. This should give a reasonable load balance with minimal memory consumption. So to summarize:

number of threads: 4
number of processes: 10
processes per k-point: 5 (this one you set in the calculator settings in the script generator)

I hope this makes the calculation run through.