Dear Sadegh
I've looked through your input script and your calculation output and as you also have observed it should have enough memory to run a DFT calculation. Tt is also able to do a few optimization steps before it crashes. This does sound a bit weird, as if something is leaking memory or uses more memory than we would think. We will try to run you script and see if we can pinpoint the issue. Unfortunately, this may take a while as we have a lot of other things to do, so be patient.
Until then I have some suggestions for reducing the memory footprint:
I looked up the CPU you are running on, and it has 20 physical cores, and you have two of those, so a total of 40 available cores. Let's put them all to use
I see you are using the newest version (2020.09) which is quite well parallelized using threads. So I suggest that you run using 4 threads, this means that you should run with 40/4 = 10 processes (you can select 4 threads and 10 processes in the job manager). You have 13 k-points and ~70.000 plane waves to parallelize over. In order to reduce the number of wave functions to store in memory, you can distribute them over processes by choosing processes per k-point = 5. This should give a reasonable load balance with minimal memory consumption. So to summarize:
number of threads: 4
number of processes: 10
processes per k-point: 5 (this one you set in the calculator settings in the script generator)
I hope this makes the calculation run through.