Author Topic: memory error?  (Read 3056 times)

0 Members and 1 Guest are viewing this topic.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
memory error?
« on: November 6, 2012, 05:19 »
looks like memory error but i reduced memory steps to 12, and i have never seen one like this ... any ideas?  i was running a electron density calculation on a 16nm long GNR device


patakye@node1083:~/ATKSIMS/shellscripts> head -n 80 msi_runDevTest4.pbs.e439725
/soft/intel/x86_64/12.1/8.273/composer_xe_2011_sp1.8.273/mpirt/bin/intel64/mpirun: line 86: /soft/intel/x86_64/12.1/8.273/composer_xe_2011_sp1.8.273/mpirt/bin/intel64/mpivars.sh: No such file or directory
Traceback (most recent call last):
  File "./zipdir/NL/Calculators/DeviceCalculatorInterface.py", line 291, in _update
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 1688, in scfLoopDevice
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 623, in scfLoopDeviceHamiltonian
  File "./zipdir/NL/Calculators/CommonBuilder/DeviceBuilder.py", line 461, in createElectrostaticCalculator
  File "./zipdir/NL/CommonConcepts/PoissonSolvers/MultigridSolver.py", line 55, in calculateFunctionOnGrid
MemoryError
Traceback (most recent call last):
  File "./zipdir/NL/Calculators/DeviceCalculatorInterface.py", line 291, in _update
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 1688, in scfLoopDevice
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 623, in scfLoopDeviceHamiltonian
  File "./zipdir/NL/Calculators/CommonBuilder/DeviceBuilder.py", line 461, in createElectrostaticCalculator
  File "./zipdir/NL/CommonConcepts/PoissonSolvers/MultigridSolver.py", line 55, in calculateFunctionOnGrid
MemoryError
Traceback (most recent call last):
  File "./zipdir/NL/Calculators/DeviceCalculatorInterface.py", line 291, in _update
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 1688, in scfLoopDevice
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 623, in scfLoopDeviceHamiltonian
  File "./zipdir/NL/Calculators/CommonBuilder/DeviceBuilder.py", line 461, in createElectrostaticCalculator
  File "./zipdir/NL/CommonConcepts/PoissonSolvers/MultigridSolver.py", line 55, in calculateFunctionOnGrid
MemoryError

Traceback (most recent call last):
  File "./zipdir/NL/Calculators/DeviceCalculatorInterface.py", line 291, in _update
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 1688, in scfLoopDevice
  File "./zipdir/NL/Calculators/LCAOCalculator/DeviceLCAOCalculator.py", line 623, in scfLoopDeviceHamiltonian
  File "./zipdir/NL/Calculators/CommonBuilder/DeviceBuilder.py", line 461, in createElectrostaticCalculator
  File "./zipdir/NL/CommonConcepts/PoissonSolvers/MultigridSolver.py", line 55, in calculateFunctionOnGrid
MemoryError
*** glibc detected *** /home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/atkpython_exec: double free or corruption (fasttop): 0x00007f05e0002430 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75916)[0x7f06269d8916]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(+0xe3fa7e)[0x7f06256fca7e]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(+0xf5a0b2)[0x7f06258170b2]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(+0xf5a1e5)[0x7f06258171e5]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(+0xd8994e)[0x7f062564694e]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(LMX_Checkin+0x120)[0x7f0625695500]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/lib/python2.7/_NLEngine.so(+0x31b81b)[0x7f0624bd881b]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x560d)[0x7f062768c98d]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x775)[0x7f062768ec05]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5c56)[0x7f062768cfd6]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x775)[0x7f062768ec05]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5c56)[0x7f062768cfd6]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x775)[0x7f062768ec05]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5c56)[0x7f062768cfd6]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x775)[0x7f062768ec05]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5c56)[0x7f062768cfd6]
/home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6b25)[0x7f062768dea5]
======= Memory map: ========
00400000-00401000 r-xp 00000000 00:17 6374573                            /home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/atkpython_exec
00500000-00501000 rw-p 00000000 00:17 6374573                            /home/it1/patakye/QuantumWise/atk-12.2.2/atkpython/bin/atkpython_exec
00f81000-0be90000 rw-p 00000000 00:00 0                                  [heap]
7f0543769000-7f05dc000000 rw-p 00000000 00:00 0
7f05dc000000-7f05dc2f2000 rw-p 00000000 00:00 0
7f05dc2f2000-7f05e0000000 ---p 00000000 00:00 0
7f05e0000000-7f05e0021000 rw-p 00000000 00:00 0
7f05e0021000-7f05e4000000 ---p 00000000 00:00 0
7f05e4000000-7f05e42f0000 rw-p 00000000 00:00 0
7f05e42f0000-7f05e8000000 ---p 00000000 00:00 0
7f05e8000000-7f05e82eb000 rw-p 00000000 00:00 0
7f05e82eb000-7f05ec000000 ---p 00000000 00:00 0
7f05ec000000-7f05ec270000 rw-p 00000000 00:00 0
7f05ec270000-7f05f0000000 ---p 00000000 00:00 0
7f05f0000000-7f05f02f0000 rw-p 00000000 00:00 0
7f05f02f0000-7f05f4000000 ---p 00000000 00:00 0
7f05f4000000-7f05f42f0000 rw-p 00000000 00:00 0
7f05f42f0000-7f05f8000000 ---p 00000000 00:00 0
7f05f8000000-7f05f8270000 rw-p 00000000 00:00 0
7f05f8270000-7f05fc000000 ---p 00000000 00:00 0
7f05fc000000-7f05fc021000 rw-p 00000000 00:00 0
7f05fc021000-7f0600000000 ---p 00000000 00:00 0
7f0604c5d000-7f0604c5e000 ---p 00000000 00:00 0
7f0604c5e000-7f060505e000 rwxp 00000000 00:00 0
7f060505e000-7f060505f000 ---p 00000000 00:00 0
7f060505f000-7f060545f000 rwxp 00000000 00:00 0
7f060545f000-7f0605460000 ---p 00000000 00:00 0
7f0605460000-7f0605860000 rwxp 00000000 00:00 0
7f0605860000-7f0605861000 ---p 00000000 00:00 0

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: memory error?
« Reply #1 on: November 6, 2012, 07:14 »
having same issue with electron density and lddos calc
i have 2.5gb per core, 8cores per node, 8 nodes ... not enough memory or system issue? note transmission calc for same device was fine

I believe Dr. Blom said the amount of memory required increases with the number of nodes .. i am retrying with 2 nodes ..
« Last Edit: November 6, 2012, 08:10 by esp »

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5423
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: memory error?
« Reply #2 on: November 6, 2012, 09:07 »
We have discovered a bug in how NC files are read and rewritten, related to memory usage, so if this is related to a case when you are trying to put a lot of data into one big NC file, try splitting it up into several small ones instead, that might solve the memory problem.

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: memory error?
« Reply #3 on: November 6, 2012, 17:06 »
I suppose it is possible, but i started creating a new fresh file .... i am only running electron density calc ... what i did seems to have worked for now, the job has not quit once i went to two nodes ... maybe that helped, or coincidence?

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5423
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: memory error?
« Reply #4 on: November 6, 2012, 17:15 »
Maybe...

Offline esp

  • Supreme QuantumATK Wizard
  • *****
  • Posts: 318
  • Country: us
  • Reputation: 3
    • View Profile
    • University of Minnesota
Re: memory error?
« Reply #5 on: November 6, 2012, 22:58 »
is it a problem if i save to a n nc file which had huckel calculations before, with LCAO calculations?

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5423
  • Country: dk
  • Reputation: 89
    • View Profile
    • QuantumATK at Synopsys
Re: memory error?
« Reply #6 on: November 6, 2012, 23:16 »
No, but every time you save to the NC file it has to read in the whole file and write it out again. So if you are adding a lot to a file it can take time and cost in memory.