Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - Vit

Pages: [1]
1
Hello,

I'm looking for some information regarding the checkpoint file ATK uses in version 11.2. I manually set the location of the checkpoint file using:

Code
checkpoint_handler = CheckpointHandler('/home/nanotubes/vitesh/CNT/8.8-K24I24-opt11.2-cp.nc', 30*Minute)

calculator = DeviceLCAOCalculator(
    checkpoint_handler=checkpoint_handler,

When trying to optimise the geometry of a twoprobe system, I get a memory crash as the checkpoint file is created (I think). The output file has a standard out-of-memory error message:

Code
Calculating Eigenvalues    : ==================================================
rank 0 in job 1  r1i1n11_47968   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9

Judging by the time of the crash (i.e. as an SCF is about to finish) I figure the memory crash and checkpoint file creation are linked.

The other error message I get is:

Code
Traceback (most recent call last):
  File "./zipdir/NL/Calculators/BulkCalculatorInterface.py", line 183, in _update
  File "./zipdir/NL/Calculators/LCAOCalculator/LCAOCalculator.py", line 1019, in scfLoop
  File "./zipdir/NL/Calculators/LCAOCalculator/LCAOCalculator.py", line 733, in scfLoopHamiltonian
  File "./zipdir/NL/Calculators/GenericParameters/CheckpointHandler.py", line 117, in _storeIfNecessary
  File "./zipdir/NL/IO/IOUtilityFunctions.py", line 566, in createNetCDFFile
OSError: [Errno 2] No such file or directory: '/home/nanotubes/vitesh/CNT/8.8-K24I24-opt11.2-cp.nc.tmp'
Fatal error in MPI_Allreduce: Message truncated, error stack:
MPI_Allreduce(773).......: MPI_Allreduce(sbuf=0x2aaaad19a350, rbuf=0x2aaaad19d4f0, count=2, MPI_INT, MPI_SUM, MPI_COMM_WORLD) failed
MPIR_Reduce(764).........:
MPIR_Reduce_binomial(172):
do_cts(490)..............: Message truncated; 8632624 bytes received but buffer size is 8

This seems to point to the checkpoint file as a cause for error. Looking at the 11.2 manual, the checkpoint file is a .nc file; however the file trying to be written is a .tmp.

So my questions are:

1) Is there any difference between the .nc I directed it to write and the .tmp it is trying to write?

2) How large is a checkpoint file going to get during the course of a calculation? Does it get overwritten after every SCF step which takes place after the specified time interval, or is the new data appended? Will the checkpoint file ever be bigger than the final .nc file?

3) Does saving a checkpoint file require a significant amount of memory during the checkpoint creation procedure?

4) Should I create the (empty) checkpoint file beforehand so the program has a file to write to?

I'd also like to ask a question on geometry optimisation - when you use vnl to create a configuration, add a calculator and a twoprobe optimisation, is it valid to go to the editor and delete the device_configuration.update() line if you're not looking for the electronic structure of the guess geometry. Or will deleting this line cause no SCF during the geometry optimisation?

2
General Questions and Answers / 'Physics Exception'
« on: February 18, 2011, 14:24 »
Hello,

I'm looking for help regarding changing the temperature of a two probe calculation. I've carried out a calculation at a higher temperature, and now want to bring the temperature down, using the electronic structure generated at the higher temp as a starting point. The script I've made (for ATK 10.8.2) is:

Code: python
# Load config and define: calc, NAP
config = nlread('Mo2Cl9-S3-0.6V-hiT.nc', object_id='gID000')[0]
calculator = config.calculator()
NAP = NumericalAccuracyParameters(electron_temperature = 300.0*Kelvin)

# Change temp
config.setCalculator(calculator(numerical_accuracy_parameters = NAP))

# Run SCF and save
config.update()
nlsave('Mo2Cl9-S3-0.6V-loT.nc')

However, I encounter this error:

Code: python
terminate called after throwing an instance of 'PhysicsException'
terminate called after throwing an instance of 'PhysicsException'
terminate called after throwing an instance of 'PhysicsException'
Center = 107, Left = 97


** Back Engine Exception 29 : central and left electrode has different grid size in i direction
** Location : electrodeutils.cpp:126

Center = 107, Left = 97


** Back Engine Exception 29 : central and left electrode has different grid size in i direction
** Location : electrodeutils.cpp:126

Center = 107, Left = 97


** Back Engine Exception 29 : central and left electrode has different grid size in i direction
** Location : electrodeutils.cpp:126

Center = 107, Left = 97


** Back Engine Exception 29 : central and left electrode has different grid size in i direction
** Location : electrodeutils.cpp:126

rank 5 in job 146  spartan_34202   caused collective abort of all ranks
  exit status of rank 5: killed by signal 9
Center = 107, Left = 97


** Back Engine Exception 29 : central and left electrode has different grid size in i direction
** Location : electrodeutils.cpp:126

Both the left and right electrode calculations converge, but on starting the "Device Density Matrix Calculation", this error is thrown up. I'd appreciate any help with either what I'm trying to accomplish, or the error message.

Thanks, Vit

Pages: [1]