Author Topic: Question on restarting calculation.  (Read 7099 times)

0 Members and 1 Guest are viewing this topic.

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Question on restarting calculation.
« on: April 14, 2009, 05:11 »
Dear all,
    I ran a two-probe job and suffered the power failure. But I have the NetCDF file and the electrode calculation part had been completed. To restart my job I modified the script:

scf = restoreSelfConsistentCalculation(
    filename = 'crash.nc'
)

scf = executeSelfConsistentCalculation(
    self_consistent_calculation=scf,
)

   Then I submitted the job and "top" command showed it was running. But for long time (24hr+) there was no further output after:
# -----------------------------------------------------------------------------
# TwoProbe Algorithm Parameters
# -----------------------------------------------------------------------------
Electrode Constraint = ElectrodeConstraints.Off
Initial Density Type = InitialDensityType.EquivalentBulk

   The "top" command showed it was still running. Did I make something wrong?

Regards,


Offline Nordland

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 812
  • Reputation: 18
    • View Profile
Re: Question on restarting calculation.
« Reply #1 on: April 14, 2009, 07:28 »
Strange!

The way you  have done it, is exactly as I would have done....

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Question on restarting calculation.
« Reply #2 on: April 14, 2009, 09:27 »
I would have to double-check, but I think that when you restart this way, ATK recomputes the electrodes (some properties of the electrodes are not stored in the NetCDF file), however this is done without any output to the log file. So, if your original electrode calculation was very time-consuming, then the electrode calculation in restart mode will take the same time, you just will not see any output.

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Question on restarting calculation.
« Reply #3 on: April 14, 2009, 10:41 »
The calculation of electrode should be done within 4 hrs from the very begining in my case. So it is really strange.
Could it relate to the parallel run? Or something went wrong when recalculated electrodes?

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Question on restarting calculation.
« Reply #4 on: April 14, 2009, 13:01 »
I think it's very simple. In your new calculation, you use default runtime parameters, which means verbosity level is 0, i.e. ATK will not print anything to the screen. So, I'm quite sure the calculations is actually just running! However, there is a bigger problem: The default runtime parameter also means there is no new NetCDF file! So, once your calculation finishes, you will not have any NetCDF file with the final, converged result. So, I would interrupt the calculation and make sure you also include runtime parameters by adding the following 3 lines before the executeSelfConsistentCalculation() statement:
Code
import ATK
ATK.setCheckpointFilename('new_calculation.nc')
ATK.setVerbosityLevel(1)

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Question on restarting calculation.
« Reply #5 on: April 16, 2009, 04:38 »
Tried. Your are right! Thank you! But I have another related question.
The calculation of electrode is converged and saved to the checkpoint file. But the calculation of two-probe is not converged. If I kill the job and change central region parameters in the input script, will the restarted  job read these changed parameters or just ignore them? In other words, the parameters are read from checkpoint file or from input script when restarting the calculation?
My test showed parameters are read from checkpoint file. Am I correct? It it is true one has to re-calculate converged electrodes if the parameters of central region are changed.
« Last Edit: April 16, 2009, 07:32 by frsy »

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Question on restarting calculation.
« Reply #6 on: April 16, 2009, 08:46 »
It depends a bit on how you "restart" the calculation.

If you use the method of passing an scf object under the keyword "self_consistent_calculation" to the function executeSelfConsistentCalculation(), then ATK assumes that you do so because the calculation did not converge (perhaps because of crash, or not reaching convergence within max number of steps allowed). In that case, the only thing that makes sense is to take the parameters from the checkpoint file.

If you want to change the parameters, then you should instead pass the self-consistent object (restored from the NetCDF file) under the keyword "initial_calculation". In this case, all parameters are read from the new method, and you can also change the positions of the atoms if you wish. You can not, however, change the basis set or the number (in fact, even the order) of the atoms, since there needs to be a 1-to-1 mapping between the density matrix in the old and new systems. This is quite useful to get a head-start in cases where you have a converged calculation, and just wish to increase the mesh cut-off, k-point sampling, or something else (or, as mentioned already, move the atoms a bit), since the path to convergence most likely is shorter from the converged calculation than from scratch. (This is not always true, but in most cases it is.)

In either case, the electrodes are recalculated. In 99% of all systems, the electrode calculation takes only a fraction of the total time, so in the big picture it doesn't really matter, although I agree it can be a bit frustrating where you are trying things out :)

You can try to use the "initial_calculation" even if your calculation is not converged, but the state saved in the NetCDF file might be a worse starting point than scratch (or not...). It depends on how close to convergece you came in the first run.

Offline frsy

  • Heavy QuantumATK user
  • ***
  • Posts: 33
  • Reputation: 0
    • View Profile
Re: Question on restarting calculation.
« Reply #7 on: April 16, 2009, 10:50 »
Thank you Dr. Blom!  I have tried initial_calculation in bulk calculation and succeeded. But in this two-probe calculation:

scf = restoreSelfConsistentCalculation(
    filename = 'dump.nc'
)

import ATK
ATK.setCheckpointFilename('check.nc')
ATK.setVerbosityLevel(6)

# Using initial density from self consistent calculation
scf = executeSelfConsistentCalculation(
    ini_two_probe_conf,
    two_probe_method,
    initial_calculation = scf,
)

It failed with "NLPolicyError: A restart for a two-probe requires an initial calculation from a two-probe." The dump.nc should already include required data since "self_consistent_calculation=scf" worked normally.
« Last Edit: April 16, 2009, 10:53 by frsy »

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Question on restarting calculation.
« Reply #8 on: April 16, 2009, 11:00 »
Tricky to say, but my best guess is that the calculation crashed in the equivalent bulk calculation, and never reached the two-probe part. Or the NetCDF file is corrupt, somehow (perhaps it crashed right at the saving of the file...).