1
General Questions and Answers / Checkpoint handler and remote jobs
« on: July 12, 2018, 15:50 »
Hello, I am working with VNL on a local windows machine, and running the calculations on a remote Linux cluster.
I haven't found a clear indication on how to use checkpoint files in a remote configuration.
So far, what I've tried is :
- Create a script file specifying a local path for the checkpoint file (i.e. u’C:\User\[...]\Bi_nw\checkpointraw.hdf5’)
- Go into the script editor and change it to an absolute path on the server (i.e. u’/W/sb255620/VNL/checkpointraw.hdf5’)
- Create a blank “checkpoint.hdf5” file at the specified location on the server, and make it writable for all users (otherwise atk will throw an error when I start the calculation)
When I do this, the calculations finish, but once I download the result file on my local machine and try to read the band structure, I get the error shown in the picture below.
Apparently the results in the final hdf5 file are dependent on the checkpoint file.
What should I do instead ? Should I use the same file for results and checkpoint ? Or is there something to do with the I/O window in the job manager when I submit the job ?
I haven't found a clear indication on how to use checkpoint files in a remote configuration.
So far, what I've tried is :
- Create a script file specifying a local path for the checkpoint file (i.e. u’C:\User\[...]\Bi_nw\checkpointraw.hdf5’)
- Go into the script editor and change it to an absolute path on the server (i.e. u’/W/sb255620/VNL/checkpointraw.hdf5’)
- Create a blank “checkpoint.hdf5” file at the specified location on the server, and make it writable for all users (otherwise atk will throw an error when I start the calculation)
When I do this, the calculations finish, but once I download the result file on my local machine and try to read the band structure, I get the error shown in the picture below.
Apparently the results in the final hdf5 file are dependent on the checkpoint file.
What should I do instead ? Should I use the same file for results and checkpoint ? Or is there something to do with the I/O window in the job manager when I submit the job ?