Hello,
I'd like to perform ATK in parallel based on MPI using 4 node systems in which each has 8 cores (total 32 cores).
I've make input file (python) for ATK and performed it using queue system in as way that VASP was performed.
My questions are as follows.
1. Despite parallel computing, the speed is not good.
The queue state showed that it assigned to 4 nodes, however, if I connected the corresponding nodes, running program for ATK is not shown by "top" or "ps" command.
Only one node showed one running job of "atkexec".
What's the problem with it and what should I do for it?
Should I add the commands for the node number or core number in input python file for correct running?
Otherwise, is it working correctly though they are not shown?
I ask your answers as in detail as possible, please.
2. Incidentally, the job stopped during the process without the completion after 4 days.
Can restart it? If so, what should I do for it?
Best reagrds,
Young