No, this has nothing to do with ATK really, it appears your network is not very reliable. Since you are running this calculation in parallel it's hard to provide any other solution, because you need a network license. For serial operation we could arrange a node-locked license if you run all calculations on the same computer, but then you lose the parallel performance advantage. One solution might be to relocate the license server to a computer closer to the computational machines, thus making it less sensitive to network outages. If you want to run the license server on a different machine please contact us by email with customer ID etc.
Thank you for detailed replys.But in fact we just calculate in only one computer,and the license server started at the same computer,I dont really think it's the problem of network.
We tried it again with a good and stable network,it terminated like that:
rank 0 in job 5 bogon_45266 caused collective abort of all ranks
exit status of rank 0: return code 137
No failed connecting now.What caused the problem?
I use "mpiexec -n 4" and the cpu has 4 core 8 processors and a 8G mem.It may be caused by improper processes we use?Or should I repile a latester mpi?It looks like it according from former topics about it here.
ps: We still use 10.8 now,if it matters...
Sorry for poor English,It's the first time for me to deal with parallell calculate^_^