Author Topic: script stucks on nlsave  (Read 5186 times)

0 Members and 1 Guest are viewing this topic.

Offline wachr

  • Regular QuantumATK user
  • **
  • Posts: 8
  • Country: de
  • Reputation: 2
    • View Profile
script stucks on nlsave
« on: August 17, 2011, 12:18 »
Dear QuantumWise-Team, I have got a question concerning an curious problem: I have written a script to evaluate the number of k-points needed to perform calculations for a nanotube as a first step towards my final simulation. When I go through step script step by step via the console (also testet as an interactive, parallelized MPAVICH-job on a SMP machine), everything works fine, but when I start the job via the queuing system, ATK seems to hang up while saving the .nc-file, showing me 16 processors fully occupied, but I don't see any progress at all. The resulting .nc-file can be acessed witout any difficulties and doesn't show any error (it contains all the calculated Hamiltonians and my saved structure). The only difference between the script and the manual calculation is the for-loop around the instructions. So here is the (minimal) script and the according output:
Code
# reading input and so on...

master = processIsMaster()

for k in arange(kstart,kend,kstep): # (1,52,5)
	# Define nanotube
	NT = NanoTube(n,m,element1,element2,bond_length*Units.Ang,vacuum*Units.Ang) # given parameters
	NT = NT.center()
	if master:
		print "k-number: %d" % (k)
	
	# Basis Set and basic calculator config
	basis_set = LDABasis.DoubleZetaPolarized
	numerical_accuracy_parameters = NumericalAccuracyParameters(
		k_point_sampling = (1, 1, int(k)),
		grid_mesh_cutoff = 40.0*Rydberg,
		)
	calculator = LCAOCalculator(
		basis_set=basis_set,
		numerical_accuracy_parameters=numerical_accuracy_parameters,
		)
	NT.setCalculator(calculator)
	
	t1 = time()
	force = Forces(NT)
	t2 = time()
	
	if master:
		print "time for DFT + Forces: %fs" % (t2-t1)
		desc = "Nanotube-%d-%d" % (n,m)
		nlsave(filename + ".nc", NT, desc)
		print "structure \"%s\" successfully written to %.nc" % (desc,filename)
		# program stucks here! why?
		print >> "%s epsilon %f" % ('%',epsilon)
		f = open(filename + "_forces_NT.txt", 'w')
		print >> f,"%s epsilon %f" % ('%',epsilon)
		force.nlprint(f)
		print >> f,""
		f.close()
output:
Code
[...]
+------------------------------------------------------------------------------+
| Calculation Converged in 13 steps                                            |
|                                                                              |
| Fermi Level  = -0.131172 Ha                                                  |
+------------------------------------------------------------------------------+
+------------------------------------------------------------------------------+
|                                                                              |
| DFT Calculation  [Finished Wed Aug 17 11:50:15 2011]                         |
|                                                                              |
+------------------------------------------------------------------------------+

                            |--------------------------------------------------|
Calculating Eigenvalues    : ==================================================
Calculating Density Matrix : ==================================================

time for DFT + Forces: 92.478966s


Offline wachr

  • Regular QuantumATK user
  • **
  • Posts: 8
  • Country: de
  • Reputation: 2
    • View Profile
Re: script stucks on nlsave
« Reply #1 on: August 17, 2011, 12:22 »
Just fogot - I am using ATK 10.8.2

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: script stucks on nlsave
« Reply #2 on: August 17, 2011, 17:00 »
It's not possible to see what "filename" is, from the script, but are you sure this location is writable when you run via PBS?

When you run via PBS, such errors are not always caught, and you have to kill the job manually. Check if there are any error messages in the .o file, after you kill it, that should give you an indication.

Not that it matters for this problem, but why 10.8.2? Looking at your email address, I would think you have access to 11.2. Lots and lots of things change, not least performance, plus some bugs.

Offline wachr

  • Regular QuantumATK user
  • **
  • Posts: 8
  • Country: de
  • Reputation: 2
    • View Profile
Re: script stucks on nlsave
« Reply #3 on: August 17, 2011, 19:36 »
Dear Mr. Blom,

Thank you very much for the quick answer. I found the error - nlsave must be executed by all the processes, not just by the master process. Therefore, yes, the pbs-script can  write into this location, filename is just a "simple" filename, not a path. And there was also no error message in the .o-file. (There is another error in the line after the error comment in my posted example - but that was not the reason.)

Concerning the version - I just "moved" to 11.2, today.

Offline Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5576
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: script stucks on nlsave
« Reply #4 on: August 17, 2011, 20:45 »
This is actually discussed here: http://quantumwise.com/documents/tutorials/latest/ParallelGuide/index.html/chap.appendix.html#sect1.parallel.io, although it's a bit hard to find it there, so it's good we got some attention to it here on the Forum.

So if you don't use the if-statement, it is true that nlsave executes on all nodes, but it is parallel-safe and will only write the file in the master process. That means there is an explicit "wait for slaves to finish" in nlsave, so the script can synchronize across nodes. In fact ALL native ATK statements are parallel-safe, which means you don't have to modify scripts in order to run safely in parallel - just avoid putting any native ATK statements inside "if processIsMaster():" statements.

The only things you should insert into such if-blocks is I/O writing you do on your own.
« Last Edit: August 17, 2011, 20:47 by Anders Blom »

Offline wachr

  • Regular QuantumATK user
  • **
  • Posts: 8
  • Country: de
  • Reputation: 2
    • View Profile
Re: script stucks on nlsave
« Reply #5 on: August 18, 2011, 10:59 »
Thank you very much for the explanation. (It's about what I thought after having found the error.) Yes, hopefully, this topic is useful to others.