Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - filipr

Pages: 1 2 3 [4] 5
46
Quote
Total memory required per k-point: 4.56 GB

By default most algorithms will parallelize over k-points, so if you use N processes per node this will require at minimum N x 4.56 GB per node. Besides this there will be some other quantities that aren't distributed across processes. Unfortunately the log report isn't totally up to date with all the quantities that use large amounts of memory so using it to give an accurate estimate of the total memory requirements is not possible.

But here are two suggestions to reduce the memory consumption drastically.

Use OpenMP threads for shared memory parallelization. If you e.g. have nodes with 40 cores you can use e.g. 10 MPI processes with 4 OpenMP threads each. The more threads the less memory usage, but depending on the system size there is an upper limit on parallel efficiency.

Parallelize the diagonalization at each k-point: set "processes_per_kpoint" under Parameters for "Eigenvalue solver" in the Calculator Settings. You want to have N_local_kpoints x N_processes_per_kpoint x N_openmp_threads equal to the number of cores on each node, so it takes a little planning to get the best load balancing.

Also be sure to use a new version of QuantumATK (>= 2020.12, maybe even >= 2021.06 - can't remember), as we improved OpenMP parallelization recently and also introduced MPI shared memory for certain quantities.

First, be sure to use a new version of QuantumATK as we have improved memory usage and distribution in the more recents versions:

47
In theory the electron wave function is always a two component spinor and it is necessary to represent it as such in the noncollinear and spin-orbit case (note that spin-orbit has nothing to do with the wave function representation, it is just noncollinear spin plus an extra term in the Hamiltonian). In the collinear and unpolarized case we use the fact that we know that the Hamiltonian will not have any spin offdiagonal components to use an effective non-spinor representation of the wave functions: in the polarized case we will have two states Ψ↑ = (ψ↑, 0) and Ψ↓ = (0, ψ↓) which will be the solution to the up Hamiltonian H↑ and the down Hamiltonian H↓, respectively, while in the unpolarized case the up and down Hamiltonian are identical so we have Ψ↑ = (ψ, 0), Ψ↓ = (0, ψ), where ψ is the same and Ψ↑ and Ψ↓ have the same eigenvalue - they are spin degenerate. In these cases we only the scalar ψ part for computational reasons: it reduces the memory and CPU needed (no need to calculate a bunch of zeros).

There is in general no meaningful (x, y, z) projection for wave functions. You kind calculate the expectation value of the spin operator S = ħ/2σ, where σ = (σ_x, σ_y, σ_z) are the Pauli matrices, maybe that is what you mean? Then you get e.g.:

<Ψ|σ_x|Ψ> = (ψ↑*, ψ↓*)([0, 1], [1, 0])(ψ↑, ψ↓) = ψ↑*ψ↓ + ψ↓*ψ↑

I don't think we have ready made functionality to calculate this, but you can do it with the Python API.

Again, the use of the word "projection" for spin-density quantities like electron density is maybe a bit misleading in QuantumATK, but it is historical: it has been called that for 15 years... When calculating e.g. the density we actually calculate the scalar density n(r) and m(r) and combine them into a 2x2 matrix: the spin density. The x-projection is actually just the x-component of the magnetization: m_x(r).

48
Yes BlochState does support non-collinear spin. However the documentation is a bit confusing on this point: A BlochState is a kind of GridValues object, and GridValues was really mostly designed for representing spin-densities, which really is a combination of a scalar density n(r) and a vector magnetization m(r). A wave function can't be split into such two fields: it is a two component complex spinor field. You can directly access the complex spinor values by using the index operator:

Code
bloch_state = BlochState(...)

spinor_value = bloch_state[i, j, k] # spinor_value will be PhysicalQuantity array of length 2: [up, down]

You can get the dimensions of the grid with:
Code
dimensions = bloch_state.shape
.

Also if you are going to do a lot of bloch state analysis I suggest you to first do the DFT SCF ground state and save that to a file, then do the analysis in a separate script like so:

Code
configuration = nlread('my_groundstate_calc.hdf5', BulkConfiguration)[-1]

bloch_state = BlochState(configuration, ...)

Then if there are errors or you need to do multiple bloch state calculations you don't need to redo the expensive SCF calculation.

49
If you read the documentation for MolecularDynamics: https://docs.quantumatk.com/manual/Types/MolecularDynamics/MolecularDynamics.html#moleculardynamics-f you'll see that you can pass a previously calculated trajectory as input to the 'configuration' parameter in which case it will continue the calculation. So you can write a script like this:

Code
old_trajectory = nlread('my_trajectory_file.hdf5', MDTrajectory)[-1]
new_trajectory = MolecularDynamics(old_trajectory, <other arguments>)

and it should continue the MD calculation.

50
Alright, if you open the script in a text editor you can clearly see the lattice vectors are not completely perpendicular:

Code
vector_a = [15.0, -0.00900547, 0.00431798]*Angstrom
vector_b = [-0.0186507, 25.0, -0.0119156]*Angstrom
vector_c = [0.0222371, -0.0296296, 65.0]*Angstrom

I don't know how you generated the structure, but if you built it from a structure obtained after a relaxation or from someone else, or made "by hand", it is easy to introduce tiny numerical errors. You can either start from a "pure" structure that is mathematically defined or use the lattice/unit cell tools in the builder to ensure that the lattice vectors are exactly perpendicular.

For now we can fix this by changing the lattice vectors in the script to:

Code
vector_a = [15.0, 0.0, 0.0]*Angstrom
vector_b = [0.0, 25.0, 0.0]*Angstrom
vector_c = [0.0, 0.0, 65.0]*Angstrom

51
General Questions and Answers / Re: Running error
« on: May 5, 2022, 15:54 »
Can you share the calculation script you are trying to run?

52
Before 2020.09 you could only do BaderCharges analysis with an all electron density calculated from an external program like VASP or FHIaims. From 2020.09 QuantumATK can now itself calculate all electron densities with the PW-PAW calculator.

53
General Questions and Answers / Re: Barder charge analysis
« on: April 27, 2022, 08:52 »
The online documentation always only refer to the newest version. KPointDensity and AllElectronDensity were both first introduced in the R-2020.09 version, so you need to upgrade to at least that version.

55
General Questions and Answers / Re: Curie temperature
« on: April 22, 2022, 15:49 »
The 'curieTemperature()' function was first introduced in version 2021.06, so I highly recommend you to upgrade.

You can also try to implement it yourself given the formula as documented here

56
It's hard to tell how much memory the calculation will require as it depends on the number of plane waves and the PAW data sets and thus elements and the real space grid size. The log output should show some information about these and you can get a rough estimate of the order of magnitude of the memory.

If you think the maximum memory consumption is less than the available 48 GB then it can still fail due to memory duplication due to parallelization. To duplicate as little memory as possible run with only a few MPI processes more OpenMP threads. For 40 cores I suggest you to run with 4 MPI processes and 10 OpenMP threads per process - and set
Code
processes_per_kpoint=4
(or equal to number of MPI processes), which can also be set in the Calculator Settings window in NanoLab. If that still doesn't work then try 2 MPI processes + 20 OpenMP threads or 1 MPI process + 40 threads. If none of that works you have to reduce the numerical accuracy, i.e. by reducing the PW wave function cutoff.

57
I don't think this is currently possible for a user to do.

The reason is that both during the SCF loop and when using HartreePotential to calculate the Hartree potential only the full electron density is used. This is both to ensure that no unnecessary computations are done (solving the Poisson equation is not always a cheap operation) and ensure that the correct boundary conditions are satisfied. In fact the Hartree potential is by definition a scalar quantity - it is an electrostatic potential.

You could in principle calculate the electrostatic potential from the spin-up and spin-down part of the electron density, but that would require you to extract those two densities and solve the Poisson equation separately for each one. I don't think this is possible for a user to do.

58
Hi Kevin,

In general you can't know how long time a calculation will take. First of all, it depends on what kind of calculation you are doing: Molecular Dynamics with Force Fields? A transmission spectrum calculation using a tight-binding calculator? Or a band structure calculation using DFT? Many calculations involve multiple steps, e.g. a DFT band structure first requires you to determine the self-consistent ground state density, after which you can calculate the band structure.

In some cases you can estimate the order of magnitude, but let's consider an example to give you an idea of the complexity involved:

Consider a DFT calculation: It is an iterative approach: Given an effective potential you calculate the density, then up update the potential, calculate a new density and so on until the change in the density between subsequent steps is smaller than some threshold. How many steps will this take? There is no way of knowing, as it depends on the system, pseudopotentials, numerical settings etc - but it is typically between 10 and 100 steps. Now in each step you calculate the Hamiltonian and find it's eigenvalues. The Hamiltonian has several contributions: For instance calculating the XC potential scales as the number of grid points, i.e. with the volume, solving the electrostatic potential in general scales as the number of grid points squared. Finding the eigenvalues and eigenvectors of the Hamiltonian scales cubicly in the basis set size, which itself is proportional to the volume or equivalently the number of atoms. Due to the prefactors of the different terms the calculation will be limited by different calculations for different systems. If you have a small to medium system with a high grid point density it may be limited by grid terms like XC and electrostatic potentials, whereas for large systems the cubic scaling of the basis set size will surely dominate. Estimating the time for each contribution from to the total calculation is extremely hard and depends on settings, parallelization and the computer specs.

The best you can do is to make different sized versions of the system you want to study, e.g. a smallish version, a medium version and a large version, run them and then extrapolate the timings assuming the N³ scaling behavior for large systems. Note this assumes that each version uses the same number of SCF steps... You may want to only take the time per SCF step into account.

Now this was a single DFT calculation. What if you also want to do geometry optimization/relaxation? Well, such a calculation is also an iterative algorithm that may take between 1 and infinitely many steps at each step doing a DFT calculation.

All of this timing analysis has to be repeated for every type of calculation, parameters, parallelization etc etc etc.

So in practice you don't estimate the time. You can make a couple of "sounding" calculations, i.e. smaller and faster calculations that consider a smaller part of the full system you want to describe, to get an idea of convergence, precision and timing. Then from those you can guesstimate or extrapolate the approximate time scale needed for the full scale model. When you have done a couple (or many!) calculations you start to get a vague idea of the time it takes to do similar calculations.

59
Exit Code 9 from an MPI program means that the program was killed by the host system.

It could be that the job scheduling system on your cluster killed the job because it used too much memory or was taking longer than the requested wall time allocation. See if you got a mail from the queuing system or ask you system administrator.

60
Actually, the TotalEnergy analysis object in QuantumATK by default gives the total free energy at the electronic temperature/broadening specified when doing the ground state calculation. The extrapolated total energy at T = 0 K has to be obtained either from the text output in the log or from using:
Code
energy_at_zero_kelvin = total_energy.alternativeEnergies()['Zero-Broadening-Energy']
If you want the free energy at a different energy you have to repeat the calculation, changing the broadening under the "Numerical Accuracy" settings.

If you for some reason want to extrapolate to a broadening different than the one you did the calculation for you can simply use that F(σ) = E(0) + 1/2 γσ2 + O(σ4), i.e. it is approximately a parabola. You have two points on this parabola: E(0) and F(σ) at the σ used in the calculation - from those you can extrapolate to any other σ in a close neighborhood.

See also:
https://docs.quantumatk.com/manual/Types/TotalEnergy/TotalEnergy.html
and
https://docs.quantumatk.com/manual/technicalnotes/occupation_methods/occupation_methods.html

Pages: 1 2 3 [4] 5