I have tested PDOS calculation using different number of cores.
It seems that assigning more cores does not help to reduce the running time.
I have tested 2, 4 and 40 cores.
In my calculation, I have converged potential calculation for a device.
Device is periodic in the device width direction (direction perpendicular to the transport direction).
So, I use several transverse modes in the calculation.
And # of transverse mode is 29.
I thought PDOS calculation might be parallelized in transverse modes, but it seems not because I don't see much speed up with more cores.
Can anyone tell me how parallelization of PDOS calculation is done?
If I understand this, I think I could assign cores more wisely.