Without seeing the exact current setup, it's hard to advise... Perhaps you can post the script?
Some more questions:
(1) Does the size of the system along X and Y axes affect the computational time ? If I increase them, does the calcultion require much more CPU time? The CNT systems, i.e., both of pristine and CNT+molecule, I have considere are alreay too big, which required more than three days for the calculation of zero-bias tranmittance only.
A little bit, plus it increases the memory usage.
Increasing the electrode length will not add too much time. And, after all, if the results are wrong, one doesn't have much of a choice...
(2) That makes me hesitate to use larger supercell sizes of X and Y axes as well as using more atoms in the scattering region so that the perfect screening is guaranteed. Can you suggest me any possible solution for this kind of timing problem? How many cores of Zeon CPU would you recommend to use for the calculation of the zero-bias transmittance ?
This is a bigger question... If you have a parallel license, but only a single computer, it's not always the best idea to use MPI parallelization, esp. if that single machine only has one socket. Put generally, the top performance of ATK is achieved by MPI parallelization over several nodes (physically separate machines with individual RAM), while at the same time letting the code thread over the cores.
The problem if you run multiple MPI processes on a single node is that the processes will fight both for RAM and CPU. So unless the system is small, it might actually end up running slower than in serial...
Threading is controlled by the environment variables MKL_NUM_THREADS (set to the number of cores) and MKL_DYNAMIC (set to false). On Linux you will probably need to set these variables by hand, while on Windows the correct number of cores is typically automatically detected.
Also, as a general rule, Linux is faster than Windows and 64-bit is much faster than 32-bit.
(3) Since dips at E=0 appears for both of pristine (5,5) CNT [which uses 4 prim cells for the scattering region] and CNT+molecule [which uses much longer tube for the scattering region], I doubt that the problem is originated from the short size of Lz for the scattering region. This is puzzling to me, unless this is due to the supercell-supercell interaction along X and Y axes.
Agree, it's not necessarily the first thing to try. First increase the electrode length (unless the XY cell is way too small, but again, since I don't see the setup I cannot know), then the cell (after checking with a simple band structure, using the same cell), and finally Lz of the central region.