Dear Yasheng,
the report from DiagonalizationSolver indicates that you are running on 24 cpus, and k-points parallelization is switched off so that each of the 50 k-points is solved by all the 24 cpus.
You can turn on k-points parallelization by adding the flag "processes_per_kpoint=N_processes_per_kpoint" in the DiagonalizationSolver entry in your python input file. As the warning indicates, for optimal performance, the number of cpus (N_cpus) should be a multiple of the number of processes per kpoint (N_processes_per_kpoint) times the number of k-points (N_kpoints).
Example nr. 1 :
N_cpus = 12
N_kpoints = 6
N_processes_per_kpoint = 2
In this case, each of the 6 k-point will be solved by 2 processes, and all k-points will be solved simultaneously.
Example nr. 2 :
N_cpus = 12
N_kpoints = 12
N_processes_per_kpoint = 2
In this case, the first half of k-points (6 k-points) will be solved simultaneously by 2 processes, then the second half (6 k-points) will be solved.
Regards,
Daniele.