As you say, in ATK 2008, device region is L+C+R and the bias is applied over the device region. And that is to say, there is voltage drop over the L and R electrode. Yet, 2008 can treats the L and R as bulk and they do not participate in the self-consistent iteration of NEGF. My question is : since the L and R electrodes are treated as bulk, why there is voltage drop? The bias is expected to be applied over the C region.
As far as I know, in ATK 2008, one can choose to let the L and R participate in the self-consistent iteration of NEGF if I am not mistaken (Is it different from that of ATK 11.2 ?). So if this can improve convergence performance, one can choose this option by himself. In ATK 11.2, as you say, it is forced to this option.
The bias voltage drop is indeed applied over L+C+R (in 2008.10 notation).
There is no option in 2008.10 to let L/R be fully self-consistent. The various constraints (Off, DensityMatrix, and RealSpaceDensity) all contain some approximation which make L/R not fully bulk-like. In all cases, the
effective potential in L and R is self-consistent, and that's why you can have a voltage drop over L and R. On the other hand this means, since the Hamiltonian and/or density (depending on the constraint used) is
bulk-like, that you have a mismatch in this region between the solution to the Poisson equation and the Schrödinger equation. This was the method used from the beginning in TranSIESTA (also in the current "new" version of it, I believe) and older versions of ATK, and it can sometimes cause big convergence problems.
In "new ATK" we have eliminated this mismatch with a more advanced implementation, which treats L+C+R fully self-consistently. To do this technically, we decided to make the new C = old L+C+R, however the computational burden is exactly the same for "new C" and "old device", as they contain equally many atoms. We also believe that this, together with a more complete treatment of matrix elements near the boundaries of the system, has contributed to much better convergence in ATK 11.2 (already 10.8, actually).
I think if there is enough surface layers in C region in ATK 2008, the L and R can be safely set to bulk value and need not to take part in the self-consistent process of NEGF. So as in ATK 11.2 since the surface layers are also needed, the self-consistent of the copy of L and R in central region (defined in 11.2) will enlarge the burden of the calculation (?).
Indeed, in the limit of infinitely many surface layers both methods are equivalent. But the real difference is that in the 2008.10 method, the fewer surface layers the stronger the mismatch between the Hamiltonian and the potential becomes, while in new ATK all regions are always fully self-consistent. Thus it's still an approximation in new ATK to have too few layers, but you always have L and R inside C to provide some screening.
But since L and R provide screening in 11.2, you can in fact use
fewer surface layers in 11.2 than in 2008.10, so the computational burden would be less. Imagine you decide you need 8 surface layers in a 2008.10 calculation, and the electrode has 2 layers. So your calculation will have 10 layers (on each side). But in 11.2, we can let 2 of the screening layers be part of the electrode copy, and just add 6 "free" surface layers. Thus we only need 8 layers in total in the central region.
This arithmetic only works, as I think you have understood, if the number of surface layers is quite large; if you have say fcc (111) with 3 layers as electrode, and 2 screening layers in 2008.10, I would recommend also having 2 free screening layers + the 3 layers from the electrode copy in 11.2. The good news is, however, that for the same computational effort, you will have better screening in 11.2.