This is a very important point!!!
There is actually no relation at all between the k-point sampling for the SCF loop and that for the transmission. (Except in the case of a 1D or 2D system, of course, where you have 1x1 or 1xN by pure symmetry.)
In the self-consistent loop, the k-point sampling is used to convert an integral of the real-space density into a sum. The density often doesn't vary very rapidly with k, so 3x3 up to 9x9 for "difficult" system is usually sufficient (although it needs to be checked, from case to case).
For the transmission, on the other hand, the original integral that you discretize is over self-energies and Green's functions, and these can have a very complicated k-point dependence, reflecting the nature of the tunneling mechanism. It may be that you have simple Gamma-point dominated "standard" direct tunneling through a quantum barrier, in which case perhaps 3x3 or 5x5 is enough, but it may also be that you see resonant tunneling, as in the case of FeMgO magnetic tunnel junction, where the main contributions to the total tunneling probability comes from isolated and extremely narrow peaks far out in the Brillouin zone. As it turns out you need 200x200 or 400x400 k-points in such a system to get an accurate current. In other cases you have a mix of these behaviors, and you will find convergence at 21x21 or 50x50 k-points.
The point is, that there is no immediate way to say beforehand how many points are sufficient, until you have analyzed the tunneling mechanism. So, for each system it is crucial to do a convergence study, and keep increasing the k-points until the result (the current, or the transmission spectrum itself) does not change significantly any more.
The tricky part here is, that what is sufficient at zero bias might not be enough at finite bias, so it's a good idea to check the convergence at say 0.5 V bias.
The good thing about the calculation of the transmission coefficients is that it scales pretty much linearly with the number of parallel MPI nodes, even up to hundred of nodes. So you can easily gain a factor of 10 or 20 if you have a parallel cluster to run the calculations on - in fact, even if you have a single machine (at least if you have 2 or more sockets) you may want to parallelize precisely the transmission part since it doesn't use so much memory.