This doesn't have so much to do with infinite as periodic, viz. period length. But also, to some extent, with numerical practicalities.
In principle you should have infinitely many k-points in any direction that is not confined (for a nanotube or wire along Z you don't need more than 1 point in X/Y). However, that is obviously not possible, we need to use a finite number. Now, as a first consideration, the k-point sampling in X/Y is used not only in the electrode calculation but also for the device calculation. It thus becomes very costly, both in time and memory, to have many k-points, and so you want to use as few as possible. One needs to strike a balance between accuracy and efficiency, however in most cases - fortunately - we use say 3x3 repetitions of the unit cell in X/Y, and it turns out 3x3 or 5x5 k-points provide an excellent description of the electron density. In some materials like Fe you may need 9x9, and for special cases like 2D graphene perhaps a bit higher even (and a multiple of 3, at that).
Now, in the Z direction we can afford more k-points because they only enter into the electrode calculation, which contains rather few atoms and thus isn't that time-consuming or RAM-hungry. That's why, to be on the safe side, we suggest 100 or so. Perhaps in many cases 25 is perfectly sufficient, but it would have a very small influence of the total calculation time anyway.
Finally, the electrodes are indeed semi-infinite. The point, however, is that in the device calculation we assume no periodicity in the Z direction. We can, however, treat the semi-infinite system exactly, under certain assumptions, namely that the electrodes only have non-zero interactions with their nearest neighbor cells (this is why a single period of the electrode in Z is not always enough, such as in the case of a graphene ribbon). In that treatment you are really treating the infinite (ok, semi-infinite) or aperiodic problem and thus the Fermi level that comes out of that calculation is based, essentially, on an infinite k-point sampling.
One of the things that can cause convergence problems is, if the Fermi levels of the electrodes - which are determined from the electrode calculations - don't match properly the device calculation Fermi level, used to populate the states in the central region in the non-equilibrium condition (and even under zero bias). Therefore, the more k-points in the Z direction the better, to avoid a mismatch. And, we can afford it, as described above, so there's no point in skimping there.
In X/Y, however, we always use the same k-point sampling both for the electrode and the device, and thus no mismatch can occur. The number of X/Y k-points therefore becomes a pure accuracy parameter - the more you have, the better accuracy, but it also costs in time/memory. The X/Y k-points shouldn't really hamper convergence, as the KZ points can, unless you have way too few, and the Fermi level ends up somewhere it really doesn't belong, of course.