The first setting, for the self-consistent calculation (SCF), is related to the band structure of the underlying electrode bulk crystal. To determine the Fermi level accurately and to sample the band structure properly, a certain number of k-points are needed. For a small unit cell (small in the XY directions that is), you need more k-points, perhaps up to 10x10 for good accuracy, while if you have a large unit cell you can perhaps even get by with a single point (1x1).
The second "analysis" settings determines the sampling of the 2D Brillouin zone when you calculate the transmission spectrum. Electrons incident from the elcetrode with different k-points have different probability of tunneling through the central region, and we must sum all contributions to obtain the total transmission at a certain energy. In this case it can be very hard to know up front how many points are needed, and one should really test by increasing them to see if the transmission spectrum changes.
The number of k-points do not have to be the same. If we take a well-known example of a FeMgO magnetic tunnel junction, an SCF sampling of 8x8 is sufficient for good results. However, the minority transmission spectrum for the parallel spin configuration requires upwards of 100x100 to reach a good value for the transmission. This is because there is large number of very narrow peaks in the transmission coefficients, which are related to the exact matching of the Fe band structure to the MgO energy levels.
Therefore, as we see in this example, the analysis k-points are related to the whole system, the transmission properties of the composite two-probe system, while the SCF k-points are only related to the electrode.
The only way to really find out how many k-points are needed to obtain converged results, is by running a sequence of calculation for different values. This goes for both the SCF and analysis k-points, although the transmission spectrum typically (but not always!) is less sensitive to the SCF k-points.
Increasing the number of k-points in the SCF does however improve the convergence, and if there are way too few k-points (e.g. using only Gamma point when one should have 5x5 instead) will shift the band structure and thus influence the results.
ATK is specifically parallelized over k-points, so when running on a cluster, it is not very expensive to increase the k-point sampling, if you run on many nodes. Memory usage does, however, increase with the number of k-points.
Of course, for one-dimensional systems like nanotubes, you always only need 1x1 k-points for both SCF and analysis. For graphene, which is periodic in two directions, you need N points in one direction and 1 point in the other (perpendicular to the sheet).