Interesting. I observed the same issue with the k-points, I am trying to investigate that further. The object k_point_sampling itself correctly reports 512 points, but for some reason it still gets reduced (or reported as such, at least) in the calculation.
It does speed up convergence if you initialize the spins closer to the expected converged polarization, rather than setting it to a fully polarized state as you do now. So in this way you do initialize the density matrix, partly, but there is no simple way to initialize the density matrix itself beyond that. One could run a separate calculation with looser convergence settings and maybe fewer k-points to at least give it a starting guess, but note that the basis set size must be the same.
Since ADMM is an approximation it is expected it may give slightly different results. 10% sounds a bit high though but it will depend on the size of the smaller basis set, of course.