2-3 days sounds a bit much, but it's hard to know anything without details on your system. Si bulk (2 atoms) takes about 5 hours on 6 nodes, indicating a serial speed of about 1 day.
One thing to keep in mind, is that you rarely need any k-point sampling, since the supercell takes care of that, so you should go with 1x1x1 usually - that may be part of the explanation for your long runtime. However, the system used for a phonon calculation should also always be relaxed first (optimized positions and cell size) and that of course needs to be done with k-points.
ATK is probably one of the most efficient codes available for doing these calculations - you wouldn't even attempt to run this in serial with VASP unless you make only a very small supercell, in which case the accuracy is questionable. And as mentioned, you can lower the accuracy in ATK too - here's how:
dmp = DynamicalMatrixParameters(repeats=(5,5,5), atomic_displacement=0.01*Angstrom)
calculator = LCAOCalculator(...all usual stuff...,
dynamical_matrix_parameters=dmp
)
As always, it's necessary to have the right tools for the job. You wouldn't build a house out of concrete using a hand-powered drill, and if you are serious about computational physics, investing in a slightly better computer - or a second one, for parallelization - will be a sensible investment. This doesn't mean supercomputers, just like for VASP and other codes you can nicely get by with ATK on a high-power workstation or a small office cluster.
But I think in your case, probably you need to revise the k-point sampling first, that may explain everything.
Finally, you mention that your calculation "doesn't converge". Do you really mean that, or you just mean that it didn't finish yet? For a phonon calculation you need to do at least 3N+1 separate selfconsistent calculations, and that takes time, but "not converging" means that one of those calculations doesn't reach the tolerance (or doesn't do so unless it takes very many steps), and that's a whole other story.