The first thing to check is if the 3rd machine even gets an ATK process at all. If it does, but still uses so little CPU, it can be because there is not enough parts of the code to parallelize over to keep all 6 MPI nodes busy. This would be the case if you have a bulk system and few k-points, for instance.
More likely, however, is that your 3rd node doesn't even participate in the run (again, you should check if there is an atkpython process on it or not). If there isn't any atkpython running there, perhaps you need to run with a "-machinefile" argument to mpiexec (see the MPICH2 documentation). It may also be that for some reason mpd is not running on the 3rd machine, or if it is, perhaps the machine at which you start doesn't know about it. Make sure your mpd ring is properly configured.
When testing this, you don't need to run a large ATK script and check each machine. It's enough to run a script containing
import socketif processIsMaster():
print 'Master node:',
else:
print 'Slave node:',
print socket.gethostname()
The printout to the screen shows on which machine the script was run, so it's handy for troubleshooting this situation.