Try "-npernode 1" as argument to mpiexec. Not all versions support it, however. It means one process per node, instead of stacking them. Otherwise, you may need to specify in the machinefile how many processes it is allowed to put on each node; if each node has several cores, it may figure that out and try to put as many processes as cores (or sockets).
When experimenting, you can use something simpler than ATK (to make it faster, by avoiding the license check), for instance
mpiexec -n 32 -machinefile machinefile echo $HOSTID
Then you can also test how high you need -n to be before it starts using other nodes, if this is 4 or 8 or another number matching the core count, it would confirm my suspicion above, and you would need to edit the machinefile a bit.