You're mixing apples and pears a little bit
OpenMP threading over cores has nothing to do with "mpiexec".
mpiexec -n 4 tells ATK to parallelize the calculation over 4 MPI processes. These 4 processes can run on separate machines, sockets or cores depending on your hardware resources (and allocation scheme). If we, for the sake of argument, assume you have 4 sockets in this machine, each one being a six-core, then each MPI process will (probably) run on a separate socket, and each one can thread using OpenMP over the 6 cores on the socket.
So, if you only have a single machine, then it's a matter of finding the balance between memory (each MPI processes uses roughly the same amount of RAM as a serial process), sockets, and cores.
However, what we mean about "automatic" is the in most cases you never have to control the OpenMP threading using the MP environment variables; MKL is good at figuring it out the best way to run by itself. (An exception is if you use hyperthreading.)