Dear everyone,
I have run the parallization for a few days. I have three computer, one has 12 GB memory, and the other two has 8 GB memory each. They are all quad-core computers, and I guess they are also single socket machine.
Now my job will take about 3.6 GB on one local machine, so I decide to allocate two processes one machine, according to the parallel strategy. This allocation is proved not to cause problems. However, I found a phenomena: the third computer seems never working under such a setting. Usually the master node has 100% CPU usage, while the second one has 50% CPU usage, and the third one only has 1% CPU usage. Sometimes the network is even not connected for the third machine, but the job still runs smoothly under the same command.
I am not sure what's the problem. I could see the three machines are connected under mpich, but the third one seems doesn't work. And my command line for a job is like this:
mpiexe -n 6 myjob.py > myjob.log
Can anybody point out the problem for me? Cause I really want to make the most usage of the computers.Thanks very much!
Have a nice day.