Author Topic: Cluster Network Setup  (Read 12632 times)

0 Members and 1 Guest are viewing this topic.

Offline Anirban Basak

  • Heavy QuantumATK user
  • ***
  • Posts: 25
  • Country: in
  • Reputation: 0
    • View Profile
Cluster Network Setup
« on: November 24, 2011, 11:14 »
Hi,

          We have a Sun server in our campus and we have access to that server using remote login. The detailed description of the server is as below:
system:       Sun Gridengine
processors: 20 quad-cores
proc-arch:   X86_64
memory:      3.4GB per core
OS:              CentOS 4.5
Kernel:         2.6.9-55.0.2.Elsmp

           Currently I am working on a Red Hat Linux workstation with four quad core proc and 8 GB RAM.
Red Hat Enterprise Linux Client release 5.3 (Tikanga)
Linux version 2.6.18-128.el5 (mockbuild@hs20-bc1-7.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Dec 17 11:41:38 EST 2008

I would like to make a cluster setup with my workstation as master and the server nodes as slaves. Please, tell me which software do I need to set up this cluster network over LAN and also tell me how to do it with that software.


Thank You.

Online Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5575
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Cluster Network Setup
« Reply #1 on: November 25, 2011, 16:31 »
First of all we should perhaps clarify the notation. "Master" in our language is not a control node, it's just the first compute node, to which you add "slaves" for additional performance. It makes no sense in your situation to use this setup, with those labels on the machines.

Probably what you want is to run VNL on your desktop workstation, and submit and run the calculations on the campus cluster. This is possible, but you have to transfer the calculations manually, by saving the script on the workstation, and copy it over to the cluster, run it there, and then copy the results back.

What you need is to install ATK on the cluster, and set it up so that it uses the same license server as your workstation does.

Offline Anirban Basak

  • Heavy QuantumATK user
  • ***
  • Posts: 25
  • Country: in
  • Reputation: 0
    • View Profile
Re: Cluster Network Setup
« Reply #2 on: December 1, 2011, 11:07 »
Thanks for the reply sir.

But currently we are facing some problem in accessing the gridmachine. However we would like to configure MPICH2 over network of 3 workstation we have in our lab. So I tried to install MPICH2 in linux I have followed the installation guide provided. I could configure and install MPICH2, but not without some errors which prevent it from functioning. I've attached the files generated from configure, make and make install alongwith the terminal commands and texts. I request you to help me find out the problem with it. I gave it some thought but I cant figure out what went wrong. ???

Thank you.

Online Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5575
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Cluster Network Setup
« Reply #3 on: December 1, 2011, 12:36 »
All those are just warning messages. AfaIcs, the build and installation completed successfully.

Offline Anirban Basak

  • Heavy QuantumATK user
  • ***
  • Posts: 25
  • Country: in
  • Reputation: 0
    • View Profile
Re: Cluster Network Setup
« Reply #4 on: December 9, 2011, 11:50 »
Thank you sir, MPICH2 was indeed installed successfully.

*When I compiled and ran a simple mpi program i got the following output

[root@vlsi112 ~]# mpiexec -n 4 /root/hello_world_mpi
Hello World from Node 0
Hello World from Node 1
Hello World from Node 2
Hello World from Node 3

*But when I tried to use two nodes (root@192.168.111.112 and root@192.168.111.116) it gave the following errors

[root@vlsi112 ~]# mpiexec -n 8 -machinefile /root/machines_mpi /root/hello_world_mpi
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(392)..............:
MPID_Init(139).....................: channel initialization failed
MPIDI_CH3_Init(38).................:
MPID_nem_init(234).................:
MPID_nem_tcp_init(108).............:
MPID_nem_tcp_get_business_card(346):
MPID_nem_tcp_init(305).............: gethostbyname failed, root@192.168.111.116 (errno 1)

*I have configured such that I can ssh one node from other without password.
I have also updated /etc/hosts in both nodes to include the hostnames and domainnames of the two nodes.
I have allowed 41000 to 41023 ports in the firewall and configured mpi to use them.
the file /root/machines_mpi contains only this line: root@192.168.111.116

Online Anders Blom

  • QuantumATK Staff
  • Supreme QuantumATK Wizard
  • *****
  • Posts: 5575
  • Country: dk
  • Reputation: 96
    • View Profile
    • QuantumATK at Synopsys
Re: Cluster Network Setup
« Reply #5 on: December 9, 2011, 15:09 »
Remove root@ from the machine file, it should only contain the hostnames, no username

Offline Anirban Basak

  • Heavy QuantumATK user
  • ***
  • Posts: 25
  • Country: in
  • Reputation: 0
    • View Profile
Re: Cluster Network Setup
« Reply #6 on: December 10, 2011, 11:35 »
Thank you very much. It solved the problem.  ;D