[Rocks-Discuss]Newbie question: alternative method for installing compute nodes with two NICs?

Marcelo Matus mmatus at dinha.acms.arizona.edu
Mon Jan 24 14:29:12 PST 2005


You don't need to add the gateway, at least you really have one,
but you probably also need to add the secondary NIC to the frontend
in the same network segment, ie, something like 192.128.1.100

If your frontend doesn't have two NICs, then still the compute nodes
will talk between them, but not the frontend.

If your frontend doesn't have two NICs, and still you want to talk
to the compute nodes through the secondary NICs from the frontend,
then you will have to set one a gateway and add the route to the frontend.
You can use on the compute nodes as gateway for that porpuses.

Marcelo


Wheeler, Dr M.D. wrote:

>Dear Rock users
> 
>Ok I have installed an extra nic on one of my compute nodes with the line
> 
>add-extra-nic --if=eth1 --ip=192.168.1.1 --netmask=255.255.255.0 --gateway=192.168.1.254 --name=mpi-0-0 compute-0-0
> 
>and I have done a shoot-node on compute-0-0
> 
>I then try and login to mpi-0-0 using ssh mpi-0-0 I get nothing, likewise with ping, and if I use the IP Address.
> 
>I think (though I could be wrong (as is often the case))  that the problem lies in the definiation of the gateway, does my frontend know how to connect to this subnet??
> 
>Any suggestions most welcome....
> 
>Thanks in advance,
> 
>Martyn
> 
>
>________________________________
>
>From: npaci-rocks-discussion-admin at sdsc.edu on behalf of Marcelo Matus
>Sent: Fri 21/01/2005 23:38
>To: Wheeler, Dr M.D.
>Cc: mmatus at dinha.acms.arizona.edu; Huiqun Zhou; npaci-rocks-discussion at sdsc.edu
>Subject: Re: [Rocks-Discuss]Newbie question: alternative method for installing compute nodes with two NICs?
>
>
>
>Netpipe is here
>
>   http://www.scl.ameslab.gov/netpipe/
>
>And a I nice experiment you can do, and then report to the list, is run
>two Netpipe instances using eth0 and eth1 between two nodes. Since
>you have Tyan opteron mb, the total sum of the bandwidth should around
>the double.
>
>and I guess if you add the extra NICs using add-extra-nic, and then
>shoot-node, it
>should work. It seems that is the way the other Rocks user was doing it,
>ie, installing
>twice each node.
>
>about other twiks?, if you call your nodes
>
>
>  compute0-0 (eth0)  -> compute-mpi-0-0 (eth1)
>
>then you must be sure that when running mpi, the hostname used in the
>command
>line are 'compute-mpi-x-x'. You can maybe change/add new sge queues to use
>the right name, or write a scripts that translate the names properly.
>
>
>
>
>
>
>
>Wheeler, Dr M.D. wrote:
>
>  
>
>>I do have tyan opteron machines with dual onboard NIC. So this might be worth a go.
>>
>>So all I have to do is run add-extra-nic and reinstall each node (can I do a shoot-node after add extra-nic or do I need to do a full install)?
>>
>>Is there any other tweaking that needs to be done for:
>>
>>
>>
>>    
>>
>>>you can use the primary NICs in
>>>a network to support all the normal cluster services like PVFS, NFS,
>>>etc, while leaving
>>>the communications (MPI) to the secondary NICs.
>>>  
>>>
>>>      
>>>
>>Cheers Martyn
>>
>>BTW where do I get NetPipe?
>>
>>________________________________
>>
>>From: npaci-rocks-discussion-admin at sdsc.edu on behalf of Marcelo Matus
>>Sent: Fri 21/01/2005 20:48
>>To: Wheeler, Dr M.D.
>>Cc: mmatus at dinha.acms.arizona.edu; Huiqun Zhou; npaci-rocks-discussion at sdsc.edu
>>Subject: Re: [Rocks-Discuss]Newbie question: alternative method for installing compute nodes with two NICs?
>>
>>
>>
>>If your machines have the NICs connected to the right bus, ie, PCI-X
>>(Tyan Opteron)
>>or PCI-Express (don't know which one), you can use the primary NICs in
>>a network to support all the normal cluster services like PVFS, NFS,
>>etc, while leaving
>>the communications (MPI) to the secondary NICs.
>>
>>Then you can be running MPI and accessing files without compiting for
>>network access,
>>ie, you get a larger bandwith.
>>
>>Now, if yours NICs are connected or using the plain PCI bus, well, there
>>is no much
>>gain since the PCI bus is already saturated with only on Gigabit NIC.
>>
>>How do you now where yor NICs are connected?, run NetPipe between two nodes,
>>and if the maximum bandwith you get is
>>
>> 200-300 Mbps   ->  PCI 32 bits
>> 500-600 Mbps   ->  PCI 64 bits
>> 800-900 Mbps   ->  PCI-X or better
>>
>>then, in the last case, it makes a lot of sense to enable the secondary
>>NICs, since
>>two Gigabits connection don't saturate the PCI-X bus, ie, you can have
>>in theory
>>double the maximum bandwith in your nodes. Now, how you can really use that
>>bandwidth, is up to you, but splitting services between them (file disk
>>access versus
>>MPI communications) is a modest way to start.
>>
>>
>>Marcelo
>>
>>
>>Wheeler, Dr M.D. wrote:
>>
>>
>>
>>    
>>
>>>Dear All,
>>>I was just wondering if someone could explain what is the advantage of using both NICs on a node?  Can it speed up parallel jobs.  At present I have only one NIC set up on each of my compute nodes and it would be great if adding the other helped speed things up
>>>
>>>Martyn
>>>
>>>________________________________
>>>
>>>From: npaci-rocks-discussion-admin at sdsc.edu on behalf of Marcelo Matus
>>>Sent: Fri 21/01/2005 19:18
>>>Cc: Huiqun Zhou; npaci-rocks-discussion at sdsc.edu
>>>Subject: Re: [Rocks-Discuss]Newbie question: alternative method for installing compute nodes with two NICs?
>>>
>>>
>>>
>>>here is how we installed our cluster nodes with two ethernet interfaces,
>>>where one of the interfaces is infiniband, we add to our extend-compute.xml
>>>file something  like this:
>>>
>>><!-- Configure ipoib0 -->
>>><post>                                                                            
>>>
>>>#
>>># IPoIB configuration
>>>#
>>>ibbase=192.168.100
>>>ibnetmask=255.255.255.0
>>>ibbroadcast=192.168.100.255
>>>ibmtu=1500
>>>                                                                           
>>>
>>># get eth0 ip address
>>>ipeth0=<var name="Node_Address"/>
>>>ipnumber=`echo $ipeth0 | tr '\.' ' '|awk '{print $4}'`
>>>                                                                           
>>>
>>># We revert ipnumber: 254 -> 1
>>>ibnumber=`expr 255 - $ipnumber`
>>>                                                                        
>>>
>>>ib-setup --config --ip $ibbase.$ibnumber --netmask $ibnetmask
>>>--broadcast $ibbroadcast --mtu $ibmtu
>>>                                                                           
>>>
>>></post>
>>>
>>>
>>>As you see, you we retreive the $ipeth0 address and fabricate a new
>>>address for ipoib0, then
>>>we set the new address using ib-setup.
>>>
>>>In your case, you can di the same, retreive the $ipeth0 address,
>>>fabricate an address for eth1,
>>>assigned using ifconfig, and also fabricate the file
>>>
>>>/etc/sysconfig/network-scripts/ifcfg-eth1
>>>
>>>
>>>then in the frontend you can run a small script,  to capture all the
>>>eth1 addresses
>>>and add them into the cluster database using add-extra-nic
>>>
>>>cluster-fork ifconfig eth1 | add-all-eth1-nics
>>>
>>>where you need to write the 'add-all-eth1-nics' script to parse the
>>>cluster-fork/ifconfig output
>>>and call add-extra-nic properly.
>>>
>>>If not, you can add the eth1 address manually with 'add-extra-nic', as
>>>we are doing by now since
>>>we have very few nodes in our testing stage.
>>>
>>>Anyway, using add-extra-nic manually or not, you don't need to reinstall
>>>the compute nodes twice.
>>>
>>>Marcelo
>>>
>>>
>>>Mason J. Katz wrote:
>>>
>>>
>>>
>>>  
>>>
>>>      
>>>
>>>>This is correct, and we feel your pain.  But for the current release
>>>>the only way to enable the second NIC is to install once, run
>>>>add-extra-nic and install again.  This will hopefully change in the
>>>>not so distant future (but not the next release).
>>>>
>>>>  -mjk
>>>>
>>>>On Jan 21, 2005, at 2:13 AM, Huiqun Zhou wrote:
>>>>
>>>>
>>>>
>>>>    
>>>>
>>>>        
>>>>
>>>>>Hi,
>>>>>
>>>>>Do we have more convenient method for installing compute nodes with
>>>>>two NICs? Currently I have to
>>>>>install compute nodes twice to let the second NIC configured.
>>>>>add-extra-nic cannot add infomation of the
>>>>>second nic into database because there are no records for the compute
>>>>>nodes right after the frontend was
>>>>>installed. Although I have only 8 compute nodes, installation took me
>>>>>more an hour.
>>>>>
>>>>>I don't know if I miss some information in the manual, please give me
>>>>>an idea.
>>>>>
>>>>>Regards,
>>>>>
>>>>>
>>>>>Huiqun Zhou
>>>>>  
>>>>>
>>>>>      
>>>>>
>>>>>          
>>>>>
>>>>    
>>>>
>>>>        
>>>>
>>>
>>>  
>>>
>>>      
>>>
>>
>>
>>
>>
>>    
>>
>
>
>  
>




More information about the npaci-rocks-discussion mailing list