ROCKS Clustering - A Review

This is “NOT” a HowTo for setting up a ROCKS Cluster, but I tried to show you off some of my try outs and some aftermath.

If you are new to ROCKS; Please refer the well equipped ROCKS User’s Guide or you might be lost.

I used VERSION:-4.1 [Rocks v4.2 Beta is released for i386 and x86_64 CPU architectures are avail now]
and my cluster details are registered here

Frontend a.k.a Head Node installation is just a breeze, only if you refer the manual.

To say a word about frontend installation, what is your requirement so what rolls you need to select.

BASE DISK
0.Area51 Roll :- For added security features like Tripwire and chkrootkit. Opt-out, if you really not othered about high-funda security.
1.Viz Roll :- Visualization, you don’t required unless you have a big and tiled monitor.
2.hpc :- Yes, I am into HPC lane
3.Ganglia :- To show off my cluster set-up and obviously for cluster’s health monitoring.
4.Web-server :- Yes.
5.Kernel Roll :- Yes.

OS DISK

Disk-1 and Disk-2 is sufficient, disks-3 and 4 are optional

..and next, I did bind to our local ntp server.

DISK PARTITIONING -> Disk-druid for my 147 GB SCSI

/boot : 128 MB
/ : 15 GB
/usr/local : 20 GB ( For mannual installation og Globus and Torque scheduler)
/var : 25 GB (I expect a little more log)
swap : 2 GB
/myspace : 10 GB (For the non-cluster/local users home directory)
/export : Fill Available space

Now the installation has got over; system booted-up and no color (GUI) :-)

# system-config-display

To say, I had an issue and I dont want to see the smoke behind my flat BenQ. What I did was just copied
the /etc/Xll/xorg.conf file from another system with “same” hardware loadead with RedHat-AS-4.
I repeat… Linux, its a large file ! ;-)

Oh..yeah monitor, it’s single BenQ flat, shared over the systems with ATEN KVM switch.

#startx
…hoo·ray ! I got the color ( when you logged in, the only difference I felt, there wasn’t any red HAT logo but centOS and the grub was different…. so Luke… its our shadow-man ! )
…then I stopped smartd service.

Compue Node Installation

I want the control over the compute node installation, atleast partitioning.

# cd /home/install/site-profiles/4.1/nodes/
Copy the skeleton.xml to extend-auto-partition.xml and edit extend-auto-partition.xml
++ refer the manual ^

I tried editing the manual option on the XML, showed strange and weired so I went with exyend-a-p.

# cd /home/install; rocks-dist dist [to apply this configuration to the distribution]
# insert-ethers
If your your frontend and compute nodes are connected via a managed ethernet switch, you’ll want to select ‘Ethernet Switches’ from the list above. This is because the default behavior of many managed ethernet switches is to issue DHCP requests in order to receive an IP address that clients can use to configure and monitor the switch.

When insert-ethers captures the DHCP request for the managed switch, it will configure it as an ethernet switch and store that information in the MySQL database on the frontend.

As a side note, you may have to wait several minutes before the ethernet switch broadcasts its DHCP request. If after 10 minutes (or if insert-ethers has correctly detected and configured the ethernet switch), then you should quit insert-ethers by hitting the F10 key.

Now, restart insert-ethers and continue reading the user guide for a procedure on how to configure your compute nodes.

# insert-ethers
and choose compute then wait [ Really, I felt I need patience, through out the set-up ] after putting the base cd to your compute node, restart and boot from the CD.
That’s it ( do remember you have gotta PXE boot option, if you got CD-Drive outage :) )

Its fast..pretty fast and I finished my 2 compute nodes instllation in 3 minutes simultaniously.

You can monitor the installation of compute nodes by using ssh with p0rt 2200.

# ssh compute-0-0 -p 2200

Once the installation got over,
login: root
password: {frontend ’s root password }

# df -h; free
Good all the partitions and swap space are correct.

NO..ITS NOT CORRECT
…reallY… go to front end
0. check the XML file ( my problem was I put forward slash instead of / before part), what’s yours…?
1. # cd /home/install; rocks-dist dist [to apply this configuration to the distribution]
2. # rocks-partition –list –delete –nodename {compue node’s hostname}
3. Use the nukeit.sh script for removing .rocks-release from the first partition of each disk on the computenodes.
[ for nukeit.sh ]
4. # ssh {compue node’s hostname} ’sh /home/install/sbin/nukeit.sh’
5. # ssh {compue node’s hostname} ‘/boot/kickstart/cluster-kickstart’

Compute node restarted; check the default grub option; re-install, go ahead by ENTER.

Hic-cup Session
0. How do I run my Linpack HPL.dat?
Luke…refer the manual
1.How do I change frontend’s Public IP Address?

Don’t use {}

# echo ‘ update app_globals set value=”{newip}” where value=”{oldip}”‘ | mysql -u apache custer
# echo ‘ update networks set IP=”{newIP}” where IP=”{oldIP}”‘ | mysql -u apache cluster
# insert-ethers –update
2. My Ganglia status shows all/some of my compute nodes are dead but actuallY its running.
If you tried the following…

[root@rocks mongoose]# cluster-fork /bin/date ; date
compute-0-0:
Sat Jul 8 04:30:39 IST 2006
compute-0-1:
Sat Jul 8 04:30:39 IST 2006

Sat Jul 8 04:30:39 IST 2006

[root@rocks mongoose]# cluster-fork service gmond restart
compute-0-0:
Shutting down GANGLIA gmond: [ OK ]
Starting GANGLIA gmond: [ OK ]
compute-0-1:
Shutting down GANGLIA gmond: [ OK ]
Starting GANGLIA gmond: [ OK ]

[root@rocks mongoose]# service gmond restart
Shutting down GANGLIA gmond: [ OK ]
Starting GANGLIA gmond: [ OK ]

root@rocks mongoose]# service gmetad restart
Shutting down GANGLIA gmetad: [ OK ]
Starting GANGLIA gmetad: [ OK ]

I refreshed the ganglia webpage
…then it showed Hosts Up = 1 (frontend) in while 1 changed to 2….. after sometime
it showed me
Hosts Up: 2 and hosts down=1
and now the case is back to Hosts Up=1 and Hosts Down=2.
Check multicas is enabled on your switch, blocking this on the networking device may cause the problem.

3.How do I manually broadcast 411 update instaead of hourly update.

# make -C /var/411 force
[You may have to use this just after creating a cluster-user on Frontend and to get updated across the nodes]

Disclaimer

All the above said materials are tested in a real time environment though Your Miles May Vary (YMMV)

2 Responses to “ROCKS Clustering - A Review”

  1. emurhfkq Says:

    people are stranger

  2. ~sagar Says:

    …when you are a stranger
    - doors-jm

Leave a Reply