Recent Changes - Search:

Disclaimer

edit SideBar

Install Ganglia

Ganglia (http://ganglia.info/) is a monitoring application.

Gmond Configuration on the Ganglia wiki

1.  Build and install

On SourceForge there are RPMs for Ganglia 3.1.1, but they are only for i386, and some needed libraries are only available as x86_64 RPMs; so we rebuild Ganglia 3.1.2

These were helpful for this install:

1.1  prepare Apache Portable Runtime 1.2.8

Ganglia needs a newer APR than available on SL4

[root@gridvm ganglia]# pwd
/nfs/data/config/ganglia
[root@gridvm ganglia]# wget http://www.apache.org/dist/apr/binaries/rpm/SRPMS/apr-1.2.8-1.src.rpm
[root@gridvm ganglia]# rpmbuild --rebuild apr-1.2.8-1.src.rpm 
[root@gridvm ganglia]# rpm -ivh /usr/src/redhat/RPMS/x86_64/apr-1.2.8-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/apr-devel-1.2.8-1.x86_64.rpm

1.2  prepare RRD

RPMs for RRD only seem to be available as x86_64

[root@gridvm ganglia]# yum install --enablerepo dag rrdtool rrdtool-devel

1.3  prepare libConfuse

libConfuse
[root@gridvm ganglia]# wget http://bzero.se/confuse/confuse-2.6.tar.gz
[root@gridvm ganglia]# yum install xmlto
Installed: 
 xmlto.x86_64 0:0.0.18-4
Dependency Installed: 
 docbook-style-xsl.noarch 0:1.65.1-2 
 netpbm-progs.x86_64 0:10.25-2.1.el4_7.4 
 passivetex.noarch 0:1.25-3 
 tetex.x86_64 0:2.0.2-22.0.1.EL4.10  
 tetex-dvips.x86_64 0:2.0.2-22.0.1.EL4.10 
 tetex-fonts.x86_64 0:2.0.2-22.0.1.EL4.10 
 tetex-latex.x86_64 0:2.0.2-22.0.1.EL4.10 
 xmltex.noarch 0:20020625-3
[root@gridvm ganglia]#  rpmbuild -ta confuse-2.6.tar.gz 

The spec file from the distribution does not work. Better start from the Fedora SRPM, which needs only to comment out BuildRequires from the spec file:

[root@gridvm ganglia]# wget ftp://download.fedora.redhat.com/pub/fedora/linux/development/source/SRPMS/libconfuse-2.6-2.fc11.src.rpm
[root@gridvm ganglia]# rpm --nomd5 -ivh libconfuse-2.6-2.fc11.src.rpm
[root@gridvm ganglia]# grep BuildRequires /usr/src/redhat/SPECS/libconfuse.spec
#BuildRequires:  check-devel, pkgconfig
[root@gridvm ganglia]# rpmbuild -ba /usr/src/redhat/SPECS/libconfuse.spec 
[root@gridvm ganglia]# rpm -ivh /usr/src/redhat/RPMS/x86_64/libconfuse-devel-2.6-2.x86_64.rpm \
 /usr/src/redhat/RPMS/x86_64/libconfuse-2.6-2.x86_64.rpm

now we can uninstall all the unnecessary stuff used for building:

[root@gridvm ganglia]# yum remove xmlto tetex netpbm-progs docbook-style-xsl tetex-dvips  tetex-fonts

1.4  prepare libart

[root@gridvm ganglia]# yum install libart_lgpl-devel

1.5  build Ganglia RPMs

 
[root@gridvm ganglia]# wget http://garr.dl.sourceforge.net/sourceforge/ganglia/ganglia-3.1.2.tar.gz
[root@gridvm ganglia]# rpmbuild -ta ganglia-3.1.2.tar.gz 
[root@gridvm x86_64]# yum localinstall ganglia-gmond-3.1.2-1.x86_64.rpm \
 libganglia-3_1_0-3.1.2-1.x86_64.rpm ganglia-gmetad-3.1.2-1.x86_64.rpm 
[root@gridvm ganglia]# yum localinstall ganglia-web-3.1.1-1.noarch.rpm    

2.  Configure on gridvm

We configure one Ganglia Grid, "UJRC", with two Clusters, "UJRC Head" and "UJRC WNs". They both are collected by the gmetad on gridvm.

/etc/ganglia/gmetad.conf

data_source "UJRC head" localhost
data_source "UJRC WNs"  wn001 wn004 wn007
gridname "UJRC"

It's important to avoid the gmond on gridvm from talking with the gmond on the WNs -if this happens, the clusters add together in unfunny ways, and the grid ends up having twice the real number of CPUs. To keep them separate, use different two multicast IPs, and bind the IP to the interface outside of the cluster.

/etc/ganglia/gmond.conf

cluster {
  name = "UJRC head"
  owner = "University of Johannesburg"
  latlong = "S26.18 E27.99"
  url = "http://physics.uj.ac.za/cluster"
}
udp_send_channel {
  mcast_join = 239.2.11.72
  port = 8649
  ttl = 1
}
udp_recv_channel {
  mcast_join = 239.2.11.72
  port = 8649
  bind = 239.2.11.72
}
[root@gridvm sysconfig]# /sbin/route add -host 239.2.11.72 dev eth0
[root@gridvm sysconfig]# service gmond restart

Permanent setting: /etc/sysconfig/network-scripts/route-eth0

239.2.11.72 dev eth0

Open the firewall on gridvm for ganglia

iptables -I DMZ 5 -p udp --dport 8649 -j ACCEPT
[root@gridvm ganglia]# chkconfig --add gmond
[root@gridvm ganglia]# service gmond restart
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
[root@gridvm ganglia]# chkconfig --add gmetad
[root@gridvm ganglia]# service gmetad restart
Shutting down GANGLIA gmetad:                              [FAILED]
Starting GANGLIA gmetad:                                   [  OK  ]
[root@gridvm ganglia]# service httpd restart

3.  Install and configure on WNs

[ADM@gridvm WN]$ ./WNsh yum -y localinstall \
/nfs/data/PKG/libganglia-3_1_0-3.1.2-1.x86_64.rpm \
/nfs/data/PKG/ganglia-gmond-3.1.2-1.x86_64.rpm \
/nfs/data/PKG/libconfuse-2.6-2.x86_64.rpm \
/nfs/data/PKG/apr-1.2.8-1.x86_64.rpm

Configure the /etc/ganglia/gmond.conf

[ADM@gridvm WN]$ mkdir etc/ganglia
[ADM@gridvm WN]$ cp /etc/ganglia/gmond.conf etc/ganglia

WN/etc/ganglia/gmond.cnf

cluster {
  name = "UJRC WNs"
  owner = "University of Johannesburg"
  latlong = "S26.18 E27.99"
  url = "http://physics.uj.ac.za/cluster"
}
udp_send_channel {
  mcast_join = 239.2.11.71
  port = 8649
  ttl = 1
}
udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8649
  bind = 239.2.11.71
}

Copy the gmond.conf to all the WNs, and start gmond:

[ADM@gridvm WN]$ ./configsync 
[ADM@gridvm WN]$ ./WNsh chkconfig gmond on
[ADM@gridvm WN]$ ./WNsh service gmond start

4.  Be patient

Ganglia takes a bit of time to collect data. Be patient, if there is no cluster, or if your nodes do not appear, wait a bit. Don't try to debug a non-problem.

Edit - History - Print - Recent Changes - Search
Page last modified on April 27, 2009, at 10:55 PM