|
|
SysAdm /
Install GangliaInstall Trail | Install Monitoring | Install Ganglia Ganglia (http://ganglia.info/) is a monitoring application. Gmond Configuration on the Ganglia wiki Build and installOn SourceForge there are RPMs for Ganglia 3.1.1, but they are only for i386, and some needed libraries are only available as x86_64 RPMs; so we rebuild Ganglia 3.1.2 These were helpful for this install:
prepare Apache Portable Runtime 1.2.8Ganglia needs a newer APR than available on SL4 [root@gridvm ganglia]# pwd /nfs/data/config/ganglia [root@gridvm ganglia]# wget http://www.apache.org/dist/apr/binaries/rpm/SRPMS/apr-1.2.8-1.src.rpm [root@gridvm ganglia]# rpmbuild --rebuild apr-1.2.8-1.src.rpm [root@gridvm ganglia]# rpm -ivh /usr/src/redhat/RPMS/x86_64/apr-1.2.8-1.x86_64.rpm /usr/src/redhat/RPMS/x86_64/apr-devel-1.2.8-1.x86_64.rpm prepare RRDRPMs for RRD only seem to be available as x86_64 [root@gridvm ganglia]# yum install --enablerepo dag rrdtool rrdtool-devel prepare libConfuselibConfuse[root@gridvm ganglia]# wget http://bzero.se/confuse/confuse-2.6.tar.gz [root@gridvm ganglia]# yum install xmlto Installed: xmlto.x86_64 0:0.0.18-4 Dependency Installed: docbook-style-xsl.noarch 0:1.65.1-2 netpbm-progs.x86_64 0:10.25-2.1.el4_7.4 passivetex.noarch 0:1.25-3 tetex.x86_64 0:2.0.2-22.0.1.EL4.10 tetex-dvips.x86_64 0:2.0.2-22.0.1.EL4.10 tetex-fonts.x86_64 0:2.0.2-22.0.1.EL4.10 tetex-latex.x86_64 0:2.0.2-22.0.1.EL4.10 xmltex.noarch 0:20020625-3 [root@gridvm ganglia]# rpmbuild -ta confuse-2.6.tar.gz The spec file from the distribution does not work. Better start from the Fedora SRPM, which needs only to comment out BuildRequires from the spec file: [root@gridvm ganglia]# wget ftp://download.fedora.redhat.com/pub/fedora/linux/development/source/SRPMS/libconfuse-2.6-2.fc11.src.rpm [root@gridvm ganglia]# rpm --nomd5 -ivh libconfuse-2.6-2.fc11.src.rpm [root@gridvm ganglia]# grep BuildRequires /usr/src/redhat/SPECS/libconfuse.spec #BuildRequires: check-devel, pkgconfig [root@gridvm ganglia]# rpmbuild -ba /usr/src/redhat/SPECS/libconfuse.spec [root@gridvm ganglia]# rpm -ivh /usr/src/redhat/RPMS/x86_64/libconfuse-devel-2.6-2.x86_64.rpm \ /usr/src/redhat/RPMS/x86_64/libconfuse-2.6-2.x86_64.rpm now we can uninstall all the unnecessary stuff used for building: [root@gridvm ganglia]# yum remove xmlto tetex netpbm-progs docbook-style-xsl tetex-dvips tetex-fonts prepare libart[root@gridvm ganglia]# yum install libart_lgpl-devel build Ganglia RPMs[root@gridvm ganglia]# wget http://garr.dl.sourceforge.net/sourceforge/ganglia/ganglia-3.1.2.tar.gz [root@gridvm ganglia]# rpmbuild -ta ganglia-3.1.2.tar.gz [root@gridvm x86_64]# yum localinstall ganglia-gmond-3.1.2-1.x86_64.rpm \ libganglia-3_1_0-3.1.2-1.x86_64.rpm ganglia-gmetad-3.1.2-1.x86_64.rpm [root@gridvm ganglia]# yum localinstall ganglia-web-3.1.1-1.noarch.rpm Configure on gridvmWe configure one Ganglia Grid, "UJRC", with two Clusters, "UJRC Head" and "UJRC WNs". They both are collected by the gmetad on gridvm.
data_source "UJRC head" localhost data_source "UJRC WNs" wn001 wn004 wn007 gridname "UJRC" It's important to avoid the gmond on gridvm from talking with the gmond on the WNs -if this happens, the clusters add together in unfunny ways, and the grid ends up having twice the real number of CPUs. To keep them separate, use different two multicast IPs, and bind the IP to the interface outside of the cluster.
cluster {
name = "UJRC head"
owner = "University of Johannesburg"
latlong = "S26.18 E27.99"
url = "http://physics.uj.ac.za/cluster"
}
udp_send_channel {
mcast_join = 239.2.11.72
port = 8649
ttl = 1
}
udp_recv_channel {
mcast_join = 239.2.11.72
port = 8649
bind = 239.2.11.72
}
[root@gridvm sysconfig]# /sbin/route add -host 239.2.11.72 dev eth0 [root@gridvm sysconfig]# service gmond restart Permanent setting:
239.2.11.72 dev eth0 Open the firewall on gridvm for ganglia iptables -I DMZ 5 -p udp --dport 8649 -j ACCEPT [root@gridvm ganglia]# chkconfig --add gmond [root@gridvm ganglia]# service gmond restart Shutting down GANGLIA gmond: [FAILED] Starting GANGLIA gmond: [ OK ] [root@gridvm ganglia]# chkconfig --add gmetad [root@gridvm ganglia]# service gmetad restart Shutting down GANGLIA gmetad: [FAILED] Starting GANGLIA gmetad: [ OK ] [root@gridvm ganglia]# service httpd restart Install and configure on WNs[ADM@gridvm WN]$ ./WNsh yum -y localinstall \ /nfs/data/PKG/libganglia-3_1_0-3.1.2-1.x86_64.rpm \ /nfs/data/PKG/ganglia-gmond-3.1.2-1.x86_64.rpm \ /nfs/data/PKG/libconfuse-2.6-2.x86_64.rpm \ /nfs/data/PKG/apr-1.2.8-1.x86_64.rpm Configure the [ADM@gridvm WN]$ mkdir etc/ganglia [ADM@gridvm WN]$ cp /etc/ganglia/gmond.conf etc/ganglia
cluster {
name = "UJRC WNs"
owner = "University of Johannesburg"
latlong = "S26.18 E27.99"
url = "http://physics.uj.ac.za/cluster"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
ttl = 1
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
Copy the [ADM@gridvm WN]$ ./configsync [ADM@gridvm WN]$ ./WNsh chkconfig gmond on [ADM@gridvm WN]$ ./WNsh service gmond start Be patientGanglia takes a bit of time to collect data. Be patient, if there is no cluster, or if your nodes do not appear, wait a bit. Don't try to debug a non-problem. |