|
|
SysAdm /
Install NTP (time synchronisation)Navigation IntroductionNTP, the Network Time Protocol, keeps the clocks synchronised across different systems. This is required whenever two or more machines need to operate "together", sharing a file system etc. Basic NTP configuration on gridvm and CEsThis config is fine for those hosts which have full network access, like gridvm and the CEs and UIs. Since we use gridvm as a backup, we are fine with it having a local server in case the external ntp server is unavailable. Note that here we will not try to establish a network of NTP peers, because the other hosts are in VMs and therefore not very reliable for timing.
driftfile /var/lib/ntp/drift restrict default nomodify notrap noquery restrict 127.0.0.1 server apk-gridvm-01.uj.ac.za server ntp.is.co.za restrict ntp.is.co.za mask 255.255.255.255 nomodify notrap noquery server 127.127.1.0 fudge 127.127.1.0 stratum 10
# see http://www.vmware.com/pdf/vmware_timekeeping.pdf # see http://kb.vmware.com/kb/1006427 tinker panic 0 restrict default nomodify notrap noquery restrict 127.0.0.1 server apk-gridvm-01.uj.ac.za server ntp.is.co.za restrict ntp.is.co.za mask 255.255.255.255 nomodify notrap noquery Configure NTP with broadcasts for WNsFor the Worker Nodes we do not want client-server communication, because it's more packets going around and more work for the server than the plain broadcast. See http://docsrv.sco.com/NET_tcpip/ntpC.complete_scenarios.html Remember to disable Add to broadcastdelay 0.008 broadcast 10.0.0.255 key 1 restrict 10.0.0.0 mask 255.255.255.0 nomodify notrap trustkey 1 65534 65534 requestkey 65534 controlkey 65535 Add to broadcastclient trustkey 1 65534 65534 requestkey 65534 controlkey 65535
10.0.0.254 127.127.1.0 Don't forget to make the shm "yum install -y ntp;chkconfig ntpd on" NTP and VMwareNTP conflicts with the time synchronisation provided by VMware Guest Tools, which unfortunately seems to be enabled by default. Also, the VMs may tend to have larger-than-normal time drifts and jitters, so the kernel must be told to use a specific clock-source algorithm.
Machines]] (PDF)
Add the appropriate kernel clock option in title Scientific Linux CERN (2.6.9-67.EL.cernsmp) root (hd0,0) kernel /boot/vmlinuz-2.6.9-67.EL.cernsmp ro root=LABEL=/ '''clock=pmtmr''' initrd /boot/initrd-2.6.9-67.EL.cernsmp.img Shut down the VM and turn off VMware time sync (from the web console, Configure VM/Power/Advanced/Synchronize guest time with host, or by setting Checking time synchronisationUse [clusteradm@gridvm CE]$ ./Ash ntpq -pn root@glite-ce: remote refid st t when poll reach delay offset jitter root@glite-ce: ============================================================================== root@glite-ce: +152.106.18.254 196.4.160.4 3 u 10 64 377 0.946 -39.703 117.412 root@glite-ce: *196.4.160.4 146.64.58.41 2 u 9 64 377 4.194 -25.400 76.069 root@osg-ce: remote refid st t when poll reach delay offset jitter root@osg-ce: ============================================================================== root@osg-ce: +152.106.18.254 196.4.160.4 3 u 62 64 377 0.136 -1.606 0.253 root@osg-ce: *196.4.160.4 146.64.58.41 2 u 60 64 377 4.099 1.330 0.663 root@osg-ui: remote refid st t when poll reach delay offset jitter root@osg-ui: ============================================================================== root@osg-ui: +152.106.18.254 196.4.160.4 3 u 17 256 377 0.139 -0.785 0.762 root@osg-ui: *196.4.160.4 146.64.58.41 2 u 148 256 377 4.509 1.657 0.593 root@glite-ui: remote refid st t when poll reach delay offset jitter root@glite-ui: ============================================================================== root@glite-ui: +152.106.18.254 196.4.160.4 3 u 104 1024 377 0.201 -3.363 0.901 root@glite-ui: *196.4.160.4 146.64.58.41 2 u 571 1024 377 4.587 0.307 0.628 A * indicates the server(s) that has been chosen as reference, a + the backup ones (NTP prefers servers at a lower stratum when they are available). The delay, offset and jitter are in milliseconds. sshd login timestamps in syslogOn the SLC nodes, ssh was logging some login entries using UTC timezone: Apr 6 09:37:39 glite-ce sshd[31026]: Accepted publickey for sergio from XX.33.130.98 port 55185 ssh2 Apr 6 07:37:39 glite-ce sshd[31027]: Accepted publickey for sergio from XX.33.130.98 port 55185 ssh2 Apr 6 09:37:39 glite-ce sshd(pam_unix)[31056]: session opened for user sergio by (uid=0) The issue is that the child sshd process does not know the time zone. There is a permanent fix indicated in this ticket, but SLC has not applied yet, so we use the other fix - copying the [ADM@gridvm CE]$ ./Ash mkdir /var/empty/sshd/etc [ADM@gridvm CE]$ ./Ash cp /etc/localtime /var/empty/sshd/etc Luckly enough, the standard SL uses a different OpenSSH which does not log as much as the SLC one, so this problem only appeared on CEs and UIs. < Install NFS | Install Trail | Install SMTP > |