Recent Changes - Search:

Disclaimer

edit SideBar

LHCG Tiers and cluster sizing

Some notes about the LHC Tier levels and the sizing of the cluster - or, how big does a cluster has to be to be useful for LHC work?

LHCG and other documents

Requirements for CERN LHC Tier2 cluster,

from www.gridpp.ac.uk/tier2/Experiment_Tier-2s_v1.0.doc:

 Number of T1sNumber of T2sTotal T2 CPUTotal T2 DiskAverage T2 CPUAverage T2 DiskNetwork InNetwork Out
   KSI2KTBKSI2KTBGb/sGb/s
ALICE6211370026006521240.0100.600
ATLAS10301620069005402300.1400.034
CMS6 to 10252072554508292181.0000.100
LHCb61476002354320.0080.008

Sizing

NB: These notes are from July 2006. The landscape has changed in the while

A full scale Tier2 node would be quite expensive and would require substantial manpower, and expertise that is not locally available. It would also require a 140Mbps connection (assuming ATLAS) to an overseas Tier1, that I suspect not to be available in SA, or incredibily expensive.
The sizing of an average Tier2 node is probably excessive for the limit size of the HEP/LHC community of SA, and might only be justified on a regional scale (southern/whole Africa ? northern African countries have probably better network connections to Europe or Israel than to SA)
I would suggest to go for a 1/10 scale node (<=50CPU equiv), that would be sufficient for all non-LHC computing requirements, and a great testbed to build up local expertise in perspective of increasing involvment in LHC.
In an LHC perspective, this cluster could either play the role of a very small Tier2, or of a pretty good Tier3.

  • 10 to 25 computing nodes
    • prefer an established supplier (like SUN, IBM, HP, Dell...)
    • single CPU, dual core AMD Opteron 175
    • 2GB ECC RAM
    • ~100GB SATA HD
    • Gigabit Ethernet
    • case: 1 rack-unit or blade (choose mostly on price)
  • 1 service node (front-end, management and file server)
    • single CPU, dual core Opteron or Xeon
    • 2GB ECC RAM
    • at least 500GB of mirrored or RAID-5 HD, but with space to grow
  • Gigabit Ethernet switch >10 ports, high performance

Motivations:

  • 10 to 20 CPUs
    AMD Opteron dual core, single chip
    • AMD CPUs have better FP performance than Intel x86-class CPUs.
    • Intel Itanium CPUs have very good FP, but are not really i386-compatible.
    • Apple/IBM PowerPC CPUs have very good SIMD FP, but almost all non-theoretical HEP or Nuclear Physics computations are not vectorisable.
    • the dual core CPUs also benchmark slightly better than a dual single-core CPU
    • we get 2 CPU using 1-CPU motherboards - certainly cheaper
  • Gigabit Ethernet
  • established supplier (like SUN, IBM, HP, Dell)
    • Do-It-Yourself from "white boxes" is expensive in terms of qualified manpower and support
    • prefer a large (US?) supplier with strong rooting in SA
      stress the fact that will also improve the skills of the SA workforce of the supplier
    • SGI have fancy but expensive stuff on board
    • LinuxNetworx and other specialized Linux cluster suppliers are small and in the US
  • physical form factor
    • rack-mount absolutely necessary to allow for growing
    • for less than 20 nodes a "blade" system is probably not interesting - saving rack space is not so important. But Blade Centers might have integrated management, lower power consumption and better cooling
    • normal 1U PCs might also be re-deployed to other tasks at End-Of-Life
  • Linux Distribution
    • Scientific Linux 4 ??
    • Most use RHEL 3 or SL 3, with kernel 2.4
  • Data Storage and File Systems?
Edit - History - Print - Recent Changes - Search
Page last modified on March 18, 2009, at 12:32 PM