Introducing: The Grid
Odds are, somewhere near you right now, computers are whirring all day and night frantically processing the latest ATLAS data. They’re part of a system called the Worldwide LHC Computing Grid, or just “the Grid” for short, and without them we’d drown in the data spat out by our detector. So what is the Grid, exactly, and how do we keep it so busy?
We always try to have a detailed simulation of the physics we’re trying to understand. This year, that means simulating about two billion proton collisions in ATLAS, which, if you were to start that on your laptop right now, would take somewhere around 15,000 years to finish. We don’t have that kind of time! Enter the Grid. The Grid ties together something like 100,000 laptop-equivalents sitting at universities and labs all around the world. We can ship chunks of data from our detector over to one of those computers, set it to work, and tell it to send us a message once it’s ready for the next chunk. It’s a kind of massive parallel-processing that allows us to churn through our data at a much higher rate than would ever be possible if we only used the computing at CERN.
We’ve come a long way in the last 30 years. For the discovery of the W boson at CERN in 1983, a picture of every event was sent to the “Megatek” facility in Switzerland, where one-by-one a person analyzed them by eye (really!). The Tevatron experiments need significant computing resources, but most of it can be hosted on-site at Fermilab near Chicago. Today, the LHC experiments are really the first physics experiments (that I know of!) that desperately need the Grid to stay operational. We could put the entire city of Geneva to work, and each person would have to process three collision events an hour, 24 hours a day, 365 days a year, to keep up with all our recorded and simulated events! The Grid makes that task much more manageable – but not easy, by any stretch of the imagination.
There are people around the world constantly working to take care of the Grid and keep it running smoothly for us. One year’s worth of operation costs around 15M Euros, give or take a bit, and we want to make sure we (and the taxpayers) get our money’s worth. About half of that goes straight to keeping the computers cool – without a fancy data center on the scale that Google and Apple build, it can be expensive keeping that many machines from overheating! No computer is retired if it can be upgraded or repaired cost-effectively. And we constantly work on improving our software, knowing that if we can make it 10 percent faster we can save over 1M euros. Well, that’s only half-true, of course; if our software runs faster, we’ll use the spare time to write even more papers!
Despite all this fancy computing, we aren’t anywhere close to developing SkyNet – don’t worry. Apple’s and Google’s new data centers, which reportedly run about $1B USD each, have computing resources that are 10-100 times as much as our entire beloved Grid. Of course, we don’t have that kind of money! But some times, late at night, waiting for a computer hundreds or even thousands of miles away to phone home and tell me it’s done with its little job for the day, I dream of all we could do with just one of those buildings…
Zach Marshall
Zach Marshall is a research fellow at CERN. Over the last five years he has alternated between developing software for ATLAS and abusing that software for the good of physics analyses.






