Dave Charlton and his team have a mammoth job on their hands; Charlton has been tasked with coordinating the Full Dress Rehearsal (FDR) of the computing and data analysis processes of the ATLAS experiment, a run–through which he describes as “essential, almost as much as ensuring the detector itself actually works”.
A huge challenge facing all ATLAS contributors, the Dress Rehearsal will involve entering samples of simulated data, designed to look just like the real data, into the system as if it were originating from the detector itself.
Once ATLAS is up and running, real data will be fed out to the Grid and spread all over the world for permanent storage and analysis. The Grid is a global network of computers which, in the same way that the Internet is used to share information, will be used to share computing power and data storage capacity. ATLAS needs to use the Grid because of the sheer volume of data that will be recorded, and the immense amount of computing power that will be required to process it.
Initially, the CERN computing centre, known as Tier 0, will farm out data to ten scientific institutes and laboratories across the globe, known as Tier 1 centres. These will subsequently distribute it amongst local ‘clouds’ of Tier 2 centres — mainly academic institutions — associated with them. Between them, these three Tiers will reconstruct the data to build up a picture of the trajectories and energies of individual particles recorded by the detector, and analyse them to try to gain an understanding of what happened during the proton collision.
Until now, reconstructions of data have not been time–critical. The FDR will, for the first time, allow ATLAS collaborators to see whether or not the system can deliver these within the short time–scales required.
“When the LHC starts we want to analyse the data within a few days. We want a very quick feedback to see what’s good, what the problems were, if there’s any new physics, if there’s anything really exciting in there,” said Charlton. “To do that we must make sure this whole system works; that we can get the data out to people where they can look at it very quickly.”
A team of around ten people are needed to prepare the huge simulated data samples, a process so time consuming that full FDRs can only be scheduled once every three months. Two one–week runs, in February and May, are planned between now and the switch–on of the Large Hadron Collider.
In addition to the dedicated team, there is a whole network of computing specialists ensuring that the systems work around the world. The FDR exercise is intended to pull together the work of the various groups, a particularly tricky endeavour according to Charlton:
“It’s difficult to coordinate people because the different disciplines are used to working in different ways. There’s more hierarchy in computing, but less in physics analysis — if people see something interesting they have to have the freedom to be able to go off and investigate those things, because occasionally they can turn out be very important.”
Charlton hopes that the FDR process will show that any problems with the detector can be highlighted quickly, and shifts in its performance can be tracked, before real data–taking begins.
“Although we know how to do that already, by taking data away and working on it for months, we don’t know how to do it in a day,” he said. “This is one of the things that, by doing the FDR, we should improve a lot. Either that or find that the model is broken. But if that is the case — better that we know about it!”