6533b825fe1ef96bd1283475

RESEARCH PRODUCT

Distributed Data Collection for the ATLAS EventIndex

S. González De La HozJ. SánchezA. Fernandez Casani

subject

HistoryData collectionDatabaseComputer scienceSnippetcomputer.software_genreGridComputer Science ApplicationsEducationMetadataPointer (computer programming)Data_FILEScomputerParticle Physics - Experiment

description

The ATLAS EventIndex contains records of all events processed by ATLAS, in all processing stages. These records include the references to the files containing each event (the GUID of the file) and the internal “pointer” to each event in the file. This information is collected by all jobs that run at Tier-0 or on the Grid and process ATLAS events. Each job produces a snippet of information for each permanent output file. This information is packed and transferred to a central broker at CERN using an ActiveMQ messaging system, and then is unpacked, sorted and reformatted in order to be stored and catalogued into a central Hadoop server. This contribution describes in detail the Producer/Consumer architecture to convey this information from the running jobs through the messaging system to the Hadoop server.

http://cds.cern.ch/record/2004425