INAF - Osservatorio Astronomico di Trieste, Trieste

Miscellaneous Information

Miscellaneous Information

Abstract Reference: 30782
Identifier: P3.5
Presentation: Poster presentation
Key Theme: 3 New Trends in HPC and Distributed Computing

The Euclid Science Ground Segment distributed infrastructure: system integration and challenges

Frailis Marco, Belikov Andrey, Benson Kevin, Bonchi Andrea, Dabin Christophe, Fumana Marco, Catherine Grenet, Holliman Mark, Maggio Gianmarco, Maino Davide, McCracken Henry J., Melchior Martin, Piemonte Antonello, Polenta Gianluca, Poncet Maurice, Scala Paolo Luigi

The Science Ground Segment (SGS) of the Euclid mission provides distributed and redundant data storage and processing, federating 9 Science Data Centres (SDCs) and a Science Operations Centre. The SGS reference architecture is based on loosely coupled systems and services, broadly organized into a common infrastructure of transverse software components and the scientific data Processing Functions (PFs). The SGS common infrastructure includes: 1) the Euclid Archive system (EAS), a central metadata repository which inventories, indexes and localizes the huge amount of distributed data 2) a Distributed Storage System (DSS), providing a unified view of the SDCs storage systems and supporting several transfer protocols 3) an Infrastructure Abstraction Layer (IAL), isolating the scientific data processing software from the underlying IT infrastructure and providing a common, lightweight workflow management system 4) a COmmon ORchestration System (COORS), performing a balanced distribution of data and processing among the SDCs. The Euclid scientific data processing levels are decomposed into 11 Processing Functions, which are the highest-level break-down of the complete processing. They are developed by distributed teams, with the constraint that each PF pipeline should run in any SDC. Virtualization is another key element of the SGS infrastructure. The EuclidVM is a lightweight virtual machine, deployed in any SDC processing node, with a reference OS, selected stable software libraries and "dynamic" installation of the Euclid PFs based on the CernVM-FS file system. We present the status of the Euclid SGS software infrastructure, the prototypes developed and the continuous system integration and testing performed through the Euclid "SGS Challenges".