The Euclid Netherlands Science Data Centre
Euclid: the mission
Euclid is an ESA mission to map the geometry of the Universe and better understand the mysterious dark matter and dark energy, which make up most of the energy budget of the cosmos. The mission will investigate the distance-redshift relationship and the evolution of cosmic structures by measuring shapes and redshifts of galaxies and clusters of galaxies out to redshifts ~2, or equivalently to a look-back time of 10 billion years. In this way, Euclid will cover the entire period over which dark energy played a significant role in accelerating the expansion of the Universe.
Euclid Science Data Centres
The scientific processing of the data, down to the production of the scientific measurements, is the responsibility of the Science Ground Segment managed by the Euclid Consortium. The processing activity will occur in national Science Data Centres (SDCs, see the figure below).
In addition to the processing of Euclid data the Netherlands Science Data Centre (SDC-NL) facilitates three other Dutch key roles within the Euclid Consortium:
• Development of the Euclid Archive System (EAS) - the key system which supports the processing of all Euclid data and which stores and distributes data files throughout the Science Ground Segment.
• Joint leadership of the organisational unit (OU-EXT) responsible for the enormous amount of ground based data required to be combined with the space-based data to fulfil the Euclid scientific goals.
• Deputy lead of the organisational group (OU-NIR) responsible for the development of the Near Infra-red imaging Pipeline.
Euclid Netherlands Science Data Centre: Institutes and Facilities
The Netherlands Euclid Science Data Centre is hosted by two University of Groningen institutes, with the participation of the Leiden Observatory.
The Euclid Netherlands Science Data centre makes use of existing facilities at the University of Groningen’s Centre for Information Technology (CIT). This is significantly more cost effective than supporting dedicated systems. At the start of 2023, it had the following resources:
3 PByte of storage in the CIT Data Handling System for bulk file storage.
80TByte of SSD storage and twoORACLE database servers in the CIT Database Infrastructure to store Euclid metadata.
8 servers in the Data Handling System to support the Euclid Archive System services.
600 dedicated cores in the CIT Habrok High Performance Cluster for compute.
Euclid Archive System
The Netherlands Science Data Centre and the European Space Astronomy Centre (ESAC) developed the Euclid Archive System (EAS) which is in the core of the Euclid Science Ground Segment and represents a new generation of data-centric scientific information systems. It will manage up to one hundred PBs of mission data in a heterogeneous storage environment and will allow intensive access both to the data and metadata produced during the mission. It has three components:
The Euclid Archive Data Processing System (EAS-DPS) - developed by SDC-NL. Responsible for the storage of all metadata defined by the Euclid Common Data Model. It supports and tracks the data processing in the Science Ground Segment in a centralised metadata repository (DPS-MDR).
The Euclid Archive Distributed Data System (EAS-DSS) - developed by SDC-NL. Responsible for the storage of data files. It supports various Science Ground Segment services (cut-out service and visualization service). It is distributed across the various National Science Data Centres, with at least one storage node with file servers (DSS-SVR) at each.
The Euclid Archive Science Archive System (EAS-SAS) - developed by ESAC. Responsible for the storage of scientific metadata. The data are thus a subset of the metadata stored in the EAS Data Processing System, with a format optimized for scientific use. It provides access to the scientific metadata to consortium and public users.
Euclid External Data
External optical ground based surveys must be combined with the Near Infra-red space-based survey to determine the galaxy photometric redshifts and stellar Spectral Energy Distributions to the required accuracy for cosmology. Additionally, many of the other science cases also rely on the combination of Euclid images with ground-based optical colours to build full Spectral Energy Distributions of large samples of objects. All these external datasets need to be (re-)processed to a consistent reference system, ready for combination with the Euclid data. This homogenization process is dubbed Euclidization of external data and is aimed at consistency within the heterogeneous external datasets as well as between external and Euclid data in:
Photometry (consistent inferred spectral energy distributions and their absolute scaling),
Inferred intrinsic light distribution (correction for optical point spread function (PSF) and other image distortions),
Astrometry (consistent reference system and calibration approach),
Data format and content (consistent characterization and correction of instrumental fingerprints, error propagation, absolute and relative unit systems, etc.).
Euclid Near Infrared Imaging Pipeline
The Near Infrared (NIR) reduction pipeline accepts the raw imaging data produced by Euclid and processes it in several stages until the images are science grade. Given the unprecedented volume of data, this process has to be completely automated, requiring the development of well-designed and tested algorithms. This is particularly true as much of the cosmology science relies on photometric redshifts, which demands reliable and very accurate photometry. The Netherlands is deputy-lead of the NIR pipeline development, which is responsible for defining, designing, and testing the NIR reduction pipeline as well as leading the development of algorithms and pipeline modules for many essential stages of processing.
|07 February 2023 09.33 a.m.