LOFAR Long-term archive
Scroll to the bottom of the page to read about the latest project developments and current Target activities related to the Lofar Long-term archive.
The LOFAR (Low-Frequency Array) radio telescope was developed by a consortium of world-class institutions, led by the Netherlands Institute for Radio Astronomy (ASTRON). Currently, it is the largest and most sensitive radio telescope in the world, as well as an important scientific and technological pathfinder for the next generation of radio telescopes like the Square Kilometer Array (SKA). LOFAR is based on the principles of aperture synthesis. Signals from a large number of small antennas are combined and manipulated in order to achieve extremely narrow beams and high sensitivity. The telescope consists of 40 phased-array stations spread over an area with a diameter of 100 km in the northern part of the Netherlands , as well as eight international stations maintained by partner institutions in Germany, France, Sweden and the United Kingdom.
The LOFAR telescope belongs to a new class of telescopes in which the complexity and challenges involved in mechanically engineering large, and precise single dishes, have been shifted to the processing and storage capabilities required for aperture synthesis based on wide-field static antennas. As a result, much effort has been put into the development of a reliable data management system that can handle the massive data streams from LOFAR. The fact that LOFAR is a computationally- and data-intensive project, made the entry of ASTRON in the Target project a very natural step.
The data flow of LOFAR data streams is modeled in a three tier architecture. Tier 0 refers to the LOFAR Central Processing Facilities (CEP), located in the Donald Smits Center for Information Technologies (CIT) at the University of Groningen. The CEP hosts the IBM Blue Gene/P supercomputer used for processing the data streams from the individual LOFAR stations, a disk-based storage of 2 Petabytes, and an HPC cluster for the initial data reduction.. Tier 1 refers to the long-term archive (LTA) with storage and re-processing capabilities at the CIT in Groningen, SARA data center and the National Institute for Subatomic Physics (NIKHEF) in Amsterdam, as well as the Forschungszentrum Jülich in Germany. The geographically spread facilities are inter-connected through a Grid-enabled environment. Tier 2 refers to the distributed user community, operating their own infrastructure from their respective institutions.
Target partners have played an important role in setting up the architectural design and user-enabling facilities for the LTA. The archive utilizes the WISE technology and a large fraction of its distributed multi-petabyte file storage is hosted on the Target testbed. The Target consortium is involved in (i) setting up the connection between the central processing facilities (CEP) and the LTA, and (ii) building the platform for further data dissemination and analysis.
LOFAR uses the WISE technology to achieve efficient data processing and management. The multi-server Oracle RAC database at Tier 1 stores data products, metadata and data lineage allowing users to trace the full life cycle of the data in the archive. Users can access the LTA to (i) request data for scientific research and analysis locally (Tier 2), (ii) carry out complex processing using LOFAR pipelines, or (iii) send queries to the database about existing data products. The WISE technology also provides the LTA with tools for quality control, data access restriction and publication of data. The DPUs (Distributed Processing Units) developed as part of the WISE technology provide the LTA with the capability to run integrated re-processing jobs for the scientific community on Big Grid and EGI facilities.
The LTA is expected to serve as a long-term storage for the data generated during the lifetime of LOFAR. To optimize the performance and sustainability of the archive, data products stored on the Target infrastructure will be placed on a type of storage (fast storage, disks, tapes) that optimally matches their data access patterns. This optimization is made possible by the GPFS file system, which is used for storage in the LTA. Provided by the Target partner IBM, the Information Lifecycle Management (ILM) functionality of the GPFS is responsible for the automated moving of data file between different storage types.
Latest Project Developments
Lofar started its science operations in December 2012. This served as a natural framework around which the activities related to the design and development of the Long-term Archive (LTA) evolved. The functionality of the LTA has significantly improved in the recent months to meet the needs of the Lofar scientific community. Special attention is being paid to the storage/archive facilities and the LTA services provided to Lofar data users. Target and ASTRON have worked together to increase the flexibility of the ingest of data products. Ingest of raw observational data has been operational for a while and it is now possible to ingest products generated by the imaging processing pipelines as well as beamformed data products. Several modifications to the LTA now allow for data products with incomplete metadata or newly specified types of data products to also be ingested to the archive – a crucial functionality for a large scientific instrument that generates terabytes of data daily. Access to the LTA archive is now available via an improved web-interface and continual integration of processing pipelines in the LTA now allows for user-initiated processing via Astro-WISE web services.
Current activities are focused on fully integrating the HSM (Hierarchical Storage Management) capability into the LTA and ensuring the required high I/O bandwidth between the various components of the archive.
|Last modified:||02 October 2015 10.52 p.m.|