Publication

Automatic optimized discovery, creation and processing of astronomical catalogs

Buddelmeijer, H., Boxhoorn, D. & Valentijn, E. A., Jan-2013, In : Experimental Astronomy. 35, 1-2, p. 203-225 23 p.

Research output: Contribution to journalArticleAcademicpeer-review

We present the design of a novel way of handling astronomical catalogs in Astro-WISE in order to achieve the scalability required for the data produced by large scale surveys. A high level of automation and abstraction is achieved in order to facilitate interoperation with visualization software for interactive exploration. At the same time flexibility in processing is enhanced and data is shared implicitly between scientists. This is accomplished by using a data model that primarily stores how catalogs are derived; the contents of the catalogs are only created when necessary and stored only when beneficial for performance. Discovery of existing catalogs and creation of new catalogs is done through the same process by directly requesting the final set of sources (astronomical objects) and attributes (physical properties) that is required, for example from within visualization software. New catalogs are automatically created to provide attributes of sources for which no suitable existing catalogs can be found. These catalogs are defined to contain the new attributes on the largest set of sources the calculation of the attributes is applicable to, facilitating reuse for future data requests. Subsequently, only those parts of the catalogs that are required for the requested end product are actually processed, ensuring scalability. The presented mechanisms primarily determine which catalogs are created and what data has to be processed and stored: the actual processing and storage itself is left to existing functionality of the underlying information system.

Original languageEnglish
Pages (from-to)203-225
Number of pages23
JournalExperimental Astronomy
Volume35
Issue number1-2
Publication statusPublished - Jan-2013

    Keywords

  • Data mining, Data lineage

ID: 5784608