Extreme data lineage in ad-hoc astronomical data processing

30 November 2012

PhD ceremony: Mr. J. Mwebaze, 16.15 uur, Academiegebouw, Broerstraat 5, Groningen

Dissertation: Extreme data lineage in ad-hoc astronomical data processing

Promotor(s): prof. E.A. Valentijn

Faculty: Mathematics and Natural Sciences

This research addresses the problems encountered in tracing and using lineage of data, pixels and code with ad hoc astronomical data processing. We describe a data model for ad hoc-based data processing and provide methods for tracing, querying and viewing its lineage. We show that data, pixel and code lineage can be used to facilitate efficient processing through incremental re-computation, as well as semi-automatic modification of processing steps, and sub-image processing.

We describe an implementation of these techniques in an all-in-one system, Astro-WISE, that allows a scientist to archive raw data, calibrate data, perform post-calibration scientific analysis and archive all results in one environment. Astro-WISE combines the use of data, pixel and code lineage to support multiple surveys with a novel way to process wide-field astronomical data in a distributed environment.

We present an example to demonstrate Astro-WISE's efficiency in searching for rare astronomical objects: we discover several of the most distant quasars in the Universe.

