ASTRO 3D telescopes are collecting large volumes of multi-dimensional datasets, while the Genesis Theoretical Simulations are producing prodigious amounts of theoretical data. These petabyte-scale datasets require sophisticated data management and access mechanisms, as well as new algorithms and visualisation tools to efficiently extract scientific information.

The Data Intensive Astronomy (DIA) Program, led by CIs Richard McDermid and Lister Staveley-Smith, facilitates better access to tools, technology, infrastructure and training for ASTRO 3D researchers working with large data sets and in High-Performance Computing (HPC) environments.

It does this by working with national infrastructure providers, and by the sharing of expertise between ASTRO 3D researchers. As much of the ASTRO 3D science involves world-leading surveys and large data sets, our ability to process our data in a timely and efficient manner is critical to our success.

How it fits together

We are implementing a layered “Data Fabric” plan based on the recommendation of the 2016-2025 Australian Astronomy Decadal Plan. With the three layers of this fabric, we aim to seamlessly join all ASTRO 3D survey and Genesis simulation data.

Layer 1 connects the high-performance computing facilities: the National Computing Infrastructure Facility, the GPU Supercomputer for Theoretical Astrophysics, and the Pawsey Centre. We are using the computing and storage infrastructures within these facilities and connecting these facilities to implement a seamless cross facility data fabric.

Layer 2 is a data-intensive research middleware that joins database systems, high-performance storage and high-performance computing with advanced scientific data management into a service-oriented architecture. We work with leading astronomical data intensive astronomy institutes and industry partners to ensure that our projects rely on the latest middleware technologies. We will employ skilled middleware specialists to implement and maintain services at this critical level, and to provide training to astrophysicists in data intensive middleware.

Layer 3 incorporates a new set of tightly connected databases to tag and structure the data, as well as high-level Virtual Observatory tools and interfaces for accessing and manipulating observational and theoretical data. We are linking the ASVO, the CSIRO ASKAP Science Data Archive, and the TAO that hosts theory data, providing a direct and vital connectivity amongst our projects. The TAO is being expanded to incorporate hydrodynamical data and radio data, with new analysis modules for interactively exploring the simulations and creating theoretical mock data cubes for Centre surveys. We are extending ASVO functionality, facilitating access nationwide and providing International Virtual Observatory Alliance compliant interfaces for the international astronomical community.

Projects

data intensive astronomy leadership

Richard McDermid
Richard McDermid Chief Investigator
Lister Staveley-Smith
Lister Staveley-SmithChief Investigator