The “pipeline” of data flowing towards eRSA’s new data Node is filling fast. We are working with the owners of around 70 datasets to get them stored on the Node, representing a large (but not yet accurately measured) quantity of data. We currently have 13 of those collections approved for storage, representing over 2.5 PB of data (almost 2.8M gigabytes).
The pipeline is a combination of administrative and technical steps, involving a great deal of information gathering. We gather technical information about the dataset in order to scope the work required to get it onto the storage. We also gather contextual information to enable the Node Merit Allocation Committee to consider long term implications of storing the data.
Some of the data approved for storage has already been stored on eRSA’s pilot Node, which has already facilitated national and international research collaboration. These datasets will be migrated across to the production Node when it comes online around the end of the month, to be joined by other datasets currently in the pipeline.
Research is increasingly moving towards modelling and exploration of massive datasets, and eRSA’s large scale data storage will facilitate this work for SA researchers. It also lightens the data management burden on researchers, allowing them to focus more on their research. eRSA storage gives the South Australian universities and state government a data storage option outside of their own networks. This is a key ingredient of attracting collaboration with researchers and research institutions across borders and generating a competitive advantage for South Australian research.