White Papers – Long-term Storage

On this page you will find PRACE White Papers related to Long term storage.

Title: Storage and Long Term Preservation Strategies in PRACE Tier-1 Datacentres

Authors: HuubStoffers, Mark van de Sanden, Johan Raber, Tom Langborg, GeorgeTsouloupas, Olivier Rouchon, Florent Marceteau

Abstract: In SARA, SNIC-NSC, CaSToRC and CINES, different strategies to store and preserve academic data produced as part of research projects are in place. Even if they are fairly similar, the policies and technologies which have been deployed to manage those data have a few differences which will be detailed. The way they address the increasing need for long time archival storage, in combination with the ever increasing size and rate of data produced will be also described, as well as data sharing problems in joint research efforts where large data sets need sharing.

Download paper: PDF

Title: Best Practices on Standards, Policies and Quality Assurance in Digital Repositories for Long Term Preservation

Authors: OlivierRouchon, Philippe Prat, Mathieu Cloirec

Abstract: During the past twenty years, the long-term preservation of digital information has only been a matter under consideration for a few scientific or patrimonial institutions. These have played a key role in the understanding of the subsequent risks and the definition of standards in this domain. The best practices rely on four technological risks which are now commonly agreed: the loss of the knowledge of the content, file format obsolescence, aging media causing data loss, sudden software or technology changes. They have been put in place in institutions dealing with text, images, sounds or video where quality assurance procedures have been developed to guarantee the integrity and accessibility of the data. The way this translates into raw, primary data produced by Tier-0 systems will be evaluated as part of this whitepaper.

Download paper: PDF

Title: Media and Technology Appraisal for Long Term Preservation

Authors: FlorentMarceteau, Olivier Rouchon, Johan Raber, George Tsouloupas

Abstract: Reliability, performance, costs and return on investment are key factors in the long term preservation of digital data. They differ from one technology to another. The different media and technologies used for storage and transfer will be compared, with a particular focus on disks and tapes.

Download paper: PDF

Title: The Jonker Case After Care: Handling the ENTRAIN Dataset after its Production on Jugene

Author: Huub Stoffers

Abstract: “Huygens”, the IBM P6 system in Amsterdam and current incarnation of the Dutch National supercomputer for the academic community, was one of the DEISA systems and is now a PRACE Tier-1 system in the PRACE 2IP project. In the PRACE preparatory phase it also was a prototype system. An informal “PRACE_HOME” storage space on Huygens provides a “next stop” for PRACE Tier-0 produced data that have to be preserved for a longer time. The experience feedback presented is tied with the project of a Dutch investigator, Harm Jonker. The simulation produces an important amount of output data to be preserved, and some issues were encountered with the data preservation.

Download paper: PDF

Title: The Vagn-Ekman Case Study at SNIC-NSC

Authors: Johan Raber,Per Lundqvist and Bengt Persson

Abstract: Vagn-Ekman is a dual cluster setup with specialized functionality of the two parts. Ekman is a large compute cluster located in Stockholm, and Vagn is a storage and post-processing cluster located at NSC in Linköping. To an extent, the Vagn and Ekman clusters resemble future PRACE operations in the sense that from a large data production facility there will be a need to conveniently transfer the produced data to the researchers, potentially scattered across Europe, in a safe manner with respect to data integrity. The experiences and tools developed during the Vagn-Ekman project can serve as an example for how this data flow can be carried out.

Download paper: PDF