Design and Implementation of the first Generic Archive Storage Service for Research Data in Germany release_7d4bwb3hpzhtzeadbdwpf6vx4y

by Felix Bach, Björn Schembera, Jos Van Wezel

Published by Digital Curation Centre.

2020  

Abstract

Research data as the true valuable good in science must be saved and subsequently kept findable, accessible and reusable for reasons of proper scientific conduct for a time span of several years. However, managing long-term storage of research data is a burden for institutes and researchers. Because of the sheer size and the required retention time apt storage providers are hard to find. Aiming to solve this puzzle, the bwDataArchive project started development of a long-term research data archive that is reliable, cost effective and able store multiple petabytes of data. The hardware consists of data storage on magnetic tape, interfaced with disk caches and nodes for data movement and access. On the software side, the High Performance Storage System (HPSS) was chosen for its proven ability to reliably store huge amounts of data. However, the implementation of bwDataArchive is not dependant on HPSS. For authentication the bwDataArchive is integrated into the federated identity management for educational institutions in the State of Baden-Württemberg in Germany. The archive features data protection by means of a dual copy at two distinct locations on different tape technologies, data accessibility by common storage protocols, data retention assurance for more than ten years, data preservation with checksums, and data management capabilities supported by a flexible directory structure allowing sharing and publication. As of September 2019, the bwDataArchive holds over 9 PB and 90 million files and sees a constant increase in usage and users from many communities.
In text/plain format

Archived Files and Locations

application/pdf  509.5 kB
file_tusd2z2ttzhppab7t2amyinlc4
publikationen.bibliothek.kit.edu (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Year   2020
Language   en ?
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 50132119-45d5-4f52-99d4-16c3853b95c1
API URL: JSON