Leveraging blockchain for immutable logging and querying across multiple sites release_h2tez6fedrg7joxwjugptxb6mm

by Mustafa Safa Ozdayi, Murat Kantarcioglu, Bradley Malin

Published in BMC Medical Genomics by Springer Science and Business Media LLC.

2020   Volume 13, Issue S7, p82

Abstract

Blockchain has emerged as a decentralized and distributed framework that enables tamper-resilience and, thus, practical immutability for stored data. This immutability property is important in scenarios where auditability is desired, such as in maintaining access logs for sensitive healthcare and biomedical data. However, the underlying data structure of blockchain, by default, does not provide capabilities to efficiently query the stored data. In this investigation, we show that it is possible to efficiently run complex audit queries over the access log data stored on blockchains by using additional key-value stores. This paper specifically reports on the approach we designed for the blockchain track of iDASH Privacy & Security Workshop 2018 competition. In this track, participants were asked to devise an efficient way to run conjunctive equality and range queries on a genomic dataset access log trail after storing it in a permissioned blockchain network consisting of 4 identical nodes, each representing a different site, created with the Multichain platform. Multichain duplicates and indexes blockchain data locally at each node in a key-value store to support retrieval requests at a later point in time. To efficiently leverage the key-value storage mechanism, we applied various techniques and optimizations, such as bucketization, simple data duplication and batch loading by accounting for the required query types of the competition and the interface provided by Multichain. Particularly, we implemented our solution and compared its loading and query-response performance with SQLite, a commonly used relational database, using the data provided by the iDASH 2018 organizers. Depending on the query type and the data size, the run time difference between blockchain based query-response and SQLite based query-response ranged from 0.2 seconds to 6 seconds. A deeper inspection revealed that range queries were the bottleneck of our solution which, nevertheless, scales up linearly. This investigation demonstrates that blockchain-based systems can provide reasonable query-response times to complex queries even if they only use simple key-value stores to manage their data. Consequently, we show that blockchains may be useful for maintaining data with auditability and immutability requirements across multiple sites.
In text/plain format

Archived Files and Locations

application/pdf  1.5 MB
file_2tioez36zzbw5ammwkndvpzasy
bmcmedgenomics.biomedcentral.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2020-07-21
Language   en ?
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  1755-8794
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 39255b5d-bcf7-4085-b547-b26003e26d46
API URL: JSON