Fingerprint-Based Data Deduplication Using a Mathematical Bounded Linear Hash Function release_m4ohqfy7ijgctdagyrapjem67q

by Ahmed Sardar Al-karadaghi, Loay E. George

Published in Symmetry by MDPI AG.

2021   Volume 13, Issue 11, p1978

Abstract

Due to the quick increase in digital data, especially in mobile usage and social media, data deduplication has become a vital and cost-effective approach for removing redundant data segments, reducing the pressure imposed by enormous volumes of data that must be kept. As part of the data deduplication process, fingerprints are employed to represent and identify identical data blocks. However, when the amount of data increases, the number of fingerprints grows as well, and due to the restricted memory size, the speed of data deduplication suffers dramatically. Various deduplication solutions show a bottleneck in the form of matching lookups and chunk fingerprint calculations, for which we pay in the form of storage and processors needed for storing hashes. Utilizing a fast hash algorithm to improve the fingerprint lookup performance is an appealing challenge. Thus, this study is focused on enhancing the deduplication system by suggesting a novel and effective mathematical bounded linear hashing algorithm that decreases the hashing time by more than two times compared to MD5 and SHA-1 and reduces the size of the hash index table by 50%. Due to the enormous number of chunk hash values, looking up and comparing hash values takes longer for large datasets; this work offers a hierarchal fingerprint lookup strategy to minimize the hash judgement comparison time by up to 78%. Our suggested system reduces the high latency imposed by deduplication procedures, primarily the hashing and matching phases. The symmetry of our work is based on the balance between the proposed hashing algorithm performance and its reflection on the system efficiency, as well as evaluating the approximate symmetries of the hashing and lookup phases compared to the other deduplication systems.
In application/xml+jats format

Archived Files and Locations

application/pdf  1.4 MB
file_xwxas6fk2ncdrfwbwxyl4nhwxu
mdpi-res.com (publisher)
web.archive.org (webarchive)

Web Captures

https://www.mdpi.com/2073-8994/13/11/1978/htm
2022-04-02 04:18:49 | 48 resources
webcapture_dpvcbcfhibf3hmayrpp24pll6a
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2021-10-20
Language   en ?
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  2073-8994
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 694fd34f-c404-4d21-acdf-055676db04b5
API URL: JSON