Biological Sequence Indexing Using Persistent Java release_jeeostwznzdrtakbhhw4bepsy4

by Elzbieta Pustulka-Hunt

Published by Zenodo.

2001  

Abstract

<strong>Abstract </strong> This thesis makes three contributions in the area of computing science. <br> Our first contribution is the recognition that new data types produced by large-scale biological research techniques lead to a flood of data which creates new challenges in the areas of data indexing, integration, manipulation and visualisation. The second contribution is a new research methodology which combines orthogonal persistence with an empirical evaluation of disk-resident suffix indexes. This methodology allowed us to develop a practical algorithm for the construction of suffix trees on disk up to any size supported by the available file and addressing space, which has hitherto not been possible. The third contribution is a new experimental methodology for examining the usefulness of suffix indexes, and the use of this methodology in an empirical investigation of the indexing gain achieved by combining an approximate matching algorithm with a large suffix index. Those results are presented against the background of the changing technological landscape affecting life sciences and bioinformatics research and the resulting need for new computing solutions.
In text/plain format

Archived Files and Locations

application/pdf  1.4 MB
file_mmhsiu4jlnbifgov7c5dhdvsie
zenodo.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2001-12-20
Language   en ?
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 9ba07ea6-a570-4d34-86a4-4bb7e04874ce
API URL: JSON