Designing optimal- and fast-on-average pattern matching algorithms release_vipm3aks3bebdpvxv6t36timi4

by Gilles Didier, Laurent Tichit

Released as a article .

2016  

Abstract

Given a pattern w and a text t, the speed of a pattern matching algorithm over t with regard to w, is the ratio of the length of t to the number of text accesses performed to search w into t. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to w, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.
In text/plain format

Archived Files and Locations

application/pdf  676.6 kB
file_rrpcqpydxvhc3lcrq7wvhng3lu
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2016-04-28
Version   v1
Language   en ?
arXiv  1604.08860v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: cedcedf2-cf39-46f2-b278-ee838a674794
API URL: JSON