Designing optimal- and fast-on-average pattern matching algorithms
release_vipm3aks3bebdpvxv6t36timi4
by
Gilles Didier, Laurent Tichit
2016
Abstract
Given a pattern w and a text t, the speed of a pattern matching algorithm
over t with regard to w, is the ratio of the length of t to the number of
text accesses performed to search w into t. We first propose a general
method for computing the limit of the expected speed of pattern matching
algorithms, with regard to w, over iid texts. Next, we show how to determine
the greatest speed which can be achieved among a large class of algorithms,
altogether with an algorithm running this speed. Since the complexity of this
determination make it impossible to deal with patterns of length greater than
4, we propose a polynomial heuristic. Finally, our approaches are compared with
9 pre-existing pattern matching algorithms from both a theoretical and a
practical point of view, i.e. both in terms of limit expected speed on iid
texts, and in terms of observed average speed on real data. In all cases, the
pre-existing algorithms are outperformed.
In text/plain
format
Archived Files and Locations
application/pdf 676.6 kB
file_rrpcqpydxvhc3lcrq7wvhng3lu
|
arxiv.org (repository) web.archive.org (webarchive) |
1604.08860v1
access all versions, variants, and formats of this works (eg, pre-prints)