Two algorithms for approximate string matching in static texts

Guided search

Click a term to initiate a search.

Keyword search

Two algorithms for approximate string matching in static texts

Mon, 10/09/2006 - 12:55 — thor

Authors:

Jokinen, P.; Ukkonen, E.

Author:

Jokinen, P

Ukkonen, E

Year:

1991

Venue:

Proceedings Mathematical Foundations of Computer Science 1991,

Citations:

Citations range:

n/a

Algorithms with optimal expected running time are presented for searching the occurrences of a two-dimensional m X m pattern P in a two-dimensional n X n text T over an alphabet of size c. The algorithms are based on placing in the text a static grid of test points, determined only by n, m, and c (not dynamically by earlier test results). Using test strings read from the test points the algorithms eliminate as many potential occurrences of P as possible. The remaining potential occurrences are separately checked for actual occurrences. A suitable choice of the test point set leads to algorithms with expected running time O(n2logc m2/m2) using the uniform Bernoulli model of randomness. This is shown to be optimal by a generalization of a one-dimensional lower bound result by Yao. Experimental results show that the algorithms are efficient in practice, too. The method is also generalized for the k mismatches problem. The resulting algorithm has expected running time O(kn2logc m2/m2), provided that $k\leq(m\lfloor m/\lceil\log_c m^2\rceil\rfloor-1)/2$.\ All algorithms need preprocessing of P which takes time and space O(m2). The text processing can be done on-line, using a rather small window. The algorithms easily generalize to d-dimensional matching for any d.

websearch

Data Cleaning publication categorizer

Guided search

Data Cleaning

Data sets

Data type

Paper type

Venue type

Author

Year

mailpart

Citations range

Keyword search

Two algorithms for approximate string matching in static texts

Related categories

User login