Pattern Matching Algorithms [2nd ed.] 0195113675, 9780195113679, 9781423759706

This book provides an overview of the current state of pattern matching as seen by specialists who have devoted years of

187 51 582KB

English Pages 394 Year 1997

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents......Page 10
1.1 Searching for strings in texts......Page 18
1.2.1 String-Matching Automaton......Page 22
1.2.2 Forward Prefix scan......Page 24
1.2.3 Forward Subword scan......Page 36
1.3.1 Reverse-Suffix scan......Page 41
1.3.2 Reverse-Prefix scan......Page 49
1.4 Space-economical methods......Page 52
1.4.1 Constant-space searching algorithm......Page 53
1.4.2 Computing a critical factorization......Page 57
1.4.3 Computing the period of the pattern......Page 59
1.5 Heuristics for practical implementations......Page 61
1.6 Exercises......Page 63
1.7 Bibliographic notes......Page 64
2.1 Preliminaries......Page 72
2.2 Application of sequential techniques......Page 74
2.3 Periodicities and witnesses......Page 76
2.3.1 Witnesses......Page 78
2.3.3 A Lower Bound for the CRCW-PRAM......Page 81
2.4 Deterministic Samples......Page 84
2.4.1 Faster text search using DS......Page 87
2.5 The fastest known CRCW-PRAM algorithm......Page 88
2.5.1 Computing deterministic samples in constant time......Page 89
2.5.3 A new method to compute witnesses : Pseudo-periods......Page 91
2.6 Fast optimal algorithms on weaker models......Page 95
2.6.1 Optimal algorithm on the EREW-PRAM......Page 97
2.7 Two dimensional pattern matching......Page 99
2.8 Conclusion and open questions......Page 101
2.10 Bibliographic notes......Page 102
3.1 Subword trees......Page 106
3.2 McCreight's algorithm......Page 110
3.3 Storing suffix trees......Page 113
3.4 Building suffix trees in parallel......Page 114
3.4.1 Preprocessing......Page 115
3.4.2 Structuring D[sub(x)]......Page 116
3.4.3 Refining D[sub(x)]......Page 117
3.4.4 Reversing the edges......Page 122
3.5 Parallel on-line search......Page 123
3.6 Exhaustive on-line searches......Page 125
3.6.1 Lexicographic lists......Page 128
3.6.2 Building lexicographic lists......Page 130
3.6.3 Standard representations for constant-time queries......Page 134
3.7 Exercises......Page 135
3.8 Bibliographic notes......Page 136
4.1 Levenshtein distance and the LCS problem......Page 140
4.2 Classical algorithm......Page 141
4.3.1 Linear space algorithm......Page 142
4.3.2 Subquadratic algorithm......Page 143
4.3.3 Lower bounds......Page 146
4.4.1 pn algorithm......Page 147
4.4.2 Hunt-Szymanski algorithm......Page 149
4.4.4 Myers algorithm......Page 151
4.5 Exercises......Page 154
4.6 Bibliographic notes......Page 155
5 Parallel Computations of Levenshtein Distances......Page 160
5.1 Edit distances and shortest paths......Page 161
5.2 A monotonicity property and its use......Page 163
5.3 A sequential algorithm for tube minima......Page 167
5.4 Optimal EREW-PRAM algorithm for tube minima......Page 168
5.4.1 EREW-PRAM computation of row minima......Page 169
5.4.2 The tube minima bound......Page 185
5.5 Optimal CRCW-PRAM algorithm for tube minima......Page 187
5.5.1 A preliminary algorithm......Page 188
5.5.2 Decreasing the work done......Page 194
5.5.3 Further remarks......Page 196
5.6 Exercises......Page 197
5.7 Bibliographic notes......Page 198
6 Approximate String Searching......Page 202
6.1.1 The dynamic programming algorithm......Page 204
6.1.2 An alternative dynamic programming computation......Page 205
6.1.3 The efficient algorithm......Page 207
6.2 The parallel algorithm......Page 208
6.3 An algorithm for the LCA problem......Page 210
6.3.2 A simple sequential LCA algorithm......Page 211
6.5 Bibliographic notes......Page 213
7.1 Preliminaries......Page 218
7.2.1 Monge conditions......Page 220
7.2.2 Matrix searching......Page 221
7.3 The one-dimensional case (1D/1D)......Page 223
7.3.1 A stack or a queue......Page 224
7.3.2 The effects of SMAWK......Page 225
7.4 The two-dimensional case......Page 228
7.4.1 A coarse divide-and-conquer......Page 229
7.4.2 A fine divide-and-conquer......Page 230
7.5 Sparsity......Page 232
7.5.1 The sparse 2D/0D problem......Page 233
7.5.2 The sparse 2D/2D problem......Page 237
7.6.1 Sequence alignment......Page 239
7.6.2 RNA secondary structure......Page 243
7.7 Exercises......Page 248
7.8 Bibliographic notes......Page 249
8 Shortest Common Superstrings......Page 254
8.1 Early results: NP-hardness and some special cases......Page 255
8.1.2 A special case: strings of length 2......Page 256
8.2 An O(n log n) approximation algorithm......Page 257
8.3.1 Some definitions and simple facts......Page 260
8.3.2 A simple variant of Greedy achieves 4n......Page 262
8.3.3 Improving to 3n......Page 265
8.3.4 A 4n upper bound for Greedy......Page 266
8.4 A polynomial-time approximation scheme is unlikely......Page 267
8.5.1 A linear approximation algorithm......Page 269
8.5.2 A non-trivial bound for a greedy solution......Page 272
8.6 DNA Sequencing and learning of strings......Page 273
8.6.1 Modelling DNA sequencing via learning......Page 274
8.6.2 Learning a string efficiently......Page 276
8.7 Superstrings with flipping......Page 277
8.8 Exercises......Page 279
8.9 Bibliographic notes......Page 280
9.1.1 Linear reductions......Page 284
9.1.2 Periodicity analysis......Page 286
9.2 Scaled matching......Page 293
9.2.1 String scaled matching......Page 294
9.2.2 Two dimensional scaled matching......Page 296
9.2.3 Partitioning a column into k-blocks......Page 299
9.3 Compressed matching......Page 301
9.3.1 Problem definition and algorithm Overview......Page 302
9.3.2 The compressed matching algorithm......Page 303
9.4 Exercises......Page 307
9.5 Bibliographic notes......Page 308
10.1 Suffix trees for square matrices......Page 310
10.2 Suffix tree construction for square matrices......Page 320
10.3 Suffix trees for rectangular matrices......Page 329
10.4 Linear-space construction......Page 336
10.4.1 NOD-processing......Page 338
10.4.2 NOD-queries......Page 340
10.5 Linear expected-time construction......Page 342
10.5.1 CT's construction......Page 344
10.5.2 Sorting M' to obtain R......Page 346
10.6 Exercises......Page 349
10.7 Bibliographic notes......Page 354
11 Tree Pattern Matching......Page 358
11.1.2 A brief review of algorithmic results in exact tree matching......Page 359
11.1.3 Edit operations and edit distance......Page 361
11.1.4 Early history of approximate tree matching algorithms......Page 363
11.2.2 Basic tree edit distance computation......Page 366
11.2.3 Pattern trees with variable length don't cares......Page 370
11.2.4 Fast parallel algorithms for small differences......Page 375
11.2.5 Tree alignment problem......Page 376
11.2.6 Tree pattern discovery problem......Page 380
11.3.1 Hardness results......Page 381
11.3.2 Algorithm for tree alignment and a heuristic algorithm for tree edit......Page 383
11.4 Conclusion......Page 384
11.5 Exercises......Page 385
11.6 Bibliographic notes......Page 386
D......Page 390
L......Page 391
P......Page 392
T......Page 393
Z......Page 394

Pattern Matching Algorithms [2nd ed.]
 0195113675, 9780195113679, 9781423759706

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up