6533b81ffe1ef96bd1277aed

RESEARCH PRODUCT

Multi-Dimensional Pattern Matching with Dimensional Wildcards: Data Structures and Optimal On-Line Search Algorithms

Raffaele GiancarloRoberto Grossi

subject

Control and OptimizationSuffix treeBlock matrixWildcard characterString searching algorithmcomputer.file_formatData structurelaw.inventionCombinatoricsComputational MathematicsMatrix (mathematics)Computational Theory and MathematicsSearch algorithmlawPattern matchingcomputerMathematics

description

We introduce a new multidimensional pattern matching problem that is a natural generalization of string matching, a well studied problem1. The motivation for its algorithmic study is mainly theoretical. LetA1:n1,?,1:nd be a text matrix withN=n1?ndentries andB1:m1,?,1:mr be a pattern matrix withM=m1?mrentries, whered?r?1 (the matrix entries are taken from an ordered alphabet ?). We study the problem of checking whether somer-dimensional submatrix ofAis equal toB(i.e., adecisionquery).Acan be preprocessed andBis given on-line. We define a new data structure for preprocessingAand propose CRCW-PRAM algorithms that build it inO(logN) time withN2/nmaxprocessors, wherenmax=max(n1,?,nd), such that the decision query forBtakesO(M) work andO(logM) time. By using known techniques, we would get the same preprocessing bounds but anO((dr)M) work bound for the decision query. The latter bound is undesirable since it can depend exponentially ond; our bound, in contrast, is independent ofdand optimal. We can also answer, in optimal work, two further types of queries: (a) anenumerationquery retrieving all ther-dimensional submatrices ofAthat are equal toBand (b) anoccurrencequery retrieving only the distinct positions inAthat correspond to all of these submatrices. As a byproduct, we also derive the first efficient sequential algorithms for the new problem.

https://doi.org/10.1006/jagm.1996.0844