6533b7d8fe1ef96bd126afbe

RESEARCH PRODUCT

Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices

Raffaele GiancarloRoberto Grossi

subject

Statistics and ProbabilityNumerical AnalysisControl and OptimizationAlgebra and Number TheoryApplied MathematicsGeneral MathematicsSuffix treeParallel algorithmData structureSquare matrixSquare (algebra)law.inventionTree (data structure)lawPattern matchingAlgorithmMathematicsData compression

description

AbstractWe describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices. The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. Given ann×ntext matrixA, we build our data structures inO(logn) time withn2processors on a CRCW PRAM, so that we can quickly processAin parallel as follows: (i) report some statistical information aboutA, e.g., find the largest repeated square submatrices that appear at least twice inAor determine, for each position inA, the smallest submatrix that occurs only there; (ii) given, on-line, anm×mpattern matrixPAT, check whether it occurs inA. We refer to the above two kinds of operations as queries and point out that they have applications to visual databases and two-dimensional data compression. Query (i) takesO(logn) time withn2/lognprocessors and query (ii) takesO(logm) time withm2/logmprocessors. The query algorithms are work optimal while the construction algorithm is work optimal only for arbitrary and large alphabets.

10.1006/jcom.1998.0496http://dx.doi.org/10.1006/jcom.1998.0496