0000000000202140

AUTHOR

Roberto Grossi

showing 7 related works from this author

Multi-dimensional pattern matching with dimensional wildcards

1995

We introduce a new multi-dimensional pattern matching problem, which is a natural generalization of the on-line search in string matching. We are given a text matrix A[1: n1, ..., 1:n d ] of size N= n1×n2×...×n d , which we may preprocess. Then, we are given, online, an r-dimensional pattern matrix B[1:m1,...,1:m r ] of size M= m1×m2×...×m r , with 1≤r≤d. We would like to know whether B*=B*[*, 1:m1,*, ...,1: mr, *] occurs in A, where * is a dimensional wildcard such that B* is any d-dimensional matrix having size 1 × ... × m1×...1×m r ×...1 and containing the same elements as B. Notice that there might be (d/r)≤2d occurrences of B* for each position of A. We give CRCW-PRAM algorithms for pr…

business.industryGeneralizationCommentz-Walter algorithmPattern recognitionWildcard characterString searching algorithmcomputer.file_formatApproximate string matchingBinary logarithmCombinatoricsMatrix (mathematics)Artificial intelligencePattern matchingbusinesscomputerMathematics
researchProduct

Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices

1999

AbstractWe describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices. The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. Given ann×ntext matrixA, we build our data structures inO(logn) time withn2processors on a CRCW PRAM, so that we can quickly processAin parallel as follows: (i) report some statistical information aboutA, e.g., find the largest repeated square submatrices that appear at least twice inAor determine, for each position inA, the smallest submatrix that occurs only there; (ii) given, on-line, anm×mpattern matrixPAT, check whether it occurs i…

Statistics and ProbabilityNumerical AnalysisControl and OptimizationAlgebra and Number TheoryApplied MathematicsGeneral MathematicsSuffix treeParallel algorithmData structureSquare matrixSquare (algebra)law.inventionTree (data structure)lawPattern matchingAlgorithmMathematicsData compressionJournal of Complexity
researchProduct

A trie-based approach for compacting automata

2004

International audience; We describe a new technique for reducing the number of nodes and symbols in automata based on tries. The technique stems from some results on anti-dictionaries for data compression and does not need to retain the input string, differently from other methods based on compact automata. The net effect is that of obtaining a lighter automaton than the directed acyclic word graph (DAWG) of Blumer et al., as it uses less nodes, still with arcs labeled by single characters.

automataComputer scienceSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]suffix tree0102 computer and information sciences02 engineering and technologyω-automaton01 natural sciencesindex text compressionlaw.inventionlawfactor and suffixTrie0202 electrical engineering electronic engineering information engineeringAutomata and formal languagesPattern matchingDirected acyclic word graphString (computer science)Directed graphDirected acyclic graphMobile automatonAutomaton010201 computation theory & mathematics020201 artificial intelligence & image processingAlgorithmComputer Science::Formal Languages and Automata Theory
researchProduct

Multi-Dimensional Pattern Matching with Dimensional Wildcards: Data Structures and Optimal On-Line Search Algorithms

1997

We introduce a new multidimensional pattern matching problem that is a natural generalization of string matching, a well studied problem1. The motivation for its algorithmic study is mainly theoretical. LetA1:n1,?,1:nd be a text matrix withN=n1?ndentries andB1:m1,?,1:mr be a pattern matrix withM=m1?mrentries, whered?r?1 (the matrix entries are taken from an ordered alphabet ?). We study the problem of checking whether somer-dimensional submatrix ofAis equal toB(i.e., adecisionquery).Acan be preprocessed andBis given on-line. We define a new data structure for preprocessingAand propose CRCW-PRAM algorithms that build it inO(logN) time withN2/nmaxprocessors, wherenmax=max(n1,?,nd), such that …

Control and OptimizationSuffix treeBlock matrixWildcard characterString searching algorithmcomputer.file_formatData structurelaw.inventionCombinatoricsComputational MathematicsMatrix (mathematics)Computational Theory and MathematicsSearch algorithmlawPattern matchingcomputerMathematicsJournal of Algorithms
researchProduct

Linear-size suffix tries

2016

Suffix trees are highly regarded data structures for text indexing and string algorithms [MCreight 76, Weiner 73]. For any given string w of length n = | w | , a suffix tree for w takes O ( n ) nodes and links. It is often presented as a compacted version of a suffix trie for w, where the latter is the trie (or digital search tree) built on the suffixes of w. Here the compaction process replaces each maximal chain of unary nodes with a single arc. For this, the suffix tree requires that the labels of its arcs are substrings encoded as pointers to w (or equivalent information). On the contrary, the arcs of the suffix trie are labeled by single symbols but there can be Θ ( n 2 ) nodes and lin…

Compressed suffix arrayGeneral Computer ScienceSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Generalized suffix tree0102 computer and information sciences02 engineering and technologyData_CODINGANDINFORMATIONTHEORYText indexing01 natural sciencesY-fast trielaw.inventionLongest common substring problemTheoretical Computer ScienceCombinatoricsSuffix treelawFactor and suffix automata0202 electrical engineering electronic engineering information engineeringData_FILESArithmeticFactor and suffix automata; Pattern matching; Suffix tree; Text indexing; Theoretical Computer Science; Computer Science (all)Pattern matchingMathematicsSettore INF/01 - InformaticaX-fast trieComputer Science (all)LCP array010201 computation theory & mathematics020201 artificial intelligence & image processingFM-index
researchProduct

On the Construction of Classes of Suffix Trees for Square Matrices: Algorithms and Applications

1996

AbstractWe provide a uniform framework for the study of index data structures for a two-dimensional matrixTEXT[1:n, 1:n] whose entries are drawn from an ordered alphabetΣ. An index forTEXTcan be informally seen as the two-dimensional analog of the suffix tree for a string. It allows on-line searches and statistics to be performed onTEXTby representing compactly theΘ(n3) square submatrices ofTEXTin optimalO(n2) space. We identify 4n−1families of indices forTEXT, each containing ∏ni=1(2i−1)! isomorphic data structures. We also develop techniques leading to a single algorithm that efficiently builds any index in any family inO(n2logn) time andO(n2) space. Such an algorithm improves in various …

Discrete mathematicsSuffix treeString (computer science)Generalized suffix treeBlock matrixData structureSquare matrixComputer Science ApplicationsTheoretical Computer Sciencelaw.inventionCombinatoricsComputational Theory and MathematicslawTree (set theory)SuffixInformation SystemsMathematics
researchProduct

On the construction of classes of suffix trees for square matrices: Algorithms and applications

1995

Given an n × n TEXT matrix with entries defined over an ordered alphabet σ, we introduce 4n−1 classes of index data structures for TEXT. Those indices are informally the two-dimensional analog of the suffix tree of a string [15], allowing on-line searches and statistics to be performed on TEXT. We provide one simple algorithm that efficiently builds any chosen index in those classes in O(n2 log n) worst case time using O(n2) space. The algorithm can be modified to require optimal O(n2) expected time for bounded σ.

CombinatoricsCompressed suffix arraylawSuffix treeString (computer science)Generalized suffix treeSuffix arraySuffixAlgorithmFM-indexlaw.inventionMathematicsLongest common substring problem
researchProduct