Search results for "pattern matching"

showing 4 items of 44 documents

Normal, Abby Normal, Prefix Normal

2014

A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present results about the number $\textit{pnw}(n)$ of prefix normal words of length n, showing that $\textit{pnw}(n) =\Omega\left(2^{n - c\sqrt{n\ln n}}\right)$ for some c and $\textit{pnw}(n) = O \left(\frac{2^n (\ln n)^2}{n}\right)$. We introduce efficient algorithms for testing the prefix normal property and a “mechanical algorithm” for computing prefix normal forms. We also include games which can be played with prefix normal words. In these games Alice wishes t…

binary jumbled pattern matchingEfficient algorithmmembership testBinary numberContext (language use)Prefix Normal Word AlgorithmData_CODINGANDINFORMATIONTHEORYprefix normal wordsOmegaSubstringenumerationCombinatoricsPrefixprefix normal words; binary jumbled pattern matching; normal forms; enumeration; membership test; binary languagesEnumerationnormal formsbinary languagesWord (group theory)Mathematics

researchProduct

Multi-dimensional pattern matching with dimensional wildcards

1995

We introduce a new multi-dimensional pattern matching problem, which is a natural generalization of the on-line search in string matching. We are given a text matrix A[1: n1, ..., 1:n d ] of size N= n1×n2×...×n d , which we may preprocess. Then, we are given, online, an r-dimensional pattern matrix B[1:m1,...,1:m r ] of size M= m1×m2×...×m r , with 1≤r≤d. We would like to know whether B*=B*[*, 1:m1,*, ...,1: mr, *] occurs in A, where * is a dimensional wildcard such that B* is any d-dimensional matrix having size 1 × ... × m1×...1×m r ×...1 and containing the same elements as B. Notice that there might be (d/r)≤2d occurrences of B* for each position of A. We give CRCW-PRAM algorithms for pr…

business.industryGeneralizationCommentz-Walter algorithmPattern recognitionWildcard characterString searching algorithmcomputer.file_formatApproximate string matchingBinary logarithmCombinatoricsMatrix (mathematics)Artificial intelligencePattern matchingbusinesscomputerMathematics

researchProduct

Perfect Hashing Structures for Parallel Similarity Searches

2015

International audience; Seed-based heuristics have proved to be efficient for studying similarity between genetic databases with billions of base pairs. This paper focuses on algorithms and data structures for the filtering phase in seed-based heuristics, with an emphasis on efficient parallel GPU/manycores implementa- tion. We propose a 2-stage index structure which is based on neighborhood indexing and perfect hashing techniques. This structure performs a filtering phase over the neighborhood regions around the seeds in constant time and avoid as much as possible random memory accesses and branch divergences. Moreover, it fits particularly well on parallel SIMD processors, because it requ…

researchProduct

On Prefix Normal Words

2011

We present a new class of binary words: the prefix normal words. They are defined by the property that for any given length $k$, no factor of length $k$ has more $a$'s than the prefix of the same length. These words arise in the context of indexing for jumbled pattern matching (a.k.a. permutation matching or Parikh vector matching), where the aim is to decide whether a string has a factor with a given multiplicity of characters, i.e., with a given Parikh vector. Using prefix normal words, we give the first non-trivial characterization of binary words having the same set of Parikh vectors of factors. We prove that the language of prefix normal words is not context-free and is strictly contai…

permutation matchingcontext-free languagesSearch engine indexingpre-necklacesBinary numberParikh vectorsComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Lyndon wordsnon- standard pattern matchingLyndon wordsCombinatoricsPrefixjumbled pattern matchingPattern matchingParikh vectors; pre-necklaces; Lyndon words; context-free languages; jumbled pattern matching; permutation matching; non- standard pattern matching; indexingComputer Science::Formal Languages and Automata TheoryParikh vectors pre-necklaces Lyndon words context-free languages jumbled pattern matching permutation matching non-standard pattern matching indexingMathematicsindexing

researchProduct