6533b821fe1ef96bd127b89c

RESEARCH PRODUCT

Query automata

Frank NevenThomas Schwentick

subject

TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGESTheoretical computer scienceComputer scienceComputer Science::Logic in Computer ScienceComputer Science::DatabasesComputer Science::Formal Languages and Automata TheoryAutomaton

description

A main task in document transformation and information retrieval is locating subtrees satisfying some pattern. Therefore, unary queries, i.e., queries that map a tree to a set of its nodes, play an important role in the context of structured document databases. We want to understand how the natural and well-studied computation model of tree automata can be used to compute such queries. We define a query automaton (QA) as a deterministic two-way finite automaton over trees that has the ability to select nodes depending on the state and the label at those nodes. We study QAs over ranked as well as over unranked trees. Unranked trees differ from ranked ones in that there is no bound on the number of children of nodes. We characterize the expressiveness of the different formalisms as the unary queries definable in monadic second-order logic (MS O). Surprisingly, in contrast to the ranked case, special stay transitions had to be added to QAs over unranked trees to capture MSO. We establish the complexity of their non-emptiness, containment, and equivalence problem to be complete for EXPTIME.

https://doi.org/10.1145/303976.303997