0000000000022271

AUTHOR

Thomas Schwentick

On the Power of Tree-Walking Automata

Tree-walking automata (TWAs) recently received new attention in the fields of formal languages and databases. Towards a better understanding of their expressiveness, we characterize them in terms of transitive closure logic formulas in normal form. It is conjectured by Engelfriet and Hoogeboom that TWAs cannot define all regular tree languages, or equivalently, all of monadic second-order logic. We prove this conjecture for a restricted, but powerful, class of TWAs. In particular, we show that 1-bounded TWAs, that is TWAs that are only allowed to traverse every edge of the input tree at most once in every direction, cannot define all regular languages. We then extend this result to a class …

research product

Query automata

A main task in document transformation and information retrieval is locating subtrees satisfying some pattern. Therefore, unary queries, i.e., queries that map a tree to a set of its nodes, play an important role in the context of structured document databases. We want to understand how the natural and well-studied computation model of tree automata can be used to compute such queries. We define a query automaton (QA) as a deterministic two-way finite automaton over trees that has the ability to select nodes depending on the state and the label at those nodes. We study QAs over ranked as well as over unranked trees. Unranked trees differ from ranked ones in that there is no bound on the num…

research product

Algebraic and logical characterizations of deterministic linear time classes

In this paper an algebraic characterization of the class DLIN of functions that can be computed in linear time by a deterministic RAM using only numbers of linear size is given. This class was introduced by Grandjean, who showed that it is robust and contains most computational problems that are usually considered to be solvable in deterministic linear time.

research product

Local Normal Forms for First-Order Logic with Applications to Games and Automata

Building on work of Gaifman [Gai82] it is shown that every first-order formula is logically equivalent to a formula of the form ∃ x_1,...,x_l, \forall y, φ where φ is r-local around y, i.e. quantification in φ is restricted to elements of the universe of distance at most r from y. \par From this and related normal forms, variants of the Ehrenfeucht game for first-order and existential monadic second-order logic are developed that restrict the possible strategies for the spoiler, one of the two players. This makes proofs of the existence of a winning strategy for the duplicator, the other player, easier and can thus simplify inexpressibility proofs. \par As another application, automata mode…

research product

Locality of order-invariant first-order formulas

A query is local if the decision of whether a tuple in a structure satisfies this query only depends on a small neighborhood of the tuple. We prove that all queries expressible by order-invariant first-order formulas are local.

research product

The Many Faces of a Translation

First-order translations have recently been characterized as the maps computed by aperiodic single-valued nondeterministic finite transducers (NFTs). It is shown here that this characterization lifts to "V-translations" and "V-single-valued-NFTs", where V is an arbitrary monoid pseudovariety. More strikingly, 2-way V-machines are introduced, and the following three models are shown exactly equivalent to Eilenberg's classical notion of a bimachine when V is a group variety or when V is the variety of aperiodic monoids: V-translations, V-single-valued-NFTs and 2-way V-transducers.

research product

On Diving in Trees Thomas Schwentick

The paper is concerned with queries on tree-structured data. It defines fragments of first-order logic (FO) and FO extended by regular expressions along paths. These fragments have the same expressive power as the full logics themselves. On the other hand, they can be evaluated reasonably efficient, even if the formula which represents the query is considered as part of the input.

research product

Graph Connectivity, Monadic NP and built-in relations of moderate degree

It has been conjectured [FSV93] that an existential secondoder formula, in which the second-order quantification is restricted to unary relations (i.e. a Monadic NP formula), cannot express Graph Connectivity even in the presence of arbitrary built-in relations.

research product

On bijections vs. unary functions

A set of finite structures is in Binary NP if it can be characterized by existential second order formulas in which second order quantification is over relations of arity 2. In [DLS95] subclasses of Binary NP were considered, in which the second order quantifiers range only over certain classes of relations. It was shown that many of these subclasses coincide and that all of them can be ordered in a three-level linear hierarchy, the levels of which are represented by bijections, successor relations and unary functions respectively.

research product

The Descriptive Complexity Approach to LOGCFL

Building upon the known generalized-quantifier-based firstorder characterization of LOGCFL, we lay the groundwork for a deeper investigation. Specifically, we examine subclasses of LOGCFL arising from varying the arity and nesting of groupoidal quantifiers. Our work extends the elaborate theory relating monoidal quantifiers to NC1 and its subclasses. In the absence of the BIT predicate, we resolve the main issues: we show in particular that no single outermost unary groupoidal quantifier with FO can capture all the context-free languages, and we obtain the surprising result that a variant of Greibach's "hardest contextfree language" is LOGCFL-complete under quantifier-free BIT-free interpre…

research product

Machine-Independent Characterizations and Complete Problems for Deterministic Linear Time

This article presents two algebraic characterizations and two related complete problems for the complexity class DLIN that was introduced in [E. Grandjean, Ann. Math. Artif. Intell., 16 (1996), pp. 183--236]. DLIN is essentially the class of all functions that can be computed in linear time on a Random Access Machine which uses only numbers of linear value during its computations. The algebraic characterizations are in terms of recursion schemes that define unary functions. One of these schemes defines several functions simultaneously, while the other one defines only one function. From the algebraic characterizations, we derive two complete problems for DLIN under new, very strict, and mac…

research product

A Logical Characterisation of Linear Time on Nondeterministic Turing Machines

The paper gives a logical characterisation of the class NTIME(n) of problems that can be solved on a nondeterministic Turing machine in linear time. It is shown that a set L of strings is in this class if and only if there is a formula of the form ∃f1..∃fk∃R1..∃Rm∀xφv; that is true exactly for all strings in L. In this formula the fi are unary function symbols, the Ri are unary relation symbols and φv; is a quantifierfree formula. Furthermore, the quantification of functions is restricted to non-crossing, decreasing functions and in φv; no equations in which different functions occur are allowed. There are a number of variations of this statement, e.g., it holds also for k = 3. From these r…

research product

Expressive and efficient pattern languages for tree-structured data (extended abstract)

It would be desirable to have a query language for tree-structured data that is (1) as easily usable as SQL, (2) as expressive as monadic second-order logic (MSO), and (3) efficiently evaluable. The paper develops some ideas in this direction. Towards (1) the specification of sets of vertices of a tree by combining conditions on their induced subtree with conditions on their path to the root is proposed. Existing query languages allow regular expressions (hence MSO logic) in path conditions but are limited in expressing subtree conditions. It is shown that such query languages fall short of capturing all MSO queries. On the other hand, allowing a certain guarded fragment of MSO-logic in the…

research product

The Descriptive Complexity Approach to LOGCFL

Building upon the known generalized-quantifier-based first-order characterization of LOGCFL, we lay the groundwork for a deeper investigation. Specifically, we examine subclasses of LOGCFL arising from varying the arity and nesting of groupoidal quantifiers. Our work extends the elaborate theory relating monoidal quantifiers to NC1 and its subclasses. In the absence of the BIT predicate, we resolve the main issues: we show in particular that no single outermost unary groupoidal quantifier with FO can capture all the context-free languages, and we obtain the surprising result that a variant of Greibach's ``hardest context-free language'' is LOGCFL-complete under quantifier-free BIT-free proj…

research product

Logics for context-free languages

We define matchings, and show that they capture the essence of context-freeness. More precisely, we show that the class of context-free languages coincides with the class of those sets of strings which can be defined by sentences of the form ∃ bϕ, where ϕ is first order, b is a binary predicate symbol, and the range of the second order quantifier is restricted to the class of matchings. Several variations and extensions are discussed.

research product

Locality of order-invariant first-order formulas

A query is local if the decision of whether a tuple in a structure satisfies this query only depends on a small neighborhood of the tuple. We prove that all queries expressible by order-invariant first-order formulas are local.

research product

Descriptive Complexity, Lower Bounds and Linear Time

This paper surveys two related lines of research: Logical characterizations of (non-deterministic) linear time complexity classes, and non-expressibility results concerning sublogics of existential second-order logic. Starting from Fagin’s fundamental work there has been steady progress in both fields with the effect that the weakest logics that are used in characterizations of linear time complexity classes are closely related to the strongest logics for which inexpressibility proofs for concrete problems have been obtained. The paper sketches these developments and highlights their connections as well as the obstacles that prevent us from closing the remaining gap between both kinds of lo…

research product

Padding and the expressive power of existential second-order logics

Padding techniques are well-known from Computational Complexity Theory. Here, an analogous concept is considered in the context of existential second-order logics. Informally, a graph H is a padded version of a graph G, if H consists of an isomorphic copy of G and some isolated vertices. A set A of graphs is called weakly expressible by a formula ϕ in the presence of padding, if ϕ is able to distinguish between (sufficiently) padded versions of graphs from A and padded versions of graphs that are not in A.

research product