Author: Ernst Althaus

0000000000004037

AUTHOR

Ernst Althaus

Graph Rewriting Based Search for Molecular Structures: Definitions, Algorithms, Hardness

We define a graph rewriting system that is easily understandable by humans, but rich enough to allow very general queries to molecule databases. It is based on the substitution of a single node in a node- and edge-labeled graph by an arbitrary graph, explicitly assigning new endpoints to the edges incident to the replaced node. For these graph rewriting systems, we are interested in the subgraph-matching problem. We show that the problem is NP-complete, even on graphs that are stars. As a positive result, we give an algorithm which is polynomial if both rules and query graph have bounded degree and bounded cut size. We demonstrate that molecular graphs of practically relevant molecules in d…

research product

Integration of an LP Solver into Interval Constraint Propagation

This paper describes the integration of an LP solver into iSAT, a Satisfiability Modulo Theories solver that can solve Boolean combinations of linear and nonlinear constraints. iSAT is a tight integration of the well-known DPLL algorithm and interval constraint propagation allowing it to reason about linear and nonlinear constraints. As interval arithmetic is known to be less efficient on solving linear programs, we will demonstrate how the integration of an LP solver can improve the overall solving performance of iSAT.

research product

Improving Interpolants for Linear Arithmetic

Craig interpolation for satisfiability modulo theory formulas have come more into focus for applications of formal verification. In this paper we, introduce a method to reduce the size of linear constraints used in the description of already computed interpolant in the theory of linear arithmetic with respect to the number of linear constraints. We successfully improve interpolants by combining satisfiability modulo theory and linear programming in a local search heuristic. Our experimental results suggest a lower running time and a larger reduction compared to other methods from the literature.

research product

Learning Molecular Classes from Small Numbers of Positive Examples Using Graph Grammars

We consider the following problem: A researcher identified a small number of molecules with a certain property of interest and now wants to find further molecules sharing this property in a database. This can be described as learning molecular classes from small numbers of positive examples. In this work, we propose a method that is based on learning a graph grammar for the molecular class. We consider the type of graph grammars proposed by Althaus et al. [2], as it can be easily interpreted and allows relatively efficient queries. We identify rules that are frequently encountered in the positive examples and use these to construct a graph grammar. We then classify a molecule as being conta…

research product

A Greedy Algorithm for Hierarchical Complete Linkage Clustering

We are interested in the greedy method to compute an hierarchical complete linkage clustering. There are two known methods for this problem, one having a running time of \({\mathcal O}(n^3)\) with a space requirement of \({\mathcal O}(n)\) and one having a running time of \({\mathcal O}(n^2 \log n)\) with a space requirement of Θ(n 2), where n is the number of points to be clustered. Both methods are not capable to handle large point sets. In this paper, we give an algorithm with a space requirement of \({\mathcal O}(n)\) which is able to cluster one million points in a day on current commodity hardware.

research product

Verification of linear hybrid systems with large discrete state spaces using counterexample-guided abstraction refinement

Abstract We present a counterexample-guided abstraction refinement ( CEGAR) approach for the verification of safety properties of linear hybrid automata with large discrete state spaces, such as naturally arising when incorporating health state monitoring and degradation levels into the controller design. Such models can – in contrast to purely functional controller models – not be analyzed with hybrid verification engines relying on explicit representations of modes, but require fully symbolic representations for both the continuous and discrete part of the state space. The presented abstraction methods directly work on a symbolic representation of arbitrary non-convex combinations of line…

research product

Scheduling of Real-Time Networks with a Column Generation Approach

We present an algorithm based on column generation for the real-time scheduling problem of allocating periodic tasks to electronic control units in multiple subsystems connected by a global bus. The allocation has to ensure that tasks can be scheduled, and messages between tasks in different subsystems can be transmitted over the global bus and meet their deadlines. Also tasks and messages occurring in a task chain must be scheduled in a way such that the sequence of execution meets their end-to-end deadline. We show that our approach computes the optimal allocation in our model and due to the column generation approach early provides lower bounds on the optimal value.

research product

Trunk Packing Revisited

For trunk packing problems only few approximation schemes are known, mostly designed for the European standard DIN 70020 [6] with equally sized boxes [8, 9, 11, 12]. In this paper two discretized approaches for the US standard SAE J1100 [10] are presented, which make use of different box sizes. An exact branch-and-bound algorithm for weighted independent sets on graphs is given, using the special structure of the SAE standard. Another branch-and-bound packing algorithm using linear programs is presented. With these algorithms axis-oriented packings of different box sizes in an arbitrary trunk geometry can be computed efficiently.

research product

Integer linear programming in computational biology

Computational molecular biology (bioinformatics) is a young research field that is rich in NP-hard optimization problems. The problem instances encountered are often huge and comprise thousands of variables. Since their introduction into the field of bioinformatics in 1997, integer linear programming (ILP) techniques have been successfully applied to many optimization problems. These approaches have added much momentum to development and progress in related areas. In particular, ILP-based approaches have become a standard optimization technique in bioinformatics. In this review, we present applications of ILP-based techniques developed by members and former members of Kurt Mehlhorn's group.…

research product

On the Low-Dimensional Steiner Minimum Tree Problem in Hamming Metric

It is known that the d-dimensional Steiner Minimum Tree Problem in Hamming metric is NP-complete if d is considered to be a part of the input. On the other hand, it was an open question whether the problem is also NP-complete in fixed dimensions. In this paper we answer this question by showing that the problem is NP-complete for any dimension strictly greater than 2. We also show that the Steiner ratio is 2 - 2/d for d ≥ 2. Using this result, we tailor the analysis of the so-called k-LCA approximation algorithm and show improved approximation guarantees for the special cases d = 3 and d = 4.

research product

Precise and efficient parametric path analysis

Hard real-time systems require tasks to finish in time. To guarantee the timeliness of such a system, static timing analyses derive upper bounds on the worst-case execution time (WCET) of tasks. There are two types of timing analyses: numeric and parametric. A numeric analysis derives a numeric timing bound and, to this end, assumes all information such as loop bounds to be given a priori. If these bounds are unknown during analysis time, a parametric analysis can compute a timing formula parametric in these variables. A performance bottleneck of timing analyses, numeric and especially parametric, is the so-called path analysis, which determines the path in the analyzed task with the longes…

research product

Fully Automatic Trunk Packing with Free Placements

We present a new algorithm to compute the volume of a trunk according to the SAE J1100 standard. Our new algorithm uses state-of-the-art methods from computational geometry and from combinatorial optimization. It finds better solutions than previous approaches for small trunks.

research product

Most Diverse Near-Shortest Paths

Computing the shortest path in a road network is a fundamental problem that has attracted lots of attention. However, in many real-world scenarios, determining solely the shortest path is not enough as users want to have additional, alternative ways of reaching their destination. In this paper, we investigate a novel variant of alternative routing, termed the k-Most Diverse Near-Shortest Paths (kMDNSP). In contrast to previous work, kMDNSP aims at maximizing the diversity of the recommended paths, while bounding their length based on a user-defined constraint. Our theoretical analysis proves the NP-hardness of the problem at hand. To compute an exact solution to kMDNSP, we present an algori…

research product

Algorithms for the Maximum Weight Connected $$k$$-Induced Subgraph Problem

Finding differentially regulated subgraphs in a biochemical network is an important problem in bioinformatics. We present a new model for finding such subgraphs which takes the polarity of the edges (activating or inhibiting) into account, leading to the problem of finding a connected subgraph induced by \(k\) vertices with maximum weight. We present several algorithms for this problem, including dynamic programming on tree decompositions and integer linear programming. We compare the strength of our integer linear program to previous formulations of the \(k\)-cardinality tree problem. Finally, we compare the performance of the algorithms and the quality of the results to a previous approac…

research product

A Column Generation Approach to Scheduling of Periodic Tasks

We present an algorithm based on column generation for a real time scheduling problem, in which all tasks appear regularly after a given period. Furthermore, the tasks exchange messages, which have to be transferred over a bus, if the tasks involved are executed on different ECUs. Experiments show that for large instances our preliminary implementation is faster than the previous approach based on an integer linear programming formulation using a state-of-the-art solver.

research product

On the low-dimensional Steiner minimum tree problem in Hamming metric

While it is known that the d-dimensional Steiner minimum tree problem in Hamming metric is NP-complete if d is part of the input, it is an open question whether this also holds for fixed dimensions. In this paper, this question is answered by showing that the Steiner minimum tree problem in Hamming metric is already NP-complete in 3 dimensions. Furthermore, we show that, the minimum spanning tree gives a 2-2d approximation on the Steiner minimum tree for d>=2. Using this result, we analyse the so-called k-LCA and A"k approximation algorithms and show improved approximation guarantees for low dimensions.

research product

Certifying feasibility and objective value of linear programs

Abstract We present an algorithm that certifies the feasibility of a linear program and computes a safe bound on its objective value while using rational arithmetic as little as possible. Our approach relies on computing a feasible solution that is as far as possible from satisfying an inequality at equality. To this end, we have to detect the set of inequalities that can only be satisfied at equality. Compared to previous approaches, our algorithm has a much higher success rate.

research product

Efficient computation of root mean square deviations under rigid transformations

The computation of root mean square deviations (RMSD) is an important step in many bioinformatics applications. If approached naively, each RMSD computation takes time linear in the number of atoms. In addition, a careful implementation is required to achieve numerical stability, which further increases runtimes. In practice, the structural variations under consideration are often induced by rigid transformations of the protein, or are at least dominated by a rigid component. In this work, we show how RMSD values resulting from rigid transformations can be computed in constant time from the protein's covariance matrix, which can be precomputed in linear time. As a typical application scenar…

research product

Fast and Accurate Bounds on Linear Programs

We present an algorithm that certifies the feasibility of a linear program while using rational arithmetic as little as possible. Our approach relies on computing a feasible solution of the linear program that is as far as possible from satisfying an inequality at equality. To realize such an approach, we have to detect the set of inequalities that can only be satisfied at equality. Compared to previous approaches for this problem our algorithm has a much higher rate of success.

research product

Computing Euclidean Steiner trees over segments

In the classical Euclidean Steiner minimum tree (SMT) problem, we are given a set of points in the Euclidean plane and we are supposed to find the minimum length tree that connects all these points, allowing the addition of arbitrary additional points. We investigate the variant of the problem where the input is a set of line segments. We allow these segments to have length 0, i.e., they are points and hence we generalize the classical problem. Furthermore, they are allowed to intersect such that we can model polygonal input. As in the GeoSteiner approach of Juhl et al. (Math Program Comput 10(2):487–532, 2018) for the classical case, we use a two-phase approach where we construct a superse…

research product

Symbolic Worst Case Execution Times

In immediate or hard real-time systems the correctness of an operation depends not only upon its logical correctness, but also on the time in which it is computed. In such systems, it is imperative that operations are performed within a given deadline because missing this deadline constitutes the failure of the complete system. Such systems include medical systems, flight control systems and other systems whose failure in responding punctually results in a high economical loss or even in the loss of human lives. These systems are usually analyzed in a sequence of steps in which first, a socalled control flow graph (CFG) is constructed that represents possible program flows. Furthermore, bou…

research product

Optimal Tree Decompositions Revisited: A Simpler Linear-Time FPT Algorithm

In 1996, Bodlaender showed the celebrated result that an optimal tree decomposition of a graph of bounded treewidth can be found in linear time. The algorithm is based on an algorithm of Bodlaender and Kloks that computes an optimal tree decomposition given a non-optimal tree decomposition of bounded width. Both algorithms, in particular the second, are hardly accessible. We present the second algorithm in a much simpler way in this paper and refer to an extended version for the first. In our description of the second algorithm, we start by explaining how all tree decompositions of subtrees defined by the nodes of the given tree decomposition can be enumerated. We group tree decompositions …

research product