Author: Kārlis ČErāns

0000000000019509

AUTHOR

Kārlis ČErāns

0000-0002-0154-5294

UML Style Graphical Notation and Editor for OWL 2

OWL is becoming the most widely used knowledge representation language. It has several textual notations but no standard graphical notation apart from verbose ODM UML. We propose an extension to UML class diagrams (heavyweight extension) that allows a compact OWL visualization. The compactness is achieved through the native power of UML class diagrams extended with optional Manchester encoding for class expressions thus largely eliminating the need for explicit anonymous class visualization. To use UML class diagram notation we had to modify its semantics to support Open World Assumption that is central to OWL. We have implemented the proposed compact visualization for OWL 2 in a UML style …

research product

Noise-tolerant efficient inductive synthesis of regular expressions from good examples

We present an almost linear time method of inductive synthesis restoring simple regular expressions from one representative (good) example. In particular, we consider synthesis of expressions of star-height one, where we allow one union operation under each iteration, and synthesis of expressions without union operations from examples that may contain mistakes. In both cases we provide sufficient conditions defining precisely the class of target expressions and the notion of good examples under which the synthesis algorithm works correctly, and present the proof of correctness. In the case of expressions with unions the proof is based on novel results in the combinatorics of words. A genera…

research product

Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data

In this paper we present an approach based on integrated use of graph clustering and visualisation methods for semi-supervised discovery of biologically significant features in biomolecular data sets. We describe several clustering algorithms that have been custom designed for analysis of biomolecular data and feature an iterated two step approach involving initial computation of thresholds and other parameters used in clustering algorithms, which is followed by identification of connected graph components, and, if needed, by adjustment of clustering parameters for processing of individual subgraphs.

research product

Characteristic Topological Features of Promoter Capture Hi-C Interaction Networks

Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify the biologically significant features, many questions still remain open. In this paper we describe analysis methods of Hi-C (specifically PCHi-C) interaction networks that are strictly focused on topological properties of these networks. The main questions we are trying to answer are: (1) can topological properties of interaction networks for different cell types alone be sufficient to distinguish between these types, and what the most important of such propert…

research product

Extensible Visualizations of Ontologies in OWLGrEd

OWLGrEd is a visual editor for OWL 2.0 ontologies that combines UML class diagram notation and textual OWL Manchester syntax for expressions. We review the basic OWLGrEd options for ontology presentation customization and consider the framework of OWLGrEd extensions that enables introducing rich use-case specific functionality to the editor. A number of available OWLGrEd extensions offering rich ontology management features to their end-users are described, as well.

research product

Towards Graphical Query Notation for Semantic Databases

We describe a notation and a tool for schema-enabled visual/diagrammatic creation of SPARQL queries over RDF databases. The notation and the tool support both the standard basic query pattern comprising a main query class and possibly linked condition classes and means for aggregate query definition and placing conditions over aggregates including also aggregation of aggregate results. We discuss the applicability of the tool for ad-hoc query formulation in practical use cases.

research product

Graph-based network analysis of transcriptional regulation pattern divergence in duplicated yeast gene pairs

The genome and interactome of Saccharomyces cerevisiae have been characterized extensively over the course of the past few decades. However, despite many insights gained over the years, both functional studies and evolutionary analyses continue to reveal many complexities and confounding factors in the construction of reliable transcriptional regulatory network models. We present here a graph-based technique for comparing transcriptional regulatory networks based on network motif similarity for gene pairs. We construct interaction graphs for duplicated transcription factor pairs traceable to the ancestral whole-genome duplication as well as other paralogues in Saccharomyces cerevisiae. We c…

research product

RDB2OWL

RDB2OWL is a simple approach of mapping relational databases into independently developed OWL ontologies. The approach is based on creating a mapping RDB schema, filling it with mapping information from which SQL scripts are generated that perform the instance-level transformation. We describe the RDB2OWL mapping schema and report on successful application of the technology to the migration of Latvian medical registries data.

research product

ViziQuer: A Web-Based Tool for Visual Diagrammatic Queries Over RDF Data

We demonstrate the open source ViziQuer tool for web-based creation and execution of visual diagrammatic queries over RDF/SPARQL data. The tool supports the data instance level and statistics queries, providing visual counterparts for most of SPARQL 1.1 select query constructs, including aggregation and subqueries. A query environment can be created over a user-supplied SPARQL endpoint with known data schema (a data schema exploration service is available, as well). There are pre-defined demonstration query environments for a mini-university data set, a fragment of synthetic similar to reality hospital data set, and a variant of Linked Movie Database RDF data set.

research product

Algorithmic Analysis of Programs with Well Quasi-ordered Domains

AbstractOver the past few years increasing research effort has been directed towards the automatic verification of infinite-state systems. This paper is concerned with identifying general mathematical structures which can serve as sufficient conditions for achieving decidability. We present decidability results for a class of systems (called well-structured systems) which consist of a finite control part operating on an infinite data domain. The results assume that the data domain is equipped with a preorder which is a well quasi-ordering, such that the transition relation is “monotonic” (a simulation) with respect to the preorder. We show that the following properties are decidable for wel…

research product

Deciding reachability for planar multi-polynomial systems

In this paper we investigate the decidability of the reachability problem for planar non-linear hybrid systems. A planar hybrid system has the property that its state space corresponds to the standard Euclidean plane, which is partitioned into a finite number of (polyhedral) regions. To each of these regions is assigned some vector field which governs the dynamical behaviour of the system within this region. We prove the decidability of point to point and region to region reachability problems for planar hybrid systems for the case when trajectories within the regions can be described by polynomials of arbitrary degree.

research product

Using Deep Learning to Extrapolate Protein Expression Measurements

Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, in…

research product

Schema-Backed Visual Queries over Europeana and Other Linked Data Resources

We describe and demonstrate the process of extracting a data-driven schema of the Europeana cultural heritage Linked data resource (with actual data classes, properties and their connections, and cardinalities) and application of the extracted schema to create a visual query environment over Europeana. The extracted schema information allows generating SHACL data shapes describing the actual data endpoint structure. The schema extraction process can be applied also to other data endpoints with a moderate data schema size and a potentially large data triple count, as e.g., British National Bibliography Linked data resource.

research product

Schema-Based Visual Queries over Linked Data Endpoints

We present the option to use the schema-based visual query tool ViziQuer over realistic Linked Data endpoints. We describe the tool meta-schema structure and the means for the endpoint schema retrieval both from an OWL ontology and from a SPARQL endpoint. We report on a store of the endpoint-specific schemas and the options to support the schema presentation to the end-user both as a class tree within the environment and as external visual diagram.

research product

Towards Self-explanatory Ontology Visualization with Contextual Verbalization

Ontologies are one of the core foundations of the Semantic Web. To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, the users first need to learn the visual notation before they can interpret it correctly. Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent. We propose to combine ontology visualizations with contextual ontology verbalizations of selected ontology (diagram) e…

research product

Advanced RDB-to-RDF/OWL Mapping Facilities in RDB2OWL

We present advanced features of RDB2OWL mapping specification language that allows expressing RDB-to-RDF/OWL mappings in a concise and human comprehensible way. The RDB2OWL mappings can be regarded as documentation of the database-to-ontology relation. The RDB2OWL language uses the OWL ontology structure as a backbone for mapping specification by placing the database link information into the annotations for ontology classes and properties. Its features include reuse of database table key information, user defined scalar and aggregate functions, table-based functions and multiclass conceptualization that is essential to keep mappings compact in case when large tables are mapped onto several…

research product

Database to Ontology Mapping Patterns in RDB2OWL Lite

We describe the RDB2OWL Lite language for relational database to RDF/OWL mapping specification and discuss the architectural and content specification patterns arising in mapping definition. RDB2OWL Lite is a simplification of original RDB2OWL with aggregation possibilities and order-based filters removed, while providing in-mapping SQL view definition possibilities. The mapping constructs and their usage patterns are illustrated on mapping examples from medical domain: medicine registries and hospital information system. The RDB2OWL Lite mapping implementation is offered both via translation into D2RQ and into standard R2RML mapping notations.

research product

ViziQuer: A Visual Notation for RDF Data Analysis Queries

Visual SPARQL query notations aim at easing the RDF data querying task. At the current state of the art there is still no generally accepted visual graph-based notation suitable to describe RDF data analysis queries that involve aggregation and subqueries. In this paper we present a visual diagram-centered notation for SPARQL select query formulation, capable to handle aggregate/statistics queries and hierarchic queries with subquery structure. The notation is supported by a web-based prototype tool. We present the notation examples, describe its syntax and semantics and describe studies with possible end users, involving both IT and medicine students.

research product

Self-learning inductive inference machines

Self-knowledge is a concept that is present in several philosophies. In this article, we consider the issue of whether or not a learning algorithm can in some sense possess self-knowledge. The question is answered affirmatively. Self-learning inductive inference algorithms are taken to be those that learn programs for their own algorithms, in addition to other functions. La connaissance de soi est un concept qui se retrouve dans plusieurs philosophies. Dans cet article, les auteurs s'interrogent a savoir si un algorithme d' apprentissage peut dans une certaine mesure posseder la connaissance de soi. lis apportent une reponse positive a cette question. Les algorithmes d'inference inductive a…

research product

Topological structure analysis of chromatin interaction networks.

Abstract Background Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify biologically significant features, many questions still remain open, in particular regarding potential biological significance of various topological features that are characteristic for chromatin interaction networks. Results It has been previously observed that promoter capture Hi-C (PCHi-C) interaction networks tend to separate easily into well-defined connected components that can be related to certain biological functionality, however, …

research product

Network motif-based analysis of regulatory patterns in paralogous gene pairs

Current high-throughput experimental techniques make it feasible to infer gene regulatory interactions at the whole-genome level with reasonably good accuracy. Such experimentally inferred regulatory networks have become available for a number of simpler model organisms such as S. cerevisiae, and others. The availability of such networks provides an opportunity to compare gene regulatory processes at the whole genome level, and in particular, to assess similarity of regulatory interactions for homologous gene pairs either from the same or from different species. We present here a new technique for analyzing the regulatory interaction neighborhoods of paralogous gene pairs. Our central focu…

research product