0000000000019508
AUTHOR
Edgars Celms
Pini Language and PiniTree Ontology Editor: Annotation and Verbalisation for Atomised Journalism
We present a new ontology language Pini and the PiniTree ontology editor supporting it. Despite Pini language bearing lot of similarities with RDF, UML class diagrams, Property Graphs and their frontends like Google Knowledge Graph and Protege, it is a more expressive language enabling FrameNet-style natural language annotation for Atomised journalism use case.
Metamodel specialization based DSL for DL lifecycle data management
A new Domain Specific Language (DSL) based approach to Deep Learning (DL) lifecycle data management (LDM) is presented: a very simple but universal DL LDM tool, still usable in practice (called Core tool); and an advanced extension mechanism, that converts the Core tool into a DSL tool building framework for DL LDM tasks. The method used is based on the metamodel specialisation approach for DSL modeling tools introduced by authors.
Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data
In this paper we present an approach based on integrated use of graph clustering and visualisation methods for semi-supervised discovery of biologically significant features in biomolecular data sets. We describe several clustering algorithms that have been custom designed for analysis of biomolecular data and feature an iterated two step approach involving initial computation of thresholds and other parameters used in clustering algorithms, which is followed by identification of connected graph components, and, if needed, by adjustment of clustering parameters for processing of individual subgraphs.
Variation in genomic landscape of clear cell renal cell carcinoma across Europe
The incidence of renal cell carcinoma (RCC) is increasing worldwide, and its prevalence is particularly high in some parts of Central Europe. Here we undertake whole-genome and transcriptome sequencing of clear cell RCC (ccRCC), the most common form of the disease, in patients from four different European countries with contrasting disease incidence to explore the underlying genomic architecture of RCC. Our findings support previous reports on frequent aberrations in the epigenetic machinery and PI3K/mTOR signalling, and uncover novel pathways and genes affected by recurrent mutations and abnormal transcriptome patterns including focal adhesion, components of extracellular matrix (ECM) and …
Characteristic Topological Features of Promoter Capture Hi-C Interaction Networks
Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify the biologically significant features, many questions still remain open. In this paper we describe analysis methods of Hi-C (specifically PCHi-C) interaction networks that are strictly focused on topological properties of these networks. The main questions we are trying to answer are: (1) can topological properties of interaction networks for different cell types alone be sufficient to distinguish between these types, and what the most important of such propert…
PASSIM – an open source software system for managing information in biomedical studies
Abstract Background One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions are largely limited to large-scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but often implies sufficient investment of time, effort and funds, which are not always available. There is a clear need for lightweig…
Graphical Template Language for Transformation Synthesis
Higher-Order Transformations (HOT) have become an important support for the development of model transformations in various transformation languages. Most frequently HOTs are used to synthesize transformations from different kinds of models, for example, mapping models. This means that model driven development (MDD) is being successfully applied to transformations themselves too. The standard HOT solution is to create the transformation as a model using the abstract syntax. However, for graphical transformation languages a significantly more efficient solution would be to create the transformation using its graphical (concrete) syntax. An analogy could be the textual template languages such…
Domain-Driven Reuse of Software Design Models
This chapter presents an approach to software development where model driven development and software reuse facilities are combined in a natural way. The basis for all of this is a semiformal requirements language RSL. The requirements in RSL consist of use cases refined by scenarios in a simple controlled natural language and the domain vocabulary containing the domain concepts. The chapter shows how model transformations building a platform independent model (PIM) can be applied directly to the requirements specified in RSL by domain experts. Further development of the software case (PSM, code) is also supported by transformations, which in addition ensure a rich traceability within the s…
Saying Hello World with MOLA - A Solution to the TTC 2011 Instructive Case
This paper describes the solution of Hello World transformations in MOLA transformation language. Transformations implementing the task are relatively straightforward and easily inferable from the task specification. The required additional steps related to model import and export are also described.
Model transformation language MOLA
The paper describes a new graphical model transformation language MOLA. The basic idea of MOLA is to merge traditional structured programming as a control structure with pattern-based transformation rules. The key language element is a graphical loop concept. The main goal of MOLA is to describe model transformations in a natural and easy readable way.
Graph-based network analysis of transcriptional regulation pattern divergence in duplicated yeast gene pairs
The genome and interactome of Saccharomyces cerevisiae have been characterized extensively over the course of the past few decades. However, despite many insights gained over the years, both functional studies and evolutionary analyses continue to reveal many complexities and confounding factors in the construction of reliable transcriptional regulatory network models. We present here a graph-based technique for comparing transcriptional regulatory networks based on network motif similarity for gene pairs. We construct interaction graphs for duplicated transcription factor pairs traceable to the ancestral whole-genome duplication as well as other paralogues in Saccharomyces cerevisiae. We c…
Using Deep Learning to Extrapolate Protein Expression Measurements
Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, in…
Editor Definition Language and Its Implementation
Universal graphical editor definition language based on logical metamodel extended by presentation classes is proposed. Implementation principles of this language, based on Graphical Diagramming Engine are described.
Behaviour modelling notation for information system design
Problems related to behaviour modelling within the platform independent model (PIM) during the model driven design are discussed in the paper. The emphasis is on design problems for information systems, especially on building a behaviour draft. At first issues in the traditional approach using sequence diagrams are discussed. Then a new approach based on activity diagrams is proposed. An extension of activity diagram notation specifically oriented towards comprehensive and readable behaviour design description is presented.
Towards DSL for DL Lifecycle Data Management
A new method based on Domain Specific Language (DSL) approach to Deep Learning (DL) lifecycle data management tool support is presented: a very simple DL lifecycle data management tool, which however is usable in practice (it will be called Core tool) and a very advanced extension mechanism which in fact converts the Core tool into domain specific tool (DSL tool) building framework for DL lifecycle data management tasks. The extension mechanism will be based on the metamodel specialization approach to DSL modeling tools introduced by authors. The main idea of metamodel specialization is that we, at first, define the Universal Metamodel (UMM) for a domain and then for each use case define a …
Towards Self-explanatory Ontology Visualization with Contextual Verbalization
Ontologies are one of the core foundations of the Semantic Web. To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, the users first need to learn the visual notation before they can interpret it correctly. Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent. We propose to combine ontology visualizations with contextual ontology verbalizations of selected ontology (diagram) e…
Programming languages for data-Intensive HPC applications: A systematic mapping study
This work is a result of activities from COST Action 10406 High -Performance Modelling and Simulation for Big Data Applications (cHiPSet), funded by the European Cooperation in Science and Technology. FCT, Portugal for grants: NOVA LINCS Research Laboratory Ref. UID/ CEC/ 04516/ 2019); INESC-ID Ref. UID/CEC/50021/2019; BioISI Ref. UID/MULTI/04046/2103; LASIGE Research Unit Ref. UID/CEC/00408/ 2019. A major challenge in modelling and simulation is the need to combine expertise in both software technologies and a given scientific domain. When High-Performance Computing (HPC) is required to solve a scientific problem, software development becomes a problematic issue. Considering the complexity…
Developing Software with Domain-Driven Model Reuse
This chapter presents an approach to software development where model-driven development and software reuse facilities are combined in a natural way. It shows how model transformations building a Platform Independent Model (PIM) can be applied directly to the requirements specified in RSL by domain experts. Further development of the software case (PSM, code) is also supported by transformations, which in addition ensure a rich traceability within the software case. Alternatively, the PSM model and code can also be generated directly from requirements in RSL, thus providing fast development of the final code of at least a system prototype in many situations. The reuse support relies on a si…
RDF* Graph Database as Interlingua for the TextWorld Challenge
This paper briefly describes the top-scoring submission to the First TextWorld Problems: A Reinforcement and Language Learning Challenge. To alleviate the partial observability problem, characteristic to the TextWorld games, we split the Agent into two independent components: Observer and Actor, communicating only via the Interlingua of the RDF* graph database. The RDF* graph database serves as the “world model” memory incrementally updated by the Observer via FrameNet informed Natural Language Understanding techniques and is used by the Actor for the efficient exploration and planning of the game Action sequences. We find that the deep-learning approach works best for the Observer componen…
Tree Based Domain-Specific Mapping Languages
Model transformation languages have been mainly used by researchers --- the software engineering industry has not yet widely accepted the model driven software development (MDSD). One of the main reasons is the complexity of metamodelling principles the developers are required to know to actually use model transformations in the way the OMG has stated. We offer the basic principles how to create domain-specific model transformation languages which can be used by developers relying only on familiar modelling concepts. We propose to use simple graphical mappings to specify the correspondence between source and target models which are represented using trees based on the concrete syntax of und…
Graph-based characterisations of cell types and functionally related modules in promoter capture Hi-C Data
From Requirements to Code in a Model Driven Way
Though there is a lot of support for model driven development the support for complete model driven path from requirements to code is limited. The approach proposed in this paper offers such a path which is fully supported by model transformations. The starting point is semiformal requirements containing behaviour description in a controlled natural language. A chain of models is proposed including analysis, platform independent and platform specific models. A particular architecture style is chosen by means of selecting a set of appropriate design patterns for these models. It is shown how to define informally and then implement in model transformation language MOLA the required transforma…
Topological structure analysis of chromatin interaction networks.
Abstract Background Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify biologically significant features, many questions still remain open, in particular regarding potential biological significance of various topological features that are characteristic for chromatin interaction networks. Results It has been previously observed that promoter capture Hi-C (PCHi-C) interaction networks tend to separate easily into well-defined connected components that can be related to certain biological functionality, however, …
Network motif-based analysis of regulatory patterns in paralogous gene pairs
Current high-throughput experimental techniques make it feasible to infer gene regulatory interactions at the whole-genome level with reasonably good accuracy. Such experimentally inferred regulatory networks have become available for a number of simpler model organisms such as S. cerevisiae, and others. The availability of such networks provides an opportunity to compare gene regulatory processes at the whole genome level, and in particular, to assess similarity of regulatory interactions for homologous gene pairs either from the same or from different species. We present here a new technique for analyzing the regulatory interaction neighborhoods of paralogous gene pairs. Our central focu…
Tool support for MOLA
AbstractThe paper describes the MOLA Tool, which supports the model transformation language MOLA. MOLA Tool consists of two parts: MOLA definition environment and MOLA execution environment. MOLA definition environment is based on the GMF (Generic Modeling Framework) and contains graphical editors for metamodels and MOLA diagrams, as well as the MOLA compiler. The main component of MOLA execution environment is a MOLA virtual machine, which performs model transformations, using an SQL database as a repository. The execution environment may be used as a plug-in for Eclipse based modeling tools (e.g., IBM Rational RSA). The current status of the tool is truly academic.