0000000001307220
AUTHOR
Juris Viksna
Pattern Matching and Pattern Discovery Algorithms for Protein Topologies
We describe algorithms for pattern-matching and pattern-learning in TOPS diagrams (formal descriptions of protein topologies). These problems can be reduced to checking for subgraph isomorphism and finding maximal common subgraphs in a restricted class of ordered graphs. We have developed a subgraph isomorphism algorithm for ordered graphs, which performs well on the given set of data. The maximal common subgraph problem then is solved by repeated subgraph extension and checking for isomorphisms. Despite its apparent inefficiency, this approach yields an algorithm with time complexity proportional to the number of graphs in the input set and is still practical on the given set of data. As a…
Variation in genomic landscape of clear cell renal cell carcinoma across Europe
The incidence of renal cell carcinoma (RCC) is increasing worldwide, and its prevalence is particularly high in some parts of Central Europe. Here we undertake whole-genome and transcriptome sequencing of clear cell RCC (ccRCC), the most common form of the disease, in patients from four different European countries with contrasting disease incidence to explore the underlying genomic architecture of RCC. Our findings support previous reports on frequent aberrations in the epigenetic machinery and PI3K/mTOR signalling, and uncover novel pathways and genes affected by recurrent mutations and abnormal transcriptome patterns including focal adhesion, components of extracellular matrix (ECM) and …
Characteristic Topological Features of Promoter Capture Hi-C Interaction Networks
Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify the biologically significant features, many questions still remain open. In this paper we describe analysis methods of Hi-C (specifically PCHi-C) interaction networks that are strictly focused on topological properties of these networks. The main questions we are trying to answer are: (1) can topological properties of interaction networks for different cell types alone be sufficient to distinguish between these types, and what the most important of such propert…
PASSIM – an open source software system for managing information in biomedical studies
Abstract Background One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions are largely limited to large-scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but often implies sufficient investment of time, effort and funds, which are not always available. There is a clear need for lightweig…
Exploration of Evolutionary Relations between Protein Structures
We describe a new method for the exploration of evolutionary relations between protein structures.
Probabilistic inference of approximations
We consider probabilistic inductive inference of Godel numbers of total recursive functions when the set of possible errors is allowed to be infinite, but with bounded density. We have obtained hierarchies of classes of functions identifiable with different probabilities up to sets with fixed density. The obtained hierarchies turn out to be different from those which we have in the case of exact identification.
Graph-based network analysis of transcriptional regulation pattern divergence in duplicated yeast gene pairs
The genome and interactome of Saccharomyces cerevisiae have been characterized extensively over the course of the past few decades. However, despite many insights gained over the years, both functional studies and evolutionary analyses continue to reveal many complexities and confounding factors in the construction of reliable transcriptional regulatory network models. We present here a graph-based technique for comparing transcriptional regulatory networks based on network motif similarity for gene pairs. We construct interaction graphs for duplicated transcription factor pairs traceable to the ancestral whole-genome duplication as well as other paralogues in Saccharomyces cerevisiae. We c…
On WQO Property for Different Quasi Orderings of the Set of Permutations
The property of certain sets being well quasi ordered (WQO) has several useful applications in computer science – it can be used to prove the existence of efficient algorithms and also in certain cases to prove that a specific algorithm terminates.
Gene Duplication Models and Reconstruction of Gene Regulatory Network Evolution from Network Structure
The work was supported by Latvian Council of Science grant 258/2012 and Latvian State Research programme project NexIT (2014-2017).
Dynamics of gene regulatory networks and their dependence on network topology and quantitative parameters – the case of phage λ
Background Gene regulatory networks can be modelled in various ways depending on the level of detail required and biological questions addressed. One of the earliest formalisms used for modeling is a Boolean network, although these models cannot describe most temporal aspects of a biological system. Differential equation models have also been used to model gene regulatory networks, but these frameworks tend to be too detailed for large models and many quantitative parameters might not be deducible in practice. Hybrid models bridge the gap between these two model classes – these are useful when concentration changes are important while the information about precise concentrations and binding…
Using Deep Learning to Extrapolate Protein Expression Measurements
Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, in…
A computer system to perform structure comparison using TOPS representations of protein structure
We describe the design and implementation of a fast topology-based method for protein structure comparison. The approach uses the TOPS topological representation of protein structure, aligning two structures using a common discovered pattern and generating measure of distance derived from an insert score. Heavy use is made of a constraint-based pattern-matching algorithm for TOPS diagrams that we have designed and described elsewhere (Bioinformatics 15(4) (1999) 317). The comparison system is maintained at the European Bioinformatics Institute and is available over the Web at tops.ebi.ac.uk/tops. Users submit a structure description in Protein Data Bank (PDB) format and can compare it with …
Probabilistic limit identification up to “small” sets
In this paper we study limit identification of total recursive functions in the case when “small” sets of errors are allowed. Here the notion of “small” sets we formalize in a very general way, i.e. we define a notion of measure for subsets of natural numbers, and we consider as being small those sets, which are subsets of sets with zero measure.
Graph-based characterisations of cell types and functionally related modules in promoter capture Hi-C Data
Topological structure analysis of chromatin interaction networks.
Abstract Background Current Hi-C technologies for chromosome conformation capture allow to understand a broad spectrum of functional interactions between genome elements. Although significant progress has been made into analysis of Hi-C data to identify biologically significant features, many questions still remain open, in particular regarding potential biological significance of various topological features that are characteristic for chromatin interaction networks. Results It has been previously observed that promoter capture Hi-C (PCHi-C) interaction networks tend to separate easily into well-defined connected components that can be related to certain biological functionality, however, …
Network motif-based analysis of regulatory patterns in paralogous gene pairs
Current high-throughput experimental techniques make it feasible to infer gene regulatory interactions at the whole-genome level with reasonably good accuracy. Such experimentally inferred regulatory networks have become available for a number of simpler model organisms such as S. cerevisiae, and others. The availability of such networks provides an opportunity to compare gene regulatory processes at the whole genome level, and in particular, to assess similarity of regulatory interactions for homologous gene pairs either from the same or from different species. We present here a new technique for analyzing the regulatory interaction neighborhoods of paralogous gene pairs. Our central focu…
Additional file 1 of Dynamics of gene regulatory networks and their dependence on network topology and quantitative parameters – the case of phage λ
Software package implementing our proposed method of attractor analysis. It contains source files, user manual and the phage λ model described in this manuscript. Following subsections describe files from the package. ModelDescription.txt: Definition of the phage λ model that is analysed within this paper. ModelConstraints.txt: File that specifies partial constraints for the orderings of binding site affinities. Here, the constraints are applicable to our phage λ model. HSM_graph_analysis.cpp: The main component of the software that identifies all feasible states of a system. HSM_graph_analysis.h: The second component of the software for graph analysis. It is a C++ header file which contain…