Search results for "QM"
showing 10 items of 284 documents
Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals
2000
The study of a few genes has permitted the identification of three elements that constitute a yeast polyadenylation signal: the efficiency element (EE), the positioning element and the actual site for cleavage and polyadenylation. In this paper we perform an analysis of oligonucleotide composition on the sequences located downstream of the stop codon of all yeast genes. Several oligonucleotide families appear over-represented with a high significance (referred to herein as"words"). The family with the highest over-representation includes the oligonucleotides shown experimentally to play a role as EEs. The word with the highest score is TATATA, followed, among others, by a series of singl…
Finding optimal finite biological sequences over finite alphabets: the OptiFin toolbox
2017
International audience; In this paper, we present a toolbox for a specific optimization problem that frequently arises in bioinformatics or genomics. In this specific optimisation problem, the state space is a set of words of specified length over a finite alphabet. To each word is associated a score. The overall objective is to find the words which have the lowest possible score. This type of general optimization problem is encountered in e.g 3D conformation optimisation for protein structure prediction, or largest core genes subset discovery based on best supported phylogenetic tree for a set of species. In order to solve this problem, we propose a toolbox that can be easily launched usin…
Selectivity in Probabilistic Causality: Drawing Arrows from Inputs to Stochastic Outputs
2011
Given a set of several inputs into a system (e.g., independent variables characterizing stimuli) and a set of several stochastically non-independent outputs (e.g., random variables describing different aspects of responses), how can one determine, for each of the outputs, which of the inputs it is influenced by? The problem has applications ranging from modeling pairwise comparisons to reconstructing mental processing architectures to conjoint testing. A necessary and sufficient condition for a given pattern of selective influences is provided by the Joint Distribution Criterion, according to which the problem of "what influences what" is equivalent to that of the existence of a joint distr…
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases
2019
AbstractThe widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotatio…
Gap Filling of Biophysical Parameter Time Series with Multi-Output Gaussian Processes
2018
In this work we evaluate multi-output (MO) Gaussian Process (GP) models based on the linear model of coregionalization (LMC) for estimation of biophysical parameter variables under a gap filling setup. In particular, we focus on LAI and fAPAR over rice areas. We show how this problem cannot be solved with standard single-output (SO) GP models, and how the proposed MO-GP models are able to successfully predict these variables even in high missing data regimes, by implicitly performing an across-domain information transfer.
Retrieval of aboveground crop nitrogen content with a hybrid machine learning method
2020
Abstract Hyperspectral acquisitions have proven to be the most informative Earth observation data source for the estimation of nitrogen (N) content, which is the main limiting nutrient for plant growth and thus agricultural production. In the past, empirical algorithms have been widely employed to retrieve information on this biochemical plant component from canopy reflectance. However, these approaches do not seek for a cause-effect relationship based on physical laws. Moreover, most studies solely relied on the correlation of chlorophyll content with nitrogen, and thus neglected the fact that most N is bound in proteins. Our study presents a hybrid retrieval method using a physically-base…
Human experts vs. machines in taxa recognition
2020
The step of expert taxa recognition currently slows down the response time of many bioassessments. Shifting to quicker and cheaper state-of-the-art machine learning approaches is still met with expert scepticism towards the ability and logic of machines. In our study, we investigate both the differences in accuracy and in the identification logic of taxonomic experts and machines. We propose a systematic approach utilizing deep Convolutional Neural Nets with the transfer learning paradigm and extensively evaluate it over a multi-pose taxonomic dataset with hierarchical labels specifically created for this comparison. We also study the prediction accuracy on different ranks of taxonomic hier…
Machinery Failure Approach and Spectral Analysis to study the Reaction Time Dynamics over Consecutive Visual Stimuli
2020
The reaction times of individuals over consecutive visual stimuli have been studied using spectral analysis and a failure machinery approach. The used tools include the fast Fourier transform and a spectral entropy analysis. The results indicate that the reaction times produced by the independently responding individuals to visual stimuli appear to be correlated. The spectral analysis and the entropy of the spectrum yield that there are features of similarity in the response times of each participant and among them. Furthermore, the analysis of the mistakes made by the participants during the reaction time experiments concluded that they follow a behavior which is consistent with the MTBF (…
Local Granger causality
2021
Granger causality is a statistical notion of causal influence based on prediction via vector autoregression. For Gaussian variables it is equivalent to transfer entropy, an information-theoretic measure of time-directed information transfer between jointly dependent processes. We exploit such equivalence and calculate exactly the 'local Granger causality', i.e. the profile of the information transfer at each discrete time point in Gaussian processes; in this frame Granger causality is the average of its local version. Our approach offers a robust and computationally fast method to follow the information transfer along the time history of linear stochastic processes, as well as of nonlinear …
Order-distance and other metric-like functions on jointly distributed random variables
2013
We construct a class of real-valued nonnegative binary functions on a set of jointly distributed random variables, which satisfy the triangle inequality and vanish at identical arguments (pseudo-quasi-metrics). These functions are useful in dealing with the problem of selective probabilistic causality encountered in behavioral sciences and in quantum physics. The problem reduces to that of ascertaining the existence of a joint distribution for a set of variables with known distributions of certain subsets of this set. Any violation of the triangle inequality or its consequences by one of our functions when applied to such a set rules out the existence of this joint distribution. We focus on…