Search results for " selection"
showing 10 items of 1271 documents
Variable Ranking Feature Selection for the Identification of Nucleosome Related Sequences
2018
Several recent works have shown that K-mer sequence representation of a DNA sequence can be used for classification or identification of nucleosome positioning related sequences. This representation can be computationally expensive when k grows, making the complexity in spaces of exponential dimension. This issue effects significantly the classification task computed by a general machine learning algorithm used for the purpose of sequence classification. In this paper, we investigate the advantage offered by the so-called Variable Ranking Feature Selection method to select the most informative k − mers associated to a set of DNA sequences, for the final purpose of nucleosome/linker classifi…
2018
Sex differences in lifespan are ubiquitous, but the underlying causal factors remain poorly understood. Inter- and intrasexual social interactions are well known to influence lifespan in many taxa, but it has proved challenging to separate the role of sex-specific behaviours from wider physiological differences between the sexes. To address this problem, we genetically manipulated the sexual identity of the nervous system—and hence sexual behaviour—in Drosophila melanogaster , and measured lifespan under varying social conditions. Consistent with previous studies, masculinization of the nervous system in females induced male-specific courtship behaviour and aggression, while nervous system…
Two-Stage Bayesian Approach for GWAS With Known Genealogy
2019
Genome-wide association studies (GWAS) aim to assess relationships between single nucleotide polymorphisms (SNPs) and diseases. They are one of the most popular problems in genetics, and have some peculiarities given the large number of SNPs compared to the number of subjects in the study. Individuals might not be independent, especially in animal breeding studies or genetic diseases in isolated populations with highly inbred individuals. We propose a family-based GWAS model in a two-stage approach comprising a dimension reduction and a subsequent model selection. The first stage, in which the genetic relatedness between the subjects is taken into account, selects the promising SNPs. The se…
Stagewise pseudo-value regression for time-varying effects on the cumulative incidence
2015
In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event s…
Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks.
2016
Abstract Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order – some entries of the precision matrix are a priori zeros – or equal dependency strengths across time lags – some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l 1-penalized maximum likelihood, imposing a further constraint on the absolute value…
A graphical model selection tool for mixed models
2017
Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing th…
Evolutionary distances corrected for purifying selection and ancestral polymorphisms.
2019
Abstract Evolutionary distance formulas that take into account effects due to ancestral polymorphisms and purifying selection are obtained on the basis of the full solution of Jukes–Cantor and Kimura DNA substitution models. In the case of purifying selection two different methods are developed. It is shown that avoiding the dimensional reduction implicitly carried out in the conventional model solving is instrumental to incorporate the quoted effects into the formalism. The problem of estimating the numerical values of the model parameters, as well as those of the correction terms, is not addressed.
Pitfalls of hypothesis tests and model selection on bootstrap samples: Causes and consequences in biometrical applications
2015
The bootstrap method has become a widely used tool applied in diverse areas where results based on asymptotic theory are scarce. It can be applied, for example, for assessing the variance of a statistic, a quantile of interest or for significance testing by resampling from the null hypothesis. Recently, some approaches have been proposed in the biometrical field where hypothesis testing or model selection is performed on a bootstrap sample as if it were the original sample. P-values computed from bootstrap samples have been used, for example, in the statistics and bioinformatics literature for ranking genes with respect to their differential expression, for estimating the variability of p-v…
Full-automatic computer aided system for stem cell clustering using content-based microscopic image analysis
2017
Abstract Stem cells are very original cells that can differentiate into other cells, tissues and organs, which play a very important role in biomedical treatments. Because of the importance of stem cells, in this paper we propose a full-automatic computer aided clustering system to assist scientists to explore potential co-occurrence relations between the cell differentiation and their morphological information in phenotype. In this proposed system, a multi-stage Content-based Microscopic Image Analysis (CBMIA) framework is applied, including image segmentation, feature extraction, feature selection, feature fusion and clustering techniques. First, an Improved Supervised Normalized Cuts (IS…
A decision analysis for periapical surgery : retrospective study
2018
Background Periapical surgery is now a reliable therapeutic procedure for the treatment of teeth with periapical lesions, particularly when orthograde retreatment is problematic. However, little information is available regarding treatment planning of cases referred for periapical surgery. Therefore, this study was conducted to analyze and evaluate the factors that affect the decision-making process for periapical surgery. Material and Methods This study retrospectively assessed clinical and radiographic data from patients undergoing periapical surgery. The factors involved in deciding to perform periapical surgery were classified into technical, biological, and combined factors. Results Ou…