Search results for "algorithm"
showing 10 items of 4887 documents
Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
2015
Abstract Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are inc…
kmcEx: memory-frugal and retrieval-efficient encoding of counted k-mers.
2018
Abstract Motivation K-mers along with their frequency have served as an elementary building block for error correction, repeat detection, multiple sequence alignment, genome assembly, etc., attracting intensive studies in k-mer counting. However, the output of k-mer counters itself is large; very often, it is too large to fit into main memory, leading to highly narrowed usability. Results We introduce a novel idea of encoding k-mers as well as their frequency, achieving good memory saving and retrieval efficiency. Specifically, we propose a Bloom filter-like data structure to encode counted k-mers by coupled-bit arrays—one for k-mer representation and the other for frequency encoding. Exper…
Adaptive Modifications of Hypotheses After an Interim Analysis
2001
It is investigated how one can modify hypotheses in a trial after an interim analysis such that the type I error rate is controlled. If only a global statement is desired, a solution was given by Bauer (1989). For a general multiple testing problem, Kieser, Bauer and Lehmacher (1999) and Bauer and Kieser (1999) gave solutions, by means of which the initial set of hypotheses can be reduced after the interim analysis. The same techniques can be applied to obtain more flexible strategies, as changing weights of hypotheses, changing an a priori order, or even including new hypotheses. It is emphasized that the application of these methods requires very careful planning of a trial as well as a c…
The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.
2020
This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.
Selecting the tuning parameter in penalized Gaussian graphical models
2019
Penalized inference of Gaussian graphical models is a way to assess the conditional independence structure in multivariate problems. In this setting, the conditional independence structure, corresponding to a graph, is related to the choice of the tuning parameter, which determines the model complexity or degrees of freedom. There has been little research on the degrees of freedom for penalized Gaussian graphical models. In this paper, we propose an estimator of the degrees of freedom in $$\ell _1$$ -penalized Gaussian graphical models. Specifically, we derive an estimator inspired by the generalized information criterion and propose to use this estimator as the bias term for two informatio…
Design-based estimation for geometric quantiles with application to outlier detection
2010
Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…
On the stability and ergodicity of adaptive scaling Metropolis algorithms
2011
The stability and ergodicity properties of two adaptive random walk Metropolis algorithms are considered. The both algorithms adjust the scaling of the proposal distribution continuously based on the observed acceptance probability. Unlike the previously proposed forms of the algorithms, the adapted scaling parameter is not constrained within a predefined compact interval. The first algorithm is based on scale adaptation only, while the second one incorporates also covariance adaptation. A strong law of large numbers is shown to hold assuming that the target density is smooth enough and has either compact support or super-exponentially decaying tails.
Test and power considerations for multiple endpoint analyses using sequentially rejective graphical procedures
2009
A variety of powerful test procedures are available for the analysis of clinical trials addressing multiple objectives, such as comparing several treatments with a control, assessing the benefit of a new drug for more than one endpoint, etc. However, some of these procedures have reached a level of complexity that makes it difficult to communicate the underlying test strategies to clinical teams. Graphical approaches have been proposed instead that facilitate the derivation and communication of Bonferroni-based closed test procedures. In this paper we give a coherent description of the methodology and illustrate it with a real clinical trial example. We further discuss suitable power measur…
Bayesian Design of “Successful” Replications
2002
Replication of experiments is commonin applied research. However, systematic studies of the goals and motivations of a “replication” are rare. As a consequence, there does not seem to be a precise notion of what a “success” when replicating means. This article discusses some of the possible goals for replication; this leads to different (but precise) notions of “success” when replicating. Bayesian hierarchical models allow for a flexible and explicit incorporation of the assumed relationship among the experiments. Bayesian predictive distributions are a natural tool to compute the probability of the replication being successful, and hence to design the replication so that the probability of…
Basic networks: Definition and applications
2009
7 pages, 4 figures, 1 table.-- PMID: 19490867 [PubMed]