Search results for "Computational Mathematic"
showing 10 items of 987 documents
CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies
2020
Motivation Numerous sequencing studies, including transcriptomics of host-pathogen systems, sequencing of hybrid genomes, xenografts, mixed species systems, metagenomics and meta-transcriptomics, involve samples containing genetic material from divergent organisms. A crucial step in these studies is identifying from which organism each sequencing read originated, and the experimental design should be directed to minimize biases caused by cross-mapping of reads to incorrect source genomes. Additionally, pooling of sufficiently different genetic material into a single sequencing library could significantly reduce experimental costs but requires careful planning and assessment of the impact of…
Quantitative characterization of antigens using monoclonal antibody reactivities
1993
A multipurpose program that empirically relates antigenic reactivities with monoclonal antibodies (MAbs) to genetic distances is presented. The program uses a set of known genetic pairwise distances to weigh each MAb depending on its capacity to define groups of taxonomically related antigens. This allows highly accurate identification and classification of unknown antigens. Also, the weights obtained constitute a quantitative measure of epitope conservation and can be used for improved vaccine design. © 1993 Oxford University Press.
Derived variables calculated from similar joint responses: some characteristics and examples
1995
Abstract A technique (Cox and Wermuth, 1992) is reviewed for finding linear combinations of a set of response variables having special relations of linear conditional independence with a set of explanatory variables. A theorem in linear algebra is used both to examine conditions in which the derived variables take a specially simple form and lead to reduced computations. Examples are discussed of medical and psychological investigations in which the method has aided interpretation.
Independent component analysis based on symmetrised scatter matrices
2007
A new method for separating the mixtures of independent sources has been proposed recently in [Oja et al. (2006). Scatter matrices and independent component analysis. Austrian J. Statist., to appear]. This method is based on two scatter matrices with the so-called independence property. The corresponding method is now further examined. Simple simulation studies are used to compare the performance of so-called symmetrised scatter matrices in solving the independence component analysis problem. The results are also compared with the classical FastICA method. Finally, the theory is illustrated by some examples. peerReviewed
Comparison of the Andersen–Gill model with poisson and negative binomial regression on recurrent event data
2008
Many generalizations of the Cox proportional hazard method have been elaborated to analyse recurrent event data. The Andersen-Gill model was proposed to handle event data following Poisson processes. This method is compared with non-survival approaches, such as Poisson and negative binomial regression. The comparison is performed on data simulated according to various event-generating processes and differing in subject heterogeneity. When robust standard error estimates are applied, for Poisson processes the Andersen-Gill approach is comparable to a negative binomial regression, whereas the poisson regression has comparable coverage probabilities of confidence intervals, but increased type …
Cluster-Localized Sparse Logistic Regression for SNP Data
2012
The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, th…
Multiple testing in candidate gene situations: a comparison of classical, discrete, and resampling-based procedures.
2011
In candidate gene association studies, usually several elementary hypotheses are tested simultaneously using one particular set of data. The data normally consist of partly correlated SNP information. Every SNP can be tested for association with the disease, e.g., using the Cochran-Armitage test for trend. To account for the multiplicity of the test situation, different types of multiple testing procedures have been proposed. The question arises whether procedures taking into account the discreteness of the situation show a benefit especially in case of correlated data. We empirically evaluate several different multiple testing procedures via simulation studies using simulated correlated SN…
TiFoSi: an efficient tool for mechanobiology simulations of epithelia
2020
[Motivation]: Emerging phenomena in developmental biology and tissue engineering are the result of feedbacks between gene expression and cell biomechanics. In that context, in silico experiments are a powerful tool to understand fundamental mechanisms and to formulate and test hypotheses.
A fast and recursive algorithm for clustering large datasets with k-medians
2012
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…
Robust estimation and regression with parametric quantile functions
2022
A new, broad family of quantile-based estimators is described, and theoretical and empirical evidence is provided for their robustness to outliers in the response. The proposed method can be used to estimate all types of parameters, including location, scale, rate and shape parameters, extremes, regression coefficients and hazard ratios, and can be extended to censored and truncated data. The described estimator can be utilized to construct robust versions of common parametric and semiparametric methods, such as linear (Normal) regression, generalized linear models, and proportional hazards models. A variety of significant results and applications is presented to show the flexibility of the…