Search results for "Computational Mathematic"
showing 10 items of 987 documents
Lattices and dual lattices in optimal experimental design for Fourier models
1998
Number-theoretic lattices, used in integration theory, are studied from the viewpoint of the design and analysis of experiments. For certain Fourier regression models lattices are optimal as experimental designs because they produce orthogonal information matrices. When the Fourier model is restricted, that is a special subset of the full factorial (cross-spectral) model is used, there is a difficult inversion problem to find generators for an optimal design for the given model. Asymptotic results are derived for certain models as the dimension of the space goes to infinity. These can be thought of as a complexity theory connecting designs and models or as special type of Nyquist sampling t…
A non-linear optimization procedure to estimate distances and instantaneous substitution rate matrices under the GTR model.
2006
Abstract Motivation: The general-time-reversible (GTR) model is one of the most popular models of nucleotide substitution because it constitutes a good trade-off between mathematical tractability and biological reality. However, when it is applied for inferring evolutionary distances and/or instantaneous rate matrices, the GTR model seems more prone to inapplicability than more restrictive time-reversible models. Although it has been previously noted that the causes for intractability are caused by the impossibility of computing the logarithm of a matrix characterised by negative eigenvalues, the issue has not been investigated further. Results: Here, we formally characterize the mathematic…
Derivations of the (n, 2, 1)-nilpotent Lie Algebra
2016
In this paper, we study derivations of the (2, n, 1)-nilpotent Lie Algebra
Modelling residuals dependence in dynamic life tables: A geostatistical approach
2008
The problem of modelling dynamic mortality tables is considered. In this context, the influence of age on data graduation needs to be properly assessed through a dynamic model, as mortality progresses over the years. After detrending the raw data, the residuals dependence structure is analysed, by considering them as a realisation of a homogeneous Gaussian random field defined on R × R. This setting allows for the implementation of geostatistical techniques for the estimation of the dependence and further interpolation in the domain of interest. In particular, a complex form of interaction between age and time is considered, by taking into account a zonally anisotropic component embedded in…
Iterative Cluster Analysis of Protein Interaction Data
2004
Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…
On the usage of joint diagonalization in multivariate statistics
2022
Scatter matrices generalize the covariance matrix and are useful in many multivariate data analysis methods, including well-known principal component analysis (PCA), which is based on the diagonalization of the covariance matrix. The simultaneous diagonalization of two or more scatter matrices goes beyond PCA and is used more and more often. In this paper, we offer an overview of many methods that are based on a joint diagonalization. These methods range from the unsupervised context with invariant coordinate selection and blind source separation, which includes independent component analysis, to the supervised context with discriminant analysis and sliced inverse regression. They also enco…
A web application for the unspecific detection of differentially expressed DNA regions in strand-specific expression data
2015
Abstract Genomic technologies allow laboratories to produce large-scale data sets, either through the use of next-generation sequencing or microarray platforms. To explore these data sets and obtain maximum value from the data, researchers view their results alongside all the known features of a given reference genome. To study transcriptional changes that occur under a given condition, researchers search for regions of the genome that are differentially expressed between different experimental conditions. In order to identify these regions several algorithms have been developed over the years, along with some bioinformatic platforms that enable their use. However, currently available appli…
Multiple sequence editing by spreadsheet.
1990
Spreadsheets have several functions and facilities that make them good candidates to be used as multiple sequence editors. They can be easily programmed (even by non-programmers) with macros that allow them to fit the needs of the user, free of the restrictions that programs written by other people have. Here I present a sheet containing a set of macros written for Lotus 1-2-3
The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis
2021
Abstract Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e. their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either mi…
Long read alignment based on maximal exact match seeds
2012
Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…