Search results for "cluster analysis."
showing 10 items of 805 documents
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study
2019
Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…
Ranking Scientific Journals Via Latent Class Models for Polytomous Item Response Data
2015
Summary We propose a model-based strategy for ranking scientific journals starting from a set of observed bibliometric indicators that represent imperfect measures of the unobserved ‘value’ of a journal. After discretizing the available indicators, we estimate an extended latent class model for polytomous item response data and use the estimated model to cluster journals. We illustrate our approach by using the data from the Italian research evaluation exercise that was carried out for the period 2004–2010, focusing on the set of journals that are considered relevant for the subarea statistics and financial mathematics. Using four bibliometric indicators (IF, IF5, AIS and the h-index), some…
Clustering of spatial point patterns
2006
Spatial point patterns arise as the natural sampling information in many problems. An ophthalmologic problem gave rise to the problem of detecting clusters of point patterns. A set of human corneal endothelium images is given. Each image is described by using a point pattern, the cell centroids. The main problem is to find groups of images corresponding with groups of spatial point patterns. This is interesting from a descriptive point of view and for clinical purposes. A new image can be compared with prototypes of each group and finally evaluated by the physician. Usual descriptors of spatial point patterns such as the empty-space function, the nearest distribution function or Ripley's K-…
Sparse kernel methods for high-dimensional survival data
2008
Abstract Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be ‘kernelized’. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, dependin…
Immune networks: Multi-tasking capabilities at medium load
2013
Associative network models featuring multi-tasking properties have been introduced recently and studied in the low load regime, where the number $P$ of simultaneously retrievable patterns scales with the number $N$ of nodes as $P\sim \log N$. In addition to their relevance in artificial intelligence, these models are increasingly important in immunology, where stored patterns represent strategies to fight pathogens and nodes represent lymphocyte clones. They allow us to understand the crucial ability of the immune system to respond simultaneously to multiple distinct antigen invasions. Here we develop further the statistical mechanical analysis of such systems, by studying the medium load r…
Degree stability of a minimum spanning tree of price return and volatility
2002
We investigate the time series of the degree of minimum spanning trees obtained by using a correlation based clustering procedure which is starting from (i) asset return and (ii) volatility time series. The minimum spanning tree is obtained at different times by computing correlation among time series over a time window of fixed length $T$. We find that the minimum spanning tree of asset return is characterized by stock degree values, which are more stable in time than the ones obtained by analyzing a minimum spanning tree computed starting from volatility time series. Our analysis also shows that the degree of stocks has a very slow dynamics with a time-scale of several years in both cases.
Iterative Cluster Analysis of Protein Interaction Data
2004
Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…
Antibacterial Activity of Flavonoids Against Methicillin-resistant Staphylococcus aureus strains
2000
An experimental and theoretical study was performed on the anti-staphylococcal activity of 18 natural and synthetic flavonoids against methicillin-resistant Staphylococcus aureus strains. The analysed flavonoids belong to three well-differentiated structural patterns: chalcones, flavanones and flavones. The quantitative analysis of the anti-staphylococcal activity of the compounds was carried out by determining their percent inhibition degree. The hierarchical cluster analysis method was used to analyse the anti-MRSA activity of the compounds. With this methodology, the flavonoids were classified into four groups according to their anti-staphylococcal activity (high, sufficient, intermediat…
Identification of clusters of companies in stock indices via Potts super-paramagnetic transitions
2000
The clustering of companies within a specific stock market index is studied by means of super-paramagnetic transitions of an appropriate q-state Potts model where the spins correspond to companies and the interactions are functions of the correlation coefficients determined from the time dependence of the companies' individual stock prices. The method is a generalization of the clustering algorithm by Domany et. al. to the case of anti-ferromagnetic interactions corresponding to anti-correlations. For the Dow Jones Industrial Average where no anti-correlations were observed in the investigated time period, the previous results obtained by different tools were well reproduced. For the Standa…
Clusters of effects curves in quantile regression models
2018
In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…