Search results for "cluster analysis."

showing 10 items of 805 documents

MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study

2019

Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…

Statistics and ProbabilityContigComputer scienceRobustness (evolution)Computational biologyOriginal PapersBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsComputational Theory and MathematicsMetagenomicsReference genesGene familyHuman viromeCluster analysisMolecular BiologyBioinformatics
researchProduct

Ranking Scientific Journals Via Latent Class Models for Polytomous Item Response Data

2015

Summary We propose a model-based strategy for ranking scientific journals starting from a set of observed bibliometric indicators that represent imperfect measures of the unobserved ‘value’ of a journal. After discretizing the available indicators, we estimate an extended latent class model for polytomous item response data and use the estimated model to cluster journals. We illustrate our approach by using the data from the Italian research evaluation exercise that was carried out for the period 2004–2010, focusing on the set of journals that are considered relevant for the subarea statistics and financial mathematics. Using four bibliometric indicators (IF, IF5, AIS and the h-index), some…

Statistics and ProbabilityEconomics and EconometricEconomics and EconometricsClass (set theory)Research evaluationClusteringSet (abstract data type)Valutazione della Qualità delle RicercaCovariateStatisticsEconometricsFinite mixture modelsCluster analysisFinite mixture modelMathematicsGraded response modelMathematical financeItem response theory modelsItem response theory modelProbability and statisticsLatent class modelRankingStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaValutazione della Qualità delle Ricerca; Clustering; Finite mixture models; Graded response model; Item response theory models; Research evaluation;Social Sciences (miscellaneous)Journal of the Royal Statistical Society Series A: Statistics in Society
researchProduct

Clustering of spatial point patterns

2006

Spatial point patterns arise as the natural sampling information in many problems. An ophthalmologic problem gave rise to the problem of detecting clusters of point patterns. A set of human corneal endothelium images is given. Each image is described by using a point pattern, the cell centroids. The main problem is to find groups of images corresponding with groups of spatial point patterns. This is interesting from a descriptive point of view and for clinical purposes. A new image can be compared with prototypes of each group and finally evaluated by the physician. Usual descriptors of spatial point patterns such as the empty-space function, the nearest distribution function or Ripley's K-…

Statistics and ProbabilityK-functionbusiness.industryApplied MathematicsCentroidPattern recognitionFunction (mathematics)Point processComputational MathematicsComputational Theory and MathematicsSurvival functionStatisticsPoint (geometry)Artificial intelligencePoint estimationCluster analysisbusinessMathematicsComputational Statistics & Data Analysis
researchProduct

Sparse kernel methods for high-dimensional survival data

2008

Abstract Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be ‘kernelized’. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, dependin…

Statistics and ProbabilityLung NeoplasmsLymphomaComputer sciencecomputer.software_genreComputing MethodologiesBiochemistryPattern Recognition AutomatedArtificial IntelligenceMargin (machine learning)CovariateCluster AnalysisHumansComputer SimulationFraction (mathematics)Molecular BiologyProportional Hazards ModelsModels StatisticalTraining setProportional hazards modelGene Expression ProfilingComputational BiologyComputer Science ApplicationsSupport vector machineComputational MathematicsKernel methodComputational Theory and MathematicsRegression AnalysisData miningcomputerAlgorithmsSoftwareBioinformatics
researchProduct

Immune networks: Multi-tasking capabilities at medium load

2013

Associative network models featuring multi-tasking properties have been introduced recently and studied in the low load regime, where the number $P$ of simultaneously retrievable patterns scales with the number $N$ of nodes as $P\sim \log N$. In addition to their relevance in artificial intelligence, these models are increasingly important in immunology, where stored patterns represent strategies to fight pathogens and nodes represent lymphocyte clones. They allow us to understand the crucial ability of the immune system to respond simultaneously to multiple distinct antigen invasions. Here we develop further the statistical mechanical analysis of such systems, by studying the medium load r…

Statistics and ProbabilityModularity (networks)Theoretical computer scienceDegree (graph theory)Associative networkComputer scienceGeneral Physics and AstronomyFOS: Physical sciencesStatistical and Nonlinear PhysicsDisordered Systems and Neural Networks (cond-mat.dis-nn)Condensed Matter - Disordered Systems and Neural NetworksModeling and SimulationFOS: Biological sciencesCell Behavior (q-bio.CB)Human multitaskingQuantitative Biology - Cell BehaviorRelevance (information retrieval)Cluster analysisImmune Network Statistical Mechanics Hopfield model Parallel RetrievalMathematical Physics
researchProduct

Degree stability of a minimum spanning tree of price return and volatility

2002

We investigate the time series of the degree of minimum spanning trees obtained by using a correlation based clustering procedure which is starting from (i) asset return and (ii) volatility time series. The minimum spanning tree is obtained at different times by computing correlation among time series over a time window of fixed length $T$. We find that the minimum spanning tree of asset return is characterized by stock degree values, which are more stable in time than the ones obtained by analyzing a minimum spanning tree computed starting from volatility time series. Our analysis also shows that the degree of stocks has a very slow dynamics with a time-scale of several years in both cases.

Statistics and ProbabilityPhysics - Physics and SocietyFOS: Physical sciencesPhysics and Society (physics.soc-ph)Minimum spanning treeFOS: Economics and businessTime windowsStatisticsMathematical PhysicCluster analysisStock (geology)Condensed Matter - Statistical MechanicsMathematicsSpanning treeStatistical Finance (q-fin.ST)Statistical Mechanics (cond-mat.stat-mech)EconophysicQuantitative Finance - Statistical FinanceStatistical and Nonlinear PhysicsAsset returnCondensed Matter PhysicsSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)VolatilityCorrelation-based clusteringPrice returnVolatility (finance)
researchProduct

Iterative Cluster Analysis of Protein Interaction Data

2004

Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…

Statistics and ProbabilitySaccharomyces cerevisiae ProteinsComputer sciencecomputer.software_genreBiochemistryInteractomePattern Recognition AutomatedSet (abstract data type)Protein Interaction MappingCluster (physics)Cluster AnalysisCluster analysisMolecular BiologyCytoskeletonMeasure (data warehouse)Gene Expression ProfilingProteinsActinsComputer Science ApplicationsHierarchical clusteringGene expression profilingComputational MathematicsComputational Theory and MathematicsPattern recognition (psychology)Benchmark (computing)Data miningcomputerAlgorithmsSoftwareSignal TransductionBioinformatics
researchProduct

Antibacterial Activity of Flavonoids Against Methicillin-resistant Staphylococcus aureus strains

2000

An experimental and theoretical study was performed on the anti-staphylococcal activity of 18 natural and synthetic flavonoids against methicillin-resistant Staphylococcus aureus strains. The analysed flavonoids belong to three well-differentiated structural patterns: chalcones, flavanones and flavones. The quantitative analysis of the anti-staphylococcal activity of the compounds was carried out by determining their percent inhibition degree. The hierarchical cluster analysis method was used to analyse the anti-MRSA activity of the compounds. With this methodology, the flavonoids were classified into four groups according to their anti-staphylococcal activity (high, sufficient, intermediat…

Statistics and ProbabilityStaphylococcus aureusChalconeStereochemistryFlavonoidMicrobial Sensitivity Testsmedicine.disease_causeFlavonesGeneral Biochemistry Genetics and Molecular BiologyStructure-Activity Relationshipchemistry.chemical_compoundChalconemedicineAnimalsCluster AnalysisHumansStructure–activity relationshipFlavonoidschemistry.chemical_classificationGeneral Immunology and MicrobiologyApplied MathematicsGeneral MedicineStaphylococcal InfectionsMethicillin-resistant Staphylococcus aureusAnti-Bacterial AgentschemistryBiochemistryStaphylococcus aureusModeling and SimulationMethicillin ResistanceGeneral Agricultural and Biological SciencesAntibacterial activityQuantitative analysis (chemistry)Journal of Theoretical Biology
researchProduct

Identification of clusters of companies in stock indices via Potts super-paramagnetic transitions

2000

The clustering of companies within a specific stock market index is studied by means of super-paramagnetic transitions of an appropriate q-state Potts model where the spins correspond to companies and the interactions are functions of the correlation coefficients determined from the time dependence of the companies' individual stock prices. The method is a generalization of the clustering algorithm by Domany et. al. to the case of anti-ferromagnetic interactions corresponding to anti-correlations. For the Dow Jones Industrial Average where no anti-correlations were observed in the investigated time period, the previous results obtained by different tools were well reproduced. For the Standa…

Statistics and ProbabilityStatistical Mechanics (cond-mat.stat-mech)SpinsFOS: Physical sciencesCondensed Matter PhysicsStock market indexParamagnetismCluster (physics)Statistical physicsCluster analysisStock (geology)Condensed Matter - Statistical MechanicsPotts modelMathematics
researchProduct

Clusters of effects curves in quantile regression models

2018

In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…

Statistics and ProbabilityStatistics::TheoryMultivariate statistics05 social sciencesUnivariateFunctional data analysis01 natural sciencesQuantile regressionQuantile regression coefficients modeling Multivariate analysis Functional data analysis Curves clustering Variable selection010104 statistics & probabilityComputational Mathematics0502 economics and businessParametric modelCovariateStatistics::MethodologyApplied mathematics0101 mathematicsStatistics Probability and UncertaintyCluster analysisSettore SECS-S/01 - Statistica050205 econometrics MathematicsQuantile
researchProduct