Search results for "Hierarchical Clustering"

showing 10 items of 56 documents

Kullback-Leibler distance as a measure of the information filtered from multivariate data

2007

We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to s…

Physics - Physics and SocietyKullback–Leibler divergenceStatistical Finance (q-fin.ST)Covariance matrixEXPRESSION DATAFOS: Physical sciencesQuantitative Finance - Statistical FinanceMultivariate normal distributionPhysics and Society (physics.soc-ph)Measure (mathematics)Stability (probability)Hierarchical clusteringDistance correlationFOS: Economics and businessPhysics - Data Analysis Statistics and ProbabilityStatisticsTime seriesAlgorithmData Analysis Statistics and Probability (physics.data-an)MATRICESMathematics
researchProduct

Economic Sector Identification in a Set of Stocks Traded at the New York Stock Exchange: A Comparative Analysis

2006

We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorded at a daily time horizon. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However, different methodologies provide different information about the considered set. Our comparative analysis suggests that th…

Physics - Physics and SocietyStatistical Finance (q-fin.ST)Correlation coefficientEconomic sectorEconophysicsFOS: Physical sciencesQuantitative Finance - Statistical FinanceTime horizonPhysics and Society (physics.soc-ph)minimum spanning treeSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Hierarchical clusteringFOS: Economics and businessEconomic informationStock exchangePhysics - Data Analysis Statistics and ProbabilityEconomicsEconometricsfinancial marketRandom matrixData Analysis Statistics and Probability (physics.data-an)Stock (geology)
researchProduct

“Anti-Bayesian” flat and hierarchical clustering using symmetric quantiloids

2017

A myriad of works has been published for achieving data clustering based on the Bayesian paradigm, where the clustering sometimes resorts to Naive-Bayes decisions. Within the domain of clustering, the Bayesian principle corresponds to assigning the unlabelled samples to the cluster whose mean (or centroid) is the closest. Recently, Oommen and his co-authors have proposed a novel, counter-intuitive and pioneering PR scheme that is radically opposed to the Bayesian principle. The rational for this paradigm, referred to as the “Anti-Bayesian” (AB) paradigm, involves classification based on the non-central quantiles of the distributions. The first-reported work to achieve clustering using the A…

Scheme (programming language)Information Systems and ManagementTheoretical computer scienceComputer scienceBayesian principleBayesian probabilityVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Statistikk: 412Multivariate normal distribution0102 computer and information sciences02 engineering and technology01 natural sciencesDomain (mathematical analysis)ClusteringTheoretical Computer ScienceArtificial Intelligence0103 physical sciencesCluster (physics)0202 electrical engineering electronic engineering information engineering010306 general physicsCluster analysiscomputer.programming_languageCentroidComputer Science ApplicationsHierarchical clustering010201 computation theory & mathematicsControl and Systems EngineeringAnti-Bayesian classification020201 artificial intelligence & image processingcomputerSoftwareQuantiloidsQuantile
researchProduct

A hierarchical clustering strategy and its application to proteomic interaction data

2003

We describe a novel strategy of hierarchical clustering analysis, particularly useful to analyze proteomic interaction data. The logic behind this method is to use the information for all interactions among the elements of a set to evaluate the strength of the interaction of each pair of elements. Our procedure allows the characterization of protein complexes starting with partial data and the detection of "promiscuous" proteins that bias the results, generating false positive data. We demonstrate the usefulness of our strategy by analyzing a real case that involves 137 Saccharomyces cerevisiae proteins. Because most functional studies require the evaluation of similar data sets, our method…

Set (abstract data type)Data setRange (mathematics)Computer scienceBenchmark (computing)Data miningcomputer.software_genrecomputerHierarchical clustering
researchProduct

Correlation based hierarchical clustering in financial time series

2005

We review a correlation based clustering procedure applied to a portfolio of assets synchronously traded in a financial market. The portfolio considered consists of the set of 500 highly capitalized stocks traded at the New York Stock Exchange during the time period 1987-1998. We show that meaningful economic information can be extracted from correlation matrices.

Set (abstract data type)FinanceCorrelationEconomic informationSeries (mathematics)Stock exchangebusiness.industryPortfoliobusinessCluster analysiseconophysichierarchical clusteringHierarchical clustering
researchProduct

Dissimilarity Measures for the Identification of Earthquake Focal Mechanisms

2013

This work presents a study about dissimilarity measures for seismic signals, and their relation to clustering in the particular problem of the identification of earthquake focal mechanisms, i.e. the physical phenomena which have generated an earthquake. Starting from the assumption that waveform similarity implies similarity in the focal parameters, important details about them can be determined by studying waveforms related to the wave field produced by earthquakes and recorded by a seismic network. Focal mechanisms identification is currently investigated by clustering of seismic events, using mainly cross-correlation dissimilarity in conjunction with hierarchical clustering algorithm. By…

Settore INF/01 - InformaticaRelation (database)Cross-correlationComputer sciencebusiness.industryPattern recognitionField (computer science)Physics::GeophysicsHierarchical clusteringIdentification (information)Similarity (network science)WaveformArtificial intelligenceCluster analysisbusinessmetrics clustering seismic signals waveforms
researchProduct

Automatic classification of acoustically detected krill aggregations: A case study from Southern Ocean

2022

Acoustic surveys represent the standard methodology to assess the spatial distribution and abundance of pelagic organisms characterized by aggregative behaviour. The species identification of acoustically observed aggregations is usually performed by taking into account the biological sampling and according to expert-based knowledge. The precision of survey estimates, such as total abundance and spatial distribution, strongly depends on the efficiency of acoustic and biological sampling as well as on the species identification. In this context, the automatic identification of specific groups based on energetic and morphological features could improve the species identification process, allo…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniEnvironmental EngineeringRoss SeaSettore INF/01 - InformaticaEcological Modelingk-meansAcousticKrillInternal validation indicesSoftwareHierarchical clustering
researchProduct

Cruise passengers' trajectories at destination. A Dynamic Time Warping approach.

2015

The present work aims at proposing an analysis of cruise passengers trajectories at the destination through Dynamic Time Warping algorithm. Data collected through GPS devices on cruise passengers’ behavior in the port of Palermo are analyzed in order to show similarities and differences among their spatial trajectories at the destination. A cluster analysis is performed in order to identify cruise passengers’ segments based on trajectories’ similarity. Results are of interest from both a methodological perspective related with the analysis of GPS data, and for the management and planning of cruise tourism destinations.

Settore SECS-S/05 - Statistica SocialeGPS tracking dataHierarchical ClusteringConsumer behaviorSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Dynamic Time Warping
researchProduct

An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering

2002

Abstract A generalized prototype-based classification scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed 1-NN classification rule by merging the two same-class nearest clusters, provided that the set of cluster representatives correctly classifies all the original points. Apart from the quality of the obtained sets and its flexibility which comes from the fact that different intercluster measures and criteria can be used, the proposed scheme includes a very efficient four-stage procedure which conveniently exploits geometric cluster properties to decide about each possible merge. Empirical results demonstrate the merits of the proposed algorithm t…

Single-linkage clusteringcomputer.software_genreComplete-linkage clusteringHierarchical clusteringk-nearest neighbors algorithmArtificial IntelligenceNearest-neighbor chain algorithmClassification ruleSignal ProcessingCluster (physics)Computer Vision and Pattern RecognitionData miningMerge (version control)computerSoftwareMathematicsPattern Recognition
researchProduct

Iterative Cluster Analysis of Protein Interaction Data

2004

Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…

Statistics and ProbabilitySaccharomyces cerevisiae ProteinsComputer sciencecomputer.software_genreBiochemistryInteractomePattern Recognition AutomatedSet (abstract data type)Protein Interaction MappingCluster (physics)Cluster AnalysisCluster analysisMolecular BiologyCytoskeletonMeasure (data warehouse)Gene Expression ProfilingProteinsActinsComputer Science ApplicationsHierarchical clusteringGene expression profilingComputational MathematicsComputational Theory and MathematicsPattern recognition (psychology)Benchmark (computing)Data miningcomputerAlgorithmsSoftwareSignal TransductionBioinformatics
researchProduct