Search results for "cluster analysis."

showing 10 items of 805 documents

Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts

2005

As has been shown recently, it is possible to automatically discover the senses of an ambiguous word by statistically analyzing its contextual behavior in a large text corpus. However, this kind of research is still at an early stage. The results need to be improved and there is considerable disagreement on methodological issues. For example, although most researchers use clustering approaches for word sense induction, it is not clear what statistical features the clustering should be based on. Whereas so far most researchers cluster global co-occurrence vectors that reflect the overall behavior of a word in a corpus, in this paper we argue that it is more appropriate to use local context v…

Text corpusbusiness.industryComputer scienceContext (language use)computer.software_genreWord senseWord-sense inductionArtificial intelligencebusinessCluster analysiscomputerNatural language processingWord (computer architecture)Strengths and weaknesses

researchProduct

Aspects Concerning SVM Method’s Scalability

2008

In the last years the quantity of text documents is increasing continually and automatic document classification is an important challenge. In the text document classification the training step is essential in obtaining a good classifier. The quality of learning depends on the dimension of the training data. When working with huge learning data sets, problems regarding the training time that increases exponentially are occurring. In this paper we are presenting a method that allows working with huge data sets into the training step without increasing exponentially the training time and without significantly decreasing the classification accuracy.

Text document classificationStructured support vector machinebusiness.industryComputer scienceDocument classificationcomputer.software_genreSupport vector machineText miningScalabilityData miningbusinessCluster analysiscomputerClassifier (UML)

researchProduct

Graph Clustering with Local Density-Cut

2018

In this paper, we introduce a new graph clustering algorithm, called Dcut. The basic idea is to envision the graph clustering as a local density-cut problem. To identify meaningful communities in a graph, a density-connected tree is first constructed in a local fashion. Building upon the local intuitive density-connected tree, Dcut allows partitioning a graph into multiple densely tight-knit clusters effectively and efficiently. We have demonstrated that our method has several attractive benefits: (a) Dcut provides an intuitive criterion to evaluate the goodness of a graph clustering in a more precise way; (b) Building upon the density-connected tree, Dcut allows identifying high-quality cl…

The intuitive criterion"Theoretical computer scienceComputer science020204 information systems0202 electrical engineering electronic engineering information engineeringGraph (abstract data type)020201 artificial intelligence & image processing02 engineering and technologyCluster analysisClustering coefficient

researchProduct

Robust Synchronization-Based Graph Clustering

2013

Complex graph data now arises in various fields like social networks, protein-protein interaction networks, ecosystems, etc. To reveal the underlying patterns in graphs, an important task is to partition them into several meaningful clusters. The question is: how can we find the natural partitions of a complex graph which truly reflect the intrinsic patterns? In this paper, we propose RSGC, a novel approach to graph clustering. The key philosophy of RSGC is to consider graph clustering as a dynamic process towards synchronization. For each vertex, it is viewed as an oscillator and interacts with other vertices according to the graph connection information. During the process towards synchro…

Theoretical computer scienceComputer scienceCURE data clustering algorithmKuramoto modelCorrelation clusteringCluster analysisPartition (database)SynchronizationMathematicsofComputing_DISCRETEMATHEMATICSClustering coefficientVertex (geometry)

researchProduct

Projector operators in clustering

2016

In a recent paper, the notion of quantum perceptron has been introduced in connection with projection operators. Here, we extend this idea, using these kind of operators to produce a clustering machine, that is, a framework that generates different clusters from a set of input data. Also, we consider what happens when the orthonormal bases first used in the definition of the projectors are replaced by frames and how these can be useful when trying to connect some noised signal to a given cluster. Copyright © 2016 John Wiley & Sons, Ltd.

Theoretical computer scienceGeneral MathematicsGeneral Engineering020206 networking & telecommunications02 engineering and technologyPerceptronlaw.inventionConnection (mathematics)Set (abstract data type)ProjectorlawPattern recognition (psychology)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingOrthonormal basisProjection (set theory)Cluster analysisMathematicsMathematical Methods in the Applied Sciences

researchProduct

The Burrows-Wheeler Transform between Data Compression and Combinatorics on Words

2013

The Burrows-Wheeler Transform (BWT) is a tool of fundamental importance in Data Compression and, recently, has found many applications well beyond its original purpose. The main goal of this paper is to highlight the mathematical and combinatorial properties on which the outstanding versatility of the $BWT$ is based, i.e. its reversibility and the clustering effect on the output. Such properties have aroused curiosity and fervent interest in the scientific world both for theoretical aspects and for practical effects. In particular, in this paper we are interested both to survey the theoretical research issues which, by taking their cue from Data Compression, have been developed in the conte…

Theoretical computer scienceSettore INF/01 - InformaticaBurrows–Wheeler transformmedia_common.quotation_subjectTheoretical researchContext (language use)Data_CODINGANDINFORMATIONTHEORYBurrows Wheeler transform; Clustering effect; Combinatorial propertiesCombinatorial propertiesBurrows Wheeler transformCombinatorics on wordsClustering effectBWT balancing optimal partitioning text-compressionCuriosityArithmeticCluster analysisFocus (optics)media_commonData compressionMathematics

researchProduct

Game of Thieves and WERW-Kpath: Two Novel Measures of Node and Edge Centrality for Mafia Networks

2021

Real-world complex systems can be modeled as homogeneous or heterogeneous graphs composed by nodes connected by edges. The importance of nodes and edges is formally described by a set of measures called centralities which are typically studied for graphs of small size. The proliferation of digital collection of data has led to huge graphs with billions of nodes and edges. For this reason, we focus on two new algorithms, Game of Thieves and WERW-Kpath which are computationally-light alternatives to the canonical centrality measures such as degree, node and edge betweenness, closeness and clustering. We explore the correlation among these measures using the Spearman’s correlation coefficient …

Theoretical computer scienceSettore INF/01 - InformaticaDegree (graph theory)Computer scienceClosenessComplex networksMafia networksComplex networkCorrelationComputational complexityBetweenness centralityNode (computer science)CentralityRank (graph theory)Cluster analysisCentrality

researchProduct

Online Induction of Probabilistic Real Time Automata

2012

Probabilistic real time automata (PRTAs) are a representation of dynamic processes arising in the sciences and industry. Currently, the induction of automata is divided into two steps: the creation of the prefix tree acceptor (PTA) and the merge procedure based on clustering of the states. These two steps can be very time intensive when a PRTA is to be induced for massive or even unbounded data sets. The latter one can be efficiently processed, as there exist scalable online clustering algorithms. However, the creation of the PTA still can be very time consuming. To overcome this problem, we propose a genuine online PRTA induction approach that incorporates new instances by first collapsing…

Theoretical computer sciencebusiness.industryComputer scienceProbabilistic logiccomputer.software_genreAutomatonData setTrieAutomata theoryThe InternetData miningbusinessCluster analysiscomputer2012 IEEE 12th International Conference on Data Mining

researchProduct

An ontological-based knowledge organization for bioinformatics workflow management system

2012

Motivation and Objectives In the field of Computer Science, ontologies represent formal structures to define and organize knowledge of a specific application domain (Chandrasekaran et al., 1999). An ontology is composed of entities, called classes, and relationships among them. Classes are characterized by features, called attributes, and they can be arranged into a hierarchical organization. Ontologies are a fundamental instrument in Artificial Intelligence for the development of Knowledge-Based Systems (KBS). With its formal and well defined structure, in fact, an ontology provides a machine-understandable language that allows automatic reasoning for problems resolution. Typical KBS are E…

Theoretical computer scienceworkflow management systembusiness.industryComputer scienceIntelligent decision support systemBioinformatics workflow management systembioinformaticsOntology (information science)Solvercomputer.software_genreExpert systemWorkflowArtificial intelligenceontologybusinessCluster analysiscomputerWorkflow management system

researchProduct

Functional Brain Segmentation Using Inter-Subject Correlation in fMRI

2016

The human brain continuously processes massive amounts of rich sensory information. To better understand such highly complex brain processes, modern neuroimaging studies are increasingly utilizing experimental setups that better mimic daily‐life situations. A new exploratory data‐analysis approach, functional segmentation inter‐subject correlation analysis (FuSeISC), was proposed to facilitate the analysis of functional magnetic resonance (fMRI) data sets collected in these experiments. The method provides a new type of functional segmentation of brain areas, not only characterizing areas that display similar processing across subjects but also areas in which processing across subjects is h…

Time FactorsComputer science0302 clinical medicinetoiminnallinen magneettikuvausImage Processing Computer-AssistedCluster AnalysisSegmentationResearch Articlesinter-subject variabilityBrain Mappingshared nearest-neighborgraphmedicine.diagnostic_test05 social sciencesBrainHuman brainMiddle AgedMagnetic Resonance Imagingmedicine.anatomical_structurefunctional segmentationGaussian mixture modelGraph (abstract data type)/dk/atira/pure/sustainabledevelopmentgoals/good_health_and_well_beinginter-subject correlationAlgorithmsAdultshared nearest-neighbor graphModels NeurologicalSensory system050105 experimental psychology03 medical and health sciencesYoung AdultNeuroimagingSDG 3 - Good Health and Well-beingmedicineHumans0501 psychology and cognitive sciencesComputer SimulationCluster analysishuman brainCommunicationbusiness.industryMagnetic resonance imagingPattern recognitionfunctional magnetic resonance imagingOxygenAffinity propagationnaturalistic stimulationArtificial intelligencebusiness030217 neurology & neurosurgery

researchProduct