Search results for "Clustering"

showing 10 items of 446 documents

Probabilistic quantum clustering

2020

Abstract Quantum Clustering is a powerful method to detect clusters with complex shapes. However, it is very sensitive to a length parameter that controls the shape of the Gaussian kernel associated with a wave function, which is employed in the Schrodinger equation with the role of a density estimator. In addition, linking data points into clusters requires local estimates of covariance which requires further parameters. This paper proposes a Bayesian framework that provides an objective measure of goodness-of-fit to the data, to optimise the adjustable parameters. This also quantifies the probabilities of cluster membership, thus partitioning the data into a specific number of clusters, w…

Information Systems and ManagementJaccard indexComputer scienceProbabilistic logicEstimatorProbability density function02 engineering and technologyFunction (mathematics)CovarianceMeasure (mathematics)Management Information Systemssymbols.namesakeArtificial Intelligence020204 information systems0202 electrical engineering electronic engineering information engineeringGaussian functionsymbolsCluster (physics)020201 artificial intelligence & image processingStatistical physicsQASoftwareQuantum clusteringKnowledge-Based Systems

researchProduct

ExtMiner : Combining multiple ranking and clustering algorithms for structured document retrieval

2006

This paper introduces ExtMiner, a platform and potential tool for information management in SMEs (small & medium-size enterprise), or for organizational workgroups. ExtMiner supports interactive and iterative clustering of documents. It provides users with a visual cluster and list views at the same time, supporting iterative search process. ExtMiner may also be applied as a platform for research on retrieval fusion, since it combines search, clustering and visualization algorithms. ExtMiner was evaluated with three document collections. Although the findings were encouraging the user interface and performance with large document repositories need further development. peerReviewed

Information managementdokumenttien hakumenetelmätklusterointiDocument retrievalInformation retrievalComputer scienceDocument clusteringXMLcomputer.software_genreRanking (information retrieval)document clusteringRankingHuman–computer information retrievalRelevance (information retrieval)Data miningUser interfaceDocument retrievalCluster analysiscomputer

researchProduct

Rings for privacy: An architecture for privacy-preserving user profiling

2014

Information privacysocial networking (online) data privacy human factors InternetSettore ING-INF/03 - TelecomunicazioniComputer sciencePrivacy softwarebusiness.industryInternet privacyComputer securitycomputer.software_genrePrivacy preservingprivacy-preserving users profiling FCM clustering distributed unstable scenarioProfiling (information science)The InternetArchitecturebusinesscomputer

researchProduct

Document Word Clouds: Visualising Web Documents as Tag Clouds to Aid Users in Relevance Decisions

2009

Περιέχει το πλήρες κείμενο Information Retrieval systems spend a great effort on determining the significant terms in a document. When, instead, a user is looking at a document he cannot benefit from such information. He has to read the text to understand which words are important. In this paper we take a look at the idea of enhancing the perception of web documents with visualisation techniques borrowed from the tag clouds of Web 2.0. Highlighting the important words in a document by using a larger font size allows to get a quick impression of the relevant concepts in a text. As this process does not depend on a user query it can also be used for explorative search. A user study showed, th…

Information retrievalProcess (engineering)Computer sciencemedia_common.quotation_subjectDocument clusteringUser requirements documentWorld Wide WebPerceptionRelevance (information retrieval)Tag cloudtf–idfΤεχνικές υπηρεσίες σε βιβλιοθήκες αρχεία και μουσείαTechnical services in libraries archives and museumsWord (computer architecture)media_common

researchProduct

Data mining-based statistical analysis of biological data uncovers hidden significance: clustering Hashimoto’s thyroiditis patients based on the resp…

2014

The pathogenesis of Hashimoto's thyroiditis includes autoimmunity involving thyroid antigens, autoantibodies, and possibly cytokines. It is unclear what role plays Hsp60, but our recent data indicate that it may contribute to pathogenesis as an autoantigen. Its role in the induction of cytokine production, pro- or anti-inflammatory, was not elucidated, except that we found that peripheral blood mononucleated cells (PBMC) from patients or from healthy controls did not respond with cytokine production upon stimulation by Hsp60 in vitro with patterns that would differentiate patients from controls with statistical significance. This "negative” outcome appeared when the data were pooled and ana…

Interleukin 2Hashimoto’s thyroiditiShort Communicationmedicine.medical_treatmentStimulationHashimoto Diseasecomputer.software_genremedicine.disease_causeBiochemistryClusteringThyroiditisAutoimmunityInterferon-gammaCluster AnalysisData MiningHumansMedicineHashimoto DiseaseDelta valueIFN-γCells CulturedSettore BIO/16 - Anatomia Umanabusiness.industryIL-2ThyroidChaperonin 60Cell BiologyHsp60medicine.diseasemedicine.anatomical_structureCytokineClustering; Data mining; Delta values; Hashimoto’s thyroiditis; Hsp60; IFN-γ; IL-2ImmunologyLeukocytes MononuclearInterleukin-2Biomarker (medicine)Data miningbusinesscomputerAlgorithmsmedicine.drug

researchProduct

PGAC: A Parallel Genetic Algorithm for Data Clustering

2005

Cluster analysis is a valuable tool for exploratory pattern analysis, especially when very little a priori knowledge about the data is available. Distributed systems, based on high speed intranet connections, provide new tools in order to design new and faster clustering algorithms. Here, a parallel genetic algorithm for clustering called PGAC is described. The used strategy of parallelization is the island model paradigm where different populations of chromosomes (called demes) evolve locally to each processor and from time to time some individuals are moved from one deme to another. Experiments have been performed for testing the benefits of the parallelisation paradigm in terms of comput…

IntranetCorrectnessTheoretical computer scienceParallel processing (DSP implementation)Artificial neural networkData Clustering Evolutionary Aglorithms Parallel processingSettore INF/01 - InformaticaComputer scienceParallel algorithmA priori and a posterioriAlgorithm designParallel computingCluster analysis

researchProduct

Gamma Knife treatment planning: MR brain tumor segmentation and volume measurement based on unsupervised Fuzzy C-Means clustering

2015

Nowadays, radiation treatment is beginning to intensively use MRI thanks to its greater ability to discriminate healthy and diseased soft-tissues. Leksell Gamma Knife® is a radio-surgical device, used to treat different brain lesions, which are often inaccessible for conventional surgery, such as benign or malignant tumors. Currently, the target to be treated with radiation therapy is contoured with slice-by-slice manual segmentation on MR datasets. This approach makes the segmentation procedure time consuming and operator-dependent. The repeatability of the tumor boundary delineation may be ensured only by using automatic or semiautomatic methods, supporting clinicians in the treatment pla…

researchProduct

Knowledge Discovery from the Programme for International Student Assessment

2017

The Programme for International Student Assessment (PISA) is a worldwide study that assesses the proficiencies of 15-year-old students in reading, mathematics, and science every three years. Despite the high quality and open availability of the PISA data sets, which call for big data learning analytics, academic research using this rich and carefully collected data is surprisingly sparse. Our research contributes to reducing this deficit by discovering novel knowledge from the PISA through the development and use of appropriate methods. Since Finland has been the country of most international interest in the PISA assessment, a relevant review of the Finnish educational system is provided. T…

Knowledge managementmedia_common.quotation_subjectknowledge discoveryBig dataLearning analytics02 engineering and technologyKnowledge extractionbig data020204 information systemsReading (process)Political science0202 electrical engineering electronic engineering information engineeringMathematics educationQuality (business)Cluster analysismedia_commonStatistical hypothesis testinglearning analyticsbusiness.industry05 social sciencesPISA050301 educationTest (assessment)businesshierarchical clustering0503 education

researchProduct

Comparison of genomic sequences clustering using Normalized Compression Distance and Evolutionary Distance

2008

Genomic sequences are usually compared using evolutionary distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression distance was introduced as a method to calculate the distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional evolutionary distance and the compression distance are compared in order to understand if the two distances sets are similar. The first results indicate that the two distances catch differen…

Kolmogorov complexityuniversal similarity metricComputer sciencebusiness.industryDNA sequencePattern recognitionGenomic Sequence ClusteringCompression (functional analysis)Normalized compression distanceArtificial intelligenceCluster analysisbusinessDistance matrices in phylogenyclustering

researchProduct

Feature Ranking of Large, Robust, and Weighted Clustering Result

2017

A clustering result needs to be interpreted and evaluated for knowledge discovery. When clustered data represents a sample from a population with known sample-to-population alignment weights, both the clustering and the evaluation techniques need to take this into account. The purpose of this article is to advance the automatic knowledge discovery from a robust clustering result on the population level. For this purpose, we derive a novel ranking method by generalizing the computation of the Kruskal-Wallis H test statistic from sample to population level with two different approaches. Application of these enlargements to both the input variables used in clustering and to metadata provides a…

Kruskal-Wallis testComputer scienceCorrelation clusteringPopulation02 engineering and technologycomputer.software_genreMachine learning01 natural sciencesRanking (information retrieval)010104 statistics & probabilityKnowledge extractionCURE data clustering algorithmpopulation analysisRanking SVM0202 electrical engineering electronic engineering information engineeringTest statistic0101 mathematicseducational knowledge discoveryeducationCluster analysiseducation.field_of_studybusiness.industryRanking020201 artificial intelligence & image processingData miningArtificial intelligencerobust clusteringbusinesscomputer

researchProduct