Search results for "clustering"

showing 10 items of 446 documents

A Coclustering Approach for Mining Large Protein-Protein Interaction Networks

2012

Several approaches have been presented in the literature to cluster Protein-Protein Interaction (PPI) networks. They can be grouped in two main categories: those allowing a protein to participate in different clusters and those generating only nonoverlapping clusters. In both cases, a challenging task is to find a suitable compromise between the biological relevance of the results and a comprehensive coverage of the analyzed networks. Indeed, methods returning high accurate results are often able to cover only small parts of the input PPI network, especially when low-characterized networks are considered. We present a coclustering-based technique able to generate both overlapping and nonove…

Biologycomputer.software_genreBioinformatics network analysis co-clusteringTask (project management)Set (abstract data type)Protein Interaction MappingGeneticsCluster (physics)Cluster AnalysisHumansRelevance (information retrieval)Protein Interaction MapsCluster analysisStructure (mathematical logic)Applied MathematicsProteinsprotein-protein interaction networksbiological networksComputingMethodologies_PATTERNRECOGNITIONCover (topology)Co-clusteringData miningcomputerAlgorithmsBiological networkBiotechnologyIEEE/ACM Transactions on Computational Biology and Bioinformatics
researchProduct

Evolution of Cooperation Patterns in Psoriasis Research: Co-Authorship Network Analysis of Papers in Medline (1942–2013)

2015

BackgroundAlthough researchers have worked in collaboration since the origins of modern science and the publication of the first scientific journals in the eighteenth century, this phenomenon has acquired exceptional importance in the last several decades. Since the mid-twentieth century, new knowledge has been generated from within an ever-growing network of investigators, working cooperatively in research groups across countries and institutions. Cooperation is a crucial determinant of academic success.ObjectiveThe aim of the present paper is to analyze the evolution of scientific collaboration at the micro level, with regard to the scientific production generated on psoriasis research.Me…

Biomedical ResearchMEDLINEScienceClosenessInformation DisseminationMEDLINEBiologyBibliometricsBioinformaticsGiant componentSocial NetworkingBetweenness centralityRegional scienceHumansPsoriasisCooperative BehaviorClustering coefficientMultidisciplinarySocial networkInformation Disseminationbusiness.industryQRAuthorshipResearch PersonnelBibliometricsWorkforceMedicinePeriodicals as TopicbusinessResearch ArticlePLOS ONE
researchProduct

Improving clustering of Web bot and human sessions by applying Principal Component Analysis

2019

View references (18) The paper addresses the problem of modeling Web sessions of bots and legitimate users (humans) as feature vectors for their use at the input of classification models. So far many different features to discriminate bots’ and humans’ navigational patterns have been considered in session models but very few studies were devoted to feature selection and dimensionality reduction in the context of bot detection. We propose applying Principal Component Analysis (PCA) to develop improved session models based on predictor variables being efficient discriminants of Web bots. The proposed models are used in session clustering, whose performance is evaluated in terms of the purity …

Bot detectionPrincipal Component AnalysisPCALog analysisComputer sciencek-meansInternet robotcomputer.software_genreClassificationWeb botDimensionality reductionClusteringWeb serverPrincipal component analysisFeature selectionData miningCluster analysiscomputerCommunications of the ECMS
researchProduct

Fast dendrogram-based OTU clustering using sequence embedding

2014

Biodiversity assessment is an important step in a metagenomic processing pipeline. The biodiversity of a microbial metagenome is often estimated by grouping its 16S rRNA reads into operational taxonomic units or OTUs. These metagenomic datasets are typically large and hence require effective yet accurate computational methods for processing.In this paper, we introduce a new hierarchical clustering method called CRiSPy-Embed which aims to produce high-quality clustering results at a low computational cost. We tackle two computational issues of the current OTU hierarchical clustering approach: (1) the compute-intensive sequence alignment operation for building the distance matrix and (2) the …

Brown clusteringCURE data clustering algorithmSingle-linkage clusteringCorrelation clusteringCanopy clustering algorithmData miningBiologyHierarchical clustering of networksCluster analysiscomputer.software_genrecomputerHierarchical clusteringProceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
researchProduct

A Fuzzy Logic C-Means Clustering Algorithm to Enhance Microcalcifications Clusters in Digital Mammograms

2011

The detection of microcalcifications is a hard task, since they are quite small and often poorly contrasted against the background of images. The Computer Aided Detection (CAD) systems could be very useful for breast cancer control. In this paper, we report a method to enhance microcalcifications cluster in digital mammograms. A Fuzzy Logic clustering algorithm with a set of features is used for clustering microcalcifications. The method described was tested on simulated clusters of microcalcifications, so that the location of the cluster within the breast and the exact number of microcalcifications is known.

C-meanCOMPUTER-AIDED DETECTIONComputer scienceCADFuzzy logicSet (abstract data type)Cluster (physics)medicineMammographycancerComputer visionCLASSIFICATION.Cluster analysisbreastmedicine.diagnostic_testbusiness.industryPattern recognitionImage enhancementComputer aided detectionSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)microcalcificationComputingMethodologies_PATTERNRECOGNITIONbreast; cancer; microcalcifications; clustering; fuzzy logic; C-means; COMPUTER-AIDED DETECTION; CLASSIFICATION.Artificial intelligencefuzzy logicbusinessclustering
researchProduct

Parallelized Clustering of Protein Structures on CUDA-Enabled GPUs

2014

Estimation of the pose in which two given molecules might bind together to form a potential complex is a crucial task in structural biology. To solve this so-called "docking problem", most algorithms initially generate large numbers of candidate poses (or decoys) which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates ranges from thousands to millions, performing the clustering on standard CPUs is highly time consuming. In this paper we analyze and evaluate different approaches to parallelize the nearest neighbor chain algorithm to perform hierarchical Ward clustering of protein structures usin…

CUDASpeedupComputer scienceNearest-neighbor chain algorithmParallel computingCluster analysisRoot-mean-square deviationPoseWard's methodHierarchical clustering2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
researchProduct

Cellular automata and urban development simulation : a transition rules creation process based on statistical analysis

2015

National audience; Nowadays land use evolution study has become a major stake in urban planning. The main focus is to understand the way in which land use evolves across time and to understand processes that take place. This understanding would allow to plan urban developments based on a knowledge as complete as possible covering as many fields as possible (i.e. urban planning, politics, sociology, etc.). Simulation tools can be used to merge and display different points of view and stakes from different stakeholders (Parrott & Meyer, 2012).

Cellular automataspatial analysisprincipal component analysis[SHS.GEO] Humanities and Social Sciences/Geographydecision tree[SHS.GEO]Humanities and Social Sciences/Geographyhierarchical clustering[ SHS.GEO ] Humanities and Social Sciences/Geography
researchProduct

Efficient unsupervised clustering for spatial bird population analysis along the Loire river

2015

International audience; This paper focuses on application and comparison of Non Linear Dimensionality Reduction (NLDR) methods on natural high dimensional bird communities dataset along the Loire River (France). In this context, biologists usually use the well-known PCA in order to explain the upstream-downstream gradient.Unfortunately this method was unsuccessful on this kind of nonlinear dataset.The goal of this paper is to compare recent NLDR methods coupled with different data transformations in order to find out the best approach. Results show that Multiscale Jensen-Shannon Embedding (Ms JSE) outperform all over methods in this context.

Clustering Algorithms[ INFO.INFO-TS ] Computer Science [cs]/Signal and Image Processing[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing[INFO.INFO-TS] Computer Science [cs]/Signal and Image ProcessingNonlinear dimension reductionMultiscale Jensen-Shannon EmbeddingDimension ReductionLoire River
researchProduct

SMART: Unique splitting-while-merging framework for gene clustering

2014

© 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named "splitting merging awareness tactics" (SMART), which does not require any a priori knowledge of either the number …

Clustering algorithmsMicroarrayslcsh:MedicineGene ExpressionBioinformaticscomputer.software_genreCell SignalingData MiningCluster Analysislcsh:ScienceFinite mixture modelOligonucleotide Array Sequence AnalysisPhysicsMultidisciplinarySMART frameworkConstrained clusteringCompetitive learning modelBioassays and Physiological AnalysisMultigene FamilyCanopy clustering algorithmEngineering and TechnologyData miningInformation TechnologyGenomic Signal ProcessingAlgorithmsResearch ArticleSignal TransductionComputer and Information SciencesFuzzy clusteringCorrelation clusteringResearch and Analysis MethodsClusteringMolecular GeneticsCURE data clustering algorithmGeneticsGene RegulationCluster analysista113Gene Expression Profilinglcsh:RBiology and Life SciencesComputational BiologyCell BiologyDetermining the number of clusters in a data setComputingMethodologies_PATTERNRECOGNITIONSplitting-merging awareness tactics (SMART)Signal ProcessingAffinity propagationlcsh:QGene expressionClustering frameworkcomputer
researchProduct

The on-line curvilinear component analysis (onCCA) for real-time data reduction

2015

Real time pattern recognition applications often deal with high dimensional data, which require a data reduction step which is only performed offline. However, this loses the possibility of adaption to a changing environment. This is also true for other applications different from pattern recognition, like data visualization for input inspection. Only linear projections, like the principal component analysis, can work in real time by using iterative algorithms while all known nonlinear techniques cannot be implemented in such a way and actually always work on the whole database at each epoch. Among these nonlinear tools, the Curvilinear Component Analysis (CCA), which is a non-convex techni…

Clustering high-dimensional dataBregman divergenceComputer scienceneural networkprojectionBregman divergenceNovelty detectionSynthetic dataData visualizationArtificial Intelligencebranch and boundComputer visionunfoldingcurvilinear component analysisCurvilinear coordinatesArtificial neural networkbusiness.industryVector quantizationPattern recognitiononline algorithmbearing faultvector quantizationPattern recognition (psychology)Principal component analysisbearing fault; branch and bound; Bregman divergence; curvilinear component analysis; data reduction; neural network; novelty detection; online algorithm; projection; unfolding; vector quantization; Software; Artificial Intelligencedata reductionArtificial intelligencebusinessnovelty detectionSoftware
researchProduct