Search results for "cluster analysis"
showing 10 items of 848 documents
Efficient and Accurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting.
2015
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clusterin…
Projection Clustering Unfolding: A New Algorithm for Clustering Individuals or Items in a Preference Matrix
2020
In the framework of preference rankings, the interest can lie in clustering individuals or items in order to reduce the complexity of the preference space for an easier interpretation of collected data. The last years have seen a remarkable flowering of works about the use of decision tree for clustering preference vectors. As a matter of fact, decision trees are useful and intuitive, but they are very unstable: small perturbations bring big changes. This is the reason why it could be necessary to use more stable procedures in order to clustering ranking data. In this work, a Projection Clustering Unfolding (PCU) algorithm for preference data will be proposed in order to extract useful info…
Exudates as Landmarks Identified through FCM Clustering in Retinal Images
2020
The aim of this work was to develop a method for the automatic identification of exudates, using an unsupervised clustering approach. The ability to classify each pixel as belonging to an eventual exudate, as a warning of disease, allows for the tracking of a patient&rsquo
A Clustering Approach for Improving Network Performance in Heterogeneous Systems
2000
A lot of research has focused on solving the problem of computation-aware task scheduling on heterogeneous systems. In this paper, we propose a clustering algorithm that, given a network topology, provides a network partition adapted to the communication requirements of the applications running on the machine. Also, we propose a criterion to measure the quality of each one of the possible mappings of processes to processors based on that network partition. Evaluation results show that these proposals can greatly improve network performance, providing a basis of a communication-aware scheduling technique.
Least-squares community extraction in feature-rich networks using similarity data
2021
We explore a doubly-greedy approach to the issue of community detection in feature-rich networks. According to this approach, both the network and feature data are straightforwardly recovered from the underlying unknown non-overlapping communities, supplied with a center in the feature space and intensity weight(s) over the network each. Our least-squares additive criterion allows us to search for communities one-by-one and to find each community by adding entities one by one. A focus of this paper is that the feature-space data part is converted into a similarity matrix format. The similarity/link values can be used in either of two modes: (a) as measured in the same scale so that one may …
MetNet: A two-level approach to reconstructing and comparing metabolic networks
2021
Metabolic pathway comparison and interaction between different species can detect important information for drug engineering and medical science. In the literature, proposals for reconstructing and comparing metabolic networks present two main problems: network reconstruction requires usually human intervention to integrate information from different sources and, in metabolic comparison, the size of the networks leads to a challenging computational problem. We propose to automatically reconstruct a metabolic network on the basis of KEGG database information. Our proposal relies on a two-level representation of the huge metabolic network: the first level is graph-based and depicts pathways a…
Detection, tracking and event localization of jet stream features in 4-D atmospheric data
2012
We introduce a novel algorithm for the efficient detection and tracking of features in spatiotemporal atmospheric data, as well as for the precise localization of the occurring genesis, lysis, merging and splitting events. The algorithm works on data given on a four-dimensional structured grid. Feature selection and clustering are based on adjustable local and global criteria, feature tracking is predominantly based on spatial overlaps of the feature's full volumes. The resulting 3-D features and the identified correspondences between features of consecutive time steps are represented as the nodes and edges of a directed acyclic graph, the event graph. Merging and splitting events appear in…
Nonnegative Tensor Train Decompositions for Multi-domain Feature Extraction and Clustering
2016
Tensor train (TT) is one of the modern tensor decomposition models for low-rank approximation of high-order tensors. For nonnegative multiway array data analysis, we propose a nonnegative TT (NTT) decomposition algorithm for the NTT model and a hybrid model called the NTT-Tucker model. By employing the hierarchical alternating least squares approach, each fiber vector of core tensors is optimized efficiently at each iteration. We compared the performances of the proposed method with a standard nonnegative Tucker decomposition (NTD) algorithm by using benchmark data sets including event-related potential data and facial image data in multi-domain feature extraction and clustering tasks. It i…
Semi-automatic Brain Lesion Segmentation in Gamma Knife Treatments Using an Unsupervised Fuzzy C-Means Clustering Technique
2016
MR Imaging is being increasingly used in radiation treatment planning as well as for staging and assessing tumor response. Leksell Gamma Knife (R) is a device for stereotactic neuro-radiosurgery to deal with inaccessible or insufficiently treated lesions with traditional surgery or radiotherapy. The target to be treated with radiation beams is currently contoured through slice-by-slice manual segmentation on MR images. This procedure is time consuming and operator-dependent. Segmentation result repeatability may be ensured only by using automatic/semi-automatic methods with the clinicians supporting the planning phase. In this paper a semi-automatic segmentation method, based on an unsuperv…
Bag-of-word based brand recognition using Markov Clustering Algorithm for codebook generation
2015
International audience; In order to address the issue of counterfeiting online, it is necessary to use automatic tools that analyze the large amount of information available over the Internet. Analysis methods that extract information about the content of the images are very promising for this purpose. In this paper, a method that automatically extract the brand of objects in images is proposed. The method does not explicitly search for text or logos. This information is implicitly included in the Bag-of-Words representation. In the Bag-of-Words paradigm, visual features are clustered to create the visual words. Despite its shortcomings, k-means is the most widely used algorithm. With k-mea…