Search results for "Hierarchical Clustering"
showing 10 items of 56 documents
Parallelized Clustering of Protein Structures on CUDA-Enabled GPUs
2014
Estimation of the pose in which two given molecules might bind together to form a potential complex is a crucial task in structural biology. To solve this so-called "docking problem", most algorithms initially generate large numbers of candidate poses (or decoys) which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates ranges from thousands to millions, performing the clustering on standard CPUs is highly time consuming. In this paper we analyze and evaluate different approaches to parallelize the nearest neighbor chain algorithm to perform hierarchical Ward clustering of protein structures usin…
Cellular automata and urban development simulation : a transition rules creation process based on statistical analysis
2015
National audience; Nowadays land use evolution study has become a major stake in urban planning. The main focus is to understand the way in which land use evolves across time and to understand processes that take place. This understanding would allow to plan urban developments based on a knowledge as complete as possible covering as many fields as possible (i.e. urban planning, politics, sociology, etc.). Simulation tools can be used to merge and display different points of view and stakes from different stakeholders (Parrott & Meyer, 2012).
Structural clustering of millions of molecular graphs
2014
We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…
A Greedy Algorithm for Hierarchical Complete Linkage Clustering
2014
We are interested in the greedy method to compute an hierarchical complete linkage clustering. There are two known methods for this problem, one having a running time of \({\mathcal O}(n^3)\) with a space requirement of \({\mathcal O}(n)\) and one having a running time of \({\mathcal O}(n^2 \log n)\) with a space requirement of Θ(n 2), where n is the number of points to be clustered. Both methods are not capable to handle large point sets. In this paper, we give an algorithm with a space requirement of \({\mathcal O}(n)\) which is able to cluster one million points in a day on current commodity hardware.
V11. Functional hierarchy within an overall network for visual motion processing and ocular-motor control at rest
2015
Introduction Visual motion processing on one hand and ocular motor functions on the other are rarely studied together in vivo in humans. The interrelation of these functional networks is rather unclear, even though their functional dependence seems obvious. In several fMRI studies the essential nodes of both networks could be localized using voluntary optokinetic ('look') nystagmus (OKN) in the horizontal plane incorporating visual motion tracking (Dieterich et al., 2009). Here, functional connectivity (FC) between these nodes representing both networks was studies using resting-state FC. Methods Resting-state fMRI data of 200 healthy adults (age 44.1±17.9; 79 male) were included in the cro…
Cluster-based RF fingerprint positioning using LTE and WLAN signal strengths
2017
Wireless Local Area Network (WLAN) positioning has become a popular localization system due to its low-cost installation and widespread availability of WLAN access points. Traditional grid-based radio frequency (RF) fingerprinting (GRFF) suffers from two drawbacks. First it requires costly and non-efficient data collection and updating procedure; secondly the method goes through time-consuming data pre-processing before it outputs user position. This paper proposes Cluster-based RF Fingerprinting (CRFF) to overcome these limitations by using modified Minimization of Drive Tests data which can be autonomously collected by cellular operators from their subscribers. The effect of environmental…
Efficient and Accurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting.
2015
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clusterin…
Sectors on sectors (SonS): A new hierarchical clustering visualization tool
2011
Clustering techniques have been widely applied to extract information from high-dimensional data structures in the last few years. Graphs are especially relevant for clustering, but many graphs associated with hierarchical clustering do not give any information about the values of the centroids' attributes and the relationships among them. In this paper, we propose a new visualization approach for hierarchical cluster analysis in which the above-mentioned information is available. The method is based on pie charts. The pie charts are divided into several pie segments or sectors corresponding to each cluster. The radius of each pie segment is proportional to the number of patterns included i…
A methodology to assess the intrinsic discriminative ability of a distance function and its interplay with clustering algorithms for microarray data …
2013
Abstract Background Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. Following Handl et al., it can be summarized as a three step process: (1) choice of a distance function; (2) choice of a clustering algorithm; (3) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Results A procedure is proposed for the assessment of the discriminative ability of a distance functi…
A New Approach to Investigate Students’ Behavior by Using Cluster Analysis as an Unsupervised Methodology in the Field of Education
2016
The problem of taking a set of data and separating it into subgroups where the ele- ments of each subgroup are more similar to each other than they are to elements not in the subgroup has been extensively studied through the statistical method of cluster analysis. In this paper we want to discuss the application of this method to the field of education: particularly, we want to present the use of cluster analysis to separate students into groups that can be recognized and characterized by common traits in their answers to a questionnaire, without any prior knowledge of what form those groups would take (unsupervised classification). We start from a detailed study of the data processing need…