Search results for "cluster analysis."
showing 10 items of 805 documents
A knowledge-based decision support system in bioinformatics: An application to protein complex extraction
2013
Abstract Background We introduce a Knowledge-based Decision Support System (KDSS) in order to face the Protein Complex Extraction issue. Using a Knowledge Base (KB) coding the expertise about the proposed scenario, our KDSS is able to suggest both strategies and tools, according to the features of input dataset. Our system provides a navigable workflow for the current experiment and furthermore it offers support in the configuration and running of every processing component of that workflow. This last feature makes our system a crossover between classical DSS and Workflow Management Systems. Results We briefly present the KDSS' architecture and basic concepts used in the design of the knowl…
Euro Area Structural Convergence? A Multi-Criterion Cluster Analysis
2015
Abstract This paper proposes a classification of the old member countries of the euro area in a structural data rich environment and run a convergence analysis using the same framework. First, we use a clustering approach and identify two structurally distinct clusters of countries that are not modified between 1999 and 2012: the South Countries Group (SCG) – composed of Greece, Italy, Portugal and Spain – and the Other Countries Group (OCG). Second, we propose a convergence metrics and reach three key findings: (i) increase over time of the between-clusters׳ dispersion; (ii) diverging demographics and innovation performance into the OCG, and (iii) an unfortunate convergence towards high la…
Multispectral imaging and its use for face recognition : sensory data enhancement
2015
In this thesis, we focus on multispectral image for face recognition. With such application,the quality of the image is an important factor that affects the accuracy of therecognition. However, the sensory data are in general corrupted by noise. Thus, wepropose several denoising algorithms that are able to ensure a good tradeoff betweennoise removal and details preservation. Furthermore, characterizing regions and detailsof the face can improve recognition. We focus also in this thesis on multispectral imagesegmentation particularly clustering techniques and cluster analysis. The effectiveness ofthe proposed algorithms is illustrated by comparing them with state-of-the-art methodsusing both…
Space-Time FPCA Clustering of Multidimensional Curves.
2018
In this paper we focus on finding clusters of multidimensional curves with spatio-temporal structure, applying a variant of a k-means algorithm based on the principal component rotation of data. The main advantage of this approach is to combine the clustering functional analysis of the multidimensional data, with smoothing methods based on generalized additive models, that cope with both the spatial and the temporal variability, and with functional principal components that takes into account the dependency between the curves.
An Analysis of Earthquakes Clustering Based on a Second-Order Diagnostic Approach
2009
A diagnostic method for space–time point process is here introduced and applied to seismic data of a fixed area of Japan. Nonparametric methods are used to estimate the intensity function of a particular space–time point process and on the basis of the proposed diagnostic method, second-order features of data are analyzed: this approach seems to be useful to interpret space–time variations of the observed seismic activity and to focus on its clustering features.
Atom- and Bond-Based 2D TOMOCOMD-CARDD Approach and Ligand-Based Virtual Screening for the Drug Discovery of New Tyrosinase Inhibitors
2008
Two-dimensional atom- and bond-based TOMOCOMD-CARDD descriptors and linear discriminant analysis (LDA) are used in this report to perform a quantitative structure-activity relationship (QSAR) study of tyrosinase-inhibitory activity. A database of inhibitors of the enzyme is collected for this study, within 246 highly dissimilar molecules presenting antityrosinase activity. In total, 7 discriminant functions are obtained by using the whole set of atom- and bond-based 2D indices. All the LDA-based QSAR models show accuracies above 90% in the training set and values of the Matthews correlation coefficient (C) varying from 0.85 to 0.90. The external validation set shows globally good classifica…
Application of clustering techniques to electron-diffraction data: determination of unit-cell parameters.
2012
A new approach to determining the unit-cell vectors from single-crystal diffraction data based on clustering analysis is proposed. The method uses the density-based clustering algorithm DBSCAN. Unit-cell determination through the clustering procedure is particularly useful for limited tilt sequences and noisy data, and therefore is optimal for single-crystal electron-diffraction automated diffraction tomography (ADT) data. The unit-cell determination of various materials from ADT data as well as single-crystal X-ray data is demonstrated.
An Heuristic Approach for the Training Dataset Selection in Fingerprint Classification Tasks
2015
Fingerprint classification is a key issue in automatic fingerprint identification systems. It aims to reduce the item search time within the fingerprint database without affecting the accuracy rate. In this paper an heuristic approach using only the directional image information for the training dataset selection in fingerprint classification tasks is described. The method combines a Fuzzy C-Means clustering method and a Naive Bayes Classifier and it is composed of three modules: the first module builds the working datasets, the second module extracts the training images dataset and, finally, the third module classifies fingerprint images in four classes. Unlike literature approaches using …
Balancing and clustering of words in the Burrows–Wheeler transform
2011
AbstractCompression algorithms based on Burrows–Wheeler transform (BWT) take advantage of the fact that the word output of BWT shows a local similarity and then turns out to be highly compressible. The aim of the present paper is to study such “clustering effect” by using notions and methods from Combinatorics on Words.The notion of balance of a word plays a central role in our investigation. Empirical observations suggest that balance is actually the combinatorial property of input word that ensure optimal BWT compression. Moreover, it is reasonable to assume that the more balanced the input word is, the more local similarity we have after BWT (and therefore the better the compression is).…
Burrows-Wheeler transform and Run-Length Enconding
2017
In this paper we study the clustering effect of the Burrows-Wheeler Transform (BWT) from a combinatorial viewpoint. In particular, given a word w we define the BWT-clustering ratio of w as the ratio between the number of clusters produced by BWT and the number of the clusters of w. The number of clusters of a word is measured by its Run-Length Encoding. We show that the BWT-clustering ratio ranges in ]0, 2]. Moreover, given a rational number \(r\,\in \,]0,2]\), it is possible to find infinitely many words having BWT-clustering ratio equal to r. Finally, we show how the words can be classified according to their BWT-clustering ratio. The behavior of such a parameter is studied for very well-…