Search results for "Clustering"
showing 10 items of 446 documents
Soft Topographic Map for Clustering and Classification of Bacteria
2007
In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA…
Cruise passengers' trajectories at destination. A Dynamic Time Warping approach.
2015
The present work aims at proposing an analysis of cruise passengers trajectories at the destination through Dynamic Time Warping algorithm. Data collected through GPS devices on cruise passengers’ behavior in the port of Palermo are analyzed in order to show similarities and differences among their spatial trajectories at the destination. A cluster analysis is performed in order to identify cruise passengers’ segments based on trajectories’ similarity. Results are of interest from both a methodological perspective related with the analysis of GPS data, and for the management and planning of cruise tourism destinations.
Screen media and non-screen media habits among preschool children in Singapore, South Korea, Japan, and Finland: Insights from an unsupervised cluste…
2021
The main purpose of the research was to describe the daily screen media habits and non-screen media habits like indoor and outdoor play, and sleep of preschool children aged 2 to 6 years from Singapore, South Korea, Japan, and Finland using a content-validated online questionnaire (SMALLQ®) and unsupervised cluster analysis. Unsupervised cluster analysis on 5809 parent-reported weekday and weekend screen and non-screen media habits of preschool children from the four countries resulted in seven emergent clusters. Cluster 2 ( n = 1288) or the Early-screen media, screen media-lite and moderate-to-vigorous physical activity-lite family made up 22.2% and Cluster 1 ( n = 261) or the High-all-ro…
An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering
2002
Abstract A generalized prototype-based classification scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed 1-NN classification rule by merging the two same-class nearest clusters, provided that the set of cluster representatives correctly classifies all the original points. Apart from the quality of the obtained sets and its flexibility which comes from the fact that different intercluster measures and criteria can be used, the proposed scheme includes a very efficient four-stage procedure which conveniently exploits geometric cluster properties to decide about each possible merge. Empirical results demonstrate the merits of the proposed algorithm t…
Identification of clusters of investors from their real trading activity in a financial market
2012
We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors.
Correlations among Game of Thieves and other centrality measures in complex networks
2021
Social Network Analysis (SNA) is used to study the exchange of resources among individuals, groups, or organizations. The role of individuals or connections in a network is described by a set of centrality metrics which represent one of the most important results of SNA. Degree, closeness, betweenness and clustering coefficient are the most used centrality measures. Their use is, however, severely hampered by their computation cost. This issue can be overcome by an algorithm called Game of Thieves (GoT). Thanks to this new algorithm, we can compute the importance of all elements in a network (i.e. vertices and edges), compared to the total number of vertices. This calculation is done not in…
Do Country Stereotypes Exist in PISA? A Clustering Approach for Large, Sparse, and Weighted Data
2015
Certain stereotypes can be associated with people from different countries. For example, the Italians are expected to be emotional, the Germans functional, and the Chinese hard-working. In this study, we cluster all 15-year-old students representing the 68 different nations and territories that participated in the latest Programme for International Student Assessment (PISA 2012). The hypothesis is that the students will start to form their own country groups when clustered according to the scale indices that summarize many of the students’ characteristics. In order to meet PISA data analysis requirements, we use a novel combination of our previously published algorithmic components to reali…
Analysing Student Performance using Sparse Data of Core Bachelor Courses
2015
Curricula for Computer Science (CS) degrees are characterized by the strong occupational orientation of the discipline. In the BSc degree structure, with clearly separate CS core studies, the learning skills for these and other required courses may vary a lot, which is shown in students' overall performance. To analyze this situation, we apply nonstandard educational data mining techniques on a preprocessed log file of the passed courses. The joint variation in the course grades is studied through correlation analysis while intrinsic groups of students are created and analyzed using a robust clustering technique. Since not all students attended all courses, there is a nonstructured sparsity…
Spatio-temporal classification in point patterns under the presence of clutter
Quantum Machine Learning: A tutorial
2021
This tutorial provides an overview of Quantum Machine Learning (QML), a relatively novel discipline that brings together concepts from Machine Learning (ML), Quantum Computing (QC) and Quantum Information (QI). The great development experienced by QC, partly due to the involvement of giant technological companies as well as the popularity and success of ML have been responsible of making QML one of the main streams for researchers working on fuzzy borders between Physics, Mathematics and Computer Science. A possible, although arguably coarse, classification of QML methods may be based on those approaches that make use of ML in a quantum experimentation environment and those others that take…