Search results for "Data mining"
showing 10 items of 907 documents
Making nonlinear manifold learning models interpretable: The manifold grand tour
2015
Smooth nonlinear topographic maps of the data distribution to guide a Grand Tour visualisation.Prioritisation of data linear views that are most consistent with data structure in the maps.Useful visualisations that cannot be obtained by other more classical approaches. Dimensionality reduction is required to produce visualisations of high dimensional data. In this framework, one of the most straightforward approaches to visualising high dimensional data is based on reducing complexity and applying linear projections while tumbling the projection axes in a defined sequence which generates a Grand Tour of the data. We propose using smooth nonlinear topographic maps of the data distribution to…
The Three Steps of Clustering In The Post-Genomic Era
2013
This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.
A Feature Set Decomposition Method for the Construction of Multi-classifier Systems Trained with High-Dimensional Data
2013
Data mining for the discovery of novel, useful patterns, encounters obstacles when dealing with high-dimensional datasets, which have been documented as the "curse" of dimensionality. A strategy to deal with this issue is the decomposition of the input feature set to build a multi-classifier system. Standalone decomposition methods are rare and generally based on random selection. We propose a decomposition method which uses information theory tools to arrange input features into uncorrelated and relevant subsets. Experimental results show how this approach significantly outperforms three baseline decomposition methods, in terms of classification accuracy.
Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm
2003
A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.
Bayesian versus data driven model selection for microarray data
2014
Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. A…
Neural networks with non-uniform embedding and explicit validation phase to assess Granger causality
2015
A challenging problem when studying a dynamical system is to find the interdependencies among its individual components. Several algorithms have been proposed to detect directed dynamical influences between time series. Two of the most used approaches are a model-free one (transfer entropy) and a model-based one (Granger causality). Several pitfalls are related to the presence or absence of assumptions in modeling the relevant features of the data. We tried to overcome those pitfalls using a neural network approach in which a model is built without any a priori assumptions. In this sense this method can be seen as a bridge between model-free and model-based approaches. The experiments perfo…
Panel Summary: Knowledge Model Representations
1997
Following the usual classifications of cognitive psychologists, we can say that the problem of representation spans three domains: the environment, the brain, and cognitive processes, which are usually studied by different scientists: the physicists, the neurobiologists and the psychologists. With the development of computer science and artificial intelligence new approaches have been introduced, which make possible simulation and implementation of cognitive processes through neural networks and symbolic systems. But the contribution of new methods is not limited to simulation, because they try to provide new models which consider cognitive process as information processing, not as reaction…
A framework to identify primitives that represent usability within Model-Driven Development methods
2014
Context: Nowadays, there are sound methods and tools which implement the Model-Driven Development approach (MDD) satisfactorily. However, MDD approaches focus on representing and generating code that represents functionality, behaviour and persistence, putting the interaction, and more specifically the usability, in a second place. If we aim to include usability features in a system developed with a MDD tool, we need to extend manually the generated code. Objective: This paper tackles how to include functional usability features (usability recommendations strongly related to system functionality) in MDD through conceptual primitives. Method: The approach consists of studying usability guide…
A Proposal for Modelling Usability in a Holistic MDD Method
2014
Holistic methods for Model-Driven Development (MDD) aim to model all the system features in a conceptual model. This conceptual model is the input for a model compiler that can generate software systems by means of automatic transformations. However, in general, MDD methods focus on modelling the structure and functionality of systems, relegating the interaction and usability features to manual implementations at the last steps of the software development process. Some usability features are strongly related to the functionality of the system and their inclusion is not so easy. In order to facilitate the inclusion of functional usability features from the first steps of the development proc…
Compaction of Open-Graded HMAs Evaluated by a Fuzzy Clustering Technique
2015
The aim of this paper is the proposal of an expeditious procedure to be used during the execution of an asphalt layer for improving the compaction task. This procedure, based on a fuzzy clustering technique, starts from the knowledge of some information recorded by ordinary measuring instruments and provides an aid to the decision-maker on the number of roller passes needed to achieve a specific density at a certain temperature. This result can be deduced with great rapidity during the paving operations on site without waiting for the time spent in the core extraction and in the subsequent laboratory analysis. In this way it is possible to identify more precisely which aspects of the execut…