Search results for "Data mining"

showing 10 items of 907 documents

Making nonlinear manifold learning models interpretable: The manifold grand tour

2015

Smooth nonlinear topographic maps of the data distribution to guide a Grand Tour visualisation.Prioritisation of data linear views that are most consistent with data structure in the maps.Useful visualisations that cannot be obtained by other more classical approaches. Dimensionality reduction is required to produce visualisations of high dimensional data. In this framework, one of the most straightforward approaches to visualising high dimensional data is based on reducing complexity and applying linear projections while tumbling the projection axes in a defined sequence which generates a Grand Tour of the data. We propose using smooth nonlinear topographic maps of the data distribution to…

Clustering high-dimensional dataQA75Nonlinear dimensionality reductionDiscriminative clusteringComputer scienceVisualització de la informaciócomputer.software_genreData visualizationProjection (mathematics)Information visualizationArtificial IntelligenceQA:Informàtica::Infografia [Àrees temàtiques de la UPC]business.industryData visualizationDimensionality reductionGrand tourGeneral EngineeringNonlinear dimensionality reductionTopographic mapData structureComputer Science ApplicationsVisualizationManifold learningData miningbusinesscomputerGenerative topographic mappingLinear projections
researchProduct

The Three Steps of Clustering In The Post-Genomic Era

2013

This chapter descibes the basic algorithmic components that are involved in clustering, with particular attention to classification of microarray data.

Clustering high-dimensional dataSettore INF/01 - Informaticabusiness.industryCorrelation clusteringPattern recognitioncomputer.software_genreBiclusteringCURE data clustering algorithmClustering Classification Biological Data MiningConsensus clusteringArtificial intelligenceData miningbusinessCluster analysiscomputerMathematics
researchProduct

A Feature Set Decomposition Method for the Construction of Multi-classifier Systems Trained with High-Dimensional Data

2013

Data mining for the discovery of novel, useful patterns, encounters obstacles when dealing with high-dimensional datasets, which have been documented as the "curse" of dimensionality. A strategy to deal with this issue is the decomposition of the input feature set to build a multi-classifier system. Standalone decomposition methods are rare and generally based on random selection. We propose a decomposition method which uses information theory tools to arrange input features into uncorrelated and relevant subsets. Experimental results show how this approach significantly outperforms three baseline decomposition methods, in terms of classification accuracy.

Clustering high-dimensional databusiness.industryComputer sciencePattern recognitionInformation theorycomputer.software_genreUncorrelatedDecomposition method (queueing theory)Data miningArtificial intelligencebusinessFeature setcomputerClassifier (UML)Curse of dimensionality
researchProduct

Incrementally Assessing Cluster Tendencies with a~Maximum Variance Cluster Algorithm

2003

A straightforward and efficient way to discover clustering tendencies in data using a recently proposed Maximum Variance Clustering algorithm is proposed. The approach shares the benefits of the plain clustering algorithm with regard to other approaches for clustering. Experiments using both synthetic and real data have been performed in order to evaluate the differences between the proposed methodology and the plain use of the Maximum Variance algorithm. According to the results obtained, the proposal constitutes an efficient and accurate alternative.

Clustering high-dimensional datak-medoidsComputer scienceCURE data clustering algorithmSingle-linkage clusteringCanopy clustering algorithmVariance (accounting)Data miningCluster analysiscomputer.software_genrecomputerk-medians clustering
researchProduct

Bayesian versus data driven model selection for microarray data

2014

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. A…

Clustering Model selection Bayesian information criterion Akaike information criterion Minimum message length BioinformaticsSettore INF/01 - InformaticaComputer sciencebusiness.industryModel selectionBayesian probabilitycomputer.software_genreMachine learningComputer Science ApplicationsData-drivenDetermining the number of clusters in a data setIdentification (information)Bayesian information criterionData miningArtificial intelligenceAkaike information criterionCluster analysisbusinesscomputer
researchProduct

Neural networks with non-uniform embedding and explicit validation phase to assess Granger causality

2015

A challenging problem when studying a dynamical system is to find the interdependencies among its individual components. Several algorithms have been proposed to detect directed dynamical influences between time series. Two of the most used approaches are a model-free one (transfer entropy) and a model-based one (Granger causality). Several pitfalls are related to the presence or absence of assumptions in modeling the relevant features of the data. We tried to overcome those pitfalls using a neural network approach in which a model is built without any a priori assumptions. In this sense this method can be seen as a bridge between model-free and model-based approaches. The experiments perfo…

Cognitive NeuroscienceEntropyFOS: Physical sciencesOverfittingcomputer.software_genreMachine learningGranger causalityArtificial IntelligenceMedicine and Health SciencesEntropy (information theory)Non-uniform embeddingComputer SimulationMathematicsArtificial neural networkbusiness.industryProbability and statisticsModels TheoreticalNeural Networks (Computer)ClassificationNeural networkAlgorithmCausalityPhysics - Data Analysis Statistics and ProbabilitySettore ING-INF/06 - Bioingegneria Elettronica E InformaticaGranger causalityEmbeddingA priori and a posterioriTransfer entropyNeural Networks ComputerArtificial intelligenceData miningbusinesscomputerAlgorithmsNeural networksData Analysis Statistics and Probability (physics.data-an)
researchProduct

Panel Summary: Knowledge Model Representations

1997

Following the usual classifications of cognitive psychologists, we can say that the problem of representation spans three domains: the environment, the brain, and cognitive processes, which are usually studied by different scientists: the physicists, the neurobiologists and the psychologists. With the development of computer science and artificial intelligence new approaches have been introduced, which make possible simulation and implementation of cognitive processes through neural networks and symbolic systems. But the contribution of new methods is not limited to simulation, because they try to provide new models which consider cognitive process as information processing, not as reaction…

Cognitive scienceArtificial neural networkArtificial visionComputer scienceInformation processingRepresentation (systemics)Conceptual spaceCognitionData miningcomputer.software_genrecomputerSymbolic Systems
researchProduct

A framework to identify primitives that represent usability within Model-Driven Development methods

2014

Context: Nowadays, there are sound methods and tools which implement the Model-Driven Development approach (MDD) satisfactorily. However, MDD approaches focus on representing and generating code that represents functionality, behaviour and persistence, putting the interaction, and more specifically the usability, in a second place. If we aim to include usability features in a system developed with a MDD tool, we need to extend manually the generated code. Objective: This paper tackles how to include functional usability features (usability recommendations strongly related to system functionality) in MDD through conceptual primitives. Method: The approach consists of studying usability guide…

Cognitive walkthroughPluralistic walkthroughComputer scienceUsabilityUsability inspectionBIBLIOTECONOMIA Y DOCUMENTACION02 engineering and technologycomputer.software_genreHuman–computer interactionSoftware_SOFTWAREENGINEERING020204 information systemsHeuristic evaluationUsability engineering0202 electrical engineering electronic engineering information engineeringWeb usabilityInformáticaModel-Driven Developmentbusiness.industry020207 software engineeringUsabilityComputer Science ApplicationsUsability goalsConceptual modelData miningbusinesscomputerLENGUAJES Y SISTEMAS INFORMATICOSSoftwareInformation Systems
researchProduct

A Proposal for Modelling Usability in a Holistic MDD Method

2014

Holistic methods for Model-Driven Development (MDD) aim to model all the system features in a conceptual model. This conceptual model is the input for a model compiler that can generate software systems by means of automatic transformations. However, in general, MDD methods focus on modelling the structure and functionality of systems, relegating the interaction and usability features to manual implementations at the last steps of the software development process. Some usability features are strongly related to the functionality of the system and their inclusion is not so easy. In order to facilitate the inclusion of functional usability features from the first steps of the development proc…

Cognitive walkthroughPluralistic walkthroughbusiness.industryComputer scienceUsabilityConceptual model (computer science)Usabilitycomputer.software_genreModel-driven developmentSoftware development processHeuristic evaluationUsability engineeringConceptual modelData miningbusinessSoftware engineeringcomputerComponent-based usability testingLENGUAJES Y SISTEMAS INFORMATICOSSoftware
researchProduct

Compaction of Open-Graded HMAs Evaluated by a Fuzzy Clustering Technique

2015

The aim of this paper is the proposal of an expeditious procedure to be used during the execution of an asphalt layer for improving the compaction task. This procedure, based on a fuzzy clustering technique, starts from the knowledge of some information recorded by ordinary measuring instruments and provides an aid to the decision-maker on the number of roller passes needed to achieve a specific density at a certain temperature. This result can be deduced with great rapidity during the paving operations on site without waiting for the time spent in the core extraction and in the subsequent laboratory analysis. In this way it is possible to identify more precisely which aspects of the execut…

Compaction Density Fuzzy C-means Hot mix asphaltFuzzy clusteringComputer scienceCompactionCompactionDensitycomputer.software_genreHot mix asphaltSpecific densityTask (project management)Asphalt pavementMeasuring instrumentSettore ICAR/04 - Strade Ferrovie Ed AeroportiData miningLayer (object-oriented design)Fuzzy C-meanscomputer
researchProduct