Search results for " High-dimensional data"

showing 4 items of 24 documents

Data Analysis and Bioinformatics

2007

Data analysis methods and techniques are revisited in the case of biological data sets. Particular emphasis is given to clustering and mining issues. Clustering is still a subject of active research in several fields such as statistics, pattern recognition, and machine learning. Data mining adds to clustering the complications of very large data-sets with many attributes of different types. And this is a typical situation in biology. Some cases studies are also described.

Clustering high-dimensional dataFuzzy clusteringComputer sciencebusiness.industryCorrelation clusteringConceptual clusteringMachine learningcomputer.software_genreComputingMethodologies_PATTERNRECOGNITIONCURE data clustering algorithmConsensus clusteringCanopy clustering algorithmData miningArtificial intelligenceCluster analysisbusinesscomputer
researchProduct

A Feature Set Decomposition Method for the Construction of Multi-classifier Systems Trained with High-Dimensional Data

2013

Data mining for the discovery of novel, useful patterns, encounters obstacles when dealing with high-dimensional datasets, which have been documented as the "curse" of dimensionality. A strategy to deal with this issue is the decomposition of the input feature set to build a multi-classifier system. Standalone decomposition methods are rare and generally based on random selection. We propose a decomposition method which uses information theory tools to arrange input features into uncorrelated and relevant subsets. Experimental results show how this approach significantly outperforms three baseline decomposition methods, in terms of classification accuracy.

Clustering high-dimensional databusiness.industryComputer sciencePattern recognitionInformation theorycomputer.software_genreUncorrelatedDecomposition method (queueing theory)Data miningArtificial intelligencebusinessFeature setcomputerClassifier (UML)Curse of dimensionality
researchProduct

GenClust: A genetic algorithm for clustering gene expression data

2005

Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, …

Clustering high-dimensional dataDNA ComplementaryComputer scienceRand indexCorrelation clusteringOligonucleotidesEvolutionary algorithmlcsh:Computer applications to medicine. Medical informaticscomputer.software_genreBiochemistryPattern Recognition AutomatedBiclusteringOpen Reading FramesStructural BiologyCURE data clustering algorithmConsensus clusteringGenetic algorithmCluster AnalysisCluster analysislcsh:QH301-705.5Molecular BiologyGene expression data Clustering Evolutionary algorithmsOligonucleotide Array Sequence AnalysisModels StatisticalBrown clusteringHeuristicGene Expression ProfilingApplied MathematicsComputational BiologyComputer Science Applicationslcsh:Biology (General)Gene Expression RegulationMutationlcsh:R858-859.7Data miningSequence AlignmentcomputerSoftwareAlgorithmsBMC Bioinformatics
researchProduct

Making nonlinear manifold learning models interpretable: The manifold grand tour

2015

Smooth nonlinear topographic maps of the data distribution to guide a Grand Tour visualisation.Prioritisation of data linear views that are most consistent with data structure in the maps.Useful visualisations that cannot be obtained by other more classical approaches. Dimensionality reduction is required to produce visualisations of high dimensional data. In this framework, one of the most straightforward approaches to visualising high dimensional data is based on reducing complexity and applying linear projections while tumbling the projection axes in a defined sequence which generates a Grand Tour of the data. We propose using smooth nonlinear topographic maps of the data distribution to…

Clustering high-dimensional dataQA75Nonlinear dimensionality reductionDiscriminative clusteringComputer scienceVisualització de la informaciócomputer.software_genreData visualizationProjection (mathematics)Information visualizationArtificial IntelligenceQA:Informàtica::Infografia [Àrees temàtiques de la UPC]business.industryData visualizationDimensionality reductionGrand tourGeneral EngineeringNonlinear dimensionality reductionTopographic mapData structureComputer Science ApplicationsVisualizationManifold learningData miningbusinesscomputerGenerative topographic mappingLinear projections
researchProduct