Search results for " dimensionality"

showing 10 items of 129 documents

Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors.

2013

Sparse Manifold Clustering and Embedding (SMCE) algorithm has been recently proposed for simultaneous clustering and dimensionality reduction of data on nonlinear manifolds using sparse representation techniques. In this work, SMCE algorithm is applied to the differential discrimination of Glioblastoma and Meningioma Tumors by means of their Gene Expression Profiles. Our purpose was to evaluate the robustness of this nonlinear manifold to classify gene expression profiles, characterized by the high-dimensionality of their representations and the low discrimination power of most of the genes. For this objective, we used SMCE to reduce the dimensionality of a preprocessed dataset of 35 single…

BioinformaticsHealth InformaticsMicroarray data analysisRobustness (computer science)Databases GeneticCluster AnalysisHumansManifoldsCluster analysisMathematicsOligonucleotide Array Sequence Analysisbusiness.industryDimensionality reductionGene Expression ProfilingComputational BiologyDiscriminant AnalysisPattern recognitionSparse approximationLinear discriminant analysisManifoldComputer Science ApplicationsFISICA APLICADAEmbeddingAutomatic classificationArtificial intelligencebusinessGlioblastomaMeningiomaTranscriptomeAlgorithmsCurse of dimensionalityComputers in biology and medicine
researchProduct

Variability of Classification Results in Data with High Dimensionality and Small Sample Size

2021

The study focuses on the analysis of biological data containing information on the number of genome sequences of intestinal microbiome bacteria before and after antibiotic use. The data have high dimensionality (bacterial taxa) and a small number of records, which is typical of bioinformatics data. Classification models induced on data sets like this usually are not stable and the accuracy metrics have high variance. The aim of the study is to create a preprocessing workflow and a classification model that can perform the most accurate classification of the microbiome into groups before and after the use of antibiotics and lessen the variability of accuracy measures of the classifier. To ev…

Classification algorithms; feature selection; high dimensionality; machine learningInformation Technology and Management Science
researchProduct

A local complexity based combination method for decision forests trained with high-dimensional data

2012

Accurate machine learning with high-dimensional data is affected by phenomena known as the “curse” of dimensionality. One of the main strategies explored in the last decade to deal with this problem is the use of multi-classifier systems. Several of such approaches are inspired by the Random Subspace Method for the construction of decision forests. Furthermore, other studies rely on estimations of the individual classifiers' competence, to enhance the combination in the multi-classifier and improve the accuracy. We propose a competence estimate which is based on local complexity measurements, to perform a weighted average combination of the decision forest. Experimental results show how thi…

Clustering high-dimensional dataComputational complexity theorybusiness.industryComputer scienceDecision treeMachine learningcomputer.software_genreRandom forestRandom subspace methodArtificial intelligenceData miningbusinessCompetence (human resources)computerClassifier (UML)Curse of dimensionality2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)
researchProduct

Making nonlinear manifold learning models interpretable: The manifold grand tour

2015

Smooth nonlinear topographic maps of the data distribution to guide a Grand Tour visualisation.Prioritisation of data linear views that are most consistent with data structure in the maps.Useful visualisations that cannot be obtained by other more classical approaches. Dimensionality reduction is required to produce visualisations of high dimensional data. In this framework, one of the most straightforward approaches to visualising high dimensional data is based on reducing complexity and applying linear projections while tumbling the projection axes in a defined sequence which generates a Grand Tour of the data. We propose using smooth nonlinear topographic maps of the data distribution to…

Clustering high-dimensional dataQA75Nonlinear dimensionality reductionDiscriminative clusteringComputer scienceVisualització de la informaciócomputer.software_genreData visualizationProjection (mathematics)Information visualizationArtificial IntelligenceQA:Informàtica::Infografia [Àrees temàtiques de la UPC]business.industryData visualizationDimensionality reductionGrand tourGeneral EngineeringNonlinear dimensionality reductionTopographic mapData structureComputer Science ApplicationsVisualizationManifold learningData miningbusinesscomputerGenerative topographic mappingLinear projections
researchProduct

Dimensionality reduction via regression on hyperspectral infrared sounding data

2014

This paper introduces a new method for dimensionality reduction via regression (DRR). The method generalizes Principal Component Analysis (PCA) in such a way that reduces the variance of the PCA scores. In order to do so, DRR relies on a deflationary process in which a non-linear regression reduces the redundancy between the PC scores. Unlike other nonlinear dimensionality reduction methods, DRR is easy to apply, it has out-of-sample extension, it is invertible, and the learned transformation is volume-preserving. These properties make the method useful for a wide range of applications, especially in very high dimensional data in general, and for hyperspectral image processing in particular…

Clustering high-dimensional dataRedundancy (information theory)business.industryDimensionality reductionPrincipal component analysisFeature extractionNonlinear dimensionality reductionHyperspectral imagingPattern recognitionArtificial intelligencebusinessMathematicsCurse of dimensionality2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS)
researchProduct

A Feature Set Decomposition Method for the Construction of Multi-classifier Systems Trained with High-Dimensional Data

2013

Data mining for the discovery of novel, useful patterns, encounters obstacles when dealing with high-dimensional datasets, which have been documented as the "curse" of dimensionality. A strategy to deal with this issue is the decomposition of the input feature set to build a multi-classifier system. Standalone decomposition methods are rare and generally based on random selection. We propose a decomposition method which uses information theory tools to arrange input features into uncorrelated and relevant subsets. Experimental results show how this approach significantly outperforms three baseline decomposition methods, in terms of classification accuracy.

Clustering high-dimensional databusiness.industryComputer sciencePattern recognitionInformation theorycomputer.software_genreUncorrelatedDecomposition method (queueing theory)Data miningArtificial intelligencebusinessFeature setcomputerClassifier (UML)Curse of dimensionality
researchProduct

Incremental Generalized Discriminative Common Vectors for Image Classification.

2015

Subspace-based methods have become popular due to their ability to appropriately represent complex data in such a way that both dimensionality is reduced and discriminativeness is enhanced. Several recent works have concentrated on the discriminative common vector (DCV) method and other closely related algorithms also based on the concept of null space. In this paper, we present a generalized incremental formulation of the DCV methods, which allows the update of a given model by considering the addition of new examples even from unseen classes. Having efficient incremental formulations of well-behaved batch algorithms allows us to conveniently adapt previously trained classifiers without th…

Complex data typeContextual image classificationComputer Networks and Communicationsbusiness.industryPattern recognitionMachine learningcomputer.software_genreComputer Science ApplicationsDiscriminative modelArtificial IntelligencePrincipal component analysisArtificial intelligencebusinesscomputerSoftwareSubspace topologyCurse of dimensionalityMathematicsIEEE transactions on neural networks and learning systems
researchProduct

The impact of sample reduction on PCA-based feature extraction for supervised learning

2006

"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…

Computer scienceCovariance matrixbusiness.industryDimensionality reductionFeature extractionSupervised learningNonparametric statisticsSampling (statistics)Pattern recognitionStratified samplingNaive Bayes classifierSample size determinationArtificial intelligencebusinessEigenvalues and eigenvectorsParametric statisticsCurse of dimensionalityProceedings of the 2006 ACM symposium on Applied computing
researchProduct

Feature Dimensionality Reduction for Mammographic Report Classification

2016

The amount and the variety of available medical data coming from multiple and heterogeneous sources can inhibit analysis, manual interpretation, and use of simple data management applications. In this paper a deep overview of the principal algorithms for dimensionality reduction is carried out; moreover, the most effective techniques are applied on a dataset composed of 4461 mammographic reports is presented. The most useful medical terms are converted and represented using a TF-IDF matrix, in order to enable data mining and retrieval tasks. A series of query have been performed on the raw matrix and on the same matrix after the dimensionality reduction obtained using the most useful techni…

Computer scienceLatent semantic analysisbusiness.industryDimensionality reductionData managementCosine similarityPattern recognitionLatent Semantic Analysis (LSA)02 engineering and technologySingular Value Decomposition (SVD)Medical Application03 medical and health sciencesMatrix (mathematics)0302 clinical medicineFeature Dimensionality ReductionFeature (computer vision)Singular value decompositionPrincipal component analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing030212 general & internal medicineArtificial intelligencebusinessPrincipal Component Analysis (PCA)
researchProduct

A hybrid virtual–boundary element formulation for heterogeneous materials

2021

Abstract In this work, a hybrid formulation based on the conjoined use of the recently developed Virtual Element Method (VEM) and the Boundary Element Method (BEM) is proposed for the effective computational analysis of multi-region domains, representative of heterogeneous materials. VEM has been recently developed as a generalisation of the Finite Element Method (FEM) and it allows the straightforward employment of elements of general polygonal shape, maintaining a high level of accuracy. For its inherent features, it allows the use of meshes of general topology, including non-convex elements. On the other hand, BEM is an effective technique for the numerical solution of sets of boundary i…

Computer scienceMechanical Engineering02 engineering and technology021001 nanoscience & nanotechnologyCondensed Matter PhysicsHomogenization (chemistry)Finite element methodComputational scienceMatrix (mathematics)020303 mechanical engineering & transports0203 mechanical engineeringMechanics of MaterialsConvergence (routing)Fibre-reinforced Composite MaterialsComputational Micro-mechanicsComputational HomogenizationContinuum Damage MechanicsVirtual Element MethodBoundary Element MethodGeneral Materials SciencePolygon meshSettore ING-IND/04 - Costruzioni E Strutture Aerospaziali0210 nano-technologyReduction (mathematics)Boundary element methodCivil and Structural EngineeringCurse of dimensionalityInternational Journal of Mechanical Sciences
researchProduct