Search results for "mining"

showing 10 items of 1730 documents

Cross-Sensor Adversarial Domain Adaptation of Landsat-8 and Proba-V images for Cloud Detection

2021

The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensors. This is particularly harmful for machine learning algorithms, since gathering new ground truth data to train models for each sensor is costly and requires experienced manpower. In this work, we propose a domain adaptation transformation to reduce the statistical differences between images of two…

FOS: Computer and information sciencesAtmospheric ScienceComputer Science - Machine LearningGenerative adversarial networks010504 meteorology & atmospheric sciencesComputer scienceRemote sensing applicationdomain adaptationGeophysics. Cosmic physics0211 other engineering and technologiesCloud computing02 engineering and technologycomputer.software_genre01 natural sciencesImage (mathematics)Data modelingMachine Learning (cs.LG)convolutional neural networksFOS: Electrical engineering electronic engineering information engineeringLandsat-8Computers in Earth SciencesAdaptation (computer science)TC1501-1800021101 geological & geomatics engineering0105 earth and related environmental sciencesbusiness.industryQC801-809Image and Video Processing (eess.IV)Electrical Engineering and Systems Science - Image and Video ProcessingOcean engineeringTransformation (function)cloud detectionSatelliteData miningProba-VTransfer of learningbusinesscomputer
researchProduct

A probabilistic estimation and prediction technique for dynamic continuous social science models: The evolution of the attitude of the Basque Country…

2015

In this paper, a computational technique to deal with uncertainty in dynamic continuous models in Social Sciences is presented.Considering data from surveys,the method consists of determining the probability distribution of the survey output and this allows to sample data and fit the model to the sampled data using a goodness-of-fit criterion based the χ2-test. Taking the fitted parameters that were not rejected by the χ2-test, substituting them into the model and computing their outputs, 95% confidence intervals in each time instant capturing the uncertainty of the survey data (probabilistic estimation) is built. Using the same set of obtained model parameters, a prediction over …

FOS: Computer and information sciencesAttitude dynamicsProbabilistic predictionComputer sciencePopulationDivergence-from-randomness modelSample (statistics)computer.software_genreMachine Learning (cs.LG)Probabilistic estimationSocial scienceeducationProbabilistic relevance modeleducation.field_of_studyApplied MathematicsProbabilistic logicConfidence intervalComputer Science - LearningComputational MathematicsSocial dynamic modelsProbability distributionSurvey data collectionData miningMATEMATICA APLICADAcomputerApplied Mathematics and Computation
researchProduct

Helminth Microbiota Profiling Using Bacterial 16S rRNA Gene Amplicon Sequencing: From Sampling to Sequence Data Mining

2021

Symbiont microbial communities play important roles in animal biology and are thus considered integral components of metazoan organisms, including parasitic worms (helminths). Nevertheless, the study of helminth microbiomes has thus far been largely overlooked, and symbiotic relationships between helminths and their microbiomes have been only investigated in selected parasitic worms. Over the past decade, advances in next-generation sequencing technologies, coupled with their increased affordability, have spurred investigations of helminth-associated microbial communities aiming at enhancing current understanding of their fundamental biology and physiology, as well as of host-microbe intera…

FOS: Computer and information sciencesBioinformaticsComputational biologyBiologyDNA sequencingSymbiosisHelminthsRNA Ribosomal 16Sparasitic diseasesHelminthAnimalsData MiningHelminthsMicrobiomeGeneBacterial 16S rRNA geneIndirect life cycleHigh-throughput sequencingMicrobiotaHigh-Throughput Nucleotide SequencingGenes rRNASchistosoma mansoniAmplicon sequencingHuman genomeSample collectionWorm-associated microbiome
researchProduct

Constrained Role Mining

2013

Role Based Access Control (RBAC) is a very popular access control model, for long time investigated and widely deployed in the security architecture of different enterprises. To implement RBAC, roles have to be firstly identified within the considered organization. Usually the process of (automatically) defining the roles in a bottom up way, starting from the permissions assigned to each user, is called {\it role mining}. In literature, the role mining problem has been formally analyzed and several techniques have been proposed in order to obtain a set of valid roles. Recently, the problem of defining different kind of constraints on the number and the size of the roles included in the resu…

FOS: Computer and information sciencesComputer Science - Cryptography and SecurityProcess (engineering)business.industryComputer scienceDistributed computingVertex coverAccess controlTop-down and bottom-up designEnterprise information security architecturecomputer.software_genreSet (abstract data type)Order (exchange)Role-based access controlData miningbusinessCryptography and Security (cs.CR)computer
researchProduct

Transfer Learning with Convolutional Networks for Atmospheric Parameter Retrieval

2018

The Infrared Atmospheric Sounding Interferometer (IASI) on board the MetOp satellite series provides important measurements for Numerical Weather Prediction (NWP). Retrieving accurate atmospheric parameters from the raw data provided by IASI is a large challenge, but necessary in order to use the data in NWP models. Statistical models performance is compromised because of the extremely high spectral dimensionality and the high number of variables to be predicted simultaneously across the atmospheric column. All this poses a challenge for selecting and studying optimal models and processing schemes. Earlier work has shown non-linear models such as kernel methods and neural networks perform w…

FOS: Computer and information sciencesComputer Science - Machine LearningComputer scienceFeature extraction0211 other engineering and technologiesTranfer learningFOS: Physical sciences02 engineering and technologyAtmospheric modelInfrared atmospheric sounding interferometercomputer.software_genreConvolutional neural networkMachine Learning (cs.LG)0202 electrical engineering electronic engineering information engineeringInfrared measurements021101 geological & geomatics engineeringArtificial neural networkStatistical modelNumerical weather predictionParameter retrievalPhysics - Atmospheric and Oceanic PhysicsKernel method13. Climate actionAtmospheric and Oceanic Physics (physics.ao-ph)Convolutional neural networks020201 artificial intelligence & image processingData miningcomputerCurse of dimensionalityIGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium
researchProduct

Multi-label Methods for Prediction with Sequential Data

2017

The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation inves…

FOS: Computer and information sciencesComputer Science - Machine LearningComputer scienceMarkov modelsMulti-label classificationMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreMarkov modelMachine learningTask (project management)Machine Learning (cs.LG)Statistics - Machine LearningArtificial Intelligence020204 information systemsComputer Science - Data Structures and Algorithms0202 electrical engineering electronic engineering information engineeringSequential dataData Structures and Algorithms (cs.DS)Multi-label classificationta113business.industryProblem transformationSignal ProcessingSequence prediction020201 artificial intelligence & image processingSequential dataComputer Vision and Pattern RecognitionData miningArtificial intelligencebusinesscomputerSoftware
researchProduct

Mislabel Detection of Finnish Publication Ranks

2019

The paper proposes to analyze a data set of Finnish ranks of academic publication channels with Extreme Learning Machine (ELM). The purpose is to introduce and test recently proposed ELM-based mislabel detection approach with a rich set of features characterizing a publication channel. We will compare the architecture, accuracy, and, especially, the set of detected mislabels of the ELM-based approach to the corresponding reference results on the reference paper.

FOS: Computer and information sciencesComputer Science - Machine LearningComputer sciencerankinglistatMachine Learning (stat.ML)computer.software_genreMachine Learning (cs.LG)Set (abstract data type)Statistics - Machine LearningDigital Libraries (cs.DL)julkaisukanavatvirheanalyysimislabel detectionExtreme learning machineExtreme Learning Machine (ELM)publication channelsComputer Science - Digital LibrariesData setkoneoppiminendataData miningrankingsarviointicomputertieteellinen julkaisutoimintaCommunication channel
researchProduct

Integrating Domain Knowledge in Data-Driven Earth Observation With Process Convolutions

2022

The modelling of Earth observation data is a challenging problem, typically approached by either purely mechanistic or purely data-driven methods. Mechanistic models encode the domain knowledge and physical rules governing the system. Such models, however, need the correct specification of all interactions between variables in the problem and the appropriate parameterization is a challenge in itself. On the other hand, machine learning approaches are flexible data-driven tools, able to approximate arbitrarily complex functions, but lack interpretability and struggle when data is scarce or in extrapolation regimes. In this paper, we argue that hybrid learning schemes that combine both approa…

FOS: Computer and information sciencesComputer Science - Machine LearningEarth observationAdvanced microwave scanning radiometer-2 (AMSR-2)moderate resolution imaging spectroradiometer (MODIS)Computer scienceleaf area index (LAI)0211 other engineering and technologiesExtrapolationMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreMachine Learning (cs.LG)Data-drivenConvolutionsymbols.namesakeadvanced scatterometer (ASCAT)Statistics - Machine Learningordinary differential equation (ODE)Electrical and Electronic EngineeringGaussian processsoil moisture and ocean salinity (SMOS)021101 geological & geomatics engineeringInterpretabilityForcing (recursion theory)machine learning (ML)soil moisture (SM)time series analysisgaussian process (GP)symbolsGeneral Earth and Planetary SciencesDomain knowledgeData mininggap fillingphysicscomputerfraction of absorbed photosynthetically active radiation (faPAR)IEEE Transactions on Geoscience and Remote Sensing
researchProduct

A perspective on Gaussian processes for Earth observation

2019

Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only accurate estimates but also principled uncertainty estimates for the predictions, can easily accommodate multimodal data coming from different sensors and from multitemporal acquisitions, allow the introduction of physical knowledge, and a formal treatment of uncertainty quantification and error pr…

FOS: Computer and information sciencesComputer Science - Machine LearningEarth observationComputer scienceDatenmanagement und AnalyseMachine Learning (stat.ML)02 engineering and technology010402 general chemistrycomputer.software_genreStatistics - Applications01 natural sciencesMachine Learning (cs.LG)symbols.namesakeStatistics - Machine LearningApplications (stat.AP)Uncertainty quantificationGaussian processPhysical lawPropagation of uncertaintyMultidisciplinarybusiness.industryPerspective (graphical)gaussian processes021001 nanoscience & nanotechnology0104 chemical sciences13. Climate actionCausal inferenceComputer ScienceGlobal Positioning SystemsymbolsData mining0210 nano-technologybusinesscomputerPerspectivesNational Science Review
researchProduct

Using the Tsetlin Machine to Learn Human-Interpretable Rules for High-Accuracy Text Categorization With Medical Applications

2019

Medical applications challenge today's text categorization techniques by demanding both high accuracy and ease-of-interpretation. Although deep learning has provided a leap ahead in accuracy, this leap comes at the sacrifice of interpretability. To address this accuracy-interpretability challenge, we here introduce, for the first time, a text categorization approach that leverages the recently introduced Tsetlin Machine. In all brevity, we represent the terms of a text as propositional variables. From these, we capture categories using simple propositional formulae, such as: if "rash" and "reaction" and "penicillin" then Allergy. The Tsetlin Machine learns these formulae from a labelled tex…

FOS: Computer and information sciencesComputer Science - Machine LearningGeneral Computer ScienceComputer sciencetext categorizationNatural language understandingDecision treeMachine Learning (stat.ML)02 engineering and technologyVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Annen informasjonsteknologi: 559Machine learningcomputer.software_genresupervised learningMachine Learning (cs.LG)Naive Bayes classifierText miningStatistics - Machine Learning0202 electrical engineering electronic engineering information engineeringGeneral Materials ScienceTsetlin machinehealth informaticsInterpretabilityPropositional variableClassification algorithmsArtificial neural networkbusiness.industryDeep learning020208 electrical & electronic engineeringGeneral EngineeringRandom forestSupport vector machinemachine learningCategorization020201 artificial intelligence & image processingArtificial intelligencelcsh:Electrical engineering. Electronics. Nuclear engineeringbusinessPrecision and recallcomputerlcsh:TK1-9971
researchProduct