Search results for "Mining"
showing 10 items of 1730 documents
Cross-Sensor Adversarial Domain Adaptation of Landsat-8 and Proba-V images for Cloud Detection
2021
The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensors. This is particularly harmful for machine learning algorithms, since gathering new ground truth data to train models for each sensor is costly and requires experienced manpower. In this work, we propose a domain adaptation transformation to reduce the statistical differences between images of two…
A probabilistic estimation and prediction technique for dynamic continuous social science models: The evolution of the attitude of the Basque Country…
2015
In this paper, a computational technique to deal with uncertainty in dynamic continuous models in Social Sciences is presented.Considering data from surveys,the method consists of determining the probability distribution of the survey output and this allows to sample data and fit the model to the sampled data using a goodness-of-fit criterion based the χ2-test. Taking the fitted parameters that were not rejected by the χ2-test, substituting them into the model and computing their outputs, 95% confidence intervals in each time instant capturing the uncertainty of the survey data (probabilistic estimation) is built. Using the same set of obtained model parameters, a prediction over …
Helminth Microbiota Profiling Using Bacterial 16S rRNA Gene Amplicon Sequencing: From Sampling to Sequence Data Mining
2021
Symbiont microbial communities play important roles in animal biology and are thus considered integral components of metazoan organisms, including parasitic worms (helminths). Nevertheless, the study of helminth microbiomes has thus far been largely overlooked, and symbiotic relationships between helminths and their microbiomes have been only investigated in selected parasitic worms. Over the past decade, advances in next-generation sequencing technologies, coupled with their increased affordability, have spurred investigations of helminth-associated microbial communities aiming at enhancing current understanding of their fundamental biology and physiology, as well as of host-microbe intera…
Constrained Role Mining
2013
Role Based Access Control (RBAC) is a very popular access control model, for long time investigated and widely deployed in the security architecture of different enterprises. To implement RBAC, roles have to be firstly identified within the considered organization. Usually the process of (automatically) defining the roles in a bottom up way, starting from the permissions assigned to each user, is called {\it role mining}. In literature, the role mining problem has been formally analyzed and several techniques have been proposed in order to obtain a set of valid roles. Recently, the problem of defining different kind of constraints on the number and the size of the roles included in the resu…
Transfer Learning with Convolutional Networks for Atmospheric Parameter Retrieval
2018
The Infrared Atmospheric Sounding Interferometer (IASI) on board the MetOp satellite series provides important measurements for Numerical Weather Prediction (NWP). Retrieving accurate atmospheric parameters from the raw data provided by IASI is a large challenge, but necessary in order to use the data in NWP models. Statistical models performance is compromised because of the extremely high spectral dimensionality and the high number of variables to be predicted simultaneously across the atmospheric column. All this poses a challenge for selecting and studying optimal models and processing schemes. Earlier work has shown non-linear models such as kernel methods and neural networks perform w…
Multi-label Methods for Prediction with Sequential Data
2017
The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation inves…
Mislabel Detection of Finnish Publication Ranks
2019
The paper proposes to analyze a data set of Finnish ranks of academic publication channels with Extreme Learning Machine (ELM). The purpose is to introduce and test recently proposed ELM-based mislabel detection approach with a rich set of features characterizing a publication channel. We will compare the architecture, accuracy, and, especially, the set of detected mislabels of the ELM-based approach to the corresponding reference results on the reference paper.
Integrating Domain Knowledge in Data-Driven Earth Observation With Process Convolutions
2022
The modelling of Earth observation data is a challenging problem, typically approached by either purely mechanistic or purely data-driven methods. Mechanistic models encode the domain knowledge and physical rules governing the system. Such models, however, need the correct specification of all interactions between variables in the problem and the appropriate parameterization is a challenge in itself. On the other hand, machine learning approaches are flexible data-driven tools, able to approximate arbitrarily complex functions, but lack interpretability and struggle when data is scarce or in extrapolation regimes. In this paper, we argue that hybrid learning schemes that combine both approa…
A perspective on Gaussian processes for Earth observation
2019
Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only accurate estimates but also principled uncertainty estimates for the predictions, can easily accommodate multimodal data coming from different sensors and from multitemporal acquisitions, allow the introduction of physical knowledge, and a formal treatment of uncertainty quantification and error pr…
Using the Tsetlin Machine to Learn Human-Interpretable Rules for High-Accuracy Text Categorization With Medical Applications
2019
Medical applications challenge today's text categorization techniques by demanding both high accuracy and ease-of-interpretation. Although deep learning has provided a leap ahead in accuracy, this leap comes at the sacrifice of interpretability. To address this accuracy-interpretability challenge, we here introduce, for the first time, a text categorization approach that leverages the recently introduced Tsetlin Machine. In all brevity, we represent the terms of a text as propositional variables. From these, we capture categories using simple propositional formulae, such as: if "rash" and "reaction" and "penicillin" then Allergy. The Tsetlin Machine learns these formulae from a labelled tex…