0000000000158291

AUTHOR

Adrian Perez-suay

Physics-Aware Machine Learning For Geosciences And Remote Sensing

Machine learning models alone are excellent approximators, but very often do not respect the most elementary laws of physics, like mass or energy conservation, so consistency and confidence are compromised. In this paper we describe the main challenges ahead in the field, and introduce several ways to live in the Physics and machine learning interplay: encoding differential equations from data, constraining data-driven models with physics-priors and dependence constraints, improving parameterizations, emulating physical models, and blending data-driven and process-based models. This is a collective long-term AI agenda towards developing and applying algorithms capable of discovering knowled…

research product

Neural Network Emulation of Synthetic Hyperspectral Sentinel-2-Like Imagery With Uncertainty

Hyperspectral satellite imagery provides highly-resolved spectral information for large areas and can provide vital information. However, only a few imaging spectrometer missions are currently in operation. Aiming to generate synthetic satellite-based hyperspectral imagery potentially covering any region, we explored the possibility of applying statistical learning, i.e. emulation. Based on the relationship of a Sentinel-2 (S2) scene and a hyperspectral HyPlant airborne image, this work demonstrates the possibility to emulate a hyperspectral S2-like image. We tested the role of different machine learning regression algorithms (MLRA) and varied the image-extracted training dataset size. We f…

research product

Nonlinear Distribution Regression for Remote Sensing Applications

In many remote sensing applications, one wants to estimate variables or parameters of interest from observations. When the target variable is available at a resolution that matches the remote sensing observations, standard algorithms, such as neural networks, random forests, or the Gaussian processes, are readily available to relate the two. However, we often encounter situations where the target variable is only available at the group level, i.e., collectively associated with a number of remotely sensed observations. This problem setting is known in statistics and machine learning as multiple instance learning (MIL) or distribution regression (DR). This article introduces a nonlinear (kern…

research product

Causal Inference in Geoscience and Remote Sensing From Observational Data

Establishing causal relations between random variables from observational data is perhaps the most important challenge in today’s science. In remote sensing and geosciences, this is of special relevance to better understand the earth’s system and the complex interactions between the governing processes. In this paper, we focus on an observational causal inference, and thus, we try to estimate the correct direction of causation using a finite set of empirical data. In addition, we focus on the more complex bivariate scenario that requires strong assumptions and no conditional independence tests can be used. In particular, we explore the framework of (nondeterministic) additive noise models, …

research product

Nonlinear Cook distance for Anomalous Change Detection

In this work we propose a method to find anomalous changes in remote sensing images based on the chronochrome approach. A regressor between images is used to discover the most {\em influential points} in the observed data. Typically, the pixels with largest residuals are decided to be anomalous changes. In order to find the anomalous pixels we consider the Cook distance and propose its nonlinear extension using random Fourier features as an efficient nonlinear measure of impact. Good empirical performance is shown over different multispectral images both visually and quantitatively evaluated with ROC curves.

research product

Randomized kernels for large scale Earth observation applications

Abstract Current remote sensing applications of bio-geophysical parameter estimation and image classification have to deal with an unprecedented big amount of heterogeneous and complex data sources. New satellite sensors involving a high number of improved time, space and wavelength resolutions give rise to challenging computational problems. Standard physical inversion techniques cannot cope efficiently with this new scenario. Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop st…

research product

Pattern Recognition Scheme for Large-Scale Cloud Detection over Landmarks

Landmark recognition and matching is a critical step in many Image Navigation and Registration (INR) models for geostationary satellite services, as well as to maintain the geometric quality assessment (GQA) in the instrument data processing chain of Earth observation satellites. Matching the landmark accurately is of paramount relevance, and the process can be strongly impacted by the cloud contamination of a given landmark. This paper introduces a complete pattern recognition methodology able to detect the presence of clouds over landmarks using Meteosat Second Generation (MSG) data. The methodology is based on the ensemble combination of dedicated support vector machines (SVMs) dependent…

research product

About Combining Metric Learning and Prototype Generation

Distance metric learning has been a major research topic in recent times. Usually, the problem is formulated as finding a Mahalanobis-like metric matrix that satisfies a set of constraints as much as possible. Different ways to introduce these constraints and to effectively formulate and solve the optimization problem have been proposed. In this work, we start with one of these formulations that leads to a convex optimization problem and generalize it in order to increase the efficiency by appropriately selecting the set of constraints. Moreover, the original criterion is expressed in terms of a reduced set of representatives that is learnt together with the metric. This leads to further im…

research product

Remote Sensing Image Classification with Large Scale Gaussian Processes

Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for…

research product

Randomized Rx For Target Detection

This work tackles the target detection problem through the well-known global RX method. The RX method models the clutter as a multivariate Gaussian distribution, and has been extended to nonlinear distributions using kernel methods. While the kernel RX can cope with complex clutters, it requires a considerable amount of computational resources as the number of clutter pixels gets larger. Here we propose random Fourier features to approximate the Gaussian kernel in kernel RX and consequently our development keep the accuracy of the nonlinearity while reducing the computational cost which is now controlled by an hyperparameter. Results over both synthetic and real-world image target detection…

research product

Consistent Regression of Biophysical Parameters with Kernel Methods

This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrate the performance in the estimation of chlorophyll content.

research product

Causal inference in geosciences with kernel sensitivity maps

Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's Science. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex and elusive interactions between processes. In this paper we explore a framework to derive cause-effect relations from pairs of variables via regression and dependence estimation. We propose to focus on the sensitivity (curvature) of the dependence estimator to account for the asymmetry of the forward and inverse densities of approximation residuals. Results in a large collection of 28 geoscience causal inference problems demonstrate the…

research product

Discovering Differential Equations from Earth Observation Data

Modeling and understanding the Earth system is a constant and challenging scientific endeavour. When a clear mechanistic model is unavailable, complex or uncertain, learning from data can be an alternative. While machine learning has provided excellent methods for detection and retrieval, understanding the governing equations of the system from observational data seems an elusive problem. In this paper we introduce sparse regression to uncover a set of governing equations in the form of a system of ordinary differential equations (ODEs). The presented method is used to explicitly describe variable relations by identifying the most expressive and simplest ODEs explaining data to model releva…

research product

Kernel methods and their derivatives: Concept and perspectives for the earth system sciences.

Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature mapping is not directly accessible and difficult to interpret.The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods is intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to many different problems. We n…

research product

Scaling Up a Metric Learning Algorithm for Image Recognition and Representation

Maximally Collapsing Metric Learning is a recently proposed algorithm to estimate a metric matrix from labelled data. The purpose of this work is to extend this approach by considering a set of landmark points which can in principle reduce the cost per iteration in one order of magnitude. The proposal is in fact a generalized version of the original algorithm that can be applied to larger amounts of higher dimensional data. Exhaustive experimentation shows that very similar behavior at a lower cost is obtained for a wide range of the number of landmark points used.

research product

A Random Extension for Discriminative Dimensionality Reduction and Metric Learning

A recently proposed metric learning algorithm which enforces the optimal discrimination of the different classes is extended and empirically assessed using different kinds of publicly available data. The optimization problem is posed in terms of landmark points and then, a stochastic approach is followed in order to bypass some of the problems of the original algorithm. According to the results, both computational burden and generalization ability are improved while absolute performance results remain almost unchanged.

research product

Fair Kernel Learning

New social and economic activities massively exploit big data and machine learning algorithms to do inference on people’s lives. Applications include automatic curricula evaluation, wage determination, and risk assessment for credits and loans. Recently, many governments and institutions have raised concerns about the lack of fairness, equity and ethics in machine learning to treat these problems. It has been shown that not including sensitive features that bias fairness, such as gender or race, is not enough to mitigate the discrimination when other related features are included. Instead, including fairness in the objective function has been shown to be more efficient.

research product

Interpretability of Recurrent Neural Networks in Remote Sensing

In this work we propose the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for multivariate time series of satellite data for crop yield estimation. Recurrent nets allow exploiting the temporal dimension efficiently, but interpretability is hampered by the typically overparameterized models. The focus of the study is to understand LSTM models by looking at the hidden units distribution, the impact of increasing network complexity, and the relative importance of the input covariates. We extracted time series of three variables describing the soil-vegetation status in agroe-cosystems -soil moisture, VOD and EVI- from optical and microwave satellites, as well as available in si…

research product

Convolutional Long Short-Term Memory Network for Multitemporal Cloud Detection Over Landmarks

In this work, we propose to exploit both the temporal and spatial correlations in Earth observation satellite images through deep learning methods. In particular, the combination of a U-Net convolutional neural network together with a convolutional long short-term memory (LSTM) layer is proposed. This model is applied for cloud detection on MSG/SEVIRI image time series over selected landmarks. Implementation details are provided and our proposal is compared against a standard SVM and a U-Net without the convolutional LSTM layer but including temporal information too. Experimental results show that this combination of networks exploits both the spatial and temporal dependence and provides st…

research product

Synergistic integration of optical and microwave satellite data for crop yield estimation

Developing accurate models of crop stress, phenology and productivity is of paramount importance, given the increasing need of food. Earth observation (EO) remote sensing data provides a unique source of information to monitor crops in a temporally resolved and spatially explicit way. In this study, we propose the combination of multisensor (optical and microwave) remote sensing data for crop yield estimation and forecasting using two novel approaches. We first propose the lag between Enhanced Vegetation Index (EVI) derived from MODIS and Vegetation Optical Depth (VOD) derived from SMAP as a new joint metric combining the information from the two satellite sensors in a unique feature or des…

research product

An Online Metric Learning Approach through Margin Maximization

This work introduces a method based on learning similarity measures between pairs of objects in any representation space that allows to develop convenient recognition algorithms. The problem is formulated through margin maximization over distance values so that it can discriminate between similar (intra-class) and dissimilar (inter-class) elements without enforcing positive definiteness of the metric matrix as in most competing approaches. A passive-aggressive approach has been adopted to carry out the corresponding optimization procedure. The proposed approach has been empirically compared to state of the art metric learning on several publicly available databases showing its potential bot…

research product

Online Metric Learning Methods Using Soft Margins and Least Squares Formulations

Online metric learning using margin maximization has been introduced as a way to learn appropriate dissimilarity measures in an efficient way when information as pairs of examples is given to the learning system in a progressive way. These schemes have several practical advantages with regard to global ones in which a training set needs to be processed. On the other hand, they may suffer from a poor performance depending on the quality of the examples and the particular tuning or other implementation details. This paper formulates several online metric learning alternatives using a passive-aggressive schema. A new formulation of the online problem using least squares is also introduced. The…

research product

Flipped evaluation: herramientas online para la evaluación participativa

[EN] The evaluation of a subject is a fundamental part of the teaching-learning process and one of the main concerns of our students. This is a complex task that requires a lot of effort from the teacher. This is a growing effort in line with the increased weight of con-tinuous evaluation in the current educational system. In this work, different methodo-logies focused on maximizing the student’s performance are presented, thus minimizing the extra effort for the teacher in the evaluation process. We provide several examples of activities throught Moodle platform such as the workshop, glossary, databases, ques-tionnaires, etc. Some of them allow self-assessment once configured, whereas others pr…

research product

Efficient Nonlinear RX Anomaly Detectors

Current anomaly detection algorithms are typically challenged by either accuracy or efficiency. More accurate nonlinear detectors are typically slow and not scalable. In this letter, we propose two families of techniques to improve the efficiency of the standard kernel Reed-Xiaoli (RX) method for anomaly detection by approximating the kernel function with either {\em data-independent} random Fourier features or {\em data-dependent} basis with the Nystr\"om approach. We compare all methods for both real multi- and hyperspectral images. We show that the proposed efficient methods have a lower computational cost and they perform similar (or outperform) the standard kernel RX algorithm thanks t…

research product

Estrategia de enseñanza y aprendizaje de programación basada en la idea de ’hackathon’

[EN] The acquisition of programming and data analysis skills in higher education is increa-singly necessary in all areas of Science and Engineering. In this paper we present a methodology for the motivation of programming learning, mainly focused on the deve-lopment of machine learning algorithms. This methodology is based on the hackathon idea and will have different levels. On the one hand the basic level where a competition is proposed in an improvised way during the development of the class. A second level where a programmed hackathon is proposed but within the classroom environment and using learning management systems such as Moodle. The last level consists of parti-cipation in an exte…

research product

Sensitivity Maps of the Hilbert-Schmidt Independence Criterion

Abstract Kernel dependence measures yield accurate estimates of nonlinear relations between random variables, and they are also endorsed with solid theoretical properties and convergence rates. Besides, the empirical estimates are easy to compute in closed form just involving linear algebra operations. However, they are hampered by two important problems: the high computational cost involved, as two kernel matrices of the sample size have to be computed and stored, and the interpretability of the measure, which remains hidden behind the implicit feature map. We here address these two issues. We introduce the sensitivity maps (SMs) for the Hilbert–Schmidt independence criterion (HSIC). Sensi…

research product

Efficient Kernel Cook's Distance for Remote Sensing Anomalous Change Detection

Detecting anomalous changes in remote sensing images is a challenging problem, where many approaches and techniques have been presented so far. We rely on the standard field of multivariate statistics of diagnostic measures, which are concerned about the characterization of distributions, detection of anomalies, extreme events, and changes. One useful tool to detect multivariate anomalies is the celebrated Cook's distance. Instead of assuming a linear relationship, we present a novel kernelized version of the Cook's distance to address anomalous change detection in remote sensing images. Due to the large computational burden involved in the direct kernelization, and the lack of out-…

research product

Passive millimeter wave image classification with large scale Gaussian processes

Passive Millimeter Wave Images (PMMWIs) are being increasingly used to identify and localize objects concealed under clothing. Taking into account the quality of these images and the unknown position, shape, and size of the hidden objects, large data sets are required to build successful classification/detection systems. Kernel methods, in particular Gaussian Processes (GPs), are sound, flexible, and popular techniques to address supervised learning problems. Unfortunately, their computational cost is known to be prohibitive for large scale applications. In this work, we present a novel approach to PMMWI classification based on the use of Gaussian Processes for large data sets. The proposed…

research product

Machine Learning Methods for Spatial and Temporal Parameter Estimation

Monitoring vegetation with satellite remote sensing is of paramount relevance to understand the status and health of our planet. Accurate and constant monitoring of the biosphere has large societal, economical, and environmental implications, given the increasing demand of biofuels and food by the world population. The current democratization of machine learning, big data, and high processing capabilities allow us to take such endeavor in a decisive manner. This chapter proposes three novel machine learning approaches to exploit spatial, temporal, multi-sensor, and large-scale data characteristics. We show (1) the application of multi-output Gaussian processes for gap-filling time series of…

research product

Warped Gaussian Processes in Remote Sensing Parameter Estimation and Causal Inference

This letter introduces warped Gaussian process (WGP) regression in remote sensing applications. WGP models output observations as a parametric nonlinear transformation of a GP. The parameters of such a prior model are then learned via standard maximum likelihood. We show the good performance of the proposed model for the estimation of oceanic chlorophyll content from multispectral data, vegetation parameters (chlorophyll, leaf area index, and fractional vegetation cover) from hyperspectral data, and in the detection of the causal direction in a collection of 28 bivariate geoscience and remote sensing causal problems. The model consistently performs better than the standard GP and the more a…

research product

HyperLabelMe : A Web Platform for Benchmarking Remote-Sensing Image Classifiers

HyperLabelMe is a web platform that allows the automatic benchmarking of remote-sensing image classifiers. To demonstrate this platform's attributes, we collected and harmonized a large data set of labeled multispectral and hyperspectral images with different numbers of classes, dimensionality, noise sources, and levels. The registered user can download training data pairs (spectra and land cover/use labels) and submit the predictions for unseen testing spectra. The system then evaluates the accuracy and robustness of the classifier, and it reports different scores as well as a ranked list of the best methods and users. The system is modular, scalable, and ever-growing in data sets and clas…

research product

Efficient remote sensing image classification with Gaussian processes and Fourier features

This paper presents an efficient methodology for approximating kernel functions in Gaussian process classification (GPC). Two models are introduced. We first include the standard random Fourier features (RFF) approximation into GPC, which largely improves the computational efficiency and permits large scale remote sensing data classification. In addition, we develop a novel approach which avoids randomly sampling a number of Fourier frequencies, and alternatively learns the optimal ones using a variational Bayes approach. The performance of the proposed methods is illustrated in complex problems of cloud detection from multispectral imagery.

research product