Convolutional Neural Networks for Cloud Screening: Transfer Learning from Landsat-8 to Proba-V
Cloud detection is a key issue for exploiting the information from Earth observation satellites multispectral sensors. For Proba-V, cloud detection is challenging due to the limited number of spectral bands. Advanced machine learning methods, such as convolutional neural networks (CNN), have shown to work well on this problem provided enough labeled data. However, simultaneous collocated information about the presence of clouds is usually not available or requires a great amount of manual labor. In this work, we propose to learn from the available Landsat −8 cloud masks datasets and transfer this learning to solve the Proba-V cloud detection problem. CNN are trained with Landsat images adap…
Optimizing Kernel Ridge Regression for Remote Sensing Problems
Kernel methods have been very successful in remote sensing problems because of their ability to deal with high dimensional non-linear data. However, they are computationally expensive to train when a large amount of samples are used. In this context, while the amount of available remote sensing data has constantly increased, the size of training sets in kernel methods is usually restricted to few thousand samples. In this work, we modified the kernel ridge regression (KRR) training procedure to deal with large scale datasets. In addition, the basis functions in the reproducing kernel Hilbert space are defined as parameters to be also optimized during the training process. This extends the n…
Convolutional Neural Networks for Multispectral Image Cloud Masking
Convolutional neural networks (CNN) have proven to be state of the art methods for many image classification tasks and their use is rapidly increasing in remote sensing problems. One of their major strengths is that, when enough data is available, CNN perform an end-to-end learning without the need of custom feature extraction methods. In this work, we study the use of different CNN architectures for cloud masking of Proba-V multispectral images. We compare such methods with the more classical machine learning approach based on feature extraction plus supervised classification. Experimental results suggest that CNN are a promising alternative for solving cloud masking problems.
Retrieval of Case 2 Water Quality Parameters with Machine Learning
Water quality parameters are derived applying several machine learning regression methods on the Case2eXtreme dataset (C2X). The used data are based on Hydrolight in-water radiative transfer simulations at Sentinel-3 OLCI wavebands, and the application is done exclusively for absorbing waters with high concentrations of coloured dissolved organic matter (CDOM). The regression approaches are: regularized linear, random forest, Kernel ridge, Gaussian process and support vector regressors. The validation is made with and an independent simulation dataset. A comparison with the OLCI Neural Network Swarm (ONSS) is made as well. The best approached is applied to a sample scene and compared with t…
Cloud detection machine learning algorithms for PROBA-V
This paper presents the development and implementation of a cloud detection algorithm for Proba-V. Accurate and automatic detection of clouds in satellite scenes is a key issue for a wide range of remote sensing applications. With no accurate cloud masking, undetected clouds are one of the most significant sources of error in both sea and land cover biophysical parameter retrieval. The objective of the algorithms presented in this paper is to detect clouds accurately providing a cloud flag per pixel. For this purpose, the method exploits the information of Proba-V using statistical machine learning techniques to identify the clouds present in Proba-V products. The effectiveness of the propo…
Cross-Sensor Adversarial Domain Adaptation of Landsat-8 and Proba-V images for Cloud Detection
The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensors. This is particularly harmful for machine learning algorithms, since gathering new ground truth data to train models for each sensor is costly and requires experienced manpower. In this work, we propose a domain adaptation transformation to reduce the statistical differences between images of two…
Cloud detection on the Google Earth engine platform
The vast amount of data acquired by current high resolution Earth observation satellites implies some technical challenges to be faced. Google Earth Engine (GEE) platform provides a framework for the development of algorithms and products built over this data in an easy and scalable manner. In this paper, we take advantage of the GEE platform capabilities to exploit the wealth of information in the temporal dimension by processing a long time series of satellite images. A cloud detection algorithm for Landsat-8, which uses previous images of the same location to detect clouds, is implemented and tested on the GEE platform.
Machine Learning Regression Approaches for Colored Dissolved Organic Matter (CDOM) Retrieval with S2-MSI and S3-OLCI Simulated Data
The colored dissolved organic matter (CDOM) variable is the standard measure of humic substance in waters optics. CDOM is optically characterized by its spectral absorption coefficient, a C D O M at at reference wavelength (e.g., ≈ 440 nm). Retrieval of CDOM is traditionally done using bio-optical models. As an alternative, this paper presents a comparison of five machine learning methods applied to Sentinel-2 and Sentinel-3 simulated reflectance ( R r s ) data for the retrieval of CDOM: regularized linear regression (RLR), random forest regression (RFR), kernel ridge regression (KRR), Gaussian process regression (GPR) and support vector machines (SVR). Two different datasets of radiative t…
Multitemporal Cloud Masking in the Google Earth Engine
The exploitation of Earth observation satellite images acquired by optical instruments requires an automatic and accurate cloud detection. Multitemporal approaches to cloud detection are usually more powerful than their single scene counterparts since the presence of clouds varies greatly from one acquisition to another whereas surface can be assumed stationary in a broad sense. However, two practical limitations usually hamper their operational use: the access to the complete satellite image archive and the required computational power. This work presents a cloud detection and removal methodology implemented in the Google Earth Engine (GEE) cloud computing platform in order to meet these r…
HyperLabelMe : A Web Platform for Benchmarking Remote-Sensing Image Classifiers
HyperLabelMe is a web platform that allows the automatic benchmarking of remote-sensing image classifiers. To demonstrate this platform's attributes, we collected and harmonized a large data set of labeled multispectral and hyperspectral images with different numbers of classes, dimensionality, noise sources, and levels. The registered user can download training data pairs (spectra and land cover/use labels) and submit the predictions for unseen testing spectra. The system then evaluates the accuracy and robustness of the classifier, and it reports different scores as well as a ranked list of the best methods and users. The system is modular, scalable, and ever-growing in data sets and clas…