6533b82afe1ef96bd128c13a

RESEARCH PRODUCT

Comparison of gap-filling techniques applied to the CCI soil moisture database in Southern Europe

ÁNgel González-zamoraMaria PilesJosé Martínez-fernándezLaura Almendra-martín

subject

010504 meteorology & atmospheric sciencesDatabaseCorrelation coefficient0208 environmental biotechnologySoil ScienceGeology02 engineering and technologycomputer.software_genre01 natural sciencesNormalized Difference Vegetation Index020801 environmental engineeringRandom forestSupport vector machineAutoregressive modelPrincipal component analysisPotential evaporationComputers in Earth Sciencescomputer0105 earth and related environmental sciencesMathematicsInterpolationRemote sensing

description

Abstract Soil moisture (SM) is a key variable that plays an important role in land-atmosphere interactions. Monitoring SM is crucial for many applications and can help to determine the impact of climate change. Therefore, it is essential to have continuous and long-term databases for this variable. Satellite missions have contributed to this; however, the continuity of the series is compromised due to the data gaps derived by different factors, including revisit time, presence of seasonal ice or Radio Frequency Interference (RFI) contamination. In this work, the applicability of different gap-filling techniques is evaluated on the ESA Climate Change Initiative (CCI) SM combined product, which is the longest available satellite-based SM data record. The methods used were linear, cubic and autoregressive interpolation and support vector machines (SVMs). This study focused on Southern Europe and spanned the years 2003–2015. The different methods were applied in the temporal and spatial domains and evaluated using the holdout cross-validation technique. A set of variables was introduced in the SVM model to estimate SM, namely, land surface temperature, precipitation, normalized difference vegetation index (NDVI), potential evaporation, soil texture and geographical coordinates. For the SVMs, several combinations of these variables were considered, including a principal component analysis (PCA) containing all of them. Although the different methods show a generally good performance, the SVM method outperforms the rest. Using the SM of the precedent day (SMt-1) is key to obtain good estimates. The median value of the correlation coefficient (R) obtained with the SVM and the SMt-1 series in the temporal analysis was 0.83, and the RMSE was 0.025 m3m−3. Similar results were obtained in the spatial analysis, with the best performance (R = 0.88; RMSE = 0.024 m3m−3) obtained by the SVM using the SMt-1 series and the static variables. The application of PCA to input variables was not beneficial, and the interpolation methods failed when dealing with large spatial or temporal gaps. A validation of the CCI SM series with in situ SM data from four networks located in Spain, France, Germany and Italy was also performed and no substantial differences were observed between results obtained with the original and with the reconstructed series. In addition, best inputs obtained with SVM were used to evaluate the random forest (RF) method in the temporal and spatial domain. This method showed a good ability to estimate soil moisture values in the temporal domain but to a lesser extent than SVM while for the spatial domain it did not seem to be as accurate. Our results confirm that we can efficiently deal with spatio-temporal gaps on observational SM databases using the SVM method and the past time series and soil texture as supporting information.

https://doi.org/10.1016/j.rse.2021.112377