6533b858fe1ef96bd12b6101

RESEARCH PRODUCT

Hyperspectral dimensionality reduction for biophysical variable statistical retrieval

Jordi Muñoz-maríJose MorenoGustau Camps-vallsJochem VerrelstJuan Pablo Rivera-caicedo

subject

010504 meteorology & atmospheric sciencesMean squared errorComputer science0211 other engineering and technologies02 engineering and technologycomputer.software_genre01 natural sciencessymbols.namesakeLinear regressionComputers in Earth SciencesEngineering (miscellaneous)Gaussian processHyMap021101 geological & geomatics engineering0105 earth and related environmental sciencesData stream miningbusiness.industryDimensionality reductionHyperspectral imagingPattern recognitionAtomic and Molecular Physics and OpticsComputer Science ApplicationsKernel (statistics)symbolsData miningArtificial intelligencebusinesscomputer

description

Abstract Current and upcoming airborne and spaceborne imaging spectrometers lead to vast hyperspectral data streams. This scenario calls for automated and optimized spectral dimensionality reduction techniques to enable fast and efficient hyperspectral data processing, such as inferring vegetation properties. In preparation of next generation biophysical variable retrieval methods applicable to hyperspectral data, we present the evaluation of 11 dimensionality reduction (DR) methods in combination with advanced machine learning regression algorithms (MLRAs) for statistical variable retrieval. Two unique hyperspectral datasets were analyzed on the predictive power of DR + MLRA methods to retrieve leaf area index (LAI): (1) a simulated PROSAIL reflectance data (2101 bands), and (2) a field dataset from airborne HyMap data (125 bands). For the majority of MLRAs, applying first a DR method leads to superior retrieval accuracies and substantial gains in processing speed as opposed to using all bands into the regression algorithm. This was especially noticeable for the PROSAIL dataset: in the most extreme case, using the classical linear regression (LR), validation results R CV 2 (RMSE CV ) improved from 0.06 (12.23) without a DR method to 0.93 (0.53) when combining it with a best performing DR method (i.e., CCA or OPLS). However, these DR methods no longer excelled when applied to noisy or real sensor data such as HyMap. Then the combination of kernel CCA (KCCA) with LR, or a classical PCA and PLS with a MLRA showed more robust performances ( R CV 2 of 0.93). Gaussian processes regression (GPR) uncertainty estimates revealed that LAI maps as trained in combination with a DR method can lead to lower uncertainties, as opposed to using all HyMap bands. The obtained results demonstrated that, in general, biophysical variable retrieval from hyperspectral data can largely benefit from dimensionality reduction in both accuracy and computational efficiency.

https://doi.org/10.1016/j.isprsjprs.2017.08.012