6533b851fe1ef96bd12aa1cd
RESEARCH PRODUCT
Active Learning Methods for Efficient Hybrid Biophysical Variable Retrieval
Jordi Muñoz-maríJose MorenoSara DethierGustau Camps-vallsJuan Pablo RiveraJochem Verrelstsubject
Signal Processing (eess.SP)FOS: Computer and information sciences010504 meteorology & atmospheric sciencesComputer scienceActive learning (machine learning)Computer Vision and Pattern Recognition (cs.CV)Computer Science - Computer Vision and Pattern Recognition0211 other engineering and technologies02 engineering and technologyMachine learningcomputer.software_genre01 natural sciencesData modelingSet (abstract data type)Kernel (linear algebra)FOS: Electrical engineering electronic engineering information engineeringElectrical Engineering and Systems Science - Signal ProcessingElectrical and Electronic Engineering021101 geological & geomatics engineering0105 earth and related environmental sciencesTraining setbusiness.industryImage and Video Processing (eess.IV)Sampling (statistics)Electrical Engineering and Systems Science - Image and Video ProcessingGeotechnical Engineering and Engineering GeologyData setKernel (statistics)Data miningArtificial intelligencebusinesscomputerdescription
Kernel-based machine learning regression algorithms (MLRAs) are potentially powerful methods for being implemented into operational biophysical variable retrieval schemes. However, they face difficulties in coping with large training data sets. With the increasing amount of optical remote sensing data made available for analysis and the possibility of using a large amount of simulated data from radiative transfer models (RTMs) to train kernel MLRAs, efficient data reduction techniques will need to be implemented. Active learning (AL) methods enable to select the most informative samples in a data set. This letter introduces six AL methods for achieving optimized biophysical variable estimation with a manageable training data set, and their implementation into a Matlab-based MLRA toolbox for semiautomatic use. The AL methods were analyzed on their efficiency of improving the estimation accuracy of the leaf area index and chlorophyll content based on PROSAIL simulations. Each of the implemented methods outperformed random sampling, improving retrieval accuracy with lower sampling rates. Practically, AL methods open opportunities to feed advanced MLRAs with RTM-generated training data for the development of operational retrieval models.
year | journal | country | edition | language |
---|---|---|---|---|
2016-07-01 | IEEE Geoscience and Remote Sensing Letters |