6533b857fe1ef96bd12b3a6b

RESEARCH PRODUCT

Comparison of machine learning models for gully erosion susceptibility mapping

Yang LiWei ChenWei ChenMarco LocheAlireza ArabameriLuigi LombardoXia ZhaoBiswajeet PradhanBiswajeet PradhanArtemi CerdàDieu Tien Bui

subject

Watershed010504 meteorology & atmospheric sciencesComputer scienceBivariate analysisLogistic model tree model010502 geochemistry & geophysicsMachine learningcomputer.software_genre01 natural sciencesLogistic model treeNatural hazardEntropy (information theory)Oil erosion0105 earth and related environmental sciencesbusiness.industrylcsh:QE1-996.5Statistical modelGISlcsh:GeologyITC-ISI-JOURNAL-ARTICLEGeneral Earth and Planetary SciencesAlternating decision treeAlternating decision tree modelArtificial intelligenceITC-GOLDbusinesscomputerDecision tree model

description

© 2019 China University of Geosciences (Beijing) and Peking University Gully erosion is a disruptive phenomenon which extensively affects the Iranian territory, especially in the Northern provinces. A number of studies have been recently undertaken to study this process and to predict it over space and ultimately, in a broader national effort, to limit its negative effects on local communities. We focused on the Bastam watershed where 9.3% of its surface is currently affected by gullying. Machine learning algorithms are currently under the magnifying glass across the geomorphological community for their high predictive ability. However, unlike the bivariate statistical models, their structure does not provide intuitive and quantifiable measures of environmental preconditioning factors. To cope with such weakness, we interpret preconditioning causes on the basis of a bivariate approach namely, Index of Entropy. And, we performed the susceptibility mapping procedure by testing three extensions of a decision tree model namely, Alternating Decision Tree (ADTree), Naïve-Bayes tree (NBTree), and Logistic Model Tree (LMT). We dichotomized the gully information over space into gully presence/absence conditions, which we further explored in their calibration and validation stages. Being the presence/absence information and associated factors identical, the resulting differences are only due to the algorithmic structures of the three models we chose. Such differences are not significant in terms of performances; in fact, the three models produce outstanding predictive AUC measures (ADTree ​= ​0.922; NBTree ​= ​0.939; LMT ​= ​0.944). However, the associated mapping results depict very different patterns where only the LMT is associated with reasonable susceptibility patterns. This is a strong indication of what model combines best performance and mapping for any natural hazard – oriented application.

10.1016/j.gsf.2019.11.009http://www.sciencedirect.com/science/article/pii/S1674987119302294