6533b852fe1ef96bd12aac50

RESEARCH PRODUCT

Performance assessment of individual and ensemble data-mining techniques for gully erosion modeling

Aiding KornejadyHamid Reza PourghasemiArtemi CerdàSaleh Yousefi

subject

Environmental EngineeringSòls Erosió010504 meteorology & atmospheric sciencesEnsemble forecastingPrinciple of maximum entropy010501 environmental sciencescomputer.software_genre01 natural sciencesPollutionStability (probability)Support vector machineGoodness of fitRobustness (computer science)StatisticsRange (statistics)Environmental ChemistryData miningWaste Management and Disposalcomputer0105 earth and related environmental sciencesMathematicsStatistical hypothesis testing

description

Gully erosion is identified as an important sediment source in a range of environments and plays a conclusive role in redistribution of eroded soils on a slope. Hence, addressing spatial occurrence pattern of this phenomenon is very important. Different ensemble models and their single counterparts, mostly data mining methods, have been used for gully erosion susceptibility mapping; however, their calibration and validation procedures need to be thoroughly addressed. The current study presents a series of individual and ensemble data mining methods including artificial neural network (ANN), support vector machine (SVM), maximum entropy (ME), ANN-SVM, ANN-ME, and SVM-ME to map gully erosion susceptibility in Aghemam watershed, Iran. To this aim, a gully inventory map along with sixteen gully conditioning factors was used. A 70:30% randomly partitioned sets were used to assess goodness-of-fit and prediction power of the models. The robustness, as the stability of models' performance in response to changes in the dataset, was assessed through three training/test replicates. As a result, conducted preliminary statistical tests showed that ANN has the highest concordance and spatial differentiation with a chi-square value of 36,656 at 95% confidence level, while the ME appeared to have the lowest concordance (1772). The ME model showed an impractical result where 45% of the study area was introduced as highly susceptible to gullying, in contrast, ANN-SVM indicated a practical result with focusing only on 34% of the study area. Through all three replicates, the ANN-SVM ensemble showed the highest goodness-of-fit and predictive power with a respective values of 0.897 (area under the success rate curve) and 0.879 (area under the prediction rate curve), on average, and correspondingly the highest robustness. This attests the important role of ensemble modeling in congruently building accurate and generalized models which emphasizes the necessity to examine different models integrations. The result of this study can prepare an outline for further biophysical designs on gullies scattered in the study area.

https://doi.org/10.1016/j.scitotenv.2017.07.198