6533b835fe1ef96bd129f427

RESEARCH PRODUCT

Assessing the performance of GIS- based machine learning models with different accuracy measures for determining susceptibility to gully erosion

Christian ConoscentiYounes GarosiMohsen SheklabadiHamid Reza PourghasemiKristof Van Oost

subject

Environmental Engineering010504 meteorology & atmospheric sciencesMean squared errorSettore GEO/04 - Geografia Fisica E Geomorfologia010501 environmental sciencesMachine learningcomputer.software_genre01 natural sciencesNormalized Difference Vegetation IndexCohen's kappaMachine learning modelDiscriminationEnvironmental ChemistryGully erosion susceptibilityDigital elevation modelWaste Management and DisposalLatin hypercube sampling technique (cLHS)0105 earth and related environmental sciencesMathematicsReceiver operating characteristicbusiness.industryTopographic attributeGeneralized additive modelReliabilityPollutionRandom forestSupport vector machineArtificial intelligencebusinesscomputer

description

Assessing the performance of GIS- based machine learning models withdifferent accuracy measures for determining susceptibility togully erosionYounes Garosia, Mohsen Sheklabadia,⁎, Christian Conoscentib, Hamid Reza Pourghasemic,d, Kristof Van Ooste,faFaculty of Agriculture, Department of Soil Science, Bu Ali Sina University, Ahmadi Roshan Avenue, 6517838695 Hamedan, IranbDepartment of Earth and Sea Sciences (DISTEM), University of Palermo, Via Archirafi22, 90123 Palermo, ItalycCollege of Marine Sciences and Engineering, Nanjing Normal University, Nanjing, 210023, ChinadDepartment of Natural Resources and Environmental Engineering, College of Agriculture, Shiraz University, Shiraz, IraneA- Fonds de la recherche Scientifique, FNRS Rue d'Egmont 5, 1000 Brussels, BelgiumfB-TECLIM - Georges Lemaître Centre for Earth and Climate Research, Université catholique de Louvain, BE-1348 Louvain-la-Neuve, BelgiumHIGHLIGHTS•The model performance consideredbased on their discrimination capacityand reliability.•The RF model had the highest perfor-mance compared with other models tocreate GESM.•The model's robustness was stablewhen the calibration-validation setswere changed.•All machine learning models werefound to be suitable for producing reli-able GESM.GRAPHICAL ABSTRACTabstractarticle infoArticle history:Received 26 November 2018Received in revised form 5 February 2019Accepted 5 February 2019Available online 6 February 2019Editor: Ouyang WeiThe main purpose was to compare discrimination and reliability of four machine learning models to create gullyerosion susceptibility map (GESM) in a part of Ekbatan Dam Basin, Hamedan, western Iran. Extensivefield sur-veys using GPS, and the visual interpretation of satellite images, used to prepare a digital map of the spatial dis-tribution of gullies. 130 locations were sampled to elucidate the spatial distribution of the soil surface properties.Topographic attributes were provided from digital elevation model (DEM). The land use and normalized differ-ence vegetation index(NDVI) maps were createdby satellite imagery.The functionalrelationships between gullyerosion and controlling factors were calculated using the random forest (RF), support vector machine (SVM),Naïve Bayes (NB), and generalized additive model (GAM) models. The performance of models was evaluatedby 10-fold cross-validation based on efficiency, Kappa coefficient, receiver operating characteristic curve(ROC), mean absolute error (MAE), and root mean square error (RMSE). The results showed that the RF modelhad the highest amount of efficiency, Kappa coefficient, and AUC and the lowest amounts of MAE and RMSE com-pared with SVM, NB, and GAM. The RF model showed the highest predictive performance (mean AUC = 92.4%),followed by SVM (mean AUC = 90.9%), GAM (mean AUC = 89.9%), and NB (mean AUC = 87.2%) models. Overallaccuracy of the models ranged from excellent (NB, GAM) to outstanding (RF, SVM) classes. The capacity of allmodels for creating GESM was quite stable when the calibration and validation samples were changedthrough10-fold cross-validation technique. According to variable importance analysis performed by RF model,the most important variables are distance from rivers, calcium carbonate equivalent (CCE), and topographic position index (TPI). The obtained maps can help identifying areas at risk of gully erosion and facilitate the implementation of plans for soil conservation and sustainable management.

10.1016/j.scitotenv.2019.02.093http://hdl.handle.net/10447/341643