6533b7d6fe1ef96bd12670ee

RESEARCH PRODUCT

Comparing Boosting and Bagging for Decision Trees of Rankings

Antonella PlaiaJohannes FürnkranzEneldo Loza MencíaSimona Buscemi

subject

Ordinal dataBoosting (machine learning)Preference learningEnsemble methodsComputer sciencebusiness.industryDecision tree learningDecision treesDecision treeLibrary and Information SciencesMachine learningcomputer.software_genreEnsemble learningBoostingMathematics (miscellaneous)RankingPattern recognition (psychology)Psychology (miscellaneous)Artificial intelligencePreference learningStatistics Probability and UncertaintybusinesscomputerRankings

description

AbstractDecision tree learning is among the most popular and most traditional families of machine learning algorithms. While these techniques excel in being quite intuitive and interpretable, they also suffer from instability: small perturbations in the training data may result in big changes in the predictions. The so-called ensemble methods combine the output of multiple trees, which makes the decision more reliable and stable. They have been primarily applied to numeric prediction problems and to classification tasks. In the last years, some attempts to extend the ensemble methods to ordinal data can be found in the literature, but no concrete methodology has been provided for preference data. In this paper, we extend decision trees, and in the following also ensemble methods to ranking data. In particular, we propose a theoretical and computational definition of bagging and boosting, two of the best known ensemble methods. In an experimental study using simulated data and real-world datasets, our results confirm that known results from classification, such as that boosting outperforms bagging, could be successfully carried over to the ranking case.

10.1007/s00357-021-09397-2http://hdl.handle.net/10447/518398