6533b85efe1ef96bd12c0305
RESEARCH PRODUCT
Ensemble methods for ranking data with and without position weights
Simona Buscemisubject
ranking databoostingweighted Kemeny distancebaggingSettore SECS-S/01 - Statisticalinear mixed modelensemble methoddescription
The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research. The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research.
year | journal | country | edition | language |
---|---|---|---|---|
2020-02-17 |