Last modified: 2018-05-17
Abstract
Preference data represent a particular type of ranking data (widely used insports, web search, social sciences), where a group of people gives their preferences over a set of alternatives. Within this framework, distance-based decision trees represent a non-parametric tool for identifying the profiles of subjects giving a similar ranking.
This paper aims to detect, in the framework of (complete and incomplete) ranking data, the impact of the differently stuctured weighted distances for building decision trees. The traditional metrics between rankings don’t take into account the importance of swapping elements similar among them (element weights) or elements belonging to the top (or to the bottom) of an ordering (position weights). By means of simulations, using weighted distances both for generating rankings and to build decision trees, we will compute the impact of different weighting systems both on splitting and on consensus ranking. The distances that will be used satisfy Kemeny’s axioms and, accordingly, the rank correlation coefficient taux, proposed by Edmond and Mason, will be used both for pruning the trees and for assessing their “goodnessâ€.