Abstract
Labels representing value judgements are commonly elicited using an interval scale of absolute values. Data collected in such a manner is not always reliable. Psychologists have long recognized a number of biases to which many human raters are prone, and which result in disagreement among raters as to the true gold standard rating of any particular object. We hypothesize that the issues arising from rater bias may be mitigated by treating the data received as an ordered set of preferences rather than a collection of absolute values. We experiment on real-world and artificially generated data, finding that treating label ratings as ordinal, rather than interval data results in an increased inter-rater reliability. This finding has the potential to improve the efficiency of data collection for applications such as Top-N recommender systems; where we are primarily interested in the ranked order of items, rather than the absolute scores which they have been assigned.
Original language | English |
---|---|
Pages (from-to) | 24-29 |
Number of pages | 6 |
Journal | CEUR Workshop Proceedings |
Volume | 1884 |
Publication status | Published - 2017 |
Event | 4th Joint Workshop on Interfaces and Human Decision Making for Recommender Systems, IntRS 2017 - Como, Italy Duration: 27 Aug 2017 → … |