Abstract
One of the biggest challenges in crowdsourcing is detecting noisy and incompetent workers. A possible way of handling this problem is to dynamically estimate the reliability of workers as they do work and accept only those workers who are deemed to be reliable to date. Although many approaches to dynamic estimation of rater reliability exist, they are often only appropriate for very specific categories of tasks, for example, only for binary classification. They also can make unrealistic assumptions such as requiring access to a large number of gold standard answers or relying on the constant availability of any rater. In this paper, we propose a novel approach to the dynamic estimation of rater reliability in regression (DER3) using multi-armed bandits. This approach is specifically suited for real-life crowdsourcing scenarios, where the task at hand is labelling or rating corpora to be used in supervised machine learning, and the annotations are continuous ratings, although it can be easily generalised to multi-class or binary classification tasks. We demonstrate that DER3 provides high-accuracy results and at the same time keeps the cost of the rating process low. Although our main motivating example is the recognition of emotion in speech, our approach shows similar results in other application areas.
Original language | English |
---|---|
Pages (from-to) | 6190-6210 |
Number of pages | 21 |
Journal | Expert Systems with Applications |
Volume | 41 |
Issue number | 14 |
DOIs | |
Publication status | Published - 15 Oct 2014 |
Keywords
- Crowdsourcing
- Emotion recognition
- Multi-armed bandits
- Supervised machine learning
- Worker reliability