Abstract
Acquiring labels for large datasets can be a costly and time-consuming process. This has motivated the development of the semi-supervised learning problem domain, which makes use of unlabelled data - in conjunction with a small amount of labelled data - to infer the correct labels of a partially labelled dataset. Active Learning is one of the most successful approaches to semi-supervised learning, and has been shown to reduce the cost and time taken to produce a fully labelled dataset. In this paper we present Activist; a free, online, state-of-theart platform which leverages active learning techniques to improve the efficiency of dataset labelling. Using a simulated crowd-sourced label gathering scenario on a number of datasets, we show that the Activist software can speed up, and ultimately reduce the cost of label acquisition.
Original language | English |
---|---|
Pages (from-to) | 140-148 |
Number of pages | 9 |
Journal | CEUR Workshop Proceedings |
Volume | 1751 |
DOIs | |
Publication status | Published - 2016 |
Event | 24th Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2016 - Dublin, Ireland Duration: 20 Sep 2016 → 21 Sep 2016 |
Keywords
- labels
- datasets
- semi-supervised learning
- unlabelled data
- labelled data
- Active Learning
- Activist
- crowd-sourced label gathering
- label acquisition