Abstract
Neural network models have become increasingly popular for text classification in recent years. In particular, the emergence of word embeddings within deep learning architectures has recently attracted a high level of attention amongst researchers. In this paper, we focus on how neural network models have been applied in text classification. Secondly, we extend our previous work [4, 3] using a neural network strategy for the task of abusive text detection. We compare word embedding features to the traditional feature representations such as n-grams and handcrafted features. In addition, we use an off-the-shelf neural network classifier, FastText[16]. Based on our results, the conclusions are: (1) Extracting selected manual features can increase abusive content detection over using basic ngrams; (2) Although averaging pre-trained word embeddings is a naive method, the distributed feature representation has better performance to ngrams in most of our datasets; (3) While the FastText classifier works efficiently with fast performance, the results are not remarkable as it is a shallow neural network with only one hidden layer; (4) Using pre-trained word embeddings does not guarantee better performance in the FastText classifier.
Original language | English |
---|---|
Pages (from-to) | 258-260 |
Number of pages | 3 |
Journal | CEUR Workshop Proceedings |
Volume | 2086 |
DOIs | |
Publication status | Published - 2017 |
Event | 25th Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2017 - Dublin, Ireland Duration: 7 Dec 2017 → 8 Dec 2017 |
Keywords
- Neural network models
- text classification
- word embeddings
- deep learning architecture