Estimating distributed representation performance in disaster-related social media classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper examines the effectiveness of a range of pre-trained language representations in order to determine the informativeness and information type of social media in the event of natural or man-made disasters. Within the context of disaster tweet analysis, we aim to accurately analyse tweets while minimising both false positive and false negatives in the automated information analysis. The investigation is performed across a number of well known disaster-related twitter datasets. Models that are built from pre-trained word embeddings from Word2Vec, GloVe, ELMo and BERT are used for performance evaluation. Given the relative ubiquity of BERT as a standout language representation in recent times it was expected that BERT dominates results. However, results are more diverse, with classical Word2Vec and GloVe both displaying strong results. As part of the analysis, we discuss some challenges related to automated twitter analysis including the fine-tuning of language models to disaster-related scenarios.

Original languageEnglish
Title of host publicationProceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
EditorsFrancesca Spezzano, Wei Chen, Xiaokui Xiao
PublisherAssociation for Computing Machinery, Inc
Pages723-727
Number of pages5
ISBN (Electronic)9781450368681
DOIs
Publication statusPublished - 27 Aug 2019
Event11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019 - Vancouver, Canada
Duration: 27 Aug 201930 Aug 2019

Publication series

NameProceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019

Conference

Conference11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
Country/TerritoryCanada
CityVancouver
Period27/08/1930/08/19

Keywords

  • BERT
  • ELMo
  • Text classification
  • Twitter
  • Word embedding

Fingerprint

Dive into the research topics of 'Estimating distributed representation performance in disaster-related social media classification'. Together they form a unique fingerprint.

Cite this