Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate

M. Atif Qureshi, Arjumand Younus, Simon Caton

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Counterfactually augmented data has recently been proposed as a successful solution for socially situated NLP tasks such as hate speech detection. The chief component within the existing counterfactual data augmentation pipeline, however, involves manually flipping labels and making minimal content edits to training data. In a hate speech context, these forms of editing have been shown to still retain offensive hate speech content. Inspired by the recent success of large language models (LLMs), especially the development of ChatGPT, which have demonstrated improved language comprehension abilities, we propose an inclusivity-oriented approach to automatically generate counterfactually augmented data using LLMs. We show that hate speech detection models trained with LLM-produced counterfactually augmented data can outperform both state-of-the-art and human-based methods.

Original languageEnglish
Title of host publicationWeb Engineering - 24th International Conference, ICWE 2024, Proceedings
EditorsKostas Stefanidis, Kari Systä, Maristella Matera, Sebastian Heil, Haridimos Kondylakis, Elisa Quintarelli
PublisherSpringer Science and Business Media Deutschland GmbH
Pages34-48
Number of pages15
ISBN (Print)9783031623615
DOIs
Publication statusPublished - 2024
Event24th International Conference on Web Engineering, ICWE 2024 - Tampere, Finland
Duration: 17 Jun 202420 Jun 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14629 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Web Engineering, ICWE 2024
Country/TerritoryFinland
CityTampere
Period17/06/2420/06/24

Keywords

  • ChatGPT
  • counterfactuals
  • inclusivity
  • model robustness
  • out-of-domain testing

Fingerprint

Dive into the research topics of 'Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate'. Together they form a unique fingerprint.

Cite this