Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection

Nasim Sobhani, Kinshuk Sengupta, Sarah Jane Delany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Predictions from Machine Learning models can reflect bias in the data on which they are trained. Gender bias has been shown to be prevalent in Natural Language Processing models. The research into identifying and mitigating gender bias in these models predominantly considers gender as binary, male and female, neglecting the fluidity and continuity of gender as a variable. In this paper, we present an approach to evaluate gender bias in a prediction task, which recognises the non-binary nature of gender. We gender-neutralise a random subset of existing real-world hate speech data. We extend the existing template approach for measuring gender bias to include test examples that are genderneutral. Measuring the bias across a selection of hate speech datasets we show that the bias for the gender-neutral data is closer to that seen for test instances that identify as male than those that identify as female.

Original languageEnglish
Title of host publicationInternational Conference Recent Advances in Natural Language Processing, RANLP 2023
Subtitle of host publicationLarge Language Models for Natural Language Processing - Proceedings
EditorsGalia Angelova, Maria Kunilovskaya, Ruslan Mitkov
PublisherIncoma Ltd
Pages1121-1131
Number of pages11
ISBN (Electronic)9789544520922
DOIs
Publication statusPublished - 2023
Event2023 International Conference Recent Advances in Natural Language Processing: Large Language Models for Natural Language Processing, RANLP 2023 - Varna, Bulgaria
Duration: 4 Sep 20236 Sep 2023

Publication series

NameInternational Conference Recent Advances in Natural Language Processing, RANLP
ISSN (Print)1313-8502

Conference

Conference2023 International Conference Recent Advances in Natural Language Processing: Large Language Models for Natural Language Processing, RANLP 2023
Country/TerritoryBulgaria
CityVarna
Period4/09/236/09/23

Fingerprint

Dive into the research topics of 'Measuring Gender Bias in Natural Language Processing: Incorporating Gender-Neutral Linguistic Forms for Non-Binary Gender Identities in Abusive Speech Detection'. Together they form a unique fingerprint.

Cite this