Towards Fairer NLP Models: Handling Gender Bias In Classification Tasks

Nasim Sobhani, Sarah Jane Delany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Measuring and mitigating gender bias in natural language processing (NLP) systems is crucial to ensure fair and ethical AI. However, a key challenge is the lack of explicit gender information in many textual datasets. This paper proposes two techniques, Identity Term Sampling (ITS) and Identity Term Pattern Extraction (ITPE), as alternatives to template-based approaches for measuring gender bias in text data. These approaches identify test data for measuring gender bias in the dataset itself and can be used to measure gender bias on any NLP classifier. We demonstrate the use of these approaches for measuring gender bias across various NLP classification tasks, including hate speech detection, fake news identification, and sentiment analysis. Additionally, we show how these techniques can benefit gender bias mitigation, proposing a variant of Counterfactual Data Augmentation (CDA), called Gender-Selective CDA (GS-CDA), which reduces the amount of data augmentation required in training data while effectively mitigating gender bias and maintaining overall classification performance.

Original languageEnglish
Title of host publicationGeBNLP 2024 - 5th Workshop on Gender Bias in Natural Language Processing, Proceedings of the Workshop
EditorsAgnieszka Falenska, Christine Basta, Marta Costa-jussa, Seraphina Goldfarb-Tarrant, Debora Nozza
PublisherAssociation for Computational Linguistics (ACL)
Pages167-178
Number of pages12
ISBN (Electronic)9798891761377
DOIs
Publication statusPublished - 2024
Event5th Workshop on Gender Bias in Natural Language Processing, GeBNLP 2024, held in conjunction with the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Bangkok, Thailand
Duration: 16 Aug 2024 → …

Publication series

NameGeBNLP 2024 - 5th Workshop on Gender Bias in Natural Language Processing, Proceedings of the Workshop

Conference

Conference5th Workshop on Gender Bias in Natural Language Processing, GeBNLP 2024, held in conjunction with the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Country/TerritoryThailand
CityBangkok
Period16/08/24 → …

Fingerprint

Dive into the research topics of 'Towards Fairer NLP Models: Handling Gender Bias In Classification Tasks'. Together they form a unique fingerprint.

Cite this