Author Gender Identification Considering Gender Bias

Manuela Nayantara Jeyaraj, Sarah Jane Delany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Writing style and choice of words used in textual content can vary between men and women both in terms of who the text is talking about and who is writing the text. The focus of this paper is on author gender prediction, identifying the gender of who is writing the text. We compare closed and open vocabulary approaches on different types of textual content including more traditional writing styles such as in books, and more recent writing styles used in user generated content on digital platforms such as blogs and social media messaging. As supervised machine learning approaches can reflect human biases in the data they are trained on, we also consider the gender bias of the different approaches across the different types of dataset. We show that open vocabulary approaches perform better both in terms of prediction performance and with less gender bias.

Original languageEnglish
Title of host publicationArtificial Intelligence and Cognitive Science - 30th Irish Conference, AICS 2022, Revised Selected Papers
EditorsLuca Longo, Ruairi O’Reilly
PublisherSpringer Science and Business Media Deutschland GmbH
Pages214-225
Number of pages12
ISBN (Print)9783031264375
DOIs
Publication statusPublished - 2023
Event30th Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2022 - Munster, Ireland
Duration: 8 Dec 20229 Dec 2022

Publication series

NameCommunications in Computer and Information Science
Volume1662 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference30th Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2022
Country/TerritoryIreland
CityMunster
Period8/12/229/12/22

Keywords

  • Author gender identification
  • Gender bias
  • Open-vocabulary approach

Fingerprint

Dive into the research topics of 'Author Gender Identification Considering Gender Bias'. Together they form a unique fingerprint.

Cite this