TY - GEN
T1 - Using ChatGPT to Generate Gendered Language
AU - Soundararajan, Shweta
AU - Jeyaraj, Manuela Nayantara
AU - Delany, Sarah Jane
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven approaches for gender stereotype detection and gender bias mitigation. The approach focuses on generating implicit gendered language that captures and reflects stereotypical characteristics or traits of a particular gender. This is done by engineering prompts to ChatGPT that use gender-coded words from gender-coded lexicons. The evaluation of the datasets generated shows good instances of English-language gendered sentences that can be identified as those that are consistent with gender stereotypes and those that are contradictory. The generated data also shows strong gender bias.
AB - Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven approaches for gender stereotype detection and gender bias mitigation. The approach focuses on generating implicit gendered language that captures and reflects stereotypical characteristics or traits of a particular gender. This is done by engineering prompts to ChatGPT that use gender-coded words from gender-coded lexicons. The evaluation of the datasets generated shows good instances of English-language gendered sentences that can be identified as those that are consistent with gender stereotypes and those that are contradictory. The generated data also shows strong gender bias.
KW - ChatGPT
KW - gendered language
KW - large language models
KW - machine learning
KW - natural language processing
KW - prompt engineering
KW - zero-shot prompting
UR - http://www.scopus.com/inward/record.url?scp=85189940246&partnerID=8YFLogxK
U2 - 10.1109/AICS60730.2023.10470830
DO - 10.1109/AICS60730.2023.10470830
M3 - Conference contribution
AN - SCOPUS:85189940246
T3 - 2023 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
BT - 2023 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023
Y2 - 7 December 2023 through 8 December 2023
ER -