TY - GEN
T1 - From Clinic to Code
T2 - 2nd International Conference on Artificial Intelligence on Healthcare, AIiH 2025
AU - Heaney, Andrea
AU - Murphy, Emma
AU - Hickey, Eugene
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - The development of Artificial Intelligence (AI) in healthcare is largely dependent on the quality of medical datasets. However, these datasets often fail to accurately represent women (and those seen as women) due to historic, implicit and biological biases. This under-representation can lead to biased, inequitable and even harmful models. Building upon the findings of previously conducted qualitative semi-structured semantic interviews with clinicians on their perceptions of women’s health, this paper presents a framework for translating qualitative findings to dataset characteristics via operationalisation. This framework outlines the key characteristics and considerations a dataset should include or consider to more accurately represent women in these datasets. Some of these factors include: pregnancy status, gender of provider of the care, menstruation and menopause, and ethnicity. Rather than considering fairness after model development and employing de-biasing metrics, this approach places fairness at the initial selection stages, with the goal of embedding equity throughout the entire development pipeline. It is both a checklist for data selection for model developers and also a guideline for those who collect medical data. The framework is divided into Necessities, Data Comprehension, Gender-Specific Factors, Clinician Information, Patient-Specific Factors, and Additional considerations. This framework is a step towards creating more gender-conscious, equitable and fair medical AI systems.
AB - The development of Artificial Intelligence (AI) in healthcare is largely dependent on the quality of medical datasets. However, these datasets often fail to accurately represent women (and those seen as women) due to historic, implicit and biological biases. This under-representation can lead to biased, inequitable and even harmful models. Building upon the findings of previously conducted qualitative semi-structured semantic interviews with clinicians on their perceptions of women’s health, this paper presents a framework for translating qualitative findings to dataset characteristics via operationalisation. This framework outlines the key characteristics and considerations a dataset should include or consider to more accurately represent women in these datasets. Some of these factors include: pregnancy status, gender of provider of the care, menstruation and menopause, and ethnicity. Rather than considering fairness after model development and employing de-biasing metrics, this approach places fairness at the initial selection stages, with the goal of embedding equity throughout the entire development pipeline. It is both a checklist for data selection for model developers and also a guideline for those who collect medical data. The framework is divided into Necessities, Data Comprehension, Gender-Specific Factors, Clinician Information, Patient-Specific Factors, and Additional considerations. This framework is a step towards creating more gender-conscious, equitable and fair medical AI systems.
KW - Equitable Healthcare
KW - Ethical AI
KW - Health Informatics
UR - https://www.scopus.com/pages/publications/105024553657
U2 - 10.1007/978-3-032-00656-1_8
DO - 10.1007/978-3-032-00656-1_8
M3 - Conference contribution
AN - SCOPUS:105024553657
SN - 9783032006554
T3 - Lecture Notes in Computer Science
SP - 100
EP - 114
BT - Artificial Intelligence in Healthcare - 2nd International Conference, AIiH 2025, Proceedings
A2 - Cafolla, Daniele
A2 - Rittman, Timothy
A2 - Ni, Hao
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 8 September 2025 through 10 September 2025
ER -