TY - GEN
T1 - Practical Strategies for Applying the Disparate Impact Remover at Inference Time
AU - Danilevskyi, Mykhailo
AU - Perez-Tellez, Fernando
AU - Vasic, Jelena
N1 - Publisher Copyright:
© 2026 Copyright held by the owner/author(s).
PY - 2026/2/16
Y1 - 2026/2/16
N2 - Bias mitigation in machine learning often relies on pre-processing techniques such as the Disparate Impact Remover (DIR). However, applying DIR at inference time remains underexplored. We investigate methods of reusing data from the training phase during the inference phase when data comes in single data instances, in mini-batches of ten, and as full test sets.We propose two new methods and compare them against the original approach based on the quantile function. The first method, dictionary-based DIR, stores the mappings learned during training and applies them at inference time using nearest-neighbour matching for unseen values. The second method, merge-based DIR, dynamically merges incoming instances with a subset of training data before reapplying DIR.We evaluated these methods on three benchmark datasets: UCI Adult Income, Ricci v. DeStefano, and German Credit Data, measuring both fairness (Disparate Impact) and predictive performance (F1-score). The results show that the dictionary-based approach achieves accuracy comparable to the original quantile-based DIR with outcomes that are independent of the data input size. In contrast, the merge-based approach can produce more fair but less stable results that vary depending on the data size.
AB - Bias mitigation in machine learning often relies on pre-processing techniques such as the Disparate Impact Remover (DIR). However, applying DIR at inference time remains underexplored. We investigate methods of reusing data from the training phase during the inference phase when data comes in single data instances, in mini-batches of ten, and as full test sets.We propose two new methods and compare them against the original approach based on the quantile function. The first method, dictionary-based DIR, stores the mappings learned during training and applies them at inference time using nearest-neighbour matching for unseen values. The second method, merge-based DIR, dynamically merges incoming instances with a subset of training data before reapplying DIR.We evaluated these methods on three benchmark datasets: UCI Adult Income, Ricci v. DeStefano, and German Credit Data, measuring both fairness (Disparate Impact) and predictive performance (F1-score). The results show that the dictionary-based approach achieves accuracy comparable to the original quantile-based DIR with outcomes that are independent of the data input size. In contrast, the merge-based approach can produce more fair but less stable results that vary depending on the data size.
KW - Bias mitigation
KW - Disparate Impact Remover
KW - Ethical AI
KW - Fairness
UR - https://www.scopus.com/pages/publications/105031764601
U2 - 10.1145/3777490.3777497
DO - 10.1145/3777490.3777497
M3 - Conference contribution
AN - SCOPUS:105031764601
T3 - HCAI-ep 2026 - Proceedings of the 2026 Conference on Human Centered Artificial Intelligence - Education and Practice
SP - 34
EP - 39
BT - HCAI-ep 2026 - Proceedings of the 2026 Conference on Human Centered Artificial Intelligence - Education and Practice
PB - Association for Computing Machinery (ACM)
T2 - 3rd International Conference on Human-Centred AI - Education and Practice, HCAI-ep 2026
Y2 - 21 January 2026 through 22 January 2026
ER -