Events Conference on Foundations and Advances of Machine Learning in Official Statistics, 3rd to 5th April, 2024

Session 3.2 Quality, Fairness and Reproducability

Fairness in Machine Learning for National Statistical Organizations

Patrick Oliver Schenk1, Christoph Kern* 1, Frauke Kreuter1

Abstract

National Statistical Organizations (NSOs) increasingly draw on Machine Learning (ML) to improve the timeliness and cost-effectiveness of existing processes or to offer new products. Thereby, NSOs must ensure that high standards with respect to robustness, reproducibility, and accuracy are upheld (Yung et al. 2022). At the same time, the ML community has started to focus on “algorithmic fairness” as a pre-condition for a safe deployment of ML models, particularly to prevent disparate social impacts in practice. However, this literature focuses on ML for data analysis and not on ML for data collection, processing, and production, i.e., the main work of NSOs.

We discuss how the (safe) deployment of ML by NSOs can benefit from concepts and methodology of algorithmic fairness research. First, we highlight the importance of fairness before, during, and after data processing and analysis. We then map fairness to the quality dimensions for NSOs in Yung et al. (2022)’s QF4SA quality framework by investigating the interaction of fairness with other dimensions and argue for fairness as its own quality dimension. The proposed mapping sharpens existing requirements, e.g. by expanding overall accuracy assessments to notions of multi-group fairness. Furthermore, we critically discuss the current state of algorithmic fairness research (e.g., its focus on binary classification, its neglect of uncertainty, and its focus on algorithmic decision-making) and suggest how, conversely, the practice of ML in NSOs can enrich the fair ML literature. By taking a data-centric approach, we thereby emphasize the importance of high-quality data products to detect and address misrepresentation in training data as a root cause of downstream fairness issues.

*: Speaker

1: LMU Munich - Germany