Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5R01CA287422-03 Interpret this number
Primary Investigator: Sulam, Jeremias
Organization: Johns Hopkins University
Project Title: SCH: Quantifying and Mitigating Demographic Biases of Machine Learning in Real World Radiology
Fiscal Year: 2025


Abstract

The application of modern machine learning algorithms in radiology continues to grow, as these tools represent potential huge improvements in efficiency, accessibility and accuracy of diagnostic and screening tools. At the same time, these increasingly complex machine learning models can result in predictions of different accuracies in different settings, such as those resulting from different imaging acquisition protocols, different imaging equipment, or different phenotypic tissue characteristic. Such lack of robustness of predictive models are particularly important in public health applications that focus on large scale population-based screening, as in cancer screening for breast and lung cancer. Thus, it is paramount to understand these sources of variability in machine learning screening algorithms, as well as developing methods to mitigate them. This proposal will develop tools to quantify, correct, and analyze the sources of prediction errors in algorithms in relation to different sources of variability in data acquisition in real world settings. In particular, we will develop analysis and algorithms to quantify the difference in predictive performance by a machine learning model in situations where information about the sources of variability itself (such as acquisition location, tissue characteristics, and acquisition protocols) are not directly observable, and we will provide algorithms that correct for their worst-case difference in predictive power. We will analyze our tools under distribution shifts, whereby differences across medical centers exist, as is common in large scale cancer screening programs. This project will also perform inference on the training samples and features most highly associated with difference in predictive power, thereby providing guidance on the development of solutions to prevent these limitations in the future. Our tools will be validated on a variety of large real-world radiology datasets spanning multiple imaging modalities, including general chest X-ray datasets that include lung cancer diagnoses (CheXpert and MIMIC-CXR), as well as the Emory Breast Cancer Imaging Dataset (EMBED) and the National Lung Cancer Screening Trial, evaluating and correcting disparities for predictive algorithms with different acquisition protocols, technical equipment, and data quality. The results of this project will establish critical knowledge about the propensity of machine learning models for medical imaging diagnosis to be sensitive to imaging settings, as well as foundational tools to quantify and mitigate these limitations in potentially game-changing technologies.



Publications

Pitfalls and Best Practices in Evaluation of AI Algorithmic Biases in Radiology.
Authors: Yi P.H. , Bachina P. , Bharti B. , Garin S.P. , Kanhere A. , Kulkarni P. , Li D. , Parekh V.S. , Santomartino S.M. , Moy L. , et al. .
Source: Radiology, 2025 May; 315(2), p. e241674.
PMID: 40392092
Related Citations

Estimating and Controlling for Equalized Odds via Sensitive Attribute Predictors.
Authors: Bharti B. , Yi P. , Sulam J. .
Source: Advances In Neural Information Processing Systems, 2023 Dec; 36, p. 37173-37192.
PMID: 38867889
Related Citations



Back to Top