Grant Details
Grant Number: |
5R01CA287422-02 Interpret this number |
Primary Investigator: |
Sulam, Jeremias |
Organization: |
Johns Hopkins University |
Project Title: |
SCH: Quantifying and Mitigating Demographic Biases of Machine Learning in Real World Radiology |
Fiscal Year: |
2024 |
Abstract
PROJECT SUMMARY (See instructions):
The application of modern machine learning algorithms in radiology continues to grow, as these tools
represent potential huge improvements in efficiency, accessibility and accuracy of diagnostic and
screening tools. At the same time, these increasingly complex machine learning models can have biased
predictions against individuals of under-represented demographic groups, potentially perpetuating
pre-existing health disparities. Such fairness concerns are particularly important in public health
applications that focus on large scale population-based screening, as in cancer screening for breast and
lung cancer. In these settings, it is paramount to understand how often machine learning screening
algorithms can be unfair and biased, and how to mitigate these disparities. This proposal will develop
tools to quantify, correct, and analyze the biases of predictive algorithms in relation to different
demographic groups in real world settings. In particular, we will develop analysis and algorithms to
quantify the violation of fairness by a machine learning model in situations where information about the
sensitive attribute itself (such as biological sex, race or age) are not directly observable, and we will
provide algorithms that correct for their worst-case fairness violations. We will analyze our tools under
distribution shifts, whereby differences in populations exist, as is common in large scale cancer screening
programs. This project will also perform inference on the training samples and features most highly
associated with fairness violations, thereby providing guidance on the development of solutions to prevent
biased algorithms in the future. Our tools will be validated on a variety of large real-world radiology
datasets spanning multiple imaging modalities, including general chest X-ray datasets that include lung
cancer diagnoses (CheXpert and MIMIC-CXR), as well as the Emory Breast Cancer Imaging Dataset
(EMBED) and the National Lung Cancer Screening Trial, evaluating and correcting disparities for
predictive algorithms with respect to biological sex (where appropriate), race, and age. The results of this
project will establish critical knowledge about the propensity of machine learning models for medical
imaging diagnosis and cancer screening to be unfair and biased, as well as foundational tools to quantify
and mitigate these biases in these potentially game-changing technologies.
Publications
None