||5R01CA262296-03 Interpret this number
||University Of Tennessee Health Sci Ctr
||Algorithm-Based Prevention and Reduction of Cancer Health Disparity Arising From Data Inequality
Ethnic minority groups have a long-term cumulative data disadvantage in biomedical research and clinical
studies. Statistics have shown that over 90% of the samples in cancer-related GWAS and clinical omics projects
were collected from Individuals of European ancestry. This severe data disadvantage of the ethnic minority
groups is set to produce new health disparities as data-driven, algorithm-based biomedical research and clinical
decisions become increasingly common. The new cancer disparity arising from data inequality can potentially
impact all ethnic minority groups in all types of cancers where data inequality exists. Thus, its negative impact is
not limited to the cancer types or subtypes for which significant ethnic disparities have already been evident. The
long-term goal of the proposed research is to prevent or reduce the heath disparities arising from the data
disadvantage of ethnic minority groups. The overall objective of this work is to obtain key knowledge and create
open resources to establish a new paradigm for machine learning with multiethnic clinical omics data. Our central
hypothesis is that the knowledge learned from data of the majority population can be transferred to improve
machine learning performance on the data-disadvantaged ethnic minority groups. Guided by strong preliminary
data, we will pursuit two specific aims to 1) Discover from cancer clinical omics data and genotype-phenotype
data: under what conditions and to what extent the transfer learning scheme improves machine learning model
performance on data-disadvantaged ethnic minority groups; 2) Create an open resource system for unbiased
multiethnic machine learning to prevent or reduce new health disparities arising from the data disadvantage of
ethnic minorities. The approach is innovative because it represents a substantive departure from the status quo
by shifting the paradigm of multiethnic machine learning from mixture learning and independent learning
schemes to a transfer learning scheme. The proposed research is significant, because it is expected to establish
a new paradigm for unbiased multiethnic machine learning and to provide an open resource system to facilitate
the paradigm shift, and thus to prevent or reduce health disparities arising from the data disadvantage of ethnic
Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective.
, Sharma T.
, Cui Y.
Annual review of biomedical data science, 2023-08-10; 6, p. 153-171.
Clinical time-to-event prediction enhanced by incorporating compatible related outcomes.
, Cui Y.
PLOS digital health, 2022; 1(5), .
Malignant transformation in human colorectal mucosa as monitored by distribution of laminin, a basement membrane glycoprotein.
, Ekblom P.
, Scheinin T.M.
, Andersson L.C.
Acta pathologica, microbiologica, et immunologica Scandinavica. Section A, Pathology, 1985 Sep; 93(5), p. 285-91.