Skip to main content

Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted.

The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov.

Updates regarding government operating status and resumption of normal operations can be found at opm.gov.

An official website of the United States government
Grant Details

Grant Number: 5R01CA269398-04 Interpret this number
Primary Investigator: Li, Yi
Organization: University Of Michigan At Ann Arbor
Project Title: Causal Machine Learning in Cancer Survival By Integrating Multiple High-Dimensional Observational Studies
Fiscal Year: 2025


Abstract

Despite overall improvements in cancer research, observational studies often face limitations in representativeness of broader patient populations and comparability across clinical settings. For example, treatment practices may vary across cohorts, and differences in patient characteristics can complicate efforts to draw generalizable conclusions about treatment outcomes. This proposal is motivated by the Boston Lung Cancer Survival Cohort (BLCSC), one of the largest lung cancer cohorts globally, consisting of cases registered since 1992 at the Dana-Farber Cancer Institute (DFCI) and the Massachusetts General Hospital (MGH), with subsequent expansion to the MD Anderson Cancer Center (MDACC) and Mayo Clinic. We also have access to the International Lung Cancer Consortium (ILCCO), an international cohort established in 2004 with a data structure like BLCSC. These rich databases provide unique opportunities for advancing modern statistical methods in cancer survival analysis. They pose methodological challenges such as unbalanced patient covariates across sites, heterogeneous data structures, and variability in treatment assignment. Leveraging BLCSC and ILCCO, we aim to develop integrative causal machine learning methods for analyzing multiple high-dimensional observational studies, with broad applications such as robust treatment comparisons and survival inference. Such methods will enable us to generate findings that are generalizable and applicable to wide-ranging patient populations.



Publications

Characterization of Occupational Endotoxin-Related Small Airway Disease With Longitudinal Paired Inspiratory/Expiratory CT Scans.
Authors: Sun Y. , Kang J. , Zhang F.Y. , Wang H. , Lai P.S. , Washko G.R. , San Jose Estepar R. , Christiani D.C. , Li Y. .
Source: Chest, 2025 Jul; 168(1), p. 43-55.
EPub date: 2025-01-18 00:00:00.0.
PMID: 39832623
Related Citations

Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data.
Authors: Gu C. , Baladandayuthapani V. , Guha S. .
Source: Bayesian Analysis, 2025 Jun; 20(2), p. 489-518.
EPub date: 2023-11-23 00:00:00.0.
PMID: 40406512
Related Citations

Radiological distribution patterns in restrictive chronic lung allograft dysfunction: Impact on survival across all phenotypes.
Authors: Fukuda T. , Nakamura Y. , Tseng S.C. , Ko Y. , Gagne S.M. , Johkoh T. , Li Y. , Christiani D.C. , Ojiri H. , Sholl L. , et al. .
Source: Jhlt Open, 2025 May; 8, p. 100232.
EPub date: 2025-02-18 00:00:00.0.
PMID: 40144724
Related Citations

Residual Volume and Total Lung Capacity at Diagnosis Predict Overall Survival in Non-Small Cell Lung Cancer Patients.
Authors: Zhai T. , Li Y. , Brown R. , Lanuti M. , Gainor J.F. , Christiani D.C. .
Source: Cancer Medicine, 2025 May; 14(10), p. e70962.
PMID: 40371871
Related Citations

Comparing Analgesic Regimen Effectiveness and Safety after Surgery (CARES): protocol for a pragmatic, international multicentre randomised trial.
Authors: Bicket M.C. , Ladha K.S. , Haroutounian S. , McFarlin K. , Neff M. , McDuffie R.L. , Waljee J.F. , Wijeysundera D.N. , Brummet C. , Li Y. , et al. .
Source: Bmj Open, 2025-04-05 00:00:00.0; 15(4), p. e099925.
EPub date: 2025-04-05 00:00:00.0.
PMID: 40187774
Related Citations

A clustering approach to integrative analyses of multiomic cancer data.
Authors: Yan D. , Guha S. .
Source: Journal Of Applied Statistics, 2025; 52(8), p. 1539-1560.
EPub date: 2024-11-29 00:00:00.0.
PMID: 40497161
Related Citations

Automated chest CT three-dimensional quantification of body composition: adipose tissue and paravertebral muscle.
Authors: Hata A. , Muraguchi Y. , Nakatsugawa M. , Wang X. , Song J. , Wada N. , Hino T. , Aoyagi K. , Kawagishi M. , Negishi T. , et al. .
Source: Scientific Reports, 2024-12-30 00:00:00.0; 14(1), p. 32117.
EPub date: 2024-12-30 00:00:00.0.
PMID: 39738489
Related Citations

Assessing the prognostic utility of clinical and radiomic features for COVID-19 patients admitted to ICU: challenges and lessons learned.
Authors: Sun Y. , Salerno S. , Pan Z. , Yang E. , Sujimongkol C. , Song J. , Wang X. , Han P. , Zeng D. , Kang J. , et al. .
Source: Harvard Data Science Review, 2024 Winter; 6(1), .
EPub date: 2024-01-31 00:00:00.0.
PMID: 38974963
Related Citations

Multi-task Learning for Gaussian Graphical Regressions with High Dimensional Covariates.
Authors: Zhang J. , Li Y. .
Source: Journal Of Computational And Graphical Statistics : A Joint Publication Of American Statistical Association, Institute Of Mathematical Statistics, Interface Foundation Of North America, 2024-12-20 00:00:00.0; , .
EPub date: 2024-12-20 00:00:00.0.
PMID: 40786561
Related Citations

Bayesian Estimation of Propensity Scores for Integrating Multiple Cohorts with High-Dimensional Covariates.
Authors: Guha S. , Li Y. .
Source: Statistics In Biosciences, 2024-12-09 00:00:00.0; , .
EPub date: 2024-12-09 00:00:00.0.
PMID: 40857526
Related Citations

Causal meta-analysis by integrating multiple observational studies with multivariate outcomes.
Authors: Guha S. , Li Y. .
Source: Biometrics, 2024-07-01 00:00:00.0; 80(3), .
PMID: 39073772
Related Citations

Debiased lasso for stratified Cox models with application to the national kidney transplant data.
Authors: Xia L. , Nan B. , Li Y. .
Source: The Annals Of Applied Statistics, 2023 Dec; 17(4), p. 3550-3569.
EPub date: 2023-10-30 00:00:00.0.
PMID: 38106966
Related Citations

Simultaneous selection and inference for varying coefficients with zero regions: a soft-thresholding approach.
Authors: Yang Y. , Pan Z. , Kang J. , Brummett C. , Li Y. .
Source: Biometrics, 2023-07-17 00:00:00.0; , .
EPub date: 2023-07-17 00:00:00.0.
PMID: 37459178
Related Citations

Use of machine learning to assess the prognostic utility of radiomic features for in-hospital COVID-19 mortality.
Authors: Sun Y. , Salerno S. , He X. , Pan Z. , Yang E. , Sujimongkol C. , Song J. , Wang X. , Han P. , Kang J. , et al. .
Source: Scientific Reports, 2023-05-05 00:00:00.0; 13(1), p. 7318.
EPub date: 2023-05-05 00:00:00.0.
PMID: 37147440
Related Citations

Prediagnosis Smoking Cessation and Overall Survival Among Patients With Non-Small Cell Lung Cancer.
Authors: Wang X. , Romero-Gutierrez C.W. , Kothari J. , Shafer A. , Li Y. , Christiani D.C. .
Source: Jama Network Open, 2023-05-01 00:00:00.0; 6(5), p. e2311966.
EPub date: 2023-05-01 00:00:00.0.
PMID: 37145597
Related Citations

Asynchronous and error-prone longitudinal data analysis via functional calibration.
Authors: Chang X. , Li Y. , Li Y. .
Source: Biometrics, 2023-04-12 00:00:00.0; , .
EPub date: 2023-04-12 00:00:00.0.
PMID: 37042741
Related Citations

Traction Bronchiectasis/Bronchiolectasis in Interstitial Lung Abnormality: Follow-up in the COPDGene.
Authors: Hata A. , Hino T. , Li Y. , Johkoh T. , Christiani D.C. , Lynch D.A. , Cho M.H. , Silverman E.K. , Hunninghake G.M. , Hatabu H. , et al. .
Source: American Journal Of Respiratory And Critical Care Medicine, 2023-03-10 00:00:00.0; , .
EPub date: 2023-03-10 00:00:00.0.
PMID: 36898128
Related Citations

OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations.
Authors: Pan Z. , Zhang R. , Shen S. , Lin Y. , Zhang L. , Wang X. , Ye Q. , Wang X. , Chen J. , Zhao Y. , et al. .
Source: Ebiomedicine, 2023 Feb; 88, p. 104443.
EPub date: 2023-01-24 00:00:00.0.
PMID: 36701900
Related Citations

Sex disparities in lung cancer survival rates based on screening status.
Authors: Rodriguez Alvarez A.A. , Yuming S. , Kothari J. , Digumarthy S.R. , Byrne N.M. , Li Y. , Christiani D.C. .
Source: Lung Cancer (amsterdam, Netherlands), 2022 09; 171, p. 115-120.
EPub date: 2022-08-01 00:00:00.0.
PMID: 35939954
Related Citations

Sex disparities in lung cancer survival rates based on screening status.
Authors: Rodriguez Alvarez A.A. , Yuming S. , Kothari J. , Digumarthy S.R. , Byrne N.M. , Li Y. , Christiani D.C. .
Source: Lung Cancer (amsterdam, Netherlands), 2022 09; 171, p. 115-120.
EPub date: 2022-08-01 00:00:00.0.
PMID: 35939954
Related Citations



Back to Top