Skip to main content
Grant Details

Grant Number: 1R21CA245855-01A1 Interpret this number
Primary Investigator: Hu, Liangyuan
Organization: Icahn School Of Medicine At Mount Sinai
Project Title: Flexible Bayesian Approaches to Causal Inference with Multilevel Survival Data and Multiple Treatments
Fiscal Year: 2020


Project Summary Combining comparative effectiveness research (CER) and dissemination and implementation research is playing an increased role in public health and health care service by allowing practitioners to make informed decisions about treatments and improving adoption of evidence-based practices. In circumstances where CER questions do not lend themselves to direct experimentation or in implementation trials where incomplete adoption of in- tervention occurs, causal inference tools for “field data” are recommended for evaluating treatment effects. The increased complexities in large national electronic health databases pose challenges for statistical analyses and demand approaches beyond conventional causal inference techniques, which have traditionally focused on bi- nary treatment. Given the wealth of information captured in large-scale data, it is rare that treatment regimens are defined in terms of two treatments only. The data are typically pooled from treating facilities across the nation with considerable variability in the institutional effect. Although it has been established that popular tools for bi- nary treatment are inappropriate for the multiple treatment setting, and that ignoring the multilevel data structure can bias the estimate of the treatment effect, few alternative methods have been proposed to deal with both complications simultaneously. The first aim of our proposed project is to develop a novel and flexible Bayesian approach to estimating the causal effects of multiple treatments on survival with clustered data. We then fully investigate the operating characteristics of our proposed method in a variety of simulated scenarios and contrast it with approaches often used in practice. For causal estimates to be unbiased, researchers commonly make the assumption of no unmeasured confounding (UMC). Though highly recommended with binary treatment, there is no known implementation or framework for sensitivity analysis with multiple treatments and multilevel survival data. The second aim of our project is to develop and apply a flexible and interpretable Bayesian approach to assessing the sensitivity of causal estimates to possible departures from the assumption of no UMC, at both cluster- and individual-level. This approach is capable of gauging the amount of unobserved confounding needed to change the direction of the observed treatment effects Our project will apply the developed methods in the first two aims to a large representative high-risk localized prostate cancer population, drawn from the National Cancer Data Base, to evaluate the average causal effects of three popular treatment options on survival and evaluate how unmeasured confounding might alter causal conclusions. We also will estimate treatment heterogeneity and identify distinct subgroups of patients for which a treatment is effective or harmful. Our methods will establish the effectiveness component and lay the groundwork for building the cost-effectiveness models, and provide evidence for further investigations of variations in intervention implementation and modifications in recommendations for treatments leading to different patient outcomes. To facilitate the dissemination of our work, we will share the underlying statistical code via an R package.


Estimating the causal effects of multiple intermittent treatments with application to COVID-19.
Authors: Hu L. , Ji J. , Joshi H. , Scott E.R. , Li F. .
Source: ArXiv, 2023-08-04; , .
EPub date: 2023-08-04.
PMID: 34981032
Related Citations

Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series.
Authors: Hu L. , Li L. .
Source: International journal of environmental research and public health, 2022-12-01; 19(23), .
EPub date: 2022-12-01.
PMID: 36498153
Related Citations

A Flexible Approach for Assessing Heterogeneity of Causal Treatment Effects on Patient Survival Using Large Datasets with Clustered Observations.
Authors: Hu L. , Ji J. , Liu H. , Ennis R. .
Source: International journal of environmental research and public health, 2022-11-12; 19(22), .
EPub date: 2022-11-12.
PMID: 36429621
Related Citations

A flexible approach for causal inference with multiple treatments and clustered survival outcomes.
Authors: Hu L. , Ji J. , Ennis R.D. , Hogan J.W. .
Source: Statistics in medicine, 2022-11-10; 41(25), p. 4982-4999.
EPub date: 2022-08-10.
PMID: 35948011
Related Citations

Correlates of cancer prevalence across census tracts in the United States: A Bayesian machine learning approach.
Authors: Niu L. , Hu L. , Li Y. , Liu B. .
Source: Spatial and spatio-temporal epidemiology, 2022 Aug; 42, p. 100522.
EPub date: 2022-05-27.
PMID: 35934328
Related Citations

Authors: Hu L. , Zou J. , Gu C. , Ji J. , Lopez M. , Kale M. .
Source: The annals of applied statistics, 2022 Jun; 16(2), p. 1014-1037.
EPub date: 2022-06-13.
PMID: 36644682
Related Citations

A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data.
Authors: Lin J.J. , Hu L. , Huang C. , Jiayi J. , Lawrence S. , Govindarajulu U. .
Source: BMC medical research methodology, 2022-05-04; 22(1), p. 132.
EPub date: 2022-05-04.
PMID: 35508974
Related Citations

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning.
Authors: Hu L. , Joyce Lin J.Y. , Ji J. .
Source: Statistical methods in medical research, 2021 Dec; 30(12), p. 2651-2671.
EPub date: 2021-10-25.
PMID: 34696650
Related Citations

Estimating heterogeneous survival treatment effects of lung cancer screening approaches: A causal machine learning analysis.
Authors: Hu L. , Lin J.Y. , Sigel K. , Kale M. .
Source: Annals of epidemiology, 2021 Oct; 62, p. 36-42.
EPub date: 2021-06-23.
PMID: 34157399
Related Citations

Estimating heterogeneous survival treatment effect in observational data using machine learning.
Authors: Hu L. , Ji J. , Li F. .
Source: Statistics in medicine, 2021-09-20; 40(21), p. 4691-4713.
EPub date: 2021-06-10.
PMID: 34114252
Related Citations

Quantile Regression Forests to Identify Determinants of Neighborhood Stroke Prevalence in 500 Cities in the USA: Implications for Neighborhoods with High Prevalence.
Authors: Hu L. , Ji J. , Li Y. , Liu B. , Zhang Y. .
Source: Journal of urban health : bulletin of the New York Academy of Medicine, 2021 Apr; 98(2), p. 259-270.
PMID: 32888155
Related Citations

Ranking sociodemographic, health behavior, prevention, and environmental factors in predicting neighborhood cardiovascular health: A Bayesian machine learning approach.
Authors: Hu L. , Liu B. , Li Y. .
Source: Preventive medicine, 2020 Dec; 141, p. 106240.
EPub date: 2020-08-27.
PMID: 32860821
Related Citations

Identifying and understanding determinants of high healthcare costs for breast cancer: a quantile regression machine learning approach.
Authors: Hu L. , Li L. , Ji J. , Sanderson M. .
Source: BMC health services research, 2020-11-23; 20(1), p. 1066.
EPub date: 2020-11-23.
PMID: 33228683
Related Citations

Tree-Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level.
Authors: Hu L. , Liu B. , Ji J. , Li Y. .
Source: Journal of the American Heart Association, 2020-11-17; 9(22), p. e016745.
EPub date: 2020-11-03.
PMID: 33140687
Related Citations

Identifying and assessing the impact of key neighborhood-level determinants on geographic variation in stroke: a machine learning and multilevel modeling approach.
Authors: Ji J. , Hu L. , Liu B. , Li Y. .
Source: BMC public health, 2020-11-07; 20(1), p. 1666.
EPub date: 2020-11-07.
PMID: 33160324
Related Citations

Back to Top