Skip to main content

COVID-19 Resources

What people with cancer should know:

Guidance for cancer researchers:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Grant Details

Grant Number: 1R21CA245855-01A1 Interpret this number
Primary Investigator: Hu, Liangyuan
Organization: Icahn School Of Medicine At Mount Sinai
Project Title: Flexible Bayesian Approaches to Causal Inference with Multilevel Survival Data and Multiple Treatments
Fiscal Year: 2020


Project Summary Combining comparative effectiveness research (CER) and dissemination and implementation research is playing an increased role in public health and health care service by allowing practitioners to make informed decisions about treatments and improving adoption of evidence-based practices. In circumstances where CER questions do not lend themselves to direct experimentation or in implementation trials where incomplete adoption of in- tervention occurs, causal inference tools for “field data” are recommended for evaluating treatment effects. The increased complexities in large national electronic health databases pose challenges for statistical analyses and demand approaches beyond conventional causal inference techniques, which have traditionally focused on bi- nary treatment. Given the wealth of information captured in large-scale data, it is rare that treatment regimens are defined in terms of two treatments only. The data are typically pooled from treating facilities across the nation with considerable variability in the institutional effect. Although it has been established that popular tools for bi- nary treatment are inappropriate for the multiple treatment setting, and that ignoring the multilevel data structure can bias the estimate of the treatment effect, few alternative methods have been proposed to deal with both complications simultaneously. The first aim of our proposed project is to develop a novel and flexible Bayesian approach to estimating the causal effects of multiple treatments on survival with clustered data. We then fully investigate the operating characteristics of our proposed method in a variety of simulated scenarios and contrast it with approaches often used in practice. For causal estimates to be unbiased, researchers commonly make the assumption of no unmeasured confounding (UMC). Though highly recommended with binary treatment, there is no known implementation or framework for sensitivity analysis with multiple treatments and multilevel survival data. The second aim of our project is to develop and apply a flexible and interpretable Bayesian approach to assessing the sensitivity of causal estimates to possible departures from the assumption of no UMC, at both cluster- and individual-level. This approach is capable of gauging the amount of unobserved confounding needed to change the direction of the observed treatment effects Our project will apply the developed methods in the first two aims to a large representative high-risk localized prostate cancer population, drawn from the National Cancer Data Base, to evaluate the average causal effects of three popular treatment options on survival and evaluate how unmeasured confounding might alter causal conclusions. We also will estimate treatment heterogeneity and identify distinct subgroups of patients for which a treatment is effective or harmful. Our methods will establish the effectiveness component and lay the groundwork for building the cost-effectiveness models, and provide evidence for further investigations of variations in intervention implementation and modifications in recommendations for treatments leading to different patient outcomes. To facilitate the dissemination of our work, we will share the underlying statistical code via an R package.


Ranking sociodemographic, health behavior, prevention, and environmental factors in predicting neighborhood cardiovascular health: A Bayesian machine learning approach.
Authors: Hu L. , Liu B. , Li Y. .
Source: Preventive medicine, 2020 12; 141, p. 106240.
EPub date: 2020-08-27.
PMID: 32860821
Related Citations

Identifying and understanding determinants of high healthcare costs for breast cancer: a quantile regression machine learning approach.
Authors: Hu L. , Li L. , Ji J. , Sanderson M. .
Source: BMC health services research, 2020-11-23; 20(1), p. 1066.
EPub date: 2020-11-23.
PMID: 33228683
Related Citations

Tree-Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level.
Authors: Hu L. , Liu B. , Ji J. , Li Y. .
Source: Journal of the American Heart Association, 2020-11-17; 9(22), p. e016745.
EPub date: 2020-11-03.
PMID: 33140687
Related Citations

Identifying and assessing the impact of key neighborhood-level determinants on geographic variation in stroke: a machine learning and multilevel modeling approach.
Authors: Ji J. , Hu L. , Liu B. , Li Y. .
Source: BMC public health, 2020-11-07; 20(1), p. 1666.
EPub date: 2020-11-07.
PMID: 33160324
Related Citations

Quantile Regression Forests to Identify Determinants of Neighborhood Stroke Prevalence in 500 Cities in the USA: Implications for Neighborhoods with High Prevalence.
Authors: Hu L. , Ji J. , Li Y. , Liu B. , Zhang Y. .
Source: Journal of urban health : bulletin of the New York Academy of Medicine, 2020-09-04; , .
EPub date: 2020-09-04.
PMID: 32888155
Related Citations

Back to Top