Skip to main content
An official website of the United States government
Grant Details

Grant Number: 7R21CA245855-02 Interpret this number
Primary Investigator: Hu, Liangyuan
Organization: Rbhs-School Of Public Health
Project Title: Flexible Bayesian Approaches to Causal Inference with Multilevel Survival Data and Multiple Treatments
Fiscal Year: 2020


Abstract

Project Summary Combining comparative effectiveness research (CER) and dissemination and implementation research is playing an increased role in public health and health care service by allowing practitioners to make informed decisions about treatments and improving adoption of evidence-based practices. In circumstances where CER questions do not lend themselves to direct experimentation or in implementation trials where incomplete adoption of in- tervention occurs, causal inference tools for “field data” are recommended for evaluating treatment effects. The increased complexities in large national electronic health databases pose challenges for statistical analyses and demand approaches beyond conventional causal inference techniques, which have traditionally focused on bi- nary treatment. Given the wealth of information captured in large-scale data, it is rare that treatment regimens are defined in terms of two treatments only. The data are typically pooled from treating facilities across the nation with considerable variability in the institutional effect. Although it has been established that popular tools for bi- nary treatment are inappropriate for the multiple treatment setting, and that ignoring the multilevel data structure can bias the estimate of the treatment effect, few alternative methods have been proposed to deal with both complications simultaneously. The first aim of our proposed project is to develop a novel and flexible Bayesian approach to estimating the causal effects of multiple treatments on survival with clustered data. We then fully investigate the operating characteristics of our proposed method in a variety of simulated scenarios and contrast it with approaches often used in practice. For causal estimates to be unbiased, researchers commonly make the assumption of no unmeasured confounding (UMC). Though highly recommended with binary treatment, there is no known implementation or framework for sensitivity analysis with multiple treatments and multilevel survival data. The second aim of our project is to develop and apply a flexible and interpretable Bayesian approach to assessing the sensitivity of causal estimates to possible departures from the assumption of no UMC, at both cluster- and individual-level. This approach is capable of gauging the amount of unobserved confounding needed to change the direction of the observed treatment effects Our project will apply the developed methods in the first two aims to a large representative high-risk localized prostate cancer population, drawn from the de-identified National Cancer Data Base, to evaluate the average causal effects of three popular treatment options on survival and evaluate how unmeasured confounding might alter causal conclusions. We also will estimate treatment het- erogeneity and identify distinct subgroups of patients for which a treatment is effective or harmful. Our methods will establish the effectiveness component and lay the groundwork for building the cost-effectiveness models, and provide evidence for further investigations of variations in intervention implementation and modifications in recommendations for treatments leading to different patient outcomes. To facilitate the dissemination of our work, we will share the underlying statistical code via an R package.



Publications

A new method for clustered survival data: Estimation of treatment effect heterogeneity and variable selection.
Authors: Hu L. .
Source: Biometrical Journal. Biometrische Zeitschrift, 2024 Jan; 66(1), p. e2200178.
EPub date: 2023-12-10 00:00:00.0.
PMID: 38072661
Related Citations

Using Tree-Based Machine Learning for Health Studies: Literature Review and Case Series.
Authors: Hu L. , Li L. .
Source: International Journal Of Environmental Research And Public Health, 2022-12-01 00:00:00.0; 19(23), .
EPub date: 2022-12-01 00:00:00.0.
PMID: 36498153
Related Citations

A Flexible Approach for Assessing Heterogeneity of Causal Treatment Effects on Patient Survival Using Large Datasets with Clustered Observations.
Authors: Hu L. , Ji J. , Liu H. , Ennis R. .
Source: International Journal Of Environmental Research And Public Health, 2022-11-12 00:00:00.0; 19(22), .
EPub date: 2022-11-12 00:00:00.0.
PMID: 36429621
Related Citations

CIMTx: An R Package for Causal Inference with Multiple Treatments using Observational Data.
Authors: Hu L. , Ji J. .
Source: The R Journal, 2022 Sep; 14(3), p. 213-230.
EPub date: 2022-12-19 00:00:00.0.
PMID: 39310290
Related Citations

A flexible approach for causal inference with multiple treatments and clustered survival outcomes.
Authors: Hu L. , Ji J. , Ennis R.D. , Hogan J.W. .
Source: Statistics In Medicine, 2022-08-10 00:00:00.0; , .
EPub date: 2022-08-10 00:00:00.0.
PMID: 35948011
Related Citations

Correlates of cancer prevalence across census tracts in the United States: A Bayesian machine learning approach.
Authors: Niu L. , Hu L. , Li Y. , Liu B. .
Source: Spatial And Spatio-temporal Epidemiology, 2022 08; 42, p. 100522.
EPub date: 2022-05-27 00:00:00.0.
PMID: 35934328
Related Citations

A FLEXIBLE SENSITIVITY ANALYSIS APPROACH FOR UNMEASURED CONFOUNDING WITH MULTIPLE TREATMENTS AND A BINARY OUTCOME WITH APPLICATION TO SEER-MEDICARE LUNG CANCER DATA.
Authors: Hu L. , Zou J. , Gu C. , Ji J. , Lopez M. , Kale M. .
Source: The Annals Of Applied Statistics, 2022 Jun; 16(2), p. 1014-1037.
EPub date: 2022-06-13 00:00:00.0.
PMID: 36644682
Related Citations

A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data.
Authors: Lin J.J. , Hu L. , Huang C. , Jiayi J. , Lawrence S. , Govindarajulu U. .
Source: Bmc Medical Research Methodology, 2022-05-04 00:00:00.0; 22(1), p. 132.
EPub date: 2022-05-04 00:00:00.0.
PMID: 35508974
Related Citations

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning.
Authors: Hu L. , Joyce Lin J.Y. , Ji J. .
Source: Statistical Methods In Medical Research, 2021 12; 30(12), p. 2651-2671.
EPub date: 2021-10-25 00:00:00.0.
PMID: 34696650
Related Citations

Estimating heterogeneous survival treatment effects of lung cancer screening approaches: A causal machine learning analysis.
Authors: Hu L. , Lin J.Y. , Sigel K. , Kale M. .
Source: Annals Of Epidemiology, 2021 Oct; 62, p. 36-42.
EPub date: 2021-06-23 00:00:00.0.
PMID: 34157399
Related Citations

Estimating the causal effects of multiple intermittent treatments with application to COVID-19.
Authors: Hu L. , Li F. , Ji J. , Joshi H. , Scott E. .
Source: Arxiv, 2021-09-27 00:00:00.0; , .
EPub date: 2021-09-27 00:00:00.0.
PMID: 34981032
Related Citations

Estimating heterogeneous survival treatment effect in observational data using machine learning.
Authors: Hu L. , Ji J. , Li F. .
Source: Statistics In Medicine, 2021-06-10 00:00:00.0; , .
EPub date: 2021-06-10 00:00:00.0.
PMID: 34114252
Related Citations

Ranking sociodemographic, health behavior, prevention, and environmental factors in predicting neighborhood cardiovascular health: A Bayesian machine learning approach.
Authors: Hu L. , Liu B. , Li Y. .
Source: Preventive Medicine, 2020 12; 141, p. 106240.
EPub date: 2020-08-27 00:00:00.0.
PMID: 32860821
Related Citations

Identifying and understanding determinants of high healthcare costs for breast cancer: a quantile regression machine learning approach.
Authors: Hu L. , Li L. , Ji J. , Sanderson M. .
Source: Bmc Health Services Research, 2020-11-23 00:00:00.0; 20(1), p. 1066.
EPub date: 2020-11-23 00:00:00.0.
PMID: 33228683
Related Citations

Tree-Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level.
Authors: Hu L. , Liu B. , Ji J. , Li Y. .
Source: Journal Of The American Heart Association, 2020-11-17 00:00:00.0; 9(22), p. e016745.
EPub date: 2020-11-03 00:00:00.0.
PMID: 33140687
Related Citations

Identifying and assessing the impact of key neighborhood-level determinants on geographic variation in stroke: a machine learning and multilevel modeling approach.
Authors: Ji J. , Hu L. , Liu B. , Li Y. .
Source: Bmc Public Health, 2020-11-07 00:00:00.0; 20(1), p. 1666.
EPub date: 2020-11-07 00:00:00.0.
PMID: 33160324
Related Citations

Quantile Regression Forests to Identify Determinants of Neighborhood Stroke Prevalence in 500 Cities in the USA: Implications for Neighborhoods with High Prevalence.
Authors: Hu L. , Ji J. , Li Y. , Liu B. , Zhang Y. .
Source: Journal Of Urban Health : Bulletin Of The New York Academy Of Medicine, 2020-09-04 00:00:00.0; , .
EPub date: 2020-09-04 00:00:00.0.
PMID: 32888155
Related Citations




Back to Top