Skip to main content
Grant Details

Grant Number: 5R21CA227613-02 Interpret this number
Primary Investigator: Hubbard, Rebecca
Organization: University Of Pennsylvania
Project Title: Improving Confounder Control in Ehr-Based Studies of Cancer Epidemiology
Fiscal Year: 2020


! Project Summary Data from Electronic Health Records (EHR) are a valuable research tool, providing information on outcomes and exposures that would be costly and difficult to obtain through primary data collection. However, EHR data capture is driven by clinical and administrative rather than research needs, necessitating substantial methodological innovation to obtain valid results. While a number of prior methodological studies have focused on reducing confounding in observational studies conducted using EHR data, they have not considered the risk of residual confounding that results when confounder variables are measured with error. The proposed study will develop novel statistical tools tailored to the EHR context to address measurement error and missing data in confounders. Under Aim 1 we will use a recently developed statistical approach, integrated likelihood, to develop a method for confounder control using imperfect confounders that does not require validation data. Under Aim 2, we will develop an index of sensitivity of study results to the assumption of “informative presence,” i.e. that absence of information on a confounder is indicative of absence of the confounder. Novel methods will be evaluated and compared to standard approaches using simulated data and applied to existing data from a study of colon cancer recurrence. Statistical software code for these methods will be developed in the R programming language and disseminated via our project website and Github. This research will provide methodological tools to improve the validity of results obtained through secondary analysis of EHR-derived data. !


Performance of Multiple Imputation Using Modern Machine Learning Methods in Electronic Health Records Data.
Authors: Getz K. , Hubbard R.A. , Linn K.A. .
Source: Epidemiology (Cambridge, Mass.), 2023-03-01; 34(2), p. 206-215.
EPub date: 2022-12-09.
PMID: 36722803
Related Citations

Informative presence bias in analyses of electronic health records-derived data: a cautionary note.
Authors: Harton J. , Mitra N. , Hubbard R.A. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2022-06-14; 29(7), p. 1191-1199.
PMID: 35438796
Related Citations

SAT: a Surrogate-Assisted Two-wave case boosting sampling method, with application to EHR-based association studies.
Authors: Liu X. , Chubak J. , Hubbard R.A. , Chen Y. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2022-04-13; 29(5), p. 918-927.
PMID: 34962283
Related Citations

Integrating real world data and clinical trial results using survival data reconstruction and marginal moment-balancing weights.
Authors: Getz K. , Mamtani R. , Hubbard R.A. .
Source: Journal of biopharmaceutical statistics, 2022-01-02; 32(1), p. 191-203.
EPub date: 2021-11-10.
PMID: 34756156
Related Citations

A cost-effective chart review sampling design to account for phenotyping error in electronic health records (EHR) data.
Authors: Yin Z. , Tong J. , Chen Y. , Hubbard R.A. , Tang C.Y. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2021-12-28; 29(1), p. 52-61.
PMID: 34718618
Related Citations

Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches.
Authors: Zhu Y. , Hubbard R.A. , Chubak J. , Roy J. , Mitra N. .
Source: Pharmacoepidemiology and drug safety, 2021 Nov; 30(11), p. 1471-1485.
EPub date: 2021-08-24.
PMID: 34375473
Related Citations

Characterizing Bias Due to Differential Exposure Ascertainment in Electronic Health Record Data.
Authors: Hubbard R.A. , Lett E. , Ho G.Y.F. , Chubak J. .
Source: Health services & outcomes research methodology, 2021 Sep; 21(3), p. 309-323.
EPub date: 2021-01-04.
PMID: 34366704
Related Citations

Bias Reduction Methods for Propensity Scores Estimated from Error-Prone EHR-Derived Covariates.
Authors: Harton J. , Mamtani R. , Mitra N. , Hubbard R.A. .
Source: Health services & outcomes research methodology, 2021 Jun; 21, p. 169-187.
EPub date: 2020-09-10.
PMID: 34149306
Related Citations

Neighborhood-level measures of socioeconomic status are more correlated with individual-level measures in urban areas compared with less urban areas.
Authors: Xie S. , Hubbard R.A. , Himes B.E. .
Source: Annals of epidemiology, 2020 Mar; 43, p. 37-43.e4.
EPub date: 2020-02-11.
PMID: 32151518
Related Citations

The proportion of Model for End-stage Liver Disease Sodium score attributable to creatinine independently predicts post-transplant survival and renal complications.
Authors: Bittermann T. , Hubbard R.A. , Lewis J.D. , Goldberg D.S. .
Source: Clinical transplantation, 2020 Mar; 34(3), p. e13817.
EPub date: 2020-02-20.
PMID: 32027405
Related Citations

Effectiveness of First-line Immune Checkpoint Blockade Versus Carboplatin-based Chemotherapy for Metastatic Urothelial Cancer.
Authors: Feld E. , Harton J. , Meropol N.J. , Adamson B.J.S. , Cohen A. , Parikh R.B. , Galsky M.D. , Narayan V. , Christodouleas J. , Vaughn D.J. , et al. .
Source: European urology, 2019 Oct; 76(4), p. 524-532.
EPub date: 2019-07-28.
PMID: 31362898
Related Citations

Back to Top