Division of Cancer Control & Population Sciences

Grant Details
Abstract
Publications

Grant Details

Grant Number:	5R21CA143242-02 Interpret this number
Primary Investigator:	Chubak, Jessica
Organization:	Kaiser Foundation Health Plan Of Washington
Project Title:	Algorithms to Identify Second Breast Cancer Events From Electronic Data
Fiscal Year:	2011

Abstract

DESCRIPTION (provided by applicant): U.S. breast cancer survivors number 2.5 million, more than the survivors of any other cancer. Studies on how to improve survival and quality of life in this ever-growing population are critical in reducing the national cancer burden. The ability to identify second breast cancer events (i.e., breast cancer recurrences and second primary breast cancers) is critical for cancer survivorship research. In response to the National Cancer Institute's call for studies of cancer surveillance using health claims data, we propose to develop and validate algorithms to identify second breast cancer events from automated healthcare utilization data in order to minimize the need for expensive and time-consuming manual medical record review. Automated healthcare utilization data are becoming increasingly accessible; however, these sources have yet to be validated against gold-standard medical record abstraction for obtaining information on second breast cancer events. This work is significant and necessary since state tumor registries do not routinely collect information on cancer recurrences. The proposed study will be conducted using data from two integrated healthcare delivery systems within the Cancer Research Network (CRN): Group Health Cooperative (in western Washington State) and the Henry Ford Health System (in Detroit, Michigan). These healthcare systems have extensive automated data on enrollment, diagnoses, procedures, and prescription medication fills. The proposed study is efficient because it will use gold-standard data on second breast cancer events that have already been abstracted on ~2500 women as part of previously funded studies of breast cancer outcomes. The sample of women will be divided into a training dataset (60%) for algorithm development and a testing dataset (40%) for validation. The primary aim of this study is to develop a "menu" of algorithms that researchers can select from under different circumstances; i.e., when they want to maximize sensitivity, specificity, or positive predictive value. Secondary analyses will explore: 1) whether algorithms developed in one population are valid in another, and 2) whether valid algorithms can be developed using more limited sources of data that are likely to be available in a larger number of healthcare settings. This project will use innovative approaches to develop the algorithm "menu" and to explore the generalizability of algorithm development. PUBLIC HEALTH RELEVANCE: As the number of breast cancer survivors grows, research on breast cancer prognosis and quality of life is becoming increasingly important to public health; however, current methods for collecting data on breast cancer recurrences and second primary breast cancers are either time-consuming and costly or have not yet been validated. Being able to identify cancer breast cancer outcomes from automated healthcare data is necessary for conducting large-scale, population-based studies to identify and modify factors that impact the prognosis and quality of life of women with breast cancer.

Publications

Optimal Surrogate-Assisted Sampling for Cost-Efficient Validation of Electronic Health Record Outcomes.
Authors: Marks-Anglin A. , Chen J. , Luo C. , Hubbard R. , Chen Y. .
Source: Statistics In Medicine, 2025 May; 44(10-12), p. e70095.
PMID: 40404279
Related Citations

SAT: a Surrogate-Assisted Two-wave case boosting sampling method, with application to EHR-based association studies.
Authors: Liu X. , Chubak J. , Hubbard R.A. , Chen Y. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2021-12-28 00:00:00.0; , .
EPub date: 2021-12-28 00:00:00.0.
PMID: 34962283
Related Citations

Incorporating Breast Cancer Recurrence Events Into Population-Based Cancer Registries Using Medical Claims: Cohort Study.
Authors: A'mar T. , Beatty J.D. , Fedorenko C. , Markowitz D. , Corey T. , Lange J. , Schwartz S.M. , Huang B. , Chubak J. , Etzioni R. .
Source: Jmir Cancer, 2020-08-17 00:00:00.0; 6(2), p. e18143.
EPub date: 2020-08-17 00:00:00.0.
PMID: 32804084
Related Citations

Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence.
Authors: Chen Y. , Wang J. , Chubak J. , Hubbard R.A. .
Source: Pharmacoepidemiology And Drug Safety, 2018-10-30 00:00:00.0; , .
EPub date: 2018-10-30 00:00:00.0.
PMID: 30375122
Related Citations

An Electronic Health Record-based Algorithm to Ascertain the Date of Second Breast Cancer Events.
Authors: Chubak J. , Onega T. , Zhu W. , Buist D.S.M. , Hubbard R.A. .
Source: Medical Care, 2017 12; 55(12), p. e81-e87.
PMID: 29135770
Related Citations

Administrative Data Algorithms To Identify Second Breast Cancer Events Following Early-stage Invasive Breast Cancer
Authors: Chubak J. , Yu O. , Pocobelli G. , Lamerato L. , Webster J. , Prout M.N. , Ulcickas Yood M. , Barlow W.E. , Buist D.S. .
Source: Journal Of The National Cancer Institute, 2012-06-20 00:00:00.0; 104(12), p. 931-40.
PMID: 22547340
Related Citations

Tradeoffs Between Accuracy Measures For Electronic Health Care Data Algorithms
Authors: Chubak,J. , Pocobelli,G. , Weiss,N.S. .
Source: Journal Of Clinical Epidemiology, 2012 Mar; 65(3), p. 343-349.e2.
PMID: 22197520
Related Citations

Division of Cancer Control and Population Sciences Program Areas

Follow

Resources

Policies

National Cancer Institute

Contact Us