||1R01CA266574-01A1 Interpret this number
||Refined Capture-Recapture Methods for Surveilling Cancer Recurrence
The monitoring of disease prevalence and estimation of the number of affected individuals in a defined
population are among the crucial goals of epidemiologic surveillance for chronic and infectious diseases. This
proposal aims to provide novel and reliable statistical tools to improve best practices for design and analysis of
such surveillance studies. We take specific motivation from timely challenges associated with the registry-
based monitoring of cancer recurrences in the state of Georgia Cancer Registry (GCR).
We focus on customizing capture-recapture (C-R) methods, which are ever increasingly used tools for
estimating total numbers of cases or deaths based on multiple epidemiologic surveillance streams. We clarify
underappreciated pitfalls associated with widely popular log-linear model-based C-R techniques, and propose
an accessible approach to sensitivity analysis with data visualization that promotes a general strategy for more
appropriate propagation of uncertainty into ultimate estimates of case totals. This in turn provides a gateway to
a broad class of useful models, whereby practitioners can transparently encode assumptions about how
surveillance streams operate relative to one another at the population level. As a next step, we consider the
case in which one surveillance stream is implemented by means of a well-controlled sampling design. Under
appropriate conditions, this provides what we refer to as an “anchor stream”, whereby otherwise ever-present
inherent uncertainties in specifying a defensible C-R model are overcome. In this setting, we will promote best
statistical practices for estimating case totals by means of a novel C-R estimator that harnesses the power of
the principled sampling behind the anchor stream while offering markedly enhanced precision. We propose to
extend this approach to account for misclassification, which is inevitable in the case of our motivating study of
cancer recurrence and in any setting in which surveillance streams identify cases in an error-prone manner.
We will tailor proposed methodology toward breast and colorectal cancer recurrence monitoring via the
ongoing Cancer Recurrence Information and Surveillance Program (CRISP), based on the GCR. CRISP is
actively compiling informative but potentially false-positive recurrence signals from up to 6 data streams, and
conducts validation sampling through protocol-based medical record review to confirm true cases among
signaled recurrences. We will use such validation data to adjust for misclassification in estimating C-R-based
recurrence counts. In particular, the current project will implement a principled “anchor stream” random sample
of 200 GCR patients for validation through medical record review, leading to valid and demonstrably precise
estimates of true recurrence counts over the study period that are free of misclassification bias.
Tailoring capture-recapture methods to estimate registry-based case counts based on error-prone diagnostic signals.
, Zhang Y.
, Ward K.C.
, Lash T.L.
, Waller L.A.
, Lyles R.H.
Statistics in medicine, 2023-07-30; 42(17), p. 2928-2943.
Using Capture-Recapture Methodology to Enhance Precision of Representative Sampling-Based Case Count Estimates.
, Zhang Y.
, Ge L.
, England C.
, Ward K.
, Lash T.L.
, Waller L.A.
Journal of survey statistics and methodology, 2022 Nov; 10(5), p. 1292-1318.