Grant Details
Grant Number: |
3P50CA271357-02S1 Interpret this number |
Primary Investigator: |
Panageas, Katherine |
Organization: |
Sloan-Kettering Inst Can Research |
Project Title: |
MATCHES: Making Telehealth Delivery of Cancer Care at Home Effective and Safe - Addressing Missing Data in the Matches Study to Improve Ml/Ai Readiness |
Fiscal Year: |
2023 |
Abstract
Project Summary:
The MATCHES (Making Telehealth Delivery of Cancer Care at Home Effective and Safe) Telehealth Research
Center aims to build the evidence base necessary to establish best practices for telehealth-enabled cancer care.
Prior work demonstrates that oncology-focused telehealth can achieve favorable outcomes, but large-scale trials
have been limited to specific contexts like palliative care or survivorship. Adoption has been constrained by
restricted reimbursement. The MATCHES Center will help remediate this evidence gap by executing prospective
trials and conducting observational analyses. Data will be integrated from multi-layers from telehealth platforms,
patient portals, mobile tracking devices, and the electronic health record (EHR). This will help develop a new
paradigm in oncology—precision care delivery—with the ultimate goal of matching individual patients with the
most beneficial combination of clinic-based or telehealth-supported home-setting care at the appropriate time—
all based on the totality of dynamically available data. This will be accomplished by applying data science
methods—including nimble trial designs and machine learning—that have had limited application to telehealth.
Missing data have been observed in the MATCHES curated data sets, which is also a common issue of
both EHR and patient-reported health data. Due to the presence of missing data, the MATCHES data is not
ready for machine learning or artificial intelligence applications as inappropriate handling of missing data can
lead to both bias and loss of statistical power. Bias is particularly concerning if a subgroup of patients is more
likely to have missing data. For example, if low-income patients are more likely to skip self-reported outcomes
for fear of triggering costly work-up, their experience will be underrepresented in the data and analysis,
compromising the robustness and generalizability of conclusions. These issues are well-recognized in the
statistical literature and a wide array of tools have been developed to impute missing data with plausible values
obtained from a probabilistic model and perform analyses recognizing that some data points are imputed.
However, many imputation methods do not scale up to the dimensions in the MATCHES data, and they may not
be robust to differentmissing data mechanisms. Additionally, there is no guidance on how to examine the missing
data patterns systematically, especially in the high-dimensional feature space as in MATCHES. Hence in this
supplement, we propose and develop machine-learning-based approaches that will be able to handle a high-
dimensional feature matrix, complex patterns of missingness, and more general missing mechanisms. We will
then apply these methods to examine the complex missing data patterns and provide imputed data sets that are
ready for ML/AL applications both for the researchers of the MATCHES program and to be shared with others
across the Telehealth Research Centers of Excellence (TRACE). We will also provide analysis pipelines that will
help appropriately handle missing data in other large-scale multi-modality healthcare data sets.
Publications
None. See parent grant details.