Skip to main content
An official website of the United States government
Grant Details

Grant Number: 3P50CA271357-02S1 Interpret this number
Primary Investigator: Panageas, Katherine
Organization: Sloan-Kettering Inst Can Research
Project Title: MATCHES: Making Telehealth Delivery of Cancer Care at Home Effective and Safe - Addressing Missing Data in the Matches Study to Improve Ml/Ai Readiness
Fiscal Year: 2023


Project Summary: The MATCHES (Making Telehealth Delivery of Cancer Care at Home Effective and Safe) Telehealth Research Center aims to build the evidence base necessary to establish best practices for telehealth-enabled cancer care. Prior work demonstrates that oncology-focused telehealth can achieve favorable outcomes, but large-scale trials have been limited to specific contexts like palliative care or survivorship. Adoption has been constrained by restricted reimbursement. The MATCHES Center will help remediate this evidence gap by executing prospective trials and conducting observational analyses. Data will be integrated from multi-layers from telehealth platforms, patient portals, mobile tracking devices, and the electronic health record (EHR). This will help develop a new paradigm in oncology—precision care delivery—with the ultimate goal of matching individual patients with the most beneficial combination of clinic-based or telehealth-supported home-setting care at the appropriate time— all based on the totality of dynamically available data. This will be accomplished by applying data science methods—including nimble trial designs and machine learning—that have had limited application to telehealth. Missing data have been observed in the MATCHES curated data sets, which is also a common issue of both EHR and patient-reported health data. Due to the presence of missing data, the MATCHES data is not ready for machine learning or artificial intelligence applications as inappropriate handling of missing data can lead to both bias and loss of statistical power. Bias is particularly concerning if a subgroup of patients is more likely to have missing data. For example, if low-income patients are more likely to skip self-reported outcomes for fear of triggering costly work-up, their experience will be underrepresented in the data and analysis, compromising the robustness and generalizability of conclusions. These issues are well-recognized in the statistical literature and a wide array of tools have been developed to impute missing data with plausible values obtained from a probabilistic model and perform analyses recognizing that some data points are imputed. However, many imputation methods do not scale up to the dimensions in the MATCHES data, and they may not be robust to differentmissing data mechanisms. Additionally, there is no guidance on how to examine the missing data patterns systematically, especially in the high-dimensional feature space as in MATCHES. Hence in this supplement, we propose and develop machine-learning-based approaches that will be able to handle a high- dimensional feature matrix, complex patterns of missingness, and more general missing mechanisms. We will then apply these methods to examine the complex missing data patterns and provide imputed data sets that are ready for ML/AL applications both for the researchers of the MATCHES program and to be shared with others across the Telehealth Research Centers of Excellence (TRACE). We will also provide analysis pipelines that will help appropriately handle missing data in other large-scale multi-modality healthcare data sets.


None. See parent grant details.

Back to Top