Skip to main content
An official website of the United States government
Grant Details

Grant Number: 3R21CA258242-01S1 Interpret this number
Primary Investigator: Yetisgen, Meliha
Organization: University Of Washington
Project Title: Extraction of Symptom Burden From Clinical Narratives of Cancer Patients Using Natural Language Processing
Fiscal Year: 2022


Abstract

Project Summary/Abstract Although cancer in children and adolescents is rare, it is the leading cause of death by disease past infancy among children in the United States. The US Department of Health defines SDOH as “conditions in the environment that affect health, functioning, and quality of life outcomes and risks." There is an extensive literature base linking race, ethnicity, and SDOH to pediatric cancer outcomes. SDOH are commonly queried in pediatric clinical practice. Very few of the SDOH data points are noted as discrete data-fields such as race and ethnicity; most are documented as clinical narratives in Electronic Health Records (EHRs) which makes it difficult to collect SDOH in clinical and research settings to improve patient care and advance clinical research. We therefore propose to develop novel deep learning-based NLP technologies that can extract detailed SDOH information from EHRs of pediatric patients for secondary use. Our dataset will include clinical notes of pediatric patients from two institutions: Seattle Cancer Care Alliance (SCCA) and University of Washington Medical Center (UWMC). SCCA cohort will include only pediatric cancer patients. To ensure the generalizability of extraction approaches across different institutions and patient populations, UWMC cohort will include a random sample from general pediatric population. Our final corpus will include thousands of clinical notes of hundreds of pediatric patients over a period of ten years (1.1.2012-12.31.2021). We will design a frame-based event representation schema to capture the salient details of the following categories of SDOH: (1) health care access and quality, (2) living arrangements, (3) economic stability, (4) housing and hunger insecurity, (5) prior trauma/loss, (6) education access and quality, (7) patient and family substance use history, and (8) patient/family mental. We will use active learning to sample a diverse and representative set of notes for gold standard annotation. Given this gold standard, our goal is automated extraction of SDOH from clinical narratives of pediatric patients with deep learning-based NLP approaches. The proposed frame- based event representation, active learning framework and NLP architectures will be based on ongoing work from our ITCR - R21 project titled “Extraction of Symptom Burden from Clinical Narratives of Cancer Patients using Natural Language Processing” (1 R21 CA258242-01). All models and their implementations produced during the execution of this project will be shared with the community as open-source resources.



Publications


None. See parent grant details.


Back to Top