Grant Details
Grant Number: |
5U24CA248010-03 Interpret this number |
Primary Investigator: |
Savova, Guergana |
Organization: |
Boston Children'S Hospital |
Project Title: |
Cancer Deep Phenotype Extraction From Electronic Medical Records |
Fiscal Year: |
2022 |
Abstract
Summary
Precise phenotype information is needed to advance translational cancer research, particularly to unravel the
effects of genetic, epigenetic, and systems changes on tumor behavior and responsiveness. Examples of
phenotypic variables in cancer include: tumor morphology (e.g. histopathologic diagnosis), co-morbid
conditions (e.g. associated immune disease), laboratory findings (e.g. gene amplification status), specific tumor
behaviors (e.g. metastasis) and response to treatment (e.g. effect of a chemotherapeutic agent on tumor).
Current models for correlating EMR data with –omics data largely ignore the clinical text, which remains one of
the most important sources of phenotype information for cancer patients. Unlocking the value of clinical text
has the potential to enable new insights about cancer initiation, progression, metastasis, and response to
treatment. We propose further collaboration to enhance the DeepPhe platform with new methods for cancer
deep phenotyping. Several aims propose investigation of biomedical information extraction where there has
been little or no previous work (e.g. clinical genomic). Visualization of extracted data, usability of the software,
and dissemination are also emphasized. A diverse set of oncology studies led by accomplished translational
investigators in Breast Cancer, Melanoma, Ovarian Cancer, Colorectal Cancer and Diffuse Large B-cell
Lymphoma will demonstrate the utility of the software. These labs will contribute phenotype variables for
extraction, test utility and usability of the software, and provide the setting for an extrinsic evaluation. The
proposed research bridges novel methods to automate cancer deep phenotype extraction from clinical text with
emerging standards in phenotype knowledge representation and NLP. This work is highly aligned with recent
calls in the scientific literature to advance scalable and robust methods of extracting and representing
phenotypes for precision medicine and translational research.
Publications
Inferring gender from first names: Comparing the accuracy of Genderize, Gender API, and the gender R package on authors of diverse nationality.
Authors: VanHelene A.D.
, Khatri I.
, Hilton C.B.
, Mishra S.
, Gamsiz Uzun E.D.
, Warner J.L.
.
Source: Plos Digital Health, 2024 Oct; 3(10), p. e0000456.
EPub date: 2024-10-29 00:00:00.0.
PMID: 39471154
Related Citations
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.
Authors: Hochheiser H.
, Finan S.
, Yuan Z.
, Durbin E.B.
, Jeong J.C.
, Hands I.
, Rust D.
, Kavuluru R.
, Wu X.C.
, Warner J.L.
, et al.
.
Source: Medrxiv : The Preprint Server For Health Sciences, 2023-10-26 00:00:00.0; , .
EPub date: 2023-10-26 00:00:00.0.
PMID: 37205575
Related Citations
An End-to-End Natural Language Processing System for Automatically Extracting Radiation Therapy Events From Clinical Texts.
Authors: Bitterman D.S.
, Goldner E.
, Finan S.
, Harris D.
, Durbin E.B.
, Hochheiser H.
, Warner J.L.
, Mak R.H.
, Miller T.
, Savova G.K.
.
Source: International Journal Of Radiation Oncology, Biology, Physics, 2023-09-01 00:00:00.0; 117(1), p. 262-273.
EPub date: 2023-03-27 00:00:00.0.
PMID: 36990288
Related Citations
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.
Authors: Hochheiser H.
, Finan S.
, Yuan Z.
, Durbin E.B.
, Jeong J.C.
, Hands I.
, Rust D.
, Kavuluru R.
, Wu X.C.
, Warner J.L.
, et al.
.
Source: Jco Clinical Cancer Informatics, 2023 Sep; 7, p. e2300156.
PMID: 38113411
Related Citations
Open-source Software Sustainability Models: Initial White Paper From the Informatics Technology for Cancer Research Sustainability and Industry Partnership Working Group.
Authors: Ye Y.
, Barapatre S.
, Davis M.K.
, Elliston K.O.
, Davatzikos C.
, Fedorov A.
, Fillion-Robin J.C.
, Foster I.
, Gilbertson J.R.
, Lasso A.
, et al.
.
Source: Journal Of Medical Internet Research, 2021-12-02 00:00:00.0; 23(12), p. e20028.
EPub date: 2021-12-02 00:00:00.0.
PMID: 34860667
Related Citations
Characterizing the Anticancer Treatment Trajectory and Pattern in Patients Receiving Chemotherapy for Cancer Using Harmonized Observational Databases: Retrospective Study.
Authors: Jeon H.
, You S.C.
, Kang S.Y.
, Seo S.I.
, Warner J.L.
, Belenkaya R.
, Park R.W.
.
Source: Jmir Medical Informatics, 2021-04-06 00:00:00.0; 9(4), p. e25035.
EPub date: 2021-04-06 00:00:00.0.
PMID: 33720842
Related Citations
Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records.
Authors: Savova G.K.
, Danciu I.
, Alamudun F.
, Miller T.
, Lin C.
, Bitterman D.S.
, Tourassi G.
, Warner J.L.
.
Source: Cancer Research, 2019-11-01 00:00:00.0; 79(21), p. 5463-5470.
EPub date: 2019-08-08 00:00:00.0.
PMID: 31395609
Related Citations