Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5U24CA194215-05 Interpret this number
Primary Investigator: Tao, Cui
Organization: University Of Texas Hlth Sci Ctr Houston
Project Title: Advancing Cancer Pharmacoepidemiology Research Through Ehrs and Informatics
Fiscal Year: 2020


Abstract

DESCRIPTION (provided by applicant): The goal of cancer pharmacoepidemiology is to identify adverse and/or long-term effects of chemotherapeutic agents and determine the impact of drugs on cancer risk, prevention, and response to treatments. Pharmacoepidemiology studies exert strong influence on defining optimal treatments and accelerating translational research. Therefore, it is imperative for these to be done efficiently and leveraging real-world patient data such as electronic health records (EHR). Massive clinical data from EHRs are being tapped into for research in disease-gene associations, comparative effectiveness and clinical outcomes. There is however paucity in pharmacoepidemiological studies using comprehensive EHR data due to the inherent challenges that exist for data abstraction, handling and analysis. The hurdles include heterogeneity of reports, embedding of detailed clinical information in narrative text, differing EHR platforms across different sites and missing data to name a few. In this study, we propose to integrate and extend preexisting tools to build an informatics infrastructure for EHR data extraction, interpretation, management and analysis to advance cancer pharmacoepidemiology research. We will leverage existing tools of natural language processing (NLP), standardized ontologies and clinical data management systems to extract and manipulate EHR data for cancer pharmacoepidemiological research. To achieve our goal we propose four specific aims. In aim 1, we intend to develop a high-performance, user- centric information extraction framework with advanced features such as active learning (to reduce annotation cost), domain adaptation (to transfer data across multiple sites) and user-friendly interfaces (for non-technical end users). In aim 2, we plan to improve data harmonization across differing platforms, develop components for seamless data export as well as expand methodologies to address impediments inherent to EHR-based data (such as the missing data problem). In aim 3, we will conduct demonstration projects of cancer pharmacoepidemiology including pharmacovigilance and pharmacogenomics of chemotherapeutic agents to evaluate, refine and validate the broad uses of our tools. Finally in aim 4, we propose to disseminate the methods and tools developed in this project to the cancer research and pharmacoepidemiology communities.



Publications

Dermoscopy Differential Diagnosis Explorer (D3X) Ontology to Aggregate and Link Dermoscopic Patterns to Differential Diagnoses: Development and Usability Study.
Authors: Lin R.Z. , Amith M.T. , Wang C.X. , Strickley J. , Tao C. .
Source: Jmir Medical Informatics, 2024-06-21 00:00:00.0; 12, p. e49613.
EPub date: 2024-06-21 00:00:00.0.
PMID: 38904996
Related Citations

BUILDING A DOSE TOXO-EQUIVALENCE MODEL FROM A BAYESIAN META-ANALYSIS OF PUBLISHED CLINICAL TRIALS.
Authors: Sigworth E.A. , Rubinstein S.M. , Warner J.L. , Chen Y. , Chen Q. .
Source: The Annals Of Applied Statistics, 2023 Dec; 17(4), p. 2993-3012.
EPub date: 2023-10-30 00:00:00.0.
PMID: 39104542
Related Citations

Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis.
Authors: Wang L. , He H. , Wen A. , Moon S. , Fu S. , Peterson K.J. , Ai X. , Liu S. , Kavuluru R. , Liu H. .
Source: Jmir Medical Informatics, 2023-06-27 00:00:00.0; 11, p. e48072.
EPub date: 2023-06-27 00:00:00.0.
PMID: 37368483
Related Citations

Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis.
Authors: Wang L. , He H. , Wen A. , Moon S. , Fu S. , Peterson K.J. , Ai X. , Liu S. , Kavuluru R. , Liu H. .
Source: Jmir Medical Informatics, 2023-06-27 00:00:00.0; 11, p. e48072.
EPub date: 2023-06-27 00:00:00.0.
PMID: 37368483
Related Citations

Assessment of a Naloxone Coprescribing Alert for Patients at Risk of Opioid Overdose: A Quality Improvement Project.
Authors: Nelson S.D. , McCoy A.B. , Rector H. , Teare A.J. , Barrett T.W. , Sigworth E.A. , Chen Q. , Edwards D.A. , Marcovitz D.E. , Wright A. .
Source: Anesthesia And Analgesia, 2022-07-01 00:00:00.0; 135(1), p. 26-34.
EPub date: 2022-06-16 00:00:00.0.
PMID: 35343932
Related Citations

Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing.
Authors: Wang L. , Fu S. , Wen A. , Ruan X. , He H. , Liu S. , Moon S. , Mai M. , Riaz I.B. , Wang N. , et al. .
Source: Jco Clinical Cancer Informatics, 2022 07; 6, p. e2200006.
PMID: 35917480
Related Citations

COVID-19 Vaccination Gap in Admitted Trauma Patients: A Critical Opportunity.
Authors: Turer R.W. , Chen Q. , Jones I.D. , Gondek S.P. , Guillamondegui O.D. , Dennis B.M. .
Source: Journal Of The American College Of Surgeons, 2022-05-01 00:00:00.0; 234(5), p. 727-735.
PMID: 35426382
Related Citations

Development of a bayesian toxo-equivalence model between docetaxel and paclitaxel.
Authors: Sigworth E.A. , Rubinstein S.M. , Chaugai S. , Rivera D.R. , Walker P.D. , Chen Q. , Warner J.L. .
Source: Iscience, 2022-04-15 00:00:00.0; 25(4), p. 104045.
EPub date: 2022-03-11 00:00:00.0.
PMID: 35359803
Related Citations

Are synthetic clinical notes useful for real natural language processing tasks: A case study on clinical entity recognition.
Authors: Li J. , Zhou Y. , Jiang X. , Natarajan K. , Pakhomov S.V. , Liu H. , Xu H. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2021-09-18 00:00:00.0; 28(10), p. 2193-2201.
PMID: 34272955
Related Citations

Nonselective beta-blockers are associated with a lower risk of hepatocellular carcinoma among cirrhotic patients in the United States.
Authors: Wijarnpreecha K. , Li F. , Xiang Y. , Xu X. , Zhu C. , Maroufy V. , Wang Q. , Tao W. , Dang Y. , Pham H.A. , et al. .
Source: Alimentary Pharmacology & Therapeutics, 2021 08; 54(4), p. 481-492.
EPub date: 2021-07-05 00:00:00.0.
PMID: 34224163
Related Citations

Characterizing the Anticancer Treatment Trajectory and Pattern in Patients Receiving Chemotherapy for Cancer Using Harmonized Observational Databases: Retrospective Study.
Authors: Jeon H. , You S.C. , Kang S.Y. , Seo S.I. , Warner J.L. , Belenkaya R. , Park R.W. .
Source: Jmir Medical Informatics, 2021-04-06 00:00:00.0; 9(4), p. e25035.
EPub date: 2021-04-06 00:00:00.0.
PMID: 33720842
Related Citations

Seven decades of chemotherapy clinical trials: a pan-cancer social network analysis.
Authors: Li X. , Sigworth E.A. , Wu A.H. , Behrens J. , Etemad S.A. , Nagpal S. , Go R.S. , Wuichet K. , Chen E.J. , Rubinstein S.M. , et al. .
Source: Scientific Reports, 2020-10-16 00:00:00.0; 10(1), p. 17536.
EPub date: 2020-10-16 00:00:00.0.
PMID: 33067482
Related Citations

Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies.
Authors: Rasmy L. , Tiryaki F. , Zhou Y. , Xiang Y. , Tao C. , Xu H. , Zhi D. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2020-10-01 00:00:00.0; 27(10), p. 1593-1599.
PMID: 32930711
Related Citations

COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes.
Authors: Dong X. , Li J. , Soysal E. , Bian J. , DuVall S.L. , Hanchrow E. , Liu H. , Lynch K.E. , Matheny M. , Natarajan K. , et al. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2020-07-01 00:00:00.0; 27(9), p. 1437-1442.
PMID: 32569358
Related Citations

Efficient and Accurate Extracting of Unstructured EHRs on Cancer Therapy Responses for the Development of RECIST Natural Language Processing Tools: Part I, the Corpus.
Authors: Li Y. , Luo Y.H. , Wampfler J.A. , Rubinstein S.M. , Tiryaki F. , Ashok K. , Warner J.L. , Xu H. , Yang P. .
Source: Jco Clinical Cancer Informatics, 2020 May; 4, p. 383-391.
PMID: 32364754
Related Citations

Electronic Health Records for Drug Repurposing: Current Status, Challenges, and Future Directions.
Authors: Xu H. , Li J. , Jiang X. , Chen Q. .
Source: Clinical Pharmacology And Therapeutics, 2020 04; 107(4), p. 712-714.
EPub date: 2020-02-03 00:00:00.0.
PMID: 32012237
Related Citations

Recognizing software names in biomedical literature using machine learning.
Authors: Wei Q. , Zhang Y. , Amith M. , Lin R. , Lapeyrolerie J. , Tao C. , Xu H. .
Source: Health Informatics Journal, 2020 Mar; 26(1), p. 21-33.
EPub date: 2019-09-30 00:00:00.0.
PMID: 31566474
Related Citations

BERT-based Ranking for Biomedical Entity Normalization.
Authors: Ji Z. , Wei Q. , Xu H. .
Source: Amia Joint Summits On Translational Science Proceedings. Amia Joint Summits On Translational Science, 2020; 2020, p. 269-277.
EPub date: 2020-05-30 00:00:00.0.
PMID: 32477646
Related Citations

Normalizing Clinical Document Titles to LOINC Document Ontology: an Initial Study.
Authors: Zuo X. , Li J. , Zhao B. , Zhou Y. , Dong X. , Duke J. , Natarajan K. , Hripcsak G. , Shah N. , Banda J.M. , et al. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2020; 2020, p. 1441-1450.
EPub date: 2021-01-25 00:00:00.0.
PMID: 33936520
Related Citations

Natural language processing for populating lung cancer clinical research data.
Authors: Wang L. , Luo L. , Wang Y. , Wampfler J. , Yang P. , Liu H. .
Source: Bmc Medical Informatics And Decision Making, 2019-12-05 00:00:00.0; 19(Suppl 5), p. 239.
EPub date: 2019-12-05 00:00:00.0.
PMID: 31801515
Related Citations

Applying a deep learning-based sequence labeling approach to detect attributes of medical concepts in clinical text.
Authors: Xu J. , Li Z. , Wei Q. , Wu Y. , Xiang Y. , Lee H.J. , Zhang Y. , Wu S. , Xu H. .
Source: Bmc Medical Informatics And Decision Making, 2019-12-05 00:00:00.0; 19(Suppl 5), p. 236.
EPub date: 2019-12-05 00:00:00.0.
PMID: 31801529
Related Citations

Cox regression increases power to detect genotype-phenotype associations in genomic studies using the electronic health record.
Authors: Hughey J.J. , Rhoades S.D. , Fu D.Y. , Bastarache L. , Denny J.C. , Chen Q. .
Source: Bmc Genomics, 2019-11-04 00:00:00.0; 20(1), p. 805.
EPub date: 2019-11-04 00:00:00.0.
PMID: 31684865
Related Citations

Enhancing clinical concept extraction with contextual embeddings.
Authors: Si Y. , Wang J. , Xu H. , Roberts K. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2019-11-01 00:00:00.0; 26(11), p. 1297-1304.
PMID: 31265066
Related Citations

Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP.
Authors: Soysal E. , Warner J.L. , Wang J. , Jiang M. , Harvey K. , Jain S.K. , Dong X. , Song H.Y. , Siddhanamatha H. , Wang L. , et al. .
Source: Studies In Health Technology And Informatics, 2019-08-21 00:00:00.0; 264, p. 1041-1045.
PMID: 31438083
Related Citations

HemOnc: A New Standard Vocabulary for Chemotherapy Regimen Representation in the OMOP Common Data Model.
Authors: Warner J.L. , Dymshyts D. , Reich C.G. , Gurley M.J. , Hochheiser H. , Moldwin Z.H. , Belenkaya R. , Williams A.E. , Yang P.C. .
Source: Journal Of Biomedical Informatics, 2019-06-22 00:00:00.0; , p. 103239.
EPub date: 2019-06-22 00:00:00.0.
PMID: 31238109
Related Citations

Detect Attributes of Medical Concepts via Sequence Labeling.
Authors: Xu J. , Xiang Y. , Li Z. , Lee H.J. , Xu H. , Wei Q. , Zhang Y. , Wu Y. , Wu S. .
Source: Ieee International Conference On Healthcare Informatics. Ieee International Conference On Healthcare Informatics, 2019 Jun; 2019, .
EPub date: 2019-11-21 00:00:00.0.
PMID: 32537570
Related Citations

Information Extraction for Populating Lung Cancer Clinical Research Data.
Authors: Wang L. , Luo L. , Wang Y. , Wampfler J.A. , Yang P. , Liu H. .
Source: Ieee International Conference On Healthcare Informatics. Ieee International Conference On Healthcare Informatics, 2019 Jun; 2019, .
EPub date: 2019-11-21 00:00:00.0.
PMID: 32537571
Related Citations

A study of deep learning approaches for medication and adverse drug event extraction from clinical text.
Authors: Wei Q. , Ji Z. , Li Z. , Du J. , Wang J. , Xu J. , Xiang Y. , Tiryaki F. , Wu S. , Zhang Y. , et al. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2019-05-28 00:00:00.0; , .
EPub date: 2019-05-28 00:00:00.0.
PMID: 31135882
Related Citations

Discovery of Noncancer Drug Effects on Survival in Electronic Health Records of Patients With Cancer: A New Paradigm for Drug Repurposing.
Authors: Wu Y. , Warner J.L. , Wang L. , Jiang M. , Xu J. , Chen Q. , Nian H. , Dai Q. , Du X. , Yang P. , et al. .
Source: Jco Clinical Cancer Informatics, 2019 May; 3, p. 1-9.
PMID: 31141421
Related Citations

Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison.
Authors: Zhang Y. , Tiryaki F. , Jiang M. , Xu H. .
Source: Bmc Medical Informatics And Decision Making, 2019-04-04 00:00:00.0; 19(Suppl 3), p. 77.
EPub date: 2019-04-04 00:00:00.0.
PMID: 30943955
Related Citations

Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text.
Authors: Li Z. , Yang Z. , Shen C. , Xu J. , Zhang Y. , Xu H. .
Source: Bmc Medical Informatics And Decision Making, 2019-01-31 00:00:00.0; 19(Suppl 1), p. 22.
EPub date: 2019-01-31 00:00:00.0.
PMID: 30700301
Related Citations

Achievability to Extract Specific Date Information for Cancer Research.
Authors: Wang L. , Wampfler J. , Dispenzieri A. , Xu H. , Yang P. , Liu H. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2019; 2019, p. 893-902.
EPub date: 2020-03-04 00:00:00.0.
PMID: 32308886
Related Citations

Relation Extraction from Clinical Narratives Using Pre-trained Language Models.
Authors: Wei Q. , Ji Z. , Si Y. , Du J. , Wang J. , Tiryaki F. , Wu S. , Tao C. , Roberts K. , Xu H. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2019; 2019, p. 1236-1245.
EPub date: 2020-03-04 00:00:00.0.
PMID: 32308921
Related Citations

Identifying direct temporal relations between time and events from clinical notes.
Authors: Lee H.J. , Zhang Y. , Jiang M. , Xu J. , Tao C. , Xu H. .
Source: Bmc Medical Informatics And Decision Making, 2018-07-23 00:00:00.0; 18(Suppl 2), p. 49.
EPub date: 2018-07-23 00:00:00.0.
PMID: 30066643
Related Citations

Infusion Related Hypersensitivity Reactions with Bio-similar Anti CD-20 Monoclonal Antibody Rituximab in Indian Patients: A Retrospective Study.
Authors: Prakash G. , Malhotra P. , Khadwal A. , Lad D. , Suri V. , Kumari S. , Varma S. .
Source: Indian Journal Of Hematology & Blood Transfusion : An Official Journal Of Indian Society Of Hematology And Blood Transfusion, 2018 Apr; 34(2), p. 273-277.
EPub date: 2017-09-30 00:00:00.0.
PMID: 29622869
Related Citations

Assessing the Practice of Biomedical Ontology Evaluation: Gaps and Opportunities.
Authors: Amith M.F. , He Z. , Bian J. , Antonio Lossio-Ventura J. , Tao C. .
Source: Journal Of Biomedical Informatics, 2018-02-17 00:00:00.0; , .
EPub date: 2018-02-17 00:00:00.0.
PMID: 29462669
Related Citations

Detecting Pharmacovigilance Signals Combining Electronic Medical Records With Spontaneous Reports: A Case Study of Conventional Disease-Modifying Antirheumatic Drugs for Rheumatoid Arthritis.
Authors: Wang L. , Rastegar-Mojarad M. , Ji Z. , Liu S. , Liu K. , Moon S. , Shen F. , Wang Y. , Yao L. , Davis Iii J.M. , et al. .
Source: Frontiers In Pharmacology, 2018; 9, p. 875.
EPub date: 2018-08-07 00:00:00.0.
PMID: 30131701
Related Citations

Computerized Approach to Creating a Systematic Ontology of Hematology/Oncology Regimens.
Authors: Malty A.M. , Jain S.K. , Yang P.C. , Harvey K. , Warner J.L. .
Source: Jco Clinical Cancer Informatics, 2018; 2018, .
EPub date: 2018-05-11 00:00:00.0.
PMID: 30238070
Related Citations

Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition.
Authors: Wu Y. , Yang X. , Bian J. , Guo Y. , Xu H. , Hogan W. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2018; 2018, p. 1110-1117.
EPub date: 2018-12-05 00:00:00.0.
PMID: 30815153
Related Citations

PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data.
Authors: Huang J. , Duan R. , Hubbard R.A. , Wu Y. , Moore J.H. , Xu H. , Chen Y. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2017-12-01 00:00:00.0; , .
EPub date: 2017-12-01 00:00:00.0.
PMID: 29206922
Related Citations

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.
Authors: Soysal E. , Wang J. , Jiang M. , Wu Y. , Pakhomov S. , Liu H. , Xu H. .
Source: Journal Of The American Medical Informatics Association : Jamia, 2017-11-24 00:00:00.0; , .
EPub date: 2017-11-24 00:00:00.0.
PMID: 29186491
Related Citations

A hybrid approach to automatic de-identification of psychiatric notes.
Authors: Lee H.J. , Wu Y. , Zhang Y. , Xu J. , Xu H. , Roberts K. .
Source: Journal Of Biomedical Informatics, 2017-06-07 00:00:00.0; , .
EPub date: 2017-06-07 00:00:00.0.
PMID: 28602904
Related Citations

Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients.
Authors: Soysal E. , Warner J.L. , Denny J.C. , Xu H. .
Source: Amia Joint Summits On Translational Science Proceedings. Amia Joint Summits On Translational Science, 2017; 2017, p. 268-277.
EPub date: 2017-07-26 00:00:00.0.
PMID: 28815141
Related Citations

Automating the Determination of Prostate Cancer Risk Strata From Electronic Medical Records.
Authors: Gregg J.R. , Lang M. , Wang L.L. , Resnick M.J. , Jain S.K. , Warner J.L. , Barocas D.A. .
Source: Jco Clinical Cancer Informatics, 2017; 2017, .
EPub date: 2017-06-08 00:00:00.0.
PMID: 29541700
Related Citations

Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation.
Authors: Lee H.J. , Zhang Y. , Roberts K. , Xu H. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2017; 2017, p. 1070-1079.
EPub date: 2018-04-16 00:00:00.0.
PMID: 29854175
Related Citations

Clinical Named Entity Recognition Using Deep Learning Models.
Authors: Wu Y. , Jiang M. , Xu J. , Zhi D. , Xu H. .
Source: Amia ... Annual Symposium Proceedings. Amia Symposium, 2017; 2017, p. 1812-1819.
EPub date: 2018-04-16 00:00:00.0.
PMID: 29854252
Related Citations




Back to Top