Skip to main content

COVID-19 Resources

What people with cancer should know:

Guidance for cancer researchers:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Grant Details

Grant Number: 7R21CA231904-02 Interpret this number
Primary Investigator: Zhang, Gq
Organization: University Of Texas Hlth Sci Ctr Houston
Project Title: An Ontology-Driven Faceted Query Engine for the Kentucky Cancer Registry
Fiscal Year: 2019


ABSTRACT A key barrier in cancer research is the traditional data access workflow that requires a hypothesis prior to accessing patient data, rather than a workflow that begins with data exploration while protecting privacy. Existing query engines allow researchers to explore clinical data, build queries, and execute queries without the need for the user to understand how the data is stored. However, the interfaces of such query engines have not achieved usability approaching the levels of those for consumer websites due in critical part to the lack of faceted capabilities. Faceted systems for querying clinical data is currently unavailable due to the complexity of data and the mismatch between the ontologies used for organizing and annotating clinical data (such as NCI Thesaurus), and the desired facet structures and properties. We propose to overcome these challenges by developing OncoSphere, a query engine using the NCI Thesaurus as a nested facet system (NFS) to provide web-based exploration of the Kentucky Cancer Registry data using 3 Specific Aims. In Aim 1 we will develop an approach to transform and implement NCI Thesaurus into an NFS to enable OncoSphere’s interface features. In Aim 2 we will develop methods to perform quality auditing on the hierarchical structure of the NCI Thesaurus to enhance its quality in supporting faceted query for OncoSphere. In Aim 3 we will perform evaluation on OncoSphere’s query expressiveness, query performance and conduct preliminary usability assessment. OncoSphere will break new ground in web-based tools and capitalize on available data resources to accelerate cancer research. We expect OncoSphere and its future versions to become an invaluable resource for the cancer research community. The long-term goal of this study is to create data exploration systems for NCI’s Surveillance Epidemiology and End Results (SEER) program and other related cancer data resources through data science innovations to transform user experience with a new generation of data interaction modalities.


Web-based interactive mapping from data dictionaries to ontologies, with an application to cancer registry.
Authors: Tao S. , Zeng N. , Hands I. , Hurt-Mueller J. , Durbin E.B. , Cui L. , Zhang G.Q. .
Source: BMC medical informatics and decision making, 2020-12-15; 20(Suppl 10), p. 271.
EPub date: 2020-12-15.
PMID: 33319710
Related Citations

Detecting missing IS-A relations in the NCI Thesaurus using an enhanced hybrid approach.
Authors: Zheng F. , Abeysinghe R. , Sioutos N. , Whiteman L. , Remennik L. , Cui L. .
Source: BMC medical informatics and decision making, 2020-12-15; 20(Suppl 10), p. 273.
EPub date: 2020-12-15.
PMID: 33319703
Related Citations

A transformation-based method for auditing the IS-A hierarchy of biomedical terminologies in the Unified Medical Language System.
Authors: Zheng F. , Shi J. , Yang Y. , Zheng W.J. , Cui L. .
Source: Journal of the American Medical Informatics Association : JAMIA, 2020-10-01; 27(10), p. 1568-1575.
PMID: 32918476
Related Citations

Enhancing the Quality of Hierarchic Relations in the National Cancer Institute Thesaurus to Enable Faceted Query of Cancer Registry Data.
Authors: Cui L. , Abeysinghe R. , Zheng F. , Tao S. , Zeng N. , Hands I. , Durbin E.B. , Whiteman L. , Remennik L. , Sioutos N. , et al. .
Source: JCO clinical cancer informatics, 2020 05; 4, p. 392-398.
PMID: 32374632
Related Citations

SSIF: Subsumption-based Sub-term Inference Framework to audit Gene Ontology.
Authors: Abeysinghe R. , Hinderer E.W. , Moseley H.N.B. , Cui L. .
Source: Bioinformatics (Oxford, England), 2020-05-01; 36(10), p. 3207-3214.
PMID: 32065617
Related Citations

Web-based Interactive Visualization of Non-Lattice Subgraphs (WINS) in SNOMED CT.
Authors: Zhu W. , Tao S. , Cui L. , Zhang G.Q. .
Source: AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020; 2020, p. 740-749.
EPub date: 2020-05-30.
PMID: 32477697
Related Citations

Leveraging Non-lattice Subgraphs to Audit Hierarchical Relations in NCI Thesaurus.
Authors: Abeysinghe R. , Brooks M.A. , Cui L. .
Source: AMIA ... Annual Symposium proceedings. AMIA Symposium, 2019; 2019, p. 982-991.
EPub date: 2020-03-04.
PMID: 32308895
Related Citations

Identifying Similar Non-Lattice Subgraphs in Gene Ontology based on Structural Isomorphism and Semantic Similarity of Concept Labels.
Authors: Abeysinghe R. , Qu X. , Cui L. .
Source: AMIA ... Annual Symposium proceedings. AMIA Symposium, 2018; 2018, p. 1186-1195.
EPub date: 2018-12-05.
PMID: 30815161
Related Citations

Back to Top