ABSTRACT
A key barrier in cancer research is the traditional data access workflow that requires a hypothesis prior to
accessing patient data, rather than a workflow that begins with data exploration while protecting privacy.
Existing query engines allow researchers to explore clinical data, build queries, and execute queries without
the need for the user to understand how the data is stored. However, the interfaces of such query engines
have not achieved usability approaching the levels of those for consumer websites due in critical part to the
lack of faceted capabilities. Faceted systems for querying clinical data is currently unavailable due to the
complexity of data and the mismatch between the ontologies used for organizing and annotating clinical data
(such as NCI Thesaurus), and the desired facet structures and properties.
We propose to overcome these challenges by developing OncoSphere, a query engine using the NCI
Thesaurus as a nested facet system (NFS) to provide web-based exploration of the Kentucky Cancer Registry
data using 3 Specific Aims. In Aim 1 we will develop an approach to transform and implement NCI Thesaurus
into an NFS to enable OncoSphere’s interface features. In Aim 2 we will develop methods to perform quality
auditing on the hierarchical structure of the NCI Thesaurus to enhance its quality in supporting faceted query
for OncoSphere. In Aim 3 we will perform evaluation on OncoSphere’s query expressiveness, query
performance and conduct preliminary usability assessment. OncoSphere will break new ground in web-based
tools and capitalize on available data resources to accelerate cancer research. We expect OncoSphere and its
future versions to become an invaluable resource for the cancer research community. The long-term goal of
this study is to create data exploration systems for NCI’s Surveillance Epidemiology and End Results (SEER)
program and other related cancer data resources through data science innovations to transform user
experience with a new generation of data interaction modalities.
Error Notice
The database may currently be offline for maintenance and should be operational soon. If not, we have been notified of this error and will be reviewing it shortly.
We apologize for the inconvenience.
- The DCCPS Team.