Grant Details
Grant Number: |
3U01CA235488-03S1 Interpret this number |
Primary Investigator: |
Kechris-Mays, Katherina |
Organization: |
University Of Colorado Denver |
Project Title: |
Addressing Sparsity in Metabolomics Data Analysis |
Fiscal Year: |
2021 |
Abstract
Comprehensive profiling of the small molecule repertoire in a sample is referred to as metabolomics and it is
being used to address a variety of scientific questions in biomedical studies. Recent technological advances in
mass spectrometry-based metabolomics have allowed for more comprehensive and sensitive measurements
of metabolites. Despite the technological advances, the bottleneck for taking full advantage of metabolomics
data is often the availability and usability of analysis tools. The goal of the parent award (U01CA235488) is to
develop novel statistical methods and software for the research community to improve the utilization of
metabolomics data, which will help maximize the potential of metabolomics to provide new discoveries in
disease etiology, diagnosis, and drug development. Software tools specifically designed for metabolomics
data, like those proposed in the parent U01 award and attendant RFA (NIH RFA-RM-17-012), are being
developed at an increasing rate. Many of these tools are open-source and freely available, but they are very
diverse with respect to programming language, data formats, and stage in the metabolomics pipeline. Several
of the challenges recognized in the NIH Common Fund Metabolomics Program are to “meet increasing
demand for user-friendly, open-source, bioinformatics tools for data analysis and interpretation” and
“coordinate community-wide identification and adoption of best practices for rigor, reproducibility and data
reuse.” To mitigate these challenges and further the consortium’s goals, we have built the MSCAT database
(https://mscat.metabolomicsworkbench.org) of metabolomics software tools that can be sustainably and
continually updated (U01CA235488-02S1). The database provides a survey of the landscape of available tools
and can assist researchers in the selection of data analysis workflows according to their specific needs. This
supplement proposal aims to extend this database project by further mining the literature to characterize tool
interoperability as outlined by their use in metabolomics studies and by analyzing the collected data about
software tools to extract factors contributing to tool adoption, usability, and utility. In Aim 1, we will develop a
text-mining process where the full text and co-citations of metabolomics studies are mined to identify which
combinations of tools were used in past studies to validate the set of tools suggested by our database. In Aim
2, we assess the metabolomics software landscape for tool redundancy (based on functionality) and correlate
software characteristics with tool adoption and interoperability.
Publications
None. See parent grant details.