Skip to main content
Grant Details

Grant Number: 3U01CA235488-03S1 Interpret this number
Primary Investigator: Kechris-Mays, Katherina
Organization: University Of Colorado Denver
Project Title: Addressing Sparsity in Metabolomics Data Analysis
Fiscal Year: 2021


Comprehensive profiling of the small molecule repertoire in a sample is referred to as metabolomics and it is being used to address a variety of scientific questions in biomedical studies. Recent technological advances in mass spectrometry-based metabolomics have allowed for more comprehensive and sensitive measurements of metabolites. Despite the technological advances, the bottleneck for taking full advantage of metabolomics data is often the availability and usability of analysis tools. The goal of the parent award (U01CA235488) is to develop novel statistical methods and software for the research community to improve the utilization of metabolomics data, which will help maximize the potential of metabolomics to provide new discoveries in disease etiology, diagnosis, and drug development. Software tools specifically designed for metabolomics data, like those proposed in the parent U01 award and attendant RFA (NIH RFA-RM-17-012), are being developed at an increasing rate. Many of these tools are open-source and freely available, but they are very diverse with respect to programming language, data formats, and stage in the metabolomics pipeline. Several of the challenges recognized in the NIH Common Fund Metabolomics Program are to “meet increasing demand for user-friendly, open-source, bioinformatics tools for data analysis and interpretation” and “coordinate community-wide identification and adoption of best practices for rigor, reproducibility and data reuse.” To mitigate these challenges and further the consortium’s goals, we have built the MSCAT database ( of metabolomics software tools that can be sustainably and continually updated (U01CA235488-02S1). The database provides a survey of the landscape of available tools and can assist researchers in the selection of data analysis workflows according to their specific needs. This supplement proposal aims to extend this database project by further mining the literature to characterize tool interoperability as outlined by their use in metabolomics studies and by analyzing the collected data about software tools to extract factors contributing to tool adoption, usability, and utility. In Aim 1, we will develop a text-mining process where the full text and co-citations of metabolomics studies are mined to identify which combinations of tools were used in past studies to validate the set of tools suggested by our database. In Aim 2, we assess the metabolomics software landscape for tool redundancy (based on functionality) and correlate software characteristics with tool adoption and interoperability.


None. See parent grant details.

Back to Top