Grant Details
Grant Number: |
5U01CA235507-04 Interpret this number |
Primary Investigator: |
Du, Xiuxia |
Organization: |
University Of North Carolina Charlotte |
Project Title: |
Cross-Platform and Graphical Software Tool for Adaptive Lc/MS and Gc/MS Metabolomics Data Preprocessing |
Fiscal Year: |
2021 |
Abstract
Project Summary / Abstract
Data preprocessing is critical for the success of any MS-based untargeted metabolomics study, as it is the first
informatics step for making sense of the data. Despite the enormous contributions that existing software tools
have made to metabolomics, errors in compound identification and relative quantitation are still plaguing the field.
This issue is becoming more serious as the sensitivity of LC/MS and GC/MS platforms is constantly increasing.
Preprocessing involves peak detection, peak grouping and annotation for LC/MS or spectral deconvolution for
GC/MS data, and peak alignment. Existing software tools invariably yield an immense number of false positive
and false negative peaks, produce inaccurate peak groups, mis-align detected peaks, and extract inaccurate
information of relative metabolite quantitation. These errors can translate downstream into spurious or missing
compound identifications and cause misleading interpretations of the metabolome. Furthermore, users need to
specify a large number of parameters for existing software tools to work. Unfortunately, general users usually
do not understand how to optimize these parameters, and maximizing one aspect (e.g., sensitivity) often has
deleterious effects on another (e.g., specificity). We will address these challenges by developing more accurate
algorithms for improving the rigor and reproducibility of data preprocessing. The proposed algorithms will be
implemented in Java and integrated with the widely-used MZmine 2, making the software cross-platform and
user-friendly with rich visualization capabilities. In addition, the implementation will be optimized for memory
efficiency and computing speed allowing large-scale data preprocessing. Extensive testing of the software will be
conducted in close collaborations with metabolomics core facilities and users around the world.
Publications
None