Skip to main content
An official website of the United States government
Grant Details

Grant Number: 5U01CA235507-04 Interpret this number
Primary Investigator: Du, Xiuxia
Organization: University Of North Carolina Charlotte
Project Title: Cross-Platform and Graphical Software Tool for Adaptive Lc/MS and Gc/MS Metabolomics Data Preprocessing
Fiscal Year: 2021


Project Summary / Abstract Data preprocessing is critical for the success of any MS-based untargeted metabolomics study, as it is the first informatics step for making sense of the data. Despite the enormous contributions that existing software tools have made to metabolomics, errors in compound identification and relative quantitation are still plaguing the field. This issue is becoming more serious as the sensitivity of LC/MS and GC/MS platforms is constantly increasing. Preprocessing involves peak detection, peak grouping and annotation for LC/MS or spectral deconvolution for GC/MS data, and peak alignment. Existing software tools invariably yield an immense number of false positive and false negative peaks, produce inaccurate peak groups, mis-align detected peaks, and extract inaccurate information of relative metabolite quantitation. These errors can translate downstream into spurious or missing compound identifications and cause misleading interpretations of the metabolome. Furthermore, users need to specify a large number of parameters for existing software tools to work. Unfortunately, general users usually do not understand how to optimize these parameters, and maximizing one aspect (e.g., sensitivity) often has deleterious effects on another (e.g., specificity). We will address these challenges by developing more accurate algorithms for improving the rigor and reproducibility of data preprocessing. The proposed algorithms will be implemented in Java and integrated with the widely-used MZmine 2, making the software cross-platform and user-friendly with rich visualization capabilities. In addition, the implementation will be optimized for memory efficiency and computing speed allowing large-scale data preprocessing. Extensive testing of the software will be conducted in close collaborations with metabolomics core facilities and users around the world.


Current Practices in LC-MS Untargeted Metabolomics: A Scoping Review on the Use of Pooled Quality Control Samples.
Authors: Broeckling C.D. , Beger R.D. , Cheng L.L. , Cumeras R. , Cuthbertson D.J. , Dasari S. , Davis W.C. , Dunn W.B. , Evans A.M. , Fernández-Ochoa A. , et al. .
Source: Analytical chemistry, 2023-12-26; 95(51), p. 18645-18654.
EPub date: 2023-12-06.
PMID: 38055671
Related Citations

Recent advances in mass spectrometry-based computational metabolomics.
Authors: Ebbels T.M.D. , van der Hooft J.J.J. , Chatelaine H. , Broeckling C. , Zamboni N. , Hassoun S. , Mathé E.A. .
Source: Current opinion in chemical biology, 2023 Jun; 74, p. 102288.
EPub date: 2023-03-24.
PMID: 36966702
Related Citations

Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening.
Authors: Smirnov A. , Liao Y. , Du X. .
Source: Metabolites, 2022-05-29; 12(6), .
EPub date: 2022-05-29.
PMID: 35736424
Related Citations

ADAP-KDB: A Spectral Knowledgebase for Tracking and Prioritizing Unknown GC-MS Spectra in the NIH's Metabolomics Data Repository.
Authors: Smirnov A. , Liao Y. , Fahy E. , Subramaniam S. , Du X. .
Source: Analytical chemistry, 2021-09-14; 93(36), p. 12213-12220.
EPub date: 2021-08-29.
PMID: 34455770
Related Citations

A Practical Guide to Metabolomics Software Development.
Authors: Chang H.Y. , Colby S.M. , Du X. , Gomez J.D. , Helf M.J. , Kechris K. , Kirkpatrick C.R. , Li S. , Patti G.J. , Renslow R.S. , et al. .
Source: Analytical chemistry, 2021-02-02; 93(4), p. 1912-1923.
EPub date: 2021-01-19.
PMID: 33467846
Related Citations

Auto-deconvolution and molecular networking of gas chromatography-mass spectrometry data.
Authors: Aksenov A.A. , Laponogov I. , Zhang Z. , Doran S.L.F. , Belluomo I. , Veselkov D. , Bittremieux W. , Nothias L.F. , Nothias-Esposito M. , Maloney K.N. , et al. .
Source: Nature biotechnology, 2021 Feb; 39(2), p. 169-173.
EPub date: 2020-11-09.
PMID: 33169034
Related Citations

Metabolomics Data Preprocessing Using ADAP and MZmine 2.
Authors: Du X. , Smirnov A. , Pluskal T. , Jia W. , Sumner S. .
Source: Methods in molecular biology (Clifton, N.J.), 2020; 2104, p. 25-48.
PMID: 31953811
Related Citations

The metaRbolomics Toolbox in Bioconductor and beyond.
Authors: Stanstrup J. , Broeckling C.D. , Helmus R. , Hoffmann N. , Mathé E. , Naake T. , Nicolotti L. , Peters K. , Rainer J. , Salek R.M. , et al. .
Source: Metabolites, 2019-09-23; 9(10), .
EPub date: 2019-09-23.
PMID: 31548506
Related Citations

ADAP-GC 4.0: Application of Clustering-Assisted Multivariate Curve Resolution to Spectral Deconvolution of Gas Chromatography-Mass Spectrometry Metabolomics Data.
Authors: Smirnov A. , Qiu Y. , Jia W. , Walker D.I. , Jones D.P. , Du X. .
Source: Analytical chemistry, 2019-07-16; 91(14), p. 9069-9077.
EPub date: 2019-07-05.
PMID: 31274283
Related Citations

Back to Top