Your new experience awaits. Try the new design now and help us make it even better

REVIEW article

Front. Public Health

Sec. Environmental Health and Exposome

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1687056

AI Redefines Mass Spectrometry Chemicals Identification: Retention Time Prediction in Metabolomics and for a Human Exposome Project

Provisionally accepted
  • 1Biology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, United States
  • 2Johns Hopkins University Whiting School of Engineering, Baltimore, United States
  • 3Insilica, Rockville, MD, United States
  • 4Universitat Konstanz, Konstanz, Germany

The final, formatted version of the article will be published soon.

The comprehensive identification of environmental and endogenous chemicals in human biospecimens is a critical bottleneck for realizing the Human Exposome Project. Untargeted metabolomics, particularly liquid chromatography–high-resolution mass spectrometry (LC-HRMS), offers unparalleled coverage of small molecules, but most detected features remain unidentified due to limited spectral libraries and structural ambiguity. Retention time (RT) prediction—based on quantitative structure–retention relationships (QSRR) and enhanced by artificial intelligence (AI)—is an underutilized orthogonal parameter that can substantially improve metabolite annotation confidence. This review synthesizes advances in machine learning–based RT prediction, probabilistic calibration, and cross-platform harmonization for liquid chromatography and gas chromatography, including deep learning, graph neural networks, and transfer learning approaches. We evaluate workflows integrating RT prediction with mass-based searches and network-based annotation tools, highlighting their potential to refine candidate ranking and reduce false positives in environmental exposure assessment. The use of endogenous compounds as internal calibrants is discussed as a practical strategy for improving RT transferability across laboratories. We further outline how RT-aware annotation supports non-targeted screening of emerging contaminants, transformation products, and exposure biomarkers, thereby enhancing the interpretability and reproducibility of exposomics data. By integrating RT prediction, QSRR modeling, and AI into untargeted metabolomics pipelines, researchers can move from qualitative detection toward quantitative, inference-driven mapping of environmental influences on human health, strengthening the scientific foundation for environmental health policy and preventive public health strategies.

Keywords: untargeted metabolomics, Retention time prediction, Quantitative structure–retention relationships, artificial intelligence, Exposomics, Environmental Health, HumanExposome Project

Received: 16 Aug 2025; Accepted: 24 Sep 2025.

Copyright: © 2025 Sille, Prasse, Luechtefeld and Hartung. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Thomas Hartung, thartung@jhsph.edu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.