AUTHOR=Cheng Kaiqi , Xiao Jingzhe , He Jingyuan , Yang Rongguang , Pei Jinjin , Jin Wengang , Abd El-Aty A. M. TITLE=Unraveling volatile metabolites in pigmented onion (Allium cepa L.) bulbs through HS-SPME/GC–MS-based metabolomics and machine learning JOURNAL=Frontiers in Nutrition VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/nutrition/articles/10.3389/fnut.2025.1582576 DOI=10.3389/fnut.2025.1582576 ISSN=2296-861X ABSTRACT=IntroductionColored onions are favored by consumers due to their distinctive aroma, rich phytochemical content, and diverse biological activities. However, comprehensive analyses of their phytochemical profiles and volatile metabolites remain limited.MethodsIn this study, total phenols, flavonoids, anthocyanins, carotenoids, and antioxidant activities of three colored onion bulbs were evaluated. Volatile metabolites were identified using headspace solid-phase microextraction combined with gas chromatography-mass spectrometry (HS-SPME/GC-MS). Multivariate statistical analyses, feature selection techniques (SelectKBest, LASSO), and machine learning models were applied to further analyze and classify the metabolite profiles.ResultsSignificant differences in phytochemical composition and antioxidant activities were observed among the three onion types. A total of 243 volatile metabolites were detected, with sulfur compounds accounting for 51-64%, followed by organic acids and their derivatives (4-19%). Multivariate analysis revealed distinct volatile profiles, and 19 key metabolites were identified as biomarkers. Additionally, 33 and 38 feature metabolites were selected by SelectKBest and LASSO, respectively. The 38 features selected by LASSO enabled clear differentiation of onion types via PCA, UMAP, and k-means clustering. Among the four machine learning models tested, the random forest model achieved the highest classification accuracy (1.00). SHAP analysis further confirmed 20 metabolites as potential key markers.ConclusionThe findings suggest that the combination of HS-SPME/GC-MS and machine learning, particularly the random forest algorithm, is a powerful approach for characterizing and classifying volatile metabolite profiles in colored onions. This method holds potential for quality assessment and breeding applications.