AUTHOR=Kou Ranran , Wang Cong , Liu Jinxia , Wan Ran , Jin Zhe , Zhao Le , Liu Youjie , Guo Junwei , Li Feng , Wang Hongbo , Yang Song , Nie Cong TITLE=Construction and interpretation of tobacco leaf position discrimination model based on interpretable machine learning JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1619380 DOI=10.3389/fpls.2025.1619380 ISSN=1664-462X ABSTRACT=Tobacco leaf position is closely associated with its quality whose material basis is the chemical components of tobacco leaf. In recent years, near-infrared (NIR) spectroscopy combined with algorithmic models has emerged as a popular method for identifying the tobacco leaf position. However, when applied to leaf position discrimination, these models often rely on principal components derived from dimensionality-reduced spectral signals, resulting in limited interpretability and difficulty in identifying key chemical components. Chemical composition data combined with algorithmic models can also be used to discriminate tobacco leaf positions. However, the acquisition of chemical components relies on traditional instrumental analytical methods. As a result, the acquisition of chemical composition data is time-consuming and labor-intensive, involving only a limited number of compounds. The study proposes a novel approach that integrates machine learning with advanced interpretability techniques for both tobacco leaf position discrimination and analysis. Based on the 70 tobacco leaf chemical components obtained using near-infrared rapid analysis technology, tobacco leaf position discrimination models were built using Support Vector Machine (SVM), Back Propagation Neural Network (BPNN), and Random Forest (RF). Particle swarm optimization (PSO) was used to optimize parameters of each model. Chemical components were analyzed for statistical significance across leaf positions, and their influence on model predictions was interpreted using SHapley Additive exPlanations (SHAP). The experimental results showed that among all models, the SVM- hybrid kernel demonstrated the most robust and accurate performance, achieving discrimination accuracies of 98.17% and 96.33% on the training and test sets, respectively. SHAP analysis provided a clear ranking of feature importance and revealed the positive and negative contributions of individual chemical components. The proposed method can be useful for position traceability and chemical feature analysis of various crops.