ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Technical Advances in Plant Science
Enabling rapid and accurate grand discrimination of flue-cured tobacco: a near-infrared hyperspectral and machine learning approach
Provisionally accepted- 1Kunming University of Science and Technology, Kunming, China
- 2Yunnan Agricultural University, Kunming, China
- 3Yunnan Tobacco Company, Kunming, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
To address the inefficiency and subjectivity of manual grading, this study established a machine learning model based on near-infrared hyperspectral data (950– 1650 nm) for the accurate classification of first-roasted tobacco grades. Multivariate statistical analysis uncovered the intrinsic correlations among grade, spectral data, and chemical composition, thereby laying a theoretical foundation for hyperspectral-based grading technology. Three preprocessing methods (namely, multiplicative scatter correction (MSC), standard normal variate transformation, and Savitzky–Golay convolutional smoothing) and four classification models (namely, random forest, backpropagation neural network, extreme learning machine, and partial least squares– discriminant analysis (PLS-DA)) were employed. Moreover, characteristic bands were selected through the successive projections algorithm (SPA) and competitive adaptive reweighted sampling to investigate how the number of characteristic bands affects the grade classification accuracy. The results showed that rank exhibited highly significant correlations with nicotine, reducing sugars, total sugars, and sugar-nicotine ratio, and that spectra exhibited highly significant correlations with nicotine. The classification accuracy of full-band MSC preprocessing combined with the PLS-DA model reached 98.5%, while the classification accuracy reached 94.0% when using 70% of the full bands selected using the SPA. In conclusion, near-infrared hyperspectroscopy combined with machine learning not only offers high efficiency, accuracy, and non-destructiveness in the grading of first-roasted tobacco leaves but also provides a theoretical basis for industrial hyperspectral grading by elucidating the correlations among spectrum, chemical composition, and grade. This method avoids the subjectivity of manual grading and offers key technical support to advance the intelligence and automation of first-roasted tobacco leaf grading in the tobacco industry.
Keywords: Characteristicbands, Chemicalanalysis, machine learning, Mantel test correlation analysis, Near-infrared hyperspectroscopy, Tobacco leaf grading
Received: 28 Nov 2025; Accepted: 31 Jan 2026.
Copyright: © 2026 Zou, Gao, Wang, Chen, Deng, Shi, Yang, Huang, Zi, Du, Bai, Wang, Wang, Liu, Zhang and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Zhengling Liu
Junhua Zhang
Peng Zhou
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
