Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Technical Advances in Plant Science

Volume 16 - 2025 | doi: 10.3389/fpls.2025.1679754

Rice Origin Traceability Using Mid-Infrared and Fluorescence Spectral Data Fusion

Provisionally accepted
Changming  LiChangming Li*Yong  TanYong TanChunyu  LiuChunyu LiuXun  GaoXun GaoZhong  LvZhong Lv
  • Changchun University of Science and Technology, Changchun, China

The final, formatted version of the article will be published soon.

This study overcomes the limitations of traditional single-spectroscopy techniques by constructing an intelligent discrimination system for rice geographic origin that integrates mid-infrared (MIR) and fluorescence (FLU) spectral feature fusion with machine learning. Using the "Zhongke Fa 5" rice variety from eight major production regions in Jilin Province, China, as the research object, spectral data were acquired using Fourier transform infrared (FTIR) and fluorescence spectrometers. A "Normalization-Smoothing-Multiplicative Scatter Correction" preprocessing framework was proposed, significantly enhancing the signal-to-noise ratio and separability of the spectral features. The complementary characteristics of the multispectral data were elucidated: MIR spectra (500-3750 cm-1) accurately represented molecular vibration features of key components such as starch, protein, and lipids, while FLU spectra (450-850 nm) effectively captured the fluorescence characteristics of phenolic compounds and protein-pigment complexes. The successive projections algorithm (SPA) was employed to extract 286-310 highly discriminative features from the original 7625-dimensional data, effectively mitigating the overfitting problem associated with high-dimensional data. The performance differences between data-level and feature-level fusion strategies were compared. The feature-level fusion model optimized by SPA demonstrated significant advantages, achieving a test set accuracy of 95.55%. Regarding algorithm performance, the logistic regression (LR) model combined with enhanced spectral features (LR-SPA) significantly outperformed support vector machine (SVM, 83.17%) and gradient boosting algorithms in terms of both precision (93.05%) and robustness. This study provides a revolutionary technical approach for agricultural product quality and safety supervision, holding substantial theoretical innovation and practical application value. As a primary goal, the abstract should render the general significance and conceptual advance of the work clearly accessible to a broad readership. References should not be cited in the abstract. Leave the Abstract empty if your article does not require one – please see the "Article types" on every Frontiers journal page for full details.

Keywords: Spectrometry, Data Preprocessing, Origin discrimination, machine learning, Datafusion

Received: 05 Aug 2025; Accepted: 13 Oct 2025.

Copyright: © 2025 Li, Tan, Liu, Gao and Lv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Changming Li, 83463808@qq.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.