AUTHOR=Wang Cong , Fu Yufeng , Wan Ran , Zhao Le , Wang Hongbo , Guo Junwei , Liu Qiang , Li Shan , Ma Shengtao , Wang Zhicai , Huang Wei , Liu Huimin , Yang Song , Nie Cong TITLE=Using preprocessed datasets to construct and interpret multiclass identification models JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1597673 DOI=10.3389/fpls.2025.1597673 ISSN=1664-462X ABSTRACT=IntroductionImage and near-infrared (NIR) spectroscopic data are widely used for constructing analytical models in precision agriculture. While model interpretation can provide valuable insights for quality control and improvement, the inherent ambiguity of individual image pixels or spectral data points often hinders practical interpretability when using raw data directly. Furthermore, the presence of imbalanced datasets can lead to model overfitting and consequently, poor robustness. Therefore, developing alternative approaches for constructing interpretable and robust models using these data types is crucial.MethodsThis study proposes using preprocessed data—specifically, morphological features extracted from images and chemical component concentrations predicted from NIR spectra—to build multiclass identification models. Combined kernel SVM based models were proposed to identify the rice variety and cultivation region of tobacco. The determination of kernel parameters and percentage of different types of kernel functions were accomplished by PSO, which make the approach self-adaptive. Feature importance and contribution analyses were conducted using Shapley additive explanations (SHAP).ResultsThe resulting models demonstrated high robustness and accuracy, achieving classification success rates of 97.9 and 97.4% via n-fold cross validation on rice and tobacco datasets, respectively, and 97.7% on an independent test set (tobacco dataset 2). This analysis identified key variables and elucidated their specific contributions to the model predictions.DiscussionThis study expands the applicability of image and NIR spectroscopic data, offering researchers an effective methodology for investigating factors crucial to the quality control and improvement of agricultural products.