Intelligent grading of sugarcane leaf disease severity by integrating physiological traits with the SSA-XGBoost algorithm

Wang, Xinrui; Sun, Jihong; Tian, Peng; Wu, Mengyao; Zhao, Jiawei; Chen, Jiangquan; Qian, Ye; Wang, Canyu

doi:10.3389/fpls.2025.1698808

ORIGINAL RESEARCH article

Front. Plant Sci., 15 October 2025

Sec. Technical Advances in Plant Science

Volume 16 - 2025 | https://doi.org/10.3389/fpls.2025.1698808

Intelligent grading of sugarcane leaf disease severity by integrating physiological traits with the SSA-XGBoost algorithm

Xinrui Wang¹

Jihong Sun²

Peng Tian¹

Mengyao Wu¹

Jiawei Zhao¹

Jiangquan Chen¹

Ye Qian^1*

Canyu Wang^1*

¹College of Big Data, Yunnan Agricultural University, Kunming, Yunnan, China
²College of Information Engineering, Kunming University, Kunming, Yunnan, China

Introduction: Accurate assessment of sugarcane leaf disease severity is crucial for early warning and effective disease control.

Methods: In this study, we propose an intelligent method for identifying sugarcane foliar disease severity based on physiological traits. Field-collected data—including Soil and Plant Analyzer Development (SPAD) values, leaf surface temperature, and nitrogen content—were acquired using a plant nutrient analyzer (TYS-4N) from sugarcane leaves infected with brown stripe disease, ring spot disease, and mosaic disease at four severity levels (mild, moderate, moderately severe, and severe). After min-max normalization, six classification models—KNN, AdaBoost, Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), and XGBoost—were developed, and the Sparrow Search Algorithm (SSA) was employed to optimize hyperparameters for enhanced performance.

Results: Results demonstrate that SSA significantly improved the classification capability of all models. The SSA-XGBoost model achieved the best performance, with Precision, Recall, F1 Score, and Accuracy all exceeding 0.9186, and a comprehensive PRFA score of 0.9326. When validated on an independent dataset from Gengma County, the model achieved an overall accuracy of 0.91, indicating strong generalization ability and field applicability.

Discussion: Compared to image-based deep learning approaches, the proposed method offers advantages in terms of data accessibility, computational efficiency, and model transparency, making it well-suited for rapid on-site diagnosis in agricultural settings. This study provides an efficient and reliable technical framework for intelligent diagnosis and early warning of sugarcane disease severity.

1 Introduction

Sugarcane is one of the most important sugar crops worldwide, accounting for approximately 75% of global sugar production (Qaadan et al., 2025; Kurniawan et al., 2025). However, the frequent occurrence of foliar diseases severely threatens sugarcane growth and development, leading to yield loss, reduced sugar content, and significant economic losses (Bao et al., 2024). Therefore, accurate identification of disease severity is not only essential for timely control measures and crop health maintenance, but also critical for maximizing yield potential, optimizing sugar accumulation, and promoting sustainable agricultural development.

In recent years, the integration of artificial intelligence and sensing technologies has opened new pathways for intelligent diagnosis of crop diseases. Researchers have extensively explored deep learning models based on image analysis, achieving notable success in various crop disease recognition tasks. Early approaches often combined convolutional neural networks (CNNs) with traditional classifiers. For instance, Shradha et al. (2023) extracted features using multiple CNNs and employed a support vector machine (SVM) for classification, achieving an accuracy of 82.80% in plant disease detection. As model architectures evolved, the You Only Look Once (YOLO) series demonstrated strong performance in object detection tasks; Kalezhi and Shumba (2025) achieved over 80% mean average precision (mAP) in cassava disease detection. For sugarcane-specific diseases, Sun et al. (2023) proposed the SE-Vit hybrid network, achieving an accuracy of 89.57%, while Hong et al. (2024) improved a VGG-16-based model to achieve a high accuracy of 98.89%. To further enhance model performance, attention mechanisms and optimized loss functions have been introduced. Sun et al. (2024) integrated the Efficient Multi-Scale Attention (EMA) attention mechanism and focal loss into YOLOv8, effectively mitigating issues of complex backgrounds and sample imbalance in field images. Moreover, the application of Transformer architectures has pushed performance boundaries; Kuppusamy et al. (2024) achieved a classification accuracy of 98.5% in sugarcane leaf disease recognition using a Hybrid Shifted Vision Transformer.

Meanwhile, researchers have begun to transcend the limitations of single-modality imaging by exploring diagnostic methods that fuse multi-source information to improve the scientific rigor and robustness of assessments. Hyperspectral imaging has gained attention due to its sensitivity to plant biochemical parameters. Bao et al. (2024) combined hyperspectral data with deep neural networks to enable early detection of sugarcane smut, achieving over 90% accuracy. Pereira et al. (2025), in a systematic review, noted that 88% of related studies employed hyperspectral technology, often combined with vegetation indices (VIs) and principal component analysis (PCA), with classification accuracies generally exceeding 71%. Poblete et al. (2023) utilized high-resolution satellite data to detect vascular disease symptoms in trees, extending the application of remote sensing to large-scale monitoring. Additionally, Gianni and Maridina (2021) proposed a multi-output learning framework to simultaneously diagnose disease types and stress severity, enhancing the comprehensiveness of assessment. Adluri and Bhukya (2025) further incorporated gene expression data into predictive modeling, achieving 96.16% accuracy in rice disease early warning using their adaptively optimized residual long short-term memory with multilayer perception (AO-RLSTM-MLP) model, enabling detection of asymptomatic infections.

Despite their strong performance under controlled conditions, these technologies face multiple challenges in real-field applications. First, environmental interference significantly affects model performance: variations in illumination, leaf overlap, and background noise degrade image quality, causing the accuracy of hyperspectral models to drop from over 90% in laboratory settings to below 70% in field conditions (Abbas et al., 2023; Pereira et al., 2025). Moreover, the high cost and complex calibration requirements of hyperspectral equipment limit their adoption among smallholder farmers (Kurniawan et al., 2025). Second, most disease severity assessments still rely on lesion area or visual scoring (Qin et al., 2025), making it difficult to dynamically reflect changes in plant physiological status (e.g., chlorophyll and nitrogen levels). Although Vasavi et al. (2023) used models such as random forest and AdaBoost to predict chili diseases, and Bin et al. (2023) proposed the triple-branch Swin Transformer classification (TSTC) network to simultaneously classify disease and severity, their inputs remain limited to image features, lacking integration with physiological parameters. Furthermore, model optimization is inefficient: traditional grid search or random search incurs high computational costs (Sharma et al., 2025), and hybrid optimization algorithms (e.g., the Hybrid WOAAPSO algorithm, which merges Adaptive Particle Swarm Optimization (APSO) with the Whale Optimization Algorithm (WOA) by Vijayan and Chowdhary, 2025) still face challenges in convergence speed within high-dimensional parameter spaces. The stacked ensemble framework proposed by Qaadan et al. (2025) improved classification performance but relied on resource-intensive models, limiting its deployability in edge environments.

To address these challenges, this study proposes a novel approach for assessing sugarcane leaf disease severity using plant physiological traits and intelligent optimization algorithms. The main research contributions are:

1. Field Data Collection and Dataset Construction: To overcome the environmental adaptability issues associated with traditional image-dependent methods, this study employs a portable plant nutrient analyzer (TYS-4N) to collect non-image physiological data from sugarcane leaves in the field. By measuring SPAD values, leaf surface temperature, and nitrogen content, we constructed a comprehensive dataset covering various sugarcane diseases (brown stripe, ring spot, and mosaic) at different severity levels.

2. Machine Learning Model Optimization Using SSA: To address the inefficiency of traditional hyperparameter tuning methods, we employed the SSA to optimize six mainstream machine learning models (KNN, AdaBoost, Random Forest, Logistic Regression, Decision Tree, and XGBoost). We introduced a composite evaluation metric PRFA, consisting of Precision, Recall, F1 Score, and Accuracy, to comprehensively assess model performance. The objective of SSA optimization was to maximize the PRFA score on the validation set, thereby enhancing model robustness and generalization.

3. Physiological Trait-Based Disease Severity Assessment Model: Based on the optimized models, we developed a disease severity assessment model centered on SPAD values, leaf surface temperature, and nitrogen content. The input layer directly maps physiological features, while the output layer adopts a multi-classification strategy (mild, moderate, moderately severe, and severe) to identify disease severity. To validate model robustness, cross-regional testing was conducted in sugarcane fields in Gengma County, Yunnan Province.

This study not only provides a new approach for intelligent identification of sugarcane disease severity but also offers methodological support for digital management of diseases in other crops, demonstrating significant theoretical innovation and practical value.

2 Materials and methods

2.1 Data collection and preprocessing

The data used in this study were collected from two representative sugarcane cultivation sites in Yunnan Province, China. The primary dataset was obtained from the sugarcane germplasm resource nursery/breeding station of Yunnan Agricultural University, where two cultivars—Dianzhe and Xintaitang—were planted. Prior to data collection, disease severity levels for brown stripe disease, ring spot disease, and mosaic disease were systematically classified based on expert consultation and field observations. The assessment was conducted by evaluating visual symptoms on green leaves, including lesion morphology (size, number, spatial distribution, and color change), and disease severity was categorized into four levels: mild, moderate, moderately severe, and severe (see Figure 1 and Supplementary Table 1).

Figure 1

Three panels labeled A, B, and C showing plant leaves with varying disease severity levels. Panel A shows four leaves labeled mild, moderate, moderately severe, and severe. Panel B displays four leaves with visible spots, also labeled mild, moderate, moderately severe, and severe. Panel C shows four leaves with subtle differences, labeled mild, moderate, moderately severe, and severe.

Figure 1. Representative symptoms of sugarcane leaf diseases at different severity levels: (A) Brown Stripe, (B) Ring Spot, and (C) Mosaic.

Physiological parameters were measured using a portable plant nutrient analyzer (TYS-4N, Top Cloud-Agro Technology, China). For each infected leaf, three measurements of SPAD value (indicating chlorophyll content), leaf surface temperature, and nitrogen content were taken at different locations within the lesion area. The average of the three readings was recorded as the representative value for that sample. The corresponding disease severity level was also documented for each measurement. Data were collected during the early maturity stage of sugarcane in November 2024, resulting in a total of 2,212 valid samples: 343 mild, 628 moderate, 670 moderately severe, and 571 severe cases.

To evaluate the model’s generalization capability, an independent validation dataset was collected from the Gengma Sugarcane Plantation, the largest sugarcane production base in Yunnan Province, primarily cultivating the Dianzhe variety. The same measurement protocol—identical severity grading criteria and instrument settings—was strictly followed. Data collection was completed in December 2024, yielding 635 validation samples: 28 mild, 63 moderate, 127 moderately severe, and 417 severe. The geographical, climatic, and agronomic differences between the two sites enhance the robustness of the model and enable rigorous cross-regional and cross-ecological validation.

The dataset used in this study comprises three physiological variables measured from diseased sugarcane leaf regions—SPAD value, leaf surface temperature, and nitrogen content—along with a categorical label indicating disease severity, classified into four levels: mild, moderate, moderately severe, and severe. To facilitate model training and evaluation, the severity labels were numerically encoded using ordinal encoding: “mild” was assigned 0, “moderate” → 1, “moderately severe” → 2, and “severe” → 3. An example of the preprocessed dataset is presented in Table 1.

Table 1

Table 1. Example data of SPAD values, leaf surface temperature, nitrogen content, and disease severity.

2.2 Model selection and hyperparameter optimization

To address challenges such as limited sample size and class imbalance inherent in the dataset, six representative machine learning algorithms were systematically selected and comparatively evaluated: KNN, a non-parametric method that classifies samples based on majority voting among their nearest neighbors (Parthasarathy and Chatterji, 1990); AdaBoost, an adaptive boosting algorithm that dynamically adjusts sample weights to focus on hard-to-classify instances (Schwenk and Bengio et al., 2000); RF, an ensemble of decision trees built using bagging, offering strong generalization capability; Logistic Regression (Breiman, 2001), LR, a simple and interpretable linear classifier (Singh et al., 2009); DT, a model that makes decisions based on tree-structured rules—easy to interpret but prone to overfitting (Geibel et al., 2002); and XGBoost, an efficient and regularized gradient boosting framework that delivers state-of-the-art performance across a wide range of machine learning tasks (Chen and Guestrin, 2016). These models span linear classifiers, instance-based learning, and ensemble learning frameworks, enabling a comprehensive assessment of the mapping between physiological features and disease severity across diverse hypothesis spaces, thereby ensuring robustness and representativeness in model selection.

Some of the selected models inherently possess a certain degree of robustness to class imbalance due to their algorithmic mechanisms. For instance, ensemble-based methods such as Random Forest and XGBoost mitigate class bias to some extent by constructing multiple base learners and incorporating randomness or gradient-based optimization. AdaBoost, on the other hand, dynamically adjusts the weights of misclassified samples, thereby placing greater emphasis on minority-class instances that are difficult to classify. Given the limited overall sample size, resampling techniques—such as oversampling (e.g., SMOTE) and under sampling—are prone to causing overfitting under small-sample conditions and may hinder model generalization. Furthermore, in model evaluation, we primarily rely on metrics robust to class imbalance, such as the F1-score and recall, rather than accuracy alone, to ensure the objectivity and reliability of our assessment results.

To overcome the inefficiency and tendency to converge to local optima of traditional hyperparameter tuning methods (e.g., grid search and random search) in high-dimensional spaces, this study employs the SSA for automated hyperparameter optimization (Xue and Shen, 2020). SSA is a metaheuristic optimization algorithm inspired by the foraging and anti-predation behaviors of sparrow groups. In this model, the sparrow population is divided into two roles: scouts, responsible for exploring new food sources (i.e., potential optimal solutions in the search space), and followers, who follow the scouts and utilize existing information. Additionally, some sparrows act as sentinels, triggering group position updates upon sensing danger (such as getting stuck in a local optimum), thereby enhancing the ability to escape local optima. By simulating this social behavior mechanism, SSA achieves a balance between global exploration and local exploitation, making it suitable for complex, non-convex, high-dimensional optimization problems, such as hyperparameter tuning in machine learning. The flowchart is shown in Figure 2.

Figure 2

Flowchart illustrating the sparrow search algorithm. It starts with initializing the sparrow population, followed by fitness evaluation. Roles are assigned as discoverers or joiners. Discoverers perform global exploration for new food sources, while joiners follow them for local exploitation. Sentinels monitor danger, detecting local optima. If danger is detected, a group position update occurs to escape local optima. The position is updated, and if termination criteria are met, the optimal solution is output. If not, the process reiterates.

Figure 2. The SSA flowchart.

SSA is a population-based metaheuristic algorithm inspired by the foraging and anti-predation behaviors of sparrows, known for its strong global search capability and rapid convergence. In this work, SSA is applied to optimize key hyperparameters of each model, with the objective of maximizing the composite evaluation metric PRFA—a weighted average of Precision, Recall, F1-score, and Accuracy (with equal weights)—on the validation set. This objective function is designed to balance classification performance across all severity levels, particularly improving detection accuracy for minority classes (e.g., mild disease cases). Although the F1-score inherently integrates Precision and Recall, the practical application context of this study—early detection of mild crop diseases—entails diverse performance priorities among different stakeholders: agronomists prioritize minimizing missed diagnoses (high Recall), system operators emphasize the reliability of alerts (high Precision), and managers require a balanced view of overall classification accuracy (Accuracy). Therefore, the PRFA metric is not intended to be a theoretically non-redundant evaluation measure; rather, it serves as a compromise proxy metric that reflects the multi-stakeholder requirements and guides the hyperparameter optimization process toward a balanced trade-off across multiple performance dimensions.

The entire optimization process is conducted within a cross-validation framework (e.g., 5-fold CV) to ensure stable and generalizable performance estimation. The final optimized models are then evaluated on an independent test set and used to construct the physiological trait-based model for sugarcane disease severity assessment. The overall technical workflow is illustrated in Figure 3.

Figure 3

Flowchart illustrating the process of classifying sugarcane leaf disease severity. It includes data acquisition using TYS-4N device, observing symptoms like brown stripe, ring spot, and mosaic with factors such as SPAD, temperature, and nitrogen levels. Severity is classified from mild (0) to severe (3). The model building section employs algorithms like KNN, AdaBoost, and others, optimized by the sparrow search algorithm and PRFA, to develop the classification model validated by precision, recall, F1 score, and accuracy.

Figure 3. Technical workflow.

All experiments were performed on a computer equipped with an AMD Ryzen 7 4800H with Radeon Graphics (2.90 GHz), using Python 3.9. The primary software libraries include Scikit-learn 1.6.0, XGBoost 2.1.3, NumPy, Pandas, and a custom-developed SSA optimization framework.

Within the SSA framework, hyperparameter optimization for each machine learning model is conducted over a predefined search space. Each individual in the sparrow population represents a candidate hyperparameter combination, and the algorithm iteratively updates their positions to maximize the PRFA score on the validation set, which serves as the fitness function. The algorithm is configured with a maximum of 100 iterations and a population size of 30. Early stopping is applied if the fitness value does not improve for 10 consecutive generations. To enhance computational efficiency, parallel execution is enabled via the n_jobs=-1 parameter in scikit-learn, leveraging all available CPU cores.

Selected hyperparameters and their corresponding search ranges are listed in Table 2. A fixed random seed (random_state=42) was used for reproducibility, while remaining hyperparameters were set to default values. All hyperparameters were encoded (either continuously or discretely) into the SSA search vector, with boundary constraints and type validation enforced during optimization. Ultimately, the optimal hyperparameter combination yielding the highest PRFA score is selected for each model and used in subsequent performance evaluation on the independent test set.

Table 2

Table 2. Selected hyperparameters and their search spaces for the base model.

2.3 Model evaluation metrics

To comprehensively evaluate the performance of different machine learning models in the sugarcane disease severity classification task, this study employs multiple classification evaluation metrics, including Accuracy, Precision, Recall, F1-score, and a custom composite metric named PRFA. All metrics are computed based on the confusion matrix constructed from the predicted labels and true labels on the test set.

2.3.1 Accuracy

Accuracy represents the proportion of correctly classified samples among the total number of samples. It is a widely used overall performance metric suitable for most classification tasks. The formula for accuracy is defined as:

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

2.3.2 Precision

Precision is the ratio of true positive predictions to all samples predicted as positive. It reflects the model’s ability to avoid false alarms when identifying diseased samples. The precision for each class is calculated as:

Precision = \frac{TP}{TP + FP}

2.3.3 Recall

Recall, also known as True Positive Rate (TPR) or sensitivity, measures the proportion of actual positive samples that are correctly identified by the model. It indicates the model’s capacity to detect all instances of a given severity level. Recall is computed as:

Recall = \frac{TP}{TP + FN}

2.3.4 F1-score

The F1-score is the harmonic mean of Precision and Recall, providing a balanced assessment of model performance, especially in the presence of class imbalance. The F1-score ranges from 0 to 1, with values closer to 1 indicating better performance. It is calculated as:

F 1 Score = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

2.3.5 Composite performance metric

To balance the trade-offs among Precision, Recall, F1-score, and Accuracy, this study proposes a custom composite metric, PRFA, which computes the equally weighted average of these four metrics:

PRFA = \frac{1}{4} (Precision + Recall + F 1 Score + Accuracy)

This metric ensures a holistic evaluation of model performance across multiple dimensions, particularly enhancing sensitivity to minority classes while maintaining overall classification consistency.

Definitions of Confusion Matrix Components:

TP (True Positive): Number of samples that are actually positive and correctly predicted as positive.

TN (True Negative): Number of samples that are actually negative and correctly predicted as negative.

FP (False Positive): Number of samples that are actually negative but incorrectly predicted as positive.

FN (False Negative): Number of samples that are actually positive but incorrectly predicted as negative.

3 Results

3.1 Distribution analysis of sugarcane leaf disease severity

Figure 4 illustrates the distribution patterns of SPAD values, leaf surface temperature, and nitrogen content in sugarcane leaves across disease severity levels (Level 0–3), visualized using violin plots. These plots effectively capture the central tendency, dispersion, and skewness of each variable within severity classes, providing insights into their response to disease progression.

Figure 4

Three violin plots labeled A, B, and C. Plot A shows SPAD values versus levels, with wider spread at higher levels. Plot B displays nitrogen content decreasing across levels. Plot C depicts temperature with similar distribution across all levels.

Figure 4. Distribution of SPAD values, leaf surface temperature, and nitrogen content across different sugarcane disease severity levels (Level 0–3). (A) shows the distribution of SPAD values; (B) shows the distribution of nitrogen content; (C) shows the distribution of leaf surface temperature.

3.1.1 SPAD value distribution

Figure 3A and Table 3 illustrate the distribution patterns of SPAD values across different disease severity levels. At Level 0, SPAD values are primarily distributed between 47 and 58, with a median of approximately 51, indicating that chlorophyll content in healthy or slightly affected leaves is concentrated at higher levels. As the disease severity increases, SPAD values show a systematic downward trend, with the distribution becoming wider but shorter. By Level 3, the range of SPAD values decreases to its lowest point, with the height of the distribution significantly increasing but the width becoming narrower, indicating that the data is highly concentrated around the median but spans a larger overall range, with some low-value outliers. This reflects the individual variability in chlorophyll degradation under severe disease conditions.

Table 3

Table 3. Distribution ranges and medians of SPAD values across different disease severity levels.

Overall, both the median and interquartile range of SPAD values decrease monotonically with disease progression, demonstrating a strong negative correlation between chlorophyll content and disease severity. Except for Level 3, the distributions in the first three levels are relatively symmetric and compact, suggesting stable physiological responses in early to mid-stage infections.

3.1.2 Nitrogen content distribution

Figure 3B and Table 4 illustrate the evolution patterns of nitrogen content as the disease progresses. At Level 0, nitrogen content is concentrated between 15 and 18 mg/kg, with a median of approximately 16 mg/kg, showing a symmetric and dense distribution. Similar to the trend observed for SPAD values, as disease severity increases, the range, median, and height of the nitrogen content distribution all gradually decrease, while the width of the distribution becomes wider. By Level 3, both the range and median of the distribution have reached their lowest points, with the distribution becoming narrower but showing significant tailing, particularly at the lower end (<10 mg/kg), where there is a notable extension of density. This indicates that severe disease leads to substantial nitrogen depletion and an increased variability among individuals.

Table 4

Table 4. Distribution ranges and medians of nitrogen content across different disease severity levels.

These results confirm a continuous decline in nitrogen content with disease progression, with distribution morphology transitioning from symmetric and concentrated to skewed and dispersed in advanced stages. This supports nitrogen content as a sensitive indicator of disease severity.

3.1.3 Leaf surface temperature distribution

Figure 4C displays the leaf temperature distribution across severity levels. In Level 0, temperatures range from 18°C to 25°C, with a median of 20–22°C. From Level 1 to Level 3, the overall range remains largely unchanged (17–25°C), and the median shows only a slight downward trend, indicating a weak response of leaf temperature to disease progression.

Notably, all severity levels exhibit multi-modal and asymmetric distributions, with multiple density peaks and unequal tails. This suggests substantial intra-class variability, likely influenced by non-disease factors such as microclimate, stomatal conductance, or water stress. Consequently, leaf temperature alone demonstrates limited discriminative power compared to SPAD and nitrogen content, highlighting its limited utility as a standalone diagnostic feature.

3.2 Training and optimization of classification models

Six machine learning models—KNN, AdaBoost, RF, LR, DT, and XGBoost—were trained to classify sugarcane disease severity based on SPAD, temperature, and nitrogen data collected from plants infected with brown stripe, ring spot, and mosaic diseases. Hyperparameters were optimized using the SSA to enhance model performance and generalization. The dataset was split into training (90%) and testing (10%) sets to ensure sufficient training and independent evaluation.

With default hyperparameters, the models were evaluated on the test set using Precision, Recall, F1-score, Accuracy, and PRFA (see Table 5). Logistic Regression (LR) achieved the best performance among default models, with Precision=0.9154, Recall=0.9144, F1-score=0.9145, and Accuracy=0.9144, indicating high consistency and stability. In contrast, AdaBoost performed the worst (Precision=0.5898, Recall=0.6216, F1-score=0.5367), reflecting its sensitivity to class imbalance and suboptimal default settings. KNN achieved scores close to LR (>0.90), while RF and XGBoost showed robust performance (~0.89). DT scored ~0.85—29% higher than AdaBoost—demonstrating acceptable baseline performance.

Table 5

Table 5. Comparison of model performance before and after optimization.

SSA significantly improved the performance of all models by optimizing key hyperparameters to maximize PRFA on a validation subset. After optimization: XGBoost emerged as the top performer, achieving Precision=0.9199, Recall=0.9189, F1-score=0.9186, Accuracy=0.9189, surpassing even the unoptimized LR model. Improvements ranged from +0.027 to +0.028 across metrics, demonstrating SSA’s effectiveness in fine-tuning ensemble models. AdaBoost showed the most dramatic improvement: F1-score increased by 0.1043, with all metrics converging toward 0.67–0.69, indicating enhanced stability and reduced bias due to better parameter configuration. KNN, RF, LR, and DT also improved by 0.01–0.03 on average, confirming the broad applicability of SSA in enhancing model robustness.

To clearly illustrate the relative effectiveness of different optimization strategies, this study directly compares the original XGBoost model (accuracy: 89.19%, F1-score: 89.17%) with various models optimized by the Sparrow Search Algorithm (SSA). The results show that SSA-XGBoost achieves the best performance among all compared models, with an accuracy improvement of 2.70 percentage points over the baseline XGBoost. Among the other SSA-optimized models, SSA-LR and SSA-RF both attain an accuracy of 91.44% (F1-scores of 91.45% and 91.41%, respectively), slightly lower than SSA-XGBoost. SSA-DT achieves an accuracy of 89.19%, comparable to the baseline XGBoost, while SSA-KNN (89.64%) and SSA-AdaBoost (71.62%) show no clear advantage. In summary, XGBoost optimized by SSA not only significantly outperforms its original version but also maintains a leading position in comparison with other SSA-optimized models, demonstrating superior overall classification capability.

The PRFA metric, defined as the equally weighted average of the four core metrics, was used to rank model performance. As summarized in Table 6, the SSA-XGBoost model achieved the highest PRFA of 0.9326, outperforming all other models. This indicates superior overall performance in balancing precision, recall, and accuracy across severity levels.

Table 6

Table 6. PRFA scores and optimal hyperparameters of the six machine learning models after SSA optimization.

In this study, we evaluated the contribution of each feature to the predictive performance of the SSA-XGBoost model by computing feature importance scores using the model’s `feature_importances_` attribute and visualized the top 20 features in a bar plot (see Figure 5). As shown in the figure, among all physiological features related to sugarcane disease, nitrogen content exhibited a significantly higher importance score than both SPAD and leaf temperature, while leaf temperature ranked lowest and contributed minimally. This finding is highly consistent with our qualitative observations from the disease severity distribution analysis—namely, that SPAD and nitrogen concentration effectively capture the gradient of disease severity, whereas leaf temperature shows limited discriminative power across severity levels.

Figure 5

Bar chart titled “Feature Importance” comparing the significance of three features: Nitrogen, SPAD, and Temperature. Nitrogen is the most important, followed by SPAD and then Temperature.

Figure 5. The feature importance results of the SSA-XGBoost model.

3.3 External validation and generalization analysis

To evaluate real-world applicability, the optimized SSA-XGBoost model was externally validated on an independent dataset of 635 samples collected from the Gengma Sugarcane Plantation. The dataset includes 28 (Level 0), 63 (Level 1), 127 (Level 2), and 417 (Level 3) samples, reflecting the field-realistic increasing prevalence of severe disease.

As shown in Table 7, the model achieved an overall Accuracy of 0.91, with a weighted F1-score of 0.91 and a macro F1-score of 0.89, indicating strong generalization. Specifically: Precision was highest for Level 2 (0.92) and lowest for Level 0 (0.88), suggesting high specificity for moderately severe cases. Recall was perfect for Level 0 (1.00), indicating no missed detection of healthy/lightly diseased plants, but lowest for Level 2 (0.65), revealing misclassification or under-detection. F1-score was highest for Level 3 (0.95) and lowest for Level 2 (0.76), highlighting classification ambiguity in the moderately severe category.

Table 7

Table 7. Classification report on the validation dataset.

To further assess performance stability, confusion matrices were generated for both the original test set and the independent validation set (see Figure 6). Both datasets yielded an overall accuracy of 0.91, confirming model consistency. On the original test set, classification was balanced: Level 0 accuracy=86.21%, Level 1 = 93.02%, Level 2 F1 = 88.41%, Level 3 = 98.48%. On the Gengma validation set, Level 0 recall reached 100%, and Level 3 precision was 98.56%, confirming robust detection of healthy and severely diseased plants.

Figure 6

Two confusion matrices for XGBoost are shown. Matrix A has true labels from 0 to 3, with the highest correct predictions at (1,1) and (2,2). Matrix B also ranges from 0 to 3, with the highest correct predictions at (2,2) and (3,3). Both matrices include a color gradient indicating prediction frequency.

Figure 6. Confusion matrices on the original test set and the independent validation set. (A) is the confusion matrix of the optimized XGBoost in the original test set; (B) is the confusion matrix of the optimized XGBoost in the independent validation set.

However, Level 2 recall dropped to 64.57% (F1 = 76.34%), with many samples misclassified as Level 3. This suggests symptom overlap and transitional characteristics between moderately severe and severe disease stages in real-field conditions.

This performance gap indicates that while the current physiological features (SPAD, temperature, nitrogen) are effective for detecting extreme disease states, they struggle to distinguish transitional stages (Level 2), likely due to overlapping symptom expression and environmental noise.

The SSA-XGBoost model demonstrates superior performance in classifying sugarcane disease severity using physiological traits, achieving high accuracy and strong generalization, particularly for healthy (Level 0) and severely infected (Level 3) plants. SPAD values and nitrogen content are identified as highly sensitive and reliable indicators of disease progression, while leaf temperature exhibits limited discriminative power due to high intra-class variability. External validation confirms the model’s robustness in real-world conditions. However, classification of moderately severe cases (Level 2) remains challenging due to ambiguous symptom expression, suggesting the need for increased sampling during transitional stages, integration of temporal or multi-modal data (e.g., canopy imaging, weather variables), and advanced feature engineering to improve boundary discrimination. Overall, this study validates the feasibility of deploying SSA-optimized XGBoost models for large-scale, non-destructive monitoring of sugarcane health across diverse agro-ecological environments.

4 Discussion

The current study introduces a novel approach for classifying the severity of sugarcane leaf diseases by integrating physiological features with machine learning techniques, aiming to address challenges in early detection and classification accuracy within disease management. By leveraging key indicators such as SPAD values, leaf surface temperature, and nitrogen content acquired through portable plant nutrient analyzers (TYS-4N), six classification models—KNN, AdaBoost, RF, LR, DT, and XGBoost—were constructed. The SSA was employed for hyperparameter optimization, significantly enhancing model performance.

4.1 Biological basis of selected features

The chosen physiological traits are grounded in biological principles. SPAD values, reflecting chlorophyll content, typically decrease due to damage to chloroplast structure during disease progression. Leaf surface temperatures are influenced by stomatal conductance and transpiration rates, often rising as stomata close upon infection, reducing heat dissipation. Nitrogen levels directly affect plant growth and disease resistance, showing systematic changes throughout disease development. These metrics provide stable and sensitive inputs for classification models, contrasting with deep learning approaches that rely on image data. Numerical physiological parameters obtained via portable devices offer advantages like ease of collection, robustness against environmental interference, and minimal preprocessing requirements, making them more suitable for rapid field detection and practical application.

Nevertheless, this study has two key limitations that should be acknowledged. First, the current models focus exclusively on disease severity grading and do not distinguish among specific sugarcane disease types (e.g., brown stripe, ring spot, or mosaic). Future work should incorporate visual, spectral, or molecular signatures to enable precise pathogen identification alongside severity assessment.

Second—and equally important—the set of physiological features is limited to only three parameters: SPAD value, leaf surface temperature, and nitrogen content. While these are biologically meaningful and field-accessible, they capture only a partial view of the plant’s stress response. Additional physiological indicators, such as leaf water potential, relative chlorophyll fluorescence (e.g., Fv/Fm), or leaf moisture content, reflect complementary mechanisms of plant defense and could significantly enhance model sensitivity—particularly to early-stage infections that may not yet manifest in SPAD or nitrogen changes.

Looking forward, integrating these physiological traits with multimodal data sources—such as hyperspectral imaging, thermal infrared sensing, or even metabolomic profiles—could unlock a more holistic understanding of plant health. Such a multi-layered approach would not only improve predictive accuracy but also support earlier and more robust diagnosis under diverse field conditions, paving the way for next-generation precision disease management systems in sugarcane and other crops.

4.2 Enhanced model performance through SSA optimization

The SSA algorithm’s global search capabilities and fast convergence properties were leveraged to optimize critical hyperparameters across all six machine learning models. This approach effectively mitigates the risk of local optima inherent in conventional tuning methods such as grid or random search. Experimental results demonstrated significant improvements in classification accuracy and stability after optimization, with SSA-XGBoost achieving the best overall performance (Precision, Recall, F1 Score, and Accuracy all exceeding 0.9186; PRFA=0.9326), thereby confirming SSA’s effectiveness in enhancing model performance.

Notably, XGBoost exhibited greater performance gains from SSA optimization compared to Random Forest (RF), Logistic Regression (LR), and Decision Tree (DT). This can be attributed to XGBoost’s algorithmic structure: as a gradient boosting framework, it relies on a complex set of interdependent hyperparameters—such as learning rate, maximum tree depth, subsample ratio, and L1/L2 regularization—that jointly control model complexity, bias-variance trade-off, and generalization. The global exploration ability of SSA is particularly well-suited to navigating this high-dimensional, non-convex hyperparameter space, enabling XGBoost to fully exploit its capacity for modeling nonlinear relationships and high-order feature interactions—common in agricultural physiological data (e.g., SPAD, leaf temperature, nitrogen content).

In contrast, LR has limited expressiveness due to its linear nature and few tunable parameters; DT, while interpretable, lacks robustness and is prone to overfitting without ensemble strategies; and RF, though inherently stable through bagging and random feature selection, exhibits reduced sensitivity to hyperparameter tuning because of its stochastic design. Consequently, the synergy between SSA’s efficient global search and XGBoost’s flexible, high-capacity architecture yields more substantial performance improvements than with other base learners, highlighting the importance of aligning optimization strategies with model-specific characteristics.

4.3 Advantages over deep learning methods

Compared to deep learning alternatives, our method exhibits several practical advantages. Firstly, it requires minimal annotated image data, relying instead on a small set of biologically interpretable physiological parameters (e.g., SPAD, leaf temperature, nitrogen content) to achieve high-precision severity grading—thereby significantly reducing data acquisition costs and eliminating the need for labor-intensive pixel-level or image-level labeling. Secondly, the lightweight model architecture and fast training/inference speeds make it well-suited for deployment in resource-constrained agricultural environments, such as on edge devices in sugarcane fields. Moreover, the transparency of input features and model decisions enhances interpretability, fostering greater trust among agronomists and end-users.

That said, recent comparative studies (e.g., Maurya et al., 2023; Gupta et al., 2024) highlight that deep learning models—particularly convolutional neural networks (CNNs) and vision transformers—excel at discriminating between visually distinct disease types from leaf images, a capability our current physiology-only framework does not address. To bridge this gap, we envision a hybrid diagnostic system that synergistically combines the strengths of both paradigms: a deep learning module could first identify the specific disease type (e.g., brown stripe vs. mosaic) from RGB or hyperspectral images, while a physiology-driven module (such as SSA-XGBoost) would then assess the severity level based on real-time field measurements of SPAD, temperature, and nitrogen.

Such a two-stage or multi-branch architecture would enable comprehensive disease diagnosis—simultaneously answering “what disease is present?” and “how severe is it?”—while leveraging the robustness of image-based recognition and the field-deployability of physiological sensing. This integrated approach represents a promising direction for future work, aligning with emerging trends in multimodal plant health monitoring.

4.4 Challenges in classifying moderately severe disease (class 2)

Class 2 (moderately severe) samples achieved a recall of only 64.57% and an F1-score of 76.34%, significantly lower than other severity levels, with the majority misclassified as Class 3. This performance gap suggests that the model struggles to reliably identify cases at the intermediate stage of disease progression, a critical window for timely intervention.

This limitation stems not merely from visual or symptomatic similarity between Class 2 and Class 3 under field conditions, but more fundamentally from the insufficient representational capacity of the current physiological feature set during the transitional phase of disease development. While SPAD, leaf temperature, and nitrogen content effectively differentiate healthy (Class 0) and severely diseased (Class 3) plants, their response patterns tend to plateau or change nonlinearly as symptoms advance from moderate to severe, resulting in ambiguous decision boundaries in the feature space. The problem is further highlighted in cross-dataset validation: Class 2 performance remains relatively stable on the internal test set but declines markedly in the external validation set collected under real-world, complex field conditions, indicating limited robustness to environmental perturbations—such as variations in light and humidity—and natural inter-plant heterogeneity.

To address these challenges, future efforts could focus on enriching the input representation through multiple complementary strategies. Incorporating dynamic physiological indicators—such as daily SPAD decline rates or diurnal leaf temperature ranges—along with higher-order feature interactions (e.g., SPAD × nitrogen content) may better capture the temporal dynamics of disease progression. Simultaneously, increasing sampling density during the peak occurrence of Class 2 symptoms or in representative field plots, combined with resampling or cost-sensitive learning techniques, could help mitigate class boundary ambiguity. Most promisingly, fusing canopy-scale imaging data from UAV-based RGB or multispectral sensors—providing texture, color, and structural phenotypic cues—with micro-meteorological variables such as temperature, humidity, and precipitation would enable a more holistic “physiology–phenotype–environment” modeling framework.

Although the current model demonstrates strong performance for healthy and severely diseased cases—making it well-suited for large-scale screening and early warning—its accuracy for moderately severe disease remains a key bottleneck. By implementing these integrated approaches, future systems could evolve from coarse-grained severity grading toward fine-grained, context-aware diagnosis, ultimately supporting more precise and actionable crop protection strategies in real-world agricultural environments.

5 Conclusion

This study proposes an intelligent classification methodology for determining the severity of sugarcane leaf diseases using physiological characteristics combined with machine learning. Key physiological indicators including SPAD values, leaf surface temperature, and nitrogen content were collected to construct six classification models: KNN, AdaBoost, RF, LR, DT, and XGBoost. Hyperparameter optimization was conducted using the SSA. Results indicate that the SSA-XGBoost model outperformed others on the test set, with evaluation metrics exceeding 0.9186 and a PRFA score of 0.9326. In the independent validation set from Gengma County, the overall accuracy reached 0.91, demonstrating excellent generalization ability and field applicability.

Compared to deep learning models, our approach offers distinct advantages in terms of data acquisition convenience, computational efficiency, and model interpretability, making it highly suitable for rapid diagnostics and early warning in agricultural settings. This study provides an effective technological pathway for intelligent management of sugarcane diseases and offers a replicable methodology for precise identification of other crop diseases.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Author contributions

XW: Methodology, Conceptualization, Visualization, Formal Analysis, Writing – review & editing, Data curation, Validation, Writing – original draft. JS: Methodology, Visualization, Conceptualization, Validation, Formal Analysis, Writing – review & editing, Project administration. PT: Data curation, Writing – original draft, Validation. MW: Validation, Data curation, Writing – review & editing. JZ: Data curation, Writing – original draft. JC: Writing – original draft, Visualization. YQ: Funding acquisition, Writing – review & editing, Supervision, Project administration, Formal Analysis. CW: Project administration, Supervision, Writing – review & editing, Formal Analysis.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This study was supported by the Yunnan Provincial Science and Technology Talent and Platform Program (Academician Expert Workstation) (202405AF140077, 202505AF350026), Yunnan Province Basic Research Special Project (202401AT070253), Yunnan Province Young and Middle aged Academic and Technical Leaders Reserve Talent Project (202405AC350108).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2025.1698808/full#supplementary-material

References

Abbas, A., Zhang, Z., Zheng, H., Alami, M. M., Alrefaei, A. F., Abbas, Q., et al. (2023). Drones in plant disease assessment, efficient monitoring, and detection: A way forward to smart agriculture. Agronomy 13, 1524. doi: 10.3390/agronomy13061524

Crossref Full Text | Google Scholar

Adluri, V. L. and Bhukya, R. (2025). An intelligent framework of heuristic approach-aided optimal gene selection and residual LSTM with MLP for disease prediction in rice crop using gene expression data. Signal Image Video Process. 19, 307–307. doi: 10.1007/s11760-025-03859-5

Crossref Full Text | Google Scholar

Bao, D., Zhou, J., Bhuiyan, S. A., Adhikari, P., Tuxworth, G., Ford, R., et al. (2024). Early detection of sugarcane smut and mosaic diseases via hyperspectral imaging and spectral-spatial attention deep neural networks. J. Agric. Food Res. 18, 101369–101369. doi: 10.1016/j.jafr.2024.101369

Crossref Full Text | Google Scholar

Bin, Y., Zhulian, W., Jinyuan, G., Lili, G., Qiaokang, L., Qiu, Z., et al. (2023). Identifying plant disease and severity from leaves: A deep multitask learning framework using triple-branch Swin Transformer and deep supervision. Comput. Electron. Agric. 209, 107809. doi: 10.1016/j.compag.2023.107809

Crossref Full Text | Google Scholar

Breiman, L. (2001). Random forests. Mach. Learn. 45, 5–32. doi: 10.1023/A:1010933404324

Crossref Full Text | Google Scholar

Chen, T. and Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System (The Institute of Electronics, Information and Communication Engineers: CoRR, abs/1603.02754).

Google Scholar

Geibel, P., Sch, K., Dler, A., and Wysotzki, F. (2002). Connectionist construction of prototypes from decision trees forgraph classification. Intelligent Data Anal. 7, 125–140. doi: 10.3233/IDA-2003-7204

Crossref Full Text | Google Scholar

Gianni, F. and Maridina, M. F. (2021). Using multioutput learning to diagnose plant disease and stress severity. COMPLEXITY 2021, 6663442. doi: 10.1155/2021/6663442

Crossref Full Text | Google Scholar

Gupta, V. K., Sharma, G., and Singh, M. K. (2024). “A deep learning based approach for sugarcane disease detection,” in 2024 3rd Edition of IEEE Delhi Section Flagship Conference (DELCON). (New Delhi, India: IEEE), 1–6. doi: 10.1109/DELCON64804.2024.10866920

Crossref Full Text | Google Scholar

Hong, P., Luo, X., and Bao, L. (2024). Crop disease diagnosis and prediction using two-stream hybrid convolutional neural networks. Crop Prot. 184, 106867–106867. doi: 10.1016/j.cropro.2024.106867

Crossref Full Text | Google Scholar

Kalezhi, J. and Shumba, L. (2025). Cassava crop disease prediction and localization using object detection. Crop Prot. 187, 107001–107001. doi: 10.1016/j.cropro.2024.107001

Crossref Full Text | Google Scholar

Kuppusamy, A., Sundaresan, S. K., and Cingaram, R. (2024). Enhancing sugarcane leaf disease classification through a novel hybrid shifted-vision transformer approach: technical insights and methodological advancements. Environ. Monit. Assess. 197, 37. doi: 10.1007/s10661-024-13468-3

PubMed Abstract | Crossref Full Text | Google Scholar

Kurniawan, R., Siaga, E., Samsuryadi, S., Susilo, A. T., and Sunardi, L. (2025). Hybrid DCNN-SVM architecture for optimizing sugarcane leaf disease classification. IAENG Int. J. Comput. Sci. 52 (4), 974–985. Available online at: https://www.researchgate.net/publication/390394263.

Google Scholar

Maurya, R., Kumar, A., and Singh, J. (2023). “A deep convolutional neural network for leaf disease detection of sugarcane,” in 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT). (New Delhi, India: IEEE), 1–6.

Google Scholar

Parthasarathy, G. and Chatterji, B. N. (1990). A class of new KNN methods for low sample problems. IEEE Trans. Systems Man Cybernetics 20, 715–718. doi: 10.1109/21.57285

Crossref Full Text | Google Scholar

Pereira, M. R., Tosin, R., Santos, F. N. D., Tavares, F., and Cunha, M. (2025). Digital assessment of plant diseases: A critical review and analysis of optical sensing technologies for early plant disease diagnosis. Comput. Electron. Agric. 236, 110443–110443. doi: 10.1016/j.compag.2025.110443

Crossref Full Text | Google Scholar

Poblete, T., Navas-Cortes, J. A., Hrnero, A., Camino, C., Calderon, R., Hernandez-Clemente, R., et al. (2023). Detection of symptoms induced by vascular plant pathogens in tree crops using high-resolution satellite data: Modelling and assessment with airborne hyperspectral imagery. Remote Sens. Environ. 295, 113698. doi: 10.1016/j.rse.2023.113698

Crossref Full Text | Google Scholar

Qaadan, S., Alshare, A., Ahmed, A., and Altartouri, H. (2025). Stacked ensembles powering smart farming for imbalanced sugarcane disease detection. Appl. Sci. 15, 2788–2788. doi: 10.3390/app15052788

Crossref Full Text | Google Scholar

Qin, F., Wang, H., Jiang, Q., and Wang, H. (2025). A novel method based on lesion expansion to assess plant disease severity. Front. Plant Sci. 16, 1510663. doi: 10.3389/fpls.2025.1510663

PubMed Abstract | Crossref Full Text | Google Scholar

Schwenk, H. and Bengio, Y. (2000). Boosting neural networks. Neural Comput. 12, 1869–1887. doi: 10.1162/089976600300015178

PubMed Abstract | Crossref Full Text | Google Scholar

Sharma, P., Sharma, D. P., and Bansal, S. (2025). Optimum RBM encoded SVM model with ensemble feature Extractor-based plant disease prediction. Chemometrics Intelligent Lab. Syst. 258, 105319–105319. doi: 10.1016/j.chemolab.2025.105319

Crossref Full Text | Google Scholar

Shradha, V., Anuradha, C., Prakash, S. A., and Dinesh, S. (2023). Plant disease detection and severity assessment using image processing and deep learning techniques. SN Comput. Sci. 5 (1), 83. doi: 10.1007/s42979-023-02417-5

Crossref Full Text | Google Scholar

Singh, Y., Kaur, A., and Malhotra, R. (2009). Comparative analysis of regression and machine learning methods for predicting fault proneness models. IJCAT 35, 183–193. doi: 10.1504/IJCAT.2009.026595

Crossref Full Text | Google Scholar

Sun, J., Li, Z., Li, F., Shen, Y., Qian, Y., and Li, T. (2024). EF yolov8s: A human–computer collaborative sugarcane disease detection model in complex environment. Agronomy 14, 2099–2099. doi: 10.3390/agronomy14092099

Crossref Full Text | Google Scholar

Sun, C., Zhou, X., Zhang, M., and Qin, A. (2023). SE-visionTransformer: hybrid network for diagnosing sugarcane leaf diseases based on attention mechanism. Sensors 23 (20), 8529. doi: 10.3390/s23208529

PubMed Abstract | Crossref Full Text | Google Scholar

Vasavi, P., Punitha, A., and Rao, T. V. N. (2023). Chili crop disease prediction using machine learning algorithms. Rev. d’Intelligence Artificielle 37 (3), 727–732. doi: 10.18280/ria.370321

Crossref Full Text | Google Scholar

Vijayan, S. and Chowdhary, C. L. (2025). Hybrid feature optimized CNN for rice crop disease prediction. Sci. Rep. 15, 7904–7904. doi: 10.1038/s41598-025-92646-w

PubMed Abstract | Crossref Full Text | Google Scholar

Xue, J. and Shen, B. (2020). A novel swarm intelligence optimization approach: sparrow search algorithm. Syst. Sci. Control Eng. 8, 22–34. doi: 10.1080/21642583.2019.1708830

Crossref Full Text | Google Scholar

Keywords: sugarcane leaf diseases, disease severity grading, physiological traits, machine learning classification, hyperparameter optimization

Citation: Wang X, Sun J, Tian P, Wu M, Zhao J, Chen J, Qian Y and Wang C (2025) Intelligent grading of sugarcane leaf disease severity by integrating physiological traits with the SSA-XGBoost algorithm. Front. Plant Sci. 16:1698808. doi: 10.3389/fpls.2025.1698808

Received: 04 September 2025; Accepted: 26 September 2025;
Published: 15 October 2025.

Edited by:

Parvathaneni Naga Srinivasu, Amrita Vishwa Vidyapeetham University, India

Reviewed by:

Rahul Maurya, Indian Institute of Technology BHU Varanasi Design and Innovation Centre, India
Haneen Altartouri, University of Fujairah, United Arab Emirates

Copyright © 2025 Wang, Sun, Tian, Wu, Zhao, Chen, Qian and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ye Qian, MjAxNDAxNEB5bmF1LmVkdS5jbg==; Canyu Wang, MjAwMTAyN0B5bmF1LmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.