- 1College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling, China
- 2Key Laboratory of Agricultural Internet of Things, Ministry of Agriculture and Rural Affairs, Yangling, China
- 3Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, China
- 4College of Plant Protection, Northwest A&F University, Yangling, China
- 5School of Computer and Computing Science, Zhejiang University City College, Hangzhou, China
Apple Valsa canker (AVC) with early incubation characteristics is a severe apple tree disease, resulting in significant orchards yield loss. Early detection of the infected trees is critical to prevent the disease from rapidly developing. Surface-enhanced Raman Scattering (SERS) spectroscopy with simplifies detection procedures and improves detection efficiency is a potential method for AVC detection. In this study, AVC early infected detection was proposed by combining SERS spectroscopy with the chemometrics methods and machine learning algorithms, and chemical distribution imaging was successfully applied to the analysis of disease dynamics. Results showed that the samples of healthy, early disease, and late disease sample datasets demonstrated significant clustering effects. The adaptive iterative reweighted penalized least squares (air-PLS) algorithm was used as the best baseline correction method to eliminate the interference of baseline shifts. The BP-ANN, ELM, Random Forest, and LS-SVM machine learning algorithms incorporating optimal spectral variables were utilized to establish discriminative models to detect of the AVC disease stage. The accuracy of these models was above 90%. SERS chemical imaging results showed that cellulose and lignin were significantly reduced at the phloem disease-health junction under AVC stress. These results suggested that SERS spectroscopy combined with chemical imaging analysis for early detection of the AVC disease was feasible and promising. This study provided a practical method for the rapidly diagnosing of apple orchard diseases.
Introduction
Apple Valsa canker (AVC), caused by fungus Valsa mali, is a severe apple tree disease resulting in serious economic losses in Southeast Asia and China (Wang et al., 2011). Commonly, AVC is mainly found by the characteristics of canker, infected tissue softening, outflowed light brown water stain, sunken or cracked on trunks at the early infected stage (Zang et al., 2012). The fungal pathogen mainly infected the subcutaneous phloem through the wounded bark tissue at the initial infected stage. After infection, fungus hypha colonized the phloem tissues, leading to severe tissue cell death (Suzaki, 2008). What’s more, plant protection experts have proved that the fungus Valsa mali can survive in weak and dead tissues of the apple trees for more than 1 year before appearing visible symptoms (Meng et al., 2019). For example, Zang et al. (2012) found that more than 50% of apple orchards existed fungus Valsa mali in symptomless apple tree tissues. However, when visible symptoms appear, it is challenging to prevent AVC from spreading throughout the orchard by conventional treating methods such as spraying fungicides, manually removing the diseased areas, and pruning the dead branches. Unfortunately, there were no adequate methods for AVC treatment due to the complicated pathogenic mechanism so far. Thus, early detection of the infected trees is necessary to prevent the rapid development of the disease in orchards.
There are various molecular biology methods, including Enzyme-Linked ImmunoSorbent Assay (ELISA) and Polymerase Chain Reaction (PCR), were developed for the isolation and identification of pathogenic (Liu et al., 2015; Golhani et al., 2018). ELISA kits have been widely utilized thanks to the low cost, but are ineffective for detecting symptomless tissue (Fang and Ramasamy, 2015), while PCR is an effective detection method. Zang et al. (2012) developed a nested PCR assay to detect the presence of Valsa mali in apple trees and achieved an accuracy of 64.7%. However, DNA deriving from the woody plant tissues contained PCR inhibiting compounds and could affect the accuracy of PCR reaction (Martinelli et al., 2015). What’s worse, a well-equipped laboratory and experienced personnel are also required, which was not feasible for on-site detection using the PCR (Okiro et al., 2019). Therefore, it is of great significance to develop a fast, non-destructive and economical method for accurate detection of AVC.
Reported studies have demonstrated that advanced non-invasive measuring technologies, such as RGB image processing (Cruz et al., 2019; Hu et al., 2020), dielectric spectrum (Khaled et al., 2018), laser scanning (Khairunniza and Vong, 2014), and spectroscopic methods (Ranulfi et al., 2016; Dou et al., 2021) have a massive amount of potential for diagnosing tree diseases. Among them, the spectroscopy technique is powerful for quality and safety inspection due to the character of simplicity, rapidity, and affordability, which makes it indispensable in tree disease detection. Raman spectroscopy (RS) is a non-invasive, rapid, and high throughput spectroscopic technique (Farber et al., 2020; Huang et al., 2020; Zhao et al., 2021). Raman shift is only related to the vibration frequency of the molecular functional group, but not to the incident light. Therefore, each sample’s the Raman “fingerprint” of each sample is unique (Fang et al., 2021). Significantly, RS could provide essential information related to the biochemical composition of the tree tissue cell, such as protein, polysaccharide, and lipid. Neither symptomatic nor asymptomatic trees, these biochemical compositions are significantly different between diseased and healthy tissue. These compositions changes can be reflected in Raman shifts or intensity changes of specific Raman bands assigned to those molecules. Therefore, RS provides an accessible way to identify subtle changes in the molecular compounds, which offers theoretical evidence for detecting tree diseases. Vallejo et al. (2016) investigated the application of RS combined with statistical analysis for detecting citrus Huanglongbing (HLB) infection in the field, and a good result was obtained with an overall classification accuracy of about 89.2%. Sanchez et al. (2019b) readily distinguished between healthy and early-HLB citrus trees using a handheld Raman system and achieved an accuracy of 94%. In their following study, Sanchez et al. (2019a) demonstrated that utilizing a handheld Raman spectrometer in combined with chemometric analyses enabled the detection and identification of the secondary disease on HLB-infected orange trees. Those researches indicated that the RS technique combined with chemometrics methods could detect diseased trees.
However, RS is frequently interfered by fluorescence caused by chromophores in plant tissue, and compositional changes under disease stress may lead to Raman band broadening or drift (Mukherjee et al., 2017; Petrov, 2017). This drawback may lead to significant deviations in the biochemical composition analysis of RS data. Surface enhanced Raman scattering (SERS) spectroscopy, based on the improvement of traditional RS, uses certain metallic nano-substrates such as gold or silver nanoparticles (AgNPs) to enhance signals under low laser powers, which maximizes fluorescence suppression. Meanwhile, the Raman system combined with the micro-imaging technology allows for scanning micron-scale Raman collection points (e.g., one-micron pixel) (Li X. L. et al., 2019), which offers chemical information on the constituents at a high spatial resolution in situ. Qin et al. (2011) developed a Raman chemical imaging system to visualize the internal distribution of lycopene in postharvest tomatoes and established a Raman chemical image to visualize the spatial distribution of lycopene at different stages of maturity. Yang et al. (2018) used a Raman imaging system to detect the spatial distribution of chemical components in maize seeds. These studies manifested that Raman chemical imaging has great potential in the visualizing of plant tissue components.
Therefore, this study aimed to develop a fast, non-invasive, and in situ diagnosis method for detecting AVC at early infection stages using SERS combined with micro-imaging technology. The main objectives are to: (1) Optimize experimental conditions (i.e., laser intensity and exposure time) for obtaining valid SERS micro-imaging data, including Synthesis and SERS AgNPs characterization; (2) Establish optimal discriminative models for detecting AVC in early infection stages based on machine learning algorithms; (3) Generate micro-distribution maps of cellulose and lignin at the disease-health junction of the tree phloem tissues to reveal the dynamic development characteristics of the disease.
Materials and Methods
Fungal Culture and Sample Inoculation
The fungus Valsa mali stored at −80°C in an ultra-low temperature refrigerator were inoculated onto potato dextrose agar (PDA) medium. The 2-year-old apple branches (Malus domestica cv. Fuji) were collected from the Economic Tree Garden of Northwest A&F University. The selected branches were pruned into 15 cm segments, and the surface of the branches was disinfected with 75% alcohol for 15 min. Then, they were cleaned with sterile water three times until there was no odor. The ends of the branches were sealed with a wet skimmed cotton to keep them fresh, followed by punching holes in the branches with a hole puncher (hole diameter 5 mm). The activated Valsa mali fungus was inoculated on the wounds of apple branches with two points on each branch. After inoculation, the branches were transferred to a 25°C incubator for further incubation.
Synthesis and Surface-Enhanced Raman Scattering Silver Nanoparticles Characterization
In the present research, AgNPs were synthesized by using the Lee–Meisel method. The synthesis steps were as follows: AgNO3 (36 mg) was dissolved in 200 ml of ultrapure water and boiled quickly. A solution of 1 wt.% trisodium citrate (6 mL) was charged to the reaction solution and was held on boiling for 25 min accompanied by stirring at 200 rpm. After cooling to room temperature, we pour the AgNPs solution into a centrifuge tube and store it away from light. The chemical reaction equation is as follows:
Subsequently, the prepared AgNPs were characterized to verify their validity. The morphology of the AgNPs was measured by Tecnai G2 transmission electron microscopy (FEI Inc., Hillsboro, OR, United States). The UV-Vis absorption spectra of the AgNPs were measured using Lambda 35 Spectrophotometer (PerkinElmer Inc., Waltham, MA, United States). The Raman spectra of the AgNPs were collected by DXR3xi Raman micro-imaging spectrometer (Thermo Fisher Scientific Inc., Waltham, MA, United States).
Surface-Enhanced Raman Scattering Spectroscopy Acquisition
First, branches were removed from the incubator, and the inoculation points on the phloem were scraped with a knife as the samples. Each sample placed on a glass slide was dripped with the AgNPs. Then, each sample was placed on the automatic stage and aligned with a Raman laser using a 10x/0.25 NA magnification objective lens for SERS imaging collection using a DXR3xi Raman micro-imaging system (Thermo Fisher Scientific Inc., Waltham, MA, United States). Specific parameters were to: the excitation wavelength was 785 nm; the collected spectral range was 300–3,000 shift/cm–1; the laser intensity was 2.6 mW; the exposure time was 0.00285 s (350 Hz); the number of scanning was 40.
For spectral imaging in the x and y directions, the samples were scanned point by point in 2 μm steps. It should be noted that no destructive effects of the laser on the samples were observed. Routinely, before starting the Raman measurements, the calibration procedure that came with the instrument was executed automatically. At this time, the software interface displayed “Performing automatic X axis calibration.” The data acquisition software OMNICxi v1.6 was used to adjust the acquisition parameters.
Spectral Data Processing and Analysis
Spectra Preprocessing
Background noises and baselines were generated during the acquisition of the SERS spectra, which seriously impaired the interpretability of the spectra. Meanwhile, these noises and baselines would also reduce the simplicity and robustness of the calibration model built on these spectra. Therefore, selecting the optimal pretreatment method was necessary to improve the spectral quality. In this study, spectral curves were first extracted for each pixel point of the imaging data before spectra preprocessing. Then, the spectral data were preprocessed with three algorithms to eliminate noise and correct the baseline background. These three algorithms include the multiple spectral baseline correction (MSBC), the asymmetric least squares (AsLS), and the adaptive iterative reweighted penalized least squares (air-PLS). Subsequently, the advantages and disadvantages of the three algorithms were compared using the correlation analysis method.
The AsLS method, proposed by Eilers (2003, 2004), is a classical baseline correction algorithm that combined a smoother with the asymmetric weighting of deviations from the smoothed trend to form an effective baseline estimation method. The MSBC method, proposed by Peng et al. (2010), is an improved approach based on the AsLS algorithm. The MSBC method learns baselines that perform well on the corresponding spectra and then “co-regularize” the selection by correcting inconsistencies between the spectra. Air-PLS is an improvement approach based on the weighting of the original model by the weighted least squares method. The light environment is automatically subtracted by meaning the iterative regression, and the background is deducted (Baek et al., 2015).
Optimal Variables Selection and Dimension Reduction
Multivariate calibration methods in chemometrics aim to construct relationships between variables and properties of interest to make a classification model. However, with the redundant spectral variables, data usually included some noise and unnecessary information, which rendering unreliable predictive properties. Therefore, optimal variables selection and dimension reduction have been used to address these problems.
Principal component analysis (PCA) can replace the original variables with a few principal components with significant deviation to reduce the original high-dimensional variable space (Dong et al., 2014). In addition, competitive adaptive reweighted sampling (CARS) and random frog (RFrog) algorithms were combined to select the optimal variables associated with the predicted properties and exclude the interference of unrelated variables. The CARS algorithm used exponentially decreasing function (EDF) as a selection strategy to select critical variables based on adaptive reweighted sampling competitively (Li et al., 2009; Li Q. Q. et al., 2019). The RFrog algorithm calculated the selection probability of each variable by moving across trans-dimensions between models, enabling the search for the optimal variable (Li et al., 2012).
Classification Models
BP artificial neural network (BP-ANN) (Zhang et al., 2018) is the most classical and successful neural network commonly utilized for non-linear fitting and pattern recognition. BP-ANN is a one-way multi-layer feedforward network composed of an input, hidden, and output layer. The learning process is composed of forwarding propagation of signals and back-propagation of errors.
The random forest (RForest) is a widely used machine learning algorithm, which has been successfully applied to pattern recognition (Lussier et al., 2020), and the choice appropriate number of decision trees is crucial in RForest. When the test data entered the classifier, each decision tree classified the data. Finally, the class with the most classification results from all decision trees was taken as the result.
The least squares support vector machine (LS-SVM) is a machine learning method that emerged from the statistical learning theory. LS-SVM divides the data samples into multi classes by determining a hyperplane in the input space, maximizing the separation between the classes (Lucay et al., 2020). Its vital parameter indexes are the kernel function and the corresponding parameters of this function.
Extreme learning machine (ELM) is one of the practical training algorithms for single-layer feedforward neural networks (Qiu et al., 2015). ELM has a faster training and better generalization performance than traditional machine learning algorithms and could overcome issues such as the local minimum, inappropriate learning rate, and overfitting (Wu et al., 2021). Therefore, it is widely used in the condition of classification and regression.
In summary, Figure 1 demonstrated Key steps for detecting apple Valsa canker at an early stage based on SERS spectroscopy combined with chemical imaging analysis. All procedures were written in MATLAB R2018b (The MathWorks, Natick, MA, United States) and ran on a personal computer with an Intel Core i5-9400F CPU, 16GB RAM, and a Windows 10 operating system.
 
  Figure 1. Key steps for detecting apple Valsa canker at early stage based on SERS combined with chemical imaging analysis. There were four main steps in the experiment: step 1, preparation of the samples; step 2, data acquisition; step 3, data processing; step 4, discriminant and analysis.
Results and Discussion
Phenotypic Development of Healthy and Inoculated Branch
Figure 2a demonstrated the strains of the fungus Valsa mali on the PDA medium. The junctions of diseased and healthy tissues in the inoculated branch samples were assessed visually in the early stage of AVC disease. The bark surface of inoculated branch samples showed no visible symptoms during the first 7 days. However, the phloem inside the bark appeared with early infection symptoms. Figure 2b demonstrated the dynamic process of the diseased phloem in the first 7 days. The healthy phloem (the first 3 days) had a smooth surface and displayed tender green. The diseased phloem became rough and showed pale brown when the symptoms of mild infection were visible on the 5th day. Subsequently, the diseased phloem appeared dark brown, and the tissue was rotten on the 7th day. The infected area of the diseased phloem, centered on the inoculation site, was continuously extended outward with time. Most notably, the infection symptoms remained in the phloem and did not appear on the bark surface in the first 7 days. The phloem regions were manually labeled as healthy, disease-1 (the disease-health intersection), and disease-2 (late-disease) according to the infection progression of the pathogen. The purpose of dividing the region into three categories is to simulated the time-series dynamic process of pathogen infection (i.e., pathogen infection spread outward around the center point). In Figure 2c, the disease-health intersection of the diseased phloem was presented using optical microscopy. It can be observed that the healthy tissue appeared green with intact cellular tissue structure; The disease-1 tissue appeared dark brown, and the infected tissue outflowed light brown water stain; The disease-2 tissue was mainly characterized by canker and softened tissue.
 
  Figure 2. Phenotypic development of healthy and inoculated branch. (a) The strains of the fungus Valsa mali on PDA medium. (b) The dynamic process of the diseased phloem in the first 7 days. (c) Optical micrograph of the disease-health junction.
Surface-Enhanced Raman Scattering Silver Nanoparticles and Its Characterization
The microstructure, UV-Vis spectrum, and Raman spectrum of AgNPs were analyzed to investigate the enhancement effects of the synthesized AgNPs. Figure 3A is the transmission electron microscopy (TEM) image of AgNPs, Figure 3B displays the UV-Vis spectra, and Figure 3C shows the Raman spectra.
 
  Figure 3. SERS AgNPs and its characterization. (A) Transmission electron microscopy image of AgNPs. (B) The UV-Vis spectra of AgNPs. (C) Raman spectrum of AgNPs.
In Figure 3A, it could be seen that the morphological character of AgNPs was very uniform in a monodisperse spherical shape. In addition, the average diameter of AgNPs was about 50 nm. As shown in Figure 3B, only one UV-Vis characteristic absorption peak (at 410 nm) corresponding to the single plasmon resonance mode was observed, and the half-peak breadth was only 90 nm. These features further indicated that the shape and size of the synthesized AgNPs were very uniform. In Figure 3C, the Raman spectrum had a faint signal, suggesting that the synthesized AgNPs themselves had no strong Raman characteristic peaks and did not have an interferential effect on experimental results. Therefore, the synthesized AgNPs were suitable as SERS substrate to detect branch samples in this research.
Overview of Surface-Enhanced Raman Scattering Spectra
Spectral imaging is capable of acquiring the spectra from a specified point at the sample surface. By adjusting the x, y position, acquisitions of the spectra from multiple points on the sample surface can be performed, assembling a spectral image of the sample. Figure 4 clearly showed the spectrum of healthy tissue samples, with and without AgNPs, respectively. Raman spectra peaks of healthy samples without AgNPs did not appear. The SERS characteristic peaks of healthy samples were obvious, which further proved that AgNPs were effective. Figure 5 showed the micro-spectral image of diseased phloem through pointwise scanning by Raman micro-imaging system. The spectral data were obtained by splitting each pixel point of the spectral image. All the original SERS spectra were also shown in Figure 5. The pathogenic mechanism of AVC remains poorly understood (Wang et al., 2021). On the one hand, cell wall degrading enzymes (e.g., pectinases) played an important role in the infection process (Yin et al., 2013). On the other hand, studies have shown that phloridzin in apple tissues can be degraded by AVC, and the metabolites have toxic effects on apple tissue cells (Feng et al., 2020). These researches explained why the vibration band of disease-2 is weaker than the health spectrum.
 
  Figure 4. The spectrum of healthy tissue samples, with and without AgNPs, respectively. (A) Raman spectra peaks of healthy samples without AgNPs did not appeared. (B) The SERS characteristic peaks of healthy samples were obvious, which further proved that AgNPs was effective.
 
  Figure 5. The sketch represents the basic principle of the spectral data cube and shows the raw spectra and spectral imaging of three types of samples.
There was an obvious baseline offset in the disease-1 and disease-2 even after dropwise addition of the AgNPs to suppress fluorescence. Therefore, the MSBC, AsLS, and air-PLS algorithms were adopted to eliminate the disturbances of the baseline offset. The parameters for these methods were manually set to obtain the best result. For the MSBC algorithm, the parameters were set to λ = 150, μ = 8 × 107, and ρ = 0. For AsLS algorithm, the parameters were set to λ = 5,000, and ρ = 0.0001. For the air-PLS algorithm, the parameters were set to λ = 150, and ρ = 0.01. The corrected spectra and the predicted fluorescence baselines were plotted in Figures 6A–C. As shown in Figure 6, the curved baselines were well-fitted and subtracted by the three algorithms. The corrected spectra showed that the baselines were pulled back to zero absorbance, the peak locations remained unchanged, and the peak shapes were more prominent, which indicated the effectiveness of the baseline correction methods.
 
  Figure 6. Spectral baseline correction. (A) Baseline correction using MSBC. (B) Baseline correction using AsLS. (C) Baseline correction using air-PLS. The blue line represents the original spectrum, the red line represents the estimated baseline, and the yellow line represents the corrected spectrum.
As shown in Figure 6, many SERS peaks can be clearly observed. In detail, the peaks at 319, 957, 1,026, 1,165, 1,242, and 1,325 cm–1 were indicators of cellulose, corresponding to C-C-C or C-O-C skeletal bending (Szymanska et al., 2011), C-C or C-O stretching vibration (Beć et al., 2020), C-C or C-O stretching vibration (Beć et al., 2020), H-C-C or H-C-O skeletal bending (Edwards et al., 1997), C = O stretching vibration (Beć et al., 2020), and C-H bending vibration (Edwards et al., 1997), respectively. The peaks at 625, 731, 1,599, and 2,939 cm–1 were indicators of lignin, corresponding to skeletal bending (Agarwal et al., 2011), skeletal bending (Agarwal et al., 2011), C-C aromatic ring (Agarwal, 2006), and C-H asymmetric stretching vibration (Gierlinger and Schwanninger, 2007), respectively. The assignment of characteristic wavenumbers was presented in Table 1.
Selecting Optimal Preprocessing Method
The correlation analysis method was adopted to select the best preprocessing algorithms. The correlation between the corrected variables was plotted in Figure 7. Significantly, the regions close to the line y = x had a correlation coefficient of 1, indicating that the original spectra were greatly disturbed by the baseline offset. This high degree of collinearity would cause adverse effects on classification analysis. Comparing Figures 7B–D with Figure 7A, the regions with a high degree of collinearity have a noticeable decrease, and most of the spectral variables had low correlation with others except in the spectral ranges of 300–400, 640–880, and 1,490–1,970 cm–1. In addition, the proportion of pixel points with values greater than 0.6 to the total pixel points was calculated, and the proportions were 0.35, 0.09, 0.24, and 0.07, respectively. The AsLS method failed to effectively fit the baseline at 1,200–1,600 cm–1, resulting in a relatively poor result of baseline correction. This result indicated that the MSBC and air-PLS baseline offset elimination strategies could greatly reduce the high correlation levels among spectral variables, and especially, the air-PLS algorithm had the best elimination effect. Therefore, the spectra corrected by the air-PLS algorithm were used for further analysis.
 
  Figure 7. The correlation between the corrected variables was plotted. (A) High correlations were found among original spectral variables. (B) Correlations were noticeably declined using MSBC. (C) Correlations were noticeably declined using AsLS. (D) Correlations were noticeably declined using air-PLS. However, the air-PLS algorithm has the best elimination effect.
Clustering Visualization by Principal Component Analysis
As an unsupervised learning strategy, PCA was often used to demonstrate the clustering effect based on the samples’ similarity of samples in the feature space. In the present research, PCA was performed on the raw spectra of the total sample set to visualize the distribution of healthy, disease-1, and disease-2 samples. The score scatters plot of clustering analysis were shown in Figure 8. PC1, PC2, and PC3 provided 51.74, 15.01, and 11.56% of the variations among samples, respectively. The cumulative contribution of the first three PCs achieved 78.31%. Figure 8 demonstrated that the healthy, disease-1, and disease-2 samples had obvious clustering effects. Therefore, the three types of samples had distinct spectral characteristics.
Optimal Variables Selection
There were 1,401 variables in the SERS spectra. However, spectral data contained many non-critical variables, which might reduce the accuracy and stability of subsequent discriminant models. Therefore, selecting optimal variables was essential for better choices of discriminant models. In the present research, two strategies were used to select characteristic variables: algorithm selection (CARS combined with RFrog) and manual selection.
Important variables were extracted from the total 1,401 spectral variables in the full range of 300–3,000 cm–1, as shown in Figure 9. The selected optimal variable subsets were set to subset-1 and subset-2, respectively. In the algorithm selection method, 10 wavenumbers at 448, 536, 667, 1,165, 1,211, 1,312, 1,314, 1,412, 1,707, and 2,951 cm–1 in the subset-1 were identified. In the manual selection method, 10 wavenumbers at 319, 625, 731, 957, 1,026, 1,165, 1,325, 1,460, 1,570, and 2,939 cm–1 in the subset-2 were identified.
 
  Figure 9. The characteristic variables for early disease detection of the AVC disease. Two strategies were used to select characteristic variables: algorithm selection (subset-1) and manual selection (subset-2).
Discriminant Models Establishment
Before establishing discriminant models, SERS spectral data were divided into a calibration set and a prediction set at the ratio of 3:1. Generally, the independent variable (x) represented the spectral matrix of samples, and labeled grades (y) stood for the AVC infection severities. Therefore, the labels for healthy, disease-1, and disease-2 were 1, 2, and 3, respectively. BP-ANN, ELM, RForest, and LS-SVM models were established using four variable matrices (x) to classify the healthy, disease-1, and disease-2 samples. These four variable matrices (x) included the full SERS spectra, the subset-1, the subset-2, and the predicted fluorescence baselines.
After formula calculation and experience screening, the learning rate of the BP-ANN model was set uniformly to 0.1, and the number of neurons in the hidden layers were 10, 3, 3, and 10, respectively. The number of neurons in the hidden layer of the ELM model was determined by comparing the performances of the ELM model using different numbers of neurons from 1 to 100 with a step of 1. The ELM with 34 neurons was selected as the optimal model. The number of decision trees in the RForest model was determined by comparing the model performances using different numbers of decision trees from 1 to 500 with a step of 1. The RForest with 100 decision trees was selected as the optimal model. The LS-SVM model used RBF as the kernel function, and the optimal penalty coefficient (c) and the kernel function parameter gamma (g) were obtained by a grid search procedure. Finally, the best-c was 379, and the best-g was 45.
The discriminant accuracy of the models was presented in Table 2. There were significant differences in the classification results of the four models on the full spectra dataset. The classical BP-ANN model learned complex relationships between data, thus improving the analytical performance (such as high sensitivity and specificity) of classification. However, the BP-ANN model had the regrettable tendency to train toward a local optimal rather than a global optimal (Lussier et al., 2020). This also explained why the BP-ANN model had the lowest classification accuracy on the full spectra dataset compared to the other three models. As opposed to the BP-ANN model, the LS-SVM model was deterministic and its solution was global and unique. As a result, the classification accuracy of the LS-SVM model improved significantly compared to the BP-ANN model. In the present case of the RForest model, each tree selected features maximize the separation of the dataset into three classes. The output of each decision tree was then pooled, leading to the final optimal classification result. Therefore, the RForest model also exhibited excellent analytical performance comparable to the LS-SVM model.
Compared with the full spectra dataset, over 99% of non-critical input variables (10 vs. 1401) were removed in subset-1 and subset-2. Meanwhile, the classification accuracy of the subset models was not decreased significantly, which demonstrated the superiority of the optimal variable selection strategies. Generally, the fluorescence baselines reduced the simplicity and robustness of a calibration model built on the raw spectra. The existing studies by other scholars had removed the fluorescence baseline from the raw data. However, the classification accuracy of the models based on the fluorescence dataset was surprisingly excellent in the present research. When infesting the phloem tissue, fungus Valsa mali produced various chemical substances such as protocatechuic acid, isocoumarin, and phlorizin. Although these chemical substances produced fluorescence interference, the baseline reflected the chemical composition and content information. Thus, the fluorescence baseline became available as valid information. This innovative discovery will guide our subsequent research.
However, the above three methods mainly focused on feature extraction, optimal parameters, and optimal variables selection without considering the model runtime, which was also crucial for intelligent online detection, were not investigated. Furthermore, the intelligent online detection would be an important research direction in plant disease detection fields. The ELM model randomly generated the hidden node parameters and then analytically determined the output weights instead of iterative tuning (Huang et al., 2006). Thus, the ELM model runs quickly and lends itself to real application scenarios, which is very important for intelligent online detection. As seen in Table 2, the ELM model ran as fast as 0.01 s, far better than the other three methods. The LS-SVM model first used the grid search method to select the best-c and best-g, severely delaying the discriminatory efficiency and making the run time as high as 0.91 s. Therefore, the ELM algorithm can be considered as the detection model in the subsequent online detection study.
Chemical Imaging Analysis of the Disease-Health Junction
The SERS micro-spectral image data cube of each phloem sample was processed by the air-PLS algorithm to eliminate fluorescence baseline, and the parameter values were consistent with section “Overview of Surface-Enhanced Raman Scattering Spectra.” Then the processed micro-spectral cube in a pixel-wise manner generated chemical distribution images in Figure 10. The symmetric tensile vibration at 1,600 cm–1 in lignin was identified as the characteristic peak of lignin components, while the bands at 300–550 cm–1 were contributed by cellulose. Therefore, these images were constructed based on the cellulose signature peak at 300–550 cm–1 and lignin signature peak at 1,600 cm–1.
 
  Figure 10. Chemical distribution images of phloem samples based on SERS spectroscopy. (a–c) The optical micrographs of the phloem tissues taken from the branches of three trees. (d–f) The chemical imaging results based on the cellulose signature peak at 300–550 cm–1. (g–i) The chemical imaging results based on the lignin signature peak at 1,600 cm–1.
Due to the fact that cell walls were probed in phloem tissues, the spectra collected did not contain any intracellular signals. Figures 10a–c showed the optical micrographs of the phloem tissues taken from the branches of three trees. The chemical imaging results based on the cellulose signature peak at 300–550 cm–1 were shown in Figures 10d–f. The redder-colored the pixels, the stronger the spectral signals of the chemical component. Meanwhile, the bluer-colored the pixels, the weaker the spectral signals. It can be noticed that the SERS signal at the healthy tissue exhibited high intensity with red, bright yellow, and green pixel colors. The diseased phloem tissue exhibited low intensity with blue and green pixel colors, and the disease-health junction exhibited green pixel colors. These differences in SERS imaging of different regions can be attributed to differences in cell wall components. The chemical imaging results based on the lignin signature peak at 1,600 cm–1 were shown in Figures 10g–i, showing a similar pattern as the cellulose distribution. The different regions of the phloem tissue shown a distinct distribution of cellulose and lignin, and the observations here were in good agreement with optical micrographs. The results suggested that cellulose and lignin in the cell walls of infected tissues reduced significantly. It also confirmed previous research (Ke et al., 2013) that cell wall degrading enzymes were considered to play an important role in fungal infection. Therefore, Raman microimaging was capable of detecting AVC at early infection stages. It is worth noting that Raman microimaging can visualize the intensity and distribution of components of the cell walls in situ through cytological observations. Meanwhile, this rapid and non-invasive chemical imaging strategy is superior to the other methods, such as the reagent staining method and transmission electron microscopy.
Conclusion
In this study, SERS spectroscopy combined with chemometric methods was applied for early detection of the AVC disease. Firstly, three spectral preprocessing algorithms were compared, and the air-PLS algorithm was considered effective in removing the spectra fluorescence background. Thereafter, PCA provided a good clustering effect to visualize the distribution of samples in three classes. Two strategies selected optimal variables to develop machine learning models for detecting AVC disease, and these models exhibited excellent analytical performance. Meanwhile, the classification accuracy of the models based on the fluorescence dataset was surprisingly excellent, which was a great inspiration. Besides, this study proposed a new strategy for SERS chemical imaging of the diseased apple phloem tissues using a non-destructive, label-free method. This chemical imaging provided the spatiotemporal dynamic characteristics of changes in the cellulose and lignin of the phloem disease-health junction under fungus stress, which would be helpful in the early AVC detection and analysis of disease dynamics.
Data Availability Statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.
Author Contributions
SF: writing – original draft and writing – review and editing. JL: investigation, resources, writing – review and editing, and revision. YW, FZ, and YZ: investigation and resources. KY: conceptualization, investigation, resources, and writing – review and editing. All authors contributed to the article and approved the submitted version.
Funding
This work was supported by the National Natural Science Foundation of China (Program Nos. 31901403 and 61705188), Natural Science Basic Research Plan in Shaanxi Province of China (Program No. 2020JQ-267), China Postdoctoral Science Foundation (Program No. 2018M641023), Shaanxi Province Postdoctoral Science Foundation (Program No. 2017BSHYDZZ61), Science and Technology Innovation and Achievement Transformation Project of Experimental Demonstration Station (Program No. TGZX2019-10), and Key Laboratory of Agricultural Internet of Things, Ministry of Agriculture and Rural Affairs, China.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We would like to thank Guangyu Sun (Northwest A&F University) for providing the experimental materials. We would also like to thank Life Science Large Instrument Sharing Platform (Northwest A&F University) for its support for TEM experiments.
References
Agarwal, U. P. (2006). Raman imaging to investigate ultrastructure and composition of plant cell walls: distribution of lignin and cellulose in black spruce wood (Picea mariana). Planta 224, 1141–1153. doi: 10.1007/s00425-006-0295-z
Agarwal, U. P., James, D. M., and Sally, A. R. (2011). FT-raman investigation of milled-wood lignins: softwood, hardwood, and chemically modified black spruce lignins. J. Wood Chem. Technol. 31, 324–344. doi: 10.1080/02773813.2011.562338
Baek, S. J., Aaron, P., Young, J. A., and Jaebum, C. (2015). Baseline correction using asymmetrically reweighted penalized least squares smoothing. Analyst 140, 250–257. doi: 10.1039/c4an01061b
Beć, K., Justyna, G., Günther, B., Michael, P., and Christian, H. (2020). Principles and applications of vibrational spectroscopic imaging in plant science: a review. Front. Plant Sci. 11:1226. doi: 10.3389/fpls.2020.01226
Cruz, A., Yiannis, A., Roberto, P., Alberto, M., Alessandra, P., Luigi, D. B., et al. (2019). Detection of grapevine yellows symptoms in Vitis vinifera L. with artificial intelligence. Comput. Electron Agric. 157, 63–76.
Dong, C. W., Yang, Y., Zhang, J. Q., Zhu, H. K., and Liu, F. (2014). Detection of thrips defect on green-peel citrus using hyperspectral imaging technology combining PCA and B-spline lighting correction method. J. Integr. Agric. 13, 2229–2235. doi: 10.1016/s2095-3119(13)60671-1
Dou, T. Y., Lee, S., Sonia, I., Nicolas, G., Prakash, N., Kranthi, M., et al. (2021). Biochemical origin of raman-based diagnostics of huanglongbing in grapefruit trees. Front. Plant Sci. 12:680991. doi: 10.3389/fpls.2021.680991
Edwards, H. G. M., Farwell, D. W., and Webster, D. (1997). FT Raman microscopy of untreated natural plant fibres. Spectrochim. Acta A Mol. Biomol. Spectrosc. 53, 2383–2392. doi: 10.1016/s1386-1425(97)00178-9
Fang, S. Y., Cui, R. Y., Wang, Y., Zhao, Y. R., Yu, K. Q., and Jiang, A. (2021). Application of multiple spectral systems for the tree disease detection: a review. Appl. Spectrosc. Rev. doi: 10.1080/05704928.2021.1930552
Fang, Y., and Ramasamy, P. R. (2015). Current and prospective methods for plant disease detection. Biosensors 5, 537–561. doi: 10.3390/bios5030537
Farber, C., Rebecca, B., Li, P., Charles, R., and Dmitry, K. (2020). Non-invasive characterization of single-, double- and triple-viral diseases of wheat with a hand-held raman spectrometer. Front. Plant Sci. 11:1300. doi: 10.3389/fpls.2020.01300
Feng, Y. Q., Yin, Z. Y., Wu, Y. M., Xu, L. S., Du, H. X., Wang, N. N., et al. (2020). LaeA controls virulence and secondary metabolism in apple canker pathogen valsa mali. Front. Plant Sci. 11:581203. doi: 10.3389/fmicb.2020.581203
Gierlinger, N., and Schwanninger, S. (2007). The potential of Raman microscopy and Raman imaging in plant research. Spectrosc. Int. J. 21, 69–89.
Golhani, K., Siva, K. B., Ganesan, V., and Biswajeet, P. (2018). A review of neural networks in plant disease detection using hyperspectral data. Information Processing Agric. 5, 354–371.
Hu, G. S., Yin, C. G., Wan, M. Z., Zhang, Y., and Fang, Y. (2020). Recognition of diseased Pinus trees in UAV images using deep learning and AdaBoost classifier. Biosystem Eng. 194, 138–151.
Huang, C. H., Gajendra, P. S., Su, H. P., Chua, N. H., Rajeev, J. R., and Bong, S. P. (2020). Early diagnosis and management of nitrogen deficiency in plants utilizing raman spectroscopy. Front. Plant Sci. 11:663. doi: 10.3389/fpls.2020.00663
Huang, G. B., Zhu, Q. Y., and Siew, C. K. (2006). Extreme learning machine: theory and applications. Neurocomputing 70, 489–501.
Ke, X. W., Huang, L. L., Han, Q. M., Gao, X. N., and Kang, Z. S. (2013). Histological and cytological investigations of the infection and colonization of apple bark by Valsa mali var. mali. Australas Plant Pathol. 42, 85–93.
Khairunniza, B. S., and Vong, N. V. (2014). Detection of basal stem rot (BSR) infected oil palm tree using laser scanning data. Agric. Agric. Sci. Procedia. 2, 156–164.
Khaled, A. Y., Samsuzana, A. A., Siti, K. B., Nazmi, M. N., and Idris, A. S. (2018). Spectral features selection and classification of oil palm leaves infected by Basal Stem Rot (BSR) disease using dielectric spectroscopy. Comput. Electron Agric. 144, 297–309.
Li, H. D., Liang, Y. Z., Xu, Q. S., and Cao, D. S. (2009). Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal. Chim. Acta 648, 77–84. doi: 10.1016/j.aca.2009.06.046
Li, H. D., Qing, S. X., and Liang, Y. Z. (2012). Random frog: an efficient reversible jump Markov Chain Monte Carlo-like approach for variable selection with applications to gene selection and disease classification. Anal. Chim. Acta 740, 20–26. doi: 10.1016/j.aca.2012.06.031
Li, Q. Q., Yue, H., Song, X. Z., Zhang, J. X., and Min, S. G. (2019). Moving window smoothing on the ensemble of competitive adaptive reweighted sampling algorithm. Spectrochim. Acta A Mol. Biomol. Spectrosc. 214, 129–138. doi: 10.1016/j.saa.2019.02.023
Li, X. L., Sha, J. J., Chu, B. Q., Wei, Y. Z., Huang, W. H., Zhou, H., et al. (2019). Quantitative visualization of intracellular lipids concentration in a microalgae cell based on Raman micro-spectroscopy coupled with chemometrics. Sensor Actuat B Chem. 292, 7–15. doi: 10.1016/j.snb.2019.04.048
Liu, M., Elisa, M., Julie, T. C., Julie, C., Sylvia, K. W., Raymond, T., et al. (2015). Detection and identification of selected cereal rust pathogens by TaqMan ® real-time PCR. Can. J. Plant Pathol. 37, 92–105. doi: 10.1080/07060661.2014.999123
Lucay, F. A., Luis, A. C., and Edelmira, D. G. (2020). An LS-SVM classifier based methodology for avoiding unwanted responses in processes under uncertainties. Comput. Chem. Eng. 138:106860. doi: 10.1016/j.compchemeng.2020.106860
Lussier, F., Vincent, T., Benjamin, C., Gregory, Q. W., and Jean, F. M. (2020). Deep learning and artificial intelligence methods for Raman and surface-enhanced Raman scattering. TrAC Trend Anal. Chem. 124:115796.
Martinelli, F., Riccardo, S., Salvatore, D., Stefano, P., Giuseppe, S., Paolo, R., et al. (2015). Advanced methods of plant disease detection. a review. Agron. Sustain. Dev. 35, 1–25.
Meng, X. L., Xing, H. Q., Han, Z. Y., Guo, Y. B., Wang, Y. N., Hu, T. L., et al. (2019). Latent infection of valsa mali in the seeds, seedlings and twigs of crabapple and apple trees is a potential inoculum source of valsa canker. Sci. Rep. 9:7738. doi: 10.1038/s41598-019-44228-w
Mukherjee, D., Bolla, G. R., and Benjaram, M. R. (2017). Characterization of ceria-based nano-oxide catalysts by raman spectroscopy. Top Catal. 60, 1673–1681.
Okiro, L. A., Matthew, A. T., Steven, G. N., Christine, D. S., and Monica, L. P. (2019). Comparative evaluation of LAMP, qPCR, conventional PCR, and ELISA to detect ralstonia solanacearum in Kenyan potato fields. Plant Dis. 103, 959–965. doi: 10.1094/PDIS-03-18-0489-RE
Peng, J. T., Peng, S. L., Jiang, A., Wei, J. P., Li, C. W., and Tan, J. (2010). Asymmetric least squares for multiple spectra baseline correction. Anal. Chim. Acta 683, 63–68. doi: 10.1016/j.aca.2010.08.033
Petrov, D. V. (2017). Pressure dependence of peak positions, half widths, and peak intensities of methane Raman bands (ν2, 2ν4, ν1, ν3, and 2ν2). J. Raman Spectrosc. 48, 1426–1430. doi: 10.1002/jrs.5141
Qin, J. W., Chao, K. L., and Moon, S. K. (2011). Investigation of Raman chemical imaging for detection of lycopene changes in tomatoes during postharvest ripening. J. Food Eng. 107, 277–288. doi: 10.1016/j.jfoodeng.2011.07.021
Qiu, S. S., Gao, L. P., and Wang, J. (2015). Classification and regression of ELM, LVQ and SVM for E-nose data of strawberry juice. J. Food Eng. 144, 77–85.
Ranulfi, A. C., Marcelo, C. B. C., Thiago, M. K. K., Juliana, F. A., Ednaldo, J. F., Barbara, S. B., et al. (2016). Laser-induced fluorescence spectroscopy applied to early diagnosis of citrus Huanglongbing. Biosyst. Eng. 144, 133–144.
Sanchez, L., Shankar, P., Mike, I., Kranthi, M., and Dmitry, K. (2019a). Detection and identification of canker and blight on orange trees using a hand-held Raman spectrometer. J. Raman Spectrosc. 50, 1875–1880. doi: 10.1002/jrs.5741
Sanchez, L., Shankar, P., Xing, Z. L., Kranthi, M., and Dmitry, K. (2019b). Rapid and noninvasive diagnostics of Huanglongbing and nutrient deficits on citrus trees with a handheld Raman spectrometer. Anal. Bioanal. Chem. 411, 3125–3133. doi: 10.1007/s00216-019-01776-4
Suzaki, K. (2008). Population structure of Valsa ceratosperma, causal fungus of Valsa canker, in apple and pear orchards. J. Gen. Plant Pathol. 74, 128–132.
Szymanska, C., Monika, J. C., and Artur, Z. (2011). Sensing the structural differences in cellulose from apple and bacterial cell wall materials by raman and FT-IR spectroscopy. Sensors 11, 5543–5560. doi: 10.3390/s110605543
Vallejo, P., Moises, R., Maria, G. G. M., Miguel, G. R. E., Francisco, J. G., Hugo, R. N. C., et al. (2016). Raman spectroscopy an option for the early detection of citrus huanglongbing. Appl. Spectrosc. 70, 829–839. doi: 10.1177/0003702816638229
Wang, W. D., Nie, J. J., Lv, L. Q., Gong, W., Wang, S. L., Yang, M. M., et al. (2021). A Valsa mali effector Protein 1 targets apple (Malus domestica) pathogenesis-related 10 Protein to promote virulence. Front. Plant Sci. 12:741342. doi: 10.3389/fpls.2021.741342
Wang, X. L., Wei, J. L., Huang, L. L., and Kang, Z. S. (2011). Re-evaluation of pathogens causing Valsa canker on apple in China. Mycologia 103, 317–324. doi: 10.3852/09-165
Wu, D. M., Wang, X. L., and Wu, S. C. (2021). A hybrid method based on extreme learning machine and wavelet transform denoising for stock prediction. Entropy 23:440. doi: 10.3390/e23040440
Yang, G. Y., Wang, Q. Y., Liu, C., Wang, X. B., Fan, S. X., and Huang, W. Q. (2018). Rapid and visual detection of the main chemical compositions in maize seeds based on Raman hyperspectral imaging. Spectrochim. Acta A Mol. Biomol. Spectrosc. 200, 186–194. doi: 10.1016/j.saa.2018.04.026
Yin, Z. Y., Ke, X. W., and Huang, L. L. (2013). Validation of reference genes for gene expression analysis in Valsa mali var. mali using real-time quantitative PCR. World J. Microb. Biot. 29, 1563–1571. doi: 10.1007/s11274-013-1320-6
Zang, R., Yin, Z. Y., Ke, X. W., Wang, X. J., Li, Z. L., Kang, Z. S., et al. (2012). A nested PCR assay for detecting Valsa mali var. mali in different tissues of apple trees. Plant Dis. 96, 1645–1652. doi: 10.1094/PDIS-05-11-0387-RE
Zhang, L., Wang, F. L., Sun, T., and Xu, B. (2018). A constrained optimization method based on BP neural network. Neural Comput. Appl. 29, 413–421.
Keywords: apple Valsa canker, early detection, Surface-Enhanced Raman Scattering, chemical imaging, machine learning
Citation: Fang S, Zhao Y, Wang Y, Li J, Zhu F and Yu K (2022) Surface-Enhanced Raman Scattering Spectroscopy Combined With Chemical Imaging Analysis for Detecting Apple Valsa Canker at an Early Stage. Front. Plant Sci. 13:802761. doi: 10.3389/fpls.2022.802761
Received: 29 October 2021; Accepted: 14 January 2022;
Published: 04 March 2022.
Edited by:
Nam-Hai Chua, Temasek Life Sciences Laboratory, SingaporeReviewed by:
Gajendra Pratap Singh, Singapore-MIT Alliance for Research and Technology (SMART), SingaporeDmitry Kurouski, Texas A&M University, United States
Copyright © 2022 Fang, Zhao, Wang, Li, Zhu and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Keqiang Yu, a2VxaWFuZ195dUBud2FmdS5lZHUuY24=
 Yan Wang4
Yan Wang4 
   
  