Comparative Analysis and Development of a Flavor Fingerprint for Volatile Compounds of Vegetable Soybean Seeds Based on Headspace-Gas Chromatography-Ion Mobility Spectrometry

Evaluating the volatile compounds and characteristic fingerprints of the core cultivars of vegetable soybean would provide useful data for improving their aroma in the breeding programs. The present study used headspace-gas chromatography-ion mobility spectrometry (HS-GC-IMS) to evaluate the volatile compounds of vegetable soybean seeds at a specific growth stage. In total, 93 signal peaks were identified, 63 compounds qualitatively, with 14 volatile flavor compounds providing multiple signals. The 63 volatile compounds consisted of 15 esters, 15 aldehydes, 13 alcohols, 15 ketones, one acid, and four other compounds. The peak intensity of most of the volatile compounds varied greatly between the core cultivars. The alcohols and aldehydes determined the basic volatile flavor of the vegetable soybean seeds. Volatile flavors were determined by their respective esters, ketones, or other components. Characteristic fingerprints were found in some core vegetable soybean cultivars. Four cultivars (Xiangdou, ZHE1754, Zhexian 65018-33, and Qvxian No. 1) had pleasant aromas, because of their higher content of 2-acetyl-1-pyrroline (2-AP). A principal component analysis (PCA) was used to distinguish the samples based on the signal intensity of their volatile components. The results showed that the composition and concentration of volatile compounds differed greatly between the core cultivars, with the volatile flavor compounds of soybeans being determined by the ecotype of the cultivar, the direction of breeding selection, and their geographical origin. Characteristic fingerprints of the cultivars were established by HS-GC-IMS, enabling them to be used to describe and distinguish cultivars and their offspring in future breeding studies.


INTRODUCTION
Soybean (Glycine max L. Merr.) is the most important crop cultivated worldwide. It is a major source of protein and vegetable oil for human and animal consumption and contains several phytochemicals, such as isoflavones and phenolic compounds. Because of its high nutritional value, the soybean is processed into many different products, such as soybean flour, soybean milk, soy sauce, tofu, natto, and other snacks. The vegetable soybean (Glycine max L. Merr., also known as edamame) is a food-grade soybean variety that is generally harvested when the pods are fully filled and still green (Zhang and Kyei-Boahen, 2007). Vegetable soybeans are consumed widely in China, Japan, and south-east Asia as a snack food. The main edible part of the vegetable soybean is the seed, which is rich in carbohydrates, proteins, vitamins, minerals, and phytochemicals. Apart from its macronutrients and micronutrients, the dark green color of the vegetable soybean at maturity, its large seed size, soft texture, sweetness, and less beany flavor differentiate it from the regular soybean (Saldivar et al., 2011). Of these attributes, flavor often has the greatest influence on consumer acceptance and behavior. The aromatic vegetable soybean has now become more popular and gained wider acceptance in Japan, the United States, and Europe than the regular soybean (Saldivar et al., 2011). The aromatic type of vegetable soybean commands a higher price than the non-aromatic varieties in international markets, mainly because of its characteristic flavor . Flavor is perceived primarily by the sense of taste and olfaction (aromatics/aroma) (Glanz et al., 1998), with a unique flavor being associated with a complex mixture of compounds belonging to the different chemical classes (Bravo, 1998). Determining the diversity of these flavor compounds and their contribution to the volatile flavor of vegetable soybean seeds is invaluable for assessing the quality of the soybean at the edible stage (Castada et al., 2019).
The diversity of volatile compounds has been studied in many vegetable and fruit crops, such as rice (Monsoor and Proctor, 2004), soybean (Ramasamy et al., 2019), sorghum (Zanan et al., 2016), melon (Shi et al., 2020), pyrus (Qin et al., 2012), peppers (Ge et al., 2020), and mushroom (Li et al., 2019). The nutritional quality attributes of the vegetable soybean are mainly investigated at specific stages of seed maturity, with its aroma and overall acceptability usually evaluated by its taste (Xu et al., 2016;Jadhav et al., 2018). However, little quantitative information is available for comparing the volatile compounds of a large number of vegetable soybean cultivars at specific stages, with no information available on the composition and ratio of the volatile flavor components in vegetable soybean seeds. The present study will detect the volatile flavor compounds in boiled vegetable soybean seeds, harvested at the R6-R7 growth stage, using headspace-gas chromatography-ion mobility spectrometry (HS-GC-IMS). The effectiveness of ion mobility spectrometry (IMS) is reported as suitable for characterizing the volatile compounds because it can rapidly analyze the samples with low detection limits without pretreatment, is highly sensitive to the compounds with high electronegativity and high proton affinity, and can detect many chemically diverse compounds, such as alcohols, aldehydes, esters, and ketones (Márquezsillero et al., 2011) and has therefore become widely used in the food analysis (Arroyo-Manzanares et al., 2017). Therefore, the present study aimed to evaluate the volatile compounds of the vegetable soybean at the core cultivar level using HS-GC-IMS. The results will provide a reference for identifying the cultivars and their offspring for improving the taste and flavor of vegetable soybeans as part of a breeding program.

Plant Materials
In this study, 30 vegetable soybean core cultivars from a breeding program were used. These consisted of eight spring vegetable soybean cultivars with either a high protein content or good flavor released by the Zhejiang local government; three autumn local cultivars with large-sized seeds, two of them with a green cotyledon; eight autumn vegetable soybean varieties with a different seed color and size and good flavor released in Zhejiang province; three imported vegetable soybean cultivars, representing the typical vegetable soybean flavor, to be used as core parents in the breeding program; and eight vegetable soybean breeding lines with different flavors.
All thirty cultivars (Table 1) were preserved in the Hangzhou Sub-Center of National Soybean Improvement and sown in the plots following a completely randomized block design with three replications in the experimental field of Zhejiang Academy of Agricultural Sciences, Hangzhou, Zhejiang province, China, in 2019, with each block measuring 1.5 m wide by 10 m long. The orchard management procedures, such as fertilization and irrigation, were same for all the cultivars. The fresh soybean pods were harvested at the R6-R7 stage, with 50 pods collected from each vegetable cultivar block and combined for each of the three replicates. The fresh soybean pod samples were wrapped in aluminum foil then stored at −80 • C until the subsequent analyses.

HS-GC-IMS Data Acquisition
The soybean pod samples were first boiled in water at 100 • C for 5-8 min, then saved for subsequent analyses using a FlavourSpec gas chromatograph (G.A.S. Gesellschaft für Analytische Sensorsysteme mbH, Dortmund, Germany), equipped with a CombiPal GC autosampler (CTC Analytics AG, Zwingen, Switzerland).
The soybean seed samples (3 g) were placed in 20-ml headspace vials, incubated at 60 • C for 15 min spinning at 500 rpm and then, a headspace volume of 400 µl was automatically injected by a syringe at 65 • C into an MXT-5 capillary column (15 m, i.d. 0.53). Nitrogen (99.99% purity) was used as the carrier gas programmed as follows: initial flow rate of 2 ml/min, maintained for 2 min, increased to 100 ml/min over 18 min, then maintained at 100 ml/min for 2 min before stopping. The analytes were separated in the column at 60 • C then ionized in the IMS ionization chamber at 45 • C, with a constant gas flow of 150 ml/min. Furthermore, 2-ketones were used to standardize the instrument as the IMS was not responsive to the alkanes. The

Data Analysis
All the experiments were performed three times. All data were acquired in the positive ion mode, with each spectrum formed from an average of 12 scans. Three software programs developed by G.A.S were used to view the analytical spectrum and for quantitative analysis. During the first step, the Laboratory Analytical Viewer (v.2.2.1, G.A.S.) and Reporter analysis (v.1.2.12, G.A.S.) were used to compare the 2D top view, 3D spectrogram, and the spectral differences among the samples. In addition, a GC × IMS Library Search (v.1.0.3, G.A.S.) was used for the qualitative analysis of the volatile of compounds based on their retention time in the GC column and drift time (time of flight in the drift tube). The reference RI data were supplied by NIST 2014, and the drift time data by G.A.S. The quantitative analysis was based on the peak height of the selected signal peak using the gallery plot analysis (v.1.0.7, G.A.S.). A principal component analysis (PCA) was used to visualize the differences between the soybean cultivars with Dynamic PCA software (G.A.S.). A standard curve was established between the peak height and a 2-acetyl-1-pyrroline (2-AP) standard sample (99%). The content of 2-AP in the vegetable soybean cultivars was calculated by using the external standard method.

Analysis of HS-GC-IMS Spectra in Vegetable Soybean Seeds
Headspace-gas chromatography-ion mobility spectrometry three-dimensional (3D) and two-dimensional (2D) spectra were used to analyze the changes and diversity of the volatile flavor compounds of the vegetable soybean samples. The data are represented using a 3D topographical visualization and 2D topographic plot. The differences between the different cultivars were obvious (Figures 1A,B). In Figure 1A, the X-axis represents the ion drift time of the volatile flavor compounds, the Y-axis represents the gas phase retention time, and the Z-axis represents the peak intensity. The peak signal distributions of the different samples were very similar, but the signal intensity varied, indicating that the content of volatile flavor compounds differed among the samples.
In a further comparison using 2D spectra ( Figure 1B), the reactive ion peak (RIP) is represented by the red vertical line at the horizontal coordinate of 1.0, with each point on the right side of RIP representing the type of volatile compound, with the retention time of most signals appearing between 100 and 600 s. To compare the differences among the samples, sample ZH716 was used as a reference, with the spectral background of the other samples being subtracted from this reference. After subtraction, the background was white, with the red area indicating that the content of the compound was higher than that of the reference sample, and the blue area indicating that it was lower. Figure 1B shows the diversity of volatile flavor compounds among the different samples directly.

Qualitative Analysis of Volatile Flavor Compounds in Vegetable Soybean Seeds by HS-GC-IMS
The spectral topographic plots of the Huning 95-1 cultivar were selected for the qualitative analysis, because all the samples had similar volatile flavor compounds. In Figure 2, each dot represents the type of volatile flavor compound. Most of the dots were concentrated in the range of retention times between 100 and 400 s and abscissae between 1 and 2. The compounds identified were numbered, with unmarked dots denoting unidentified compounds. From all the samples, 93 signal peaks were identified and 63 compounds were qualitatively identified using the built-in NIST database and IMS database in the GC-IMS library search ( Table 2). Fourteen volatile flavor compounds provided multiple signals, such as monomers and dimers. These included methyl octanoate, ethyl hexanoate, ethyl 2-methylpropanoate, ethyl propanoate, (E)-2-octenal, (E)-hept-2-enal, (E)-2-hexen-1-ol, oct-1-en-3-ol, pentan-1-ol, trans-2pentenal, 6-methyl-hept-5-en-2-ol, 3-octanone, 1-octen-3-one, and 2-heptanone. These compounds exhibited similar retention times, but different migration times, related to their content. In the ionization region, the protonated molecules and neutral molecules combined to form dimers, whose quantity could be enhanced by a high content of the compounds (Ewing et al., 1999;Arroyo-Manzanares et al., 2017;Rodríguez-Maecker et al., 2017).

Relative Abundance of Major Volatile Flavor Compounds in the Core Vegetable Soybean Cultivars
A total of 63 diverse volatile compounds were identified by using the GC-IMS, consisting of esters (15), aldehydes (15), alcohols (13), ketones (15), acid (1), and other volatiles (4). The relative peak signal volumes of the volatile compounds from the different cultivars are given in a Supplementary Table. The results showed that the peak volumes of most of the volatile compounds varied greatly between the different cultivars.
Of the 15 ester compounds, acetyl esters (9/15), such as ethyl acetate, ethyl hexanoate, ethyl hex-3-enoate, ethyl 2-methylpropanoate, ethyl 2-methylbutanoate, and ethyl nonanoate were predominant. Among the ester compounds, the  Ketones, which contribute to the flavor of food, were detected in the vegetable soybean seeds. Compared with the alcohols and aldehydes, the peak volumes of most ketone compounds (10/15) were less than 1,000 a.u. The highest peak volumes of the ketone compounds were for 3-octanone-D and 3-octanone-M, followed by 3-pentanone and 2-butanone. As ester compounds, the ketones in the vegetable soybean seeds provided the characteristic volatile flavor of the different vegetable soybean cultivars.

Fingerprints of Cultivars Based on Volatile Substances Using HS-GC-IMS
The volatile compounds in the different vegetable soybean cultivars were analyzed by using the HS-GC-IMS. Fingerprint imaging (Figure 3) shows the gallery plot of each sample and their color differences so that the content of volatile compounds can be approximated by the color of each square, with a brighter color representing a higher content of the compound. Each column indicates a signal peak, and each row represents a sample, with three sample repeats. Ninety-three compounds were detected in the 30 samples, with 63 compounds being analyzed qualitatively and quantitatively. Of the 63 compounds, pentan-1ol, (E)-2-hexen-1-ol, (E)-hept-2-enal, oct-1-en-3-ol, and hexanal were detected and had a comparatively high content in all 30 samples. In contrast, ethyl hex-3-enoate, amyl acetate, isoamyl acetate, and ethyl 2-methylpropanoate-D were not detected in the 30 samples except for Zhexian No. 19. The seeds of this cultivar exhibited the nine strongest signal peaks for ester compounds: ethyl acetate, amyl acetate, ethyl hex-3-enoate-M, ethyl 2-methypropanoate-D, ethyl 2-methypropanoate-M, ethyl 2-methylbutanoate, ethyl propanoate-D, isoamyl acetate, and ethyl propanoate-M. Therefore, these peak signals constituted a very particular fingerprint feature for the Zhexian No. 19 cultivar (Figure 3).
The strongest peak signals for ethyl nonanoate, methyl octanoate-D, and methyl octanoate-D were provided by the Huning 95-1 cultivar, therefore, these three volatile flavor compounds could be taken as marker signals for the Huning 95-1 cultivar. The characteristic fingerprint of the Lvpiqingren cultivar consisted of strong peak signals for (E)-2-octenal-D, (E)-2-octenal-M, heptanal, and trans-2-pentenal-M. Of the 13 alcohol compounds, the peak signals of (Z)-3-hexen-1-ol, pentan-1-ol-D, and n-hexanol provided a special fingerprint for the Zhechun No. 3,Zhechun No. 4,and Zhechun No. 8 cultivars, which all have a high protein content. The Lvpiqingren and Qingpiqingren cultivars exhibited stronger peak signals for 6-methyl-hept-5-en-2-ol-D and 6-methyl-hept-5-en-2-ol-M than the other cultivars. These two cultivars were summer ecotypes and local germplasms with green cotyledons, an exceptional phenotype compared with the other cultivars. Ten ketones and one acid were detected in the present study, with the peak signals of 2,3-butanedione, 2-hexanone, and allylacetic acid endowing the characteristic fingerprints for the Zhexian76004 and TMD cultivars. These three compounds exhibited strong peak signals in the Zhexian 65018-33 cultivar, as well as the strongest peak signal for 3-hydroxybutan-2-one. The Lvpiqingren cultivar exhibited a special peak signal for 2heptanone-D, 2-heptanone-M, and 2-butanone. Of the five other volatile compounds, the peak signal for 2-acetyl-1-pyrroline showed the brightest color from the Zhexian 65018-33 cultivar.

PCA Analysis Based on Volatile Substances Detected by HS-GC-IMS
Principal component analysis can reduce the number of dimensions and classify the original data. Figure 4 shows that the two principal components explained 78% of the total variance: PC1 54% and PC2 24%. The data from the present study were separated into four groups. Group I, consisting of five highprotein soybean cultivars, were poorly correlated with the other vegetable soybean cultivars. The autumn and summer ecotype vegetable soybean cultivars were gathered in the same region to form group II. The spring ecotype vegetable soybean cultivars divided into the groups III and IV, with the flavors between these two groups varying greatly. The six cultivars in group III included Taiwan 75 and Huning 95-1, which were introduced from the Shanghai City and Taiwan regions and were one of the hybrid parents of the other four cultivars. In group IV, six cultivars were bred by crossing local soybean cultivars with cultivars from Japan. Four cultivars, ZK1754, TMD, Zhexian No. 19, and Zhexian 76004, belonged to a single category, because of their special volatile flavor components, which was consistent with the results on the special fingerprints of volatile flavor.

HS-GC-IMS Analytical Approach for the Measurement of Volatile Compounds in Plant-Based Products
Several analytical approaches are developed for measuring the volatile compounds in plant-based products. GC-MS has been considered a powerful analytical instrument to identify the chemical. However, time-consuming, complex sample pretreatment, and a significant constraint in the distinguishing of isomeric molecules, particularly ring-isomeric compounds (Kranenburg et al., 2020) limit its application for the plant-based products screening. GC-O-MS is a powerful tool for extracting aroma-active compounds from the complex mixtures, because of repetitive time-consuming labor, this method is not ideal for the rapid detection of volatile organic compounds (VOCs) in the plant-based products . Despite the fact that gas sensors are sensitive at room temperature and have strong selectivity and low detection limits, their detection performances are greatly influenced by moisture and air fluctuation in the environment, resulting in unstable results and poor instrument repeatability (Chen et al., 2018. Ion mobility spectrometry is an emerging technique for detecting the trace gases and for characterizing chemical ionic substances based on the differences in the rate of migration of gas phase ions in an electric field (Li et al., 2019). This technique has many advantages over other methods, such as a rapid speed of detection, high sensitivity, easy preparation, and simple sample preparation steps. However, there are also some limitations for IMS, especially for the complex samples (Arce et al., 2014). Combining IMS with GC column could provide a better method for detecting the volatile compounds and has already proved to be useful for analyzing and characterizing the volatile compounds. HS-GC-IMS has been widely and usefully applied for analyzing wine, eggs, jujube fruits, and honey (Garrido-Delgado et al., 2011;Cavanna et al., 2019;Sun et al., 2019;Wang et al., 2019). More importantly, HS-GC-IMS can also be used to establish the visual fingerprints of volatile compounds which show changes in their variety and progress during a food process (Ge et al., 2020). This could be the most powerful function of GC-IMS compared with other analysis methods.

Volatile Compounds in Vegetable Soybean
Various volatile compounds may serve as indicators of the maturity of soybeans and biochemical markers to evaluate seed quality. Several classes of compounds, such as alcohols, aldehydes, esters, and ketones, were the main volatile compounds  Table 1.
identified in the present study. Of these compounds, acetyl esters were the main characteristic components, agreeing with the previous studies reporting that esters with fruity notes were the major aromatic components found in the ripe fruit (Lara et al., 2003;Moyaleon et al., 2006;Qin et al., 2012). Aldehydes and alcohols are C6 volatile compounds, often referred to as "green leaf " volatiles because of their green grass note (Yang et al., 2009), and were found to be important background volatiles of vegetable soybeans in the present study. Some previous studies have reported that the light and water characteristics and growth temperatures can significantly affect the formation of volatile metabolites, particularly alcohols and aldehydes in the plants (Bertrand et al., 2012;Benelli et al., 2015). In addition, the present study has found that some alcohols and aldehydes in vegetable soybean seeds exhibited significantly different peak volumes, even though these cultivars were produced in the same environment. These results indicated that the genetic background of the cultivar was another major factor determining the volatile flavor metabolites.
The previous studies have reported that the major volatiles of grain soybeans were ethanol, 1-octen-3-ol, phenylethyl alcohol, hexanal, octanal, 2-propanone, and r-butyrolactone (Lee and Shibamoto, 2000;Boué et al., 2003;Dings et al., 2005;Kim et al., 2020). However, most of these compounds were not detected in the present study except for oct-1-en-3-ol and hexanal, possibly because of the different sampling stages and different soybean processing methods. In the present research, the vegetable soybean seed samples were harvested at the R6 growth stage then the fresh pods were boiled in water. The specific harvesting stage and treatment method also contributed to the presence of some particular volatile compounds. In contrast, the different extraction methods used by HS-GC-IMS could be another factor affecting the composition of volatile flavor compounds. Two other important compounds were found in the present study, furans and 2-acetyl-1-1-pyrroline (2-AP). Furans usually exhibit sweet, burned, and baking odors formed through Maillard reactions (Fischer et al., 2017), and showed a stable peak volume among the cultivars so could have contributed to the basic flavor of the vegetable soybean. 2-acetyl-1-1-pyrroline (2-AP), a 5membered N-heterocyclic ring compound, is identified as the most important compound contributing to the aroma of soybean Arikit et al., 2011a,b). It was also detected in the six cultivars where its content was significantly higher than that in the other cultivars, a result consistent with their taste and aroma.
Overall, the soybean cultivars had stronger alcohol and aldehyde component peak signals, but also possessed special ester and ketone signals. In other words, the alcohol and aldehyde compounds determined the basic volatile flavor of the vegetable soybean seeds, with particular volatile flavors being determined by their respective compositions of esters, ketones, or other compounds.
The most abundant volatile flavor compound signals were detected in the spring vegetable soybean cultivars, the Zhexian serial cultivars from Zhexian 12 to Zhexian 65018-33 (Figure 4). The autumn soybean cultivars, Zheqiudou No. 2 and Zheqiudou No. 5, exhibited the most similar volatile flavor compound signals. A summer vegetable soybean cultivar, such as Kaixinlv, and some imported vegetable soybean cultivars, such as Taiwan 75, exhibited fairly abundant flavor compound signals. These results could contribute in forming the breeding objectives, and select the direction of breeding and growing environment. The quality objectives for breeding spring vegetable soybean varieties were good taste, flavor, and shape, with the better aromas provided by individual or lines of cultivars being selected by breeders during the selection procedure. In contrast, Zheqiudou No. 2 and Zheqiudou No. 5 were bred for producing the soy food products, such as tofu, soy sauce, and soybean milk, with little attention being paid to the flavor of the fresh seed. Most summer and autumn cultivars were intended for use as vegetables and for producing soy foods, so the flavor of the fresh soybean seeds was given only moderate consideration. However, the growing environment for all the cultivars would have some effect on their volatile flavor compounds.
The use of volatile-compound imaging and determining the markers based on HS-GC-IMS for discriminating among the vegetable soybean cultivars was a non-targeted approach to analysis. This could play an important role in screening for the specific markers, and for extracting reliable, unbiased, and visual information from a large amount of data . The present study found that some specific fingerprints belonged to different cultivars, that the content of volatile flavor compounds in the different soybean cultivars could be affected by their cultivation method or growing environment, but that some particular volatile flavors (specific fingerprints) could be determined by special genes that could be passed on to their offspring. Therefore, these types of data visualization combined with the results of sensory evaluation could be applied for selecting the volatile flavors and breeding new vegetable varieties with pleasant and acceptable flavors.

PCA Analysis of Volatile Compounds in Vegetable Soybean
A PCA is a multivariate statistical analysis method that linearly transforms several variables to select a smaller number of significant variables (Li et al., 2019). The PCA results on volatile compounds from the 30 vegetable soybean cultivars tended to show a clear separation. The soybean cultivars with a high seed protein content were less correlated with the other vegetable soybean cultivars. The volatile flavor compounds of these high-protein soybean cultivars were different to the middle or lower protein content soybean cultivars. The autumn and summer ecotype vegetable soybean cultivars were located in the same PCA region, thus reflecting the higher correlation of volatile flavor components of these samples. The flavors of the spring ecotype vegetable soybean cultivars varied greatly.
The particular volatile flavor compounds were determined by the ecotype of the cultivars, the direction of breeding selection, and the original cultivation area. The data from HS-GC-IMS contained useful information and could be a reasonably useful tool for distinguishing among the vegetable soybean cultivars.
These non-targeted characteristic markers offer the potential for selecting the new vegetable soybean lines with good volatile flavors in the future breeding programs.
In conclusion, 93 volatile compounds were detected in the seeds from 30 vegetable soybean cultivars, of which 63 compounds were detected qualitatively. Alcohol and aldehydes were the predominant volatile compounds followed by esters and ketones. The composition and concentration of volatile compounds differed greatly between the cultivars. Some characteristic fingerprints of vegetable cultivars, established using HS-GC-IMS, could be used to describe and distinguish cultivars and their offspring in the future studies. Four vegetable cultivars that exhibited an aromatic flavor because of their high 2-AP content would be valuable in the vegetable soybean breeding programs. Based on the PCA analysis, the volatile flavor compounds of the soybean seeds were determined by using the ecotype of the cultivars, the direction of breeding selection, and the geographical origin.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
XF participated in planting vegetable soybean cultivars. XY performed the statistical analysis and helped to draft the manuscript. HJ collected the materials. QY helped for the analysis of the data. LZ helped to perform the HS-GC-IMS experiment. FY designed the study, carried out the HS-GC-IMS experiment and helped in drafting the manuscript. All authors have read and approved this manuscript.