Determination of Differentiating Markers in Coicis Semen From Multi-Sources Based on Structural Similarity Classification Coupled With UPCC-Xevo G2-XS QTOF

Coicis semen, a medicinal food, is derived from the dried and mature seeds of Coix lacryma-jobi L. var. ma-yuen (Rom.Caill.) Stapf, a member of the Gramineae family. Lipids are its main constituents. Previous literature reported that coicis semen contains twenty triglycerides and twelve diglycerides. However, we identified thirty-five triglycerides, sixteen diglycerides, four monoglycerides, and two sterols under the preoptimized conditions of UPCC-Xevo G2-XS QTOF combined with a personalized TCM database. Furthermore, we successfully determined glycerol trioleate content to evaluate quality differences. Finally, we identified the fatty acid compositions of seven out of nine differential markers via Progenesis QI using principal component analysis, orthogonal projection to latent structures–discriminant analysis, and the LipidMaps database. In addition, we applied a software-based classification, a method that was previously developed by our team, to verify and predict structurally similar compounds. Our findings confirmed that UPCC-Xevo G2-XS QTOF combined with software-based group classification could be used as an efficient method for exploring the potential lipid markers of seed medicine.


INTRODUCTION
Common sense dictates that various natural ingredients exist in TCM. However, most reports on the active components of TCM have focused on polysaccharides, alkaloids, and flavonoids. Fatty oils are widely available ingredients of herbs, and the limited attention that they have received may restrict their further development and application. Fatty oils can be obtained as an active ingredient from animals and plants (Liu et al., 2015;Chen et al., 2016;Hong et al., 2016). A number of TCM contain fatty oils, which are mainly derived from the seeds and fruits of herbs. Diverse fatty oils comprise of glycerol and different types of saturated, monounsaturated, and polyunsaturated fatty acids that each exert therapeutic effects.
However, several components, especially glycerides, of coicis semen oil still require analysis, and the identification of the active ingredients of this material needs in-depth research. Therefore, our team used Acquity UPCC-Xevo G2-XS QTOF coupled with software-based group classification to further excavate, identify, and visually classify active ingredients in coicis semen oils. Coicis semen oils have boundless development prospects and need to be explored in-depth to lay a foundation for its new preparation, development, and extensive clinical application.

Materials and Reagents
Glycerol trioleate with the purity of more than 99.9% as determined via HPLC-ELSD was purchased from the Nature Standard Co. Ltd (Shanghai, China). Acetonitrile and methanol (HPLC-MS grade) were purchased from Merck (Darmstadt, Germany). Ammonium formate (HPLC grade) was purchased from Sigma-Aldrich. High-purity CO 2 (99.999%) was purchased from the Shanghai Yizhi Industry Gases Co., Ltd. (Shanghai, China). All other reagents used in sample preparation were of analytical grade. Seven batches of dried coicis semen were purchased from different TCM enterprises in China. The manufacturers and batch numbers of the samples were as follows: batch number 180901 (Zhejiang Chinese Medical University Medical Pieces Co., Ltd., Hangzhou); batch number 190101 (Jirentang Pharmaceutical Co., Ltd., Guiyang); batch number 190105 (Zuoli Baicao Herbal Pieces Co., Ltd., Jiangxi); batch number 181201 (Huadong Herbal Pieces Co., Ltd., Hangzhou); batch number 181206 (Zuoli Baicao Herbal Pieces Co., Ltd., Zhejiang); batch number 190122 (Haiyuan Prepared Slices of Chinese Crude Drugs Co., Ltd., Nanjing); and batch number 190216 (Haichang Chinese Medicine Group Co., Ltd., Nanjing).

Preparation of Reference and Sample Solutions
The appropriate amount of glyceryl trioleate, which was used as the reference substance, was weighed accurately and diluted with nhexane to prepare a series of working solutions with concentrations of 0.0099, 0.0988, 0.9881, 4.9405, and 9.8810 mg/mL.
All samples were ground and passed through a No. 3 sieve (355 ± 13 mm). The passing rate of the particles was maintained at more than 80%. The sample preparation procedure was as follows: A total of 50.0 mL of n-hexane was added to 0.6 g of powdered coicis semen. The seed powder was soaked for 2 h and then sonicated (50 kHz, 250 W, KQ-500DB) for 30 min. The supernatant was filtered to obtain the sample solution.
The filtrate was diluted with n-hexane 5 times and 100-fold for the analysis of different components of different samples and the quantitative analysis of glyceride trioleate, respectively. A total of 200 mL solution of each batch was mixed together and used as a pooled QC sample solution for the analysis of free fatty acids.
The data processing software included UNIFI 1.9.4 and Masslynx V 4.1.

Data Acquisition and Analysis
Xevo G2-XS QTOF uses the patented LockSpray technology to ensure the accuracy of the collected data in real time. Highaccuracy mass numbers can be obtained, and the combination of high-accuracy mass numbers and isotope distribution and secondary fragment information accurately provides molecular formulas. Xevo G2-XS QTOF applies patented MS E technology to obtain the primary and secondary mass spectral information of the compounds for further structural confirmation with one injection at the same time.
The ESI +/− mode was used for the analysis of glycerides and free fatty acids. The first step involved understanding the fragmentation pattern of glycerides as a whole. In the second step, by combining the fragmentation patterns and consulting related literature, a UNIFI database for the glyceride analysis of coicis semen was established. In the third step, the self-built database was imported into UNIFI in addition to the ChemSpider online database, and the appropriate analysis method was set. Furthermore, the software automatically analyzed primary and secondary mass spectral information. Finally, it quickly screened out the target through a unique workflow.
Our team developed a classification program in the Visual Basic for Applications (VBA; Microsoft, USA) and MATLAB v7.1 (The Mathworks, Natick, USA) environments. The classification program consisted of three parts (Shan et al., 2012).
A total of 2,916 features were introduced into the SIMCA-P 13.5 software (Umetrics, Umeå, Sweden) for principal component analysis (PCA) and orthogonal projection to latent structuresdiscriminant analysis (OPLS-DA). The corresponding variable importance in the projection value (VIP value) was calculated in the OPLS-DA model. A potential differential marker was selected when its VIP value exceeded 2.00 and its S-Plot exceeded 0.95.

Qualitative Results of Glycerides and Free Fatty Acids in Coicis Semen Oils
In the ESI + mode, fifty-six and fifty-seven compounds were identified in five (No. 180901, No. 181201, No. 181206, No. 190101, and No. 190122) and two (No. 190105 and No. 190216) batches of samples, respectively. These compounds were mainly composed of glycerides. The total ion chromatogram of all samples is provided in Figure 1A, and the corresponding identified glycerides are listed in Table 1. Among them, fiftysix common compounds, including thirty-five triglycerides, fifteen diglycerides, four glycerides, and two sterols, were identified in comparison with the corresponding results of twenty triglycerides and twelve diglycerides (Hou et al., 2018a). However, the OP of diglycerides was identified only in two batches of samples, namely 190105 and 190216, likely because of the different processing technologies of different medical enterprises. The QC sample mentioned in Figure 1B was overlaid in the following differential component analysis.
A total of thirty free fatty acids were identified in the ESI − mode, and some unsaturated fatty acids could have had isomers, which needed to be confirmed further by using a reference substance. The total ion chromatogram of the QC sample is given in Figure 1C and the corresponding identified free fatty acids are listed in Table 2.
As shown in Figure 2A, PLO (t R 3.90 min) provided a precursor ion ([M+NH 4 ] + ) at m/z 874.7874 with a double-bond equivalent.

Software-Based Group Classification of Glycerides
Our teams previously developed a program in VBA and MATLAB for the classification of multiple complex components. This program successfully grouped the constituents in the n-hexane extract of coicis semen. Through the comprehensive analysis of herbal samples, fifty-seven peaks were identified and divided into four groups as shown in Figures 3, 4. Three of these groups consisted of triglycerides and diglycerides. The remaining group was composed of four monoglycerides, two diglycerides, and two sterols. The chemical structures and special MS fragmentation pathways of these compounds indicated that the same group might have similar features, and unknown ingredients could be identified through the comprehensive software-based group classification of these compounds.

Differential Component Analysis of Different Samples
The Progenesis QI omics analysis software was used for differential component research. Before exploring quality markers, the analytical system was first validated for repeatability upon the injection of six QC samples. A total of 2,916 features were extracted and then imported into EZinfo for multivariate statistical analysis. PCA was used to study the variations in the oils of seven batches of coicis semen ( Figure  5A). The differences between the groups of samples, namely No. 181216 and No.190122, were large. Furthermore, No. 190105 and No. 190101, which showed the largest differences, were subjected to OPLS-DA analysis ( Figure 5B). These two groups were clearly distinct. Furthermore, we selected compounds with S Plot ≥ 0.95 and VIP ≥ 2 as markers ( Figures 5C, D), and transferred them back to the QI for identification. Finally, nine markers were found (Table 3). Then, the LipidBlast, LipidMaps, and Chemspider databases were searched in QI for further identification. Among the nine markers found, seven were identified (five diglycerides, one triglyceride, and one stigmasterol), and the molecular formulas of the remaining two unknown compounds were estimated using an elemental composition tool. The abundance distribution of the nine markers in all the samples is shown in Figure 6. The abundance of markers, except diglycerides (16:0/18:0/0:0), in No. 190105 were remarkably higher than that in No. 190101. This result could be related to the largest differences between No. 190105 and No. 190101 and indicated that massive differences in resources and processing technologies existed among medical enterprises.    No. 190101 and No. 190105 with significant differences; (C) S-Plot of No. 190101 and No. 190105;(D) VIP diagram of No. 190101 and No. 190105. Zhu et al. Differentiating Markers in Coicis Semen Frontiers in Pharmacology | www.frontiersin.org October 2020 | Volume 11 | Article 549181

Investigation of Linear Relations
Reference solutions with concentrations of 0.0099, 0.0988, 0.9881, 4.9405, and 9.8810 mg/mL were used to perform three consecutive injections. The results showed that glyceryl trioleate had a good linear relationship in the range of 0.0090-9.8810 mg/ mL, r 2 > 0.9990.

Quantitative Limit Investigation
The reference solution was diluted stepwise at certain multiples until glyceride trioleate presented S/N ≈ 10. The results showed that the quantitative limit was 4.94 ng/mL.

Instrument Precision Inspection
The low, middle, and high concentrations (0.0099, 0.9881, and 9.8810 mg/mL) of the reference solution on the calibration curve   were taken and used in six consecutive injections to check instrument precision. The RSD value was less than 3%, which indicated good precision.

Repeatability Test
Six powder samples (0.6 g each) of the same batch (No. 180901) were weighed and prepared via the sample solution preparation method. The average content of glyceryl trioleate was determined and calculated as 0.91%. The results showed that the RSD value was 4.12% (n = 6).

Recovery Rate Test
ine powder samples (0.3 g each) of the same batch of known content were weighed, and then low, medium, and high levels of the three different concentrations of the reference substance were added precisely. The reference substance/sample ratio was controlled at 0.5:1, 1:1, 1.5:1, and each concentration level was tested in triplicate.
The results showed a high average recovery of 102.28%.

Sample Measurement Results
The content of glyceryl trioleate in seven batches of the samples was determined using the established method above. The representative chromatogram is shown in Figure 7. Each batch was replicated in triplicate. The average content ranged from 0.84% to 1.05%. The RSD value was less than 5%.

CONCLUSION
The established analytical method fully demonstrated that ACQUITY UPCC enabled the fast and efficient chromatographic separation of lipids in coicis semen. Xevo G2-XS QTOF combined with LockSpray real-time external standard mass calibration technology ensured mass accuracy. The data collection method based on MS E tandem mass spectrometry without content discrimination ensured the full collection of information, and one-shot collection could obtain precursor ion and fragment ion information simultaneously with convenient, fast, and highthroughput characteristics. By using the ACQUITY UPCC/Xevo G2-XS QTOF system combined with the UNIFI software, fifty-seven compounds of glycerides were identified and divided into four groups on the basis of their similar features via software-based group classification in the ESI + mode. Moreover, thirty free fatty acids were identified in ESI − . In addition, QI omics analysis software found nine differential compounds between No. 190101 and No. 190105, and seven of these compounds were identified. Finally, the quantitative analysis of glyceryl trioleate (quality control component in the 2015 Edition of the Chinese Pharmacopoeia) and methodological verification were performed, and the results showed that the linearity, precision, reproducibility, recovery, and other parameters of the method were good. The established quantitative method determined that the glyceryl trioleate contents of the seven batches of samples ranged from 0.84% to 1.05%.
In summary, we identified additional glycerides and free fatty acids in coicis semen oils. Our results could supplement corresponding component research. Furthermore, nine differential components were found to be potential markers of quality for differentiating coicis semen with different origins. Finally, glyceryl trioleate was determined to evaluate its pros and cons. This approach might be useful for assessing the quality of TCM.

DATA AVAILABILITY STATEMENT
All datasets presented in this study are included in the article/ Supplementary Material.