Integrated Transcriptomics and Widely Targeted Metabolomics Analyses Provide Insights Into Flavonoid Biosynthesis in the Rhizomes of Golden Buckwheat (Fagopyrum cymosum)

Golden buckwheat (Fagopyrum cymosum) is used in Traditional Chinese Medicine. It has received attention because of the high value of its various medicinal and nutritional metabolites, especially flavonoids (catechin and epicatechin). However, the metabolites and their encoding genes in golden buckwheat have not yet been identified in the global landscape. This study performed transcriptomics and widely targeted metabolomics analyses for the first time on rhizomes of golden buckwheat. As a result, 10,191 differentially expressed genes (DEGs) and 297 differentially regulated metabolites (DRMs) were identified, among which the flavonoid biosynthesis pathway was enriched in both transcriptome and metabolome. The integration analyses of the transcriptome and the metabolome revealed a network related to catechin, in which four metabolites and 14 genes interacted with each other. Subsequently, an SG5 R2R3-MYB transcription factor, named FcMYB1, was identified as a transcriptional activator in catechin biosynthesis, as it was positively correlated to eight flavonoid biosynthesis genes in their expression patterns and was directly bound to the promoters of FcLAR2 and FcF3'H1 by yeast one hybrid analysis. Finally, a flavonoid biosynthesis pathway was proposed in the rhizomes of golden buckwheat, including 13 metabolites, 11 genes encoding 9 enzymes, and 1 MYB transcription factor. The expression of 12 DEGs were validated by qRT-PCR, resulting in a good agreement with the Pearson R ranging from 0.83 to 1. The study provided a comprehensive flavonoid biosynthesis and regulatory network of golden buckwheat.


INTRODUCTION
Common buckwheat (Fagopyrum esculentum Moench), Tartary buckwheat (Fagopyrum tataricum (L.) Gaertn), and golden buckwheat (Fagopyrum cymosum, also called Fagopyrum dibotrys) are three major species of the genus Fagopyrum Mill in the Polygonum family (Ohnishi and Matsuoka, 1996;Chen et al., 2018). Among them, common buckwheat and Tartary buckwheat are utilized as food or feed and mostly found as diploid, whereas golden buckwheat is utilized as medicine or tea and found as diploid or tetraploid (Mazza and Oomah, 2003;Gu et al., 2014;Chen et al., 2018). Golden buckwheat is one of the National Key Protected Wild Plants (Class II) in China, and its rhizome is a traditional Chinese medicine included in the Chinese Pharmacopoeia (2020) with high medicinal value (Chiang et al., 2000). Golden buckwheat is also known as a health food because it has high nutritional value and healthcare function. Therefore, it is utilized to make healthy products such as fermented tea and root slices (Yang et al., 2007;Gu et al., 2014). The rhizome, as the medicinal tissue of golden buckwheat, is rich in various medicinal and hygienical components such as flavonoids (catechin and epicatechin), antioxidant phenolic compounds, γaminobutyric acid, and terpenoids (Shao et al., 2005;Wang et al., 2005;Zhang et al., 2016). Among these, flavonoids are some of the most functionally valuable components beneficial to health that are good for gastrointestinal dysfunction; enhance vascular toughness; reduce blood sugar, blood fat, and inflammation; promote tumor inhibition; and improve immunity (Chan, 2003;Li et al., 2021).
Proanthocyanidins are plant polyphenols formed by the condensation of flavane-3-ol monomers such as catechin and epicatechin . They are the last components in flavonoid biosynthesis and essential bioactive components in golden buckwheat. The content of epicatechin is a medicinal indicator for evaluating the quality of the rhizomes of golden buckwheat in the Chinese Pharmacopoeia (2020) (Chiang et al., 2000). Therefore, more and more research has focused on flavonoids in golden buckwheat in recent years. Li et al. (2021) purified proanthocyanidins and analyzed their detailed structure in golden buckwheat rhizome. The contents, antioxidant activity, and antidiabetic activity in golden buckwheat rhizome were higher than those in six other Polygonaceae plants . Chen et al. obtained a golden buckwheat mutant that had higher epicatechin content and elucidated the molecular mechanism underlying proanthocyanidin accumulation by transcriptomes in which unigenes in radiation-mediated flavonoid biosynthesis were identified (Jia and Li, 2008;Chen et al., 2012;Chen and Li, 2016). Besides, several genes in the flavonoid biosynthesis of golden buckwheat were cloned and analyzed, including FdPAL, FdFLS, FdCHI, FdCHS, FdDFR, FdANS, FdLAR, and FdMYB (Liu et al., 2009;Ma et al., 2009Ma et al., , 2012Meng et al., 2010;Li et al., 2011;Jiang et al., 2013;Luo et al., 2013;Pu et al., 2014).
Previously, we collected 211 golden buckwheat accessions from different provinces in China. They were divided into three categories based on total flavonoid content (Wang et al., 2019). The current study aimed to make up for the deficiency of flavonoid biosynthesis pathways and regulatory networks in golden buckwheat. We performed transcriptome and widely targeted metabolome analyses using four golden buckwheat accessions that differed in total flavonoid content. Massive differentially expressed genes (DEGs) and differentially regulated metabolites (DRMs) were identified. Integration analyses of the transcriptome and metabolome profiles were carried out, and a flavonoid biosynthesis pathway in the rhizomes of golden buckwheat was proposed. A yeast one-hybrid (Y1H) analysis was conducted to validate the binding of an MYB transcription factor in promoters of flavonoid biosynthesis genes. A quantitative reverse transcription PCR (qRT-PCR) analysis of 13 DEGs was performed to verify the transcriptome results. This study provided a comprehensive flavonoid biosynthesis and regulatory network of golden buckwheat.

Plant Growth and Sampling
Four golden buckwheat collections were used as plant materials in this study, namely, R1, R2, R3, and R4. Among these, R1 originated from Shangri-la, Yunnan province (longitude 100 • 05 ′ , latitude 27 • 19 ′ , and altitude 3,452 m); R2 originated from Kunming City, Yunnan province (longitude 102 • 82 ′ , latitude 24 • 89 ′ , and altitude 1,916 m); R3 originated from Kaiyang county, Guizhou province (longitude 106 • 96 ′ , latitude 27 • 06 ′ , and altitude 1,228 m); R4 was kept by our laboratory. They were planted by cutting in the experimental field in March 2017, and with normal field management during the growth periods. For transcriptome and widely targeted metabolome analyses, the golden yellow rhizomes were sampled and washed by ddH 2 O. The fibrous root was removed, the remaining rhizomes were cut into portions, and quickly subjected to liquid nitrogen for subsequent experiments. Three biological replicates were included for each golden buckwheat collection.

Transcriptome Analyses
Total RNA was isolated using a plant RNA purification kit [TianGen Biotech (Beijing) Co., Ltd., China], followed by treatment with Dnase I for 20 min to digest genomic DNA. For each sample, 1-µg RNA was used for library construction, using NEBNextUltraTM RNA Library Prep Kit for Illumina (NEB, United States) according to the manufacturer's instruction. The library's quality was evaluated with the Agilent Bioanalyzer 2100 system. High-throughput transcriptome sequencing was performed on NOVASEQ 6000 System (Illumina Inc., United States) with a read length of 150 bp and a paired-end method.
Raw reads were filtered by in-house Perl scripts to obtain clean data. Then, the clean reads were mapped to the Tartary buckwheat genome data using Hisat2 (Kim et al., 2015;Zhang et al., 2017). Only reads with a perfect match or one mismatch were further analyzed and annotated based on the reference genome. Gene function annotation was performed by a local BLASTP against NR, Pfam, Swiss-Prot, KO, and GO database (1e-5). Afterward, gene expression was normalized to fragments per kilobase of transcript per million fragments mapped (FPKM) as the following: FPKM = cDNA Fragments Mapped Fragments (Millions) * Transcript Length (kb) . DEGs were identified with the DESeq2 package in the R language (https://www.R-project.org/), with an adjusted P-value (padj) <0.05 and absolute value of log 2 ratio ≥ 1 as the thresholds (Love et al., 2014). Principal component analysis (PCA), Spearman's correlation coefficient, GO enrichment analyses, and hierarchical cluster were performed and visualized by the libraries in the R language, namely scatterplot3d, ggplot2, clusterProfiler, pheatmap, reshape2, and factoextra. The KEGG pathway was enriched with KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/kobas3/? t=1).

Widely Targeted Metabolome Analyses
Metabolite analysis was performed by Metware Biotechnology Co., Ltd. (Wuhan, China) as previously described Zhao et al., 2019). The freeze-dried rhizome was crushed using a mixer mill (MM 400; Retsch) with a zirconia bead for 1.5 min at 30 Hz. Then, 100-mg powder for each sample was weighed and treated with 0.6 ml 70% aqueous methanol overnight at 4 • C. Following centrifugation at 10, 000 g for 10 min, extracts were filtrated with a 0.22-µm pore size membrane and stored in chromatographic sample bottles.
Then, the sample extracts were analyzed using a UPLC-ESI-MS/MS system. The UPLC (Shim-pack UFLC SHIMADZU CBM30A system) analytical conditions were set as follows: (1) Column, Waters ACQUITY UPLC HSS T3 C18 (1.8 µm, 2.1 mm * 100 mm); (2) mobile phase, solvent A was pure water with 0.04% acetic acid and solvent B was acetonitrile with 0.04% acetic acid; (3) gradient elution: sample measurements were performed with a gradient program that employed the starting conditions of 95% A and 5% B. Within 10 min, a linear gradient was programmed, and a composition of 5% A and 95% B was kept for 1 min. Subsequently, a composition of 95% A and 5% B was adjusted within 0.1 min and kept for 2.9 min; (4) injection volume, 4 µl; column oven, 40 • C; flow rate, 0.35 ml/min. The effluent was alternatively connected to a triple quadrupole-linear ion trap (QTRAP)-MS (Applied Biosystems 4500 QTRAP) for LIT and QQQ scan, equipped with an ESI Turbo Ion-Spray interface operating in positive and negative ion modes and controlled with the Analyst 1.6.3 software (AB Sciex). ESI source operation parameters were as follows: ion source, turbo spray; source temperature 550 • C; ion spray voltage 5,500 V (positive ion mode)/−4,500 V (negative ion mode); ion source gas I, gas II, curtain gas were set at 50, 60, and 30 psi, respectively; the collision gas was high. Instrument tuning and mass calibration were performed with 10 and 100 µmol/L polypropylene glycol solutions in QQQ and LIT modes, respectively. QQQ scans were acquired as MRM experiments with collision gas (nitrogen) set to 5 psi. Declustering potential (DP) and collision energy (CE) for individual MRM transitions were conducted with further DP and CE optimization. A specific set of MRM transitions were monitored for each period according to metabolites eluted within this period Zhou et al., 2019). Mass spectrometric data were processed with Analyst 1.6.3. Then, the quality and quantity of metabolites were processed with MultiaQuant based on a local metabolite database constructed by Metware Biotechnology. Quality control was performed by evaluating the repeatability of three technical replicates prepared from a mixed extract of all the samples. Then, the obtained data were log transformed (log 2 ) and subjected to the MetaboAnalystR package in the R language for Pearson correlation analysis, PCA analysis, and OPLS-DA analysis (Chong and Xia, 2018). DRMs were identified by variable importance in the projection value (VIP) ≥ 1 and absolute log 2 ratio ≥ 1. A permutation test with 200 permutations was performed to avoid overfitting.

Integration of the Transcriptome and Metabolome Profiles
The transcriptome and metabolome data were log transformed (log 2 ) for integration analyses. A Pearson correlation coefficient (PCC) analysis was performed in the R language, and significantly correlated genes and metabolites were identified by PCC > 0.8 and a P-value of PCC (PCCP) < 0.05 (Jozefczuk et al., 2010). The network of genes and metabolites with PCC > 0.8 was visualized with the software Cytoscape (Shannon et al., 2003). A canonical correlation analysis (CCA) was performed for genes and metabolites in the correlation network using the CCA package in the R language (González et al., 2008).

Y1H Analysis
The obtaining of promoter sequences (1,500 bp) of candidate genes and scanning of DNA-binding specificities of R2R3-MYB proteins (Kelemen et al., 2015) in the promoters were performed with in-house Python programs. Y1H was performed as previously described . Briefly, the promoters of FcDFR1 (FtPinG0002371500.01), FcF3H1 (FtPinG0006662600.01), FcF3 ′ H1 (FtPinG0002353900.01), FcLAR2 (FtPinG0002428800.01), and FcCHS2 (FtPinG0002106500.01) were cloned and linked into pHIS2 with the EcoRI and MluI restriction sites. The coding sequence of FcMYB1 (FtPinG0007213700.01) was cloned and linked to pGADT7 with the EcoRI and BamHI restriction sites. Yeast Y187 Frontiers in Plant Science | www.frontiersin.org was co-transformed with pGADT7-MYB and five constructs containing corresponding promoters. The co-transformed yeast was selected in an SD/-Leu/-Trp medium to test whether the cotransformations were successful. Then, an SD/-His/-Leu/-Trp medium was used to test the expression of the HIS3 reporter; 90, 110, and 150 mM 3-AT were used to inhibit the auto-activation of pHIS2 constructs with five promoters. The primer sequences described above are listed in Supplementary Table 1.

qRT-PCR Analysis
The transcriptome results were verified by qRT-PCR. A total of 13 DEGs related to flavonoid biosynthesis were selected, FcLAR1 (FtPinG0000053800.01), FcLAR2, and FcMYB1. Actin was used as the inner reference gene.
Primer3Plus (http://www.primer3plus.com/cgi-bin/ dev/primer3plus.cgi) was used to pick gene-specific primers (Supplementary Table 1). qRT-PCR was performed on an ABI 7500 Fast Real-Time PCR system (Applied Biosystems, United States) using an SYBR R Premix Ex Taq TM II kit (Takara Biomedical Technology (Beijing) Co., Ltd., China). The procedure was performed in accordance with the manufacturers ′ instructions, with three technical replicates. qRT-PCR results were calculated using the 2 − Ct method.

Transcriptome Analysis of Golden Buckwheat Rhizomes
High-throughput RNA-Seq was performed on 12 RNA samples from the rhizomes of four golden buckwheat collections (R1, R2, R3, and R4), with three biological replicates for each collection. As a result, we obtained 41,034,456-52,182,636 clean reads and 6,111,106,094-7,791,174,252 clean bases for each library. GC content ranged from 46.17-49.63%. Among these, 63.6-72.36% of clean reads were uniquely mapped to the predicted coding sequences of the genome data of Tartary buckwheat. Meanwhile, 34.3-36.98% of clean reads were mapped to the "+" strand of the coding sequences, whereas 34.04-37.01% were mapped to the "-" strand of the coding sequences (Supplementary Table 2). After mapping, 28,492 genes were identified in all the 12 libraries. Besides, new genes that were not included in the reference genome data were also identified, resulting in 1,665 new genes.
In addition, all of the 12 samples were clustered, and the repeatability of biological replicates was evaluated. As shown in Supplementary Figure 1, samples of the same biological replicates were clustered closer, with correlations ranging from 0.8503 to 0.9546, whereas samples out of the biological replicates were not closely related, with correlations ranging from 0.6777 to 0.8799, indicating our data had high reliability and reproducibility.
A total of 10,191 DEGs were identified in all the samples, among which 1, 903, 3,401, 4,054, 4,095, 5,809, and 5,903 were identified in R4_vs._R1, R4_vs._R2, R3_vs._R1, R3_vs._R4, R3_vs._R2, and R1_vs._R2, respectively (Supplementary Table 3 and Figure 1A). Of these, 155 DEGs were common in five comparisons except for R1_vs._R2; 80 DEGs were common in five comparisons except for R3_vs._R2; 61 DEGs were common in five comparisons except for R3_vs._R1, and 52 DEGs were common in five comparisons except for R3_vs_R4 ( Figure 1A). Besides, a hierarchical cluster analysis based on gene expression pattern was performed and six clusters, namely, C1 to C6, were obtained (Figures 1B,C). Among these, C1 included 626 genes that were upregulated in R3; C2 included 1,492 genes that were upregulated in R1; C3 included 2,743 genes that showed higher expression levels in R1, R2, and R3 but lower expression levels in R4; C4 included 2,020 genes that showed higher expression level in R1, R2, and R4 but lower expression level in R3; C5 included 757 genes that showed higher expression level in R2, R3, and R4 but lower expression level in R1; and C6 included 2,554 genes that showed higher expression level in R3 and R4 but lower expression level in R1 and R2.
A GO analysis was performed, and DEGs in the six comparisons were classified into three component categories. The top 30 GO terms are presented in Supplementary Figure 2. Interestingly, some biological processes related to the pathway of flavonoid biosynthesis and metabolism, such as "flavonoid biosynthetic process, " "flavonol biosynthetic process, " "flavone biosynthetic process, " and "flavonoid glucuronidation, " were enriched in at least two comparisons. At the same time, we analyzed the metabolic pathways based on KEGG annotation. Among the top 20 most enriched pathways in the six comparisons (Supplementary Figure 3), "flavonoid biosynthesis, " "flavone and flavonol biosynthesis, " and "phenylpropanoid biosynthesis" also took place. These implied that flavonoids might show some difference among four golden buckwheat rhizomes.

Metabolomics Analyses of Golden Buckwheat Rhizomes
Widely targeted metabolomics analyses were subsequently performed to identify the metabolites existing in golden buckwheat rhizomes, resulting in 439 metabolites being identified (Supplementary Table 4). These included 90 flavonoids, 64 amino acids and derivatives, 59 phenolic acids, 57 lipids, 52 others, 30 organic acids, 30 nucleotides and derivatives, 26 alkaloids, 15 tannins, 8 lignans and coumarins, and 8 quinones (Figure 2A). The PCA analysis showed that samples of the same biological replicates were less variable than those of the biological replicates, suggesting that the metabolomics data had high reliability and reproducibility (Supplementary Figure 4)  means cluster analysis (Supplementary Figure 5). Metabolites in different subclasses showed different regulation patterns: subclass 1 included 58 metabolites that were upregulated in R4, subclass 2 included 37 metabolites that were upregulated in R1, subclass 3 included 23 metabolites that were upregulated in R2 and R3, subclass 4 included 26 metabolites that were downregulated in R3 but were upregulated in R4, subclass 5 included 21 metabolites that were upregulated in R2, R3, and R4, subclass 6 included 32 metabolites that were upregulated in R3, subclass 7 included 26 metabolites that were upregulated in R3 but were downregulated in R4, subclass 8 included 52 metabolites that were upregulated in R2. Subclass 8 also included 52 metabolites that were downregulated in R2. In addition, KEGG metabolic pathways based on the DRMs were analyzed. Similar to the transcriptomics result, "flavonoid biosynthesis, " "flavone and flavonol biosynthesis, " and "phenylpropanoid biosynthesis" were also listed in the top 20 significantly enriched pathways in all of the six comparisons (Supplementary Figure 6). These indicated that flavonoids might be the most essential compounds in golden buckwheat rhizomes.
It is well-known that some kind of flavonoids, such as epicatechin, is the most essential index for evaluating whether golden buckwheat rhizomes are up to the standard of Chinese medicinal material (Chiang et al., 2000). In all kinds of identified metabolites, flavonoids accounted for the most abundant one, so we subsequently analyzed the type of flavonoids. Among the 90 flavonoids, 29 flavonoids, 25 flavonols, 15 flavanols, 7 dihydroflavones, 5 flavonoid carbonosides, 3 isoflavones, 2 anthocyanins, 1 flavonoid, 1 chalcone, 1 dihydroflavonol, and 1 other flavonoid were included ( Figure 2C).
Previous studies have exhaustively characterized the DNAbinding specificities of R2R3-MYB proteins in relation to flavonoid biosynthesis. Nine DNA-binding specificities could bind to most R2R3-MYB proteins related to flavonoid biosynthesis in plants, namely, TACTGTTG, TGCGGTTG, AAAAGTTA, GTCAGTTA, ACAAGTTA, ATTAGTTG, GCTTGTTG, TATGGTTA, and TAGTTA (Kelemen et al., 2015). Therefore, we obtained and scanned the 1,500 bp promoter sequences from the translation initiation site of the eight DEGs in flavonoid biosynthesis that were highly correlated to FcMYB1 to determine whether these DNA-binding specificities were included in them. Notably, FcLAR2 had four R2R3-MYB DNA-binding specificities on its promoter; FcDFR1 had two R2R3-MYB DNA-binding specificities on its promoter; FcCHS2, FcF3 ′ H1, and FcF3H1 had one R2R3-MYB DNA-binding specificity on each of their promoters (Figure 4B). These suggested that FcMYB1 might directly bind to the promoter of these five genes to regulate flavonoid biosynthesis.
Yeast one-hybrid technology was performed to determine the interactions of the FcMYB1 protein and the promoter of five flavonoid biosynthesis genes, namely, FcCHS2, FcLAR2, FcDFR1, FcF3 ′ H1, and FcF3H1. The constructs of pHIS2-promoter and pGADT7-MYB were co-transformed into yeast strain Y187. The results showed that MYB could bind to the promoter of FcLAR2 and FcF3 ′ H1 ( Figure 4C) but could not bind to the promoter of the other three genes. This suggested that MYB had a DNAbinding activity and could directly regulate the expression of FcLAR2 and FcF3 ′ H1.

Validation of Gene Expression by qRT-PCR Analysis
The expression of 13 DEGs was validated by qRT-PCR, in which 12 flavonoid biosynthesis DEGs and FcMYB1 that was mentioned in Figure 4 were included. As shown in Figure 5, the expressions of 12 out of the 13 DEGs were well correlated by qRT-PCR, with a Pearson R ranging from 0.83 to 1. The Pearson R of FcMYB1 was 0.93. The Pearson R of FcLAR2 and FcF3 ′ H1, whose promoter could be bound by the FcMYB1protein, was 0.92 and 1, respectively. One CHS gene, namely, FcCHS1, showed a lower correlation between the transcriptome and qRT-PCR (Pearson R = 0.46), so it was eliminated in the subsequent analysis.

Flavonoid Biosynthesis Pathway in the Rhizomes of Golden Buckwheat
Based on our analyses, a putative flavonoid biosynthesis pathway in the rhizomes of golden buckwheat was proposed (Figure 6).  Orange rounded boxes represent enzymes that catalyze the reactions in the flavonoid biosynthesis pathway. Purple circles represent the FcMYB1 protein that could bind to the promoters of candidate genes. Gene names of 11 enzymes are listed as follows : FcCHS2, FcCHS3, FcCHI1, FcF3H1, FcF3 FcDFR1,FcFLS1,FcLAR1,and FcLAR2. This included 13 metabolites, 11 genes that were annotated to encode eight enzymes, and one MYB transcription factor.
The expression patterns of genes in this pathway are shown in Figure 5 and validated by qRT-PCR.

First Integrative Transcriptomics and Metabolomics Analyses of Golden Buckwheat
Among the three buckwheat species, Tartary buckwheat is the most thoroughly studied species in molecular biology. This is largely due to its high content of flavonoids, which are known as antioxidants with anti-inflammatory, anticancer, and vascularprotective properties (Fabjan et al., 2003). Since its genome sequencing was finished years ago, flavonoid biosynthesis and its regulatory mechanisms have been extensively reported Li et al., 2019c;Yao et al., 2020;Ding et al., 2021). Meanwhile, types and amounts of Tartary buckwheat flavonoids were identified and measured (Fabjan et al., 2003;Ma et al., 2019;Li et al., 2019a,b;Yang et al., 2020;Hou et al., 2021). By contrast, only a few studies related to the above areas were reported on golden buckwheat, as its genome is complex and not sequenced yet, which limited its molecular biology development. To date, only one transcriptome analysis of underlying proanthocyanidin accumulation was performed by de novo assembly, and eight coding sequences in flavonoid biosynthesis were cloned in golden buckwheat Ma et al., 2009Ma et al., , 2012Meng et al., 2010;Li et al., 2011;Jiang et al., 2013;Luo et al., 2013;Pu et al., 2014;Chen and Li, 2016). In the current transcriptome, we tried to map the clean reads to Tartary buckwheat genome data, in view of F. tataricum being very close to F. cymosum (Ohnishi and Matsuoka, 1996), which showed the expected results, with a unique mapping percentage ranging from 63.6-72.36%. A total of 28,492 genes based on the genome data of Tartary buckwheat and 1,665 new genes not included in the reference genome data were identified in the rhizomes of golden buckwheat, among which 10,191 genes were DEGs. This would enrich the gene sequence resources of golden buckwheat.
According to the metabolome analysis, 439 metabolites were identified, including 90 flavonoids. The number of flavonoids in golden buckwheat was much less than that in Tartary buckwheat (Ma et al., 2019;Zhao et al., 2019;Li et al., 2019a,b). A recent study determined nine flavonoids from rhizomes of golden buckwheat, namely, catechin, epicatechin, epicatechin gallate, gallocatechin, epigallocatechin, catechin gallate, epicatechin gallate, afzelechin, and epiafzelechin, of which catechin and epicatechin together constituted about 90% . Compared with our study, eight of the nine flavonoids were included in our metabolome except for afzelechin. All the other 82 flavonoids were identified in golden buckwheat for the first time. Catechin, gallic acid, hesperidin, nepetin, and Lepicatechin were the top five flavonoids identified in this study, which together constituted about 57% of the total flavonoids. This difference in the percentage of catechin and epicatechin is largely due to the difference in methods. Our metabolome was a large-scale method that included more than 1,000 flavonoids in the constructed flavonoid metabolite database . The most abundant flavonoid in Tartary buckwheat was rutin, which belongs to the flavonol subclass and accounts for 50-80% of total flavonoids (Fabjan et al., 2003;Li et al., 2019a;Ma et al., 2019). In our golden buckwheat metabolome data, rutin was 23rd.
To our knowledge, this is the first metabolome study, and the first to integrate metabolome and transcriptome data in golden buckwheat. Based on our analyses, we enriched the expressed genes and the existing metabolites and proposed a flavonoid biosynthesis pathway in golden buckwheat rhizomes.

Integration Analysis of the Transcriptome and Metabolome Revealed a Catechin Integrative Network in the Rhizomes of Golden Buckwheat
A catechin integrative network was identified based on the transcriptome and metabolome integration analyses, in which 14 genes and four metabolites were included. The Pearson correlation of four metabolites indicated that catechin (mws0054) was negatively correlated to myricetin (mws0032) and chlorogenic acids (mws0178 and pmp000544), which implied two competitive synthesis reactions during catechin biosynthesis. The first competitive synthesis reaction occurred during the biosynthesis of chlorogenic acids and flavonoids where they competed for the same substrate, p-coumaroyl-CoA. The formation of p-coumaroyl-CoA by 4CL was a common step for both chlorogenic acid biosynthesis and flavonoid biosynthesis (Kim et al., 2013;Saito et al., 2013;Tohge et al., 2017). After this step, HQT catalyzed p-coumaroyl-CoA to p-coumaroyl quinic acid or hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase catalyzed pcoumaroyl-CoA to p-coumaroyl shikimic, which started the biosynthesis pathway of chlorogenic acids (Kim et al., 2013). Meanwhile, CHS catalyzed p-coumaroyl-CoA to naringenin chalcone, which started the biosynthesis pathway of flavonoids (Saito et al., 2013;Tohge et al., 2017). Catechin was the most abundant flavonoid, so it was negatively correlated to chlorogenic acids. The second competitive synthesis reaction occurred during the biosynthesis of catechin and myricetin. Although catechin is a kind of flavanol and myricetin is a flavonol, naringenin and dihydrokaempferol are their common upstream substrates. For catechin biosynthesis, naringenin and dihydrokaempferol were first catalyzed by F3 ′ H to form dihydroquercetin; then dihydroquercetin was catalyzed by DFR to form leucocyanidin; finally, leucocyanidin was catalyzed by LAR to form catechin. For myricetin biosynthesis, naringenin and dihydrokaempferol were first catalyzed by F3 ′ 5 ′ H to form dihydrotricetin; then, dihydrotricetin was catalyzed by FLS to form myricetin (Saito et al., 2013;Tohge et al., 2017). They competed with the same substrates during their biosynthesis, so they were negatively correlated to each other.
It is worth mentioning that the enzymes in the network were proposed based on the gene expression in transcriptome data. Although there are some reports that construct the flavonoid biosynthesis pathway based on transcriptome data (Chen and Li, 2016;Chen et al., 2020;Zhang et al., 2022), we do not have experimental evidence for the enzymatic activity of gene products. Therefore, enzyme activity should be determined for further analyses.

MYB Transcription Factor Positively Regulated Catechin Biosynthesis
In plants, R2R3-MYB transcription factors are involved in the regulation of flavonoid biosynthesis by binding to DNAbinding specificities on promoters of flavonoid biosynthesis genes (Kelemen et al., 2015;Xu et al., 2015). In Tartary buckwheat, massive genes have been identified and, thus, form a comprehensive flavonoid biosynthesis pathway, in which some R2R3-MYB transcription factors are included (Zhang et al., , 2018Li et al., 2019c;Yao et al., 2020;Ding et al., 2021). After the release of the complete Tartary buckwheat genome, genes involved in rutin biosynthesis and regulation pathway were comprehensively identified . Afterward, several FtMYBs were found to directly regulate flavonoid biosynthesis, among which FtMYB11, FtMYB13, FtMYB14, FtMYB15, and FtMYB16 could directly bind to the MBS site on the promoter of FtPAL and repress rutin biosynthesis, FtMYB116 could bind directly to the promoter region of FtF3 ′ H and increase the content of rutin and quercetin, and FtMYB6 promoted the activity of FtF3H and FtFLS1 promoters and significantly increased the accumulation of flavonols (Zhang et al., 2018Li et al., 2019c;Yao et al., 2020;Ding et al., 2021). However, there has been no reported regulator involved in the flavonoid biosynthesis of golden buckwheat so far. We identified a novel MYB transcription factor, FcMYB1, whose expression patterns were positively correlated to eight out of 12 flavonoid biosynthesis genes significantly. The Y1H analysis showed that the FcMYB1 protein could directly bind to the promoters of FcLAR2 and FcF3 ′ H1. F3 ′ H is an essential enzyme that promotes the formation of flavanols in the competitive biosynthesis of flavonols and flavanols, and LAR is the key enzyme catalyzing leucocyanidin to form catechin (Saito et al., 2013;Tohge et al., 2017). This suggested that FcMYB1 might be a transcriptional activator in catechin biosynthesis by directly binding to the promoters of FcLAR2 and FcF3 ′ H1. This is the first identified regulator in the flavonoid biosynthesis of golden buckwheat. It is a very good beginning for further identification of flavonoid regulatory factors including R2R3-MYB, bHLH, WD40 proteins, and the interaction of MYB-bHLH-WD40 complex in golden buckwheat.

CONCLUSION
In conclusion, for the first time, this study performed transcriptome and widely targeted metabolome analyses using four golden buckwheat accessions that differed in total flavonoid content. A total of 10,191 DEGs and 297 DRMs were identified, among which the flavonoid biosynthesis pathway was enriched in both transcriptome and metabolome. Integration analyses of the transcriptome and metabolome profiles revealed a network related to catechin in which 4 metabolites and 14 genes interacted with each other. An SG5 R2R3-MYB transcription factor, FcMYB1, was found as a transcriptional activator in catechin biosynthesis for the first time, as it was positively correlated to eight flavonoid biosynthesis genes in their expression patterns and could directly bind to the promoters of FcLAR2 and FcF3 ′ H1. Finally, a flavonoid biosynthesis pathway in the rhizomes of golden buckwheat was proposed, including 13 metabolites, 11 genes encoding 9 enzymes, and 1 MYB transcription factor. The result provided a comprehensive flavonoid biosynthesis and regulatory network of golden buckwheat.

DATA AVAILABILITY STATEMENT
The raw sequence data are deposited in the Genome Sequence Archive in the National Genomics Data Center, Chinese Academy of Sciences, under accession number CRA003372 which is publicly accessible at https://ngdc.cncb.ac.cn/gsa.

AUTHOR CONTRIBUTIONS
JH and QC designed and coordinated the study. LW, BT, RR, TS, LZ, and CL planted the materials, took the samples, and carried out the experiments. JH, JD, and LW analyzed the transcriptomics and metabolomics data, made the figures and wrote the draft of the manuscript. All authors approved the final version of the manuscript.