Co-expression Gene Network Analysis and Functional Module Identification in Bamboo Growth and Development

Bamboo is one of the fastest-growing non-timber forest plants. Moso bamboo (Phyllostachys edulis) is the most economically valuable bamboo in Asia, especially in China. With the release of the whole-genome sequence of moso bamboo, there are increasing demands for refined annotation of bamboo genes. Recently, large amounts of bamboo transcriptome data have become available, including data on the multiple growth stages of tissues. It is now feasible for us to construct co-expression networks to improve bamboo gene annotation and reveal the relationships between gene expression and growth traits. We integrated the genome sequence of moso bamboo and 78 transcriptome data sets to build genome-wide global and conditional co-expression networks. We overlaid the gene expression results onto the network with multiple dimensions (different development stages). Through combining the co-expression network, module classification and function enrichment tools, we identified 1,896 functional modules related to bamboo development, which covered functions such as photosynthesis, hormone biosynthesis, signal transduction, and secondary cell wall biosynthesis. Furthermore, an online database (http://bioinformatics.cau.edu.cn/bamboo) was built for searching the moso bamboo co-expression network and module enrichment analysis. Our database also includes cis-element analysis, gene set enrichment analysis, and other tools. In summary, we integrated public and in-house bamboo transcriptome data sets and carried out co-expression network analysis and functional module identification. Through data mining, we have yielded some novel insights into the regulation of growth and development. Our established online database might be convenient for the bamboo research community to identify functional genes or modules with important traits.


INTRODUCTION
Bamboo, an important fast-growing non-timber forest plant worldwide, has been an essential forest resource with an annual trade value of >2.5 billion US dollars, and approximately 2.5 billion people depend on it economically (Peng et al., 2013a,b;Zhao et al., 2017). Moso bamboo (Phyllostachys edulis, once known as Phyllostachys heterocycla) is the most economically valuable bamboo in Asia, especially in China. With the release of the whole-genome sequence of moso bamboo, there are increasing demands for refined annotation of bamboo genes on the whole-genome level. Considering the small proportion of annotated genes in the bamboo genome and high accumulation of data, it is necessary and urgent to conduct big-data mining to yield novel insights into bamboo growth and development.
Generally, genes with coordinated expression across a variety of experimental conditions indicate the presence of functional linkages between genes. Thus, co-expression gene networks can associate these genes of unknown function with biological processes in an intuitive way. An increasing number of studies have supported the versatility of co-expression analysis for inferring and annotating gene functions (D'Haeseleer et al., 2000;Aoki et al., 2007;Usadel et al., 2009;Morenorisueno et al., 2010;Li et al., 2015;Serin et al., 2016). Through data mining tools and algorithms that describe complex co-expression patterns of multiple genes in a pairwise fashion, global co-expression network analyses consider all samples (multiple data sources with independence) together and establish connections between genes based on the collective information available (Bassel et al., 2011). Compared with such a network, the conditional coexpression network aims to enhance our understanding of gene function from a portion of transcriptome data sets that have much in common, such as having the same source and a similar acquisition of raw materials and inferring gene transcriptional regulatory mechanisms in developmental processes based on a series of selected associated samples. In co-expression analysis, gene expression views can help clearly present the tendency of differential gene expression between samples. Consequently, co-expression networks with expression views can be used to associate genes of unknown function with biological processes, to discern gene transcriptional regulatory mechanisms in vivo and to prioritize candidate regulatory genes or modules of vital traits.
Based on the de novo sequencing data, together with the fulllength complementary DNA and RNA-seq data of moso bamboo, BambooGDB has become the first genome database with comprehensively functional annotation for bamboo . It is also an analytical platform composed of comparative genomic analysis, protein-protein interaction networks, pathway analysis and visualization of genomic data. However, it has only 12 RNA-seq data sets in different tissues of moso bamboo, which falls far short of existing RNA-seq data sets and does not meet the needs of researchers. Moreover, there are no analyses of coexpression networks, functional modules, cis-elements and gene set enrichment in BambooGDB. ATTED-II (Aoki et al., 2016), a co-expression database for plant species, provides a view of multiple co-expression data sets for nine species (Arabidopsis, field mustard, soybean, barrel medic, poplar, tomato, grape, rice and maize). Only two of them are members of the grass family (Poaceae), like bamboo. It is exceedingly necessary to present co-expression networks for bamboo.
Recently, large amounts of transcriptome data have become available on bamboo for the establishment of co-expression gene networks associated with plant growth and development. We collected 52 high-quality genome-wide transcriptome data sets on moso bamboo covering six tissues from the NCBI SRA database (He et al., 2013;Peng et al., 2013a;Huang et al., 2016;Wei et al., 2016;Zhao et al., 2016Zhao et al., , 2018. In addition, we have newly produced 26 in-house transcriptome data sets across six tissues of different growth stages from the Genome Atlas of Bamboo and Rattan (GABR). To efficiently extract information from large data sets, we applied in silico methods to build genome-wide global and conditional co-expression networks, and further, to identify functional modules for annotating and predicting bamboo gene functions. Furthermore, we constructed the BambooNET database 1 to integrate the high-throughput transcriptome data, co-expression networks, functional modules, etc. BambooNET also included co-expression network analysis, cis-element analysis and GSEA tools, which might be an online server for refining annotation of bamboo gene functions.

Data Process and Gene Expression Profiling Analysis
The whole-genome sequence of moso bamboo was accessed from the 2013 public version 1 (Peng et al., 2013a) and corresponded to a genome size of ∼2 Gb and 31,987 protein-coding genes. The reads of 78 RNA-seq samples were aligned to the bamboo genome (version 1.0) using TopHat v2.1.1 software (Trapnell et al., 2009). Calculation of FPKM and the identification of differentially expressed genes were performed using Cuffdiff in Cufflinks v2.2.1 software (Trapnell et al., 2010). GO enrichment analysis was performed using the agriGO website (Du et al., 2010).
To determine the minimum threshold of the gene expression value (FPKM) among 78 bamboo samples, the lowest 5% of all gene FPKM values in each RNA-seq sample and the SD of each experimental group were computed. Then, the mathematical formula "threshold = average(5% value) + 3 * SD" (You et al., 2016 was used to calculate the minimum expression value of each experimental group. The minimum threshold of FPKM was 0.1474.

Co-expression Network Construction Algorithm and Parameters
The PCC represents the co-expression relationship between two genes among the 78 samples. The closer the relationships between the genes were, the higher the PCC scores. MR, an algorithm for calculating the rank of PCC, takes a geometric average of the PCC rank from gene A to gene B and from gene B to gene A. Specifically, when gene A is the third highest co-expressed gene for gene B, the PCC rank of gene A to gene B is 3. Thus, MR ensures more credible co-expression gene pairs would be left out, so the PCC and MR were used to construct a co-expression network.
Pearson correlation coefficient: Here, we retained co-expressed gene pairs with a single direction rank of PCC (Rank AB or Rank BA ) less than 3 and MR score less than 30 in a co-expression network (Aoki et al., 2016), and these gene pairs were regarded as having positive coexpression relationships when their PCC values were more than zero and negative co-expression relationships when their values were less than zero.
All samples were used to construct global networks, while ICBR samples were used for conditional networks. Following a similar procedure, 65 data sets without the stress treatment were selected to define tissue-preferentially expressed genes, and 10 data sets associated with dehydration and cold treatment were selected for stress-differentially expressed genes.

Modules Identification Algorithm and Parameters
The CPM (Adamcsek et al., 2006) was used to find modules with nodes more densely connected to each other than to nodes outside the group in the bamboo co-expression networks. Parameter selection was based on the number of modules, the coverage rate of genes and the overlap rate of community. Hence, we selected a k = 6 clique size, which meant each node had co-expression interactions with at least five nodes in a module (Supplementary Figure S5). The functions of modules were predicted by gene set analysis (Yi et al., 2013) through integrating annotations such as GO, gene families (transcription regulators, kinases, and carbohydrate-active enzymes), and KEGG. The TF family and kinase family classifications were collected from iTAK  and PlantTFDB (Jin et al., 2017). A total of 3,305 TFs and 1,598 kinases were identified. Moreover, nonsignificant entries were filtered by the Fisher's exact test and multiple hypothesis testing (FDR ≤ 0.05). In the end, 1,896 modules containing at least 6 genes each were found in bamboo, covering functions such as metabolism, hormones, development, and transcriptional regulation.

Cis-Element Significance Analysis
The cis-element significance test is a statistical algorithm based on Z score and P-value filtering. When scanned in the 3 kb promoter region of bamboo genes, motifs with a P-value less than 0.05 were significantly enriched in a regulatory module (Yu et al., 2014;You et al., 2016).
The Z score was calculated as X is the sum value of a motif in the promoters of a list of genes, µ is the mean value of the same motif in 1000(n) random lists of genes with same scale, and σ is the SD of the 1000-mean value based on random selection.

Ortholog Identification in Arabidopsis
Bidirectional blast alignments were conducted for the analysis of protein sequences in moso bamboo and Arabidopsis. Our criteria for the orthologous search were as follows: the top three hits in each bidirectional blast alignment were selected as the best orthologous pairs; in addition, pairs with an E-value less than 1E-25 were regarded as secondary orthologous pairs. Table 3 lists the results of the orthologous search, including for NST1, SND1 and VND7.

Search and Visualization Platform
The network search function was based on MySQL, Apache and PHP scripts. Cytoscape.js, an open source java script package, can dynamically display the components, construction and variation of the network.

Network Construction and Module Identification
We integrated 78 transcriptome data sets for moso bamboo (Phyllostachys edulis), which can be divided into two parts according to the data source: 52 public data sets from NCBI and 26 in-house data sets from ICBR ( Table 1). The data sets spanned most tissues of bamboo, including leaf, culm (stem), shoot, root, rhizome, bud and panicle as well as stress-treated (dehydration and cold) samples from the public platform. The data sets from ICBR were available for the construction of conditional network, covering different portions from tissue root, shoot, bud, leaf and so forth. Furthermore, each ICBR sample was a mixture from six areas of bamboo production in China. This variety is beneficial for the study of fast growth and development regulation in bamboo. A well-developed integrated strategy was used for network construction (You et al., 2016. First, the raw RNA-seq reads of bamboo samples were mapped to the moso bamboo reference genome by TopHat, and then the FPKM of genes were calculated by Cufflinks. Since the read mapping ratios of 7 RNA-seq samples (culm tissue) were too low (<25%), we ultimately retained 78 RNA-seq samples for global co-expression network construction (details of mapping results shown in Supplementary Table S1). Figure S1), the minimum threshold FPKM value of 0.1474 among 78 bamboo samples was chosen as a cut-off value to identify whether the gene was expressed according to all genes' FPKM values in each sample. The PCC was adopted as the correlation coefficient between two genes and to measure the coexpression relationship. The lowest 5% PCC (−0.4) and highest 5% PCC values (0.6) were considered thresholds for negative and positive correlation, respectively, in the PCC distribution diagram of all gene pairs (Supplementary Figure S2A). The MR method (Aoki et al., 2016) was widely used in many species such as the model plant Arabidopsis. Strict parameters were set to get optimal co-expressed gene pairs in this way. It mainly classified co-expressed gene pairs into three levels: MR top3, MR ≤ 5 and 5 < MR ≤ 30. As a result, the PCC cut-off of 0.6 and the MR top3 + MR cut-off of ≤30 were better for the construction of co-expression network. Finally, there were 302,383 and 185,044 pairs with 31,681 nodes in the positive co-expression network (PCC value > 0) and negative co-expression network (PCC value < 0), respectively. The network was inferred to be scale-free from the distribution of nodes and their linked edges numbers (Supplementary Figure S2B).

Through the FPKM value distribution boxplot of all samples (Supplementary
All transcriptome data sets were used to construct the global networks in this study, while the 26 data sets from ICBR were used for conditional networks as a parallel analysis with the same method (MR) and procedure as global networks. In addition, 65 data sets without stress treatment were selected to define tissue-preferentially expressed genes, and 10 data sets associated with dehydration and cold treatment were selected for stressdifferentially expressed genes. We overlaid the gene expression results onto the co-expression network with multiple dimensions (development and stress).
Furthermore, co-expression networks allow modularized analysis of biological processes to discover regulatory genes or modules of vital traits. The CPM (Adamcsek et al., 2006;Li et al., 2014) together with function enrichment tools was applied to classify possible function modules. As a result, 1,896 functional modules containing at least 6 genes each were identified in bamboo, covering functions such as metabolism, hormones, development, and transcriptional regulation.

Gene Network Analysis of Photosynthesis-Related Genes
Photosynthesis provides energy for the fast growth and development of bamboo. It may possess a unique carbon assimilation mechanism, and it would be interesting to study the light-harvesting process in bamboo (Jiang et al., 2012). Additionally, an efficient light-harvesting step is critical for the success of photosynthesis (Cheng and Fleming, 2009;Zhao et al., 2016). We selected three light-harvesting complex (LHC) genes of photosystem I and photosystem II in bamboo ( Table 2), including PH01003036G0080 (orthologous gene of LHCA1), PH01001378G0550 (orthologous gene of CAB1 or LHCB1.3) and PH01000242G0150 (orthologous gene of CAB2 or LHCB1.1), and searched their global co-expression networks with a tissue-preferential gene expression view in bamboo. Based on the LHCB-related gene expression views among different tissues, three samples for each tissue were used to detect different expression levels, which were quantified by FPKM  value ( Figure 1A). These genes preferentially expressed in leaf compared to other tissues, while they were almost not expressed in root. To validate the robustness and credibility of the networks and further study the possible regulatory mechanisms of LHCA1 and their co-expressed genes in bamboo, we selected the LHCA1 gene (PH01003036G0080) as an example to visualize the global co-expression network ( Figure 1B). Through GO enrichment analysis of all genes from this network by using agriGO (Du et al., 2010;Tian et al., 2017) (Figure 1D), the results showed that these co-expressed genes were strongly associated with the GO terms of photosynthesis and light harvesting, light reaction, and generation of precursor metabolites and energy, which matched the previous findings that the primary function of LHC protein was the absorption of light through chlorophyll excitation and transfer of absorbed energy to photochemical reaction centers (Dolganov et al., 1995;Li et al., 2000;Montané and Kloppstech, 2000;Zhao et al., 2016). A similar result was also obtained in CAB1, CAB2 and their co-expressed genes following the above process ( Figure 1C). Zhao et al. (2016) found that more copies of LHC genes indicated more energy may be required in the fast-growth stage of moso bamboo.
LHCA and LHCB coexist with some other LHC genes in these global co-expression networks ( Figure 1B). From the perspective of only LHCA and LHCB genes' co-expression ( Figure 1E), LHCA and LHCB are intimately linked with each other.
In addition, we also searched the co-expression network of photosynthesis-related genes in the conditional co-expression network. We performed the same procedure for the global network analysis as in the conditional network of LHC genes. The gene expression views in the conditional network (Figure 2A) showed a similar tendency to those in the global network. Meanwhile, we conducted GO enrichment analysis of all genes from the conditional co-expression network for LHCA1 (PH01003036G0080), CAB1 (PH01001378G0550) and CAB2 (PH01000242G0150) by agriGO (Du et al., 2010;Tian et al., 2017). The GO terms were associated with photosynthesis, light reaction and light harvesting ( Figure 2D). In addition to overlaps, the conditional co-expression network had some specific genes that were different from the global network (Figures 2B,C).
Comparative genomics might help to construct and identify functional modules in bamboo. We made a comparison between the top 300 PCC co-expressed genes in bamboo and in Arabidopsis (collected from ATTED-II and AraNet) (Figures 2E,F). The co-expression networks of PH01003036G0080 and AT3G54890 (LHCA1) showed high similarity, suggesting the reliability of our bamboo co-expression network.

Network Analysis of Genes Related to Brassinosteroid Biosynthetic and Signal Transduction Pathways
Phytohormones are indispensable in plant development and various environment adaption (Lacombe and Achard, 2016). BRs are a group of plant steroidal hormones that play vital roles in almost all aspects of plant growth and development . Several key enzymes in BR biosynthesis pathways have been found in Arabidopsis, such as DET2/DWF6, CYP90B1/DWF4, CYP90A1/CPD/DWF3, CYP90C1/ROT3, CYP90D1, and CYP85A2/BR6OX2 (Fujioka et al., 1997;Choe et al., 1998;Yukihisa et al., 2003;Kim et al., 2005;Ohnishi et al., 2012). First, we searched the global network for gene PH01003419G0030 (orthologous gene of CYP90A1/CPD/DWF3), PH01000278G0580 (orthologous gene of CYP85A1/BR6OX1), PH01001995G0390 (orthologous gene of CYP85A2/BR6OX2), and PH01003429G0090 (orthologous gene of CYP90D1) (Figure 3A). Second, we conducted GSEA analysis of GO, gene family, PlantCyc and KEGG categories for all genes from this global network by using PlantGSEA (Yi et al., 2013) (Figure 3C). The GO terms of BR biosynthetic process, BR metabolic process and steroid biosynthetic process were significantly enriched, suggesting that this network corresponds to the BR biosynthetic pathway and the genes from this network may be involved in BR biosynthesis in bamboo. Third, we chose the genes CYP85A1/BR6OX1 and CYP85A2/BR6OX2 in bamboo and their top 300 co-expressed genes and compared them with their orthologous genes and their top 300 from ATTED-II in Arabidopsis ( Figure 3B). There were many orthologous gene pairs between them, which could indicate these co-expressed genes were conserved and increased the credibility of predicting BR biosynthetic functional modules in bamboo. We also searched the global network for PH01000234G0890 (orthologous gene of BAK1, also known as BRI1-associated receptor kinase), PH01000584G0630 (orthologous gene of BIN2, also known as BR-insensitive 2) and their co-expressed genes ( Figure 3D). With the GO enrichment analysis by agriGO on all the genes from this network, some GO terms were enriched such as the BR-mediated signaling pathway, steroid hormon mediated signaling pathway and responses to steroid hormone stimuli (Supplementary Figure S4).

Co-expression Network Analysis of Secondary Cell Wall Biosynthesis
Transcription factors in the NAC family, including VNDs, SNDs, and NSTs, acting as master switches for SCW thickening, play important roles in the SCW formation process, including the deposition of hemicellulose, cellulose and lignin (Table 3). MYB46 directly binds to the promoters and activates the transcription of genes involved in lignin and xylan biosynthesis, functioning as a central and direct regulator of the genes involved in the biosynthesis of all three major secondary wall components in Arabidopsis (Kim et al., 2013(Kim et al., , 2014a. Thus, we selected some key NAC and MYB TFs to study their functions in regulating SCW formation and strong lignified culms in bamboo ( Table 4). The gene expression profiling of SCW-related NAC family genes was statistically analyzed with the Z-score test in ICBR samples. The hierarchical cluster results of these genes demonstrated that NST/SND genes were highly expressed in the bamboo shoot compared to other tissues (Figure 4). We searched the constructed global and conditional networks with a gene expression view for these clustered NAC genes. The networks might indicate the possible regulatory mechanism of the SCW thickening process during bamboo development (Figure 5).
In the global network, SND3 was co-expressed with SND2 and NST1, CESA4, HAT22, PAL1 and an ortholog of AtMYB83 in bamboo (PH01000006G2680). In Arabidopsis, both MYB46 and MYB83 act in the regulation of secondary wall biosynthesis by binding to the promoters of the xylan and lignin biosynthetic genes (Mccarthy et al., 2009;Kim et al., 2014b).
With regard to the global network for PH01000508G0100 (orthologous gene of AtMYB103), the co-expressed genes were  Figure 3A. The results shown in this table list the description, category and FDR value. The red bar on the right represents the FDR value of the enriched GO terms. The darker the red color becomes, the lower the FDR value is. Other colored bars, including blue, yellow, green, and purple bars, represent KEGG pathways, Plantcyc, PO terms and TFT, respectively. (D) A global co-expression network of BR signal transduction genes BAK1 (PH01000234G0890) and BIN2 (PH01000584G0630) in bamboo. A yellow line between two nodes indicates a positive co-expression relationship, and a green line between two nodes indicates a negative co-expression relationship.
mainly HCT, ATCESA7, ATCESA8, CESA4, and some NAC genes SND2, SND3. There were a few specific co-expressed genes, the NAC gene NST2 and another ortholog of AtMYB103 in bamboo (PH01000462G0290). Meanwhile, we supplied a visualization of global networks for some genes related to the phenylpropanoid biosynthesis pathway, such as PH01001164G0160 (orthologous gene of HCT), PH01001569G0030 (orthologous gene of HCT) and PH01000009G1900 (orthologous gene of C4H) (Figure 5). These genes were also co-expressed with some genes related to phenylpropanoid biosynthesis pathways, including CCR1, CCR4, 4CL1, 4CL2, and CYP98A, whose promoter regions share a cisacting motif called ' AC element' that is recognized by MYB58 and MYB63 in Arabidopsis (Zhou et al., 2009). Through motif analysis of 3 kb of these bamboo genes' promoter regions, the ' AC element' was found to be significantly enriched.
To increase the reliability of networks in bamboo, we further made a comparison between the top 300 PCC co-expressed genes of SND3 and MYB103 in Arabidopsis (from ATTED-II) and those in bamboo. Plenty of orthologous gene pairs in SND3 network comparison could be grouped into several sections, such as LAC genes, MYB genes, zinc finger genes, IRX family genes and other NAC genes. Generally, our co-expression network analysis, together with the cis-element and GO enrichment analyses, efficiently identified components and recapitulated a regulatory module of the SCW biosynthetic process.

A Combination of Several Functional Regulatory Modules Related to Bamboo Development
The function modules contained nodes that were more densely connected to each other than to nodes outside the group in bamboo co-expression networks. We identified an important functional module related to photosynthesis by co-expression network analysis, and the function of this module was predicted to associate with photosynthesis and light harvesting (FDR: 2.00E-8) by GSEA (Figure 6). We also identified functional modules related to BR biosynthetic pathways (FDR: 1.83E-3) and diterpenoid biosynthetic pathways (FDR: 6.18E-3) based on a similar approach (Figure 6). In addition, three  regulatory modules were found to possibly participate in phenylpropanoid biosynthetic pathways. For example, one functional module was significantly related to phenylpropanoid biosynthesis (FDR: 1.07E-7) and flavonoid biosynthesis (FDR: 0.02), including PH01000009G1900 (orthologous gene of C4H), PH01001044G0220 (orthologous gene of CCR1), and PH01001444G0130 (orthologous gene of CCR1). We further combined several regulatory gene modules that were identified and conducted module analysis of functions related to fast growth of bamboo culms (Figure 6). All these modules were related to bamboo growth and development, such as photosynthesis, BR biosynthesis, and phenylpropanoid biosynthesis. Among the different modules related to phenylpropanoid biosynthesis, the connected node (PH01001164G0160, orthologous gene of HCT) could play a vital role in the regulatory pathway based on its co-expression network analysis. The connections between functional modules could represent crosslinks between different modules related to the similar function or different pathways. Thus, modules with nodes connected to other modules were selected and displayed in the database to help to further study their key functions. Accordingly, the combination of these functional modules displayed a series of possible key genes from hormone signals to culm development, mimicking the dynamic regulatory process in bamboo and highlighting the connections between these nodes in regulatory modules during growth stages.

Online Co-expression Network Database for Moso Bamboo
Here, we developed the BambooNET database, a platform with co-expression network analysis, cis-element analysis and GSEA tools and provided an online server for gene functional module analyses in multi-dimensional co-expression networks for moso bamboo (Phyllostachys edulis), which will help to refine annotation of bamboo gene functions. In this database, different categories of the co-expression network can be selected to visualize using the Cytoscape web tool, including the global network and the conditional network, which includes a search function for either a single gene or a list of genes. Notably, there are three main analysis options in the co-expression network platform: positive relationship, negative relationship and predicted protein-protein interaction relationship. In the tissue-preferential analysis, there are eight tissues, namely, the shoot, root, culm (stem), leaf, panicle, bud, rhizome and sheath. In addition, the gene expression changes of a certain sub-network among different tissues can be clearly observed. The stress-differential analysis displays not only the differences in gene expression between 2 and 8 h under dehydration and cold stress but also the fold changes of gene expression after each stress treatment. In the view of the network, the nodes in red or blue represent up-or downregulated genes in leaves after a stress treatment, respectively. Moreover, some tools in this platform are available for the gene lists from the selected specific network to analyze, annotate and identify some functional modules, besides gene set analysis (Yi et al., 2013) and UCSC Genome Browser (Speir et al., 2016), such as BLAST search, keyword search and module enrichment analysis, comprising co-expression network and miRNA-target network. The website can be accessed at http://bioinformatics.cau.edu.cn/bamboo.

DISCUSSION
In this study, we constructed a genome-level co-expression network containing more than 90% of predicted genes and FIGURE 4 | Cluster analysis of SND, NST and VND genes from ICBR samples of bamboo. The gene names are indicated on the right, while the tissue types are shown above each column. High, average and low levels of expression in a specific tissue are indicated by red, white, and blue, respectively. 487,427 positive and negative edges with existing transcriptome data on bamboo. The samples cover most of the development stages of bamboo growth and development, such as the root, leaf, culm (stem), and shoot. In addition, a network-based platform, covering global, conditional and predicted proteinprotein interaction networks, has been built successfully to refine the annotation of bamboo genes or modules with functions related to bamboo growth and development. Through the data mining system, networks of various aspects have combined with several functional analysis tools, including ortholog annotation, gene family classification, cis-element analysis and GO analysis, to evaluate the reliability of the predictions.
Although the whole-genome sequence of moso bamboo has been released, the genome annotation is still far from complete. For these sparsely annotated genes, compared with the study of single-gene identifications, modules are a valuable resource for predicting gene function. Combined with genes from co-expression networks, it would be interesting to identify modules associated with the biological process of bamboo growth and development. The potential functional modules related to phytohormones can display the module functions for BR biosynthetic pathways (FDR: 1.83E-3) and diterpenoid biosynthetic pathways (FDR: 6.18E-3) by GSEA analysis of KEGG pathways (Figure 6). These tightly linked genes within the one module may have related key biological functions in the process of fast growth in bamboo and can be used for genetic improvement and molecular regulatory mechanisms of moso bamboo. There is great potential for producing a large number of mutant traits of target genes related to bamboo growth and development using the clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein 9 (CRISPR/Cas9)based genome-editing systems. Natural variants of these genes in different bamboo species may also be favorable for genetic improvement of traits in crops. Compared with the model plant Arabidopsis, the similar functions of orthologous genes have validated the credibility of the above network analysis and module implications based on the network comparison between Arabidopsis and bamboo ( Figure 3B).
The co-expression networks with different samples are usually different. An ideal method should be able to incorporate global networks and conditional networks for different samples. Compared with all samples, there is much less diversity and far more comparability between ICBR samples (Supplementary Figure S3). Moreover, all ICBR samples should be classified as vegetative tissues, which may have parts specifically related to FIGURE 5 | A co-regulatory network for NSTs/SNDs-related genes in bamboo. PH01006140G0010 (SND3) and PH01000508G0100 (orthologous to AtMYB103) were used to present regulatory networks with all samples (global network) and partial samples (conditional network). The rounded boxes represent conditional networks, while the rectangular boxes represent global networks. The networks for the genes PH01006140G0010 (SND3) and PH01000508G0100 (orthologous to AtMYB103) are in the boxes with green solid borders and orange solid borders, respectively. The networks for HCT (PH01001164G0160 and PH01001569G0030) and C4H (PH01000009G1900) are in the boxes with yellow-dotted borders. The comparison views of conditional co-expression networks are also shown. The gray edges link to SND3 in bamboo, and brown edges link to AT1G28470 (SND3) in Arabidopsis. The red edges link the AtMYB103 (AT1G63910) gene and orthologs of AtMYB103 in bamboo (PH01000508G0100). Dotted lines link orthologous pairs between the two species, and the numbers in the middle of the dotted lines are the E-values of the BLAST results. The NAC, MYB, zinc finger, IRX and LAC genes are highlighted with green, orange, pink, blue, and purple nodes, respectively. In the conditional co-expression network for PH01006140G0010 (SND3), the LAC4 gene PH01001798G0410 is highlighted in purple, which is a miR397 target gene.
fast growth and the development of shoot and culm. For the coexpression networks for LHC genes (Figures 2B,C), the genes between global and conditional networks have their differences and overlaps. Therefore, the conditional co-expression network together with the global network can be complementary and then imitate the potential LHC genes' regulatory mechanism of fast-growth stage in bamboo. Specifically, we also investigated whether network modules are associated with specific tissue types and are enriched for specific biological process analysis by agriGO based on cluster analysis of SND, NST and VND FIGURE 6 | A combination of functional regulatory modules related to bamboo development through module prediction. The functions of the modules were predicted through integrating annotations such as GO, gene families (transcription regulators, kinases, and carbohydrate-active enzymes), and KEGG, and non-significant entries were filtered by the Fisher test and multiple hypothesis testing (FDR ≤ 0.05). The different modules with similar functions are in the same section.
genes between conditional samples (from ICBR) and global samples (all source) of bamboo. These genes are preferentially expressed within shoot tissues relative to all other tissue types in conditional samples (Figure 4), which would be essential in the growth of bamboo, especially the tissue shoot. For example, one regulatory module with the gene SND3 for SCW thickening was identified based on the conditional co-expression networks, which might indicate that these genes can fulfill their function in shoot development stages. Meanwhile, the global networks provided additional genes for further exploration of shoot tissue development.
Although BambooGDB  has been integrated high-throughput sequencing data and provided researchers worldwide with a central genomic resource and an extensible analysis platform for bamboo genome, it is still necessary to build an online database for refining gene annotation and discovering novel gene functions. Through Cytoscape, our online bamboo co-expression database displays the multidimensional network structure and module enrichment for clear visualization and convenient analysis. Based on the co-expression network, the strategy for functional module prediction and refined gene function annotation is general and effective, so more regulatory modules could be identified by the same strategy based on a detailed biological focus or event, such as fast growth. We successfully identified 1,896 functional modules through the CPM method (Adamcsek et al., 2006), which can be searched and studied through the module enrichment in the database. These findings make it more convenient to understand the molecular regulatory mechanisms of bamboo's vital developmental traits, extremely its fast growth, which can help to dissect the molecular biological processes of bamboo. In addition, the unannotated genes establish connections to their co-expressed genes and can be refined by functional module enrichment analysis to further study unknown functions with biological processes and discern gene transcriptional regulatory mechanisms in vivo with the help of gene expression view in different tissues. With our multi-dimensional co-expression network, more than 90% of unannotated bamboo genes might be predicted potential functions.
However, the results might be unsatisfactory owing to the lack of complete data sets on all kinds of tissues in bamboo development stages. We believe that the detection of function modules will become much more efficient with more comprehensive transcriptome data sets of moso bamboo. Furthermore, our online bamboo co-expression database will be improved to facilitate data visualization. We will further incorporate faster and more efficient tools, such as JBrowse (Buels et al., 2016), which are very convenient for genomic track data visualization. Finally, we expect to functionally characterize modules and to investigate how to alter modules to drive developmental changes across all developmental stages and how genes in these modules act in biological pathways.

CONCLUSION
Here, multi-dimensional bamboo samples and comparable computing measurements have been used to build a coexpression network to refine the annotation of bamboo genes or functional modules with important agronomic traits, such as growth processes. Meanwhile, module functional enrichment analysis tools, such as gene family classification, cis-element analysis and GO analysis, have been used to evaluate the reliability of the predictions. Based on the gene expression analysis and conditional network, the strategy for functional module prediction and refined gene function annotation is general and effective. Thus, more regulatory modules could be identified by the same strategy based on a detailed biological focus or event, such as fast growth. Therefore, this approach will improve our understanding of the molecular regulatory mechanisms underlying vital agronomic traits, such as growth and development. We hope that more transcriptome data will improve the network analysis for functional module identification and reduce biases or mistakes caused by its current limitations, increasing our understanding of bamboo growth and development.

AUTHOR CONTRIBUTIONS
ZS, WX, and ZG designed the project. XM, HZ, WX, and HY performed the research. XM, QY, and WX analyzed the data and conducted the bioinformatics analysis. XM, HZ, WX, ZG, and ZS wrote the article.

FUNDING
This work was supported by grants from the National Natural Science Foundation of China (31771467 and 31371291).