Original Research ARTICLE
Exploring potential new floral organ morphogenesis genes of Arabidopsis thaliana using systems biology approach
- School of Life Sciences, Sun Yat-Sen University, Guangzhou, China
Flowering is one of the important defining features of angiosperms. The initiation of flower development and the formation of different floral organs are the results of the interplays among numerous genes. But until now, just fewer genes have been found linked with flower development. And the functions of lots of genes of Arabidopsis thaliana are still unknown. Although, the quartet model successfully simplified the ABCDE model to elaborate the molecular mechanism by introducing protein-protein interactions (PPIs). We still don't know much about several important aspects of flower development. So we need to discriminate even more genes involving in the flower development. In this study, we identified seven differentially modules through integrating the weighted gene co-expression network analysis (WGCNA) and Support Vector Machine (SVM) method to analyze co-expression network and PPIs using the public floral and non-floral expression profiles data of Arabidopsis thaliana. Gene set enrichment analysis was used for the functional annotation of the related genes, and some of the hub genes were identified in each module. The potential floral organ morphogenesis genes of two significant modules were integrated with PPI information in order to detail the inherent regulation mechanisms. Finally, the functions of the floral patterning genes were elucidated by combining the PPI and evolutionary information. It was indicated that the sub-networks or complexes, rather than the genes, were the regulation unit of flower development. We found that the most possible potential new genes underlining the floral pattern formation in A. thaliana were FY, CBL2, ZFN3, and AT1G77370; among them, FY, CBL2 acted as an upstream regulator of AP2; ZFN3 activated the flower primordial determining gene AP1 and AP2 by HY5/HYH gene via photo induction possibly. And AT1G77370 exhibited similar function in floral morphogenesis, same as ELF3. It possibly formed a complex between RFC3 and RPS15 in cytoplasm, which regulated TSO1 and CPSF160 in the nucleus, to control the floral organ morphogenesis. This process might also be fine tuning by AT5G53360 in the nucleus.
Flowering is one of the important defining features of angiosperms. Flowering is also the most pivotal stage that interposes vegetative growth stage and fruiting stage during the development in the higher plants. Each flower starts from a small fraction of undifferentiated cell, and develops into a complex pattern structure while different organs precisely occupy different positions. This process, also named as the floral pattern formation, attracts growing attentions in recent years (Bemis et al., 2013).
The transition from vegetative phase to reproductive phase is of great importance for all flowering plants. The hallmark of the reproductive phase is the differentiation of flower. Shoot apical meristem transforms into floral meristem within this early phase. Latterly, floral organ primordial initiates within the floral meristem and rises to the formation of sepal, petal, stamen and carpel. The development of floral organ is controlled by homeotic genes during reproductive phase. In 1991, the ABC model was proposed by Coen and Meyerowitz (1991) to elaborate the classification of homeotic genes, and to explain the mechanisms of how A, B, and C class genes forming the floral organs in the precise positions during flower development. The hypotheses behind the model are: firstly, the genes in each class were required to function in two adjacent whorls to specify organ types; secondly, each floral organ type originated due to function combination of class A, B, and C genes; finally, class A and class C genes were mutually antagonistic. Colombo et.al revealed that the gene FBP11 determined ovule development (Colombo et al., 1995) soon, and class D genes were added. In addition, by multiple gene mutants, four SEPALLATA genes were found redundantly interacting with ABC genes to specify floral organ identity (Rounsley et al., 1995). The four class genes are all MADS box transcription factors that are widely spreading in sepal, petal, stamen, carpel and ovule. Furthermore, the ABC model was expanded to ABCDE model. The ABCDE model was meticulous but a little more complicated than the previous one. Protein is the function executor of a gene. From this point of view, a quartet model was proposed by Theissen et al., who presumed that the development of a specific floral organ was achieved by the formation of a single protein complex by both ACB transcription factors and SEPALLATA transcription factors (Theissen and Saedler, 2001). The quartet model successfully simplified the ABCDE model by introducing protein-protein interactions (PPIs).
From the early homologous genes cloning, expression to the later large-scale computational mining the regulating relationships among genes, the flower development in A. thaliana had been intensely studied (O'Maoileidigh et al., 2014). The differentially expressed genes between mutant and wild-type of A. thaliana had been systematic identified by microarray and experimental results alleging, the floral organ-specific genes were spatially limited expression (Wellmer et al., 2004). The flower organ specification gene regulatory network (FOS-GRN) of A. thaliana had been modeled and surveyed the characteristics of network signal transduction (Sanchez-Corrales et al., 2010). But, the effects of PPIs have not been fully considered in flower development research. It was found that the functional tetramers were widespread in the MADS domain protein-protein interaction networks (Espinosa-Soto et al., 2014). So, the protein complexes might provide much more additional information in describing flower development process.
Considerable progress has been made in deciphering the molecular mechanisms underlying the formation of flowers in the past years (Krouk et al., 2013). Floral pattern formation is an extremely complex process. The initiation of flower development and the formation of different floral organs are the results of the interplays among numerous genes. But until now, just a few genes have been found linked with flower development. And the functions of lots of genes of Arabidopsis thaliana are still unknown. Several important aspects of flower development still remain poorly understood. So we need to discriminate even more genes involving in the flower development. Several lines of investigation must be followed to address these knowledge gaps and to further unravel the structure and composition of the flowering gene network. The regulatory complexes that control gene expression during flower development must be characterized (O'Maoileidigh et al., 2014). In this research, we're going to identify more potential new genes of the flower development using the systems biology approach, for further understanding the sophisticated relationships of gene regulations underlying the floral pattern formation in A. thaliana.
Materials and Methods
The gene expression data of A. thaliana development were obtained from TAIR (Lamesch et al., 2012). Eighteen samples in triplicate of wild type Columbia (Col-0) were collected from different tissues of A. thaliana, and split into two groups by their tissue specificities (Table 1). Both floral group and non-floral group contained data from the same period but with different tissues, particularly, with the florescence stage of floral group ranged from 9 to 12.
The PPI data set of Arabidopsis was constructed based on the PPI data which validated by biological experiment, the data mainly came from the following public databases: TAIR (Lamesch et al., 2012), BIND (Willis and Hogue, 2006), BioGRID (Chatr-Aryamontri et al., 2013), IntAct (Kerrien et al., 2012), and MINT (Licata et al., 2012) databases.
Co-expression Network Analysis
A gene co-expression network was constructed using the weighted gene co-expression network analysis (WGCNA) method, which implemented with the WGCNA package in R (Langfelder and Horvath, 2008). In order to analyze the data within the WGCNA framework in the reasonable time and limited hardware resources, the size of the data set was filtered based on Pearson correlation coefficient (PCC) between two genes. There were 6337 genes filtered for WGCNA unsigned co-expression network analysis. A soft-thresholding in the interval (1, 40) was computed, and a soft-thresholding power of 14 with a scale-free model that fitting index R2 > 0.6 was applied to the maximized scale-free topology structure. While the minimum size of 30 members for each module was chosen.
To incorporate external information into the co-expression network, we used the gene significance (GS) measures. Gene significance was defined as GSi = |cor(xi, T)|, which indicated correlation of a xi node expression profile to a phenotypic trait T, or a binary trait variable across m samples (Langfelder and Horvath, 2008). The network hub was defined as highly connected gene within a network that had high intra-modular connectivity. To identify possible highly connected intra-modular hub genes, module membership (MM) was applied. Module Membership was also known as eigengene-based connectivity kME, that was defined as kMEcor, i(q) = cor(xi, E(q)), where E(q) was the module eigengene of module q.
Protein-protein Interaction Analysis
A summary of pre-process was applied to the PPI data sets. Firstly, the protein pairs that contained a protein with < 50 amino acids or unknown amino acids were removed. Secondly, All proteins in the data set were aligned using the multiple sequence alignment tool, cd-hit program (Li and Godzik, 2006), the protein pairs with ≥ 40% identity were removed, and the remaining 6505 protein pairs comprised the final positive data set. Although the overwhelming majority of these pairs had <40% pairwise sequence identity to one another, the classifier would possibly be biased to these homologous sequence pairs.
Since the non-interacting protein pairs were not readily available in Arabidopsis, one strategy for constructing negative data set was used. It based on such an assumption, if proteins occupying different subcellular localizations did not interact. The subcellular localization information of the proteins in the positive data set was extracted from SUBA3 (http://suba.plantenergy.uwa.edu.au/) (Tanz et al., 2013). The non-interacting pairs were generated by pairing proteins from different subsets. Here, the negative data set based on subcellular localization information was called Psub. The negative data set must meet three requirements: (i) the protein pairs cannot appear in the whole PPI data set of Arabidopsis; (ii) the number of negative pairs is equal to that of positive pairs (Pitre et al., 2006; Shen et al., 2007); (iii) the auto covariance (AC) algorithm proposed by Guo et al. (2008), are subsequently fed to LIBSVM (Chang and Lin, 2011) to construct a two-class classification model. The RBF (radial basis function) kernel is used in the support vector machines (SVM) model, the cost (c), and gamma (γ) parameters are optimized with grid searching, which are set to 5.278 and 0.574 respectively (Supplementary Figure 1). In addition, co-expression-based PPI was constructed by setting an independent co-expression threshold (α) for the module with high GS. Two genes, the co-expression value of which is higher than the threshold, are considered to be interacted in their protein level. The threshold α is calculated by the formula (weightmax-weightmin)*0.6+weightaverage, where weightmax indicates the maximum weight value, with the minimum weightmin and the average weightaverage.
Module Enrichment Analysis
Gene ontology (GO) enrichment in modules was carried out with ClueGO (Bindea et al., 2009) using Cytoscape v.2.8. The hypergeometric test method was applied (P < 0.05). Each module was tested for enrichment in terms of the molecular function (MF) and the biology process (BP) categories. Bonferroni correction method was applied to correct the P-values for multiple testing. The ClueGO used kappa statistics to link the functional group terms in the network. The functional groups terms were created by iterative merging of initially defined groups, which based on the predetermined kappa score threshold. The kappa score value could initially be adjusted on a positive scale from zero to one, to limit the network connectivity in a customized way. We functionally grouped network with terms as nodes linked that based on their kappa score ≥0.3. The co-expression network and subcellular localization annotation of interesting genes were visualized by Cerebral (Barsky et al., 2007). Only GO terms with corrected P < 0.005 were considered to be overrepresented in our analysis.
Sequences of flower development genes of rice (Oryza sativa) (Yoshida and Nagato, 2011), snapdragon (Antirrhinum majus) (Hudson et al., 2008), and petunia (Petunia hybrid) (Mallona et al., 2010) were retrieved from the literatures. Sequences of flower development genes of A. thaliana were selected from the predicted-PPI of brown and magenta modules. Phylogenetic tree was constructed using the alignment-free method to avoid the influence of sequence heterogeneity. The alignment-free method which based on K-tuple counting and background subtraction termed a composition vector (CV) approach, and the approach was abbreviated as CVTree (Xu and Hao, 2009). K-tuple was set to 6, and the resulted tree was visualized by MEGA 5 (Hall, 2013).
Modules Organization and Gene Set Enrichment Analysis
As shown in Figure 1, a weighted co-expression network with scale-free topology that composed with seven modules of Arabidopsis genes was obtained. WGCNA assigned to each module a unique color label that was used as specific module identifier below. The largest module (“magenta”) contained 1333 genes; the least module (“red”) contained 158 genes. Almost 177 probesets were not grouped into any above modules, so they were added to the “gray” module that represented poorly connected genes.
Figure 1. Network analysis dendrogram showing modules identified by WGCNA. (A) Dendrogram plot with color annotation. (B) Module significance.
Gene set enrichment analysis of GO terms within module was conducted to provide a biological interpretation for the constructed gene networks (Table 2 and Supplementary Tables 1–6). The magenta module had an over-representation of BP terms related to negative regulation of flower development (P = 1.16E-6). Floral organ development (P = 2.06E-03) and nuclear-transcribed mRNA catabolic process (P = 9.31E-05) were notably enriched in black module. GO terms that included development of floral whorl, carpel and ovule were enriched in blue module (P = 1.07E-4). GO terms of far red light respond (P = 4.19E-19) and NADPH regeneration (P = 1.22E-11) were significantly enriched in green module. Abscisic acid stimulus respond (P = 5.12E-06), photomorphogenesis regulation (P = 3.09E-04) and interphase of mitotic cell cycle (P = 6.37E-25) were notably in brown network. The red module was enriched for genes in regulation of actin filament depolymerization (P = 3.75E-04) and the jasmonic acid metabolic process (P = 1.87E-03). Hormone-mediated signaling pathway (P = 2.42E-6), photomorphogenesis regulation (P = 3.09E-04) and RNA splicing (P = 3.84E-10) were overrepresented in the magenta and black module.
Each module was filtered to identify the top hub proteins relative to desired criteria using measures, such as intra-modular connectivity (kME) and gene significance (GS). The Brown module scored the highest among the differentially co-expressed gene modules, followed by the magenta module (Supplementary Figure 2). Multiple genes in the brown module, i.e., AT1G13030, AT3G09630, AT3G23940, AT4G28450, AT5G07090, AT5G47210, ATARCA, ATG2, CARA, EIF2-GAMMA, GYRA, HD2B, NDPK1, NOP56, NUC-L1, PUR5, and TOM40, were essential factors during the pyrimidine metabolic process. AT5G38895 and EIN3 were the factors within reactive oxygen species metabolic process. AT3G14390, also known as diaminopimelate decarboxylase 1, was the hub protein in the brown network. In the magenta module, AHP3, EIN2, ERS1, KEG, PGGT-I, PIF4, RGS1, and RHA2A participated regulations in the signaling pathway. ELF3, GSTU19, HY5, JAR1, PIF4, PKS1, and RD2 were involved in far red light stimulate response.
PPIs in Brown Module
Brown module scored the highest among the differentially co-expressed gene modules (GS = 0.3109, Figure 1B). The functional annotation showed that this module was enriched in post-embryonic organ morphogenesis, flower organ development and morphogenesis (Supplementary Table 3), which suggested a very important relationship with floral patterning.
There were 24 proteins, including FY, EGL3, CRN, CSN5A that involved in floral organ morphogenesis (P = 3.18E-04) and also in other floral development process, which were mapped to the experimental PPI databases described above, there were 13 proteins which formed a sub-network (Figure 2A). As the hub protein within the sub-network, CSN5A interacted with FUS7 (COP9), CSN6B, CSN6A, FUS11, FUS12, PI, EMB144 (FUS9), EMB134 (COP8), TIF3H1, and SK31 (FUS6) to form the COP9 signalosome (CSN) complex.
Figure 2. PPI network of floral organ morphogenesis in brown module. (A) experiment-PPI. (B) predicted-PPI. Rhombus: functional enriched proteins in this module to be concerned.
The experimental validated PPIs might present absence in certain interactions. To gain more information, the co-expression value between these 24 proteins and other proteins in the brown module were calculated and filtered with the threshold α setting to 0.08. There were 81 proteins that were selected as highly co-expressed and submitted to the SVM model to predict possible interactions. The interaction results were further filtered to preserve those PPIs with the same subcellular localization. Two proteins who localized in nucleus, i.e., TSO1 and CPSF160, interacted with RPL34, RPS15, AT2G27710, and AT3G12390 (Figure 2B).
PPIs in Magenta Module
Genes participated in negative regulation of flower development were found in magenta module, which was the second import module based on gene significance score (Figure 1B). There were 104 genes involved in the flower development (P = 3.08E-04) which attracted special attention, including class A genes AP1 and AP2, class B gene PI. AP1/AP2 controlled sepal's development, while PI regulated petals development, all of which belonged to the first two stages among floral organ formation.
To decrease the level of complexity, sub-network including AP1, AP2, and PI was extracted from the 104-genes-based experimental PPIs for further investigation (Figure 3A). AP1, which interacted with AP3, AG, SEU, LUG, SEP3, SEP4, PI, SVP, and AGL, was the hub protein of the sub-experimental PPI. WSIP1, WSIP2, and TPR2 were the interaction partners of AP2 protein.
Figure 3. PPI network of flower development regulation in magenta module. (A) AP1/AP2/PI involved experiment-PPI. (B) predicted-PPI. Rhombus: functional enriched proteins in this module to be concerned.
The predicted-PPI of flowering development in magenta module was constructed similarly as it did in brown module with 101 proteins filter by setting threshold α to 0.16. AP2 was the hub protein in this predicted-PPI which interacted with 10 proteins including CBL2, ERS, SRL2, etc. AP2, one of the MADS box transcription factor which belonged to class A, collaborated with AP1 to regulate the development of sepal and petal.
Discussions and Conclusions
Modules Organization and Gene Set Enrichment Analysis
It is always the problem to validate the results from the computational methods. The common cross validation methods are literature retrieval in biological research. We can obtain partial information about the functions of the genes or proteins from the literatures to support our predictions.
It was confirmed by literature retrieval that the early flowering 3 (ELF3) of Arabidopsis was responsible for generation of circadian rhythm as well as for regulation of photoperiodic flowering (Zhao et al., 2012). The mutation of ELF3 led to arrhythmic circadian output in continuous light (Covington et al., 2001; Kolmos et al., 2011) and late flowering (Zhao et al., 2012). The membrane-associated progesterone binding protein 2 (ATMP2) was the hub protein in the module based on the MM, and took parts in both negative regulation of cellular process and indoleacetic acid biosynthetic process (Kao et al., 2005).
Potential Floral Organ Morphogenesis Genes in Brown Module
CNS was a conserved protein complex that interacted with CDD complex and covered in the ubiquitin-proteasome pathway, so as to orchestrate the repression of photomorphogenesis (Chen et al., 2006; Nezames and Deng, 2012). The F-box protein, named as Unusual Floral Organs (UFO), also interacted with CSN5A, and participated in flower development of Arabidopsis (Wang et al., 2003). Mutation of UFO leaded to dramatic changes in floral-organ type (Hepworth et al., 2006). Chae et al. (2008) showed that the UFO, acting as a DNA-associated transcriptional co-factor, was physically interacting with LFY transcription factor to active AP3 expression in developing petal, stamen primordial and controlling class B and C genes in floral organ formation.
TSO1 regulated directional processes in cells during floral organogenesis (Hauser et al., 1998). It encoded a floral-specific cell division component, but its function was redundant in non-floral tissue (Liu et al., 1997). This study showed that mutation of TSO1 displayed defects in cell division of floral meristem cell which including partially formed cell walls and increased DNA ploidy (Liu et al., 1997). CPSF160, a subunit of the cleavage and polyadenylation specificity factor (CPSF), was an important component of mRNA 3′- end processing apparatus in Arabidopsis (Xu et al., 2006). CPSF was physically associated with the flowering time regulator FY (Herr et al., 2006). It recruited FCA to control FLC mRNA expression to affect flowering time (Simpson et al., 2004). The replication factor C subunit 3 (RFC3) was high homology to RFC3 in yeast and other eukaryotic species, functioning in cell replication, proliferation, DNA replication and damage repair (Xia et al., 2009). Genetic research showed that RFC3 mutation accounts for smaller leaf blades and flower petals, implying that it had cell replication and proliferation functions (Xia et al., 2009), and played an essential role in DNA replication and damage repair (Mossi and Hübscher, 1998). The function of chloroplast ribosomal protein S15 (RPS15) was beyond research, but recent results showed that the replication factor and ribosomal protein might jointly participate in protein synthesis (Daijiro et al., 2014). Thus, we proposed that RFC3 formed a complex with RPS15 in cytoplasmic, and then transported into nucleus, regulating the mRNA expression of TSO1 and CPSF160, further to control the floral organ morphogenesis based on the predicted PPIs. This process might also fine tuning by AT3G12390 and AT5G53360 in the nucleus.
Potential Floral Organ Morphogenesis Genes in Magenta Module
Most of the AP1 partners belong to the MADS-box family, which are the generally transcription factors (Shore and Sharrocks, 1995) to control all major aspects of development (Becker and Theissen, 2003), and to determine floral organ identity (Ng and Yanofsky, 2001) or flowering time (Michaels and Amasino, 1999) in plant. The MADS-box protein SVP interacted with AP1, SEP3, AGL6 and many other proteins, was a negative regulator of the floral transition (Hartmann et al., 2000). Another MADS-box gene, FLC, was also known to repress flowering (Sheldon et al., 1999). SVP consistently interacted with FLC to form a functional heterodimer, and associated with the promoter regions of flowering time regulator FT and SCO1 to repress flowering (Li et al., 2008). Over-expression of SVP and/or FLC dimerization led to precocious flowering and abnormal floral organ development (Li et al., 2008). SEP3, a member of the class E genes, activated class B and C gene expression in stage 3 floral meristem. Class B and C genes did not express because SEP3 was repressed by SVP in floral meristem before late stage 2. This process was reversed by AP1 through the repression of SVP, so as to derepress SEP3 and LFY to activate the genes expression of these two classes in the early stage 3 (Liu et al., 2009).
The antagonistic interaction between class A and class C genes was triggered by AP2 through negatively regulating AG—the C class gene (Krogan et al., 2012). TPR2 also involved in this process as a binding partner of AP2 (Figure 3A) (Krogan et al., 2012). ERS (ethylene response sensor), a gene in A. thaliana ethylene hormone-response pathway, was strongly expressed in young floral primordia and floral organ primordial (Hua et al., 1998). The predicted interaction with AP2 suggested that it might regulate AP2 in the early stage of flower development. The F-box protein COI1, a critical component of the jasmonate receptor, was also noteworthy. Jasmonates modulate numerous genes expression and mediate responded to stress-related growth inhibition, wounding and pollen development (Devoto et al., 2002; Gfeller et al., 2010). COI1 mutant was insensitive to methyl jasmonate, and was male sterile due to abnormal pollen production (Xie et al., 1998). Yeast two-hybrid assay showed that the flowering protein terminal flower 2 (TFL2) was associated with the potential transporter AT-IMP (Arabidopsis Interactome Mapping, 2011). TFL2 had a repressive function in jasmonate signaling, and localized preferentially to euchromatic regions instead of heterochromatic chromocenters (Valdés et al., 2012). COI1 was predicted to associate with AT-IMP in predicted-PPI. We proposed that while COI1 responded to jasmonate stimulate, AT-IMP was active and transferred the signal to TFL2 to make it engaging in flower development process.
Functional Inference of Vital Genes in Flower Development
Above studies showed that, on one hand, the flower development was the complex biological process that multiple genes/proteins involved. The research on gene regulatory network had achieved profound progresses in Arabidopsis and other model plant (Azpeitia et al., 2014; O'Maoileidigh et al., 2014). Gene function was directly correlated to specific protein and therefore to its interaction partners. Previous analysis elaborated proteins' role through co-expression clustering and the function of its interaction partners. On the other hand, it was widely accepted that the revolutionary related proteins tended to perform similar function (Ranea et al., 2007; Engelhardt et al., 2011). Thus, we further investigated the evolutionary relationships of flower development genes, which selected from the experiment-PPI/predicted-PPI in brown and magenta module of A. thaliana as well as those from rice, snapdragon and petunia that belonged to class A/B/C/D/E genes.
It was recognized that most of the known proteins in flower development were close to each other in the phylogenetic tree (Figure 4, note by black circle), which suggested that they were evolutionary-related, possibly having the similar biological functions. The result was reasonable as the ABCDE organ identity genes in Arabidopsis encoded the MADS-box transcription factors except for the class A gene AP2 (Figure 4) (Martinez-Castilla and Alvarez-Buylla, 2003). The floral homeotic gene DROOPING LEAF (DL) in Oryzais was distinct from the well-known ABC genes, which had already been defined (Yamaguchi et al., 2004) and also been discussed in phylogenetic tree (Figure 4). It was confirmed that ACS10 closed to class B genes, while in the predicted-PPIs of magenta module, it was predicted to be interacted with AP2 (Figure 3B), which indicated that ACS10 participated in the early stage of floral organ development. It was also found that ACS10 was recorded to express during petal differentiation and expansion stage in TAIR database (https://www.arabidopsis.org/servlets/TairObject?name=AT1G62960&type=locus). CBL2, being clustered with the flowering time regulator FY in the phylogenetic tree, was also predicted that it could associate with AP2 (Figure 3B). Expression of CBL2, being expressed in mature leaves, disappeared during dark treatment while recovering upon illumination, which strongly suggested that it was influential in light-signal transduction (Nozawa et al., 2001). Thus, we proposed that the function of CBL2 was similar as FY, and acted as an upstream regulator of AP2. Transcription factor HY5 controlled light-induced gene expression and targets genes which including light-signaling components and flowering time regulators (Lee et al., 2007). Two genes, HY5 and HYH, were highly similar in Arabidopsis (Sibout et al., 2006). The predicted interaction between HY5/HYH and ZFN3 (Figure 3B), and the cluster of ZFN3 and AP2 (Figure 4), indicated that ZFN3 might be involved in flowering time control. ELF3, AT1G77370, AT2G27710, ATMTK, and AT-IMP were in a similar branch. Few studies had been launched to explore the function of At-IMP, ATMTK, and AT2G27710. However, genetic analysis showed that ELF3 expressed some functions in early photomorphogenesis (Liu et al., 2001). AT1G77370, also named as glutaredoxin-C3, might play a vital role in floral morphogenesis (Wang et al., 2009). Therefore, ELF3 and AT1G77370 might exhibit similar function in floral morphogenesis.
Figure 4. Phylogenetic analysis of flower development genes. Solid circles in the figure represent known flower development genes. Arabidopsis genes are denoted by AT|symbol|AGI or AGI. Other species genes are denoted by Species|Class|Symbol. “Species” abbreviation: AT, A. thaliana; Oryzac, Oryza sativa; AM, Antirrhinum majus; Petunia, Petunia hybrida. “Class” includes A/B/C/D/E. “Symbol” indicates gene symbol.
Conclusions and Limitations
Floral pattern formation is an extremely complex process. It suffers from the interplay of many different genes. Until now, just a few genes have been found to link with flower development. The functions of lots of genes of A. thaliana are still unknown. We need to discriminate even more genes involving in the flower development to better understand the molecular regulation mechanism of the floral pattern formation in A. thaliana.
This study aimed to find the possible potential new genes underlining the floral pattern formation in A. thaliana by combining the gene expression data, PPIs and phylogenetic information. Results showed that the genes involved in this process could be classified into seven modules with different functions. Furthermore, the brown and magenta modules were significantly correlated with floral organ morphogenesis. By digging into the modules with different types of PPIs information, we endowed each module with real meaning, and it revealed that the PPI networks satisfied the regulatory relationships proposed by ABCDE model.
It also showed that, the most possible potential new genes of the floral pattern formation in A. thaliana were FY, CBL2, ZFN3, and AT1G77370. FY and CBL2 acted as upstream regulators of AP2. ZFN3 activated the flower primordial determining gene AP1 and AP2 by HY5/HYH gene via photo induction possibly. AT1G77370 exhibits similar function in floral morphogenesis, same as ELF3. RFC3 forms a complex with RPS15 in cytoplasmic possibly, to regulate TSO1 and CPSF160 in the nucleus, to control the floral organ morphogenesis. This process might also be fine tuning by AT5G53360 in the nucleus. We inferred a possible pathway to describe the possible molecular regulation mechanism among these genes/proteins of the floral pattern formation in A. thaliana by considering some of the previous results (O'Maoileidigh et al., 2014) (see Figure 5).
Figure 5. The integrated pathway of floral pattern formation in Arabidopsis thaliana. Dotted line indicated the indirect interaction. Some of the proteins/genes combined with AP2 are from the literature (O'Maoileidigh et al., 2014).
Generally, the false positives are always existed using in silico methods. Novel PPIs and related proteins functions, which are inferred from the module-based PPI networks combining the phylogenetic information, also require to be validated experimentally in the future.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the National Natural Science Foundation of China (No. 91130009, No.11475273), the Fundamental Research Funds for the Central Universities of China (No. 141gjc06), and Guangdong Teaching Reform Project of Higher Education (Undergraduate) (No. GDJG20142022).
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/article/10.3389/fpls.2015.00829
Azpeitia, E., Davila-Velderrain, J., Villarreal, C., and Alvarez-Buylla, E. R. (2014). Gene regulatory network models for floral organ determination. Methods Mol. Biol. 1110, 441–469. doi: 10.1007/978-1-4614-9408-9_26
Barsky, A., Gardy, J. L., Hancock, R. E., and Munzner, T. (2007). Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation. Bioinformatics 23, 1040–1042. doi: 10.1093/bioinformatics/btm057
Becker, A., and Theissen, G. (2003). The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol. Phylogenet. Evol. 29, 464–489. doi: 10.1016/S1055-7903(03)00207-0
Bemis, S. M., Lee, J. S., Shpak, E. D., and Torii, K. U. (2013). Regulation of floral patterning and organ identity by Arabidopsis ERECTA-family receptor kinase genes. J. Exp. Bot. 64, 5323–5333. doi: 10.1093/jxb/ert270
Bindea, G., Mlecnik, B., Hackl, H., Charoentong, P., Tosolini, M., Kirilovsky, A., et al. (2009). ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25, 1091–1093. doi: 10.1093/bioinformatics/btp101
Chae, E., Tan, Q. K., Hill, T. A., and Irish, V. F. (2008). An Arabidopsis F-box protein acts as a transcriptional co-factor to regulate floral development. Development 135, 1235–1245. doi: 10.1242/dev.015842
Chatr-Aryamontri, A., Breitkreutz, B. J., Heinicke, S., Boucher, L., Winter, A., Stark, C., et al. (2013). The BioGRID interaction database: 2013 update. Nucleic Acids Res. 41, D816–D823. doi: 10.1093/nar/gks1158
Chen, H., Shen, Y., Tang, X., Yu, L., Wang, J., Guo, L., et al. (2006). Arabidopsis CULLIN4 forms an E3 ubiquitin ligase with RBX1 and the CDD complex in mediating light control of development. Plant Cell 18, 1991–2004. doi: 10.1105/tpc.106.043224
Colombo, L., Franken, J., Koetje, E., van Went, J., Dons, H. J., Angenent, G. C., et al. (1995). The petunia MADS box gene FBP11 determines ovule identity. Plant Cell 7, 1859–1868. doi: 10.1105/tpc.7.11.1859
Covington, M. F., Panda, S., Liu, X. L., Strayer, C. A., Wagner, D. R., and Kay, S. A. (2001). ELF3 modulates resetting of the circadian clock in Arabidopsis. Plant Cell 13, 1305–1315. doi: 10.1105/tpc.13.6.1305
Devoto, A., Nieto-Rostro, M., Xie, D., Ellis, C., Harmston, R., Patrick, E., et al. (2002). COI1 links jasmonate signalling and fertility to the SCF ubiquitin-ligase complex in Arabidopsis. Plant J. 32, 457–466. doi: 10.1046/j.1365-313X.2002.01432.x
Engelhardt, B. E., Jordan, M. I., Srouji, J. R., and Brenner, S. E. (2011). Genome-scale phylogenetic function annotation of large and diverse protein families. Genome Res. 21, 1969–1980. doi: 10.1101/gr.104687.109
Espinosa-Soto, C., Immink, R. G. H., Angenent, G. C., Alvarez-Buylla, E.R., and de Folter S. (2014). Tetramer formation in Arabidopsis MADS domain proteins: analysis of a protein-protein interaction network. BMC Syst. Biol. 1:9. doi: 10.1186/1752-0509-8-9
Guo, Y., Yu, L., Wen, Z., and Li, M. (2008). Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res. 36, 3025–3030. doi: 10.1093/nar/gkn159
Hartmann, U., Höhmann, S., Nettesheim, K., Wisman, E., Saedler, H., and Huijser, P. (2000). Molecular cloning of SVP: a negative regulator of the floral transition in Arabidopsis. Plant J. 21, 351–360. doi: 10.1046/j.1365-313x.2000.00682.x
Hepworth, S. R., Klenz, J. E., and Haughn, G. W. (2006). UFO in the Arabidopsis inflorescence apex is required for floral-meristem identity and bract suppression. Planta 223, 769–778. doi: 10.1007/s00425-005-0138-3
Herr, A. J., Molnàr, A., Jones, A., and Baulcombe, D. C. (2006). Defective RNA processing enhances RNA silencing and influences flowering of Arabidopsis. Proc. Natl. Acad. Sci. U.S.A. 103, 14994–15001. doi: 10.1073/pnas.0606536103
Hua, J., Sakai, H., Nourizadeh, S., Chen, Q. G., Bleecker, A. B., Ecker, J. R., et al. (1998). EIN4 and ERS2 are members of the putative ethylene receptor gene family in Arabidopsis. Plant Cell 10, 1321–1332. doi: 10.1105/tpc.10.8.1321
Kao, A. L., Chang, T. Y., Chang, S. H., Su, J. C., and Yang, C. C. (2005). Characterization of a novel Arabidopsis protein family AtMAPR homologous to 25-Dx/IZAg/Hpr6.6 proteins. Bot. Bull. Acad. Sin. 46, 107–118.
Kerrien, S., Aranda, B., Breuza, L., Bridge, A., Broackes-Carter, F., Chen, C., et al. (2012). The IntAct molecular interaction database in 2012. Nucleic Acids Res. 40, D841–D846. doi: 10.1093/nar/gkr1088
Kolmos, E., Herrero, E., Bujdoso, N., Millar, A. J., Tóth, R., Gyula, P., et al. (2011). A reduced-function allele reveals that EARLY FLOWERING3 repressive action on the circadian clock is modulated by phytochrome signals in Arabidopsis. Plant Cell 23, 3230–3246. doi: 10.1105/tpc.111.088195
Krogan, N. T., Hogan, K., and Long, J. A. (2012). APETALA2 negatively regulates multiple floral organ identity genes in Arabidopsis by recruiting the co-repressor TOPLESS and the histone deacetylase HDA19. Development 139, 4180–4190. doi: 10.1242/dev.085407
Krouk, G., Lingeman, J., Colon, A. M., Coruzzi, G., and Shasha, D. (2013). Gene regulatory networks in plants: learning causality from time and perturbation. Genome Biol. 14:123. doi: 10.1186/gb-2013-14-6-123
Lamesch, P., Berardini, T. Z., Li, D., Swarbreck, D., Wilks, C., Sasidharan, R., et al. (2012). The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 40, D1202–D1210. doi: 10.1093/nar/gkr1090
Lee, J., He, K., Stolc, V., Lee, H., Figueroa, P., Gao, Y., et al. (2007). Analysis of transcription factor HY5 genomic binding sites revealed its hierarchical role in light regulation of development. Plant Cell 19, 731–749. doi: 10.1105/tpc.106.047688
Li, D., Liu, C., Shen, L., Wu, Y., Chen, H., Robertson, M., et al. (2008). A repressor complex governs the integration of flowering signals in Arabidopsis. Dev. Cell 15, 110–120. doi: 10.1016/j.devcel.2008.05.002
Licata, L., Briganti, L., Peluso, D., Perfetto, L., Iannuccelli, M., Galeota, E., et al. (2012). MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 40, D857–D861. doi: 10.1093/nar/gkr930
Liu, X. L., Covington, M. F., Fankhauser, C., Chory, J., and Wagner, D. R. (2001). ELF3 encodes a circadian clock-regulated nuclear protein that functions in an Arabidopsis PHYB signal transduction pathway. Plant Cell 13, 1293–1304. doi: 10.1105/tpc.13.6.1293
Mallona, I., Lischewski, S., Weiss, J., Hause, B., and Egea-Cortines, M. (2010). Validation of reference genes for quantitative real-time PCR during leaf and flower development in Petunia hybrida. BMC Plant Biol. 10:4. doi: 10.1186/1471-2229-10-4
Martinez-Castilla, L. P., and Alvarez-Buylla, E. R. (2003). Adaptive evolution in the Arabidopsis MADS-box gene family inferred from its complete resolved phylogeny. Proc. Natl. Acad. Sci. U.S.A. 100, 13407–13412. doi: 10.1073/pnas.1835864100
Nozawa, A., Koizumi, N., and Sano, H. (2001). An Arabidopsis SNF1-related protein kinase, AtSR1, interacts with a calcium-binding protein, AtCBL2, of which transcripts respond to light. Plant Cell Physiol. 42, 976–981. doi: 10.1093/pcp/pce126
Pitre, S., Dehne, F., Chan, A., Cheetham, J., Duong, A., Emili, A., et al. (2006). PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs. BMC Bioinformatics 7:365. doi: 10.1186/1471-2105-7-365
Ranea, J. A., Yeats, C., Grant, A., and Orengo, C. A. (2007). Predicting protein function with hierarchical phylogenetic profiles: the Gene3D Phylo-Tuner method applied to eukaryotic genomes. PLoS Comput. Biol. 3:e237. doi: 10.1371/journal.pcbi.0030237
Sanchez-Corrales, Y. E., Alvarez-Buylla, E. R., and Mendoza, L. (2010). The A. thaliana flower organ specification gene regulatory network determines a robust differentiation process. J. Theor. Biol. 264, 971–983. doi: 10.1016/j.jtbi.2010.03.006
Sheldon, C. C., Burn, J. E., Perez, P. P., Metzger, J., Edwards, J. A., Peacock, W. J., et al. (1999). The FLF MADS box gene: a repressor of flowering in Arabidopsis regulated by vernalization and methylation. Plant Cell 11, 445–458. doi: 10.1105/tpc.11.3.445
Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., et al. (2007). Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. U.S.A. 104, 4337–4341. doi: 10.1073/pnas.0607879104
Sibout, R., Sukumar, P., Hettiarachchi, C., Holm, M., Muday, G. K., and Hardtke, C. S. (2006). Opposite root growth phenotypes of hy5 versus hy5 hyh mutants correlate with increased constitutive auxin signaling. PLoS Genet. 2:e202. doi: 10.1371/journal.pgen.0020202
Simpson, G. G., Quesada, V., Henderson, I. R., Dijkwel, P. P., Macknight, R., and Dean, C. (2004). RNA processing and Arabidopsis flowering time control. Biochem. Soc. Trans. 32, 565–566. doi: 10.1042/BST0320565
Tanz, S. K., Castleden, I., Hooper, C. M., Vacher, M., Small, I., and Millar, H. A. (2013). SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis. Nucleic Acids Res. 41, D1185–D1191. doi: 10.1093/nar/gks1151
Valdés, A. E., Rizzardi, K., Johannesson, H., Para, A., Sundås-Larsson, A., and Landberg, K. (2012). A. thaliana TERMINAL FLOWER2 is involved in light-controlled signalling during seedling photomorphogenesis. Plant Cell Environ. 35, 1013–1025. doi: 10.1111/j.1365-3040.2011.02468.x
Wang, X., Feng, S., Nakayama, N., Crosby, W. L., Irish, V., Deng, X. W., et al. (2003). The COP9 signalosome interacts with SCF UFO and participates in Arabidopsis flower development. Plant Cell 15, 1071–1082. doi: 10.1105/tpc.009936
Wang, Z., Xing, S., Birkenbihl, R. P., and Zachgo, S. (2009). Conserved functions of Arabidopsis and rice CC-type glutaredoxins in flower development and pathogen response. Mol. Plant 2, 323–335. doi: 10.1093/mp/ssn078
Wellmer, F., Riechmann, J. L., Alves-Ferreira, M., and Meyerowitz, E. M. (2004). Genome-wide analysis of spatial gene expression in Arabidopsis flowers. Plant Cell 16, 1314–1326. doi: 10.1105/tpc.021741
Willis, R. C., and Hogue, C. W. (2006). Searching, viewing, and visualizing data in the Biomolecular Interaction Network Database (BIND). Curr. Protoc. Bioinformatics Chapter 8, Unit 8 9. doi: 10.1002/0471250953.bi0809s12
Xia, S., Zhu, Z., Hao, L., Chen, J. G., Xiao, L., Zhang, Y., et al. (2009). Negative regulation of systemic acquired resistance by replication factor C subunit3 in Arabidopsis. Plant Physiol. 150, 2009–2017. doi: 10.1104/pp.109.138321
Xie, D. X., Feys, B. F., James, S., Nieto-Rostro, M., and Turner, J. G. (1998). COI1: an Arabidopsis gene required for jasmonate-regulated defense and fertility. Science 280, 1091–1094. doi: 10.1126/science.280.5366.1091
Xu, R., Zhao, H., Dinkins, R. D., Cheng, X., Carberry, G., and Li, Q. Q. (2006). The 73 kD subunit of the cleavage and polyadenylation specificity factor (CPSF) complex affects reproductive development in Arabidopsis. Plant Mol. Biol. 61, 799–815. doi: 10.1007/s11103-006-0051-6
Yamaguchi, T., Nagasawa, N., Kawasaki, S., Matsuoka, M., Nagato, Y., and Hirano, H. Y. (2004). The YABBY gene DROOPING LEAF regulates carpel specification and midrib development in Oryza sativa. Plant Cell 16, 500–509. doi: 10.1105/tpc.018044
Keywords: Arabidopsis thaliana, floral pattern formation, systems biology, co-expression, protein-protein interactions
Citation: Xie W, Huang J, Liu Y, Rao J, Luo D and He M (2015) Exploring potential new floral organ morphogenesis genes of Arabidopsis thaliana using systems biology approach. Front. Plant Sci. 6:829. doi: 10.3389/fpls.2015.00829
Received: 19 December 2014; Accepted: 22 September 2015;
Published: 13 October 2015.
Edited by:David Toubiana, Ben Gurion University, Israel
Reviewed by:Sudip Kundu, University of Calcutta, India
Chuang Ma, Northwest Agricultural and Forestry University, China
Copyright © 2015 Xie, Huang, Liu, Rao, Luo and He. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Da Luo and Miao He, School of Life Sciences, Sun Yat-sen University, No. 135 West Xingang RD, Guangzhou 510275, Guangdong, China, firstname.lastname@example.org; email@example.com