Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants

Schnable, James  C; Wang, Xiaowu; Pires, J.  Chris; Freeling, Michael

doi:10.3389/fpls.2012.00094

ORIGINAL RESEARCH article

Front. Plant Sci., 15 May 2012

Sec. Plant Genetics and Genomics

Volume 3 - 2012 | https://doi.org/10.3389/fpls.2012.00094

Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants

JC
James C. Schnable ¹
XW
Xiaowu Wang ²
JC
J. Chris Pires ³
MF
Michael Freeling ¹^*

1. Freeling Lab, Plant and Microbial Biology, University of California – Berkeley Berkeley, CA, USA
2. Molecular Genetics Lab, Biotechnology Department, Institute of vegetables and flowers, Chinese Academy of Agricultural Sciences Beijing, China
3. Biological Sciences, Bond Life Sciences Center, University of Missouri Colombia, MO, USA

Abstract

The well supported gene dosage hypothesis predicts that genes encoding proteins engaged in dose–sensitive interactions cannot be reduced back to single copies once all interacting partners are simultaneously duplicated in a whole genome duplication. The genomes of extant flowering plants are the result of many sequential rounds of whole genome duplication, yet the fraction of genomes devoted to encoding complex molecular machines does not increase as fast as expected through multiple rounds of whole genome duplications. Using parallel interspecies genomic comparisons in the grasses and crucifers, we demonstrate that genes retained as duplicates following a whole genome duplication have only a 50% chance of being retained as duplicates in a second whole genome duplication. Genes which fractionated to a single copy following a second whole genome duplication tend to be the member of a gene pair with less complex promoters, lower levels of expression, and to be under lower levels of purifying selection. We suggest the copy with lower levels of expression and less purifying selection contributes less to effective gene-product dosage and therefore is under less dosage constraint in future whole genome duplications, providing an explanation for why flowering plant genomes are not overrun with subunits of large dose–sensitive protein complexes.

Introduction

Plants have been colorfully labeled the “big kahuna of polyploidization” (Sémon and Wolfe, 2007). The lineages leading to the two preeminent models for plant genetics – Arabidopsis (a eudicot) and maize (a monocot) – each show evidence of multiple independent whole genome duplications (Figure 1) since monocots and eudicots diverged approximately 120 million years ago (Soltis et al., 2009). Recent evidence suggests at least two additional, shared, whole genome duplications prior to the monocot/eudicot split (Jiao et al., 2011). The cumulative ploidy numbers relative to a pre-seed plant ancestor are listed in parentheses in Figure 1. Whole genome duplication creates duplicate, potentially redundant, copies of all the genes within a genome. The loss of these duplicate copies from the genomes of ancient polyploid species is known as fractionation (Langham et al., 2004) and – over evolutionary time scales – the majority of genes duplicated by polyploidy will be reduced back to a single copy. If fractionation did not occur, an ancestral genome of 10,000 genes would grow to an unrealistically large 640,000 genes in maize, and 1.44 million genes in Brassica rapa.

Figure 1

Some classes of genes, particularly those encoding organelle, preferentially revert to single copy status following whole genome duplications (Duarte et al., 2010). However, other classes of genes – such as subunits of large multiprotein complexes, transcription factors, and signal transduction machinery tend to resist fractionation following whole genome duplication (Blanc and Wolfe, 2004; Seoighe and Gehring, 2004; Maere et al., 2005). This observation has been explained by the Gene Dosage Hypothesis (Birchler and Veitia, 2007) which predicts that fractionation of genes encoding proteins involved in dose–sensitive interactions will be selected against, as the loss of either gene copy is expected to throw the dosage of that gene pair’s product out of balance with its interaction partners, partners that also tend to remain duplicated. The topic of the influence of gene dosage-constraints on post-tetraploidy genome evolution has been well-reviewed (Sémon and Wolfe, 2007; Edger and Pires, 2009; Freeling, 2009; Birchler and Veitia, 2010). A previous study of multiple sequential tetraploidies in the Arabidopsis lineage found a general tendency for genes retained following one tetraploidy to also be retained following a second one (Seoighe and Gehring, 2004).

Since the divergence of the Arabidopsis and grape lineages, Arabidopsis has experienced two additional rounds of whole genome duplication. The rate of duplicate gene retention for transcription factors after single polyploidies have been observed to be approximately 25% (Blanc and Wolfe, 2004; Seoighe and Gehring, 2004). If no mitigation of gene dosage occurred, our expectation after two rounds of whole genome duplication is that Arabidopsis should contain approximately 156% as many transcription factor encoding genes as grape. However, a detailed annotation of transcription factors using conserved protein domains found the number of transcription factors in the Arabidopsis genome is only 25.4% greater than the number found in grape (Lang et al., 2010). The fitness cost of changes in relative gene dosage must, to some extent, be mitigated over multiple whole genome duplications or the genomes of plants would long ago have become over-burdened with genes encoding life’s most complicated machines.

This paper provides evidence that duplicate genes do not equally maintain their progenitor’s preference for duplicate gene retention. Duplicate genes produced by whole genome duplication are not equivalent. Parental genomes originating from different species within a polyploid almost immediately differentiate into dominant and non-dominant subgenomes (Chang et al., 2010), and these expression differences are preserved for millions of years (Flagel and Wendel, 2010; Schnable et al., 2011a). Bias in gene loss between duplicate regions (fractionation bias) has been observed in Arabidopsis (Thomas et al., 2006) and maize (Woodhouse et al., 2010) and seems to be a general rule for whole genome duplications ranging from paramecium to fish (Sankoff et al., 2010). Bias in fractionation and genome dominance are linked because it is expected that genes on the underexpressed, non-dominant subgenome simply matter less to purifying selection and dosage-constraints (Schnable et al., 2011a). In maize, genes with known mutant phenotypes are indeed preferentially found on the dominant subgenome (Schnable and Freeling, 2011). As bias in expression predicts which subgenome will experience more fractionation following polyploidy, either subgenome identity or the expression patterns of individual gene pairs may also predict which copy of a duplicate gene pair will be more prone to duplicate gene retention in future polyploidies.

We addressed the issue of mitigation of gene dosage-constraints with two experimental systems, the grasses, and the crucifers. Both clades have roughly parallel histories of polyploidy among species with sequenced genomes (Figure 1; Table 1). Both grasses and crucifers contain a more ancient whole genome duplication which is shared by all sequenced species in the clade (Bowers et al., 2003; Paterson et al., 2004) and in both clades one well studied species with a sequenced genome has experienced a second subsequent whole genome duplication – maize in the grasses (Gaut and Doebley, 1997) and B. rapa in the crucifers (Lysak et al., 2005). In both cases any duplicate genes retained from the older clade-wide polyploidy did not retain additional duplicate copies in the subsequent lineage-specific polyploidy. Therefore we were able to carry out parallel experiments to identify characteristics associated with preferential retention. It was possible to control, to some extent for the effect of protein function, by focusing on pairs of duplicate genes retained in the clade-wide polyploidy which had different fates in the subsequent lineage-specific polyploidy. A model is proposed to explain how the duplicate copies of dose–sensitive genes escape preferential retention in later polyploidies.

Table 1

Footnote ID from Figure 1	One name (often of many)	One citation (often of many)
1	Pre-seed plant	Jiao et al. (2011)
2	Pre-flowering plant	Jiao et al. (2011)
3	Sigma1	Tang et al. (2010)
4	Sigma2	Tang et al. (2010)
5	Pre-grass/Rho	Paterson et al. (2004)
6	Maize Lineage WGD	Gaut and Doebley (1997)
7	Gamma/pre-eudicot hexaploidy	Jaillon et al. (2007)
8	Beta	Bowers et al. (2003)
9	Alpha	Bowers et al. (2003)
10	Brassica hexaploidy	Lysak et al. (2005)

Whole genome duplications.

Materials and Methods

Data sources

The genome assemblies and annotation used in this study were TAIR 10 (Arabidopsis thaliana), Arabidopsis lyrata v1.0 (Hu et al., 2011), the initial release of the B. rapa genome (The Brassica rapa Genome Sequencing Project Consortium, 2011), MSU 6 (Oryza sativa; Goff et al., 2002), Sorghum bicolor 1.4 (Paterson et al., 2009), and B73_refgen1 (Zea mays; Schnable et al., 2009).

Gene pair identification

Orthologous genes between A. thaliana and A. lyrata were identified using SynMap (Lyons et al., 2008) with QuotaAlign settings of 1:1 (Tang et al., 2011). Arabidopsis–Brassica orthologous relationships were taken from Tang et al. (2012). All orthologous and homeologous relationships between grass species are those published in Schnable et al. (2012).

Expression calculations

Gene expression levels were calculated using previously published RNA-seq data from wild type seedlings of A. thaliana (SRX019140: 44.7 million reads; Deng et al., 2010) and rice (SRX020118: 8.9 million reads; Zemach et al., 2010). These datasets were selected because, at the time these analysis were originally conducted they represented the RNA-seq experiments with the most sequencing depth for these two species deposited in the sequence read archive. Reads were aligned to reference genomes using Bowtie (Langmead et al., 2009) and gene expression levels were quantified using Cufflinks (Trapnell et al., 2010). Bowtie does not perform spliced alignments, which means some reads from regions of mRNA molecules which span exon junctions were not recovered in our analysis. However, given that homeologous genes will in almost all cases posses the same intron–exon structure, any bias introduced by this approach will be equivalent between gene copies.

Measuring purifying Selection

Synonymous and non-synonymous substitution rates were calculated using the synonymous_calculation package included with bio-pipeline¹ using the Nei–Gojobori method (Nei and Gojobori, 1986). All other settings remained as default.

Identification of Rice CNSs

Rice CNSs were identified using version 3 of the CNS Discovery pipeline² (Schnable et al., 2011b).

Statistics

p-Values for the difference in retention frequencies between singleton genes and homeologously paired genes were calculated using Fisher’s Exact Test. In the crucifers, Arabidopsis genes with two or three retained co-orthologs in B. rapa were grouped together as “retained.”

Results

Genes syntenically conserved through the crucifiers or grasses were categorized as (1) those without a homeologous duplicate from the older polyploidy in each lineage (2) those with a retained homeolog from the older polyploidy in each lineage. In the crucifer lineage, the older tetraploidy is Arabidopsis lineage alpha (23–40 MYA); in the Poales, the earlier tetraploidy was “pre-grass” (about 70 MYA; Figure 1). In crucifers, these genes are classified by the number of co-orthologs conserved in B. rapa after the hexaploidy shared by all Brassica species (Figure 2A). In grasses, genes were classified by whether maize retained only one or both co-orthologs following the more recent tetraploidy of the Zea/Tripsacum lineage (Figure 2B). Retention in older polyploidies does predict retention in future polyploidies (p < 2.2 × 10⁻¹⁶ for both crucifers and grasses), as previously showing in Arabidopsis (Seoighe and Gehring, 2004). However in both experiments approximately half of genes previously retained as a duplicate pair in the older whole genome duplication – and therefore presumed to be sensitive to changes in gene dosage – fractionated to a single copy in the more recent whole genome duplication.

Figure 2

The crucifer dataset consisted of 817 Arabidopsis gene pairs where one copy was orthologous to only a single gene in B. rapa and the other possessed either two or three co-orthologs (Data Sheet S1 in Material). The grass dataset consisted of 407 gene pairs conserved in both rice and sorghum where one copy was orthologous to only a single gene in maize, its duplicate having been fractionated and the other represented by two co-orthologs in maize (Data Sheet S2 in Supplementary Material). Gene pairs result from more ancient whole genome duplications were identified and removed, as these tend to introduce confounding factors. Members of gene pairs were assigned to under and over fractionated subgenomes using differences in the number of genes syntenically retained in multiple species between homeologous regions of the rice and Arabidopsis genomes (Schnable et al., 2011a, 2012). In both datasets, the analysis of the relative levels of RNA encoded by duplicate genes pairs – measured by RNA-seq – was carried out in an outgroup lineage which shared only the older clade-wide polyploidy. In the grasses we used the expression of syntenic orthologs in rice and in the crucifers syntenic orthologs in A. thaliana (see Materials and Methods). The relative levels of purifying selection acting on each members of a gene pair were also compared using the ratio of non-synonymous substitutions to synonymous substitutions between orthologous genes in A. thaliana and A. lyrata (for the crucifers) and between rice and sorghum (for the grasses; see Materials and Methods). Promoter complexity, as measured by number of conserved non-coding sequences, has previously shown to influence the odds a gene will be retained as a duplicate pair following polyploidy in the grasses (Schnable et al., 2011b) – so gene pairs were also sorted based on number of conserved non-coding sequences, in the grasses, and total quantity of upstream non-transposon sequence in Arabidopsis, this length being a crude proxy for promoter complexity having previously been shown to correlate with complexity of gene expression patterns (Sun et al., 2010).

All four potential markers examined showed significant power to predict which copy of a homeologous gene pair would be more resistant to fractionation in subsequent whole genome duplications (Figure 3). In general the gene copy retained in duplicate tended to also be the higher expressed copy, show evidence of greater purifying selection and to be associated with greater amounts of non-coding regulatory sequence. These genes also tended to be located on the dominant subgenome.

Figure 3

Discussion

Following polyploidy, a genome possesses two or more homeologous genes, each with the same coding sequence and regulatory elements. Yet these gene copies can immediately show very different patterns of expression (Flagel et al., 2008; Buggs et al., 2011). It has been proposed that the deletion of less expressed copy of a gene following polyploidy is more likely to be selectively neutral (Schnable and Freeling, 2011; Schnable et al., 2011a). When combined with the observation that expression levels are unequal between parental subgenomes in allotetraploids (Chang et al., 2010; Flagel and Wendel, 2010; Schnable et al., 2011a), this model may explain the bias fractionation bias which has been found in ancient polyploids species (Schnable et al., 2011a).

Here we have shown that that the dominant gene copy – more expressed, under higher purifying selection, associated with more regulatory sequence – of a homeologous gene pair is more likely to retain the ancestral characteristic of preferential retention of duplicate copies in subsequent polyploidies. A number of explanations could be proposed for the link between expression and future resistance to fractionation. We propose a model based on the same link between expression and which predicts fractionation bias between parental subgenomes. If all the co-orthologs of a single ancestral gene contribute to a single pool of gene-product, the loss of less expressed gene copies would result in the smallest change in total gene-product dosage. If the total expression of a group of homeologous genes is constrained in either relative or absolute terms (Bekaert et al., 2011) smaller changes in total gene-product dosage – created by the loss of a less expressed gene copy – are predicted to be more often selectively neutral, and therefore more common (Figure 4). This model also predicts that, for gene pairs in A. thaliana where only one copy possesses any orthologous genes in B. rapa, it should more often be the more expressed copy; as is indeed the case (Table A1 in Appendix).

Figure 4

When combined with previous results linking genome dominance with biased fractionation (Chang et al., 2010; Schnable et al., 2011a), our results suggest the Gene Dosage Hypothesis could perhaps be better thought of as the Gene-Product Dosage Hypothesis in that it can generally be considered to act on the concentration of the proteins encoded by duplicate genes, not gene copy number itself. Even when both copies of a gene are retained following whole genome duplication, the less expressed copy will often be lost in subsequent whole genome duplications. Furthermore, the greater the number of duplicate copies of a gene are found within a genome the less each individual copy contributes to total expression and the more likely it becomes that the loss of individual copies can be tolerated. In other words, the protection against fractionation provided by selection for gene dosage – either absolute or relative – becomes less powerful the less a given gene copy contributes to total expression, and the more total gene copies are present within the genome. This explains, at least in part, why despite being the “big kahuna” of whole genome duplications, plant genomes are not over-burdened with subunits of large dose-sensitive protein complexes.

Supplementary Material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Plant_Genetics_and_Genomics/10.3389/fpls.2012.00094/abstract

Statements

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnotes

1.^https://github.com/tanghaibao/bio-pipeline/

2.^https://github.com/gturco/find_cns

References

1
BekaertM.EdgerP. P.PiresJ. C.ConantG. C. (2011). Two-phase resolution of polyploidy in the Arabidopsis metabolic network gives rise to relative and absolute dosage constraints. Plant Cell23, 1719–1728.10.1105/tpc.110.081281
2
BirchlerJ. A.VeitiaR. A. (2007). The gene balance hypothesis: from classical genetics to modern genomics. Plant Cell19, 395–402.10.1105/tpc.106.049338
3
BirchlerJ. A.VeitiaR. A. (2010). The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol.186, 54–62.10.1111/j.1469-8137.2009.03087.x
4
BlancG.WolfeK. H. (2004). Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell16, 1679–1691.10.1105/tpc.021410
5
BowersJ. E.ChapmanB. A.RongJ.PatersonA. H. (2003). Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature422, 433–438.10.1038/nature01521
6
BuggsR. J. A.ZhangL.MilesN.TateJ. A.GaoL.WeiW.SchnableP. S.Brad BarbazukW.SoltisP. S.SoltisD. E. (2011). Transcriptomic shock generates evolutionary novelty in a newly formed, natural allopolyploid plant. Curr. Biol.21, 551–556.10.1016/j.cub.2011.02.016
7
ChangP. L.DilkesB. P.McMahonM.ComaiL.NuzhdinS. V. (2010). Homeolog-specific retention and use in allotetraploid Arabidopsis suecica depends on parent of origin and network partners. Genome Biol.11, R125.10.1186/gb-2010-11-7-125
8
DengX.LianfengG.ChunyanL.TiancongL.FalongL.ZhikeL.PengC.YanxiP.BaichenW.SongnianH.XiaofengC. (2010). Arginine methylation mediated by the Arabidopsis homolog of PRMT5 is essential for proper pre-mRNA splicing. Proc. Natl. Acad. Sci. U.S.A.107, 19114–19119.10.1073/pnas.1007883107
9
DuarteJ. M.WallP. K.EdgerP. P.LandherrL. L.MaH.PiresJ. C.Leebens-MackJ.dePamphilisC. W. (2010). Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol. Biol.10, 61.10.1186/1471-2148-10-61
10
EdgerP. P.PiresJ. C. (2009). Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes. Chromosome Res.17, 699–717.10.1007/s10577-009-9055-9
11
FlagelL.UdallJ.NettletonD.WendelJ. (2008). Duplicate gene expression in allopolyploid Gossypium reveals two temporally distinct phases of expression evolution. BMC Biol.6, 16.10.1186/1741-7007-6-16
12
FlagelL. E.WendelJ. F. (2010). Evolutionary rate variation, genomic dominance and duplicate gene expression evolution during allotetraploid cotton speciation. New Phytol.186, 184–193.10.1111/j.1469-8137.2009.03107.x
13
FreelingM. (2009). Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu. Rev. Plant Biol.60, 433–453.10.1146/annurev.arplant.043008.092122
14
GautB. S.DoebleyJ. F. (1997). DNA sequence evidence for the segmental allotetraploid origin of maize. Proc. Natl. Acad. Sci. U.S.A.94, 6809–6814.10.1073/pnas.94.13.6809
15
GoffS. A.RickeD.LanT. H.PrestingG.WangR.DunnM.GlazebrookJ.SessionsA.OellerP.VarmaH.HadleyD.HutchisonD.MartinC.KatagiriF.LangeB. M.MoughamerT.XiaY.BudworthP.ZhongJ.MiguelT.PaszkowskiU.ZhangS.ColbertM.SunW. L.ChenL.CooperB.ParkS.WoodT. C.MaoL.QuailP.WingR.DeanR.YuY.ZharkikhA.ShenR.SahasrabudheS.ThomasA.CanningsR.GutinA.PrussD.ReidJ.TavtigianS.MitchellJ.EldredgeG.SchollT.MillerR. M.BhatnagarS.AdeyN.RubanoT.TusneemN.RobinsonR.FeldhausJ.MacalmaT.OliphantA.BriggsS. (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science296, 92–100.10.1126/science.1068275
16
HuT. T.PattynP.BakkerE. G.CaoJ.ChengJ.-F.ClarkR. M.FahlgrenN.FawcettJ. A.GrimwoodJ.GundlachH.HabererG.HollisterJ. D.OssowskiS.OttilarR. P.SalamovA. A.SchneebergerK.SpannaglM.WangX.YangL.NasrallahM. E.BergelsonJ.CarringtonJ. C.GautB. S.SchmutzJ.MayerK. F. X.Van de PeerY.GrigorievI. V.NordborgM.WeigelD.GuoY.-L. (2011). The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat. Genet.43, 476–481.10.1038/ng.875
17
JaillonO.AuryJ. M.NoelB.PolicritiA.ClepetC.CasagrandeA.ChoisneN.AubourgS.VituloN.JubinC.VezziA.LegeaiF.HugueneyP.DasilvaC.HornerD.MicaE.JublotD.PoulainJ.BruyèreC.BillaultA.SegurensB.GouyvenouxM.UgarteE.CattonaroF.AnthouardV.VicoV.Del FabbroC.AlauxM.Di GasperoG.DumasV.FeliceN.PaillardS.JumanI.MoroldoM.ScalabrinS.CanaguierA.Le ClaincheI.MalacridaG.DurandE.PesoleG.LaucouV.ChateletP.MerdinogluD.DelledonneM.PezzottiM.LecharnyA.ScarpelliC.ArtiguenaveF.PèM. E.ValleG.MorganteM.CabocheM.Adam-BlondonA. F.WeissenbachJ.QuétierF.WinckerP.French-Italian Public Consortium for Grapevine Genome Characterization. (2007). The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature449, 463–467.10.1038/nature06148
18
JiaoY.WickettN. J.AyyampalayamS.ChanderbaliA. S.LandherrL.RalphP. E.TomshoL. P.HuY.LiangH.SoltisP. S.SoltisD. E.CliftonS. W.SchlarbaumS. E.SchusterS. C.MaH.Leebens-MackJ.dePamphilisC. W. (2011). Ancestral polyploidy in seed plants and angiosperms. Nature473, 97–100.10.1038/nature09916
19
LangD.WeicheB.TimmerhausG.RichardtS.Riano-PachonD. M.CorreaL. G. G.ReskiR.Mueller-RoeberB.RensingS. A. (2010). Genome-wide phylogenetic comparative analysis of plant transcriptional regulation: a timeline of loss, gain, expansion, and correlation with complexity. Genome Biol. Evol.2, 488–503.10.1093/gbe/evq032
20
LanghamR. J.WalshJ.DunnM.KoC.GoffS. A.FreelingM. (2004). Genomic duplication, fractionation and the origin of regulatory novelty. Genetics166, 935–945.10.1534/genetics.166.2.935
21
LangmeadB.TrapnellC.PopM.SalzbergS. L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol.10, R25.10.1186/gb-2009-10-3-r25
22
LyonsE.PedersenB.KaneJ.FreelingM. (2008). The value of nonmodel genomes and an example using synmap within coge to dissect the hexaploidy that predates the rosids. Trop. Plant Biol.1, 181–190.10.1007/s12042-008-9017-y
- CrossRef
- Google Scholar
23
LysakM. A.KochM. A.PecinkaA.SchubertI. (2005). Chromosome triplication found across the tribe Brassiceae. Genome Res.15, 516–525.10.1101/gr.3531105
24
MaereS.De BodtS.RaesJ.CasneufT.Van MontaguM.KuiperM.Van De PeerY. (2005). Modeling gene and genome duplications in eukaryotes. Proc. Natl. Acad. Sci. U.S.A.102, 5454–5459.10.1073/pnas.0501102102
25
NeiM.GojoboriT. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol.3, 418–426.
- Pubmed Abstract
- Google Scholar
26
PatersonA. H.BowersJ. E.BruggmannR.DubchakI.GrimwoodJ.GundlachH.HabererG.HellstenU.MitrosT.PoliakovA.SchmutzJ.SpannaglM.TangH.WangX.WickerT.BhartiA. K.ChapmanJ.FeltusF. A.GowikU.GrigorievI. V.LyonsE.MaherC. A.MartisM.NarechaniaA.OtillarR. P.PenningB. W.SalamovA. A.WangY.ZhangL.CarpitaN. C.FreelingM.GingleA. R.HashC. T.KellerB.KleinP.KresovichS.McCannM. C.MingR.PetersonD. G.RahmanM.WareD.WesthoffP.MayerK. F.MessingJ.RokhsarD. S. (2009). The Sorghum bicolor genome and the diversification of grasses. Nature457, 551–556.10.1038/nature07723
27
PatersonA. H.BowersJ. E.ChapmanB. A. (2004). Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. U.S.A.101, 9903–9908.10.1073/pnas.0308567100
28
SankoffD.ZhengC.ZhuQ. (2010). The collapse of gene complement following whole genome duplication. BMC Genomics11, 313.10.1186/1471-2164-11-313
29
SchnableJ. C.FreelingM. (2011). Genes identified by visible mutant phenotypes show increased bias toward one of two subgenomes of maize. PLoS ONE6, e17855.10.1371/journal.pone.0017855
30
SchnableJ. C.FreelingM.LyonsE. (2012). Genome-wide analysis of syntenic gene deletion in the grasses. Genome Biol. Evol.4, 265–277.10.1093/gbe/evs009
31
SchnableJ. C.SpringerN. M.FreelingM. (2011a). Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl. Acad. Sci. U.S.A.108, 4069–4074.10.1073/pnas.1101368108
- CrossRef
- Google Scholar
32
SchnableJ. C.Pedersen BrentS.SabarinathS.MichaelF. (2011b). Dose-sensitivity, conserved non-coding sequences and duplicate gene retention through multiple tetraploidies in the grasses. Front. Plant Sci.2:2.10.3389/fpls.2011.00002
- CrossRef
- Google Scholar
33
SchnableP. S.WareD.FultonR. S.SteinJ. C.WeiF.PasternakS.LiangC.ZhangJ.FultonL.GravesT. A.MinxP.ReilyA. D.CourtneyL.KruchowskiS. S.TomlinsonC.StrongC.DelehauntyK.FronickC.CourtneyB.RockS. M.BelterE.DuF.KimK.AbbottR. M.CottonM.LevyA.MarchettoP.OchoaK.JacksonS. M.GillamB.ChenW.YanL.HigginbothamJ.CardenasM.WaligorskiJ.ApplebaumE.PhelpsL.FalconeJ.KanchiK.ThaneT.ScimoneA.ThaneN.HenkeJ.WangT.RuppertJ.ShahN.RotterK.HodgesJ.IngenthronE.CordesM.KohlbergS.SgroJ.DelgadoB.MeadK.ChinwallaA.LeonardS.CrouseK.ColluraK.KudrnaD.CurrieJ.HeR.AngelovaA.RajasekarS.MuellerT.LomeliR.ScaraG.KoA.DelaneyK.WissotskiM.LopezG.CamposD.BraidottiM.AshleyE.GolserW.KimH.LeeS.LinJ.DujmicZ.KimW.TalagJ.ZuccoloA.FanC.SebastianA.KramerM.SpiegelL.NascimentoL.ZutavernT.MillerB.AmbroiseC.MullerS.SpoonerW.NarechaniaA.RenL.WeiS.KumariS.FagaB.LevyM. J.McMahanL.Van BurenP.VaughnM. W.YingK.YehC. T.EmrichS. J.JiaY.KalyanaramanA.HsiaA. P.BarbazukW. B.BaucomR. S.BrutnellT. P.CarpitaN. C.ChaparroC.ChiaJ. M.DeragonJ. M.EstillJ. C.FuY.JeddelohJ. A.HanY.LeeH.LiP.LischD. R.LiuS.LiuZ.NagelD. H.McCannM. C.SanMiguelP.MyersA. M.NettletonD.NguyenJ.PenningB. W.PonnalaL.SchneiderK. L.SchwartzD. C.SharmaA.SoderlundC.SpringerN. M.SunQ.WangH.WatermanM.WestermanR.WolfgruberT. K.YangL.YuY.ZhangL.ZhouS.ZhuQ.BennetzenJ. L.DaweR. K.JiangJ.JiangN.PrestingG. G.WesslerS. R.AluruS.MartienssenR. A.CliftonS. W.McCombieW. R.WingR. A.WilsonR. K. (2009). The B73 maize genome: complexity, diversity, and dynamics. Science326, 1112–1115.10.1126/science.1178534
34
SémonM.WolfeK. H. (2007). Consequences of genome duplication. Curr. Opin. Genet. Dev.17, 505–512.10.1016/j.gde.2007.09.007
35
SeoigheC.GehringC. (2004). Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome. Trends Genet.20, 461–464.10.1016/j.tig.2004.07.008
36
SoltisD. E.AlbertV. A.Leebens-MackJ.BellC. D.PatersonA. H.ZhengC.SankoffD.DepamphilisC. W.WallP. K.SoltisP. S. (2009). Polyploidy and angiosperm diversification. Am. J. Bot.96, 336–348.10.3732/ajb.0800079
37
SunX.ZouY.NikiforovaV.KurthsJ.WaltherD. (2010). The complexity of gene expression dynamics revealed by permutation entropy. BMC Bioinformatics11, 607.10.1186/1471-2105-11-607
38
TangH.BowersJ. E.WangX.PatersonA. H. (2010). Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proc. Natl. Acad. Sci. U.S.A.107, 472–477.10.1073/pnas.1009566107
39
TangH.LyonsE.PedersenB.SchnableJ. C.PatersonA. H.FreelingM. (2011). Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics12, 102.10.1186/1471-2105-12-102
40
TangH.WoodhouseM. R.ChengF.SchnableJ. C.PedersenB. S.ConantG.WangX.FreelingM.PiresJ. C. (2012). Altered patterns of fractionation and exon deletions in Brassica rapa support a two-step model for paleohexaploidy. Genetics190, 1563–1574.10.1534/genetics.111.137349
41
The Brassica rapa Genome Sequencing Project Consortium. (2011). The genome of the mesopolyploid crop species Brassica rapa. Nat. Genet.43, 1035–1039.10.1038/ng.919
42
ThomasB. C.PedersenB.FreelingM. (2006). Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res.16, 934–946.10.1101/gr.5089806
43
TrapnellC.WilliamsB. A.PerteaG.MortazaviA.KwanG.van BarenM. J.SalzbergS. L.WoldB. J.PachterL. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol.28, 511–515.10.1038/nbt.1621
44
WoodhouseM. R.SchnableJ. C.PedersenB. S.LyonsE.LischD.SubramaniamS.FreelingM. (2010). Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homeologs. PLoS Biol.8, e1000409.10.1371/journal.pbio.1000409
45
ZemachA.McDanielI. E.SilvaP.ZilbermanD. (2010). Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science328, 916–919.10.1126/science.1186366

Appendix

Table A1

	Less expressed copy lost in Brassica rapa	More expressed copy lost in Brassica rapa	p-Value
All alpha pairs where one copy has been completely lost in Brassica rapa	428 gene pairs	217 gene pairs	p = 3.60 × 10⁻¹⁷
Alpha pairs where there are multiple co-orthologs in Brassica rapa of the retained copy	271 gene pairs	98 gene pairs	p = 3.48 × 10⁻²⁰
Both copies expressed above five FPKM in Arabidopsis thaliana	191 gene pairs	128 gene pairs	p = 2.49 × 10⁻⁴

Expression in Arabidopsis and complete gene loss in Brassica rapa.

Summary

Keywords

polyploidy, gene dosage, gene loss, genome evolution, comparative genomics, crucifers, grasses

Citation

Schnable JC, Wang X, Pires JC and Freeling M (2012) Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants. Front. Plant Sci. 3:94. doi: 10.3389/fpls.2012.00094

Received

06 January 2012

Accepted

24 April 2012

Published

15 May 2012

Volume

3 - 2012

Edited by

Elena R. Alvarez-Buylla, Universidad Nacional Autónoma de Mexico, Mexico

Reviewed by

Paula Casati, Centro de Estudios Fotosinteticos-CONICET, Argentina; Amy Louise Lawton-Rauh, Clemson University, USA

This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.

*Correspondence: Michael Freeling, Freeling Lab, Plant and Microbial Biology, University of California – Berkeley, 111 Koshland Hall, PMB, Berkeley, CA 94720, USA. e-mail: freeling@berkeley.edu

This article was submitted to Frontiers in Plant Genetics and Genomics, a specialty of Frontiers in Plant Science.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Plant Genetics and Genomics

ORIGINAL RESEARCH article

Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants

Abstract

Introduction