The Diversity of Sequence and Chromosomal Distribution of New Transposable Element-Related Segments in the Rye Genome Revealed by FISH and Lineage Annotation

Transposable elements (TEs) in plant genomes exhibit a great variety of structure, sequence content and copy number, making them important drivers for species diversity and genome evolution. Even though a genome-wide statistic summary of TEs in rye has been obtained using high-throughput DNA sequencing technology, the accurate diversity of TEs in rye, as well as their chromosomal distribution and evolution, remains elusive due to the repetitive sequence assembling problems and the high dynamic and nested nature of TEs. In this study, using genomic plasmid library construction combined with dot-blot hybridization and fluorescence in situ hybridization (FISH) analysis, we successfully isolated 70 unique FISH-positive TE-related sequences including 47 rye genome specific ones: 30 showed homology or partial homology with previously FISH characterized sequences and 40 have not been characterized. Among the 70 sequences, 48 sequences carried Ty3/gypsy-derived segments, 7 sequences carried Ty1/copia-derived segments and 15 sequences carried segments homologous with multiple TE families. 26 TE lineages were found in the 70 sequences, and among these lineages, Wilma was found in sequences dispersed in all chromosome regions except telomeric positions; Abiba was found in sequences predominantly located at pericentromeric and centromeric positions; Wis, Carmilla, and Inga were found in sequences displaying signals dispersed from distal regions toward pericentromeric positions; except DNA transposon lineages, all the other lineages were found in sequences displaying signals dispersed from proximal regions toward distal regions. A high percentage (21.4%) of chimeric sequences were identified in this study and their high abundance in rye genome suggested that new TEs might form through recombination and nested transposition. Our results also gave proofs that diverse TE lineages were arranged at centromeric and pericentromeric positions in rye, and lineages like Abiba might play a role in their structural organization and function. All these results might help in understanding the diversity and evolution of TEs in rye, as well as their driving forces in rye genome organization and evolution.

Transposable elements (TEs) in plant genomes exhibit a great variety of structure, sequence content and copy number, making them important drivers for species diversity and genome evolution. Even though a genome-wide statistic summary of TEs in rye has been obtained using high-throughput DNA sequencing technology, the accurate diversity of TEs in rye, as well as their chromosomal distribution and evolution, remains elusive due to the repetitive sequence assembling problems and the high dynamic and nested nature of TEs. In this study, using genomic plasmid library construction combined with dot-blot hybridization and fluorescence in situ hybridization (FISH) analysis, we successfully isolated 70 unique FISH-positive TE-related sequences including 47 rye genome specific ones: 30 showed homology or partial homology with previously FISH characterized sequences and 40 have not been characterized. Among the 70 sequences, 48 sequences carried Ty3/gypsy-derived segments, 7 sequences carried Ty1/copia-derived segments and 15 sequences carried segments homologous with multiple TE families. 26 TE lineages were found in the 70 sequences, and among these lineages, Wilma was found in sequences dispersed in all chromosome regions except telomeric positions; Abiba was found in sequences predominantly located at pericentromeric and centromeric positions; Wis, Carmilla, and Inga were found in sequences displaying signals dispersed from distal regions toward pericentromeric positions; except DNA transposon lineages, all the other lineages were found in sequences displaying signals dispersed from proximal regions toward distal regions. A high percentage (21.4%) of chimeric sequences were identified in this study and their high abundance in rye genome suggested that new TEs might form through recombination and nested transposition. Our results also gave proofs that diverse

INTRODUCTION
Transposable elements (TEs) represented a high percentage of eukaryotic genomes, 58.58% in Pinus taeda (Wegrzyn et al., 2014), 63% in Sorghum bicolor (Paterson et al., 2008), 80% in maize (Feschotte et al., 2002), and more than 72% in Secale cereale (Bauer et al., 2017). Besides their high copy number, the serial transposition of individual TEs into previously inserted elements can form large nested structures in genomes (Bergman et al., 2006;Bousios et al., 2016). It was proposed that such clustered, scrambled TE nests could be subsequently copied and amplified, and resulted in large amount of duplications of TE nests in genome (Bergman et al., 2006;Coline et al., 2014), and might even form new TE families (Losada et al., 1999). As a consequence of their variety in structure, size, mechanisms of transposition and high copy number, TEs contribute a lot to the genomic rearrangement, nucleotide diversity and speciation (Middleton et al., 2013;Belyayev, 2014).
Besides the high abundance, structure and sequence diversity of TEs in plant genomes, they were also presented distribution variation among lineages. In Triticum boeoticum, for instance, a Ty3/gypsy lineage Wgel was preferentially clustered at both the centromeric and pericentromeric positions, while another two Ty3/gypsy lineages (Erika and Sukkula) were rare at the centromeric positions (Liu et al., 2008). Even the same TE lineage might show diversity between species and ploidy levels. As proved by four Ty3/gypsy lineages (CRM, Athila, Del, and Tat) in Brachiaria, evident differences in location and abundance were observed between diploids and polyploidy (Santos et al., 2015). CRM (centromeric retrotransposon in maize) is a special Ty3/gypsy element located at centromeric positions of maize, and CHIP assays demonstrated that this element can interact with CENH3 (centromere-specific H3 histone) throughout its length (Zhong et al., 2002). CRR (centromeric retrotransposon in rice) and CRW (centromeric retrotransposon in wheat), belonging to the same family with CRM, were also proved to interact with CENH3 (Nagaki et al., 2004;Li et al., 2013), which suggesting that the Centromeric Retrotransposon (CR) family in grass species played an important role in centromere structural organization and function (Zhong et al., 2002).
Rye (Secale cereale L., 2n = 2x = 14) is an important member of the Triticeae, with a high percentage of repetitive elements of more than 92% (Bartoš et al., 2008). Analysis of repetitive sequences in rye has been performed since the 1970s (Weimarck, 1975;Appels et al., 1978), thereafter, many sequences including some TE derived sequences were located and extensively investigated, such as the Secale dispersed repeat sequence R173 elements, a rye-specific family distributed in a dispersed manner over all rye chromosomes (Rogowsky et al., 1992); the Secale pSc20H family, which was identified as retrotransposon related sequence, and dispersed throughout the rye genome except telomeric positions and nucleolar organizing regions (Ko et al., 2002;Tang et al., 2011); the transposon-like gene Revolver, which is dispersed on all seven chromosomes of rye (Tomita, 2008); the Superior families, a transposon-like gene family also dispersed in the rye genome (Tomita et al., 2009); the Secale cereale clone B2465 retrotransposon Ty3/gypsy-like sequence, which displayed strong hybridization signals on rye chromosomes (Carchilan et al., 2009); the predominantly pericentromerelocated pSc10C families (Ko et al., 2002); the centromerelocated Ty1-copia retrotransposons of the Bilby family (Francki, 2001) and the centromere-located Sc192 bp repeats, which were identified as Ty3/gypsy-type sequences (Banaei-Moghaddam et al., 2012).
Even though some TEs have been cytologically defined, and great progress has been achieved in rye genome sequencing and expressed sequence tags analysis (Martis et al., 2013;Bauer et al., 2017), there remains a limited understanding about the constitution, chromosomal distribution, diversity and abundance of TEs in rye. In addition, due to the complex organization of TEs and the assembly problem caused by them, the whole genomewide analysis may not accurately reflect the TE distribution and abundance for any region of the genome (Bergman et al., 2006), especially for genomes haven't been successfully assembled. The fluorescence in situ hybridization (FISH) technique, which was developed by Langer-Safer et al. (1982), was popular for physical mapping of high copy number sequences clustered in plant genomes (Iwata-Otsubo et al., 2016;Gouveia et al., 2017). Thus the FISH technique provided an efficient tool to locate the hardly assembled TE sequences on chromosomes of rye (Li et al., 2016).
To gain more insight into the diversity of sequences and chromosomal distribution of TEs and their evolution in rye, we isolated 70 unique FISH-positive TE-related sequences and investigated their chromosomal location and sequence composition using FISH and TE lineage annotation. 26 TE lineages were found in these newly identified sequences and variable chromosomal distribution bias were observed among these TE lineages; additionally, TE lineage Abiba was both found in sequences located at pericentromeric positions and sequences located at centromeric positions. Our results might provide new information for the highly dynamic nature of TEs in rye and their important roles in driving genome diversity, evolution and speciation, as well as centromere organization.

Plant Materials
The materials used in this work included Secale cereale var. King II rye (2n = 2x = 14, R genome), Allohexaploid triticale (AABBRR, 2n = 2x = 42) and Triticum aestivum L. var. Chinese Spring wheat (AABBDD, 2n = 2x = 42). To quickly identify rye chromosome specific sequences, allohexaploid triticale (AABBRR, 2n = 2x = 42) was used for the first round of FISH. For sequences displaying signals on A, B and R chromosomes, a second round of FISH was preformed using King II rye and Chinese Spring wheat to check if signals on A, B, and R chromosomes in allohexaploid triticale coincided with those in rye and wheat. The universal probe pSc119.2 was used to help identify chromosomes from rye. The plants used for DNA isolation were grown in the greenhouse with 16 h of lights and 8 h in the dark at 25 • C.

Genomic Plasmid Library Construction
A rye (var. King II) plasmid library for repetitive element screening was constructed by partially digesting the rye genomic DNA using Hind III (Takara Bio, Shiga, Japan). The DNA of rye seedlings was extracted using the CTAB method, and the restriction digestion with Hind III was performed in a 200 µl reaction with 20 µg genomic DNA, 1× Buffer, sterile H 2 O, and 200 U of Hind III. The DNA was digested at 37 • C for 20 min and then separated on a 1% agarose gel by electrophoresis. The fraction of 1,000-2,000 bp was collected using an EasyPure Quick Gel extraction kit (TransGen Biotech, Beijing, China). The recovered fragments were ligated into pUC118 vector (Takara Bio, Shiga, Japan) using the TaKaRa DNA ligation kit (Takara Bio, Shiga, Japan) and transformed into competent E. coli DH5α (TransGen Biotech, Beijing, China) according to the manufacturer's instructions.

Library Screening
Transformed clones were screened using dot-blot hybridization, following the method described by Zhang et al. (2016). For probe labeling, the rye genomic DNA was labeled by digoxigenin-11-dUTP with a random primer DNA labeling kit (Takara Bio, Shiga, Japan) according to the manufacturer's instructions, but using 1× DIG DNA labeling mix instead of the dNTP in the kit. The darker blots, which were interpreted as high copy number repetitive sequences, were then used in subsequent FISH for chromosomal distribution analysis and sequence identification.

Slide Preparation and FISH Identification of the Sequences
Slides for FISH were prepared according to Han et al. (2006) and Fu et al. (2015), with minor modifications. Generally, the actively growing root tips were treated with 1.0 MPa nitrous oxide gas (N 2 O) for 2 h, then fixed in 90% glacial acetic acid for 10 min on ice. The root tips could be used immediately or stored in 70% ethanol at −20 • C. The root tips were washed three times and digested at 37 • C for 1 h in an enzyme solution of 0.5% pectolyase Y-23 (Kikkoman, Co., Tokyo, Japan) and 1% cellulose Onozuka R-10 (Yakult Honsha, Co., Ltd., Minato-ku, Tokyo, Japan) dissolved in citric buffer (10 mM NaC, 10 mM EDTA, pH 5.5). After digestion, the root sections were washed with 70% ethanol and mashed with forceps. The cells were washed with 100% ethanol, resuspended in 100% acetic acid and dropped onto clean glass slides.
For probe labeling, the plasmids carrying subject sequences were labeled with Texas Red-5-dCTP using a nick translation procedure (Han et al., 2006). The labeled probes were dissolved in 2× SSC and 1× TE (20 ng µl −1 ), dropped to the chromosome spreads and denatured together by heating at 100 • C for 5 min. Image capturing was carried out using a Nikon Ni-E fluorescence microscope (Nikon, Tokyo, Japan) and further processed with Photoshop 5.0 (Adobe).

Homology-Based Sequence Identification
The clones were sequenced in both directions with the universal M13 primers synthesized by AuGCT Biotechnology (AuGCT, China) using an ABI PRISM 377 DNA sequencer (Applied Biosystems). Next, the sequences were annotated and classified by a homology search against the RepBase (Bao et al., 2015), TREP database (Wicker et al., 2002) and the REdat_9.0_Poaceae section of the PGSB transposon library (Spannagl et al., 2015) with the default settings. According to the rules proposed by Wicker et al. (2007), nested sequences were annotated segmentally and only homologous regions longer than 80 nucleotides were considered. In order to check if these sequences have been characterized, sequences were further queried against the GenBank database using BLASTN for sequence identity analysis, with a threshold e-value ≤ 10 −5 , and without filtering out low complexity regions. The BLAST results based on the four databases were listed in Supplementary Table S1 and sequences showing homology with TEs were performed a last screening using FISH. To visualize the constitution of each sequence and TE lineages found in these sequences, Venn diagrams and pie charts ( Figure 5 and Supplementary Figure S1) were created from the BLAST results listed in Supplementary Table S1. Venn diagrams were created using the online tool Venny 2.1.0 1 and pie charts were drawn using GraphPad Prism 5.

Immunofluorescence and FISH Assay
Root tips for immunofluorescence assay were prepared and treated according to Guo et al. (2016). After washing with 1× PBS, the slides were incubated with a rabbit monoclonal anti-CENH3 antibody synthesized by MBL (Nagoya, Japan; 1:250) in 1× TNB [100 mM Tris-HCl, 150 mM NaCl, and 0.5% blocking reagent (w/v)] at 4 • C overnight in a wet chamber. The rabbit antibodies were detected using fluorescein isothiocyanateconjugated goat anti-rabbit antibody (1:1,000; Jackson Immuno Research Labs). Before performing FISH, the slides were dehydrated in 70, 90, and 100% ethanol for 5 min. Images were captured using a Nikon Ni-E fluorescence microscope (Nikon, Tokyo, Japan).

Isolation and FISH Characterization of Repetitive DNA Sequences from Rye
In this work, a total of 1,800 clones were screened from a Hind III-digested rye genomic-DNA library by dot-blot hybridization using rye genomic DNA as the probe. Then, 200 clones appearing as dark dots in the blot hybridization were sequenced and examined for the presence of FISH signals on the metaphase chromosomes of allohexaploid triticale. Furthermore, 70 unique sequences were performed for further analysis after eliminating the 130 duplicate clones or sequences lacking FISH signal. Selected examples are given for all FISH distribution patterns (Figures 1-3), and data of all the 70 unique sequences are summarized in Table 1.
According to the FISH signal patterns, the identified sequences fell into two main categories: signals enriched in the rye genomes (Table 1, part I, 47 sequences) and signals enriched in both rye and wheat genomes ( Table 1, part II, 23 sequences).
Of the 47 rye genome-specific sequences ( Table 1, part I), 17 sequences ( Table 1 Of the 23 sequences hybridized with both rye and wheat chromosomes ( Table 1, part II), 13 sequences ( Table 1, part II-1) displayed stronger signals on rye chromosomes but weaker signals on wheat chromosomes (Figures 2A-F), including 2 centromere located sequences (Figures 2A-C, 3C,F); 10 sequences ( Table 1, part II-2) produced same intensely dispersed signals on both rye and wheat chromosomes (Figures 2G-I). Among the 23 sequences, only three sequences (HK18-5, HK17-88, and HK5-70) produced signals dispersed from distal regions toward pericentromeric positions, without obvious signals at pericentromeric and centromeric positions. All the other non-centromere located sequences produced signals dispersed from proximal regions toward distal regions (data not shown).

Immunofluorescence Analysis of Centromere Located Sequences
Functional centromeres are epigenetically specified by incorporation of CENH3, a centromere-specific histone H3 variant (Li et al., 2013;Cech and Peichel, 2016). To determine whether the centromere located sequences are part of the functional areas of centromeres, we conducted immunofluorescence assay and sequential FISH experiments on the same interphase nuclei of King II rye. All the six centromere located sequences were co-localized with CENH3 on all the seven pairs of rye chromosomes (Figure 4), but the signals were larger than those of CENH3, which suggested that not all of their sequences were present at the kinetochore positions.

Annotation of the FISH-Positive Sequences
The FISH-positive fragments were sequenced. All the sequence data were registered in the GenBank as accession numbers (KY327841-KY327936).
Based on the homology search, all the 70 isolated sequences were labeled as TE derived sequences: 48 sequences carried Ty3/gypsy-derived segments, 7 sequences carried Ty1/copiaderived segments and 15 sequences (chimeric sequences) carried segments homologous with multiple TE families ( Table 1). 26 TE lineages (six unknown lineages were included) were found in these sequences ( Figure 5B and Supplementary Table S1): 53 sequences carried segments exclusively homologous with TE lineages belonging to Ty3/gypsy (seven chimeric sequences included); seven sequences carried segments exclusively homologous with TE lineages belonging to Ty1/copia (one chimeric sequences included); four sequences (chimeric sequences) carried segments homologous with TE lineages belonging to Ty3/gypsy and Ty1/copia; 2 sequences (chimeric sequences) carried segments homologous with TE lineages belonging to Ty3/gypsy and DNA transposons; one sequence (chimeric sequence) carried segments homologous with TE lineages belonging to Ty3/gypsy, Ty1/copia and DNA transposons ( Figure 5A and Supplementary Table S1). Among these TE lineages (six unknown lineages were not included), four TE lineages were exclusively found in non-chimeric sequences: Barbara, Carmila, Latidu, and Erika; seven TE lineages were exclusively found in chimeric sequences: Cereba, Mariner, MuDR, Ophelia, Polinton, Sukkula, and Vandal (MuDR); nine TE lineages were found both in non-chimeric and non-chimeric sequences: Abiba, Angela, Danila, Inga, Wis, Sabrina, Wham, Wilma, and Sumana (Figure 5 and Supplementary Figure S1A, Table S1).
In addition, the frequency of occurrence of different TE lineages in the 70 sequences was also different, such as Angela was found in 6 sequences, Danila in 5 sequences, Erika in 6 sequences, Inga in 2 sequences, Sabrina in 18 sequences, Summana in 7 sequences, Wilma in 8 sequences, Wham in 5 sequences, Abiba in 12 sequences, Barbara in 3 sequences, Wis in 2 sequences and all the other lineages in only one sequence (Figure 5B and Supplementary Table S1). Even though some TE lineages existed in only one sequence, they still presented high copy numbers in the rye genome, inferred from the strong FISH signals displayed by their residing sequences.
As suggested by the FISH patterns each sequence displayed and the TE lineages found in these sequences, differential chromosomal distribution of these TE lineages was detected (Supplementary Figure S1B and Table S1): Abiba (Gypsy type) were found in sequences located at the pericentromeric and centromeric positions; Cereba (Gypsy type), Mariner (DNA transposons) and MuDR (DNA transposons) were only found in centromere located sequences; Latidu (Gypsy type) was only found in HK5-34, which displayed stronger signals on  (G-I) The signal distribution of HK15-21 on chromosomes of allohexaploid triticale, Secale cereale L. var. King II and "Chinese Spring" wheat (AABBDD, 2n = 42). The signal of each sequence hybridized with both wheat and rye chromosomes (red signals) was typically displayed by the enlarged 1B and 1R chromosomes placed in the inset, with pSc 119.2 (green signals) removed. Bars = 10 µm. on chromosome arms of both rye and wheat; Angela (Copia type), Sabrina (Gypsy type), Danila (Gypsy type), Wham (Gypsy type), and Summana (Gypsy type) were found in sequences dispersed in all chromosome regions except centromeric and telomeric positions, but the last three lineages (Danila, Wham, and Summana) were not found in sequences producing same intense signals on chromosome arms of both rye and wheat; Wis (Copia type) was only found in sequences displaying same intense signals on chromosome arms of both rye and wheat. Besides, Erika, Summana, Barbara, Sukkula, and Latidu (highlighted in green) were only found in sequences displaying signals dispersed from proximal regions toward distal regions; while Wilma, Danila, Sabrina, Wham, and Angela (highlighted in red) were found in sequences displaying signals dispersed from proximal regions toward distal regions and sequences displaying signals dispersed from distal regions toward pericentromeric positions, without obvious signals at centromeric and pericentromeric positions.
In order to check the relationship between previously FISH defined families and TE lineages, previously FISH identified sequences (R173, Revolver, Secale cereale clone B2465, pSc20H, Bilby, Superior and Secale cereale clone Sc192 bp-rye-sortedBclone-1) were download and blasted against the same databases as our sequences. It turned out that pSc20H showed full length homology with the Erika lineage, B2465 showed full length homology with the Daniela lineage, Superior and Bilby showed partial homology with the Abiba lineage, both R173 and Revolver families showed homology with multiple TE lineages (Supplementary Table S2).

DISCUSSION
Because TEs contributed a major part of the Triticeae genomes, understanding their sequence diversity and distribution dynamics can help investigate genome evolution and speciation (Middleton et al., 2013;Bauer et al., 2017). In rye, it is still challenging due to the unfinished whole genome assembly. However, because of their high abundance and chromosomal clustering nature, TEs can be relative easily located on chromosomes using cytological method like FISH (Rogowsky et al., 1992;Francki, 2001;Kalendar et al., 2004;Tomita, 2008Tomita, , 2010Carchilan et al., 2009;Tang et al., 2011), especially high copy number TE lineages. In this study, the chromosomal distribution and sequence diversity of 70 TE related sequences were investigated, which would help understand the organization and evolution of TEs in the rye genome.
Transposable elements constitute at least 72% of the rye genome, with 60% LTR retrotransposons and 7% DNA transposons (Bauer et al., 2017). Ty3/gypsy and Ty1/copia are   (Rogowsky et al., 1992) Sequences presented here are registered in GenBank as accession numbers KY327841-KY327936. a Sequences showing homology to the reported sequences. b Sequences showing partial homology to the reported sequences. c Newly identified FISH-positive sequences in rye.
two major groups of LTR retrotransposons, and Ty3/gypsy elements are generally more presented than Ty1/copia ones in angiosperms (Dereeper et al., 2013;Natali et al., 2015;Guyot et al., 2016). Besides different abundance, Ty3/gypsy are presented more diversity than Ty1/copia in plants (Santos et al., 2015). In this study, 62 of the 70 identified sequences contained Ty3/gypsyderived segments, almost five times of those contained Ty1/copiaderived sequences (Figure 5A). DNA transposons were also found in the identified sequences: Polinton in HK-26 (dispersed in interstitial regions of rye chromosomes), Vandal (MuDR) in HK16-18 (mainly located at the pericentromeric positions), Mariner and two MuDR in HK5-64 (located at centromeric positions). Our results showed that Ty3/gypsy constituted a major part of TEs in rye and DNA transposons lineages might also play a role in centromere structural organization and function. Complex or hybrid TEs are commonly seen in genomic sequences, these elements might arise from the nested TE integration, intrachromosomal recombination or variant replication (Wicker et al., 2007;Vitte et al., 2013;Gao et al., 2015). These kind of hybrid TE were often clustered in plant genomes, and can spread over distances as large as 200 kb (Choulet et al., 2010). In this study, 15 chimeric sequences (more than 21.4%) were characterized (Supplementary Table  S1), which involved nearly all the TE lineages found in this work (Supplementary Figure S1A and Table S1). None of these chimeric sequences were head-tail/head TE junction structures, so these sequences should be independent fragments. To further confirm their existence, all the 15 chimeric sequences were searched against the available Secale BAC clones deposited in NCBI database and WGS sequence contigs deposited in IPK Rye BLAST Server 2 . Due to the unfinished whole-genome assembly, only two sequences showed full-length homologous with published data: full-length of HK5-7 was found in 5 WGS sequence contigs (e-value = 0), and full length of HK15-5 was found in Secale BAC clone (Secale cereale BAC956-D7, e-value = 0). The results suggested that these chimeric sequences should exist in rye genome, and might be formed by a series of nested transposition and/or recombination of TEs. Additionally, the strong FISH signals given by those chimeric sequences suggested that they were highly abundant and stretched long distances in the rye genome, which might result from duplication of these nested copies following the nested insertions and recombination, as suggested by Bergman et al. (2006).
Duplication of nested TEs is not a rare phenomenon in eukaryote genomes, which have been widely observed in Drosophila (O'hare et al., 2002;Bergman et al., 2006), as well as barley (Wicker et al., 2005) and Arabidopsis (Lippman et al., 2004). Moreover, chimeric TEs (TE nests) were mostly found in rye chromosome specific sequences or sequences displaying stronger signals on rye chromosomes than on wheat chromosomes ( Table 1 and Supplementary Table S1). Another example is the well-studied rye genome specific transposon-like gene Revolver, which also shows homology with multiple TE lineages (Supplementary Table S2). All these results indicated that nested transposition, recombination among TE lineages and duplication of nested sequences were important driving forces of speciation and genome evolution, and might also be an important mechanism of new TE family formation (Losada et al., 1999).
Transposable elements are relatively neutral elements within genome (Petrov, 2001), which facilitate them accumulating more changes in the genome (Warren et al., 2015). FISH signal intensity could help to evaluate the homology between probes and target genome, as well as the copy numbers of target sequences in genome. In this study, differential FISH signal intensity was observed among TE lineages (Supplementary Figure S1, Table S1, and Table 1), such as Wis and Barbara were found in sequences displaying strong signals on wheat chromosomes; while Carmilla, Inga, Erika, and Sukkula were found in sequences displaying no signals on wheat chromosomes. Besides differences among TE lineages, differential FISH signal intensity was also observed among different members of a same family: Angela was found in sequences displaying strong signals, weaker signals and no signals on wheat chromosomes, as well as Sabrina and Wilma; Sumana was found in sequences displaying weaker signals and no signals on wheat chromosomes, as well as Daniela and wham. These high diversity among TE lineages and different members of a same TE lineages indicated variably evolutionary rate and direction of these TE lineages, which contributed a lot to the genome diversity and speciation.
Transposable elements constitute a considerable proportion of the centromeric DNA sequences in cereals, for instance, 96% of the centromeric DNA of the hexaploid wheat chromosome 3B was TE sequences (Li et al., 2013). Even though the function of centromeres are conserved, TEs located at centromeric positions keep evolving (Ma et al., 2007;Neumann et al., 2011). During species evolution, new TE lineages might form and play a role in centromere structural organization and function, and some ancestor elements might lose their function or head to extinction. In wheat, for instance, satellite repeats lost their ability to bind with CENH3, and might have been replaced by the CRW and Quinta elements at the functional centromere (Li et al., 2013). In addition, some species might evolve their own species-specific elements, such as Bilby family in rye, which are significantly enriched at the centromeric positions of rye chromosomes (Francki, 2001). In this study, four of the six centromere located sequences contained segments homologous with Bilby family (Table 1 and Supplementary Table S1). Except HK1-71, all the centromere located sequences, including Bilby family, contained segments homologous with the Abiba TE lineage. These results support the idea that retrotransposon families located at centromeric positions in cereals probably derived from a single conventional Ty3/gypsy family or a non-autonomous derivative (Langdon et al., 2000;Nagaki et al., 2003), and from an evolutionary perspective, elder families kept being replaced by new emerged families. Immuno-colocalization of the six centromere located sequences with CENH3 suggested that they might involve in the centromere structural organization. To confirm this, more work needs to be performed.
At the centromeric and pericentromeric positions, meiotic recombination is almost completely suppressed (Gore et al., 2009), but rearrangements caused by retrotransposons were frequently detected (Henikoff et al., 2001;Hall et al., 2004;Liu et al., 2008;Li et al., 2013;Wolfgruber et al., 2016). In our study, a chimeric sequence HK5-64 contains segments from Ty3/gypsy (Abiba), Ty1/copia (Copia3) and two type of DNA transposon lineages (MuDR and Mariner), which suggested that recombination events have occurred during the evolution of rye genome. After BLASTed against the NCBI database, only 289 bp length of segment homologous with Abiba was found in the Triticum database, which indicated that this chimeric sequence should form after rye and wheat diverged from a common ancestor. However, we failed to obtain its full length, even though a 782 bp length of its segment was found in the released database of rye (Bauer et al., 2017).
Plant pericentromeres were regions physically separating the centromere core from the gene-rich chromosome arms, which were characterized by large TE islands (Sigman and Slotkin, 2016). In this study, the TE lineage Abiba was not only found in almost all the centromere located sequences (except HK1-71), but also in all the pericentromere located sequences, which supported the idea that there was similarity between centromeric and perientromeric regions (Gent et al., 2011). However, more TE lineages dispersed in interstitial regions were found at pericentromerc positions, such as Angela, Barbara, Danila, Erika, Sumana, Latidu, Sabrina, and Wham. This result suggested that pericentromerc regions might share more TE lineages with interstitial regions.

CONCLUSION
The rye genome contained a substantial fraction of repetitive sequences, especially TE sequences. Although broad-scale patterns of TE abundance has been investigated in rye using highthroughput DNA sequencing technology (Bartoš et al., 2008;Fluch et al., 2012;Bauer et al., 2017), the accurate diversity of sequence and chromosomal distribution of TEs in rye remains enigmatic due to their dynamic nature and nested transposition. In this work, the constitution and chromosomal distribution of 70 unique FISH-positive TE-related sequences were identified and characterized. Of the 70 sequences, 30 contained segments homologous with previously FISH characterized TE-related sequences and 40 have not been characterized. 62 of the 70 sequences contained Ty3/gypsy-derived sequences (14 chimeric sequences included), which suggested a high percentage of Ty3/gypsy type TEs in rye genome. 26 TE lineages were found in these identified sequences, and almost all of them could be found in chimeric sequences, which suggested wide nested transposition and recombination have happened among these TE lineages in rye genome. In addition, the strong FISH signals produced by the chimeric sequences indicated that TE nested insertions, recombination, and duplication of nested sequences contributed a lot to new TE family formation, rye genome organization and evolution. Except the conserved centromeric retrotransposon Cereba, another TE lineage Abiba and 3 DNA transposons were also found in centromere located sequences, which suggested that diverse TE lineages were involved in the centromere structural organization in rye. To wholly understand the structure, organization, potential function and transposition mechanisms of our identified TEs, it is necessary to obtain their full lengths in further work. Our studies provided valuable insights into the constitution, distribution and diversity of TEs in the rye genome, which is helpful in understanding the roles of TEs in driving rye genome organization and evolution.

AUTHOR CONTRIBUTIONS
YZ, CF, and ZH designed the experiments. YZ conducted the study, processed the data and wrote the manuscript. CF, SL, YC, RW, XZ, FH, and ZH discussed the results and modified the manuscript. All authors have read and approved the final manuscript.

ACKNOWLEDGMENTS
This research was supported by the National Science Foundation of China (31170209) and National Key Research and Development plan from Ministry of Science and Technology of China (2016YFD0102003-10).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017.01706/ full#supplementary-material FIGURE S1 | Venn diagram showing TE lineages found in the 70 identified sequences. (A) TE lineages were classified based on the types of their residing sequences: non-chimeric sequences or chimeric sequences, the lineages falling in overlapped regions were found in both types of sequences. (B) TE lineages were classified based on the FISH patterns displayed by their residing sequences, TE lineages highlighted in green were exclusively found in sequences displaying signals dispersed from proximal regions toward distal regions; TE lineages highlighted in red were found both in sequences displaying signals dispersed from proximal regions toward distal regions and sequences displaying signals dispersed from distal regions toward pericentromeric positions.