Aegilops crassa Boiss. repeatome characterized using low-coverage NGS as a source of new FISH markers: Application in phylogenetic studies of the Triticeae

Aegilops crassa Boiss. is polyploid grass species that grows in the eastern part of the Fertile Crescent, Afghanistan, and Middle Asia. It consists of tetraploid (4x) and hexaploid (6x) cytotypes (2n = 4x = 28, D1D1XcrXcr and 2n = 6x = 42, D1D1XcrXcrD2D2, respectively) that are similar morphologically. Although many Aegilops species were used in wheat breeding, the genetic potential of Ae. crassa has not yet been exploited due to its uncertain origin and significant genome modifications. Tetraploid Ae. crassa is thought to be the oldest polyploid Aegilops species, the subgenomes of which still retain some features of its ancient diploid progenitors. The D1 and D2 subgenomes of Ae. crassa were contributed by Aegilopstauschii (2n = 2x = 14, DD), while the Xcr subgenome donor is still unknown. Owing to its ancient origin, Ae. crassa can serve as model for studying genome evolution. Despite this, Ae. crassa is poorly studied genetically and no genome sequences were available for this species. We performed low-coverage genome sequencing of 4x and 6x cytotypes of Ae. crassa, and four Ae. tauschii accessions belonging to different subspecies; diploid wheatgrass Thinopyrum bessarabicum (Jb genome), which is phylogenetically close to D (sub)genome species, was taken as an outgroup. Subsequent data analysis using the pipeline RepeatExplorer2 allowed us to characterize the repeatomes of these species and identify several satellite sequences. Some of these sequences are novel, while others are found to be homologous to already known satellite sequences of Triticeae species. The copy number of satellite repeats in genomes of different species and their subgenome (D1 or Xcr) affinity in Ae. crassa were assessed by means of comparative bioinformatic analysis combined with quantitative PCR (qPCR). Fluorescence in situ hybridization (FISH) was performed to map newly identified satellite repeats on chromosomes of common wheat, Triticum aestivum, 4x and 6x Ae. crassa, Ae. tauschii, and Th. bessarabicum. The new FISH markers can be used in phylogenetic analyses of the Triticeae for chromosome identification and the assessment of their subgenome affinities and for evaluation of genome/chromosome constitution of wide hybrids or polyploid species.


Introduction
The genus Aegilops L. is closely related to wheat and represents an important gene pool for wheat improvement (Molnár-Láng et al., 2014;Kishii, 2019). Modern taxonomy recognizes 10 diploid and 11 polyploid Aegilops species with various genome compositions (Van Slageren, 1994;Kilian et al., 2011). Six major genomic types -S* (Sitopsis section), U (Ae. umbellulata), C (Ae. markgraphii), M (Ae. comosa), N (Ae. uniaristata), and D (Ae. tauschii) -have been distinguished among diploid Aegilops species (Kimber and Tsunewaki, 1988); they are thought to have occurred approximately 3 million years ago from hybrid populations of the progenitor of Ae. speltoides (S genome) × ancient diploid wheat (A genome) via the mechanism of homoploid hybrid speciation (Marcussen et al., 2014). Polyploid Aegilops emerged from the hybridization of diploid progenitors carrying different genomic types. Despite a broad genome diversity of polyploid Aegilops, the D (sub)genome was detected in four, whereas the U in seven species. Depending on the presence of the D or U genome, which are designated as "pivotal, " all polyploid Aegilops species are divided into the D genome cluster and U genome cluster (Kimber and Feldman, 1987).
An extinct or still unknown diploid species contributed the second subgenome to Ae. crassa (Dubkovsky and Dvořák, 1995;Dvořák, 1998;Edet et al., 2018). Kihara (1963) proposed that this subgenome could be inherited from Ae. comosa and suggested genomic formula DM for tetraploid and DDM for hexaploid Ae. crassa. Molecular  and meiotic (Kimber and Abu-Bakar, 1981) analyses did not confirm the presence of the M genome in Ae. crassa. Comparison of the restriction profiles of nuclear repeated nucleotide sequences, RNS Dubkovsky and Dvořák, 1995;Dvořák, 1998), and DArTseq genotyping (Edet et al., 2018) revealed higher similarity of the X cr subgenome with the S genome of the Sitopsis group, most likely Ae. speltoides, or with the T genome of Ae. mutica (Edet et al., 2018). These observations contradict the result of cytogenetic analysis, which showed correspondence to 5S and 45S rDNA patterns of the X cr subgenome chromosomes of Ae.
Despite the significant progress made in genome sequencing, the number of DNA probes employed in FISH analysis of cereal species is still very limited. In addition to pSc119.2 and pAs1 probes that have traditionally been used for chromosome identification and phylogenetic studies of the Triticeae since the middle 80 th (Bedbrook et al., 1980;Rayburn and Gill, 1986), several new DNA sequences have been isolated from nuclear DNA and used as probes in FISH analysis of wheat and Aegilops as well as of other grass species (Kato et al., 2004(Kato et al., , 2011Komuro et al., 2013;Chen et al., 2019;Xi et al., 2019).
Progress in whole-genome sequencing and bioinformatics pipelines allows obtaining detailed information on the structure of the repeatome. To date, genome assemblies of Ae. tauschii, Ae. longissima, Ae. speltoides, Ae. sharonensis, and Ae. bicornis have been obtained (Tiwari et al., 2015;Wang et al., 2021;Avni et al., 2022;Li et al., 2022;Yu et al., 2022). However, the genome of Ae. crassa has not yet been sequenced, which certainly limits the possibilities of its comprehensive study. From the other side, even unassembled reads can be used to search for new tandem satellite repeats from which chromosomal markers can be developed (Koo et al., 2016;Du et al., 2017;Liu et al., 2018b;Chen et al., 2019;Kroupin et al., 2019b;Nikitina et al., 2020). In particular, comparative genome analysis has been successfully used to obtain specific chromosomal markers for the detection of alien chromosomes in wheat addition lines Liu et al., 2018a) and for identification of Y subgenome in Roegneria (Wu et al., 2021). Koo et al. (2016) mapped satellite repeats identified by flow-sorting and sequencing to chromosome 5M g of Ae. geniculata. However, in most studies of structural genomic diversity of Aegilops, a limited set of "standard" DNA probes based on tandem repeats pTa71, pTa794, pSc119.2, pAs1, and pTa-713, as well as a number of microsatellites, are still being used (Song et al., 2020;Said et al., 2021). Single-gene FISH probes are also employed to compare structural rearrangements of chromosomes in Aegilops species (Danilova et al., 2014(Danilova et al., , 2017Tiwari et al., 2015;Said et al., 2021). Despite the informativity of the results obtained using single-gene FISH probes, both flow-sorting and creation of cDNA clones remain cost-and labor-consuming procedures. Owing to modern bioinformatics approaches, tandem repeats can be efficiently selected from even shallow whole-genome sequencing data, and new FISH probes can be obtained by either direct labeling of PCR products or by designing labeled oligonucleotides, which facilitates their transfer between different scientific teams (Kroupin et al., 2019a;Lang et al., 2019a;Xi et al., 2020).
In addition to Ae. crassa, the D subgenome is present in hexaploid wheat and several Aegilops species. According to meiotic, cytogenetic, and molecular analyses, the D subgenome of common wheat is not significantly modified relative to the parental (Kimber and Zhao, 1983;Rayburn and Gill, 1987;Dvořák et al., 1998) and therefore can be used for tracing evolutionary changes in the orthologous chromosomes of other Triticeae species. From another side, the wheatgrass (Thinopyrum) species are genetically related to D subgenome species of wheat and Aegilops (Chen et al., 2001;Guo et al., 2016;Bernhardt et al., 2017Bernhardt et al., , 2020. Introgression of useful genes from wheatgrass to wheat usually occurs between J (wheatgrass) and D (wheat) chromosomes, which might be due to higher homology between them rather than with homoeologues of A or B subgenomes of wheat (Liu et al., 2007;He et al., 2009;Patokar et al., 2015). High syntheny between the J b genome of Th. bessarabicum and common wheat subgenomes has been detected using a combination of cytogenetic (GISH) and molecular (SNP-mapping) analyses of 12 wheat-Th. bessarabicum introgression lines (Grewal et al., 2018), indicating that the divergence of wheat and Th. bessarabicum genomes was not accompanied with large chromosomal rearrangements, but with alterations of the repeated nucleotide sequences. The comparison of copy number variation of transposable elements between polyploid and diploid Triticeae revealed the similarity between J b genome of Th. bessarabicum and D genome of Ae. tauschii (Divashuk et al., 2019). Therefore, the comparison of J b and D (sub)genomes in the abundance and chromosomal localization of repeated DNA elements can be informative not only for repeatome and evolutionary studies, but also may have practical application as a source for increasing genetic diversity of wheat.
The aim of our study was to trace evolutionary changes of Ae. crassa subgenomes (with a special emphasis on the D subgenome) based on a complex approach, which includes low-coveragesequencing followed by identification of repetitive DNA families using bioinformatics, quantitative assessment of repeats using qPCR, and their physical mapping on chromosomes of Ae. crassa (4x and 6x) in comparison with diploid Ae. tauschii (DD), hexaploid common wheat (BBAADD), and diploid wheatgrass Th. bessarabicum (J b J b ) as an outgroup.

Plant material
The following materials have been used (Table 1). The images of heads, vegetating plants, and spikelets of Ae. crassa accession K-2485 (4x) and IG 131680 (6x) are shown in Supplementary Figure S1.

Sequencing
The fresh young leaves of growing plants were ground in liquid nitrogen, then genomic DNA was extracted using the CTAB protocol (Rogers and Bendich, 1985) and used for wholegenome sequencing, qPCR, and probe preparation for FISH. The quantity and quality of the extracted DNA were checked using a NanoDrop OneC spectrophotometer (Thermo Fisher Scientific) and by electrophoresis in 0.8% agarose gel, respectively. Only genomic DNA samples with OD260/280 value ranging from 1.8 to 2.0 and OD260/230 value ranging from 2.0 to 2.2 were considered as good quality. DNA concentration was measured on a Qubit 4 instrument using Qubit™ dsDNA HS and BR Assay Kits (Thermo Fisher Scientific, Waltham, MA, United States). The shotgun libraries were synthesized using the Swift 2S ® Turbo DNA Library Kit (Swift Bioscience, Ann Arbor, MI, USA) according to the manufacturer's protocol. A test run to check the quality of the libraries was carried out on the MiSeq instrument on MiSeq Reagent Nano Kit v2 (300-cycles). Next, the libraries went through the conversion step and were sequenced on DNBSEQ-G400 on 1 lane. The initial amount of DNA was 25 ng, with the length of fragments around 350 bp and pair-end indexing on Swift 2S Turbo Unique Dual Indexing Kit. The run was performed on the Illumina NextSeq with NextSeq 500/550 Mid Output Kit v2.5 (300 cycles) as described in Illumina protocols for pair-end reads. The length of read was 151 bp, the length of index -8 bp. The sequencing was performed in Genomed, Ltd. (Moscow, Russia).

Repeats alignment and identification
Global alignment is not sensitive enough when applied to repeat consensus monomers because of arbitrary selection of their starts. We took another approach, so each repeat consensus was tripled and aligned with consensuses database by blastn from BLAST+ package (v2.9.0, ftp.ncbi.nlm.nih.gov/blast/executables/ blast+/2.9.0; Camacho et al., 2009) with -task dc-megablast. All the high-scoring pairs (HSPs) for each aligned pair were analyzed, and total coverages without overlaps of longer sequences by HSPs of shorter ones were calculated with our custom script. We assumed the two repeat consensuses to be related if they produced alignments with e-value <0.05 and their total coverage without overlaps ≥80%. To identify previously known repeats the NCBI Nucleotide database (Coordinators, 2012) was used, and all the repeat-related sequences of Triticeae were downloaded on Oct 18 2021. Each tandem repeat was designated as follows: AC4x_ CL##_###nt for 4x Ae. crassa, AC6x_CL##_###nt for 6x Ae. crassa, ATs_CL##_###nt for Ae.tauschii subsp. strangulata and

Repeatome structure comparison
RepeatExplorer2 outputs containing proportions of reads by repeat type were parsed. For all the repeat type categories the cumulative proportions, i.e., including subcategories, were summed up and compared between species. For each category the average proportions, standard deviations, and coefficients of variation were obtained.
In silico identification of X cr subgenome-specific tandem repeats in 4x Aegilops. crassa Raw reads of 4x Ae. crassa were cleaned up from adapter sequences, and reads were truncated till 100 bp from 3′-end using bbduk from BBTools v38.93 toolkit (sourceforge.net/ projects/bbmap/). Trimmed reads were mapped on Ae. tauschii Aet v4.0 genome assembly (Luo et al., 2017) using bwa mem v0.7.17 (github.com/lh3/bwa). For further analysis the reads that perfectly mapped on reference assembly were removed using samtools (Danecek et al., 2021). The resulting 831,000 reads were taken for de novo tandem repeats' identification using RepeatExplorer2 pipeline (Novák et al., 2017). Novel tandem repeats were identified based on previously obtained consensuses sequences from 4x Ae. crassa using BLAST (Altschul et al., 1990). Primers for identified tandem repeats selected for further estimation of repeat number by qPCR and for probe synthesis for FISH by PCR were designed using Primer3Plus (Untergasser et al., 2012), and their sequences are given in Table 2.
Real-time quantitative PCR qPCR using primers developed for repeat monomers (Table 2) was performed in triple technical replication with water as negative control and VRN1 as a reference gene (Yaakov et al., 2013) according to the protocol described in Kroupin et al. (2019b). The amplification was performed on a CFX Real-Time PCR Detection System (Bio-Rad) and in Real-Time PCR Mix reaction mixture with Eva Green (Syntol Ltd., Moscow, Russia) according to the manufacturer's protocol. Primers were synthesized at Syntol Ltd. (Moscow, Russia). The primer

Assemblies' characterization
Repeat assemblies for single species ("individual" assemblies) were prepared for tetraploid and hexaploid Ae. crassa, Th. bessarabicum, Ae. tauschii subsp. strangulata, and subsp. typica. The general features of these assemblies are summarized in Supplementary Table S1. In addition, we compiled three comparative assemblies, in which the number of reads for each species were calculated in a direct proportion to ploidy level: 0.5 mln TB + 0.5 mln ATs + 1mln AC4x; 0.5 mln TB + 0.5 mln ATt + 0.5 mln AC4x; 1 mln ATs + 2 mln AC4x. For all repeats extracted from the above assemblies, the extensive summary tables were constructed (Supplementary Table S2). For comparative assembly we have extracted and described clusters containing ≥80% repeats from only one species and, if possible, having no highly similar alignments with clusters from individual assemblies for other species involved in comparative assembly.
Frontiers in Plant Science 07 frontiersin.org

Repeats' characterization
Altogether, 34 repeats were identified in individual assemblies of Ae. crassa 4x (11 high confidence, 8 low confidence satellites and 1 LTR) and Th. bessarabicum (6 high confidence, 6 low confidence satellites and 2 LTRs). The characterization of the identified repeats are shown in Supplementary Tables S3, S4 including consensus sequences, layout of TAREAN graph, estimated proportion of given repeat in the genome, homology to previously found repeats, and repeats found in this study. All repeat consensus sequences (excluding those shorter than 100 bp) together with the putative satellites of 6x Ae. crassa and Ae. tauschii have been clustered after all-to-all blast into 19 groups by their identity and total coverage percentage; based on these data we developed 19 probes. Seventeen of these groups contained sequences from Ae. crassa 4x and were classified into highprobability (№1-№11) and low-probability (№12-№16) repeats, and putative LTR (№17). Additionally, we discriminated an additional group (№18), in which a sequence obtained from 6x Ae. crassa was not paired with a sequence from 4x Ae. crassa. Two other satellites (high confidence №19, and low confidence №20) were found in Th. bessarabicum genome. Based on bioinformatical analysis, all repeats were classified according to copy number as very-low-copy (below 0.29%), low-copy (0.3-0.59%), and common (0.6% and more).
The high-probability repeats, representatives of the groups based on 4x Ae. crassa: 1. AC4x_CL3_339nt is a common repeat, found in all species analyzed. The density of the TAREAN graph layout produced for the given length of repeat monomer (Supplementary Figure S2a) is high, allowing the assumption that there are self-alike features in the monomer influencing the graph. Subsequent analysis with YASS online tool reveals a non-perfect palindromic region 58-176 with 60% identity and e-value of 9.7e−05. Two similar repeats, Ats_CL11_337nt and Ats_CL51_343nt, were identified in the assembly of Ae. tauschii subsp. strangulata by RepeatExplorer2; the latter differed from the analogous Ae. crassa monomer mostly in the given palindromic region. AC4x_CL3_339nt was more dissimilar to P335 repeat from Ae. crassa genome. 2. AC4x_CL131_334nt is a very-low-copy repeat, found in all analyzed species. It is highly similar to P334 found in Ae. tauschii with 90% identity across the 322 bp alignment and highly similar to pTa-465 (FISH-positive repetitive sequence KC290905.1) with 91% identity across 299 bp. 3. AC4x_CL170_369nt is a very-low-copy repeat found in all species analyzed in this study, which directly corresponds to P369 found in Ae. tauschii, and is highly similar to tandem repeat sequence 4P6-14 found in Ae. tauschii (AY249985.1) and ACRI_TR_CL78 satellite sequence found in Agropyron cristatum (MG323512.1).
4. AC4x_CL209_316nt is a very-low-copy repeat, found in all species, except for Th. bessarabicum. It is highly similar to T. aestivum clone p451 (genomic repeat region AF139201.1), and partially more dissimilar from P320 found in Ae. tauschii, aligning for 197 bp with 75% identity. 5. AC4x_CL219_319nt is a very-low-copy repeat, found in all species except for Th. bessarabicum and Ae. tauschii subsp. strangulata. This sequence was found in Ae. tauschii (AH013688.3) in tandem, but was not annotated. 6. AC4x_CL228_312nt is a very-low-copy repeat not yet annotated at NCBI database and found only in the context of T. aestivum BAC libraries and assemblies. It is, however, found in our assemblies of both Ae. tauschii subspecies and 6x Ae. crassa, but is lacking in Th. bessarabicum. Mapping of the AC4x_CL228_312nt monomer with trimmed reads from different sources (Supplementary Figure S2c) shows similar mapping profiles between Ae. crassa 4x and 6x and Ae. tauschii subsp. typica with more notable SNPs in Ae. tauschii subsp. strangulata and lack of given repeat in Th. bessarabicum. AC4x_CL228_312nt is more dissimilar to P321 found in Ae. tauschii, although they produce alignment of 242 bp with 92.6% identity. 7. AC4x_CL232_320nt is a very-low-copy repeat, found in all studied species except for Th. bessarabicum. It is highly similar to T. aestivum clone, to p451 (genomic repeat region AF139201.1), and partially it shows high similarity with P320, aligning for 276 bp with 96% identity. 8. AC4x_CL241_88nt is a very-low-copy repeat, found across all species. This repeat is 97.72% identical to Oligo-44 and Oligo-3A1 found in T. aestivum and highly similar to oligonucleotide BSCL184-2 based on tandem repeat BSCL184 (88 bp long) found in Th. bessarabicum and P132 found in Ae. tauschii. 9. AC4x_CL244_376nt is a very-low-copy repeat, found in all the species except for Ae. tauschii. It is highly similar (with 100% identity) to oligos BSCL1-1 and BSCL1-2 based on putative satellite BSCL1 (376 bp long) found in Th. bessarabicum, and DP4J27982 and DP4J28086 developed based on the sequences of chromosome arm 4J b L of Th. bessarabicum. 10. AC4x_CL257_820nt is a very-low-copy repeat, found exclusively in 4x Ae. crassa and having no similar annotated sequences. 11. AC4x_CL258_1307nt is a very-low-copy repeat, specific to Ae. crassa, and is not annotated either in NCBI database or in analyzed literature.
The low-probability repeats, representatives of the groups based on 4x Ae. crassa: 12. AC4x_CL8_584nt is a common repeat, highly similar to AC6x_CL6_584nt monomer (6x Ae. crassa) and more dissimilar to monomers ATs_CL16_551nt and ATt_ CL17_553nt of both Ae. tauschii subspecies. According to Frontiers in Plant Science 08 frontiersin.org gsnap mapping (Supplementary Figure S2b), Th. bessarabicum variant of this repeat differs from the AC4x_ CL8_584nt due to the many SNPs. It had no similar annotated sequences either in NCBI database or in analyzed literature. 13. AC4x_CL60_251nt is a low-copy repeat, found exclusively in tetraploid Ae. crassa and being partially more dissimilar to Ae. bicornis RAPD-generated marker sequence (AF120172.1), aligning for 200 bp with 78% identity. 14. AC4x_CL193_504nt is a very-low-copy repeat, found in all species except for Th. bessarabicum. It is somewhat similar to T. aestivum clone pTa-451 (FISH-positive repetitive sequence KC290912.1) with the alignment of 500 bp with 68% identity. 15. AC4x_CL239_178nt is a very-low-copy repeat, found in all species except for Ae. tauschii. This sequence is tandemly organized in Ae. tauschii sequence AH013688.3, but is not annotated either in NCBI database or in analyzed literature. 16. AC4x_CL261_553nt is a very-low-copy repeat, found in all species except for Ae. tauschii. It is more dissimilar to T. aestivum clone CentT550 (satellite sequence MN161206.1), aligning across 540 bp with 83.6% identity.
The putative LTR, representative of the groups based on 4x Ae. crassa: 17. AC4x_CL18_487nt is a common repeat, similar to FAT element.
The high-probability repeats, representative of the groups based on 6x Ae. crassa: 18. AC6x_CL232_145nt is a low-copy repeat, found only in 6x Ae. crassa and in both Ae. tauschii subspecies, being fully identical to P436 found in Ae. tauschii. The FISH probe developed based on AC6x_CL232_145nt is designated CL27_232.
The high-probability repeats, representative of the groups based on Th. bessarabicum: 19. TB_CL2_379nt is a common repeat, found in all species except for Ae. tauschii and somewhat similar to AC4x_ CL244_376nt at the 62% identity. It is also highly similar to oligos BSCL1-1, BSCL1-2, DP4J27982, and DP4J28086 with 100% identity (see №9).
The low-probability repeats, representative of the groups based on Th. bessarabicum: 20. TB_CL148_662nt is a low-copy repeat, not found in 6x Ae. crassa and Ae. tauschii subsp. strangulata. It is highly similar to AC4x_CL162_661nt, and to A. cristatum ACRI_ TR_CL80 satellite sequence (MG323513.1)

Repeats' comparative assembly analysis
The comparative analysis allowed us to find repeats unique for 4x Ae. crassa, which are absent in other studied species.
TB + ATs + AC4x_CL82_379nt and TB + ATs + AC4x_ CL88_376nt are presumably Th. bessarabicum-specific, even though we get no additional information from this assembly, as the first repeat has a clear analogue in Th. bessarabicum individual assembly, and the second, in contrast with 4x Ae. crassa reads share, aligns well with AC4x known repeat. However, both of them are absent in Ae. tauschii subsp. strangulata. TB + ATs + AC4x_CL217_316nt and TB + ATs + AC4x_CL234_319nt are 4x Ae. crassa-specific, both found in 4x Ae. crassa single species assembly (Supplementary Table S5).
TB + ATt + AC4x_CL87_379nt and TB + ATt + AC4x_ CL93_376nt are both absent in Ae. tauschii subsp. typica and are equivalent to the first pair of repeats described above for the previous assembly. TB + ATt + AC4x_CL204_316nt and TB + ATt + AC4x_CL220_319nt are 4x Ae. crassa-specific and are equivalent to the last pair of repeats described above for the previous assembly (Supplementary Table S6).
In silico identification of X cr subgenome-specific tandem repeats in 4x Aegilops crassa Four novel tandem repeats were identified in RepeatExplorer2 output among filtered reads (Supplementary Table S3). Homology search showed that they have no similarity to previously identified tandem repeats of 4x Ae. crassa. Biased filtering (removing reads perfectly mapping on Ae. tauschii D genome) probably causes an increase of fraction of these reads of interest and makes it possible to assemble and classify assembled consensuses as tandem repeats. BLASTn search against the entire Nucleotide database showed numerous hits with other satellite and transposon sequences from other grass species for CL162 and CL244 and no hits against repeat-related sequences for CL257 and CL262.

Assessment of copy numbers of repetitive DNA clusters using qPCR
The values of threshold cycle (Ct) were directly determined using qPCR for each repeat and for each species and relative copy number normalized against single-copy gene VRN1, which was calculated for each repeat (Supplementary Table S8). Decimal logarithm of relative quantity (RQ) was provided for copy number comparison (Table 3), and all repeats were conventionally (with Frontiers in Plant Science 09 frontiersin.org highly probability) divided into high-copy (log 10 RQ > 4), mediumcopy (2 < log 10 RQ < 4), and low-copy (log 10 RQ < 2). In 4x and 6x Ae. crassa CL3, CL18, CL209, CL170, CL8, and CL239 represented high-copy repeats; CL131 and CL2 were low-copy repeats, while others were classified as medium-copy repeats. In Ae. tauschii genome CL3, CL18, and CL170 were highly abundant, CL148, CL60, CL8, CL232, CL193, CL261, CL228, and CL27_232medium abundant, while others were found to be low-copy repeats.
Two Ae. crassa repeats, CL8 and CL60, and Th. bessarabicum repeat CL148 are distributed evenly along the entire lengths of all chromosomes irrespective of their (sub)genome affinities (Supplementary Figures S3, S7). Owing to the usefulness of these repeats for genome or chromosome identification, we excluded them from further analyses.
Other DNA probes containing already known and novel repeated sequences demonstrated discrete signals. Three of them are species-specific and produce signals on one chromosome (pair) each. Thus, CL257 and CL258 occur only in Ae. crassa (4x and 6x). Small CL257 signals appear in subterminal region of 5D 1 S  S4). Although 6x Ae. crassa possesses two copies of the D* subgenome, only one chromosome pair belonging to D 1 subgenome carries the signals of each of the two abovementioned probes; signals are a little more intense for CL258 than for CL257 ( Figures 1A,I). CL193 probe produces a single hybridization site on the tip of the short arm of one of the two Th. bessarabicum homologous chromosomes 7 J ( Figure 1O). FISH fails to detect the CL193 on Ae. crassa, Ae. tauschii, or common wheat chromosomes.
Large signals of CL27_232 probe are found in the proximal region of 3D 1 short arms and faint signals in a distal part of 5X cr long arms in both 4x and 6x cytotypes of Ae. crassa (Figures 2A;  Supplementary Figures S3, S4). Hexaploid Ae. crassa possesses additional CL27_232 signals in the short arm of 3D 2 and in the terminus of short arm of 4D 1 , which was transferred from F cr short arm as a result of species-specific translocation T2 (Figure 2A; Supplementary Figure S4). Only one pair of clear CL27_232 signals is detected on chromosome 3D of Ae. tauschii ( Figure 2C) and 3D of common wheat ( Figure 2J).
CL244 probe displays relatively poor hybridization to Ae. crassa chromosomes. Thus, three pairs of very faint signals appear in the terminal regions of 1X cr L, A cr L, and 4D 1 S of 4x Ae. crassa ( Figures 1A, 3A,B). In karyotype of 6x Ae. crassa the largest CL244 signals appear on 5D 1 S, and smaller signals -on chromosomes A cr L and F cr S. As F cr was modified following species-specific translocation T2, this site is probably derived from 4D 1 S ( Figure 3C). Common wheat possesses three pairs of very small CL244 signals on chromosomes 2AS, 1BS, and 4BL ( Figures 3E,F), and no hybridization is detected in Ae. tauschii. On the contrary, Frontiers in Plant Science 11 frontiersin.org most Th. bessarabicum chromosomes carry intense hybridization sites in either the short or long arm ( Figure 3D). Signals appear on both homologous chromosomes 1JS, 5JS, and 7JS, but only on one homolog of 3JL and 6JS pairs. The probe CL241 hybridized to the proximal part of the short arms of all group 5 chromosomes of 4x and 6x Ae. crassa ( Figure 2G; Supplementary Figures S3, S4) and of Ae. tauschii. In common wheat, the CL241 sites on 5A and 5B are located in the long, but not in the short arms, while additional CL241 loci are found on 3A (long arm) and both arms of 7A ( Figure 2I; Supplementary Figure S6). Chromosome 7J of Th. bessarabicum carries two CL241 sites in opposite arms ( Figure 1O; Supplementary Figure S7). Additional CL241 sites are observed in the middle of 4JL, but there are no signals on group 5 chromosomes.
The CL209 signals appear mainly on the X cr subgenome chromosomes of Ae. crassa. Small CL209-sites are present on chromosomes C cr (middle of the short arm), 5X cr (distal third of the long arm), and on a distal part of 7D 1 L arm ( Figure 1J; Supplementary Figures S3, S4). Hexaploid Ae. crassa has additional large subtelomeric CL209 clusters on chromosomes 1X cr S and in a distal part of 6D 2 L ( Figure 2F; Supplementary Figure S4). CL209 is absent in Ae. tauschii and Th. bessarabicum, but produces very weak signals on common wheat chromosomes 2BS, 2DL, and 6DS (Supplementary Figure S6).
CL219 sequence is present in abundance on Ae. crassa (4x and 6x) chromosome 7D 1 , which contains two prominent clusters located in a proximal half of the short arm in a distal third of an opposite arm (Figures 1G, 2B; Supplementary Figures S3, S4). In addition, small signals occur in a distal part of 5D 1 S and in the terminus of F cr S. In 6x cytotype the site from chromosome F cr is transferred onto chromosome 4D 1 S following species-specific translocation T2. Hexaploid Ae. crassa also contains clear CL219 FISH reveals large signal of CL232 probe in a distal part of the long arm of chromosome 2D (or its derivative) in all wheat and Aegilops species (Figure 4, Supplementary Figures S3-S6), while a smaller site in the short arm of 2D is present only in Ae. tauschii, common wheat, and the D 2 subgenome of Ae. crassa (6x). Prominent subtelomeric CL232 signals are detected in the long arm of 1X cr in 4x ( Figures 1D,I) as well as in 6x Ae. crassa, which also possesses a smaller signal terminally in the short arm ( Figure 4). Hexaploid Ae. crassa has smaller signals of CL232 in a terminal part of 4D 1 S and distal quarters of 7D 1 L and 6D 2 L. Small CL232 site in 7DL appears in diploid Ae. tauschii ( Figure 2D), and this sequence is absent in Th. bessarabicum genome.
CL239 is not detected in Ae. tauschii and common wheat; it is also absent in the D 2 subgenome chromosomes of 6x Ae. crassa (Figure 4). Both Ae. crassa cytotypes contain large CL239 clusters in subterminal regions of several X cr and D 1 chromosomes ( Figure 1D). Among them, five sites are common (B cr L, C cr L, 5X cr L, 2D 1 L, and 3D 1 ), but tetraploid form contains two additional loci on 1D 1 L and 2D 1 S. Two pairwise CL239 signals and one odd signal are detected terminally on Th. bessarabicum chromosomes 2JL and 7JS + L (Supplementary Figure S7).
The  Figure 2J). Ae. crassa contain several o-45 sites located in the middle of 6D 1 S; on the chromosome A cr S, close to the centromere; interstitially and terminally in 1X cr L; in the middle of 5X cr L; and in pericentromere of 6X cr (4x and 6x Ae. crassa) and 7D 2 (6x Ae. crassa; Supplementary Figures S3, S4).
All studied species possess the CL261(=CL198) sequence. According to FISH, it localizes in pericentromeric regions of most chromosomes and signal intensity varies between chromosomes and between species (Figures 1J, 2F,H; Supplementary Figures S3-S7).
According to FISH, CL170 sequence is present in D (sub) genome(s) of Triticum and Aegilops species and in the J b genome of Th. bessarabicum ( Figure 2H; Supplementary Figures S3-S7). Distribution of CL170 sites along the D (sub)genome chromosomes is highly conserved across species (Figures 5). Two clear signals are located interstitially in the opposite arms of 1D; Localization of CL244 (green) repeat on Ae. crassa, 4x (A,B), Ae. crassa, 6x (C), Th. bessarabicum (D), and common wheat (E,F) chromosomes. Chromosomes are identified using pAs1 (B,C, red), pTa-713 (D, red) or GAA n (F, green) and pTa-535 (F, red). Chromosomes carrying CL244 signals are indicated. Scale bar, 10 mcm.
Frontiers in Plant Science 13 frontiersin.org two signals in the middle 4DL are interrupted by a small pTa-713 site; large signal is present proximally in 5D short arm alongside a distinct pTa-713 site, and large double signals in the proximal half of 6DS ( Figures 1B,E,H,K). A distinct CL170 site occurs in the middle of 2D chromosome of Ae. tauschii ( Figures 2H, 5, TAU) and 2D of common wheat ( Figure 5, D AEST). Two small CL170 sites are detected on opposite arms of chromosome A cr of the tetraploid and in the long arms of A cr and 6X cr chromosomes of the hexaploid Ae. crassa cytotypes; the split of two CL170 sites between two different chromosomes is due to species-specific translocation T1. Hexaploid Ae. crassa contains CL170 sites in both, the long (subterminal) and short (double distal) arms of chromosome 7D 2 , which are absent on the orthologous chromosomes of other species. CL228 is located predominantly on the D 1 subgenome chromosomes of 4x (Supplementary Figure S3) and 6x Ae. crassa (Figures 1M, 2B; Supplementary Figure S4). Two clear signals are detected in the terminus and in the middle part of 1D 1 and 6D 1 short arms; a prominent, probably double signal is observed in the terminal part of 3D 1 L, and one or a pair of small signals are present in opposite arms of 2D 1 and 7D 1 (Figure 6). Distinct signals are found on chromosomes B cr L and A cr L (6x Ae. crassa)/ 6X cr L (4x Ae. crassa), in the latter case the signal exchange is due to species-specific translocation T1.
Five pairs of the D 2 subgenome chromosomes of 6x Ae. crassa possess CL228 sites. The largest signal occurs on 3D 2 L, similarly to 3D 1 L, and other intense sites are present in the proximal third of 2D 2 L and a distal part of 7D 2 S. Signals detected on chromosomes 1D 2 and 6D 2 are very weak, however their location is similar to that observed on the orthologous D 1 subgenome chromosomes ( Figure 6).
CL228 probe hybridized to all chromosomes of a diploid Ae. tauschii subsp. strangulata, however hybridization is much weaker compared to Ae. crassa (Figure 6; Supplementary Figure S5). Labeling patterns of the D subgenome chromosomes of common wheat are similar to that of Ae. tauschii, except for the lack of signals on 7D and less intense sites on 1D. The CL228 sequence hybridized to some A and B subgenome chromosomes of common wheat. Small but distinct signals are present on 6BS close to the centromere, in distal parts of 2AS and 7AS. Several very weak but consistent signals appear on 2B, 4B, 5B, 1A, 3A, 4A, and 6A chromosomes ( Figure 6). CL131 is homologous to T. aestivum clone pTa-465. CL131 hybridization patterns obtained in our study on common wheat chromosomes (Figure 7, AEST; Supplementary Figure S6) corresponds to previously reported patterns for pTa-465, thus confirming that these sequences are homologous. In common wheat and Ae. tauschii karyotypes only four out of seven D (sub) genome chromosomes carry small CL131 signals; these are 1D, 2D, 6D, and 7D in Aegilops and 2D, 5D, 6D, and 7D in wheat. Both Ae. crassa cytotypes carry numerous CL131 sites ( Figure 1G), which have higher intensities and appear on both D 1 and X cr chromosomes (Figure 7). Most prominent sites appear on chromosome 2D 1 ; subterminal signals on 3D 1 L and 3D 2 L are also large. Prominent CL131 signals are observed in the perinucleolar region of 6X cr and subtelomeric parts of 7C cr L chromosomes; hexaploid accession also contains a clear signal on C cr S (Figure 7; Supplementary Figures S3, S4).
The P332 is homologous to pTa-k566 sequence and is similar to it in the distribution pattern on common wheat chromosomes (Figure 8). FISH detected P332 signals of variable sizes on most Ae. crassa (4x and 6x) and Ae. tauschii chromosomes (Figures 2E, 8; Supplementary Figures S3-S6), but not in Th. bessarabicum. No intraspecific variation of labeling patterns has been observed in Ae. crassa, and 4x and 6x cytotypes differ only in the size of 332 site on the chromosome C cr (Figure 8). The D 1 subgenome differs from D 2 in labeling patterns of 2D, to a lesser extent of 5D and 7D chromosomes, Comparison of CL170 patterns on chromosomes of different cereal species: TAU -Ae. tauschii ssp. strangulata; D 1 , 4x cr -D 1 subgenome of tetraploid Ae. crassa; D 1 , 6x cr, D 2 , 6x cr -D 1 and D 2 subgenomes of hexaploid Ae. crassa; AEST -D subgenome of T. aestivum, J, bessarab -J b genome of Th. bessarabicum. 1-7 -homoeologous groups. T1, T2, T10 -translocated chromosome of Ae. crassa. T10 causes small length reduction of chromosome region distal to CL170 site in the long arm of 1D 1 . and D 2 subgenome shows high similarity with the D (sub) genomes of Ae. tauschii and common wheat.
CL18 is homologous to FAT repeat and is similar to it in chromosomal distribution. Thus, CL18 signals localize unevenly along the length of all chromosomes in all species studied. It shows more intense labeling in proximal chromosome regions ( Figures 2G,I; Supplementary Figures S3-S6) being especially abundant in the chromosome(s) 4DS. The sequence CL3 is homologous to pAs1 clone and CL187 to pSc119.2. Labeling patterns of these probes are almost identical to those reported earlier for the respective sequences and are not described in this paper.
Six new repeats identified in Ae. crassa, CL8 = CL16, CL131 = CL149, CL170, CL239, CL241, and Th. bessarabicumderived repeat CL148, were shared by Aegilops and Thinopyrum species. Two Th. bessarabicum sequences, CL16 (homologous to CL8) and CL148, are dispersed along all chromosomes; of them CL16 is more abundant in proximal, while CL148 in distal chromosome regions (Supplementary Figure S7). The probe CL198 (related to CL261) hybridizes to pericentromeric chromosome regions, while CL2, CL193, CL239, and CL244 to subterminal regions of one to several Th. bessarabicum chromosomes. A clustered pattern of CL2 is observed in terminal regions of the J b genome chromosomes. Three probes, CL149 (=CL131), CL241, and CL170, hybridize to interstitial regions of Th. bessarabicum chromosomes. Signals obtained with CL131 are very faint, inconsistent, and therefore not secure for chromosome identification. Together, CL241 and CL170 can serve as reliable chromosomal markers for Th. bessarabicum (Supplementary Figure S7). For all these repeats heteromorphisms of homologous chromosomes in signal presence and/ or size is often observed.

Discussion
Repeated nucleotide sequences are a major component of plant genome and play an important role in evolution (Salina et al., 2004b;Sharma and Raina, 2005;Dvořák, 2009;Mehrotra and Goyal, 2014;Macas et al., 2015;Liu et al., 2019). Divergence of diploid species or formation of new species via polyploidization are often associated with alterations in a fraction of repetitive DNA manifested in the emergence of new repeated DNA families, amplifications/eliminations of repeats, or their re-distribution between chromosomes (Zhao et al., 1998;Hemleben et al., 2007;FIGURE 6 Comparison of CL228 labeling patterns on chromosomes of different cereal species: TAU -Ae. tauschii ssp. strangulata; D 1 , 4x cr -D 1 subgenome of tetraploid Ae. crassa; D 1 , 6x cr, D 2 , 6x cr -D 1 and D 2 subgenomes of hexaploid Ae. crassa; AEST -A, B, and D subgenomes of T. aestivum. 1-7 -homoeologous groups. White arrowheads show minor CL228 sites detected on the X cr , A, B, and D subgenome chromosomes. Frontiers in Plant Science 16 frontiersin.org Macas et al., 2015;Liu et al., 2019;Kuo et al., 2021;Waminal et al., 2021). Repetitive DNAs are located in structurally and functionally important chromosome regions (Heslop-Harrison et al., 2003;Shapiro and von Sternberg, 2005;Sharma and Raina, 2005;Heslop-Harrison and Schwarzacher, 2011;Mehrotra and Goyal, 2014;Garrido-Ramos, 2015) and are essential for maintenance of chromosome and genome integrity. From another side, chromosomal breaks causing chromosomal rearrangements often occur at sites enriched with repetitive DNA (Raskina et al., 2008;Molnár et al., 2011;Murat et al., 2017;Pollak et al., 2018). Owing to this, many researchers studying genome evolution and speciation in plants were focused on analysis repetitive DNA Song et al., 2020;Ebrahimzadegan et al., 2021;Waminal et al., 2021). Fluorescence in situ hybridization is one of the most broadly exploited approaches in this field. Development of new markers for genome studies is an important task in molecular biology and cytogenetics (Du et al., 2017;Meng et al., 2018;Said et al., 2018;Liu et al., 2019;Nikitina et al., 2020;Xi et al., 2020;Zagorski et al., 2020;Li et al., 2021;Liu and Zhang, 2021;Singh et al., 2021). Current progress in plant genome sequencing and bioinformatic analysis has opened broad perspectives for discovering new DNA sequences that can potentially be used as FISH probes, and the number of such markers rapidly increases. In our current study, as in some other publications Zagorski et al., 2020;Waminal et al., 2021), we combine the benefits of molecular biology (in vitro), bioinformatics (in silico), and molecular cytogenetics (in situ) to get deeper insight on genome organization and karyotype evolution of Ae. crassa. The integration of different methods allowed us to identify several new repetitive DNA families, to assess their genome abundance, and map them on chromosomes by FISH. We reveal complicated genome organization of Ae. crassa and trace changes in the pattern of repetitive DNA over the course of evolution.

Novel markers for chromosome and genome identification
The results obtained in a current study provided us more detailed information on genome and chromosome organization of Ae. crassa and the related species. We found that some repetitive sequences are widespread, whereas other sequences are restricted to particular species, subgenome(s), homoeologous group, or even a single chromosome of particular species. Thus, eight repeats isolated from Ae. crassa (CL8, CL18, CL131, CL170, CL239, CL241, CL261 = CL198, and CL244) and one Th. bessarabicum (CL148) repeats were found to be common between wheat, Comparison of CL131 labeling patterns on chromosomes of different cereal species: TAU -Ae. tauschii ssp. strangulata; D 1 , 4x cr -D 1 subgenome of tetraploid Ae. crassa; D 1 , 6x cr, D 2 , 6x cr -D 1 and D 2 subgenomes of hexaploid Ae. crassa; AEST -A, B, and D subgenomes of T. aestivum; J, bessarab -J b genome of Th. bessarabicum. 1-7 -homoeologous groups.
Frontiers in Plant Science 17 frontiersin.org Aegilops, and Th. bessarabicum; the results of qPCR and FISH assays correlates to each other rather strongly. Importantly, at least two of these sequences, CL241 and CL170, exhibited unique labeling patterns, which differed from other FISH probes suggested for cytogenetic analysis of this grass (Du et al., 2017;Grewal et al., 2018;Chen et al., 2019;Badaeva et al., 2019b). The probes CL2 and CL244 produced large signals overlapping with the position of C-bands on Th. bessarabicum chromosomes (Mirzaghaderi et al., 2010); these sequences may be important constituents of heterochromatin blocks in this species. Taken together with CL244 and CL2 probes, the sequences CL241 and CL170 can be used as supplementary probes in FISH analysis of wheatgrass and its hybrids with common wheat. Our analyses showed that individual Ae. crassa chromosomes differ in the number and combination of repeated families, their abundance, and physical distribution. Chromosome 5D 1 , comprising nearly all known repetitive DNA families, showed the highest diversity, while only few variants were recorded for 2D 1 and 7X cr .
Five variants of repetitive DNAs hybridized exceptionally to terminal chromosome regions -CL239, CL244, CL257, CL258, and CL2 (only in Th. bessarabicum). One of the newly discovered Ae. crassa repeats, CL261, was localized in pericentromeric regions of several chromosome pairs of wheat, Ae. crassa, and Ae.
Frontiers in Plant Science 18 frontiersin.org CL219 and pTa-713 repeats and much lesser amounts of CL232, CL18, and pTa-k566 sequences. As in a previous case, a large Giemsa band was observed in the respective chromosome region (Badaeva et al., 1998(Badaeva et al., , 2002. Repeats CL27_232 and CL241 found in Triticum-Aegilops species occupied a similar position on the orthologous chromosomes. CL27_232 clusters appeared in the short arm of 3D of common wheat, Ae. crassa, and Ae. tauschii. Although few additional minor signals have been detected on chromosomes 4D 1 (T2) and 5X cr of 6x Ae. crassa, CL27_232 repeat can be used as a marker of the short arm of 3D chromosome. Such chromosomespecific markers based on single-copy genes (Danilova et al., 2014) or pooled oligo-probes  have been developed for wheat and found broad application in phylogenetic studies of the Triticeae (Danilova et al., 2017). Repeat CL241 was found in the short arms of all homoeologous group 5 chromosomes of Ae. crassa and Ae. tauschii. This syntheny however was disturbed in wheat and Thinopyrum species, which possessed clear CL241 sites on A, B, and J b (sub)genome chromosomes, but in different positions.
Two sequences, CL170 and CL228, occurred mainly in the D 1 subgenome. Hybridization pattern of CL170 probe was highly conserved across the D (sub)genomes of wheat and Aegilops species (Figure 5), whereas more intense hybridization sites of CL228 on Ae. crassa chromosomes compared to Ae. tauschii or common wheat ( Figure 6) suggested sequence amplification following speciation of this amphiploid. Interestingly, labeling patterns of CL170, CL244 probes on Ae. tauschii subsp. strangulata and D subgenome chromosomes of common wheat differed from that in the D 1 and D 2 chromosomes, suggesting that these differences may exist in the genome of diploid progenitor. This assumption is legitimate, as many molecular (Luo et al., 2007;Wang et al., 2013;Li et al., 2018;Singh et al., 2019;Gaurav et al., 2021) and cytogenetic (Majka et al., 2017;Zhao et al., 2018;Badaeva et al., 2019a;Ebrahimzadegan et al., 2021) data supported significant genome divergence between two Ae. tauschii subspecies.
Two repeats were more abundant in the X cr subgenome, CL131 homologous to pTa-465 and P332 homologous to pTa-k566 (Komuro et al., 2013), but occurred also in D 1 and D 2 subgenomes (Figures 7; 8, Supplementary Figures S3, S4). Both probes hybridized to subtelomeric and interstitial chromosome regions and showed different labeling patterns between D genomes of Ae. tauschii subsp. strangulata and tauschii vs. D 1 and D 2 subgenome chromosomes of Ae. crassa. The CL131 is, according to our current results, more abundant in the X cr rather than D 1 subgenome of Ae. crassa (Figure 7). Thus, three prominent CL131 clusters were detected on 2D 1 and another one, overlapping with CL228, on 3D 1 L. Such prominent CL131 (pTa-465) clusters were not found in wheat (Komuro et al., 2013) or in subsp. strangulata (accession K-112), but smaller pTa-465 sites in similar positions were recorded in several Ae. tauschii accessions by Majka et al. (2017). Unfortunately, these authors did not provide full taxonomic description on the material they used.
Most CL131 sites on X cr chromosomes were small and only 6X cr possessed large signals comparable in size with signals on chromosome 4A of wheat. We found differences between 4x and 6x accessions in the size of CL131 sites on chromosomes C cr S and F cr L, which are probably not related to the formation of hexaploid form, whereas modifications of labeling patterns of F cr S and 4D 1 S chromosomes were likely caused by species-specific translocation T2. No traces of CL131 repeat were recorded on Th. bessarabicum chromosomes by FISH, which corresponds to qPCR results.
According to bioinformatics and qPCR the novel CL239 repeat is absent in the Ae. tauschii genomes. Indeed, FISH detected CL239 sites on most X cr chromosomes, but only on two D 1 subgenome chromosomes. In addition to Ae. crassa, CL239 was discovered in Th. bessarabicum, but it was absent in the D (sub) genomes of wheat and Ae. tauschii, and the D 2 subgenome of 6x Ae. crassa, in agreement with results obtained by other methods. Based on these observations we suggest that CL239 sequence was probably contributed to Ae. crassa by the putative progenitor of X cr subgenome, and it spread to the D 1 subgenome chromosomes following species evolution. Alternatively, this sequence could present in a putative progenitor of the D 1 subgenome, but after formation of primary allopolyploid was amplified in polyploid descendant, but eliminated from diploid ancestor. Amplification of CL131 and CL239 repeats in Ae. crassa can be suggested based on the lack of this sequence in Ae. tauschii genome. Comparison with 6x Ae. crassa however showed that this is true only for CL239, because most D 2 subgenome chromosomes had a CL131 hybridization pattern similar with the D 1 chromosomes.
Although CL232 repeat hybridized only on a few Ae. crassa chromosomes, it helped us to shed some light on the structure of chromosome 2D 1 , the origin of which is still highly speculative. Labeling patterns of all repeats used in our current and in all previous studies (Badaeva et al., 1998(Badaeva et al., , 2002(Badaeva et al., , 2021Abdolmalaki et al., 2019) showed that most drastic changes occurred in this Ae. crassa chromosome as compared to Ae. tauschii. Chromosome 2D 1 was assigned to the D subgenome because i) it showed distinct hybridization with the D genome-specific probe pAs1; and ii) labeling pattern of CL232 and CL239 probes on the long arm of 2D 1 were almost identical to the long arm of 2D orthologs of Ae. tauschii, the D 2 subgenome of 6x Ae. crassa, and D subgenome of common wheat. At the same time, 2D 1 lacks hybridization sites of CL170, CL228, and pTa-k566, and acquires three CL131 clusters, which point to significant structural rearrangement of Ae. crassa chromosome 2D 1 .
Comparison of the results of the preliminary estimation of the repeats abundance using qPCR and the localization of the identified repeats to the chromosomes of the studied species generally showed the collinearity of the results obtained by these methods. Thus, CL8, CL18, CL60, and CL148 demonstrated a high copy number and noticeable dispersed signals on the chromosomes in all studied species. CL3, CL261, CL27_232, and CL170 also showed high copy number in all studied species according to qPCR results, and discrete pattern of hybridization on chromosomes. It can also be noted that the species specificity revealed by the qPCR results was also observed when analyzing the FISH results: CL257 and CL258 are found only in Ae. crassa,CL209 and CL219 in Ae. crassa  However, we also revealed differences between qPCR data and localization of repeats on chromosomes using FISH for CL193, CL2, and CL131. Although CL193 showed a high copy number in all species in the qPCR experiment, it hybridized to one pair of Th. bessarabicum chromosomes only. CL193 is homologous to the dispersed-clustered FAT and P631 repeats, as well as to the microsatellite-related FISH-positive repetitive sequence pTa-451 (Supplementary Table S9). We can assume that the CL193 repeat is quite strongly dispersed throughout the chromosomes without cluster localization, so that FISH is not able to identify its presence. On the other hand, qPCR in this case could give false positive results due to primer annealing and non-specific amplification on other repeating elements. CL2, homologous to terminal repeats of various Triticeae species (Supplementary Table S9), in our experiments, was absent in the studied species by qPCR, but showed bright terminal signals on the Th. bessarabicum chromosomes. Similarly, CL131 also showed extremely low abundance in qPCR, but showed clear local signals on the chromosomes of all species studied. In these cases, qPCR showed false-negative results, which can be explained by the low efficiency of the selected primers or difficult amplification regions for the polymerase. Nevertheless, despite individual cases, in general, the qPCR method has demonstrated its suitability for preliminary screening of novel DNA repeats for their application as chromosomal markers.

Evolutionary changes of repetitive DNA families in Aegilops crassa genome
The five repeats we identified, CL2, CL239, CL244, CL257, and CL258, are distinguished by their conservative subtelomeric localization. Sequence analysis showed that CL2, which was found in Th. bessarabicum genome, is homologous to terminal/ subterminal repeats found in Th. bessarabicum, Leymus racemosus, Dasypyrum villosum, and Secale cereale (Wilkes et al., 1995;Francki et al., 1997;Kishii et al., 1999;Pace et al., 2011;Du et al., 2017;Chen et al., 2019), as well as to CL244; the CL244 repeat itself showed homology to terminal repeats of Th. bessarabicum, Ae. speltoides (Spelt52.1), and S. cereale (including pSc200; Appels et al., 1981;Vershinin et al., 1995;Salina et al., 2004a;Du et al., 2017;Chen et al., 2019), whereas CL239 is homologous to Spelt1like repeat Tri-MS-6 (EF469549.1; Supplementary Table S9). Subtelomeric repeats are localized in terminal heterochromatic blocks and their copy number may vary between accessions and between species and can change during evolution. Moreover, even homologous FISH probes may give different signals (or no signals at all) in the same species. For example, oligo-probes DP4J20764 and DP4J30938, which are homologous to CL2, gave signals on chromosomes of Th. bessarabicum and wheat, while the DP4J31304 was found only in chromosomes of Th. bessarabicum (Du et al., 2017). CL2 hybridized only to Th. bessarabicum chromosomes, while CL244 and CL239 showed hybridization to chromosomes of Th. bessarabicum, as well as the X cr and D 1 subgenome chromosomes of Ae. crassa, but absent in Ae. tauschii. Judging from the wide range of genomes in which the terminal repeats CL2, CL239, and CL244 (V, R, J, S) occur, we can propose their antiquity and even possible common origin of CL2 and CL244. The CL2, CL239, and CL244 repeats can probably be amplified during Th. bessarabicum speciation, but CL2 and CL239 were totally eliminated from Aegilops and Triticum, while CL244 was retained in the S, R, X cr genomes and the putative ancestor of wheat B subgenome. Signals on D and A subgenome chromosomes of common wheat may appear due to the transfer of CL244 repeat from B to the A subgenome chromosomes, while CL239 and CL244 may be transferred from X cr to the D 1 subgenome chromosomes of Ae. crassa after polyploidization, as a result of coevolution of subgenomes, as was supposedly for terminal repeats Spelt1 and Spelt52 (Zoshchuk et al., 2007;Raskina et al., 2011).
Subtelomeric repeats play an important role in recognition of homologous chromosomes during meiosis (Corredor et al., 2007;Aguilar and Prieto, 2020). We found CL257 and CL258 repeats only in the D 1 subgenome of the Ae. crassa (4x and 6x) on chromosomes belonging to genetic groups 1 and 5, respectively, which may indirectly indicate their role in the recognition of these chromosomes during meiosis and promote their differentiation from homoeologues. In turn, CL244 forms signals of varying intensity in termini of several X cr and D 1 chromosomes, which may also indirectly indicate their putative involvement in chromosome recognition in meiosis.
CL244, which was filtered bioinformatically as the putative X cr subgenome-specific repeat, may also exhibit partial sequence elimination in Ae. crassa. Only a few faint FISH signals were observed on Ae. crassa and common wheat chromosomes, and it was totally absent in Ae. tauschii, which agrees with qPCR results. Despite some polymorphisms in CL244 location, the signals were always found in subtelomeric regions of wheat and Aegilops chromosomes, and most chromosomes carrying CL244 sites belong to subgenomes other than D/ D 1 . According to both FISH and qPCR analyses, CL244 is highly abundant in Th. bessarabicum, which presumed that this repeat was inherited from an ancient grass ancestor and remained mainly in J b and X cr (sub)genomes.
Only two repetitive DNA families analyzed in our study, CL257 and CL258, were found to be species-specific. Each of them was mapped to a single Ae. crassa chromosome, 5D 1 and 1D 1 , respectively, and in both cases hybridization signals had terminal location. This is not surprising because many pieces of experimental evidence prove that species-specific repeats are often accumulated in subtelomeric regions of plant chromosome, which comprise rapidly evolving families of satellite repeats (Anamthawat-Jonsson and Heslop-Harrison, 1993;Salina et al., 2004b;Lim et al., 2005;Sharma and Raina, 2005;González-García et al., 2006). Bioinformatically, CL257 and CL258 were classified as "true satellites, " which occur in genome of Ae. crassa with the same frequency of 0.019% (Supplementary Table S3). Real-time qPCR confirmed their presence in both 4x and 6x Ae. crassa and absence in other species (Supplementary Table S8). None of the methods we used here revealed these repeats in other species, and no hits have been found in NCBI database. Based on these observations we assume that these two repeats emerged de novo in the D 1 subgenome at the stage of formation of primary amphiploids or during subsequent evolution of 4x Ae. crassa. Probably owing to this novelty, bioinformatics attributed the CL257 repeat to the X cr subgenome, although physically it localizes on the D 1 chromosome that emphasize the necessity of complex approaches in repeatome studies. The emergence of five novel Ae. crassa-specific repetitive DNA sequences was earlier documented by Dubkovsky and Dvořák (1995), however, the relationships of these repeats with CL257 and CL258 remain not known. No differences in CL257 and CL258 signal intensities have been observed between 4x and 6x Ae. crassa, suggesting that hexaploidization did not cause significant changes in their content.
Formation of 4x Ae. crassa was associated with massive amplification of certain satellite repeats that may already pre-exist in the genome of the putative progenitor species. This mechanism was probably responsible for the emergence of two huge CL219 clusters on 7D 1 in both 4x and 6x Ae. crassa. According to FISH, this sequence is lacking on the orthologous chromosomes of common wheat, Ae. tauschii, and on 7D 2 of Ae. crassa. On the other side, CL219 was absent in Th. bessarabicum and Ae. tauschii, but occurred on chromosomes 2BS of common wheat and 1X cr , 6X cr , and 7X cr of Ae. crassa. All these chromosomes belong to subgenomes other than D, and all sites were only found in subtelomeric regions. Based in these observations, we assumed that CL219 satellite was present in minor quantities in the putative X cr subgenome progenitor. It could probably be transferred to the proximal region of 7D, which was highly enriched with other satellite sequences (e.g., pTa-713, pTa-k566, CL18, and CL232) via the mechanism of ectopic pairing (Schubert and Lysak, 2011;Waminal et al., 2021) soon after formation of primary Ae. crassa amphiploid. Genomic shock might cause massive amplification and spread of the repeat to other chromosomal sites, leading to the emergence of prominent CL219 clusters in proximal (7D 1 S) and distal (7D 1 L) chromosome regions.
We identified two repeats, CL261 and CL198, localized in the pericentromeric region, which are found to be homologous to the centromeric repeats CentT550 and 17-202 (Supplementary Table S9). Centromeres play an important role in the precise segregation of sister chromatids in mitosis and meiosis, mediated by the centromere-specific histone protein CENH3. Localization of this protein coincides with the centromeric arrangement of CentT550 repeats, which is characteristic mainly for the D subgenome of wheat (and Ae. tauschii), and CentT566, which is characteristic mainly for the B subgenome of wheat (and Ae. speltoides) and, probably, served as a source of these repeats in other subgenomes of bread wheat after polyploidization (Mach, 2019). Repeat 17-202 was found in the genome of Th. pontium and is localized to chromosomes of both wheat and wheatgrass (Nikitina et al., 2020). The CL261 repeat is localized mainly on the D 1 chromosomes of 4x and 6x Ae. crassa and on the D subgenome chromosomes of common wheat. Its analogue, CL198, is localized on five pairs of Th. bessarabicum chromosomes. Thus, most likely we identified a new centromeric repeat that has a common origin with CentT550 and passed to both the J and D (sub) genomes of wheatgrass and Aegilops during evolution. Considering the importance of this repeat in ensuring proper meiosis, we can assume that the simultaneous presence of CentT550-like repeats in wheat and wheatgrass chromosomes of intergeneric hybrids may explain predominant introgressions of the J genome chromatin of Thinopyrum to the D subgenome of wheat.
We also identified a few chromosome-or group-specific repeats. Thus, CL241, which is homologous to Oligo-3A1 Lang et al., 2019b) and Oligo-44 (Tang et al., 2018;Supplementary Table S9), was mapped to the same wheat or Ae. tauschii chromosomal regions as was described in literature. Their signals are predominantly found on group 5 chromosomes, either in the short [(sub)genomes D, S l , S s ] or long [(sub)genomes S, B, A] arms. In Th. bessarabicum, we detected signals on two chromosome pairs (with one and two hybridization sites), while Lang et al. (2019b) identified two and four chromosomes carrying a single Oligo-3A1 site each in Th. ponticum and Th. intermedium, respectively. We did not find this repeat in the X cr subgenome, Lang et al. (2019b) did not detect it in barley and Dasypyrum breviaristatum. Thus, we can assume that the X cr subgenome donor diverged from a common ancestor after barley and D. breviaristatum, but prior to active amplification of this repeat in Triticum/Aegilops/Thinopyrum/D. villosum.
We can reach a similar conclusion considering that the CL170 repeat is absent in X cr genome, but is abundant in the D (sub) genome of wheat and Ae. tauschii, Ae. crassa, and Th. bessarabicum. The homologous sequences were detected on chromosomes of Th. ponticum, Th. bessarabicum, Th. intermedium, A. cristatum, but they were absent in D. villosum and Pseudoroegneria spicata chromosomes (Supplementary Table S9). Probably, the putative X cr genome progenitor diverged from a common ancestor after separation of the V and St genomes, and prior to the start of massive amplification of the CL170-like repeat in Triticum/Aegilops/Thinopyrum/A. cristatum.
Results obtained in our study showed that evolution of Ae. crassa was associated not only with amplification, but also with elimination of repetitive DNA sequences, as was described for other plant species Salina et al., 2004b;Adams and Wendel, 2005;Feldman and Levy, 2005;Kumar et al., 2010). For example, bioinformatic analysis failed to detect sequence AC4x_CL193_504nt in Th. bessarabicum J b genome; qPCR showed its high abundance in genomes of all analyzed species, while FISH detected a single CL193 signal on one of the two homologs 7 J, and no signals were found in Ae. crassa. Probably, this sequence was present in the genome of the putative progenitor(s) of Ae. crassa, but it was eliminated during evolution of Aegilops species. Ae. crassa, being the most ancient polyploid species in the genus Aegilops, may retain a tracing Frontiers in Plant Science 21 frontiersin.org amount of this repeat, which cannot be detected at chromosomal level by FISH. Thus, if the full-length genome assemblies are unavailable, identification of new repetitive DNA sequences in particular species based on results of low-coverage NGS sequencing makes it possible to broaden the pool of data and get closer to their analysis at the level of "omics" technologies with lesser time and financial expenses.

Data availability statement
The original contributions presented in the study are publicly available. This data can be found here: NCBI, ON872662-ON872692.

Author contributions
EB, GK, and MD conceived and designed the experiments, analyzed data, interpreted results, wrote and edited the manuscript. EB, VS, TK, and AY designed and conducted the cytogenetic experiments and analyzed chromosome images. NC and MB provided and botanically verified plant material for analysis. SS designed and synthesized oligo-probes for FISH analyses. EN, DU, and AE performed bioinformatics analysis. AK performed and analyzed qPCR experiments. EB, OR, and PK wrote the first manuscript version. All authors contributed to the article and approved the submitted version.

Funding
This research was funded by the Russian Science Foundation, grant number 21-16-00123.