Insights into comparative genomics, structural features, and phylogenetic relationship of species from Eurasian Aster and its related genera (Asteraceae: Astereae) based on complete chloroplast genome

Aster L. is an economically and phylogenetically important genus in the tribe Astereae. Here, the complete plastomes of the eight Aster species were assembled and characterized using next-generation sequencing datasets. The results indicated the complete plastomes of Aster had a quadripartite structure. These genomes were 152,045–152,729 bp in length and contained 132–133 genes, including 87 protein-coding genes, 37–38 tRNA genes, and eight rRNA genes. Expansion or contraction of inverted repeat regions and forward, palindromic, complement, and reverse repeats were detected in the eight Aster species. Additionally, our analyses showed the richest type of simple sequence repeats was A/T mononucleotides, and 14 highly variable regions were discovered by analyzing the border regions, sequence divergence, and hotspots. Phylogenetic analyses indicated that 27 species in Astereae were clustered into six clades, i.e., A to D, North American, and outgroup clades, and supported that the genera Heteropappus, Kalimeris, and Heteroplexis are nested within Aster. The results indicated the clades B to D might be considered as genera. Divergence time estimate showed the clades A, B, C, and D diverged at 23.15 Mya, 15.13 Mya, 24.29 Mya, and 21.66 Mya, respectively. These results shed light on the phylogenetic relationships of Aster and provided new information on species identification of Aster and its related genera.


Introduction
The tribe Astereaehas has ~222 genera and ~3,100 species, which is the second largest tribe of Asteraceae (Noyes and Rieseberg, 1999;Brouillet et al., 2004;Panero et al., 2004;Panero and Crozier, 2016;Fu et al., 2016).The tribe Senecioneae has over 150 genera and 3,500 species (Nordenstam, 2007), more than the species number of the tribe Astereae.Aster is one of the large genera of Astereae and contains more than 152 species.The majority of Aster species are distributed in Eurasia, with only one species reaching North America (Nesom, 1994a, b;Chen et al., 2011).The species of Aster are mainly perennial herbs and are rarely annual or biennial herbs, subshrubs, or shrubs.The genus is characterized by capitula solitary or arranged in corymbiform or, sometimes, paniculiform capitulescences; white, pink, purple, or blue ray florets; and phyllaries imbricate or arranged in two equal layers.
Traditionally, Aster was defined as a genus encompassing around 300 species distributed in both the New World and the Old World (Jones, 1980;Semple and Brouillet, 1980;Jones and Young, 1983).However, in recent years, studies on the basis of morphology (Nesom, 1994a;Nesom and Robinson, 2007), Restriction Fragment Length Polymorphism (RFLPs) (Xiang and Semple, 1994), or DNA markers (Noyes and Rieseberg, 1999;Selliah and Brouillet, 2008;Li et al., 2012;Jafari et al., 2015;Korolyuk et al., 2015) have shown that the New World Aster species were distinct from the Old World taxa with a considerable genetic divergence.These New World taxa were treated as 13 separate genera (Nesom, 1994a, b), and the generic delimitation of the Old World species remained controversial.Some studies accepted a border Aster s.l., which includes most or all of the species of Aster from the Old World (Merxmüller et al., 1976;Ito et al., 1995;Chen et al., 2011).On the contrary, other studies treated the Old World species into Aster s.s. and 12 segregated genera (e.g., Kalimeris Cass., Heteroplexis C.C.Chang, Heteropappus Less.) (Tamamschjan, 1959;Grierson, 1975;Nesom, 1994b;Nesom and Robinson, 2007;Chen et al., 2011).However, recent molecular phylogenetic analyses have suggested that neither Aster s.l.nor Aster s.s. was monophyletic (Selliah and Brouillet, 2008;Pelser et al., 2010;Li et al., 2012;Jafari et al., 2015;Korolyuk et al., 2015;Fu et al., 2019).On the basis of analyses using Internal Transcribed Spacer (ITS), Enternal Transcribed Spacer (ETS), and trnL-F sequences, Li et al. (Li et al., 2012).showed that Eurasian Aster (referred to as EA Aster hereafter) is polyphyletic and supported that the genera Kalimeris and Heteropappus belonged to Aster.In addition, Li et al. (Li et al., 2012).proposed and suggested that Aster section Alpigenia, Aster ser.Albescentes, and Aster ser.Hersileoides should be elevated to the generic rank.However, the taxonomic position of Aster pycnophyllus Franch.ex Diels.remained unresolved.Another phylogenetic study using ITS and psbA-trnH sequences showed that the genera of Heteropappus and Kalimeris were nested within Aster, supporting the results of Jafari et al. (Jafari et al., 2015).Korolyuk et al. (Korolyuk et al., 2015).divided the Eurasian (EA) Aster into three groups, namely, a typical Eurasian asters group, Heteropappus group, and Asterothamnus group, but the relationships among these three groups were not strongly supported, and, hence, the boundary of Aster remained unclear.Although, the previous studies have indicated that the non-monophyly of Aster, the insufficient sampling of species, and low coverage and inadequate informative sites of molecular markers hampered the resolution of the phylogenetic trees of Aster and its related genera.
The chloroplast genome is one of the three DNA genomes, alongside the nuclear and mitochondrial genomes.In general, it is inherited maternally and possesses a highly conserved circular DNA arrangement, typically ranging from 115 kb to 165 kb in size (Wicke et al., 2011;Daniell et al., 2016).The complete chloroplast genome is a quadripartite structure, consisting of a large single copy (LSC), a small single copy (SSC), and two inverted repeats (IRs) (Daniell et al., 2016).The length differences are mostly due to expansion/contraction of IR regions (Zhu et al., 2016) or gene losses (Magee et al., 2010).In addition, the complete sequences of chloroplast genomes are commonly used for phylogenetic reconstruction at lower taxonomic levels, e.g., within genus, and population genetic analyses in plants.The utilization of complete chloroplast genomes has become widespread as an efficient tool for molecular phylogenetics in Aster (Kumar et al., 2009;Choi and Park, 2015;Zhang et al., 2015;Shen et al., 2017;Wang et al., 2019;Zhang et al., 2019a;Zhang X. et al., 2021;Duan et al., 2022;Palazzesi et al., 2022;Wang and Liu, 2023) and other tribes of Asteraceeae (Vargas et al., 2017;Do et al., 2019;Tyagi et al., 2019;Zhang et al., 2019b;Yu et al., 2022;Liu et al., 2023).Previous studies on Aster classification used one to several molecular markers, such as ITS, ETS, trnL-F, and psbA-trnH sequences (Li et al., 2012;Jafari et al., 2015), and some studies used only ITS sequences (Korolyuk et al., 2015), leaving many phylogenetic and taxonomic questions unresolved.Additionally, the lack of complete chloroplast genome sequences severely hampers the evaluation analyses of the genetic diversity of Aster germplasm resources.
In this study, to explore the genetic variation of Aster and its related genera, we report eight newly sequenced chloroplast genomes in the genus Aster, namely, Aster polius C.K. Schneid., Aster albescens Wall., Aster argyropholis Hand.-Mazz., Aster lavandulifolius Hand.-Mazz., Aster procerus Hemsl., A. pycnophyllus, Aster falcifolius Hand.-Mazz., and Aster yunnanensis Franch.The objectives of this study were to (1) analyze the evolution of chloroplast genomes within Aster using genetic comparative methods, (2) reconstruct the phylogenetic relationships of Aster and its related genera and further determine the phylogenetic backbone of Aster, and (3) estimate the divergence time of Aster and its related genera.This study provides new insights into the phylogenetics and evolution of Aster and its related genera and also shed the lights on the genetic diversity of Aster wild germplasm resources.

Sampling, extraction, and genome sequencing
Fresh leaves of the eight Aster species were gathered from the wild (Table 1).The formal identification of the plant material was undertaken by Dr. Zhixi Fu.The voucher specimens were then preserved in the herbarium at Sichuan Normal University in China (SCNU) (contact person: Dr. Zhixi Fu, fuzx2017@sicnu.edu.cn).As these species are not included in List of National Key Protected Wild Plants in China, there was no need to obtain a permit for their collection.Following the CTAB DNA extraction protocol (Allen et al., 2006), genomic DNA was extracted using the Plant Genomic DNA Kit (Tiangen, Beijing, China).The construction of the DNA library was carried out using the Illumina Paired-End DNA Library Kit (Illumina Inc., San Diego, CA, USA), and, subsequently, sequencing was performed on the Illumina Genome Analyzer (Hiseq 2000, Illumina, San Diego, CA, USA).The resulting raw data for each of the eight species consisted of approximately 150-bp paired-end read lengths.The 27 complete chloroplast genomic datasets are available for download on NCBI (Table 2).In the final supermatrix, names of species were checked based on Flora of China (Chen et al., 2011).

Comparative genome analysis
To identify potential IR expansion or contraction in eight Aster species, the reference species was used from A. ageratoides.This analysis was conducted using the perl script of Irscope (Amiryousefi et al., 2018).With the A. albescens as reference, the homology of these sequences was visualized using the mVISTA program (Frazer et al., 2004, https://genome.lbl.gov/vista/mvista/submit.shtml) with the LAGAN mode (Brudno et al., 2003).
The alignment of all sequences from the eight Aster species was performed using the "-auto" strategy of Multiple Alignment using Fast Fourier Transform (MAFFT).Nucleotide diversity was then calculated using a sliding window approach in DnaSP v.6.12.03 (Rozas et al., 2017) with a window length of 600 bp and a step size of 200 bp.

Phylogenetic analysis
The phylogenetic analysis of the complete chloroplast genomic dataset, consisting of 27 species of Astereae, was performed using the maximum likelihood (ML) method implemented in RAxML.The species of Nannoglottis ravida and Llerasia caucana from basal group of Astereae were selected as outgroups (Stamatakis et al., 2008).The analysis was performed on the CIPRES platform (Miller et al., 2010) (https://www.phylo.org/portal2/).ModelTest (Posada, 2006) was employed to determine the most suitable model for the dataset.The molecular model GTRCAT was applied for the analysis.For bootstrap support assessment, the fast bootstrap option with 1,000 replicates was utilized in RAxML from CIPRES pla tform.The morphologica l identi fi cation characteristics of the genus Aster and its related genera have been described more clearly by Chen et al. (2011).Therefore, we define the key to the Aster and related species with reference to the criteria proposed by Chen et al. (2011) in combination with classification of previous studies (Li et al., 2012;Jafari et al., 2015;Korolyuk et al., 2015).

Divergence time estimations
For divergence time estimation, we used the complete chloroplast sequence dataset.The BEAST v.1.8(Drummond et al., 2012) was applied to estimate the divergence times with Bayesian uncorrelated lognormal relaxed clock model.The node of Astereae was set at 31.42 Mya according to Zhang C. F. (Zhang C. F. et al. 2021).The tree Yule model was selected.The Markov chain Monte Carlo (MCMC) was run for 10,000,000 generations and sampled every 1,000 generations.TreeAnnotator v. 1.6 (BEAST package) was used to summarize and annotate the tree, with the initial 10% of trees discarded as burn-in.Finally, the tree was visualized in the program Figtree v.1.4.4 (http://tree.bio.ed.ac.uk/) with 95% highest posterior density being shown.

Chloroplast genome structure and feature of Aster
In this study, the complete chloroplast genomes of the eight species of Aster were sequenced and assembled.The results revealed a high degree of conservation in the structures of these genomes (Figure 1).These chloroplast genomes exhibited the standard quadripartite structure, consisting of a LSC region, a SSC region, and a pair of IR regions (IRa and IRb).The size of these genomes varied from 152,045 bp (A.polius) to 152,729 bp (A.albescens) (Table 2).The GC content ranged from 37.27% (A.procerus) to 37.35% (A.polius).Overall, all chloroplast genomes have 133 genes except A. procerus and A. pycnophyllus having 132 genes, including 87 protein-coding genes, 37/38 tRNA genes, and eight rRNA genes.Additionally, 115 of these genes were unique and 18 genes were duplicated in the IR regions (Table 3).The arrangement of these 133 genes in all chloroplast genomes was found to be completely collinear.There were two introns of four genes (rps12, rps12, ycf3, and clpP) and single intron of 16 genes (ndhA, ndhB, petB, petD, atpF, rbcL, rpl16, rpl2, rps16, rpoC1, trnA-UGC, trnG, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC).The gene rps12 was trans-spliced, and the genes ndhD and psbL experienced RNA editing.

Expansion and contraction of the border regions
In general, the IR/Single Copy (SC) expansion and contraction might cause the IR/SC junction position change.The IR/SC borders of the eight newly sequenced Aster chloroplast genomes were compared to analyze the expansion and contraction variation in junction regions (Figure 2).Although overall genomic structure including gene order and gene number was well conserved, these genomes exhibited slight differences at four junctions (JLB, JSB, JSA, and JLA).The rps19 gene of all Aster species located the JLBs, with the IRa region including 60 bp to 62 bp, except for A. pycnophyllus (36 bp).Likewise, the JLAs of all Aster species were located between rps12 and trnH.The ndhF gene, related to photosynthesis, was entirely located in the SSC region and the distance to the junction ranged from five to 54 bp.In our newly sequenced genomes, the ycf1 pseudogene was identified in all newly sequenced genomes.The main part of ycf1 gene was in the SSC region, with other 564 bp to 567 bp in the IRa region.The same fragment was also found in the IRb region of the ycf1 pseudogene and extended to SSC region with extension region with 9 bp to 147 bp.
The forward, palindromic, complement, and reverse repeats were detected in the eight newly sequenced chloroplast genomes (Figure 4A). A. procerus, A. pycnophyllus, A. yunnanensis, and A. falcifolius had all four type repeats.A. polius, A. albescens, A. argyropholis, and A. lavandulifolius had forward, palindromic, and reverse repeats.On average, 46-49 repeat sequences were identified in these genomes, with 17-23 forward repeats, 19-25 palindromic repeats, and 1-8 reverse repeats.However, complement repeats were only detected in A. procerus, A. pycnophyllus, A. yunnanensis, and A. falcifolius, with number of 1 to 3.Moreover, the repeats with 30 bp to 39 bp in length were the most common type in these genomes (Figure 4B), and none of the repeats with 50 bp to 59 bp in length.These interspersed repeat sequences were mainly present in the intergenic spacers, and several were observed within the coding regions and introns.The ycf15, rps12, ycf2, rrn5, rrn4.5, psbN, trnG, trnT-GGU, ycf4, cemA, trnS-GCU, trnS-UGA, trnS-GGA, psaB, psaA, accD, psal, psbE, petL, ndhD, and psaC genes contained LDRs.The interspersed repeat sequences were also more commonly detected in   (2), ycf3**, ycf4 LSC and IR than SSC regions.The overall distribution of interspersed repeat sequences was similar in both IR regions.

Sequence divergence and hotspots
The mVISTA analysis of these eight Aster species indicated the complete chloroplast genome shared high levels of sequence similarity.Genetic variability was more prevalent in the noncoding regions than in the coding regions.The five genes with the highest variation were matK, atpA, rps19, ycf2, and ycf1 (Figure 5).DnaSP analysis revealed nucleotide diversity in single copy genes and intergenic regions with nucleotide diversity (Pi) ranged from 0.00068 to 0.04577.Six mutation hotspots showed significantly high Pi values (p > 0.014) (Figure 6), much higher than the average pi value (p = 0.0038).Among the gene coding regions, the highest Pi values were found in ndhF, followed by ndhC, trnV (UAC), and trnM (CAU).Among intergenic regions, the highest Pi values were detected in the rpl12-ndhF region, followed by ndhC-trnV (UAC), trnV (UAC)-trnM (CAU), and trnM (CAU)-atpE (Figure 6).We analyzed the nucleotide diversity of the complete chloroplast genomes and the LSC, SSC, and IR regions.The nucleotide diversity in the complete chloroplast genome was 0.0038, and higher nucleotide diversity was found in the LSC and SSC regions than the IR region, showing that the IR regions were more conserved than the single-copy regions.We found only two regions with p > 0.02, i.e., trnT (GGU)-psbD and trnL (UAA)-trnF (GAA), and three regions with p>0.015 and <0.02, i.e., trnU (UAC)-trnM (CAU), accD, and ycf4/ycf4-cemA (Figure 6).

Codon usage analysis
The preferences for codon are extremely similar among species.The analyses showed that 87 protein-coding genes were encoded by 64 codons (including three are stop codons: UAA, UGA, and UAG; Figure 7).The most prevalent amino acid was leucine.Leucine was encoded by CUA, CUC, CUG, CUU, UUA, and UUG with 2,420 codons (A.falcifolius) to 2,440 codons (A.yunnanensis).However, the rarest one was cysteine.Cysteine was encoded by UGC and UGU with 251 codons (A.falcifolius) to 252 codons (A.polius).In the complete chloroplast genome of these Aster, only codons tryptophan (encoded by UGG) exhibited no bias with RSCU = 1.00.The common start codon for the protein coding genes was AUG (M), except for the psbL, rps19, and ndhD genes, which have start codons of ACG, GUG, and GUG in all species, respectively.

Phylogenetic analysis
In this study, the complete chloroplast genomes of 27 Astereae species were used to perform phylogenetic reconstruction, with Nannoglottis ravida (C.Winkl.)Y.L.Chen and Llerasia caucana (S.F.Blake) Cuatrec used as outgroup.Phylogenetic analyses of the supermatrix of 25 taxa (not include outgroups) using the ML methods yielded a topology (Figure 8

Divergence time estimations
On the basis of the newly reconstructed phylogeny, the origin and divergence times of lineages within the genus Aster were estimated (Figure 9).Divergent time estimate showed that the divergent time of clade A was dated back to 23.15 Mya.clades B, C, and D were divergent from 15.13 Mya, 24.29 Mya, and 21.66 Mya, respectively.

Plastome structure and characteristics analysis
The structure, gene position, size, orientation, and gene content of the plastid genomes of the eight Aster species were highly conserved ( Doorduin et al., 2011;Curci et al., 2015;Cheon et al., 2017;Chen et al., 2018).These genomes of Aster have a standard quadripartite structure, including a LSC, a SSC, and a pair of IRs (IRa and IRb), which was the same as that reported for most other Aster (Shen et al., 2018;Wang et al., 2019;Zhang X. et al., 2021).The sizes of plastomes of the eight Aster species are between 152,045 bp and 152,729 bp (Table 2).In addition, these plastomes did not have any loss of introns.Additionally, the GC contents, which are a crucial factor in genome organization and stability, of the chloroplast genomes were low (37.3%) and were similar to that of other Asteraceae species, such as Aster spathulifolius Maxim.(37.28%) and A. hypoleucus (37.3%) (Ravi et al., 2008;Tyagi et al., 2019;Wang et al., 2019).We detected losses of the trnT (GGU) gene in A. procerus and A. pycnophyllus.In previous study, the loss of the tRNA was detected in some Asteraceae species (Lee et al., 2017).

Expansion and contraction of the border regions
The IR regions are known to be highly conserved in the genome of chloroplasts.During evolution, the expansion and contraction of the IR, LSC, and SSC regions are common, which leads to variability in genome length (Kim and Lee, 2004).In this study, the examination of chloroplast genome variation (Figure 2) showed that great expansion or contraction of the chloroplast IR region was not detected.However, ycf1 and ndhF genes located at the SSC/IR border had the slight variation in position and length in the eight Aster chloroplast genomes, suggesting boundary contraction and expansion between the SSC/IR regions in Aster (Liu et al., 2018).

Repeat sequence analysis
The SSRs are effective molecular markers, and they are often used for species identification and population genetic analyses (Thiel et al., 2003).In the eight Aster species analyzed here, A/T repeats, AT/AT repeats, AAT/ATT repeats, and AAAT/ATTT repeats were commonly detected (Figure 3C).This phenomenon may be related to that the AT preference pattern is widely reported in many plant plastids (Somaratne et al., 2019).In the rearrangement of the complete chloroplast genomes and sequence divergence, larger and more complex repeat sequences may play an important role (Weng et al., 2014).The interspersed repeat The analysis of the number and length of the long repeats identified from the eight Aster complete chloroplast genomes.The hamming distance of 3, the minimal repeats of 30, and the maximum repeats of 5,000 were applied during the calculating process.sequences were more prevalent in the non-coding regions than the coding regions (Kim et al., 2015).In our study, the ycf2 gene includes rich repeats, which contained many repeats: forward and palindromic.This result was consistent with the previous analysis that showed the gene has already been shown to be associated with many evolutionary events (Huang et al., 2010).

Sequence divergence and hotspots
DNA barcoding technology has been widely used in the species identification, phylogeny, and evolution (Doorduin et al., 2011;Palazzesi et al., 2022).In mVISTA analysis, the matK, atpA, rps19, ycf2, and ycf1 genes had large differences and were putative markers for The nucleotide variability (Pi) values were compared among the eight Aster taxa.The RSCU values of the 20 amino acids of the complete chloroplast genome of the eight Aster taxa and their different codon usages.

FIGURE 8
The best maximum likelihood (ML) phylogram inferred from 27 chloroplast genomes (bootstrap value are indicated on the branches).The circled species are the newly sequenced species in this study.
population genetic and barcoding analyses (Figure 5).Among these genes, the matK and ycf1 genes have been used in previous plant phylogenetic and DNA barcoding analyses for land plants (Dong et al., 2015).Some regions of the plastomes of the eight Aster species showed high sequence divergence and might be used for phylogenetic reconstruction.However, these regions are different from the phylogenetic markers previously reported for Asteraceae (Do et al., 2019).Therefore, the complete chloroplast genome sequences and molecular markers might provide fundamental data for further studies on genus of Aster and related species in tribe Astereae.

Codon usage analysis
Thirty-one codons with RSCU value >1 were found, indicating that these codons are preferentially used in coding amino acids.An identical trend was discovered among the eight species (Figure 7).Leucine was the most abundant amino acid, whereas the cysteine was the least abundant amino acid, which is consistent with other Asteraceae species (Salih et al., 2017;Shen et al., 2017).In addition, most of A/U-ending codons had RSCU values >1; meanwhile, most of G/C-ending codons had RSCU values <1, indicating that amino acids tended to using A/Uending codons, similar to a previous study (Zhao et al., 2021).

Phylogenetic analysis
In the tribe Astereae, there are numerous morphologically similar but distantly related taxa, such as some species of Aster (Noyes and Rieseberg, 1999).Whether some taxa should remain as genera or be merged into a single genus remains to be determined, such as genus Kalimeris, Heteropappus, and Heteroplexis, as wells as Aster series Albescentes, Aster Ser.Hersileoides, and Aster section Alpigenia (Nesom, 1994b;Noyes and Rieseberg, 1999;Li et al., 2012;Jafari et al., 2015;Korolyuk et al., 2015;Nesom, 2020).Nesom (1994b) proposed that Aster series Albescentes should be removed from Aster.In the Flora of China, Aster series Albescentes species A. nitidus and A. hersileoides were treated as the unplaced Aster group.The molecular phylogeny of Li et al. (2012) suggested that Aster section Alpigenia should be elevated to the new genera, series Albescentes is considered to be more closely related to section Alpigenia, and the Aster series Hersileoides is a wellsupported monophyletic group.Therefore, according to the results of previous studies and the phylogenetic tree of this study, we classified the 25 species (not including outgroups) into
In the previous studies, Heteropappus was considered for generic rank based on the heteromorphic pappus of ray and disc flowers (Jones, 1980;Noyes and Rieseberg, 1999;Chen et al., 2011;Jafari et al., 2015).Based on RFLPs and gene sequences, it is suggested that Heteropappus altaicus should be classified within Aster (Ito et al., 1998;Li et al., 2012).Our study showed that Aster altaicus Willd.(=Heteropappus altaicus) belong to clade A (BS = 100), supporting the previous results.Li et al. (2012) proposed A. pycnophyllus should be kept separate as it was found to be nested within a clade with Myriactis Less.and distantly related to Aster.In our study, the result shown that A. pycnophyllus was nested within clade A, with a strong support (BS = 97).
The genus Heteroplexis, comprising five species, is an herb endemic to Guangxi, China (Chen et al., 2011).In the Flora of China, the genus Heteroplexis shares similarities in morphology and inflorescence with the genus Aster, but they could be distinguished by its bilaterally symmetrical corolla and climbing or erect herb (Chen et al., 2011).Therefore, it is placed within the subtribe Asterinae as close allies of Aster.Based on the number of outer flowers over the number of bisexual flowers, Zhang and Bremer (1993) treated Heteroplexis in Erigeron-Conyza group.According to some characters, e.g., disciform capitula, oblong-obovoid achenes, and long corolla lobes, Nesom (1994a) treated Heteroplexis as a member of Baccharidinae.In recent study, it is the unplaced Aster group (Nesom and Robinson, 2007).Our results suggested that Heteroplexis should be included with Aster and treated as a synonym of Aster.

Clade B (Aster series Albescentes)
The species of clade B exhibit a shrubby growth habit and are classified within the Aster series Albescentes (Li et al., 2012;Nesom, 2020).Based on the characters of muti-layers involucre and mutiribs achene, Nesom (1994b) proposed that Aster series Albescentes has a distinct position in Aster.Based on the character of pappus, Nesom (1994b) noted that A. series Albescentes is sister to the NA Doellingeria Nees.In the Flora of China, the species of Aster ser.Albescentes were considered as the unplaced Aster group (Chen et al., 2011).Li et al. (2012) showed that ser.Albescentes is a monophyletic taxon with high support in a polytomy with Myriactis and other segregates of Aster s.s., implying that series Albescentes may belong to the Australasian lineages, in disagreement with the study of Nesom (1994b).In this study, A. albescens, A. argyropholis, A. lavandulifolius, A. polius, and A. hypoleucus formed a strong supported clade B (BS = 88/100) as sister of clade C (Aster Ser.Hersileoides) (Figure 8).It demonstrates that series Albescentes is a well-supported monophyletic genus.The newly defined taxon possesses the following distinct characteristics: shrubs, leaves cauline, basal leaves and proximal leaves withered at anthesis usually, non-rosulate, stem leaves well developed, nearly as long as basal leaves, capitula many, much branched, in corymbiform, terminal solitary rarely, involucres herbaceous or membranous, involucres 3-5, imbricate, margin membranous, irregularly lobed.margin membranous, irregularly lobed, and 4-5(8)-ribbed.

Clade C (Aster Ser. Hersileoides)
Aster series Hersileoides consists of two species, Aster hersileoides C.K.Schneid.and Aster nitidus C.C.Chang (Yin et al., 2010;Chen et al., 2011).Chen et al. (2011) treated the A. hersileoides within the unplaced status in Aster.Based on molecular phylogenetic studies, Li et al. (2012) strongly suggested that Aster ser.Hersileoides should be removed from Aster and considered as a separate genus.Our results supported that the series (represented by A. hersileoides) should be kept separately from Aster.They are characterized by shrubs, leaves cauline, non-rosulate, leaf oblanceolate and glabrous, capitula many, terminal solitary, involucres 3-5, imbricate, 2 inner involucres equaling, and 3-ribbed.It is reasonable to propose the elevation of the Aster Series Hersileoides to a generic rank, considering its unique traits and the phylogenetic results here.Further investigations and comprehensive molecular analyses will be essential in demonstrating the full taxonomic status and evolutionary relationships of this clade.
In conclusion, the previously classification and definition of Aster is not monophyletic.Clade A (core Aster) includes most Aster taxa, Heteropappus, Kalimeris, and Heteroplexis.Additionally, clade B (Aster series Albescentes), clade C (Aster Ser.Hersileoides), and clade D (Alpine Aster) are identified as independent groups.Furthermore, it is estimated that the genus Aster comprise more than 152 species.However, this study only encompassed 25 species and two outgroup species.Therefore, a more comprehensive and extensive sampling of chloroplast genome and more data are necessary to conduct a thorough and comprehensive phylogenetic study of the genus Aster and its related genera.Based on the results of both general morphological and molecular phylogenetic analysis, the identification key was presented as following.

Divergence time estimations
The divergence time estimation of Aster relied on secondary calibration because of the lack of fossil record for most Aster taxa.Most species of Aster and its related genera are distributed in East Asia (Nesom, 1994b;Brouillet et al., 2009;Chen et al., 2011).The result of Brouillet et al. (2009) indicated that Aster originated from a clade with a dispersal from Australasia into East Asia.The result of our molecular dating suggests that clades A and C began to diversify in the late Oligocene (23.15 Mya and 24.29Mya,respectively). Clades B and D originated in the Early Miocene (15.13 Mya and 21.66Mya, respectively).The rapid radiation may be related to collisions between geological plates (Audley-Charles, 1987;Liu et al., 2002).Geologic uplift events (first of which began at about 50 Ma) have taken place in the Tibetan Plateau during at least four different periods since the early Miocene, i.e., 22 Mya, 15-13 Mya, 8-7 Mya, and 3.5-1.6 Mya (Shi et al., 1998;Guo et al., 2002;Spicer et al., 2003).The origin of the four clades (Clade A-D) likely occurred independently at first two stages of the uplift and formation of the Tibetan Plateau.Geological evidence suggests that the strong uplift of the Tibetan Plateau, coupled with favorable oceanic and continental environments, produced a strong Asian monsoon dominated by the Summer Monsoon (Shi et al., 1998;Ding et al., 2020).During this uplift movement, the original Planetary Wind System in East Asia was changed and the arid zone retreated to the northwest (Shi et al., 1998;Ding et al., 2020).Eastern China was gradually covered by tropical or subtropical forests.This scenario also correlates with the current habitat preferences of the studied taxa (clade A, understorey vegetation; clades B and C, dry slopes and scree regions; and clade D, cold and dry alpine meadows).The similar rapid radiation has also been found in other groups of Asteraceae in the Tibetan Plateau, such as Saussurea (Liu et al., 2002) and the Dolomiaea-Diplazoptilon-Xanthopappus group (Wang et al., 2007).

Conclusions
The complete chloroplast genomes of the eight Aster species were sequenced in this study.The results revealed that cp genome size, structure, gene content, as well as compositional organization were highly conserved among these species.The chloroplast genomes of all species exhibited the standard quadripartite structure, and the size of these species of Aster varied from 152,045 bp to 152,729 bp.They include 87 protein-coding genes, 37/38 tRNA genes, and eight rRNA genes.They have three/four types of repeats, and the number of SSRs ranged from 75 to 99.Genes located at the junctions were well conserved among the Aster species.Furthermore, the genic and IR regions were more conserved than the intergenic and SC regions, respectively.In addition, the plastid genome structure of Aster exhibited high consistency and was obviously different in some regions, such as rps19, ycf1, and ndhf.Furthermore, the preferences for codon use in our study are all similar.The most prevalent amino acid was leucine, whereas the rarest one was cysteine.Moreover, we detected six hotspots that could be used as candidate DNA barcodes.The analysis of complete chloroplast genomes and combined datasets provided clear evidence supporting the moderate to strong differentiation of clades (clades A, B, C, and D and North American clade).The phylogenetic results showed that the traditionally defined Aster was not monophyly.For the delimitation of the genus Aster, Kalimeris, Heteropappus, and Heteroplexis, the closed allied genera of Aster were revealed to be nested within the Aster clade and should be included in Aster.Additionally, we suggest that the clade B (Aster series Albescentes), clade C (Aster Ser.Hersileoides), and clade D (Alpine Aster) should be treated as separated genera and taxonomic treatment.Divergent time estimate showed that the divergent time of clade A was dated back to 23.15Mya.Clades B, C, and D were divergent from 15.13 Mya, 24.29 Mya, and 21.66Mya, respectively.Our analyses suggested that the divergence of the genus Aster is closely related to the uplift of the Qinghai-Tibet Plateau.This study sequenced eight plastid genomes of Aster, provided a well resolved phylogenetic tree of Aster and related genera, and selected putative markers for further barcoding analysis.This study is important for us to understand the phylogeny and evolution of Aster and the further phylogenetic, population genetic, and related studies.

FIGURE 1
FIGURE 1Gene map of the Aster chloroplast genome.Genes shown outside the outer circle are transcribed clockwise, and those insides are transcribed counterclockwise.Genes are color-coded according to different functional groups.The darker gray in the inner circle indicates the GC content, and the lighter gray indicates the AT content.The inner circle also shows that the chloroplast genome contains two copies of inverted repeats (IRA and IRB), a large single-copy (LSC) region, and a small single-copy (SSC) region.

FIGURE 2
FIGURE 2Comparison of IR-SC border positions across plastomes of the eight Aster taxa.Genes are denoted by colored boxes.The gaps between the genes and the boundaries are indicated by the base lengths (bp).
) with in-group fell into five clades: clade A, clade B, clade C, clade D, and North American clade (Figure 8).Clade A was the largest clade and was strongly supported (Bootstrap value (BP) = 97/100), containing 12 Aster species and two Heteroplexis species.The newly sequenced species, namely, A. falcifolius, A. pycnophyllus, and A. procerus, were nested in clade A. Other four newly sequenced species, namely, A. argyropholis, A. albescens, A. lavandulifolius, and A. polius, together with Aster hypoleucus Hand.-Mazz.formed the strongly supported clade B. A. hersileoides lonely formed clade C, as the sister group of clade B with a moderate support (BS = 68).Symphyotrichum subulatum (Michx.)G.L. Nesom and Erigeron canadensis L. formed the North American clade, and it was a sister group of clades A, B, and C with a high support (BS = 100).The newly sequenced A. yunnanensis were placed together with A. flaccidus and A. batangensis in clade D with strong support (BS = 100).
FIGURE 3 Comparison of simple sequence repeats (SSRs) among eight plastomes.(A) Numbers of SSRs detected in the eight newly sequenced Aster plastomes.(B) Frequencies of identified SSR types in all eight Aster plastomes.(C) Analysis of SSRs in eight Aster plastid genomes species.
(A) Numbers of the type of long repeats contains F, P, R, and C. (B) Numbers of the length of long repeats.

FIGURE 5
FIGURE 5Comparison of the chloroplast genome of the eight Aster newly sequenced species.Dark blue bars represent protein-coding genes, pale blue bars represent rRNA genes, and red bars represent conserved non-coding sequences.The y-scale axis represents the percentage identity (50%-100%).mVISTA was used to perform the comparison.

FIGURE 9
FIGURE 9Divergence time estimates of Aster based on complete cp genomes, based on BEAST analysis using the complete chloroplast genomes dataset.Blue bars indicate 95% highest posterior density intervals.

TABLE 1
Information on the 27 Aster species used in the study.

TABLE 2
Comparative analysis of chloroplast genomes of the seven Aster species.

TABLE 3
List of genes found in the complete chloroplast genomes of Aster species.