Original Research ARTICLE
The Population Genetics of Alternaria tenuissima in Four Regions of China as Determined by Microsatellite Markers Obtained by Transcriptome Sequencing
- College of Plant Protection, Department of Plant Pathology, China Agricultural University, Beijing, China
A total of 32,284 unigenes were obtained from the transcriptome of Alternaria tenuissima, a pathogenic fungus causing foliar disease in tomato, using next-generation sequencing (NGS) technology. In total, 24,670 unigenes were annotated using five databases, including NCBI non-redundant protein, Swiss-Prot, euKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes, and the Gene Ontology. A total of 1,140 simple sequence repeats were also identified for use as molecular markers. Sixteen of the simple sequence repeat loci were selected to study the population structure of A. tenuissima. A population genetic analysis of 191 A. tenuissima isolates, sampled from four geographic regions in China, indicated that A. tenuissima had a high level of genetic diversity, and that the selected simple sequence repeat markers could reliably capture the genetic variation. The null hypothesis of random mating was rejected for all four geographic regions in China. Isolation by distance was observed for the entire data set, but not within clusters, which is indicative of barriers to gene flow among geographic regions. The analyses of Bayesian and principal coordinates, however, did not separate four geographic regions into four separate genetic clusters. The different levels of historical migration rates suggest that isolation by distance did not represent a major biological obstacle to the spread of A. tenuissima. The potential epidemic spread of A. tenuissima in China may occur through the transport of plant products or other factors. The presented results provide a basis for a comprehensive understanding of the population genetics of A. tenuissima in China.
Alternaria tenuissima is an important global pathogen on a large variety of economically important crops, including broad bean, tomato, sunflower, potato, watermelon, and muskmelon (Rahman et al., 2002; Agamy et al., 2013; Wang et al., 2014; Zheng et al., 2015; Zhao et al., 2016a,b). The pathogen affects the above-ground parts of the crops, and is the causal agent of early blight, stem canker, and some fruit rots (Abdelfattah et al., 2016; Bessadat et al., 2017). Foliar diseases outbreaks caused by A. tenuissima are primarily epidemic and are especially devastating on tomato leaves (Agamy et al., 2013; Bessadat et al., 2017). High humidity and fairly high temperatures can lead to severe epidemics in tomato-growing regions (Bessadat et al., 2017). Although environmental conditions vary significantly in different crop production regions, once established, the infection spreads rapidly. In fact, sporadic epidemic transmission is the main factor responsible for the high frequency of occurrence of this disease (Agamy et al., 2013; Meng et al., 2015). The increasing frequency of A. tenuissima outbreaks has affected the distribution of Alternaria species responsible for causing foliar diseases (Wang et al., 2014; Zheng et al., 2015; Zhao et al., 2016a,b). The extent of genetic variation and spatial distribution in A. tenuissima associated with tomato foliar diseases in China, however, remains largely unknown.
Genetic variation in a species results from evolutionary events, including drift, migration, type of mating system, selection, and mutation, all of which are influenced by human activity and natural events (Wright et al., 2004; Zhan and McDonald, 2013). For example, indiscriminate use of fungicides in agro-ecosystems can increase the rate of mutation and impact the virulence and aggressiveness of a pathogen (Piotrowska et al., 2016). The artificial dissemination of a pathogen also affects migration or selection in pathogen populations and causes a change in natural ecosystems. Emergence of a sexual stage also plays an important role in dispersion (Artero et al., 2016). Species are expected to exhibit a greater level of genetic variation in response to environmental changes that increase their ability to adapt (McDonald and Linde, 2002; Meng et al., 2015). The genetic structure of a species determines the evolutionary potential of pathogen as it reflects the level of allelic diversity available to selection pressures (McDonald, 1997; Parker and Gilbert, 2004). A comprehensive understanding of the genetic structure of a species of pathogen is essential for managing disease occurrence and developing sustainable management practices (Miller et al., 2003).
Molecular markers are a reliable tool for assessing genetic variation and inferring mating systems in fungal populations (Santha Lakshmi Prasad et al., 2009; Stewart et al., 2011). To date, however, relatively few molecular markers, such as random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphism (AFLP), have been reported for Alternaria spp. (Morris et al., 2000; Gannibal et al., 2007). Additionally, most molecular markers do not capture genetic structure with the degree of resolution and reliability that is provided by simple sequence repeats (SSRs) or microsatellites. SSRs are tandem repeat motifs of 1–6 bases that are abundantly spread throughout eukaryotic genomes and reflect genetic diversity (Andeden et al., 2015).
Genetic SSRs occur in the coding and regulatory regions of genes (Zheng et al., 2013), while genomic SSRs are in non-coding regions of the genome. Genetic SSRs have the advantage of being more highly conserved and thus more transferable across species (Chen et al., 2015). Although the identification of genetic SSRs is less expensive and time-consuming than identifying genomic SSRs, little information is available on genetic SSR markers in A. tenuissima. Due to their utility, it would be useful to identify SSR loci in A. tenuissima from transcriptomic data and design primer pairs that could be used to identify these SSR loci in genetic analyses of A. tenuissima.
A high level of genetic variation has been reported in A. alternata, A. brassicicola, and A. solani suggesting that a cryptic sexual stage dominates in these Alternaria species (Morris et al., 2000; Bock et al., 2005; Meng et al., 2015). High levels of genetic variation and evidence of sexual reproduction has also been reported in A. tenuissima isolated from wheat in Russia (Gannibal et al., 2007). Little information is available, however, on the population structure of A. tenuissima in tomato. Therefore, in the present study, a transcriptome of A. tenuissima was sequenced using next-generation sequencing technology (NGS) and used to identify large numbers of SSRs. This was done to determine the level of genetic diversity in A. tenuissima populations in China and infer the main evolutionary factors influencing the epidemic outbreaks of A. tenuissima. The distribution of SSR motifs in the transcriptome of A. tenuissima was characterized and the assembled unigenes were functionally annotated.
Materials and Methods
Sample Collection and Fungal Populations
A total of 191 A. tenuissima isolates were collected during 2015 and 2017 from 34 sampling locations in China (Supplementary Table S1). The isolates were obtained from tomato leaves exhibiting typical symptoms of foliar disease. The 34 sampling locations from twelve provinces, autonomous region, or municipality were organized into four tomato cropping regions based on geography, climate, and agricultural management (Bernardes-de-Assis et al., 2009). They were designated Northeastern China (Heilongjiang, Jilin, and Liaoning Provinces), Northern China (Hebei, Shanxi Provinces, and Beijing Municipality), Eastern China (Anhui, Fujian, Jiangxi, and Zhejiang Provinces), and Northwestern China (Ningxia Hui Autonomous Region and Gansu Province). These groupings represent the four major tomato-cropping regions in China (Figure 1). The four geographic regions are separated from each other by more than 500 km.
FIGURE 1. Geographic locations of the four tomato cropping regions of Alternaria tenuissima used in the study.
Isolates were identified using the standard procedures reported in our previous study (Zheng et al., 2015). The procedure includes both morphological characteristics and molecular analyses. The collected isolates were transferred to potato carrot agar (PCA) plates and grown for 7 days at 25°C with 8 h light/16 h dark photoperiod to characterize their growth and conidia morphology. Genomic DNA was extracted from the A. tenuissima isolates using a cetyltrimethylammonium bromide (CTAB) procedure and used for molecular identification and additional SSR assays (Lee and Taylor, 1990). For the molecular analysis, partial coding sequences of the histone 3 gene and the internal transcribed spacer (ITS) region of ribosomal DNA (rDNA) were amplified from the extracted genomic DNA using the primer sets H3-1a/H3-1b and ITS1/ITS4, respectively (Glass and Donaldson, 1995). The PCR amplification products were shipped to Beijing TSINGKE Biotechnology Co. Ltd. (Beijing, China) for sequencing. The obtained sequence data were used to conduct BLAST searches using BLASTn on the NCBI website1 to identify the Alternaria species.
cDNA Library Construction and Illumina Sequencing
One isolate “BJ319-1” was randomly selected from among the 191 collected A. tenuissima isolates. The mycelia from a culture of “BJ319-1” growing on potato dextrose agar (PDA) plates for 7 days, were harvested for the isolation of total RNA and subsequent transcriptome analysis. Total RNA was extracted using TRIzol reagent (Ambion, Thermo Fisher Scientific, United States). Any traces of DNA were then removed from the RNA extracts using DNase I (TaKaRa, Japan). The purity of the RNA extract was determined using a Nano-Drop 2000 (Thermo Fisher Scientific, United States). Qubit 2.0 (Life Technologies, United States) and an Agilent 2100 Bioanalyzer (Agilent Technologies, United States) were used to estimate the concentration and integrity of the total RNA. The cDNA library of pooled RNA was constructed with a method described by Li et al. (2014), with minor modifications. The resulting A. tenuissima cDNA library was sequenced using an Illumina HiSeq 4000 sequencing platform at Beijing Biomarker Technologies Co., Ltd. (Beijing, China).
De novo Assembly and Unigene Annotation
To obtain high-quality reads, raw sequences from the Illumina sequencing were filtered using Trimmomatic, a flexible read-trimming tool for Illumina NGS data (Bolger et al., 2014). After filtering out low-quality reads, the resulting clean reads were deposited into the Sequence Read Archive (SRA) database2, under the accession number SRP136412. Subsequently, clean reads were assembled using Trinity (Haas et al., 2013).
To annotate the A. tenuissima transcriptome, unigenes were searched against various databases, including NCBI non-redundant protein (Nr protein), Swiss-Prot, euKaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) (Altschul et al., 1997; Cameron et al., 2004). Blast2GO software (Conesa et al., 2005) was used to assign the Gene Ontology (GO) terms to the unigenes. All unigene annotations were performed using the method of Wu et al. (2014).
Development of SSRs and Primer Design
SSR loci were identified from the A. tenuissima transcriptome sequence data using MISA (MIcroSAtellite identification tool) and SAMtools (Li et al., 2009). The minimum number of repeats was defined as ten for mono-nucleotide repeats, six for di-nucleotide repeats, five for tri-nucleotide repeats, and three for tetra-, penta-, and hexa-nucleotide repeats. Subsequently, SSR primers were designed using Primer Premier 5.0 software (PREMIER Biosoft International, Palo Alto, CA, United States). Based on the methodology described by Chen et al. (2015), the criteria used for designing the primers were: primer length of 16–22 bp, PCR product size of 100–300 bp, annealing temperature of 40–60°C, and GC content of 40–60%.
SSR Assays of A. tenuissima Populations
To further characterize the population genetics of the A. tenuissima populations, primers pairs were synthesized for 16 SSRs. These SSRs were used as suitable markers for subsequent analyses based on preliminary tests (Table 1). The forward primers were separately labeled with a fluorescent dye (Dye set: FAM, ROX, TRMA, HEX; Applied TSINGKE Biotechnology Co. Ltd.) at the 5′ end. PCR amplifications were performed in a 25 μL PCR mixture that included 1 μl DNA template (100 μg mL-1), 9.5 μL ddH2O, 12.5 μL 2 × T5 Super PCR Mix (TSINGKE Biotechnology Co. Ltd.), and 1 μL each of the two primers (10 μM). The amplification was conducted in an Eppendorf Mastercycler® using the following protocol: an initial denaturation step at 95°C for 5 min, followed by 35 cycles of denaturation at 94°C for 30 s; annealing at 57°C for 30 s, and extension at 72°C for 30 s; with a final extension for 5 min at 72°C. The obtained amplicons were then sequenced using an ABI 3730 DNA sequencer (Applied Biosystems).
TABLE 1. Repeat motifs (RM) in the core SSR loci sequences, primer sequences, the length of the cloned alleles (LCA), and annealing temperatures (AT) in the sixteen microsatellite loci developed from transcriptomic library of Alternaria tenuissima.
Alleles were aligned using GeneMarker v.2.2.0 software (SoftGenetics, State College, PA, United States). Sequenced fragments with an identical size originating from the same primer pair were considered as an allele. Multilocus genotypes, defined as having the same alleles at each of the single SSR loci, were detected using GenClone v.2.0 (Arnaud-Haond and Belkhir, 2007). Isolates with the same multilocus haplotype were considered as the asexual progeny of a genotype.
Genotypic diversity, gene diversity (Nei, 1973), allelic richness, and clonal fraction (CF) were used to evaluate genetic variation for each of the assigned geographical groups and the pooled geographic regions (Zhan et al., 2003). The genetic variation data, except for CF, was calculated using POPGENE v. 1.32 according to the method of Meng et al. (2015). Shannon index was computed to estimate genotypic diversity (Grünwald et al., 2003). CF, defined as the percentage of isolates resulting from asexual reproduction (Zhan et al., 2003), was calculated as 1 - (number of genotypes/number of isolates assayed). The Ewens–Watterson test was used to evaluate the selective neutrality of SSR markers (Ewens, 1972; Watterson, 1978). The index population differentiation (FST), summary heterozygosity (H) from each locus (Nei, 1973; Nei and Chesser, 1983), and the test for selective neutrality were also computed by POPGENE v. 1.32 (Yeh et al., 1997).
Finally, to infer the possibility of random mating in each of the geographic regions, MULTILOCUS v. 1.3 was used to test the null hypothesis of random mating using the index of association (IA) and multilocus linkage disequilibrium values (rd) by 1,000 randomizations according to Hemmati et al. (2009). If the value of IA and rd are not significantly different from the expected value of 0, random mating exists in the population (Brown et al., 1980). IA is usually dependent on the number of loci included. To supplement the use of IA, a modified statistic (rd) was used in the analysis. The proportion of compatible pairs of SSR loci (PrCompat) was also performed using MULTILOCUS v. 1.3 software (Agapow and Burt, 2001). If all the observed genotypes are explained by mutation rather than recombination, two SSR loci are compatible (Estabrook and Landrum, 1975).
STRUCTURE v. 2.3.4 software (Pritchard et al., 2000) was used to analyze population structure and test for admixture. A Bayesian distinct Monte Carlo Markov Chain (MCMC) approach was implemented by STRUCTURE v. 2.3.4 using the protocol described by Tsui et al. (2012). A 100,000 burn-in period followed by 1,000,000 iterations was implemented using an admixture model, and the correlated allele frequencies for K-values were between 1 and 10. For each simulated cluster for K = 1-10, ten runs were repeated independently for consistency (Tsui et al., 2012). Structure Harvester3 was used to compute ΔK (Evanno et al., 2005) to estimate the optimal K-value. Replicate simulations of cluster membership (q-matrices) at K = 4 were used as input for CLUMPP_Windows v. 1.1.2 (Jakobsson and Rosenberg, 2007) using the Fullsearch algorithm, with weighted H and the G similarity statistic. Summarized cluster membership matrices (q-values) for both individuals and populations were then visualized using DISTRUCT v. 1.1 (Rosenberg, 2004).
Nei’s unbiased genetic distance (Nei, 1978) was calculated among all pairs of sampling populations and visualized by Principal Coordinates Analysis (PCoA) with GenALEx v. 6.5 (Peakall and Smouse, 2006).
Genetic differentiation was calculated using an analysis of molecular variance (AMOVA) with ARLEQUIN v. 18.104.22.168 (Excoffier et al., 2005). Statistical significance of φ-statistics was tested based on 1023 permutations (default). Pairwise FST was calculated and evaluated using a randomization test with 1000 iterations utilizing ARLEQUIN v. 3.11 (Excoffier et al., 2005).
Isolation by distance (IBD) was evaluated by assessing the correlation between pairwise geographical distance and Nei’s unbiased genetic distance (Nei, 1978) for all population pairs with the package GENEPOP in R v.3.5.1 (using Isolde) (Raymond and Rousset, 1995) using 1000 random permutations.
The possibility and rate of migration among geographic regions were tested with MIGRATE v. 3.6.11 (Beerli and Felsenstein, 1999), which uses an expansion of the coalescent theory to estimate migration rates between populations (Nem) and Θ (2Neμ), where Ne is the effective population size, m is the constant migration rate between population pairs, and μ is the mutation rate per generation at the locus considered. Likelihood surfaces for each parameter were estimated by simulating genealogies using MCMC approach. The computations were carried out under a Brownian motion approximation of the stepwise mutation model (SMM). The runs consisted of two replicates of 10 short chains (with 10,000 genealogies sampled) and three long chains (with 500,000 genealogies sampled), with the first 10,000 genealogies discarded. A likelihood ratio test was used to compare the likelihoods of all models (Beerli and Felsenstein, 1999).
Illumina Sequencing Data
After stringent quality assessment, a total of 32.25 million clean reads with a GC content of 54.18 and a 95.27% Quality Score 30 (Q30) were obtained from the transcriptome sequence of A. tenuissima (Supplementary Table S2). Clean reads accounted for 99.72% of the total raw reads (Supplementary Figure S1). Sequencing was conducted on an Illumina HiSeq 4000 sequencing platform. Based on the clean reads, 50,992 individual transcripts were identified and 32,284 unigenes were assembled with an average length of 2,007.88 bp (N50 length of transcript = 4,172 bp, which is defined as the shortest sequence length of 50% of total contigs and is used to evaluate the quality of assembled sequences) and 1,088.76 bp (N50 length of unigene = 2,451 bp), respectively. Among the unigenes, 9,555 (29.60%) were 201 to 300 bp in length; 8,304 (25.72%) were 301 to 500 bp; 5,112 (15.83%) were 501 to 1,000 bp; 3,899 (12.08%) were 1,001 to 2,000 bp; and 5,413 (16.77%) were over 2,000 bp.
Collectively, 24,670 unigenes were annotated utilizing five databases, NCBI non-redundant protein (Nr protein), Swiss-Prot, euKaryotic Orthologous Groups (KOG), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO). Among the unigenes, 24,570 (99.6%) exhibited significant similarity to proteins in the Nr protein database, among which 10,562 (42.8%) were also found in the Swiss-Prot database.
A total of 15,503 (62.8%) of the unigenes were classified into 25 functional categories according to the KOG functional classification (Figure 2). The general function prediction was the most highly represented category (2,349 unigenes, 15.2%). Extracellular structures (31, 0.2%), followed by cell motility (3, <0.1%), were the least represented categories. KEGG was employed to identify the biological pathways present in the transcriptome sequences obtained from A. tenuissima. This analysis resulted in the clustering of 7,412 (30.0%) unigenes into 111 pathways. Highly represented pathways included: carbon metabolism (378 unigenes, 5.1%), biosynthesis of amino acids (352, 4.7%), and protein processing in the endoplasmic reticulum (321, 4.3%). The top 20 biological pathways of the enriched KEGG annotations are presented in Figure 3. Further analysis indicated that 15,589 (63.2%) of the unigenes could be assigned into three GO categories: cellular component, molecular function, and biological process (Figure 4). The highest represented subcategory in the cellular component category was cell part (6,475, 41.5%). Within the molecular function category, catalytic activity (8,860, 56.8%) and binding activity (7,810, 50.1%) were the most highly represented. A total of 1,096 unigenes (7.0%) were associated with transporter activity. Under the biological process category, metabolic process (11,068, 71.0%) was the most highly represented, followed by cellular process (8,984, 57.6%), and single-organism process (7,913, 50.8%).
Development of SSR Markers
A total of 1,140 SSRs were identified in the A. tenuissima transcriptome (Table 2). The SSRs were identified in 1,072 unigenes, among a total of 9,312 unigenes that were more than 1,000 bp in length. The number of repeat nucleotides in the SSRs varied from 5 to 24, with more than 10 repeats being the most abundant SSR. The percentage of motifs with 9 repeats was low (3.1%) (Table 2). A total of 111 of the unigenes contained more than one SSR. The distribution density of SSR loci in A. tenuissima unigenes is one per 22.9 kb and the frequency distribution of different SSR repeat numbers varies. The 487 mono-nucleotide repeat motifs were the most abundant with a frequency of 42.7%, followed by tri- (368 or 32.3%), di- (258 or 22.6%), tetra- (15 or 1.32%), penta- (7 or 0.6%), and hexa-nucleotide (5 or 0.4%) repeat motifs (Table 2). Sixteen polymorphic SSR markers were selected for population genetic structure analysis based on the presence of different motifs. The unigenes with SSR markers were annotated with 51 functions, assigned to three categories in GO terms, and grouped into 13 different classifications in the KOG database. The unigenes with SSR markers were mapped onto the 15 pathways in the KEGG pathway database (Supplementary Table S3).
Genetic Variation and Linkage Disequilibrium
Sixteen SSR markers were used to analyze the genetic structure of A. tenuissima in four geographic regions in China (Table 1). The flanking primers designed for each SSR provided distinct amplicons of the expected size. The observed fixation indexes had a 95% confidence interval for the analysis of selected neutrality of the SSR loci, suggesting that each SSR conformed to selective neutrality (Table 3). The 191 isolates from the four populations of A. tenuissima were determined to represent 182 distinct genotypes. Genotypic diversity was 0.87 and the CF was 0.05 in the population pooled from the four geographic regions (Table 4). Among the geographic regions, 175 were detected only once, 6 genotypes detected twice, and one genotype detected four times. A total of 180 of the genotypes were detected in only 1 geographic region, while 2 genotypes were present in two geographic regions. No unique genotypes were found to be present in three or four geographic regions. The genotypic diversity of isolates collected from Eastern China was higher than in the other three regions, Northeastern China, Northern China, and Northwestern China (Table 4).
TABLE 3. Size range, number of isolates analyzed (n), number of alleles (Na), and test for neutrality of the SSR loci identified in the transcriptome sequence data of Alternaria tenuissima.
TABLE 4. Genetic diversity in four Alternaria tenuissima geographic regions sampled from the principle tomato production regions in China.
The total number of alleles in the four geographic regions ranged from 2 to 13, and the number of private alleles ranged from 0.25 to 0.69 (Table 4). A. tenuissima from Eastern China had the most private alleles, followed by Northeastern China, Northern China, and Northwestern China. Correspondingly, the population from Eastern China also had the highest genetic diversity value among all four populations. Values of gene diversity ranged from 0.48 in Northwestern China to 0.56 in Eastern China (Table 4). The proportion of total genetic diversity attributed to population differentiation (FST) ranged from 0.309 to 0.582 for the sixteen SSR loci, with an overall average of 0.461. The gene diversity per locus ranged from 0.102 to 0.854 (Table 5).
TABLE 5. Summary of index population differentiation (FST) and heterozygosity (H) from each locus in Alternaria tenuissima isolates collected from China.
In an analysis of multilocus gametic disequilibrium, two measures of association, linkage disequilibrium (IA) and proportion of compatible pairs of loci (rd), were found to be significant in the four geographic regions for the total sample (all four geographic regions combined), indicating that the null hypothesis of complete panmixia was rejected (Table 6).
TABLE 6. Proportion of compatible pairs of loci (PrCompat), index of association (IA), and multilocus linkage disequilibrium (rd) for Chinese populations of Alternaria tenuissima obtained from infected tomato leaves.
Population Structure and Differentiation
The Bayesian cluster analysis using STRUCTURE v. 2.3.4 indicated that the number of genetically distinct ancestral populations was best represented by K = 4 clusters, which was the highest value of ΔK (Figure 5 and Supplementary Figure S2). The isolates from Northeastern China were assigned to cluster q4 (17 isolates, 38%) and q3 (15 isolates, 34%). The fifteen populations from Northern China exhibited a high level of admixture, and the isolates were assigned to cluster q4 (21 isolates, 29%) and q3 (19 isolates, 26%), followed by q2 (16 isolates, 22%). Most of the isolates from Eastern China were assigned to cluster q1 (18 isolates, 47%), and only one isolate was assigned to cluster q2. Most of the isolates from Northwestern China were assigned to q2 (16 isolates, 43%) and to a lesser extent q3 (8 isolates, 22%). Only two Northwestern China isolates were assigned to cluster q4.
FIGURE 5. Population structure of A. tenuissima based on 16 microsatellites (different shadings represent different genetic groups; each column represents an individual isolate, and the height of the column segments shows the probability of assignment of this isolate to a particular genetic group. The height of each shaded region within an individual bar is the measure of proportional affiliation. When K = 4, q1 red, q2 green, q3 yellow, q4 blue, individuals with membership coefficients of qi ≥ 0.7 were assigned to a specific genetic cluster).
Based on the PCoA, the eight populations from Northeastern China clustered within the two right quadrants of the first PCoA axis (Figure 6). The fifteen Northern China populations were spread across the first component space (explaining 45% of the variation), partially overlapping with Eastern China and two Northwestern China regional populations. Northwestern China and Northeastern China regional populations were slightly differentiated in the second PCoA axis (explaining 12.86% of the variation). The populations in Eastern China and Northwestern China tended to fall into different clusters in the first PCoA axis (Figure 6). The PCoA results were similar to results obtained in the STRUCTURE analysis. The genetic clusters were not completely grouped according to geographic region in the PCoA analysis, which may be explained by the populations with small sample size.
FIGURE 6. Principal Coordinates Analysis (PCoA) among 34 populations based on Nei’s genetic distance using GenAlEx.
The analysis of molecular variance (AMOVA) performed on the 34 populations indicated that 13.25 and 86.75% of the genetic variation was attributed to variations among and within populations, respectively (P < 0.001) (Table 7). AMOVA was used to further analyze the level of differentiation among the four geographic regions established in the analysis utilizing STRUCTURE and geography. AMOVA attributed 5.43, 8.88, and 85.69% of the total variation to variations among geographic regions, among sampling locations within geographic region, and among individual isolates within populations, respectively, all of which were highly significant (P < 0.001) (Table 7).
TABLE 7. Analysis of molecular variance (AMOVA) for Alternaria tenuissima populations based on (i) sampling locations, and (ii) four geographic regions.
In general, pairwise genetic differentiations (FST) between populations were not significant within the geographic regions Northern China, Eastern China, and Northwestern China, except for Baoding city (Supplementary Table S4). These results indicate that the level of genetic differentiation within the Northern China, Eastern China, and Northwestern China are similar.
The strength of the correlation was weak and non-significant within the genetic clusters, Northeastern China (r2 = 0.0151, P = 0.668), Northern China (r2 = 0.0580, P = 0.999), Eastern China (r2 = 0.0847, P = 0.907), and Northwestern China (r2 = 0.3949, P = 0.007) (Supplementary Figure S3), however, a significant correlation was observed between genetic distances FST/(1-FST) and geographical distances (km) for the entire data set (r2 = 0.0362, P < 0.001). These results suggest that isolation by distance exists among the geographic regions.
Considerable levels of gene flow were observed among the geographic regions with an estimated number of migrants per generation M (2Nem) ranging from 0.54 (Eastern China from Northwestern China) to 3.70 (Northwestern China from Northern China) (Table 8). The observed gene flow was asymmetric between Eastern China and Northwestern China (1.13 vs. 0.54), depending on the direction of the gene flow.
TABLE 8. Estimates of the mean population mutation rate (2Neμ) and mean number of migrants per generation M (2Nem).
Whole genome sequences in the genus Alternaria have been obtained for A. consortialis (GenBank accession no. BCGG00000000), A. alternata (LMXP00000000), A. arborescens (AIIC00000000), and A. brassicicola (PHFN00000000) (Hu et al., 2012; Nguyen et al., 2016). A whole genome or transcriptome sequence of A. tenuissima, however, has not been reported or deposited in a public DNA database. In the current study, Illumina sequencing of a transcriptome of A. tenuissima generated 32.25 million reads with a 95.27% Q30 and 32,284 unigenes were predicted after assembly. The N50 length of the unigenes was 2,451 bp, which was longer than the N50 obtained from the transcriptome sequencing of Alternaria sp. MG1 (N50 = 2,153 bp) using the Illumina HiSeq 2500 platform in a previous study (Che et al., 2016). Collectively, the results indicate that the quality and integrity of the obtained sequences are high.
Next-generation sequencing is a highly efficient and low-cost technology that can be used to develop large numbers of new SSR markers (Zhang et al., 2014). Detecting SSR markers in NGS data derived from a transcriptome is more efficient and rapid than previous, standard methodologies (Zheng et al., 2013). A total of 1,140 SSR loci were identified from 1,072 unigene sequences of A. tenuissima. Approximately 11.5% of the transcriptomic sequences contained SSR loci. The distribution density of SSRs in A. tenuissima is similar to many other higher plant species, such as rice, wheat, and soybean, which generally express a larger number of genes in a transcriptome than fungi due to the overall size of their genomes. Our results clearly identified a large number of SSR loci in the genes expressed in the A. tenuissima transcriptome.
Genetic markers should be selectively neutral, moderately diverse, and not linked if they are to be reliably used to study population genetics (Brown, 1996; Cooke and Lees, 2004). In the current study, sixteen SSR loci were selected from among different unigenes to analyze genetic diversity in A. tenuissima populations from four geographic regions. Most of the unigenes with SSR loci had different annotations in GO, KOG, and KEGG databases. The average size of a genome within different Alternaria species is more than 30 Mb (Hu et al., 2012; Woudenberg et al., 2015). Therefore, the probability that the sixteen selected genetic SSR markers are linked is extremely low based on the size of the genome.
Sexual recombination is expected to produce high levels of genetic diversity and random association among different loci (Milgroom, 1996; Kreis et al., 2016). In recent studies, some Alternaria species, such as A. solani (Meng et al., 2015), A. brassicicola (Morris et al., 2000), and A. helianthin (Santha Lakshmi Prasad et al., 2009) have been reported to have high levels of genetic diversity and recombination. Stewart et al. (2011), based on the results of mating system tests, suggested that Alternaria has a sexual cycle. Linkage equilibrium was found in A. brassicicola among the microsatellite loci (Linde et al., 2010). The complete sexual cycle of the above Alternaria species, however, has not been observed in any parts of the world. In the present study, the analysis of genotypic disequilibrium of populations from four geographic regions revealed a significant degree of non-random association, although high levels of diversity were observed. These results are consistent with Meng et al. (2015), who found that populations of A. solani from the Fujian Province (Eastern China) displayed high genetic variation and a lack of random mating. Bock et al. (2005) reported high levels of genetic diversity with a significant level of linkage disequilibrium in populations of A. brassicicola and suggested that recombination occurred only occasionally. Van Der Waals et al. (2004) indicated that the high genetic variation in A. solani could be accounted for by mutations rather than by sexual reproduction. These results suggest that random mating is not the main biotic factor that governs the high variation present in the four geographic regions.
High levels of genetic diversity were found to be present in the four geographic regions and within each sampled population, except for populations with a small sample size (e.g., Beijing Municipality, Songyuan, Ganzhou, Shaoxing, and Zhangye cities) (Table 4 and Supplementary Table S1). There are reports of A. tenuissima causing foliar diseases in China on wheat (Bensassi et al., 2009), potato (Zheng et al., 2015), and watermelon (Zhao et al., 2016a). These crops are often grown in rotation with tomato in some regions of China. The high genetic variation within each geographic region and the low spatial differentiation among different geographic regions are similar to in the findings of a study of A. alternata in China (Meng et al., 2018). The genetic structure of the Northeastern China, Northern China, and Northwestern China geographic regions were highly admixed and could not be separated into three single major clusters by admixture and principal coordinate analyses. These results comply with gene flow driven by anthropogenic activities occurring in geographically closer populations which exchange genetic information over time, and have a tendency to exhibit a higher genetic similarity (Meng et al., 2018).
Samples in the current study were collected from four geographic regions, separated from each other by more than 500 km. It is difficult for pathogen spores to be disseminated such a distance via the air. The isolation by distance observed for the entire data set, but not within geographic regions, is indicative of a barrier to gene flow. Seed-borne dispersal or transport of other goods contaminated with A. tenuissima, however, may account for the observed gene flow between the geographic regions of Northeastern China, Northern China, and Northwestern China (Malik et al., 1991; Bock et al., 2005; Meng et al., 2015). Long distance dispersal via human-mediated gene flow was also reported in populations of A. alternata in potato growing areas of China (Meng et al., 2018) and Rhynchosporium secalis in agricultural systems (Linde et al., 2009). Our present results suggest that human-mediated dispersal also plays an important role in the dynamics of the population genetic structure of A. tenuissima.
In contrast to Northeastern China, Northern China, and Northwestern China, the Eastern China region exhibited a relatively simple genetic structure (Figure 1). The sampled locations in the Eastern China region are geographically far from the other three tomato-cropping regions and separated from them by the Yellow and Yangtze Rivers. We suggest that the populations located within Eastern China may be separated by weak natural barriers. In this scenario, the significant correlation between genetic differentiation and geographic distance would mainly be influenced by the population genetic structure in the Eastern China geographic region. This infers that genetic isolation exists between the Eastern China geographic region and the other three tomato-cropping regions.
In recent years, A. tenuissima has become an important pathogen, causing foliar disease in various crops throughout China (Wang et al., 2014; Zheng et al., 2015; Zhao et al., 2016a,b). A comprehensive understanding of the population genetics of A. tenuissima has been lacking. In the present study, high levels of genetic diversity were determined to be present in A. tenuissima potentially brought about by gene flow among individuals within the populations. This may explain why A. tenuissima has developed the ability to infect different crops. The population genetics and biology of other tomato-growing regions in China (e.g., Central China and Southern China) have yet to be determined. Additional population genetic studies of Alternaria are needed for other geographic regions in China and further analyses are needed to determine the population genetic structure of Alternaria isolates over wider geographic regions of China.
Data Archiving Statement
Data for this study will be available at the Dryad Digital Repository after manuscript is accepted for publication.
NY and XW conceived and designed the study. NY, GM, and KC performed the experiments. NY and XW wrote the paper. XW reviewed and edited the manuscript.
This work was supported by the Chinese Universities Scientific Fund (2015NX005).
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2018.02904/full#supplementary-material
FIGURE S1 | The percentage of clean reads, adapter related reads, and low quality reads.
FIGURE S2 | The estimated Delta K (K) for number of clusters ranging from 2 to 10 in STRUCTURE analysis.
FIGURE S3 | Plot of isolation by distance for the entire population, the geographic region Northern China, the geographic region Northeastern China, and the geographic region Eastern China.
TABLE S1 | Isolates of Alternaria tenuissima comprising four different geographic regions in China.
TABLE S2 | Transcriptome reads and assembled contig information for Alternaria tenuissima.
TABLE S3 | The SSR of unigene annotations in KOG and GO database, and the pathway of KEGG orthology.
TABLE S4 | Pairwise FST calculated with Arlequin (assessed after 1000 permutations).
- ^ https://blast.ncbi.nlm.nih.gov/Blast.cgi
- ^ https://www.ncbi.nlm.nih.gov/sra
- ^ http://taylor0.biology.ucla.edu/struct_harvest/
Abdelfattah, A., Wisniewski, M., Droby, S., and Schena, L. (2016). Spatial and compositional variation in the fungal communities of organic and conventionally grown apple fruit at the consumer point-of-purchase. Hortic. Res. 3:16047. doi: 10.1038/hortres.2016.47
Agamy, R., Alamri, S., Moustafa, M. F., and Hashem, M. (2013). Management of tomato leaf spot caused by Alternaria tenuissima Wiltshire using salicylic acid and Agrileen. Int. J. Agric. Biol. 15, 266–272.
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J. H., Zhang, Z., Miller, W., et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. doi: 10.1093/nar/25.17.3389
Andeden, E. E., Baloch, F. S., Çakır, E., Toklu, F., and Özkan, H. (2015). Development, characterization and mapping of microsatellite markers for lentil (Lens culinaris Medik.). Plant Breed. 134, 589–598. doi: 10.1111/pbr.12296
Arnaud-Haond, S., and Belkhir, K. (2007). GENCLONE: a computer program to analyse genotypic data, test for clonality and describe spatial clonal organization. Mol. Ecol. Notes 7, 15–17. doi: 10.1111/j.1471-8286.2006.01522.x
Artero, A. S., Silva, J. Q., Albuquerque, P. S. B., Bressan, E. A., Leal, G. A., Sebbenn, A. M., et al. (2016). Spatial genetic structure and dispersal of the cacao pathogen Moniliophthora perniciosa in the Brazilian Amazon. Plant Pathol. 66, 912–923. doi: 10.1111/ppa.12644
Bensassi, F., Zid, M., Rhouma, A., Bacha, H., and Hajlaoui, M. R. (2009). First report of Alternaria species associated with black point of wheat in Tunisia. Ann. Microbiol. 59, 465–467. doi: 10.1007/BF03175132
Bernardes-de-Assis, J., Storari, M., Zala, M., Wang, W., Jiang, D., ShiDong, L., et al. (2009). Genetic structure of populations of the rice-infecting pathogen Rhizoctonia solani AG-1 IA from China. Phytopathology 99, 1090–1099. doi: 10.1094/PHYTO-99-9-1090
Bessadat, N., Berruyer, R., Hamon, B., Bataille-Simoneau, N., Benichou, S., Mebrouk, K., et al. (2017). Alternaria species associated with early blight epidemics on tomato and other Solanaceae crops in northwestern Algeria. Eur. J. Plant Pathol. 148, 181–197. doi: 10.1007/s10658-016-1081-9
Bock, C. H., Thrall, P. H., and Burdon, J. J. (2005). Genetic structure of populations of Alternaria brassicicola suggests the occurrence of sexual recombination. Mycol. Res. 109, 227–236. doi: 10.1017/S0953756204001674
Che, J., Shi, J., Gao, Z., and Zhang, Y. (2016). Transcriptome analysis reveals the genetic basis of the resveratrol biosynthesis pathway in an endophytic fungus (Alternaria sp. MG1) isolated from Vitis vinifera. Front. Microbiol. 7:1257. doi: 10.3389/fmicb.2016.01257
Chen, H., Liu, L., Wang, L., Wang, S., Somta, P., and Cheng, X. (2015). Development and validation of EST-SSR markers from the transcriptome of adzuki bean (Vigna angularis). PLoS One 10:e0131939. doi: 10.1371/journal.pone.0131939
Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M., and Robles, M. (2005). Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676. doi: 10.1093/bioinformatics/bti610
Evanno, G., Regnaut, S., and Goudet, J. (2005). Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14, 2611–2620. doi: 10.1111/j.1365-294X.2005.02553.x
Excoffier, L., Laval, G., and Schneider, S. (2005). Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol. Bioinform. Online 1, 47–50. doi: 10.1177/117693430500100003
Gannibal, P. B., Klemsdal, S. S., and Levitin, M. M. (2007). AFLP analysis of Russian Alternaria tenuissima, populations from wheat kernels and other hosts. Eur. J. Plant Pathol. 119, 175–182. doi: 10.1007/s10658-007-9159-z
Grünwald, N. J., Goodwin, S. B., Milgroom, M. G., and Fry, W. E. (2003). Analysis of genotypic diversity data for populations of microorganisms. Phytopathology 93, 738–746. doi: 10.1094/PHYTO.2003.93.6.738
Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., et al. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512. doi: 10.1038/nprot.2013.084
Hu, J., Chen, C., Peever, T., Dang, H., Lawrence, C., and Mitchell, T. (2012). Genomic characterization of the conditionally dispensable chromosome in Alternaria arborescens provides evidence for horizontal gene transfer. BMC Genomics 13:171. doi: 10.1186/1471-2164-13-171
Jakobsson, M., and Rosenberg, N. A. (2007). CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23, 1801–1806. doi: 10.1093/bioinformatics/btm233
Kreis, R. A., Dillard, H. R., and Smart, C. D. (2016). Population diversity and sensitivity to azoxystrobin of Alternaria brassicicola in New York State. Plant Dis. 100, 2422–2426. doi: 10.1094/PDIS-03-16-0414-RE
Lee, S. B., and Taylor, J. W. (1990). “Isolation of DNA from fungal mycelium and single spores,” in PCR Protocols. A Guide to Methods and Applications, eds M. A. Innis, D. H. Gelfand, J. J. Sninsky, and T. J. White (San Diego, CA: Academic Press), 282–287.
Li, M. Y., Wang, F., Jiang, Q., Ma, J., and Xiong, A. S. (2014). Identification of SSRs and differentially expressed genes in two cultivars of celery (Apium graveolens L.) by deep transcriptome sequencing. Hortic. Res. 1:10. doi: 10.1038/hortres.2014.10
Linde, C. C., Liles, J. A., and Thrall, P. H. (2010). Expansion of genetic diversity in randomly mating founder populations of Alternaria brassicicola infecting Cakile maritima in Australia. Appl. Environ. Microbiol. 76, 1946–1954. doi: 10.1128/AEM.01594-09
Linde, C. C., Zala, M., and Mc Donald, B. A. (2009). Molecular evidence for recent founder populations and human–mediated migration in the barley scald pathogen Rhynchosporium secalis. Mol. Phylogenet. Evol. 51, 454–464. doi: 10.1016/j.ympev.2009.03.002
Meng, J. W., He, D. C., Zhu, W., Yang, L. N., Wu, E., Xie, J. H., et al. (2018). Human-mediated gene flow contributes to metapopulation genetic structure of the pathogenic fungus Alternaria alternata from Potato. Front. Plant Sci. 9:198. doi: 10.3389/fpls.2018.00198
Meng, J. W., Zhu, W., He, M. H., Wu, E. J., Yang, L. N., Shang, L. P., et al. (2015). High genotype diversity and lack of isolation by distance in the Alternaria solani populations from China. Plant Pathol. 64, 434–441. doi: 10.1111/ppa.12275
Miller, N. J., Birley, A. J., Overall, A. D. J., and Tatchell, G. M. (2003). Population genetic structure of the lettuce root aphid, Pemphigus bursarius (L.), in relation to geographic distance, gene flow and host plant usage. Heredity 91, 217–223. doi: 10.1038/sj.hdy.6800331
Morris, P. F., Connolly, M. S., and Clair, D. A. S. (2000). Genetic diversity of Alternaria alternata, isolated from tomato in California assessed using RAPDs. Mycol. Res. 104, 286–292. doi: 10.1017/S0953756299008758
Piotrowska, M. J., Ennos, R. A., Fountaine, J. M., Burnett, F. J., Kaczmarek, M., and Hoebe, P. N. (2016). Development and use of microsatellite markers to study diversity, reproduction and population genetic structure of the cereal pathogen Ramularia collo-cygni. Fungal Genet. Biol. 87, 64–71. doi: 10.1016/j.fgb.2016.01.007
Rahman, M. Z., Honda, Y., Islam, S. Z., Muroguchi, N., and Arase, S. (2002). Leaf spot disease of broad bean (Vicia faba L.) caused by Alternaria tenuissima-a new disease in Japan. J. Gen. Plant Pathol. 68, 31–37. doi: 10.1007/PL00013049
Santha Lakshmi Prasad, M., Sujatha, M., and Chander Rao, S. (2009). Analysis of cultural and genetic diversity in Alternaria helianthi and determination of pathogenic variability using wild Helianthus species. J. Phytopathol. 157, 609–617. doi: 10.1111/j.1439-0434.2009.01542.x
Stewart, J. E., Kawabe, M., Abdo, Z., Arie, T., and Peever, T. L. (2011). Contrasting codon usage patterns and purifying selection at the mating locus in putatively asexual Alternaria fungal species. PLoS One 6:e20083. doi: 10.1371/journal.pone.0020083
Tsui, C. K. M., Roe, A. D., El-Kassaby, Y. A., Rice, A. V., Alamouti, S. M., Sperling, F. A. H., et al. (2012). Population structure and migration pattern of a conifer pathogen, Grosmannia clavigera, as influenced by its symbiont, the mountain pine beetle. Mol. Ecol. 21, 71–86. doi: 10.1111/j.1365-294X.2011.05366.x
Wang, T. Y., Zhao, J., Sun, P., and Wu, X. H. (2014). Characterization of Alternaria species associated with leaf blight of sunflower in China. Eur. J. Plant Pathol. 140, 301–315. doi: 10.1007/s10658-014-0464-z
Woudenberg, J. H. C., Seidl, M. F., Groenewald, J. Z., De Vries, M., Stielow, J. B., Thomma, B. P. H. J., et al. (2015). Alternaria section Alternaria: species, formae speciales or pathotypes? Stud. Mycol. 82, 1–21. doi: 10.1016/j.simyco.2015.07.001
Wright, E. R., Rivera, M. C., Esperón, J., Cheheid, A., and Rodríguez Codazzi, A. (2004). Alternaria leaf spot, twig blight, and fruit rot of highbush blueberry in Argentina. Plant Dis. 88, 1383–1383. doi: 10.1094/PDIS.2004.88.12.1383B
Wu, Z. J., Li, X. H., Liu, Z. W., Xu, Z. S., and Zhuang, J. (2014). De novo assembly and transcriptome characterization: novel insights into catechins biosynthesis in Camellia sinensis. BMC Plant Biol. 14:277. doi: 10.1186/s12870-014-0277-4
Zhan, J., Pettway, R. E., and McDonald, B. A. (2003). The global genetic structure of the wheat pathogen Mycosphaerella graminicola is characterized by high nuclear diversity, low mitochondrial diversity, regular recombination, and gene flow. Fungal Genet. Biol. 38, 286–297. doi: 10.1016/S0187-1845(02)00538-8
Zhang, S., Chen, W., Xin, L., Gao, Z., Hou, Y., Yu, X., et al. (2014). Genomic variants of genes associated with three horticultural traits in apple revealed by genome re-sequencing. Hortic. Res. 1:14045. doi: 10.1038/hortres.2014.45
Zhao, J., Bao, S. W., Ma, G. P., and Wu, X. H. (2016b). Characterization of Alternaria species associated with muskmelon foliar diseases in Beijing municipality of China. J. Gen. Plant Pathol. 82, 29–32. doi: 10.1007/s10327-015-0631-x
Keywords: Alternaria tenuissima, SSR marker, tomato, population genetic structure, next-generation sequencing
Citation: Yang N, Ma G, Chen K and Wu X (2018) The Population Genetics of Alternaria tenuissima in Four Regions of China as Determined by Microsatellite Markers Obtained by Transcriptome Sequencing. Front. Microbiol. 9:2904. doi: 10.3389/fmicb.2018.02904
Received: 16 July 2018; Accepted: 13 November 2018;
Published: 03 December 2018.
Edited by:Weiguo Fang, Zhejiang University, China
Reviewed by:Bo Huang, Anhui Agricultural University, China
Kin-Ming (Clement) Tsui, Weill Cornell Medicine - Qatar, Qatar
Copyright © 2018 Yang, Ma, Chen and Wu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xuehong Wu, firstname.lastname@example.org