Genomics Evolutionary History and Diagnostics of the Alternaria alternata Species Group Including Apple and Asian Pear Pathotypes

The Alternaria section alternaria (Alternaria alternata species group) represents a diverse group of saprotroph, human allergens, and plant pathogens. Alternaria taxonomy has benefited from recent phylogenetic revision but the basis of differentiation between major phylogenetic clades within the group is not yet understood. Furthermore, genomic resources have been limited for the study of host-specific pathotypes. We report near complete genomes of the apple and Asian pear pathotypes as well as draft assemblies for a further 10 isolates representing Alternaria tenuissima and Alternaria arborescens lineages. These assemblies provide the first insights into differentiation of these taxa as well as allowing the description of effector and non-effector profiles of apple and pear conditionally dispensable chromosomes (CDCs). We define the phylogenetic relationship between the isolates sequenced in this study and a further 23 Alternaria spp. based on available genomes. We determine which of these genomes represent MAT1-1-1 or MAT1-2-1 idiomorphs and designate host-specific pathotypes. We show for the first time that the apple pathotype is polyphyletic, present in both the A. arborescens and A. tenuissima lineages. Furthermore, we profile a wider set of 89 isolates for both mating type idiomorphs and toxin gene markers. Mating-type distribution indicated that gene flow has occurred since the formation of A. tenuissima and A. arborescens lineages. We also developed primers designed to AMT14, a gene from the apple pathotype toxin gene cluster with homologs in all tested pathotypes. These primers allow identification and differentiation of apple, pear, and strawberry pathotypes, providing new tools for pathogen diagnostics.


INTRODUCTION
Species within the genus Alternaria encompass a range of lifestyles, acting as saprotroph, opportunistic pathogens, and host-adapted plant pathogens (Thomma, 2003). Large spored species include Alternaria solani, a major pathogen of potato, whereas small spored taxa include the Alternaria alternata species group (Alternaria sect. alternaria), which are found ubiquitously in the environment acting as saprotrophs and opportunistic necrotrophs. This species group is responsible for opportunistic human infections and a range of host adapted plant diseases.
Taxonomy within this presumed asexual genus has been subject to recent revision . Large spored species can be clearly resolved by standard phylogenetic markers such as ITS and are supported by morphological characters. However, small spored species within the A. alternata species group overlap in morphological characters, possess the same ITS haplotype (Kusaba and Tsuge, 1995), and show low variation in other commonly used barcoding markers Armitage et al., 2015;Woudenberg et al., 2015). Highly variable phylogenetic markers have provided resolution between groups of isolates that possess morphological patterns typical of descriptions for Alternaria gaisen, Alternaria tenuissima, and Alternaria arborescens (Armitage et al., 2015).
The taxonomy of the species group is complicated by designation of isolates as pathotypes, each able to produce polyketide host-selective toxins (HST) adapted to apple, Asian pear, tangerine, citrus, rough lemon, or tomato (Tsuge et al., 2013). Genes involved in the production of these HSTs are located on conditionally dispensable chromosomes (CDCs) (Hatta et al., 2002). CDCs have been estimated to be 1.05 Mb in the strawberry pathotype (Hatta et al., 2002), 1.1-1.7 Mb in the apple pathotype (Johnson et al., 2001), 1.1-1.9 Mb in the tangerine pathotype (Masunaka et al., 2000(Masunaka et al., , 2005, and 4.1 Mb in the pear pathotype (Tanaka et al., 1999;Tanaka and Tsuge, 2000). These CDCs are understood to have been acquired through horizontal gene transfer and as such, the evolutionary history of CDCs may be distinct from the core genome.
The polyketide synthase genes responsible for the production of the six HSTs are present in clusters. Some genes within these clusters are conserved between pathotypes (Hatta et al., 2002;Miyamoto et al., 2009), while genes are also present within these clusters that are unique to particular pathotypes (Ajiro et al., 2010;Miyamoto et al., 2010). This is reflected in structural similarities between the pear (AKT) and strawberry (AFT) and tangerine (ACTT) toxins with each containing a 9,10-epoxy-8-hydroxy-9methyl-decatrienoic acid moiety. In contrast, the toxin produced by the apple pathotype (AMT) does not contain this moiety and is primarily cyclic in structure (Tsuge et al., 2013).
Studies making use of bacterial artificial chromosomes (BAC) have led to the sequencing of toxin gene cluster regions from three apple pathotype isolates (GenBank accessions: AB525198, AB525199, AB525200; unpublished). These sequences are 100-130 kb in size and contain 17 genes that are considered to be involved in synthesis of the AMT apple toxin (Harimoto et al., 2007). AMT1, AMT2, AMT3, and AMT4 have been demonstrated to be involved in AMT synthesis, as gene disruption experiments have led to loss of toxin production and pathogenicity (Johnson et al., 2000;Harimoto et al., 2007Harimoto et al., , 2008. However, experimental evidence has not been provided to show that the remaining 13 AMT genes have a role in toxin production. Four genes present in the CDC for the pear pathotype have been identified and have been named AKT1, AKT2, AKT3, AKTR-1 (Tanaka et al., 1999;Tanaka and Tsuge, 2000;Tsuge et al., 2013) and a further two genes (AKT4, AKTS1) have been reported (Tsuge et al., 2013).
The toxicity of an HST is not restricted to the designated host for that pathotype. All or some of the derivatives of a toxin may induce necrosis on "non-target" host leaves. For example, AMT from the apple pathotype can induce necrosis on the leaves of Asian pear (Kohmoto et al., 1976). Therefore, non-host resistance may be triggered by recognition of non-HST avirulence genes.
Alternaria spp. are of phytosanitary importance, with apple and pear pathotypes subject to quarantine regulations in Europe under Annex IIAI of Directive 2000/29/EC as A. alternata (non-European pathogenic isolates). As such, rapid and accurate diagnostics are required for identification. Where genes on essential chromosomes can be identified that phylogenetically resolve taxa, then these can be used for identification of quarantine pathogens (Bonants et al., 2010;Quaedvlieg et al., 2012). However, regulation and management strategies also need to consider the potential for genetic exchange between species. The Alternaria sect. alternaria are presumed asexual but evidence has been presented for either the presence of sexuality or a recent sexual past. Sexuality or parasexuality provides a mechanism for reshuffling the core genome associated with CDCs of a pathotype. It is currently unknown whether pathotype identification can be based on sequencing of phylogenetic loci, or whether the use of CDC-specific primers is more appropriate. This is of particular importance for the apple and Asian pear pathotypes due to the phytosanitary risk posed by their potential establishment and spread in Europe.

DNA and RNA Extraction and Sequencing
Apple pathotype isolate FERA 1166 and Asian pear pathotype isolate FERA 650 were sequenced using both Illumina and nanopore MinION sequencing technologies and the remaining 10 isolates were sequenced using Illumina sequencing technology. For both illumina and MinION sequencing, DNA extraction was performed on freeze dried mycelium grown in PDB for 14 days.
High molecular weight DNA was extracted for MinION sequencing using the protocol of Schwessinger and McDonald (2017), scaled down to a starting volume of 2 ml. This was followed by phenol-chloroform purification and size selection to a minimum of 30 kb using a Blue Pippin. The resulting product was concentrated using ampure beads before library preparation was performed using a Rapid Barcoding Sequencing Kit (SQK-RBK001) modified through exclusion of LLB beads. Sequencing was performed on an Oxford Nanopore GridION generating 40 and 34 times coverage of sequence data for isolates FERA 1166 and FERA 650, respectively. gDNA for illumina sequencing of isolate FERA 1166 was extracted using a modified CTAB protocol (Li et al., 1994). gDNA for illumina sequencing of the eleven other isolates was extracted using a Genelute Plant DNA Miniprep Kit (Sigma) using the manufacturer's protocol with the following modifications: the volume of lysis solutions (PartA and PartB) were doubled; an RNase digestion step was performed as suggested in the manufacturer's protocol; twice the volume of precipitation solution was added; elution was performed using elution buffer EB (Qiagen). A 200 bp genomic library was prepared for isolate FERA 1166 using a TrueSeq protocol (TrueSeq Kit, Illumina) and sequenced using 76 bp paired-end reads on an Illumina GA2 Genome Analyzer. Genomic libraries were prepared for the other eleven isolates using a Nextera Sample Preparation Kit (Illumina) and libraries sequenced using a MiSeq Benchtop Analyzer (Illumina) using 250 bp, paired-end reads.
RNAseq was performed to aid training of gene models. mRNA was extracted from isolates FERA 1166 and FERA 650 grown in full strength PDB, 1% PDB, Potato Carrot Broth (PCB), and V8 juice broth (V8B). The protocol for making PCB and V8B was as described in Simmons (2007) for making Potato Carrot Agar and V8 juice agar, with the exception that agar was not added to the recipe. Cultures were grown in conical flasks containing 250 ml of each liquid medium for 14 days. mRNA extraction was performed on freeze dried mycelium using the RNeasy Plant RNA extraction Kit (Qiagen). Concentration and quality of mRNA samples were assessed using a Bioanalyzer (Agilent Technologies). mRNA from the sample grown in 1% PDB for isolate FERA 650 showed evidence of degradation and was not used further. Samples were pooled from growth mediums for each isolate and 200 bp cDNA libraries prepared using a TrueSeq Kit (Illumina). These libraries were sequenced in multiplex on a MiSeq (Illumina) using 200 bp paired-end reads.
Gene prediction was performed on softmasked genomes using Braker1 v.2 (Hoff et al., 2016), a pipeline for automated training and gene prediction of AUGUSTUS v.3.1 (Stanke and Morgenstern, 2005). Additional gene models were called in intergenic regions using CodingQuarry v.2 (Testa et al., 2015). Braker1 was run using the "fungal" flag and CodingQuarry was run using the "pathogen" flag. RNAseq data generated from FERA 1166 and FERA 650 were aligned to each genome using STAR v.2.5.3a (Dobin et al., 2013), and used in the training of Braker1 and CodingQuarry gene models. Orthology was identified between the 12 predicted proteomes using OrthoMCL v.2.0.9 (Li et al., 2003) with an inflation value of 5.
Draft functional genome annotations were determined for gene models using InterProScan-5.18-57.0 (Jones et al., 2014) and through identifying homology (BLASTP, e-value >1 × 10 −100 ) between predicted proteins and those contained in the March 2018 release of the SwissProt database (Bairoch and Apweiler, 2000). Putative secreted proteins were identified through prediction of signal peptides using SignalP v.4.1 and removing those predicted to contain transmembrane domains using TMHMM v.2.0 (Käll et al., 2004;Krogh et al., 2001). Additional programs were used to provide evidence of effectors and pathogenicity factors. EffectorP v.1.0 was used to screen secreted proteins for characteristics of length, net charge and amino acid content typical of fungal effectors (Sperschneider et al., 2016). Secreted proteins were also screened for carbohydrate active enzymes using HMMER3 (Mistry et al., 2013) and HMM models from the dbCAN database (Huang et al., 2018). DNA binding domains associated with transcription factors (Shelest, 2017) were identified along with two additional fungal-specific transcription factors domains (IPR007219 and IPR021858). Annotated assemblies were submitted as Whole Genome Shotgun projects to DDBJ/ENA/GenBank ( Table 1). This included passing assemblies through the NCBI contamination screen, which did not identify presence of contaminant organisms.

Phylogenetics
BUSCO hits of single copy core ascomycete genes to assemblies were extracted and retained if a single hit was found in all of the 12 sequenced genomes and 23 publicly available Alternaria spp. genomes from the Alternaria genomes database (Dang et al., 2015). Nucleotide sequences from the resulting hits of 500 loci were aligned using MAFFT v6.864b (Katoh and Standley, 2013),  (Liu et al., 2011). The most parsimonious tree from each RAxML run was used to determine a single consensus phylogeny of the 500 loci using ASTRAL v.5.6.1 . The resulting tree was visualized using the R package GGtree v.1.12.4 .

CDC Identification
Contigs unique to apple and pear pathotypes were identified through read alignment to assembled genomes. Short read alignment was performed using Bowtie2 (Langmead and Salzberg, 2012), returning a single best alignment for each paired read, whereas long read alignments were performed using Minimap2 v.2.8-r711-dirty (Li, 2018). Read coverage was quantified from these alignments using Samtools (Li et al., 2009).

Toxin-Synthesis Genes in Alternaria Genomes
Sequence data for 40 genes located in A. alternata HST gene clusters were downloaded from GenBank. BLASTn searches were performed for all 40 gene sequences against one another to identify homology between these sequences. Genes were considered homologous where they had >70% identical sequences over the entire query length, and an e-value of 1 × 10 −30 . tBLASTx was used to search for the presence of these genes in assemblies.

PCR Screens for Apple and Pear Toxin-Synthesis Genes
A set of 90 previously characterized isolates was used to further investigate the distribution of pathotypes throughout the A. alternata species group. PCR primers were designed for the amplification of three genes (AMT4, AMT14, AKT3) located within CDC gene clusters involved in toxin synthesis. Primers for AMT4 were designed to amplify apple pathotype isolates, AKT3 to amplify pear pathotype isolates and AMT14 to identify both apple and pear pathotype isolates. These primers were then used to screen isolates for the presence of these genes in 30 cycles of PCR using 0.25 µl Dream taq, 1 µl of 10x PCR buffer, 1 µl of dNTPs, 1 µl of gDNA, 1 µl of each primer (5 µM), and 4.75 µl purified water (Sigma-Aldrich). PCR products were visualized using gel electrophoresis and amplicon identity confirmed through Sanger sequencing. Primers AMT4-EMR-F (5 -CTCGACGACGGTTTGGAGAA-3) and AMT4-EMR-R (5 -TTCCTTCGCATCAATGCCCT-3) were used for amplification of AMT4. Primers AKT3-EMR-F (5 -GCAATGGACGCAGACGATTC-3) and AKT3-EMR-R (5 -CTTGGAAGCCAGGCCAACTA-3) were used for amplification of AKT3. Primers AMT14-EMR-F (5 -TTTCTGCAACGGCG KCGCTT-3) and AMT14-EMR-R (5 -TGAGGAGTYAGACCR GRCGC-3) were used for amplification of AMT14. PCR reaction conditions were the same as described above for mating type loci, but with annealing performed at 66 • C for all primer pairs.

Virulence Assay
Pathogenicity assays were performed on apple cv. Spartan and cv. Bramley's seedling to determine differences in isolate virulence between A. tenuissima isolates possessing the apple pathotype CDC (FERA 635, FERA 743 or FERA 1166) and non-pathotype isolates lacking the CDC (FERA 648, FERA 1082 or FERA 1164). Briefly, leaves were inoculated with 10 µl of 1 × 10 5 spore suspensions at six points and the number of leaf spots counted at 14 days post inoculation. One isolate was infected per leaf, with 10 replicates per cultivar. Binomial regression using a generalized linear model (GLM) was used to analyse the number of resulting lesions per leaf. Unfolded adult apple leaves, less than 10 cm in length were cut from young (less than 12 months old) apple cv. Spartan trees or cv. Bramley's seedling trees. These were quality-checked to ensure that they were healthy and free from disease. Leaves were grouped by similar size and age and organized into ten experimental replicates of nine leaves. Leaves placed in clear plastic containers, with the abaxial leaf surface facing upwards. The base of these boxes was lined with two sheets of paper towel, and wetted with 50 ml of sterile distilled water (SDW). The cultivars were assessed in two independent experiments. Spore suspensions were made by growing A. alternata isolates on 1% PDA plates for 4 weeks at 23 • C before flooding the plate with 2 ml of SDW, scraping the plate with a disposable L-shaped spreader. Each leaf was inoculated with 10 ml of 1 × 10 5 spores ml −1 A. alternata spore suspension or 10 ml of steriledistilled water at six points on the abaxial leaf surface. Of the nine leaves in each box, three leaves were inoculated with a spore suspensions from isolates carrying apple pathotype CDC, three leaves were inoculated with non-pathotype isolates lacking the CDC, and three leaves were inoculated with SDW. Following inoculation, each container was sealed and placed in plastic bags to prevent moisture loss. Boxes were then kept at 23 • C with a 12 h light/12 h dark cycle.

Generation of Near-Complete Genomes for the Apple and Pear Pathotype Using MinION Sequencing
Assemblies using nanopore long-read sequence data for the apple pathotype isolate FERA 1166, and pear pathotype isolate FERA 650 were highly contiguous, with the former totaling 35.7 Mb in 22 contigs and the latter totaling 34.3 Mb in 27 contigs ( Table 2). Whole genome alignments of these assemblies to the 10 chromosomes of A. solani showed an overall macrosynteny between genomes (Figure 1), but with structural rearrangement of apple pathotype chromosomes in comparison to A. solani chromosomes 1 and 10. The Asian pear pathotype had distinct structural rearrangements in comparison to A. solani, chromosomes 1 and 2 (Figure 1). Scaffolded contigs of FERA 1166 spanned the entire length of A. solani chromosomes 2, 3, 6, 8, and 9 and chromosomes 4, 5, 6, and 10 for FERA 650 (Figure 1). Interestingly, sites of major structural rearrangements within A. solani chromosome 1 were flanked by telomere-like TTAGGG sequences.
Genome assembly of 10 Illumina sequenced isolates yielded assemblies of a similar total size to MinION assemblies (33.9-36.1 Mb) but fragmented into 167-912 contigs. Assembled genomes were repeat sparse, with 1.41-2.83% of genomes repeat masked ( Table 2). Genome assemblies of A. arborescens isolates (33.8-33.9 Mb), were of similar total size to non-pathotype A. tenuissima isolates and had similar repetitive content (2.51-2.83 and 1.41-2.83%, respectively). Despite this, identification of transposon families in both genomes showed expansion of DDE (T 5df = 5.36, P > 0.01) and gypsy (T 5df = 6.35, P > 0.01) families in A. arborescens genomes (Figure 2). Sequencing depth is shown, representing median coverage of trimmed reads aligned to the assembled genome. Number of genes predicted to encode secreted proteins, secreted effectors (EffectorP) and secreted carbohydrate active enzymes (CAZymes) are shown as well as the total number of secondary metabolite clusters in the genome. The percentage of 1315 conserved ascomycete genes that were identified as complete and present in a single copy within assemblies or gene models are shown.

Phylogeny of Sequenced Isolates
The relationship between the 12 sequenced isolates and 23 Alternaria spp. with publicly available genomes was investigated through phylogenetic analysis of 500 shared core ascomycete genes. A. pori and A. destruens genomes were excluded from the analysis due to low numbers of complete single copy ascomycete genes being found in their assemblies (Supplementary Table S1).
The 12 sequenced isolates were distributed throughout A. gaisen, A. tenuissima, and A. arborescens clades (Figure 3). The resulting phylogeny (Figure 3), formed the basis for later assessment of CDC presence and mating type distribution among newly sequenced and publicly available genomes, as discussed below.

Gene and Effector Identification
Gene prediction resulted in 12757-13733 genes from the our assemblies (Table 2), with significantly more genes observed in the apple pathotype isolates than in A. tenuissima clade nonpathotype isolates (P > 0.01, F 2,8df = 51.19). BUSCO analysis identified that gene models included over 97% of the single copy conserved ascomycete genes, indicating well trained gene models.

Genomic Differences Between
A. tenuissima and A. arborescens Clades Orthology analysis was performed upon the combined set of 158,280 total proteins from the 12 sequenced isolates. In total, 99.2% of proteins clustered into 14,187 orthogroups. Of these, 10,669 orthogroups were shared between all isolates, with 10,016 consisting of a single gene from each isolate. This analysis allowed the identification of 239 orthogroups that were either unique to A. arborescens isolates or expanded in comparison to nonpathotype A. tenuissima isolates. Isolates pathotype is labeled following identification of genes involved in synthesis of apple, pear, strawberry, tangerine, rough lemon, and tomato toxins.
Expanded and unique genes to A. arborescens isolates was further investigated using FERA 675 (Supplementary Table  S2). Genes involved in reproductive isolation were in this set, including 21 of the 148 heterokaryon incompatibility (HET) loci from FERA 675. CAZymes were also identified within this set, three of which showed presence of chitin binding activity and the other three having roles in xylan or pectin degradation. In total, 25 genes encoding secreted proteins were within this set, secreted proteins with pathogenicityassociated functional annotations included a lipase, a chloroperoxidase, an aerolysin-like toxin, a serine protease and an aspartic peptidase. A further six secreted genes had an effector-like structure by EffectorP but no further functional annotations. Furthermore, one gene from this set was predicted to encode a fungal-specific transcription factor unique to A. arborescens isolates.
Further to the identification of genes unique or expanded in A. arborescens, 220 orthogroups were identified as unique or expanded in the A. tenuissima. These orthogroups were further investigated using isolate FERA 648 (Supplementary Table S2). This set also contained genes involved in reproductive isolation, including nine of the 153 from FERA 648. CAZymes within the set included two chitin binding proteins, indicating a divergence of LysM effectors between A. tenuissima and A. arborescens lineages. The five additional CAZymes in this set represented distinct families from those expanded/unique in A. arborescens, including carboxylesterases, chitooligosaccharide oxidase, and sialidase. In total, 18 proteins from this set were predicted as secreted, including proteins with cupin protein domains, leucine rich-repeats, astacin family peptidase domains and with four predicted to have effector-like structures but no further annotations. A. tenuissima isolates had their own complement of transcription factors, represented by four genes within this set.

Identification of CDC Contigs and Assessment of Copy Number
Alignment of Illumina reads to the apple and Asian pear pathotype MinION reference assemblies identified variable presence of some contigs, identifying these as contigs representing CDCs (CDC contigs). Six contigs totaling 1.87 Mb were designated as CDCs in the apple pathotype reference (  (Table 3).
Read alignments showed that CDC contigs were present in multiple copies within A. alternata pathotype isolates. FERA 1166 Illumina reads aligned to its own assembly showed two-fold coverage over contigs 14, 15, 20, and 21 in comparison to core contigs ( Table 3). This was more pronounced in isolate FERA 1177 that had between two-and eight-fold coverage of these contigs. The same was observed in pear pathotype CDC regions, with contigs 14 and 24 in isolate FERA 650 showing twofold coverage from Illumina reads in comparison to core contigs ( Table 4).

Toxin Gene Clusters Are Present on Multiple CDC Contigs
Homologs to 15 of the 17 AMT cluster genes were located on contigs 20 and 21 in the apple pathotype reference genome (evalue < 1 × 10 −30 , > 70% query alignment), confirming them as CDC-regions ( Table 5). Of the remaining two genes, AMT11   had low-confidence BLAST homologs on contigs 18 and 21 (evalue < 1 × 10 −30 ) whereas the best BLAST hit of AMT15 was located on contig 18 (e-value < 1 × 10 −30 ). Duplication of toxin gene regions was observed between CDC contigs, with contig 20 carrying homologs to 16 toxin genes, but with contig 21 also carrying the AMT1 to AMT12 section of the cluster ( Table 5).
The three other apple pathotype isolates (FERA 635, FERA 743 and FERA 1177) also showed presence of 15 of the 17 AMT genes (e-value < 1 × 10 −30 , >70% query alignment), and with some AMT genes present in multiple copies within the genome indicating that the AMT toxin region has also been duplicated in these isolates. The Asian pear pathotype was also found to carry toxin gene clusters in multiple copies, with homologs to the four AKT cluster genes present on contig 14 of the FERA 650 assembly (e-value <1 × 10 −30 , >70% query alignment), with three of these also present on contig 24 (e-value <1 × 10 −30 , two with >70% query alignment). BLAST hit results from AKT genes were supported by their homologs from strawberry and tangerine pathotypes also found in these regions ( Table 5). The pear pathotype genome was also found to contain additional homologs from apple (AMT14), strawberry (AFT9-1, AFT10-1, AFT11-1, and AFT12-1) and citrus (ACTT5 and ACTT6) located on CDC contigs 14 and 24 (Table 5).

CDCs Carry Effectors Alongside Secondary Metabolites
A total of 624 proteins were encoded on the six contigs designated as CDCs in the reference apple pathotype genome, with 502 proteins encoded on the four Asian pear pathotype CDC contigs (Supplementary Table S3). We further investigated the gene complements of these regions.
Approximately a quarter of gene models on apple pathotype CDC contigs were involved in secondary metabolism, with 153 genes present in six secondary metabolite gene clusters. This Results from reference genome isolates FERA 1166 and FERA 650 indicate toxin clusters are present in multiple copies within the genome. This is supported by identification of multiple AMT1 homologs in FERA 635 and FERA 743. Homology between query sequences is shown (homolog groups), as determined from reciprocal BLAST searches between queries. Homologs are identified by e-value < 1 × 10 −30 , >70% query alignment. *Marks lower-confidence hits with <70% query alignment.
included AMT toxin gene homologs on contigs 20 and 21, which were located within NRPS secondary metabolite gene clusters. Three other secondary metabolite clusters were located on CDC contigs with two of these involved in the production of T1PKS secondary metabolites and the third with unknown function. A further two secondary metabolite clusters were located on contig 14 shared with two non-pathotype isolates, one of which is involved in the production of a T1PKS. The pear pathotype also carried 153 genes in secondary metabolite gene clusters. These 30% of CDC genes were located in four clusters, with the AKT toxin genes in T1PKS clusters of contigs 14 and 24. A second cluster was present on contig 14 with unknown function and a T1PKS cluster was present on contig 16. Approximately 5% of the genes on apple CDC contigs encoded secreted proteins, with 32 in isolate FERA 1166 many of which had potential effector functions with six designated as CAZymes and 12 testing positive by EffectorP. Similarly, a total of 41 secreted proteins were predicted on the CDC regions of the Asian pear pathotype, with eight of these designated as secreted CAZymes and 13 testing positive by EffectorP.
Further investigation into the 32 secreted proteins from the apple pathotype identified three CAZYmes from the chitinactive AA11 family, two from the cellulose-active GH61 family and one cellulose-active GH3 family protein. Six of the 13 EffectorP proteins also had domains identifiable by interproscan: four carried NTF2-like domains, which are envelope proteins facilitating protein transport into the nucleus; one was a fungal hydrophobin protein; one was a member of an panther superfamily PTHR40845 that shares structural similarity with proteins from the plant pathogens Phaeosphaeria nodorum, Sclerotinia sclerotiorum, and Ustilago maydis. Of the 38 secreted proteins identified from the pear pathotype, two CAZYmes were also identified from the chitin-active AA11 family, two from the AA3 family with single proteins from GH5, CBM67 and AA7 families. Ten of the twelve secreted EffectorP proteins had no functional information as predicted by interproscan, with the other two identified as carrying WSC domains IPR002889, which are cysteine-rich domains involved carbohydrate binding. CDCs may also play important roles in transcriptional regulation with 29 putative transcription factors identified in the apple pathotype CDC contigs (4.6% CDC genes) and 35 identified in pear pathotype CDC contigs (7.0% CDC genes).

Polyphyletic Distribution of Apple and Tangerine Pathotypes
The evolutionary relationship between A. alternata pathotypes sequenced in this study and publicly available genomes was analyzed by the core gene phylogeny (Figure 2). We identified four isolates as tangerine pathotypes (Z7, BMP2343, BMP2327, BMP3436) two as tomato pathotypes (BMP0308, EGS39-128), one Asian pear pathotype (MBP2338), one roughlemon (BMP2335) and two apple pathotypes (BMP3063, BMP3064) through searches for genes from HST-gene clusters (Supplementary Table S4). When plotted on the genome phylogeny, we found the apple and tangerine pathotypes to be polyphyletic (Figure 2). Five of the six sequenced apple pathotype isolates were located in the A. tenuissima clade and one in the A. arborescens clade, whereas the tangerine pathotype was present in both the A. tenuissima clade and in the A. tangelonis/A. longipes clade.

Molecular Tools for Identification of Apple, Pear, and Strawberry Pathotypes
PCR primers for three loci (AMT4, AKT3, and AMT14) were designed to identify the distribution of pathotypic isolates through the A. alternata species group and were screened against a set of 89 previously characterized isolates (Figure 4). Five isolates tested positive for the presence of AMT4, each of which was from the A. tenuissima clade (FERA 635, FERA 743, FERA 1166, FERA 1177. Five isolates tested positive for the presence of AKT3, including the three isolates from Asian pear in the A. gaisen clade and a further two isolates from the A. tenuissima clade that were from strawberry. Sequencing of the AKT3 amplicons from the two isolates ex. strawberry identified them as the AFT3-2 ortholog of AKT3, showing that these isolates were strawberry pathotypes rather than pear pathotypes. Sequencing of PCR products from the other isolates confirmed them to be apple or pear pathotypes as expected. All of the isolates testing positive for AMT4 or AKT3 also tested positive for AMT14, indicating its suitability as a target gene for identification of a range of pathotypes. Presence of apple pathotype CDCs was confirmed to be associated with pathogenicity through detached apple leaf assays. Apple pathotype isolates showed significantly greater numbers of necrotic lesions when inoculated onto cv. Spartan (F 72df = 100.64) and cv. Bramley's Seedling (F 72df = 69.64) leaves than nonpathotype A. tenuissima isolates (Figure 5).

DISCUSSION
This work builds upon the current genomic resources available for Alternaria, including the A. brassicicola and A. solani genomes (Belmas et al., 2018;Wolters et al., 2018), A. alternata from onion (Bihon et al., 2016) the additional 25 Alternaria spp. genomes available on the Alternaria Genomes Database (Dang et al., 2015) as well as recent genomes for other pathotype and non-pathotype A. alternata (Hou et al., 2016;Wang et al., 2016;Nguyen et al., 2016). Of the previously sequenced genomes, A. solani, the citrus pathotype and a non-pathotype A. alternata isolate have benefited from long read sequencing technology with each comprising less than 30 contigs (Wolters et al., 2018;Wang et al., 2016;Nguyen et al., 2016). Total genome sizes in this study (33)(34)(35)(36) were in line with previous estimates for A. alternata, with the tomato pathotype also previously assembled into 34 Mb (Hu et al., 2012). Synteny analysis of our two reference genomes against the chromosome-level A. solani genome revealed structural differences for chromosomes 1 and 10 in the apple pathotype and for chromosomes 1 and 2 in the pear pathotype. These structural differences may represent distinct traits between clades of the A. alternata species group, and may represent a barrier to genetic exchange involved in the divergence of A. gaisen and A. tenuissima lineages. The number of essential chromosomes in our reference genomes is in line with previous findings in A. alternata (Kodama et al., 1998), with 9-11 core.
Species designations within the species group have been subject to recent revision (Woudenberg et al., 2015;Lawrence et al., 2013;Armitage et al., 2015) leading to potential confusion when selecting isolates for study. For example, the available Alternaria fragariae genome (Dang et al., 2015), did not represent a strawberry pathotype isolate and was located in the A. gaisen clade. As such, the phylogenetic context for sequenced Alternaria genomes described in this study, along with pathotype identification provides a useful framework for isolate selection in future work.

Evidence of Genetic Exchange
A 1:1 ratio of MAT loci was observed within A. arborescens and A. tenuissima clades. This supports previous identification of both idiomorphs within A. alternata, Alternaria brassicae, and A. brassicicola (Berbee et al., 2003). Furthermore, presence of both MAT idiomorphs within apple pathotype isolates indicates that genetic exchange (sexuality or parasexuality) has occurred since the evolution of CDCs, providing a mechanism of transfer of CDCs. Evidence for cryptic sexuality or a parasexual cycle has been previously presented for the citrus pathotype of Alternaria alternata (Stewart et al., 2013). We also show that some recent or historic genetic exchange has occurred between A. tenuissima and A. arborescens clades, with both apple and tangerine pathotypes exhibiting a polyphyletic distribution throughout the phylogeny.

Duplication of Toxin-Gene Contigs
Toxin genes have been proposed to be present in multiple copies within A. sect. alternaria pathotype genomes with AMT2 proposed to be present in at least three copies in the apple pathotype CDC (Harimoto et al., 2008), and multiple copies of AKTR and AKT3 in the pear pathotype (Tanaka et al., 1999;Tanaka and Tsuge, 2000). Through read mapping we demonstrated that this is the case. Furthermore, we show that toxin gene clusters are present on multiple contigs, with differences in the gene complements between these clusters. At this stage, it is unclear whether these different clusters are responsible for the production of the variant R-groups previously characterized in AMT or AKT toxins (Nakashima et al., 1985;Harimoto et al., 2007). Differences were also noted between non-pathotype isolates from the A. tenuissima clade in the presence/absence of contigs 14 and 19, representing a total of 775 kb. Chromosomal loss has been reported in the apple pathotype (Johnson et al., 2001), and it is not clear if this represents chromosomal instability in culture or additional dispensable chromosomes within A. tenuissima clade isolates.

PCR Primers for Diagnostics
It is now clear that genes on essential chromosomes do not provide reliable targets for identification of different pathotypes and hence loci located directly on CDCs should be used. We found AMT14 homologs to be present in all pathotype genomes and designed primers to this region. These demonstrated specificity to apple, pear and strawberry pathotypes within a set of 86 Alternaria isolates. Furthermore, Sanger sequencing of these amplicons confirmed this to be a single locus that can both identify and discriminate a range of pathotypes. Wider validation of this primer set is now required to test its suitability across other pathotypes.

Divergence of A. arborescens and A. tenuissima
The divergence of A. tenuissima and A. arborescens lineages was investigated through identification of expanded and unique gene compliments. We identified HET loci unique to A. arborescens or A. tenuissima lineages. HET loci may act as incompatibility barriers to common genetic exchange between these taxa (Glass and Kaneko, 2003). Taxa also showed divergence in effector profiles, including chitin binding effectors, with A. arborescens isolates possessing unique xylan/pectin degradation CAZymes, while A. tenuissima isolates possessed unique carboxylesterase, chitooligosaccharide and sialidase CAZymes. Chitin binding proteins are important in preventing MAMP triggered host recognition by plants and animals during infection, and may also aid persistence of resting bodies outside of the host (Kombrink and Thomma, 2013). Putative transcription factors were also amongst the proteins specific to A. arborescens or A. tenuissima, indicating that these taxa not only possess distinct gene complements but also differ in how they respond to stimuli. Dispersed repeat sequences such as transposable elements have been shown to serve as sites of recombination within and between fungal chromosomes (Zolan, 1995) and we also show distinct transposon profiles between A. arborescens and A. tenuissima. Transposons are known to aid host adaptation in plant pathogens (Faino et al., 2016;Gijzen, 2009;Schmidt et al., 2013) and have been a mechanism for differentiation of these taxa.

Effectors on CDC Regions
Alternaria HSTs are capable of inducing necrosis on nonhost leaves (Kohmoto et al., 1976), meaning that non-host resistance must be associated with recognition of other avirulence genes. We investigated the complements of other putative pathogenicity genes and effectors produced by the apple and Asian pear pathotypes and identified additional CAZymes and secondary metabolite profiles on CDC regions, distinct between pathotypes, suggesting additional host-adapted tools for pathogenicity. Additional secondary metabolites clusters were present on both apple and pear pathotype CDCs as well as unique complements of secreted CAZymes. CAZyme families AA3, AA7 and AA9 have previously been reported to be in greater numbers in the citrus pathotype in comparison to non-pathotypes (Wang et al., 2016). Furthermore, putative transcription factor genes were identified in CDCs indicating that these regions may have some level of transcriptional autonomy from the core genome. This has been shown in Fusarium, where effector proteins are regulated by the SGE transcription factor on the core genome but also by FTF and other transcription factor families (TF1-9) located on lineage specific chromosomes (van der Does et al., 2016).

CONCLUSION
We report near-complete reference genomes for the apple and Asian pear pathotypes of A. sect. alternaria and provide genomic resources for a further ten diverse isolates from this clade. For the first time we show sequenced Alternaria genomes in a phylogenetic context allowing the identification of both mating type idiomorphs present in A. arborescens and A. tenuissima, with a distribution throughout subclades that was indicative of recent genetic exchange. The presence of the apple CDC in isolates of both mating types supports gene flow between isolates. Furthermore, the distribution of isolates from different pathotypes throughout the phylogeny indicated that apple and tangerine pathotypes are polyphyletic. This means that gene flow is not limited to within, but has also occurred between A. tenuissima and A. arborescens lineages. We also developed PCR primers to aid identification of pathotypes, with those targeting the AMT14 locus identifying a range of pathotypes due to its conservation between CDCs. Despite evidence of genetic exchange between A. arborescens and A. tenuissima clades, we show that these taxa are sufficiently isolated to have diverged, with significant differences in core effector profiles and transposon content.

DATA AVAILABILITY STATEMENT
Accession numbers for genomic data are provided in Table 1. Sanger sequence data is deposited on NCBI under accession numbers MK255031-MK255052.

AUTHOR CONTRIBUTIONS
AA, SS, JW, CL, and JC contributed to the conception and design of the study. AA, HC, and RH performed the lab work including library preparation and sequencing. AA performed the bioinformatic analyses and wrote the manuscript. All authors contributed to the manuscript revision, read, and approved the submitted version.

ACKNOWLEDGMENTS
Thanks are given to the FERA Science Ltd., Drs. P. Gannibal, R. Roberts, and E. Simmons for access to Alternaria isolates. Authors are grateful to the BBSRC for supporting associated research on fungal and oomycete pathogens at NIAB EMR, underpinning the advances presented here.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2019.03124/full#supplementary-material TABLE S1 | Identification of single copy Ascomycete genes in reference Alternaria spp. genomes.