Comparative genomics and transcriptome analysis reveals potential pathogenic mechanisms of Microdochium paspali on seashore paspalum

The sparse leaf patch of seashore paspalum (Paspalum vaginatum Sw.) caused by Microdochium paspali seriously impacts the landscape value of turf and poses a challenge to the maintenance and management of golf courses. Little is known about the genome of M. paspali or the potential genes underlying pathogenicity. In this study, we present a high-quality genome assembly of M. paspali with 14 contigs using the Nanopore and Illumina platform. The M. paspali genome is roughly 37.32 Mb in size and contains 10,365 putative protein-coding genes. These encompass a total of 3,830 pathogen-host interactions (PHI) genes, 481 carbohydrate-active enzymes (CAZymes) coding genes, 105 effectors, and 50 secondary metabolite biosynthetic gene clusters (SMGCs) predicted to be associated with pathogenicity. Comparative genomic analysis suggests M. paspali has 672 species-specific genes (SSGs) compared to two previously sequenced non-pathogenic Microdochium species, including 24 species-specific gene clusters (SSGCs). Comparative transcriptomic analyses reveal that 739 PHIs, 198 CAZymes, 40 effectors, 21 SMGCs, 213 SSGs, and 4 SSGCs were significantly up-regulated during the process of infection. In conclusion, the study enriches the genomic resources of Microdochium species and provides a valuable resource to characterize the pathogenic mechanisms of M. paspali.


Introduction
Seashore paspalum (Paspalum vaginatum Sw.), a perennial warm-season grass with dense rhizomes and stolon's, is widely distributed in Malaysia, Australia, South Africa, Southeastern China, Southern United States and other tropical and subtropical coastal areas (Lonard et al., 2015).It is one of the most attractive grass species because of its excellent characteristics, including trampling resistance, salt tolerance, drought tolerance, and is widely used in forage production, ecological restoration, landscape, and athletic fields (e. g. golf courses) (Pompeiano et al., 2016;Karimi et al., 2018).However, with its increasing use, the disease problems of seashore paspalum have become severe, and many fungal diseases have been reported to cause Jin et al. 10.3389/fmicb.2023.1259241Frontiers in Microbiology 02 frontiersin.orginfection, including fusarium blight (Fusarium sp.), large patch (Rhizoctonia solani), dollar spot (Clarireedia spp.), basal leaf blight (Waitea circinata), and take-all patch (Gaeumannomyces graminis var.graminis) (Kammerer et al., 2011;Wang, 2015;Zhang et al., 2022).In 2015, a new disease of seashore paspalum named "sparse leaf patch" was reported at golf courses in Hainan province, China.The disease is caused by a new plant pathogenic fungal species, Microdochium paspali (Zhang et al., 2017).At present, there is no report that M. paspali has been isolated from other turfgrasses or plants, but study has verified that it can infect other warm-season and cold-season turfgrasses in control environments (Zhang et al., 2017).Initially, the disease symptoms of infected seashore paspalum are dark green spots with sizes varying from 3-10 cm in diameter.The spots are white with clumped mycelia when the environment is highly humid, especially in the morning.The infected leaves show signs of waterlogging.As the disease progresses, the leaves turn yellow and may become necrotic.And finally, the whole leaves can die rapidly leading to obviously sparse patches on the turf; seriously impacting the landscape and causing significant economic losses (Figure 1).
Microdochium is a genus in Microdochiaceae (Xylariales), first described by Sydow in Germany with M. phragmitis isolated from reed leaves as the model.It is distributed worldwide, especially in Asia and Europe (Huang et al., 2020).Many Microdochium species are important plant pathogens causing diseases of cereals and grasses with severe yield losses.In addition to M. paspali, M. majus, and M. nivale are reported to cause pink snow mold of cereals and cool-season turfgrass in winter (Jewell and Hsiang, 2013).M. oryzae can cause leaf scale of rice (Oryza sativa) (Parkinson, 1980), M. maydis can cause tar-spot disease of maize (Zea mays) (Müller and Samuels, 1984), and recently, M. poae was reported to cause leaf blight on Kentucky bluegrass (Poa pratensis) and creeping bentgrass (Agrostis stolonifera) in North China (Liang et al., 2019).Some Microdochium species also have biological control effects.For example, the spores of the endophyte M. bolleyi can effectively inhibit the take-all disease of wheat and barley (Jadubansa et al., 1994).
At present, studies on Microdochium species mainly focused on the identification and biological characterization of the pathogenic or endophytic fungi, but there is a lack of genome-scale studies.In the NCBI database, only a few gene sequences commonly used for fungal classifications have been recorded for Microdochium species, and the genomes of only two endophytes, M. bolleyi (David et al., 2016) and M. trichocladiopsis (Mesny et al., 2021), have been sequenced.Sparse leaf patch caused by M. paspali has become one of the most serious diseases on seashore paspalum at golf courses in East and South China.Current research on M. paspali is limited and little is known at molecular level (Zhang et al., 2017).In recent years, comparative genomics and transcriptomics have been successfully applied to plant pathogenic fungi (Li et al., 2009;Shen et al., 2017) to better understand the mechanisms of species evolution, pathogenicity and pathogenhost interactions.
In this study, we sequenced the genome of M. paspali using a combination of Nanopore and Illumina high-throughput sequencing technologies.The putative pathogenic related factors in M. paspali were systematically predicted, including pathogen-host interaction (PHI) proteins, carbohydrate-active enzymes (CAZymes), secreted proteins and effector proteins, species-specific genes (SSGs).The species-specific secondary metabolite biosynthetic gene clusters (SMGCs) were further analyzed by comparative genomics with the non-pathogenic fungi, M. bolleyi and M. trichocladiopsis.In addition, comparative transcriptomics analysis were performed to identify the putative diseaserelated factors during the infection of M. paspali on seashore paspalum.This study enriches the genome resources of Microdochium species and provides a scientific basis for further characterization of the mechanisms of pathogenicity and pathogen-host interactions.

Genomic DNA extraction
The M. paspali strain I16 was preserved at the College of Agrograssland Science, Nanjing Agricultural University.It was isolated from symptomatic leaves of seashore paspalum (Paspalum vaginatum) at the research station of the Jurong Turf Institute.The strain, stored in glycerol (20%), was inoculated on potato dextrose agar (PDA) medium, and incubated for 3 d at 25°C in the dark.Thereafter, actively growing mycelial plugs were transferred to fresh PDA at 25°C for 7 d, and subsequently the mycelia were collected.Highly purified total genomic DNA was extracted according to a modified CTAB method (Murray and Thompson, 1980).DNA quality and concentration were examined by electrophoresis in 1% agarose gels (BIOWEST Regular agarose G-10) and the Nanodrop ND-100 (Nanodrop Technologies, Waltham, MA, United States), and sent to Wuhan Benagen Tech Solutions Co., Ltd. for genome sequencing.

Genome sequencing and assembly
The whole-genome of M. paspali was sequenced by the third generation nanopore sequencing combined with the second generation Illumina sequencing.For nanopore sequencing, the Oxford Nanopore Technology (ONT) library was constructed following the manufacturer's protocols of a Ligation Sequencing Kit SQK-LSK109 and a Native Barcoding Expansion Pack EXP-NBD104/114 (Oxford Nanopore Technologies, Oxford, UK).The prepared ONT library was loaded onto flow cells for single molecule sequencing on a PromethION device (Oxford Nanopore Technologies).Illumina sequencing was conducted on an Illumina NovaSeq (Illumina Inc., San Diego, CA, United States) after the genomic DNA was randomly fragmented by Covaris M220 Focused Ultra-sonicator device (Covaris, LLC, United States).The library for Illumina sequencing was constructed following the steps of DNA fragment end repair, 3′ A-tailing, adapter ligation, purification, and PCR amplification.The raw sequencing data were filtered to obtain clean reads, by removing reads containing adapters and ploy-N and low-quality reads with more than 50% of the bases having Q-scores ≤10 in the whole read.The clean reads were analyzed by Genomescope v1.0 (Vurture et al., 2017) with k-mer (k = 19) to estimate the genome size, heterozygosity, and duplication rate before assembly.An initial splicing result was obtained by performing genome error correction and splicing with NECAT software (Chen et al., 2021).The obtained result was subjected to two rounds of error correction based on the third generation sequencing data using Racon v1.4.11 (Vaser et al., 2017), followed by two rounds of Pilon v1.23 (Walker et al., 2014) error correction of the second generation reads.Finally, the haplotigs (also known as duplicate regions) were removed using Purge Haplotigs (Roach et al., 2018) with default parameters.The completeness of the genome assembly was assessed based on the lineage data set fungi_odb10 using BUSCO v4.1.4 (Simão et al., 2015).

Genome structural prediction and functional annotation
The de novo prediction of coding genes in the M. paspali genome was mainly performed with BRAKER v2.1.4 (Brůna et al., 2021), a fully automated pipeline for gene predictions with GeneMark-EP+ and AUGUSTUS.Firstly, the genomic data and RNA-seq data were used for obtaining initial gene structure using the GeneMark-EP+ tool, and then AUGUSTUS was used for further training based on the initial gene structure to generate the final gene prediction results.Non-coding RNA was predicted and statistically classified based on the Rfam database using Infernal v1.1.3 (Nawrocki and Eddy, 2013).The RepeatModeler v2.0 (Flynn et al., 2020) was used to construct a de novo repeat library.The repbase libraries (Bao et al., 2015) were merged and RepeatMasker v4.0.5 (Tempel, 2012) was used for repeat sequence annotations at the genome level.
Functional annotations of coding genes in the M. paspali genome were performed based on searches of sequence and motif similarities.The protein sequences encoded by the genes were compared with the existing protein databases Uniprot, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups of Proteins (COG), Nonredundant (Nr), Pathway, and Reference Sequence (Refseq) with BLASTP (BLAST+ v2.9, E-value <1e-5) (Camacho et al., 2009) to predict gene function.The prediction of structural domains was performed using the program HMMScan in HMMER v3.2.1 (Johnson et al., 2010), and the conserved sequences, motifs and structural domains of the proteins were obtained The predicted proteins were further annotated using the Protein Families (Pfam) database (Mistry et al., 2021) and InterProScan v5.0 (Jones et al., 2014).

Transcriptome sequencing and analyses
To obtain the interactive transcriptomes between M. paspali and seashore paspalum, both fungal mycelia and plant leaf tissues were collected after inoculation.Specifically, ten actively growing mycelial plugs (5 mm in diameter) of M. paspali were put into potato dextrose broth (PDB) medium and grown for 7 days.Mycelia were collected and rinsed with sterile distilled water (SDW) and a portion was stored directly in centrifuge tubes in liquid nitrogen.This served as the control (0 h of inoculation) with the rest being used to make a mycelial suspension by adding SDW and pulverizing with a grinder.The mycelial suspension was evenly sprayed on healthy seashore paspalum cultivar "Sea Spray" until water drops fell off the leaves.The plants were then transferred to a sterilized glass box and placed in a greenhouse at 25°C.Inoculated plants were watered daily to maintain high humidity for infection.It was observed that the mycelium of M. paspali began to grow gradually within 0-12 h after infection, and there were no visible symptoms on the leaves.At 24 h post infection, the water-soaked leaves could be observed, and at 36 h the symptoms of the leaves began to worsen and chlorosis was observed.At 48 h post infection the mycelium of M. paspali spread all over the leaves, which appeared to be uniformly blighted and showed obvious black lesions.Therefore, we collected leaf tissues at 6 h, 12 h, 24 h, 36 h, and 48 h after inoculation (Supplementary Figure S1), and preserved in liquid nitrogen before sequencing.Three biological replicates for each treatment were harvested.
RNA of the samples was extracted using the TRlzol Reagent (Life technologies, California, United States) and RNA concentrations and purity were measured using the NanoDrop 2000 (Thermo Fisher Scientific, Wilmington, DE).RNA integrity was assessed using the RNA Nano 6,000 Assay Kit and the Agilent Bioanalyzer 2,100 system (Agilent Technologies, CA, United States).RNA-seq was performed using the Illumina NovaSeq6000 platform and the 150 bp paired-end chemistries.Raw reads were processed through in-house perl scripts to obtain clean reads by removing reads containing adapters and ploy-N and low-quality reads with more than 50% of the bases having Q-scores ≤10 in the whole read.Subsequently, Hisat2 v2.2.1 (Kim et al., 2015) was used to map the clean reads to the genome of M. paspali.Fragments per kilobase of transcript per million fragments mapped (FPKM) was applied to measure the expression level of a gene or transcript by StringTie (Pertea et al., 2015) employing the maximum flow algorithm.DESeq2 v1.30.1 (Love et al., 2014) provided statistical routines for determining the differential expressions in digital gene expression data using a model based on the negative binomial distribution.Genes with an adjusted fold change ≥1.5 and false discovery rate (FDR) <0.01 were assigned as differentially expressed.Principal component analysis (PCA) of different samples and GO enrichment analysis of differential expressed genes (DEGs) were performed using TBtools v1.108 (Chen et al., 2020).

Prediction and analysis of pathogenicity-related genes
Homologs of known PHI genes were annotated using the BLASTP (E-value <1e-5) based on the PHI v4.14 database (Urban et al., 2021).The prediction and annotation of CAZymes in M. paspali was performed with the dbCAN2 meta-server (E-value <1e-5, coverage >35%) (Zhang et al., 2018).Identification of SMGCs was accomplished in the three species of Microdochium and comparisons with known clusters using antiSMASH v6.0 (Blin et al., 2021) at default parameters.
The classical secreted proteins in M. paspali were identified according to multiple software analyses.The protein sequences of all genes were analyzed using SignalP v5.0 (Almagro Armenteros et al., 2019a,b) to identify proteins containing signal peptides.TMHMM v2.0 (Krogh et al., 2001) was used to eliminate proteins containing transmembrane domains.PredGPI (Pierleoni et al., 2008) was used to eliminate GPI ankyrin.Subcellular localization analysis was performed by TargetP v2.0 (Almagro Armenteros et al., 2019a,b) and the exocrine secreted proteins were screened as classical secreted proteins.Finally, EffecorP v3.0 (Sperschneider and Dodds, 2022) was used for further prediction, and proteins with amino acid length < 300 and the number of cysteine residues ≥4 (Jones et al., 2018) were manually screened as putative effectors.
The density statistics of each type of pathogenicity-related genes and transposable elements in the M. paspali genome were performed at 50-kb length intervals using TBtools v1.108 (Chen et al., 2020).Based on the gene density profile, the circos plot of M. paspali genome was performed using shinyCircos v2.0 (Wang et al., 2023) to display the genome-wide pathogenicity-related genes density distribution.

High-quality genome assembly and annotation of Microdochium paspali
The genome of M. paspali was assembled using a combination of Nanopore sequencing and correction with Illumina short reads.Nanopore sequencing constituted 5.5 Gb data, a total of 707,581 nanopore long reads with an average length of 7.4 kb were obtained after data filtering.Additionally, approximately 28.73 million Illumina short reads were used to correct errors.The assembled genome size of 37.32 Mb was 94.29% of the estimated genome size (39.58Mb) by K-mer analysis (Supplementary Figure S2; Supplementary Table S1).The genome consisted of 14 contigs with an N50 of 3.54 Mb.The maximum contig length was 5,367,041 bp (Contig 1), the minimum contig length was 102,989 bp (Contig 14), and the average length was 2,665,914 bp.The GC content of the assembled contigs was 54.17% (Table 1).Using the fungi_odb10 data set of BUSCO to evaluate the integrity of genome assembly; 99.1% of the expected single-copy  (David et al., 2016;Mesny et al., 2021).A total of 10,365 protein-coding genes were predicted in the genome of M. paspali with an average gene length of 1703 bp and an average of 2.71 exons per gene, while no gene was predicted on contig 14.The total length of genes were 17.66 Mb, accounting for 47.32% of the genome (Table 1).The repeat sequences identified in M. paspali represented 1.78% of the genome length.Most of the repeated sequences were simple repeats with a coverage of 1.13%, while the transposable elements (long terminal repeats (LTR), Long interspersed nuclear elements (LINE), short interspersed nuclear elements (SINE), and DNA element) were 0.51%% (Supplementary Table S2).In addition, a total of 205 non-coding RNAs were annotated, including 40 rRNAs, 3 sRNAs, 25 snRNAs, and 137 tRNAs (Table 1).
To predict gene function, functional annotation was performed using several public databases including GO, KEGG, eggCOG, Nr, Uniprot, Pfam, Pathway, Refseq, and Interproscan.There were 9,877 genes annotated, accounting for 95.29% of the total predicted genes (Supplementary Table S3).The GO database annotated 5,839 genes (56.33%), and most genes involved in the biological processes were transmembrane transport and protein transport; the cellular components were mainly concentrated in nucleus, cytoplasm, and integral component of membrane; the molecular functions were mainly concentrated in binding, oxidoreductase activity, and monooxygenase activity (Supplementary Figure S3).There were 2,405 genes (23.20%) annotated in the KEGG database, of which, 2,263 genes (21.83%) were enriched into 331 metabolic pathways, including amino acid metabolism, carbohydrate metabolism, global and overview maps (Supplementary Figure S4).The COG analysis showed that 810 genes (7.81%) were annotated and clustered into 24 orthologous groups.The functions of these genes were mainly concentrated in carbohydrate transport and metabolism (97 genes), lipid transport and metabolism (87 genes), amino acid transport and metabolism (79 genes), general function prediction only (76 genes), translation, ribosomal structure and biogenesis (66 genes), and energy production and conversion (63 genes) (Supplementary Figure S5).

Identification of pathogenicity-related genes in Microdochium paspali
The pathogenicity-related genes in M. paspali were systematically identified by a combination of bioinformatic prediction and transcriptome sequencing.After sequencing quality control, a total of 198.78 Gb of clean data was obtained for 18 samples through transcriptome sequencing, and the clean data of each sample reached 10.03 Gb.The samples generated clean reads of approximately 33.48 M to 41.86 M, the percentages of Q30 bases were all higher than 91.68%, and a total of 9,629 genes (92.90% of the genes in M. paspali genome) were detected in all samples (Supplementary Table S4).The result of PCA showed that the samples at each time point after infection were obviously separated from the samples before infection (Supplementary Figure S6).A total of 1,983 (1,109 up-regulated and 874 down-regulated), 2,535 (1,487 up-regulated and 1,048 downregulated), 3,244 (1867 up-regulated and 1,377 down-regulated), 3,700 (1908 up-regulated and 1792 down-regulated), and 4,157 (2,338 up-regulated and 1819 down-regulated) DEGs were identified at 6 h, 12 h, 24 h, 36 h, and 48 h after infection, respectively (Supplementary Figure S7; Supplementary Table S5).Subsequently, we performed GO analysis of the DEGs at each time after M. paspali infection.The result showed that the up-regulated DEGs functions were significantly enriched in catalytic activity, membrane and metabolic process, while the down-regulated DEGs functions were significantly associated with binding, metabolic process and cellular process (Supplementary Figure S8).
Abundant pathogenicity-related genes were found in M. paspali genome, a total of 3,830 genes were annotated, accounting for 29.72% of the total predicted genes by using the PHI database (Supplementary Table S6).Of these genes, 1,819, 342 and 136 were annotated as related to reduced virulence, loss of pathogenicity, and increased virulence, respectively, and 739 pathogenicity-related PHI homologs were up-regulated during infection (Figure 2A).Most of the annotated genes were homologs from Fusarium graminearum, Magnaporthe oryzae, and Aspergillus fumigatus (Supplementary Table S6).
In the M. paspali genome, 481 CAZymes coding genes were predicted suggesting that M. paspali may use a variety of CAZymes to break through the first barrier of host cell wall during infection (Supplementary Table S7).The glycoside hydrolases (GHs) accounted for the largest number (247), followed by 102 auxiliary activities (AAs), 85 glycosyl transferases (GTs), 37 carbohydrate esterases (CEs), 13 carbohydrate-binding modules (CBMs), and 9 polysaccharide lyases (PLs).There were 12 genes with two CAZymes annotations (1 AA/CBM and 11 GH/CBM).In total, one hundred ninety-eight CAZymes in M. paspali were up-regulated in at least one time point after infection (Figure 2B).A total of 678 secreted proteins were predicted in M. paspali, of which 105 were putative effectors.Among these putative effectors, fifty-six were hypothetical proteins of unknown function, and thirty-one were not annotated in the NR database.Thirty-two putative effectors shared homologies with the functional domains, and ten were related to reduced virulence, two were related to increased virulence, one was related to loss of pathogenicity, and four were clearly annotated as effectors in other pathogenic fungi (Supplementary Table S8).Transcriptome analysis revealed that forty of the putative effectors were significantly up-regulated during M. paspali infection (Figure 2C), and the expression patterns of up-regulated genes encoding effectors were clustered into three groups, where 13, 9, and 18 putative effectors showed significantly dynamic changes in the early (6-24 h), middle (36 h), and late (48 h) stage of the infection, respectively (Figure 2D).Six putative effectors (g1621, g1879, g2155, g3289, g421, and g9387) were up-regulated at all infection stages.
In order to evaluate the production potential of secondary metabolites of M. paspali, SMGCs were further predicted, and a total of 50 SMGCs were identified in M. paspali (Supplementary Table S9).Three unique SMGCs (SMGC 21, SMGC 6 and SMGC 39) in M. paspali were identified to be involved in the synthesis of known compounds, as compared with M. bolleyi and M. trichocladiopsis (Figure 2E).The similarity between SMGC 21 and the tenuazonic acid (TeA) biosynthetic gene cluster from Pyricularia oryzae reached 75% (Figure 2F), which is a mycotoxin widely produced in various plant pathogenic fungi and causes significant damage to a variety of crops (Lebrun et al., 1988).Transcriptome analysis revealed that twenty-one SMGCs were up-regulated after infection (Figure 2G) and SMGC 21 was specifically up-regulated at the early stage of infection (6-24 h).

Identification of species-specific genes in Microdochium paspali
To further investigate the pathogenic factors in M. paspali, SSGs of M. paspali were identified and analyzed through comparison with two non-pathogenic fungal species M. bolleyi and M. trichocladiopsis.At total of 438, 1818 and 2,803 unique orthologous groups were identified in M. paspali, M. bolleyi and M. trichocladiopsis, respectively, which were termed as SSGs (Figure 3A).SSGs were further screened through the protein-coding genes in the M. paspali genome with the identity or coverage of <50% to the genes in the genomes of M. bolleyi and M. trichocladiopsis through BLAST, and 672 SSGs were finally identified in M. paspali (Supplementary Table S10).These SSGs were widely distributed across each contig in the M. paspali genome, with the most on contig 6 (Figure 3B; Supplementary Table S10).
Only 285 SSGs were annotated in the NR database and most (218 genes) had unknown functions.GO analysis showed that the functions of SSGs were classified into sixteen terms, of which twelve SSGs were significantly enriched in term of pathogenesis (Figure 3C).Eighty-six PHI homologous genes (forty-two related to pathogenicity), nine CAZymes, seventeen candidate effectors, and eighty secondary metabolite (SM) genes were identified from the SSGs in M. paspali (Supplementary Tables S10, S11).SSGs arranged in clusters on the genome (n ≥ 3) were defined as species-specific gene clusters (SSGCs), and there were 24 SSGCs identified in M. paspali genome (Supplementary Table S10).
The expression of most SSGs was significantly altered after infection, with two hundred and thirteen SSGs being up-regulated at least at one stage after infection (Figure 3D).Among them, there were twelve PHI pathogenicity-related genes, four CAZymes of GH family, four candidate effectors and forty SM genes (including g3022, the biosynthesis core gene of SMGC21 responsible for TeA production) (Figures 4A-D; Supplementary Table S11).Four SSGCs were specifically up-regulated (SSGC 7,SSGC 15,SSGC 17,and SSGC 23) during the infection stages (Figure 4E).Interestingly, the four SSGCs were fully or partially located on the respective SMGCs (Figure 4F), where SSGC 7 and SSGC 23 were all located on SMGC 22 and SMGC 48, while SSGC 15 and SSGC 17 were partially located on SMGC 38 and SMGC 44.

Comparative analyses of Microdochium paspali at the genomic level
The evolutionary relationship between M. paspali and other fungi was compared with the genomes of two non-pathogenic Microdochium species, M. bolleyi and M. trichocladiopsis, and 12 other representative pathogenic fungi.The phylogenetic tree was constructed according to sequence alignment of 3,024 singlecopy orthologous proteins, which suggested that M. paspali had a close evolutionary relationship with other Microdochium species (Figure 5A).M. paspali formed a clade with M. trichocladiopsis, while M. bolleyi was a self-contained clade, indicating a closer evolutionary relationship between M. paspali and M. trichocladiopsis.In addition, M. paspali was also relatively close to Truncatella angustata, a fungal pathogen that can cause root and leaf diseases of vascular plants (Kellner et al., 2022).
Collinearity analysis revealed 8,297 collinearity gene pairs in M. paspali and M. bolleyi, and 8,575 in M. paspali and M. trichocladiopsis, further supporting the close relationship of M. paspali and M. trichocladiopsis.Compared to M. bolleyi, M. paspali had two distinct inversions on contig 1, an insertion of large fragments on contig 7, while contig 11, 12, and 13 were not collinear with M. bolleyi.Compared to M. trichocladiopsis，M.paspali had significant inversions on contig 3, and fusion events were identified on contig12 and contig 13, which formed contig 9 in M. trichocladiopsis.Furthermore, M. paspali showed large translocations on contig 9, as compared to both species (Figure 5B).
The overview of M. paspali genome features revealed the enriched regions of PHI pathogenicity-related genes, CAZymes, secreted proteins (including putative effectors), SMGCs, and SSGs were mainly concentrated at the edges of contigs, and there were also more up-regulated genes during infection in these pathogenic factor-enriched regions (Figures 6C-I).Transposable elements were relatively dense in the SSGs-enriched regions, and various pathogenic factors were also more abundant (especially the SMGCs) in these regions (Figure 6J).In addition, three short contigs of M. paspali (contig11, contig12, and contig13) held more densely distributed SSGs and pathogenic factors than other contigs (Supplementary Table S12).There were 20 putative effectors distributed on contig13, far more than the number on other contigs (Supplementary Table S8).It is suggested that these short contigs may be related to the evolution of M. paspali pathogenicity.

Discussion
The genus of Microdochium includes some important plant pathogenic fungi which cause a variety of diseases on cereals and grasses leading to serious damage to these economically useful plants.M. paspali is a new pathogenic fungus causing sparse leaf patch on seashore paspalum in China (Zhang et al., 2017).However, there have been limited studies on the pathogen-host interaction process and pathogenic mechanisms of this fungus.In this study, a high-quality genome of M. paspali was produced containing 14 contigs and at total size of 37.32 Mb.The genome provides a valuable resource to characterize the pathogenic mechanisms of M. paspali and the evolutionary architectures of the genus Microdochium.
Comparative genomics was useful for identifying potential pathogenic factors.To investigate the potential pathogenic factors between pathogenic species M. paspali and non-pathogenic species  during infection, suggesting that a high number of SSGs responded to the process of M. paspali-seashore paspalum interaction, and might play important roles in the pathogenicity of M. paspali.It was worth noting that the three short contigs (contig11, 12 and 13) of M. paspali contained more dense SSGs and pathogenicity-related genes (up-regulated genes, CAZymes, secreted proteins, and effectors) than other contigs (Supplementary Table S12; Figure 6).Especially, there were 20 putative effectors in contig13, accounting for the largest proportion (19.04%) (Supplementary Table S8; Figure 6).The results indicate that these short contigs may be related to the pathogenicity of M. paspali.Further population genomics analysis is needed to characterize if high plasticity chromosome regions exist in the genome of M. paspali and what roles these regions play.At initial stages of host infection, plant pathogenic fungi secrete a large number of CAZymes to degrade plant cell walls and support the processes of pathogen attachment, invasion, colonization and nutrient absorption in host plant cells (Kubicek et al., 2014).In recent years, the family of CAZymes has become an important category of pathogenic factors garnering close attention insystemic researche (Garron and Henrissat, 2019).In this study, a total of 481 CAZymes-coding genes were predicted in the M. paspali genome, and 198 CAZymes showed increased expression during infection.GHs accounted for the largest number in CAZymes (120 genes) and all four up-regulated SSGs encoding CAZymes belonged to GHs (g4711, g2960, g7995, and g1742).The observed result suggestes GHs may be widely involved in the interaction between M. paspali and seashore paspalum.Existing studies indicate GHs decomposed the cell wall during the infection of plant pathogenic fungi, helping them to invade and colonize, and also play a toxic role by inducing the immune response of the host plants (Zhao et al., 2013;Rafiei et al., 2021).Effectors are important weapons during Comparative genomic analysis of Microdochium paspali.(A) Phylogenetic analysis and gene family clustering between M. paspali and 14 other fungi (12 plant pathogenic fungi and 2 non-pathogenic Microdochium species).The phylogenetic analysis was constructed based on single-copy orthologous proteins.(B) The genome collinearity among M. paspali, M. bolleyi, and M. trichocladiopsis.The thick line segments with different colors represent the contigs of each species.Each gray line connects a pair of collinearity blocks between two genomes.The first 15 long contigs of each species were selected for mapping.
the interaction between pathogenic fungi and host plants (Horbach et al., 2011).In this study, we found that forty putative effectors were up-regulated in at least at one time point after infection, which might be important for pathogenicity.Four of them were identified as SSGs in M. paspali (g3013, g3253, g4446, and g6900).However, the four SSGs have not been functionally annotated and their function needs to be further analyzed.Among the up-regulated effectors, six (g1621, g1879, g2155, g3289, g421, and g9387) were consistently up-regulated at 6-48 h after infection, which might also be the targets of interest to understand the pathogenic mechanism of M. paspali.In the Pfam database, g2155 is annotated as phospholipase, which can aid pathogen invasion by disrupting components of the host cell membrane, and widely exists in a variety of pathogenic fungi and bacteria (Ghannoum, 2000).The effector g3289 is annotated as trypsin, and studies reveal that the expressions of trypsin-like peptidases in fungi may represent a marker for fungal pathogenicity (Dubovenko et al., 2010).The effector g9387 was annotated as fungal hydrophobin.It is reported to be an important virulence factor in fungi and is secreted during pathogenesis.Some hydrophobin deletion mutants in Fusarium graminearum showed significantly reduced pathogenicity (Quarantin et al., 2019).In summary, these putative CAZymes and effectors identified by genomic and transcriptomic analyses may play important roles in the Overview of the Microdochium paspali genome features.The circles labeled from the outside are as follows: (A) the outermost layer of colored blocks is a circular representation of the 13 contigs with coding sequences, with a scale mark labeling each 200 Kb.(B-J) density of all genes, genes upregulated during infection, pathogen-host interactions (PHI), carbohydrate active enzymes (CAZymes) genes, secreted protein genes, effector genes, secondary metabolite genes (SMGs), species-specific genes (SSGs), and transposable elements in each contig.All statistics are based on 50-kb nonoverlapping windows.
During the complex processes of interaction with host plants, pathogenic fungi can produce a variety of different secondary metabolites, which affect the symptom expression and disease progression of plant hosts, and are also important virulence factors (Keller et al., 2005;Howlett, 2006).Bioinformatics predictions identified 50 different types of SMGCs in M. paspali, and ten showed high identities with the clusters producing known compounds.When compared with M. bolleyi and M. trichocladiopsis, it was found that M. paspali held a unique biosynthetic gene cluster (SMGC21).It contained thirty-one genes, and its core biosynthetic gene g3022 was a SSG of M. paspali, which was predicted to be responsible for the production of tenuazonic acid (TeA).TeA is a type of tetramic acidic (2,4-pyrrolidinediones) toxin that is generally considered to be a potential virulence factor (Lebrun et al., 1988;Wang et al., 2022), which can induce a chloroplast-derived oxidative burst leading to cell death and plant necrosis and has potential herbicidal activity against both monocotyledonous and dicotyledonous plants (Chen and Qiang, 2017).TeA was first reported to be isolated from Alternaria tenuis (Rosett et al., 1957).The plant pathogen Phoma sorghina (a pathogen of sorghum) and Magnaporthe oryzae (a pathogen of rice) were also reported to produce TeA (Umetsu et al., 1972;Steyn and Rabie, 1976).The biosynthetic gene cluster of TeA was first identified in M. oryzae, which contained four genes (Yun et al., 2015).Through blast analysis, we found that g3020, g3021, and g3022 in SMGC21 were 73%, 50%, and 59% similar to MGG_07802, MGG_07800, and MGG_07803 in the TeA biosynthesis gene cluster of M. oryzae (MIBiG accession: BGC0002158), respectively.The core biosynthetic genes of the two gene clusters, MGG_07803 and g3022, were both predicted to be the fungal non-ribosomal peptide synthetase and polyketide synthase (NRPS-PKS) hybrid enzyme.Transcriptome analysis revealed that fourteen genes (including g3020, g3021, and g3022) in SMGC21 were up-regulated after infection.Therefore, it is hypothesized that M. paspali has the potential to produce TeA, which may play an important role in the pathogenicity of M. paspali.Interestingly, we also found that four SSGCs (SSGC7, 23, 15, and 17) significantly up-regulated during infection were all partially or fully distributed on different SMGCs (SMGC22,48,38,and 44), suggesting that these SSGCs might be involved in the biosynthesis of certain secondary metabolic genes (SMs) in M. paspali.Whether the biosynthetic SMs have toxic effects on plants and what types of SMs they are will be an important direction for our future research.

Conclusion
In this study, the whole genome of M. paspali was sequenced for the first time.The results provide a valuable reference for genome assemblies of other related pathogenic fungi in the genus of Microdochium.In addition, comparative genomics and transcriptomics analyses identified large numbers of pathogenicity-related genes in M. paspali.These provide an important baseline for further characterization of pathogenic and evolutionary mechanisms and may assist in developing effective management strategies to control M. paspali on turfgrass.

FIGURE 1
FIGURE 1 Symptoms of sparse leaf patch on seashore paspalum caused by Microdochium paspali.(A) The irregular sparse patches on the diseased seashore paspalum fairway.(B) Infected leaves died quickly, with white mycelium attached.

FIGURE 2
FIGURE 2 Identification and analysis of pathogenicity-related genes in Microdochium paspali.(A-C) Heat map of the normalized fragments per kilobase of transcript per million fragments mapped (FPKM) values of up-regulated pathogenicity-related pathogen-host interactions (PHI) genes, carbohydrate active enzymes (CAZymes) genes, and effector genes at different infection stages.The expression level was indicated by different colors, red indicates a high expression and blue indicates a low expression.(D) Expression patterns of up-regulated expressed effector genes during the infection process.(E) The presence of predicted secondary metabolite biosynthetic gene clusters (SMGCs) in the three Microdochium species compared to known gene clusters and the distribution of the SMGC core biosynthetic genes family in each species.NRPS: nonribosomal peptide synthetase, NRPS-like: NRPSlike fragment, PKS: poly keytide synthase, HYBRID: multiple core biosynthetic genes.(F) Comparison of the similarity between M. paspali SMGC21 and the tenuazonic acid biosynthetic gene cluster from Pyricularia oryzae 70-15.0(G) Heat map of normalized FPKM values of SMGC core biosynthetic genes at different infection stages.

FIGURE 3
FIGURE 3 Identification and analysis of species-specific genes (SSGs) in Microdochium paspali.(A) Venn diagram for gene families clustering of the three Microdochium species.(B) Distribution of SSGs on contigs.The red color represents the position of SSGs in M. paspali compared with M. bolleyi and M. trichocladiopsis.(C) Gene Ontology (GO) enrichment analysis of SSGs in M. paspali.(D) Heat map of the normalized fragments per kilobase of transcript per million fragments mapped (FPKM) values of SSGs at different infection stages.The expression level was indicated by different colors, red indicates a high expression and blue indicates a low expression.

FIGURE 4
FIGURE 4 Analysis of pathogenicity-related species-specific genes (SSGs) in Microdochium paspali.(A-D) Heat map of the normalized fragments per kilobase of transcript per million fragments mapped (FPKM) values of up-regulated pathogenicity-related pathogen-host interactions (PHI) genes, carbohydrate active enzymes (CAZymes) genes, effector genes, and secondary metabolite genes in the SSGs of M. paspali at different infection stages.The expression level was indicated by different colors, red indicates a high expression and blue indicates a low expression.(E) Heat map of the four species-specific gene clusters (SSGCs) that were significantly up-regulated during the infection process.(F) Localization of the four SSGCs on secondary metabolite biosynthetic gene clusters (SMGCs).

TABLE 1
Features of the Microdochium paspali genome assembly and annotations.