AUTHOR=Wu Jing Qin, Dong Chongmei, Song Long, Park Robert F. TITLE=Long-Read–Based de novo Genome Assembly and Comparative Genomics of the Wheat Leaf Rust Pathogen Puccinia triticina Identifies Candidates for Three Avirulence Genes JOURNAL=Frontiers in Genetics VOLUME=11 YEAR=2020 URL=https://www.frontiersin.org/articles/10.3389/fgene.2020.00521 DOI=10.3389/fgene.2020.00521 ISSN=1664-8021 ABSTRACT=Leaf rust, caused by Puccinia triticina (Pt), is one of the most devastating diseases of wheat, affecting production in nearly all wheat-growing regions worldwide. Despite its economic importance, genomic resources for Pt are very limited. In the present study, we have used long-read sequencing (LRS) and the pipeline of FALCON and FALCON-Unzip (v4.1.0) to carry out the first LRS-based de novo genome assembly for Pt. Using 22.4-Gb data with an average read length of 11.6 kb and average coverage of 150-fold, we generated a genome assembly for Pt104 [strain 104-2,3,(6),(7),11; isolate S423], considered to be the founding isolate of a clonal lineage of Pt in Australia. The Pt104 genome contains 162 contigs with a total length of 140.5 Mb and N50 of 2 Mb, with the associated haplotigs providing haplotype information for 91% of the genome. This represents the best quality of Pt genome assembly to date, which reduces the contig number by 91-fold and improves the N50 by 4-fold as compared to the previous Pt race1 assembly. An annotation pipeline that combined multiple lines of evidence including the transcriptome assemblies derived from RNA-Seq, previously identified expressed sequence tags and Pt race 1 protein sequences predicted 29,043 genes for Pt104 genome. Based on the presence of a signal peptide, no transmembrane segment, and no target location to mitochondria, 2,178 genes were identified as secreted proteins (SPs). Whole-genome sequencing (Illumina paired-end) was performed for Pt104 and six additional strains with differential virulence profile on the wheat leaf rust resistance genes Lr26, Lr2a, and Lr3ka. To identify candidates for the corresponding avirulence genes AvrLr26, AvrLr2a, and AvrLr3ka, genetic variation within each strain was first identified by mapping to the Pt104 genome. Variants within predicted SP genes between the strains were then correlated to the virulence profiles, identifying 38, 31, and 37 candidates for AvrLr26, AvrLr2a, and AvrLr3ka, respectively. The identification of these candidate genes lays a good foundation for future studies on isolating these avirulence genes, investigating the molecular mechanisms underlying host–pathogen interactions, and the development of new diagnostic tools for pathogen monitoring.