Identification of candidate genes for LepR1 resistance against Leptosphaeria maculans in Brassica napus

Utilising resistance (R) genes, such as LepR1, against Leptosphaeria maculans, the causal agent of blackleg in canola (Brassica napus), could help manage the disease in the field and increase crop yield. Here we present a genome wide association study (GWAS) in B. napus to identify LepR1 candidate genes. Disease phenotyping of 104 B. napus genotypes revealed 30 resistant and 74 susceptible lines. Whole genome re-sequencing of these cultivars yielded over 3 million high quality single nucleotide polymorphisms (SNPs). GWAS in mixed linear model (MLM) revealed a total of 2,166 significant SNPs associated with LepR1 resistance. Of these SNPs, 2108 (97%) were found on chromosome A02 of B. napus cv. Darmor bzh v9 with a delineated LepR1_mlm1 QTL at 15.11-26.08 Mb. In LepR1_mlm1, there are 30 resistance gene analogs (RGAs) (13 nucleotide-binding site-leucine rich repeats (NLRs), 12 receptor-like kinases (RLKs), and 5 transmembrane-coiled-coil (TM-CCs)). Sequence analysis of alleles in resistant and susceptible lines was undertaken to identify candidate genes. This research provides insights into blackleg resistance in B. napus and assists identification of the functional LepR1 blackleg resistance gene.


Introduction
Brassica napus L. (canola, oilseed rape) is one of the most important crops for edible oil, food and industrial purposes (Sȩnsöz et al., 2000;Abbadi and Leckband, 2011). The breeding of canola cultivars has focused on oil quality, seed yield and resistance to pests and diseases Salisbury et al., 2016). One of the major diseases affecting canola production is blackleg, caused by the fungal pathogen Leptosphaeria maculans. Blackleg can reduce seed yield by at least 50% (Gugel and Petrie, 1992;West et al., 2001; Van de Wouw et al., 2010;Potter et al., 2016;Salisbury et al., 2016) and also affect oil quality (Fikere et al., 2020a). Blackleg disease is difficult to manage, even with fungicide application and recently there has been identification of fungicide tolerant L. maculans isolates (Kutcher et al., 2010;Van de Wouw et al., 2017;Van de Wouw et al., 2021). Thus, utilising resistant cultivars is recognised as an effective and environmentally safe approach in managing L. maculans and minimising yield loss.
There are two types of resistance against L. maculans in B. napus; qualitative and quantitative resistance (Delourme et al., 2006). Qualitative resistance involves an interaction between a resistance (R) gene and an avirulence (Avr) gene (Flor, 1971). These R genes, also termed major genes, are highly heritable and manifest effectively from the cotyledon stage through to the adult stage . On the other hand, quantitative resistance, considered to be governed by multiple minor genes, provides partial resistance to individual isolates and is poorly understood both phenotypically and genotypically (Huang et al., 2009). Both types of blackleg resistance are important in canola and many commercial cultivars have both qualitative and quantitative resistance.
To determine the position of an R gene in the genome, genetic mapping of genotypic variants is frequently undertaken. Two of the most common approaches for genetic mapping are association mapping, or genome wide association studies (GWAS), and biparental linkage mapping. GWAS is an alternative approach to traditional mapping of biparental populations and has been widely used in B. napus (Raman et al., 2014;Zhang et al., 2015;Gacek et al., 2016;Raman et al., 2016;Wu et al., 2016;Fu et al., 2020;Hu et al., 2022). The detection of quantitative trait loci (QTL) through GWAS depends on the level of linkage disequilibrium (LD) between functional loci and markers, which most GWAS models can statistically and easily determine. Generalized linear models (GLM) and mixed linear models (MLM) are the most common GWAS models used in plant studies, as these models are flexible and can detect multiple genotypic variants. Of the two models, MLM simultaneously incorporates kinship and principal component analyses (PCA) to address cryptic relatedness in tested materials (Yu et al., 2006). Currently 16 qualitative blackleg R genes have been mapped in Brassica species, however only LepR3, Rlm2, Rlm4, Rlm7 and Rlm9 have been cloned to date (Larkan et al., 2013;Larkan et al., 2015;Larkan et al., 2020;Haddadi et al., 2022). LepR3 and Rlm2 are alleles of the same gene, a resistance gene analog (RGA) encoding a receptorlike protein (RLP), while Rlm4, Rlm7 and Rlm9 are also allelic variants of an RGA encoding wall-associated kinases (WAKs) (Larkan et al., 2020;Haddadi et al., 2022). Other blackleg RGAs have also been cloned in Arabidopsis thaliana; RLM1a, RLM1b and RLM3 which encode nucleotide-binding site (NBS)-leucine rich repeats (LRR) (Staal et al., 2006;Staal et al., 2008). One of the blackleg R genes yet to be cloned is LepR1, which was originally introgressed from Brassica rapa L. subsp. sylvestris into B. napus and has been mapped on chromosome A02 of B. napus (Yu et al., 2005;Larkan et al., 2016;Dolatabadian et al., 2019;Fikere et al., 2020b). This study used whole genome re-sequencing (WGRS) data from 104 Australian B. napus genotypes, screened with blackleg isolates containing AvrLep1 to identify a precise location of the QTL harbouring LepR1 and to identify candidate genes within this region. Lastly, sequence analysis was used to identify candidate LepR1 genes in B. napus.

Materials and methods
Blackleg isolates and disease phenotyping A set of 14 well-characterised differential L. maculans isolates were used to genotype B. napus genotypes for the presence or absence of resistance gene LepR1. All isolates and their genotypes (Table S1) were maintained on 10% V8 media. Pycnidiospores (10 -5 ) of each L. maculans isolate were dropped onto cotyledons of wounded ten-dayold seedlings grown in the glasshouse at 22°C. Disease symptoms were scored at 14 days post-inoculation on a 0-9 scale, as previously described . For each isolate x cultivar interaction, eight replicate plants were inoculated, each with four wounds per plant. Average pathogenicity scores of <5.0 were considered avirulent, whilst average scores of >5.0 were considered virulent.
DNA extraction and whole genome re-sequencing A total of 104 Australian B. napus genotypes (89 commercial cultivars and 15 breeding lines) were used in this study (Table S2). The genotypes were grown in a glasshouse at 18°C during the day and 13°C at night in a 12-hour photoperiod cycle for two weeks. Approximately 100 mg of young, healthy leaf material was collected, snap-frozen in liquid nitrogen and finely ground using a Geno-Grinder 2010 (SPEX SamplePrep, Mettuchen, Germany). DNA was extracted using the DNeasy Plant Kit (Qiagen © , Hilden, Germany) according to the manufacturer's instructions. Total DNA was quantified using a Qubit 3.0 Fluorometer and the Qubit dsDNA BR Assay kit (Invitrogen, Carlsbad, United States), while DNA quality was assessed using the LabChip GX Touch 24 (PerkinElmer, Waltham, United States). WGRS of the cultivars was carried out using an Illumina NovaSeq ™ 6000 Sequencing System by GeneWiz (Brookes Life Sciences, Guangzhou, China), which generated approximately~6 Gb of 150 bp paired-end sequencing data per cultivar.

SNP calling and filtering
From the WGRS data, raw FASTQ files were trimmed of adapter sequences and low quality bases using Trimmomatic v0.39 (Bolger et al., 2014) with a maximum mismatch score of 2, a palindrome clip score threshold of 30 and a clip score threshold of 10. Low quality bases with a Phred+ 33 score of < 3 were trimmed from both the start and end of each read. Sliding window trimming was carried out using a 4-base wide window to remove bases with an average quality per base of <15. All the reads with fewer than 36 bases, as well as unpaired reads paired after trimming, were discarded. To verify adapter and read quality trimming, untrimmed and trimmed reads were analysed using FastQC (Andrews, 2010) followed by MultiQC (Ewels et al., 2016), which were compared afterwards.
Merged variants were further filtered using VCFtools v0.1.16 (Danecek et al., 2011). Indels and multiallelic SNPs were omitted (remove-indels -max-alleles 2 -min-alleles 2). Individuals with > 0.9 missing genotypes were removed before the filtering of SNPs. Genotypes with a depth of < 5 (-minDP 5) were set to missing to minimise the rate of heterozygous alleles incorrectly called as homozygous alleles due to insufficient read depth. SNPs displaying a minor allele frequency (MAF) of < 0.05 (-MAF 0.05) or when genotypes were not present in > 80% of all individuals (-max-missing 0.8) were also discarded. The filtered SNP data was then used as genotype data for GWAS.

Association analysis
The GWAS results, along with quantile-quantile (QQ) plot in mixed linear model (MLM) with PCA (3 principal components) + Kinship matrix for family relatedness estimates (K) (Yu et al., 2006), were computed using the GAPIT3 R package (Wang and Zhang, 2021). An additional GWAS model; Fixed and random model Circulating Probability Unification (FarmCPU) , computed through GAPIT3 R package was used to determine consistent significant SNPs across two models. From the GWAS result, CMplot R package (Yin et al., 2021) was used to visualise circular Manhattan and SNP density plots. MAF and adjusted P-value following a false discovery rate (FDR)-controlling procedure (Benjamini and Hochberg, 1995), additional criteria to select SNPs associated to the trait, were also identified using GAPIT3. GAPIT was also used to show the heterozygosity information and LD, represented by R 2 which measures two biallelic markers (Hill and Robertson, 1968). The phenotypic variation explained by a SNP marker (PVE) was taken using the following formula: PVE = [(R2 with SNP model value-R2 without SNP model value) x 100] (Sun et al., 2010). Figure plotting of the SNP statistics was also done in GAPIT3 while the other SNP statistics were computed in TASSEL 5.0 (Bradbury et al., 2007).

Current and previous QTL dissection
Putative QTL were derived from the significant SNPs identified through MLM GWAS. A QTL was defined either as a 100 kb upstream and downstream region of each significant SNP (Anderson et al., 2020) or a region with two significant SNPs located ≤ 15 Mb from each other on the same chromosome (Martıńez-Montes et al., 2018), also integrating a 100 kb region upstream and downstream from the SNP. For QTL comparison, the previous QTL (chromosome A02) based on LepR1 introgressed lines and structural variants, derived from the results of L. maculans containing AvrLep1 screened at the cotyledon stage (Larkan et al., 2016;Dolatabadian et al., 2019), and based on adult plant survival rate and average internal infection of the stem at maturity stage (Fikere et al., 2020b) (considered as the QTL outlier) were physically mapped to B. napus cv Darmor bzh v9 using MapChart 2.32 for comparison (Voorrips, 2002). Then, the number of genes and RGAs in the derived QTL intervals were identified for comparison.

Molecular analysis
The LepR1 candidates which had at least 2 SNPs in CDS regions were chosen as the most likely candidates. We also included two flanking RGAs (for comparison) nearest to the LepR1 QTL identified in this study. The marker development was done in two stages. First, markers flanking each candidate were designed using the Primer3 function in Geneious R10 (Kearse et al., 2012) and the markers were tested for allele detection of the resistant and susceptible materials. Then, the amplicons showing segregation between resistant and susceptible lines were purified for sequencing. MiSeq sequencing was done using an Illumina MiSeq platform at the Australian Genome Research Facility (AGRF), Perth, Australia. MiSeq reads were quality-trimmed using Trimmomatic (Bolger et al., 2014) and assembled using Spades version 3.6.0 (Bankevich et al., 2012). The markers showing sequence variation between resistant and susceptible materials were used to design a resistant-specific marker.

Phenotyping
Out of 104 B. napus cultivars, 30 showed a resistance response to L. maculans isolates containing AvrLep1, indicating the presence of LepR1, while 74 showed a susceptible response, indicating they do not harbour LepR1 (Table S2).

WGRS SNP analysis
The WGRS data of 104 B. napus genotypes produced a total of 3,235,008 high quality SNPs that were used in the GWAS ( Figure S1).
BnaA02g33310D3 and BnaA02g25030D3 had a sequence difference between the resistant and susceptible alleles, while BnaA02g33850D3 and BnaA02g34440D3 had similar sequences TABLE 1 Delineation of the LepR1_mlm1 quantitative trait loci (QTL) including number of genes and the resistance gene analogs (RGAs) content using the mixed linear model (MLM) genome-wide association study (GWAS) and its significant single nucleotide polymorphism (SNP) associated with LepR1 blackleg resistance in Brassica napus.

QTL name LepR1_mlm1
Chromosome ( throughout the tested materials. SNP differences among the alleles of the resistant cultivars was also observed in BnaA02g25030D3. In all, BnaA02g33310D3 had 92 SNP differences between the resistant and susceptible B. napus cultivars ( Figure S5). There were also two large deletions in positions 697 to 706 bp and 2836 to 2854 bp observed in the resistant allele, while there was one large deletion in position 2881 and 2894 bp in the susceptible allele ( Figure S5). For the resistantspecific marker developed based on the sequencing result (Table S10), 22 out of 25 LepR1 resistant cultivars had a band while no band was detected in the 31 susceptible cultivars ( Figure S6).

Discussion
Leptosphaeria maculans isolates containing specific Avr gene profiles are widely used either to verify known blackleg R genes or discover novel blackleg R genes in B. napus (Balesdent et al., 2005;Van de Wouw et al., 2009;Marcroft et al., 2012;Jiquel et al., 2021). However, a mild or inconsistent hypersensitive response in B. napus cultivars can be observed due to different genomic backgrounds (Rouxel et al., 2003;Balesdent et al., 2005;Larkan et al., 2016). Nevertheless, LepR1 is a single dominant gene and its interaction to  AvrLep1 interaction in B. napus follows a gene-for-gene relationship (Yu et al., 2002;Yu et al., 2005). Thus, the phenotype data from L. maculans isolates containing AvrLep1 confirmed the presence of the R gene in the resistant cultivars and the absence of the R gene in the susceptible cultivars in this study. The use of WGRS in crop studies is advantageous because it can identify a large number of SNPs that are densely dispersed throughout the entire genome, therefore improving the ability for GWAS to identify functional QTL. The genotype data used here comprised over 3 million SNP markers, which has a better coverage and was more dense compared to low coverage markers such as RFLPs, SSRs, the Brassica Infinium 60K SNP array, and imputed SNPs, that were used to detect previous LepR1 QTL in B. napus (Yu et al., 2005;Yu et al., 2012;Larkan et al., 2016;Fikere et al., 2020b). In previous studies which had defined positions of the LepR1 QTL, they used previous versions (with different quality and assembly size) of Darmor bzh genomes (v4.1 or v8.1). For instance, Darmor bzh v9 has a total size of 1,043.4 Mb compared to the 850.3 Mb of Darmor bzh v4.1 .
GWAS is commonly used to reveal genetic mechanisms underlying disease resistance as it integrates historical recombination of the different cultivars and thus, uncovers significant associations between genotype and phenotype. Among the GWAS models, MLM is widely regarded as the most common model in analysing marker trait associations in crop research. Kinship and PCA results were integrated into the MLM to overcome any bias from genetic ancestry (relatedness) (Price et al., 2006;Yu et al., 2006;VanRaden, 2008).
LepR1_meta was based on adult plant survival rate and average internal infection of the stem at maturity stage paired with imputed SNPs, a methodology that would identify quantitative instead of qualitative blackleg resistance in B. napus (Raman et al., 2018;Fikere et al., 2020b). This is likely the reason why LepR1_meta was located 6.89 Mb away from LepR1_mlm1 of this study. Our LepR1_mlm1 corresponds to qualitative blackleg resistance. Similarly, LepR1_TLepR1 was a result from L. maculans containing AvrLep1 screened at the cotyledon stage of the LepR1 introgressed lines (Larkan et al., 2016). LepR1_TLepR1 was also used as the basis to define structural variants of LepR1 in LepR1_SV (Dolatabadian et al., 2019). Taken together, our QTL was consistent with previous LepR1 QTL based on gene-for-gene screening which were all mapped to A02 (Larkan et al., 2016;Dolatabadian et al., 2019;Fikere et al., 2020b).
The few significant SNP associations on C01 and C02 (having RGAs in the QTL) could possibly be due the incorrect mapping of DNA sequences containing notable SNPs that occurred during the genome assembly process in B. napus as SNPs were commonly observed in a genome due to their abundance. Another reason is due to homology between chromosomes, as C02 is homeologous to A02 in B. napus (Parkin et al., 2005). Another reason could be that some genes might have a role (for example as a helper or decoy gene) in the gene-for-gene interaction between LepR1 and AvrLep1. RNA sequencing in previous studies has revealed several significantly activated genes in the LepR1-AvrLep1 interaction on chromosomes C01 and C02 (Becker et al., 2017;Haddadi et al., 2019), indicating the possible roles of other genes in the LepR1-AvrLep1 interaction in B. napus. Recent findings in the B. napus-L. maculans pathosystem indicate that the simple gene-for-gene interaction may indeed be more complicated than once thought (Cantila et al., 2021). Functional gene studies revealed that B. napus mitogen-activated protein (MAP) kinase 9 (BnMPK9), BnaCnng11720D or C05p48150.1_BnaDar on C05 (Rousseau-Gueutin et al., 2020), is an essential partner to the gene-for-gene mediated resistance of blackleg R gene Rlm1 on A07 and pathogen effector AvrLm1 (Ma et al., 2018). Furthermore, Suppresor of BAK1-interacting receptor like kinase 1 (SOBIR1) genes on A03 and C03 (BnaA03g14760D and BnaC03g17800D) are also required for the blackleg resistance of the alleles LepR3 and Rlm2 (BnaA10g20720D) of A10 to effectors AvrLm1 and AvrLm2, respectively (Larkan et al., 2015;Ma et al., 2018).
RGAs with CDS-SNPs are promising candidates for the LepR1 gene. The CDS-SNPs of these RGAs could alter amino acids (aa) of the predicted proteins explaining the difference between susceptible and resistant phenotypes. Among the SNP types, a nonsense SNP leads to a premature stoppage of the amino acids, potentially changing the gene expression among cultivars with different genotypes. Although this study did not examine gene expression between cultivars with different LepR1 phenotypes, previous studies have shown that RLKs and NLRs were significantly expressed in LepR1 resistant cultivars (Becker et al., 2017;Ferdous et al., 2020).
Among the candidates, only BnaA02g33310D3 showed interesting SNP variation, including large deletions, between LepR1 resistant and susceptible materials, indicating the gene may be considered further as a candidate gene. Large deleted sequences  (Fikere et al., 2020b), c (Larkan et al., 2016). All BLAST results obtained E-values of "0"and 100% sequence coverage.
were also found in functionally characterised (mutated) genes RPM1 against Pseudomonas syringae (Grant et al., 1998) and LepR3/Rlm2 against L. maculans (Larkan et al., 2013;Larkan et al., 2015) compared to its respective wild gene. Also, SNP variation has been the basis to develop specific markers with blackleg resistance ( Van de Wouw et al., 2022). This study's resistant-specific marker for BnaA02g33310D3 also signified the gene as a candidate in B. napus and will be useful for MAS. On the other hand, the other candidates which did not have successful band detection can still be considered candidates for the LepR1 but they require further analysis.

Conclusion
This study has narrowed down the LepR1 resistance in B. napus by comparing the current QTL to the previous LepR1 QTLs. While all RGAs in LepR1_mlm1 are promising candidates, those co-localised within the intervals in previous LepR1 QTLs aligned in Darmor bzh v9 and having CDS-SNPs, especially with nonsense SNPs, should be prioritised for validation. In addition, RGAs in the QTLs identified on chromosomes other than A02 in this study can be tested for the possible roles of accessory, helper or decoy genes in LepR1 gene-for-gene resistance in B. napus. Our identified RGAs especially BnaA02g33310D3 are a useful resource to identify the functional LepR1 gene in B. napus.
LepR1 can also be an significant source of blackleg resistance for the cultivars in Australia, which has virulent L. maculans population causing the breakdown of the cultivars containing R gene like LepR3 and ineffectiveness of R genes like Rlm1 and Rlm3 Van de Wouw et al., 2014;Van de Wouw et al., 2021). Other studies have also shown that cultivars with LepR1 are more effective than those with R genes Rlm1, Rlm3, Rlm4, and LepR3 (Alnajar et al., 2022;Rashid et al., 2022).

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI accession PRJNA885991.

Author contributions
AC and JB conceptualized the paper. AC wrote the original draft along with formal analyses. WT, PB, DE, and AV helped improve the paper by suggesting additional ideas and by thorough revision/ editing. AC and NS planted the B. napus materials including DNA extraction. AV provided the phenotype data. AS-E, NS and RA provided bioinformatics support by calling SNPs on the reference B. napus cv. Darmor-bzh v9 genome and statistical codes. AC designed the primers/markers and prepared the materials for sequencing analysis. JB supervised, reviewed, and suggested revisions to the paper. All authors contributed to the article and approved the submitted version.

Funding
This study is funded by the Australian Research Council projects DP200100762, and DP210100296 and the Grains Research and Development Corporation (UWA1905-006RTX).

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.