Genome-wide association study of Verticillium longisporum resistance in Brassica genotypes

Verticillium stripe, caused by Verticillium longisporum, presents an emerging threat to Canadian canola (Brassica napus). Initially detected in Manitoba in 2014, the presence of this pathogen has since been confirmed across western Canada. Infections by V. longisporum can result in yield losses of up to 50%, which is a cause for concern given the susceptibility of most commercial Canadian canola cultivars. The objective of this study was to screen a collection of 211 Brassica genotypes for their reactions to V. longisporum, and to use genome-wide association study (GWAS) to identify single nucleotide polymorphism (SNP) markers for resistance. The plant material consisted of 110 rutabaga (B. napus ssp. napobrassica), 35 canola, 40 Brassica rapa, and 15 Brassica oleracea accessions or cultivars, alongside 11 hosts of the European Clubroot Differential (ECD) set. These materials were screened for resistance under greenhouse conditions and were genotyped using a 19K Brassica SNP array. Three general linear models (GLM), four mixed linear models (MLM), and three GWAS methods were employed to evaluate the markers. Eleven non-commercial Brassica accessions and 9 out of 35 commercial canola cultivars displayed a low normalized area under the disease progress curve (AUDPCnorm.). The non-commercial accessions could prove valuable as potential sources of resistance against V. longisporum. Forty-five SNP markers were identified to be significantly associated with V. longisporum resistance using single-SNP based GWAS analysis. In comparison, haplotype-based GWAS analyses identified 10 to 25 haplotype blocks to be significantly associated with V. longisporum resistance. Between 20% and 56% of QTLs identified by the more conventional single-SNP based GWAS analysis were also detected by the haplotype-based GWAS analysis. The overlapping genomic regions identified by the two GWAS methods present promising hotspots for marker-assisted selection in the future development of Verticillium stripe-resistant canola.


Introduction
Verticillium stripe, caused by the fungal pathogen Verticillium longisporum (C.Stark) Karapapa, Bainbridge and Heale, is an important soilborne disease of canola (Brassica napus L.) in Canada.The first case of Verticillium stripe in this country was identified in a canola field in Manitoba in 2014 (Canadian Food Inspection Agency, 2018).Subsequently, V. longisporum has been detected in other Canadian provinces, including British Columbia, Alberta, Saskatchewan, Ontario, and Quebec (Canadian Food Inspection Agency, 2018).Yield losses due to V. longisporum infection were reported to range from approximately 10% to 50% on canola, although they could exceed 80% on a single plant (Dunker et al., 2008).Since the survival structures (microsclerotia) of V. longisporum can persist in the soil for up to 10 years (Schnathorst, 1981), strategies such as minimizing soil movement, implementing longer rotations out of host crops, and good weed management can potentially reduce V. longisporum inoculum levels (Johansson et al., 2006).However, these strategies may not be practical for growers due to economic concerns.Moreover, there are currently no registered fungicides available for controlling this disease (Dunker et al., 2008).Therefore, genetic resistance stands out as the most effective and environmentally friendly approach for managing Verticillium stripe.Unfortunately, no commercial canola varieties in Canada have been registered as resistant to V. longisporum (Norman, 2023).
However, to the best of our knowledge, no screening or resistance gene/QTL detection studies have been conducted in Canada for the identification of Brassica germplasm suitable for breeding V. longisporum resistance in commercial canola cultivars.Therefore, the objective of this study was to screen a large collection of rutabaga (B.napus ssp.napobrassica) accessions and commercial canola cultivars from Canada, as well as B. rapa and B. oleracea genotypes from China, for resistance to this fungus.Additionally, a genome-wide association study (GWAS) was utilized to identify ac cessions and genomic regions assoc iate d with V. longisporum resistance.

Resistance phenotyping
The single-spore isolate VL 43 of V. longisporum, collected from an infected canola plant sampled near Edmonton, Alberta (Cui et al., 2022), was cultured in Petri dishes (9-cm diameter) filled with potato dextrose agar (PDA).The multiplex PCR method described by Inderbitzin et al. (2013) was employed to identify isolate VL 43 as V. longisporum lineage A1/D1.Cultures were incubated in darkness at room temperature for 28 days before harvesting the conidia.Briefly, 10 mL of sterile distilled water was added to each Petri dish, and a sterile inoculating loop was used to gently dislodge the spores.The resulting conidial suspension was filtered through four layers of sterile cheesecloth to remove mycelial fragments.The spore concentration was then estimated using a haemocytometer (Hausser Scientific, Horsham, Pennsylvania, USA), and adjusted to 1 × 10 6 spores mL −1 with sterile distilled water.
Seven-day old seedlings of the 211 Brassica genotypes were inoculated using the root-dip method as described by Cui et al. (2022).Non-inoculated controls were dipped in sterile distilled water instead.The experimental setup consisted of 32 L plastic tubs filled with Sunshine Mix #4 growing mix (Sun Gro Horticulture, Vilna, Alberta, Canada).Each tub accommodated five seedlings of the same genotype, with four Brassica genotypes per tub, totaling 20 plants (5 plants × 4 genotypes) per tub and each genotype had 4 replicates.The plants were maintained in a greenhouse under an 18-h photoperiod (22°C day/16°C night).
Disease severity assessments were conducted weekly for each plant over a 4-week period.The assessment utilized a 1-9 rating scale as described by Eynck et al. (2009), where a rating of 1 = no symptoms, while 9 = the plant is dead.The experiment was arranged in a randomized completely block design with four replicates, and was independently repeated.

Statistical analysis of the disease data
The area under the disease progress curve (AUDPC) was calculated for each host genotype based on Verticillium stripe severity using the formula described by Campbell and Madden (1990) , where y i is the disease severity for each observation number i, t i is the number of days after inoculation at the time of observation number i, and n is the number of observations.Non-inoculated plants were also assessed on the same scale at the same times.A net AUDPC value (AUDPC net ) was calculated following Eynck et al. (2009): AUDPC net = AUDPC(X inoc. ) -AUDPC(X contr.), where X inoc. is the inoculated plants and X contr. is non-inoculated controls.
The AUDPC values were normalized for each genotype relative to the susceptible check 'Westar' and moderately resistant check 'Granaat' to account for fluctuating disease severity between trials.The normalized AUDPC (AUDPC norm. ) was calculated according to Eynck et al. (2009): The phenotype data was analyzed statistically using R: A Language and Environment for Statistical Computing (R Core Team, 2013).Brassica genotypes with significantly lower AUDPC norm.(P ≤ 0.05) compared to the moderately resistant cultivar 'Granaat' were considered resistant (Rygulla et al., 2007;Eynck et al., 2009).If 0.05 < P ≤ 0.1, the genotypes were regarded as moderately resistant.In addition, for those not assigned as resistant or moderately resistant, the susceptible check 'Westar' was also used for comparison.Brassica genotypes with P ≤ 0.05 were regarded as susceptible and P > 0.05 were considered as moderately susceptible.

SNP genotyping
SNP genotyping was performed on all 211 Brassica genotypes using a Brassica 19K SNP array from TraitGenetics GmbH (Gatersleben.Germany).This array included 9,966 SNP markers on the A-genome, 7,740 SNP markers on the C-genome, and 1,146 SNP markers on scaffolds.After filtering monomorphic, lowcoverage site markers, as well as markers with minor allele frequency (MAF) ≤ 0.05 and those missing data for >10%, 4,972 A-genome markers and 4,621 C-genome markers were retained for the GWAS.The GWAS was conducted separately on the 45 B. rapa (AA) and 149 B. napus (AACC) accessions using the A-genome markers, and on the 17 B. oleracea (CC) and 149 B. napus accessions using the C-genome markers.Additionally, the average inter-SNP marker distance was determined for each combination and each chromosome (Table 1).

Linkage disequilibrium estimation
Intra-chromosomal linkage disequilibrium (LD) between allelic values at two loci was estimated using Pearson's squared correlation coefficient (r 2 ) statistic with TASSEL 5.0 (Bradbury et al., 2007).To determine the significance of pairwise marker r 2 -values, P < 0.001 of the Chi-square (c2) statistic for each SNP pair was used according to Fredua-Agyeman et al. (2020).The LD decay curves were determined by calculating the Chi-square (c 2 ) statistic for each SNP pair in relation to physical map distance (in Mb) using R v. 4.3.2(R Core Team, 2013).The extent of LD was estimated based on the interaction of the fitted LD decay curve and r 2 -threshold lines for each chromosome (Breseghello and Sorrells, 2006;Bellucci et al., 2015).

Population structure analysis
To determine the population structure (Q) of the Brassica accessions used in this study, a Bayesian clustering approach was employed.Burn-in periods ranged from 5,000 to 100,000 iterations, and Markov Chain Monte Carlo (MCMC) analyses ranged from 5,000 to 100,000 permutations through the population-genetic software STRUCTURE v. 2.3.4 (Pritchard et al., 2000).The Brassica rapa + B. napus genotypes and B. oleracea + B. napus genotypes were analyzed separately to determine the number of genetically homogeneous clusters (K) based on 4,972 and 4,621 SNP markers, respectively.Runs for each K=1-10 were replicated 10 times.The number of clusters and average log-likelihood plots were determined according to Evanno et al. (2005) through STRUCTURE HARVESTER (Earl and vonHoldt, 2012).
The genetic diversity of B. rapa + B. napus genotypes and B. oleracea + B. napus genotypes was determined separately.This analysis was based on 4,972 A-genome markers and 4,621 Cgenome markers.The unweighted pair group method with arithmetic mean (UPGMA) and the neighbor joining (NJ) method implemented in TASSEL v. 5.0 were used to generate phylogenetic trees.

SNP-based genome-wide association analyses
Three general linear models (GLM) and four mixed linear models (MLM) implemented in TASSEL v. 5.0 (Bradbury et al., 2007) were tested for the SNP-based marker-trait association studies.The GLM tested consisted of the population structure (Q)-only, Kinship (K)-only, and Principal Coordinate Analysis (PCA)-only models.The MLM models comprised Q+K, PCA+K, Q+PCA and PCA+D (Distance matrices) (Fredua-Agyeman et al., 2020).Furthermore, three additional GWAS methods were employed using the GAPIT v. 3 (Wang and Zhang, 2021) package in R.These included the Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) (Huang et al., 2019), the Fixed and random model Circulating Probability Unification (FarmCPU) (Liu et al., 2016), and the Multiple Locus Mixed Linear Model (MLMM) (Segura et al., 2012).
The SNP-based GWAS was conducted for the B. rapa + B. napus accessions using the 4,972 A-genome SNP marker data.Two independent AUDPC measurements and the average of two sets of AUDPC phenotype data were utilized.This analysis was performed using the seven models and three methods noted above.Similarly, the 4,621 C-genome SNP marker data and two independent measurements of AUDPC, along with the average of two sets of AUDPC data for each genotype, were used for the GWAS of the B. oleracea + B. napus genotypes.For each model/method and genotype combination, Quantile-Quantile (Q-Q) plots were examined to identify which plot showed the least amount of deviation from the expected -log 10 P-value.Significant markers associated with Verticillium stripe resistance were identified by examining the best-fitted Q-Q plots and Manhattan plots.These plots were generated using the CMplot package in R. To establish the significance cut-off (-log 10 (0.05/n), n = number of markers), the Bonferroni correction was applied (Benjamini and Hochberg, 1995).A slightly lower threshold of -log 10 P = 3.0 was employed for association.Significant SNP markers associated with Verticillium stripe resistance were identified using the various models and methods.

Construction of haplotype blocks
Haplotype association tests were performed using three different algorithms (Confidence Interval [CI], Four Gamete Rule [FGR], and Solid Spine of LD [SS]) implemented in the software Haploview v 4.2 (Barrett et al., 2005).Haplotype blocks were constructed for each chromosome separately to identify SNPs in the same haploblock and to investigate the combined effect of the *The number and percentages of SNP pairs in significant LD were determined from Chi-squared tests at p-value < 0.001.f The extent of LD decay was estimated from the projection of the intersection between the fitted curve of the data points and the 95th percentile of an unlinked r 2 threshold line (background LD) onto the physical distance axis.linked significant SNPs.Each analysis was carried out using the default settings and the standardized disequilibrium coefficient (D') was used to determine the LD between SNP markers and to generate halpoview LD plots.Blocks were formed if ≥ 95% of the comparisons exhibited strong LD.The haplotype blocks were transformed into multiallelic markers and used for haplotype-trait association analyses (Abed and Belzile, 2019;Bajgain and Anderson, 2021).

Candidate gene prediction
The probe sequences of SNP markers significantly associated with VL resistance were utilized in BlastN searches of the B. rapa (AA) genome assembly CAAS_Brap_v3.01,B. oleracea (CC) genome assembly BOL, and B. napus (AACC) genome assembly Da-Ae in the EnsemblPlants (plants.ensembl.org)and National Centre for Biotechnology Information (www.ncbi.nlm.nih.gov)databases.Using a threshold of ≥ 90% identity and an E-value ≤ 1e -20 , candidate genes were mapped to the reference genomes.The physical locations of these genes were determined based on the significant SNPs and haploblocks in strong LD.

Distribution of polymorphic SNP markers
Table 1 presents the number and distribution of SNP markers retained in the GWAS to determine resistance to V. longisporum.In the GWAS of the B. rapa and B. napus accessions, the mean number of filtered SNP markers was 497.2 ± 130.6, ranging from 370 on chromosome A09 to 779 on chromosome A03 (Table 1).The filtered set of 4,972 markers covered 291.6 Mb of the Agenome in B. rapa and B. napus (Table 1).The mean inter-SNP marker distance or density for the A-genome was 61.6 ± 21.5 Kb, ranging from 41.3 on chromosome A07 to 115.7 on chromosome A09 (Table 1).In the GWAS of the B. oleracea and B. napus accessions, the mean number of filtered SNP markers was 513.4 ± 207.5, ranging from 231 on chromosome C09 to 829 on chromosome C03 (Table 1).The filtered set of 4,621 markers covered 463.4 Mb of the C-genome in B. oleracea and B. napus (Table 1).The mean inter-SNP marker distance or density for the C-genome was 117.8 ± 59.3 Kb, ranging from 65.0 on chromosome C07 to chromosome C09 255.5 (Table 1).

Linkage disequilibrium
The average of the squared allele correlation LD (r 2 ) for all chromosomes is presented in Table 1 and the plots of correlation coefficient (r 2 ) and physical distance (in Mb) for SNP markers on chromosomes A and C are presented in Supplementary Figure S2.The mean r 2 value for the A-genome of B. rapa and B. napus was calculated to be 0.163, ranging from 0.144 on chromosome A09 to 0.223 on chromosome A08 (Table 1).The average extent of LD decay for the 10 A-genome chromosomes (A 1 -A 10 ) ranged from 0.42 Mb on chromosome A08 to 1.15 Mb on chromosome A09, with an estimated mean of 0.60 Mb (Table 1).The mean r 2 for the C-genome of B. oleracea and B. napus was 0.277, ranging from 0.243 on chromosome C08 to 0.358 on chromosome C01 (Table 1).The estimated mean LD decay for the nine C-genome chromosomes (C 1 to C 9 ) ranged from 0.68 Mb on chromosome C07 to 1.85 Mb on chromosome C09, with an estimated mean of 0.42 Mb (Table 1).Therefore, the extent of LD for the C-genome c h r o m o s o m e s wa s s li g h t ly gr e a t e r t h a n f o r t h e Agenome chromosomes.

Population structure analyses
Two clusters (K=2) were determined at all runs (10,000, 50,000 and 100,000 burn-in iterations and MCMC lengths) by the method of Evanno et al. (2005) using STRUCTURE for the analyses of both B. rapa + B. napus (Figure 3A) and B. napus + B. oleracea (Figure 3B).At a probability of 70%, 77 (39.7%) of the B. rapa and B. napus genotypes were placed in group 1, 108 (55.7%) were placed in group 2, while 9 (4.6%) were classified as admixtures (Figure 3C).In group 1, there were 45 B. rapa genotypes, including 40 vegetable cultivars from China and the ECD lines 01-05, along with 32 Canadian canola cultivars.In Group 2, there were 106 rutabagas, along with ECD 10 and one Canadian canola cultivar.Additionally, there were two Canadian canola cultivars, along with ECD 06-09 and four rutabagas classified as admixtures.

SNP-based association mapping of Verticillium stripe resistance loci
In the two GWAS, the observed -log 10 P distribution showed greater deviation from the expected distribution in the Q-Q plots of the three GLMs than in the four MLMs.Among the four MLMs, the observed -log 10 P distribution of the PCA + K and Q + K models deviated the least from the expected distribution compared with the Q + D and PCA + D models (Supplementary Figures S3A-G, K-Q).The observed -log 10 P distribution of three other GWAS methods, namely BLINK, FarmCPU and MMLM, also exhibited minimal deviation from the expected distribution (Supplementary Figures S3H-J, R-T).Therefore, among the 10 models and methods tested, the PCA + K and Q+K models, along with the BLINK, FarmCPU, and MLMM methods, generated the best Q-Q plots.Consequently, Manhattan plots for these five methods were utilized to identify significant SNPs for Verticillium stripe resistance (Figures 5A-E,  6A-E).Based on the Manhattan plots, 45 SNP markers were found to be associated with resistance to this disease (Table 2).Among these significant markers, 38 SNPs were identified on the Agenome, while seven SNPs were on the C-genome (Table 2).The significant SNPs were distributed across all chromosomes except for chromosomes C01, C04, C07, and C09 (Table 2).Markers identified on chromosome A and chromosome C explained 1.7% to 34.2% and 7.7% to 58.2% of variation, respectively (Table 2).Marker Bn_A03_p2130281 on chromosome A03 and Bn_scaff_18181_1_p572911 on chromosome C05 explained the highest percentage of variation of 34.2% and 58.2% respectively (Table 2).The markers effect size ranged from -0.33 to 0.35 and -0.19 to 0.43 for chromosome A and chromosome C, respectively (Table 2).The allelic effects of 45 SNP markers were listed in Supplementary Table S2.

Haplotype associated with verticillium resistance
The 9,593 (4,972 A-genome + 4,621 C-genome) SNP markers formed a mean of 2283 (1345 A-genome + 938 C-genome) blocks in the haplotype analyses.The haplotypes from the A-genome contained between 2 and 10 SNPs while the mean size of the haplotype blocks ranged from 47 to 90 kilobases (mean 65 kb).
In the case of the C-genome, the number of SNPs in the haplotypes ranged from 2 to 24 while the mean size of the haplotype blocks ranged from 109 to 207 kb (mean 152 kb).The SS, FGR and CI genome association analyses yielded 25, 23 and 9 haplotype blocks, that were significantly associated with V. longisporum resistance, respectively (Table 3).Therefore, the SS and FGR methods significantly outperformed the CI method in identifying haplotype blocks.Twenty of the haplotype blocks were detected by both the SS and FGR methods whiles the SS and FGR methods independently detected 5 and 3 haplotype blocks, respectively (Table 3).Nine haplotype blocks were identified by the three methods.Altogether, a total of 28 haplotype blocks determined by the three methods (SS, GR FGR and CI) were used in the BlastN searches to determine the structural and functional genomic information.Our results showed that the haplotype-based GWAS analysis allowed the detection of more candidate genes than if only the SNP-based GWAS had been used.The haplotype-trait associations were located on 13 different chromosomes, with the LOD scores ranging from 1.8 to 69.7 (Table 3).

Candidate genes based on SNP markers significantly associated with VL resistance
Based on the significant 38 SNP markers (Table 2) and 23 haplotype blocks on the A-genome (Table 3), a total of 962 genes on Da-Ae anchored genome assembly and 1143 genes on CAAS_Brap_v3.01anchored genome assembly (Supplementary Table S3).Similarly, a total of 261 genes on Da-Ae anchored genome assembly and 242 genes (Supplementary Table S3) on BOL anchored genome assembly were detected for 7 SNP markers (Table 2) and 5 haplotype blocks (Table 3) identified on chromosome C.Among total of 2608 candidate genes, some of genes encode proteins such as disease resistance protein RML 1Blike, multisubstrate pseudouridine synthase 7, leucine-rich repeat receptor-like serine, L-type lectin-domain containing receptor kinase S.5, F-box protein family, ethylene-responsive transcription factor ERF106-like and serine/threonine-protein kinase 16-like that were reported to be associated with plant disease response (Supplementary Table S3).Other genes encoded functional proteins include formin-like protein, NAD-dependent protein, ATPase 1, plasma membrane-type-like, internal metabolism, and biosynthesis, which are associated with cellular and biochemical processes (Supplementary Table S3).Additionally, genes that encoded zinc transporter 12, COP1-interactive protein 1, transcription factor PIF3, and transcription factor DIVARICATA, play an important role in basic plant biological and physiological processes (Supplementary Table S3).Proteins of unknown molecular function were also detected (Supplementary Table S3).

Discussion
As an emerging canola disease, Verticillium stripe continues to spread across the Canadian prairies (Oosterhuis, 2022), and has recently been detected in North Dakota (Chapara et al., 2023).However, no V. longisporum resistant canola cultivars have been registered in Canada, resulting in increased yield losses (Norman, 2023).Therefore, the identification of germplasm for breeding resistant cultivars and identifying molecular makers tightly linked with V. longisporum resistance for marker-assisted selection is critical.Association mapping, based on linkage disequilibrium of markers with QTLs, is a powerful tool for marker-assisted selection, enabling the exploitation of variation in plant materials (Jestin et al., 2011).GWAS is one of the most popular approaches for association mapping, offering significant advantages over linkage analysis.It provides higher resolution, incorporates a greater number of alleles, and allows for the simultaneous analysis of various traits of interest (Zhu et al., 2008).Currently, single nucleotide polymorphism     (SNP) markers are widely utilized for GWAS (Ben-Ari and Lavi, 2012).These markers are co-dominant and suitable for highthroughput genotyping (Ben-Ari and Lavi, 2012).Their biallelic and high heritability contribute to increased genotyping accuracy (Zhu et al., 2008).In this study, GWAS was employed to find significant SNP markers associated with V. longisporum resistance in a large collection of B. napus, B. rapa, and B. oleracea genotypes.
The Brassica 15K SNP array used by Fredua-Agyeman et al. (2020) had a total of 13,714 SNP markers which comprised 7,214 Agenome markers and 6,500 C-genome markers which were mostly on scaffolds, while the 19K array had 9,966 A-genome markers and 8,886 C-genome markers which were mostly mapped to specific chromosomes.To minimize missing data, the B. rapa and B. napus accessions were analyzed separately using only the A-genome markers, while the B. oleracea and B. napus accessions were analyzed separately using only the C-genome markers.The filtered set of 4,972 SNP markers obtained from the Brassica 19K SNP array were uniformly distributed on the A 1 -A 10 chromosomes and covered 291.6 Mb of the A-genome of B. rapa and B. napus.For the GWAS of the B. oleracea and B. napus accessions, the filtered set of 4,621 SNPs were also uniformly distributed on the C 1 -C 9 chromosomes and covered 463.4 Mb of the C-genome.The two GWAS conducted in this study had approximately the same Agenome coverage and close to 1.5× more coverage of the C-genome compared to a previous study that utilized a Brassica 15K SNP array from SGS TraitGenetics (Fredua-Agyeman et al., 2020).In the current study, the mean maker density was 61.6 ± 21.5 Kb on the A-genome, and 117.8 ± 59.3 Kb on the C-genome.In contrast, Fredua-Agyeman et al. (2020) reported mean marker densities using the Brassica 15K SNP assay of approximately 63.4 ± 21.9 Kb and 15.0 ± 8.4 Kb for the A-and C-genomes, respectively.
Therefore, the marker density on the A-genome was similar, but on the C-genome, the density was almost 9× lower with the 19K SNP array used in this study.While the 19K SNP array provided increased coverage, more C-genome markers need to be developed.
Linkage disequilibrium refers to the association and linkage of alleles among SNPs within a genomic sequence, which is important in GWAS for identification of genetic markers (Joiret et al., 2022).Wang et al. (2017) observed that the extent of LD decay ranged from 0.15 to 3.3 Mb for the A-genome and from 0.4 to 8.3 Mb for the C-genome.Using a Brassica 60K Illumina Infinium SNP array, Xu et al. (2015), reported LD decay for the A-genome ranging from 0.6 to 5.6 Mb and from 1.2 to 8.5 Mb for the C-genome.In another study, Fredua-Agyeman et al. (2020) estimated LD decay ranging from 1.1 to 2.3 Mb for the A-genome and from 0.20 to 1.5 Mb for the C-genome using the Brassica 13.2K SNP array.The extent of LD decay found in this study ranged from 0.42 to 1.15 Mb for the Agenome and from 0.68 to 1.85 Mb for the C-genome, which was similar to the values reported in these earlier studies.
In the present study, GWAS was conducted using three GLMs (Q-only, K-only, and PCA-only) and four MLMs (PCA+D, PCA +K, Q+K, and Q+D), in addition to three methods (BLINK, FarmCPU, and MLMM).Mixed linear models are versatile and widely used in GWAS.They offer a balance between complexity and computational efficiency by incorporating population structure and kinship to adjust associations on markers (Yang et al., 2014).However, MLMs can lead to increased false positive rates due to overfitting (Kaler et al., 2020).Therefore, other GWAS algorithms were also employed in this study.To reduce false positive or false negative associations, BLINK considers both main effects and interactions among genetic variants (Huang et al., 2019).FarmCPU integrates fixed and random effects and adjusts for population structure and relatedness using a kinship matrix (Kaler et al., 2020), while MLMM simultaneously considers multiple loci, accommodating polygenic effects (Kaler et al., 2020).By employing multiple GWAS methods, SNP markers could be identified more consistently, thereby increasing the accuracy and efficiency of QTL detection.
Six SNPs, each corresponding to a distinct QTL, were identified on chromosome A03 based on the SNP-based association and haplotype block analysis.Gabur et al. (2020) mapped the major QTLs for VL resistance to the A03 chromosome using a biparental DH population.In this GWAS study, the significant SNPs on A03 seem to be distal from the SNPs bordering the QTL for V. longisporum resistance previously reported by Gabur et al. (2020) on chromosome A03 (physical position 7,963,059 to 11,419,476).Therefore, the QTL regions on chromosome A03 identified in this study appear to be different from the QTLs reported by Gabur et al. (2020).One of the SNP markers identified in this study, Bn_A03_p14870270, was found in a genomic region reported to be a hotspot for clubroot resistance (Fredua-Agyeman et al., 2021).This region contains the clubroot resistance gene(s)/QTLs Bn.A3P2F, Crr3, CRk, and CRd.However, this SNP did not form a haploblock but was found to be in the genomic region encoding multisubstrate pseudouridine synthase.The haplotype hapVLA10.1 was associated with three significant SNP markers: Bn_A10_p15719803, Bn_A10_p15727608, and Bn_A10_p15731773.The haplotype hapVLC6.1 was associated with two significant SNP markers, Bn_scaff_15892_1_p310757 and Bn_scaff_15892_1_p404259.
The rutabaga accessions used in this study were also screened for clubroot resistance by Fredua-Agyeman et al. (2020).Unfortunately, accessions previously classified as resistant or moderately resistant to clubroot (FGRA036, FGRA037, FGRA044, FGRA072, FGRA106, FGRA108, FGRA109 and FGRA112) were found to be susceptible or moderately susceptible to V. longisporum.Hirani et al. (2018) reported that ECD 01, ECD 02, ECD 03 and ECD 04 (B.rapa), as well as ECD 11 (B.oleracea), were resistant to Canadian field isolates of the clubroot pathogen, while ECD 05 (B.rapa) was susceptible.In the current study, ECD 01, ECD 02, ECD 03 and ECD 04 were all susceptible to V. longisporum isolate VL 43.Among all of the ECD hosts screened, only ECD 11 exhibited resistance to this fungus, while ECD 05 was moderately resistant.Similarly, ECD 08 (B.napus) and ECD 13 (B.oleracea) were moderately resistant to V. longisporum, but they were susceptible and segregated in response to clubroot (Hirani et al., 2018).Consequently, the present findings suggest a negative relationship between resistance to Verticillium stripe and clubroot.
Of the 20 Verticillium stripe-resistant Brassica genotypes identified in this study, eight were B. rapa and B. oleracea vegetable types, three were rutabagas, and nine were canola.Gabur et al. (2020); Rygulla et al. (2007), andSu et al. (2023) identified QTLs/genes for Verticillium stripe resistance in vegetable-type Brassica germplasm.This indicates that B. rapa and B. oleracea might be potential donors for resistance breeding programs in canola/oilseed rape.Nine or about a quarter of the Canadian canola cultivars in this study were found to be resistant to V. longisporum, suggesting that the deployment of resistant hosts holds promise for the management of Verticillium stripe.
In the current GWAS, genes located within the QTLs were identified that were associated with plant disease resistance and immunity mechanisms.The gene identified in the QTL region on chromosome A04 encoded disease resistance protein RML1B-like.It was reported to be related to Leptosphaeria maculans resistance in A. thaliana (Staal et al., 2006).Similarly, genes encoding disease resistance protein RML1A-like on chromosome A07 were identified as involved in resistance to powdery mildew (Bhattarai et al., 2020).In one of the QTL regions on chromosome A06, the gene encoded disease resistance protein RPV1-like was identified which confers resistance to Plasmopara viticola (Feechan et al., 2013).In the other QTL region on chromosome A06, genes encoding disease resistance protein RBA1-like and TMV resistance protein N were identified, which were associated with cell death for disease resistance in Arabidopsis (Nishimura et al., 2017) and resistance to tobacco mosaic virus (Marathe et al., 2002).Additionally, the gene encoding protein PYRICULARIA ORYZAE RESISTANCE 21 was identified in the QTL region on chromosome A08 which confers resistance to Magnaporthe oryzae (Nawaz et al., 2020).Unlike QTL regions on the A-genome, only one of the regions on chromosome C08 identified a gene encoding a probable disease resistance protein At1g15890, which was reported as an Arabidopsis-resistance gene related to cell death (Huang et al., 2024).Other genes encoded proteins that were associated with disease resistance responses such as the ethylene-responsive transcription factor ERF106 (Yang et al., 2022), a probable leucine-rich repeat receptor-like protein kinase which modulates the development of Phytophthora sojae on soybean (Si et al., 2021), and an L-type lectin-domain containing receptor kinase that was found to positively regulate disease resistance against Phytophthora in pepper (Woo et al., 2020).It is possible that these QTL regions habour genes controlling V. longisporum resistance.
In conclusion, screening 211 Brassica genotypes for resistance to V. longisporum identified 20 resistant accessions/cultivars, including representatives from B. rapa, B. oleracea, and B. napus.Additionally, significant SNP markers on chromosome A03 may be important for Verticillium stripe resistance breeding.

FIGURE 1
FIGURE 1 Reaction of Brassica genotypes to inoculation with Verticillium longisporum.Genotypes were rated as susceptible (S), moderately susceptible (MS), moderately resistant (MR), or resistant (R) to the fungus based on disease severity.Frequency distributions for host reactions are shown for (A) the entire collection of 209 genotypes excluding the susceptible and moderately resistant checks; (B) 110 rutabagas (B.napus ssp.napobrassica); (C) 34 canola (B.napus) cultivars; (D) 39 vegetable-type B. rapa; (E) 15 vegetable-type B. oleracea; and (F) 10 selected hosts of the European Clubroot Differential (ECD) set, excluding the moderately resistant-check ECD 05.

FIGURE 3
FIGURE 3 Bayesian cluster analysis of 211 Brassica accessions including B. napus, B. oleracea and B. rapa estimated with STRUCTURE using 50,000 burn-in iterations and Markov Chain Monte Carlo (MCMC) lengths.The value of K, determined following Evanno et al. (2005), indicated two clusters for the B. rapa and B. napus genotypes (A), and for the B. oleracea and B. napus genotypes (B), in all runs.Detailed Bayesian clustering of the B. rapa and B. napus genotypes (C), and of the B. oleracea and B. napus genotypes (D), is also shown, with each color representing one ancestry component.The simplified view suggests two ancestral populations.

FIGURE 4
FIGURE 4 Neighbor joining (NJ) (A) and unweighted pair group method with arithmetic mean (UPGMA) (B) analysis with 4,972 A-genome markers grouped 194 Brassica rapa and Brasscia napus genotypes into five clusters.NJ (C) and UPGMA (D) analysis with 4,621 C-genome markers grouped 166 Brassica oleracea and B. napus genotypes into four clusters.

FIGURE 5
FIGURE 5 Manhattan plots of the PCA+K (A), Q+K (B), BLINK (C), FarmCPU (D) and MLMM (E) models for identifying Verticillium longisporum resistance in Brassica rapa + Brassica napus genotypes.The dashed horizontal lines indicate the Bonferroni-adjusted significance threshold known as "logarithmof-odds" (LOD score).The solid lines indicate a slightly lower threshold of -log10 P = 3.0.The dots above the significance threshold indicate singlenucleotide polymorphisms (SNPs) associated with resistance to V. longisporum.

FIGURE 6
FIGURE 6 Manhattan plots of the PCA+K (A), Q+K (B), BLINK (C), FarmCPU (D) and MLMM (E) models for identifying Verticillium longisporum resistance in Brassica oleracea + Brassica napus genotypes.The dashed horizontal lines indicate the Bonferroni-adjusted significance threshold known as "logarithm-of-odds" (LOD score).The solid lines indicate a slightly lower threshold of -log10 P = 3.0.The dots above the significance threshold indicate single-nucleotide polymorphisms (SNPs) associated with resistance to V. longisporum.

TABLE 1
Single nucleotide polymorphism (SNP) marker density and extent of intrachromosomal linkage disequilibrium (LD) in Brassica rapa, Brassica napus and Brassica oleracea genotypes included in a genome-wide association study of resistance to Verticillium longisporum.

TABLE 2
Single nucleotide polymorphism (SNP) markers identified in two genotype combinations, Brassica rapa + Brassica napus and Brassica oleracea + B. napus, including their chromosomal locations and linkage association with resistance to Verticillium longisporum.
Q Mixed Linear Model (MLM) designations: PCA, principal component analysis; Q, population structure; K, Kinship.GWAS method designations: BLINK, Bayesian-information and Linkagedisequilibrium Iteratively Nested Keyway; FarmCPU, the Fixed and random model Circulating Probability Unification; MLMM, the Multiple Locus Mixed Linear Model.a SNP markers denoted with the same superscript letter mapped to multiple chromosomes on the reference genomes.The type of PCR-based markers showing trait association has been specified.Putative functions are based on matching entries in the EnsemblPlants and NCBI GenBank databases.

TABLE 3
Determination of haplotype blocks associated with Verticillium stripe resistance in canola determined using the confidence interval, four gamete rule, and solid spine of LD methods.