Whole Genome Re-sequencing and Bulk Segregant Analysis Reveals Chromosomal Location for Papaya Ringspot Virus W Resistance in Squash

Squash (Cucurbita moschata) is among the most important cucurbit crops grown worldwide. Plant pathogen, Papaya ringspot virus W (PRSV-W) causes significant yield loss in commercial squash production globally. The development of virus-resistant cultivars can complement integrated disease management and mitigate losses due to viral infections. However, the genetic loci and molecular markers linked to PRSV-W resistance that could facilitate marker-assisted selection (MAS) for accelerated cultivar development are unknown. In this study, quantitative trait loci (QTL), molecular markers, and candidate genes associated with PRSV-W resistance in squash were identified in an F2 population (n = 118) derived from a cross between Nigerian Local accession (resistant) and Butterbush cultivar (susceptible). Whole genome re-sequencing-based bulked segregant analysis (QTLseq method; n = 10 for each bulk) and non-parametric interval mapping were used to identify a major QTL associated with PRSV-W resistance on chromosome 9 (QtlPRSV-C09) (p < 0.05) of C. moschata. QtlPRSV-C09 extended from 785,532 to 5,093,314 bp and harbored 12,245 SNPs among which 94 were high-effect variants. To validate QtlPRSV-C09, 13 SNP markers were assayed as Kompetitive allele-specific PCR (KASP) markers in the F2 population and tested for the association with PRSV-W resistance. Among these, two KASP markers (Ch09_2080834 and Ch09_5023865-1) showed significant association with PRSV-W resistance (p < 0.05). The two SNPs were located within exons of putative disease-resistant genes encoding the clathrin assembly family and actin cytoskeleton-regulatory complex proteins, which are implicated in disease resistance across plant species. The findings of this study will facilitate MAS for PRSV-W resistance in squash and allow further understanding of the functional mechanisms underlying potyvirus resistance in Cucurbita species.


INTRODUCTION
Squash and pumpkin (Cucurbita species) constitute an economically important horticultural crop grown in most of the subtropical and temperate regions of the world (Kates, 2019). Cucurbita moschata is among the major cultivated species of squash valued for culinary, processing, and ornamental purposes worldwide (Whitaker and Robinson, 1986). Flowers and tender shoots of squash are relished as vegetables while fruits are consumed either baked or cooked. Fruits of squash are also used in pies, soups, and jams (Lim, 2012). The flesh and leaves of C. moschata are a good source of vitamins, amino acids, flavonoids, and carbohydrates, whereas the seeds have significant levels of unsaturated fatty acids and antioxidants (Stevenson et al., 2007;Yadav et al., 2010;Vinayashree and Vasu, 2021).
The US is among the top-five producers of squash in the world with 116,600 acres of land under production delivering an annual value of more than 411.8 million dollars in 2020 (USDA-National Agricultural Statistics Service [NASS], 2021). Disease outbreaks caused by potyviruses (Potyviridae family) are the major limiting factors for squash production in the US (Boyhan et al., 2009). More than 165 potyvirus species are known to infect plants (Nigam et al., 2019); however, the most devastating for Cucurbita crops are Papaya ringspot virus W (PRSV-W), Zucchini yellow mosaic virus (ZYMV), and Watermelon mosaic virus (WMV) (Paris, 2008). These three viruses can infect both vegetative and reproductive parts of the plants, which cause severe mottling in leaves, stunted growth, and deformed fruits, and can cause 100% yield loss when susceptible cultivars are grown (Seda-Martínez et al., 2021). Chemical and cultural management methods aim at suppressing aphid-vector populations but have limited efficacy due to the obligate intracellular nature of the viruses (Rubio et al., 2020). Utilizing genetic host resistance is the most reliable and cost-efficient way of minimizing losses due to the viruses (Brown et al., 2003;Collard et al., 2005;Pachner et al., 2011;Rubio et al., 2020).
Resistance to PRSV-W, ZYMV, and WMV has been reported in Nigerian Local, a landrace of C. moschata from Nigeria (Munger and Provvidenti, 1987;Brown et al., 2003). Resistance to PRSV-W in Nigerian Local is controlled by a single recessive gene (prv) with modifiers (McPhail-Medina et al., 2012;Seda-Martinez, 2019), whereas resistance to WMV is controlled by a single dominant gene (Wmv) (Behera et al., 2012). On the other hand, resistance to ZYMV in Nigerian Local is conferred by two genes designated Zym-0 and Zym-4, with the latter only conferring resistance in complementary interaction with a recessive gene designated zym-5 in ZYMV-susceptible cultivar, Waltham Butternut (Brown et al., 2003;Pachner et al., 2011). Traditional breeding strategies for developing cultivars of squash resistant to potyviruses are costly and dependent on resourceintensive phenotypic selection methods. For example, the first commercial squash cultivars integrating virus resistance from wild germplasm took nearly 20 years to develop using traditional breeding approaches (Brown et al., 2003). This is due to interspecific barriers across Cucurbita species and the non-allelic nature of resistance to the three potyviruses (Brown et al., 2003). However, molecular tagging of resistant genes through markerassisted selection (MAS) can reduce the length of selection cycle by allowing the identification of desirable plants at the seedling stage or through seed-based genotyping (Collard et al., 2005;Martinez et al., 2021). Molecular markers tightly linked to ZYMV resistance in Nigerian Local have recently become available for MAS (Pachner et al., 2011;Shrestha et al., 2021). However, the quantitative trait loci (QTL) and molecular markers linked to PRSV-W and WMV resistance in Nigerian Local are currently unknown, thus limiting potential genetic gain through MAS.
The advances in next-generation sequencing technologies and availability of cost-effective genotyping platforms have facilitated the development of rapid genomic tools for crop advancement (Buermans and Den Dunnen, 2014). The recent availability of a reference genome for C. moschata (Sun et al., 2017) has accelerated the rapid development of MAS tools for economically important traits through linkage mapping (Sáez et al., 2020;Vogel et al., 2021) and QTLseq (Ramos et al., 2020;Shrestha et al., 2021). QTLseq is a genetic mapping method that integrates whole genome sequencing data into traditional bulk-segregant analysis to provide robust and rapid identification of genomic loci associated with the traits of interest (Takagi et al., 2013). Briefly, the method involves developing a mapping population segregating for a trait of interest and identifying individuals expressing the extreme phenotype. DNA from these individuals is bulked into high and low pools for sequencing, with each pool containing genomes from both parents in the ratio of 1:1, except for the regions harboring QTL of interest (Takagi et al., 2013).
In this study, the QTLseq approach was employed to identify genomic regions and DNA markers associated with resistance to PRSV-W in Nigerian Local accession of C. moschata.

Plant Materials and Phenotyping
A controlled cross between Nigerian Local (resistant accession, maternal) and Butterbush (susceptible Butternut cultivar, paternal) was made in the greenhouse, and a single F 1 individual was self-pollinated to generate an F 2 population (n = 118). A strain of PRSV-W (provided by Dr. Jane Polston, University of Florida) was maintained on susceptible Yellow Crookneck plants (commercial cultivar, highly susceptible to PRSV) in insect-proof cages (Megaview Science, Talchung, Taiwan). The isolate was confirmed as PRSV-W using enzyme-linked immunosorbent assay (ELISA) (Agdia Inc., Elkhart, IN, United States).
Seeds of each parent (n = 12) and the F 2 population (n = 118) were sown in 4-inch pots filled with Proline C/B growing mix (Jolly Gardener, PA, United States) amended with controlled release fertilizer (14:4.2:11.6 NPK). Seeds (n = 9) of Yellow Crookneck were also sown to serve as a susceptible check. Inoculation was done at the two true-leaf stages. Fresh inoculum was prepared by grinding leaf tissues of PRSV-W-infected Yellow Crookneck plants in 0.02 M phosphate buffer (pH 7) at 1:5 weight by volume ratio. Cotyledon and first true leaf of the plants were dusted with silicon carbide powder (Thermo Fisher Scientific, MA, United States) and mechanically inoculated with the freshly prepared PRSV-W inoculum. Disease rating (DR) was performed on a scale of 0-3, with 0 representing asymptomatic plants, 1 for plants showing slight yellowing on the leaves, 2 for plants showing moderate yellowing and mottling of the leaves, and 3 for plants with severe leaf yellowing and mottling and stunted plant growth. Disease severity data were recorded 7, 14, 21, and 28 days after inoculation (DAI).

Whole Genome Re-sequencing and Quantitative Trait Loci Mapping
The DNA was extracted from 10 resistant (DR = 0) and 10 susceptible (DR = 3) F 2 individuals and the parents using FavorPrep Plant DNA kit (Favorgen Biotech Corp., Ping-Tung, Taiwan) following the manufacturer's instruction. The resistant and susceptible sequencing libraries were prepared by pooling an equal amount (500 ng) of DNA from individuals constituting the resistant and susceptible bulks, respectively. Whole genome re-sequencing of the parents, resistant, and susceptible bulks was performed on Illumina HiSeq X platform at BGI sequencing center (Shenzhen, Guangdong, China). The resulting paired-end sequencing reads from the parents were mapped into the C. moschata (Rifu) genome using BWA-MEM (Li and Durbin, 2009). A consensus reference sequence was generated by replacing C. moschata reference alleles with sequence reads from resistant and susceptible parents using SAMtools . Variant calling between the resistant and susceptible bulks was performed following the standard bioinformatic procedures. Briefly, reads from resistant and susceptible bulks were aligned to consensus reference sequence followed by filtering out of duplicate reads using Picard 2.19.1. HaplotypeCaller of GATK4 was used for variant calling (Garrison and Marth, 2012;Van der Auwera et al., 2013). The called variants were then compiled using GVCFs to obtain a raw variant VCF file, which was used as raw input data for QTL analysis in QTLseqr (Mansfeld and Grumet, 2018). In QTLseqr, runQTLseqAnalysis function was implemented to identify significant QTL associated with PRSV-W resistance. Filtering criteria set on QTLseqr to filter out the variants with low confidence were "minimum depth ≥ 50, maximum depth ≤ 500, GQ ≥ 99." The SNP-index for each SNP position was calculated by determining the difference between SNP index of resistant and susceptible bulks (Takagi et al., 2013). SNP index values across the entire genome of resistant and susceptible bulks were calculated with a sliding window average of 1 Mb. Significant QTL were detected at 95 and 99% confidence intervals (Takagi et al., 2013;Mansfeld and Grumet, 2018).

Variant's Characterization, Marker Development, and Validation
Annotation and characterization of SNPs identified within QTL region were performed using SnpEff software (Cingolani et al., 2012). Since C. moschata (Rifu) reference genome is not present in SnpEff genomic database, a custom database was built using the reference genome and gff3 annotation files available through Cucurbit Genomics Database prior to the analysis. High-effect SNPs identified by SnpEff were further explored for the marker development and QTL validation. A total of 13 high-effect SNPs were converted into Kompetitive allele-specific PCR (KASP) markers using BatchPrimer3 software (Ali and Al-Koofee, 2019; Supplementary Table 1) and tested for the association with PRSV-W resistance in the F 2 population (n = 118). KASP assays were performed in 10-µl reaction volumes containing 5 µl of 2X low rox master mix, 0.16 µl of each forward primers (10 µm), 0.41 µl of reverse primer, 2 µl of genomic DNA (50 ng/µl), and 2.27 µl of H2O. The PCR amplification process consisted of initial incubation at 94 • C for 15 min, touchdown step at 94 • C for 20 s, and 61 • C for 60 s, with a 0.6 • C decrease per cycle for 10 cycles, followed by two-stage PCR step of 26 cycles, 94 • C for 20 s and 55 • C for 60 s. Fluorescent plate reading and cluster calling were performed using LightCycler R 480 Instrument II (Roche Life Sciences, Penzberg, Germany).
The markers were tested for the association with PRSV-W resistance using the Kruskal-Wallis test (p < 0.05), whereas the R/qtl package was used for distance estimation and nonparametric interval mapping using the est.map and scanone functions, respectively (Broman et al., 2003;Broman and Sen, 2009). Likelihood of the odd (LOD) values were determined using 1,000 permutations, and the significance thresholds were viewed at 99, 95, and 90 percentiles. The SNP effect on the predicted amino acid sequence was checked using resources available through the Cucurbit Genomics Database.

Phenotypic Data
The onset of leaf yellowing was evident on the susceptible parent (Butterbush) and the susceptible check (Yellow Crookneck) at 14 DAI, whereas leaf distortion and plant stunting were observed at 28 DAI (Supplementary Figure 1). On the contrary, plants of the Nigerian Local accession remained asymptomatic throughout the experiment. At 28 DAI, the F 2 population was segregated into four distinct phenotypic classes. Individuals with a DR of 0 and 1 were considered resistant, whereas individuals with DR of 2 and 3 were considered susceptible (Supplementary Table 2). Of the total 118 F 2 individuals evaluated for PRSV-W resistance, 54% were susceptible and 46% were resistant. A chi-square test for the susceptible and resistant classes in the F 2 population fits a 9:7 ratio (p = 0.65; 54 resistant and 64 susceptible) (Supplementary Table 3).

Quantitative Trait Lociseq Analysis
A high mapping rate (98.07-99.03%) and mapping depth (58.72-79.92) across samples were the indicative of good quality whole genome re-sequencing sequence data ( Table 1). The alignment of reads from susceptible and resistant bulks to the Butterbush consensus reference FASTQ file revealed 1,887,568 SNPs. A total of 599,997 SNPs were filtered out by QTLseqr based on the maximum (500) and minimum read depths (50) and genotype quality (GQ =≥ 99). After filtering, 1,287,571 SNPs were utilized in QTLseqr for QTL analysis. The SNP indices across the genome were calculated by subtracting the SNP index of resistant and susceptible bulks at each locus and the values plotted. QTLseq analysis revealed a single QTL (QtlPRSV-C09) on chromosome 9 that was significantly (95% confidence interval) associated with PRSV-W resistance in Nigerian Local (Figure 1). QtlPRSV-C09 extended from 785,532 to 5,093,314 bp and harbored 12,245 SNPs. A similar QTL region was identified using the resistant parent, Nigerian Local, as the consensus reference sequence (Supplementary Figure 2).

Marker-Trait Association
Of the 12,245 SNPs within QtlPRSV-C09, 2,004 were missense variants with moderate effect, 3,082 were synonymous variants with low effect, 7,065 were intron variants with modifier effect, and 94 SNPs were variants with high effect. Further examination of the high effect SNPs revealed that 50 were stop-gain, 10 start-loss, 20 splice acceptor, 11 splice donor, and 3 stop loss variants. Only three of the 94 high effect SNPs were located within genes that were previously annotated as putative TIR-NBS-LRR disease-resistant homologs (Supplementary Table 4). A total of 13 high effect SNPs, which include those identified within putative disease resistance genes, were converted into KASP assays and tested for the association with PRSV-W resistance. A total of four among these were polymorphic between the parents and individuals constituting two bulks and thus were tested in the entire F 2 population. A number of two SNP markers (Ch09_2080834 and Ch09_5023865-1) within QtlPRSV-C09 were significantly (p < 0.05) associated with PRSV-W resistance in Nigerian Local ( Table 2). Nonparametric interval mapping confirmed a significant association of the two markers with PRSV-W resistance (Figure 2). Genotype and phenotype plots of the two SNPs (Ch09_2080834 and Ch09_5023865-1) in the F 2 population revealed that individuals with the homozygous genotype (BB; SNP from C to T for Ch09_2080834 and SNP from T to C for Ch09_5023865-1) exhibited a higher average disease resistance than those with either the AA (allele C for Ch09_2080834 and allele T for Ch09_5023865-1) or heterozygous genotypes for both markers (Figure 3).

Candidate Genes for Papaya Ringspot Virus W Resistance
Further analysis revealed that QtlPRSV-C09 contained 70 genes (Supplementary Table 4). The two significant SNPs (Ch09_2080834 and Ch09_5023865-1) were located within putative disease-resistant gene homologs, CmoCh09G004640 and CmoCh09G009540, respectively. CmoCh09G004640 is a clathrin assembly family protein extending from 2,080,680 to 2,083,518 bp on chromosome 9 and shares 49.3% homology with the clathrin assembly family protein of Arabidopsis (Arabidopsis thaliana) that features an AP180 N-terminal  Figure 3A). On the other hand, homolog CmoCh09G009540 (5,023,313-5,023,867 bp) associated with SNP marker Ch09_5023865-1 is an actin cytoskeleton-regulatory complex gene. SNP Ch09_5023865-1 [T (susceptible)/C (resistant)] is a stop-loss variant located on exon 1 of CmoCh09G009540, which leads to the loss of a stop codon and an elongated amino acid chain in the resistant parent (Supplementary Figure 3B).

DISCUSSION
PRSV-W is one of the most destructive potyviruses in cucurbit production worldwide (Purcifull, 1984). Genetic resistance to PRSV-W in Nigerian Local (C. moschata) is well characterized and widely utilized in breeding (Munger and Provvidenti, 1987;Brown et al., 2003;Behera et al., 2012;  Seda-Martinez, 2019). The development and deployment of molecular markers associated with PRSV-W resistance could increase the rate of genetic gain by shortening the length of the selection cycle (Collard et al., 2005). In this study, the QTLseq method was employed to rapidly identify genomic loci associated with resistance to PRSV-W in Nigerian Local. As expected, the phenotypic data confirmed high resistance in Nigerian Local and susceptibility in the susceptible parent  Frontiers in Plant Science | www.frontiersin.org (Butterbush) and the cultivar check, Yellow Crookneck. The 9:7 (susceptible/resistant) segregation ratio observed in the F 2 indicates an oligogenic recessive epistatic mode of inheritance and supports previous inheritance studies in Nigerian Local (McPhail-Medina et al., 2012). On the contrary, Brown et al. (2003) reported that resistance in Nigerian Local is controlled by a single recessive gene (prv). The disparities in conclusions among these studies may be due to contrasting thresholds set for resistance and susceptible classes and also the differences in the susceptible parent genetic backgrounds which may contribute to modifying loci (McPhail-Medina et al., 2012). Although the segregation pattern in the F 2 was oligogenic, only single QTL (QtlPRSV-C09) linked to PRSV-W resistance in Nigerian Local was detected on chromosome 9. This may be due to the small mapping population (n = 118) utilized in this study that may have diminished the statistical power for the detection of QTL of minor effect (Vales et al., 2005;. The location of QtlPRSV-C09 on chromosome 9 is different from the QTL for ZYMV resistance in Nigerian Local on chromosomes 2, 4, 8, and 20 . This observation supports a previous study that demonstrated that the resistance to PRSV-W and ZYMV in Nigerian Local is non-allelic (Brown et al., 2003;Seda-Martinez, 2019).
QtlPRSV-C09 spanned 4.31 Mb and contained 12,245 SNPs, among which 94 were determined as high-effect SNPs using SnpEff, a tool for delimiting confidence intervals of QTL regions in linkage mapping (;Cingolani et al., 2012). The abundance of SNPs in the non-coding region with no effect was expected because of low sequence conservation and decreased selection pressure in these regions as compared to genic regions (Loewe and Hill, 2010). Among the 13 high-effect SNPs assayed as KASP markers in the F 2 population, two (Ch09_2080834 and Ch09_5023865-1) were significantly associated with PRSV-W resistance in Nigerian Local. These SNPs, together with those linked to ZYMV resistance in Nigerian Local , can be targeted through MAS to facilitate gene pyramiding for resistance to PRSV and ZYMV in squash. SNP Ch09_2080834 lies within a clathrin assembly family protein (CmoCh09G004640) that features ANTH domain that is associated with clathrin coat assembly and clathrindependent endocytosis in Arabidopsis (Mayer et al., 2004;Cheng et al., 2017). Interestingly, three different patternrecognition receptor kinases, Flagellin sensing 2, PEP receptor 1, and elongation factor-Tu involved in plant defense, are activated via clathrin-dependent endocytosis (Mbengue et al., 2016;Ortiz-Morea et al., 2016). On the other hand, SNP Ch09_5023865-1 was within CmoCh09G009540, an actin cytoskeleton-regulatory complex gene reportedly involved in the regulation of leaf senescence, salicylic acid, and pathogen attack in Arabidopsis and barley (Hordeum vulgare) (Krupinska et al., 2002;Fischer-Kilbienski et al., 2010). SNPs Ch09_2080834 and Ch09_5023865-1 are non-synonymous variants, which results in stop-gain and stop-loss in CmoCh09G004640 and CmoCh09G009540, respectively, and may result in altered pattern recognition in response to PRSV-W infection in susceptible and resistant parents (Cingolani et al., 2012).

CONCLUSION
This study demonstrated QTLseq as a rapid tool to map traits in horticultural crops, particularly for oligogenic traits. Segregation pattern in the F 2 population indicated that at least two genes are involved in PRSV-W resistance in Nigerian Local. A single QTL (QtlPRSV-C09) associated with resistance to PRSV-W resistance was mapped on chromosome 9 of C. moschata. The two SNP markers (Ch09_2080834 and Ch09_5023865-1) tightly linked to QtlPRSV-C09 in Nigerian Local may expedite the development of elite PRSV-W-resistant squash cultivars through MAS.

AUTHOR CONTRIBUTIONS
GM, SS, and YF designed the research. SS, YF, and VM conducted the experiments and analyzed the data. GM and SS wrote the manuscript. All authors read and approved the manuscript.