Development and Utilization of InDel Markers to Identify Peanut (Arachis hypogaea) Disease Resistance

Peanut diseases, such as leaf spot and spotted wilt caused by Tomato Spotted Wilt Virus, can significantly reduce yield and quality. Application of marker assisted plant breeding requires the development and validation of different types of DNA molecular markers. Nearly 10,000 SSR-based molecular markers have been identified by various research groups around the world, but less than 14.5% showed polymorphism in peanut and only 6.4% have been mapped. Low levels of polymorphism limit the application of marker assisted selection (MAS) in peanut breeding programs. Insertion/deletion (InDel) markers have been reported to be more polymorphic than SSRs in some crops. The goals of this study were to identify novel InDel markers and to evaluate the potential use in peanut breeding. Forty-eight InDel markers were developed from conserved sequences of functional genes and tested in a diverse panel of 118 accessions covering six botanical types of cultivated peanut, of which 104 were from the U.S. mini-core. Results showed that 16 InDel markers were polymorphic with polymorphic information content (PIC) among InDels ranged from 0.017 to 0.660. With respect to botanical types, PICs varied from 0.176 for fastigiata var., 0.181 for hypogaea var., 0.306 for vulgaris var., 0.534 for aequatoriana var., 0.556 for peruviana var., to 0.660 for hirsuta var., implying that aequatoriana var., peruviana var., and hirsuta var. have higher genetic diversity than the other types and provide a basis for gene functional studies. Single marker analysis was conducted to associate specific marker to disease resistant traits. Five InDels from functional genes were identified to be significantly correlated to tomato spotted wilt virus (TSWV) infection and leaf spot, and these novel markers will be utilized to identify disease resistant genotype in breeding populations.


INTRODUCTION
Various types of molecular markers, such as random amplified polymorphic DNA (RAPD) (Williams et al., 1990;Burow et al., 1996;Subramanian et al., 2000); amplified fragment length polymorphism (AFLP) (Vos et al., 1995;He and Prakash, 1997); inter simple sequence repeat (ISSR) markers (Zietkiewicz et al., 1994;Raina et al., 2001) and simple sequence repeats (SSR) (Tautz, 1989;Liang et al., 2009), have been used in detecting the genetic diversity of plant germplasm resources (Cuc et al., 2008;Jiang et al., 2010;Moretzsohn et al., 2013), construction of genetic linkage maps (Varshney et al., 2009;Hong et al., 2010;Gautami et al., 2012;Nagy et al., 2012;Qin et al., 2012;Shirasawa et al., 2013), molecular marker-assisted selection (MAS) and mapping and cloning of genes/QTL (Chu et al., 2011;Ravi et al., 2011;Sujay et al., 2012) in peanut. Microsatellite or simple sequence repeat (SSR) markers have been developed using sequences derived from SSR-enriched genomic libraries and expressed sequence tags (ESTs) (Guo et al., 2009;Koilkonda et al., 2012;Wang et al., 2012;Zhang et al., 2012) and have been utilized to investigate genetic diversity for the US peanut mini-core collection (Belamkar et al., 2011;Wang et al., 2011;Chen et al., 2014), Chinese peanut mini-core collection (Jiang et al., 2010(Jiang et al., , 2014, and ICRISAT peanut mini-core collections Mukri et al., 2012;Upadhyaya et al., 2012). The functional SNP markers from FAD2A/FAD2B genes have been used to screen the U.S. mini-core collection . Another new kind of marker called Start codon targeted polymorphism (SCoT) was also developed and showed the potential use for studying the genetic diversity and relationship in cultivated peanut (Xiong et al., 2011). Approximately 10,000 molecular markers have been identified by various research groups around the world, but only 14.5% showed polymorphism in peanut and only 6.4% were mapped , mainly due to the fact that cultivated peanut possesses an extremely narrow genetic basis (Xiong et al., 2011). Low genetic diversity among cultivated peanut accessions is likely due to the single hybridization event between two ancient diploid species, likely Arachis duranensis (A genome) and Arachis ipaensis (B genome) (Burow et al., 2009;Nagy et al., 2012;Shirasawa et al., 2013). Low level of polymorphism limits the application of molecular markers in peanut breeding and genetics studies.
InDels have been recognized as an abundant source of genetic markers that are widely spread across the genome, and there is an increasing focus on polymorphisms of the type short insertions and deletions (InDels) in genomic and breeding research (Lv et al., 2013;Yamaki et al., 2013). Short sequence and homonucleotide repeats tend to accumulate InDels due to polymerase slippage during replication and frame shift InDels in coding regions can result loss of function or non-sense mutation (Rockah-Shmuel et al., 2013). It has been reported that insertions and deletions (InDels) markers were more polymorphic than SSRs in some crops (Liu et al., 2013;Wu et al., 2014). No research of InDel marker in peanut has been reported for trait association. Therefore, it is vital to develop InDel markers in peanut and to apply these markers to associate important traits, such as disease resistance. The objectives of this research were: (1) to develop the gene-specific InDel markers; (2) to evaluate the potential use in genetic diversity study for cultivated peanut; and (3) to identify novel InDel markers that related to the disease-resistant traits.

Plant Materials and Phenotyping of TSWV and Leaf Spot
One hundred and eighteen peanut accessions from the USDA peanut germplasm collection in Griffin, GA were used in the study, in which 104 accessions were selected from the US peanut mini-core collection and an additional 14 accessions were selected to represent two botanical types (hirsuta var. and aequatoriana var.) of cultivated peanut that are not present in the mini-core (Table 1). Twenty seed of each 118 Arachis hypogaea accessions were planted at Dawson, GA (31 • 45 ′ latitude, −84 • 30 ′ longitude) in 2010, 2012, and 2013 under irrigated conditions. The genotypes were planted in two-row plots 3 m long and 0.91 m between rows at a seeding rate of 3 seed m −1 in early May with three replications. Before planting, the field area was cultivated and irrigated with 15 mm of water to ensure adequate moisture for uniform seed germination. Crop management for all entries was according to best management practices for soil nutrients, herbicides, and pesticides. For evaluation of TSWV resistance, all plots of each PI were visually rated immediately prior to digging for foliar symptoms on a percentage basis, similar to the 1-10 method described by Tillman et al. (2007) where 1 = no disease and 10 = all plants severely diseased. Disease evaluations for leaf spot resistance were conducted in the field under a reduced fungicide-treatment with one application of 1.5 pt/A chlorothalonil in 2010 and no fungicide application in 2012 and 2013. Plants were rated using the Florida leaf spot scoring system during flowering, 2 weeks before harvest, and immediately prior to harvest (Chiteka et al., 1988). The data was analyzed using SAS Institute (version 9.2, 2009) with PROC GLM under the general linear model. Means were separated using Fisher's Protected LSD at p < 0.05.

Identification of InDels and Primer Design
Publically available peanut expressed sequence tags (ESTs) derived from various tissues, developmental stages, and under different biotic and abiotic stresses  were utilized to identify potential InDel markers. Sequences were downloaded and alignment was performed by Sequencher v5.1 (Gene Codes, Ann Arbor, MI). Individual clusters or contigs were visually observed to identify potential InDels and selected contigs were reassembled using "large gap" criteria for assembly algorithm, resulting in the identification of 48 InDels. Primers were designed using Primer Express 3.0 (Applied Biosystems, Foster City, CA) for the sizes of 150-500 bp. Potential plant gene function was identified through BLASTx (NCBI) and comparison of the sequences according to conserved sequences of functional genes. The procedure of identification of peanut EST InDels, primer design and marker scoring was illustrated by flowchart (Figure 1).

DNA Extraction and PCR
Genomic DNA extraction from dry seeds was performed following the method of Dang and Chen (2013). A Nano-Drop 2000c spectrophotometer (Nano Drop Technologies, USA) was used to evaluate the quality and concentration of all DNA. DNA samples were diluted to 20 ng/µL and PCR conditions were applied: 94 • C for 1 min, 30 cycles of 30 s at 94 • C, 50 • C for 1.0 min, 72 • C for 1.5 min, and 1 cycle at 72 • C for 10 min. PCR products and DNA molecular weight marker (Promega, Madison, WI) were separated on a 1.2% TAE-agarose gel and

Data Analysis
Polymorphism Information Content (PIC) based on allelic frequencies among 118 genotypes was calculated for each InDel marker using the following formula: PIC = 1x 2 i where x i is the relative frequency of the ith allele of the SSR loci. Clustering analyses were performed using SAS (SAS 9.3; SAS Institute, 2009) to calculate the genetic similarity matrices, and a neighborjoining (NJ) algorithm (Saitou and Nei, 1987) was used to construct a phylogram from a distance matrix using the MEGA4 software (Tamura et al., 2007). Single marker analysis (SMA) method was used for trait-marker analysis (Jansen and Stam, 1994). It was carried out by PROC GLM of SAS (SAS 9.3; SAS Institute, 2009) with the following linear model: where Y iklm is each observed phenotype, u is the population mean, E i is the effect of year (i = 1, 2), M k is the effect of marker genotype (k = 1, 2), F(M) kl is the effect of PIs within marker genotype (l = 1, . . . , 118), E x F(M) ikl is the interaction between the effect of year and the effect of PIs within marker genotype, and e iklm is residual error. Threshold for declaring a marker significant was chosen to be marker-wise p < 0.0001, which is approximately equal to an experiment-wise p < 0.05 in this study based on 16 polymorphic markers.

Polymorphic Information of the InDel Markers and Genetic Diversity of the Different Botanical Types Based on InDel Markers
Forty-eight primer-pairs of InDel markers were designed from coding and non-coding regions of the 48 functional genes ( Table 2). All 48 primer-pairs generated PCR bands, of which 16 were polymorphic, with different sizes from 200 to 470 bp (Figure 2). The polymorphic information content (PIC) values of each primer ranged from 0.0169 of InDel-03 to 0.5960 of InDel-18 with an average of 0.1349 ( Table 3). The distributions of 16 polymorphic InDel markers among the six botanical types were quite different. More polymorphic markers were detected in the botanical types of hirsuta var., aequatoriana var., hypogaea var., and fastigiata var. than the other two types of peruviana var. and vulgaris var. (12, 9, 9, 7, vs. 2, 2) ( Table 3). The least polymorphic marker was InDel-03 which only showed in hirsuta var., while InDel-16 and InDel-18 showed polymorphism in five of six botanical types. In respect to the different botanical types, PICs varied from 0.176 for fastigiata var., 0.181 for hypogaea var., 0.306 for vulgaris var., 0.534 for aequatoriana var., 0.556 for peruviana var., to 0.660 for hirsuta var., which implied that hirsuta var., peruviana var., and aequatoriana var. have higher genetic diversity than the other types ( Table 4).
In general, the accessions carrying the alleles of the markers had a low leaf spot rate or low percentages of TSWV incidents (Table 5). For example, 43 accessions with InDel-018 alleles had an average of 2.9 leaf spot rate while 75 accessions without the alleles had an average of 4.1 ( Table 5). Similar results were observed for TSWV, in which the accessions carrying the alleles of InDel-032 showed a low disease incident (10.7%) compared to the accessions that are lacking of the alleles (46.1%) ( Table 5).

DISCUSSION
Difference in genetic pattern or polymorphism is a main criterion to evaluate the potential functionality of DNA molecular markers. In the present study, the polymorphism of the InDel markers was 33.3%, which was higher than some markers that have been previously reported as to RAPD marker (6.6%) by Subramanian et al. (2000); AFLP marker (3.6%) by He and Prakash (1997); EST-SSR marker (10.4%) by Liang et al. (2009); SSR marker (14.5%) by Zhao et al. (2012) but was lower than Start Codon Targeted polymorphism (SCoT) marker (38.2%) as reported by Xiong et al. (2011) (Table 6). Among the reports, the numbers of accessions evaluated were much less than the 118 accessions used in this study. In general, the larger the number of accessions with diverse genetic background the higher the accuracy of estimated polymorphism associated with a particular trait. Therefore, our reported polymorphism for the InDel markers in this study can be useful in peanut breeding programs. Germplasm resources provide fundamental materials for peanut genetic improvement, and the study of genetic diversity on cultivated peanut will enhance the utilization of peanut genetic resources. Genetic diversity of six botanical types of cultivated peanuts has been extensively investigated using molecular markers. Based on SSR markers, Jiang et al. (2010) demonstrated that the accessions of fastigiata and hypogaea were more diversified than other botanical types. The genetic diversity of 72 accessions of the U.S. mini core was estimated using 67 SSR primer pairs and the results indicated that the PIC of SSR markers ranged from 0.063 to 0.918 and the gene diversity ranged from 0.027 to 0.50 (Kottapalli et al., 2007). In the present study, PICs varied from 0.176 for fastigiata var. to 0.660 for hirsuta var., and hirsuta var., peruviana var., and aequatoriana var. have higher genetic diversity than the other types, indicating that, like other  Unlike the QTL that using biparental RIL (Recombinant Inbred Lines) mapping populations to link markers with target traits, the identified marker trait association in present cannot validated in different backgrounds, but in our another apparel association mapping study we have extensively evaluated leaf spot and TSWV resistances for the U.S. mini-core collection and mapped three SSR markers named "pPGPseq2D12B, " "pPGSseq19B1, " and "TC04F12, " to be associated both with leaf spot and TSWV resistances. The marker "TC20B05" can explain 15% phenotypical variation of leaf spot resistance.
Regarding application of MAS in peanut, there are only two molecular markers currently being utilized in breeding programs: nematode resistance and high oleic seed chemistry. Chu et al. (2011) demonstrated that a tremendous reduction in the amount of time (at least 3-fold) for plant selection was achieved with MAS to pyramid nematode resistance with high oleic trait in peanut. This recent success is only possible due to the initial discovery of the genetic markers and the development of breeding lines. For example, the identification of high oleic marker was achieved by utilizing different genes in fatty acid biosynthesis for high oleic chemistry in other oil seed crops enabling a straightforward characterization in peanut and discovery of similar functional mutations in breeding populations (Jung et al., 2000;Lopez et al., 2002). Nematode resistance was introgressed from wild species (Simpson and Starr, 2001), and resistant plants were selected based on the availability of molecular markers at the time (Nagy et al., 2010). High Oleic trait resulted from the expression of two recessive genes (Lopez et al., 2001) while nematode resistance was determined to result from the expression of two dominant genes (Garcia et al., 1996). For other traits such as disease resistance or drought tolerance, complex interaction between genetic and environment poses daunting challenge to breeders to select resistant plants. Since InDel markers were developed from sequences of functional genes, they will lay the groundwork for the identification of genes related to superior agronomic traits, provide information on population genetic variations, and identify homologous genes for functional studies. Since InDel markers were found to be associated with leaf spot and TSWV resistance with a higher level of DNA polymorphism compared to other molecular markers, they provide a very useful type of molecular marker to identify other agronomical important traits in peanut.

ACKNOWLEDGMENTS
We are indebted to Brian Gamble and Larry Wells for devoted assistance with management of field experiment research plots at the Wiregrass Research and Extension Center, Auburn University, Headland, Alabama. The contributions and assistances of Sam Hilton, Joseph Powell, Kathy Gray, Lori Riles, Dan Todd, Robin Barfield, Staci Ingram, and Bill Edwards from the USDA-ARS National Peanut Research Laboratory are gratefully acknowledged. The author, LL was sponsored by Grant of 948 project (2013-Z65) and The Excellent Going Abroad Experts Training Program in Hebei, China to conduct this research in Auburn University.