Allele Mining and Selective Patterns of Pi9 Gene in a Set of Rice Landraces from India

Allelic variants of the broad-spectrum blast resistance gene, Pi9 (nucleotide binding site-leucine-rich repeat region) have been analyzed in Indian rice landraces. They were selected from the list of 338 rice landraces phenotyped in the rice blast nursery at central Rainfed Upland Rice Research Station, Hazaribag. Six of them were further selected on the basis of their resistance and susceptible pattern for virulence analysis and selective pattern study of Pi9 gene. The sequence analysis and phylogenetic study illustrated that such sequences are vastly homologous and clustered into two groups. All the blast resistance Pi9 alleles were grouped into one cluster, whereas Pi9 alleles of susceptible landraces formed another cluster even though these landraces have a low level of DNA polymorphisms. A total number of 136 polymorphic sites comprising of transitions, transversions, and insertion and deletions (InDels) were identified in the 2.9 kb sequence of Pi9 alleles. Lower variation in the form of mutations (77) (Transition + Transversion), and InDels (59) were observed in the Pi9 alleles isolated from rice landraces studied. The results showed that the Pi9 alleles of the selected rice landraces were less variable, suggesting that the rice landraces would have been exposed to less number of pathotypes across the country. The positive Tajima’s D (0.33580), P > 0.10 (not significant) was observed among the seven rice landraces, which suggests the balancing selection of Pi9 alleles. The value of synonymous substitution (-0.43337) was less than the non-synonymous substitution (0.78808). The greater non-synonymous substitution than the synonymous means that the coding region, mainly the leucine-rich repeat domain was under diversified selection. In this study, the Pi9 gene has been subjected to balancing selection with low nucleotide diversity which is different from the earlier reports, this may be because of the closeness of the rice landraces, cultivated in the same region, and under low pathotype pressure.


INTRODUCTION
Rice blast (Magnaporthe oryzae), the most serious diseases of rice causes significant yield loss globally and the complexity of pathogen, host, and microclimate have a profound effect on this (Valent, 1990;Teng et al., 1991;Kwon and Lee, 2002;Li et al., 2007). The blast fungus is both sexual and asexual in nature which resulted in the evolution of its different variants in field conditions (Xia et al., 1993). The high adaptation frequency and variations lead to the emergence of new races of the fungal population of M. oryzae leads to the breakdown of resistance in newly released rice cultivars in the fields (Kiyosawa et al., 1986;MacKill and Bonman, 1992;Valent and Chumley, 1994;Han et al., 2001). Under normal field condition, incomplete, or field resistance of blast disease is better options for the effective control of M. oryzae (Liu et al., 2005). The host-pathogen interaction can be better understood by the identification and characterization of both R and Avr genes and certain enzyme based studies on technological improvements through combined approaches (Imam et al., 2014a(Imam et al., ,d, 2015a(Imam et al., , 2016Baweja et al., 2016;Kumar et al., 2016). Till date, many blast resistance genes and QTLs have been recognized and cloned (Sharma et al., 2005;Imam et al., 2014bImam et al., , 2015b. Most of the rice blast resistance genes cloned till date encode nucleotide binding site-leucine-rich repeat (NBS-LRR) proteins which suggest the common escape root involving a familiar resistance pathway to counter blast infections (Hammond-Kosack and Jones, 1997;Takken and Tameling, 2009;Liu et al., 2010;Imam et al., 2013b).
The transfer of valuable alleles found in the rice germplasm is generally employed by the plant breeders for the improvement of high-yielding varieties (Kumar et al., 2010). Natural mutation like transition, transvertion, point mutation, and insertion and deletions (InDels) is the main driving force for the generation and evolution of new alleles. With the availability of enormous database information, desired and superior alleles can be easily identified and retrieved (Kumar et al., 2010). The potential application of allele mining approach is in the identification of new haplotypes and evolution pattern study which helps in the rice improvement programs (Kumar et al., 2010). TILLING (Targeting Induced Local Lesions in Genomes) and sequencebased allele mining are the two main approach for sequence polymorphism study in the natural population of germplasm (Till et al., 2003;Kumar et al., 2010). Allele mining has emerged as an important approach for cloning and characterization of new and better forms of disease resistance genes. Isolation of orthologs provides insights into the evolutionary forces shaping the development that help identification of better alleles for future experiments (Ashkani et al., 2015). Because of its facile nature, this approach is being used extensively to identify alleles of agriculturally important traits. Wild as well as cultivated rice varieties has been studied for the blast resistance genes by allele mining approach (Geng et al., 2008;Huang et al., 2008;Yang et al., 2008). An extensive study of the Pi-ta locus was described from wild species of rice , cultivated (AA) and wild species and invasive weedy rice (Lee et al., 2009(Lee et al., , 2011. Another gene, Pid3 studied from 36 rice lines of both cultivated and wild species indicated pseudogenization of Pid3 in japonica cultivars (Shang et al., 2009). Liu et al. (2011) reported divergent selection in Pi9 locus cloned from cultivated and wild species of rice. Most of the above mentioned blast genes were studied through sequence-based allele mining approach.
The Pi9, Pi2, and Piz-t, the three paralogs of the blast resistance gene family is now well characterized and resistance mechanism is known Zhou et al., 2006). MacKill and Bonman (1992) discussed the origins of Piz-t and Pi2 genes from indica cultivars while Oryza minuta, a wild rice is the source of origin of Pi9 gene and high LRR positive selection (Amante-Bordeos et al., 1992;Zhou et al., 2007;Dai et al., 2010). The Pi9 locus contains at least six known resistance genes specific to the fungal pathogen M. oryzae and three R-genes from this locus (Pi9, Pi2, and Piz-t) have been cloned Zhou et al., 2006). The resistance specificities of different broad spectrum rice blast resistance genes shown to be different from one another which mainly arise because of the different evolutionary changes in the NBS-LRR genes and its generation and also the helps in the adaptation to ever changing pathogen populations (Bergelson et al., 2001;Shen et al., 2006;Liu et al., 2007Liu et al., , 2013Yang et al., 2008). For the preservation of the resistant germplasm, knowledge of the variation patterns of R-genes is important. Yang et al. (2006) worked on the genomewide allelic analysis of R-genes between two rice cultivars and categorize the variation patterns into four types, namely type I, type II, type II, and type IV from conserved to presence/absence genes. Basically type I and type II plays the main role in R-genes allelic polymorphism and resistance specificity and this give rise to rapid evolution in these blast resistance genes and enable them to adapt to the ever-changing pathogen population (Bergelson et al., 2001;Shen et al., 2006;Yang et al., 2008). The single-copy gene dominant group (type I), showed the lowest diversity (<0.005); the clustered-gene dominant group (type III), have a high level of diversity and the intermediate one (type II), and the presence or absence of R-genes in one genomes (P/A R-genes, type IV) . Different blast resistance genes showed different levels of polymorphism and diversity. The LRR region of Pi54 and Piz-t genes are more vulnerable for changes and leads to positive directional selection as compared to LRR regions encoded by Pi-km1 and Pi-km2 blast ressitance genes which are highly conserved (Ashikawa et al., 2010;Thakur et al., 2013). It is interesting to note that LRRs have direct interacting roles with effector proteins (Young and Innes, 2006). Higher levels of polymorphism were observed in the LRR region of Pi54 which helps in effector recognition and the evolutionary pressure by virulent M. oryzae races results in in variations in the LRR domain (Thakur et al., 2015).
Allelic polymorphisms in the R-genes are mainly driven by balancing selection and positive selection. The positive selection is the main evolutionary force which maintains the polymorphisms in the R-genes family plants (Bent and Mackey, 2007;Zhou et al., 2007). The genome structure at Pi9 locus is highly conserved but the LRR region showed high sequence variation giving rise to positive selection for Pi9 genes among the rice germplasm (Zhou et al., 2007;Dai et al., 2010;Liu et al., 2010). Previous data toward the prevalence of high pathotypic diversity of the M. oryzae population from Eastern India and very rare compatibility of Pi9 gene in field evaluations with isolates from this region provoked this study (Variar et al., 2009). Amongst the multi-genes near the waxy gene locus on chromosome 6, Pi9 was extra efficient than Pi2 (Piz-t) in the preceding investigation (Ballini et al., 2008;Imam et al., 2014b). Our recent studies on the rice and M. oryzae interaction of isolates collected from India reveals the compatible and incompatible interactions between the R and the corresponding Avr genes (Imam et al., 2013b(Imam et al., , 2014a(Imam et al., , 2015a. Despite the fact that resistance mediated by single R-gene can be easily wrecked by emerging virulence, some cultivars with major resistance genes have stay resistant for a extended time without resistance loss (Khush and Jena, 2009). A likely rationale for the durability of Pi9-mediated resistance to blast is the fact that the gene presents broad-spectrum resistance to miscellaneous isolates. The germplams harboring the Pi9 gene identified in the earlier study originated from different Eastern Indian locations exhibited excellent resistance to several isolates from the region, which is appealing to hypothesize that the gene is effective and durable (Imam et al., 2014c). To better understand the genetic polymorphism and molecular evolution mechanism of the Pi9 alleles, we analyzed the 2.9 kb region of the Pi9 gene in six accessions of cultivated rice landraces. Therefore, the present investigation is taken up for the allele mining of NBS-LRR region of Pi9 gene from the rice landraces to better understand the sequence polymorphisms and its relevance in resistance and susceptibility pattern. The objectives of this study were (1) to isolate alleles of Pi9 blast resistance gene, (2) to understand the nucleotide diversity in NBS-LRR region of Pi9 gene, and (3) analysis of the molecular evolution and patterns of selection in this region.

Plant Materials
A selected set of 338 rice landraces accessions which were reevaluated in the uniform blast nursery (UBN) were considered for allele mining of Pi9 gene. Out of 338, seven rice landraces accessions were taken for further analysis and allele mining ( Table 1). The selection for allele mining was based on the result of the presence of Pi9 gene by STS marker and their disease score. A pair of dominant STS markers 195R-1 (5 -ATGGTCCTTTATCTTTATTG-3 ) and 195F-1 (5 -TTGCTCCATCTCCTCTGTT-3 ) derived from the Nbs2-Pi9 candidate gene was used to check the presence of Pi9 gene in the rice landraces in this study . Out of seven, one is the iso-genic line for Pi9 gene, three were resistant and rests three were susceptible to blast disease.

Phenotype Evaluation of Landraces
A mixture of virulent isolates (Mo-ei-66, Mo-ei-79, Mo-ei-119, and Mo-ei-202) was used as inoculum for the phenotyping of selected rice landraces (Imam et al., 2013a). Oat Meal Agar (HiMedia, India) medium was used to maintain the fungal culture of each isolate. The Mathur's medium was used for the sporulation and multiplication of fungal spores. These cultures were preserved at 22 • C for 12-16 days under stable illumination from white fluorescent light (55 µF/Em/s) (Thakur et al., 2013). Conidia were split from the conidiophores which were used for the preparation of fungal spores and the inoculum were maintained to approximately 10 5 spores/ml. The leaf stage seedlings (2-3 in number) in replicated sets were spray-inoculated with 1 ml mixed spore suspension and then kept back in darkness at 27 • C and over 90% relative humidity for 24 h. In this experiment, positive control for Pi9 gene (IRBL9w) and rice landraces was grown in plastic pots and maintained in phenotyping facility. After inoculation with mixed fungal cultures, the rice seedlings were maintained in the phenotyping chamber with desired temperature and humidity. Analysis of virulence was completed on the basis of reaction type using 0-5 standard evaluation scale. Resistance was scored based on no visible infection and no conidia produced from inoculated tissue (scores 0, 1, 2), while susceptibility was scored with a lesion >3 mm in length (score 3, 4, 5) and sporulating .

PCR and Sequencing
Overlapping oligos were designed using Primer 3 software 1 to amplify 2.9 kb NBS-LRR region of Pi9 gene (DQ285630) using primer walking technique (Thakur et al., 2015). A total of five primer pairs was designed to amplify the 2.9 kb region ( Table 2). PCR was carried out from the isolated DNA of the isogenic line IRBL9-w and six rice landraces using Q5 high-fidelity DNA polymerase (New England Biolabs, Life Technologies, USA) to amplify full-length allele with high-fidelity with the following thermal cycling conditions: initial DNA denaturation at 95 The amplified PCR products were then sequenced 2 and assembled. Phred/Phrap/Consed software (Ewing and Green, 1998) was used for the assembly of multiple reads of different fragments to form the full-length allele. For data analysis good quality (>Phred Phred 30) consensus sequence was used.

Sequence Analysis
Alignment of assembled sequences and manual editing of blast resistance gene Pi9 was done by ClustalW (Thompson et al., 1994) and BioEdit Software version 7.0.9.0 3 . Pi9 gene sequence (DQ285630) was used as a reference for the prediction of gene coding regions by using Gene FGENESH 4 . The functional domain(s) which play an important role in mediating resistance were predicted using the online tools Pfam) 5 and SMART 6 . Phylogenetic analysis was performed with MEGA 4.0 (Tamura et al., 2007) using the Neighbor-Joining method (Saitou and Nei, 1987). All positions containing gaps and missing data were eliminated from the dataset (complete deletion option).

Nucleotide Polymorphisms Analysis
Nucleotide polymorphism analysis of the aligned DNA sequences was done by DnaSP 5.10 program (Rozas et al., 2003  The Dna SP 5.10 program was used for the analysis of polymorphisms and Tajima's D test. The BioEdit software was used to calculate pairwise identity at DNA level. Synonymous and non-synonymous substitution (π syn and π non ) were calculated to examine the selection at the NBS-LRR region of Pi9 gene.

Selection of Rice Landraces and Virulence Analysis
On the basis of pathotyping of 338 rice landraces at UBN, Hazaribag, six landraces comprising of three resistant and susceptible each, was selected for the allele mining of NBS-LRR region of Pi9 gene (Table 1). To further confirm their resistance and susceptibility, these rice landraces were along with the isogenic line for Pi9 gene IRBL9-w (control) were phenotyped with the mixture of virulent isolates discussed earlier (Imam et al., 2013a). The virulence analysis results showed that, out of six landraces, three were consistently resistant while three showed susceptibility to the mixture of virulent M. oryzae isolates (Figure 1). IRBL9-w, the isogenic lines for Pi9 gene was also given a resistant reaction.

Sequence Characterization of the Pi9 Alleles
To determine the nucleotide diversity at the Pi9 allele, 2.9 kb long fragment were amplified from all the seven rice landraces by primer walking technique and sequenced (Figure 1). Only high-quality reads of the sequenced fragments were selected for analysis. About 99% (98%-100%) homology between the sequences was observed after pairwise alignment at the DNA level. Lower variation in the form of mutations (77) (Transition + Transversion), and InDels (59) was observed in the Pi9 alleles isolated from rice landraces selected. The phylogenetic tree was constructed based on the nucleotide sequences of seven rice landraces and one reference Pi9 (DQ285630) sequence (Figures 2 and 3). Phylogenetic analysis results in the formation of two groups, which clearly demonstrate the homology between the sequences. All the blast resistance Pi9 alleles were grouped into one cluster, whereas Pi9 alleles of susceptible landraces formed another cluster even though these landraces have a low level of DNA polymorphisms.

Nucleotide Polymorphism of the Pi9 Alleles
A total number of 75 polymorphic sites were identified in the 2.9 kb sequence among all the Pi9 alleles using DnaSP program. Average pairwise nucleotide diversity (π) and silent Watterson's nucleotide diversity estimator (θ w ) over the Pi9 alleles were 0.01103 and 0.01011, respectively. The average number of nucleotide differences, k was 31.536 and θ (per site) from Eta was 0.01038. Low-diversified nucleotide diversity for Pi9 alleles was observed based on earlier published results. The results showed that the Pi9 alleles of the selected rice landraces were less variable, suggesting that these rice landraces would have been exposed to less number of pathotypes across the country. LRR region showed higher average nucleotide diversity than that of the NBS region and this clearly suggests the importance of LRR domain in the variation of the Pi9 alleles.

Selection of Pi9 Alleles
We evaluated the neutral selection with the Tajima's D test to test the evolutionary selection dynamics of Pi9 alleles in the rice lanraces. Among the seven rice landraces, positive Tajima's D (0.33580) was observed, which signifies the balancing selection among them, which is different from the earlier results (Tajima, 1989). The presence of less number of rare alleles may be the reason for the positive Tajima's D test. Average rates of non-synonymous and synonymous substitution (π syn and π non ) were calculated to examine the selective patterns of Pi9 gene in the rice landraces. The synonymous (π syn ) and nonsynonymous (π non ) substitution in coding region as a whole were calculated in all the seven Pi9 alleles. In the coding region, the value of synonymous substitution (−0.43337) was less than the non-synonymous substitution (0.78808). The greater nonsynonymous substitution than the synonymous means that the coding region, mainly the LRR domain was under diversified selection. The Tajima's D ratio (Non-syn/Syn) was −1.81851 (<1), suggesting the low level of polymorphism in the coding regions of rice landraces. A haplotype distribution analysis was done for all the seven alleles to study mutations and polymorphisms. The study of sequence polymorphism leads to the identification of a total number of five (5) haplotypes (Table 3).

DISCUSSION
The analysis of allelic variants of disease resistance gene imparts essential information regarding novel resistance gene generation and specificity. Earlier reports showed both higher as well as lower levels of sequence diversity at different R-gene locus . In this study, polymorphism of the Pi9 allele was investigated in seven rice landraces. Earlier study about the prevalence of high pathotypic diversity of the M. oryzae population of Eastern India and very rare compatibility of Pi9 gene in field evaluations with isolates from this region result in considering Pi9 gene further in rice landraces (Variar et al., 2009). Our earlier results of virulence analysis of 72 M. oryzae isolates against 26 differential variety revealed that matching virulence to all monogenic differentials carrying different resistant genes were present in the pathogen population, although the resistant check Tetep was resistant to all of them. The frequency of virulence on different monogenic lines ranged from 4.5 to 73%. Very low frequencies of isolates were virulent on Pi9 (4.5%) and Piz-5(Pi-2) (7%) followed by Pita-2 (16 and 18.2%) were reported (Alam et al., 2015). Therefore, complementary resistance spectra that exclude all the pathotypes of the pathogen are required for strategic resistant gene deployment. Pi9 and Pita-2 genes exhibited complementary resistance spectrum and excluded all the pathotypes of the pathogen. Therefore, Pi9 was taken into consideration for further study and analysis. The present result showed that the alleles of the rice landraces were mostly identical at the DNA sequence level, which further suggests the high level of conservation among Pi9 rice germplasms. A total number of 136 polymorphic sites comprising of transitions, transversions, and InDels were identified in the 2.9 kb sequence of Pi9 alleles. Simple InDels and Single nucleotide polymorphisms (SNPs) play a very important function in R-gene evolution (Shen et al., 2006). A single nucleotide difference in the regulatory region of Pi54 locus distinguishes resistant phenotype from the susceptible one (Sharma et al., 2005). The Pita gene when physically linked to a region called superlocus able to show resistance pattern (Jia and Martin, 2008;Lee et al., 2009). In cereal genomes, higher SNPs are detected in the in non-coding regions (one in 100-600 bp) (Gupta and Rustogi, 2004). Similarly, between O. sativa and O. rufipogon, the 26 kb region of DNA sequence showed higher variation (Rakshit et al., 2007). The results clearly showed 99% similarity and low polymorphism at the DNA level among all the seven rice landraces, however, presence of SNPs make it little variable at some regions. The low polymorphism in the DNA sequences of rice landraces reveals that these landraces are closely related were exposed to less number of pathotypes. Liu et al. (2011)  On the basis of a genome-wide analysis of allelic diversity in R-genes of the rice genome, four classes of diversification of R-genes are described . The present study with seven rice landraces indicated that Pi9 allele belongs to type II category since it was neither highly conserved not highly diverse, even though it has low diversified alleles, similar to other blast resistance gene Pi54 (Thakur et al., 2015). Different studies showed that the rapid evolution of R-genes are driven by the high level of diversification (Type III and Type IV) and polymorphism (Shen et al., 2006;Yang et al., 2008;Liu et al., 2011). Pairwise allelic diversity, genomic organization, and the genealogical relationship among different genes have been the criteria to characterize the variation patterns which results in the categorization in four types of variation. Our studies also demonstrateed similar diversification of conserved (Type I; π < 0.5%), highly diversified (Type III;  π > 0.5%), intermediated-diversified (Type II; π = 0.5−5%) and present/absent genes (Type IV) as previously published reports Liu et al., 2011). Earlier study by Liu et al. (2011) suggest that both human and natural selection played a major role in evolutionary divergence of the Pi9 gene after the rice species differentiation. The allelic variation among the rice germplasm in the NBS-LRR region has increased our understanding of variation patterns. Earlier studies showed that variation level of R-gene was generally constant among the rice germplasms. This is now believed that the polymorphism content directly correlates to evolutionary changes (Shen et al., 2006;Yang et al., 2008). For the R-gene resistance specificity, LRR region is the major determinant which is largely responsible for the variation in the NBS-LRR genes (Collier and Moffett, 2009). It is also inetersting to note that among and within oryza species (wild and cultivated rice), LRR region showed more sequence variation than NBS region (Liu et al., 2011). Since Pi9 alleles showed Type II intermediate level of polymorphism, therefore, its evolution pattern is slow and intermediate during the course of time (Ding et al., 2007;Yang et al., 2008). The present study suggests the intermediate level of polymorphism in the Pi9 alleles which may be due to the mixed evolutionary pressure experienced by the gene during co-evolution of rice blast pathogen. In another study, among the cultivated rice, the Pita alleles showed the lowest rate of diversification as among other rice species (Lee et al., 2009). Low nucleotide variation was observed in the coding region (0.00067) of Pita alleles in US weedy rice as compared to non-coding regions (0.00161) (Lee et al., 2011). Interestingly, the phylogenetic analysis showed that resistant and susceptible Pi9 alleles grouped into separate clusters. This is in line to Pi9 alleles wherein cultivated rice along with its ancestors clustered into one group and African cultivated rice along with its ancestors grouped into separate cluster, suggesting that same selection pressure has occurred in two groups during domestication and/or natural selection (Liu et al., 2011). Thakur et al. (2013) also demonstrated the grouping of resistant and susceptible Piz-t alleles in two sub-cluster. In R-gene evolution and development of resistance specificity, the LRR region plays the major role (Collier and Moffett, 2009). The present result also showed the high level of sequence variation in LRR region among the rice landraces.

CONCLUSION
In R-gene evolution, balancing as well as positive selection has been observed and different test is used to calculate the selection pressure which drives the evolution of R-genes (Hudson et al., 1987;Tajima, 1989;McDonald and Kreitman, 1991). In this study, it appears to be balancing selection because of the minor positive Tajima's D test value, which is different from the earlier reports of Liu et al. (2011), which showed that the Pi9 gene is under positive selection. The reason for having positive Tajima's D test was the low nucleotide diversity within the rice germplasm.
This may be because of the closeness of the rice landraces, cultivated in the same region, and under low pathotype pressure. The Tajima's D ratio (Non-syn/Syn) is an indicative of selection pressure acting on the protein coding genes. Both balancing and purifying selections have been observed for the evolution of R-gene (Thakur et al., 2013).