Novel Alleles of Phosphorus-Starvation Tolerance 1 Gene (PSTOL1) from Oryza rufipogon Confers High Phosphorus Uptake Efficiency

Limited phosphorus availability in the soil is one of the major constraints to the growth and productivity of rice across Asian, African and South American countries, where 50% of the rice is grown under rain-fed systems on poor and problematic soils. With an aim to determine novel alleles for enhanced phosphorus uptake efficiency in wild species germplasm of rice Oryza rufipogon, we investigated phosphorus uptake1 (Pup1) locus with 11 previously reported SSR markers and sequence characterized the phosphorus-starvation tolerance 1 (PSTOL1) gene. In the present study, we screened 182 accessions of O. rufipogon along with Vandana as a positive control with SSR markers. From the analysis, it was inferred that all of the O. rufipogon accessions undertaken in this study had an insertion of 90 kb region, including Pup1-K46, a diagnostic marker for PSTOL1, however, it was absent among O. sativa cv. PR114, PR121, and PR122. The complete PSTOL1 gene was also sequenced in 67 representative accessions of O. rufipogon and Vandana as a positive control. From comparative sequence analysis, 53 mutations (52 SNPs and 1 nonsense mutation) were found in the PSTOL1 coding region, of which 28 were missense mutations and 10 corresponded to changes in the amino acid polarity. These 53 mutations correspond to 17 haplotypes, of these 6 were shared and 11 were scored only once. A major shared haplotype was observed among 44 accessions of O. rufipogon along with Vandana and Kasalath. Out of 17 haplotypes, accessions representing 8 haplotypes were grown under the phosphorus-deficient conditions in hydroponics for 60 days. Significant differences were observed in the root length and weight among all the genotypes when grown under phosphorus deficiency conditions as compared to the phosphorus sufficient conditions. The O. rufipogon accession IRGC 106506 from Laos performed significantly better, with 2.5 times higher root weight and phosphorus content as compared to the positive control Vandana. In terms of phosphorus uptake efficiency, the O. rufipogon accessions IRGC 104639, 104712, and 105569 also showed nearly two times higher phosphorus content than Vandana. Thus, these O. rufipogon accessions could be used as the potential donor for improving phosphorus uptake efficiency of elite rice cultivars.

Limited phosphorus availability in the soil is one of the major constraints to the growth and productivity of rice across Asian, African and South American countries, where 50% of the rice is grown under rain-fed systems on poor and problematic soils. With an aim to determine novel alleles for enhanced phosphorus uptake efficiency in wild species germplasm of rice Oryza rufipogon, we investigated phosphorus uptake1 (Pup1) locus with 11 previously reported SSR markers and sequence characterized the phosphorus-starvation tolerance 1 (PSTOL1) gene. In the present study, we screened 182 accessions of O. rufipogon along with Vandana as a positive control with SSR markers. From the analysis, it was inferred that all of the O. rufipogon accessions undertaken in this study had an insertion of 90 kb region, including Pup1-K46, a diagnostic marker for PSTOL1, however, it was absent among O. sativa cv. PR114, PR121, and PR122. The complete PSTOL1 gene was also sequenced in 67 representative accessions of O. rufipogon and Vandana as a positive control. From comparative sequence analysis, 53 mutations (52 SNPs and 1 nonsense mutation) were found in the PSTOL1 coding region, of which 28 were missense mutations and 10 corresponded to changes in the amino acid polarity. These 53 mutations correspond to 17 haplotypes, of these 6 were shared and 11 were scored only once. A major shared haplotype was observed among 44 accessions of O. rufipogon along with Vandana and Kasalath. Out of 17 haplotypes, accessions representing 8 haplotypes were grown under the phosphorus-deficient conditions in hydroponics for 60 days. Significant differences were observed in the root length and weight among all the genotypes when grown under phosphorus deficiency conditions as compared to the phosphorus sufficient conditions. The O. rufipogon accession IRGC 106506 from Laos performed significantly better, with 2.5 times higher root weight and phosphorus content as compared to the positive control INTRODUCTION Rice (Oryza sativa L.), one of the major staple food crops in the world, is critical to food security for billions of people around the world. Calories from rice are particularly important in Asia, especially among the poor, where it accounts for 50-80% of the daily calorie intake (http://www.gramene.org/). The estimated demand for rice in India is projected to go up to 121.2 million tons by the year 2030, 129.6 by the year 2040 and 137.3 million tons by the year 2050 as compared to 90-104 million tons being produced currently (http://www.crri.nic.in/ebook_crrivision2050_final_16Jan13.pdf). This indicates that rice production needs to be increased by 32% in the next 33 years for fulfilling the internal consumption of India. Keeping in view the situation when the area growth rate is negative and decreasing at the rate of 0.15% per year under rice, utilization of poor and problematic soils for sustaining yield requirement is one of the most promising ways.
Rice requires phosphorus to survive and thrive. It is a key element in plant metabolism, root growth, maturity, and yield. Phosphorus (P) deficiency leads to various physiological disorders in rice such as stunted growth, reduced tillering, thin and spindle stems, reduced number of grains per panicle (http://www.Knowledgebank.irri.org/phosphorus-deficiency) and ultimately leads to the reduction in the yield of rice plants. In Asia, 60% of the rain-fed lowland rice is produced on poor and problem soils that are naturally low in phosphorus or P fixing (Gamuyao et al., 2012). Phosphorus deficiency is widespread in Bangladesh, India, Indonesia, Nepal, Pakistan, South China, and Vietnam (Wissuwa and Ae, 2001;Haefele and Hijmans, 2009). In India, nearly 61.02% of the soils are found low in available P, 25.89 and 13.09% are found medium and high in available P content (Hasan, 1996;Muralidharudu et al., 2011). The hurdle further increases due to the presence of a non-renewable source of phosphatic fertilizers. The indigenous deposits of rock phosphate are barely able to meet 10% of the phosphate fertilizer demand in India. For the rest of the need (90%), India depends on imports of raw materials and processed phosphatic fertilizer products (Sharma and Thaker, 2011). Large quantities of finished products of fertilizer are imported in India every year, along with raw materials and intermediates for producing different fertilizers indigenously. In 2000-01, import of finished products (on N + P 2 O 5 + K 2 O nutrient basis) was 2.194 million tons, which rose to 12.208 million tons in 2010-11 (Majumdar et al., 2013). Besides, about 5 million tons of rock phosphate and 2 million tons of phosphoric acid are imported every year. The availability of rock phosphate from domestic sources is about 1.86 million tons (Majumdar et al., 2013) which is nearly one by seventh of the total demand. Further, annual outgo on fertilizer subsidy during 2013-14 was Rs. 71,251 crores, out of which Rs. 29,427 crores were shared by phosphatic and potassic fertilizers. Therefore, the development of rice varieties with sustainable productivity under the problematic soil is a valid approach toward reducing the economic burden of the country.
The wild species germplasm of rice constitutes the most important genetic resources for rice improvement. Rice belongs to genus Oryza and tribe Oryzeae of the family Gramineae (Poaceae). The genus Oryza contains 24 recognized species, of which 22 are wild species (Vaughan et al., 2003). The wild species have either 2n = 24 or 2n = 48 chromosomes representing AA, BB, CC, BBCC, CCDD, EE, FF, GG, and HHJJ genomes (Brar and Khush, 2003). Several genes and QTLs have been mined from wild species of rice for resistance to biotic and abiotic stresses and for enhancing the productivity of modern cultivars (Khush et al., 1977;Xiao et al., 1996;Moncada et al., 2001;Aluko et al., 2004;Linh et al., 2008;Rangel et al., 2008;Chen et al., 2009;Khush, 2013). In rice, the low-Pi tolerance is naturally present in wild germplasm/landraces and could be used to improve phosphorus acquisition efficiency (PAE) and phosphorus use efficiency (PUE) in modern varieties (Gamuyao et al., 2012). A major QTL for P-deficiency tolerance was mapped on chromosome 12 (Pup1) from the aus type rice variety Kasalath, explaining 70% of the variance (Wissuwa et al., 2002). Among various markers developed by Chin et al. (2010) for marker assisted breeding of phosphorus uptake efficiency, only OsPupK46-2 was found associated with the trait and later named as phosphorus-starvation tolerance 1 (PSTOL1) gene by Gamuyao et al. (2012). This gene is absent from the rice reference genome (Nipponbare) and in the genomes of other indicia varieties which are susceptible to phosphorus deficiency (Wissuwa et al., 2002). The PSTOL1 act as an enhancer of early root growth and promotes more phosphorus uptake (Gamuyao et al., 2012).Therefore, it is highly desirable to explore, utilize and transfer new alleles of PSTOL1 gene to the elite cultivars for improving their yield under low phosphorus soil conditions. Only a few reports are available on allelic diversity present among the rice wild species germplasm for the PSTOL1 gene (Pariasca-Tanaka et al., 2014;Vigueira et al., 2016). Moreover, all of the breeding programs worldwide for improving phosphorus uptake are focused on the transfer of PSTOL1 gene from Kasalath (aus type) and African rice (O. glaberrima Steud), leading to the narrowing of genetic variability. In order to deploy novel genes/alleles for improving phosphorus uptake efficiency, our primary objective is to investigate Oryza rufipogon accessions for allelic diversity at PSTOL1, its validation under the phosphorusdeficient conditions and further its transfer to elite rice indica cultivars.

Sequencing of PSTOL1 Gene in O. rufipogon Accessions
The Oryza sativa cv. Kasalath sequence (Accession AB458444.1) from position 275,525-276,499 bp covering 975 bp CDS region of PSTOL1 was used for designing sequencing primers (PSTOL1 forward: 5 ′ -ATAGCAGGCATTTCTGGCTCA-3 ′ and PSTOL1 reverse: 5 ′ -CCATGACAGCTGATTGCCTT-3 ′ ). The amplicons were purified using Wizard R SV 96 PCR clean up/Gel extraction kit from Promega, USA, as per manufacturer's protocol. Sequencing reaction was performed using ABI Big-dye Terminator v3.1 chemistry and sequenced using ABI Sequencer 3730XL. Hi-fidelity long-read DNA polymerase (Phusion Taq) from Promega, USA, was employed to obtain the required amplicon size. A minimum of three replications was carried out for the confirmation of single nucleotide polymorphism (SNPs).

Haplotype Determination and Protein Prediction
For comparative sequence analysis, the PSTOL1 sequences were trimmed to remove any poor quality region at both ends.  Multiple sequence alignment was performed using Clustal W of MEGA version 7.0 (Kumar et al., 2016). Kasalath sequence was used as a reference for detection and determination of SNPs position among the PSTOL1 sequences obtained from O. rufipogon accessions. The identified SNPs were manually confirmed using chromatograms. DnaSP version 5.0 and Selecton server (http://selecton.tau.ac.il/, Stern et al., 2007) were used to calculate summary statistics for nucleotide diversity (π), the number of segregating sites, non-synonymous (k a ), and synonymous (k s ) mutations and the ratio of k a /k s is to estimate positive/purifying selection of a given amino acid, the number of haplotype and Tajima's D test. Bioinformatics toolkit (http://toolkit.tuebingen.mpg.de/) was used to predict protein structures of all sequences. Homology modeling approach was employed using the Modeler to determine the structure of proteins based on the known structure of template protein. Protein domains were predicted and compared using Pfam (http://pfam.xfam.org/search) and Prosite (http://prosite.expasy.org) online tools. The protein models were checked for the quality using the Ramachandran plot developed using Procheck through PDBsum (http://www.ebi.ac.uk/thornton-srv/databases/pdbsum).
The modeled protein structure was visualized and compared in UCSF Chimera (Pettersen et al., 2004). All the structures were superimposed for observing structural variations.

Phylogenetic Analysis
A phylogenetic tree was generated by MEGA7.0 software using the alignment file obtained earlier. The molecular phylogeny was inferred using the Maximum Likelihood method with 1,000 bootstrap (Tamura and Nei, 1993). All positions containing gaps and missing data were eliminated along with other default settings of the software.

Validation of Haplotypes under Phosphorus Starvation
For functional validation of PSTOL1 haplotypes toward phosphorus uptake efficiency, eight accessions with seven different haplotypes along with positive control Vandana and negative control PR121 were grown in replicates under low and high phosphorus conditions in the greenhouse facility following the protocol of Gamuyao et al. (2012). High and low P growth conditions were established by maintaining the NaH 2 PO 4 concentration in the hydroponic media as 100 µM and 10 µM, respectively. The eight accessions (CR 100013, CR 10013A-H2, IRGC 104639-H3, IRGC 104712-H4, IRGC 100588-H8, IRGC 105569-H9, IRGC 81989-H11, and IRGC 106506-H17) along with controls were grown in hydroponics for about 2 months. Due to poor germination of accessions representing remaining haplotypes, they were not included in the present study. The seeds were germinated on the wet filter paper, and four seedling replicates per accession were assayed for each phosphorus treatment. After 10 days of germination, seedlings were transferred to the Styrofoam trays suspended in Yoshida growth media (Yoshida et al., 1976). The nutrient media was changed at every third day. Data for the root length, shoot length, final dry root, and shoot weight were taken after 60 days in growth media. Phosphorus content in roots was measured using Inductively Coupled Plasma Spectrophotometer after digestion in a mixture of HNO 3 , HClO 4 , and H 2 SO 4 (3:1:1) according to the protocol described by Neelam et al. (2011). The morphological data on root and shoot traits under study along with phosphorus content on dry root weight basis was subjected to the statistical analysis. Student's t-test was applied for testing the significance of differences among the means of O. rufipogon accessions and the controls.

Genotyping of Pup1 Locus Using SSR Markers
The analyzed co-dominant SSR markers were found monomorphic among all the 182 O. rufipogon accessions and indica rice cultivars (PR114, PR121, PR122, PB3, and Vandana) (Supplementary Table S3). For dominant markers (Pup1-K41 to K-59), the presence of Vandana alleles was detected in the majority of the O. rufipogon accessions as well as in the modern rice cultivars except for marker K-46. Rice cultivars PR114, PR121, and PR122 did not show any amplification for K-46 marker. This indicates the specificity of K-46 marker for the assessment of phosphorus starvation tolerance.

Haplotype Variations in PSTOL1 Gene
From comparative sequence analysis, 53 nucleotide changes (52 SNPs and 1 nonsense mutation) across the exon were observed ( Table 2). Both types of conversions i.e., transitions  In parenthesis, conservative mutations were marked as (:), semi-conservative (.), and non-conservative/radical mutations were unmarked.
(n = 39) and transversions (n = 14) were observed, while the G/A transition was most common (28.30%). Higher transitions indicated more of the synonymous substitutions were present among genotypes and hence no conformational changes in the structure of proteins were observed. Based on the nucleotide diversity present among O. rufipogon accessions, haplotypes were identified using DnaSP software v5.0. Out of 53 identified, 10 SNPs at position 174,260,283,303,358,410,554,633,647, and 738 were found as singleton whereas 43 SNPs were parsimony informative sites with a minimum frequency of occurrence in two or more O. rufipogon accessions (Figure 1). The overall nucleotide diversity (π) of the identified PSTOL1 alleles was found 0.00758, which indicates low variance in the average number of nucleotide differences per site between two sequences. The number of mutations (n = 53) and the number of segregations sites (S = 53) were same, suggesting their positive selection. The value of Tajima's D obtained is negative (−1.09788) supporting the above-said statement. Presence of fewer haplotypes was observed than the number of segregating sites indicating the lower frequency of rare alleles present in the population. A total of 17 haplotype groups was formed, revealing genotypes divergence at PSTOL1 gene among studied O. rufipogon accessions (

Protein Structure Prediction and Comparison
A total of 28 differences in amino acid sequences with a comparison to the variety Vandana and Kasalath were identified  ( Table 2). The ratio of non-synonymous/synonymous site (k a /k s ) was found 1.52, suggesting that the amino acids were under positive selection and favored by the environment. The amino acids at position 25, 31, 35, 55, 72, 85, 87, 95, 101, 115, 120, 127, 142, 146, 156, 157, 161, 185, 202, 209, 216, 219, 246, 253, 273, and 283 highlighted by yellow color were under positive selection (Supplementary Figure S2). Protein structures for Kasalath and 67 accessions of O. rufipogon belonging to different haplotype groups were superimposed and analyzed for structural differences. The non-synonymous mutations concentrate around the ATP binding site (LEU45, ARG47, GLY48, VAL53, ALA65, GLU112, MET114, TYR113, SER118, LYS168, GLN170, and LEU173). (Supplementary Figure S3). The Ramachandran plot of Kasalath revealed more than 99.3% residues were at the core and allowed region and only two residues were present in the disallowed region. Similar results were obtained for protein models of other accessions. All the accessions had a three-dimensional structure similar to the reference Kasalath except accession IRGC 106336 (Figure 2). The protein structure of Kasalath and other accessions displayed 14 helices and two beta-pleated sheets and nine strands, while accession IR 106336 showed only three helices, one sheet, and five strands. In accession IRGC 106336, the PSTOL1 sequence revealed the presence of premature stop codon at position 137 and its further domain analysis using Pfam and Prosite revealed that it encodes partial protein kinase domain instead of protein tyrosine kinase as encoded by Kasalath. The Prosite analysis for PSTOL1 protein in Kasalath predicted the features as the kinase domain from codon 39-319, nucleotide phosphate binding site (NP_BIND) at position 45-53, ATP-binding site (BINDING) at position 67 and the proton acceptor site as active site at position 166 whereas IRGC 106336 had partial protein kinase domain from 39 to 136, NP_BIND from position 45-53, ATP-binding site (BINDING) at position 67, with absence of proton acceptor site i.e., active site (Supplementary Table S4).

Phylogenetic Analysis
Phylogenetic analysis at PSTOL1 locus revealed divergence among O. rufipogon accessions (Figure 3). The different node colors correspond to the different mutations present in O. rufipogon accessions and vice-versa. Two major groups were observed. The clade A is consisted of 59 O. rufipogon accessions which can be further divided into 8 subgroups. Out of 59 accessions, 44 were found to have similar sequences as that of the reference sequence, whereas others having either single or more substitutions as compared to the reference.

Validation of Novel Alleles under Phosphorus Deficiency
Genotypic variation for root and shoot length, dry root and shoot weight and phosphorus content on dry roots basis were examined under phosphorus sufficient and deficient conditions after 2 months of the experiment (Tables 4A,B). Genotypic differences were observed among all O. rufipogon accessions under both growing conditions. All of the genotypes under phosphorus sufficient conditions had almost double root length as compared to the deficient conditions. Though, not much difference was observed in shoot length and shoot weight for all the genotypes when compared under phosphorus sufficiency and deficiency conditions. The O. rufipogon accession IRGC 106506 (H17) showed the best root and shoot length under phosphorus-deficient conditions when compared to other genotypes and control. Among H2 haplotype O. rufipogon accession CR 100013A performed better than Vandana for all the traits studied. In terms of root and shoot weight, O. rufipogon IGCC 106506 (H17) showed the best root weight followed by IRGC 81989 from H11 haplotype. Approximately, 1.5 and 2.3 times higher phosphorus  Table S3). Chin et al. (2010), also observed the Kasalath specific alleles for markers K-41, K-43, and K-48 in lowland/irrigated (indica, japonica, aus, and traditional or modern) rice cultivars, representing nonusefulness of these markers for marker aided selection for phosphorus uptake efficiency. Similarly, the markers K-42, and K-29 were not found linked with PUE by Sarkar et al. (2011) while assessing indica germplasm. It should be taken into consideration that Gamuyao et al. (2012) ruled out other co-dominant and INDEL markers as indicative of PUE except for K-46. This dominant marker was found useful for MAS in the progenies involving Kasalath as Pup1 donor variety and Asian lowland rice varieties (without this gene) by Chin et al. (2010) and Pariasca-Tanaka et al. (2014, supporting our results. Mukherjee et al. (2014), assessed 108 genotypes from different states of India for phosphorus acquisition efficiency with gene specific markers and closely linked SSR marker RM1261 and reported no association between markers and PUE. The same has been observed when they studied a RIL population developed from a cross between Gobindabhog (with PSTOL1 gene) and Satabdi (PSTOL1 absent). The notion that PSTOL1 specific marker was not indicative in the case of indica germplasm (Mukherjee et al., 2014) is more likely due to the complex nature of Pup1 locus and different genetic background and environment where this gene has to express.
The germplasm survey with Pup1 specific markers of Kasalath indicated entire inserted region of 90 kb among studied O. rufipogon accessions. The probable explanation for this could be a continuous gene flow between O. sativa and O. rufipogon populations throughout the history of domestication (Vaughan et al., 2008). Also, O. rufipogon accessions from South and Southeast Asia are considered as the wild progenitor of domesticated rice (Oka, 1988;Molina et al., 2011) and FIGURE 3 | The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model. The bootstrap consensus tree inferred from 1,000 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The different node colors indicate the presence of different mutations as compared to the reference. The O. rufipogon accessions used for validation under phosphorus deficiency are indicated by an arrow.

Phylogeography of O. rufipogon Accessions under Study
Our results on molecular diversity at PSTOL1 locus, suggests the presence of lower diversity among O. rufipogon accessions from South Asia and Southeast Asian nations. This is expected as they share common geographical boundaries. This result is in accordance with several studies conducted on the assessment of genetic diversity of Asian wild rice using RFLP, microsatellite markers, SINEs, sequence based polymorphism, ISSRs, chloroplast, and low-copy nuclear markers (Joshi et al., 2000;Cheng et al., 2002;Rakshit et al., 2007;Xu et al., 2007;Huang et al., 2012 (Ellis and Setter, 1999;Bhullar et al., 2010;Ravensdale et al., 2012;Vasudevan et al., 2014;Ashkani et al., 2015). The wheat powdery mildew resistance gene Pm3 with 17 identified functional alleles is a remarkable example of natural variations present in GenBank accessions and can be efficiently utilized for conferring broad-spectrum disease resistance. For rice blast resistance gene, functional orthologous have been found in wild rice O. rufipogon accessions (Lv et al., 2013;Xu et al., 2014;Ashkani et al., 2015) defining their utility in widening the genetic base of cultivated rice varieties. A novel allele of PSTOL1 gene is identified in O. glaberrima (CG14) and being transferred to the NERICAs (New Rice for Africa) cultivars using allele-specific markers. In their study, they identified 3 novel alleles in 10 studied O. rufipogon accessions and also the presence of Kasalath alleles for INDEL markers which is consistent with our results. The successful efforts for the transfer of PSTOL1 were made by Gamuyao et al. (2012) through marker assisted backcross breeding to Asian rice cultivar IR74 with increased root growth and phosphorus uptake efficiency. The presence of PSTOL1 in all O. rufipogon accessions raise the question regarding the functionality of different alleles under phosphorus deficiency. Validation of haplotype groups showed the significant difference for root and shoot length and biomass as compared to PR121 and Vandana under both phosphorus sufficient and deficient conditions. The correlation between root elongation, higher root and shoot biomass of genotypes under P-deficiency is considered as one of an important indicator of higher phosphorus uptake efficiency. A number of reports, including QTLs on P deficiency induced root elongation in plants were published (Steingrobe et al., 2001;He et al., 2003;Ma et al., 2003;Wissuwa, 2005;Li et al., 2007;Rose et al., 2013). Near isogenic line of "Nipponbare" with Pup1 QTL from "Kasalath" showed high P content, high tillering and high root growth under P-deficient upland conditions (Wissuwa and Ae, 2001;Wissuwa et al., 2002). The O. rufipogon accession IRGC 106506 showed the highest root growth under P-deficiency and thus is the best option for transferring this novel allele to elite cultivars for improving P starvation tolerance.

CONCLUSION
In Summary, our efforts for harnessing superior allele of PSTOL1 in O. rufipogon revealed three accessions (IRGC 106506, IRGC 81989, and IRGC 104639) from haplotypes H17, H11, and H3 with better performance under Phosphorus deficiency conditions. Though, further confirmation of identified superior alleles should be done under the phosphorus-deficient soil. Transfer and development of allele-specific markers for MAS have already been initiated at Punjab Agricultural University. Marker assisted transfer of these potential haplotypes to the indica rice cultivars would be useful to breed better rice with sustainable yield under phosphorus-deficient soil.