- Plant Genetics and Molecular Breeding Laboratory, Department of Botany, Rice Germplasm Conservation and Breeding Centre, University of North Bengal, Siliguri, India
Developing high-yielding rice varieties (Oryza sativa L.) is critical to ensure global food security. The narrow genetic base in the released rice varieties has plateaued the improvement. Considering the potentials of wild rice (Oryza rufipogon), two distinct recombinant inbred line (RIL) populations were developed through interspecific hybridization (BWF: Badshabhog × O. rufipogon and CWF: Chenga × O. rufipogon) to increase the genetic base via alien introgression of hidden genes. Genetic diversity was assessed through the following: genetic variability parameters, broad-sense heritability, Mahalanobis D2 test, and principal component analysis (PCA) using 15 agro-morphological characteristics that indicated enhanced genetic variation. The first four principal components (PCs) together accounted for 73.74% of the variability in BWF, and the first six PCs showed 71.90% cumulative variability in CWF (eigen value >1). The broad-sense heritability ranged from 74.42% to 99.87% for all traits in both the RILs. Single plant yield was positively correlated with grain per panicle, 1,000 grain weight, grain length, and panicle weight. The cluster analysis showed that the grain per panicle, grain weight, kernel breadth, and plant height were the key yield-contributing traits. The detection of petunidin 3-O-glucoside through HR-LCMS-QTOF indicated that anthocyanin was synthesized in the black-grain RILs, signifying nutritional improvement. Hence, underutilized wild rice contributed immensely to enhancing the genetic base of the RILs, with unusual genetic diversity associated with yield improvement and grain pigmentation. Pre-breeding materials are the cornerstone of future rice improvement programs, and our materials can be efficiently utilized to develop resilient, productive, and nutritious pigmented rice varieties.
Introduction
Rice (Oryza sativa L.) is the single most important crop of the world, as half of the world’s population (>3.5 billion) eats rice every day, and it contributes to global food and nutritional security (Gross and Zhao, 2014; Rao et al., 2020; Alam et al., 2024; Huang et al., 2024). The majority of the production and consumption is dominated by Asian countries, with approximately 90% of the total production and consumption occurring in Asia (FAO, 2024). Rice contributes about 20% of the world’s dietary energy supply, whereas wheat and maize contribute 19% and 5%, respectively (GRiSP, 2013; Bin Rahman and Zhang, 2022; FAO, 2024). In some Asian countries, rice provides over 30%–80% of calorie supply, and it is considered a major source of staple food for Asia (Laghari et al., 2025). The need to produce rice will double by 2050 to feed the more than 9 billion people in this world, even while simultaneous factors, such as diminishing acreage, deteriorating soil health, and environmental stresses induced by global climate change (Gaikwad et al., 2021; Xu et al., 2021; Malik et al., 2022; Seck et al., 2023), will intensify. Undoubtedly, plant breeders have witnessed a substantial increase in yield over the years during the green revolution era, but a major bottleneck has been the significant reduction in genetic diversity due to limited utilization of genetic resources. Consequently, the narrow genetic base of improved varieties led to yield plateaus (Tanksley and McCouch, 1997; Tian et al., 2006; Sanchez et al., 2013; Seck et al., 2023). Moreover, a genetic bottleneck that occurred during the domestication of cultivated rice from its immediate ancestral progenitor, the wild rice Oryza rufipogon, has played a major role in reducing the allelic diversity, at least by 50%–60% in cultivated rice compared to that of wild rice O. rufipogon, leading to loss of genetic variability along with yield potentiality (Xiao et al., 1996; Tanksley and McCouch, 1997; Brar and Khush, 2006). Wild rice O. rufipogon has been considered a reservoir of many untapped gene/quantitative trait loci (QTLs) for important agronomic traits such as yield, quality, nutritional characteristics, and resistance to biotic and abiotic stresses (drought, salinity, submergence, and aluminum toxicity). It can be utilized in the pre-breeding program for broadening the genetic base of released varieties to break the yield plateaus (Tanksley and McCouch, 1997; Sun et al., 2001; Tester and Langridge, 2010; Sanchez et al., 2013; McCouch et al., 2007; Brar and Khush, 2018; Solis et al., 2020; Siddiq and Vemireddy, 2021; Padmavathi et al., 2024). The transfer of genes controlling desirable traits (yield and grain quality) from wild relative O. rufipogon to cultivated rice is an important strategy in rice breeding (Tian et al., 2006; Huang et al., 2012b; Qiao et al., 2016; Zhang B. et al., 2022). The introgression of superior alleles from diverse sources of wild rice has been shown to widen the gene pool of cultivated rice to breed cultivars with improved yield, quality, and stress tolerance (Xiao et al., 1996; Subudhi et al., 2015; Singh et al., 2020; Gaikwad et al., 2021; Malik et al., 2022; Siddiq and Vemireddy, 2021; Zhang B. et al., 2022; Cao et al., 2022). Yield-related QTLs have been transferred from O. rufipogon to cultivated rice for yield enhancement through interspecific hybridization (pre-breeding) (Thomson et al., 2003; McCouch et al., 2007; Imai et al., 2013; Siddiq and Vemireddy, 2021; Eizenga et al., 2024). However, breeders are facing not only the yield barrier but also the quality improvement barrier. Because white rice provides food for approximately 3.5 billion people, its nutritional quality is poor compared to that of pigmented rice (Ito and Lacerda, 2019; Mbanjo et al., 2020). The nutritional quality of rice is determined by the levels of starch, protein, lipids, minerals, vitamins, and phytochemicals (Senguttuvel et al., 2023). Pigmented rice varieties (brown, red, and black) are gaining popularity among consumers due to their nutritional health benefits (Shao et al., 2018; Mendoza-Sarmiento et al., 2023; Xie et al., 2024; Zhu et al., 2024; Idrishi et al., 2024; Sakulsingharoj et al., 2024; Gogoi et al., 2024), and market demands are expected to increase (Kushwaha, 2016; Ito and Lacerda, 2019; Bhuvaneswari et al., 2023). Pigmented rice accumulates various types of secondary metabolites, such as phytosterols, polyphenols, flavonoids, anthocyanins, proanthocyanidins, vitamins, and micronutrients, which are known to have a high nutritional value and medicinal properties (Shao et al., 2018; Mbanjo et al., 2020; Zhu et al., 2024; Idrishi et al., 2024; Thilavech et al., 2025). Purple and red rice extracts inhibited the viability of human colorectal cancer cell line SW480 at 24-h and 48-h exposures starting at doses of 0.5 mg/mL and higher. The red varieties had higher bioactivity than purple varieties, whereas non-pigmented rice displayed no effect on cell viability (Rao et al., 2019; Brotman et al., 2021). Ghasemzadeh et al. (2018) demonstrated that a level of 119.2 mg/mL–148.6 mg/mL of black rice extracts and 151 mg/mL–175 mg/mL of red rice extracts exhibited potent antiproliferative activity against MCF-7 and MDA-MB-231 breast cancer cell lines (Ghasemzadeh et al., 2018; Tiozon et al., 2023; Das et al., 2023). Reactive oxygen species (ROS) and reactive nitrogen species (RNS) are majorly produced in the body due to induced oxidative stress, leading to carcinogenesis, aging, and inflammation. Thus, anthocyanin, which is the major polyphenol pigment present in black rice, scavenges the ROS and RNS produced in cells (Ito and Lacerda, 2019). Mapoung et al. (2022) reported that the main anthocyanin components present in bran and germ of black rice, viz. cyanidin 3-glucoside and peonidin 3-glucoside, have the ability to inhibit inflammation, leading to infections caused by the spike glycoprotein of SARS-CoV-2 virus. Black rice phenolics (BRPs) are helpful in managing type 2 diabetes mellitus (T2DM) in rats. The results indicated that BRPs significantly alleviated diabetic symptoms, lowered the fasting blood glucose and hemoglobin A1c (HbA1c) levels, and enhanced glucose tolerance in T2DM rats (Xu et al., 2024a). The antidiabetic effects of pigmented rice appear to arise from a synergistic effect of anthocyanin, proanthocyanidin, vitamin E, gamma-oryzanol, and various flavonoids, and they inhibit alpha-glucosidase and alpha-amylase activity, thus delaying the absorption of carbohydrates while tested on streptozotocin-induced diabetic rats (Tantipaiboonwong et al., 2017). The nutritive value of pigmented rice is greatly influenced by genetics, genotypic variation, and environmental factors (Shirley, 1998; Sweeney et al., 2006; 2007; Furukawa et al., 2007; Gross and Zhao, 2014; Tiozon et al., 2023; Zhu et al., 2024), along with several external influences, such as soil fertility status, the degree of milling, and the method of preparation before consumption (Gogoi et al., 2024). Numerous black rice lines were developed through crossing between black rice Okunomurasaki and white rice Koshihikari and black rice Hong Xie Nuo with white Koshihikari (Maeda et al., 2014). Transgressive segregant lines were selected from the crosses between black rice Chakhao Poireiton and white rice Sahbhagi Dhan, with colored pericarp, high anthocyanin content, and increased yield compared to their parental lines (Lap et al., 2024). Looking forward, there is a remarkable opportunity for breeding programs to develop nutritionally enriched (phytonutrients), productive pigmented rice varieties. Despite having nutritional importance (rich source of phytonutrients), pigmented rice is usually low yielding, prone to lodging, susceptible to diseases, and late-maturing (Devi et al., 2020; Bhuvaneswari et al., 2020; Sedeek et al., 2023). The narrow genetic base of modern rice varieties has led to yield plateaus, making it essential to introduce genetic diversity to overcome these barriers. Pre-breeding involves crossing elite cultivars with wild relatives to incorporate novel genes and QTLs for the improvement of traits. Pre-breeding facilitates the introgression of desirable traits such as yield potential, nutritional quality, and resistance to biotic and abiotic stresses (Rao et al., 2020; Bin Rahman and Zhang, 2022; Alam et al., 2024). Therefore, the present study aimed to broaden the genetic base through interspecific hybridization between the rice cultivars Badshabhog and Chenga with wild rice O. rufipogon to increase yield potential and enhance grain quality.
Materials and methods
Plant materials for interspecific hybridization
Wild rice O. rufipogon Griff. of Raiganj was used as one of the parental lines (donor parent). This wild rice variety grows naturally in the shallow marshy land/ditches of the Raiganj block, Uttar Dinajpur district, West Bengal, India, at latitude 25.62 °N and longitude 88.12 °E, and elevation of 40 m (130 ft). This wild rice variety is highly shattering in nature and fully spreading in the habitat with an annual growth pattern, and it produces awned spikelets on the spreading panicles and disperses the mature seeds. Its grain and hull colors are red and black, respectively. Two well-adapted farmer’s varieties of O. sativa L. subspecies indica cultivar, Badshabhog and Chenga, were used as the parents in this interspecific hybridization. Badshabhog has white aromatic grains with a straw-colored husk, whereas Chenga has a blackish husk with nonaromatic brown grains. Both the farmer’s rice varieties (Badshabhog and Chenga) have been deposited in the National Rice Gene Bank, NBPGR-ICAR, Govt. of India, New Delhi, for conservation purpose with indigenous collection numbers (IC No-0652950 for Chenga and IC no-0652952 for Badshabhog).
Development of RILs following the pedigree method
Interspecific hybridization between cultivated rice (O. sativa) and wild rice (O. rufipogon)
Two recombinant inbred line (RIL) populations were developed through interspecific hybridization between cultivated rice varieties (O. sativa) and wild rice (O. rufipogon). The RIL population developed from the cross Badshabhog × O. rufipogon” (named BWF) comprised 100 distinct individual lines in the F7 generation, whereas the population developed from the cross “Chenga × O. rufipogon” (named CWF) comprised 100 individual genotypes in the F7 generation. In the pedigree method, individual F2 generation plants were carefully selected, with their cultivated offspring, and a detailed pedigree record was maintained following the standard method (Xu et al., 2024b). The process began by crossing cultivated rice varieties (O. sativa) with wild rice O. rufipogon. Initial crosses were made between O. sativa cv. Badshabhog × O. rufipogon and O. sativa cv. Chenga × O. rufipogon for the creation of F1 progenies in 2016 according to the standard protocols, and they were collectively planted and harvested (Sleper and Poehlman, 2007; Sha, 2013; Roy, 2017).
RIL population development was carried out at the NBU, Siliguri, India, beginning with a cross between Badshabhog and Chenga as the female parent and O. rufipogon as the male parent. Crossability was determined by counting the number of seeds produced per cross. It was calculated as the ratio of the number of true F1 seeds developed per cross to the total number of spikelet emasculated and was expressed as the percentage.
Twelve hybrid F1 viable seeds were obtained from crossing Badshabhog × O. rufipogon and nine F1 hybrid seeds were obtained from crossing Chenga × O. rufipogon, with seed set percentages of 14.28% and 12.32%, respectively. The highest percentage of seed set among O. sativa × O. rufipogon was found in the cross BWF Badshabhog × O. rufipogon (14.28%) [12/84 × 100 = 14.28%], whereas the lowest crossability was exhibited by the cross CWF Chenga × O. rufipogon (12.32%) [09/73 × 100 = 12.32%]. The harvested viable F1 seeds from both BWF (12 F1 seeds) and CWF (nine F1 seeds) crosses were sown to obtain the next generation, F2. Due to the hybrid sterility/hybrid breakdown criterion, a few F1 plants could not grow properly and died (two in BWF and one in CWF). The remaining 10 F1 progeny in BWF plants and eight F1 progeny in CWF plants were developed after overcoming hybrid sterility, and 174 F2 seeds were collected from each of the two crosses. F2 plants were randomly selected to proceed to F3, and 50 F2 plants were selected based on the phenotype to eliminate non-desirable characteristics, including very late- or non-flowering types, excessively tall plants, and sterile plants. A total of 243 F3:4 plants were selected and analyzed. F2 plants were individually grown to identify and select optimal lines. All F2 seeds were highly shattering and collected in a nylon net by bagging the panicles to harvest the seeds. This selection process was extended to the F3 generation, where individual plant progeny was row-planted, and optimal plants were selected. Progeny populations from the F2 generation (in 2017) were allowed to self-fertilize to develop RILs. Shatteredness was reduced from the F3 and F4 generations, and from the F5 generations, shattering was stopped in most of the breeding lines and maintained. This selection procedure was continued until the production of the F6 generation. From the F6 populations (2021 kharif crop), we selected 100 phenotypically distinct breeding lines based on 15 yield-related traits from each of the populations (BWF and CWF). To establish the RIL population following the pedigree method, 100 superior progeny plants were collectively harvested from the F7 generation, at which point they achieved more uniformity in trait expression.
Experimental design
Yield trials were conducted during two consecutive kharif seasons of 2022 and 2023. The parents and both RIL populations were grown at the experimental rice field of North Bengal University. Genotypes were plotted in the field in a randomized complete block design (RCBD) with three replications for two seasons (F7 in 2022 and F8 in 2023 kharif crop) to evaluate the yield performance. Local black rice cultivar Chakhao was used as the control variety. The plot size was 6 m2 (2.0 m × 3.0 m). The 30-day-old seedlings were transplanted with a spacing of 20 cm × 20 cm and one seedling per hill. The fertilizer application and intercultural agronomic practices were carried out as per the recommended standard.
Trait measurement
Phenotypic evaluation
The phenotypic evaluation of both the RIL populations containing 100 genotypes (BWF and CWF) was performed under natural conditions at the experimental field of North Bengal University in two kharif seasons of 2022 and 2023. Phenotypic data regarding 15 yield and yield-related traits, including plant height (PH), flag leaf length (FFL), flag leaf width (FLW), panicle length (PnL), panicle weight (PnWt), grain per panicle (GrPn), grain length (GL), grain breadth (GB), kernel length (KL), kernel breadth (KB), 1,000 grain weight (GrWt), tiller number (Till), heading date (HD), maturity time in days (MT), and single plant yield (PY), were recorded from five randomly selected representative plants in each plot of each replication using the DUS guideline (PPV&FR Act, 2001, Govt. of India). Data were recorded from the middle rows to avoid border effects, and the mean values of the 15 traits were used for further analysis. Other agro-morphological and grain quality parameters such as awn length (AwnL), aroma (Aroma), ASV, GT, and GC; pericarp pigmentation color (PC); seed shattering habit (Sh); and seed coat phenol test were recorded and analyzed.
Statistical data analysis and genetic diversity studies
The mean pooled data obtained from two kharif seasons (2022 and 2023) were used for biometrical analysis. The genotypic and phenotypic variation, broad-sense heritability, genetic advance, and genotypic and phenotypic correlation coefficients were estimated. Genetic diversity analysis was performed following D2 statistics proposed by Mahalanobis (1936). The RIL genotypes were classified into several clusters by Tocher’s method using Mahalanobis D2 distance statistics (Rao, 1952). The broad-sense heritability (H%) and other genetic variability parameters were calculated using the standard methods (Johnson et al., 1955; Allard, 1960). Phenotypic coefficients of correlation were calculated based on Burton and de Vane’s formula (1953). The multivariate PCA was utilized to estimate the relative contribution of various traits to the total variability based on the original concept of Pearson (Hotelling, 1933). Statistical analyses were carried out using various software applications, such as SPSSv-22, XLSTAT, PAST4.03, Origin 2024, and R4.4.1.
The following formulas were used to calculate the genetic variability parameters:
Heritability (broad-sense) measurement (H%):
Broad-sense heritability of the breeding lines was estimated using the formula by Allard (1960).
where (H) = broad-sense heritability; σ2p = phenotypic variance; σ2g = genotypic variance. Genotypic variance (σ2g) = (MS2 –MS3)/b; error variance (σ2e) = MS3; MS2 = mean square of populations; MS3 = mean square of error; b = number of blocks.
X¯ = the mean of the trait.
Different variance components such as the phenotypic coefficients of variation (PCV) and genotypic coefficient of variation (GCV) were estimated according to the method of Burton and de Vane (1953). Environmental variance was calculated by the formula suggested by Burton and de Vane (1953). The estimation of genetic advance (GA) and the genetic advance as percentage of the mean (GAM) were calculated according to the method described by Johnson et al. (1955).
where GA = genetic advance; K = standardized selection differential at 5% selection intensity (k = 2.063); σ2p = phenotypic variance; σ2g = genotypic variance; GAM = genetic advance as percentage of the mean; × = grand mean of a character.
Physicochemical properties and sensory-based aroma test
The alkali spreading value (ASV) (on a scale of 1–7) was measured according to the standard method (Little et al., 1958). A low ASV corresponds to a high gelatinization temperature (GT), and conversely, a high ASV indicates a low GT (Little et al., 1958). Sensory-based aroma (on a scale of 0–3) was evaluated using the standard procedure (Sood and Siddiq, 1978). The gel consistency (GC) was measured as per the standard protocol (Little et al., 1958).
Phenol reaction of seed coat
The phenol reaction of the seed coat was tested according to the protocol (Kumar et al., 2021). Freshly harvested grains were collected. Fifteen healthy grains of each cultivar and breeding lines were soaked in 1.5% aqueous phenol solution for 24 h. After that, the solutions were drained, and the grains were air-dried. The hull color was then recorded unstained and stained as compared to the control treatment, in which the grains were treated with distilled water.
Metabolomics analysis of grain quality through the HR-LCMS-QTOF method
Anthocyanin pigments were qualitatively identified from the grains of black rice lines (BW23 and CW16) according to the standard methods (Bhuvaneswari et al., 2020). In brief, the dried, pigmented black rice grain samples (1g) were ground with 5 mL of 70% aqueous methanol at room temperature. After centrifugation at 10,000 × g for 10 min, the extracts were filtered (0.22 μm) before HR-LCMS-QTOF analysis at the SAIF, IIT Bombay, India. The instrument used was HRLCMS QTOF (Agilent Technologies, United States), the data acquisition software was Agilent MassHunter, and the data processing software was Agilent MassHunter Qualitative Analysis B.06; the column was an ZORBAX Eclipse Plus-C18 150 × 2.1 MM, 5 microns (Agilent). The following solvents were used: solvent A: 0.1% formic acid in Milli-Q water and solvent B: acetonitrile. The instrument scanned over the mass (m)/charge (z) range of 100–1,100 in both the positive and negative ion modes.
Identification of amino acids in black lines using HR-LC/MS-QTOF
Total amino acid (TAA) identification was performed using standard protocols (Liyanaarachchi et al., 2020; Tyagi et al., 2022). Briefly, 100 mg of black rice flour (BW23 and CW16) were hydrolyzed in 10 mL of 6N HCI at 110 °C for 24 h. Approximately 20 µL of the solution was taken from the hydrolyzed samples and evaporated by speed vac. Then, it was reconstituted by adding 50 µL of 0.1 N HCl. From this extract, 1 µL of the sample was loaded into the LCMS system for amino acid profiling, along with standard amino acids. Amino acid identification and quantitative analysis were performed with an HR-LCMS-QTOF mass spectrometer (Agilent Technologies, United States; SAIF, IIT Bombay, India) with the following parameters: dual ion source AJS ESI, HiP sampler, binary pump, and diode-array detection (DAD) with gradient elution in a Q-TOF column comp (Poroshell HPH-C18, 2.7 µ, 4.6 × 100 mm).
Results
Phenotyping for yield and yield-related traits in RIL populations
We have developed two RIL populations (100 genotypes in each population) through interspecific hybridization, namely, BWF (O. sativa cv. Badshabhog × O. rufipogon) and CWF (O. sativa Cv. Chenga × O. rufipogon), to enhance the genetic base of the rice cultivars. In our cross, we have used only AA genome-containing genotypes within the primary gene pool of Oryza species (O. sativa, 2n = 24, AA genome and O. rufipogon, 2n = 24, AA genome), which is why the progeny lines (F1) showed recombination stability; however, hybrid sterility/hybrid breakdown was sometimes observed in the F1 generation. Due to this hybrid sterility/hybrid breakdown criterion, many F1 plants could not grow properly and died. In this study, the analysis of variance (ANOVA) revealed significant (p < 0.001) differences among the 100 RILs for all 15 measured yield-related agro-morphological traits (Table 1). These findings indicate the presence of ample genetic variations among the genotypes in the RIL populations. The 15 yield-related agro-morphological traits were evaluated for two consecutive kharif seasons (2022 and 2023), and some of the RIL genotypes exhibited superior performance in terms of PY and with pigmented grain quality (Tables 2, 3). An unexpected range of phenotypic variation was recorded among the RILs of BWF and CWF (Figure 1; Tables 2, 3). The shortest PH of only 60 cm was recorded in the BW98 line with a small FFL of 15.17 cm and a width of only 5.07 mm with small shattered seeds (Table 3). In contrast, the tallest PH (204 cm) was observed in line BW97 with shattered seeds. The grain pericarp color in both the RILs (BWF and CWF) varied from white, brown, red, and greenish to black, with distinctive grain quality parameters viz., ASV, GT, GC, and aroma (Figure 1; Supplementary Tables S1, S2).

Table 1. Mean squares of the analysis of variance (ANOVA) for 15 yield-related agro-morphological characteristics for both the RIL populations (BWF and CWF).

Table 2. Mean performance of yield and yield related 15 agromorphological traits in 100 BWF RIL lines.

Table 3. Mean performance of yield and yield related 15 agromorphological traits in 100 CWF RIL lines.

Figure 1. Scheme for recombinant inbred line (RIL) development in the pre-breeding program and the field trial of both the RIL (CWF and BWF) populations with grain color variations. Demonstration of the development of the CWF RIL population: interspecific hybridization was made in 2016 between Oryza sativa ssp. indica cv. Chenga and Asian common wild rice Oryza rufipogon Griff., and then the F1 generation was allowed to self-fertilize up to the seventh generation for the generation of the RIL population. RIL genotypes showing different grain colors including black pericarp in the F2 generation and inherited till the F6 (2021), F7 (2022), and F8 (2023). Field trial and experimental design: progeny lines were plotted in the field in a randomized complete block design (RCBD) with three replications for two seasons (F7 in 2022 and F8 in 2023 kharif crop) to evaluate the yield performance; the purple leaf plot is also shown. Wild rice Oryza rufipogon, with red grain and black husk with long awns. Rice cultivar Chenga, with brown grain and blackish husk, Chakhao control black rice, with black grain and black husk. White grain series with the size and husk color of pre-breeding F8 lines. Brown grain series with the size and husk color of pre-breeding F8 lines. Red grain series with the size and husk color of pre-breeding F8 lines. Black grain series with the size and husk color of pre-breeding F8 lines.
Genetic variability of the 15 yield-contributing traits with broad-sense heritability
The genetic parameters pertaining to the degree of variability among the RIL genotypes (BWF and CWF) were estimated (Table 4) using the GCV, PCV, H%, GA, and GAM. For the majority of the characters, the magnitude of PCV was significantly greater than that of GCV, indicating the role of genetic factors for trait development. Differences between GCV and PCV were less in both the RIL populations (BWF and CWF), indicating a higher correlation between phenotype and genotype, less environmental effect, and a larger role of genetic factors in these traits’ expression (Table 4). H% was high (>80%) in all the characters studied, except FLL (74.42%), which indicates little environmental influence. The H% for traits ranged from 74.42% (FLL) to 98.01% (GrWt) in BWF lines and from 76.58% (GB) to 98.71% (GrPn) in CWF lines. H% was found to be more than 90% high in eleven traits out of fifteen, such as PH, PnWt, GrPn, GL, GB, KL, KB, GrWt, HD, MT, and PY, in BWF population. Whereas in CWF population, H% more than 90% was recorded for the following traits: FLW, PnWt, GrPn, GrWt, GL, KL, and KB (Table 4). High heritability (>80%) in combination with high GA (>20) was observed for the following traits: PH, PnL, GrPn, GrWt, and PY in the BWF population and PH, PnL, GrPn, and KB in the CWF population, suggesting additive gene action for the characteristics. It was observed that PnWt, GrPn, GrWt, and Till had high GAM (>40%) in the BWF RIL and PnWt, GrPn, and PY in the CWF RIL population.

Table 4. Descriptive statistics and broad-sense heritability (H%) for 15 traits in both the RIL populations (BWF and CWF).
Trait correlation
The correlation coefficients among 15 traits for both the RIL genotypes (100 lines) are shown in Figures 2A, B. GrPn, PnWt, GrWt, and GL were positively correlated with YP in BWF, and PnL, PnWt, GrPn, and GB were positively correlated with YP in CWF (Figures 2A, B). In contrast, HD and MT were negatively correlated with YP in BWF, and KL, HD, and MT were negatively correlated with YP in CWF. The highest value of positive correlation was observed (0.88) between the traits KL and GL, followed by GL and GrWt (0.79), GL, and PnWt (0.61). HD was negatively correlated with three traits: GrWt (−0.22), GB (−0.0.23), and KB (−0.03) in BWF. The highest positive correlation (0.76) was found between HD and MT and between GL and KL (0.61) in CWF. PY was positively correlated with GrPn and PnWt in both the RILs (BWF and CWF) while being negatively correlated with HD and MT. The correlation analysis, therefore, indicates that GrPn, PnL, PnWt, GrWt, and GL are the most important traits that need to be considered in the production of high-yield breeding lines.

Figure 2. Pearson’s correlations, PCA bi-plot, and clustering dendrogram for both the RIL populations (BWF and CWF) based on 15 agro-morphological traits. Pearson’s correlation coefficient matrix of the 15 agro-morphological traits of 100 BWF RILs (A) and 100 CWF RILs (B). Scree plot showing eigenvalue % and components of the RIL populations BWF (C) and CWF (D). PCA bi-plot distribution of 100 RILs and 15 quantitative traits across the first two components based on PCA scores of BWF (E) and CWF (F). Clustering dendrogram of the two RIL populations using 15 traits showing five clusters in BWF (G) and four clusters in CWF (H), where different colors and heights of the clusters tree indicate the grouping of the genotypes into different main clusters. Plant height (PH), flag leaf length (FFL), flag leaf width (FLW), panicle length (PnL), panicle weight (PnWt), grain per panicle (GrPn), grain length (GL), grain breadth (GB), kernel length (KL), kernel breadth (KB), 1,000 grain weight (GrWt), tiller number (Till), heading date (HD), maturity time in days (MT), and single plant yield (PY).
Cluster analysis and dendrogram construction
The UPGMA dendrogram grouped 100 BWF RILs into 5 clusters and 100 CWF RILs into 4 clusters (Figures 2C, D). The cluster means for 15 traits among 100 BWF RILs and 100 CWF RILs are presented in Table 5. The BWF dendrogram cluster I consists of one genotype of only wild rice (blue) and is grouped separately, whereas cluster II contains a single genotype of the shortest PH of only 60 cm (yellow); clusters III, IV, and V were polygenotypic, comprising 72 genotypes, eight genotypes, and 24 genotypes, respectively (gray, purple, and greenish colors). Similarly, 100 genotypes of CWF RIL population were grouped into four clusters: I, II, III, and IV, with 1, 8, 52, and 42 genotypes, respectively (different colors in the dendrogram).
Mahalanobis D2 statistic is widely used to analyze the relative contribution of various yield components to total divergence, and it also classifies different genotypes into suitable clusters based on their genetic distances (D2 values) following Tocher’s method. Mahalanobis D2 statistic estimates the relative contribution of several components at the intra- and intercluster levels, and genotypes derived from widely divergent clusters are likely to form heterotic combinations. The 100 genotypes of BWF RIL populations were grouped into seven clusters following Tocher’s method based on Mahalanobis D2 distance values (Tables 6 and 7). Among them, five clusters are polygenotypic (i.e., I, IV, V, VI, and VII), whereas clusters II and III were monogenotypic and predicted uniqueness in the genes (Supplementary Table S3). The average intra- and intercluster D2 distance values are represented in Table 6. Intra-cluster D2 values ranged from 0.00 (clusters II and III) to 2,233.83 (cluster VI), followed by clusters IV (1,688.55), V (1,567.85), VII (1,299.42), and I (952.77). The highest intra-cluster distance (2,233.83) in cluster VI indicates wide genetic variation among the genotypes belonging to these clusters, and cluster I recorded the lowest intra-cluster distance (952.77), suggesting a closer relationship and low degree of diversity among the genotypes of this cluster. Clusters II and III consisted of only one genotype each; hence, they lacked intra-cluster distance (0.00). The largest intercluster distances were found between clusters IV and VII (25,865.50) in the BWF lines, indicating that genotypes in cluster IV were far diverse from those of VII. The least distance was observed between clusters I and II (4,047.36), which indicated that genotypes included in these cluster were closely related (Table 6). The cluster-wise mean values for the 15 characters in BWF are presented in Table 7. These are helpful to assess the superiority of the clusters during the improvement of characters through a hybridization program. The cluster mean values showed a wide range of variation for the majority of the characters undertaken in the present study. It was observed that cluster II had recorded the highest mean values for most of the traits, followed by clusters IV, VI, and VII (Table 7). The contribution of different traits to total divergence is depicted in Table 7. The trait GrPn (19.00) showed the maximum contribution toward genetic divergence, followed by GrWt (13.90), KB (13.60), GB (10.50), and PH (8.50). Out of the 15 agro-morphological traits studied, only five traits (GrPn, GrWt, KB, GB, and PH) provided the maximum contribution (65.50%) toward total divergence (Table 7). On the other hand, based on the D2 matrix, 100 CWF RILs were grouped into 11 Tocher’s clusters, of which seven were multi-genotypic and four were mono-genotypic (Supplementary Table S4). Each of the 15 traits that contributed to the overall genetic divergence in the CWF was categorized and displayed in Table 8. The contribution toward the total variation was the maximum for GrWt (24.32), followed by the other traits (Table 8). Cluster XI had the highest PY (34.95 g), with the maximum contributions from GrWt, GL, KB, PH, GB, PnL, GrPn, and PnWt. Moreover, PY benefited most from clusters VIII (32.01 g) and X (33.65 g) (Table 9). The average intra- and intercluster distances within the 11 clusters indicate the degree of divergence within and between the groups (Tables 8 and 9). The largest intercluster distances (25,817.49) were found between cluster II (CW27, CW71, CW95, CW23, CW16, CW84, CW85, and CW81) and cluster IV (CW44, CW48, CW26, and CW31), containing genotypes that were found to be the most divergent with the maximum intercluster distance. According to the D2 cluster matrix, cluster VII had the largest intra-cluster distance (2942.63) with RIL genotypes CW20 and CW34. The maximum heterosis would result from a cross between genotypes from clusters II and IV, which had the greatest genetic distance (25,817.49).

Table 7. Cluster mean values estimated by Tocher’s method from 100 BWF RIL populations and the percent contribution of each trait toward total divergence.

Table 9. Cluster mean values estimated by Tocher’s method from 100 CWF RIL populations and the percent contribution of each trait toward total divergence.
Principal component analysis (PCA)
To find out the independent impact of all the traits under study and to reveal the patterns of genetic variation among the rice RIL genotypes, PCA was conducted. The visual scree plot (Figures 2C, D) showed the amount of variance described by each PC. The results of the PCA showed that the first four PCs contribute 73.74% of the cumulative variability (PC1, 35.52%; PC2, 17.43%; PC3, 13.09%; and PC4, 7.68%) with eigenvalues >1, indicating significant variability in the BWF RILs (Table 10). PC1 has a high component loading value for PnWt (0.350), GrWt (0.346), GL (0.335), GB (0.324), KL (0.332), PY (0.356), and others, and that accounted for 35.52% of the total variation as a whole in BWF (Table 10). The main contributing variables to the first four PCs (73.74% of the cumulative variability) were PnWt, GrWt, GL, GB, KL, and PY, and these are the major drivers of differences among 100 genotypes of the BWF population. PC2 contributed 17.43%, PC3 contributed 13.099%, and PC4 contributed 7.683% of the total variability in BWF lines (Table 10). The PCA bi-plot shows relationships among genotypes, traits, and environments in a simplified manner. The PCA-bi-plot analysis indicated the comparative genetic distance between different genotypes and phenotypic characteristics by employing the first two PCs along the X- and Y-axes (Figure 2E). The distribution of genotypes based on their genetic diversity has been observed in the four quadrants of the bi-plot (Figure 2E). The RILs (BW9, BW19, BW81, BW44, BW86, and BW88) projected on the PCA bi-plot vectors of GrPn, grain weight, GL, KL and PY were close to them, demonstrating a positive interaction (Figure 2). The PCA analysis of the yield and yield-contributing traits of 100 CWF RILs generated six PCs, and the first six components together explained more than 71.90% of the total variation in CWF RILs (Figure 2; Table 11). PC1, PC2, PC3, PC4, PC5, and PC6 accounted for 20.026, 14.267, 11.569, 9.680, 8.394, and 7.961%, respectively, of the total variability in CWF RILs. The PCA scores for 100 CWF RIL genotypes in the first two PCs were estimated and plotted on a bi-plot. The distribution of genotypes based on their diversity can be observed in the four quadrants of the bi-plot (Figure 2F). Comparing the 100 CWF RILs based on the PCA bi-plot analysis, the RILs CW1, CW11, CW16, CW26, CW29, CW32, CW36, CW40, CW41, CW44, CW57, CW79, CW81, CW84, CW85, CW93, CW96, and CW98 were superior for PnWt, PnL, GrWt, PY, GL, GB, and KL (Figure 2; Table 11). The BWF RIL genotypes with a high positive principal component score for PC 1 were BW5 (1.912), BW6 (1.762), BW7 (1.577), BW8 (1.142), BW12 (1.950), BW15 (1.634), BW17 (1.935), BW18 (2.162), BW23 (4.285), BW24 (2.965), BW25 (2.786), BW30 (1.419), BW31 (1.625), BW44 (2.215), BW52 (2.737), BW77 (2.411), BW88 (2.564), BW90 (2.511), BW99 (3.362), and BW86 (1.950), and they were superior for the traits PnWt, GrWt, GL, GB, KL, and PY.

Table 10. Contribution of different traits toward the total variance in 100 BWF RIL populations (eigenvalue, contribution of variability, and factor loadings of PCs).

Table 11. Contribution of different traits toward the total variance in 100 CWF RIL populations (eigenvalue, contribution of variability, and factor loadings of PCs).
Pre-breeding lines with pigmented grains and nutritional benefits
Many phenotypic variations were detected in the grain color, which ranged from white, light brown, reddish-brown, brown, deep brown, reddish, red, blackish-red, greenish, blackish-brown, black, to deep black (Figure 1; Supplementary Tables S1 and S2), broadly showing a 9:6:1 ratio. In the present study, we have observed many breeding lines with purple leaf coloration in the CWF cross with black pericarp and black husk color (Figure 1). The purple leaf trait is inherited from the F3 generation, suggesting that the trait has been newly acquired by the breeding lines, although parental lines were devoid of such a trait. The HR-LCMS-QTOF method of metabolomics analysis revealed the detection of several anthocyanin pigment compounds in our black rice breeding lines (BW23 and CW16), such as petunidin 3-O-glucoside, peonidin 3-O-glucoside, peonidin 3-galactoside, cyanidin 3-O-glucogalactoside, and pelargonin, including other 46 metabolite compounds (Figure 3; Table 12). Most common metabolites that were identified were as follows: catechin, oryzanol, gallic acid, caffeic acid, quinic acid, quercetin, 3,5-dihydroxybenzoic acid, rutin, luteolin 4′-O-glucoside, heptadecatrienoic acid, PAB/4-aminobenzoic acid, kaempferol 7-O-glucoside, peganine, maritimetin, mitoxantrone, methyl 2-(10-heptadecenyl)-6-hydroxybenzoate, zinnimidine, azafrin, tubulosine, and other metabolite compounds that have medicinal importance (Table 12). The total amino acid content was quantitatively estimated in the pigmented grain of our pre-breeding lines through the HR-LCMS-QTOF method. The estimation ranged from 8.76 mg/100 g (BW23) to 8.81 mg/100 g (CW16) on dry weight basis, with the following amino acid compositions: aspartic acid, alanine, arginine, cysteine, glutamate, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline (hydroxyproline), serine, threonine, tyrosine, glutamine, and valine (Figure 3; Table 13).

Figure 3. Chromatogram of HR-LCMS-QTOF for the identification of the total free amino acids from two black rice breeding lines (BW23 and CW16) and the qualitative detection of metabolites including anthocyanin pigments (petunidin 3-O-glucoside) from two black rice breeding lines (BW23 and CW16) responsible for black pigmentation and grain quality. (A) Chromatogram of amino acids identification. (B) Comparative amino acids composition of two lines represented using a bar graph. (C) Chromatogram of total metabolites detection from black rice lines (BW23 and CW16). (D) MS zoomed spectra of antioxidant and anthocyanin pigments such as 5-acetyl-3,4-dihydro-2H-pyrrole, petunidin 3-O-glucoside, cyanidin 3-O-glucogalactoside, and peonidin 3-galactoside.

Table 12. Distinct types (46 components) of metabolites (anthocyanins and others) qualitatively detected from two black rice RIL lines (BW23 and CW16) through HR-LCMS-QTOF and their Nutritional Values depicted.

Table 13. Total amino acids in black rice breeding lines (mg/100 g dry weight basis) quantitatively detected by using the HR-LCMS-QTOF method.
Discussion
Yield enhancement with pigmented grain quality in both the RIL populations
In the present study, two distinct RIL populations comprising 100 genotypes (BWF and CWF) were developed and characterized using 15 yield-related agro-morphological traits that indicated the presence of gigantic variability for these traits. ANOVA was employed to assess the significance of phenotypic variation in 15 yield-related agronomic traits across the BWF and CWF populations. The results indicated significant genotypic differences (p < 0.001) for most traits, confirming the presence of substantial genetic variation introgressed from O. rufipogon (significant at p < 0.001) (Table 1). The results were consistent with the general notion that the larger the divergence between the parental genotypes is, the higher the heterosis in crosses will be (Falconer, 1964) (Tables 2 and 3). The PY was recorded with the mean values 14.95 g, 22.65 g, and 10.66 g in the control black Chakhao, parent 1 Badshabhog, and parent 2 wild rice (O. rufipogon), respectively, whereas BWF RIL genotypes displayed PY that ranged from 5.00 g to 61.89 g with the mean value of 28.62 g (Table 2). In the case of CWF RIL genotypes, PY was recorded with mean values 14.95 g, 24.79 g, and 10.66 g in the control black Chakhao, parent 1 Chenga, and parent 2 wild rice (O. rufipogon), respectively, and the CWF RIL genotypes displayed PY that ranged from 10.66 g to 43.87 g with a mean value of 22.80 g (Table 3). At least 15 breeding lines out of the 100 BWF RIL populations were considered as promising lines due to better performance related to PY during two kharif seasons (2022 and 2023) with early maturity times (125 days–135 days) than the control black variety Chakhao, Manipur (153 days and plant height of 156.55 cm). The following breeding lines showed high PY: BW6 (33.48 g), BW18 (40.37 g), BW23 (61.89 g), BW24 (56.59 g), BW25 (51.93 g), BW26 (34.19 g), BW33 (32.48 g), BW44 (31.70 g), BW50 (42.74 g), BW77 (37.67 g), BW83 (50.22 g), BW88 (35.94 g), BW90 (43.93 g), BW91 (26.30 g), and BW99 (60.89 g) from the BWF population and the genotypes CW1 (37.99 g), CW11 (37.11 g), CW16 (43.87 g), CW20 (25.70 g), CW23 (26.32 g), CW39 (32.59 g), CW69 (27.86 g), CW78 (32.44 g), CW79 (37.76 g), CW80 (29.11 g), CW90 (33.65 g), CW94 (29.62 g), CW95 (22.78 g), CW97 (25.50 g), and CW98 (23.44 g) from the CWF population (Tables 2 and 3). Our finding was consistent with the earlier report that yield was enhanced when local rice Chakhao Poireiton (purple) was crossed with HYV Sahbhagi Dhan (white) and Shasharang (light brown) (Lap et al., 2024).
Genetic variability parameters
A wide range of phenotypic diversity was observed in both the RIL populations and showed transgressive segregation with respect to GL, grain number per panicle, GrWt, PnL, and PnWt (Figure 1; Tables 2 and 3; Supplementary Tables S1 and S2). The results signify that both the RIL populations (BWF and CWF) have shown substantial genetic variation among the genotypes (Tables 2 and 3). The nature and magnitude of genetic divergence prevailing in both the RIL populations (BWF and CWF) were estimated by various multivariate statistical tools, such as genetic variability parameters (broad-sense heritability), Mahalanobis D2 statistic, PCA, and cluster analysis using standard formulae. The present results were consistent with the earlier findings that genetic divergence prevailed in our pre-breeding RIL populations (Faysal et al., 2022; At-Ul-Karim et al., 2022; Mondal et al., 2024; Mudhale et al., 2024; Bordoloi et al., 2024). Success in crop improvement generally depends on the magnitude of genetic variability and/or diversity and the extent to which the desirable characteristics are heritable (Mondal et al., 2024). Heritability is crucial in determining a trait’s response to selection and predicting the transmission of desirable characteristics from parents to offspring during breeding (Acquaah, 2009). The traits with high heritability such as PH, active tillering, filled grain per plant, GL, FLW, PnL, and GrWt have been widely reported to effect rice yield (At-Ul-Karim et al., 2022). The yield of rice is controlled by three key components: the number of effective panicles, the number of grains per panicle, and grain weight (Zheng et al., 2024). Agronomic traits such as GrWt and PH have been widely used for the improvement of rice yield in breeding programs (Hernandez-Soto et al., 2021). In this study, moderate-to-high heritability (74.42%–98.71%) was observed in both the RILs F8:9 (BWF and CWF), indicating moderate-to-high level of genetic control of the traits associated with yield parameters (At-Ul-Karim et al., 2022; Hernandez-Soto et al., 2021; Zheng et al., 2024), and it subsequently showed increased yield in the RILs during field trials in two kharif seasons (2022 and 2023) (Tables 2–4). RIL genotypes (BWF and CWF) showed high yield potential due to high heritability of all the yield-enhancing characters evaluated in the present study (PH, active tillering, filled grain per plant, GL, FLW, PnL, and GrWt) (Tables 2 and 3). High heritability (>80%) with high GA (>20) was observed for the traits PH, PnL, GrPn, GrWt, and PY in BWF lines and PH, PnL, GrPn, and KB in CWF lines, suggesting additive gene action for the characteristics and that these traits could contribute largely to the yield improvement of pre-breeding lines (Tables 2–4). High heritability (H %) for yield-related traits in the BWF population (H % = 80 for PH, GrWt, PY, GrPn, and PnL) indicated strong genetic control, which is likely due to novel alleles from O. rufipogon. Similarly, the CWF population exhibited high heritability for the traits PH, PnL, GrPn, and KB (H % = 80), suggesting that introgression enhanced the genetic contribution to phenotypic variation, facilitating selection for improved traits. High heritability (>90%) in combination with the high score of genetic advance as percent of mean (>40%) was also observed for the traits PnWt, GrPn, and PY in the CWF RIL population, indicating strong additive genetic effects for increasing yield (Table 4). High heritability indicates strong genetic control, and genetic advance highlights the potential for trait improvement, which are all driven by O. rufipogon alleles. Many genes/QTLs of yield-enhancing traits (spikelet number, grain number per panicle, PnL, GrWt, grain size, grain yield, and GL) were identified and introgressed into the elite rice germplasm from the progenitor wild rice for enhancing the yield characters (McCouch et al., 2007; Gaikwad et al., 2021; Roy and Shil, 2020a; Xu et al., 2021; Malik et al., 2022; Seck et al., 2023). Significant genetic gain can be achieved in improving varieties by utilizing novel genes of the neglected wild rice to restore the genetic diversity and allelic variation lost during domestication (Siddiq and Vemireddy, 2021; Eizenga et al., 2024). Superior genes/QTLs of agronomic importance from wild rice (O. rufipogon) can be directly incorporated into breeding programs to generate pre-breeding material, which will serve as a valuable germplasm resource for rice breeding (Henry, 2022; Zhang J. et al., 2022; Bedford et al., 2023; Zheng et al., 2024). Similarly, these types of yield-enhancing traits (allelic variants/genes/QTLs) must have been introgressed into our pre-breeding lines from untapped wild rice (O. rufipogon); otherwise, both the RILs (BWF and CWF) would not show such high heritability with high genetic advance for yield characteristics, and they would have also exhibited high heterotic phenotypic features (Tables 2–4). Increased phenotypic variance observed in both the RIL populations compared with the parental lines could be attributed to the increased level of transgression of these yield-related components and gene interactions (Tables 2 and 3). Therefore, our pre-breeding materials (RILs) are valuable resources for rice improvement programs that may provide a powerful tool for broadening the genetic base of breeding materials to improve rice productivity, including climate change-resilient phenotypes along with high yield and grain quality. The improvement of rice grain quality has become an important breeding target in almost all rice breeding programs since the early 1980s (Mackill and Khush, 2018). Some of the grain qualities (ASV, GT, GC, and aroma) were evaluated in the present study and are depicted in Supplementary Tables S1 and S2.
Clustering and dendrogram construction
The RIL population of BWF was grouped into 5 clusters according to the mean cluster values based on 15 traits, and the CWF RIL population was grouped into four clusters, indicating the relationship among the 100 genotypes (Table 5; Figures 2H, G). Our present findings are consistent with the earlier studies of Ahmad et al. (2015) and Bordoloi et al. (2024).
Mahalanobis D2 test for genetic diversity assessment in BWF and CWF
The Mahalanobis D2 test is a multivariate statistical tool used to measure the genetic divergence or distance between genotypes based on multiple traits. It is particularly valuable for assessing diversity based on Mahalanobis D2 values. Genotypes with large D2 values are genetically divergent. Mahalanobis D2 distance values can be used to group genotypes into several clusters using Tocher’s method. The BWF RIL population was grouped into 7 clusters and the CWF RIL population was grouped into eleven distinct clusters based on Tocher’s method using Mahalanobis D2 distance values, reflecting the successful introgression of wild alleles (Tables 6 and 7). The average intercluster distances were observed to be greater than the average intra-cluster distances, suggesting that the genotypes of both the RIL populations (BWF and CWF) possess a greater degree of genetic diversity. The largest intercluster distances were found between clusters IV and VII (25,865.50) in BWF lines, indicating that genotypes in cluster IV were far diverse from those of cluster VII. The largest intercluster distances (25,817.49) were found between clusters II and IV in the CWF RIL population. The maximum intercluster distance indicated wide diversity, whereas the minimum suggested a close relationship between the groups (At-Ul-Karim et al., 2022). Genotypes with the largest genetic distance in yield-attributing parameters would result in the complementation of gene effects in the hybridization program, and these were detected in the present RILs (Tables 6 and 7). Out of the 15 agro-morphological traits studied, only 5 traits (GrPn, GrWt, KB, GB, and PH) provided the maximum contribution (65.50%) toward the total divergence in the BWF RIL population (Table 7). The contribution toward the total variation was the maximum for GrPn (19.00), followed by GrWt (13.90), KB (13.60), GB (10.50), GL (8.80), PH (8.50), PnWt (5.30), KL (5.00), and PY (4.30) in the CWF RIL population (Table 8). Overall, we have observed that the main characteristics that helped express our study’s diversity were the traits PH, GrPn, GrWt, GB, KL, PY, and KB (Tables 8 and 9). These traits should be taken into consideration while selecting parents for hybridization. Mahalanobis distance-based clustering pattern of both the RIL genotypes (BWF and CWF) into several groups confirmed the quantum of diversity present in the developed pre-breeding lines and provides a scope for its exploitation through breeding for yield improvement in rice (Tables 6–9). We know that the transfer of genes governing desirable traits from wild relatives to cultivated rice is an important strategy in rice breeding (Sanchez et al., 2013; Qiao et al., 2016). The narrow genetic base of modern rice varieties has led to yield plateaus, making it essential to introduce genetic diversity to overcome these barriers. Moreover, the genetic bottleneck that occurred during the domestication of cultivated rice from its immediate ancestral progenitor of wild rice O. rufipogon has played a major role in reducing allelic diversity by at least 50%–60% in cultivated rice than in wild rice O. rufipogon, leading to the loss of genetic variability with yield potentiality (Xiao et al., 1996; Tanksley and McCouch, 1997; Brar and Khush, 2006). Wild rice O. rufipogon has been considered as a reservoir of many untapped gene/QTLs for important agronomic traits, such as yield, quality, nutritional characteristics, and resistance to biotic and abiotic stresses, and it can be utilized in the pre-breeding program for broadening the genetic base of the released varieties to break the yield plateaus (Tanksley and McCouch, 1997; McCouch et al., 2007; Tester and Langridge, 2010; Sanchez et al., 2013; Brar and Khush, 2018). The transfer of genes controlling desirable traits (yield and grain quality) from the wild relatives O. rufipogon to cultivated rice is an important strategy in rice breeding (Tian et al., 2006; Huang et al., 2012B; Qiao et al., 2016; Gaikwad et al., 2021; Zhang B. et al., 2022). Pre-breeding facilitates the introgression of desirable traits such as yield potential, nutritional quality, and resistance to biotic and abiotic stresses (Rao et al., 2020; Bin Rahman and Zhang, 2022; Alam et al., 2024). In this study, we corroborate the results of earlier findings that wild rice serves as a valuable genetic resource for enhancing rice varieties via widening the genetic base through introgression of hidden alien gene/QTLs for high yield potential (Tanksley and McCouch, 1997; Henry, 2022; Bedford et al., 2023; Zheng et al., 2024). A complex trait such as grain yield is controlled by many genes along with being influenced by the environment and is related to other traits such as plant types, growth duration, and other yield-component traits (Mudhale et al., 2024; Bordoloi et al., 2024). The present reports are consistent with the earlier analyses that higher intercluster distances existed in the breeding lines, indicating wide trait variability and genetic divergence (Faysal et al., 2022; At-Ul-Karim et al., 2022; Mondal et al., 2024). Both the RIL populations (BWF and CWF) have shown wide genetic divergence in respect to the 15 characters studied, signifying that the genetic base has broadened as a result of interspecific hybridization through the introgression of untapped genetic components from the underutilized wild rice (O. rufipogon) germplasm of India (Tables 2 and 3). The yield-enhancing associated traits such as spikelet number, grain number, grain size, grain weight, and PnL have been introgressed into the populations developed from crosses of O. sativa × O. rufipogon, making O. rufipogon an ideal germplasm for mining yield-enhancing loci (McCouch et al., 2007; Malik et al., 2022; Gaikwad et al., 2021). These traits also contribute toward increasing the creation of high genetic variation and diversity in our pre-breeding RILs (BWF and CWF) (Tables 6–9).
Principal component analysis
PCA was used to explore the genetic diversity and population structure in the BWF and CWF populations. It was utilized to categorize all the yield-attributing traits into distinct PCs, thereby revealing the individual traits’ contributions to genetic divergence. The first four components in our study were considered the primary PCs as they showed the greatest variability with eigenvalues greater than >1 in the BWF RIL population (Table 10) and six components in the CWF population (Table 11). It showed that the maximum variation was present in the first two PCs (BWF and CWF), and hence, selection of genotypes from these PCs will be useful for obtaining higher genetic variation with higher yields (Figure 2; Tables 10 and 11). Breeders can use selection to influence such significant traits in the divergence analysis of BWF and CWF RIL populations. In this study, only PH and PnWt exhibited positive values in the first four PCs associated with divergence in the BWF RIL population (Table 10). PC1 and PC2 in the PCA-bi-plot diagram showed the dispersion and nature of diversity for both variables and genotypes (Figure 2). The cumulative variance of 73.74% by the first four axes with an eigenvalue > 1.00 indicates that the identified traits significantly influenced the RIL’s phenotype and could effectively be used for selecting among them in the BWF genotypes (Table 10). The bi-plot analysis showed the relationships between the morphological traits among the tested RIL genotypes of BWF and CWF (Figure 2). The traits influencing PC1 were PnWt, GrWt, GL, GB, KL, and PY. These results also support the GCV estimates for PnWt, GrWt, GL, GB, KL, and PY; the first three traits, along with GrPn, KB, and PH, also corroborated Mahalanobis distance-based divergence in the present study. PCA and Mahalanobis D2 analyses further validate the introduction of novel genetic diversity, as RILs exhibit distinct genetic profiles and increased divergence from the cultivated parents. These findings demonstrate that O. rufipogon introgression successfully broadens the genetic base, enhancing the diversity and the potential for developing resilient, high-yielding rice varieties (Tables 2 and 3). Based on the comparison of the 100 RILs of BWF based on PCA bi-plot analysis, the RILs BW18, BW23, BW24, BW25, BW44, BW52, BW77, BW83, BW88, BW90, and BW99 were superior for PnWt, GrWt, PY, GL, GB, and KL. Hence, these results of PCA will be of great benefit to the breeder for identifying parents and the selection of characters for future hybridization programs for varietal improvement. The genetic diversity of the breeding lines (BWF and CWF) was clarified, and component traits contributing to variability were broken down through the combination of PCA; this could provide the framework for a well-run hybridization program. The length of a trait’s vector in PCA represents its contribution to the overall divergence; the longer the vector, the larger the contribution (Figure 2). All the traits exhibited the maximum length and contributed maximally to the total diversity. These results were in conformity with the findings of Bhuvaneswari et al. (2020) and Mondal et al. (2024). The dispersion of RILs across PCA axes confirmed the introduction of novel genetic diversity, broadening the genetic base beyond that of O. sativa.
Pre-breeding lines with pigmented grains and nutritional benefits
Wild species O. rufipogon is extensively used for the mining of new genes for biotic/abiotic stresses and high-value QTLs for yield and grain quality traits (Gaikwad et al., 2021; Malik et al., 2022). Pre-breeding (O. sativa × O. rufipogon) has been utilized not only for improving the qualitative and quantitative traits but also for introgressing new useful variability, which recognizes its potential as a valuable reservoir of genetic variation (Tanksley and McCouch, 1997; Brar and Khush, 2018; Roy and Shil, 2020b; Gaikwad et al., 2021; Henry, 2022; Malik et al., 2022). In the present investigation, the most innovative and novel genetic change that we observed in the progeny populations (BWF and CWF) was the appearance of rice lines containing black pericarp. We also observed many breeding lines with purple leaf coloration in the CWF cross with black pericarp and black husk color (Figure 1; Supplementary Tables S1 and S2). The purple leaf trait is inherited from the F3 generation, suggesting that the trait has been newly acquired by the breeding lines, although parental lines were devoid of such a trait.
This unique finding is consistent with the earlier results that new useful genetic variation may be created in the progeny population when crossed with wild progenitor O. rufipogon (Tanksley and McCouch, 1997; Sanchez et al., 2013; Brar and Khush, 2018; Gaikwad et al., 2021; Xia et al., 2021; Malik et al., 2022). Thus, the most innovative and novel genetic change that we observed in the progeny populations (BWF and CWF) was the appearance of rice lines containing black pericarp; however, the parental lines were non-black (O. rufipogon, red grain; Chenga, brown; and Badshabhog, white) with only the green leaf character. Many phenotypic variations were detected in the grain color (pericarp pigmentation), which ranged from white, light brown, reddish-brown, brown, deep brown, reddish, red, blackish-red, greenish, blackish brown, black, to deep black (Figure 1; Supplementary Tables S1 and S2), broadly showing a 9:6:1 ratio (polymeric gene interaction) in the progeny populations. Other types of segregation ratios such as 9:7 (complementary gene interaction) and 9:3:4 (supplementary gene interaction) were also reported earlier (Devi et al., 2020; Lap et al., 2024). The exceptional range of color variations also supports the view that grain color is of polygenic inheritance in nature and controlled by many genes or quantitative trait loci (QTL) or/involves as yet unidentified genes (Oikawa et al., 2015; Ham et al., 2015; Devi et al., 2020; Pham et al., 2024). Our present study was concomitant with the earlier analysis that domestication and population divergence in crops produce considerable phenotypic changes, reflecting their genomic evolutionary trajectories, particularly in structural variants (SVs) and gene expression (Shang et al., 2022; Xu et al., 2024b). Following this notion, a known genetic construction (Kala4 gene with LINE1 insertional mutation) that reappears through unfolding the hidden genetic components or new genetic constructs acquired de novo by exchanging genomic segments during meiotic recombination/chromosomal rearrangement of SVs leads to acquired neofunctionalization to form black pericarp. The nutritional quality of rice is determined by the levels of starch, protein, lipids, minerals, vitamins, and phytochemicals (Senguttuvel et al., 2023). Pigmented rice varieties (brown, red, and black) are gaining popularity among consumers due to their nutritional health benefits (Shao et al., 2018; Mendoza-Sarmiento et al., 2023; Xie et al., 2024; Zhu et al., 2024; Idrishi et al., 2024; Sakulsingharoj et al., 2024; Gogoi et al., 2024), and market demands are expected to increase (Kushwaha, 2016; Ito and Lacerda, 2019; Bhuvaneswari et al., 2023). Pigmented rice accumulates various types of secondary metabolites, such as phytosterols, polyphenols, flavonoids, anthocyanins, proanthocyanidins, vitamins, and micronutrients (Shao et al., 2018; Mendoza-Sarmiento et al., 2023; Zhu et al., 2024; Idrishi et al., 2024), which are recognized to have a high nutritional value and medicinal properties, with antioxidant, antimutagenic, anticancer, antiviral, antidiabetic, anti-inflammatory, and antiaging potentialities (Mbanjo et al., 2020; Das et al., 2023; Sakulsingharoj et al., 2024; Gogoi et al., 2024). However, pigmented rice landraces often have lower yields and less favorable agronomic traits, necessitating pre-breeding to integrate their nutritional benefits into high-yielding, resilient varieties. The nutritive value of pigmented rice is greatly influenced by genetics, genotypic variation, and environmental factors (Sweeney et al., 2006; 2007; Furukawa et al., 2007; Gross and Zhao, 2014; Tiozon et al., 2023; Zhu et al., 2024), along with several external influences such as soil fertility status, the degree of milling, and the method of preparation before consumption (Gogoi et al., 2024). Through HR-LCMS-QTOF metabolomics profiling and amino acid identification from the RIL genotypes (BW23 and CW16), we confirm that our black rice lines are rich in phytonutrients (polyphenols, flavonoids, anthocyanins, oryzanol, catechin, quercitrin, kaempferol 7-O-glucoside, 5-Acetyl-3,4-dihydro-2H-pyrrole, and others), which have health benefits (Figure 3; Table 12). A total of 46 most vital metabolites were identified through the HR-LCMS-QTOF method and the total amino acids profiling carried out on black rice lines (BW23 and CW16) (Tables 12 and 13). Several anthocyanin pigment compounds were qualitatively identified from our black rice lines, such as petunidin 3 O-glucoside, peonidin 3-O glucoside, peonidin 3-galactoside, cyanidin 3-O-glucogalactoside, and pelargonin, confirming that the black color is due to the presence of anthocyanin pigments (Figure 3; Table 12). The most common metabolites that were identified were as follows: catechin, oryzanol, gallic acid, caffeic acid, quinic acid, quercetin, 3,5-dihydroxybenzoic acid, rutin, luteolin 4′-O-glucoside, heptadecatrienoic acid, PAB/4-aminobenzoic acid, kaempferol 7-O-glucoside, peganine, maritimetin, mitoxantrone, methyl 2-(10-heptadecenyl)-6-hydroxybenzoate, zinnimidine, azafrin, tubulosine, and other metabolite compounds that have nutritional value and medicinal importance (Table 12). Our present results are consistent with the previous report about the health benefits and medicinal uses, with antioxidant, antimutagenic, anticancer, antiviral, antidiabetic, anti-inflammatory, and antiaging potentialities (Mbanjo et al., 2020; Das et al., 2023; Sakulsingharoj et al., 2024; Gogoi et al., 2024). The total amino acid estimation ranged from 8.76 mg/100 g (BW23) to 8.81 mg/100 g (CW16) based on the dry weight, with the following amino acid compositions: aspartic acid, alanine, arginine, cysteine, glutamate, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline (hydroxyproline), serine, threonine, tyrosine, glutamine, and valine (Figure 3; Table 12). The amino acid profile of our rice shows that it is high in glutamic and aspartic acid, whereas lysine is the limiting amino acid, which is similar to that in another analysis (Carcea, 2021). In the present study, glutamic acid was found in the highest amount (1,650 mg/100g) in BW23, and methionine was found in the lowest amount (70 mg/100g) in CW16. The taste is better in CW16 due to the presence of glutamine (358 mg/100 g). Proline (as hydroxyproline) was quite high in the CW16 (642 mg/100 g) and in BW23 lines (578 mg/100 g), leading to a popcorn-like aroma. Proteins containing amino acids such as lysine, leucine, isoleucine, and threonine are considered high-quality proteins (Min et al., 2019; Liyanaarachchi et al., 2020; Tyagi et al., 2022; Jayaprakash et al., 2022). Our results showed that both the black rice breeding lines (BW23 and CW16) are nutritionally enriched due to the presence of high-quality proteins in the endosperm containing amino acids such as lysine, leucine, isoleucine, and threonine (Table 13). Our present investigation is consistent with earlier reports that the newly developed black rice breeding lines are nutritionally enriched in respect to the high-quality amino acids content (Min et al., 2019; Liyanaarachchi et al., 2020; Tyagi et al., 2022; Jayaprakash et al., 2022) and other important nutraceuticals, that is, oryzanol, anthocyanin, catechin, quercitrin, kaempferol 7-O-glucoside, and 5-acetyl-3,4-dihydro-2H-pyrrole (Tables 12 and 13) (Ahmad et al., 2015). A similar pattern of metabolites was reported by previous studies in pigmented rice varieties (Bhuvaneswari et al., 2020; Zhu et al., 2024).
Conclusion
Pre-breeding is indispensable for rice improvement programs, addressing critical challenges such as climate change, food security, malnutrition, and sustainability. By unlocking the genetic potential of wild relatives (O. rufipogon), we have successfully developed two wide-ranging pre-breeding populations (F8:9) containing 100 RILs (each BWF and CWF) in the genetic background of local indica cultivars Badshabhog and Chenga. The analyses of ANOVA, broad-sense heritability (H%), genetic advance, PCA, and Mahalanobis D2 test statistics collectively demonstrate that the genetic base of the BWF and CWF RIL populations has been significantly broadened through the alien introgression of untapped hidden genes from underutilized wild rice (O. rufipogon). These findings support the use of O. rufipogon as a valuable resource for rice improvement, offering novel alleles to overcome the limitations of a narrow genetic base in cultivated rice. At least 15 breeding lines from each RIL populations (BWF and CWF) were regarded as promising lines due to high-yield performance during two kharif seasons (2022 and 2023), with early maturity times (125 days–135 days) as compared to control black variety Chakhao, Manipur (153 days and plant height of 156.55 cm). The PY is 14.95 g in the control Chakhao, whereas our black rice lines showed an average 30 g PY with short PH (130 cm–145 cm). Novel and unique traits, such as black pericarp and purple leaf, were discovered in the progeny lines, which were inherited persistently in the RIL populations (F8:9), confirming the earlier concept that domestication and population divergence in crops produce considerable phenotypic changes, reflecting their genomic evolutionary trajectories, particularly in structural variants (SVs) and gene expression. Anthocyanin pigments responsible for black pericarp color such as petunidin 3-O-glucoside, peonidin 3-O-glucoside, peonidin 3-galactoside, cyanidin 3-O-glucogalactoside, and pelargonin were qualitatively identified from the black rice lines (BW23 and CW16) through HR-LCMS-QTOF, signifying that genetic components related to the anthocyanin biosynthetic pathway(s) were activated in the pre-breeding materials; otherwise, the black colored pericarp would not be possible. This study supports the earlier concept that pre-breeding (O. sativa × O. rufipogon) has the potential to be a valuable source of genetic variation because it not only improves qualitative and quantitative traits with a great deal of genetic diversity but also introduces new functional variability. Metabolomics analysis (HR-LCMS-QTOF) identified 46 different phytonutrients metabolites in black rice lines (BW23 and CW16), which indicates the grain quality improvement with medicinal values and health benefits. Our pre-breeding lines can be used as an important genetic resource for improving black rice varieties for food and nutritional security.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
SR: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing. PS: Data curation, Formal Analysis, Resources, Writing – review and editing.
Funding
The author(s) declare that no financial support was received for the research and/or publication of this article.
Acknowledgments
SR is thankful to the North Bengal University authority for providing necessary laboratory facilities and the establishment of the “Rice Germplasm Conservation & Breeding Centre” in the Department of Botany, the University of North Bengal, with an “experimental rice field” to conduct the trial of a large number of breeding population and selection. The authors are also thankful to DST, Govt. of India, for providing the DST-FIST Fund [Sanc No. SR/FST/LS-I/2021/900; date 25.03.2022; duration 2022 to 2027] to the Department of Botany, the University of North Bengal, for the upgradation of the laboratory with sophisticated instruments.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2025.1659937/full#supplementary-material
Abbreviations
Awn, awn length; BWF, Badshabhog × wild rice progeny; CWF, Chenga × wild rice progeny; DUS, guideline-distinctiveness uniformity and stability; FLL, flag leaf length; FLW, flag leaf width; GrPn, grain per panicle; GL, grain length; GB, grain breadth; GrWt, 1000 grain weight; HD, heading date; PC, pericarp color; PH, plant height; PY, single plant yield; PnL, panicle length; Till, active tiller number; HYV, high-yielding variety; Kala, key activator loci for anthocyanin; LINE1, long interspersed nuclear element-1; MT, maturity time in days; QTL, quantitative trait locus; PCA, principal component analysis; PCV, phenotypic coefficient of variation; GCV, genotypic coefficient of variation; HR-LCMS-QTOF, high-resolution-liquid chromatography–mass spectrometry-QTOF.
References
Ahmad, F., Hanafi, M. M., Hakim, M. A., Rafii, M. Y., Arolu, I. W., and Akmar Abdullah, S. N. (2015). Genetic divergence and heritability of 42 coloured upland rice genotypes (Oryza sativa) as revealed by microsatellites marker and agro-morphological traits. PLoS One 10 (9), e0138246. doi:10.1371/journal.pone.0138246
Alam, M., Lou, G., Abbas, W., Osti, R., Ahmad, A., Bista, S., et al. (2024). Improving rice grain quality through ecotype breeding for enhancing food and nutritional security in Asia–Pacific region. Rice 17, 47. doi:10.1186/s12284-024-00725-9
Ata-Ul-Karim, S. T., Begum, H., Lopena, V., Borromeo, T., Virk, P., Hernandez, J. E., et al. (2022). Genotypic variation of yield-related traits in an irrigated rice breeding program for tropical Asia. Crop Environ. 1 (3), 173–181. doi:10.1016/j.crope.2022.08.004
Bedford, J. A., Carine, M., and Chapman, M. A. (2023). Detection of locally adapted genomic regions in wild rice (Oryza rufipogon) using environmental association analysis. G3 Genes, Genomes, Genet. 13 (10), jkad194. doi:10.1093/g3journal/jkad194
Bhuvaneswari, S., Gopala Krishnan, S., Bollinedi, H., Saha, S., Ellur, R. K., Vinod, K. K., et al. (2020). Genetic architecture and anthocyanin profiling of aromatic rice from Manipur reveals divergence of chakhao landraces. Front. Genet. 11, 570731. doi:10.3389/fgene.2020.570731
Bhuvaneswari, S., Ellur, R. K., Krishnan, S. G., Bollinedi, H., Saha, S., Vinod, K. K., et al. (2023). Deciphering the inheritance and QTL-seq aided mapping of the candidate locus governing pericarp pigmentation in Manipur black rice. Indian J. Genet. Plant Breed. 83 (1), 8–14.
Bin Rahman, ANMR, and Zhang, J. (2022). Trends in rice research: 2030 and beyond. Food and Energy Security. doi:10.1002/fes3.390
Bordoloi, D., Sarma, D., Sarma Barua, N., Das, R., and Das, B. K. (2024). Morpho-molecular and nutritional profiling for yield improvement and value addition of indigenous aromatic Joha rice of Assam. Sci. Rep. 14 (1), 3509. doi:10.1038/s41598-023-42874-9
Brar, D. S., and Khush, G. S. (2006). “Cytogenetic manipulation and germplasm enhancement of rice (Oryza sativa L.),” in Genetic resources, chromosome engineering and crop improvement. Editors R. J. Singh, and P. P. Jauhar (Boca Raton: CRC Press), 2, 115–158. doi:10.1201/9780203489260
Brar, D. S., and Khush, G. S. (2018). “Wild relatives of rice: a valuable genetic resource for genomics and breeding research,” in The wild oryza genomes. Compendium of Plant Genomes. Editors T. Mondal, and R. Henry (Cham: Springer), 1–25. doi:10.1007/978-3-319-71997-9_1
Brotman, Y., Llorente-Wiegand, C., Oyong, G., Badoni, S., Misra, G., Anacleto, R., et al. (2021). The genetics underlying metabolic signatures in a brown rice diversity panel and their vital role in human nutrition. Plant J. 106 (2), 507–525. doi:10.1111/tpj.15182
Burton, G. W., and De Vane, D. E. (1953). Estimating heritability in tall fescue (Festuca arundinacea) from replicated clonal material. Agron. J. 45, 478–481. doi:10.2134/agronj1953.00021962004500100005x
Cao, Z., Tang, H., Cai, Y., Zeng, B., Zhao, J., Tang, X., et al. (2022). Natural variation of HTH5 from wild rice, Oryza rufipogon Griff., is involved in conferring high-temperature tolerance at the heading stage. Plant Biotechnol. J. 20 (8), 1591–1605. doi:10.1111/pbi.13835
Carcea, M. (2021). Value of whole grain rice in a healthy human nutrition. Agriculture 11, 720. doi:10.3390/agriculture11080720
Das, M., Dash, U., Shekhar Mahanand, S., Nayak, P. K., and Kesavan, R. K. (2023). Black rice: a comprehensive review on its bioactive compounds, potential health benefits and food applications. Food Chem. Adv. 3, 100462. doi:10.1016/j.focha.2023.100462
Devi, W. J., Vivekananda, Y., Uddin, A., Laishram, J. M., and Chakraborty, S. (2020). Morphological markers associated with pericarp colour and its inheritance pattern in black scented rice of Manipur. Trop. Plant Res. 7, 396–402. doi:10.22271/tpr.2020.v7.i2.046
Eizenga, G. C., Edwards, J. D., Jackson, A. K., and Huggins, T. D. (2024). Substitution mapping of yield-related traits utilizing three cybonnet rice × wild introgression libraries. Crop Sci. 64, 2288–2304. doi:10.1002/csc2.21264
Falconer, D. S. (1964). An introduction to quantitative genetics. Edinburgh: Oliver and Boyd Publishing Co. Pvt. Ltd.
FAO (2024). Food outlook-biannual report on global food Markets- November 2024. Available online at: https://openknowledge.fao.org/handle/20.500.14283/cd3177en.
Faysal, A. S. M., Ali, L., Azam, M. G., Sarker, U., Ercisli, S., Golokhvast, K. S., et al. (2022). Genetic variability, character association, and path coefficient analysis in transplant Aman rice genotypes. Plants 11 (21), 2952. doi:10.3390/plants11212952
Furukawa, T., Maekawa, M., Oki, T., Suda, I., Iida, S., Shimada, H., et al. (2007). The Rc and Rd genes are involved in proanthocyanidin synthesis in rice pericarp. Plant J. 49 (1), 91–102. doi:10.1111/j.1365-313X.2006.02958.x
Gaikwad, K. B., Singh, N., Kaur, P., Rani, S., Babu, H. P., and Singh, K. (2021). Deployment of wild relatives for genetic improvement in rice (Oryza sativa). Plant Breed. 140 (1), 23–52. doi:10.1111/pbr.12875
Ghasemzadeh, A., Karbalaii, M. T., Jaafar, H. Z. E., and Rahmat, A. (2018). Phytochemical constituents, antioxidant activity, and antiproliferative properties of black, red, and brown rice bran. Chem. Cent. J. 12, 17. doi:10.1186/s13065-018-0382-9
Gogoi, S., Singh, S., Swamy, B. M., Das, P., Sarma, D., Sarma, R. N., et al. (2024). Grain iron and zinc content is independent of anthocyanin accumulation in pigmented rice genotypes of Northeast region of India. Sci. Rep. 14 (1), 4128. doi:10.1038/s41598-024-53534-x
Gross, B. L., and Zhao, Z. (2014). Archaeological and genetic insights into the origins of domesticated rice. Proc. Natl. Acad. Sci. U. S. A. 111 (17), 6190–6197. doi:10.1073/pnas.1308942110
Ham, T. H., Kwon, S. W., Ryu, S. N., and Koh, H. J. (2015). Correlation analysis between grain color and cyanidin-3-glucoside content of rice grain in segregate population. Plant Breed. Biotechnol. 3 (2), 160–166. doi:10.9787/PBB.2015.3.2.160
Henry, R. J. (2022). Wild rice research: advancing plant science and food security. Mol. Plant 15 (4), 563–565. doi:10.1016/j.molp.2021.12.006
Hernandez-Soto, A., Echeverría-Beirute, F., Abdelnour-Esquivel, A., Valdez-Melara, M., Boch, J., and Gatica-Arias, A. (2021). Rice breeding in the new era: comparison of useful agronomic traits. Curr. Plant Biol. 27, 100211. doi:10.1016/j.cpb.2021.100211
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 498–520. doi:10.1037/h0070888
Huang, X., Kurata, N., Wei, X., Wang, Z. X., Wang, A., Zhao, Q., et al. (2012b). A map of rice genome variation reveals the origin of cultivated rice. Nature 490 (7421), 497–501. doi:10.1038/nature11532
Huang, J., Zhang, Y., Li, Y., Xing, M., Lei, C., Wang, S., et al. (2024). Haplotype-resolved gapless genome and chromosome segment substitution lines facilitate gene identification in wild rice. Nat. Commun. 15 (1), 4573. doi:10.1038/s41467-024-48845-6
Idrishi, R., Singha, S., and Rangan, L. (2024). Quality characterization, variety discrimination and correlations amongst twelve landraces of black rice of Assam and Manipur, India. J. Food Compos. Analysis 130, 106182. doi:10.1016/j.jfca.2024.106182
Imai, I., Kimball, J. A., Conway, B., Yeater, K. M., McCouch, S. R., and McClung, A. (2013). Validation of yield-enhancing quantitative trait loci from a low-yielding wild ancestor of rice. Mol. Breed. 32, 101–120. doi:10.1007/s11032-013-9855-7
Ito, V. C., and Lacerda, L. G. (2019). Black rice (Oryza sativa L.): a review of its historical aspects, chemical composition, nutritional and functional properties, and applications and processing technologies. Food Chem. 301, 125304. doi:10.1016/j.foodchem.2019.125304
Jayaprakash, G., Bains, A., Chawla, P., Fogarasi, M., and Fogarasi, S. (2022). A narrative review on rice proteins: current scenario and food industrial application. Polymers 14, 3003. doi:10.3390/polym14153003
Jeke, E., Mzengeza, T., Kyung-Ho, K., and Imani, C. (2021). Correlation and path coefficient analysis of yield and component traits of KAFACI doubled haploid Rice (Oryza sativa L) Genotypes in Malawi. Int. J. Agric. Technol. 1 (2), 1–9. doi:10.33425/2770-2928.1005
Johnson, H. W., Robinson, H. F., and Comstock, R. E. (1955). Estimates of genetic and environmental variability in soybeans. Agron. J. 47, 314–318. doi:10.2134/agronj1955.00021962004700070009x
Kumar, S., Chakrabarty, S. K., and Singh, Y. (2021). Variation in phenol colour reaction in grains of rice (Oryza sativa L.) varieties. Indian J. Genet. Plant Breed. 81 (03), 367–375. doi:10.31742/IJGPB.81.3.9
Kushwaha, U. K. S. (2016). Black rice: research, history and development. Berlin: Springer International Publishing.
Laghari, A. A., Ahmad, A., Memon, S., Musavi, S. A. M., Ali, A., Kumar, A., et al. (2025). Genetic diversity in F3 segregating populations of rice (Oryza sativa L.) genotypes under salt stress. Front. Plant Sci. 16, 1568859. doi:10.3389/fpls.2025.1568859
Lap, B., Magudeeswari, P., Tyagi, W., and Rai, M. (2024). Genetic analysis of purple pigmentation in rice seed and vegetative parts—implications on developing high-yielding purple rice (Oryza sativa L.). J. Appl. Genet. 65 (2), 241–254. doi:10.1007/s13353-023-00825-0
Little, R. R., Hilder, G. B., and Dawson, E. H. (1958). Differential effect of dilute alkali on 25 varieties of milled white rice. Cereal Chem. 35 (2), 111–126.
Liyanaarachchi, G. V. V., Mahanama, K. R. R., Somasiri, HPPS, Pan, P., and Kottawa-Arachchi, J. D. (2020). Total and free amino acid contents of popular rice varieties (Oryza sativa L.) consumed in the capital city of Sri Lanka. J. Natn Sci. Found. Sri Lanka 48 (2), 199–211. doi:10.4038/jnsfsr.v48i2.9565
Mackill, D. J., and Khush, G. S. (2018). IR64: a high-quality and high-yielding mega variety. Rice 11, 18. doi:10.1186/s12284-018-0208-3
Maeda, H., Yamaguchi, T., Omoteno, M., Takarada, T., Fujita, K., Murata, K., et al. (2014). Genetic dissection of black grain rice by the development of a near isogenic line. Breed. Sci. 64 (2), 134–141. doi:10.1270/jsbbs.64.134
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proc. Natl. Acad. Sci. India 2, 49–55. doi:10.1007/s13171-019-00164-5
Malik, P., Huang, M., Neelam, K., Bhatia, D., Kaur, R., Yadav, B., et al. (2022). Genotyping-by-sequencing based investigation of population structure and genome wide association studies for seven agronomically important traits in a set of 346 Oryza rufipogon accessions. Rice 15 (1), 37. doi:10.1186/s12284-022-00582-4
Mapoung, S., Semmarath, W., Arjsri, P., Thippraphan, P., Srisawad, K., Umsumarng, S., et al. (2022). Comparative analysis of bioactive-phytochemical characteristics, antioxidants activities, and anti-inflammatory properties of selected black rice germ and bran (Oryza sativa L.) varieties. Eur. Food Res. Technol. 9, 451–464. doi:10.1007/s00217-022-04129-1
Mbanjo, E. G. N., Kretzschmar, T., Jones, H., Ereful, N., Blanchard, C., Boyd, L. A., et al. (2020). The genetic basis and nutritional benefits of pigmented rice grain. Front. Genet. 11, 229. doi:10.3389/fgene.2020.00229
McCouch, S. R., Sweeney, M., Li, J., Jiang, H., Thomson, M., Septiningsih, E., et al. (2007). Through the genetic bottleneck: O. rufipogon as a source of trait-enhancing alleles for O. Sativa. Euphytica 154, 317–339. doi:10.1007/s10681-006-9210-8
Mendoza-Sarmiento, D., Mistades, E. V., and Hill, A. M. (2023). Effect of pigmented rice consumption on cardiometabolic risk factors: a systematic review of randomized controlled trials. Curr. Nutr. Rep. 12 (4), 797–812. doi:10.1007/s13668-023-00496-7
Min, H., Zhang, H., Zhao, C., Chen, G., and Zou, Y. (2019). Amino acid content in rice grains is affected by high temperature during the early grain-filling period. Sci. Rep. 9, 2700. doi:10.1038/s41598-019-38883-2
Mondal, S., Pradhan, P., Das, B., Kumar, D., Paramanik, B., Yonzone, R., et al. (2024). Genetic characterization and diversity analysis of indigenous aromatic rice. Heliyon 10 (10), e31232. doi:10.1016/j.heliyon.2024.e31232
Mudhale, A., Sar, P., Kumar, J., Bhowmick, P. K., Banerjee, A., Chakraborty, K., et al. (2024). Characterization of rice (Oryza sativa L.) landraces from Majuli and surrounding riverine ecologies in Assam: assessment of morphogenetic variability and submergence tolerance. Plant Breed. 143 (4), 469–480. doi:10.1111/pbr.13181
Oikawa, T., Maeda, H., Oguchi, T., Yamaguchi, T., Tanabe, N., Ebana, K., et al. (2015). The birth of a black rice gene and its local spread by introgression. Plant Cell 27 (9), 2401–2414. doi:10.1105/tpc.15.00310
Padmavathi, G., Bangale, U., Rao, K. N., Balakrishnan, D., Arun, M. N., Singh, R. K., et al. (2024). Progress and prospects in harnessing wild relatives for genetic enhancement of salt tolerance in rice. Front. Plant Sci. 14, 1253726. doi:10.3389/fpls.2023.1253726
Pham, C. H., Do, T. D., Nguyen, H. T. L., Hoang, N. T., Tran, T. D., Vu, M. T. T., et al. (2024). Genome-wide association mapping of genes for anthocyanin and flavonoid contents in Vietnamese landraces of black rice. Euphytica 220 (1), 11. doi:10.1007/s10681-023-03268-0
PPV&FR Act (2001). Protection of plant varieties and farmers’ rights act 2001. Available online at: https://plantauthority.gov.in/.
Qiao, W., Qi, L., Cheng, Z., Su, L., Li, J., Sun, Y., et al. (2016). Development and characterization of chromosome segment substitution lines derived from Oryza rufipogon in the genetic background of O. sativa spp. indica cultivar 9311. BMC Genomics 17, 580. doi:10.1186/s12864-016-2987-5
Rao, C. R. (1952). Advanced statistical methods in biometrical research. New York: John Wiley and Sons Inc.
Rao, S., Chinkwo, K., Santhakumar, A., Johnson, S., and Blanchard, C. (2019). Apoptosis induction pathway in human colorectal cancer cell line SW480 exposed to cereal phenolic extracts. Molecules 24 (13), 2465. doi:10.3390/molecules24132465
Rao, D. S., Neeraja, C. N., Madhu Babu, P., Nirmala, B., Suman, K., Subba Rao, L. V., et al. (2020). Zinc biofortified rice varieties: challenges, possibilities, and progress in India. Front. Nutr. 7, 26. doi:10.3389/fnut.2020.00026
Roy, S. C. (2017). Molecular breeding and genetic resources of tulaipanji rice. Germany: Lambert Academic Publishing.
Roy, S. C., and Shil, P. (2020a). Assessment of genetic heritability in rice breeding lines based on morphological traits and caryopsis ultrastructure. Sci. Rep. 10 (1), 7830. doi:10.1038/s41598-020-63976-8
Roy, S. C., and Shil, P. (2020b). Black rice developed through interspecific hybridization (O. sativa × O. rufipogon): origin of black rice gene from Indian wild rice. bioRxiv. doi:10.1101/2020.12.25.423663
Sakulsingharoj, C., Vuttipongchaikij, S., Khammona, K., Narachasima, L., Sukkasem, R., Pongjaroenkit, S., et al. (2024). Overexpression of black rice OsC1 confers tissue-specific anthocyanin accumulation in indica rice cv. Kasalath and its potential use as a visible marker in rice transformation. Plant Gene. 37, 100446. doi:10.1016/j.plgene.2024.100446
Sanchez, P. L., Wing, R. A., and Brar, D. S. (2013). “The wild relative of rice: genomes and genomics,” in Genetics and genomics of rice. Editors Q. Zhang, and R. A. Wing (New York: Springer), 9–25. doi:10.1007/978-1-4614-7903-1_2
Seck, F., Covarrubias-Pazaran, G., Gueye, T., and Bartholomé, J. (2023). Realized genetic gain in rice: achievements from breeding programs. Rice 16 (1), 61. doi:10.1186/s12284-023-00677-6
Sedeek, K., Zuccolo, A., Fornasiero, A., Weber, A. M., Sanikommu, K., Sampathkumar, S., et al. (2023). Multi-omics resources for targeted agronomic improvement of pigmented rice. Nat. Food 4 (5), 366–371. doi:10.1038/s43016-023-00742-9
Senguttuvel, P., Govindaraj, M., et al., , D, S. R., Cn, N., V, J., et al. (2023). Rice biofortification: breeding and genomic approaches for genetic enhancement of grain zinc and iron contents. Front. Plant Sci. 14, 1138408. doi:10.3389/fpls.2023.1138408
Sha, X. Y. (2013). “Rice artificial hybridization for genetic analysis,”. Rice protocols. methods in molecular biology. Editor Y. Yang (New York: Humana Press), 956, 1–12. doi:10.1007/978-1-62703-194-3_1
Shang, L., Li, X., He, H., Yuan, Q., Song, Y., Wei, Z., et al. (2022). A super pan-genomic landscape of rice. Cell Res. 32 (10), 878–896. doi:10.1038/s41422-022-00685-z
Shao, Y., Hu, Z., Yu, Y., Mou, R., Zhu, Z., and Beta, T. (2018). Phenolic acids, anthocyanins, proanthocyanidins, antioxidant activity, minerals and their correlations in non-pigmented, red, and black rice. Food Chem. 239, 733–741. doi:10.1016/j.foodchem.2017.07.009
Shirley, B. W. (1998). Flavonoids in seeds and grains: physiological function, agronomic importance and the genetics of biosynthesis. Seed Sci. Res. 8 (4), 415–422. doi:10.1017/S0960258500004372
Siddiq, E. A., and Vemireddy, L. R. (2021). “Advances in genetics and breeding of rice: an overview,” in Rice improvement: physiological, molecular breeding and genetic perspectives. Editors J. Ali, and S. H. Wani (Switzerland: Springer), 1–29. doi:10.1007/978-3-030-66530-2_1
Singh, N., Wang, D. R., Ali, L., Kim, H., Akther, K. M., Harrington, S. E., et al. (2020). A coordinated suite of wild-introgression lines in Indica and Japonica elite backgrounds. Front. Plant Sci. 11, 564824. doi:10.3389/fpls.2020.564824
Sleper, D. A., and Poehlman, J. M. (2007). “Breeding rice,” in Breeding field crops. Editors D. A. Sleper, and J. M. Poehlman 5th edn. (Iowa: Blackwell Publishing), 239–257.
Solis, C. A., Yong, M. T., Vinarao, R., Jena, K., Holford, P., Shabala, L., et al. (2020). Back to the wild: on a quest for donors toward salinity tolerant rice. Front. Plant Sci. 11, 323. doi:10.3389/fpls.2020.00323
Sood, B. C., and Siddiq, E. A. (1978). A rapid technique for scent determination in rice. Indian J. Genet. Plant Breed. 38, 268–271.
Subudhi, P. K., De Leon, T., Singh, P. K., Parco, A., Cohn, M. A., and Sasaki, T. (2015). A chromosome segment substitution library of weedy rice for genetic dissection of complex agronomic and domestication traits. PLoS One 10 (6), e0130650. doi:10.1371/journal.pone.0130650
Sun, C. Q., Wang, X. K., Li, Z. C., Yoshimura, A., and Iwata, N. (2001). Comparison of the genetic diversity of common wild rice (Oryza rufipogon Griff.) and cultivated rice (O. sativa L.) using RFLP markers. Theor. Appl. Genet. 102, 157–162. doi:10.1007/s001220051631
Sweeney, M. T., Thomson, M. J., Pfeil, B. E., and McCouch, S. (2006). Caught red-handed: rc encodes a basic helix-loop-helix protein conditioning red pericarp in rice. Plant Cell 18 (2), 283–294. doi:10.1105/tpc.105.038430
Sweeney, M. T., Thomson, M. J., Cho, Y. G., Park, Y. J., Williamson, S. H., Bustamante, C. D., et al. (2007). Global dissemination of a single mutation conferring white pericarp in rice. PLoS Genet. 3 (8), e133. doi:10.1371/journal.pgen.0030133
Tanksley, S. D., and McCouch, S. R. (1997). Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277 (5329), 1063–1066. doi:10.1126/science.277.5329.1063
Tantipaiboonwong, P., Pintha, K., Chaiwangyen, W., Chewonarin, T., Pangjit, K., Chumphukam, O., et al. (2017). Anti-hyperglycaemic and anti-hyperlipidaemic effects of black and red rice in streptozotocin-induced diabetic rats. Sci. Asia 43, 281–288. doi:10.2306/scienceasia1513-1874.2017.43.281
Tester, M., and Langridge, P. (2010). Breeding technologies to increase crop production in a changing world. Science 327 (5967), 818–822. doi:10.1126/science.1183700
Thilavech, T., Suantawee, T., Chusak, C., Suklaew, P. O., and Adisakwattana, S. (2025). Black rice (Oryza sativa L.) and its anthocyanins: mechanisms, food applications, and clinical insights for postprandial glycemic and lipid regulation. Food Prod Process Nutr. 7, 15. doi:10.1186/s43014-024-00288-8
Thomson, M. J., Tai, T. H., McClung, A. M., Lai, X. H., Hinga, M. E., Lobos, K. B., et al. (2003). Mapping quantitative trait loci for yield, yield components and morphological traits in an advanced backcross population between Oryza rufipogon and the Oryza sativa cultivar Jefferson. Theor. Appl. Genet. 107, 479–493. doi:10.1007/s00122-003-1270-8
Tian, F., Li, D. J., Fu, Q., Zhu, Z. F., Fu, Y. C., Wang, X. K., et al. (2006). Construction of introgression lines carrying wild rice (Oryza rufipogon Griff.) segments in cultivated rice (Oryza sativa L.) background and characterization of introgressed segments associated with yield-related traits. Theor. Appl. Genet. 112, 570–580. doi:10.1007/s00122-005-0165-2
Tiozon, R. J. N., Sartagoda, K. J. D., Fernie, A. R., and Sreenivasulu, N. (2023). The nutritional profile and human health benefit of pigmented rice and the impact of post-harvest processes and product development on the nutritional components: a review. Crit. Rev. Food Sci. Nutr. 63 (19), 3867–3894. doi:10.1080/10408398.2021.1995697
Tyagi, A., Chen, X., Shabbir, U., Chelliah, R., and Oh, D. H. (2022). Effect of slightly acidic electrolyzed water on amino acid and phenolic profiling of germinated brown rice sprouts and their antioxidant potential. LWT- Food Sci. Technol. 157, 113119–119. doi:10.1016/j.lwt.2022.113119
Xia, D., Zhou, H., Wang, Y., Li, P., Fu, P., Wu, B., et al. (2021). How rice organs are colored: the genetic basis of anthocyanin biosynthesis in rice. Crop J. 9 (3), 598–608. doi:10.1016/j.cj.2021.03.013
Xiao, J., Li, J., Grandillo, S., Ahn, S. N., McCouch, S. R., Tanksley, S. D., et al. (1996). Genes from wild rice improve yield. Nature 384 (6606), 223–224. doi:10.1038/384223a0
Xie, L., Wu, D., Fang, Y., Ye, C., Zhu, Q. H., Fan, L., et al. (2024). Population genomic analysis unravels the evolutionary roadmap of pericarp color in rice. Plant Commun. 5 (3), 100778. doi:10.1016/j.xplc.2023.100778
Xu, Y., Chu, C., and Yao, S. (2021). The impact of high-temperature stress on rice: challenges and solutions. Crop J. 9 (5), 963–976. doi:10.1016/j.cj.2021.02.011
Xu, S., Li, X., Podio, N. S., Han, Y., Wang, X.-Y., and Gong, Er S. (2024a). Black rice phenolics alleviate T2DM through comprehensive regulation of oxidative stress, metabolic pathways and gut microbiota. Food Biosci. 62, 105494. doi:10.1016/j.fbio.2024.105494
Xu, N., Yu, Z., Wang, X., Lu, J., Chen, H., Sun, Q., et al. (2024b). Influence of natural and artificial selection on the yield differences among progeny derived from crossing between subspecies in cultivated rice. New Crops 1, 100020. doi:10.1016/j.ncrops.2024.100020
Zhang, B., Ma, L., Wu, B., Xing, Y., and Qiu, X. (2022). Introgression lines: valuable resources for functional genomics research and breeding in rice (Oryza sativa L.). Front. Plant Sci. 13, 863789. doi:10.3389/fpls.2022.863789
Zhang, J., Pan, D., Fan, Z., Yu, H., Jiang, L., Lv, S., et al. (2022). Genetic diversity of wild rice accessions (Oryza rufipogon Griff.) in Guangdong and Hainan provinces, China, and construction of a wild rice core collection. Front. Plant Sci. 13, 999454. doi:10.3389/fpls.2022.999454
Zheng, X., Peng, Y., Qiao, J., Henry, R., and Qian, Q. (2024). Wild rice: unlocking the future of rice breeding. Plant Biotechnol. J. 22 (11), 3218–3226. doi:10.1111/pbi.14443
Keywords: Rice pre-breeding, Oryza rufipogon, black rice RIL development, widening genetic base, anthocyanin pigment
Citation: Roy SC and Shil P (2025) Comprehensive genetic diversity revealed in the pre-breeding RILs (O. sativa × O. rufipogon) with enhanced yield and pigmented grain quality. Front. Genet. 16:1659937. doi: 10.3389/fgene.2025.1659937
Received: 04 July 2025; Accepted: 20 August 2025;
Published: 01 October 2025.
Edited by:
Rajib Roychowdhury, Volcani Center, IsraelReviewed by:
Arvind H. Hirani, Kemin Industries, Inc., United StatesSoumya Prakash Das, Seacom Skills University, India
Zahed Hossain, University of Kalyani, India
Copyright © 2025 Roy and Shil. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Subhas Chandra Roy, c3ViaGFzY3IyMDExQGdtYWlsLmNvbQ==