Weedy Rice From South Korea Arose From Two Distinct De-domestication Events

Agro-ecosystems are dominated by crop plants and the weedy species that thrive under agricultural conditions. Weedy crop relatives are some of the most difficult weeds to manage and can dramatically reduce crop yields when left unchecked. Weedy rice has resulted from multiple de-domestication events from crop rice in different rice growing regions. Interestingly, both South Korea and the United States harbor weedy rice populations that share ancestry with indica cultivars and temperate japonica cultivars. Here we compare weedy rice populations from South Korea and the United States on order to identify if they are the result of the same de-domestication events. We find that weedy rice populations in South Korea are genetically distinct from weedy rice found in the USA and are therefore the result of two unique de-domestication events. Low levels of genetic diversity among Korean weedy rice accessions (haplotype diversity = 0.0188 and 0.0324) indicate recent de-domestication events from crop relatives.


INTRODUCTION
Agricultural weeds account for approximately one third of all crop yield loss (Oerke, 2006), contributing to food shortages worldwide. Understanding the population structure and mechanisms of adaption in weedy plants informs best management practices in agro-ecosystems. In particular, weedy crop relatives have played a longstanding role in agro-ecosystem dynamics, driving both the evolution of crops as well as the development of new management strategies (Kwit et al., 2011;Li and Olsen, 2020). A well-documented example is weedy rice, a conspecific weed of cultivated rice (Oryza sativa L.). Weedy rice has a distribution that spans nearly all rice growing regions around the world. Infestations of this species can cause up to an 80% loss in harvest for cultivated rice and is often cited as a major limiting factor for rice production. In the United States alone, estimates of production loss due to weedy rice could feed an additional 12 million people annually (Durand-Morat et al., 2018). Management efforts for weedy rice have ranged from manual removal to large scale herbicide application.
There are five main cultivated rice subtypes that are genetically and phenotypically distinguishable: indica, aus, aromatic, tropical japonica, and temperate japonica. Weedy rice has arisen through de-domestication events from at least 3 of the 5 cultivated subtypes (Qiu et al., 2020).
In the United States which has no wild Oryza species, there are three main weedy rice subgroups: (1) straw-hulled weedy rice from the southern US (SH weeds) that are genetically similar to indica type cultivated rice, (2) black-hulled weedy rice from the southern US (BH weeds) that are genetically similar to aus type cultivated rice, and (3) California weedy rice (CA weeds) that is genetically similar to temperate japonica cultivated rice. All three of these subgroups seem to have evolved from dedomestication events in each progenitor cultivar group Kanapeckas et al., 2016;Qiu et al., 2020). In other world regions where wild rice (Oryza rufipogon) is common, weedy rice populations have genetic contributions from both wild rice and cultivated rice populations (Vigueira et al., 2019;Qiu et al., 2020).
Weedy rice from South Korea (Korean weedy rice) is composed of two main subgroups based on population structure analysis: indica-like and temperate japonica-like (Vigueira et al., 2019). Using whole genome sequencing, other researchers have also placed Korean weedy rice into the same two subgroups (He et al., 2017). Like the United States, South Korea is not in the natural range of wild Oryza species. Therefore, all rice crops have been imported into the region for cultivation. Weedy rice populations in these countries have therefore either been introduced with cultivated rice seed or have evolved in place since rice cultivation began Kanapeckas et al., 2016;Qiu et al., 2020).
Here, we aim to more closely examine the genetic similarities of weedy rice from South Korea and the United States. We have used both candidate genes (Rc, controlling pericarp color; Bh4, controlling hull color; and sh4, controlling seed shattering) as well as genome-wide neutral genetic markers (Sequence Tagged Sites) to better understand the evolutionary history and population structure of weedy rice from these two regions. We find that Korean weedy rice is genetically distinct from US weedy rice populations despite their phenotypic similarities, indicating that these weeds were the result of unique dedomestication events.

Sampling and Sequencing
Rice seeds were obtained from the International Rice Germplasm Collection (IRGC) of the International Rice Research Institute (IRRI; Los Baños, Philippines). Twenty-four accessions of weedy rice from South Korea were selected to represent the phenotypic diversity for hull color, pericarp color, and presence of awns (phenotypes are listed in Table 1) from a total sample of 226 accessions (Supplementary Table 1). Eighteen of these samples were previously included in a comparative study with weedy, wild, and cultivated rice from Southeast Asian and the United States (Vigueira et al., 2019). Seeds were germinated and grown to the young seedling stage in the greenhouse. DNA was extracted from young leaf tissue using DNeasy Plant DNA kits (QIAGEN, Hilden, Germany).
Polymerase Chain Reaction (PCR) was carried out using standard conditions to amplify 48 Sequence Tagged Sites (STS loci) as described in Reagon et al. (2010). Due to inconsistent amplification, 7 loci were excluded from the analysis. Regions of the three candidate genes (Rc, Bh4, and sh4) were amplified by PCR using primers and conditions as previously described (Konishi et al., 2006;Sweeney et al., 2006;Zhu et al., 2011). Successful PCR amplification was confirmed using gel electrophoresis and excess primers and dNTPs were removed using Exonuclease I and Antarctic phosphatase treatment. Direct Sanger sequencing in both the forward and reverse direction was carried out by Eurofins Genomics (Louisville, KY, USA).
Sequences were assembled into contiguously aligned sequence "contigs" and aligned using CodonCode Aligner. All sequences were inspected visually for quality and for the presence of heterozygous sites. Low quality sequences were removed from the dataset. Heterozygous base calls were randomly assigned to two pseudo-haplotypes, which were then phased using PHASE version 2.1 (Stephens et al., 2001;Stephens and Scheet, 2005). Due to very low levels of heterozygosity in the data set, haplotypes were inferred with very high probabilities and were consistently assigned across five independent runs. All sequences have been submitted to NCBI GenBank (accession numbers MT976168-MT977030).

Analysis of STS Loci
Phased haplotypes were aligned with STS sequences from 27 weedy rice accessions collected in California (Kanapeckas et al., 2016; GenBank accessions KT441140-KT443009) as well as from a diverse sampling of 206 accessions that includes Southern US weedy as well as wild and cultivated rice representing major Oryza varieties and species ; GenBank accessions GQ999668-GQ999777). Population structure was inferred using STRUCTURE 2.3.4 (Pritchard et al., 2000). Initial runs included all samples in the dataset. To limit the number of possible subpopulations with proposed ancestry to Korean weedy rice and therefore better resolve differences between closely related groups, we reduced the data set to include the following groups: all Korean weedy rice (24 accessions), weedy rice from California (27 accessions), weedy rice from the southern US (58 accessions of BH and SH weedy rice), and the five cultivar groups (75 accessions representing indica, aus, aromatic, temperate japonica, and tropical japonica). The number of populations (K) was tested with five permutations each between values of K = 1 to K = 10. Each permutation had a burn-in period of 100,000 steps and a MCMC chain length of 500,000 steps after the burn-in. STRUCTURE HARVESTER (Earl and vonHoldt, 2012) was used to calculate Delta K (Evanno et al., 2005) and determine the K value that maximized the marginal likelihood. DISTRUCT version 1.1 (Rosenberg, 2003) was used to produce the graphical display of structure results. As a complement to STRUCTURE analysis, principal component analysis (PCA) was run on haplotype data using the ML model in JMP. PCAs were produced for the full dataset and the subset of data used in the STRUCTURE analysis. The PCA produced from data in our STRUCTURE analysis separated California weedy rice from the rest of the groups. Therefore, a PCA was also run with California weedy rice removed from the dataset.
Summary statistics for each STS locus, including nucleotide diversity at silent sites (π) using the Juke's Cantor correction  (Jukes and Cantor, 1969), Watterson's estimator of θ at silent sites (Watterson, 1975), number of segregating sites S, and haplotype diversity were calculated in DnaSP version 5.0 (Librado and Rozas, 2009). Averages for these statistics across all STS loci were calculated in Excel.

Candidate Gene Analysis
Candidate gene sequences were aligned with rice sequences from previous studies Thurber et al., 2010;Vigueira et al., 2013). Genetic variants were determined by identifying haplotypes and mutations shared between Korean weedy rice and wild, weedy, or cultivated rice varieties.

Population Structure
Analyses of neutral STS markers grouped Korean weedy rice groups two distinct genetic subpopulations. STRUCTURE analysis comparing Korean weeds with the five major cultivars and weedy rice from the United States partitions Korean weeds into an indica-like group and a temperate japonica-like group (Figure 1). STRUCTURE plots are shown for K = 4 based on Delta K results and K = 5 because temperate japonica was distinguishable from tropical japonica at that number of subpopulations. Principal Component Analysis reveals the same genetic groupings as found in STRUCTURE analysis. PCA was performed with and without California weedy rice (Figure 2), as it was the most genetically distinct group. Due to consistent genetic grouping, we analyzed the Korean weedy rice as two separate populations, indica-like and temperate japonica-like weedy rice, for the remainder of the analysis.

Diversity of Korean Weedy Rice
Summary statistics indicated very low genetic diversity within Korean weedy rice sub-groups, consistent with a population bottleneck during de-domestication from cultivated rice ( Table 2). Average haplotype diversity and nucleotide diversity is lower in weedy rice groups compared to cultivated rice. Korean weedy rice has even lower values of average haplotype diversity (0.0188 and 0.0324) than SH weedy rice (0.0515), BH weedy rice (0.1456) and California weedy rice (0.0466). In addition, Korean weedy rice had an average pairwise nucleotide diversity (π) of about half the average value found in SH and BH weedy rice from the Southern US (Table 2). This level of genetic diversity is similar to that found in California weedy rice. Low levels of genetic diversity in Korean weedy rice could be a result of a very recent genetic bottleneck associated with a recent de-domestication event.

Candidate Gene Alleles
Korean weedy rice phenotypes and candidate gene allele information can be found in Table 1. Of the 24 Korean weedy rice accessions included in our analysis, 14 had a straw colored hull and 10 had a black colored hull. All straw colored hull accessions FIGURE 1 | STRUCTURE plot of five cultivated rice subgroups and four weedy rice groups. K = 4 was chosen using Delta K, while K = 5 was included as it separated temperate and tropical japonica. Korean weedy rice groups with indica (red) and temperate japonica (yellow) cultivars. with the exception of one (IRGC #112820) contained the 22 bp causal deletion in Bh4 that is found in most cultivated rice. As expected, black hulled Korean weeds carried the ancestral wildtype allele (lacking the 22 bp deletion) which is found in black hull weedy and wild rice. Four Korean weedy rice accessions had a white pericarp color and twenty had red pericarps. Of the four, one contained a previously determined white pericarp allele (14 bp deletion) found in nearly all cultivated rice. Interestingly, this accession is the same accession (IRGC #112820) that does not have the 22 bp deletion at the Bh4 locus despite having a straw hull. The other three had no obvious deletions or loss of function mutations.
All 24 Korean weedy rice accessions contained the "T" reduced-shattering allele found in cultivated rice at the sh4 locus. This allele is also present in weedy rice from other world regions , which further supports the gain of shattering phenotype in weedy rice was acquired during de-domestication.

DISCUSSION
Korean weedy rice groups most closely with two distinct cultivated rice subtypes: indica and temperate japonica. These weeds likely originated from two distinct de-domestication events from cultivated varieties. These de-domestication events are likely recent, given low genetic divergence from cultivated groups and low genetic diversity within weedy rice groups. We find patterns consistent with cultivated rice de-domestication at both STS loci as well as candidate genes for weedy traits (Bh4, Rc, and sh4). These patterns have also been found in weedy rice from the United States Thurber et al., 2010;Vigueira et al., 2013). Weedy rice subtypes from the southern US include strawhulled (indica-like) and black-hulled (aus-like). In California, there is a distinct weedy rice population that groups most closely with temperate japonica cultivars based on coalescent modeling in Kanapeckas et al. (2016). Our PCA results from the California weedy rice group may better resolve this grouping. One weedy accession from California groups closely with japonica cultivars in our PCA, while the other accessions are genetically distinct from all other rice subtypes. This is an interesting discovery that warrants additional sampling from the California weedy rice population. Weedy rice from Korea does not seem to share recent ancestry with any of the US weedy rice populations, further supporting the previous findings that this group is the result of two distinct de-domestication events from indica and japonica cultivars.
Although this study does not provide definitive evidence for the location of Korean weedy rice de-domestication, recent studies of world-wide samples of weedy and cultivated rice points to other de-domestication events from rice cultivars found in the Korean peninsula (Qiu et al., 2020). Interestingly, kinship analysis results for Korean weeds identified closest cultivar relatives from Korea, Japan, China, India, and Egypt (Qiu et al., 2020). Taken together, it seems that Korean weedy rice may have multiple origins possibly including de-domestication events from cultivars in situ.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI GenBank (accession numbers MT976168-MT977030).

AUTHOR CONTRIBUTIONS
CV, PV, and KO designed the study. CV, PV, CW, and ZC collected the data. CV and PV analyzed the data. CV wrote the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This research was funded by High Point University.