Genomic characterisation and dissection of the onset of resistance to acetyl CoA carboxylase-inhibiting herbicides in a large collection of Digitaria insularis from Brazil

An in-depth genotypic characterisation of a diverse collection of Digitaria insularis was undertaken to explore the neutral genetic variation across the natural expansion range of this weed species in Brazil. With the exception of Minas Gerais, populations from all other states showed high estimates of expected heterozygosity (HE > 0.60) and genetic diversity. There was a lack of population structure based on geographic origin and a low population differentiation between populations across the landscape as evidenced by average Fst value of 0.02. On combining haloxyfop [acetyl CoA carboxylase (ACCase)-inhibiting herbicide] efficacy data with neutral genetic variation, we found evidence of presence of two scenarios of resistance evolution in this weed species. Whilst populations originating from north-eastern region demonstrated an active role of gene flow, populations from the mid-western region displayed multiple, independent resistance evolution as the major evolutionary mechanism. A target-site mutation (Trp2027Cys) in the ACCase gene, observed in less than 1% of resistant populations, could not explain the reduced sensitivity of 15% of the populations to haloxyfop. The genetic architecture of resistance to ACCase-inhibiting herbicides was dissected using a genome wide association study (GWAS) approach. GWAS revealed association of three SNPs with reduced sensitivity to haloxyfop and clethodim. In silico analysis of these SNPs revealed important non-target site genes belonging to families involved in herbicide detoxification, including UDPGT91C1 and GT2, and genes involved in vacuolar sequestration-based degradation pathway. Exploration of five genomic prediction models revealed that the highest prediction power (≥0.80) was achieved with the models Bayes A and RKHS, incorporating SNPs with additive effects and epistatic interactions, respectively.


Introduction
Digitaria insularis, commonly known as sourgrass, is an outcrossing perennial grass weed native to Central and South America, where glyphosate-tolerant corn and soybean varieties are primarily grown as a double-crop year system.The extensive glyphosate-based weed management in South America has led to widespread evolution of glyphosate resistant populations of D. insularis over the years (Heap, 1999).In Brazil, glyphosate resistant sourgrass populations were first reported from soybean and maize fields in 2008 (de Carvalho et al., 2011).In just over a decade, D. insularis has become one of the most persistent weed species competing with soybean, corn and cotton.In soybean, yield losses to sourgrass infestation can reach up to 80% and it is a major concern to the agricultural sector (Lopez Ovejero et al., 2017;Gazziero et al., 2019).D. insularis propagates by seed and asexually by rhizomes, which makes it even more challenging to control in the field as compared to many other weeds.
Acetyl coenzyme A carboxylase (ACCase) inhibiting herbicides have been used intensively for post-emergence control of D. insularis in Brazil, which substantially increased selection pressure for this class of herbicides.A total of 49 weeds have already evolved resistance to ACCase inhibitors, conferred predominantly by target-site resistance (TSR) mechanism.The TSR mechanism involves point mutation(s) in genes encoding the protein targets of herbicides affecting the binding of the herbicide at or near catalytic domains.Several mutations at ACCase codons Ile1781, Trp1999, Trp2027, Ile2041, Asp2078, Cys2088 and Gly2096 in the carboxyltransferase (CT) domain of the enzyme have been associated with resistance to ACCaseinhibiting herbicides (Powles and Yu, 2010;Beckie and Tardif, 2012;Kaundun, 2014;Takano et al., 2020).The non-target site resistance (NTSR) mechanisms are more complex and can include one or more physiological processes resulting in reduced absorption and translocation of the herbicides and/or their enhanced metabolism or sequestration.The metabolism-based NTSR mechanism, involving the increased activity of enzymes such as cytochrome P450s, glutathione S-transferases and/or Uridine 5′diphospho (UDP)-glucosyl transferases, has been reported for ACCase inhibitors in a few grass weed species (Powles and Yu, 2010;Yu et al., 2013;Han et al., 2016;Iwakami et al., 2019).
Hitherto, limited knowledge exists on the genetic diversity and population genetic structure in D. insularis as is the case with most other weed species (Gonçalves Netto et al., 2021).The information on the genetic diversity and population structure in D. insularis is important to increase our understanding of the genetic structure and gene flow across natural expansion area of this weed species.Recent advances in next-generation sequencing (NGS) technologies and bioinformatic and statistical tools have opened new vistas to characterize plant genomes at a much greater depth than before.Genotyping by sequencing (GBS), particularly, has been used extensively in crops to assess genetic diversity, investigate population genetic structure and gene discovery for a plethora of traits using genome wide association study (GWAS) approach (Kim et al., 2016;Sehgal et al., 2017;2020;Shi et al., 2020;Abou-Khater et al., 2022).It is a rapid, high-throughput, and cost-effective method for the simultaneous discovery of single nucleotide polymorphisms (SNPs) and genotyping of target germplasm (Poland and Rife, 2012).The GBS technology has made it possible to generate thousands to millions of SNPs with a high resolution with and without the availability of a reference genome in weed species (Küpper et al., 2018).
The present study aims to i) characterize the genetic diversity in a large Brazilian collection of D. insularis using the GBS approach ii) determine the population genetic structure and investigate pattern of gene flow between populations iii) identify marker trait associations (MTA) for resistance to reduced sensitivity to ACCase inhibitors such as haloxyfop and clethodim using the GWAS approach and iv) investigate genomic prediction (GP) models to explore the potential of this approach for predicting NTSR.The present study is the first report of a detailed genotypic characterization of the largest collection of D. insularis and a preliminary investigation to explore the potential of GWAS approach and GP models dissecting NTSR in its natural diverse populations.

Plant material
A total of 205 D. insularis weed populations were used in the study.The populations were collected from the most relevant soybean-(and corn) growing regions across Brazil inhabiting crop fields and crop margins in the southern, mid-western, south-eastern and north-eastern regions.Briefly, these 205 populations originated from seven states including Mato Grosso (77), Mato Grosso do Sul (10), Minas Gerais (19), Bahia (24), Goiás (20), Paraná (PR) and Rio Grande do Sul (18).The detail information on the geographic origin and geographic coordinates are provided in the (Supplementary Table S1).Some collection sites (1/5th of the collection) had a history of multiple herbicide treatments on the crop including glyphosate, ACCase inhibitors such as haloxyfop or clethodim alone or in mixtures and/or ALS inhibitors (Supplementary Table S1).The populations were collected in 2020 and 2021 and were investigated for their sensitivity to ACCase inhibitors haloxyfop and clethodim at multiple dose rates by Weed Research team at Brazil Resistance Management Laboratory in Uberlândia, Brazil, as part of resistance monitoring studies.

Herbicide resistance screening of Digitaria insularis collection
The rates used in the population screening were initially determined in a pilot study using ten D. insularis sensitive populations sampled in Brazil urban areas with no history of herbicide application.Informative herbicide rates were determined as 7.8; 15.0, 27.0 and 64.0 g a. i ha -1 for haloxyfop and 27.0, 54.0 and 108.0 g a. i ha -1 for clethodim.
Seeds from all D. insularis populations were germinated and planted in 1 L pots filled with a commercial substrate to produce around 15 plants per pot.Each combination of population and herbicide dose was replicated 3 times (45 plants tested per treatment) in a completely randomized design.The herbicide treatments were sprayed on plants at the 4 leaf-stage, in a spray chamber equipped with flat fan nozzles calibrated to deliver 200 L ha-1 at 200 kPa pressure.Plant control was evaluated 21 days after treatment, using a visual scale of 0%-100%; 0% represents healthy plants and the absence of symptoms, and 100% represents the death of the plant.
DNA extraction, genotyping-by-sequencing (GBS) library preparation and sequencing Four pinches of seeds were sown in a punnet of size 18 cm × 6 cm filled with peat keeping 3 cm between pinches.For each population, two such punnets were sown.The punnets were watered manually and put on trolleys in a glass house with controlled conditions (day temperature, 24 °C; night temperature, 18 °C; light, 16 h; humidity −65%).The punnets were watered daily for 3 weeks.After 3 weeks, a 1 cm × 2 cm of leaf sample was cut from 25 individual plants for each population and pooled into a 14 mL falcon tube.The falcon tubes were stored in a −80 °C freezer until further manipulation.
The leaf samples were dried in a freeze dryer for three consecutive days and nights by keeping the shelves at a contact temperature of 1.0 °C and freezer at −60 °C.After freeze drying, the samples were shipped to LGC Genomics GmbH, Germany for DNA extraction, reduced representation library preparation and sequencing.
Genomic DNA extraction was performed from the pooled samples using the sbeadex ™ maxi plant kit (LGC) on KingFisher Flex (after lysis step) followed by a spectrophotometric quantification step using Nano Drop 8,000 (Thermo Fisher Scientific).Reduced representation library preparation was done by the standardized ddRAD protocol at LGC Genomics.Briefly, 100 ng of genomic DNA were digested with 2 units each of Apek I and Pst I enzymes (NEB) in 1 times NEB buffer 3.1 in 20 μL volume for 1 hour at 37 °C.The restriction enzymes were heat inactivated by incubation at 75 °C for 60 min.The detailed protocol for the ligation reaction, library purification, amplification and normalization were performed according to the standardized ddRAD protocol at LGC Genomics, GmbH.The library was size selected on a LMP-Agarose gel, removing fragments smaller than 300 bp and those larger than 500 bp.Sequencing was done on an Illumina NovaSeq 6,000 (150bp paired-end read).

Genotypes and SNP filtering
Demultiplexing of all libraries for each sequencing lane was done using the Illumina bcl2fastq v2.20 software.Demultiplexing of library groups into samples was done according to their inline barcodes and verification of restriction site.No mismatches or Ns were allowed in the inline barcodes, but Ns were allowed in the restriction site.Reads with final length <20 bases were rejected and reads with 5′ ends not matching the restriction enzyme site were also discarded.The reads were quality trimmed at 3′-end to get a minimum average Phred quality score of 20 over a window of ten bases.The mapping of quality trimmed reads on the D. insularis reference genome v01.0 (available at Weedpedia, https://weedpedia.weedgenomics.org/)was done using BWA-MEM v0.7.12.One combined alignment file of all samples in the BAM format was used for variant discovery and genotyping of samples with Freebayes v1.2.0.Filtering of variants was done using the following GBSspecific rule set; 1.The read count for a locus must exceed 8 reads 2. Genotypes must have been observed in at least 66% of samples 3. Minimum allele frequency across all samples must exceed 5%.

Genetic diversity and population differentiation
The genetic diversity indices expected heterozygosity (H E ) and inbreeding coefficients (F IS ) were calculated using the R packages "adegenet" and "hierfstat" (Goudet, 2005).The polymorphic information content (PIC) was calculated using an in-house R package.The two-and three-dimensional principal component analysis (PCA) was conducted using the R packages "stats" and "rgl".A Bayesian clustering approach implemented in the program STRUCTURE version 2.3.4 (Pritchard et al., 2000) was used to analyse population genetic structure by setting replication number to 10,000 for the burn-in and Markov Chain Monte Carlo (MCMC) iterations each and using options of admixture model and correlated allele frequencies.The number of subpopulations, i.e., K was set from 1 to 7 and three independent runs were performed for each K.The Structure Harvester (https://taylor0.biology.ucla.edu/structureHarvester/) was used to analyze the results from the STRUCTURE software, which constructs a deltaK vs K plot using the method of Evanno et al. (2005).The weighted neighbour joining (NJ) was constructed in DARwin 6.0.(Perrier and Jacquemoud-Collet, 2006).

Genome wide association study (GWAS) and in silico analysis of significant SNPs
The normality of the resistance data scores was checked in PAST3 program (Hammer et al., 2001).The resistance scores data of haloxyfop at dose rate of 7.8 g a. i ha −1 and clethodim at dose rate of 27.0 g a. i ha −1 showed normal and near-normal distribution, respectively and hence were used in GWAS.Both general linear model (GLM) and mixed linear model (MLM) were used for GWAS in TASSEL software ver 5.0 (Bradbury et al., 2007).The kinship matrix was calculated using VanRaden algorithm (VanRaden, 2008) in the GAPIT package 2.0 (Lipka et al., 2012).In the GLM, PCA was used as a fixed variate and in the MLM, PCA and kinship matrices were used as fixed and random variates, respectively.The threshold to declare significant marker-trait associations (MTA) was ≥10 −3 (log10p) after applying a correction for a false discovery rate (FDR) at p < 0.05.
The VCF file was annotated with SnpEff version 5.1 using the IWGC D. insularis reference genome and annotation v01.0.In addition, the in silico analysis of the significant SNPs was conducted using nucleotide Basic Local Alignment Search Tool (BLAST) in the EnsemblPlants database (https://plants.ensembl.org/index.html).The EnsemblPlants database has cDNA/transcript sequences of more than 80 monocots and dicots and of model plants, which were used to find the homologies.The genes found in the overlapping region and within 1.0 Mb upstream and downstream of the matched regions were selected as candidate genes.To determine their molecular functions, the protein sequences of the candidate genes were downloaded from EnsemblPlants database and used in protein BLAST analysis in NCBI server (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and their molecular functions were determined after ascertaining their homologies with known proteins in grasses.

Genomic prediction models
Four parametric models (Ridge regression, Bayes A, Bayes B, Bayes C) and a non-parametric model (RKHS) were used and all these models are implemented in the "BGLR" package (De Los Campos et al., 2022) in R. Ridge regression (RR) method considers common variance for all markers and shrinks the marker effects toward zero (Meuwissen et al., 2001).All Bayesian models do not consider the common variance of markers and incorporate additive genetic effects.The RKHS model uses a kernel function and captures non-additive effects (epistatic interactions).Either all available SNP markers were used or different sets of SNP markers were employed according to preliminary GWAS results.The SNP markers were ranked according to increasing p-values in GWAS analyses.Prediction accuracy of all models was calculated through "Pearson correlation coefficient" between observed and predicted values based on 100 iterations and 10-fold cross-validations.

GBS marker distribution
A total of 105,699 single nucleotide polymorphisms (SNPs) were generated across 205 populations using the criterion of a minimum coverage to call a SNP.Of these, 5,238 SNPs were retained for genetic analysis after filtering for 5% minor allele frequency (MAF) and missing data 30% and removing markers that could not be mapped on the currently available reference genome of D. insularis on Weedpedia (https://weedpedia.weedgenomics.org/).The polymorphic information content (PIC) of the filtered SNPs varied from 0.10 to 0.38.Overall, SNPs showed good distribution with chromosomes S02 and S06 showing the highest (842) and the least (344) number of SNPs, respectively (Figure 1).On an average, 624 Mb of the physical genome was encompassed by these SNPs with the maximum genome coverage achieved on chromosome S01 (93.7 Mb) and minimum (51.9 Mb) on chromosome S09 (Supplementary Table S2).

Within populations genetic diversity and population differentiation
The expected heterozygosity (H E ) and polymorphic information content (PIC) values of the populations ranged from 0.56 to 0.64 and 0.20 to 0.37, with an average of 0.62 and 0.37, respectively (Table 1).Both parameters, H E and PIC, showed that Minas Gerais (MG) population was moderately diverse while most other populations showed high level of genetic diversity (H E > 0.60).The inbreeding coefficient (F IS ) for all the seven populations was negative (Table 1) suggesting highly outcrossing nature of the populations under local environmental conditions.
The average Fst across all samples s was 0.020, indicating a low genetic differentiation among populations.The pairwise Fst values ranged from 0.003 (between populations MS and PR, GO and PR, MT and RS and PR and RS) to 0.064 (between MG and BA) (Table 2).As expected, a high gene flow (Nm) was detected among populations, varying from 3.65 to 83.08, with an overall average of 12.25.Interestingly, the lowest Nm estimates were obtained between MG and five populations (MT, MS, PR, RS and BA) (Table 2).

Principal component analysis (PCA) and STRUCTURE analysis
The PCA plot with genome wide 5,238 SNPs revealed two important features; i) all populations were scattered in the PCA plot and ii) populations did not show clusters based on their geographical origin (Figure 2).The population structure explored by Bayesian STRUCTURE analysis revealed two subpopulations at the best K (K = 2) (Supplementary Figures S1A, B).Both subpopulations comprised of individuals from all seven populations (Supplementary Figure S1B).Only populations from MG had a defined cluster (cluster II) based on a cluster membership threshold of 0.80 (Supplementary Figure S1B), whereas all remaining populations were distributed between the two clusters.
The weighted neighbour joining (NJ) tree confirmed that two groups comprised individuals from seven populations clustered randomly irrespective of their geographic origin (Supplementary Figure S2).
With the Fst outlier analysis, we detected 63 SNPs that showed Fst values higher than the average Fst (Figure 3A).We used these 63 SNPs to redraw PCA to investigate if any geographical differentiation could be observed (Figure 3B).Interestingly, using a subset of 63 SNPs (Fst >0.03), MT population could be moderately differentiated from rest of the populations while the remaining six populations remained mixed (Figure 3B).To understand the functional significance of this subtle geographical differentiation, we conducted an in silico analysis of the sequences of the top fifteen outlier hits (Fst >0.04) to identify candidate genes (Supplementary Table S3).It was revealed that orthologs of the genes involved in growth and development and/or stress pathways showed hits with these sequences.

Population
Number 27.0 and 64.0 g a. i ha −1 , respectively.For clethodim, 16 (8.2%),8 (4.1%) and 4 (2.0%) populations were found to be controlled at less than 90% at doses 27.0, 54.0 and 108.0 g a. i ha -1 , respectively.There were no significant differences in the sensitivity scores for the two lower doses of haloxyfop, i.e., 7.8 and 15.0 g a. i ha −1 , hence sensitivity scores from the dose 7.8 g a. i ha −1 were used in all further analysis.The state wise distribution showed that more than 50% of populations with reduced sensitivity to the haloxyfop dose rate 7.8 g a. i ha -1 were from MG state followed by MT (31%) (Supplementary Figure S3A).For higher doses of haloxyfop and for all the three doses of clethodim, populations with reduced sensitivity were from PR and MS states (Supplementary Figure S3B).The efficacy scores at 7.8 g a. i ha −1 dose of haloxyfop and 27.0 g a. i ha −1 of clethodim showed a normal and near-normal distribution, respectively (Supplementary Figures S3C, D).Sequencing of the ACCase gene in all samples revealed that only two populations (BR-20-Din-144 and BR-21-Din-046) showed the presence of Trp2027Cys mutation.Whilst both populations showed reduced sensitivity to all the four doses of haloxyfop (7.8, 15.0, 27.0 and 64.0 g a. i ha −1 ), only one exhibited reduced sensitivity to the two doses (27.0 and 54.0 g a. i ha −1 ) of clethodim.The four populations (BR-21-Din-417, BR-21-Din-419, BR-21-Din-421 and BR-21-Din-423), that showed reduced sensitivity to the higher doses of haloxyfop (27.0 and 64.0 g a. i ha −1 ) and to all the three doses of clethodim (27.0, 54.0 and 108.0 g a. i ha −1 ), did not show the Trp2027Cys mutation.None of the other known mutations in this gene were identified in the resistant populations.
To understand whether the high gene flow observed in the species is shaping the genetic structure of resistance, we investigated the population genetic structure of a subset of 65 populations, contrasting for resistance or reduced sensitivity to 7.8 g a. i ha −1 of haloxyfop regardless of geographic origin.Only 7.8 g a. i ha −1 dose rate was selected for these analyses as this dose rate showed a good size of populations with reduced sensitivity (29) which could be compared with sensitive populations (36).The other dose rates produced too few populations showing reduced sensitivity (<8) and hence were not analysed further.The PCA plot showed that resistant and sensitive populations were interspersed, and no clear groups were observed (Figure 4A), which led us to suggest that the resistance has evolved independently across the landscape.We  then drew weighted NJ trees of populations from MG and MT regions separately (Figures 4B, C).For MG, the tree showed distinction between resistant and sensitive populations but for MT, resistant and sensitive populations were interspersed.

Genome wide association study (GWAS) and genomic prediction models
Since efficacy scores at 7.8 g a. i ha −1 dose of haloxyfop and 27.0 g a. i ha −1 of clethodim showed a normal and near-normal distribution, respectively (Supplementary Figures S3C, D), characteristics of a quantitative trait, a pilot GWAS study was conducted to identify candidate genes associated with reduced efficacy to haloxyfop and clethodim.Three and two SNPs were found associated with reduced efficacy with haloxyfop in GLM and MLM models, respectively (Figure 5).Two SNPs, S02_45439268 and S03_5213977, were common in both models (Table 3).S02_ 45439268 on chromosome 2 explained the highest percentage variation (R 2 ) of 26.6% (p-value = 3.13E-06), while S03_ 5213977 on chromosome 3 explained 14.6% R 2 (p-value = 9.71E-04).In silico analysis (BLAST analysis using reference genomes/ cDNA sequences of D. exilis or model species) of these SNPs revealed that the SNP S02_45439268 was located 0.1 Mb upstream of a D. exilis gene Dexi2A01G0009650.The protein BLAST analysis of the protein sequence of this gene showed homologies with UDP-glycosyltransferase 91C1-like protein (UDPGT91C1) of Panicum virgatum, Sorghum bicolor, Oryza glaberrima and Setaria italica.The SNP S03_5213977 showed homologies with the galactinol--sucrose galactosyltransferase (GT) 2 in Hordeum vulgare, Aegilops tauschii, Triticum urartu and Lolium arundinaceum.
For efficacy shift to clethodim, four and one SNP were identified with GLM and MLM, respectively (Figure 6).The four SNPs with the GLM model explained 11.3%-16.6% of R 2 , while the SNP identified with the MLM explained 24.0% of R 2 (Table 3).Interestingly, two SNPs were located within 0.1 Mb of potential NTSR genes.The SNP S05_47637569 was located within 0.1 Mb of D. exilis gene Dexi1A01G0019350.The BLASTP analysis of protein sequence of this gene showed homologies with vacuolar protein sorting (VPS)-associated protein 20 of P. virgatum, S. italica, Zea mays and Lolium rigidum.The SNP S08_15270772 is located within 0.1 Mb of D. exilis gene Dexi9A01G0037640, the protein sequence of which showed homologies with CMP-sialic acid transporter 2 of S. italica, P. virgatum and S. bicolor.
We further investigated the potential of genomic prediction (GP) models to predict NTSR resistance in the panel using resistance data scores at haloxyfop dose of 7.8 g a. i ha -1 .We tested five models, Ridge Regression, Bayes A, Bayes B, Bayes C and RKHS, on different subsets of SNPs selected either based on GWAS hits or complete set of 5,238 SNPs (Figure 7).Although predictions based on models having 50 SNPs from GWAS showed more than 50% prediction accuracy, the highest accuracy was achieved with complete set of 5,238 SNPs with better prediction accuracy from Bayes A or RKHS models.

Discussion
The present study highlights the outcomes of the assessment of neutral genetic variation in D. insularis in its natural expansion range in Brazil, where it has recently become one of the most problematic agricultural weeds.Both genetic diversity parameters (PIC and H E ) indicated an overall high genetic diversity in the populations.The H E estimates in populations (0.56-0.64) are higher than the previously reported values for this species by GBS markers (0.15-0.45;Netto et al., 2021).This discrepancy is partly due to differences in the sample size, i.e., a smaller set analysed in the previous study and due to the fact that only glyphosate resistant and susceptible populations were investigated earlier (Gonçalves Netto et al., 2021).A further comparison with other outcrossing weed species revealed higher genetic diversity in D. insularis as compared to Alopecurus myosuroides (average H E = 0.24) and Ipomoea purpurea (average H E = 0.24) and lower genetic diversity than Lolium multiflorum (average H E = 0.81) (Délye et al., 2010;Kuester et al., 2015;Karn and Jasieniuk, 2017;Dixon et al., 2021).
The PCA, STRUCTURE and NJ tree analyses based on neutral GBS markers unveiled the absence of geographic differentiation among populations and concurrently reiterated the fact that geographical distance does not govern the genetic structure of D. insularis across the landscape (Figure 2).This lack of spatial patterning has been revealed in several weed species that are outcrossing and possess a high seed dispersal ability such as black grass and common morning glory (Kuester et al., 2015;Dixon et al., 2021).The same is true for invasive weed species that have shown recent expansion such as common ragweed (Martin et al., 2016) and M. micrantha (Ruan et al., 2021).Several factors are associated with the lack of structure or spatial patterning observed in weeds, including their outcrossing nature leading to higher gene flow between populations as compared to inbreeding species, human-mediated modes of dispersal (e.g., dispersal through farm machinery) and their recent expansion across the landscape (Kuester et al., 2015).The population genetic differentiation coefficient (Fst) confirmed a very low genetic differentiation between D. insularis populations (0.003 ≤ F ST ≤ 0.064), suggesting that gene flow between populations was common.The interpopulation gene flow estimates (Nm = 3.65-83.08)were approximately two-fold higher than the values obtained in invasive weed Mikania micrantha (Ruan et al., 2021).Such low level of population differentiation and high level of gene flow among populations are indicative of recent expansion of the species as an agricultural weed in Brazil.
We also hypothesized that the prevailing agricultural practices in Brazil have played a significant role in shaping the genetic structure of D. insularis populations investigated here.As a common practice, many Brazilian soybean growers own many large-scale farms, serving urban and rural markets, across the country and transport of combines is common between the farms during harvesting (Lopez Ovejero et al., 2017).Through hair like appendages, D. insularis seeds adhere to combines, leading to an increased genetic exchange between different populations thereby weakening population divergence.This also explains the high gene flow detected between D. insularis populations collected from southern and mid-western states (Table 2).
The Fst outlier analysis detected 63 SNPs that showed Fst values above genome-wide average obtained with 5,238 SNPs.Using these SNPs in PCA evidenced a subtle genetic structure; populations from MT could be moderately differentiated from the remaining populations (Figure 3B).The in silico analysis of the top outlier loci revealed hits with adaptive genes such as DUF579, DUF2404 and GRAS family transcription factor involved in growth and development and/or abiotic stress responses (Lee et al., 2012;Wang et al., 2022).In invasive weed species, adaptive signatures throughout the genomes have been identified using GBS markers and SNPs active in geographic differentiation detected more often pathways related to growth and defence responses (Martin et al., 2016;van Boheemen et al., 2017;Ruan et al., 2021).The identification of similar functional genes in the present study therefore suggests that the populations have been adapting to changes in the environments while going through population expansion.There are evidences that indicate that the populations from southern, south-eastern, and north-eastern states (PR, RS, MS, MG, BA) have been exposed to cyclical droughts more often than populations from mid-western states such as MT (Nobre et al., 2016;Cunha et al., 2018;Marengo et al., 2020).
Two different models of resistance evolution have been proposed in weed species, i.e., gene flow and independent evolution (Kuester et al., 2015;Délye et al., 2010).Currently, only a limited number of studies have addressed the issue by simultaneous assessment of neutral genetic variation with the level of resistance in weed species across the spatial scale (Küpper et al., 2018;Dixon et al., 2021).Such studies are required in more weed species, both inbreeding and outcrossing, to provide insights into the evolutionary units of herbicide resistance.By combining neutral genetic variation with the level of resistance, if a pattern of isolation-by-distance (IBD) is displayed, we can infer that resistance has spread by gene flow in a weed species.In the second scenario, if a mosaic resistance pattern is exhibited and no evidence of IBD across populations, it would suggest independent evolution of resistance on a local scale or at larger distances.Such studies will prove of high relevance to applied weed scientists willing to maintain low levels of resistance as these will help to make informed decisions about weed management strategies, which would differ in the two scenarios.
We investigated, using a subset of populations showing susceptibility and reduced sensitivity to haloxyfop, whether the potential for spread of resistance occurs through gene flow or through independent evolution in this weed species.Although we did see a moderate differentiation between the resistant and sensitive populations, a mosaic pattern of resistance, i.e., resistant populations placed closely to sensitive populations, was generally observed across the landscape in the PCA (Figure 4A).The weighted NJ trees of the local resistant and sensitive populations within MT and MG states (Figures 4B, C) provided explicit evidence that resistance is evolving by both means (gene flow and independent evolution) in D. insularis.Within MG, two clusters of resistant populations were evident which were separated from the sensitive populations (Figure 4B), whereas within MT a mosaic pattern was evident (Figure 4C).These results suggests that both evolutionary mechanisms, gene flow and multiple, independent resistance evolution, are contributing to the evolution of resistance in D. insularis.This contrasts with the results obtained in black grass (Délye et al., 2010;Dixon et al., 2021) and common morning glory (Kuester et al., 2015) that reported only independent evolution of resistance across the landscape.
The sequencing of the ACCase gene to screen the seven known target-site mutations (Powles and Yu, 2010;Kaundun et al., 2021) showed the presence of Trp2027Cys mutation in two populations only.It is noteworthy that these two populations showed clear survivorship at field rate (64.0 g a. i ha −1 ) of haloxyfop, consistent with the impact of the W2027C mutation on FOP herbicides (Kaundun, 2014).A large majority of populations showing reduced sensitivity to lower doses of haloxyfop (7.8 or 15.0 g a. i ha −1 ) or clethodim (27.0 g a. i ha −1 ) did not show this or other known target-site mutations.In the light of the present results, although it seems safe to assume that NTS mechanisms are likely the cause of reduced sensitivity of 15% of populations, further studies including sequencing of the ACCase gene on hundreds of individuals of these populations will be required to rule out the possibility of coexistence of TSR (at a very low frequency) and NTSR.It is important to note that previous studies in weeds reporting TSR as the leading mechanism of resistance to ACCase inhibitors largely involved smaller number of populations and only field dose rates, which precluded the plausibility of detection of NTSR genes.GWAS analysis for haloxyfop resistance identified two important SNPs that together explained 41.0% of variation.One of these SNPs (S03_5213977) showed hits with galactinol--sucrose galactosyltransferase (GT) 2 of multiple grass species including H. vulgare, A. tauschii, T. urartu and L. arundinaceum.GTs are a wellresearched group of NTSR enzymes known for herbicide detoxification in several weed species such as L. rigidum (Gaines et al., 2014;Duhoux et al., 2015), A. myosuroides (Gardin et al., 2015) and Eleusine indica (Chen et al., 2015).The second SNP (S02_ 45439268) for reduced sensitivity to haloxyfop was located within 0.1 Mb upstream of a gene Dexi2A01G0009650 in D. exilis (reference genome of a cultivated species of Digitaria) encoding a putative NTSR enzyme UDP-glycosyltransferase 91C1-like protein (UDPGT91C1), the role of which in herbicide detoxification mechanisms has been shown only recently in Arabidopsis Manhattan plots showing marker-trait associations identified for resistance to Clethodim.
Frontiers in Genetics frontiersin.org10 thaliana through its glucosylating activity (Huang et al., 2021).Using both mutant and overexpression lines, it was demonstrated that UDPGT91C1 can glycosylate the herbicide sulcotrione, a triketone herbicide.UDPGTs (also known as UGTs), in addition to CYP450s and GSTs, have been implicated in herbicide detoxification by metabolism, and upregulated expression of some UDPGT genes has been observed with degradation of herbicides (Matzrafi et al., 2017;Vega et al., 2020;Huang et al., 2021;Chandra and Leon, 2022).
For reduced sensitivity to clethodim, two important candidate gene hits with multiple grass species were obtained in close vicinities of two significant SNPs (S05_47637569 and SNP S08_15270772).Of these, vacuolar protein sorting (VPS)associated protein 20 (VPS20) is part of Endosomal Sorting Complex Required for Transport (ESCRT)-III, responsible for vacuolar-based degradation pathway (Agaoua et al., 2021).The herbicides that are inactivated by GST-catalyzed glutathione conjugation are transported into vacuoles for further metabolism (YU et al., 2013;Cai et al., 2014).A VPS60-type candidate target gene has been selected for validation in relation to NTS mechanism of glyphosate resistance mechanism in Lolium multiflorum (Cechin et al., 2020).
Finally, we tested five genomic models; Ridge Regression, Bayes A, Bayes B, Bayes C and RKHS, to explore the ability of these models for predicting NTSR in D. insularis.While Ridge Regression model assumes that all markers have same effects, Bayesian models (Bayes A, Bayes B and Bayes C) incorporate additive genetic effects and RKHS captures complex epistatic interactions.As anticipated, we got reasonably high prediction with the reduced set of SNPs selected by GWAS (the case of 50 GWAS SNPs), comparable to the whole set of 5,238 SNPs (Figure 7).All models invariably showed ≥0.60 prediction accuracy with the 50 SNPs selected by GWAS as compared to 10 SNPs.However, the highest prediction power (≥0.80) was achieved with the complete set of 5,238 SNPs with the models Bayes A and RKHS, incorporating SNPs with additive effects and epistatic interactions, respectively.These results contrast with the ones obtained by Dixon et al. (2021) in black grass, who obtained the highest prediction with the top few hundreds GWAS loci for resistance to fenoxaprop compared to using all markers.

Conclusion
Digitaria insularis populations showed high genetic diversity and a population structure typical of a weed species that has shown recent expansion across the landscape.In most weed species, where resistance evolution to ACCase inhibitors has been investigated, it is indicated that resistance has evolved independently in weed populations.Digitaria insularis, on the other hand, is a fascinating example of herbicide resistance evolution, in which resistance has evolved by multiple mechanisms, i.e., gene flow and independent evolution.Since resistance to ACCase inhibitor haloxyfop is observed in Brazil, the identification of candidate genes conferring shifts in the sensitivity scores or reduced sensitivity to haloxyfop opens new opportunities to further investigate resistance mechanisms in this species and come up with effective weed management strategies.The present study also opens new avenues of functional validation of the candidate genes identified here to determine their role in NTSR-linked processes.Prediction accuracy of five genomic prediction models tested using a subset of GWAS.

FIGURE 1
FIGURE 1Genome wide GBS marker distribution of 5,238 SNPs; number of SNPs per chromosome (A) and SNP density per chromosome (B) (physical positions based on the current Digitaria insularis reference genome in Weedpedia).

FIGURE 2
FIGURE 2 Two-dimensional (A) and three-dimensional (B) principal component analysis of Digitaria insularis populations based on 5,238 SNPs.

FIGURE 3
FIGURE 3 Plot of Fst outlier analysis showing 63 SNPs Fst>0.03 (A) and 2D-PCA with 63 SNPs showing a moderate differentiation of MT population from rest of the populations (B).

FIGURE 4
FIGURE 4Two-dimensional PCA plot of resistant and sensitive populations from different geographic regions in Brazil (A) and weighted NJ trees of resistant and sensitive populations from Minas Gerais (B) and Mato Grosso (C) based on complete set of 5,238 SNPs.

FIGURE 5
FIGURE 5Manhattan plots showing marker-trait associations identified for resistance to Haloxyfop in Digitaria insularis with GLM (A) and MLM (B) models.

TABLE 3
Marker-trait associations identified by genome wide association study for resistance to Haloxyfop and clethodim and candidate gene hits.