Meta-Analysis Enables Prediction of the Maximum Permissible Arsenic Concentration in Asian Paddy Soil

It is now well-established that not just drinking water, but irrigation water contaminated with arsenic (As) is an important source of human As exposure through water-soil-rice transfer. While drinking water As has a permissible, or guideline value, quantification of guideline values for soil and irrigation water is limited. Using published data from 26 field studies (not pot-based experiments) from Asia, each of which reported irrigation water, soil and rice grain As concentrations from the same site, this meta-analysis quantitatively evaluated the relationship between soil and irrigation water As concentrations and the As concentration in the rice grain. A generalized linear regression model revealed As in soil to be a stronger predictor of As in rice than As in irrigation water (beta of 16.72 and 0.6, respectively, p < 0.01). Based on the better performing decision tree model, using soil and irrigation water As as independent variables we determined that Asian paddy soil As concentrations greater than 14 mg kg−1 may result in rice grains exceeding the Codex recommended maximum allowable inorganic As (i-As) concentrations of 0.2 mg kg−1 for polished rice and 0.35 mg kg−1 for husked rice. Both logistic regression and decision tree models, identified soil As as the main determining factor and irrigation water to be a non-significant factor, preventing determination of any guideline value for irrigation water. The seemingly non-significant contribution of irrigation water in predicting grain i-As concentrations below or above the Codex recommendation may be due to the complexity in the relationship between irrigation water As and rice grains. Despite modeling limitations and heterogeneity in meta-data, our findings can inform the maximum permissible As concentrations in Asian paddy soil.


INTRODUCTION
Arsenic (As) is a toxic, carcinogenic (Cohen et al., 2013) metalloid that occurs naturally in terrestrial and aquatic environments. Arsenic exposure, mainly through contaminated groundwater used for drinking, has widely been associated with detrimental health effects (Rahman et al., 2009). Though As exposure affects more than 200 million people worldwide (Shakoor et al., 2017), it has emerged as a major public health concern in Bangladesh and India, over the last few decades (Chakraborti et al., 2015). The World Health Organization (WHO) has established a guideline value of 10 μg L −1 for As in drinking water. Although contaminated irrigation water also contributes to As exposure by enhancing As concentrations in food crops (Mandal et al., 2019;Bhattacharyya et al., 2021), no WHO or international guideline value for irrigation water has been established to date.
Rice is a staple food for more than half of the global population, especially in Asian, African and Latin American countries (Majumder and Banik, 2019). In India and Bangladesh, daily consumption of milled rice is high (approximately 103 and 268 kg per capita year −1 respectively; FAO, 2017). In Bangladesh, approximately 73% of calorific intake comes from rice (Mwale et al., 2018) and in India it comprises 30% (IRRI, Knowledge Bank). Rice is a rich source of dietary fiber and nutrients, including carbohydrates, proteins, vitamins, and minerals (Dipti et al., 2012;Mwale et al., 2018). However, rice consumption may also be a major route of As exposure (Mondal and Polya, 2008;Mondal et al., 2010;Mondal et al., 2020). Soil serves as a significant sink for As, which is highly bioavailable to rice roots under the conditions in which rice is cultivated (Kumarathilaka et al., 2018). Rice plants are major accumulators of As compared to other cereal crops (Williams et al., 2007) and irrigation of a paddy field with As contaminated water elevates As concentrations in paddy soil (Meharg and Rahman, 2003), rice straw, and grain (Panaullah et al., 2008). In Asia, rice is the basic staple food for the majority of the population, including the region's 560 million poor (GRiSP Global Rice Science Partnership, 2013). During 2018-19, rice consumption in China was to the extent of 146.7 million tons, followed by India at 102 million tons (ICAR-NRRI Annual Report 2020). Apart from China and India, the other major rice producing countries are Bangladesh, Indonesia, Vietnam, Thailand, and Philippines. The production together accounts for more than 80% of global rice production (ICAR-NRRI Annual Report 2020) but unfortunately some of these regions are As contaminated. For example, in Bangladesh, 2.4 million out of 4 million hectares of paddy field have been found to be As contaminated (Akinbile and Haque, 2012).
Arsenic intake from rice has become a global concern, hence different countries have set maximum tolerable concentrations of As in rice grain (Schmidt, 2015). The Joint FAO-WHO Codex Alimentarius Commission has recommended a maximum concentration of 0.2 mg kg −1 for inorganic As in polished rice and 0.35 mg kg −1 in husked rice (Codex Alimentarius Commission, 2017). However, there have been limited attempts to establish paddy soil and irrigation water As concentrations above which the maximum recommended concentrations in rice may be exceeded. The usual range of total As in uncontaminated soil is 0.1-10 mg kg −1 (Zhao et al., 2010). The European Union (EU) recommended that As in agricultural soil should not exceed 20 mg kg −1 (Rahman et al., 2007;Hussain et al., 2021). Lower and upper guideline values of 10 and 50 mg kg −1 respectively have been prescribed by Finnish regulators (Ministry of the Environment, 2007;Toth et al., 2016). However, the values recommended by the EU and the Ministry of Environment in Finland were for generic agricultural soils rather than for paddy soils. These generic agricultural values may not be appropriate for application to paddy soil conditions, which are known to enhance As bioavailability to rice roots (Meharg and Rahman, 2003). For irrigation water, a regulatory limit of 100 μg L −1 for As has been adopted (Food and Agriculture Organization FAO, 1992;Pescod, 1992). This is in line with the 100 μg L −1 maximum concentration recommended by Ayers and Westcot (1985) for trace elements in irrigation waters but is again focused on generic agricultural production rather than rice specifically.
To our knowledge, no previous studies have derived maximum tolerable concentrations of paddy soil and irrigation water As above which rice grain As may exceed the maximum allowable concentrations set by the Joint FAO-WHO Codex Alimentarius Commission (JECFA, 2017). Using a metaanalysis approach, we attempt to determine soil and irrigation water As concentrations above which rice grains cultivated in Asian paddy fields may exceed the maximum tolerable concentrations of 0.2 mg kg −1 for inorganic As in polished rice and 0.35 mg kg −1 in husked rice.

Data Sources
We systematically reviewed published articles reporting As concentrations in paddy soil, irrigation water, and rice grains cultivated in Asian countries. We used Boolean operators (e.g., "OR" and "AND") to develop search terms from keywords ("arsenic," "contamination," "soil," "water," "rice," "risk"). Searching ISI Web of Science and PubMed with these terms, we identified relevant research papers published between 1980 and 2021, since 1980 onward the severity of As contamination was recognized in Asia. Studies were only included in subsequent meta-analysis if (1) the research was carried out in the field and not as pot experiments in the laboratory; (2) it was undertaken in Asian countries; (3) the As concentration data presented included total arsenic of soil, rice grain, and irrigation water from the same study location; (4) the analysis of As was carried out using appropriate laboratory instruments rather than Field Testing Kits; and (5) details of the analytical method(s) and quality assurance procedures used for the study were provided. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) flowchart can be seen in Figure 1.

Classification of Data
The proportion of inorganic As (i-As) in rice grain was determined from published data (18 studies) from Asian countries and a weighted mean of 80 and 75% for polished rice and husked rice, respectively, was calculated (Supplementary Table S1). The total As (t-As) concentrations in rice grain in our meta-data was converted to i-As. The rice grain concentrations were divided into two groups: (1) "within the maximum tolerable concentration (≤MTC)": As ≤ 0.35 mg kg −1 (husked rice) and ≤0.20 mg kg −1 (polished rice); (2) "above the maximum tolerable concentration (>MTC)": >0.35 mg kg −1 (husked rice) and >0.20 mg kg −1 (polished rice) (based on the recommendation of JECFA, 2017). The whole data set was randomly split into two, 80% of the data were used as the training set and the remaining 20% formed the testing set (Mukherjee et al., 2021).

Logistic Regression
A logistic regression model or logit model (James et al., 2013) was used to model our binary dependent variables: ≤MTC and >MTC. A probability value between 0 and 1 was allocated to each class. To identify the best fitting model, both accuracy and kappa values were considered. The residuals of the LR model were checked for normality and the distribution was further confirmed from the plot. To estimate the coefficients from the data, the model could have two (X1 and X2) or more predictors, as in this case As in soil and As in irrigation water. A linear relationship can be written in the mathematical form shown by Equation 1, where p is the probability of the event that Y 1 and Y is the binary response variable. The quantity p (X)/(1−p(X)) is called the odds, which can take any value between 0 and ∞ and is calculated by the maximum likelihood method. β0, β1, and β2 are the coefficients.

Decision Tree
In this study, we used a Classification and Regression Tree (CART) which is a non-parametric supervised learning method proposed by Breiman (1984) and Ripley (1996). Decision trees are not black-box models, their outputs are easy to interpret, and the DT maps the behavior or relationship between the predictor and target variable (Dreiseitl and Ohno-Machado, 2002). The DT method has been used widely, for example, to identify heavy metals in the environment (Jouanneau et al., 2011) and to streamline the mapping of soil pollution, for example, in a study on rice cadmium concentration (Wang et al., 2020). The algorithm divides the data set several times according to a criterion that maximizes data separation, resulting in a tree-like structure (Breiman, 1984). The most used criterion is knowledge gain, which implies that the decrease in entropy, due to each split, is maximized. The ratio of y class elements over all elements of the leaf node that contains data item x is the estimate of P (y|x) (Dreiseitl and Ohno-Machado, 2002). The best DT model was selected based on Complexity Parameter (cp) and accuracy. The cp was used to control the size of the DT and to select the optimal tree size. If the cost of adding another variable to the DT from the current node was above the value of cp, then tree building was discontinued. In terms of model complexity, it is low in logistic regression, particularly when no or few interaction terms and variable transformations are used. The LR method uses only the statistically significant predictor variables in the model whereas DT uses the predictor variables in a hierarchical and recursive manner. DT has the flexibility of assigning the classes in one or more steps. One advantage of the LR is that it can be used to generate probabilities of class membership for each object whereas DT only generates average probabilities applicable to all the objects assigned to a particular group (Worth and Cronin, 2003).

Model Limitations and Assumptions
The two criteria used to assess the quality of a classification model are discrimination and calibration. Discrimination is a measure of how well the two classes in the data set are separated; calibration determines how accurate the model probability estimate is to predict the true probability (Dreiseitl and Ohno-Machado, 2002). To provide an unbiased estimate of a model's discrimination and calibration, these values should be calculated from a data set not used in the model building process. Usually, a portion of the original data set, called the test or validation set, is put aside for this purpose, since testing on a separate data set would, in an ideal case, provide an unbiased estimation of generalization error. In small data sets as in this study, there may not be enough data for both training and testing. For this reason, the total data set was split into training set and testing set and the training data set was used as the source of information. In this case, the whole data set was divided into k pieces, k-1 pieces are used for training, and the last piece was the test set. This process of k-fold cross-validation builds k models; the numbers reported are the averages over all k test sets (Stone, 1974;Allen, 1977). The problem of over fitting both in the logistic regression and the DT analysis was controlled by k-fold cross validation (k 10) of the training data (James et al., 2013). On the observations in the remaining fold, the number of misclassified observations was calculated. This procedure was repeated, with each validation set consisting of a different set of observations (James et al., 2013). To quantify the extent to which the predicted response value for a given observation was close to the true response value for that observation, the receiver operating characteristic (ROC) curve was used. The overall performance of a classifier, summarized over all possible thresholds, was given by the area under the curve (AUC) (James et al., 2013).
The data analysis was performed using R-Studio (version 1.3.1093 2.3.1). Splitting the data into training and test data was performed using the stats (version 4.0.3) package. The Caret package (version 6.0-86) was used to conduct logistic regression and DT analysis (Kuhn, 2008). The probability graph from the logistic regression was prepared using ggplot2 (version 3.3.3) and tidyr (version 1.1.3), and for ROC and AUC pROC (version 1.17.0.1) was used.

Study Characteristics
Reviewing the titles and abstracts of 156 research articles that were obtained using the search terms resulted in 82 articles being identified for further consideration, of which 26 met the inclusion criteria (Section 2.1). From these 26 selected research papers (Table 1), an overall sample of 134 grain As concentrations were collated with corresponding soil and irrigation water As concentrations. Among the 26 selected studies, 15 reported grain As concentrations in husked rice and 11 reported concentrations in polished rice ( Table 1). Most of the studies (n 20) reported mean values. For the 6 studies that only reported concentration ranges (Roychowdhury et al., 2008b;Biswas et al., 2013;Srivastava et al., 2015;Sharma et al., 2017;Chowdhury et al., 2020;Reid et al., 2021), the minimum and maximum As concentrations reported for rice grain, soil and irrigation water were used. All the selected studies used credible methods for sample preparation and analysis, for example, Atomic Absorption Spectrophotometer with Flow Injection Hydride Generation (AAS-FI-HG) was used in 18 studies, Inductively Coupled Plasma Mass Spectrometry (ICP-MS) in 7 studies, and High Resolution Inductively Coupled Plasma Mass Spectrometry (HR ICP-MS) was used in 1 study (Supplementary Table S2). All have provided the results of analysis for certified reference materials, thus ensuring quality assurance except 3 studies. Rahman et al. (2010) and Rahman et al. (2014) mentioned the use of reagent blanks in duplicates and Talukder et al. (2011) followed the method as outlined and established by Loeppert and Biswas (2002). In all these studies, the irrigation water samples were filtered before instrumental analysis and the soil and rice grain samples were dried before digestion followed by instrumental analysis.

Relationship Between t-As Concentration in Rice Grain with Soil and Irrigation Water As Concentrations
Arsenic concentrations in rice grain, soil, and irrigation water based on the meta data (n 134) are summarized in Table 2. The rice grain t-As concentration ranged from 0.0018 to 1.56 mg kg −1 with a mean value of 0.40 mg kg −1 . The As concentration in soil ranged from 0.06 to 112 mg kg −1 with a mean value of 11.73 mg kg −1 and the irrigation water As content ranged from 0 to 1014 µg L −1 with a mean value of 235.49 µg L −1 . The rice grain t-As content was found to be positively and significantly (p < 0.01) correlated with the soil As (Spearman rho 0.65) and the irrigation water As (Spearman rho 0.46) concentrations (Figures 2A,B). The soil As had a significant (p < 0.05) positive correlation with irrigation water (Spearman rho 0.32) concentrations ( Figure 2C). Of the collated meta data, 12.68% had soil As concentrations above 20 mg kg −1 and 63.43% of the data had irrigation water As above 100 μg L −1 ; 54% of the Chowdhury et al.    polished rice grain i-As meta-data exceeded the concentration of 0.20 mg kg −1 and 74% of the husked rice grain i-As meta-data exceeded 0.35 mg kg −1 .
When a generalized linear regression model (GLM) was used on the meta data it was observed that Grain As 0.60 Irri As + 16.72 Soil As (AIC 1900.40) when intercept was forced through zero. Both the coefficients were significant (p < 0.01). The model with intercept, Grain As 200.91 + 0.28 Irri As + 11.76 Soil As (AIC 1877.8) had a slightly better fit, and the coefficients were significant (p < 0.05 for Irri As and p < 0.01 for Soil As ).

Determination of the Maximum Concentration of As in Soil and Irrigation Water
The LR model predicted Probability (≤MTC |> MTC) -1.6822 + 0.1429 Soil As (AIC 123.68). The soil As coefficient significantly (p < 0.01) explained the grain i-As content. When irrigation water As was added to the model, the coefficient was statistically nonsignificant (p > 0.05) and AIC increased to 128.17. Soil As content was 11.75 mg kg −1 when probability (≤MTC |> MTC) reached 50% ( Figure 3) and hence can be considered as the soil concentration above which the i-As in grain may exceed 0.20 mg kg −1 for polished rice and 0.35 mg kg −1 for husked rice. Performance of LR model was lower than the DT, both based on the classification ability in terms of accuracy; misclassification (grain As as ≤ MTC vs. > MTC ( Table 3)) and ROC plots (Figure 4). Since classifiers that perform no better than chance are expected to have an AUC below 50% (James et al., 2013), both models performed well overall, but the ROC plots confirmed that DT performed better than LR. During the training phase, DT achieved an AUC of 72.5% and LR 65.5% ( Figures  4A,B) and in the testing phase ( Figures 4C,D), the AUC for DT was 70.6% and for LR 65.5%.
The probability of grain i-As being classed ≤ MTC by the DT model was 0.52 ( Figure 5). Soil As was the primary splitting variable at 14 mg kg −1 . Below this soil concentration, the probability of grain i-As being classed as ≤ MTC was 0.76 (69% of the data were in this second node). When soil As was greater than 14 mg kg −1 , the probability of grain As being classed > MTC was 0.85 and 31% of the data was in this node. No further splitting of the tree and inclusion of irrigation water as a successful variable was observed. An attempt was made to predict the maximum concentration in irrigation water above which the soil As exceeded 11.75 and 14 mg kg −1 using the LR and DT models, respectively. With the LR model the irrigation water As was observed to be non-significant and in the case of DT irrational splitting was observed which was also not suitable for pruning based on the complexity parameter.

Subgroup Analysis
The subgroup analysis (Supplementary Table S2) was performed using the data from the studies that also reported soil parameters like pH and organic carbon (OC) content in soil ( Table 1). Several studies have reported pH (Hussain et al., 2021; and organic carbon (OC) (Mandal et al., 2019;Kumari et al., 2021) content in soil as major determinants of  Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 760125 soil As. The GLM Grain As 258.05 + 0.38 Irri As −0.29 pH−0.32 OC + 19.43 Soil As (AIC 420.47) was only significant for the Irri As (p < 0.05) and Soil As (p < 0.01) coefficients. For the GLM Soil As 1.30 + 0.003 Irri As + 1.03 pH−0.01 OC (AIC 210.37) none of the coefficients were statistically significant (p > 0.05). In the logistic regression of the subgroup Pr (≤MTC |> MTC) −1.811 + 0.003 Irri As −0.62pH−0.01 OC + 0.28 Soil As (AIC 46.50), only the coefficient of Soil As was statistically significant (p < 0.05). Using the LR model for the subgroup identified that soil As alone significantly affected the class probabilities of the rice grain As in terms of being ≤ MTC or > MTC.

DISCUSSION
To our knowledge this is the first meta-analysis utilizing published data from Asian paddy fields on As in rice grain, soil, and irrigation water to determine the relationship between them and to develop a model to estimate the maximum concentration in paddy soil and irrigation water above which Codex standards for the maximum allowable i-As in rice would be exceeded (JECFA, 2017). From the 156 papers reviewed, only 26 studies (15 from India, 1 from Taiwan, 1 from Nepal, 1 from Vietnam, 1 from China and 7 from Bangladesh) met the inclusion criteria for the meta-analysis; these studies all reported t-As concentrations in rice grain, soil, and irrigation water. There was near equal split between studies that reported t-As concentrations in husked rice (15 studies) and those in polished rice (11 studies). After converting t-As to i-As, 43% for husked and 60% for polished rice concentrations exceeded the Codex standard. Based on the GLRM, both soil and irrigation water As concentrations significantly influenced As concentrations in grain. While it is known that the bioavailability of As from soil to rice depends on various parameters including microbial activity (Kumarathilaka et al., 2019), As concentrations in soil FIGURE 2 | Spearman correlation between (A) rice grain t-As and soil As, (B) rice grain t-As and irrigation water As, and (C) irrigation water As and soil As (n 134).
Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 760125 and irrigation water are acknowledged to be the most significant factors affecting rice grain t-As concentrations (Mukherjee et al., 2017;Kumari et al., 2021). Our subgroup analysis on the impact of soil parameters (e.g., pH and OC) on rice grain As supports this, the other soil parameters being non-significant. However, in this meta-analysis, soil As was the main determining factor and this was confirmed by (1) the stronger positive correlation between paddy soil and rice As concentrations compared to irrigation water and rice grain concentrations ( Figure 2); (2) the LR model having non-significant contribution of irrigation water As; and (3) the DT model predictions taking only into account the soil As classified data. This aligns with previous studies where authors reported most significant impact of soil As  and "modest if any" impact of irrigation water on t-As content of rice (Van Geen et al., 2006). On the contrary, in a recent study, mean As concentrations in groundwater used for irrigation were strongly correlated with grain t-As (Reid et al., 2021). Regardless, many studies have suggested that soil As concentrations increase with contaminated groundwater irrigation of paddy fields (Huq et al., 2006;Panaullah et al., 2008;Dittmar et al., 2010), eventually resulting in increase of grain As (Rahman et al., 2007;Lu et al., 2009;Rahman et al., 2010). The non-significant influence of irrigation water on the grain As concentrations in this study was perhaps unsurprising given the complexity in the transfer of irrigation water As to rice grain via the soil. For example, the accumulation of As in soil from the irrigation water is dependent on several factors like the temporal variation in As concentration throughout the crop-growth period, the volume of irrigation water used, and the area of the field being irrigated (Chowdhury et al., 2018;Chowdhury et al., 2020). As seen in Figure 2C, there is only a moderate increase in soil As with increase in irrigation water As concentration. In fact, some of the very high As concentrations in soil were observed when the irrigation water As concentration was less than 300 μg L −1 . The accumulation of soil As also depends on long-term leaching, which is influenced by soil texture including particle size, clay content, and saturated hydraulic conductivity (Dousova et al., 2016). The complexity in the relationship between irrigation water As and grain As could be further enhanced due to irrigation practices which often include the use of both groundwater and rainwater. For example, the phase wise soil As movement and its enrichment pattern in rice due to the use of As contaminated irrigation water showed a moderate accumulation of soil As in the vegetative phase followed by a severe drop in the reproductive phase and continued buildup of As in the ripening phase (Chowdhury et al., 2018). Whereas, in rainfed rice cultivation, a moderate accumulation of As in soil in the vegetative phase followed by a  rise in the reproductive phase and a decrease at the ripening stage mainly due to the dilution of the As accumulation in soil due to rainwater was noted (Chowdhury et al., 2020). Furthermore, the binding of As via ternary complexes and mobile colloids has been considered the most important but not the sole driver of As mobility in soils (Mandal et al., 2019).
The other competing inorganic anions like phosphates and silicates affect the As adsorption/desorption mechanism significantly. For example, they form surface complexes with Fe and organic matter, causing an easier release of As from surface bindings at higher concentrations (Raj et al., 2021). Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 760125 9 The maximum concentration of As in soil from the LR model was found to be 11.75 mg kg −1 whereas, based on the better performing DT model the maximum concentration was 14 mg kg −1 above which the As concentration in rice grain would exceed the Codex recommendation. This is in agreement with (1) the study from Bangladesh, investigating the accumulation and distribution of As in rice grain, and reporting that the rice grown in soils contaminated with As of 14.5 ± 0.1 mg kg −1 was not safe for human consumption (Rahman et al., 2007); (2) the recommendation of the maximum acceptable limit of As in soil of 20 mg kg −1 by the European Union (Rahman et al., 2007); and (3) the limit of 15 mg kg −1 of As in paddy soils by Ministry of Environment, Government of Japan (Punshon et al., 2017). These findings suggest that, based on the scientific data currently available for rice cultivation in Asia, an As guideline value of 14 mg kg −1 in paddy soil may be appropriate.
Being a meta-analysis one of the limitations of our study might be unavoidable heterogeneity. We have carefully limited the data based on our inclusion criteria to reduce the heterogeneity. However, both the models had a minor influence from the data of the six research papers that reported the minimum and maximum values and to overcome this, repeated crossvalidation was performed (James et al., 2013). Alongside the inherent limitations of modelling, the challenge was to increase the model's generalizability so that its applicability is not limited. It would be naive to believe that our models could be applicable to all contaminated rice growing sites worldwide, as the models have been trained with a particular set of data from a specific geographical region. However, some points should be taken into consideration. For example, the soil As data used in the meta-analysis is total As, which is obtained using an acid extraction technique and total As content in soil does not take into account how its availability is modified by important soil properties (Golui et al., 2017). However, the bioavailability of As in soil solutions is governed mainly by pH, redox potential, organic matter content, and the presence of other ions in soil (Hussain et al., 2021). The use of soil available As (extracted by NaHCO 3 ) should be investigated as a better explanatory variable. Future work incorporating analysis of As in rhizosphere soil, soil solution, and irrigation water samples, and sequential analysis of As in soil over time would enable further improvement and validation of our model.

CONCLUSION
This meta-analysis determined that the concentrations of 14 mg kg −1 in paddy soil may be an appropriate guideline value above which rice grains cultivated in Asian paddy fields will exceed the Codex recommended maximum allowable concentrations of 0.2 mg kg −1 for i-As in polished rice and 0.35 mg kg −1 in husked rice. While the GLM predicted an overall positive relationship between As in rice grain with As in irrigation water, both LR and DT models predicted that soil As was the main determining factor. A guideline value for the irrigation water could not be derived using either the LR or DT models. The non-significant contribution of irrigation water was unsurprising, given that the As accumulation in soil due to contaminated irrigation water depends on several factors and the relationships governing transfer to rice grains are complex. Considering uncertainties and limitations of the available meta data and models, experimental studies collecting more appropriate soil and irrigation water samples along with grain rice, and analysis of soil available As, rather than total As in paddy soil, is warranted to validate these findings.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.