Noise-Resilience Horizon of Groundwater Potential Maps Generated by Frequency-Ratio Data Integration

The study of groundwater distribution is gaining importance due to the mounting pressures exerted by rapid urban growth on water supply, especially in small islands that could experience faster supply deterioration through saltwater intrusion. Understanding the interplay between the groundwater supply and demand dynamics requires seeing the resources beneath the surface. One typical visualization technique is groundwater potential (GWP) mapping, which predicts groundwater’s spatial distribution from measurable variables on or above the Earth’s surface. However, system errors and noise can affect the quality of the input variables, which can influence the reliability and explanatory power of the GWP maps. Herein, we analyzed the effect of noise on the GWP map accuracy for Cebu and Mactan islands, Philippines. We found that the GWP map retains the fidelity of the zonal structure information in the presence of noise in the input map layers. With a combination of two binary-classifier performance curves, we established the noise-resilience horizon. This horizon is the limit noise-level that the input maps may contain such that the GWP maps retain high accuracy. This horizon indicates that the input maps may carry as much as 20% to 25% error without significantly corrupting the GWP map’s predictive accuracy. Our findings contribute to the knowledge of GWP mapping’s accuracy limits, which is valuable as such diagrams comprise the core of decision-support systems in groundwater management. We also anticipate our dither approach as a foundation for the generic assessment of GWP map accuracy, regardless of a priori details of the map-generating model.


INTRODUCTION
Groundwater is an essential resource at the very foundation of modern human civilization. The growth of urban centers worldwide presents an ever-mounting pressure on this water supply for three reasons. First, the population has grown tremendously in a period shorter than the development of groundwater extraction systems (Wyman, 2013). Second, the threat of climate shifts seems to be happening at a rate faster than cities could adapt (Green, 2016). Third, the speed of groundwater contamination, which anthropogenic activities accelerate, is sufficiently significant in magnifying the perceived scarcity (Burri et al., 2019). Hence, studies that can help in enhancing the community's adaptation rate, such as on the informative search of untapped groundwater sources, will be valuable.
Small island communities are among the densest in the world, thus facing a prominent threat of strained groundwater supply. For instance, the small area exposes such islands to saltwater intrusion, a situation that rising sea levels would exacerbate (Robins, 2013). The motivation to explore groundwater sources to support the demands of a growing human population, driven mainly by rapid urbanization, is paramount. Cebu is an island in the Philippines with a population density of about 890 persons in every square kilometer, among the most notable in the world. The rapid urbanization of the island led to the surging water demand for residential, commercial, and industrial consumption. Indeed, in many small islands, the population density exerts the most pressure on the water supply. A further complication with small island hydrogeology is the difficulty in quantifying catchment water balance, a major part of which is unseen (Robins, 2013). The largely immeasurable water balance of small islands will thus require indirect methods of tracking groundwater viability. These tracking methods will help address the possible repercussions from a strained supply.
The present study applies the concept of the groundwater potential (GWP) to delineate zones within the densely inhabited islands of Cebu province (notably, the Cebu and Mactan islands) in the context of water source prospecting. A GWP map is a spatial distribution of the best estimate of the terrain's physical capacity to supply a required amount of groundwater for a given use. The GWP map is the output of a combination of input map data on physical variables that relate to the presence or absence of groundwater. Díaz-Alcaide and Martínez-Santos (2019) identified eight (8) of the most common input variables GWP researchers include. We utilized the frequency ratio (FR) method in determining the GWP values due to its data-driven approach. The technique assigns weights objectively based on the available data, although it requires a large number of wells (such as in the order of 1,000). Semi-quantitative methods, such as the analytical hierarchy process (AHP), consider human-expert opinions to assign the weights of importance of the putative factors that influence GWP.
The noise-resilience horizon is a framework for setting a theoretical limit to the expected predictive accuracy of any GWP model, regardless of the details of the method that generated it. We utilized the notion of dithering to inject random noise to a GWP map and explore the maximum intensity of this noise at which the original map's accuracy degrades significantly. Dithering has been discussed in the context of signal processing as a method of modulating the inherent quantization errors of measured signals (Sullivan et al., 1993;Hu, 2016). The modulation renders the signal's total error component into a steady white noise that can be separated from the underlying information by conventional filtering techniques (Lipshitz et al., 1992). In this study, we used dithering similarly, but for showing how far the degradation corrupts the original information content of the map, analogous, in context, to the lossy transmission of signals over noisy computer networks (Mohr et al., 2000). Demonstrating our approach of finding the noise-level horizon requires that we minimize as much human bias as possible, which may be challenging to filter out through techniques like AHP. Nevertheless, the framework should be applicable, with suitable adjustments, to generate GWP maps using other methods.

MATERIALS AND METHODS
A GWP map typically results from the data integration of multiple predictor variables. For every map unit (e.g., pixel for a digital map) centered at coordinate (x,y), the value of the GWP is expressible as a multivariable linear regression model of the form, The F i are the input factors, grouped into eight (8) thematic layers. Each layer L, in turn, comprises M L classes; hence, We may assume a 0 = 0 as a boundary condition (i.e., GWP = 0 if all F = 0).
The modeler does not typically have a priori information on the value of the weight coefficient a i . The common approaches for determining those values, such as the multi-influence factor (MIF) technique and AHP, use a qualitative system of ranking the importance of factors. However, this ranking system relies on subjective assessment of the importance (Nasir et al., 2018), which can amplify site-specific biases. In the present study, we opted to use the frequency-ratio (FR) technique to determine the weight values based on prior information about the location of existing wells (Díaz-Alcaide and Martínez-Santos, 2019). The National Water Resources Board (NWRB) and Local Water Utilities Administration (LWUA) of the Philippines provide data on existing well locations 1 . For Cebu and Mactan islands, the NWRB and LWUA identified a total of 1,204 unique wells. This data set is a 70:30 split between training set for constructing the GWP map, and validation set for the map evaluation.
The present study has been carried out through four main steps, as outlined in Figure 1: (1) data collection and preparation, (2) noise injection, (3) application of frequency ratio model, and (4) map validation and assessment.

Frequency Ratio
The basis of determining the coefficients is the frequency ratio (FR), a value calculated given a data set of existing well locations. The training data set includes a total of 842 wells as indicated in Figure 2, which also highlights the Cebu and Mactan islands of the Cebu province.
Every pixel on the map corresponds to a 10-m by 10-m area, which could accommodate at most one well. With this oneto-one correspondence, a total of 842 pixels represent existing well locations in the training set. For every thematic layer L categorized into M L classes, the FR k of the k-th class (for k = 1, 2,. . ., M L ) is a ratio of two ratios. The numerator is a ratio between the number of wells located in pixels belonging to the k-th class, W k , and the total number of wells in the study area (N = 842) considered. The denominator is the ratio between the number of pixels categorized in the k-th class, P k , and the total number of pixels representing the entire study area, P. Hence, FR k = (W k / N) / (P k / P), which is a real-valued positive number.
For every thematic layer L, a threshold W L * = N/M L , represents the equipartition scenario for which wells can be found in pixels regardless of the class to which the pixel belongs. This scenario implies that the thematic layer offers no useful information of the zones in which to develop groundwater sources. We use W L * to further assess the significance of the classes in which the FR values are relatively substantial. Particularly, if W k /N > W L * , then we consider the k-th class as a significant factor that determines the GWP. This method of assessing the k-th class checks if a high FR is a result of consistent recurrence of wells in k-class pixels. A high FR is possible when very few wells are in pixels of a rare class (i.e., covers a small portion of the study area). This scenario might imply a chance incidence of wells in a rare class of a certain thematic layer.

Thematic Layers
A thematic layer corresponds to any of the eight input variables assumed as relevant in determining GWP across various geographic settings (Díaz-Alcaide and Martínez-Santos, 2019). The layer provides pixel-level data classified either according to quantitative ranges or qualitative characteristics. The following paragraphs provide further details about each thematic layer as they apply to the context of the study. The class labels for all layers are in the Supplementary Table 1.

Geomorphology
The geomorphologic features have a link to the topographic landforms found in the study area. Hammond's macro-landform classification technique was used in the ArcMap ModelBuilder (Karagulle et al., 2017). This approach generates the landform map from the relief, slope and profile parameter maps resulting from a digital elevation model (DEM).

Lineament Density
The lineament density is a quantity referring to the total lineament length per unit area. The shaded relief images from a LiDAR DEM generated, via manual interpretation and digitization, the lineament data. Each relief image corresponds to an azimuth angle: 45 • , 135 • , 225 • , and 315 • . These angles allow for the enhancement of the linear features from the DEM. With ArcMap's (version 10) line density tool, we automated the calculation of the lineament density values. Finally, with the natural breaks method, we categorized the lineament density values into five classes that correspond to ranges: very low (0.00-1.33 km/sq.km), low (1.33-3.06), moderate (3.06-4.99), high (4.99-7.92), and very high (7.92-16.97).

Drainage Density
The drainage density is the total length of the drainage channels per unit area. Each drainage channel is a result of the watershed delineation framework applied to the LiDAR DEM of the study area. Using ArcMap (version 10) line density tool, the generated channels went through further data integration that resulted in the estimates of the drainage density values. With the natural breaks method, we categorized the drainage density values into five classes: very low (0.00-0.22 km/sq. km), low (0.22-0.67), moderate (0.67-1.13), high (1.13-1.67), and very high (1.67-4.38).

Soil
The soil map came from the Philippine GIS Data Clearinghouse or PhilGIS (philgis.org, Accessed December 2019), an online repository for free GIS data of the entire Philippine archipelago. The study area consists of eleven (11) soil types: hydrosol, beach sand, clay (Faraon, Bolinao, Lugo, and Medellin), clay loam (Macolod, Mandawe, Mantalongan, and Baguio), and Mandawe silt loam.

Land Use/Land Cover
The National Mapping and Resource Information Authority (NAMRIA) of the Philippines provided the maps on the land use and cover features of the study area. Due to the dynamic nature of land use, especially in built-up areas, the land use/land cover (LULC), at best, only covers a snapshot. The present study utilized the 2015 version of the NAMRIA data for LULC.

Geology
The PhilGIS is also the provider of data on the geologic map of the study area. This map comprises twelve (12) identified features dominated by Carcar limestone and Talavera Group, which are 37% and 30%, respectively, of the study area. Only 0.03% of the study area does not belong to a geologic category.

Rainfall
The annual average rainfall amount has a north-south gradation highest at the northernmost tip of the study area decreasing towards the south. The long-term rainfall data between 1981 and 2010 came from the Philippine Atmospheric, Geophysical and Astronomical Services Administration (PAGASA). We used the Thiessen polygon method to generate the rainfall map from these data. Four weather stations were crucial sources for the Thiessen method: Tacloban, Mactan, Tagbilaran and Dumaguete. The Tacloban station accounting for 0.46% of the study area, which is at the northernmost tip, registered the highest rainfall level at 2659.3 mm. The Mactan station, covering the majority of the area (69%), registered the second highest level at 1564.5 mm. The Tagbilaran station, the scope of which is the second widest (19%), registered 1412.6 mm. The Dumaguete station, which covered the southernmost 11% of the study area also registered the lowest rainfall level at 1218.4 mm.

Slope
The slope data is a result obtained from the same data processing steps as in creating the geomorphology map. With a combination of Hammond's classification procedure and the natural breaks technique, we considered five (5) categories of slope angles from the gentlest to the steepest. These categories are: Gentle (0.00 to 7.73 degrees); Moderate (

Sensitivity Analysis
Models that use multiple parameters such as the FR technique curb the impact of ambiguities in the generated GWP map.
An adequate model is one that optimizes the balance between biases and unexplained variances arising from the use of too few parameters and overfitting from including too many parameters (Saidi et al., 2011). To deal with this bias-variance tradeoff and provide an objective and efficient justification of the factors of the GWP map, a sensitivity analysis was performed.
The map removal sensitivity analysis technique (Lodwick et al., 1990) was used in the present work. It was performed by removing one thematic layer at a time from the FR integration. If the exclusion of a factor results in a large variation in the output map, the output map was said to depend on that factor. The layers which caused the least changes to the GWP map were then removed in succession until the layer with the largest variation caused was left. The sensitivity, S = S(x,y) at position (x,y) on the map is calculated according to the following expression: (1) accounting for all U = 8 thematic layers, and V' = V'(x,y) is the GWP value with one or more thematic layers (or input factors) removed. The quantities U and u are the number of factors included in the calculation of V and V' respectively. For a given u, the overall map sensitivity is examined from the statistics taken across all the pixels on the map. Particularly, we focused the sensitivity analysis on the mean sensitivity across all pixels.

Noise Injection by Dithering
This study focused on investigating the robustness of the GWP map generation approach to the inherent errors embedded into the input map data (i.e., thematic layers). The reliability of the GWP map rests on the stability of the output to the potential variabilities arising from the fidelity issues in the input data. We simulate the errors in the input data by injecting uncorrelated noise in the form of a dither probability K that the value in the pixel located in (x,y) switches to a different category. In this study, we explored the following dither probabilities: 10%, 20%, 25%, 50%, and 80%. For a given thematic layer L, the effective probability that any pixel changes its value (i.e., category) would be K (1 -1/M L ). The value of this effective probability is less than K but never less than K/2. Applying this procedure onto all pixels generates a dithered map GWP K (x,y) for a given value of K.

Map Validation and Assessment
The GWP map is an output of the evaluation of Equation 1 over all the pixels of the map. The delineation of the GWP zones required the discretization of the pixel values into five categories determined using the natural breaks technique: Very Low (0.00 to 4.55), Low (4.55 to 7.69), Moderate (7.69 to 12.56), High (12.56 to 18.62), and Very High (18.62 to 27.61).
The validation set consisting of 362 wells consist of wells that were not part of constructing the GWP map. We used this data set to test the accuracy of the constructed GWP map. First, the GWP categories go through a further categorization that groups the values belonging to Very Low and Low as "NO" and Moderate through Very High as "YES." This binarization of the categories prepares the map for the assessment using the receiver operating characteristic (ROC) curve. The ROC visualizes the map's diagnostic ability to predict those zones in which groundwater wells are feasible or not. Second, we define the four fundamental categories for the evaluation of binary classifiers, namely, true positive (TP), false positive (FP), false negative (FN), and true negative (TN). TP is when a well in the data set is in a YES pixel; FP is when no well is in a YES pixel; FN is when a well is in a NO pixel; TN is when no well is in a NO pixel.
The assessment of the GWP map dwells on the examination of the ROC and another graph, known as the precisionrecall (PR) curve. While the ROC plots the TP rate (also known as recall, = n(TP)/[n(TP) + n(FN)]) against the FP rate ( = n(FP)/[n(FP) + n(TN)]), the PR curve plots the precision ( = n(TP)/[n(TP) + n(FP)]) against the recall. The area under the curve (AUC), which has a value between 0 and 1, is the metric for the classification performance of the map. Applied to either the ROC or PR curve, a high AUC within the allowable range implies the better performance of the GWP map. While the AUC of the ROC curve usually suffices, its high value may be misleading, especially for class-imbalanced data sets (Lever et al., 2016). The AUC of the PR curve offers confirmation for the AUC of the ROC curve. A high AUC from both curves would suggest that the binary classifier is accurate.
In addition to the high AUC, the PR curves also would exhibit features that indicate the predictive accuracy of the binary classifier. A binary classifier's baseline performance is random guessing, which implies that it will be correct half of the time, on average. This baseline should be a horizontal line at 0.5 in Figures 4 and 5.

RESULTS
The FR values resulting from the application of the frequencyratio calculations onto the training data set reveals the noteworthy correlations between the thematic layers and GWP. In comparison with a designated threshold (Supplementary Table 1), the existing wells tend to be in sites (pixels) belonging to certain classes more often than chance. Hence, the data set exhibits a class imbalance in the context discussed by Lever et al. (2016). This result translates to relatively high coefficients on certain factors in Equation 1, further implying a strong correlation between those factors and GWP.
On Cebu and Mactan islands, the following features seem to have noteworthy, strong correlations with GWP: (1) quaternary alluvium geology; (2) very low hills geomorphology; (3) clay or clay loam soil; (4) built-up area; (5) gentle slope. We also find that GWP takes strong influence from low lineament density and high drainage density, a counterintuitive combination. Some authors suggest that GWP would benefit from a high lineament density due to enhanced groundwater recharging by precipitation. Meanwhile, a low drainage density tends to arrest groundwater leakage. Nevertheless, the high GWP in those areas of the study area is likely a confluence of the extensive quaternary alluvial geology and transitional zone topography. Walther et al. (2017) found that small dams in Oregon are typically hiding in "very low" hills and areas of high population density (as do built-up areas). Small dams, like groundwater sources, are mostly hidden. They also refer to very low hills as a transition zone between plain and steeper topography. The high GWP in these areas seems to corroborate with this aspect of a transition zone, especially in an area that receives an average of more than 1,500 mm of rainfall annually.  and  likewise found a similar confluence of geology, geomorphology, and soil type in their geospatial characterization of the Pravara basin in India.

Sensitivity Analysis of the Estimated GWP
The results of the sensitivity analysis justified the thematic layers that were considered in determining the GWP map. Table 1 shows the variation index statistics calculated from the map removal sensitivity analysis approach. The analysis disclosed that the rainfall factor had the highest influence in the GWP (mean variation index: 1.33%) followed by geology (mean variation index: 1.1%) and lineament density (mean variation index: 1.01%). A similar deduction was found by Al-Ruzouq et al.
(2019) on geology and precipitation in northern UAE, although lineament density was found least significant. Conversely, excluding the LULC layer delivered the least variation in GWP implying that the factor has the least impact on the GWP map. Although higher FRs were computed for the geomorphology classes, the factor only ranks fifth in the most significant factors in GWP. The application of a one map removal sensitivity analysis indicated that rainfall, lineament density, and geology are the most significant factors influencing groundwater potential in the region of study. Table 2 displays the variation of the GWP as a result of excluding one or more factors in the integration process. A small variation of the GWP was calculated by removing the factor LULC while higher variations manifested as more factors were removed from the integration. The assimilation of fewer parameters divulged that variations are apparent in the GWP map with the complete number of parameters. Despite the varying importance of each factor, sensitivity analysis confirmed the significance of the eight (8) thematic layers in GWP.

Dithered GWP Maps at Varying Noise Injection Levels
The noise injection study established the limits of the dithering level at which the GWP maps retain a substantial level of accuracy. Particularly, our results show that the baseline map retains the zonal structure in the presence of noise in the input map layers, but only up to a maximum level. We designated this maximum level as the noise-resilience horizon. The output GWP maps (Figures 3A-F) show a progressively decreasing zonal structure with an intensifying noise injection level (i.e., as the value of K increases). This trend is likewise visually evident in the progression of the input maps (Supplementary Figures 1-7) from K = 0% to K = 80%. The noiseless case (Figure 3A) show zones of high to very high GWP in the northern tip of Cebu Island. GWP is high as well on a major part of Mactan Island, which lies to the right of the middle part of Cebu Island. The coastal areas exhibit high GWP especially around the strait between Cebu and Mactan islands in which a dense cluster of wells actually exist (Figure 2). The zonal structure of the GWP map is still remarkable at K = 10%, 20% and even at 25% (Figures 3B-D). However, at K = 50%, the GWP map already demonstrates significant smudging in the zonal structure. Compared to the K = 0 case, the GWP map at K = 50% (Figure 3E) is not an informative guide for groundwater prospecting. The situation is much worse at K = 80% (Figure 3F).

Validation and Assessment of the GWP Map and the Noise-Resilience Horizon
The AUC for the ROC and PR curves on the training data set ( Table 3) are above 0.80 when the noise levels are K = 0%, 10%, and 20%. The binary classification performance of the GWP map appreciably degrades at K = 50% and 80%, with the AUC of the ROC and PR curves both dropping to less than 0.70. The GWP map performs just as well (if not somewhat better) on the validation set across the different K values (Table 3). This consistency in the AUC between the training and validation data sets seems to suggest that the GWP map of the study area is useful for identifying potential areas of groundwater development.
A cluster of ROC and PR curves form within a recognizable band for a range of noise levels between K = 0% and 25% (Figures 4 and 5). This band is distinctly separate from the curves for K = 50% and 80% and also significantly above the indicative performance (see dashed lines in Figures 4 and 5) of a randomly guessing classifier. The high AUC > 0.80 for all the curves in this band suggests that the GWP map preserves zonal information in spite of noise. Consequently, the GWP map is resilient to noise with intensities as high as K = 25%. Visually, the cluster of curves constitute the noise-resilience horizon, which represents the limit noise-level that the input maps may contain to retain high accuracy.
The noise-resilience horizon that we found in our analysis seems to agree with results obtained from the lossy transmission of an image over noisy computer networks (Mohr et al., 2000). In that study, the authors determined how many packets can be lost through network transmission and still retain an acceptable image quality level at the receiver's end. From the context of  that study, the image is analogous to the GWP map; the packet losses correspond to the inherent noise within the input maps; and the lossy transmission can be interpreted as the noisy inputoutput transformation that generates the GWP map. Mohr et al. (2000) found that the image quality is robust to packet losses of as much as 40%, which is analogous to the noise-resilience horizon we obtained in our study. The calculation of such a horizon is valuable for assessing the realistic limits to the accuracy of relaying information via inherently noisy methods.

DISCUSSION
The study by Elbeih (2015) established the scientific utility of remote-sensing and GIS data sets as inputs for estimating GWP maps. However, as did numerous studies after it, authors did not report the inherent statistical uncertainties of the input GIS thematic layers. For this reason, the reliability of the groundwater maps, regardless of the reported site-specific accuracy, is not easy to establish. Most studies also assume that the input variables do not change in time, which neglects the demand level dynamics associated with rapid urbanization. An earlier study by Nas and Berktay (2010) seems to be the only one that provided an explicit declaration of the geo-statistical characteristics of the input data from which they generated the spatial distribution of groundwater quality. The geo-statistical characterization of the input data underscored the importance of examining input uncertainty to gauge the reliability of estimated groundwater maps. For example, the kurtosis was exceptionally high (i.e., > 3), suggesting heavy-tailed distributions of values, having more frequent outliers than a normal distribution would allow. A large kurtosis would likely correspond to high dither probabilities, implying weaker reliability levels of the predicted GW map. Machine-learning approaches for GWP mapping have increased popularity in recent years (Miraki et al., 2019;Pham et al., 2019;Chen et al., 2020). Although such approaches report high accuracies in specific sites to where many researchers applied the analyses, the transferability of the results to other sites is inconclusive, contrary to the authors' claims. Miraki et al. (2019) used machine learning and considered 12 inputs to train a classifier in predicting a GWP map. However, the model's transferability to other sites may not be a straightforward consequence of the study. Pham et al. (2019) claim that their GWP mapping method is generically applicable to other areas worldwide. Like similar studies, the input GIS data sets are mentioned and described, but not geo-statistically characterized. Chen et al. (2020) used 16 inputs and did not report the uncertainties or errors associated with the data sets. These studies commonly do not provide a statistical characterization of the input data sets. Instead, the learning algorithms implicitly address any inherent errors or uncertainties in the training data. The framework to assess the reliability and transferability of the generated maps did not exist until now. Thus, a scientific assessment of the reliability and transferability of machine learning approaches can now be made through the framework we propose.
Although we opted to use the FR method in this study to demonstrate the noise-resilience horizon, the framework should be applicable for other GWP estimation methods. The choice of the FR (Díaz-Alcaide and Martínez-Santos, 2019) is made in consideration of eliminating as much artificial bias originating from the subjective assessments present in other semi-quantitative methods such as the MIF (Nasir et al., 2018), AHP (Sahoo et al., 2015), and simple additive weight (SAW) (Abrams et al., 2018). Previous GWP studies typically assumed that the input map layers are clean. Also, there was possibly no systematic way to investigate the impact of inherent uncertainties from maps with various data types (e.g., categorical or numerical). Without loss of applicability, we chose the FR method, mainly because it is data-driven and thus minimizes the degree of subjective biases as in the case of AHP, which rely on human-expert judgment to attribute weights of importance to input factors.
The present study applied dithering as an approach to implementing uncertainty analysis on generated GWP maps. We established the dithering level, which is between 20 and 25%, at which the GWP map's accuracy remains at a preferably high  level (AUC > 80%). This noise-resilience horizon is the limit of the noise level, which the input maps may contain so that the GWP map achieves an accuracy level. This horizon may vary for different methods of generating GWP due to the differences in the sensitivity to the input factors, such as what Oh et al. (2011) found in their application of the FR method to analyze the GWP in Pohang, South Korea. Kleinen (1994) discussed the complementarity between sensitivity and uncertainty analysis extensively.
The statistical uncertainties of the measured input factors for determining the GWP are implicit in several studies that relate GWP to groundwater recharge. Herrera-Pantoja and Hiscock (2007) employed a stochastic weather generator in their estimates of groundwater recharge. The stochastic approach thus assumes an inherent random uncertainty from the generated input values. Szymczycha et al. (2017) used the recharge temperature as a proxy for the GWP interpreted as the aquifer recharge rate. The calculation of the recharge temperature hinged on several assumptions, the uncertainties of which could significantly alter the estimates within and among sites. The authors observed a range of estimated recharge temperatures as wide as 13.5%, which is well within the noise-resilience horizon we found. In other studies, the statistical uncertainties originated from the assumed measurable factors: e.g., concentrations of water contaminants (Kourgialas et al., 2018) or heat-exchange rates (Shrestha et al., 2018). Valois et al. (2018) applied sounding measurements to map the groundwater reserves in an area. The authors noted multiple sources of uncertainties in the obtained sounding measurements that could affect the accuracy of the GWP representations. Indeed, quantifying the extent of these uncertainties will be valuable for estimating the noise-resilience horizon for this survey-type method of GWP mapping. Therefore, a conceptual framework for accounting the statistical uncertainties of the input factors needed to determine GWP is necessary to objectively qualify any conclusions originating from GWP maps concerning the search for untapped potential groundwater sources.
Small islands in the Pacific Ring of Fire have complex hydrogeological structures, unlike landlocked areas, arid regions, or most coastal zones of continents or big islands. The complexity of these groundwater-relevant structures is commonly a consequence of the prevalence of volcanic aquifers (Margat and Van der Grun, 2013, p. 58). Cebu and Mactan islands are found well within this volcanically and tectonically active zone in the Pacific. The challenge of applying GWP mapping in these settings will be the degree of uncertainty in the measurement of thematic factors due to geologic features that could influence instrument measurements. For example, Valois et al. (2018) reported multiple sources of uncertainty in their transient electromagnetic sounding measurements arising from the geologic fractures or folds, which are likewise typical in islands with active tectonic features. It will not be surprising that most of the GWP estimation methods found to be useful in such settings as the northeastern region of the Indian subcontinent (Sahoo et al., 2015), or the eastern Arabian peninsula (Abrams et al., 2018), will not work on the hydrogeological complexity of small islands.
Most studies on GWP maps utilized only ROC to assess binary classification performance. Here we used the PR curve to provide confirmation (Lever et al., 2016), mainly because the data set of existing wells exhibits class imbalance. The preponderance of wells in urban centers, such as in the area flanking the strait between Cebu and Mactan islands, manifests this imbalance. The consistently high AUC from both ROC and PR curves, nevertheless, confirms the GWP map's generalization performance. The ROC alone would have provided a misleading indication of this performance. For instance, the cluster of wells may be a result of factors that are unique to the areas in which it exists. As a result, the precision is only high over a small interval of recall values starting at zero. In our results, the high AUC from the ROC curve corroborates with a similarly high AUC from the PR curve. The utilization of both curves will be an instructive part of the process of generating GWP maps.
The availability of a substantial data set of existing well locations contributed to the GWP map's consistent, satisfactory performance. Without an extensive data set, the output GWP map may perform satisfactorily on the training data, but not as well on the validation data set (Guru et al., 2017)-a case known as overfitting. We had access to the coordinates of existing wells scattered across the study area (N = 1,204). Thus, in our results, the AUC values are consistently higher on the validation data set than on the training data set. The size and territorial scope of the well-location data set will influence the statistical reliability of determining the noise-resilience horizon. Extensive prior ground-truth surveys of existing wells across the study area could curb the limitations imposed by small data sets on the GWP map's predictive value.
Although the number of wells in past GWP mapping studies that utilized machine learning is rather small: N = 66 (Chen et al., 2019) or N = 279 (Naghibi et al., 2019), it will also be interesting to apply the noise-resilience horizon approach to test the limits of such artificial-intelligence models (Miraki et al., 2019;Pham et al., 2019;Chen et al., 2020). Machinelearning techniques are recently gaining traction in the academic discourse on GWP mapping. The need to examine the impact of the inherent uncertainty in the input GIS data is imperative to any rigorous efforts to represent the unseen GWP with measurable or measured spatiotemporal variables.
The implementation of ground-truth validation may benefit from the interpretation of false-positive detections as opportunities for groundwater development. As defined in this study, the false positives represent the absence of wells in sites where the GWP map predicts high GWP. Where the GWP map predicts a zone of these false positives is especially a reasonable candidate for prospecting. The exploration of groundwater sources has been driving worldwide interest because rapid urbanization and climate change exacerbate an impending future shortage.

CONCLUSION
The quality and reliability of GWP maps are essential to its role as a decision-support tool. The methods of generating such maps have operated on the assumption that the input data sets are 100% accurate. However, in reality, such data sets could carry issues related to the limitations of the measuring methods applied to obtain them. Establishing a framework that could propagate the input-variable uncertainties onto the GWP map output was our study's central aim. We showed using a dithering approach that GWP maps may only be considered adequately reliable, within the limitations of our research, if the input uncertainty levels do not exceed 25%. This finding will help assess the accuracy levels of the input maps before generating a GWP map, especially for groundwater management planning purposes.
The noise-resilience horizon is a semi-theoretical framework for delineating the limits to GWP estimation methods as imposed by the input-factor statistical uncertainties. The limitation of the present work is that we only showed single instances of the dithered maps at each noise injection level. It will be interesting if the same conclusions will hold for the average of an ensemble of random realizations for dither probability. Also, it will be interesting to see how the ROC and PR curves vary across a denser range dither probabilities. The exact position of the horizon may be found by considering finer sweeps of this parameter. The conversion of detected opportunities for groundwater development and protection into focused, economically feasible prospecting is the value that GWP maps may provide especially for sustainable groundwater resource management.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://doi.org/10.5061/dryad.c866t1g4c.

AUTHOR CONTRIBUTIONS
FC and CP designed the methods and gathered the results. DJ analyzed and interpreted the data. All authors wrote and agreed to the final version of the manuscript.