<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Water</journal-id>
<journal-title>Frontiers in Water</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Water</abbrev-journal-title>
<issn pub-type="epub">2624-9375</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frwa.2024.1332888</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Water</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Fuzzy C-Means clustering for physical model calibration and 7-day, 10-year low flow estimation in ungaged basins: comparisons to traditional, statistical estimates</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>DelSanto</surname> <given-names>Andrew</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2567619/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Palmer</surname> <given-names>Richard N.</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/2622519/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Andreadis</surname> <given-names>Konstantinos</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/655490/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff><institution>Department of Civil and Environmental Engineering, University of Massachusetts</institution>, <addr-line>Amherst, MA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Omid Chatrabgoun, Coventry University, United Kingdom</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Babak Jamshidi, King&#x00027;s College London, United Kingdom</p>
<p>Mohammad Bashirgonbad, University of Malayer, Iran</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Andrew DelSanto <email>adelsanto&#x00040;umass.edu</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>01</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>6</volume>
<elocation-id>1332888</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>11</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>09</day>
<month>01</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2024 DelSanto, Palmer and Andreadis.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>DelSanto, Palmer and Andreadis</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>In the northeast U.S., resource managers commonly apply 7-day, 10-year (7Q10) low flow estimates for protecting aquatic species in streams. In this paper, the efficacy of process-based hydrologic models is evaluated for estimating 7Q10s compared to the United States Geological Survey&#x00027;s (USGS) widely applied web-application StreamStats, which uses traditional statistical regression equations for estimating extreme flows. To generate the process-based estimates, the USGS&#x00027;s National Hydrologic Modeling (NHM-PRMS) framework (which relies on traditional rainfall-runoff modeling) is applied with 36 years of forcings from the Daymet climate dataset to a representative sample of ninety-four unimpaired gages in the Northeast and Mid-Atlantic U.S. The rainfall-runoff models are calibrated to the measured streamflow at each gage using the recommended NHM-PRMS calibration procedure and evaluated using Kling-Gupta Efficiency (KGE) for daily streamflow estimation. To evaluate the 7Q10 estimates made by the rainfall-runoff models compared to StreamStats, a multitude of error metrics are applied, including median relative bias (cfs/cfs), Root Mean Square Error (RMSE) (cfs), Relative RMSE (RRMSE) (cfs/cfs), and Unit-Area RMSE (UA-RMSE) (cfs/mi<sup>2</sup>). The calibrated rainfall-runoff models display both improved daily streamflow estimation (median KGE improving from 0.30 to 0.52) and 7Q10 estimation (smaller median relative bias, RMSE, RRMSE, and UA-RMSE, especially for basins larger than 100 mi<sup>2</sup>). The success of calibration is extended to ungaged locations using the machine learning algorithm Fuzzy C-Means (FCM) clustering, finding that traditional K-Means clustering (FCM clustering with no fuzzification factor) is the preferred method for model regionalization based on (1) Silhouette Analysis, (2) daily streamflow KGE, and (3) 7Q10 error metrics. The optimal rainfall-runoff models created with clustering show improvement for daily streamflow estimation (a median KGE of 0.48, only slightly below that of the calibrated models at 0.52); however, these models display similar error metrics for 7Q10 estimation compared to the uncalibrated models, neither of which provide improved error compared to the statistical estimates. Results suggest that the rainfall-runoff models calibrated to measured streamflow data provide the best 7Q10 estimation in terms of all error metrics except median relative bias, but for all models applicable to ungaged locations, the statistical estimates from StreamStats display the lowest error metrics in every category.</p></abstract>
<kwd-group>
<kwd>hydrology</kwd>
<kwd>machine learning</kwd>
<kwd>physical modeling</kwd>
<kwd>low flow estimation</kwd>
<kwd>prediction in ungaged basins</kwd>
</kwd-group>
<counts>
<fig-count count="10"/>
<table-count count="7"/>
<equation-count count="6"/>
<ref-count count="67"/>
<page-count count="20"/>
<word-count count="10974"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Water and Artificial Intelligence</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Estimating the recurrence of low-flow events on rivers and streams is necessary for municipal, industrial, and agricultural planning, as well as for considering water quality, energy production, and species habitat (Smakhtin, <xref ref-type="bibr" rid="B58">2001</xref>; Blum et al., <xref ref-type="bibr" rid="B8">2019</xref>). Resource managers in the Northeast United States apply low flow statistics, such as the 7-day-10-year low flow (7Q10), to establish environmental flows to protect aquatic species. The 7Q10 is defined as an estimate of the lowest streamflow for 7 consecutive days that occurs, on average, once every 10 years (EPA Office of Water, <xref ref-type="bibr" rid="B21">2018</xref>). At gaged locations, 7Q10 is calculated using an extreme value distribution and estimating the lowest average week that reoccurs every 10 years on average (EPA Office of Water, <xref ref-type="bibr" rid="B21">2018</xref>). These calculated 7Q10 values are often extended to other locations within a basin through flow scaling. Flow scaling is typically only recommended for watershed areas that are 0.5 to 1.5 times the original gaged area (Asquith and Thompson, <xref ref-type="bibr" rid="B1">2008</xref>). For locations more distant from a gage, and all locations on streams without a stream gage, another method is required for 7Q10 calculation.</p>
<p>The two most common techniques to estimate long-term, low flows at ungaged locations are statistical regression modeling and process-based hydrologic modeling. Statistical regression models use information from other gaged locations that have measured historical data to apply at an ungaged location of interest (e.g., Ries et al., <xref ref-type="bibr" rid="B51">2008</xref>). For example, the 7Q10 is calculated at all gaged locations in a homogenous area and a regression equation is applied to the 7Q10s with generally available predictors such as a watershed&#x00027;s physical attributes (basin area, elevation, soil type, and/or other features). The developed regression can then be used to estimate 7Q10s in other locations in the homogenous area where data on the predictor variables are available (Worland et al., <xref ref-type="bibr" rid="B66">2018</xref>). Common applications of this methodology include the USGS&#x00027; web-application StreamStats (Ries et al., <xref ref-type="bibr" rid="B51">2008</xref>) and a module in the EPA&#x00027;s desktop program Basins (US EPA, <xref ref-type="bibr" rid="B63">2019</xref>). In contrast, process-based hydrologic modeling involves the use of complex physical equations that describe the variability in water storage and fluxes and essentially solves the water, mass, and energy balance to create streamflow data (e.g., Berghuijs et al., <xref ref-type="bibr" rid="B7">2016</xref>). One common example of a process-based hydrologic model is a rainfall-runoff model, which can be used to simulate daily or sub-daily streamflow data. The 7Q10 can then be calculated from the simulated streamflow data rather than measured streamflow data (e.g., Siddique et al., <xref ref-type="bibr" rid="B57">2020</xref>).</p>
<p>In practice, the statistical regression models described above are most often used by resource managers to get estimates of 7Q10s in ungaged locations. The associated regression equations rely on relative &#x0201C;stationarity,&#x0201D; the assumption that the statistical properties of streams do not change over time. Recent studies suggest that anthropogenic changes (land cover, water withdrawal, and climate change) that impact hydrologic processes may not satisfy that assumption, exposing shortcomings in this assumption (Milly et al., <xref ref-type="bibr" rid="B47">2008</xref>; Bayazit, <xref ref-type="bibr" rid="B4">2015</xref>; Salas et al., <xref ref-type="bibr" rid="B55">2018</xref>; Blum et al., <xref ref-type="bibr" rid="B8">2019</xref>; Hesarkazzazi et al., <xref ref-type="bibr" rid="B32">2021</xref>). For instance, Williams et al. (<xref ref-type="bibr" rid="B65">2022</xref>) estimate that the southwestern United States is experiencing its driest 22-year period since 800 A.D., with approximately 20% of the drought being attributed to recent anthropogenic changes (Williams et al., <xref ref-type="bibr" rid="B65">2022</xref>). In contrast, recent studies in the Northeast United States have found that both average baseflows and 7-day summer baseflows are increasing with statistical significance (Hodgkins and Dudley, <xref ref-type="bibr" rid="B33">2011</xref>; Ayers et al., <xref ref-type="bibr" rid="B3">2022</xref>). In the Mid-Atlantic, Blum et al. (<xref ref-type="bibr" rid="B8">2019</xref>) found increasing 7Q10s in the northern part of the Mid-Atlantic (New York, Pennsylvania) and decreasing 7Q10s in lower Mid-Atlantic (Virginia, Maryland). In addition, the authors found that using the most recent 30 years of the streamflow record when a trend in the annual low flows is detected reduces error and bias in 7Q10 estimators compared to using the full record (Blum et al., <xref ref-type="bibr" rid="B8">2019</xref>). This result is significant as it implies that anthropogenic impacts may be impacting 7Q10s, and statistical models that rely on long-term stationarity are failing to account for these changes.</p>
<p>Because regression models inherently rely on stationarity, there has been a renewed interest in improving process-based hydrologic modeling when estimating current and future extreme low flows. For example, because of recent extremely dry conditions in the western U.S., the California Department of Water Resources (DWR) concluded that: &#x0201C;The significant overestimation in DWR&#x00027;s spring 2021 forecasts of snowmelt runoff forecasts illustrate the importance of shifting away from statistical approaches that rely on a historical record no longer reflective of observed conditions, including the need to invest in the data to support better forecasting. DWR is transitioning to physically based watershed [rainfall-runoff] models that have the capability to include a changing climate and to use gridded data sets, including remotely sensed snowpack observations&#x0201D; (California Department of Water Resources, <xref ref-type="bibr" rid="B12">2021</xref>). The continued interest in process-based hydrologic modeling has encouraged federal agencies charged with natural resource management to create and update national hydrologic databases that can be used to facilitate the implementation of rainfall-runoff models. The National Oceanic and Atmospheric Administration (NOAA) continues to develop and improve the National Water Model (National Water Model: Improving NOAA&#x00027;s Water Prediction Services) for short- and long-range forecasts, vulnerability assessments, and parameter sensitivity analyses (e.g., El Gharamti et al., <xref ref-type="bibr" rid="B19">2021</xref>). The United States Geological Survey (USGS) has also developed the National Hydrologic Modeling framework using their rainfall-runoff modeling software, the Precipitation Runoff Modeling System (National Hydrologic Model Infrastructure: NHM-PRMS). The base version of PRMS has been used to predict future shifts in winter streamflow in southern Ontario (Champagne et al., <xref ref-type="bibr" rid="B14">2020</xref>), analyze how changing river network synchrony affects high flows (Rupp et al., <xref ref-type="bibr" rid="B54">2021</xref>), and evaluate climate change impacts in an agricultural valley irrigated with snowmelt runoff in Nevada and northern California (Kitlasten et al., <xref ref-type="bibr" rid="B37">2021</xref>). The NHM version of PRMS has been used for simulation of water availability in the Southeastern United States for historical and potential future climate and land-cover conditions (LaFontaine et al., <xref ref-type="bibr" rid="B39">2019</xref>), modeling surface-water depression storage in a Prairie Pothole region (Hay et al., <xref ref-type="bibr" rid="B30">2018</xref>), and quantifying spatiotemporal variability of watershed scale surface-depression storage and runoff in the U.S. (Driscoll et al., <xref ref-type="bibr" rid="B15">2020</xref>).</p>
<p>Although process-based hydrologic models depict the complex physical processes that affect streamflow, they present their own, unique challenges. For rainfall-runoff models to inform managers in the planning process, accurate and long-term historic data must be available for calibration and validation. Calibration/verification is an iterative process that includes determining the appropriate interdependence and correlation between model variables for estimating the value of the input variables that are hard to characterize accurately (Boyle et al., <xref ref-type="bibr" rid="B10">2000</xref>; Duan, <xref ref-type="bibr" rid="B16">2003</xref>). In general, most rainfall-runoff models utilize some input variables that are hard to characterize accurately (e.g., groundwater depths, soil porosity, and underground soil types), making calibration an important step in the rainfall-runoff modeling process (Gupta and Waymire, <xref ref-type="bibr" rid="B29">1998</xref>). Without measured streamflow data for comparison, an uncalibrated rainfall-runoff model can generate results with unknown errors, requiring additional verification that the model is working as intended. Calibration directly to measured streamflow can improve model performance, but in the absence of measured streamflow data, it is difficult to (1) verify that a rainfall-runoff model is properly simulating every step of the water budget, and (2) that the streamflow estimates provided by the model are accurate enough for decision-making.</p>
<p>In watersheds that lack stream gages, hydrologic models can infer model parameters using data from similar catchments for which observations are available, known as parameter regionalization (Hrachowitz et al., <xref ref-type="bibr" rid="B35">2013</xref>). This is achieved by transferring catchment parameters from locations with measured data to an ungaged location of interest (Brunner et al., <xref ref-type="bibr" rid="B11">2021</xref>). Regression is one of the main methods for hydrologic regionalization (Guo et al., <xref ref-type="bibr" rid="B27">2021</xref>). Many recent studies document the successful application of regression-based methods for hydrologic regionalization, including regional prediction of flow-duration curves using three-dimensional kriging (Castellarin, <xref ref-type="bibr" rid="B13">2014</xref>) and the combination of regression and spatial proximity for catchment model regionalization (Steinschneider et al., <xref ref-type="bibr" rid="B60">2014</xref>). Additionally, with continued access to improved data and computational power, machine learning algorithms have been increasingly utilized for hydrologic applications (Kratzert et al., <xref ref-type="bibr" rid="B38">2019</xref>). Machine learning has been extensively tested for hydrologic regionalization in recent years, including the application of a genetic algorithm for annual runoff estimation in ungaged basins (Hong et al., <xref ref-type="bibr" rid="B34">2017</xref>), the regionalization of hydrological model parameters using gradient boosting machine learning (Song et al., <xref ref-type="bibr" rid="B59">2022</xref>), and robust regionalization using deep learning for a global hydrologic model (Li et al., <xref ref-type="bibr" rid="B40">2022</xref>). However, few papers test regionalization using machine learning for both daily streamflow and extreme flow estimation. Golian et al. (<xref ref-type="bibr" rid="B26">2021</xref>) documents the use of K-Nearest-Neighbors (KNN) and statistical methods for predicting low, average, and high flow quantiles, finding that &#x0201C;Regionalization was least satisfactory for low flows&#x0201D; (Golian et al., <xref ref-type="bibr" rid="B26">2021</xref>). For resource managers who may assume that regionalization using machine learning can be used to calibrate their models, the distinction between &#x0201C;successful&#x0201D; model calibration using regionalization and a model&#x00027;s ability to estimate low flows must be further studied and documented.</p>
<p>This study&#x00027;s objective is to test whether a regionally calibrated, process-based hydrologic model can provide better estimates of 7Q10 flows than common statistical methods. Future updates to the NWM and NHM will make it possible to quickly create uncalibrated rainfall-runoff models at virtually any location on a stream in the U.S., and application of these models may prove to be attractive to individuals seeking a process-based model for low flow estimation. Although there is substantial research on how process-based models will perform for daily streamflow estimation in ungaged basins, there is a paucity of research on how they perform for specific use-cases like extreme low-flow and/or 7Q10 estimation, especially against commonly applied statistical methods. Farmer et al. (<xref ref-type="bibr" rid="B22">2019</xref>) tested a procedure that used statistical at-site streamflow to calibrate the NHM in ungaged basins, finding that their models performed within 23% of rainfall-runoff models calibrated to daily streamflows at the same locations. However, the authors note that their initial results suggest these models may not reproduce both low and high streamflow magnitudes, and that further research should be conducted to examine this (Farmer et al., <xref ref-type="bibr" rid="B22">2019</xref>). As noted in the previous paragraph, the authors of Golian et al. (<xref ref-type="bibr" rid="B26">2021</xref>) tested their hydrologic model for both daily (median) flows and extreme flows, finding that their machine learning-based regionalization was least satisfactory for low flows when compared to both average and high flows. In this paper, the ability of process-based models to estimate 7Q10s is evaluated against open-source statistical 7Q10 estimates. Regardless of its success for average and high flows, if the process-based models provide lower errors for 7Q10 estimation than statistical estimates at the same locations, this will support managers in further justifying the use of process-based models over traditional statistical models for 7Q10 estimation. For this analysis, 94 uncalibrated rainfall-runoff models from the USGS&#x00027; National Hydrologic Modeling network at unimpaired, gaged locations in the Northeast and Mid-Atlantic United States are utilized to achieve the process-based estimates. The uncalibrated rainfall-runoff models generate daily streamflow data which can be used to calculate 7Q10 values. Each model is then calibrated to the measured streamflow data at each location using the USGS&#x00027; auto-calibration software LUCA (Hay and Umemoto, <xref ref-type="bibr" rid="B31">2007</xref>), generating new daily streamflow data and new 7Q10s. To extend calibration to ungaged locations without measured streamflow data, the adaptive machine learning algorithm Fuzzy C-Means (FCM) clustering (Dunn, <xref ref-type="bibr" rid="B18">1973</xref>) is used for parameter regionalization to re-calibrate the models at each location. This clustering algorithm was selected because similar studies have demonstrated success in using it for regionalization of rainfall-runoff model parameters (e.g., Mosavi et al., <xref ref-type="bibr" rid="B48">2021</xref>). Each process-based model (uncalibrated, calibrated, FCM) is then evaluated for its ability to estimate daily streamflow and 7Q10s. This process is summarized in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Summary of this study&#x00027;s experimental design.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0001.tif"/>
</fig>
<p>The results from this experiment will answer the following questions:</p>
<list list-type="simple">
<list-item><p>(1) Do uncalibrated rainfall-runoff models, created using extractions from the USGS&#x00027; National Hydrologic Modeling framework, perform comparably to regression-based 7Q10 estimation?</p></list-item>
<list-item><p>(2) Can models calibrated using Fuzzy C-Means clustering provide improved 7Q10 estimation compared to publicly available regression models?</p></list-item>
</list>
</sec>
<sec id="s2">
<title>2 Data and study area</title>
<p>The following sections describe the study area (Section 2.1) and data used in this research (Section 2.2).</p>
<sec>
<title>2.1 Study area</title>
<p>The study area for this analysis is the northeast United States, including the states of Maine, New Hampshire, Vermont, Massachusetts, Rhode Island, Connecticut, New York, Pennsylvania, New Jersey, Delaware, Maryland, Virginia, and West Virginia. This area is roughly 260,000 square miles and is not homogenous, as it covers two distinct Hydrologic Unit Code (HUC) regions of the U.S. (Seaber et al., <xref ref-type="bibr" rid="B56">1987</xref>). Basins selected for this study have been defined as &#x0201C;unimpaired&#x0201D; in USGS&#x00027;s Hydro-Climatic Data Network, HCDN-2009 (Lins, <xref ref-type="bibr" rid="B41">2012</xref>). This set of stream gages includes 94 watersheds of varying size and physical attributes. These basins range from 2.1 mi<sup>2</sup> to 1,419 mi<sup>2</sup> (<xref ref-type="fig" rid="F2">Figure 2</xref>).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>A 94 unimpaired gaged basins in the Northeast United States.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0002.tif"/>
</fig>
</sec>
<sec>
<title>2.2 Data</title>
<sec>
<title>2.2.1 Streamflow and 7Q10 data</title>
<p>Streamflow data from these 94 gages were downloaded from the USGS&#x00027;s Current Water Data for the Nation (<ext-link ext-link-type="uri" xlink:href="https://waterdata.usgs.gov/nwis/rt">https://waterdata.usgs.gov/nwis/rt</ext-link>). For this experiment, the full record of streamflow was used for calculation of the 7Q10 at each site. The &#x0201C;fasstr&#x0201D; software package (<ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/fasstr/index.html">https://cran.r-project.org/web/packages/fasstr/index.html</ext-link>) was used to calculate the 7Q10 directly from the daily streamflow data. This package applies a quantile distribution to daily streamflow data allowing for the efficient calculation of low flow frequency analysis metrics, including the 7Q10. These 7Q10 values were identical to the 7Q10 values calculated by the USGS at each site, presented on the USGS&#x00027;s StreamStats Data-Collection Station reports (<ext-link ext-link-type="uri" xlink:href="https://streamstatsags.cr.usgs.gov/">https://streamstatsags.cr.usgs.gov/</ext-link>).</p>
</sec>
<sec>
<title>2.2.2 Process-based hydrologic models</title>
<p>Rainfall-runoff models for each of the 94 gaged locations were extracted from the USGS&#x00027;s National Hydrologic Model version of the Precipitation Runoff Modeling System (NHM-PRMS). The USGS National Hydrologic Model (NHM) infrastructure was developed to support the efficient creation of local, regional, and national-scale hydrologic models for the United States (Regan et al., <xref ref-type="bibr" rid="B50">2019</xref>). These models incorporate data stored in the NHM, including the basin and subbasin landcover values and area-weighted average climate forcings required to run PRMS. Selecting a location or gage that is a point-of-interest in the NHM generates a ready-to-run rainfall-runoff model at that location, with all necessary variables being extracted for the basin of interest, including land data (area, elevation, landcover) and the corresponding climate data. Climate forcing dataset choices include Daymet (1980&#x02013;2016) (Thornton et al., <xref ref-type="bibr" rid="B62">2016</xref>), Maurer (1949&#x02013;2010) (Maurer et al., <xref ref-type="bibr" rid="B45">2002</xref>), and Livneh (1915&#x02013;2015) (Livneh and National Center for Atmospheric Research Staff, <xref ref-type="bibr" rid="B42">2019</xref>) in the form of basin area-weighted precipitation and temperature timeseries. For this analysis, the Daymet climate dataset was chosen because it offers the finest resolution of the three (1 km, as opposed to 6 km for Livneh and &#x0007E;12 km for Maurer) and allows for trend analysis, as there are no temporal discontinuities.</p>
<p>Additionally, the NHM-PRMS makes several major assumptions to model hydrologic processes. This includes:</p>
<list list-type="order">
<list-item><p>Dividing basins into subbasins using pre-determined Hydrologic Response Units (HRUs) from the USGS&#x00027; Geospatial Fabric (Bock et al., <xref ref-type="bibr" rid="B9">2020</xref>).</p></list-item>
<list-item><p>Using a daily time-step, while some models utilize finer timesteps.</p></list-item>
<list-item><p>Calculating evapotranspiration using the Jensen-Haise (JH) formulation (Jensen and Haise, <xref ref-type="bibr" rid="B36">1963</xref>).</p></list-item>
</list>
<p>The assumptions above are specific to the NHM-PRMS but should not imply that the results from this study will be specific to this hydrologic modeling software. The HRUs from the NHM-PRMS are based on pre-determined areas of homogeneity. Though some other models utilize gridded landcover data, all of the gridded values that would fall within an HRU should be similar to the value used for that HRU from the NHM-PRMS. Using a daily time-step may provide inaccurate high-flow estimates, as things like the 100-year-flood are calculated using gage data on a 15-min scale when it is available (England et al., <xref ref-type="bibr" rid="B20">2019</xref>), but the 7Q10 is always calculated from daily average streamflows. Calculating evapotranspiration using the JH formulation may have an impact on results, but the appropriate steps have been taken to minimize the impact. JH calculates evapotranspiration based on temperature for each HRU. In this study, the Daymet climate dataset is used, which provides the finest resolution of the available climate datasets with no discontinuities. Additionally, extensive model calibration is used to minimize the impact of the JH formulation. JH is calibrated during its own step of the standard NHM calibration procedure, which will be further described in Section 3.1.</p>
</sec>
<sec>
<title>2.2.3 StreamStats 7Q10 estimates</title>
<p>To compare 7Q10 estimates from the process-based hydrologic models to current statistical estimates, the USGS&#x00027;s statistical estimation program StreamStats is used. StreamStats was chosen for comparison because (1) it is widely utilized by resource managers in the study area, (2) it provides direct comparisons, and (3) it utilizes a statistical methodology for estimation, as opposed to another process-based methodology. Without estimating daily streamflow, this program uses multiple linear regression equations, derived in log-space, to directly estimate flow statistics (Ries et al., <xref ref-type="bibr" rid="B51">2008</xref>). Though the input variables vary by state, the typical process is as follows:</p>
<list list-type="order">
<list-item><p>Calculate the historic 7Q10 at various gaged locations in a homogenous hydrologic area.</p></list-item>
<list-item><p>Collect the physical characteristics (watershed area, elevation, slope, etc.) for each of the watersheds attributed to the gages used above.</p></list-item>
<list-item><p>Fit a multiple linear regression, in log-space, to relate the input variables (watershed area, elevation, slope, etc.) to the corresponding 7Q10 value.</p></list-item>
<list-item><p>Delineate the watershed that is attributed to the ungaged location of interest.</p></list-item>
<list-item><p>Calculate the physical characteristics of the delineated watershed.</p></list-item>
<list-item><p>Apply the physical characteristics from the ungaged, delineated watershed to the regression equation developed in step 3 to calculate the 7Q10.</p></list-item>
</list>
<p>StreamStats uses varying regression equations and explanatory variables for Massachusetts (Ries, <xref ref-type="bibr" rid="B52">2000</xref>), Rhode Island (Bent et al., <xref ref-type="bibr" rid="B6">2014</xref>), New Hampshire (Flynn and Tasker, <xref ref-type="bibr" rid="B24">2002</xref>), Maine (Dudley, <xref ref-type="bibr" rid="B17">2004</xref>), Pennsylvania (Stuckey, <xref ref-type="bibr" rid="B61">2006</xref>), Virginia (Austin et al., <xref ref-type="bibr" rid="B2">2011</xref>), and West Virginia (Wiley, <xref ref-type="bibr" rid="B64">2008</xref>). StreamStats 7Q10 has not been developed in Connecticut, Delaware, Maryland, New Jersey, New York, and Vermont, which partially limits later comparisons.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>3 Methodology</title>
<p>In the following section, all methods used in this research are described in detail. This includes calibration of the NHM-PRMS models (Section 3.1), Fuzzy C-Means clustering for regionalization (Section 3.2), Silhouette Analysis for evaluating the optimal number of clusters (Section 3.3), and the evaluation metrics used for both the daily streamflow models and 7Q10 estimates (Section 3.4).</p>
<sec>
<title>3.1 Calibration of the NHM-PRMS models</title>
<p>Models of the 94 basins were calibrated using the procedure recommended in the PRMS IV Manual (Markstrom et al., <xref ref-type="bibr" rid="B44">2015</xref>) by executing the USGS&#x00027;s automated calibration software LUCA (Hay and Umemoto, <xref ref-type="bibr" rid="B31">2007</xref>). This procedure applies a multi-objective, multi-step process of continuously revising sub-basin parameters. To achieve calibration, parameters are varied individually for each of the 94 locations, with the objective of minimizing the difference between the simulated daily streamflow and the measured daily streamflow at each gage. The parameters recommended for calibration in PRMS are summarized in <xref ref-type="table" rid="T1">Table 1</xref>, along with their default values and recommended calibration bounds (Markstrom et al., <xref ref-type="bibr" rid="B44">2015</xref>).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Parameters calibrated in the process-based hydrologic model.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="left"><bold>Description</bold></th>
<th valign="top" align="left"><bold>Units</bold></th>
<th valign="top" align="left"><bold>Default value</bold></th>
<th valign="top" align="left"><bold>Lower bound</bold></th>
<th valign="top" align="left"><bold>Upper bound</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">dday_intcp</td>
<td valign="top" align="left">Monthly (January to December) intercept in degree-day equation</td>
<td valign="top" align="left">dday</td>
<td valign="top" align="left">&#x02212;40</td>
<td valign="top" align="left">&#x02212;70</td>
<td valign="top" align="left">10</td>
</tr>
<tr>
<td valign="top" align="left">dday_slope</td>
<td valign="top" align="left">Monthly (January to December) slope in degree-day equation</td>
<td valign="top" align="left">dday/temp units</td>
<td valign="top" align="left">0.4</td>
<td valign="top" align="left">0.2</td>
<td valign="top" align="left">1.1</td>
</tr>
<tr>
<td valign="top" align="left">tmax_index</td>
<td valign="top" align="left">Monthly (January to December) index temperature used to determine precipitation adjustments to solar radiation</td>
<td valign="top" align="left">Temperature units</td>
<td valign="top" align="left">50</td>
<td valign="top" align="left">50</td>
<td valign="top" align="left">90</td>
</tr>
<tr>
<td valign="top" align="left">jh_coef</td>
<td valign="top" align="left">Monthly (January to December) air temperature coefficient used in Jensen-Haise potential ET computations</td>
<td valign="top" align="left">Per degree temperature</td>
<td valign="top" align="left">0.014</td>
<td valign="top" align="left">0.005</td>
<td valign="top" align="left">0.10</td>
</tr>
<tr>
<td valign="top" align="left">emis_noppt</td>
<td valign="top" align="left">Average emissivity of air on days without precipitation.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">0.757</td>
<td valign="top" align="left">0.757</td>
<td valign="top" align="left">1</td>
</tr>
<tr>
<td valign="top" align="left">fastcoef_lin</td>
<td valign="top" align="left">Linear coefficient in equation to route preferential-flow storage down slope.</td>
<td valign="top" align="left">Fraction/day</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0.0001</td>
<td valign="top" align="left">1</td>
</tr>
<tr>
<td valign="top" align="left">fastcoef_sq</td>
<td valign="top" align="left">Non-linear coefficient in equation to route preferential-flow storage down slope.</td>
<td valign="top" align="left">None</td>
<td valign="top" align="left">0.8</td>
<td valign="top" align="left">0.00001</td>
<td valign="top" align="left">1</td>
</tr>
<tr>
<td valign="top" align="left">freeh2o_cap</td>
<td valign="top" align="left">Free-water holding capacity of snowpack expressed as a decimal fraction of the frozen water content of the snowpack.</td>
<td valign="top" align="left">Inches</td>
<td valign="top" align="left">0.08</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">0.2</td>
</tr>
<tr>
<td valign="top" align="left">gwflow_coef</td>
<td valign="top" align="left">Linear coefficient in the equation to compute groundwater discharge for each GWR.</td>
<td valign="top" align="left">Fraction/day</td>
<td valign="top" align="left">0.03</td>
<td valign="top" align="left">0.0005</td>
<td valign="top" align="left">0.10</td>
</tr>
<tr>
<td valign="top" align="left">gwstor_init</td>
<td valign="top" align="left">Storage in each GWR at the beginning of a simulation.</td>
<td valign="top" align="left">Inches</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">20.0</td>
</tr>
<tr>
<td valign="top" align="left">potet_sublim</td>
<td valign="top" align="left">Fraction of potential ET that is sublimated from snow in the canopy and snowpack.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">0.5</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0.75</td>
</tr>
<tr>
<td valign="top" align="left">smidx_coef</td>
<td valign="top" align="left">Coefficient in non-linear contributing area algorithm.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">0.001</td>
<td valign="top" align="left">0.0001</td>
<td valign="top" align="left">1</td>
</tr>
<tr>
<td valign="top" align="left">smidx_exp</td>
<td valign="top" align="left">Exponent in non-linear contributing area algorithm.</td>
<td valign="top" align="left">1/inch</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.2</td>
<td valign="top" align="left">1.8</td>
</tr>
<tr>
<td valign="top" align="left">soil_moist_max</td>
<td valign="top" align="left">Maximum available water holding capacity of capillary reservoir from land surface to rooting depth of the major vegetation type.</td>
<td valign="top" align="left">Inches</td>
<td valign="top" align="left">5</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">20</td>
</tr>
<tr>
<td valign="top" align="left">soil_rechr_max_frac</td>
<td valign="top" align="left">Maximum storage for soil recharge zone (upper portion of capillary reservoir where losses occur as both evaporation and transpiration).</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">0.5</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">1</td>
</tr>
<tr>
<td valign="top" align="left">soil2gw_max</td>
<td valign="top" align="left">Maximum amount of the capillary reservoir excess that is routed directly to the groundwater recharge</td>
<td valign="top" align="left">Inches</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">0.5</td>
</tr>
<tr>
<td valign="top" align="left">rain_cbh_adj</td>
<td valign="top" align="left">Monthly (January to December) adjustment factor to measured precipitation to account for deficiencies in gage catch.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">2</td>
</tr>
<tr>
<td valign="top" align="left">snow_cbh_adj</td>
<td valign="top" align="left">Monthly (January to December) adjustment factor to measured precipitation to account for deficiencies in gage catch.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">2</td>
</tr>
<tr>
<td valign="top" align="left">adjmix_rain</td>
<td valign="top" align="left">Monthly (January to December) factor to adjust rain proportion in a mixed rain/snow event by month.</td>
<td valign="top" align="left">Decimal fraction</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">1.4</td>
</tr>
<tr>
<td valign="top" align="left">cecn_coef</td>
<td valign="top" align="left">Monthly (January to December) convection condensation energy coefficient.</td>
<td valign="top" align="left">Calories/deg Celsius</td>
<td valign="top" align="left">5</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">20</td>
</tr>
<tr>
<td valign="top" align="left">tmax_allrain_offset</td>
<td valign="top" align="left">Monthly (January to December) maximum air temperature when precipitation is assumed to be rain; if HRU air temperature is greater than or equal to tmax_allsnow plus this value, precipitation is rain.</td>
<td valign="top" align="left">Temperature units</td>
<td valign="top" align="left">5</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">10</td>
</tr>
<tr>
<td valign="top" align="left">tmax_allsnow</td>
<td valign="top" align="left">Monthly (January to December) maximum air temperature when precipitation is assumed to be snow; if HRU air temperature is less than or equal to this value, precipitation is snow.</td>
<td valign="top" align="left">Temperature units</td>
<td valign="top" align="left">30</td>
<td valign="top" align="left">20</td>
<td valign="top" align="left">40</td>
</tr></tbody>
</table>
</table-wrap>
<p>Each parameter that is to be calibrated begins with the default value and is continually refined during the calibration process. During calibration, the parameters are constrained to lie within the process-driven limits given above. In preparation for clustering, the updated values of the parameters are normalized using standard min-max normalization:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mfrac><mml:mrow><mml:mi>x</mml:mi><mml:mo>-</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mo class="qopname">max</mml:mo></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> displays the range of parameters values after calibration and normalization. Normalization causes all the parameters to share the same range of 0 to 1. These values can easily be returned to their actual values by using the minimum and maximum values given below each boxplot in <xref ref-type="fig" rid="F3">Figure 3</xref> and reversing the equation above to solve for x.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Parameter ranges after calibration and min-max normalization.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0003.tif"/>
</fig>
<p>Calibration of the rainfall-runoff models was initially set up to minimize the error for low-flow estimation rather than for daily streamflows, deviating from the recommended calibration procedure in the PRMS IV manual (Markstrom et al., <xref ref-type="bibr" rid="B44">2015</xref>). This was considered appropriate because the goal of this experiment is to test hydrologic models for low flow estimation. Specifically, the LUCA software allows for calibration to the lowest annual daily streamflows. Initial results suggested that the rainfall-runoff models calibrated to low flows were able to estimate the magnitude of 7Q10s well, but a more thorough analysis of the hydrograph suggested that the models were not properly maintaining the water budget and corresponding streamflow throughout the rest of the year. This discrepancy is highlighted in <xref ref-type="supplementary-material" rid="SM1">Appendix A</xref>, which highlights gage 01552000 in Pennsylvania on Loyalsock Creek as an example.</p>
</sec>
<sec>
<title>3.2 Fuzzy C-Means clustering</title>
<p>Traditional calibration of a rainfall-runoff model to daily streamflow is only possible at gaged locations. Furthermore, at these locations, 7Q10s can be directly calculated from the measured streamflow data. However, to extend calibration to ungaged locations, hydrologic regionalization can be used to transfer model parameters. Fuzzy C-Means (FCM) clustering is a clustering algorithm that utilizes soft assignments of data points to clusters (Dunn, <xref ref-type="bibr" rid="B18">1973</xref>). Unlike traditional clustering algorithms like K-means clustering (MacQueen, <xref ref-type="bibr" rid="B43">1967</xref>) that create hard assignments for each data point to a single cluster, FCM assigns membership values to indicate the degree of &#x0201C;belongingness&#x0201D; of data points to each cluster. The objective function seeks to find the optimal cluster centers and membership values that minimize the overall fuzziness or uncertainty of the clustering result. The FCM algorithm provides greater flexibility in clustering tasks, as it can handle scenarios where data points may partially belong to multiple clusters or where cluster boundaries are ambiguous. This is advantageous in this application, as this methodology is tested for a large geographical area that includes two pre-determined HUC regions of the United States. By assigning membership values, FCM provides a more nuanced representation of the clustering structure and allows for capturing overlapping clusters or gradual transitions between clusters. These membership values were used to create a weighted average of the parameters to use in the rainfall-runoff models. This is illustrated in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Difference between hard clustering algorithms (e.g., K-Means) and soft clustering algorithms (e.g., Fuzzy-C Means).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0004.tif"/>
</fig>
<p>For implementation, a value &#x0201C;m&#x0201D; is used to set the &#x0201C;fuzzification&#x0201D; factor, dictating how much overlap to allow between clusters. As m is increased, the allowed overlap between clusters is increased. The FCM algorithm is executed for the total number of clusters possible with a specified range of m-values. The range of clusters possible for FCM is integers between [2, N/2] (for this case between 2 and 262), as there are 525 individual sub-basins with their own set of parameters. The range of m values possible is [1.5, &#x0221E;]. In this study, m values of [1.5, 5] will be used with increasing steps of 0.10. This will test 36 different m values for each cluster, leading to a total number of 9,396 possible cluster and fuzziness combinations for this experiment. Note that when m = 1, there is no overlap allowed between clusters and FCM reduces to K-Means clustering.</p>
<p>Clustering algorithms are typically used descriptively to highlight patterns in a dataset, but they can be used prescriptively given a set of predictor variables and response variables. For this study, the response variables for the FCM clustering are the calibrated parameters given in <xref ref-type="table" rid="T1">Table 1</xref>, as that is what must be predicted for calibrating the models at ungaged locations. The predictor variables, which will be used to predict which set of calibrated parameters to use, are the publicly available physical parameters for each sub-basin from the NHM, all designated as numeric values. These are highlighted in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Predictor variables used in the Fuzzy C-Means analysis from the 94 test basins.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Model parameter</bold></th>
<th valign="top" align="left"><bold>Description</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Area</td>
<td valign="top" align="left">Area of the basin</td>
</tr>
<tr>
<td valign="top" align="left">Elevation</td>
<td valign="top" align="left">Mean elevation of the basin</td>
</tr>
<tr>
<td valign="top" align="left">Percent impervious</td>
<td valign="top" align="left">Percent of the basin considered to be impervious</td>
</tr>
<tr>
<td valign="top" align="left">Slope</td>
<td valign="top" align="left">Average slope of the basin</td>
</tr>
<tr>
<td valign="top" align="left">Land type</td>
<td valign="top" align="left">Primary land designation (land, lake, swale)</td>
</tr>
<tr>
<td valign="top" align="left">Soil type</td>
<td valign="top" align="left">Soil type of the basin</td>
</tr>
<tr>
<td valign="top" align="left">Vegetation type</td>
<td valign="top" align="left">Type of vegetation primarily covering the area</td>
</tr></tbody>
</table>
</table-wrap>
<p>Given the above parameters, the process for creating the new models using Fuzzy C-Means is summarized below.</p>
<list list-type="order">
<list-item><p>Calibration creates response variables (<xref ref-type="table" rid="T1">Table 1</xref>) for each location.</p></list-item>
<list-item><p>Predictor variables (<xref ref-type="table" rid="T2">Table 2</xref>) are extracted from each location.</p></list-item>
<list-item><p>All parameters are normalized using min-max normalization (<xref ref-type="disp-formula" rid="E1">Equation 1</xref>).</p></list-item>
<list-item><p>Clustering is used to create groups of similar basin parameters 1.</p></list-item>
<list-item><p>Fuzzy C-Means clustering is used to evaluate each location&#x00027;s membership to each group.</p></list-item>
<list-item><p>New calibrated parameters are created using a weighted average of each location&#x00027;s membership to each cluster and the cluster&#x00027;s corresponding centroid (<xref ref-type="fig" rid="F4">Figure 4</xref>).</p></list-item>
</list>
<p>The FCM algorithm is known to suffer from several drawbacks, including computational time complexity, initial cluster centers, membership matrix reliance, and noise sensitivity. In this study, strategic measures were implemented to mitigate these drawbacks effectively. To address the issue of computational time complexity, parallel computing techniques were utilized to ensure efficient execution. To overcome the sensitivity to initial cluster centers, a robust initialization strategy was employed to integrate multiple runs with distinct initializations to determine the configuration that yielded optimal results. Additionally, the membership matrix was manually evaluated and refined to enhance the stability of the clustering process and minimize the impact of noise. These approaches aim to mitigate the various drawbacks of the FCM method and improve the robustness of the clustering methodology.</p>
</sec>
<sec>
<title>3.3 Silhouette analysis</title>
<p>Silhouette analysis (Rousseeuw, <xref ref-type="bibr" rid="B53">1987</xref>) is a common methodology used for calculating the optimal number of clusters for a dataset. This methodology evaluates the quality of clustering by assessing the separation and cohesion of clusters, as well as the fit of data points within their assigned clusters (Rousseeuw, <xref ref-type="bibr" rid="B53">1987</xref>). First, a silhouette coefficient is calculated for each data point that measures how well it fits within its cluster compared to neighboring clusters. Next, the average silhouette coefficient is computed across all data points for each value of the number of clusters. Finally, the optimal number of clusters is determined by selecting the value that maximizes the average silhouette coefficient, indicating the presence of well-separated and compact clusters.</p>
<p>Silhouette analysis was chosen to determine the optimal number of clusters due to its ability to quantify both cohesion and separation within clusters. Unlike other techniques for determining the optimal number of clusters, this method provides a clear and intuitive measure of the quality of clustering, considering both the compactness of clusters and their distinctiveness. Its versatility allows for the evaluation of clustering performance across varying cluster configurations, making it a well-suited metric for this study where the identification of an optimal cluster number is crucial for optimizing the rainfall-runoff models. For this experiment, models created from the parameter clusters are executed with the four highest average silhouette coefficients. Because parameters are regionalized for use in a rainfall-runoff model, there may be some slight variations between the cluster/m-value combination with the best average silhouette coefficient and the rainfall-runoff model with the optimal daily streamflow. By testing four models, the link between the optimal silhouette coefficients and optimal physical model performances can be verified, as well as ensuring that the single model with the best daily streamflow and 7Q10 estimation is identified. If the silhouette analysis suggests that there are distinct clusters, which would occur if the highest average silhouette coefficients occur when m = 1.5 (the smallest fuzzification factor possible) and decrease as m is incrementally increased, models with m = 1 should also be tested, which would reduce the FCM to K-Means clustering.</p>
</sec>
<sec>
<title>3.4 Evaluation metrics for daily streamflow and 7Q10 estimates</title>
<p>In this experiment, the Kling Gupta Efficiency (KGE) (Gupta et al., <xref ref-type="bibr" rid="B28">2009</xref>) is used to evaluate the daily streamflow models. KGE is widely used for hydrologic applications (Formetta et al., <xref ref-type="bibr" rid="B25">2011</xref>; Beck et al., <xref ref-type="bibr" rid="B5">2016</xref>) because it incorporates three components into its definition: the Pearson&#x00027;s correlation coefficient (r), the bias (&#x003B2;), and the error variability (&#x003B1;). KGE is calculated using the following formula (<xref ref-type="disp-formula" rid="E2">Equation 2</xref>):</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>K</mml:mi><mml:mi>G</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>r</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B1;</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B2;</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:msqrt></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Unlike traditional metrics like R<sup>2</sup> (Wright, <xref ref-type="bibr" rid="B67">1921</xref>) and Nash-Sutcliffe Efficiency (NSE) (Nash and Sutcliffe, <xref ref-type="bibr" rid="B49">1970</xref>), KGE considers three key components of model performance: correlation, bias, and variability, which combined offer a holistic assessment of model fit. This provides a nuanced understanding of the model&#x00027;s ability to capture not only the mean and variability of the observed data, but also the temporal dynamics.</p>
<p>The goal of this experiment is to evaluate the models for 7Q10 estimation, so four error metrics will be used to evaluate the errors for 7Q10 estimation. These are Relative Bias (<xref ref-type="disp-formula" rid="E3">Equations 3</xref>&#x02013;<xref ref-type="disp-formula" rid="E6">6</xref>),</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M3"><mml:mrow><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>B</mml:mi><mml:mi>i</mml:mi><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo>|</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>Root Mean Square Error (RMSE),</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M4"><mml:mrow><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msqrt><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mo>&#x02217;</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt></mml:mrow></mml:math></disp-formula>
<p>Relative Root Mean Square Error (R-RMSE),</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M5"><mml:mrow><mml:mi>R</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mfrac><mml:mrow><mml:msqrt><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mo>&#x02217;</mml:mo><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:msubsup><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt></mml:mrow><mml:mi>y</mml:mi></mml:mfrac></mml:mrow></mml:math></disp-formula>
<p>and Unit-Area RMSE (UA-RMSE),</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M6"><mml:mrow><mml:mi>U</mml:mi><mml:mi>A</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msqrt><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mo>&#x02217;</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mi>A</mml:mi></mml:mrow></mml:mfrac><mml:mo>&#x02212;</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mi>A</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt></mml:mrow></mml:math></disp-formula>
<p>where <italic>y</italic><sub><italic>i</italic></sub> is the observed 7Q10, &#x00177; is the model predicted 7Q10, <italic>n</italic> is the total number of sites, <italic>DA</italic> is the drainage area, and <inline-formula><mml:math id="M7"><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:math></inline-formula> is the associated mean value. Similar studies have also chosen RMSE over MAE for its sensitivity to outliers (Mekanik et al., <xref ref-type="bibr" rid="B46">2016</xref>; Ferreira et al., <xref ref-type="bibr" rid="B23">2021</xref>). These four metrics were chosen to complement each other and provide a comprehensive analysis of performance. RMSE provides a standard error estimate, while relative RMSE provides an error estimate adjusted for the mean of the dataset. Additionally, relative bias provides a direct bias estimate in relation to the size of the value itself, and Unit-Area RMSE adjusts RMSE for the size of the basin being analyzed. These additional metrics attempt to weight 7Q10 estimation in small basins and large basins similarly, where a bias estimate may be heavily skewed by the size of the values themselves (e.g., a 7Q10 estimate of 1.00cfs when the actual value is 0.00cfs, compared to a 7Q10 estimate of 100.00cfs when the actual value is 99.00cfs).</p>
</sec>
</sec>
<sec id="s4">
<title>4 Results and discussion</title>
<p>The following sections present and discuss the uncalibrated model vs calibrated model performance for daily streamflow estimation (Section 4.1) and 7Q10 estimation (Section 4.2), the results from the silhouette analysis (Section 4.3), applying the optimal silhouettes with Fuzzy C-Means regionalization for daily streamflow estimation (Section 4.4), and the results of using the models calibrated using FCM for 7Q10 estimation (Section 4.5).</p>
<sec>
<title>4.1 Uncalibrated model performance vs. calibrated model performance</title>
<p>In <xref ref-type="fig" rid="F5">Figure 5</xref>, the general results for the uncalibrated rainfall-runoff models directly from the NHM-PRMS, compared to the calibrated models, are presented.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Results from calibration using KGE to evaluate daily streamflow estimation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0005.tif"/>
</fig>
<p>As expected, calibration improved daily streamflow estimation for every basin. Before calibration, the median KGE was 0.30 with some locations having negative KGE values. Calibration improved the median daily streamflow to 0.52, with the lowest KGE value being 0.25. The results are further analyzed by spatially disaggregating the basins into small (&#x0003C; 100 mi<sup>2</sup>) and large basins (&#x0003E;100 mi<sup>2</sup>). This threshold is chosen because many regression equations used for 7Q10 estimation, including some of the StreamStats&#x00027; equations in the study area, are only recommended for basins up to 100&#x02013;150 mi<sup>2</sup> (e.g., Ries, <xref ref-type="bibr" rid="B52">2000</xref>). <xref ref-type="fig" rid="F6">Figures 6A</xref>, <xref ref-type="fig" rid="F6">B</xref> display the physical model KGEs, but this time split by small basins (<xref ref-type="fig" rid="F6">Figure 6A</xref>) and large basins (<xref ref-type="fig" rid="F6">Figure 6B</xref>).</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Results from calibration for basins smaller than 100 mi<sup>2</sup> <bold>(A)</bold> and basins larger than 100 mi<sup>2</sup> <bold>(B)</bold> using KGE to evaluate daily streamflow estimation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0006.tif"/>
</fig>
<p>There are minimal differences between the calibrated models for small and large basins, but the uncalibrated models display noticeable differences. The uncalibrated models for larger basins display an inferior median KGE, more KGE values that are negative, and a much wider interquartile range with a lower 25<sup>th</sup> percentile that is negative. The uncalibrated models for the larger basins perform significantly worse than the uncalibrated models for the smaller basins in terms of KGE. Given that the calibrated models for the large basins perform similarly to the calibrated models for the small basins, this suggests that the default parameter values are not as appropriate for larger basins, requiring calibration more than that of the models for small basins.</p>
</sec>
<sec>
<title>4.2 Uncalibrated vs. calibrated models for gaged 7Q10 estimation</title>
<p>This experiment&#x00027;s primary goal is to test the hypothesis that models can be used for 7Q10 estimation in ungaged locations. Of the 94 basins in the study, an estimate of the 7Q10 is not available through StreamStats for 28 of the basins (StreamStats 7Q10 estimation has not been developed by the USGS for certain states, as discussed in Section 2.2.3). These locations were removed from the analysis for the analyses presented in Sections 4.2 and 4.4. <xref ref-type="table" rid="T3">Table 3</xref> summarizes the Median Relative Bias, RMSE, RRMSE, and UA-RMSE of the uncalibrated and calibrated models for 7Q10 estimation compared to StreamStats.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Error metrics for 7Q10 estimation (only StreamStats locations).</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>7Q10 estimation technique</bold></th>
<th valign="top" align="center"><bold>Median relative bias (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>RMSE (cfs)</bold></th>
<th valign="top" align="center"><bold>RRMSE (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>UA-RMSE (cfs/mi<sup>2</sup>)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Uncalibrated hydrologic models</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">13.99</td>
<td valign="top" align="center">116.48</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated hydrologic models</td>
<td valign="top" align="center">0.61</td>
<td valign="top" align="center">12.33</td>
<td valign="top" align="center">102.63</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">13.75</td>
<td valign="top" align="center">114.5</td>
<td valign="top" align="center">0.05</td>
</tr></tbody>
</table>
</table-wrap>
<p>For the 66 basins where StreamStats 7Q10 estimation is available, the results suggest that StreamStats provides significantly lower median relative bias and UA-RMSE to the uncalibrated models, but similar RMSE and RRMSE. The calibrated models perform best in terms of RMSE and RRMSE but provide significantly larger median relative bias and UA-RMSE than StreamStats. RMSE is heavily influenced by larger values, while UA-RMSE attempts to weigh smaller and larger values equally by scaling the larger values down based on their larger watershed areas. Because the calibrated models perform best for RMSE but not for UA-RMSE, this suggests that the calibrated models perform well for larger basins. <xref ref-type="table" rid="T4">Table 4</xref> confirms this by displaying the same metrics characterized by small and large basins, defined in Section 4.1.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Error metrics for 7Q10 estimation (only StreamStats locations).</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>7Q10 estimation technique</bold></th>
<th valign="top" align="center"><bold>Median relative bias (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>RMSE (cfs)</bold></th>
<th valign="top" align="center"><bold>RRMSE (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>UA-RMSE (cfs/mi<sup>2</sup>)</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="5"><bold>Basins smaller than 100 mi</bold><sup>2</sup><bold>-35 locations</bold></td>
</tr>
<tr>
<td valign="top" align="left">Uncalibrated rainfall-runoff models</td>
<td valign="top" align="center">0.90</td>
<td valign="top" align="center">7.26</td>
<td valign="top" align="center">188.82</td>
<td valign="top" align="center">0.09</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated rainfall-runoff models</td>
<td valign="top" align="center">0.90</td>
<td valign="top" align="center">5.47</td>
<td valign="top" align="center">142.24</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">2.23</td>
<td valign="top" align="center">58.00</td>
<td valign="top" align="center">0.04</td>
</tr>
<tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="5"><bold>Basins larger than 100mi</bold><sup>2</sup><bold>-31 Locations</bold></td>
</tr>
<tr>
<td valign="top" align="left">Uncalibrated rainfall-runoff models</td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">19.57</td>
<td valign="top" align="center">92.17</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated rainfall-runoff models</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">16.25</td>
<td valign="top" align="center">76.53</td>
<td valign="top" align="center">0.06</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.32</td>
<td valign="top" align="center">19.92</td>
<td valign="top" align="center">93.86</td>
<td valign="top" align="center">0.06</td>
</tr></tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T4">Table 4</xref> suggests that StreamStats outperforms both the uncalibrated and calibrated physical models for 7Q10 estimation in small basins. For the small basins, StreamStats&#x00027; median relative bias, RMSE, RRMSE, and UA-RMSE are all half that (or more) of the calibrated models. For the large basins, StreamStats still performs best in terms of median relative bias, but the calibrated models display the same UA-RMSE and better RMSE and RRMSE than StreamStats. Given the results presented here, the calibrated physical models at gaged locations over 100 square miles can provide similar errors for 7Q10 estimation compared to StreamStats in terms of the error metrics presented, but StreamStats is primarily for ungaged basin estimation, and this calibration methodology cannot be used for ungaged locations. In the next section, a methodology is presented for calibrating models in ungaged locations using Fuzzy C-Means clustering for hydrologic regionalization.</p>
</sec>
<sec>
<title>4.3 Results from silhouette analysis with Fuzzy C-Means clustering</title>
<p>As summarized in Section 3.3, silhouettes were used to find the optimal FCM parameters. The full range of clusters possible are combined with m-values ranging from 1.5 to 5 using steps of 0.1. <xref ref-type="fig" rid="F7">Figure 7</xref> displays the average silhouette widths for each combination of clusters and m-values.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Silhouette analysis for the Fuzzy C-Means clustering.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0007.tif"/>
</fig>
<p>In addition, the top 10 average silhouette coefficients, arranged from largest to smallest, are presented in <xref ref-type="table" rid="T5">Table 5</xref>.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Highest 10 average silhouette coefficients.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Number of clusters (c)</bold></th>
<th valign="top" align="center"><bold>M-value (m)</bold></th>
<th valign="top" align="center"><bold>Average silhouette coefficient</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">0.344265</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">1.6</td>
<td valign="top" align="center">0.334988</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">1.7</td>
<td valign="top" align="center">0.33009</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">1.8</td>
<td valign="top" align="center">0.32043</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">1.9</td>
<td valign="top" align="center">0.306479</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">0.302499</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1.6</td>
<td valign="top" align="center">0.301588</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1.8</td>
<td valign="top" align="center">0.301469</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1.7</td>
<td valign="top" align="center">0.301405</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1.9</td>
<td valign="top" align="center">0.300441</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0.29949</td>
</tr></tbody>
</table>
</table-wrap>
<p>Results suggest that the cluster and m-value combination that optimizes the average silhouette coefficient is the minimum value for both clusters and m. The best average silhouette coefficient was found to occur when c = 2 clusters with an m-value of 1.5. The next 4 best silhouette coefficients remain when c = 2 but decrease slightly as m is increased by 0.10 each step. The next optimal number of clusters was found to be c = 3 for the 5<sup>th</sup> highest average silhouette coefficient, with the minimum m-value of 1.5 once again being optimal. When the number of clusters equals 3, the silhouette coefficients decrease slightly as m is increased by 0.10 each step once again. The results in <xref ref-type="fig" rid="F7">Figure 7</xref> and <xref ref-type="table" rid="T5">Table 5</xref> suggest that there may be distinct clusters that need to be considered. Exemplified by the results in <xref ref-type="table" rid="T5">Table 5</xref> but can also be seen in <xref ref-type="fig" rid="F7">Figure 7</xref>, the silhouette coefficients decrease as the m-value is increased, suggesting less optimal solutions as the fuzziness between clusters is increased. For both clusters c = 2 and c = 3, the smallest m-value displayed the best average silhouette coefficient.</p>
<p>Based on the results from <xref ref-type="fig" rid="F7">Figure 7</xref> and <xref ref-type="table" rid="T5">Table 5</xref> that suggest possible distinct clusters, rather than test the four parameter combinations which display the best average silhouette coefficient (which would be when c = 2 and m = 1.5, 1.6, 1.7, and 1.8), the following four models will be tested:</p>
<list list-type="order">
<list-item><p>FCM when c = 2 and m = 1 (reduces to K-Means clustering where k = 2).</p></list-item>
<list-item><p>FCM when c = 2 and m = 1.5 (FCM for c = 2 with the minimum fuzziness factor m = 1.5).</p></list-item>
<list-item><p>FCM when c = 3 and m = 1 (reduces to K-Means clustering where k = 3).</p></list-item>
<list-item><p>FCM when c = 3 and m = 1.5 (FCM for c = 3 with the minimum fuzziness factor m = 1.5).</p></list-item>
</list>
<p>This will allow us to test: (1) the optimal result from the silhouette analysis (c = 2 and m = 1.5), (2) multiple clusters (c = 2 and c = 3), and (3) whether applying distinct clusters leads to better calibration, as suggested by the silhouettes.</p>
</sec>
<sec>
<title>4.4 Results from clustering for daily streamflow</title>
<p>New rainfall-runoff models were created using clustering for the four scenarios discussed in the previous section. <xref ref-type="fig" rid="F8">Figure 8</xref> summarizes how each model created from clustering performs for daily streamflow estimation.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Daily streamflow KGE using FCM for calibration.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0008.tif"/>
</fig>
<p><xref ref-type="fig" rid="F8">Figure 8</xref> shows that some models calibrated using FCM display improved KGEs compared to the uncalibrated models. Both models with no fuzzification factor (FCM: C = 2 and FCM: C = 3) display median KGEs slightly below 0.50, while the models with a fuzzification factor (FCM: C = 2, M = 1.5, and FCM: C = 3, M = 1.5) display median KGEs around 0.30. Even with this improvement however, there are some locations that have a KGE close to 0 for all four models calibrated with FCM. All models calibrated using FCM display minimum KGEs close to 0, but this is still an improvement compared to the uncalibrated models where the 25<sup>th</sup> percentile KGE is just above 0. The models created with a fuzzification factor of m = 1.5 display much poorer median KGEs than the models created with distinct clusters. The results from silhouette analysis suggested that distinct clusters may provide more optimal solutions, and the results from <xref ref-type="fig" rid="F6">Figure 6</xref> support that using distinct clusters improves KGE significantly more on average than the models created with fuzzification factors.</p>
<p>In <xref ref-type="fig" rid="F9">Figures 9A</xref>, <xref ref-type="fig" rid="F9">B</xref>, the models are once again separated by area for further analysis.</p>
<fig id="F9" position="float">
<label>Figure 9</label>
<caption><p>Results from calibration for basins smaller than 100 mi<sup>2</sup> <bold>(A)</bold> and basins larger than 100 mi<sup>2</sup> <bold>(B)</bold> using KGE to evaluate daily streamflow estimation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0009.tif"/>
</fig>
<p>The results from <xref ref-type="fig" rid="F9">Figures 9A</xref>, <xref ref-type="fig" rid="F9">B</xref> provide further explanation for the general results given in <xref ref-type="fig" rid="F8">Figure 8</xref>. <xref ref-type="fig" rid="F9">Figure 9A</xref> shows that results for the smaller basins are very similar to the overall results given in <xref ref-type="fig" rid="F8">Figure 8</xref>, that models calibrated with distinct clusters once again provide only slightly poorer median KGEs than the model calibrated to daily streamflow at each location. It also supports that using distinct clusters provides significant improvement compared to using a fuzzification factor. However, <xref ref-type="fig" rid="F9">Figure 9B</xref> shows that for the larger basins, all models created using clustering (with or without a fuzzification factor) display almost the same median KGE of about 0.30. The models created using a fuzzification factor even display slightly higher median KGEs, with smaller interquartile ranges and better minimum KGEs. Overall, results from <xref ref-type="fig" rid="F9">Figures 9A</xref>, <xref ref-type="fig" rid="F9">B</xref> suggest that distinct clusters should be used when creating physical models for smaller basins (&#x0003C; 100 mi<sup>2</sup>), but for larger basins (&#x0003E;100 mi<sup>2</sup>), adding a fuzzification factor may provide similar results on average but less variability.</p>
<p>Next, the models are evaluated specifically for their ability to estimate 7Q10s.</p>
</sec>
<sec>
<title>4.5 Procedure for optimal cluster selection for overall model performance</title>
<p>Similar to Section 4.2, an analysis of all models for 7Q10 estimation is provided in <xref ref-type="table" rid="T6">Table 6</xref>.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Error metrics for 7Q10 estimation (only StreamStats locations).</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>7Q10 estimation technique</bold></th>
<th valign="top" align="center"><bold>Median relative bias (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>RMSE (cfs)</bold></th>
<th valign="top" align="center"><bold>RRMSE (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>UA-RMSE (cfs/mi<sup>2</sup>)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Uncalibrated rainfall-runoff models</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">13.99</td>
<td valign="top" align="center">116.48</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated rainfall-runoff models</td>
<td valign="top" align="center">0.61</td>
<td valign="top" align="center">12.33</td>
<td valign="top" align="center">102.63</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2)</td>
<td valign="top" align="center">0.84</td>
<td valign="top" align="center">14.31</td>
<td valign="top" align="center">119.16</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2, M = 1.5)</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">19.34</td>
<td valign="top" align="center">161.10</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3)</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">14.47</td>
<td valign="top" align="center">120.48</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3, M = 1.5)</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">19.35</td>
<td valign="top" align="center">161.08</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">13.75</td>
<td valign="top" align="center">114.5</td>
<td valign="top" align="center">0.05</td>
</tr></tbody>
</table>
</table-wrap>
<p>The results from <xref ref-type="table" rid="T6">Table 6</xref> suggest that even with the improvement for daily streamflow estimation shown in Section 4.4, none of the models provide better median relative bias, RMSE, RRMSE, or UA-RMSE compared to StreamStats. The models created using distinct clusters provide slightly higher RMSE and RRMSE than StreamStats but provide substantially higher relative bias and UA-RMSE. This is further analyzed in <xref ref-type="table" rid="T7">Table 7</xref>, which separates small and large basins for analysis.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Error metrics for 7Q10 estimation (only StreamStats locations).</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>7Q10 estimation technique</bold></th>
<th valign="top" align="center"><bold>Median relative bias (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>RMSE (cfs)</bold></th>
<th valign="top" align="center"><bold>RRMSE (cfs/cfs)</bold></th>
<th valign="top" align="center"><bold>UA-RMSE (cfs/mi<sup>2</sup>)</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="5"><bold>Basins smaller than 100mi</bold><sup>2</sup><bold>-35 Locations</bold></td>
</tr>
<tr>
<td valign="top" align="left">Uncalibrated rainfall-runoff models</td>
<td valign="top" align="center">0.90</td>
<td valign="top" align="center">7.26</td>
<td valign="top" align="center">188.82</td>
<td valign="top" align="center">0.09</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated rainfall-runoff models</td>
<td valign="top" align="center">0.90</td>
<td valign="top" align="center">5.47</td>
<td valign="top" align="center">142.24</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2)</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">5.63</td>
<td valign="top" align="center">146.53</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2, M = 1.5)</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">6.06</td>
<td valign="top" align="center">157.72</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3)</td>
<td valign="top" align="center">0.82</td>
<td valign="top" align="center">5.55</td>
<td valign="top" align="center">144.47</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3, M = 1.5)</td>
<td valign="top" align="center">0.98</td>
<td valign="top" align="center">6.08</td>
<td valign="top" align="center">158.08</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">2.23</td>
<td valign="top" align="center">58.00</td>
<td valign="top" align="center">0.04</td>
</tr>
<tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="5"><bold>Basins larger than 100 mi</bold><sup>2</sup><bold>-31 locations</bold></td>
</tr>
<tr>
<td valign="top" align="left">Uncalibrated rainfall-runoff models</td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">19.57</td>
<td valign="top" align="center">92.17</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">Calibrated rainfall-runoff models</td>
<td valign="top" align="center">0.42</td>
<td valign="top" align="center">16.25</td>
<td valign="top" align="center">76.53</td>
<td valign="top" align="center">0.06</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2)</td>
<td valign="top" align="center">0.88</td>
<td valign="top" align="center">20.00</td>
<td valign="top" align="center">94.23</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 2, M = 1.5)</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">27.48</td>
<td valign="top" align="center">129.44</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3)</td>
<td valign="top" align="center">0.88</td>
<td valign="top" align="center">20.27</td>
<td valign="top" align="center">95.49</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">FCM (C = 3, M = 1.5)</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">27.47</td>
<td valign="top" align="center">129.45</td>
<td valign="top" align="center">0.08</td>
</tr>
<tr>
<td valign="top" align="left">StreamStats</td>
<td valign="top" align="center">0.32</td>
<td valign="top" align="center">19.92</td>
<td valign="top" align="center">93.86</td>
<td valign="top" align="center">0.06</td>
</tr></tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T7">Table 7</xref> shows that for small basins, StreamStats provides the best 7Q10 estimation by far for all metrics employed. In terms of median relative bias, RMSE, RRMSE, and UA-RMSE, StreamStats&#x00027; estimates provide roughly half of the error compared to all other methods. For larger basins however, results are much more mixed. StreamStats does provide the best median relative bias by far, but for all other metrics, the calibrated models and models created with no fuzzification factor perform comparably to StreamStats for 7Q10 estimation. The calibrated models provide better RMSE and RRMSE, as well as an identical UA-RMSE. The models created using FCM with no fuzzification factor also display comparable but slightly larger RMSE, RRMSE, and UA-RMSE than StreamStats.</p>
<p>In <xref ref-type="fig" rid="F10">Figures 10A</xref>&#x02013;<xref ref-type="fig" rid="F10">D</xref>, a plot of the 7Q10 bias for each model is provided. Bias is calculated by [estimated&#x02013;actual], meaning points above the 0 line suffer from overestimation and points below suffer from underestimation. The corresponding line through the points follows a loess smoothing curve (<ext-link ext-link-type="uri" xlink:href="https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/loess">https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/loess</ext-link>) with the corresponding standard error highlighted in gray. This provides those interested in using these models for 7Q10 estimation with three valuable insights:</p>
<list list-type="order">
<list-item><p>For any of the models, the average bias that can be expected with a basin area of X.</p></list-item>
<list-item><p>The corresponding confidence in that estimate, given by the highlighted area.</p></list-item>
<list-item><p>If any of the models are consistently overestimating or underestimating 7Q10s for a range of basin sizes.</p></list-item>
</list>
<fig id="F10" position="float">
<label>Figure 10</label>
<caption><p><bold>(A&#x02013;D)</bold> Bias created from each model used, arranged by drainage area.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frwa-06-1332888-g0010.tif"/>
</fig>
<p>For both StreamStats and the calibrated models, the average bias remains around 0 for all range of basin sizes. There seems to be some minimal oscillation that suggests slight overestimation or underestimation depending on the basin size, but 0 remains in the highlighted confidence interval for the full range of basin sizes. For the uncalibrated models and models created using FCM (C = 2), there are clear patterns that demonstrate weaknesses in these modeling approaches. The uncalibrated models are heavily underestimating 7Q10s for basin sizes that range from 50&#x02013;250 mi<sup>2</sup>, and the models calibrated with FCM (C = 2) are consistently underestimating 7Q10s for basins larger than 50 mi<sup>2</sup>. Given the results in <xref ref-type="table" rid="T7">Table 7</xref> and <xref ref-type="fig" rid="F10">Figures 10A</xref>&#x02013;<xref ref-type="fig" rid="F10">D</xref>, the only physical models that provide sufficient</p>
<p>7Q10 estimation are the individually calibrated hydrologic models, which cannot be used in ungaged basins.</p>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusions</title>
<p>Based on the results from this experiment, the results support that resource managers who require 7Q10 estimates in gaged basins can use calibrated models in basins larger than 100 mi<sup>2</sup> and expect similar errors to current statistical estimates. For basins smaller than that, statistical estimates still provide smaller median relative bias, RMSE, RRMSE, and UA-RMSE for 7Q10 estimation. 7Q10s in these smaller basins can be extremely small values, however, and it should be considered how accurate estimates need to be to be sufficient for their exact design application. For example, the 7Q10 is frequently used in wastewater treatment plant design as a mixing flow. Though it is difficult to attribute exact costs to value changes, an estimated mixing flow of 1.00 cfs could cause a significantly different design than a mixing flow of 0.10 cfs, only having a difference of 0.90 cfs. However, in the case of a future climate change study where the accuracy of the value itself is less important than model-predicted future changes in the value, the process-based estimates readily allow for the incorporation of altered climate and/or landcover projections. Given few alternatives, the process-based models also provide the ability for resource managers to estimate 7Q10s in states where StreamStats 7Q10 estimation is not yet developed. Future work related to this study could include testing this procedure with other process-based models, determining other ways to calibrate physical models in ungaged locations for both daily streamflow and low flow estimation, and further analyzing the performance of physical models for gaged and ungaged high flow estimation (e.g., 100-year-flood estimation) as well.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec sec-type="author-contributions" id="s7">
<title>Author contributions</title>
<p>AD: Writing&#x02014;original draft. RP: Writing&#x02014;review &#x00026; editing. KA: Writing&#x02014;review &#x00026; editing.</p>
</sec>
</body>
<back>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was funded by a U.S. Geological Survey Northeast Climate Adaptation Science Center award G21AC10556 and a decision support system for estimating changes in extreme floods and droughts in the Northeast U.S., to AD.</p>
</sec>
<ack><p>The authors thank Parker Norton and Jacob Lafontaine of the United States Geological Survey for providing hydrologic models using extractions from the National Hydrologic Modeling framework, access to the Let Us Calibrate Software, and initial advice on modeling steps.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="s10">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/frwa.2024.1332888/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/frwa.2024.1332888/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/></sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Asquith</surname> <given-names>W. H.</given-names></name> <name><surname>Thompson</surname> <given-names>D. B.</given-names></name></person-group> (<year>2008</year>). <source>Alternative regression equations for estimation of annual peak-streamflow frequency for undeveloped watersheds in Texas using PRESS minimization</source>. U.S. Geological Survey Scientific Investigations Report 2008&#x02013;5084 p. 40. <pub-id pub-id-type="doi">10.3133/sir20085084</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Austin</surname> <given-names>S. H.</given-names></name> <name><surname>Krstolic</surname> <given-names>J. L.</given-names></name> <name><surname>Wiegand</surname> <given-names>U.</given-names></name></person-group> (<year>2011</year>). <source>Low-flow characteristics of Virginia streams</source>. U.S. Geological Survey Scientific Investigations Report 2011&#x02013;5143 p. 122. <pub-id pub-id-type="doi">10.3133/sir20115143</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ayers</surname> <given-names>J.</given-names></name> <name><surname>Villarini</surname> <given-names>G.</given-names></name> <name><surname>Jones</surname> <given-names>C.</given-names></name> <name><surname>Schilling</surname> <given-names>K.</given-names></name> <name><surname>Farmer</surname> <given-names>W.</given-names></name></person-group> (<year>2022</year>). <article-title>The role of climate in monthly baseflow changes across the continental United States</article-title>. <source>J. Hydrol. Eng</source>. 27, 2170. <pub-id pub-id-type="doi">10.1061/(ASCE)HE.1943-5584.0002170</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bayazit</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>Nonstationarity of hydrological records and recent trends in trend analysis: a state-of-the-art review</article-title>. <source>Environ. Process</source>. <volume>2</volume>, <fpage>527</fpage>&#x02013;<lpage>542</lpage>. <pub-id pub-id-type="doi">10.1007/s40710-015-0081-7</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beck</surname> <given-names>H. E.</given-names></name> <name><surname>van Dijk</surname> <given-names>A. I. J. M.</given-names></name> <name><surname>de Roo</surname> <given-names>A.</given-names></name> <name><surname>Miralles</surname> <given-names>D. G.</given-names></name> <name><surname>McVicar</surname> <given-names>T. R.</given-names></name> <name><surname>Schellekens</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>A Global-scale regionalization of hydrologic model parameters</article-title>. <source>Water Resour. Res</source>. <volume>52</volume>, <fpage>3599</fpage>&#x02013;<lpage>3622</lpage>. <pub-id pub-id-type="doi">10.1002/2015WR018247</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bent</surname> <given-names>G. C.</given-names></name> <name><surname>Steeves</surname> <given-names>P. A.</given-names></name> <name><surname>Waite</surname> <given-names>A. M.</given-names></name></person-group> (<year>2014</year>). <source>Equations for estimating selected streamflow statistics in Rhode Island.</source> U.S. Geological Survey Scientific Investigations Report 2014&#x02013;5010 p. 65. <pub-id pub-id-type="doi">10.3133/sir20145010</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berghuijs</surname> <given-names>W. R.</given-names></name> <name><surname>Woods</surname> <given-names>R. A.</given-names></name> <name><surname>Hutton</surname> <given-names>C. J.</given-names></name> <name><surname>Sivapalan</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>Dominant flood generating mechanisms across the United States</article-title>. <source>Geophys. Res. Lett.</source> <volume>43</volume>, <fpage>4382</fpage>&#x02013;<lpage>4390</lpage>. <pub-id pub-id-type="doi">10.1002/2016GL068070</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blum</surname> <given-names>A. G.</given-names></name> <name><surname>Archfield</surname> <given-names>S. A.</given-names></name> <name><surname>Hirsch</surname> <given-names>R. M.</given-names></name> <name><surname>Vogel</surname> <given-names>R. M.</given-names></name> <name><surname>Kiang</surname> <given-names>J. E.</given-names></name> <name><surname>Dudley</surname> <given-names>R. W.</given-names></name></person-group> (<year>2019</year>). <article-title>Updating estimates of low-streamflow statistics to account for possible trends</article-title>. <source>Hydrol. Sci. J.</source> <volume>64</volume>, <fpage>1404</fpage>&#x02013;<lpage>1414</lpage>. <pub-id pub-id-type="doi">10.1080/02626667.2019.1655148</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bock</surname> <given-names>A. E.</given-names></name> <name><surname>Santiago</surname> <given-names>M.</given-names></name> <name><surname>Wieczorek</surname> <given-names>M. E.</given-names></name> <name><surname>Foks</surname> <given-names>S. S.</given-names></name> <name><surname>Norton</surname> <given-names>P. A.</given-names></name> <name><surname>Lombard</surname> <given-names>M. A.</given-names></name></person-group> (<year>2020</year>). <source>Geospatial Fabric for National Hydrologic Modeling, version 1.1 (ver. 3.0, November 2021).</source> U.S. Geological Survey data release.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boyle</surname> <given-names>D. P.</given-names></name> <name><surname>Gupta</surname> <given-names>H. V.</given-names></name> <name><surname>Sorooshian</surname> <given-names>S.</given-names></name></person-group> (<year>2000</year>). <article-title>Toward improved calibration of hydrologic models: combining the strengths of manual and automatic methods</article-title>. <source>Water Resour. Res</source>. <volume>36</volume>, <fpage>3663</fpage>&#x02013;<lpage>3674</lpage>. <pub-id pub-id-type="doi">10.1029/2000WR900207</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brunner</surname> <given-names>M. I.</given-names></name> <name><surname>Slater</surname> <given-names>L.</given-names></name> <name><surname>Tallaksen</surname> <given-names>L. M.</given-names></name> <name><surname>Clark</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Challenges in modeling and predicting floods and droughts: a review</article-title>. <source>WIREs Water</source>. 8, e1520. <pub-id pub-id-type="doi">10.1002/wat2.1520</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="web"><person-group person-group-type="author"><collab>California Department of Water Resources</collab></person-group> (<year>2021</year>). <source>Water Year 2021: An Extreme Year</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://water.ca.gov/-/media/DWR-Website/Web-Pages/Water-Basics/Drought/Files/Publications-And-Reports/091521-Water-Year-2021-broch_v2.pdf">https://water.ca.gov/-/media/DWR-Website/Web-Pages/Water-Basics/Drought/Files/Publications-And-Reports/091521-Water-Year-2021-broch_v2.pdf</ext-link> (accessed September 9, 2023).</citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Castellarin</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Regional prediction of flow-duration curves using a three-dimensional kriging</article-title>. <source>J. Hydrol</source>. <volume>513</volume>, <fpage>179</fpage>&#x02013;<lpage>191</lpage>. <pub-id pub-id-type="doi">10.1016/j.jhydrol.2014.03.050</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Champagne</surname> <given-names>O.</given-names></name> <name><surname>Arain</surname> <given-names>M. A.</given-names></name> <name><surname>Leduc</surname> <given-names>M.</given-names></name> <name><surname>Coulibaly</surname> <given-names>P.</given-names></name> <name><surname>McKenzie</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Future shift in winter streamflow modulated by the internal variability of climate in southern Ontario</article-title>. <source>Hydrol. Earth Syst. Sci</source>. <volume>24</volume>, <fpage>3077</fpage>&#x02013;<lpage>3096</lpage>. <pub-id pub-id-type="doi">10.5194/hess-24-3077-2020</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Driscoll</surname> <given-names>J. M.</given-names></name> <name><surname>Hay</surname> <given-names>L. E.</given-names></name> <name><surname>Vanderhoof</surname> <given-names>M.</given-names></name> <name><surname>Viger</surname> <given-names>R. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Spatiotemporal variability of modeled watershed scale surface-depression storage and runoff for the conterminous United States</article-title>. <source>J. Am. Water Resour. Assoc.</source> <volume>56</volume>, <fpage>16</fpage>&#x02013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1111/1752-1688.12826</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duan</surname> <given-names>Q.</given-names></name></person-group> (<year>2003</year>). <article-title>&#x0201C;Global optimization for watershed model calibration,&#x0201D;</article-title> in <source>Calibration of Watershed Models</source>, eds. Q. Duan, H. V. Gupta, S. Sorooshian, A. N. Rousseau, and R. Turcotte. <pub-id pub-id-type="doi">10.1029/WS006</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dudley</surname> <given-names>R. W.</given-names></name></person-group> (<year>2004</year>). <source>Estimating Monthly, Annual, and Low 7-Day, 10-Year Streamflows for Ungaged Rivers in Maine.</source> U.S. Geological Survey Scientific Investigations Report 2004-5026, p. 22. <pub-id pub-id-type="doi">10.3133/sir20045026</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dunn</surname> <given-names>J. C.</given-names></name></person-group> (<year>1973</year>). <article-title>A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters</article-title>. <source>J. Cybern.</source> <volume>3</volume>, <fpage>32</fpage>&#x02013;<lpage>57</lpage>. <pub-id pub-id-type="doi">10.1080/01969727308546046</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>El Gharamti</surname> <given-names>M.</given-names></name> <name><surname>McCreight</surname> <given-names>J. L.</given-names></name> <name><surname>Noh</surname> <given-names>S. J.</given-names></name> <name><surname>Hoar</surname> <given-names>T. J.</given-names></name> <name><surname>RafieeiNasab</surname> <given-names>A.</given-names></name> <name><surname>Johnson</surname> <given-names>B. K.</given-names></name></person-group> (<year>2021</year>). <article-title>Ensemble streamflow data assimilation using WRF-Hydro and DART: novel localization and inflation techniques applied to Hurricane Florence flooding</article-title>. <source>Hydrol. Earth Syst. Sci</source>. <volume>25</volume>, <fpage>5315</fpage>&#x02013;<lpage>5336</lpage>. <pub-id pub-id-type="doi">10.5194/hess-25-5315-2021</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>England</surname> <given-names>J. F.</given-names></name> <name><surname>Cohn</surname> <given-names>T. A.</given-names></name> <name><surname>Faber</surname> <given-names>B. A.</given-names></name> <name><surname>Stedinger</surname> <given-names>J. R.</given-names></name> <name><surname>Thomas</surname> <given-names>W. O.</given-names></name> <name><surname>Veilleux</surname> <given-names>A. G.</given-names></name> <etal/></person-group>. (<year>2019</year>). <source>Guidelines for determining flood flow frequency &#x02014; Bulletin 17C (ver. 1.1, May 2019).</source> U.S. Geological Survey Techniques and Methods, p. 148. <pub-id pub-id-type="doi">10.3133/tm4B5</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="web"><person-group person-group-type="author"><collab>EPA Office of Water</collab></person-group> (<year>2018</year>). <source>Low Flow Statistics Tools: A How-To Handbook for NPDES Permit Writers</source>. <ext-link ext-link-type="uri" xlink:href="https://www.epa.gov/sites/default/files/2018-11/documents/low_flow_stats_tools_handbook.pdf">https://www.epa.gov/sites/default/files/2018-11/documents/low_flow_stats_tools_handbook.pdf</ext-link> (accessed October 3, 2023).</citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farmer</surname> <given-names>W. H.</given-names></name> <name><surname>LaFontaine</surname> <given-names>J. H.</given-names></name> <name><surname>Hay</surname> <given-names>L. E.</given-names></name></person-group> (<year>2019</year>). <article-title>Calibration of the US geological survey national hydrologic model in ungauged basins using statistical at-site streamflow simulations</article-title>. <source>J. Hydrol. Eng.</source> <volume>24</volume>, <fpage>11</fpage>. <pub-id pub-id-type="doi">10.1061/(ASCE)HE.1943-5584.0001854</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferreira</surname> <given-names>R. G.</given-names></name> <name><surname>da Silva</surname> <given-names>D. D.</given-names></name> <name><surname>Elesbon</surname> <given-names>A. A.</given-names></name> <name><surname>Fernandes-Filho</surname> <given-names>E. I.</given-names></name> <name><surname>Veloso</surname> <given-names>G. V.</given-names></name> <name><surname>de Souza Fraga</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Machine learning models for streamflow regionalization in a tropical watershed. <italic>J. Environ</italic></article-title>. <source>Manage</source>. 280, 111713. <pub-id pub-id-type="doi">10.1016/j.jenvman.2020.111713</pub-id><pub-id pub-id-type="pmid">33257181</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flynn</surname> <given-names>R. H.</given-names></name> <name><surname>Tasker</surname> <given-names>G. D.</given-names></name></person-group> (<year>2002</year>). <source>Development of Regression Equations to Estimate Flow Durations and Low-Flow-Frequency Statistics in New Hampshire Streams</source>. U.S. Geological Survey Scientific Investigations Report 02-4298, p. 66.</citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Formetta</surname> <given-names>G.</given-names></name> <name><surname>Mantilla</surname> <given-names>R.</given-names></name> <name><surname>Franceschi</surname> <given-names>S.</given-names></name> <name><surname>Antonello</surname> <given-names>A.</given-names></name> <name><surname>Rigon</surname> <given-names>R.</given-names></name></person-group> (<year>2011</year>). <article-title>The JGrass-New Age system for forecasting and managing the hydrological budgets at the basin scale: models of flow generation and propagation/routing</article-title>. <source>Geosci. Model Dev</source>. <volume>4</volume>, <fpage>943</fpage>&#x02013;<lpage>955</lpage>. <pub-id pub-id-type="doi">10.5194/gmd-4-943-2011</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Golian</surname> <given-names>S.</given-names></name> <name><surname>Murphy</surname> <given-names>C.</given-names></name> <name><surname>Meresa</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Regionalization of hydrological models for flow estimation in ungauged catchments in Ireland</article-title>. <source>J. Hydrol.</source> <volume>36</volume>, <fpage>100859</fpage>. <pub-id pub-id-type="doi">10.1016/j.ejrh.2021.100859</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name></person-group> (<year>2021</year>). <article-title>Regionalization of hydrological modeling for predicting streamflow in ungauged catchments: a comprehensive review</article-title>. <source>WIREs Water</source>. 8, e1487. <pub-id pub-id-type="doi">10.1002/wat2.1487</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>H. V.</given-names></name> <name><surname>Kling</surname> <given-names>H.</given-names></name> <name><surname>Yilmaz</surname> <given-names>K. K.</given-names></name> <name><surname>Martinez</surname> <given-names>G. F.</given-names></name></person-group> (<year>2009</year>). <article-title>Decomposition of the mean squared error and NSE performance criteria: implications for improving hydrological modeling</article-title>. <source>J. Hydrol</source>. <volume>377</volume>, <fpage>80</fpage>&#x02013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1016/j.jhydrol.2009.08.003</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>V.</given-names></name> <name><surname>Waymire</surname> <given-names>E.</given-names></name></person-group> (<year>1998</year>). <article-title>&#x0201C;Spatial variability and scale invariance in hydrologic regionalization,&#x0201D;</article-title> in <source>Scale Dependence and Scale Invariance in Hydrology</source>, ed. G. Sposito (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>88</fpage>&#x02013;<lpage>135</lpage>. <pub-id pub-id-type="doi">10.1017/CBO9780511551864.005</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hay</surname> <given-names>L.</given-names></name> <name><surname>Norton</surname> <given-names>P.</given-names></name> <name><surname>Viger</surname> <given-names>R.</given-names></name> <name><surname>Markstrom</surname> <given-names>S.</given-names></name> <name><surname>Regan</surname> <given-names>R. S.</given-names></name> <name><surname>Vanderhoof</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Modeling surface-water depression storage in a Prairie Pothole Region</article-title>. <source>Hydrol. Proc</source>. <volume>32</volume>, <fpage>462</fpage>&#x02013;<lpage>479</lpage>. <pub-id pub-id-type="doi">10.1002/hyp.11416</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Hay</surname> <given-names>L. E.</given-names></name> <name><surname>Umemoto</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <source>Multiple-objective stepwise calibration using Luca. U.S. Geological Survey Open-File Report 2006&#x02013;1323, p. 25</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://pubs.usgs.gov/of/2006/1323/">http://pubs.usgs.gov/of/2006/1323/</ext-link> (accessed March 22, 2023).</citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hesarkazzazi</surname> <given-names>H.</given-names></name> <name><surname>Arabzadeh</surname> <given-names>R.</given-names></name> <name><surname>Hajibabaei</surname> <given-names>M.</given-names></name> <name><surname>Rauch</surname> <given-names>W.</given-names></name> <name><surname>Kjeldsen</surname> <given-names>T. R.</given-names></name> <name><surname>Prosdocimi</surname> <given-names>I.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Stationary vs non-stationary modelling of flood frequency distribution across northwest England</article-title>. <source>Hydrol. Sci. J.</source> <volume>66</volume>, <fpage>729</fpage>&#x02013;<lpage>744</lpage>. <pub-id pub-id-type="doi">10.1080/02626667.2021.1884685</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hodgkins</surname> <given-names>G. A.</given-names></name> <name><surname>Dudley</surname> <given-names>R. W.</given-names></name></person-group> (<year>2011</year>). <article-title>Historical summer base flow and stormflow trends for New England rivers. <italic>Water Resour</italic></article-title>. <source>Res</source>. 47, W07528. <pub-id pub-id-type="doi">10.1029/2010WR009109</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hong</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>R.</given-names></name> <name><surname>Wang</surname> <given-names>D.</given-names></name> <name><surname>Qian</surname> <given-names>L. X.</given-names></name> <name><surname>Hu</surname> <given-names>Z. H.</given-names></name></person-group> (<year>2017</year>). <article-title>Spatial interpolation of annual runoff in ungauged basins based on the improved information diffusion model using a genetic algorithm</article-title>. <source>Discr. Dyn. Nat. Soc.</source> <volume>18</volume>, <fpage>1</fpage>&#x02013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1155/2017/4293731</pub-id></citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hrachowitz</surname> <given-names>M.</given-names></name> <name><surname>Savenije</surname> <given-names>H. H. G.</given-names></name> <name><surname>Bl&#x000F6;schl</surname> <given-names>G.</given-names></name> <name><surname>McDonnell</surname> <given-names>J. J.</given-names></name> <name><surname>Sivapalan</surname> <given-names>M.</given-names></name> <name><surname>Pomeroy</surname> <given-names>J. W.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>A decade of Predictions in Ungauged Basins (PUB)&#x02014;A review</article-title>. <source>Hydrol. Sci. J.</source> <volume>58</volume>, <fpage>1198</fpage>&#x02013;<lpage>1255</lpage>. <pub-id pub-id-type="doi">10.1080/02626667.2013.803183</pub-id></citation>
</ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jensen</surname> <given-names>M. E.</given-names></name> <name><surname>Haise</surname> <given-names>H. R.</given-names></name></person-group> (<year>1963</year>). <article-title>Estimating evapotranspiration from solar radiation</article-title>. <source>J. Irrig. Drain. Div.</source> <volume>89</volume>, <fpage>15</fpage>&#x02013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1061/JRCEA4.0000287</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kitlasten</surname> <given-names>W.</given-names></name> <name><surname>Morway</surname> <given-names>E. D.</given-names></name> <name><surname>Niswonger</surname> <given-names>R. G.</given-names></name> <name><surname>Gardner</surname> <given-names>M.</given-names></name> <name><surname>White</surname> <given-names>J. T.</given-names></name> <name><surname>Triana</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Integrated hydrology and operations modeling to evaluate climate change impacts in an agricultural valley irrigated with snowmelt runoff</article-title>. <source>Water Resour. Res.</source> <volume>57</volume>, <fpage>e2020W</fpage>R027924. <pub-id pub-id-type="doi">10.1029/2020WR027924</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kratzert</surname> <given-names>F.</given-names></name> <name><surname>Klotz</surname> <given-names>D.</given-names></name> <name><surname>Herrnegger</surname> <given-names>M.</given-names></name> <name><surname>Sampson</surname> <given-names>A. K.</given-names></name> <name><surname>Hochreiter</surname> <given-names>S.</given-names></name> <name><surname>Nearing</surname> <given-names>G. S.</given-names></name></person-group> (<year>2019</year>). <article-title>Toward improved predictions in ungauged basins: Exploiting the power of machine learning</article-title>. <source>Water Resour. Res</source>. <volume>55</volume>, <fpage>11344</fpage>&#x02013;<lpage>11354</lpage>. <pub-id pub-id-type="doi">10.1029/2019WR026065</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>LaFontaine</surname> <given-names>J. H.</given-names></name> <name><surname>Hart</surname> <given-names>R. M.</given-names></name> <name><surname>Hay</surname> <given-names>L. E.</given-names></name> <name><surname>Farmer</surname> <given-names>W. H.</given-names></name> <name><surname>Bock</surname> <given-names>A. R.</given-names></name> <name><surname>Viger</surname> <given-names>R. J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <source>Simulation of water availability in the Southeastern United States for historical and potential future climate and land-cover conditions.</source> U.S. Geological Survey Scientific Investigations Report 2019&#x02013;5039 p. 83. <pub-id pub-id-type="doi">10.3133/sir20195039</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Khandelwal</surname> <given-names>A.</given-names></name> <name><surname>Jia</surname> <given-names>X.</given-names></name> <name><surname>Cutler</surname> <given-names>K.</given-names></name> <name><surname>Ghosh</surname> <given-names>R.</given-names></name> <name><surname>Renganathan</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Regionalization in a global hydrologic deep learning model: from physical descriptors to random vectors</article-title>. <source>Water Resour. Res.</source> <volume>58</volume>, <fpage>e2021W</fpage>R031794. <pub-id pub-id-type="doi">10.1029/2021WR031794</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Lins</surname> <given-names>H. F.</given-names></name></person-group> (<year>2012</year>). <source>USGS Hydro-Climatic Data Network (HCDN-2009). U.S. Geological Survey, Reston VA. Fact Sheet 2012-3047</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://pubs.er.usgs.gov/publication/fs20123047">https://pubs.er.usgs.gov/publication/fs20123047</ext-link> (accessed January 18, 2023).</citation>
</ref>
<ref id="B42">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Livneh</surname> <given-names>B.</given-names></name> <collab>National Center for Atmospheric Research Staff</collab></person-group> (<year>2019</year>). The <italic>Climate Data Guide: Livneh gridded precipitation and other meteorological variables for continental US, Mexico and southern Canada</italic>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://climatedataguide.ucar.edu/climate-data/livneh-gridded-precipitation-and-other-meteorological-variables-continental-us-mexico">https://climatedataguide.ucar.edu/climate-data/livneh-gridded-precipitation-and-other-meteorological-variables-continental-us-mexico</ext-link> (accessed December 12, 2019).</citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>MacQueen</surname> <given-names>J. B.</given-names></name></person-group> (<year>1967</year>). <article-title>&#x0201C;Some methods for classification and analysis of multivariate observations,&#x0201D;</article-title> in <source>Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability</source> (University of California Press), p. <fpage>281</fpage>&#x02013;<lpage>297</lpage>.</citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Markstrom</surname> <given-names>S. L.</given-names></name> <name><surname>Regan</surname> <given-names>R. S.</given-names></name> <name><surname>Hay</surname> <given-names>L. E.</given-names></name> <name><surname>Viger</surname> <given-names>R. J.</given-names></name> <name><surname>Webb</surname> <given-names>R. M. T.</given-names></name> <name><surname>Payn</surname> <given-names>R. A.</given-names></name> <etal/></person-group>. (<year>2015</year>). <source>PRMS-IV, the precipitation-runoff modeling system, version 4.</source> U.S. Geological Survey Techniques and Methods, p. 158. <pub-id pub-id-type="doi">10.3133/tm6B7</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maurer</surname> <given-names>E. P.</given-names></name> <name><surname>Wood</surname> <given-names>A. W.</given-names></name> <name><surname>Adam</surname> <given-names>J. C.</given-names></name> <name><surname>Lettenmaier</surname> <given-names>D. P.</given-names></name> <name><surname>Nijssen</surname> <given-names>B.</given-names></name></person-group> (<year>2002</year>). <article-title>A long-term hydrologically-based data set of land surface fluxes and states for the conterminous United States</article-title>. <source>J. Clim.</source> <volume>15</volume>, <fpage>3237</fpage>&#x02013;<lpage>3251</lpage>. <pub-id pub-id-type="doi">10.1175/1520-0442(2002)015&#x0003C;3237:ALTHBD&#x0003E;2.0.CO;2</pub-id></citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mekanik</surname> <given-names>F.</given-names></name> <name><surname>Imteaz</surname> <given-names>M. A.</given-names></name> <name><surname>Talei</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>Seasonal rainfall forecasting by adaptive network-based fuzzy inference system (ANFIS) using large scale climate signals. <italic>Clim</italic></article-title>. <source>Dynam</source>. <volume>46</volume>, <fpage>3097</fpage>&#x02013;<lpage>3111</lpage>. <pub-id pub-id-type="doi">10.1007/s00382-015-2755-2</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milly</surname> <given-names>P. C. D.</given-names></name> <name><surname>Betancourt</surname> <given-names>J.</given-names></name> <name><surname>Falkenmark</surname> <given-names>M.</given-names></name> <name><surname>Hirsch</surname> <given-names>R. M.</given-names></name> <name><surname>Kundzewicz</surname> <given-names>Z. W.</given-names></name> <name><surname>Lettenmaier</surname> <given-names>D. P.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Stationarity is dead: whither water management</article-title>. <source>Science</source>. <volume>319</volume>, <fpage>573</fpage>&#x02013;<lpage>574</lpage>. <pub-id pub-id-type="doi">10.1126/science.1151915</pub-id><pub-id pub-id-type="pmid">18239110</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mosavi</surname> <given-names>A.</given-names></name> <name><surname>Golshan</surname> <given-names>M.</given-names></name> <name><surname>Choubin</surname> <given-names>B.</given-names></name> <name><surname>Ziegler</surname> <given-names>A. D.</given-names></name> <name><surname>Sigaroodi</surname> <given-names>S. K.</given-names></name> <name><surname>Zhang</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Fuzzy clustering and distributed model for streamflow estimation in ungauged watersheds</article-title>. <source>Sci. Rep.</source> <volume>11</volume>:<fpage>8243</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-87691-0</pub-id><pub-id pub-id-type="pmid">33859280</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nash</surname> <given-names>J. E.</given-names></name> <name><surname>Sutcliffe</surname> <given-names>J. V.</given-names></name></person-group> (<year>1970</year>). <article-title>River flow forecasting through conceptual models part I&#x02014;A discussion of principles</article-title>. <source>J. Hydrol</source>. <volume>10</volume>, <fpage>282</fpage>&#x02013;<lpage>290</lpage>. <pub-id pub-id-type="doi">10.1016/0022-1694(70)90255-6</pub-id></citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Regan</surname> <given-names>R. S.</given-names></name> <name><surname>Juracek</surname> <given-names>K. E.</given-names></name> <name><surname>Hay</surname> <given-names>L. E.</given-names></name> <name><surname>Markstrom</surname> <given-names>S. L.</given-names></name> <name><surname>Viger</surname> <given-names>R. J.</given-names></name> <name><surname>Driscoll</surname> <given-names>J. M.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>The U. S. Geological Survey National Hydrologic Model infrastructure: Rationale, description, and application of a watershed-scale model for the conterminous United States</article-title>. <source>Environ. Model. Softw.</source> <volume>111</volume>, <fpage>192</fpage>&#x02013;<lpage>203</lpage>. <pub-id pub-id-type="doi">10.1016/j.envsoft.2018.09.023</pub-id></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ries</surname> <given-names>K. G.</given-names> <suffix>III.</suffix></name> <name><surname>Guthrie</surname> <given-names>J. D.</given-names></name> <name><surname>Rea</surname> <given-names>A. H.</given-names></name> <name><surname>Steeves</surname> <given-names>P. A.</given-names></name> <name><surname>Stewart</surname> <given-names>D. W.</given-names></name></person-group> (<year>2008</year>). <source>StreamStats: A Water Resources Web Application.</source> U.S. Geological Survey Fact Sheet 2008-3067 p. 6. <pub-id pub-id-type="doi">10.3133/fs20083067</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ries</surname> <given-names>K. G.</given-names> <suffix>III.</suffix></name></person-group> (<year>2000</year>). <source>Methods for estimating low-flow statistics for Massachusetts streams.</source> U.S. Geological Survey Water Resources Investigations Report 00-4135 p. 81.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rousseeuw</surname> <given-names>P. J.</given-names></name></person-group> (<year>1987</year>). <article-title>Silhouettes: a graphical aid to the interpretation and validation of cluster analysis</article-title>. <source>Comput. Appl. Mathem</source>. <volume>20</volume>, <fpage>53</fpage>&#x02013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1016/0377-0427(87)90125-7</pub-id></citation>
</ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rupp</surname> <given-names>D. E.</given-names></name> <name><surname>Chegwidden</surname> <given-names>O. S.</given-names></name> <name><surname>Nijssen</surname> <given-names>B.</given-names></name> <name><surname>Clark</surname> <given-names>M. P.</given-names></name></person-group> (<year>2021</year>). <article-title>Changing river network synchrony modulates projected increases in high flows</article-title>. <source>Water Resour. Res.</source> <volume>57</volume>:<fpage>e2020W</fpage>R028713. <pub-id pub-id-type="doi">10.1029/2020WR028713</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salas</surname> <given-names>J. D.</given-names></name> <name><surname>Obeysekera</surname> <given-names>J.</given-names></name> <name><surname>Vogel</surname> <given-names>R. M.</given-names></name></person-group> (<year>2018</year>). <article-title>Techniques for assessing water infrastructure for nonstationary extreme events: a review</article-title>. <source>Hydrol. Sci. J.</source> <volume>63</volume>, <fpage>325</fpage>&#x02013;<lpage>352</lpage>. <pub-id pub-id-type="doi">10.1080/02626667.2018.1426858</pub-id></citation>
</ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Seaber</surname> <given-names>P. R.</given-names></name> <name><surname>Kapinos</surname> <given-names>F. P.</given-names></name> <name><surname>Knapp</surname> <given-names>G. L.</given-names></name></person-group> (<year>1987</year>). <source>Hydrologic Unit Maps</source>. <publisher-loc>Washington, DC, USA</publisher-loc>: <publisher-name>US Government Printing Office</publisher-name>.</citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Siddique</surname> <given-names>R.</given-names></name> <name><surname>Karmalkar</surname> <given-names>A.</given-names></name> <name><surname>Sun</surname> <given-names>F.</given-names></name> <name><surname>Palmer</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <article-title>Hydrological extremes across the Commonwealth of Massachusetts in a changing climate</article-title>. <source>J. Hydrol.</source> <volume>32</volume>, <fpage>100733</fpage>. <pub-id pub-id-type="doi">10.1016/j.ejrh.2020.100733</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smakhtin</surname> <given-names>V. U.</given-names></name></person-group> (<year>2001</year>). <article-title>Low flow hydrology: a review</article-title>. <source>J. Hydrol</source>. <volume>240</volume>, <fpage>147</fpage>&#x02013;<lpage>186</lpage>. <pub-id pub-id-type="doi">10.1016/S0022-1694(00)00340-1</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Song</surname> <given-names>Z.</given-names></name> <name><surname>Xia</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>G.</given-names></name> <name><surname>She</surname> <given-names>D.</given-names></name> <name><surname>Hu</surname> <given-names>C.</given-names></name> <name><surname>Hong</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Regionalization of hydrological model parameters using gradient boosting machine</article-title>. <source>Hydrol. Earth Syst. Sci</source>. <volume>26</volume>, <fpage>505</fpage>&#x02013;<lpage>524</lpage>. <pub-id pub-id-type="doi">10.5194/hess-26-505-2022</pub-id></citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steinschneider</surname> <given-names>S.</given-names></name> <name><surname>Yang</surname> <given-names>Y.</given-names></name> <name><surname>Brown</surname> <given-names>C.</given-names></name></person-group> (<year>2014</year>). <article-title>Combining regression and spatial proximity for catchment model regionalization: a comparative study</article-title>. <source>Hydrol. Sci. J</source>. 60, 141217125340005. <pub-id pub-id-type="doi">10.1080/02626667.2014.899701</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stuckey</surname> <given-names>M. H.</given-names></name></person-group> (<year>2006</year>). <source>Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams.</source> U.S. Geological Survey Scientific Investigations Report 2006-5130, p. 84. <pub-id pub-id-type="doi">10.3133/sir20065130</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Thornton</surname> <given-names>M. M.</given-names></name> <name><surname>Thornton</surname> <given-names>P. E.</given-names></name> <name><surname>Wei</surname> <given-names>Y.</given-names></name> <name><surname>Mayer</surname> <given-names>B. W.</given-names></name> <name><surname>Cook</surname> <given-names>R. B.</given-names></name> <name><surname>Vose</surname> <given-names>R. S.</given-names></name></person-group> (<year>2016</year>). <source>Daymet: Monthly Climate Summaries on a 1-km Grid for North America, Version 3</source>. <publisher-loc>Oak Ridge, TN</publisher-loc>: <publisher-name>ORNL DAAC</publisher-name>.</citation>
</ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><collab>US EPA</collab></person-group> (<year>2019</year>). <source>BASINS 4.5 (Better Assessment Science Integrating point and Non-point Sources) Modeling Framework</source>. National Exposure Research Laboratory, RTP, North Carolina. BASINS Core Manual.</citation>
</ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wiley</surname> <given-names>J. B.</given-names></name></person-group> (<year>2008</year>). <source>Estimating Selected Streamflow Statistics Representative of 1930&#x02013;2002 in West Virginia.</source> U.S. Geological Survey Scientific Investigations Report 2008-5105, Version 2, p. 24. <pub-id pub-id-type="doi">10.3133/sir20085105</pub-id></citation>
</ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Williams</surname> <given-names>A. P.</given-names></name> <name><surname>Cook</surname> <given-names>B. I.</given-names></name> <name><surname>Smerdon</surname> <given-names>J. E.</given-names></name></person-group> (<year>2022</year>). <article-title>Rapid intensification of the emerging southwestern North American megadrought in 2020&#x02013;2021. <italic>Nat. Clim</italic></article-title>. <source>Chang</source>. <volume>12</volume>, <fpage>232</fpage>&#x02013;<lpage>234</lpage>. <pub-id pub-id-type="doi">10.1038/s41558-022-01290-z</pub-id></citation>
</ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Worland</surname> <given-names>S. C.</given-names></name> <name><surname>Farmer</surname> <given-names>W. H.</given-names></name> <name><surname>Kiang</surname> <given-names>J. E.</given-names></name></person-group> (<year>2018</year>). <article-title>Improving predictions of hydrological low-flow indices in ungaged basins using machine learning</article-title>. <source>Environ. Model. Softw.</source> <volume>101</volume>, <fpage>169</fpage>&#x02013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1016/j.envsoft.2017.12.021</pub-id></citation>
</ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wright</surname> <given-names>S.</given-names></name></person-group> (<year>1921</year>). <article-title>Correlation and causation</article-title>. <source>J. Agric. Res</source>. <volume>20</volume>, <fpage>557</fpage>&#x02013;<lpage>585</lpage>.</citation>
</ref>
</ref-list>
</back>
</article>