Tandem Use of Multiple Tracers and Metrics to Identify Dynamic and Slow Hydrological Flowpaths

Current understanding of the dynamic and slow flow paths that support streamflow in mountain headwater catchments is inhibited by the lack of long-term hydrogeochemical data and the frequent use of short residence time age tracers. To address this, the current study combined the traditional mean transit time and the state-of-the-art fraction of young water (Fyw) metrics with stable water isotopes and tritium tracers to characterize the dynamic and slow flow paths at Marshall Gulch, a sub-humid headwater catchment in the Santa Catalina Mountains, Arizona, USA. The results show that Fyw varied significantly with period when using sinusoidal curve fitting methods (e.g., iteratively re-weighted least squares or IRLS), but not when using the transit time distribution (TTD)-based method. Therefore, Fyw estimates from TTD-based methods may be particularly useful for intercomparison of dynamic flow behavior between catchments. However, the utility of 3H to determine Fyw in deeper groundwater was limited due to both data quality and inconsistent seasonal cyclicity of the precipitation 3H time series data. Although a Gamma-type TTD was appropriate to characterize deep groundwater, there were large uncertainties in the estimated Gamma TTD shape parameter arising from the short record length of 3H in deep groundwater. This work demonstrates how co-application of multiple metrics and tracers can yield a more complete understanding of the dynamic and slow flow paths and observable deep groundwater storage volumes that contribute to streamflow in mountain headwater catchments.

Current understanding of the dynamic and slow flow paths that support streamflow in mountain headwater catchments is inhibited by the lack of long-term hydrogeochemical data and the frequent use of short residence time age tracers. To address this, the current study combined the traditional mean transit time and the state-of-the-art fraction of young water (F yw ) metrics with stable water isotopes and tritium tracers to characterize the dynamic and slow flow paths at Marshall Gulch, a sub-humid headwater catchment in the Santa Catalina Mountains, Arizona, USA. The results show that F yw varied significantly with period when using sinusoidal curve fitting methods (e.g., iteratively re-weighted least squares or IRLS), but not when using the transit time distribution (TTD)-based method. Therefore, F yw estimates from TTD-based methods may be particularly useful for intercomparison of dynamic flow behavior between catchments. However, the utility of 3 H to determine F yw in deeper groundwater was limited due to both data quality and inconsistent seasonal cyclicity of the precipitation 3 H time series data. Although a Gamma-type TTD was appropriate to characterize deep groundwater, there were large uncertainties in the estimated Gamma TTD shape parameter arising from the short record length of 3 H in deep groundwater. This work demonstrates how co-application of multiple metrics and tracers can yield a more complete understanding of the dynamic and slow flow paths and observable deep groundwater storage volumes that contribute to streamflow in mountain headwater catchments.

INTRODUCTION
Globally, mountain headwater catchments are critical sources of water to downstream valleyfill aquifers (Viviroli et al., 2003(Viviroli et al., , 2007Kohler and Marselli, 2009;Harpold et al., 2012;Carroll et al., 2019;Milly and Dunne, 2020;Bryan, 2021;Eppolito and Fonseca, 2021). Consequently, accurate characterization of the dynamic and slow flow components of streamflow is needed to more thoroughly understand and predict water quantity and quality derived from these headwater catchments. The current study couples the transit time distribution (TTD)-based mean transit time (mTT) metric and the state-of-the-art fraction of young water (F yw ) metric to stable water isotopes and tritium tracers in order to address this challenge.
Deep groundwater is a critical component of catchment hydrology because it supports baseflow under dry conditions. However, the contribution of deep groundwater is hard to determine using stable water isotope data alone. Therefore, the literature on tracing old groundwater suggests using tritium as a tracer (Stewart et al., 2010(Stewart et al., , 2012. This recommendation is corroborated by previous work indicating that mTTs based on stable water isotopes alone may be underestimated because the TTDs have tails that correspond to longer transit times that can become truncated (DeWalle et al., 1997;Stewart et al., 2010Stewart et al., , 2012Frisbee et al., 2013). Additionally, certain model performance criteria (e.g., Nash-Sutcliffe Efficiency) that are used to evaluate TTDs based on stable water isotope data can become insensitive to longer transit times (Seeger and Weiler, 2014).
The F yw is defined as the fraction of water younger than a certain threshold age. Stable water isotope data and synthetic numerical models suggest that the dynamic nature of flowpaths and the hydrogeochemical behavior of a catchment can be more accurately characterized by the F yw metric than by the mTT (Kirchner, 2016a,b). Principal reasons for this include significant spatiotemporal aggregation errors associated with mTT estimates from heterogeneous catchments, and the fact that the F yw metric is largely insensitive to spatial and temporal aggregation errors when evaluated for annual tracer cycles in inflow and outflow. Given the scarcity of long duration tracer time series data, F ywbased work typically fits annual or seasonal sinusoidal cycles to catchment tracer data (Jasechko, 2016, Stockinger et al., 2016, 2019Jasechko et al., 2017;Clow et al., 2018;Jacobs et al., 2018;von Freyberg et al., 2018;Gallart et al., 2020) such that it remains unknown how F yw may vary with period in either tracer concentrations or tracer flux cycles. However, at basin-floor elevation in the Tucson Basin watershed, Arizona, which includes the present study area, the time series of water isotopes in precipitation spans four decades, and the seasonal tritium time series spans two decades. In both cases, annual and multi-annual cycles are present (Eastoe et al., 2012;Eastoe and Dettman, 2016; Supplementary Figure 1), and similar patterns, adjusted for altitude, are likely in the surrounding mountains. If F yw varies significantly with period without an identifiable trend, both intercomparison of F yw estimates and differences in the dynamic nature of flow paths between catchments will be affected, even when using the same method (e.g., sinusoidal curve fitting) (see also Stockinger et al., 2019). Therefore, examination of the dependence of F yw on period for a study area with long-term tracer time series data represents an opportunity to gain transferrable information about hydrological cycling in mountain headwater catchments.
At Marshall Gulch, Arizona, USA, previous work has reported TTDs and mTTs from high-density stable water isotope data (Heidbüchel et al., 2012;Dwivedi et al., 2021) and tritium-based transit times based on the implicit, a priori assumption of a piston-flow type TTD (Dwivedi et al., 2019a). To compliment this work and provide new insight into the nature of dynamic and slow flow paths, the current study presents an original evaluation of TTD type based on tritium data at Marshall Gulch (MGC). We specifically considered the following questions: (i) What is the appropriate TTD type and mTT for characterization of deep groundwater at MGC? (ii) How does F yw vary with period when using stable water isotope data? And (iii), how does F yw vary with period when using tritium data? We rely on previously collected stable water isotope data in precipitation and streamflow and tritium data collected from streamflow under low flow conditions and from deep groundwater at MGC to address these questions. The tritium-based TTD and mTT are estimated by objectively evaluating various TTD types with set criteria, whereas the F yw estimates are generated from stable water isotope data using both the sinusoidal curve fitting method for periods ranging from 2 days to 5 years and an alternative method that used TTD type and parameters from a previous study (Dwivedi et al., 2021). The tritium-based F yw was estimated using the TTD-based method only.

Study Site
Marshall Gulch (MGC) is a 1.55 km 2 headwater catchment located within the Santa Catalina Mountains ∼26 km northeast of Tucson in southeast Arizona, USA (Figure 1). The elevation at MGC ranges from 2,285 to 2,632 m above sea level (asl) with a mean of 2,428 m asl and a mean topographic slope of 22 • (or ∼40%). Bedrock at the field site is mostly granite at upper elevations and micaceous schist at lower elevations (Dickinson et al., 2002). The prevailing soil texture is sandy loam (Holleran, 2013) with soil depth varying from 0 to 1.5 m (Pelletier and Rasmussen, 2009). Soils overlying micaceous schist are generally deeper and have a higher clay content than soils overlying granite (Heidbüchel et al., 2013;Holleran, 2013a). Based on a 30-year (1981-2010) record, the long-term average annual precipitation at MGC is 920 mm (PRISM Climate Group, 2018). The catchment received an average of 654 mm ± 158 mm (mean ± one standard deviation) of precipitation per year between water years (WY) 2008 through 2017; the mean annual streamflow for the same period was 247 mm ± 138 mm. WY n is defined here as the period from July 1 of year n-1 through June 30 of year n. Instrumentation relevant to this study (Figure 1) is described in detail in the following sections.

Hydrologic Fluxes
The MGC-scale daily precipitation (P) and streamflow (Q) data were calculated between WY 2008 and 2017 (Figures 2A,B; Dwivedi et al., 2019bDwivedi et al., , 2021. Precipitation was observed at 15-min intervals at eight measurement sites equipped with tipping bucket precipitation gages at seven locations and a heated precipitation gage at the remaining site (Figure 1). From the precipitation time series, Thiessen polygon-derived weights were used to estimate daily catchment-scale mean precipitation (Dwivedi et al., 2019b). Streamflow was measured at 30-min FIGURE 1 | Marshall Gulch Catchment (MGC; the catchment boundary is shown in green), located within the Santa Catalina Mountains Critical Zone Observatory (SCM-CZO) in southeast Arizona, USA (inset map), and the precipitation, stream water, and deep groundwater collection sites used in this study. The source of the digital elevation model is U.S. Geological Survey (2021a) and the source for the drainage network data is U.S. Geological Survey (2021b).
intervals at the MG-Weir site (Figure 1) using a pressure transducer (HOBO by Onset U20-001-01; Onset Computer Corporation) with maximum error of 0.62 kPa and accuracy of 0.02 kPa, and a previously derived stage-discharge relationship (Heidbüchel et al., 2012).

Stable Water Isotopes in Precipitation and Streamflow Precipitation
The precipitation bulk samples at MGC were collected using: (i) bulk samplers at the Schist, Fern Valley, and Granite stations and (ii) autosamplers at the MG-Weir and Mt. Lemmon stations (Figure 1). At the Fern Valley, Granite, and Schist stations, two collectors were installed at each station and bulk samples were collected every 5-7 d (Lyon et al., 2009;Heidbüchel et al., 2012). Daily bulk precipitation samples were also collected from the Mt. Lemmon and MG-Weir stations (Figure 1). At the Mt. Lemmon station, sampling mainly focused on the summer monsoon season (Heidbüchel et al., 2012), whereas continuous samples were collected beginning in December 2009 at the MG-Weir station. At all stations, the data density decreased after 2012 due to logistical constraints (see Figure 2C). All samples were analyzed using a DLT-100 laser spectrometer, Los Gatos Research, Inc., model # 908-0008 with analytical precisions (1σ) of 0.37% for δ 2 H and 0.12% for δ 18 O, respectively (Lyon et al., 2009). The catchment-scale time series of δ 18 O in precipitation was calculated as the unweighted mean of results from all stations and was characterized by irregular time intervals between WY 2008 through WY 2012 (Heidbüchel et al., 2012;Dwivedi et al., 2021).

Streamflow
Stream water samples were collected using a Teledyne ISCO autosampler (model 6712c) installed at the MG-Weir site prior to 2012 and by grab sampling after 2012 [ Figure 2D; see also Heidbüchel et al. (2012) for more details]. While the stream water autosampler collected daily samples, sub-daily samples were also collected on the rising and falling limbs of the hydrograph during large runoff events (Heidbüchel et al., 2012). In the current study, sub-daily samples are volume-weighted to daily resolution (Dwivedi et al., 2021).

Tritium in Precipitation and Deep Groundwater
We calculated the amount-weighted time series of Tritium ( 3 H) in Tucson precipitation since 1992 using the following: (i) Eastoe et al. (2004) The 3 H time series data for deep groundwater, i.e., the groundwater residing in the fractured bedrock aquifer at MGC, was assembled from observations in streamflow (baseflow conditions; n = 9) and groundwater from Pigeon Spring (n = 5; Supplementary Table 1 and Figure 3 inset). The five discharge measurements from Pigeon Spring (Figure 1) were considered representative of deep groundwater (Dwivedi et al., 2019a). Data are grouped into half-yearly brackets using the following criteria: (i) sampling months 6-10 of year n, and (ii) sampling months 11 of year n-1 to 5 of year n. For groups with three or more measurements, the data are expressed as a mean ± 1σ (Figure 3 inset).

Stable Water Isotope-Based TTD and mTT
In this study, stable water isotope-based TTD and mTT estimates are based on our previous work (Dwivedi et al., 2021) that estimated TTD and mTT from long-term stable water isotope data in precipitation and streamflow. Using wavelet analysis of tracer flux (i.e., a product of tracer concentration and hydrologic flux for precipitation and stream water), TTD and mTT values were obtained from objectively evaluating the following TTD types: Exponential (Exp), Gamma (Gam), Fixed-path one-dimensional advection-dispersion model [ADE-1x;Maloszewski and Zuber (1982)], Multiple-paths advectiondispersion model [ADE-nx; Kirchner et al. (2001)], and Piston flow (PF). More details are provided in Dwivedi et al. (2021).

Tritium-Based TTD and mTT
The 3 H time series data were used in conjunction with the Stewart et al. (2017) method, i.e., Equation (1) of their work, to estimate the mTT and best fitting TTD for deep groundwater. Previous work at MGC determined that deep groundwater was recharged at a time scale of 0.5 years or less (Ajami et al., 2011;Dwivedi et al., 2019a). Because the half-life of 3 H is 12.32 years (Lucas and Unterweger, 2000), decay of 3 H during recharge was insignificant relative to analytical precision (± 0.5 TU). Therefore, the 3 H time series in precipitation was used as the 3 H time series in recharge to deep groundwater (Figure 3).
The TTD types evaluated using 3 H data had the following fitting parameters: Exp-mTT; Gam-shape parameter (α) and mTT; ADE-1x-Péclet number (Pe) and mTT; ADEnx-Péclet number (Pe) and mTT; and PF-mTT. The following ranges of TTD model parameters were considered: mTT: 1-50 years when using tritium, which serves as a groundwater age tracer at a time scale of 1-50 years (Aggarwal, 2013;Suckow, 2014;Gleeson et al., 2015); shape parameter (α) for the Gamma TTD: 0.1-15 ; average catchment-scale Péclet number (Pe): 0.1-100 (Kirchner et al., 2001;Kirchner and Neal, 2013). When evaluating TTD types, the Downhill Simplex method (Nelder and Mead, 1965;Gupta, 2016) was used to search for optimal model parameters and the modified Kling Gupta efficiency or KGE' (Gupta et al., 2009;Kling et al., 2012) was used as the model performance criterion. A perfectly fitting model will have a KGE' value of zero and the worst fitting model will have a KGE' value of ∞. Following Godsey et al. (2010), the KGE' criterion was estimated in a log-transformed space. Both KGE' and the characteristics of the criterion response surface were utilized to search for the optimum model parameters (Dwivedi et al., 2021). Furthermore, additional tests were also conducted to constrain how uncertainty in amount-weighted precipitation and stream water 3 H concentrations affected both TTD types and its parameters. In these tests, various combinations of uncertainty in amount-weighted precipitation and stream water 3 H concentrations were considered with respect to their means (µ) ± one standard deviation (σ) (see Figure 3), i.e., µ-σ, µ, and µ + σ were specifically related to three possible cases of deep groundwater 3 H concentration uncertainty (Supplementary Table 2). Subsequently, both TTD type and TTD parameters were objectively estimated using the previously mentioned TTD types. The same allowable parameter space for all model parameters was used to evaluate the most suitable TTD type and parameters.

Fraction of Young Water (F yw )
General Description F yw can be estimated from the amplitude ratio of tracer concentrations in outflow and inflow for any tracer (Kirchner, 2016a;von Freyberg et al., 2018). Thus, if the amplitudes of the tracer concentrations in outflow and inflow for period λ are A Q (λ) and A P (λ), respectively, then: This method is preferred when sufficient long-term tracer data are available and can be applied without a priori knowledge of the TTD type (Kirchner, 2016a). However, if a TTD h(τ ) is known, F yw can also be estimated using Equation (2), in which T yw is the threshold age for young water, defined as the upper limit in Equation (2) for which both Equations (1) and (2) provide equivalent values of F yw (λ): For a Gamma type TTD, Equation (2) can be simplified to Equation (3) where β is defined as the ratio of mTT (y) to the shape parameter (α, unitless), κ is the decay constant for a tracer [log e (2)/half-life; 0 for stable water isotope and 0.056 yr −1 for 3 H], and ω is defined as 2 π/λ: In this study, δ 18 O-based F yw calculations were based on tracer fluxes in precipitation and streamflow, whereas the 3 H-based F yw estimates were based on tracer concentrations in recharging and discharging deep groundwater. For consistency with the literature, we express tracer flux-based fraction of young water values as F * yw and tracer concentration-based fraction of young water values as F yw .

Estimation of F * yw Using Stable Water Isotopes
The F yw or F * yw are typically estimated by fitting sinusoidal curves to long-term tracer data in precipitation and streamflow (e.g., Jasechko, 2016;von Freyberg et al., 2018). In the current study, stable water isotope-based F * yw was specifically estimated using the iteratively re-weighted least squares (IRLS) method that determines tracer cycle amplitude by fitting sinusoidal functions to tracer flux data in precipitation and streamflow (Equation 1; Kirchner, 2016a; von Freyberg et al., 2018), and the TTD method (Equation 3). The tracer flux is the product of tracer concentration and hydrologic flux, and daily precipitation (aggregated into time intervals corresponding to the availability of isotope data in precipitation) and streamflow data are used to calculate the fraction of precipitation contributing to streamflow. The IRLS method was specifically applied to estimate F * yw for periods ranging from 2 days to 5 years.

Estimation of F yw Using 3 H
The 3 H tracer data in precipitation and deep groundwater at MGC were sparse relative to δ 18 O, especially for deep groundwater (Figure 3). As a result, application of sinusoidal curve fitting methods such as IRLS to deep groundwater 3 H data was unsatisfactory i.e., fitting a sinusoidal function to the data resulted in significant amplitude uncertainty at a period of 1 year. Therefore, the 3 H F yw estimates were based on the TTD method (Equations 2 and 3 above).
Uncertainty Estimation for F * yw and F yw When using stable water isotope data, the temporal variability of δ 18 O in precipitation was addressed by means of F * yw uncertainty analyses where the temporal variability of δ 18 O was expressed as three statistics: daily mean, mean + 1σ, and mean −1σ calculated for both precipitation (P) and stream water (Q); consideration of all pair combinations resulted in nine scenarios. For each period, the minimum, mean (referred to as the ensemble mean below), maximum, and 1σ of the F * yw results were computed for all nine scenarios. Finally, the ensemble means for F yw estimates from 3 H data were calculated similarly, using mean, mean + 1σ, and mean −1σ of the Gamma TTD parameters α and mTT.

H-Based TTD Type, Parameters, and Modeled Outflow Tracer Concentrations H-Based TTD Type and mTT
Piston-flow (PF) and Gamma (Gam) TTD types performed adequately and yielded TTD parameters within permissible parameter spaces (Table 1, Figure 4), whereas the other TTD types (Exp, ADE-1x, ADE-nx) did not. In addition, the response surfaces for certain TTD types were found to be rough (e.g., Figure 4E), and therefore, the model parameters for each TTD type were estimated from three separate model runs (Figure 4 and Supplementary Table 3). The model performance expressed in terms of KGE' was slightly better for the PF relative to the Gam TTD type (PF KGE' was 29% lower). Comparison of the PF and Gam TTDs further suggested "approximate equifinality" (Kirchner, 2016b)   of variation = 0.05) and 2.17 to 14.58 (unitless) (mean = 6.53; coefficient of variation = 0.64), respectively.

Modeled Tracer Concentrations in Deep Groundwater
Modeled 3 H concentrations in deep groundwater were generally within their observed ranges, and remained within 1σ of their simulated means for both PF and Gam TTDs, but modeled 3 H concentration variability was lower for the Gam TTD (Figure 5). The error bars in Figure 5 were determined by considering the analytical uncertainty in 3 H concentrations and were based on the first set of model runs. Although the ADE-1x TTDbased model produced 3 H concentrations that were within their observed ranges, the estimated TTD parameters were sometimes at the edge of the allowable parameter space (section  (2) and (3) are based on the input and output functions shown in Figure 3.
3H-Based TTD Type and mTT); simulated 3 H concentrations from the ADE-nx TTD-based model were far from observed concentrations and indicated that this TTD type is not applicable at MGC.

Fraction of Young Water
F * yw Based on δ 18 O Considering F * yw variability due to δ 18 O variability in precipitation and stream water, the IRLS method yielded highly variable results, particularly for periods of <1 year (Figure 6A). For a seasonal tracer cycle, i.e., period = 0.5 years, the F * yw estimate was 52 ± 2.1 % (mean ± one standard deviation). However, for periods close to 0.5 years, F * yw estimates were 17 ± 0.3 % (for λ = 0.45 years) and ∼99% (for λ = 0.56 years), respectively. Similarly, for the annual tracer cycle, the F * yw estimate was 11 ± 0.7 %, but for periods close to 1 year, F * yw estimates were 22 ± 1.3% (λ = 0.83 years) and 99 ± 0.6% (λ = 1.25 years). Note that the field site experiences two dominant wet seasons: a winter wet season due to Pacific cold fronts and a summer wet season due to the North American Monsoon (Eastoe and Dettman, 2016) such that both seasonal and annual periods may occur in isotopic time series data for precipitation. Thus, the F * yw estimates using sinusoidal curve fitting to the tracer cycle data showed significant variability with period, which is not yet recognized in the F yw literature.

F yw Based on 3 H
The time series of 3 H in groundwater and streamflow (Figure 3, inset) were too sparse and coarse for reliable estimation of A Q /A P using the IRLS method (Equation 1; see also Supplementary Section Sources of uncertainty for F yw when considering annual cycles of 3 H in precipitation and deep groundwater). Instead, the TTD method (Equation 3) was used to calculate F yw with model parameters drawn from Section 3H-based TTD type and mTT for a Gamma TTD. The pattern of F yw with period was characterized by a gradual increase with period ( Figure 6B). For an annual tracer cycle, the ensemble mean-based F yw was 1.6 ± 2.4 × 10 −3 % (blue triangle in Figure 6B). For the highest period considered in our analysis, 27 years, F yw was 6.4 ± 1.1%. Between periods 1 and 27 years, F yw aand its slope with respect to period increased gradually. For example, dF yw /dλ changed from 0.06%/year (5 < λ < 10 years), to 0.19%/year (10 < λ < 15 years), to 0.36%/year (15 < λ < 20 years), and finally 0.47%/year (20 < λ < 25 years).

DISCUSSION 3 H-Based TTD Type and mTT Estimates
Previous estimates of TTD type and mTT at Marshall Gulch (MGC) were based on stable water isotopes (Heidbüchel et al., 2012;Dwivedi et al., 2021). The current work complements these studies by using a tracer ( 3 H) that is applicable over decades rather than years. For both 3 H and δ 18 O tracers, a Gamma TTD type was appropriate at MGC with an mTT of ∼0.82 years (α = 0.42) using δ 18 O and ∼27 years (α = 6.53) using 3 H; similar differences in 3 H-and δ 18 O-based mTTs have been noted by previous work (e.g., Stewart et al., 2010). The 3 Hbased mTT estimate was close to the 26-year interval over which the amount-weighted 3 H data were available for precipitation and is consistent with ages of bedrock-hosted groundwater from Dwivedi et al. (2019a). Even with the coarser resolution of the tritium data, the current study was able to constrain deep groundwater flow paths at a representative mountain headwater site and identify unique solutions for Gamma TTD parameters ( Figure 4B). Therefore, neither applicability nor poor fit of a particular TTD type should be considered as an indication of data limitation alone. Consequently, the large difference between mTTs from 3 H and δ 18 O can be attributed to the sampled flow regime, i.e., slow flow of deep groundwater vs. highly dynamic flow of near surface and soil waters at MGC and is consistent with the range of applicability of each tracer (DeWalle et al., 1997;Aggarwal, 2013;Suckow, 2014;Dwivedi et al., 2021).
Dependence of F * yw on Period and an Improved Method for F * yw Estimation Estimates of F * yw from the IRLS method depended variably and non-systematically on period (section F * yw Based on δ 18 O). This finding presents a significant impediment to the use of sinusoidal curve fitting methods to estimate F yw from a particular catchment or to compare flow dynamics between catchments. In contrast, the TTD-based method (Equations 2, 3) yielded a systematic relationship between F * yw and period ( Figure 6A). On the basis of stable water isotope data, Dwivedi et al. (2021) reported that a combined Piston Flow and Gamma TTD was applicable for periods below 1 month, and a Gamma TTD alone was applicable for longer periods at MGC. Linking their estimated TTD parameters and corresponding uncertainties for the Gamma TTD type to a method similar to that used to calculate the tritium-based ensemble F yw mean ± standard deviation (Section Estimation of F yw using 3H), F * yw estimates for periods longer than 1 month gradually increased with period. At an annual period, the TTD-based ensemble-mean F * yw at MGC was 34.5%, which is comparable to results of previous work that used similar methods (Table 2). However, recent work FIGURE 6 | (A) F * yw (ensemble mean ± one standard deviation) vs. period (λ) based on δ 18 O data and (B) F yw (ensemble mean ± one standard deviation) vs. period (λ) based on 3 H data. The vertical error bars in each plot denote one standard deviation. The two gray points in (A) show ensemble mean F * yw estimates for seasonal (0.5 years) and annual periods using the IRLS method; the two black triangles show ensemble mean F * yw estimates for seasonal and annual periods using the TTD method. The two vertical dashed lines in (A) mark the seasonal and annual periods. The blue triangle in (B) shows the ensemble mean of F yw for the annual period, calculated using the TTD method and 3 H data.
using least-square fitting (Gallart et al., 2020) and IRLS and TTD (Stockinger et al., 2016) methods has suggested that F yw estimates depend on the tracer sampling frequency. The TTD-based F * yw estimates from the current work support this line of reasoning insofar as the TTD-based F * yw represents a thorough sampling of flowpaths with transit time between 0 and T * yw . In this way, the TTD based F * yw results may be more reliable than the estimates derived from the IRLS methods that potentially lack thorough sampling of flowpaths and show non-systematic variability with period. However, the literature on F * yw is mostly based on IRLS or similar methods with few studies reporting TTD-based results ( Table 2; Supplementary Table 6).

Suitability of Tritium-Based F yw Estimates and Inferred Deep Groundwater Storage From Tritium-Based mTT Estimates
Context for Tritium-Based F yw Estimates The tritium-based F yw values reported in the current study are significantly lower than previously published estimates. The ensemble-mean F yw determined from annual 3 H cycles at MGC was 1.6 x 10 −3 %, or effectively 0% (Figure 6B; Section F yw based on 3 H). For comparison, the lowest tritium-based F yw value in  for an annual cycle was ∼8% for a system composed of two homogeneous sub-systems, each having an mTT of 25 years with a Gamma TTD shape parameter α = 10. We attribute this difference i.e., 8 vs. ∼0%, to either the constant T yw value that  used for both α and tracer cycle period, or the use of multiple lumped parameter models by , as opposed to the F yw values in the current work that are based on fitting a single TTD to the whole 3 H dataset for deep groundwater. Using TTD parameter estimates from rSAS (rank StorAge Selection) functions, Rodriguez et al. (2021) reported a F yw estimate of 1.8% for a forested headwater catchment that is closer to the near-zero F yw estimate at MGC. This difference may result from differences in sampling protocols and/or calculation methods; Rodriguez et al. (2021) sampled stream water under varying flow conditions, in contrast to deep groundwater and base flow sampling in the current work.

Suitability of Tritium-Based F yw Estimates for the Annual Tracer Cycle
A near-zero 3 H-based F yw at MGC calls into question the suitability of the 3 H-based F yw approach for deeper groundwater. The deep groundwater residing in fractured bedrock at MGC has a large mTT (∼27 years), and the annual tracer cycle will be highly damped as a result (F yw ∼ 0; Section F yw based on 3 H). From the standpoint of the IRLS method (Equation 1), useful 3 H data in precipitation or deep groundwater should have an amplitude A P or A Q greater than the 3 H measurement precision (0.5 TU, Section Tritium in Precipitation and Deep Groundwater) for some period between 1 and 27 years. At MGC, this is true for A P at periods <19 years, but not for A Q at any period up to 27 years (Supplementary Section Sources of uncertainty for F yw when considering annual cycles of 3 H in precipitation and deep groundwater); consequently, the available data are inadequate for calculating F yw . This is apparent in the lack of consistent annual periodicity in the Tucson Basin precipitation data (Figure 3), which may not be possible to overcome even with a much larger 3 H dataset.

Inferred Deep Groundwater Storage at MGC
Deep groundwater storage within the fractured bedrock aquifer is important because it recharges the surrounding valley fill aquifer through mountain front and/or mountain block recharge pathways (Wilson and Guan, 2004;Ajami et al., 2011). In the traditional approach [see Rodriguez et al. (2021) and references therein], subsurface storage is a function of the tracer-based mTT multiplied by the long-term mean discharge. Application of this approach to MGC for deep groundwater results in a storage estimate of 6.7 m using a 3 H-based mTT of 27 years and  (von Freyberg et al., 2018).
the observed long-term streamflow (WY 2008to WY 2017. In contrast, Gleeson et al. (2015) determined that global presentday groundwater is equivalent to a depth of only 3 m on the land surface. Acknowledging such issues, Kirchner (2016b)-using stable water isotopes and virtual experiments-suggested the use of volume-weighted rather than time-weighted mTTs. However, Peters et al. (2013) and Dwivedi et al. (2021) have shown that volume-weighted mTTs and time-weighted mTTs differ by only a factor of ∼2 as opposed to orders of magnitude. Volumeweighted mTTs are also difficult to obtain via 3 H data due to the lack of multi-decade observations of both streamflow and tracer concentrations in outflow. Further, the use of mean long-term discharge is problematic as the contribution of deep groundwater to streamflow is likely to be lower than contributions from soil water or other near surface storages. Using end-member mixing analysis, Dwivedi et al. (2019a) reported that deep groundwater contributes 4.5% of the long-term streamflow at MGC. If this fraction is included in the storage calculation, then the storage estimate decreases to a more plausible 0.3 m. Taken together, the results of the current work demonstrate that the tracer-based mTT can be used to estimate storage volumes, provided they are interpreted with appropriate caution.

Limitations of the Proposed Approach and Recommendations for Future Work
Considered together, F yw and TTD parameters such as the shape parameter and mTT for a Gamma TTD provide complementary information about subsurface flowpaths and their dynamic vs. slow behavior. However, the existing literature on F yw mainly reports F yw estimates based on either IRLS or similar sinusoidal curve fitting methods for annual or seasonal tracer cycles ( Table 2 and Supplementary Table 6). The current work demonstrates that flux-weighted F yw estimates or F * yw , when made using sinusoidal curve fitting methods, vary greatly and non-systematically with period ( Figure 6A). For an annual tracer cycle, F * yw from the IRLS method was one-third of F * yw from the TTD method, and it is therefore likely that previously-reported F yw estimates may be underestimated. This would have significant implications for F yw -based understanding of contaminant and nutrient transport, surface water quality (Jasechko, 2016;Kirchner, 2016a), and estimation of TTD parameters (Lutz et al., 2018). As a result, we urge future studies utilizing IRLS or similar methods to report F yw for various periods, in addition to the annual period, in order to better constrain the variability of results. Future studies that report TTD-based results will be useful to characterize the methodological sensitivity of F yw across a broader range of natural systems.
The use of a 3 H-based F yw metric has been recommended for improved understanding of deep and/or slow flowpaths contributing to streamflow (Jacobs et al., 2018;Jasechko, 2019). However, the current study highlights a difficulty: that the 3 H-based F yw metric may be inappropriate when there are insufficient deep groundwater data, which is a general limitation of groundwater aquifers (Gleeson et al., 2015;Rodriguez et al., 2021), including MGC. This limitation can also lead to significant variability in the estimated Gamma TTD parameters when the 3 H tracer is applied to the question of "hidden streamflow" (Stewart et al., 2010(Stewart et al., , 2012Seeger and Weiler, 2014;Jacobs et al., 2018) i.e., the flowpaths that are untraceable by stable water isotope tracers alone. In contrast to F yw , the 3 H-based mTT metric does not depend on any particular period of tracer cycles in inflow and outflow, but aggregation errors may lead to estimates of mTT that are low by several orders of magnitude relative to known mTTs from virtual experiments (Kirchner, 2016b;, especially in heterogeneous catchments such as MGC. The current work also demonstrates that the 3 Hbased mTT can lead to greatly over-estimated deep groundwater storage estimates. To address this issue, appropriate long-term discharge estimates, not including storm runoff or discharge from shallow storages, are critical to accurate storage calculations for deep groundwater (Section Inferred Deep Groundwater Storage at MGC). Although the current literature supports the use of multiple ("lumped") parameter models qualified by site-specific hydrogeological information to reduce aggregation errors in real catchments, model parameters may be difficult to constrain in the multiple parameter approach (Hrachowitz et al., 2009;Jacobs et al., 2018).

CONCLUSIONS
Resultsv of the current study indicate that concurrent application of multiple metrics and short and long-residence time age tracers provide a more complete understanding of the highly transient and slow flow paths that contribute to streamflow in a sub-humid mountain headwater catchment. Among the various combinations of age tracers and metrics that were tested, the most appropriate metrics at MGC included δ 18 Obased F yw and 3 H-based deep groundwater mTT. Application of sinusoidal curve fitting methods (e.g., iteratively re-weighted least square or IRLS) for F yw estimation yielded large, nonsystematic changes in F yw estimates with period. Conversely, F yw estimates using the alternative transit time distribution or TTD-based method showed consistent patterns with period. The current study therefore suggests that data resultant from sinusoidal curve fitting methods to estimate F yw and/or to compare dynamic groundwater flow behavior among catchments must be interpreted with caution.
The Gamma TTD-based mTT for deep groundwater using 3 H data was 27 years. The same methodology yielded an mTT of 0.82 years when based on δ 18 O (Dwivedi et al., 2021); hence, we conclude that the former mTT may correspond to deep groundwater stored in fractured bedrock, whereas the latter applies to shallow storages in the soil profile. The shape parameters of the 3 H-based Gamma TTD at MGC demonstrated significant variability arising from the short length of the available 3 H time series data, and the utility of 3 H for determining F yw in deeper groundwater was limited due to both data quality and inconsistent seasonal cyclicity of the precipitation 3 H time series data. Although data quality could be addressed by longerterm observation and attention to precision, irregular seasonal cyclicity of 3 H in precipitation at some locations may restrict the applicability of this approach. In summary, using F yw together with mTT and stable water isotope and tritium tracers can yield a more complete understanding of highly transient and slow flow paths that contribute to streamflow in mountain headwater catchments.

DATA AVAILABILITY STATEMENT
The original contributions in terms of tritium data presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
RD: conceptualization, formal analysis, investigation, methodology, validation, and visualization. CE: writing-original draft preparation and writing-review and editing. JK, RM, and GB-G: writing-review and editing. JM, PF, TM, and JC: funding acquisition, project administration, resources, supervision, writing-original draft preparation, and writing-review and editing. NA: data curation and writing-review and editing. MS: resources and writing-review and editing. All authors contributed to the article and approved the submitted version.