EURO-CORDEX: A Multi-Model Ensemble Fit for Assessing Future Hydrological Change?

Human-induced changes in climatic behavior and variations in future river ﬂows has been at the fore-front of recent academic and political discourse. Future climate projections are a vital tool in tackling climate change and supporting future adaptation, however until recently models have been viewed individually with a lack of uncertainty quantiﬁcation. A multi-model ensemble (MME) with a wide range of general circulation models, regional climate models and emissions scenarios, EURO-CORDEX provides climate projections as well as ﬂow series projections across the European domain from 1950 to 2100. This paper explores the validity of the 68 chain MME ﬂow projections by investigating its ability to match observed ﬂow records in the UK over the period 1975–2004. The work explores magnitude through quantile matching and seasonality matching by time-series decomposition of trends. Two statistical tests [Mann-Whitney, and Mean Average Arctangent Percentage Error (MAAPE)] were used to compare EURO-CORDEX ﬂow projections to observed river ﬂows recorded by the National River Flow Archive (NRFA) across 1,436 UK river catchments. Results indicate a high degree of similarity justifying the application of this dataset for assessing future hydrological changes across a regional scale. Discretizing the ﬂow projections into regional and hydrometric areas highlights the variability in performance between neighboring domains and the strong inﬂuence local features may have on climate model performance. The validation of EURO-CORDEX ﬂow projection data regionally enables a wide range of applications including the exploration of future changes in local and national river ﬂows.


INTRODUCTION
In 2018, it was reported that human-induced warming had reached ∼1 ± 0.2 • C (likelihood of outcome, 66-100% probability) (IPCC, 2018) above "pre-industrial" levels (1,850-1,900; Allen et al., 2018). Changes in climatic behavior driven by anthropogenic influences, in terms of both mean and variability, are projected throughout the twenty-first century (IPCC, 2014). National assessments [such as the Independent Assessment of UK Climate Risk (CCRA3, 2020)] have highlighted to governments that greater urgency is required across society in order to adapt to the climate emergency.
Climate is a major determinant of hydrological processes, where precipitation, temperature and evaporation represent the dominant drivers (Arthington, 2012;Cisneros et al., 2014). Consequently, changes in climate will invariably lead to alterations of river flow regimes (Rahel and Olden, 2008;Arnell and Gosling, 2016). Recent research by the authors have investigated the impact of these changes on the river flow regime in terms of future flood and drought magnitude, duration, and frequency (Collet et al., 2017(Collet et al., , 2018Visser-Quinn et al., 2019aEllis et al., 2021), which is also corroborated by other authors (Kay, 2021;Lane and Kay, 2021). The work on changing extremes in the UK indicates the threats faced at a regional level, as well as more widely across Europe  are likely to intensify in the coming decades.
Hydrological models serve to bridge the gap between global climate change projections and the need to understand the impact of climate change at a more localized scale (Gleick, 1986) on components of interest such as river flows. Research into this hydrological impact has been ongoing for over two decades (Olsson et al., 2016). A typical hydroclimatological modeling chain is depicted in Figure 1A. Emissions scenario(s), representing potential pathways of future emissions [e.g., representative concentration pathways (RCPs)] serve as input to General Circulation Models (GCMs) or Earth System Models (ESMs), to model climate at the global scale. Note, hereafter we refer to only GCMs for conciseness. Given their coarse scale (>50 km grid; Taylor et al., 2012), these outputs must be downscaled prior to their use in hydrological models. Downscaling is typically dynamical or statistical (Flato et al., 2013). Dynamical downscaling uses a higher resolution model [Regional Climate Model (RCMs)] with GCM outputs as boundary conditions; statistical downscaling develops a relationship between baseline observed and simulated GCM outputs, applying this relationship across the future projections. In order to explore the influence of climate projections on future hydrological processes, the downscaled climate outputs (e.g., precipitation and temperature) are then normally used as inputs for hydrological models.
The complexity of the hydroclimatological modeling chain necessitates the quantification of the uncertainty propagating through the system (Jacob et al., 2020). Uncertainty can be evaluated at each step in the modeling chain through the assessment of input, structural and parameter variations: input uncertainty may be measured through consideration of multiple emissions scenarios; structural uncertainty is captured through the use of a range of GCMs known as a multi-model ensemble (MME); and parameter uncertainty is captured through the systematic variance of model parameters, known as a Perturbed Physics (or Parameter) Ensemble (PPE). The computational demands of climate modeling limit the ability to consider MMEs and PPEs simultaneously. Clark et al. (2016) discusses the characterization and minimization of uncertainty associated with the hydrological impacts of climate change, arguing that much of the focus has been directed toward climate models, and less so on hydrological models and the propagation of uncertainty. Clark et al. (2016) argue that the substantial uncertainty associated with hydrological projections represents a key challenge to practical application. Several authors Thober et al., 2018;Visser-Quinn et al., 2019a) have shown that these uncertainties can in certain circumstances be greater than the uncertainty associated with the climate projections. Consequently, this raises questions about the validity of using these flow projections for further impact assessments and policy decisions. There are, however, a number of modeling studies which have found the uncertainty associated with climate models to be far larger than hydrological models (Eisner et al., 2017;Vetter et al., 2017;Hattermann et al., 2018). This highlights the intricacies of multimodel frameworks and necessity to quantify uncertainties at every level.
Understanding uncertainty at each step in the modeling chain is therefore critical to support decision-making. To this end, further assessment of hydroclimatological ensembles is required. The European branch of CORDEX, the Coordinated Regional Downscaling EXperiment. EURO-CORDEX has downscaled global climate projections from 13 CMIP5 GCMs dynamically through the application of 19 Regional Climate Models (RCMs) across the European domain. This is a significant contribution to the climate science community, only possible through large scale collaboration across different science disciplines and institutions, in order to provide robust climate projections which capture different components of uncertainty for the analysis of future change (Jacob et al., 2020). Input uncertainty is further explored using multiple RCPs. Hydrological processes (e.g., soil moisture, snow, precipitation, runoff, and outflow) are captured in the land surface component of climate models, both at global and regional scales (Overgaard et al., 2006). In EURO-CORDEX, global climate projections are dynamically downscaled using RCMs, delivering spatial detail at finer  resolutions. The component based structure of climate models allows for explicit representation of river hydrology; for example, CNRM-ALADIN53, one of the RCM's used in EURO-CORDEX and this study (see Figure 3), utilizes the TRIP (Total Runoff Integrating Pathways) model (Oki and Sud, 1998;Sevault et al., 2014). Models from the IPSL modeling center are also known to include substantial land component (included in EURO-CORDEX and this study, see Figure 3). Dynamic coupling within the RCM also allows for consideration of feedbacks in the hydrological cycle through atmospheric and oceanic components (Overgaard et al., 2006). With 68 modeling chains available (Figure 3), EURO-CORDEX represents a unique opportunity to quantify input and structural uncertainties in flow projections at a scale not previously possible.
Climate model outputs such as EURO-CORDEX are typically used as drivers for hydrological models to create river flow time series. Current assessments of climate model outputs use one or more hydrological models for river flow projections under the future climate Hannaford et al., In Review). Validation is performed across a chosen domain (Pastén-Zapata et al., 2020) with an ensemble of hydrological models used to account for structural and parameter uncertainty Di Sante et al., 2021). However, depending on the accuracy of RCM datasets, hydrological models may be entirely unnecessary for future flow projection studies. Therefore, it is necessary to assess whether this data is fit for hydrological studies. To the best of our knowledge this has not been done before. In this paper, we assess the validity of using daily flow projections directly from EURO-CORDEX, bypassing the hydrological model requirement (Figure 1), thus removing an additional source of uncertainty and computational cost within the modeling chain. Gridded flow projections from EURO-CORDEX are mapped to UK river catchments, providing time series of daily flow projections for almost all catchments (1,436 out of 1,575). With up to 50 modeling chains available for each catchment, the resultant dataset offers a robust ensemble of future flow projections for the UK which captures modeling uncertainty and is applicable to a wide range of practical applications including assessment of variability in future extremes, engineering vulnerability and community resilience. To understand the skill of the EURO-CORDEX hydrological projections, the performance of the models against observed data must be assessed. For hydrological assessments it is important that the models capture the components of the river flow, for example the range of flow magnitudes experienced in a catchment, and the seasonal occurrence of these. Here the performance of the EURO-CORDEX ensemble daily flow projection data is assessed across 1,436 UK river catchments.

DATA PRE-PROCESSING
Data downloaded from EURO-CORDEX has been processed from raw climate model outputs into flow projections following methods outlined in Figure 2. This involves transforming large scale climate data into gridded data for individual flow gauging stations, performing a bias correction to observed historical data and producing flow duration curves. Extreme value analysis and seasonal trends can then be extracted for further impact assessments.

EURO-CORDEX Flow Projections
In EURO-CORDEX, the total surface runoff variable, mrro, represents flow. Data was extracted from the Earth Systems Grid Federation (ESGF; https://esgf-data.dkrz.de/projects/esgfdkrz/) on an 0.11 • grid (∼12.5 km) across the European domain for 68 RCP-GCM-RCM combinations (Figure 3). To ensure consistency across the dataset, 11 RCP-GCM-RCM combinations using rotated projection on a smaller grid (∼12.1 km) were removed (Figure 3). Four RCP-GCM-RCM combinations represented a PPE, consisting of 3-3-3-2 members, respectively. To reduce bias toward these model combinations, only one member from each was utilized. For the remainder of this study, a total of 50 RCP-GCM-RCM combinations were considered-hereafter referred to as modeling chains.
The EURO-CORDEX gridded flow projections were subset to the UK domain: Great Britain and Northern Ireland. These gridded flow projections were mapped to catchments using catchment boundaries provided on request from the National River Flow Archive (NRFA). Flow in each catchment Q catchment , was determined as the sum of the grid cell fraction A i , multiplied by the gridded flow projection, Q i (Figure 4).

Observed Flow (NRFA)
A baseline period of 1975-2004 was selected to maximize data availability ensuring the largest number of stations with an acceptable level of missing data across a 30 year time slice. Observed flow data was extracted from the National River Flow Archive (NRFA) for a total of 1,436 gauging stations across 107 hydrometric areas (Figures 5A,B). The NRFA provides observed time flow series for every gauging station in the UK. The measuring authorities vary between UK regions: Environment Agency (EA) in England, the Scottish Environment Protection Agency (SEPA) in Scotland, Natural Resources Wales (NRW) in Wales, and the Department for Infrastructure-Rivers (DfIR) for Northern Ireland. The NRFA processed flows are not naturalized to account for human intervention, therefore calibrating to these observed flows will not account for artificial influences (Hannaford et al., In Review). It should be noted that water abstraction processes upstream of gauging stations are assumed to remain stationary between time periods due to the unquantifiable uncertainty associated with human-interventions.
Multivariate imputation via chained equations (MICE) was applied where gaps in the continuous time series were ≤3% of this 30-year baseline period. MICE derives multiple imputations for each missing value in the time series (Azur et al., 2011). As such, MICE takes into account the uncertainty around the imputed values and gives accurate standard errors. The principle behind MICE is to analyze a range of possible values for the missing data, pool the results of the imputed values at each time step and create a continuous time series for further analysis.

Bias Correction
Quantile mapping, a well-established bias correction technique in hydrological studies (Gudmundsson et al., 2012; al., 2013), was applied to improve the fit between modeled flow hindcasts and observations for the baseline period-common practice in hydrological flow projections Ellis et al., 2021;Hannaford et al., In Review). To prevent overfitting, 5 equal quantiles were selected (0,20,40,60,80,100). For each catchment and modeling chain, baseline flows were split into the five quantiles. These were mapped to the corresponding quantiles from observed data and a linear transformation, to improve the fit of the hindcasts, was determined for each observed-projected pair. This transformation was used to bias correct the flows from 1976 to the end of the projection (up to 2,100). [See Supplementary Figures A.2 The application of bias correction directly to RCM river flow series is a novel approach which removes the need for hydrological modeling. Quantile mapping the RCM outputs directly to historical data, enables rapid and direct access to hydrological projections. This provides a more equitable data source, removing the need for specific knowledge of hydrological models, computational costs associated with running these models over national scales and further sources of uncertainty (hydrological model structure, parameter, and input). To our knowledge, this is a novel approach to deriving river flow time series projection which has the potential to improve the process of flood hazard assessments.

METHODS
The validity of the bias-corrected EURO-CORDEX daily flow time series was assessed for the baseline period of 1975-2004 per catchment for each modeling chain. Validity was assessed through the application of two tests: Mann-Whitney U-test and the Mean Arctangent Absolute Percentage Error, as described below. These tests were applied to the flow duration curve calculated for each catchment per modeling chain at 1-percentile intervals, and compared to the observed flow duration curve. The tails of the distribution were clipped, 0-5th and 95-100th percentiles, due to known errors associated with the simulation of climatic extremes in climate modeling, particularly with regards to precipitation (IPCC, 2014).
Both methods were selected as a means of testing the similarity between observed and simulated flows with a focus on magnitude and frequency of flood and drought events. Alternative methods may produce more insight into similarities between time series seasonality [cross-correlation analysis (Chandra and Al-Deek, 2008)] but this is out with the scope of this paper.

Mann-Whitney
The Mann-Whitney test (independent two sample Wilcoxon rank-sum test) is applied to determine the similarity between the two populations, the observed and projected flows. The null hypothesis states that there is a 50% probability that any randomly sampled value from former is larger than a value from the latter. The alternative hypothesis is that both samples have the same median and therefore can be said to come from the same population. Larger p-values indicate a stronger statistical similarity between observed and projected flows with a value of 1 indicating exact agreement whereas a value <0.05 is statistically different and the null hypothesis is rejected (Mann and Whitney, 1947).

Maape
The second statistical test, MAAPE, is a measure of the average percentage error between the observed and projected flow (Kim and Kim, 2016). A variant of the mean absolute relative error, MAAPE addresses limitations in infinite/undefined values close to or equal to zero.
Let A t and F t be the actual (observed) and forecast (simulated) values at data point t and N is the number of data points. Then, the error ratio |A-F| over |A| (belonging to [0, ∞]) is equivalent to angle theta: where θ ∈ [0, 90] and is thus defined for actual values A t close to or equal to zero. Then, Lower and higher values indicate a better and worse fit, respectively. To aid in the interpretation of the results, the outcomes of the two tests have been assigned to four quality categories ( Table 1). Thresholds have been defined to allow easily understood consistent assessment between tests and across the nation.

Seasonality
The seasonality component of the flow projections (the timing of flow series peaks and troughs) was assessed through decomposition of the time series (observed flows per catchment, and flow projections per catchment and chain combination). The time series were split into systematic (average, trend and seasonal) and non-systematic (noise) components using R's decompose function which uses moving averages from a symmetric window and equal weights (Kendall and Stuart, 1983). Extracted seasonality trends are then compared to decomposed NRFA observed seasonality trends. Mann-Whitney and MAAPE values were then determined as above.

RESULTS
EURO-CORDEX Verification results have been produced for time-series quantiles and decomposed seasonality trends across 1,436 UK river catchments and 50 climate model chains −71,800 total combinations. Each chain-station combination has been categorized as "Good, " "Okay, " "Bad, " or "Unusable" depending upon the value attained from Mann-Whitney and MAAPE tests. Seasonality trend errors between NRFA and simulated flow quantiles have been compared to examine projection timing. National results have been presented first as a general guide to EURO-CORDEX validity, however, due to the volume of results (up to 50 chains per catchment, 1,436 catchments) understanding generalities at this scale is difficult. As such, results are then presented for each hydrological regional area within the UK to highlight underperforming regions and chains within these reasons. Finally, further discretization to hydrometric areas shows the influence of individual catchments and the highly localized nature of EURO-CORDEX performance. Individual station results can be found for each of the 1,436 stations across the UK .

National Results
Mann-Whitney results suggest the majority of simulated data is statistically similar to the observed. Over half of all chaincatchment combinations produce p-values > 0.8, with 53.2% above the threshold indicating good/okay statistical agreement ( Table 2). Of the remaining 46.8 only 1.95% are less than the 0.05 null hypothesis threshold, defining statistical differences in flow time series quantiles. Focusing on these unusable combinations, every chain is represented at least once, with 136 stations producing non-agreement with observed data. Upon inspection, 93 of these underperforming stations have <10 years observed data, however 3 stations have over 40% and one spans the entire 30 year period. Median p-values across each chain ( Figure 6A) generally remain above the 0.05 threshold. These results indicate a strong correlation between length of observed data and accuracy of bias corrected data, in agreement with previous studies (Lafon et al., 2013), however it is not the singular controlling feature. Station MAAPE error has been investigated to compare station performance (Table 2; Figure 6B) against the observed flows. Low MAAPE values are seen across the ensemble with 78.0% of MAAPE errors <10%, showing a strong agreement between observed and simulated flow quantiles. A further 17.97% of errors are classified as "Okay" and only 3.7% producing MAAPE values >20%. Taking the median MAAPE value across all chains per catchment, over 97% of results are categorized as Good/Okay with only 2.26% producing Bad median values (Figure 6B). Three gauging stations were found to have median MAAPE values >50%: Balder at Balderhead Reservoir (25022), Flore at Experimental Catchment (32029), and Gallica Stream at Gallica Bridge (52020). Each of these stations provided <10 years observed data with none functioning past 1985.
The aggregation of chain performance into a median station value provides a general overview of ensemble performance but reduces the catchment-chain performance detail. The interquartile ranges in Figure 6 demonstrate the difficulty in understanding the performance of individual chains across particular catchments or regions, which is important when assessing the performance of ensemble members for a particular use. Therefore, a regional assessment has been performed.

Regional Results
The spatial variability of MAAPE values have been analyzed over 17 UK regions, defined in Figure 5C. MAAPE was selected as it provides clear, well-defined results which can be easily compared between regions. Cumulative distribution functions (CDFs) of regional MAAPE values identifies variability in national projection performance (Figure 7). Poorly performing regions are typically found in the southeast of England were multiple chains break the 10% threshold before the 50th percentile (indicating a majority MAAPE values out with the well performance threshold). However, due to the large number of chains included in the EURO-CORDEX ensemble, accurate flow time series are still produced in the worst performing regions. Scottish, Welsh, Northern Irish and northern English regions produce lower MAAPE values between observed and simulated data, increasing confidence in regional projections. Within regional performance the controlling factor was found to be the RCM over GCM and RCP. Across every region assessed the worst performing RCM was found to be IPSL-WRF381P from the INSTITUT PIERRE-SIMON LAPLACE research center. FIGURE 7 | Regional MAAPE CDFs for each chain separated by RCM.
Regional results identify underperforming areas however the large scale limits conclusions as the contributions from each catchment are hidden. Therefore, further discretization to NRFA hydrometric areas has been undertaken.

Hydrometric Areas
The EURO-CORDEX flow time series performance has been considered over 107 NRFA defined hydrometric areas (HA) and 3 RCPs in order to explore spatial differences. MAAPE results have been used to define spatial performance and offer more detailed catchment characteristics and faceting by RCP examines variability in emissions scenarios. The percentage of MAAPE value categories ("Good, " "Okay, " "Bad, " "Unusable") reveals projection behavior across HAs and RCPs (Figure 8). Differences in HA performance between RCPs is seen to be minimal with the same areas performing well or poorly regardless of emissions scenario. This behavior is mirrored across error categories signaling a larger influence associated with location over climate pathway.
Across HAs, a north-west south-east divide is once more identified however individual MAAPE results are now emphasized. The hydrometric areas of Welland (31) and Nene (32) contribute significantly to the poor performance of southeast regions but some neighboring areas are seen to produce more robust projections, e.g., the Norfolk Rivers Group (34). The individual influence of specific catchments on regional performance shows the need to consider flow projection performance on a highly discretized scale. For example, the Shetlands hydrometric area (108) is the worst performing area over each RCP reporting over 90% of the chains giving a MAAPE value >10%. As a one-catchment HA, flow projections for this area are dependent upon errors from a single gauging station increasing the chance of misrepresentation. Furthermore, as an island catchment in the North Sea the meteorological processes on Shetland will be different from mainland UK. The poor ensemble performance emphasizes the critical need for catchment knowledge when investigating climate model projections.
MAAPE CDFs for each chain separated by RCM for each Hydrometric Area (107 no.) are included in Supplementary Material.

Decomposed Time-Series-Seasonality Component
The calculation of Mann-Whitney and MAAPE on the seasonality component of the time series reveals a limitation of the EURO-CORDEX flow projections. The majority of stations produce median Mann-Whitney p-values less than the 0.8 "Okay" category threshold ( Figure 9A). Large interquartile ranges (covering almost the entire range of p-values) indicate low similarity between observed and simulated results. Median MAAPE seasonality values ( Figure 9B) suggests difficulties in replicating temporal facets of the flow regime. Large errors are found across stations with only 3 stations with errors <20% and no MAAPE values <19%. The remaining 1,433 median error values suggest a large disparity between observed and simulated time series. In light of the superior performance in replicating the flow duration curve (flow magnitude), the relatively poor replication of the temporal aspect is unsurprising. It is well-established that hydrological models alone are incapable of modeling all, or even multiple, facets of the flow regime concurrently (Blöschl and Montanari, 2010;e.g., magnitude and timing). These difficulties are highlighted in recent efforts by the hydrological modeling community to replicate specific ecologically relevant "hydrological indicators" (Pool et al., 2017;Visser-Quinn et al., 2019b).

DISCUSSION
Flow projections extracted from EURO-CORDEX were mapped to 1,436 UK river catchments, bias corrected and validated against observed flows from the NRFA . The bias correction of the data used 5 points in the flow duration curve to adjust the hindcast time series. Then both validation checks were completed on the full flow duration curve, removing the extremes (i.e., on 90% of the data). The validation tests (Mann Whitney and MAAPE) were performed and show that, across the ensemble flow projections available, there are robust, usable flow time series projections in every catchment in the UK. No catchment has all ensemble chains performing badly. Regional variations have been investigated and the results suggest a northwest south-east divide. This work has highlighted southern English regions with a greater proportion of underperforming projections. Increasing the spatial resolution of the analysis to hydrometric area identified specific catchments and localized areas where specific model performance is worse, e.g., the Nene catchment group-West Midlands. In such areas a number of different factors may be influencing the performance, for example the Nene is a fen catchment which may make hydrological processes more difficult to replicate.
The large ensemble dataset produces valid flow projections across the UK domain, allowing exploration of future water availability on a regional level. The size of the ensemble (up to 50 chains per catchment) and the breadth of scenario included (RCP 2.6, 4.5 and 8.5) means that reasonable uncertainty quantification can be undertaken on future analysis in a consistent manner. It must be noted however, that the seasonality of flows is poorly represented across every UK station and caution must be applied to seasonal trends analysis of this dataset. Misrepresentation of timings is a result of GCM complexity and a prioritization of different hydrological interactions which should not hinder the application of EURO-CORDEX future flow projections.
Higher MAAPE values seen with the IPSL-WRF381P RCM reduce confidence in these chains at specific locations however, in general this chain performs poorly across large parts of the UK. The inclusion of underperforming models remains a topic of discussion for climate researchers; is it better to remove the poorly performing models or include every possible projection for maximum uncertainty quantification? Results presented herein suggest that although specific catchments or HAs may be poorly represented by certain chains, this underperformance is highly localized and complete removal of climate models would negatively impact uncertainty quantification across the entire UK domain. Furthermore, large ensembles of data such as EURO-CORDEX reduce individual bias associated with poorly performing chains, allowing accurate flow projections with no loss in uncertainty quantification.
Exploring the role of catchment size on MAAPE values highlights no consistent trend. As the catchments get smaller some display increasing error however this is not universally observed. Figure 10 plots MAAPE vs. catchment area. Given the resolution of RCM models (12.5 km 2 ) there is a chance that catchments under 50 km 2 may be poorly represented by this grid. However, the results show an inconsistent message which may be correlated to other factors such as location, topography, and complexity. Therefore, each catchment should be considered independently at the outset of any study.

CONCLUSION
The EURO-CORDEX dataset has been shown to produce flow time series for every UK river catchment. Test have been undertaken to examine whether these are statistically significantly different to observed flows, and to examine in more detail the errors (using MAAPE) in the flow projections on the baseline. Across the UK this study has found that in the large majority of catchments the ensemble flow projections available through EURO-CORDEX are useable. The performance of different model chains are explored for both representation of the flow range as well as for flow seasonality. In general the results for the flow range are good across the UK, however representation of seasonality is poor. Climate model uncertainty and parameter uncertainties are quantified within the ensemble which will allow robust uncertainty quantification in future flow availability analysis. EURO-CORDEX outputs, appropriately bias corrected, thus are suitable and available for future climate change impact assessments of water resources across the UK.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: DOI: 10.5281/zenodo.5609987, Aitken et al. (2021).

AUTHOR CONTRIBUTIONS
GA performed the data collection with support from AV. GA and AV performed the statistical analysis. LB, AV, and GA analyzed results. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by EPSRC-EP/N030419/1.