This article was submitted to Cryospheric Sciences, a section of the journal Frontiers in Earth Science
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Estimation of spatial correlations should be an integral part of objective analysis of geophysical variables. However, a statistical assessment of spatial correlations has been absent from studies of objective analysis of snow depth since its debut over 2 decades ago. We show a method for computing regional spatial correlations of observed snow depth and the daily snow depth increment and fitting them to correlation functions to estimate the correlation scale parameters. Both horizontal and vertical distance correlations are computed from station observations over a well sampled part of North America. The vertical and horizontal distance correlations are fitted to exponential functions using the least square method to estimate the correlation scale parameters including the amplitude, which represents short distance correlation. Our assessment suggests a large horizontal e-folding correlation scale for both the observed snow depth and the daily increment, with implications for improving predictions in poorly monitored areas with relatively flat topography. Over mountainous terrain, vertical e-folding correlation scale for observed snow depth is much smaller than that for the daily snow depth increment and for the snow depth increment used in operational snow analyses. That means that optimal interpolation-based analysis of the increments may be more accurate than the interpolation of snow depth data.
Spatial distribution of snow depth exhibits long spatial correlations, which can be exploited for regional mapping of snow depth using Kriging interpolation method. Kriging requires the specification of the spatial correlation functions which the technique uses to compute the spatial correlations between pairs of observations and between each observation and the grid point being considered for estimation. These spatial correlation values are then used to compute the weight of each observation and the grid point snow depth as a spatially weighted average. Spatial correlation functions of the
A central assumption of Kriging/OI method is that the embedded correlation functions represent the spatial correlation structure of the interpolated variable for the method to be optimal and for the spatial weights to have realism, in our case, to estimate snow depth with minimum error and with an accuracy significantly higher than the simple average. Therefore, fitting observations to the correlation functions to estimate the function parameters should be an integral part of Kriging/OI-based analysis.
In operational OI-based snow depth analyses, the spatial correlations of the analysis snow depth increments are computed using the family of exponential correlation functions as follows:
The correlation functions and the scale parameters as above indicated were suggested by
When analysis increment correlation scales were chosen over 2 decades ago there was much less information available for their estimation. The accumulation of snow depth data since then gives an opportunity to consider launching statistical assessments of the spatial snow depth correlations. This study presents a method for computing regional spatial correlation statistics of observed snow depth and the daily snow depth increments over a well sampled part of North America. Specifically, correlations are computed from
In this regard, it is important to make the distinction between the observed snow depth and the analysis snow depth increment correlation statistics. In the latter case, the statistics depend on both the specific model used to produce the first guess snow depth and the sampling of snow depth observations. This study focuses on the correlations of observed snow depth and the daily snow depth increments, the latter defined as the difference between observed snow depth on a given day and that from the previous day. We chose to focus on the correlation statistics of observations because knowledge of spatial correlations of observations would be useful for a wider range of hydro-meteorological applications and employment of improved correlation functions and scale parameters based on observations has the potential for improving operational snow analyses
Correlations are computed separately for snow depth and the daily snow depth increment using snow depth observations from December, January, and February (DJF) for the years 2012–2016. For each day the daily snow depth increment is defined as the difference between the daily value and the value of the previous day. A snow depth must be defined at each location for both days for a daily change to be defined for use in computing statistics.
Daily snow depth is obtained from NOAA’s Global Historical Climatology Network (GHCN)-Daily, available at NOAA’s National Center for Environmental Information (NCEI,
For DJF months, correlations are computed over North America where stations are densest (30°N-55°N).
The study region (30°N-55°N). Depicted is the fraction of days in DJF months for the GHCN stations over 2012–2016 with snow-depth sampling. Sampling = 1 if all days are sampled.
GHCN-Daily weather station density over the study domain is one of the highest globally, with greater than 100 snow depth measuring stations per 5° grid (
Both horizontal and vertical (altitude-dependent) correlations are considered. The goal of this step is to compute correlations from observations for a range of horizontal and vertical distances that can then be fitted to correlation functions of horizontal distance and to correlation functions of vertical distance. To this end, we divide the geographical extent of the study depicted in
Horizontal correlation statistics are computed using daily data from 30°N-55°N, 100°W-65°W, east of the greatest elevation changes over North America, with the assumption that elevational changes over this region have negligible influence on horizontal correlations. Vertical correlation statistics are computed using daily data from 30°N-55°N and west of 100°W, where elevation changes are greatest. As described below, to compute vertical correlation statistics, we developed a method that minimizes the influence of horizontal separation on vertical correlations.
For computing binned horizontal correlations from unevenly distributed stations, we resample observations into a grid in which the horizontal distance between observations is measured by the distance between centers of their grid boxes. Inspection of the daily data revealed that 0.1° by 0.1° spatial grid represents a fine spatial resolution of the daily data over this sub-region, in that most grid boxes of this size there is only one observation. In the few grid squares with more than one observation, we use only the first one for consistency. Squares with no stations are not filled. Using station data on this fine grid allows the binned correlations to be more easily computed. Since correlation scales are much larger than the grid scale, any uncertainty associated with this method should be minimal. A coarse grid box with a size comparable to that of the e-folding scale would fail to detect the intrinsic correlation structure of snow depth. That could be the case, for example, if this technique is applied over a region with sparse observations.
Horizontal correlations are computed with distance for bins 10 km wide, which is approximately the size of each grid square, with distances of 0 <
The method for computing lag correlations described above is similar to the usual way the empirical semi-variogram is modelled, whereby smoothing of the measured semi-variances is achieved by the binning of irregular distances between observations into distance classes of equal width and pooling of the data with the same lag distance to compute semi-variances. This averaging process which we chose to apply is described in
For the vertical lag correlations, data are compared for adjacent grid squares: there are too few stations in one grid box at different elevations to compute vertical correlations over same grid box, so it is necessary to use stations separated by one grid square. As with the horizontal lag correlations, the vertical lag correlations are computed using only one station observation per grid square. Several experiments are performed to find out how much horizontal separation is needed in the stations to compute vertical correlations. Here we use data within 0.1° of a central grid in computations
Lagged vertical correlations from a base grid box with observations are computed by searching only over the adjacent grid squares for all pairs of observations with same vertical lag distance from that base. Next, the data are pooled to compute one vertical lag correlation value from the base. In a similar fashion, the search continues over all grid squares with observations.
The binned correlation values computed for observed snow depth and the daily snow depth increment are fit to Eqs
Three scenarios are evaluated by comparing how well the selected equation represents the observed correlation with (vertical or horizontal) distance: the autoregressive correlation function (Eq.
The observed correlations with distance are denoted
The first fit considered, referred to as Fit1, is the one used by
The function is a second-order autoregressive function also suggested for analysis of a background field (
A solution for Eq.
The second function considered, referred to as Fit2, is a modified version of Fit1 and the same as Eq.
The change from Fit1 is that here an amplitude,
Similarly, taking the first derivative of the MSE of Eq
Combining Equations
Equation
The third function fit considered, referred to as Fit3, uses a Gaussian function similar to Eq.
This is the Gaussian correlation function also used by
These equations are combined to cancel out
Using the numerical estimate of the e-folding scale α_{3}, the amplitude A_{3} is found using Equation
Horizontal correlations as a function of distance for observations drop below
Snow depth correlations as a function of horizontal separation distance. The observed correlations, Corr, and the indicated model fits are shown.
Model fits of horizontal correlations are also shown, and the fitted parameters are listed in
The scale parameter (
Scale | Amplitude | EFD | RMSE | |
---|---|---|---|---|
Fit1 | 137.5 | 1.00 |
295.6 | 0.13 |
Fit2 | 200.0 | 0.72 | 430.1 | 0.03 |
Fit3 | 431.4 | 0.68 | 431.4 | 0.04 |
Fixed a priori.
The Fit1 estimate is distorted by its requirement to approach 1, as the lag distance approaches 0. Both Fit2 and Fit3 give estimates of the amplitude at roughly 0.7. As discussed by
For Fit2 and Fit3, the noise/signal variance ratio is about 0.43 (i.e., the signal/noise variance >2).
For vertical correlations of snow depth, we found that after bin seven correlations sometimes become negative and values can be erratic. Also, correlations could not be computed for all of the higher bin numbers. This could be from data noise or decorrelation over too large a vertical distance, which, in either case, would suggest that data with that large separation distance are likely not reliable for interpolation. Therefore, we use only the first nine bins in the fits discussed here, giving a vertical range of 900 m.
The vertical correlations are fit to the same three equations and the observed and fit correlations are evaluated (
Snow depth correlations as a function of vertical separation distance. The observed correlations, Corr, and the indicated model fits are shown.
The scale parameter (
Scale | Amplitude | EFD | RMSE | |
---|---|---|---|---|
Fit1 | 198.8 | 1.00 |
427.5 | 0.14 |
Fit2 | 197.6 | 1.01 | 424.9 | 0.14 |
Fit3 | 461.1 | 0.94 | 461.1 | 0.12 |
Fixed a priori.
For horizontal daily snow depth increment statistics, Fit1 is clearly inferior to the others (
Daily snow depth increment correlations as a function of horizontal separation distance. The observed correlations, Corr, and the indicated model fits are shown.
The scale parameter (
Scale | Amplitude | EFD | RMSE | |
---|---|---|---|---|
Fit1 | 84.6 | 1.00 |
181.9 | 0.18 |
Fit2 | 170.2 | 0.54 | 365.9 | 0.02 |
Fit3 | 378.0 | 0.51 | 378.0 | 0.03 |
Fixed atablef priori.
For vertical separations (
Daily snow depth increment correlations as a function of vertical separation distance. The observed correlations, Corr, and the indicated model fits are shown.
The scale parameter (
Scale | Amplitude | EFD | RMSE | |
---|---|---|---|---|
Fit1 | 330.4 | 1.00 |
710.4 | 0.18 |
Fit2 | 543.3 | 0.62 | 1,168.1 | 0.06 |
Fit3 | 1,166.7 | 0.59 | 1,166.7 | 0.06 |
Fixed a priori.
The aim of the study was to estimate regional horizontal and vertical correlation scales from snow depth observations over North America for use in Kriging/OI. Scales were assumed to be fixed parameters over the selected region and are considered for the winter period only. The study showed that these assumptions are reasonable using snow depth data from GHCN-Daily. The distinct pattern of lag correlations of observations over the selected region is indicative of the large scale structure of snow depth spatial variability. The exponential correlation functions of horizontal distance and elevation used in operational snow depth analysis were found to be a good fit to the observed lag correlations.
It was necessary to perform two separate analyses: an analysis over the eastern regions with relatively flat topography to estimate horizontal scales and over the western regions characterized by high mountain terrain to estimate vertical scales. Fitting observed correlations using horizontal distance and elevation as predictors simultaneously is a valid method—a correlation function such as this can be used for interpolation - but the data are not dense enough for simultaneous estimation of horizontal and vertical scales. Using these data errors can occur in situations where horizontal correlations are estimated over mountain environments without considering elevation. Our study found a large effect of elevation on observed snow depth correlations over the high mountain regions of North America: On average, the observed snow-depth correlation e-folding vertical distance is 461 m. Therefore, horizontal correlation scales estimated over areas with steep elevation gradients could be severely underestimated.
The amplitude is a useful parameter for Kriging/OI. An amplitude smaller than one indicates noise in the data. The noise variance is computed from the amplitude
The horizontal amplitude estimate over the eastern regions is about 0.7, indicating a signal/noise variance of about 2 (Eq.
For horizontal correlations of both full snow depth and daily increments, the horizontal length scales are an order of magnitude larger than the spatial grid scale of 10 km. Thus, sub-grid scale variability should have negligible influence on the measured large scale structure. However, the binned vertical correlations and the vertical scale estimates are less certain due to lower sampling available for their estimates. The data are not sufficient for resolving vertical scales less than the bin width of 100 m, and the vertical statistics should be thought of as a rougher estimate than the horizontal statistics. Because of the limited sampling, we used the data available to compute only one set of statistics and scale parameters for the region. Exploration of spatially and seasonally adjusted correlation scales was also considered initially but was abandoned because of the uneven sampling density. Future studies using better data and methods may be able to improve correlation statistics, especially for regions with large changes in elevation.
Measurements of horizontal and vertical DJF snow depth and daily snow depth increment correlation scales over a densely-sampled part of North America are estimated from the fit to exponential equations used to represent correlations of analysis snow depth increments.
The results of our evaluation indicate large horizontal correlation scales for measured snow depth and the daily snow depth increments estimated at about 430 km and 370 km, respectively, when the fit to equations includes an amplitude estimate, which is generally less than 1. These large horizontal scales indicate that in regions with limited topography snow depth and its daily variations may be better analyzed using data with greater horizontal separation. The horizontal Gaussian estimate of correlation (Fit3) gives estimates comparable to that of autoregressive function (Fit2).
Correlations due to vertical separation distances are harder to measure due to lower sampling over mountainous regions. However, we showed here that vertical correlation equations can be fit to data from stations with minimal horizontal separation. All the three fits to vertical correlations are similar, although the Gaussian estimate (Fit3) is slightly better. The measured and fitted vertical correlation scales for snow depth and much smaller than for daily snow depth increments suggesting that analysis based on interpolation of snow depth in regions with large topographic changes requires more data to maintain the same accuracy. It may be possible to use a satellite-based proxy, tuned and bias adjusted against
Estimating the amplitude reduces the fitting errors, avoids overestimation of the correlation at short separation distances, and allows estimation of the noise/signal variance ratio for use in optimal interpolation.
Our results indicate potential utility for operational snow analysis. For example, the persistence of large horizontal scales of observations and the daily increments in the scenarios considered imply that horizontal increment analysis scales larger than 120 km can be tested for improved predictions especially over remote areas, or assessments such as this can be performed to estimate scales for a specific analysis application.
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
CK and TS wrote the initial draft of the paper and provided equal contributions. All authors contributed to the writing—review and editing of the final written version.
This research was funded by NOAA, grant number NA18NWS4680053 and NOAA, grant number NA21OAR4590381.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.