Optimal Spatiotemporal Scales to Aggregate Satellite Ocean Color Data for Nearshore Reefs and Tropical Coastal Waters: Two Case Studies

Remotely sensed ocean color data are useful for monitoring water quality in coastal environments. However, moderate resolution (hundreds of meters to a few kilometers) satellite data are underutilized in these environments because of frequent data gaps from cloud cover and algorithm complexities in shallow waters. Aggregating satellite data over larger space and time scales is a common method to reduce data gaps and generate a more complete time series, but potentially smooths out the small-scale, episodic changes in water quality that can have ecological influences. By comparing aggregated satellite estimates of Kd(490) with related in-water measurements, we can understand the extent to which aggregation methods are viable for filling gaps while being able to characterize ecologically relevant water quality conditions. In this study, we tested a combination of six spatial and seven temporal scales for aggregating data from the VIIRS instrument at several coral reef locations in Maui, Hawai‘i and Puerto Rico and compared these with in situ measurements of Kd(490) and turbidity. In Maui, we found that the median value of a 5-pixels, 7-days spatiotemporal cube of satellite data yielded a robust result capable of differentiating observations across small space and time domains and had the best correlation among spatiotemporal cubes when compared with in situ Kd(490) across 11 nearshore sites (R2 = 0.84). We also found long-term averages (i.e., chronic condition) of VIIRS data using this aggregation method follow a similar spatial pattern to onshore turbidity measurements along the Maui coast over a three-year period. In Puerto Rico, we found that the median of a 13-pixels, 13-days spatiotemporal cube of satellite data yielded the best overall result with an R2 = 0.54 when compared with in situ Kd(490) measurements for one nearshore site with measurement dates spanning 2016–2019. As spatiotemporal cubes of different dimensions yielded optimum results in the two locations, we recommend local analysis of spatial and temporal optima when applying this technique elsewhere. The use of satellite data and in situ water quality measurements provide complementary information, each enhancing understanding of the issues affecting coastal ecosystems, including coral reefs, and the success of management efforts.

Remotely sensed ocean color data are useful for monitoring water quality in coastal environments. However, moderate resolution (hundreds of meters to a few kilometers) satellite data are underutilized in these environments because of frequent data gaps from cloud cover and algorithm complexities in shallow waters. Aggregating satellite data over larger space and time scales is a common method to reduce data gaps and generate a more complete time series, but potentially smooths out the small-scale, episodic changes in water quality that can have ecological influences. By comparing aggregated satellite estimates of K d (490) with related in-water measurements, we can understand the extent to which aggregation methods are viable for filling gaps while being able to characterize ecologically relevant water quality conditions. In this study, we tested a combination of six spatial and seven temporal scales for aggregating data from the VIIRS instrument at several coral reef locations in Maui, Hawai'i and Puerto Rico and compared these with in situ measurements of K d (490) and turbidity. In Maui, we found that the median value of a 5-pixels, 7-days spatiotemporal cube of satellite data yielded a robust result capable of differentiating observations across small space and time domains and had the best correlation among spatiotemporal cubes when compared with in situ K d (490) across 11 nearshore sites (R 2 = 0.84). We also found long-term averages (i.e., chronic condition) of VIIRS data using this aggregation method follow a similar spatial pattern to onshore turbidity measurements along the Maui coast over a three-year period. In Puerto Rico, we found that the median INTRODUCTION Good water quality is essential for healthy coastal ecosystems and recent increases in land-based sources of pollution (LBSP) threaten the persistence of many ecologically important nearshore species. Water quality is defined by several factors such as the turbidity, productivity, salinity, amount of sediment, nutrients, dissolved oxygen, and other pollutants present within the water column. Human expansion has dramatically increased coastal LBSP and negatively affected water quality in nearshore environments impacting most of the world's coral reefs (Fabricius, 2005;Sheridan et al., 2014;Otaño-Cruz et al., 2017). Climate change, sea level rise, and the increasing frequency of hurricanes and other natural disasters exacerbate the impacts of anthropogenic LBSP through increased runoff, coastal erosion, and sediment resuspension (Nearing et al., 2005). Excess sediment and nutrients that lead to eutrophication can degrade coastal ecosystem health by limiting light availability necessary for photosynthetic organisms and disrupting oligotrophic ecosystems.
Coral reef ecosystems are particularly sensitive to changes in water quality. There have been substantial ecosystem declines in reef regions around the world (Pandolfi et al., 2003;Burke et al., 2011;Hughes et al., 2018), and increased sediment and nutrient loads indirectly threaten one-third of remaining reef-building corals by disrupting reef biological processes and increasing coral disease (Rogers, 1990;Carpenter et al., 2008;Thompson et al., 2014;UNEP, 2017). In tropical nearshore waters, structurebuilding corals provide the underlying habitat that supports the high biodiversity of coral reefs and the many ecosystem services they provide. Tropical corals live close to their upper light and temperature tolerance thresholds (Rodolfo-Metalpa et al., 2014) and, as sessile organisms, they are highly sensitive to changes in water quality, which modifies temperature, light, and availability of both beneficial and detrimental nutrients (Lesser et al., 2009). Turbid water from sedimentation, nutrient input, and other LBSP can impair reef growth and lead to degradation of existing reefs. For example, suspended sediment decreases the light available to zooxanthellate corals for photosynthesis, and deposited sediment limits coral recruitment and is energetically costly for corals to remove (Fabricius, 2005;Weber et al., 2012). Sediment resulting from coastal dredging has similar deleterious impacts and may also act as a catalyst for the spread of coral disease (Pollock et al., 2014). High nutrient input from terrestrial runoff can cause eutrophication that leads to phytoplankton and algae blooms, further reducing light availability.
Quantifying water quality in coastal ecosystems has largely depended on highly accurate but small-scale in situ measurements and less accurate but larger-scale complementary data from remotely sensed ocean color instruments. While in situ water quality measurements can be highly accurate, they are time-consuming, labor-intensive, and often expensive and therefore difficult to collect over regional scales (Duan et al., 2013). These measurement methods are also limited in their ability to track the spatial and temporal variability in water quality in large water bodies (Gholizadeh et al., 2016). Weather conditions may also inhibit in situ measurement collection, and in the case of Hurricane Maria in Puerto Rico, destroy existing monitoring infrastructure (Hernández et al., 2020). As such, previous studies have used remotely sensed ocean color data to estimate water quality parameters in coastal environments (Thompson et al., 2014;Gholizadeh et al., 2016;Hernández et al., 2020). These data provide complementary information to in situ measurements and are useful for estimating water quality parameters over broad areas.
One such satellite-derived parameter is the diffuse attenuation coefficient at 490 nm, K d (490) (Lee et al., 2005;Morel et al., 2007;Wang et al., 2009). K d (490) is an apparent optical property that acts as a proxy for turbidity in the water column and can be used to quantify light availability for benthic ecosystems such as coral reefs (Kirk, 2011;Hernández et al., 2020). Two limitations of ocean color algorithms for remotely sensed water quality observations over coral reefs are the frequent presence of clouds along tropical shorelines (Gholizadeh et al., 2016), and the confounding of water column measurements due to bottom reflectance in clear, shallow water (Reichstetter et al., 2015). The presence of clouds increases when using instruments aboard an afternoon-overpass satellite like Suomi-National Polar-orbiting Partnership (NPP) or NOAA-20. Optically shallow waters can confound algorithms due to bottom reflectance, interfering with measurement of near-coast turbidity (McKinna and Werdell, 2018). Waters are considered optically shallow when the physical depth of the measurement location is less than the depth to which penetrating light would be optically significant (Bailey and Werdell, 2006). While previous work to improve algorithms for optically shallow waters and minimize bottom reflectance issues has been tested in Florida and the Gulf of Mexico (Barnes et al., 2013;Hu et al., 2014;Barnes et al., 2018), these algorithms have not been implemented operationally by NOAA or NASA. In addition to cloud presence and bottom reflectance, the practice of using only the highest quality satellite data affects the quantity of data available due to quality control flags, which account for perturbations such as stray light and sun glint, further removing data from consideration (Feng and Hu, 2015;Hu et al., 2020). Multi-satellite approaches can reduce data gaps, but are limited by the insufficient frequency of polar-orbiting satellite overpasses and prove difficult to implement due to instrument and algorithm differences across sensors (Sathyendranath et al., 2019). NOAA has been using Data Interpolating Empirical Orthogonal Functions (DINEOF) on single-sensor and merged-sensor datasets to successfully fill gaps and capture meso-and large-scale ocean features, highlighting the need for gapless datasets, but this method has not been tested at resolutions less than 9 km Wang, 2018, 2019). Aggregating data from one instrument over space and time scales is routinely done by NOAA and NASA to produce a more complete time series (e.g, 3-, 8-days, monthly, and seasonal averages at 4 and 9 km globally). However, aggregating over space and time has the potential to smooth out small-scale features that are of interest to coastal managers. Globally standardized aggregation scales may not be optimal for all locations due to localized small-scale episodic changes in water quality.
This study investigated the relationship of in situ data with space and time aggregations of corresponding satellite K d (490) to determine how well satellite data describe environmental variability at scales applicable to coral reefs. Data from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument (nominal spatial resolution of 750 m) aboard the Suomi-NPP satellite at several coral reef locations in Maui, Hawai'i and southwestern Puerto Rico were compared with in situ measurements of K d (490) and turbidity. The performance of aggregated satellite data was assessed in terms of how well the data correspond to in situ data to describe acute and chronic water quality, both of which are ecologically important to coral reefs. Improving the data availability of satellite-derived water quality while maintaining a robust relationship with in situ measurements will enhance the utility and reliability of satellite data for use in the monitoring and management of events that affect water quality, as well as understanding of the ecological implications of such events.

MATERIALS AND METHODS
This study compares the globally available satellite remote sensing dataset from VIIRS with two types of in situ data in Maui, Hawai'i and Puerto Rico (Figure 1). In Maui, the focus is on spatial dynamics and aggregated VIIRS data are compared with (i) in situ nearshore optical data collected at 11 sites over three days and (ii) in situ onshore turbidity measurements collected at 51 sites over three years. In Puerto Rico, the focus is on temporal dynamics and aggregated VIIRS data are compared with a four-year in situ time series of nearshore optical data at a single location. The following sections describe the data sources and collection methods in more detail.

Nearshore Optical Data Collection-Maui and Puerto Rico
Nearshore optical data in Maui and Puerto Rico were collected using a Sea-Bird Scientific HyperPro II optical profiler (Satlantic) in free-falling mode over the side of a boat, while avoiding shipshadowing errors (Mueller et al., 2003;Ondrusek et al., 2012). The raw instrument data were processed in Satlantic Prosoft software (v8.0) with calibrated reference data to obtain the attenuation coefficient of downward irradiance at 490 nm, K d (490) Wei et al., 2016). The Maui data consisted of 11 measurements taken at different nearshore sites covering a broad spatial domain (∼17 km 2 ) during November 26 and 28, 2018 (Figure 2). Nine of the nearshore sites in Maui were ∼750 m off the coast (one satellite pixel) and two were more than 750 m off the coast. The Puerto Rico data consisted of 22 measurements taken at one site in the La Parguera Natural Reserve during 2016-2019 (Figure 3). The site in Puerto Rico was more than 750 m off the coast, but within the shelf break. At each measurement site (across both study locations), three instrument casts were taken and collectively used to determine the attenuation coefficient (Supplementary Figure 1). In Maui, four sites can be considered optically shallow at the time of measurement using a conservative measure of apparent optical depth (AOD = 1.3/K d ; Bailey and Werdell, 2006); everywhere else the waters were optically deep (i.e., the apparent optical depth less than the physical depth; Supplementary Table 1).

Onshore Turbidity Data Collection-Maui
Onshore turbidity data were collected using a Hach 2100Q turbidimeter in < 1 m deep water along the west and south Maui coasts at 51 fixed sites (±3 m of accuracy) that are representative of water quality types along the stretch of coastline. Data were collected between June 15, 2016 and June 15, 2019 (Figure 4). Sites were chosen to be adjacent to but not directly in front of surface water discharges. The turbidimeter reports measurements in Nephelometric Turbidity Units (NTU). These measurements are an ongoing effort in collaboration with Hui o Ka Wai Ola, a local water quality-monitoring group (http://huiokawaiola.com), and the Hawai'i State Department of Health. The number of NTU measurements and dates they were collected varies between sites (Supplementary Table 2).

VIIRS K d (490) Ocean Color Data-Maui and Puerto Rico
Ocean color data are from the Visible Infrared Imaging Radiometer Suite (VIIRS) sensor aboard the Suomi-NPP satellite. Data are available for download as science-quality, Level-2 swaths (nominal resolution 750 m) from NOAA CoastWatch (NOAA, 2021). The NOAA Center for Satellite Applications and Research Ocean Color Team has been routinely producing ocean color products for VIIRS since its launch on October 28, 2011 using the NOAA Multi-Sensor Level-1 to Level-2 Processing System (NOAA-MSL12 v1.2) (Wang et al., 2009(Wang et al., , 2017. The algorithms   (Wang et al., 2017;Eqs. 12-16). Algorithm validation was undertaken using the predecessor to VIIRS (Moderate Resolution Imaging Spectroradiometer, MODIS) for which the mean ratio of satellite to in situ K d (490) was 1.037 (Wang et al., 2009). Subsequent comparison of VIIRS and MODIS data demonstrated consistent K d (490) values (Wang et al., 2017). For this study, Level-2 swath data for Hawai'i and Puerto Rico were binned to a geographic projection with pixel size of 0.0075 degrees (∼750 m) using CoastWatch Utilities software (v3.6.0). VIIRS overpasses occur once daily with an afternoon orbit. Swath data within 4 h before or after the average overpass time were temporally binned into daily images. For Hawai'i, which is near the International Date Line, this method prevented swaths that were sometimes greater than 22 h apart from being temporally binned into the same "day." Level-2 quality flags were applied during the spatial binning process to mask out potentially poor quality data as defined by the MSL12 process (Supplementary Table 3). The flags applied follow the current default flags used by NOAA to produce daily Level-3 global products (NOAA, 2021). Data using this process were previously used to investigate water quality impacts following Hurricanes Irma and Maria in Puerto Rico (Hernández et al., 2020). The daily, 750 m K d (490) imagery over Hawai'i and Puerto . White areas are missing data due to cloud cover or flagged due to poor quality. The red circles show the locations where 11 in situ optical data measurements were taken on November 26 or November 28, 2018 using a Sea-Bird Scientific HyperPro II; and (B) the 7-days period from November 24-30, 2018, illustrating the variability in satellite data coverage. Sites marked with a white asterisk were considered optically shallow during the time of measurement (see Supplementary Table 1).
FIGURE 3 | Map of VIIRS K d (490) for southwestern Puerto Rico for November 9, 2016 (750 m pixel resolution). White areas are missing data due to cloud cover or flagged due to poor quality. The red circle shows the location where 22 in situ optical data measurements were taken between 2016 and 2019 using a Sea-Bird Scientific HyperPro II.
Frontiers in Marine Science | www.frontiersin.org FIGURE 4 | (Left) Temporal median of the time series of (red, upper axis) in situ turbidity (NTU) and (black, lower axis) spatiotemporal (median of 5-pixels, 7-days cube) satellite K d (490) (m −1 ) for each site by latitude in West Maui, HI. In situ turbidity measurements were taken using a Hach 2100Q turbidimeter. Data were included in the analysis only when in situ observations and satellite observations (in a 5-pixels, 7-days cube) were both available. Whiskers represent the 1st and 3rd quartiles of underlying time series data for each dataset, respectively. (Right) Map of collection sites along the West Maui coast. The 1937 land use layer shows areas historically used for pineapple (orange) and sugarcane (yellow) production. The pink with yellow pattern area shows coral reef presence along the coast.
Rico produced following this process were used to test different aggregation methods in this study.

Aggregating VIIRS Data for Comparison With in situ Data
To evaluate the performance of different aggregations of satellite data, K d (490) values from every combination of six spatial and seven temporal scales were compiled for comparison with in situ data. Spatial scales were defined using a rosette-shaped buffer that incremented from the central pixel (containing the in situ data measurement) by one 750 m satellite pixel in every direction. This resulted in total potential pixels considered for the spatial dimension of 1, 5, 13, 25, 41, and 61 (effective resolutions ∼0. 75, 2.25, 3.75, 5.25, 6.75, and 8.25 km). For in situ measurements close to shore, the number of pixels considered in the spatial buffer was less than the potential maximum (typically around half of the maximum, depending on shape of the coastline) due to the buffer expanding over land. The temporal scales for the satellite data were centered on the date of the in situ data measurement, incrementing by one day before and after, that resulted in time scales of 1, 3, 5, 7, 9, 11, and 13 days centered on the in situ measurement date. The resulting spatiotemporal cubes of satellite data ranged in size from one pixel (1-pixel, 1-day) to 793-pixels (61-pixels, 13-days).
For each spatiotemporal cube of satellite data, the median value was determined (where an even number of satellite data values occurred, the mean of the two middle values was used). This statistic was chosen (e.g., instead of the mean) for comparison with in situ data to preserve a "real" measurement from the satellite data and to remove any influence of extreme values, which may be erroneous and indicate bottom reflectance in optically shallow near-shore environments (McKinna and Werdell, 2018). As distributions of K d (490) data are typically skewed toward low values, both in situ and satellite K d (490) data were log-transformed to reduce skew before comparison.

Comparing
For the 11 sites in West Maui (Figure 2A), a week of satellite K d (490) values illustrates the variability in satellite data coverage ( Figure 2B). On average, the 11 satellite pixels containing these sites had data on only 18% of days in 2018. In situ and satellite K d (490) values ranged within 0.022 and 0.223 m −1 . Median values from each spatiotemporal cube of VIIRS K d (490) were compared with in situ K d (490) measurements, and linear regressions of log-transformed data resulted in R 2 values ranging from 0.289 to 0.911 ( Figure 5). The highest R 2 resulted from the 5-pixels, 1-day combination; however, this combination yielded satellite data for only nine out of 11 in situ sites. The next highest R 2 of 0.843 resulted from the 5-pixels, 7-days combination, which yielded matching satellite data for all 11 of the in situ measurements in Maui ( Figure 5B). For this spatiotemporal cube, a total of 136 satellite values contributed to the 11 medians (6-21 gray dots for each black dot, Figure 5B) that were compared with in situ measurements. In general, increasing the spatial scale beyond the 5-pixels buffer and the 7-9-days time scale led to reduced R 2 values. Consideration of model fit using relative root mean square error (RRMSE) produced similar outcomes (Supplementary Figure 3). Due to sparse satellite data coverage, R 2 values for smaller and shorter spatiotemporal aggregation scales were more variable and, for the smallest aggregations, did not include all sites. Applying the 5-pixels, 7-days aggregation to site 15 resulted in an increase from 28 to 88% data coverage with an RMS difference of 0.005 m −1 between the two time series (Figure 6).

Comparing Aggregated VIIRS K d (490) With Onshore in situ Turbidity (NTU)-Maui
Having determined from the nearshore analysis that the 5-pixels, 7-days spatiotemporal aggregation was optimal, this aggregation scale was applied to VIIRS K d (490) values for comparison with the onshore turbidity measurements. While some of the 51 onshore sites fell within the same 750 m pixel, the sampling dates were unique to each site. From the resulting matched sets of satellite and in situ data, the chronic or long-term turbidity at each site was determined using the median value through time to investigate variation with latitude (Figure 4). The left panel of Figure 4 illustrates the chronic spatial pattern of measured turbidity and K d (490). Along the southern sector of the Maui coast, both measures increased from south to north until reaching Kalepolepo Park (∼20 • 46.6 N). Variability in turbidity (as indicated by the interquartile whiskers), also increased along this section of the coast. In the middle latitudes, the coastline runs from east to west, from Kalepolepo Park around the bight to Mā'alaea Harbor, which means the values are difficult to distinguish by latitude alone (Figure 4). Considering the sites by index (Supplementary Figure 2 and Keālia Pond,, also within the bight, had low turbidity and high K d (490) values. Along the northern sector of the Maui coast, turbidity (together with variability) increased heading north to Kā'opala Bay, although to a lesser extent than the southern sector. The northernmost site, Honolua Bay, had the highest variability in turbidity. Despite the differences in the measured quantities (in situ turbidity of particulates, and remotely sensed attenuation coefficient, which measures the effect of both particulates and dissolved matter), there is a consistency in the spatial pattern, indicating either that the components co-vary or that the satellite measurement is predominated by particulates. The R 2 between the long-term median K d (490) and turbidity values was 0.056. While spatial patterns in the chronic conditions appeared consistent for the two datasets, there was little to no relationship between turbidity values and VIIRS K d (490) when examined for each time and location. This was likely due to the inability of the satellite to measure changes in turbidity within tens of meters of the coast where turbidity samples were taken (typically <1 m deep water).

Comparing Aggregated VIIRS K d (490) With in situ K d (490)-Puerto Rico
We matched 22 in situ K d (490) measurements spanning four years from a single nearshore site in Puerto Rico with spatiotemporal aggregations of satellite data (Figure 3, which shows satellite K d (490) coverage for one matchup date). In situ and satellite K d (490) values ranged between 0.03 and 0.33 m −1 . Median values from each spatiotemporal cube of VIIRS K d (490) were compared with in situ K d (490) measurements, and linear regressions of log-transformed data resulted in R 2 values ranging from 0.003 to 0.547 (Figure 7). Every spatiotemporal cube smaller than 13-pixels and/or shorter than nine days resulted in an R 2 less than 0.35. The highest R 2 of 0.547 (to three decimal places) resulted from the 13-pixels, 13-days and the 25-pixels, 13-days combinations (the latter was higher in the 5th decimal place). There was little difference in R 2 between the regressions that fell within the 13-41 pixels and 9-13 days range of the matrix (ranging 0.45-0.547). The 13-pixels, 13-days matchup yielded the highest R 2 with the more parsimonious spatial scale (Figure 7B). In contrast to the West Maui dataset, the in situ values here represent measurement dates at the same location rather than measurements across different sites.
The matrix in Figure 7 shows little pattern between R 2 and increasing time and space scales, but it could be argued that increasing the time scale had a greater (positive) effect on R 2 than increasing the spatial scale. Evaluation of model fit using the RRMSE statistic (RRMSE < 15%) supported the selection of the 13-pixels, 13-days aggregation (using R 2 ) to balance the desired improvement to data coverage (number of dates matched) while maintaining representation of the in-water  Figure 2) for a range of spatial buffers and time windows applied to the satellite data. The outer horizontal axis indicates the spatial buffer and the outer vertical axis represents the time window, with icons on each representing the space (or time) domain. Together, the time window and spatial buffer make the spatiotemporal cube of satellite data from which values are compiled (gray circles) and the median of those is calculated (black circles). Shading of each plot represents the R 2 value of individual regression lines between the median satellite observation and the in situ observations. For each scatterplot, the horizontal axis values are the in situ observations from the sites in Figure 2, all of which were taken during November 26-28, 2018, and the vertical axis values are all satellite observations associated with that in situ measurement within the time window and space buffer indicated. The number in the upper left corner is the R 2 value for that plot; the number in the bottom right corner is the total number of satellite observations (gray circles) displayed in the plot. (B) Enlarged scatterplot for recommended spatiotemporal aggregation (5-pixels, 7-days). Axis tick values in (B) are the same as all scatterplots in (A) and major tick marks are bold for all scatterplots.  Figure 4). No spatiotemporal aggregation resulted in all in situ sampling dates having matched satellite data (the largest cube, 61-pixels and 13-days, matched with 19 out of 22 dates; the selected 13-pixels, 13-days cube matched with 17). Applying the 13-pixels, 13-days aggregation to Site LP6 resulted in an increase from 11 to 82% data coverage with an RMS difference of 0.025 m −1 between the two time series (Figure 8).

DISCUSSION
Understanding the long-and short-term variations in water quality are essential to enhance related management of marine ecosystems. We found that aggregated VIIRS K d (490) data adequately captures this variability in coastal waters, allowing characterization of water quality over broad regions. Researchers and managers can leverage the long time domain of satellite measures of water quality that are now available to assess both historical trends and variability, and to evaluate temporal dynamics. Our method of optimizing satellite data aggregation in nearshore areas, which are typically of low data density, successfully represented in situ measurements, supporting the use of this type of data for coral reefs and other coastal marine environments.
The nearshore Maui dataset-collected in a three-days snapshot at 11 sites-captured K d (490) data over the West Maui region. The relationship between in situ K d (490) and median VIIRS K d (490) from a 5-pixels, 7-days spatiotemporal cube was robust at differentiating observations even across these small space and time domains. Many of the nearshore in situ K d (490) measurement locations in Maui were within the satellite pixel directly adjacent to the VIIRS dataset land mask. This affected the overall number of pixels that had valid data and effectively reduced the total possible pixels in the spatiotemporal cubes by approximately half. Rather than attempting to shift the buffer to include additional water pixels and capture the maximum possible pixels for each buffer size, we chose instead to use a consistent method for all in situ sites. Time series using the 5pixels, 7-days spatiotemporal cube at site 15 in Maui highlight the improvement in data coverage, while maintaining features of the non-aggregated data (Figure 7).
The robustness of the methodology also was apparent in the Puerto Rico analysis, which considered a single location over a longer time period (four years). However, the optimal spatiotemporal cube dimension (13-pixels, 13-days) was notably different. Aggregating satellite data over longer time and broader spatial scales yielded better correlation with in situ data, with a preference for longer time scales. This is consistent with previous uses of VIIRS monthly K d (490) and chlorophyll-a data to monitor prolonged degradation of water quality after Hurricanes Maria and Irma in Puerto Rico (Hernández et al., 2020). Of the 22 in situ measurements, less than half had matching satellite data at combinations less than a 25-pixels spatial buffer or a 7-days time window. This can be explained by the typical occurrence of cloud cover at the time of the satellite overpass (Mikelsons and Wang, 2019;Hernández et al., 2020). However, the strength of the relationship between satellite and in situ data for larger spatiotemporal cubes suggests the persistence of water quality features in this region typically aligns with the scale selected for the aggregation. Importantly, and similar to the spatial analysis in Maui, VIIRS data seem capable of effectively capturing temporal dynamics in turbidity in Puerto Rico. Time series using the 13pixels, 13-days spatiotemporal cube at site LP6 in La Parguera, Puerto Rico highlight the improvement in data coverage, while maintaining features of the non-aggregated data (Figure 8).
VIIRS K d (490) also appear to be useful for monitoring and characterizing chronic states along the Maui coast where sediment input is known to be high and extends far offshore. The onshore turbidity dataset for Maui provided a spatially well-distributed dataset spanning three years, reflecting the commitment of both local organizations and state agencies FIGURE 7 | (A) Correlation matrix of in situ and VIIRS measures of log-transformed K d (490) over a 4-years in situ measurement period at one southwest Puerto Rico location ( Figure 5) for a range of spatial buffers and time windows applied to the satellite data. The outer horizontal axis indicates the spatial buffer and the outer vertical axis represents the time window, with icons on each representing the space (or time) domain. Together, the time window and spatial buffer make the spatiotemporal cube of satellite data from which satellite values are compiled (gray circles) and the median of those is calculated. Shading of each plot represents the R 2 value of individual regression lines between the median satellite observation and the in situ observations. Scatterplots with two or fewer matchups are excluded. For each scatterplot, the horizontal axis values are the in situ observations from the site in Figure 3, which were taken between 2016 and 2019, and the vertical axis values are all satellite observations associated with that in situ measurement within the time window and space buffer indicated. The number in the upper left corner is the R 2 value for that plot; the number in the bottom right corner is the total number of satellite observations (gray circles) displayed in the plot. (B) Enlarged scatterplot for recommended spatiotemporal aggregation (13-pixels, 13-days). Axis tick values in (B) are the same as all scatterplots in (A) and major tick marks are bold for all scatterplots. to ongoing water quality monitoring. While turbidity and K d (490) are measures of different in-water constituents (turbidity measures particles and K d (490) measures both particles and dissolved matter), K d (490) has been previously used to estimate turbidity levels in the Great Lakes (Son and Wang, 2019), and may act as a proxy for turbidity in particle-dominated waters. The similar spatial patterns of long-term conditions between in situ and satellite measures across south and west Maui suggest that nearshore patterns reflect those in shallow onshore waters. This provides confidence in the application of satellite data to inform likely onshore spatial patterns. Notably, the increase in both chronic VIIRS K d (490) and turbidity from south to north follows known sedimentation patterns in Maui related to historic agricultural land use practices, as shown in the map in Figure 4. Historical pineapple plantations were common in the northern West Maui watersheds and best management practices were typically poor, such as terrace creation leading to movement of soil into the adjacent stream gulch (Stock et al., 2015). Along the northern coast, the turbidity gradient is also related to an increase in rainfall moving north to Honolua Bay. In the southern sector, large sugar plantations were in cultivation up until 2016, and were developed using an extensive network of ditches that also deliver sediment and nutrients to the coast. The effects were concentrated in the Mā'alaea area, which may partially explain the south-to-north increase in turbidity along the southern sector. It is of note that while spatial patterns in long-term characteristics were similar, individually paired turbidity values and VIIRS K d (490) showed little to no relationship (maximum R 2 = 0.013 across all spatiotemporal cubes tested). This may be attributed to aforementioned differences in the in-water constituents satellite K d (490) and in situ turbidity are measuring and the resolution and onshore coverage limitations of the satellite. Sediment resuspension with wind or wave action may also affect onshore samples differently from nearshore patterns we see in the satellite data. Although broad spatial patterns are captured by satellite data, the temporal variability of particular sites is not well captured, emphasizing the independent value of in situ sampling.
Importantly, these results together indicate the complementary value of using both remotely sensed and in situ data on water quality. Satellite products can provide continuity of information between field excursions, while inwater measurements are invaluable during extended periods of cloud cover that usually correlate with storms and occur close to shore, where bottom reflectance and land masking prevent consistently available satellite data. In the shallow coastal zone, satellite products may best inform long-term conditions (e.g., baseline data) for water quality that couple with in situ data used to monitor short-term events. For some sites, it may be possible to develop satellite tools that alert managers and other stakeholders to unusual conditions nearshore, which trigger response monitoring onshore. Similarly, in situ measurements can be useful to inform the development of satellite algorithms in complex waters.
It is common practice to average satellite data over longer time scales of weeks and months to understand seasonal patterns and fill data gaps. It is less common to increase the spatial buffer used, especially near the coast, due to the potential of introducing more noise into the signal. However, for monitoring changes over an area, such as coastal coral reefs, this method of determining the optimal spatiotemporal scale may capture nearby events that can have ecological impacts, such as a sediment plume moving along the coast. While they can be informed by the analyses presented here, optimizing the dimensions of the spatiotemporal aggregation should ideally be undertaken for each geographic region of study using appropriate in situ measurements. This is especially important since satellite data coverage can vary greatly from location to location. Spatial buffer size may ultimately affect the correlation between satellite and in situ data when trying to track acute events; however, it can be useful for characterizing the chronic state of a location when looking for anomalies. Our relationships between satellite data and in situ data in Maui and Puerto Rico suggest that aggregating over longer time scales produces better correlations with in situ data than increasing spatial buffer size; however, further analysis would be required to categorically state this.
Enhancing the availability of temporally dense, long-term satellite data to managers opens up new possibilities for application and calibration/validation. The application of methods to maximize satellite data availability should include fine-tuning with in situ measurements at each new location. Ecological models that require regular data input to generate predicted changes in environmental conditions, such as seagrass habitat loss or coral disease occurrence, could utilize this method. We suggest testing this method with other remotely sensed metrics such as chlorophyll-a or colored dissolved organic matter (CDOM) where in situ data are readily available. Longer time series of in situ water quality data in nearshore and onshore environments would benefit the validation and applicability of this method.

CONCLUSION
Remotely sensed ocean color data from satellites provide useful information for characterizing and tracking changes in water quality over broad regions. Many challenges exist when creating management tools using these data, the most prevalent of which are data gaps due to clouds and capturing events that occur at small spatial and temporal scales close to the coast (e.g., tens of meters, less than one day in duration). In this study, we addressed the issue of overcoming data gaps by testing space and time aggregation methods on VIIRS data in Maui Hawai'i and southwestern Puerto Rico to determine when the correlation with corresponding in situ measurements is optimal. We found that this method worked well in nearshore waters of Maui for optimizing the correlation with in situ data and increasing the overall number of satellite matchups. In onshore waters in Maui, VIIRS data aggregated with this method follow similar climatological patterns as in situ turbidity measurements over a 3-years period. In Puerto Rico, the overall correlations between VIIRS data and in situ data were lower, but R 2 and matchups were maximized at longer time and space scales. The complementary nature of satellite and in situ water quality measurements enhance our understanding of coastal ecosystems and should continue to be used together in management plans and other applications.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
EFG, SFH, WJH, JMC, MJD, and ALG contributed to conception and design of the study. WJH, RAA, KF, and TC provided in situ data. EFG, SFH, MJD, JMC, WJH, CME, and ALG conceptualized and/or performed the statistical analysis. EFG wrote the first draft of the manuscript with WJH, KF, and SFH contributing sections of the manuscript. EFG, SFH, WJH, JMC, KF, TC, ALG, GL, JLD, RAA, MJD, and CME contributed to manuscript revision, read, and approved the submitted version.   (Figure 2) for a range of spatial buffers and time windows applied to the satellite data. The outer horizontal axis indicates the spatial buffer and the outer vertical axis represents the time window, with icons on each representing the space (or time) domain. Together, the time window and spatial buffer make the spatiotemporal cube of satellite data from which values are compiled (gray circles) and the median of those is calculated (black circles). Shading of each plot represents the RRMSE value of individual regression lines between the median satellite observation and the in situ observations. For each scatterplot, the horizontal axis values are the in situ observations from the sites in Figure 2, all of which were taken during November 26-28, 2018, and the vertical axis values are all satellite observations associated with that in situ measurement within the time window and space buffer indicated. The number in the upper left corner is the RRMSE value for that plot; the number in the bottom right corner is the total number of satellite observations (gray circles) displayed in the plot. (B) Enlarged scatterplot for recommended spatiotemporal aggregation (5-pixels, 7-days). Axis tick values in (B) are the same as all scatterplots in (A) and major tick marks are bold for all scatterplots.

Supplementary Figure 4 | (A)
Correlation matrix of in situ and VIIRS measures of log-transformed K d (490) over a 4-years in situ measurement period at one southwest Puerto Rico location ( Figure 5) for a range of spatial buffers and time windows applied to the satellite data. The outer horizontal axis indicates the spatial buffer and the outer vertical axis represents the time window, with icons on each representing the space (or time) domain. Together, the time window and spatial buffer make the spatiotemporal cube of satellite data from which satellite values are compiled (gray circles) and the median of those is calculated. Shading of each plot represents the RRMSE value of individual regression lines between the median satellite observation and the in situ observations. Scatterplots with two or fewer matchups are excluded. For each scatterplot, the horizontal axis values are the in situ observations from the site in Figure 3, which were taken between 2016 and 2019, and the vertical axis values are all satellite observations associated with that in situ measurement within the time window and space buffer indicated. The number in the upper left corner is the RRMSE value for that plot; the number in the bottom right corner is the total number of satellite observations (gray circles) displayed in the plot. (B) Enlarged scatterplot for recommended spatiotemporal aggregation (13-pixels, 13-days). Axis tick values in (B) are the same as all scatterplots in (A) and major tick marks are bold for all scatterplots.
Supplementary Table 1 | Site names, locations, and average depths of in situ measurement sites. The date the Maui samples were taken is November 28, 2018 and the date of the La Parguera sample is December 6, 2017. Apparent optical depth was calculated using the formula 1.3/K d (Bailey and Werdell, 2006). Rows with a gray background are considered optically shallow since the apparent optical depth exceeds the site average depth.
Supplementary Table 2 | West Maui, HI sites where in situ turbidity samples were taken sorted by descending latitude as they appear in Figure 4. N Observations is the total number of in situ observations, Matched Pairs is the number of in situ observations with corresponding satellite observations (within the 5-pixels, 7-days data cube), and Distance to Satellite Pixel is the distance of the in situ observation to the center point of the 750 m satellite pixel containing the sampling site (decimal degrees).
Supplementary Table 3 | List of Level-2 flags used for quality control during the binning process and corresponding bit values that make up the bitmask for flagging data.