Application of Generalized Cauchy Process on Modeling the Long-Range Dependence and Self-Similarity of Sea Surface Chlorophyll Using 23 years of Remote Sensing Data

Understanding the temporal characteristics of sea surface chlorophyll (SSC) is helpful for marine environmental management. This study chose 10 time series of remote daily sea surface chlorophyll products from the European Space Agency during the period from July 29, 1998 to December 31, 2020. A generalized Cauchy model was employed to capture the local and global behaviors of sea surface chlorophyll from a fractal perspective; the fractal dimension D measures the local similarity while the Hurst parameter H measures the global long-range dependence. The generalized Cauchy model was fitted to the empirical autocorrelation function values of each SSC series. The results showed that the sea surface chlorophyll was multi-fractal in both space and time with the D values ranging from 1.0000 to 1.7964 and H values ranging from 0.6757 to 0.8431. Specifically, regarding the local behavior, 9 of the 10 series had low D values (<1.5), representing weak self-similarity; on the other hand, regarding the global behavior, high H values represent strong long-range dependence that may be a general phenomenon of daily sea surface chlorophyll.


INTRODUCTION
Sea surface chlorophyll (SSC) is an important bio-indicator, representing the biomass of the phytoplankton in the surface layer of the ocean [1][2][3]. On one hand, phytoplankton have made significant contributions to capture greenhouse gas from the atmosphere and balance the carbon cycle globally [4,5]; on the other hand, under a suitable living environment condition (such as temperature, nutrients, etc.), the phytoplankton will grow rapidly and cause blooms, leading to the degradation of the water environment and ecosystem corruption [6][7][8]. Therefore, understanding the evolution and pattern of SSC is of great significance to ocean environmental management.
With the development of remote sensing technology, the sensors equipped on satellites can provide long-term SSC products at a global scale, which is conducive to the studies of SSC. For example, the pattern of global ocean primary production can be investigated at a large scale [9][10][11][12]. Likewise, the regional SSC variations were studied using remote sensing data. Yamada et al. [13] employed the Ocean Color and Temperature Scanner (OCTS) and the Sea-Viewing Wide Field of View Sensor (SeaWiFS) remote sensing data to study the SSC variation in the East China Sea and the Sea of Japan and found the interannual variability of the spring bloom and the weak temporal transition of the fall bloom. In the Bohai Sea and the North Yellow Sea of China, Zhai et al. [14] found the SSC exhibited a spatially coherent increasing trend over 2003-2011 and a decreasing trend over 2012-2018 by using Moderate Resolution Imaging Spectroradiometer (MODIS) data; specifically, the decreasing trend was more obvious than the increasing trend. Further, the Ocean Colour Climate Change Initiative (OC-CCI) standard products with locally modified SSC was also used to detect four types of SSC annual cycle in the East China Sea, i.e., the summer bloom, spring and autumn bloom, early spring bloom, and low SSC [15]. In summary, the studies mentioned above only focused on the trends and made simple statistics for exploring the space-time SSC patterns.
Recently, specific SSC variation modeling has been implemented in several studies. He et al. [16] chose the optimal theoretical model (such as Exponential model, Spherical model, Gaussian model, and their combinations) to fit the spatial covariance of the SSC distribution in the Gulf of St. Lawrence, and found that the highest SSC variability occurred in November while it changed a lot during the period from August to November. Despite this, few studies have modeled the temporal variance or pattern of SSC. The long-range correlation (or dependence) of SSC was detected in the South China Sea with time scales ranging from a few weeks to 2 years [17]. However, the long-term mathematical modeling of the longrange dependence (LRD) and self-similarity of SSC is still lacking.
In general, several important parameters are used to characterize the complex behavior and dynamics of a time series, such as the Hurst parameter and the fractal dimension/ index. Further, some methodologies have been developed to estimate these two parameters separately. Traditionally, the fractal dimension or index can be estimated by counting the number of level crossings, using increments, or the relationship between power variations and the fractal dimension [18][19][20]; besides this, some other fractal dimensions, such as numberbased fragmentation fractal dimension and mass fractal dimension for soil properties can be calculated as shown in other studies [21][22][23]. Regarding the Hurst parameter, the variance-plot with various block sizes were fitted to obtain the slope β and the Hurst parameter can be calculated subsequently by β 2H − 1; Kettani and Gubner [24] developed a variogrambased method to calculate the Hurst parameter and found the new method was superior to the wavelet method; Li [25] used the generalized fractional Gaussian noise to fit the autocorrelation function (ACF) of the traffic and further obtain the Hurst parameter; moreover, modified multifractal Gaussian noise theory was also developed to calculate the Hurst parameter of the sea level across the study period [26]. Given that the two parameters denote various fractal characteristics of the time series, it is important to seek ways to simultaneously obtain the fractal dimension and Hurst parameter. Luckily, the generalized Cauchy model provides a potential way to achieve this goal. It can be used to model the ACF of the studied time series, and it proves that the two parameters were independent of each other [27]. In the past few decades, the generalized Cauchy process has been successfully applied to model the sea-level fluctuations, teletraffic, and network traffic [27,28].
Given the above considerations, the objective of this work is to use the generalized Cauchy process to model the ACF of remote SSC data and explore the fractal characteristics of SSC, which will benefit local SSC monitoring, controlling, and policy-making.

Data Collection
The long-term daily SSC data was collected from the European Space Agency (ESA). It merged remote sensing reflectance (Rrs) from several satellites, including SeaWiFS, MERIS (Medium Resolution Imaging Spectrometer), Aqua-MODIS, VIIRS (Visible and Infrared Imager/Radiometer Suite), and OLCI (Ocean and Land Color Instrument) [29]. Then, the SSC products are generated using Algorithm Theoretical Baseline Document (Optical Classification and Algorithm Blending) [30]. In the present study, the daily SSC products with spatial resolution 1°× 1°were integrated during the period from July 29, 1998 to December 31, 2020 (8,192 days in total), and 10 locations were selected for further analysis, see Figure 1. Of the 10 locations, 7 are located in the Gulf of California ( Figure 1B), with 2 and 1 located in the western coastal regions of Madagascar and South Africa, respectively.

Long-Range Dependence
Let x(t) and r(τ) denote the time series of the studied natural attribute and its ACF, i.e., r(τ) E[x(t)x(t + τ)], where E represents the expectation operator. Thus, LRD or long memory is used to depict the situation that the ACF decays slowly with the characteristic as +∞ −∞ r(τ)dτ ∞ [31][32][33], i.e., the values of the studied natural attribute with large temporal lag show a strong correlation. Further, the asymptotic form of ACF with LRD can be expressed as Eq. 1 with the help of the Hurst parameter [34].
Where the Hurst parameter H ranges from 0.5 to 1 under the LRD condition, representing the global property of the time series x(t), a larger value of H implies that the LRD is stronger.

Self-Similarity
The ACF is self-similar when it remains the same through aggregating the sub-series of x(t) with nonoverlapping blocks [35], i.e., part of the time series is locally approximately similar to the entire time series.
According to the literature [36,37], the fractal index (α) was employed to measure the local self-similarity, as follows: where c > 0 and 0 < α ≤ 2. The fractal dimension, D, belongs to [1,2). A larger value of D means that the local self-similarity of the studied time series is stronger [27].

Generalized Cauchy Process
The generalized Cauchy process is applied when the time series x(t) and its ACF are of the form of the following equation, subject to 1 < α ≤ 2 and β ≥ 0 [28,38]: where ψ 2 is the intensity of x(t). The following comments discuss features of the two parameters in Eq. 3. Regarding the parameter β, it defines the dependence of where B is the beta function, i.e., it represents short-range dependence (SRD). Regarding the parameter α, it defines the self-similarity of x(t) by setting τ → 0, thus lim τ→0 C(τ) ψ 2 τ α with respect to α. In short, the LRD and self-similarity of x(t) only rely on the parameters β and α, respectively. In this case, with the definition of Eq. 1b and Eq. 2b, the generalized Cauchy process can be written as For modeling purpose, the intensity ψ 2 can be set to 1, and Eq. 4 becomes In this case, the generalized Cauchy process can simultaneously depict the LRD (global property) and self-similarity (local property) of x(t) by using the two parameters H and D, respectively. Regarding the Hurst parameter H, if 0 ≤ β < 1, i.e., 0.5 < H < 1, it represents LRD, and the values of ACF remain high even over large temporal lag; whereas if β > 1, i.e., 0 < H < 0.5, it represents SRD, and the value of ACF usually decays quickly, e.g., the value of ACF may decline to zero over a lag of several days. With various values of H and D, the ACFs were plotted in Figure 2. It was found that the ACF value of the generalized Cauchy process decreases as the temporal lag τ increases. Moreover, the ACF value increases when the value of H increases and the value of D is fixed, while the ACF value decreases a little when the value of D increases and the value of H is fixed. Among the six lines shown in the sub-figures, the three blue represent the LRD cases, and three red lines represent the SRD cases. Specifically, when the temporal lag, H value, and D value are equal to 7.2 days, 0.05, and 1.7 respectively, the value of ACF declines to 0.01, representing SRD characteristics; see the dark red line of the bottom right sub-figure in Figure 2. On the other hand, when the temporal lag, H value, and D value are equal to 31 days, 0.75, and 1.1 respectively, the value of ACF is still greater than 0.179, representing LRD characteristics; see the middle blue line of the top left sub-figure in Figure 2.

Autocorrelation Function Fitting Process
The original time series of SSC at each location was divided equally into 16 sub-series with no overlapping cases, each containing 512 data points. Considering that the value of the autocorrelation function may decay to zero with a 1-month temporal lag in some parts of the world [39,40], the theoretical autocorrelation function values for temporal lags between 0 and 32 was calculated by averaging the 16 autocorrelation functions of each sub-series. Then, the generalized Cauchy model (Eq. 5) was employed to fit each of the theoretical autocorrelation function values using the "lsqnonlin" function embedded in MATLAB software. Then, the fractal dimension D and the Hurst parameter H were estimated. The fitting performance of the generalized Cauchy model was evaluated by R 2 , MAE, and RMSE, as follows: where Y(τ) and Y(τ) represent the empirical ACF value and fitted ACF value at temporal lag τ respectively, Y represents the mean value of the ACF series, n represents the length of the series, and SS res and SS tot represents the sum of squared residuals and the sum of squares of deviation from mean respectively.

Descriptive Statistics
The proportion of missing daily SSC data from ESA at the 10 locations ranges from 8.02 to 11.56% over the entire period. Given that the missing values were discretely distributed in Frontiers in Physics | www.frontiersin.org September 2021 | Volume 9 | Article 750347 each SSC time series, they were interpolated by using the "spline" function in MATLAB software for further analysis. The SSC time series with full length (including 8192 SSC data) at the considered 10 locations are plotted in Figure 3 and the corresponding descriptive statistical results are presented in

DISCUSSION
The present work employed the generalized Cauchy process to model the ACF of daily remote SSC data during a 23-year period  and good performance of the fitting was obtained, indicating that the SSC in the 10 chosen locations follows a heavy-tailed distribution [41]. Compared to the generalized Gaussian noise model, the outstanding ability of the generalized Cauchy model is that it can simultaneously estimate both independent variables (Hurst parameter and fractal dimension). This means that it can describe the global correlation and local self-similarity of the considered natural attribute [27,42], such as the SSC in the present work. To the best of my knowledge, the present effort is one of the earliest studies in obtaining the LRD and selfsimilarity of the SSC in view of fractal statistics.

Impacts of Environmental Factors on the Fractal Characteristics of Sea Surface Chlorophyll
Relatively low self-similarity and high LRD of the SSC series at the 10 locations can be summarized. A rather different case was found in a lake study, i.e., the chlorophyll-a was autocorrelated over lags of five or 6 days [43], indicating SRD. The nutrients and aquatic environment (light, temperature, salinity, etc.) impact the algae growth and further influence the SSC variability [44][45][46][47][48]. With the continuous variation of these factors, the variation of SSC across time is also smooth according to the algae growth, causing the weak irregularity characteristics with low values of the fractal dimension and strong LRD with high values of the Hurst parameter. This phenomenon leads to the empirical values of ACF (blue dash line shown in Figure 4) being slightly higher than the theoretical values of ACF (black line shown in Figure 4) with the temporal lags between 2 and 10 days at most locations. However, with the temporal lag increasing from 10 to 25 days, the empirical values become smaller than the theoretical values, because the SSC may be influenced by the global climatic dynamics, such as the Southern Pacific Oscillation Index [49], which is different from algae's own growth condition. Moreover, there may be another situation that algae blooms with enough nutrient inputs [50], leading to extremely high SSC values, e.g., the SSC value increases rapidly and peaks for one or 2 days (as some peaks shown in Figure 3); and that may be the reason that the value of the fractal dimension is the highest among the 10 series, i.e., D 1.7964 with the highest maximum SSC value and standard deviation across the study period. On the other hand, the values of the fractal dimension and Hurst parameter varied at various locations, indicating that the environmental conditions are rather different from each other. Moreover, various species of algae may exist at various locations and their growth response to the environmental conditions are rather different [51,52]. Compared to the values of the fractal dimension (varies from 1.7244 to 1.7838) of SSC in the Chesapeake Bay [45], the values of the fractal dimension obtained in the current study are rather low; this may due to the fact that the river discharge and the nutrients it carries are not as large as the rivers (e.g., Susquehanna River and Potomac River) that run into the Chesapeake Bay. However, the values of the Hurst parameters in the current study are greater than that in the Chesapeake Bay study with LRD characteristics, indicating that a large nutrient load in the Chesapeake Bay may lead to weak LRD. Therefore, the LRD may be a general feature of SSC variations in oceans. Besides this, the aquatic environment will also influence the behavior of the zooplankton, e.g., warm waters will favor the consumption of the zooplankton, causing the reduction of SSC [53][54][55]. On the other hand, the upwelling system and surface currents around the coastal areas play important roles in shaping the distribution of zooplankton and further influence the variation of SSC [56]. The upwelling system on the California coast shows seasonal variabilities and can be summarized into four types: "Winter Storms" season (Dec-Jan-Feb), "Upwelling Transition" period (Mar and Jun), "Peak Upwelling" season (Apr-May), "Upwelling Relaxation" season (Jul-Aug-Sep), and "Winter Transition" season (Oct-Nov), so the impacts of upwelling system on the SSC are also seasonally continuous. That may be one of the reasons that SSC has LRD characteristic [57].

Comparisons to Previous Works
Comparisons between the current study and previous studies were conducted as follows. Some ACFs of teletraffic was rather high, above 0.98 with even 128 days lag [27], but the ACF value of SSC in the current study fall below 0.5 with 31 days lag. This may be the reason why very high values of the Hurst parameter (larger than 0.99) were detected with teletraffic rather than SSC. In addition, the values of the fractal dimension of teletraffic were much larger than that of SSC, demonstrating that stronger self-similarity was found in teletraffic series than SSC. This may be due to the fact that values for teletraffic are more random in occurrence while the values for SSC are more like a continuous series associated with several environmental factors mentioned above. In another study [28], the generalized Cauchy process was used to model the ACF of sea level fluctuations with a temporal resolution of 1 h, and found that the value of the fractal dimension was approximately equal to 1 at several locations while the most of them were larger than 1.8; the values of Hurst parameter were larger than 0.98. Interestingly, the locations with low fractal dimension values are located in the Gulf of Mexico, which is similar to the seven locations studied in the current study, i.e., the weak self-similarity may occur in a stable environment.

Limitations and Future Work
Certain limitations of the current study should be acknowledged: 1) Although there may be a relationship between the fractal characteristics of SSC and the living environment, rigorous proof and statistical analysis was not conducted in the current study due to lack of data. Therefore, future work can focus on quantitatively exploring the impacts of nutrients and temperature on the fractal dimension or Hurst parameters of SSC. 2) Even SSC products with high spatiotemporal coverage were used in the current study, there still exist missing values from other locations. Hence, spatiotemporal interpolation methods should be employed to obtain a more complete remote SSC dataset for mapping the global fractal dimension or Hurst parameter of SSC, such as the Bayesian maximum entropy approach [59][60][61], so that the spatial pattern of the fractal dimension and Hurst parameter can be further studied.
3) Taking into consideration the stochastic differential equations, the evolution pattern (or law) of SSC can be further explored, such as the fractional Brownian motion pattern [62,63].

CONCLUSION
The present study applied a novel generalized Cauchy model to depict the variations of SSC and good performance was obtained. The fractal characteristics of the SSC vary at different locations in terms of the fractal dimension and Hurst parameter; weak selfsimilarity was found in most locations with low values of the fractal dimension while strong LRD was detected across all locations with reactively high value of the Hurst parameter.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
JH contributed to conception and design of the study, organized the database, performed the statistical analysis, wrote the first draft of the manuscript. The author contributed to manuscript revision, read, and approved the submitted version.

ACKNOWLEDGMENTS
I would like to thank Prof. Ming Li for his guidance in modeling the generalized Cauchy process and Chen, Bairu for his assistant in improving the manuscript.