The Non-stationary Environmental Effects on Spawning Habitat of Fish in Estuaries: A Case Study of Coilia mystus in the Yangtze Estuary

The estuarine areas provide necessary spawning habitat, nursing habitat, and migratory for a variety of fishes and the environmental conditions are of significant heterogeneity. Coilia mystus is the key commercial fish that spawns in the Yangtze Estuary and the yield has declined sharply in the past 30 years. In order to understand the spawning habitat selection mechanism of C. mystus, the geographically weighted regression (GWR) model was applied to explore the non-stationary effect of environmental variables [e.g., the sea surface temperature (SST) and the sea surface salinity (SSS)] and geographical variables [e.g., the distance to the coast (DTC) and the depth (DEP) of water] on the egg density distribution of C. mystus. The data were derived from the spring and summer ichthyoplankton surveys carried out from 2019 to 2020 in the Yangtze Estuary, China. The GWR model results showed that the key variables in different seasons had spatial non-stationary effects on the distribution of spawning habitat of C. mystus, which were mainly caused by regional rise in temperature and saltwater intrusion of the South Branch. In the spring, the SSS, the DTC, and the DEP were the main impact factors and saltwater intrusion in the South Branch might lead to the trend that the spawning habitat moved further upstream to the Changxing Island. The SST was most important in the summer and the relatively lower SST was more crucial in the spawning habitat selection than the DTC and the DEP. The GWR model performed well in the study of C. mystus potential spawning habitat in the Yangtze Estuary. We recommend that more attention should be paid in regional variation of environmental factors to explore fish potential spawning habitat in the estuarine areas.


INTRODUCTION
The distribution of fish spawning habitat is usually determined by the adaptation of reproductive physiology and species response to the surrounding environmental conditions (Gallego et al., 2007;Zorica et al., 2013;Gonzalez-Irusta and Wright, 2016;Reglero et al., 2017). The spawning distribution could be influenced by various environmental factors, e.g., temperature, salinity, topography (Opdal, 2010;Dumont et al., 2011;Shi et al., 2014). There is evidence that compared with adults, the early life history stages of fish species seem to be more sensible and vulnerable to the variations in the environmental conditions (Portner and Farrell, 2008). To ensure more survival of eggs and larvae, spawning activity usually happens in the recurrent hydrographic features (Munk et al., 2009) such as estuaries, upwellings, or tidal mixing front (Tian et al., 2009;Zorica et al., 2013;Gibson et al., 2018). The estuaries, as important fish spawning and nursery habitats, are usually characterized by complex hydrological systems and spatially varying environmental conditions (Andrew et al., 2009). A study on fish spawning habitat is the fundamental step for the fish recruitment process (Lelievre et al., 2014), especially for the estuarine species. Examining fish spawning activity and its response to the spatial nonstationary environment is important for gaining a comprehensive understanding of spawning habitat selection mechanism, which helps to understand fish stock fluctuations and, hence, establish appropriate protection strategies for estuarine fish.
To evaluate statistical relationships between the spawning habitat and environmental variables, "global" regression models (Brunsdon et al., 1996;Windle et al., 2010) such as generalized additive model (GAM) and generalized linear model (GLM) are fully developed and used (Bacheler et al., 2010;Asch and Checkley, 2013;Gonzalez-Irusta and Wright, 2016). These models treat the entire data set as "whole map" and function under the assumption of spatial stationarity (Brunsdon et al., 2002;Windle et al., 2010). In the recent years, more and more studies have focused on spatial heterogeneity and the changes in species distribution caused by variations in the environmental conditions (Li et al., 2018). The geographically weighted regression (GWR) model provides a promising approach for investigating spatial non-stationarity effect and could make good predictions of fish species distribution (Lauri et al., 2014). The GWR model has been applied successfully to both the marine and freshwater habitats, e.g., North Atlantic Ocean (Windle et al., 2010), Gulf of Maine (Li et al., 2018), and Lake Malawi (Likongwe et al., 2015). However, rarely study has been reported that applies the GWR model method to demonstrate the non-stationary effect of the environment on the fish spawning habitat.
The Yangtze Estuary in China is the largest estuary in the western Pacific Ocean with three brunches and four outlets to the East China Sea (Figure 1). It is characterized by high biological productivity associated with relatively extreme and varying environmental conditions (Shan et al., 2010). It serves as an important spawning habitat, nursing ground, and migration channel for varieties of fish species because of the advantaged conditions (Zhuang et al., 2006;Yu and Zhang, 2017). More than 300 fish species have been found inhabiting the Yangtze Estuary (Zhuang et al., 2006). Among them, Coilia mystus is considered one of the most important commercial fish species in the Yangtze Estuary (Shi and Wang, 2002) with an annual yield of more than 1,500 tons in the 1990s (Ni, 1999). However, in 2019, China had to announce that it would stop issuing special fishing licenses for the wild populations of C. mystus because of the severe depletion of the resources Pauly, 2019, 2020) caused by overfishing and the habitat degradation Liu et al., 2013).
Coilia mystus belongs to Clupeiformes, with a short lifespan, seasonal migrations, and long spawning periods (Zhuang et al., 2006;Zhao et al., 2020). Also, it is a semi-anadromous fish (Elliott et al., 2007) that generally lives in the shallow marine area and spawns in the estuarine area (Zhuang et al., 2006;Zhao et al., 2020). The South Branch of the Yangtze Estuary is considered the main spawning location for C. mystus . The spawning period of C. mystus starts from late April to September and peaking in late May and June (Zhuang et al., 2006). In the spawning season, adults of C. mystus migrate to the brackish water in the Yangtze Estuary to spawn, but will never enter the freshwater areas above the Jiangyin section of the river (Yang et al., 2006b). There are few studies on the selection mechanism of spawning habitat of C. mystus and its response to environmental conditions, except for some previous work examining the migration pattern and biology trait (Yang et al., 2006a;Zhao et al., 2020). It is generally believed that the sea surface temperature (SST) and the sea surface salinity (SSS) would influence the spawning habitat of C. mystus . However, some geographical variables such as the distance to the coast (DTC) and the depth (DEP) of water, which were involved in other estuarine ichthyoplankton study (Locke and Courtenay, 1995;Halvor et al., 2007), were barely explored. An updated study on spawning activity and specific locations of spawning habitat in the area is necessary. Especially, understanding how the spawning habitat of C. mystus shifts over subtle spatial changes in the environment is the key problem.
In this study, the spawning habitat selection mechanism under the locally varying environmental conditions was explored based on a case study of C. mystus in the Yangtze Estuary. The spatial GWR model was applied to investigate the nonstationary effect of the environmental variables, i.e., the SST, the SSS, and geographical variables, i.e., the DTC and the DEP on the distribution of spawning habitat of C. mystus. The conclusions are helpful in gaining a comprehensive understanding of fish spawning habitat selection mechanism in the estuarine areas and conductive to the protection of the population.

Sample Collection and Analysis
The biological and environmental data were collected from four ichthyoplankton surveys carried out in 2019 and 2020 (Table 1) covering the peak and end of the spawning period of C. mystus. The sampling area covered both the South and North Branches of the Yangtze Estuary and its coastal zones (Figure 1). Considering the navigation risk and difficulties, the Deepwater Navigation Channel (DNC) was not involved in the survey. The study area was divided into four subareas (A, B, C, and D) based on the topographic features and salinity condition in the Yangtze Estuary (Luo and Shen, 1994;Yan et al., 2001). Besides, it was divided into 159 oblique grids of equal size 3 × 3 ( Figure 1B) based on the central axis of the Chongming Island. In each cruise, around 55 stations were chosen from the grids by using a stratified random sampling method. A higher weight (25 stations) was allocated in subarea B according to the historical study of C. mystus spawning habitat .
The surveys started around 4 a.m. and were conducted in the day. Eggs were collected by a plankton net with an opening diameter of 80 cm and a mesh size of 505 µm. At each survey station, the net was towed horizontally against the current at a speed of 2-3 knots for a duration of 10 min and a calibrated flow meter was mounted in the net opening to capture the filtered water volume. Samples were gathered from the net and fixed in a 5% buffered formalin seawater solution immediately for further identification in the laboratory. Eggs of C. mystus have distinguishing features such as spherical with a diameter of 0.83-1.03 mm, transparent, and contain multiple oil globules with different sizes (Zhang et al., 1985). The eggs were identified by morphology observation under a microscope according to the study by Thompson and Riley (1981) and Zhang et al. (1985). The identified eggs were standardized in terms of number of specimens per cubic meter (ind m −3 ). The stations where no specimens were found were recorded as zero. The environmental profile was recorded by the SBE 19plus V2 SeaCAT conductivitytemperature-depth system (CTD). SST ( • C) and SSS were defined by mean data of 3 m below the sea surface. Spatial data such as the Aug. 2020 Summer 55 29.14 ± 1.42 4.30 ± 6.52 0.027 ± 0.059 DEP (m) was derived by the electronic charts and the DTC (km), the shortest distance from the station to the coast or islands, was calculated in use of the R package "raster."

Data Analysis
Model Development The GWR model was applied to evaluate the relationship between the distribution of spawning habitat of C. mystus and the environmental constraints for each month. The square root of egg density with corresponding independent variables was formulated as below (Gollini et al., 2015): Where, y i is the dependent variable at location i, x ik is the value of kth independent variable at location i, m is the number of independent variables, β ik is the local regression coefficient for the kth independent variable at location i, and c i is random error at location i. All the model development is based on the R package "GWmodel". The spatial weighting matrix of the GWR model was built of three key elements: the type of distance, the kernel function, and the bandwidth. In this study, a great circle distance matrix, bisquare kernels function, and "adaptive" bandwidth (number of local data) were applied. Besides, bandwidth was obtained via cross-validation. For more detail, please see Gollini et al. (2015).
Before model development, collinearity between explanatory variables in four months was tested based on variance inflation factors (VIFs). By using a GLM model including all the independent variables (e.g., SST, SSS, DTC, DEP), a given predictor with VIFs greater than 10 should be removed from the model development (O'Brien, 2007). The result of the diagnostics of collinearity (Figure 2) showed that only SSS in August 2019 had a collinearity relationship with other indicators. However, the VIFs (10.16) were close to 10, so it was still involved in the following model development.
A global selection procedure (Brooks et al., 2014) was conducted by the GWR model to find the optimal model for each month and the best performing model comes to a variable combination that produces the minimum akaike information criterion (AIC) value among all the regressing models. Among four independent variables analyzed (SSS, SST, DTC, and DEP), the optimal independent variable subset for models showed both the consistency and difference. The DTC was subsequently involved in all the four models. The DEP was included in all the models, except for May 2020. The SSS was only removed from the model in August 2020 and the SST was included in August 2019 and 2020.

Model Fit and Prediction
The GWR model fit was tested by the Pearson correlation coefficient (r 2 ). Significant levels (p) of spatial autocorrelation in regression residuals were evaluated by the Moran's I statistics.
As the quantity of data in each month is small, the leave-oneout cross-validation (LOOCV) procedure was used to obtain test errors (Molinaro et al., 2005). In this procedure, each station was sequentially picked to be the validation data and the remaining data were used as the training data. Meanwhile, the root mean square error (RMSE) was used to validate the model (Christoph and Jose, 2012). The RMSE can be calculated as follows: Where, N is the number of samples andŷ i is the predicted value from the model for the ith observation. In the GWR model prediction, the entire survey area was divided into 6,000 oblique grids of equal size 0.52 × 0.52 and the selected environmental variables were evaluated by inverse distance weighting based on the survey data from each cruise.

Model Performance and Validation
Based on the minimum AIC scores, the GWR models that best approximated the observed distribution of C. mystus eggs were listed in Table 2. Compared to the total number of stations (Table 1), the year 2019 required a larger bandwidth size than the year 2020 to develop the GWR model, in which model of August 2019 yielded results increasingly close to the universal model. Smaller bandwidth for models in other months might lead to more rapid spatial variations in the results. Moran's I of residuals for the GWR models were randomly structured and the relevant p-value (0.13-0.78) indicated that spatial autocorrelation was not significant (Windle et al., 2010). In addition, the GWR models had good performance according to r 2 (0.53-0.69). The GWR models had reasonable prediction skills with RMSE ranging from 0.05 to 0.24 (Table 2). In general, the observed C. mystus eggs were confined in the South Branch, near the Chongming, Changxing, and Hengsha Islands (Figure 3). The predicted distribution of egg concentration was broadly consistent with the pattern where most C. mystus eggs appeared. The monthly and interannual variations in the predicted egg density basically corresponded to that of the survey data. There were far more eggs collected in May than in August. In May 2019, the highest egg density was up to 0.34 ind.m −3 (predicted 0.26 ind.m −3 ), while it was only 0.07 ind.m −3 (predicted 0.08 ind.m −3 ) in August. Besides, the egg density sampled in 2020 was higher compared with that in the same month in 2019. Of the four months compared, the highest egg density was 2.72 ind.m −3 (predicted 1.61 ind.m −3 ) in May 2020 and eggs were highly aggregated in the upper end of the Changxing Island. In August 2020, the highest egg density was as much as 0.32 ind.m −3 (predicted 0.14 ind.m −3 ). Although there were a few eggs sampled in the North Branch (3 eggs in May 2019 and 2 eggs in May 2020) and subarea C (16 eggs in May 2019 and 1 egg in August 2020), no obvious egg concentration was predicted in these areas.

Environmental Variables and Non-stationary Environmental Effects
In this study, the coefficients estimated of each independent variable determined in the GWR models were mapped to explore the non-stationary environmental effects. The value of coefficients is not limited between −1 and 1 (Li et al., 2018). All the independent variables showed non-stationary effects on the dependent variable.  The overall SST in the Yangtze Estuary showed a decreasing trend from the South to the North Branch and from the branch to the offshore area ( Figure 4A). The SST in May 2019 and 2020 ranged from 20 to 24 • C. SST in August was included in best performing models in both 2019 and 2020 models. In August 2019, the SST exceeded 25 • C in all the sites and ranged from 28 to 30 • C in the South Branch and the surrounding waters. The positive relationship between SST and egg density widely occupied the study area (coefficients up to 0.037), except for a slight negative relationship near the Hengsha Island (Figure 5). August 2020 had a general higher SST across the South Branch and the highest SST (34.17 • C) showed up in the North Channel. Meanwhile, the relationship between SST and egg density was negative in the North Channel and waters around the Changxing Island and positive in the rest of the study area. The whole estimated coefficients ranged from −0.029 to 0.04 in August 2020.
The SSS ranged from 0.01 to 28.5 in the study area showing an opposite distribution pattern to SST (Figure 4B). The salinity of water between the South Branch and the North Branch was of significant heterogeneity as the brackish water (salinity < 2) occupied most of the South Branch and the high salinity water (salinity > 10) distributed mainly in the North Branch and the offshore area. In May 2019 and 2020, the estimated coefficients of SSS were generally negative upstream of the Hengsha Island and upstream of the North Branch and close to zero in the rest of the study area (Figure 5). The strength of the negative effect of SSS was stable and the coefficients were approximately −0.022 in May 2019. However, it had an unstable negative effect on egg density in May 2020, as the strength of the negative effect in the upstream of the Changxing Island (coefficients up to −0.275) was stronger than that around the Hengsha Island (coefficients up to −0.1). The salinity in August 2019 had a weaker effect and the estimated coefficients were close to 0.
The relationship between DTC and egg density was obvious in subareas A and B (Figure 5). In May 2019 and 2020, the distribution pattern of a positive relationship between DTC and egg density was similar to that of a negative effect of SSS. The magnitude of positive coefficients in May 2020 (up to 0.249) was much higher than that (up to 0.04) in May 2019. The negative coefficients were in the rest of subareas A and B and the negative effect of SSS in May 2019 took up a wider area than that in May 2020 (Figure 5). In August 2020, the relationship between DTC and egg density showed high spatial variability in subarea B and the coefficients were from −0.019 to 0.039.
The DEP showed strong heterogeneity across the whole study area, especially in subarea B. The upstream of the Hengsha Island was occupied by both the water depths up to 25 m deep and shallow waters less than 5 m. Most of the waters in the rest area were less than 5 m deep, except for the right side of subarea C, which was appropriately 20 m deep. The mapping of estimated coefficients of DEP showed a wide range of positive relationships with egg density in the three models. In May 2019, it had positive coefficients in subareas A, B, and D, except for a weak negative effect in the bottom of subarea C. The positive strength in the upstream of the Hengsha Island (coefficients up to 0.015) was weaker than that in the rest of subareas A and B and the South Passage (coefficients up to 0.025). The DEP in August 2019 was completely positively related to egg density and displayed a weaker effect with coefficients around 0.005. The coefficients ranged from −0.20 to 0.040 in August 2020 and negative effects were located in the middle of the Changxing Island and the right end of the Hengsha Island.

DISCUSSION
The spatial regression GWR model was used to explore the spawning habitat of C. mystus and its response to the rapidly changing abiotic variables in the Yangtze Estuary. The estimated coefficients of each variable changed locally across the study  area, indicating non-stationary effects caused by environment heterogeneity in the Yangtze Estuary. Seasonal variations can be found in environmental factors that affect the distribution of the spawning habitat of C. mystus. In the spring, SSS, DTC, and DEP were the main impact factors and saltwater intrusion in the South Branch might lead to the trend that the spawning habitat moved further upstream to the Changxing Island. SST was most important in the summer and the relatively lower SST was more crucial in the spawning habitat selection than DTC and DEP. The predicted spawning habitat of C. mystus was mainly distributed in the middle of the South Branch, near the Chongming, Changxing, and Hengsha Islands, which basically corresponded with the known area of spawning habitat defined in previous studies Zhuang et al., 2006). However, there was possibility that the spawning habitat of C. mystus moved further upstream (Figure 3: May 2020) and adults of C. mystus might have the habit of spawning in the freshwater (Yang et al., 2006a,b).
The non-stationary effects of abiotic variables were reflected by the spatially varying relationship between the independent variables and dependent variables (Li et al., 2018). Both SST and SSS represented the primary habitat environment in defining C. mystus spawning habitat Shi et al., 2014). In the reproductive season, adults of C. mystus would spawn to the brackish water with salinity from 0.14 to 7.51 and temperature of 18-28 • C in the Yangtze Estuary Zhuang et al., 2006;Ding, 2011).
The temperature across the study area in May 2019 and May 2020 (19-24 • C) was within the range of suitable temperature for spawning, resulting in SSS the decisive environmental factor in defining the spawning habitat in the spring . Although SSS was also involved in August 2019, the nonstationary effect of SSS was only significant in spring. Several studies based on the stepwise model and GAM have shown a universal negative correlation between SSS and the number of C. mystus eggs Ding, 2011;Hu et al., 2021). However, results from the GWR models indicated that the strength of the negative relationship between SSS and egg density varied spatially upstream of the Hengsha Island in May 2020 (Figure 5). There was saltwater intrusion of the South Branch in this month, as high salinity water was found along the Jiuduan sandbank ( Figure 4B; Chen and Zhu, 2014). To avoid that eggs being transported to the unfavorable salinity waters (Fortier and Gagné, 1990), more adult fish might choose to spawn further upstream of the Changxing Island (Figure 3) and resulted in the strength difference. There was also a strong difference in the negative effect of SSS between May 2020 and May 2019. Overfishing and environmental degradation were the main reasons for the resource decline of C. mystus Liu et al., 2013). The policy that stopping issuing special fishing licenses of C. mystus in 2019 might contribute to more egg production in 2020 and lead to the strength difference.
Sea surface temperature was the dominant environmental factor in summer, as the strength of the effect of SST was much stronger than that of SSS in August 2019. The conclusion was contradicted with that of , who thought that SSS influenced greater than SST in summer. It might because the average SST in the summer of 2002 was lower than 28 • C . Previous studies have proved that the increase of temperature would accelerate fish gonadal maturation and, thus, promoting spawning activity Jiao et al., 2019). The widely distributed positive coefficients estimated of SST in August 2019 agreed with the conclusion. However, under higher SST in the North Channel, the area around the Changxing Island, and the South Passage in August 2020 (Figure 4A), the relationship between SST and egg density was negative (Figure 5), which indicated that the "extreme" temperature might restrain the spawning activity or harm the survival of eggs. Ni et al. (2020) reported that under the gradual warming of the Yangtze Estuary, the yield of adults of C. mystus have declined dramatically since a surge by 2 • C occurred in 1997 and did not recover in the recent years. The regional temperature rise in the South Branch might be one of the reasons that influence the recruitment success of C. mystus. More attention should be paid to such areas in the future surveys.
There has been no obvious alteration in the distribution of C. mystus spawning habitat over the years. Homing of mature adults to their natal area is observed in a huge variety of migratory fishes such as diadromous fish and Gadiformes (Stabell, 1984;Robichaud and Rose, 2001;Knutsen et al., 2007). However, cues for the homing behavior were still uncertain and complex to explain. In this study, DTC and DEP were applied to explore the geographic effect on the distribution of spawning. It was shown that the relationship between DTC and egg density is positive in the upstream of the Hengsha Island and negative in the rest of subareas A and B (Figure 5). The spatially varying relationship suggests that adult C. mystus might prefer to spawn in the middle of the river upstream of the Hengsha Island, while close to the coast near waters around the Hengsha Island (Yang et al., 1990). The less egg production around the Hengsha Island caused by weaker runoff in May 2020 might also account for the smaller area of the negative effect of DTC in the North Channel (Figure 5). In August 2020, the nonstationary effect of DTC around the Changxing and Hengsha Islands was different from that in the other months. The choice of relatively lower temperature areas in August 2020 rather than the suitable DTC might be the main reason for the "abnormal" phenomenon. Since it is possible to ensure the survival of eggs under "extreme" temperature, the environmental condition of SST would determine the spawning habitat of C. mystus. The reproductive migration process of C. mystus was from offshore to inshore waters and finally stopped upstream of the Hengsha Island (Zhuang et al., 2006); based on the relationship analyzed above, the gradually reducing DTC might be one cue for the homing process of adults of C. mystus.
It was shown that the estimated coefficients of DEP were positively correlated to egg density in most of the study areas ( Figure 5) in all the 3 months, which validated the result that C. mystus would spawn in deep waters (Zhou et al., 2011). The non-stationary effect caused by DEP was obvious in May 2019 and August 2020. As the deepest DEP yielded approximately the same level of egg density as other areas where water depth was relatively shallow (Figures 1A, 3), the strength of the positive effect of DEP was weaker in the upstream of the Hengsha Island in May 2019 (Figure 5). The negative relationship between DEP and egg density was due to the eggs found in the shallow waters at the bottom of subarea C in May 2019. However, these eggs might be a result of the transport of local current (Ye et al., 2004;Wan and Wang, 2018) rather than a choice of spawning, as the environmental conditions are unsuitable for spawning. In August 2020, there were small areas of negative relationship in waters near the Changxing and Hengsha Islands. Such a phenomenon might also be the result of lower temperature preference in summer, as eggs were found in the shallow water of relatively lower SST rather than the deep water with "extreme" waters.

CONCLUSION
In conclusion, the specific location of the spawning habitat of C. mystus was predicted and defined by the GWR model in the Yangtze Estuary. The GWR model showed capability in explaining the response of species to the environmental variation in an estuarine habitat and the non-stationary effect of local variables had been analyzed for better understanding the spawning habitat selection mechanism of C. mystus. However, it was difficult to predict the actual spawning habitat specifically based on our limited data. In the future, more surveys covering the upstream of the South and North Branch throughout the spawning season should be carried out. Meanwhile, there was inference about the potential spawning habitat of C. mystus in the North Branch (Zhao et al., 2020). The potential spawning habitat can be defined as "a habitat where the environmental conditions are suitable for spawning" (Planque et al., 2010). We have collected a few eggs in the North Branch and subarea C; however, such phenomenon did not appear repeatedly and the environmental condition in the North Branch seemed unsuitable for spawning. The member/vagrant hypothesis proposed that the successful stock recruitment depended heavily on the local current system for preventing the eggs from being dispersed to unfavorable conditions (Fortier and Gagné, 1990). The eggs found might be under the transport of local currents to the nursing ground (Hu et al., 2021), as residual currents in the North Channel to the offshore area have the direction first to East then to the North (Ye et al., 2004;Wan and Wang, 2018). Thus, a study on the transport mechanism of the early life stage of C. mystus is necessary to verify the inference and hypothesis.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
DW and ZL conceived and designed the study. RW, JZ, LZ, and SZ were involved in revising the manuscript for important intellectual content. XL and PS collected the data. DW compiled and analyzed the data and drafted the manuscript. All authors approved the final version of the manuscript and agreed to be accountable for all the aspects of the work.