Reduction of Genetic Variation When Far From the Niche Centroid: Prediction for Mangrove Species

The niche-centroid hypothesis states that populations that are distributed near the centroid of the species' ecological niche will have higher fitness-related attributes, such as population abundance and genetic diversity than populations near the edges of the niche. Empirical evidence based on abundance and, more recently, genetic diversity data support this hypothesis. However, there are few studies that test this hypothesis in coastal species, such as mangroves. Here, we focused on the black mangrove Avicennia germinans. We combined ecological, heterozygosity, and allelic richness information from 1,419 individuals distributed in 40 populations with three main goals: (1) test the relationship between distance to the niche centroid and genetic diversity, (2) determine the set of environmental variables that best explain heterozygosity and allelic richness, and (3) predict the spatial variation in genetic diversity throughout most of the species' natural geographic range. We found a strong correlation between the distance to the niche centroid and both observed heterozygosity (Ho; ρ2 = 0.67 P < 0.05) and expected heterozygosity (He; ρ2 = 0.65, P < 0.05). The niche variables that best explained geographic variation in genetic diversity were soil type and precipitation seasonality. This suggests that these environmental variables influence mangrove growth and establishment, indirectly impacting standing genetic variation. We also predicted the spatial heterozygosity of A. germinans across its natural geographic range in the Americas using regression model coefficients. They showed significant power in predicting the observed data (R2 = 0.65 for Ho; R2 = 0.60 for He), even when we considered independent data sets (R2= 0.28 for Ho; R2 = 0.25 for He). Using this approach, several genetic diversity estimates can be implemented and may take advantage of population genomics to improve genetic diversity predictions. We conclude that the level of genetic diversity in A. germinans is in agreement with expectations of the niche-centroid hypothesis, namely that the highest heterozygosity and allelic richness (the basic genetic units for adaptation) are higher at locations of high environmental suitability. This shows that this approach is a potentially powerful tool in the conservation and management of this species, including for modelling changes in the face of climate change.


INTRODUCTION
Genetic diversity plays a central role in the evolution of species; thus, understanding the factors that affect it and how to estimate it in natural populations are crucial for conservation efforts both now and in the future (Hendry et al., 2011). Genetic diversity is the raw material for natural selection to produce adaptive evolutionary change, and it is a strong predictor of fitness and extinction risk (Frankham, 1996(Frankham, , 2005(Frankham, , 2012Reed and Frankham, 2003;Spielman et al., 2004). Many factors influence genetic diversity among and within species and genomes (Romiguier et al., 2014;Ellegren and Galtier, 2016). These include, for example, effective population size, which undoubtedly has a key role in determining genetic diversity at the population level (Frankham, 2012;Ellegren and Galtier, 2016).
Genetic diversity and population density are not homogeneous among populations across the landscape of a species' geographic range. Two hypotheses seek to explain this variation empirically-the abundant-centre hypothesis and the niche-centroid hypothesis (Brown, 1984;Martínez-Meyer et al., 2013). The abundant-centre hypothesis predicts a peak of abundance in populations near the geographic centre of the range, while peripheral populations suffer a loss of genetic variation because of reduced effective population size, drift, and inbreeding (Diniz-Filho et al., 2009). Although the abundantcentre hypothesis has received repeated support, there are many exceptions (e.g., Pfeifer et al., 2009;Abeli et al., 2014;Pironon et al., 2015;Dallas et al., 2017;Santini et al., 2018;Kennedy et al., 2020). These exceptions point out that habitat suitability is not necessarily related to the geographic distance from the centre of its distribution range. Rather, fitness, and fitness-related attributes such as demographic parameters may be more strongly influenced by the suitability of local environmental conditions (Hutchinson, 1957;Martínez-Meyer et al., 2013;Lira-Noriega and Manthey, 2014), an idea expressed by the niche-centroid hypothesis. A species' ecological niche is a multidimensional hyper-volume constructed with n-axes, each representing a fundamental variable for population survival (Hutchinson, 1957). The niche hyper-volume has an internal structure such that its centroid represents the optimal conditions (e.g., the highest intrinsic population growth rate) (Maguire, 1973), while populations outside the environmentally defined niche show zero or negative growth (Lira-Noriega and Manthey, 2014). Therefore, distances to environmental niche centroids might better reflect fitness-related attributes, such as population abundance and genetic diversity (Martínez-Meyer et al., 2013).
Recent papers testing the niche-based hypothesis have found that the distance to the niche centroid is negatively related to population abundance and genetic variation, measured mainly as expected heterozygosity, nucleotide diversity, and allelic richness (Yañez-Arenas et al., 2012;Martínez-Meyer et al., 2013;Lira-Noriega and Manthey, 2014;Micheletti and Storfer, 2015;Osorio-Olvera et al., 2016, 2020aAltamiranda-Saavedra et al., 2020). Under this hypothesis, the highest amount of genetic variation is expected at locations of high environmental suitability, which occurs in regions closer to the niche centroid, while population genetic variability declines as the distance to the niche centroid increases (Lira-Noriega and Manthey, 2014). If population size varies in geographic space, a negative relationship between genetic variation and the distance to the niche centroid is also expected, since larger populations generally harbour more genetic diversity (Diniz-Filho et al., 2009, 2015Martínez-Meyer et al., 2013).
Integrating ecological niche modelling with population genetics studies can test the link between environmental suitability and population genetic diversity (Micheletti and Storfer, 2015). Meanwhile, predicting the genetic variability of species on the geographical space and defining the environmental variables that best correlate with genetic variability could have important implications for species ecology and conservation biology (Guo et al., 2020). Many studies applying niche modelling techniques have concentrated on temperate terrestrial species, while tropical and marine species have received less attention (Eckert et al., 2008).
Mangrove species play a crucial role in maintaining biodiversity and providing many ecosystem services such as carbon storage, climatic regulation, purifying water, and protection from coastal erosion caused by storms and tsunamis (Costanza et al., 1997;Donato et al., 2011;Tomlinson, 2016;Alongi, 2018). Located at the interface of land and sea, mangrove forests are subject to a variety of environmental stressors such as UV radiation, salinity, tidal inundation, and climatic variability. However, few studies have focused on understanding the environmental requirements associated with the distribution of mangroves (e.g., Crase et al., 2013;Record et al., 2013;Rodríguez-Medina et al., 2020). The information on environmental requirements is relevant to assess the spatial changes of mangrove communities, their composition and relationship to environmental predictors of geographic distribution, and in predicting the effects of future climate change (Rodríguez-Medina et al., 2020). In this study, we focus on the black mangrove Avicennia germinans. First, we evaluate whether environmental suitability, measured as the distance to the niche centroid is a good predictor of the genetic diversity of this species. Then, we determine the set of environmental variables that better explain its genetic variability, and finally, we predict the spatial variation in heterozygosity throughout most of its natural geographic distribution range in the Americas.

Study System
Mangroves are flowering trees that live on the coastlines of tropical and subtropical regions of the world (Tomlinson, 2016). Global distributions of mangroves are limited mainly by the physiological tolerance of each species to low temperatures (Duke et al., 1998). The range of conditions over which mangroves occur naturally encompasses other environmental extremes (Clough, 1992). Mangrove species evolved to withstand the tidal regimes of coastal and estuarine shorelines, such as seawater, periodical flooding, and exposure to waves, wind, and offshore currents (Duke, 2017). Mangroves can grow in a wide variety of soil types, including heavy consolidated clays, unconsolidated silts, calcareous and mineral sands, coral rubble, and organic peats, varying in salinity up to 90 ppt (Clough, 1992). Mangroves are also adapted to anaerobic, acidic, or alkaline soils with varying nutrients such as nitrogen, phosphorous, potassium, calcium, magnesium, sodium, and chlorine, common in tidal water (Hossain and Nuruddin, 2016).
Mangrove forests develop best on tropical shorelines with a low wave energy and abundant supply of fine-grained sediments. These ideal conditions occur in areas of high rainfall or ample supply of fresh water through river discharges. Also, these optimal areas are persistently cloudy with a low precipitation-to-evaporation ratio, mild solar radiation fluxes, and negligible climate seasonality. In contrast, high wave energy and sandy substrate are not favourable for mangrove establishment (Clough, 1992;Woodroffe, 1992).
The black mangrove, Avicennia germinans, is a conspicuous mangrove species that grows in the intertidal zone of the tropical and subtropical coastlines of America and West Africa. It is one of the most tolerant species to high salinity, aridity, and low temperatures (Clough, 1992). This species occurs in sympatry with the red mangrove Rhizophora spp. throughout most of its distribution range. However, at local scales, they occupy different intertidal zones, characterised by different types of soils. Avicennia germinans seem to prefer firmer and more sandy substrates associated with fibrous soils (i.e., with a high percentage of the organic matter in the form of undecomposed remains of Rhizophora rootles) (Jordan, 1964;Diop et al., 1997).

Genetic Data
The molecular genetic data of black mangrove derives from a total of 1,419 trees from 40 populations covering most of the species' natural geographic range in the Americas, both on the Caribbean and Pacific coasts (Supplementary Table 1). We used two previously published data sets: one from Mexico exclusively (Ochoa-Zavala et al., 2019) and another with Central America and Caribbean localities (Cerón-Souza et al., 2015). The Mexican study had 29 localities with 40 individuals per locality, for a total of 1,144 individuals. After excluding two loci with high null allele frequency, we used eight microsatellite loci ranging between 2.0 and 4.5 alleles per locus (Ochoa-Zavala et al., 2019). The Central American and Caribbean study included 11 localities. The number of individuals ranged from 13 to 40 individuals per population, for a total of 275 and used 11 microsatellites ranging from 3.3 to 9.8 alleles per locus (Cerón-Souza et al., 2015, Supplementary Table 2). Three of the microsatellites were shared between the two studies: AgT4 (tetranucleotide), GT_006 (dinucleotide GT), and CT_004 (dinucleotide CT) (Cerón-Souza et al., 2015;Ochoa-Zavala et al., 2019).
The observed and expected heterozygosity and allelic richness for each locality depends directly on several factors, including the motif, the number of repetitions of each motif, the number of microsatellite molecular markers used, and the number of individuals included in each population (Ellegren, 2000;Väli et al., 2008;Pemberton et al., 2009). To eliminate variation due to a different number of microsatellites between the two studies, we randomly eliminated three of the nonshared markers from the 11-marker set from the study of non-Mexican populations to use a total of eight markers from each dataset. Thus, the microsatellites used from the non-Mexican populations were the three shared loci plus CA_002 (dinucleotide CA), GA_003 (dinucleotide GA), CT_003 (dinucleotide CT), GT_003 (dinucleotide GT), and CTT_001 (trinucleotide CTT). Therefore, the non-Mexican database contained six microsatellites with a dinucleotide motif, one with a trinucleotide motif, and one with a tetranucleotide motif. In comparison, the microsatellites used from the Mexican populations were: Agerm1-25 (TG), Agerm1-18 (AG), AgT4 ). The genetic diversity was calculated separately for each of the 40 populations using observed and expected heterozygosity (H o , H e , respectively) in Arlequin version 3.5 (Excoffier and Lischer, 2010). In addition, we obtained allelic richness (Ar) based on a sample size of 11 alleles using the rarefaction method in hierfstat (Goudet and Jombart, 2015) in R (R Core Team, 2019).
To verify that estimates of heterozygosity (H o and H e ) and Ar are not affected by microsatellites' features, we compared the mean values obtained using the set of eight markers with estimates derived from three additional databases: one with three shared loci, a second of one tetranucleotide locus, and a third with four dinucleotide loci. Mean values of each parameter showed the same trend at each locality (Supplementary Figures 1A-E). Then, to corroborate the consistency in our results, we repeated the correlation analyses (see section Assessing the relationship between Niche-centroid distance and genetic diversity) using H o , H e , and Ar computed from these three datasets. For comparison, we also included the correlation coefficient values obtained using all eight microsatellite loci. Values of correlation coefficients showed very similar distribution and median values (Supplementary Figures 1B-F). It must be noticed that the dinucleotide dataset produced a wider distribution of H o and H e , but a similar median value (Supplementary Figures 1D,F). Thus, it is likely that levels of genetic diversity of A. germinans populations correlate with their distance to the niche centroid independently of the microsatellite markers used.

Occurrence Data
We obtained 3,710 occurrence data for A. germinans from the Global Biodiversity Information Facility (GBIF). We followed standardised steps to curate the data. First, we eliminated wrongly georeferenced records and dubious and duplicated localities. Then, to avoid problems related to spatial autocorrelation, we thinned the data using a spatial filter of 0.08333333 degrees (∼9.3 km), which was the spatial resolution of the environmental layers used to model the niche. The process of data curation left us with 227 geo-referenced records of the presence of the species over the extent of its geographic distributional range. We split environmental information at occurrence localities into training and testing data using a random partition of 70% of the records for training and 30% for testing (159 and 68, respectively) (step 1 in Figure 1). FIGURE 1 | General workflow of ecological niche modelling. Each step of the workflow is detailed in section ecological niche modelling (see Methods). White arrows indicate when information from a previous step was used.
Frontiers in Conservation Science | www.frontiersin.org

Environmental Information
We obtained 341 variables from three different sets of predictors at five arc-minutes of spatial resolution (∼ 9.3 km): 19 bioclimatic variables from WorldClim (Hijmans et al., 2005), 316 soil variables retrieved from the SoilGrids database (Hengl et al., 2014), and six variables related to salinity at the sea surface obtained from the Bio-Oracle 2.1 database (Assis et al., 2017). The WorldClim bioclimatic variables came from monthly temperature and precipitation reports from weather stations from 1970 to 2000. The soil variables refer to the following soil properties: depth to the bedrock, organic carbon stock, bulk density, clay, silt, and sand content, soil pH, and soil cationexchange capacity; soil grids also integrate the most probable class and predicted probabilities for each World Reference Base soil unit and Soil Taxonomy suborder (Hengl et al., 2014). Salinity variables were from satellite-based and in situ oceanographic data (Tyberghein et al., 2012). Salinity rasters for analysis included: maximum, minimum, range, and mean salinity, as well as salinity of the most saline month and salinity of the least saline month (step 2 in Figure 1).
We obtained the Pearson pairwise correlation matrix between all predictors at the occurrence localities to select which environmental data to use to fit ecological niche models (ENM) by retaining variables with a correlation value ≤ 0.8. This resulted in a total of 92 predictors (11, 79, and 2 from WorldClim, Soil Grids, and Bio-Oracle, respectively; see the list of variables in Supplementary Table 3). We masked all of the raster layers using the polygon of M, which is the region hypothesised to be accessible for a species (Barve et al., 2011). Because the black mangrove has high dispersal capacity (Nettel and Dodd, 2007), we defined M as the area occupied by mangroves forests (Giri et al., 2011) along the coastlines of America and West Africa and applied a buffer of 0.5 degrees (step 2 in Figure 1).

ENM Calibration and Selection
To model the black mangrove's niche, we used Minimum Volume Ellipsoid (MVE) models, given that theoretical and empirical work supports the idea that fundamental niches have convex shapes like ellipsoids (Maguire, 1973;Jiménez et al., 2019;Osorio-Olvera et al., 2020a,b). Three parameters define ellipsoids: the niche centroid, the shape matrix, and the cutoff value. The niche centroid is the point in the niche space where fitness reaches its maximum value (Hutchinson, 1957;Maguire, 1973;Martínez-Meyer et al., 2013;Osorio-Olvera et al., 2019). The shape matrix measures how dependent two niche axes are and how they covary. The cutoff value defines the ellipsoid's surface. The function ellipsoid_selection of the ntbox R package (Osorio-Olvera et al., 2020b) allows fitting MVE. This algorithm finds the ellipsoid of the smallest volume that contains a k proportion of training points. Each model is evaluated based on its capability to predict occurrence points via the statistical and performance criteria defined by the user. The ntbox R package uses the binomial test to compute statistical significance and estimates performance via the omission rate, which indicates how wellmodels created with training data predict occurrence points. In this study, we used as modelling variables the least correlated variables estimated in the environmental data section (steps 2 and 3 in Figure 1; Supplementary Table 3). Thus, we fit MVEs in two, three, and four dimensions within the M polygon using a proportion of k = 0.95 of the training data. To select the best models, we used a p ≤ 0.05 for the binomial test and an omission rate of ≤ 0.10 (step 3 in Figure 1).

Assessing the Relationship Between Niche-Centroid Distance and Genetic Diversity
We evaluated the relationship between distance to the niche centroid (DNC) and population genetic diversity in two consecutive steps (step 4 in Figure 1). First, based on the models selected in the ENM calibration and selection process, we extracted the environmental information of the geographic positions of the black mangrove populations with genetic diversity data (n = 40). Then, we estimated the Mahalanobis distance to the niche's centroid in R. To do this, for the mahalanobis function, we used as input the parameters that define ellipsoids models: niche-centroid, shape matrix, and the environmental information of each population with genetic diversity data. This produced a vector of distances to the centroid of each ellipsoid model for each genetic diversity record (step 4 in Figure 1). Using this vector of distances, we performed non-linear regression between centroid-distance and the estimated values of H o , H e , and Ar separately. Parameters of the regression models were estimated using the nlsfit function of the R package easynls. Thus, the set of environmental variables that best explained the genetic variability of A. germinans was based on models (for H o , H e , and Ar) that included the combination of variables that (1) met the evaluation criteria (statistical test and good performance, step 3 in Figure 1), and (2) showed the strongest correlation (ρ) between DNC and genetic diversity (hereafter best-fit models). The best-fit model was used to predict the geographic variation in heterozygosity. To do this, we computed a vector (which we called MD) containing the Mahalanobis distance between the niche-centroid and the environmental data of variables from the best-fit-model. We evaluated the regression model at every element of the MD vector and converted this information into a raster (step 4 in Figure 1).
To evaluate the models' accuracy when predicting H o and H e , we first performed linear regressions between the observed (computed from microsatellite data) and predicted (from best-fit-model) genetic estimates, using the observed genetic estimates as the independent variable (x). We tested the assumption of normal distribution of residuals using the Shapiro-Wilk normality test (Royston, 1995) and homoscedasticity by examining the correlation (Pearson method) between the residuals vs. fitted values. Finally, we used independent microsatellite estimates of H o and H e from Mori et al. (2015) and Kennedy et al. (2020) to perform linear regressions using the same strategy as above. We used tangent transformations for observed and predicted H o and H e to meet the assumption of normality of residuals. All statistics were performed using R.

RESULTS
We fit a total of 2,798,250 candidate niche models (4,095; 121,485; and 2,672,670; for ellipsoids built in two, three, and four dimensions, respectively). All of the candidate models generated, only 257,196 models met the statistical test and performance criteria (p ≤ 0.05 and an omission rate ≤ 0.10). Of these models, 50,120 indicated significant relationships between nichecentroid distances and H o (44,574 and 5,546 were negative and positive, respectively). For H e , 41,795 models showed significant associations with DNC (7,319 negative and 34,476 positive) and 58,855 (8,056 positive and 50,799 negative) significant associations were obtained for Ar.
The model fit with H o , H e , and Ar indicated an ecological niche of A. germinans characterized by four dimensions, related to precipitation seasonality (bio15 from WorldClim) and soil classification variables (World Reference Base for Soil Resources -WRB, and the United States Department of Agriculture Soil Taxonomy -USDA). Soil types referred to the estimated probability of occurrence of haplic arenosols, leptic cambisols, luvic phaeozems (IUSS Working Group WRB, 2007), and argids (Soil Survey staff, 2010). Note that Ar retrieve almost the same environmental variables as heterozygosity but included leptic cambisols instead of haplic arenosols. Haplic arenosols comprise sandy soils. Their coarse structure accounts for their general high permeability and low water and nutrient storage capacity, without mentioning other particular soil features (IUSS Working Group WRB, 2007). Leptic cambisols are moderately developed soils with clay or carbonate content; characterized by the absence of organic matter, aluminium and/or iron compounds. Cambisols occur in dry regions and are less common in the humid tropics and subtropics (IUSS Working Group WRB, 2007). Luvic phaeozems are dark soils rich in organic matter with a certain clay content through or up to 50 cm below its upper limit (IUSS Working Group WRB, 2007). Argids are aridisols with an argillic or natric subsuperficial horizon. Therefore, they refer to an accumulation of phyllosilicate clays or silicate clays, respectively (Soil Survey staff, 2010). In summary, the ecological niche of A. germinans was characterized by precipitation seasonality and by soils associated with organic matter, sand, and clay. These soils have some sodium accumulation in subsuperficial horizons.
Regression coefficients indicated strong negative relationships between genetic diversity estimates (ρ 2 = 0.67 P < 0.05 for H o ; ρ 2 = 0.65, P < 0.05 for H e ; ρ 2 = 0.65, P < 0.05 for Ar) and the distance to the centroid of the ecological niche (Figure 2). Spatial predictions of heterozygosity were estimated with the regression coefficients from the DNC models, based on the equations: y = 0.73 x/ (0.33 + x 1.80 ) and y = 0.76 x/ (0.3 + x 1.80 ) for H o and H e , respectively (Figures 3, 4). Our tests to evaluate the accuracy of genetic diversity predictions indicated that both models showed considerable explanatory power for the observed data (R 2 = 0.65 for H o ; R 2 = 0.60 for H e ; P < 0.05), even when we consider other independent data sets (R 2 = 0.28 for H o ; R 2 = 0.25 for H e ; P < 0.05) that surveyed a larger number of microsatellite markers to estimate H o and H e : 22 loci in Mori et al. (2015) and 12 in Kennedy et al. (2020) (Supplementary Figure 2).
Estimates of DNC suggest that populations closer to the black mangrove's niche centroid are located mainly in the Caribbean coast, specifically on the Yucatan Peninsula and in Honduras, Panama, and the Lesser Antilles (Supplementary Table 4). Black mangrove seemed to be more likely to occur in arenosols, cambisols, and argids when in peripheral populations than near to the centroid-niche ( Figure 5) and in phaeozems when nearer to the niche centroid. Unfortunately, there is no additional information from shapefiles of haplic arenosols, leptic cambisols, luvic phaeozems, or argids. Therefore, we cannot make inferences beyond their associations with organic matter, sand, or clay composition.
Among the models that showed a significant negative relationship with genetic diversity estimates, we identified models with significantly higher correlation (ρ ranging from −0.8 to −0.7) between niche-centroid distances and H o , H e , and Ar. We observed seven specific environmental variables from these models that appear with high frequency (black stars in Supplementary Figure 3). These variables were related to Luvic Phaeozem soils, Argid soils, precipitation variability (bio15), Aric Regosol soils, Ochrept soils, mean diurnal range (bio02), and mean temperature of the wettest quarter (bio08). Note that Luvic Phaeozems, Argids, and precipitation variability were variables included within the best-fit models for H o , H e , and Ar. Aric regosol refers to weakly developed soils that are particularly common in arid areas, with remnants of diagnostic horizons (IUSS Working Group WRB, 2007). Ochrepts originally formed a subgroup within the inceptisols (Smith, 1986), which are young soils with a weak but notable degree of profile development (Soil Survey staff, 2010). Similar to the other soil types, it seems more likely to find Aric Regosols and Ochrepts in populations farther from the niche centroid (Figure 6). We could not obtain additional information from shapefiles about specific features of these soil types.
Precipitation-related variables showed a clear trend between populations near the niche-centroid vs. those at the periphery. These variables were precipitation variability (bio15), mean diurnal range (bio02), and mean temperature of the wettest quarter (bio08). Populations close to the niche centroid exhibited lower precipitation variability than populations farther from the centroid (Figure 6), exhibiting less pronounced differences between dry and rainy seasons. The mean diurnal temperature range showed a trend in which central populations appear to experience, on average, more stable temperatures. In contrast, the mean temperature of the wettest quarter was warmer in the periphery than in central populations (Figure 6).

DISCUSSION
In this study, we evaluated whether there is a correlation between populations' distance from the niche centroid and their genetic variation. We defined the environmental variables that best predicted the spatial genetic variation of A. germinans along the coastlines of the Americas. Our results underscore the critical role that soil features and precipitation-related variables have on black mangrove population dynamics and distribution and how these variables relate to current genetic variation among populations of A. germinans. These results are relevant in the face of global changes affecting mangroves' distribution, defining conservation units, and understanding adaptation to future environments (Hoffmann and Sgro, 2011;Cavanaugh et al., 2014;Wee et al., 2018).

How Might Habitat Quality Influence the Demography of Populations?
How far a population is located from the centroid of its niche can determine population abundance, genetic diversity, and genetic structure (Yañez-Arenas et al., 2012;Martínez-Meyer et al., 2013;Lira-Noriega and Manthey, 2014;Micheletti and Storfer, 2015;Osorio-Olvera et al., 2016;Altamiranda-Saavedra et al., 2020). Under this niche-based point of view, distance to the ecological niche centroid can explain (among other factors) spatial patterns of genetic variation because environmental variables impact population dynamics. Thus, there is a close link between the set of habitable conditions and geographic genetic variation (Lira-Noriega and Manthey, 2014). We found a strong correlation between genetic variation and soil types. Therefore, this result suggests that the presence of organic matter, sand, and clay particles influence black mangrove development.
Mangrove forests represent significant global carbon sinks (Donato et al., 2011;Alongi, 2014). Stocks of soil organic carbon (a measurable component of organic matter) vary greatly with FIGURE 5 | Range values recorded from populations close to the niche centroid (centre) and populations farther from the central-niche (peripheral) in relation to the environmental variables from the best-fit model for genetic diversity estimates: observed and expected heterozygosity, and allelic richness.
FIGURE 6 | Range values recorded from populations close to the niche centroid (centre) and populations farther from the central-niche (peripheral) considering variables of higher frequency of the models with significantly higher correlation (ρ ranging from −0.8 to −0.7) between the distance to the niche's centroid and observed (H o ) and expected heterozygosity (H e ) and allelic richness (Ar). Light blue boxes refer to variables included within the best-fit models for H o , H e , and Ar.
climate, vegetation type, parent material, soil texture (silt, clay, and sand composition), and land use, among others (Jobbágy and Jackson, 2000;Gruba et al., 2015;Matsui et al., 2015;Gonçalves et al., 2017). Nonetheless, the annual precipitation is an essential predictor of spatial variation in the carbon concentrations of global mangrove soils (Jardine and Siikamäki, 2014). Studies that have assessed the correlation between the soil organic carbon and soil texture have suggested a close relationship among precipitation regimes, soil organic carbon, and clay content (Wiesmeier et al., 2011;Saiz et al., 2012;Feng et al., 2013). Recently, Zhong et al. (2018) found increased soil organic carbon and clay content levels in sites characterized by higher precipitation levels. The authors argued that higher rainfall and temperature favour the mineralization of soils and the production of clay particles. Thus, this promotes net primary productivity, which incorporates carbon in litter and root exudates. Other related studies highlight that the variation in clay mineral content partly produce soil organic carbon stocks (Blanco-Canqui and Lal, 2004).
At the local level (site-specific), the metabolic activity of microbes governs soil organic carbon storage, which in turn is mediated by the vegetation density of a diverse plant community via higher root inputs, including root exudation (Lange et al., 2015). This process could be occurring near the niche centroid (i.e., Panama). The abundant litter input by more productive trees (e.g., higher diameter at breast height and height) of the dense mangrove forest could promote organic matter accumulation by producing higher litter mass and increasing soil organic carbon (Rahman et al., 2021). Hence, it makes sense that in peripheral populations, Luvic Phaeozem soils are practically absent.
In addition, we found studies that suggested two mechanisms by which the proportion of clay and sand particles may act on mangrove growth and establishment. First, clay could be a favourable substrate for rooting mangrove propagules (seedling density and soil texture r = 0.62; p < 0.01) (Ukpong, 1997). Moreover, soil properties are essential components of the mangrove's environment because soil texture is a crucial determinant of trees' height and density (stems per hectare) in Avicennia and other mangrove species (Ukpong, 1997). On the other hand, seedlings of A. germinans and Laguncularia racemosa growing in hypersaline conditions in 100% sand failed to survive. However, seedlings grown in soil composed of 90% sand and 10% clay had 100% survival in hypersaline conditions, although they showed some leaf discolouration (McMillan, 1975). Seedlings grown in 75% sand and 25% clay soil had a 100 % survival and no observable effect on leaves (McMillan, 1975).
Temperature and precipitation are known to influence global distribution patterns in numerous taxa, including mangrove trees (Woodward and Williams, 1987;Duke et al., 1998;Tomlinson, 2016). Indeed, low air and soil temperatures explain a reduction in growth rate (Clough, 1992). In mangroves, extreme cold events such as frost have harmful effects (e.g., dieback) (Woodroffe and Grindrod, 1991;Krauss et al., 2008). Despite this, recent studies have concluded that temperature alone does not define the latitudinal range limits of the mangrove genera Avicennia, Rhizophora, and Laguncularia (Quisthoudt et al., 2012;Ximenes et al., 2018). Other factors may contribute to determining range limits and habitat suitability for A. germinans' successful establishment. Tree height is a reliable indicator of productivity and the relative suitability of a site (Li et al., 2010). Tree height of A. germinans is correlated negatively with latitude and positively with rainfall (Arreola-Lizárraga et al., 2004;Méndez-Alonzo and López-Portillo, 2008). Indeed, tree height is the most critical factor determining tree diameter (DBH), growth rate, and wood production of mangroves trees at the global scale (Xiong et al., 2019). For instance, there are records of low biomass in the limits of mangrove ecosystems distribution in western Mexico, where aridity and cold winter temperatures are common (Hutchinson et al., 2014). Rainfall affects mangrove tree growth by enhancing and supporting photosynthetic carbon gain (Xiong et al., 2019). The identification of precipitation seasonality (and other three additional precipitation-related variables) in the model that best explained the geographic variation in genetic diversity of A. germinans highlights its importance as a component of niche suitability.

The Relationship Between Genetic Diversity and Distance to the Niche-Centroid
The population genetics theory predicts that genetic diversity in a population results from a complex interplay between different evolutionary forces that depends on its effective population size (Kimura, 1983). We found solid non-linear relationships between genetic diversity and niche-centroid distance. Nonlinear relationships between abundance and distance to the niche centroid imply that optimal niche conditions are relatively narrow (cf., Rabinowitz, 1981). Thus, few sites hold suitable conditions for maintaining large populations (Martínez-Meyer et al., 2013). Our data in A. germinans suggest this pattern. Populations mainly located along the coasts of the Caribbean Sea (v.gr., Central America, Puerto Rico, Trinidad, and French Guiana) bore high genetic diversity and larger effective population size. However, island populations may maintain high genetic diversity owing to high gene flow and show genetic differentiation from continental coasts due to ocean currents (see Cerón-Souza et al., 2015). Populations of A. germinans located near the niche centroid are expected to deal with changing environments, owing to higher genetic variability, and thus might be more likely to survive and produce offspring. Nevertheless, this could depend on whether species differ in distributional range, life-history traits, or evolutionary history. Thus, eco-geographic and historical circumstances can have different demographic consequences across the species range (Griffin and Willi, 2014).
Many woody plants have high heterozygosity throughout their populations (Hamrick and Godt, 1996). They have long generation times and, in many cases, have extensive gene flow over large geographic distances through both pollen and seeds (Petit and Hampe, 2006). Studies have shown that life-history traits such as life form and dispersal capacity influence patterns of genetic variation (Hamrick and Godt, 1996). In many cases, high gene flow rates can erase patterns of genetic diversity. Assessment of the correlation between habitat suitability and abundance has shown that dispersal capability also influences the strength of this correlation (Osorio-Olvera et al., 2016). Eventually, increasing gene flow rates can blur the relationship between DNC and abundance. Therefore, we would expect a similar result for genetic diversity patterns. Hence, detecting the relationship between DNC and genetic diversity in woody plants may be difficult; however, our results support the notion that A.
germinans (and, in general, mangroves) are an ideal model to test the niche-centroid hypothesis (Kennedy et al., 2020).
The distance to the ecological niche centroid is a powerful tool to infer the role of some critical variables in population dynamics, especially when limited information is available. Since environmental conditions are not stable over time, the geographic ranges of many species have changed in concert with habitat suitability, resulting in geographic variation in effective population size; and the long-term estimates of heterozygosity are useful to track those changes. Furthermore, because losses of rare alleles occur during founder events, population bottlenecks, and sporadic changes in effective population size, we recommend targeting the analysis of allelic diversity in future studies to increase the likelihood to identify this geographical pattern (Eckert et al., 2008), especially for species with long generation times and high gene flow rates.
Levels of genetic diversity vary significantly in natural populations and species. Determinants of this variation are still controversial because many confounding factors affect genetic polymorphism, such as mutation rate, mating system, gene flow rates, population bottlenecks, environmental perturbations, and population size (Petit and Hampe, 2006;Romiguier et al., 2014). Linear regression analyses showed that our predicted models of observed and expected heterozygosity indicated considerably explanatory power of observed values (R 2 = 0.28 for H o ; R 2 = 0.25 for H e ; P < 0.05), similar to other related studies (Martínez-Meyer et al., 2013;Lira-Noriega and Manthey, 2014). Thus, correlation analysis between DNC and genetic variation is helpful to predict the spatial variation of heterozygosity throughout most of its natural geographic distribution. In the literature, some more complex models allow prediction of current (and future) genetic diversity, population size, and connectivity among populations (e.g., Benoliel-Carvalho et al., 2019;Hearn et al., 2019). Here, we may be able to predict spatial heterozygosity using a simple approach with a reasonable certainty rate.
Nowadays, various genetic diversity estimates can be applied, and the implementation of population genomics may further improve the prediction of population genetic diversity. These results are relevant to plant conservation and evolutionary biology because evolutionary potential is primarily a function of genetic variation (Reed and Frankham, 2003;Wee et al., 2018). Modelling such genetic parameters provides a helpful way to support and envisage management plans. Thus, this is a valuable tool for conservation and management efforts. The current niche models, coupled with general circulation models, could project likely future changes in genetic diversity (while recognizing the effects of uncertainties and assumptions on niche modelling; Wiens et al., 2009).
There are many predictions for mangrove forests in the face of climate change (Record et al., 2013;Alongi, 2015;Cerón-Souza et al., 2015). Given the multiple ecosystem services they provide, it is imperative to develop models that can anticipate the genetic diversity changes for more accurate information about which places are likely to be most at risk, in order to inform decision-making by conservationists and resource managers.
In conclusion, this study supports that black mangrove genetic diversity meets the niche-centroid hypothesis, in which the highest heterozygosity and allelic richness (the basic genetic units for adaptation) was observed at locations of high environmental suitability. Our findings contribute to understanding how habitat quality may affect the demography of populations of one of the most conspicuous mangrove species in the Americas. The strong correlation between heterozygosity and allelic richness estimates and DNC illustrates how the environmental variables that define the fundamental niche impact individuals' establishment, growth, development, and survival. Soil features and precipitation are related to the population dynamics of A. germinans, explaining, in part, extant genetic variation. Habitat suitability of A. germinans is spatially structured with few populations containing high genetic diversity, mainly located along the Caribbean coast. The ability to predict the amount of genetic variation across the natural range of a species and model future changes in the face of climate change are potent tools in species conservation and management efforts, especially in defining conservation units (Wee et al., 2018).

DATA AVAILABILITY STATEMENT
The datasets analysed for this study can be found in the DRYAD REPOSITORY