Unlocking environmental archives in the Arctic—insights from modern diatom-environment relationships in lakes and ponds across Greenland

Given the current rate of Arctic warming, the associated ecological changes need to be put into a longer-term context of natural variability. Palaeolimnology offers tools to explore archives stored in the sediments of Arctic lakes and ponds. The interpretation of these archives requires a sound knowledge of the ecology and distribution of the sedimentary proxy organisms used. Here we explored the relationship between diatoms, a widely used proxy group of siliceous algae, and the environmental drivers defining their assemblages and diversity in 115 lakes and ponds in Greenland, a markedly understudied arctic region covering extensive climate and environmental gradients. The main environmental drivers of diatom communities were related to climate and lake ontogeny, including both measured and unmeasured (spatially structured) environmental variables. The lakes and ponds in the northern study regions showed a distinctive dominance of small benthic fragilarioid species, while diatom communities in the South(west) of Greenland were more varied, including many epiphytes, owing to the longer growing season and higher habitat diversity of these lakes and ponds. The newly established lakes in the Ilulissat region host markedly different communities compared to all other sites. Species diversity followed an overall clear latitudinal decline towards the North. Despite the large distances between our study regions, diatom dispersal appeared not to be limited. Based on our results, diatoms are an excellent proxy for climate-mediated lake ecosystem change in the Arctic and thus a valuable tool for climate reconstructions in the region. Particular consideration should be given to often unmeasured climate-related drivers, such as in-lake habitat availability, due to their apparent importance in defining Arctic diatom communities.


Introduction
To understand fully current climate and environmental change, driven by the Great Acceleration since the mid-20th century (Steffen et al., 2015), we need to place observed change into the context of long-term natural environmental variability.However, data from environmental monitoring programmes commonly cover only the past decades, and in remote regions like the Arctic, monitoring and field studies are spatially and temporally patchy (Metcalfe et al., 2018;Kahlert et al., 2021).In the absence of long-term instrumental and biomonitoring, lake sediment archives containing a wealth of microfossils and biogeochemical tracers can offer a valuable tool for assessing past environmental change over relevant time scales.
The Arctic has warmed nearly four times faster than the globe over the recent decades (Rantanen et al., 2022), a rate that is unprecedented over decades to millennia (Miller et al. 2013).Simultaneously, dramatic ecosystem change has been reported throughout the Arctic (Overpeck et al., 1997;Smith et al., 2005;Kaufman et al., 2009;Post et al., 2009;Wrona et al., 2016), as studies show that changes in temperature, and snow and ice cover alter limnological boundary conditions (e.g., light and mixing regime, nutrient cycling, length of growing season) and influence biological dynamics (Douglas et al., 2004;Schindler and Smol, 2006;Callaghan et al., 2010;Rautio et al., 2011).
To give these ongoing changes perspective, palaeolimnological studies (analysing sedimentary archives) are now commonly used as an alternative for missing long-term monitoring data in remote arctic regions to assess the rate, speed, magnitude and direction of environmental change (e.g., Douglas et al., 2004;Smol et al., 2005;Rühland et al., 2008;Hobbs et al., 2010;Medeiros et al., 2012;Saros and Anderson, 2015).Interpreting palaeolimnological archives requires a sound understanding of the microfossils and biogeochemical tracers (or proxies) found in the sediments.Lake surface-sediment surveys or "training sets, " i.e., a large number of lakes from which both surface-sediment proxies and measured environmental variables are analysed for the purpose of quantitative environmental reconstructions, offer a very good tool to investigate the links between proxy distribution, abundance, diversity and environmental setup.
Many palaeolimnological studies use diatoms as a proxy indicator.A premise for using diatoms, a key component of Arctic freshwater ecosystems, as a reliable proxy of environmental change over time is a sound understanding of their ecological and biogeographical characteristics (e.g., Smol et al., 2002;Bouchard et al., 2004).While this is the case for several common and abundant taxa, the ecology and distribution of a much larger number of species are still rather poorly known.
Here we explored surface-sediment diatom taxa-environment relationships from 115 lakes and ponds, located in five different regions along the ice-free margin of Greenland, covering an exceptionally large latitudinal (ca.60°−83° N) and longitudinal gradient (ca.-72°− -19° E).The large geographical scale of our sampling regime provides us with the opportunity to assess the overriding drivers of diatom community structure in the Arctic and to investigate the relative roles of spatial vs. environmental factors.The five different study regions are effectively isolated from each other, and diatoms can therefore only disperse overland passively via air or actively by water birds (Kristiansen, 1996).The lake communities of each region may therefore to some extent be defined by low degree of interchange (dispersal) between the regions.Several studies have underlined the importance of spatial gradients in addition or sometimes over environmental gradients in shaping diatom community structure in streams and lakes (Soininen and Weckström, 2009;Sweetman et al., 2010;Smucker and Vis, 2011;Virtanen and Soininen, 2012;Liu et al., 2016), and the discussion over dispersal influencing diatom species distribution is still ongoing (Keck et al., 2018;Falasco et al., 2019;Leboucher et al., 2020 and reference therein).
The objectives of this study are to investigate diatom distribution and abundance across Greenlandic lakes and ponds and to determine which environmental factors are important in influencing diatom communities and to what degree they are spatially structured in this climatically and environmentally varied region.As such, this study contributes valuable and hard-to-obtain information on ecological and biogeographical characteristics of Arctic freshwater diatoms, vital for ecologist and palaeoecologist alike, adding important basis knowledge for future palaeoclimatic and palaeoenvironmental interpretations in a yet poorly investigated and climatically sensitive region.

Study area & sites
The 115 investigated lakes and ponds are located along the ice-free margins in five different regions of Greenland (Figure 1).Forty eight sites are embedded in the unreworked Archaean gneiss of the southern West, in the region of Nuuk, Godhåbsfjord, where the annual mean temperature is −1.4°C, with 752 mm annual mean precipitation (Nuuk weather station).Fourteen sites are located in the central West, east of Ilulissat, at the foot of the inland ice and are embedded in reworked Archaean gneiss in early Proterozoic fold belts.The annual mean air temperature here is −3.9°C, and the annual mean precipitation is 266 mm (Ilulissat weather station).The sites in this region are further divided into young lakes/ponds (n = 7) that have emerged under the Greenland Ice Sheet since 1850 and into lakes/ponds (n = 7) significantly older than this date (Jeppesen et al., 2023).Seventeen sites are embedded in Proterozoic and lower  Henriksen and Higgins, 2009;Cappelen et al., 2011).Apart from the climatic differences between the regions, a climatic gradient also exists within each of the five regions from the more maritime outer coast to the continental ice-free inland.The study sites range in size from large proglacial lakes (ca. 2 km 2 ) to small bedrock catchment ponds (0.002 km 2 ) and their maximum

Sample collection and environmental data
The surface sediments were retrieved between 1998 and 2013, during the arctic summer (July-August) from the deepest area of each site using a Kajak corer (KC Denmark) or, in shallow lakes and ponds, with a tube on a rod.The cores were sliced into 0.25-0.5 cm intervals, with an exception of the 1998 samples from the Zackenberg/Daneborg region; here the cores were sliced in 1 cm intervals.The top half-centimetre was used for diatom sample preparation.Sampling regions and sites were chosen to cover wide environmental and spatial gradients across Greenland and to cover lakes and ponds from all depths, with approximately half between 2 and 10 m depth.
Lake water was collected for chemical analysis (total phosphorus (TP), total nitrogen (TN), specific conductivity, chlorophyl-a (chl a), and acidity (pH)) at the same time as the surface sediment was retrieved.From each site, a composite sample was collected with a 5 l Schindler sampler at 0.5-3 m intervals (depending on depth) from the surface to 0.5 m above the bottom in the deepest part of each site.The discrete samples were mixed and used for subsampling.For chemical analyses, a subsample of 250 ml was frozen, and for chl a a duplicate 1 l sample was filtered on a GF/C filter (Whatman) using vacuum and stored in snow/ice fans until arrival at the lab where it was frozen until analysis.TP was determined as molybdate reactive phosphorus (Murphy and Riley, 1962) following persulphate digestion (Koroleff, 1970) and TN as nitrite + nitrate after potassium (K) persulphate digestion (Solórzano and Sharp, 1980).Chl a was determined spectrophotometrically after ethanol extraction for approx 24 h (Jespersen and Christoffersen, 1987).Specific conductivity and pH in the surface water (0.5 m) was obtained using an YSI multiprobe recorder.
For information on fish absence or presence, multi-mesh sized monofilament survey gillnets were placed in the lakes/ponds in the afternoon and left overnight.Depending on the size and depth of the sites, the number of nets varied from one to five, covering the littoral, profundal and pelagic zone (Jeppesen et al., 2017).
Lake/pond area and altitude were extracted from Google Earth Pro (V 7.1.2.2041) to avoid error from the use of different (older) GPS technologies.To estimate vegetation cover in the catchment areas the enhanced vegetation index (EVI) was used.We extracted EVI, using a 250 m resolution, 16 day EVI band, compiled as an annual mean between 2000 and 2013.For EVI extraction, site coordinates were placed approx 250 m away from the water body where possible without interfering with neighbouring water bodies to avoid false EVI extraction due to water surfaces.All EVI data were downloaded from the NASA USGS website via the Modistool package in R (Tuck et al., 2014).
August mean air temperature for 1998-2013 was downloaded and extracted for our specific sites from WorldClim-Global Climate data in the highest resolution possible (30 arc-seconds /~1 km; http://www.worldclim.org/;Hijmans et al., 2005) using the rgdal and raster package in R.

Diatom analysis
Surface-sediment samples were prepared for diatom analysis, following standard procedures using the water bath technique (Renberg, 1990;Battarbee et al., 2001).Slides were mounted using the mounting medium Naphrax ® .The target total count for each of the 115 samples was 300-500 diatom valves.In 65 samples diatom valve concentrations were low, and the total count in these samples was 100-300 valves (26 samples: 100-200; 39 samples: 201-300).The samples were analysed at 1,000 × magnification under oil immersion using a Zeiss AXIO phase contrast microscope.A complete list of identified taxa is provided in Supplementary Table 1.

Statistical analysis
Environmental data were checked for homogeneity of variance and log-transformed where necessary.Principal Components Analysis (PCA) using a correlation matrix was subsequently applied to explore the patterns across the different sampling regions.Diatom species abundance data was log-and Hellinger-transformed.PCA using a covariance matrix was subsequently applied to explore patterns in diatom composition.Despite relatively long floristic gradients revealed in Detrended Component Analysis (DCA; 3.8 SD), the linear response model PCA, in combination with Hellinger transformation, was chosen as the more appropriate model (Legendre and Gallagher, 2001) to explore patterns of variation in the diatom assemblage.Diatom species diversity was calculated for each site using the inverse Simpson index and species richness on rarefied count data to account for different sample size.Correlations between richness, diversity and selected environmental and spatial variables were calculated using the Pearson correlation coefficient.The significance of the difference (or similarity) between species diversity in the different sampling regions was tested using the Welch Two Sample t-test, which accounts for unequal variance in the objects being compared.
To test for the importance of the spatial structure in our data set, spatial vectors (Principal Coordinates of Neighbour Matrices, PCNMs) were produced as a proxy for space and consequently added as spatial predictors in redundancy analysis (RDA).PCNMs are based on basic coordinates and produce a numerical expression of a range of possible spatial correlation in the data set.The vectors with small numbers (e.g., V1, V5) capture broad-scale spatial patterns across Greenland, whereas vectors with higher numbers (e.g., V42, V59) capture local-scale spatial patterns (variation in a single region).
To evaluate the respective importance of spatial and environmental factors for structuring diatom communities in our study sites, we used partial RDA on diatom abundance and presence/absence (PA) data.This was done under the assumption that PA data will be stronger shaped by spatial factors than environmental factors (compared to the proportion explained using diatom abundance data) if dispersal was to be a significant driver.
The first two PCs extracted from PCA of the environmental variables explain 44% of the variation (PC1 = 24%, PC2 = 20%).PCA clearly separates the lakes and ponds into the generally warmer, deeper, larger and more nutrient-enriched sites with higher amount of catchment vegetation in the Southwest (Nuuk area), and the shallower, colder, ultra-oligotrophic, higher conductivity and more alkaline sites in the North and Northeast (Supplementary Figure 1; Table 1).

Richness and diversity
The diatom species richness and diversity overall (with the exception of the Nunatarsuaq region) decrease from South to North, strongly correlating with latitude (r = −0.65,−0.55, respectively, p < 0.001) and mean August air temperature (r = 0.59, 0.51, respectively, p < 0.001; Figure 3).There is no significant relationship between species diversity and lake/pond area (r = 0.008) or depth (r = 0.06), but generally deeper lakes are amongst the more species-rich sites from the Southwest (Nuuk region).A weak, but statistically significant, relationship was found between richness/diversity and EVI (r = 0.29/0.27,p < 0.001).On the local scale, relationships between species diversity and the environment are less clear.The Nunatarsuaq lakes exhibit a diversity that is more similar to the species diversity range of the Nuuk region (Welch t-test: t = 2.2, p = 0.04; Figure 4A).The comparison of species diversity between same latitude sites on the west coast versus the east coast of Greenland reveals no significant differences (Welch t-test: t = 0.46, p = 0.65; compare old Ilulissat lakes vs. Zackenberg/Daneborg Figure 4A).For the three regions Nuuk, Zackenberg/Daneborg and Pearyland, where sites follow a quasicoast-inland-ice margin transect, species diversity decreases from the sites closest to the coast or fjord towards the sites closest to the inland ice, but no statistically significant relationship with our measured environmental variables is found (Figures 4B,D).

Environmental predictors
The RDA, with forward selection (999 Monte Carlo permutations) reveals that eight out of the 11 environmental variables contribute significantly to the species-sites relationships of the data set, accounting for 29% of the variance (TP, area and fish insignificant).Mean August temperature, EVI and lake depth have the strongest correlation with axis 1, whilst specific conductivity and pH correlate

Spatial predictors
To test for the influence of the spatial factors on the diatom species distribution, PCNM's were used.In total, 59 spatial vectors (PCNMs) captured the spatial structure of the 115 locations across Greenland.In the RDA with Monte Carlo forward selection, 15 of these spatial vectors significantly explained patterns in diatom species communities.Combining these 15 significant spatial vectors with the 11 measured environmental variables, the RDA with Monte Carlo forward selection revealed that all 15 spatial vectors and lake depth, EVI, specific conductivity, chl a, TN and TP were statistically significant in shaping the diatom communities (p ≤ 0.01).Mean August air temperature was marginally significant, depending on the permutation run.In combination with spatial vectors (e.g., V4, V59-vectors capturing the regional and local spatial structure of the Ilulissat region, respectively), specific conductivity still remained strongly associated with community patterns along axis 2, whilst axis 1 was now more strongly associated with spatial vectors (e.g., V1, V5-vectors effectively capturing the degree in spatial structuring between all northern and all southern sites), while measured environmental variables were less important (Figure 6).
The partial RDA (variance partitioning) using diatom species abundance data revealed that all environmental factors explain independently 5%, while space explains 14%.The conditional effect relating to the covariance between environment and space explains 17% of the variance.Together they explain 36% of the total variation (p < 0.001).The partial RDA using PA data revealed a similar proportion of explanatory power (environmental factors 3%, space 12%, conditional effect 15%; together explaining 30% of the total variance, p < 0.001; Figures 7A,B).The variance explained is relatively low, but it is typical of data sets containing many taxa and many zero values in the species matrix.Important, however, are the associated

Environmental gradients
Typical for high latitude freshwaters, the majority of our lakes and ponds are ultra-oligotrophic to oligotrophic (Ryves et al., 2002;Bouchard et al., 2004;Cremer and Wagner, 2004).The distribution of trophy is associated with mean August air temperature.Warmer summer temperatures, characteristic for the southwestern region of Greenland, lead to increased terrestrial plant growth and hence catchment inputs (Normand et al., 2013), which affect the trophic state of the lakes and ponds: sites in the Southwest are more nutrient enriched (up to mesotrophic), while ultra-oligotrophic in the North and Northwest (Table 1).High specific conductivity sites are closely linked to carbonate bedrock geology and silty soil type (vs.rocky catchments) in the Pituffik and Pearyland regions.More dilute and acidic lakes are located in Southwest Greenland, embedded in gneiss bedrock and associated with more established soil and vegetation development binding base cations (Law et al., 2015).Dilute and acidic lakes are also found in the Nunatarsuaq region (Table 1), which, unlike the Pituffic lakes are embedded in siliciclastic bedrock.Some sites in recently deglaciated areas, here the newly established lakes from the Ilulissat region, had high conductivity (Jeppesen et al., 2023) despite the gneiss bedrock Species richness and diversity on a regional scale vs. latitude Species richness and diversity on a regional scale vs. latitude (A,B), mean August air temperature (C,D), and enhanced vegetation index (EVI) (E,F).Sites are colour-coded by region and scaled by lake depth.

Diatom communities and species ecology
Many of the taxa encountered in our Greenland lake surfacesediment data set, including dominantly benthic species, have been previously reported in studies from the high Artic.These taxa are typical for slightly acidic to circumneutral, oligotrophic, electrolytepoor lakes (Bouchard et al., 2004;Jones and Birks, 2004;Antoniades et al., 2005;Keatley et al., 2008;Hadley et al., 2013;McGowan et al., 2018).The low abundance of planktic taxa has been observed in most high Arctic surveys and is linked to the shallowness of sites, extended ice-cover period and low nutrient concentration of the lake water (e.g., Smol, 1988;Weckström et al., 1997a,b;Weckström and Korhola, 2001;Teittinen et al., 2018).We found consistently higher abundances of planktic Pantocsekiella ocellata and Discostella stelligera in the deeper lakes of the Southwest (Nuuk region), indicating a longer open water season, sufficient depth and nutrient supply enabling planktic growth (Reynolds et al., 2002;Rühland et al., 2015;Saros and Anderson, 2015;Ossyssek et al., 2020).Higher abundances of planktic species were also observed in the deeper lakes of the regions Ilulissat (Cyclotella and Discostella) and Nunatarsuaq (Aulacoseira, especially A. lirata; Figure 2; Table 1).The difference in the dominant planktic species is likely related to temperature stratification, with more stratified lakes in the warmer Southwest.Heavily silicified Aulacoseira taxa require turbulent (mixed) conditions to thrive, whereas small Cyclotella (sensu lato) species dominate in a temperature-stratified water column (Rühland et al., 2003(Rühland et al., , 2015)).
The high compositional species turnover in our survey identified by DCA (3.8 SD) was expected and indicates relatively large differences in diatom assemblages amongst regions and sites due to the wide range of habitats across Greenland.Species of mostly alkaliphilous Staurosirella, Staurosira and Staurosira are typical for the Pearyland, Pituffik and Zackenberg/Daneborg regions.These taxa are often described as pioneers in fast changing environments, and are known to be competitive in shallow, hard-substrate, low-productivity cold sites, where they thrive under prolonged ice-cover and the resulting short growing season (Lotter and Bigler, 2000;Rühland et al., 2003;Pla-Rabes et al., 2016).
While pioneer species and generalists with wider niches (e.g., Achnanthes sensu lato) dominate Greenland's northerly sites, southerly sites host more complex communities and life forms.A succession along a gradient of increasing temperature, from relatively simple diatom assemblages (adnate benthic small Fragilaria) to more diverse and complex assemblages has been observed in other lake surfacesediment studies in the Canadian Arctic (e.g., Michelutti et al., 2003) and in palaeolimnological studies, where it was suggested to be an indication of climate warming (Smol, 1988;Douglas and Smol, 1999).The southerly sites in Greenland host generalist Achnanthes sensu lato Achnanthes sensu lato (Achnanthes spp., Achanthidium spp., Planothidium spp., Psammothidium spp., Rossithidium spp) along with high abundances of Eunotia spp.and larger epiphytic species like F. quadrisinuata, F. saxonica, Brachysira brebissonii, B. neoexilis and P. mesolepta, which are acidophilous taxa (Weckström et al., 2003) reflecting the dilute and slightly acidic waters of the Southwest.Here they also reflect a higher abundance of mosses and submerged macrophytes than in the northern sites due to the longer growing season and higher temperatures, as these epiphytic taxa have often been found in moss-rich Artic freshwaters (Douglas and Smol, 1995;Michelutti et al., 2007;McGowan et al., 2018).Apart from the absence of Frustulia species, diatom communities in Nunatarssuaq are dominated by these same taxa, likely owing to the dilute and slightly acidic lake waters in this region (Table 1) and the presence of mosses also in the deeper sites.Many of the lakes here have very clear water (Secchi depth ≥ lake depth) due to stony catchment areas with little soil.As the Nunatarsuaq area is located at a higher elevation close to the inland ice, the number of sunny days is higher than near the coast, favouring benthic macrophytes.
Encyonopsis microcephala was encountered at high abundances together with the cold-water O. mesodon (Potapova, 2009) and the generalist N. perminuta in the newly established lakes and ponds at the foreland of the fast-retreating Ilulissat glacier (Figure 2; E. microcephala, which in Figure 2 is grouped together with Cymbella spp., was found at 20% on average).Also, A. libyca was abundant in several of these lakes and ponds.It was found at marked abundances only here and at the higher conductivity of the Pearyland sites.Despite their classification as pioneers, small fragilariod taxa were scarce (Figure 2).As such, these sites, established after 1850 and some after 1960 (Jeppesen et al., 2023), host very different diatom communities compared with any other sites investigated in this study.They are high-conductivity and vegetation-free (both submerged and terrestrial) systems, characterised by a rocky lake bottom, with clay and a thin layer of lake sediment deposition.Species of Amphora and Diatoma (Odontidium) have been shown to prefer highconductivity environments (Ryves et al., 2002;Weckström and Juggins, 2005).This could explain their occurrence in such glacier-proximal lakes, which are known to be ion-enriched and alkaline (Engstrom et al., 2000).Nitzschia perminuta was also observed at high abundances in Svalbard ponds, which were recently formed after glacier retreat (30-40 yr.old), and which are similar to our sites in being hydrologically isolated from the retreating glacier (Pinseel et al., 2017).In the Kangerlusuaq region, 250 km south of Ilulissat, McGowan et al. (2018) found Encyonopsis (Cymbella) microcephala at high abundances co-dominant with different species of Nitzschia in the investigated inland lakes, whilst the abundance of small fragilariod taxa was low.These taxa are known to inhabit microbial biofilms (McGowan et al., 2018).In addition to these more motile species, also prostrate forms, such as O. mesodon (Spaulding et al., 2010, as Diatoma mesodon) are suited to grow as epiphytes on algal  Variance partition on (A) diatom abundance data, environmental variables, and spatial vectors and (B) diatom presence/absence data, environmental variables, and spatial vectors. 10.3389/fevo.2023.1177638 Frontiers in Ecology and Evolution 11 frontiersin.orgfilaments in complex microbial mats (McGowan et al., 2018, and references therein).Cyanobacteria-dominated microbial mats are highly common in cold polar waters and are known to be primary colonisers after glacier/ice sheet retreat, offering a suitable habitat for other algal groups (Vincent, 2000).

Diatom richness and diversity
With the exception of the Nunatarssuaq region, diatom diversity and species richness showed a decline with latitude, aligning with earlier studies encompassing the North American, European and Russian Arctic (Michelutti et al., 2003;Kahlert et al., 2021).Bouchard et al. (2004) further found a strong correlation between species diversity and lake area, which they attributed to a higher heterogeneity in habitats.Their study covered a much smaller region compared with our Greenland-wide lake survey, where other environmental gradients override the importance of lake area.Mean August temperature, which decreases from south to north, is strongly correlated with diversity and richness in our data set, emphasising that the duration of the growing season directly or indirectly strongly controls diatom diversity in the High Arctic.A longer and warmer summer in the southern sites will lead to earlier ice breakup and a longer growing season, creating a wider range of habitats.Diversity increases due to the development of more complex assemblages as the season progresses (Smol, 1988;Douglas and Smol, 1999;Smol et al., 2005;Keatley et al., 2008;Weckström et al., 2014).In addition, the terrestrial vegetation is generally denser (or present) at southern sites, promoting diatom abundance and diversity indirectly via additional nutrient input (Bouchard et al., 2004;Wrona et al., 2016).The surprisingly high species richness and diversity in the Nunatarsuaq region in comparison to all other northerly sampling regions (Figure 4A) could at least partly be explained by the presence of aquatic mosses in the majority of the sites, translating into a greater potential for diverse micro-habitats for benthic taxa (Bouchard et al., 2004).Additionally, half of the sites in the region are deep (Table 1), allowing for a planktic community to develop, adding to the species diversity.
In the regions with a wider ice-free margin (Nuuk, Daneborg/ Zackenberg and Pearyland, Figure 1) the within-region diversity and richness decreased from the coastal and fjord lakes towards the inland and ice margin lakes (Figure 4), weakly correlating with decreasing mean August air temperature along an altitudinal gradient (the temperature range within each region varied from 1.6°C in Pearlyland to 2.6°C in Zackenberg/Daneborg and Nuuk).This tentatively aligns with the above-discussed dependency of diatom diversity on the length of the growing season.

Diatoms-environment vs. space
The physical and chemical components in the RDA (without spatial vectors) explained 29% of the variation in the diatom community, a figure that is similar to other studies in the Arctic (e.g., Ryves et al., 2002;Lim et al., 2007;Keatley et al., 2008).Mean August temperature, specific conductivity, and vegetation cover (EVI) were most significant in driving the diatom community composition, indicating that climate, geology, and lake/catchment age determine (indirectly) species distribution along the extensive environmental gradients in Greenland.
The presence and abundance of fish has been suspected as a potential driver of diatom species distribution via controlling the direct grazing pressure on planktic diatom communities (e.g., McGowan et al., 2018).Indeed, fish presence has shown strong cascading effects on zooplankton and macroinvertebrate communities in arctic and boreal regions (Jeppesen et al., 2003(Jeppesen et al., , 2017;;Milardi et al., 2016) and on the dominance of planktic or benthic diatom production (Milardi et al., 2017).In this study, however, we could not show a significant effect of fish presence on diatom species community distribution, similar to results from Jeppesen et al. (2017), who did not find cascading effects of fish on the phytoplankton biomass in low and high arctic lakes in Greenland.
When environmental and spatial components were analysed together in RDA (using diatom abundance data), the environment uniquely explained only 5% of the variation, whereas space, captured as spatial vectors (PCNMS) based on basic coordinates, appears to have a stronger impact on the patterns in the diatom community composition, uniquely accounting for 14% of the variation.Together, environmental components and space accounted for 36% of the variation (including the conditional effect relating to their covariance, Figure 7).Results using diatom presence/absence data were almost identical.We hypothesised that if space explained a markedly larger portion of diatom variability using P/A data, this would indicate dispersal as a significant driver over the large distances between our study regions.As this was not the case, we conclude that spatial factors relating to the varying environmental settings/long environmental gradients covered by our extensive study area are likely at play.As further support, there was no difference in species diversity between the sites on the east and west coast located at the same latitude; if diatom communities were affected by dispersal limitation over larger distances, we would have expected the more isolated east coast diatom communities to display lower species diversity.Many of the diatom species encountered were found in several of the study regions (see Supplementary Table 1).The most significant spatial vectors were those separating the southern sites in Nuuk from all northern study regions along a climate (temperature, EVI) gradient and those separating Ilulissat sites (and a few Pearyland lakes) from all other regions along a conductivity gradient, but also separating the old and new sites within the Ilulissat region relating to lake ontogeny.
For a much smaller regional-scale surface diatom data set in the Kangerlussuaq area, West Greenland, McGowan et al. (2018) found that space accounted for 9-20% of variation in the diatom community, depending on sampling season.Likewise, Virtanen and Soininen (2012) concluded that spatial structure was more important than local environmental variables for defining diatom communities in boreal streams, suggesting that large-scale processes related to climate, history and dispersal may play a significant role in community distribution, while Sweetman et al. (2010) observed a similar pattern for zooplankton in Canadian Arctic lakes.However, for lakes in the Canadian High Arctic, Keatley et al. (2008) found only environmental variables to be important in structuring diatom communities, although they conclude that this may only be true at the landscape level explored.Our study covers a significantly larger geographical area, where distance between study areas was hypothesised to increase in importance as a factor  Verleyen et al. (2009).
The spatial separation of the subset of young Ilulissat sites may be explained by these lakes having appeared recently (after 1850 and many after 1960) due to glacier retreat (Jeppesen et al., 2023).They are characterised by bare catchment ground and poor lake habitat development.The markedly different environmental conditions of these lakes, which were not fully captured by our measured environmental variables, is reflected as very distinct diatom communities.Hence, spatial structure may become more important in shaping species communities when relevant spatially linked environmental variables are not captured by the environmental data collected.Potential unmeasured variables contributing to the higher explanatory power of space in our study setting could be, for example, duration of lake ice cover, length of the growing season, abundance of aquatic macrophytes, topography of the catchment, wind conditions, lake shape, and trophic interactions between zoo-and phytoplankton (not related to fish presence).Looking at the positions between the main spatial and environmental vectors in Figure 6, the unmeasured spatially structured environmental variables are largely linked to the latitudinal climate gradient and to lake ontogeny.Future climate warming is hence likely affecting several of the measured and non-measured spatially structured environmental variables in this study, and potentially marked changes in Greenlandic (and Arcticwide) diatom communities regarding their species composition and diversity may be expected.

Conclusion
One of the key requirements for the use of organisms as palaeoecological proxies, is a sufficient understanding of their ecology (which modern environments are species found in and what changes in their environment are they responding to).Exploring modern calibration data sets, which combine information on species assemblages and their surrounding environment, allows such critical information to be gained.Here we used a large 115-lake data set of remote Greenlandic lakes, covering extensive geographical, climatic and environmental gradients to improve our knowledge on the ecological and biogeographical characteristics of Arctic freshwater diatoms.
There were marked differences in the diatom community structure between sites, although a number of species were found in all regions.The lakes and ponds in the northern regions showed a distinctive dominance of small benthic fragilarioid species, linked to their extensive ice-cover period and poor habitat diversity.Diatom communities in the Southwest of Greenland were more varied, including many epiphytes, expressing the longer growing season, more complex successional development, and higher habitat diversity of these lakes and ponds.The main environmental drivers of diatom communities in Greenland were summer (August) air temperature, catchment vegetation and lake conductivity.However, a larger part of diatom variability was explained by spatially structured (unmeasured) environmental variables that aligned with the identified main explanatory variables into climate-related and bedrock/lake development-related drivers.Species diversity was most strongly driven by climate (temperature), following an overall clear latitudinal decline towards the North.Regionally, diversity declined along an altitudinal gradient from the coast to the ice margin.Despite the large distances between our study regions, diatom dispersal appeared not to be limited.
Diatoms provide an excellent tool for (long-term) assessments of climate change in the Arctic and its effects on lake ecosystems, as they appear to be most strongly driven by climate-related variables over Arctic-wide scales.However, several of these drivers, such as in-lake habitat availability, are often left unmeasured in Arctic field campaigns, similarly to non-climatic (trophic interactions) parameters, but should nevertheless be carefully considered due to their apparent importance, when using diatoms as a proxy for reconstructing past and present climate change in the Arctic.10.3389/fevo.2023.1177638Frontiers in Ecology and Evolution 13 frontiersin.orgbe made by its manufacturer, is not guaranteed or endorsed by the publisher.
FIGURE 1Location of study areas.

Frontiers 2 (
Figure5).The seven investigated areas show distinct clusters in the ordination plot.The Nuuk region (Southwest) plots on the positive side of the first axis, whilst the Zackenberg/Daneborg region (Northeast), Pearyland (North) and Pituffik (Northwest) plot on the opposite end.The Nunatarsuaq region (Northwest) is located in between.The older lakes in the Ilulissat region (West) and some Pearyland sites plot on the positive side of the second axis, with the young lakes in the Ilulissat region clustering closely together at the positive end of the axis.

FIGURE 3
FIGURE 3 geology.These high values are linked to the recent retreat of the inland ice, as inwash to these lakes from the newly emerged catchments is rich in nutrients and base-rich solutes(Law et al., 2015;Jeppesen et al., 2023).

FIGURE 4
FIGURE 4Species diversity of study regions (A), and three local-scale examples of a "coast->ice margin" transect (B-D).

FIGURE 5
FIGURE 5 Redundancy analysis (RDA) (includes measured environmental variables only) with Monte Carlo forward selection.Regions are colour-coded, and the 40 most abundant taxa are shown.

FIGURE 6
FIGURE 6 Redundancy analysis (RDA) with significant spatial vectors (PCNMs) included, under Monte Carlo forward selection.Regions are colour-coded, and the 40 most abundant taxa are shown.

TABLE 1
Measured environmental variables for 115 Greenlandic lakes and ponds with means, medians, and ranges.
Each region hosts relative well-defined diatom communities.Dominantly acidophilous and periphytic taxa (Brachysira spp., Frustulia spp., Cymbella sensu lato spp., Pinnularia spp.), and planktic/tychoplanktic taxa (Pantocsekiella, Discostella spp., Aulacoseira spp., Tabellaria spp.) plot together with the warmer, deeper, nutrient-enriched lakes in the Southwest.Small fragilariod taxa (e.g., S. venter, S. pseudoconstruens, S. pinnata) are closely linked to the shallower, cooler, nutrient-poor lakes and ponds in the North, Northeast and Northwest.The intermediate region in the Northwest (Nunatarsuaq) hosts a combination of taxa from the Southwest (Nuuk) and the northern sites.The lakes from the Ilulissat region and Pearlyland with higher alkalinity and conductivity are dominated by Nitzschia spp., O. mesodon, E. microcephala and Amphora libyca.