Holocene Vegetation and Plant Diversity Changes in the North-Eastern Siberian Treeline Region From Pollen and Sedimentary Ancient DNA

Although sedimentary ancient DNA (sedaDNA) has been increasingly used to study paleoecological dynamics (Schulte et al., 2020), the approach has rarely been compared with the traditional method of pollen analysis for investigating past changes in the vegetation composition and diversity of Arctic treeline areas. Here, we provide a history of latitudinal floristic composition and species diversity based on a comparison of sedaDNA and pollen data archived in three Siberian lake sediment cores spanning the mid-Holocene to the present (7.6–0 cal ka BP), from northern typical tundra to southern open larch forest in the Omoloy region. Our results show that the sedaDNA approach identifies more plant taxa found in the local vegetation communities, while the corresponding pollen analysis mainly captures the regional vegetation development and has its limitations for plant diversity reconstruction. Measures of alpha diversity were calculated based on sedaDNA data recovered from along a tundra to forest tundra to open larch forest gradient. Across all sites, sedaDNA archives provide a complementary record of the vegetation transition within each lake’s catchment, tracking a distinct latitudinal vegetation type range from larch tree/alder shrub (open larch forest site) to dwarf shrub-steppe (forest tundra) to wet sedge tundra (typical tundra site). By contrast, the pollen data reveal an open landscape, which cannot distinguish the temporal changes in compositional vegetation for the open larch forest site and forest-tundra site. Increasing Larix pollen percentages were recorded in the forest-tundra site in the last millenium although no Larix DNA was detected, suggesting that the sedaDNA approach performs better for tracking the local establishment of Larix. Highest species richness and diversity are found in the mid-Holocene (before 4.4 ka) at the typical tundra site with a diverse range of vegetational habitats, while lowest species richness is recorded for the forest tundra where dwarf-willow habitats dominated the lake’s catchment. During the late Holocene, strong declines in species richness and diversity are found at the typical tundra site with the vegetation changing to relatively simple communities. Nevertheless, plant species richness is mostly higher than at the forest-tundra site, which shows a slightly decreasing trend. Plant species richness at the open larch forest site fluctuates through time and is higher than the other sites since around 2.5 ka. Taken together, there is no evidence to suggest that the latitudinal gradients in species diversity changes are present at a millennial scale. Additionally, a weak correlation between the principal component analysis (PCA) site scores of sedaDNA and species richness suggests that climate may not be a direct driver of species turnover within a lake’s catchment. Our data suggest that sedaDNA and pollen have different but complementary abilities for reconstructing past vegetation and species diversity along a latitude.

Although sedimentary ancient DNA (sedaDNA) has been increasingly used to study paleoecological dynamics (Schulte et al., 2020), the approach has rarely been compared with the traditional method of pollen analysis for investigating past changes in the vegetation composition and diversity of Arctic treeline areas. Here, we provide a history of latitudinal floristic composition and species diversity based on a comparison of sedaDNA and pollen data archived in three Siberian lake sediment cores spanning the mid-Holocene to the present (7.6-0 cal ka BP), from northern typical tundra to southern open larch forest in the Omoloy region. Our results show that the sedaDNA approach identifies more plant taxa found in the local vegetation communities, while the corresponding pollen analysis mainly captures the regional vegetation development and has its limitations for plant diversity reconstruction. Measures of alpha diversity were calculated based on sedaDNA data recovered from along a tundra to forest tundra to open larch forest gradient. Across all sites, sedaDNA archives provide a complementary record of the vegetation transition within each lake's catchment, tracking a distinct latitudinal vegetation type range from larch tree/alder shrub (open larch forest site) to dwarf shrub-steppe (forest tundra) to wet sedge tundra (typical tundra site). By contrast, the pollen data reveal an open landscape, which cannot distinguish the temporal changes in compositional vegetation for the open larch forest site and forest-tundra site. Increasing Larix pollen percentages were recorded in the forest-tundra site in the last millenium although no Larix DNA was detected, suggesting that the sedaDNA approach performs better for tracking the local establishment of Larix. Highest species richness and diversity are found in the mid-Holocene (before 4.4 ka) at the typical tundra site with a diverse range of vegetational habitats, while lowest species richness is recorded for the forest tundra where dwarf-willow habitats dominated the lake's catchment. During the late Holocene, strong declines in species richness and diversity are found at the typical tundra site with the vegetation changing to relatively simple communities. Nevertheless, plant species richness is mostly higher than at the forest-tundra site, which shows a slightly decreasing trend. Plant species richness at the open larch forest site fluctuates

INTRODUCTION
Arctic ecosystems are assumed to be very sensitive to climate change. Shifts in vegetation composition and plant richness (or other alpha diversity measures) over the past few decades have thus been assigned to Arctic warming (Elmendorf et al., 2012;Pearson et al., 2013;Myers-Smith et al., 2015. Plant richness of the tundra is typically lower than in forest areas and is associated with harsh climate conditions (Chapin and Körner, 1995;Khitun et al., 2016). Hence, species diversity in present-day tundra areas is expected to increase in the course of forest expansion related to future warming. However, there is speculation that plant richness may decrease in the course of warming due to the reduction of habitats for arctic-adapted species (Callaghan et al., 2004). Additionally, other observationbased studies argue that plant species diversity is not mainly driven by climate (Hudson and Henry, 2009), but by other forcing factors such as soil conditions (e.g., water content and nutrition, Nabe-Nielsen et al., 2017), biotic interactions (Mod et al., 2016), or landscape diversity (e.g., polygonal terrain, Krauss et al., 2004;Zibulski et al., 2016). Accordingly, whether plant richness in tundra areas will decrease or increase under future climate-driven vegetation turnover is largely unknown.
This lack of understanding may, to some extent, originate from the slow response of vegetation to climate change  and because the observational period is too short to provide a comprehensive perspective on underlying causes of vegetation change (Willis and Birks, 2006). To fill this gap, natural archives such as lake sediments can be explored to gain long-term records of plant richness and vegetation composition in relation to a changing climate (Birks, 2019).
Vegetation change is traditionally reconstructed using pollen and plant macrofossil data, but both have their limitations . Several pollen studies from the high northern latitudes document an early-to mid-Holocene treeline advance Naidina and Bauch, 2011;Salonen et al., 2011) though studies from northern Siberia are still relatively scarce (Clayden et al., 1997). An open question exists with respect to lagged vegetation-climate relationships (Elmendorf et al., 2012;Herzschuh et al., 2016;Crump et al., 2019;Herzschuh, 2020). While it is assumed that the broad vegetation composition can be inferred from arctic pollen records despite the biases from different pollen productivities and dispersal characteristics (Klemm et al., 2013;Niemeyer et al., 2015), plant diversity estimates are challenging because of low taxonomic resolution in some studies (Naidina and Bauch, 2001;Rudenko et al., 2014). Aside from the lower taxonomic resolution, plant diversity estimates can also be confounded by pollen production, dispersal ability and representation . Although plant macrofossils can improve the taxonomic classification and complement the vegetational information of the pollen record (Birks and Birks, 2000;Andreev et al., 2011;Aarnes et al., 2012;Gałka et al., 2018), their different production, dispersion and preservation may prevent the inference of plant richness (Birks, 2014).
Recently, sedimentary ancient DNA (sedaDNA) analysis was established as a new paleoecological tool (Jørgensen et al., 2012;Pedersen et al., 2013;Willerslev et al., 2014;Epp et al., 2015;Parducci et al., 2017;Zimmermann et al., 2017;Clarke et al., 2018). The general g/h primers that amplify a part of the trnL P6 loop of the plant chloroplast genome (Taberlet et al., 2007) have been widely applied to identify vascular plants in various environmental samples (e.g., Jørgensen et al., 2012;Zimmermann et al., 2017;Clarke et al., 2019). The major advantages of sedaDNA over the classical methods are that more taxa can be identified to low taxonomic levels and it is well representative of the investigated site (Alsos et al., 2018). A previous study has shown that sedaDNA can identify almost twice as many taxa in total and at species-level and genus-level as pollen does (Niemeyer et al., 2017), allowing a semi-quantitative estimate of floristic species richness in north-eastern Siberia during the pre-and post-Last Glacial Maximum (Zimmermann et al., 2017). The sedaDNA approach may thus have the potential to track local richness variations in the course of compositional vegetation changes by itself.
Although sedaDNA has been used in addition to pollen or as an independent proxy to investigate past changes in Holocene floristic patterns (Jørgensen et al., 2012;Paus et al., 2015;Zimmermann et al., 2017;Voldstad et al., 2020), comparisons of how these two approaches reflect vegetation and diversity changes along a latitudinal vegetation gradient are still scarce, particularly for Siberia (Jørgensen et al., 2012;Zimmermann et al., 2017). In Siberia, the average July temperature in the mid-Holocene was warmer than today and its anomalies increased with latitude above 65 • N, which have been demonstrated by vegetation modeling (Monserud et al., 1998) and fossil records (e.g., Andreev et al., 2002;Nazarova et al., 2013). A similar latitudinal cold trend (with a lag in the south) has been recorded in this region (Andreev et al., 2004;Müller et al., 2009). Whether plant diversity has a similar latitudinal pattern in the context of climate variation is still unknown. Here, we analyze pollen and sedaDNA from three lake sediment cores from the Omoloy region in north-eastern Siberia (northern Yakutia), which are currently surrounded by different vegetation types ranging from typical tundra to open larch forest. First, our aim is to compare sedaDNA with the pollen data to see whether both methods track the same pattern with respect to compositional changes and diversity changes across the northern Russian treeline zone or are complementary to each other. Second, we reconstruct the mid-to late-Holocene changes of vegetation composition along a north-south transect. Third, we use the sedaDNA data to reconstruct variations in species richness and relate this to vegetation and climate change.

Study Area
The study area (70.96-70.53 • N and 132.57-132.92 • E, Figure 1A) is located in the northern part of the Republic of Sakha (Yakutia, north-eastern Siberia, Russia), to the east of Tiksi Bay. It spans a 75 km transect to the east of the Omoloy River stretching from typical tundra areas in the north to open larch forests in the south ( Figure 1B). The vegetation of the northernmost site is typical arctic tundra (site I, Figure 1B, 14OM12A), characterized by Poaceae and Cyperaceae with some dwarf shrubs, such as Betula spp. (dwarf birch) and Salix spp. (dwarf willow). The forest tundra (site II, 14OM02B) is characterized by dense shrubtundra ( Figure 1B), consisting of Alnus alnobetula (alder shrub), Salix spp., Betula spp., and patches of Larix cajanderi (larch). The southernmost site (site III, 14OM20B), open larch forest, is situated in a tree-free valley dominated by Poaceae and some Ericaceae (e.g., Ledum palustre, Vaccinium vitis-idaea), while the hills around the site within lake catchment are covered by dense larch forest (Larix cajanderi).
The landscape, with its periglacial landforms and low topographic relief, contains many thermokarst lakes on underlying continuous permafrost (Wetterich et al., 2011). The mean active layer depths are 35.10 cm (site I), 35.05 cm (site II) and 35.00 cm (site III), which are measured based on 400 m 2 plots (field measurements in July 2014). Mean July temperature at Tiksi weather station (71.58 • N, 128.92 • E, 8 m a.s.l., ca. 170 km west of site II) is 8 • C based on weather reports during 1980-2016 (NOAA's Integrated Surface Hourly). In the winter, the study area is influenced by a persistent high-pressure system and covered by a thin layer of snow. The coldest month is February with an average temperature of −31 • C. Annual rainfall is 189.1 mm and mostly falls during the summer months (June-August).

Lake Sediment Cores and Subsampling
Three lake sediment cores, 14OM12A (33 cm long), 14OM02B (49.5 cm long) and 14OM20B (86 cm long), were recovered from three sites using a UWITEC gravity corer (6 cm internal diameter) equipped with a hammer tool in July 2014. The sediment cores were stored cold and dark before being transported to the cooling room at the Alfred Wegener Institute for Polar and Marine Research (AWI). Core subsampling was conducted soon thereafter at 10 • C under clean conditions in the climate chamber of GFZ German Research Centre for Geosciences. The detailed subsampling procedures are described Frontiers in Ecology and Evolution | www.frontiersin.org in Zimmermann et al. (2017). In total, 54 samples (18 samples per core) were taken for ancient DNA analysis and pollen analysis.

Dating
From the three cores, 16 bulk organic carbon samples were selected because of the lack of macrofossil remains and radiocarbon dated using accelerator mass spectrometry (AMS) at Poznań radiocarbon laboratory of Adam Mickiewicz University, Poland. In addition, 30 freeze-dried samples per core at 0.25 or 0.5 cm intervals between 0 and 15 cm were analyzed for 210 Pb/ 137 Cs at the Liverpool University Environmental Radioactivity Laboratory. The age-depth model was built using Bacon (Blaauw and Christen, 2011) and all ages were calibrated to calendar years before present (BP = before 1950 AD) with the IntCal13 calibration curve (Reimer et al., 2013).

Pollen Analysis
For each lake sediment core, pollen samples, mostly at 1.5-cm intervals, were prepared using the standard acetolysis procedure (Faegri et al., 1989), including HCl (10%), KOH (10%)/NaOH (10%), hot HF (42%), and acetic anhydride with H 2 SO 4 for 2 min. In order to estimate the pollen concentration, a Lycopodium spore tablet (Batch nr. 1031; n = 20,848 ± 1460) was added to each pollen sample before starting the chemical treatment. The pollen concentrate was sealed with glycerol gel and examined at x400 magnification with a Zeiss Axioskop 40 microscope. Pollen identification follows Savelieva et al. (2013) and modern pollen reference collections from AWI, as well as Moore et al. (1991) and Beug (2004). The sum of terrestrial pollen grains in each sample is more than 500. The three raw pollen datasets are shown in Supplementary Data 1.

DNA Extraction and Amplification
SedaDNA was extracted from about 2-4 g of sediment using the PowerMax R Soil DNA Isolation Kit (MoBio Laboratories, Inc., United States) in a dedicated paleogenetic laboratory in AWI. In total, 54 samples were extracted in six batches by adding 15 ml bead solution, 1.2 ml C1 buffer, 400 µl of 2 mg/l proteinase K (VWR International) and 100 µl of 5 M dithiothreitol in the bead tube for each sample. Samples were incubated at 56 • C in a rocking shaker overnight. All subsequent extraction steps of the kit followed the manufacturer's instructions. We handled each extraction batch on different days. One extraction batch always contained nine samples and one extraction blank to help identify any potential contamination during the extraction process. Two PCRs (polymerase chain reactions) were performed for each extraction batch using the g-h universal plant primer targeting the P6 loop region of chloroplast trnL (UAA) intron (Taberlet et al., 2007) to amplify the short DNA fragments (10-146 bp) with identical tag-combinations. The massively parallel sequencing can be performed because the 5 end of the primers have a unique flanking sequence of 8 bp (Binladen et al., 2007). NNN tagging attached to the primers improved the cluster detection on the sequencing platform (De Barba et al., 2014). The PCR mixture, in total 25 µL, contained 3 µl DNA, 1.25 U Platinum R Taq High Fidelity DNA Polymerase (Invitrogen, United States), 0.2 mM of each primer, 10 x HiFi buffer, 2 mM MgSO 4 , 0.25 mM mixed dNTPs (consisting of dATP, dCTP, dGTP, and dTTP) and 0.8 mg Bovine Serum Albumin (VWR, Germany). In order to test for potential contamination, a PCR negative control (no template control, NTC) was included in each PCR batch. Then, the PCR mixtures were transported to the post-PCR laboratory which is physically separated from the paleogenetic laboratory. PCR was executed in the TProfessional Basic thermocycler (Biometra, Germany) at 94 • C for 5 min, followed by 50 cycles of 30 s at 94 • C, 30 s at 50 • C, 30 s at 68 • C, and 10 min at 72 • C. Only those PCR products that showed the expected gene bands in at least two replicates were selected for PCR purification and were afterward pooled in equimolar concentrations to avoid DNA concentration bias. The expected gene bands satisfied two criteria: (1) the brightest gene band for each lake sediment sample should be in the 100-200 bp range; (2) the gene bands of the associated DNA isolation control and NTC should be shorter than those of the lake sediment samples. The gene bands were displayed in 2% agarose gel. The pooled PCR products were sequenced on the Illumina HiSeq platform (2 × 125 bp, paired-end reads) at Fasteris SA sequencing service, Switzerland.

Sequencing Filtering and Taxonomic Assignment
To align, assign and filter Illumina sequences we used OBITools (Boyer et al., 2016). The main filtering procedures were: (1) paired-end sequence alignment using 'illuminapairedend' under quality control (minimal score = 40), (2) assign sequence to the corresponding sample using 'ngsfilter' , (3) group and count the identical sequences using 'obiuniq' , (4) remove short sequences (length < 10) and scarce sequences (count < 10) using 'obigrep' , (5) clean PCR/sequencing errors using 'obiclean' with the setting "-r 0.05 and -H", and (6) taxonomic assignment with 'ecotag' based on the EMBL nucleotide sequence database (version 133, Kanz et al., 2005) and the Arctic and Boreal vascular plant and bryophyte nucleotide sequence database (Sønstebø et al., 2010;Willerslev et al., 2014;Soininen et al., 2015). It should be noted that the sequence counts for each lake sediment sample is the sum of two PCR replicates because they were amplified with the same tag-combination (but in different PCR batches). To further reduce noise, sequences occurring < 10 times in each lake sediment sample were ignored (Zimmermann et al., 2017). For our purpose, we only collected sequences corresponding to the terrestrial seed plants (Spermatophyta) with the best identity value of 1 (100% matched to those sequences in taxonomic reference database). Furthermore, those sequences that were not related to plants from the catchment based on our vegetation survey and Arctic region were discarded to avoid scattered contamination. The final scientific name was decided based on two criteria: (1) the scientific name with the best identity value of 1 if the sequence is not 100% assigned to both taxonomic reference databases; (2) the scientific name with a lower taxonomic level if a sequence is 100% assigned to both taxonomic reference databases. The three sedaDNA datasets for statistical analyses are summarized in Supplementary Data 2.

Statistical Analyses
The lowest number of reads among all samples from 14OM12A is 20,241; from 14OM02B is 9433; and from 14OM20B is 3719.
We first resampled the sedaDNA reads to a base count of 10,000 with 100 repeats 1 . During the calculation, the series number of the sequence type (st.1, st.2 etc.) was attached to the corresponding raw taxon name when the raw taxon name has multiple sequence types. For example, Ranunculus.st1 and Ranunculus.st2 represent two sequences types of Ranunculus. We calculated the relative read abundance based on the rarefied sedaDNA data. The species richness and diversity was calculated using the iNEXT package (Hsieh et al., 2016). A similar approach was applied to the pollen data with a base count of 500. Accordingly, species diversity refers to the sequence diversity and taxon diversity for sedaDNA data and pollen data, respectively. The species diversity is represented by a form of Hill's numbers: q0 (rarefied species richness), q1 (the exponential form of Shannon entropy), and q2 (the inverse of Simpson's concentration).
Only those taxa that occurred at a minimum of 0.5% in five samples were used for the principal component analysis (PCA) and are shown in the stratigraphic profiles. Thus, we discuss the compositional vegetation based on these common taxa. The relative proportions of pollen data were square-root transformed and the relative proportions of sedaDNA data were double square-root transformed before PCA calculations following the suggestion by Zimmermann et al. (2017). The PCAs were performed based on a variance-covariance matrix. Zonation of pollen and sedaDNA was based on stratigraphically constrained cluster analysis (CONISS; Grimm, 1987). The taxonomy of the pollen and sedaDNA data were harmonized prior to PCA. Procrustes analysis was applied to assess the differences between pollen and sedaDNA based on the site and species scores of the first two PCA axes of the two datasets using 'procrustes()' and 'protest()' (Oksanen et al., 2019). The Spearman correlation coefficient was calculated using 'cor.test()' (R Core Team, 2019) with the Bonferroni correction to relate rarefied richness to the first PCA axes of the pollen and sedaDNA relative abundance.

RESULTS
Chronology 210 Pb/ 137 Cs dating and radiocarbon dates of the three cores are shown in Tables 1, 2. The three cores showed a relatively uniform and slow accumulation rate, particularly core 14OM20B. Thus, we assumed that the accumulation rate is stable in the upper part of the sediment core. For each core, we calculated ages for the 2.25 cm sample based on parallel dating by 210 Pb/ 137 Cs and 14 C AMS. The difference in the two ages provided an estimation of the "old carbon" effect: 475 years (14OM12A), 1,573 years (14OM02B), and 3,365 years (14OM20B). For core 14OM20B, four dates were probably out of sequence, which could be attributed to a changing "old carbon effect": they were therefore excluded. All age-depth models were established after subtraction of the "old carbon" effect (Figure 2) using 'Bacon ()' (Blaauw and Christen, 2011

SedaDNA and Pollen Assemblages
Core 14OM12A, Typical Tundra In total, 784,452 terrestrial seed-plant sequence reads were assigned to 120 terrestrial taxa, of which 42 were identified to species level. One sequence 100% assigned to Musaceae with 61 counts was excluded. Twenty-five taxa met our selection criteria, and were shown in the assemblages of sedaDNA, grouped into three zones (Figure 3, red bars). The dominant vegetation compositions showed a clear transformation from shrub-steppe (Salicaceae, Rhododendron, Rosoidaeae, Betula) at 5.7-5.2 ka to steppe (Ranunculus sceleratus.st1, Ranunculus.st2, Tephroseris, Saxifraga) at 5.2-1.6 ka and to wetland herbs with a portion of dwarf willow (Carex aquatilis, Comarum palustre, Salix) at 1.6-0 ka. Rarefied species richness (in 21-66 taxa range) and diversity indices (q1 and q2) were high before 4.4 ka, followed by a gradual decline of richness and diversity until 0.2 ka ( Figure 4A). A total of 52 pollen types were identified in this core, 14 of which met the filter criteria for statistical analyses and were shown in the pollen percentage diagram (Figure 3, blue bars). The pollen assemblages were dominated by Betula, Alnus alnobetula ssp. fruticosa, Cyperaceae, and Poaceae. There was a gradual decrease in pollen concentration and percentages of dwarf birch (Betula nana) and Poaceae since 2.0 ka, which were accompanied by an apparent increase in relative abundance of shrub alder (Alnus alnobetula ssp. fruticosa) and Cyperaceae. The rarefied species richness was in the range of 18-32 taxa, showing a slightly higher value at 5.0 ka and lower values in other periods, and species diversity was similar ( Figure 4B). The estimated species richness had large uncertainties with higher values between 5 and 4 ka ( Figure 4B).

Core 14OM02B, Forest Tundra
A total of 841,136 terrestrial seed-plant sequence reads were assigned to 90 taxa, of which 34 were at the species-level. All sequences with a best identity value 1 were kept. After filtering, 19 taxa remained and were shown in the sedaDNA assemblage (Figure 5, red bars). The floristic composition was dominated by willow shrubs (Salicaceae) and marsh cinquefoil (Comarum palustre) from 7.6 to 6.6 ka, followed by willow and moisture-adapted taxa (Koenigia islandica, Caltha) until 3.6 ka. In the late Holocene, various shrubs (Betula, Vaccinium vitisidaea, Rhododendron) and hygrophilous herbs (Carex aquatilis, Eriophorum.st2) were common occurrences and have high percentages. In addition, Larix was mostly below 1% except for two samples where it is up to 5.7% at 1.8 ka and 1.9% at 1.5 ka. Rarefied taxa richness ranged from 18 to 36 through the core and had a high value before 4.4 ka with a decline afterward, in contrast to the slight increases in the diversity indices ( Figure 4A).
Palynological analysis yielded 55 taxa in this core, of which 14 were shown in the stratigraphic profile covering the past 7.6 ka (Figure 5, blue bars). The dominance of Betula nana, Alnus alnobetula ssp. fruticosa, Poaceae, and Cyperaceae were typical of the Holocene. Pinus subgenus Strobus-type was also appearance in the pollen assemblage and showed a decrease from 7.6 to 5.0 ka and an increase from 2 to 0 ka. Alnus alnobetula ssp. fruticosa decreased in relative abundance after 6.2 ka. The rarefied species richness (in the 20-31 taxa range) did not show a clear trend, and neither did the asymptotic estimated richness that contains higher uncertainties (Figure 4B). The rarefied and estimated species diversity measures were almost constant over time.

Core 14OM20B, Open Larch Forest
After sequence filtering, 813,364 sequence reads were assigned to 117 taxa, of which 45 taxa were classified to specieslevel. Six sequences comprising of 1516 counts were excluded, which were 100% assigned to Musaceae (2 sequences), Persea, Pisum, Phaseolus and BOP clade. The percentages of 21 taxa reached the threshold for statistical analysis (Figure 6, red bars). Larix was detected in almost all samples and at the highest percentage (up to 8%) of the three cores. The whole dataset was dominated by Salicaceae sequences (average 31.6%), followed by Caltha (average 7.8%), Rhododendron (average 7.2%), and Betula (average 7.0%). In particular, the taxonomic composition was dominated by Salicaceae and Caltha from 4.8 to 3.5 ka and shifted to a mixed dwarf-shrub and herb assemblage (Salicaceae, Betula, Rhododendron, Vaccinium vitis-idaea st1, Agrostidinae, Poinae 20B) until 1.2 ka. The water sedge (Carex aquatilis) and waterside herb (Comarum palustre) were common taxa as well, with higher percentages since 1.2 ka. The rarefied species richness wavered between 22 and 55 taxa and no taxa showed particular dominance, as seen by the imperceptible changes in the diversity indices ( Figure 4A).
A total of 59 taxa based on pollen analysis were identified in this core, of which 13 were suitable for analysis. The percentage pollen diagram was dominated by Alnus alnobetula ssp. fruticosa, Betula nana, and Poaceae, with larger proportions of Larix than other cores (Figure 6, blue bars). However, Larix showed no clear trend through the core and neither does Pinus subgenus Strobus-type. From 4.8 to 3.5 ka, there was a gradual increase in the percentage of Alnus alnobetula ssp. fruticosa reaching a maximum value of 23.4%, followed by a steady decrease. The rarefied species richness showed a wave-like pattern in the range of 19-23 taxa, which was similar to the estimated richness that had extremely high upper limits ( Figure 4B).

Gradient Analysis and Correlation Analysis
PCA plots of taxa from the Omoloy cores reflected the characteristics of the vegetation transects and summarize the vegetation shifts (Figures 7A,B). The first two axes of the sedaDNA PCA together explained 51.7% of the total variance and not only distinguish typical tundra from forest tundra and open larch forest, but also distinguish these latter two vegetation types from each other. Such a latitudinal vegetation gradient simply indicated the latitudinal climatic gradient. The first and second axes of the pollen PCA explained 52.0 and 9.2% of the total variance, respectively, and show the changing dominance of pollen types. PCA axis 1 separated taxa of the typical tundra site (core 14OM12A) from the forest tundra (core 14OM02B) and open larch forest sites (core 14OM20B). However, these last two sites were not distinct from each other.
Results of the Procrustes rotation analyses and associated PROTEST ( Table 3) indicated significant correspondence in the pollen and sedaDNA sample scores (p-values < 0.05 for all three cores, with typical tundra showing the best fit (p < 0.001). There were no obvious trends in the residuals (Supplementary Figure 1). In contrast, we found no significant fit between the pollen and sedaDNA species scores ( Table 3). The residuals of Ranunculaceae and Comarum palustre were particularly high in all three cores (Supplementary Figure 1).
We further found a statistically significant but weak positive correlation between plant species richness and PCA axis 1 of sedaDNA (rho = 0.3444, p < 0.05; Table 4) and a negative correlation between plant species richness and PCA axis 2 of sedaDNA (rho = −0.4205, p < 0.05; Table 4).

Contributions of Pollen and sedaDNA to Vegetation Reconstruction and Taxon Richness
This study compares the sedaDNA data with the pollen data retrieved from three Arctic lakes collected along a vegetation transect. The sedaDNA approach yielded almost twice as many taxa as recorded by pollen in each core and classified more taxa at lower taxonomic levels ( Table 5), particularly herb taxa that can often only be identified to family level using pollen . Hence, our study supports earlier findings that for the estimation of past plant species diversity sedaDNA is superior to classical pollen analysis (Zimmermann et al., 2017;Clarke et al., 2019). The rarefaction curves (q0) show sampling saturation for sedaDNA but unsaturated sampling for pollen (Figure 8). This provides further evidence of the insufficiency of pollen to capture diversity compared to sedaDNA. The species diversity estimates are more precise using the sedaDNA approach than pollen analysis, as suggested by the lower uncertainty of the sedaDNA data (Figure 8 and Supplementary Figure 2). Moreover, high pollen producers are likely to obscure the pollenbased plant richness signals (Meltsov et al., 2011) as well as the vegetation compositional signal (Klerk et al., 2009). For instance, Betula, known to be a high pollen producer, dominates the pollen assemblages of the Omoloy area -even in the northern typical tundra site (Figure 3, blue bars), where it reaches up to 32%. By contrast, the Betula sedaDNA is prevalent in the typical tundra site before 4.4 ka (5.2-10.8%) but decreases afterward, becoming especially low in the last millenium (0.1-0.8%, Figure 3, red bars). We thus conclude that the history of high-latitude plant species diversity is better reconstructed using the sedaDNA approach at the local scale.
Larix, an important tree species in northern Siberia, has limited pollen productivity and dispersal ability, leading to its strong under-representation in pollen spectra (Pelánková et al., 2008). Therefore, its low percentage in pollen records does not necessarily reflect sparse occurrences but can actually indicate the presence of larch forests (Clayden et al., 1997;Niemeyer et al., 2015). Both the pollen and the sedaDNA data show that Larix occurs throughout the entire forest-tundra core (14OM02B) and open larch forest core (14OM20B), indicating that larch trees have been present since at least 7.6 ka. Moreover, the Larix sedaDNA and pollen grains disappeared from the typical tundra core (14OM12A) at 5.4 ka and 4.1 ka (Supplementary Data 1 and Data 2), respectively. These results confirm conclusions from northern Siberian records that invasion of Larix into typical tundra started in the early Holocene, approximately between 9 ka and 8 ka, and moved southward to modern limits by 4-3 ka (MacDonald et al., 2008). We see an increase in the percentage of Larix pollen in the last millenium in the forest-tundra core (14OM02B, Figure 5) but do not see any comparable signal in the sedaDNA records. We speculate that the increased Larix pollen abundance does not reflect an expanded larch forest in the past 1,000 years, but instead reflects a reduction in the overall vegetation cover thus favoring the northward transportation and/or increasing the proportional weighting of Larix pollen grains. Modern pollenvegetation studies note that low levels of Larix pollen can be found in typical tundra lake sediments even though Larix forests do not grow in the vicinity of the lakes (Pisaric et al., 2001). This is in line with a lack of Larix DNA in our typical tundra core (14OM12A) but frequent appearance of Larix pollen. Overall, we believe that Larix in the lake catchment can be more accurately tracked by sedaDNA than by pollen.
We notice that our pollen data have steadily high percentages of Betula nana, Alnus alnobetula ssp. fruticosa, Cyperaceae, and Poaceae, in contrast to their varying relative abundance in the sedaDNA records, confirming that the main origin of sedaDNA is not pollen but other plant components (Jørgensen et al., 2012;Pedersen et al., 2013). These differences are clearly reflected FIGURE 4 | Temporal changes in species diversity based on abundance data of sedaDNA (A) and pollen (B). The Hill numbers indicate species richness (q0), the exponential of Shannon's entropy (q1) and the inverse of Simpson's concentration (q2). The observed diversity, asymptotic estimates, and 95% confidence intervals are marked by solid lines, dashed lines, and shaded areas, respectively. Observed and asymptotic estimates overlie each other for q1 and q2.
FIGURE 5 | Percentages of selected taxa/sequences over the past 7.6 ka from the forest-tundra site (14OM02B) in the Omoloy region used in our statistical analyses, shown as red bar plots of relative read abundance of sedaDNA and blue bar plots of pollen percentages. The families detected by both of sedaDNA and pollen analyses are marked as black lines. CONISS zones for each proxy are shown on the right. by the Procrustes analyses that find no significant correlation between species scores of pollen data and sedaDNA data ( Table 3 and Supplementary Figure 1). To date, most sedaDNA studies suggest that the source of DNA sequences is more local or extralocal while pollen signals have a higher regional component (Pedersen et al., 2013;Parducci et al., 2014;Alsos et al., 2016Alsos et al., , 2018Sjögren et al., 2017). This local representation can also be inferred by high proportions of hydrophilic taxa (e.g., Salicaceae, Comarum palustre, Ericaceae; de Klerk et al., 2017;Raschke and Savelieva, 2017) in all sedaDNA records. For these  reasons, we believe that shrub alder (Alnus fruticosa), a high pollen-producing and wind-dispersed taxon, has not grown in the vicinity of northern Omoloy (typical tundra, forest tundra sites) since 7.6 ka due to a lack of Alnus fruticosa DNA in both site cores. Accordingly, pollen grains of Alnus alnobetula ssp. fruticosa likely originate from the southern region and reflect the regional pollen component (Sjögren et al., 2017). Therefore, small-scale shrub expansion should not be inferred from such pollen records, although it can be inferred from the sedaDNA record.
We find that our sedaDNA records are more sensitive to sitespecific vegetation compositional change than the pollen records (Figures 3, 5, 6). For example, typical tundra is reflected by a high proportion of water sedges with some dwarf shrubs while forest is characterized by higher percentages of erect shrubs: Alnus alnobetula is higher in the forest-tundra site (14OM02B) and the open larch forest site (14OM20B) than the typical tundra site (14OM12A). The sedaDNA approach, therefore, may be superior when aiming to reconstruct latitudinal vegetation shifts in Arctic regions, although the signal is likely to be rather local.  Pollen records are still useful when aiming to reconstruct regional vegetation change.

Variation in Holocene Vegetation Composition in the Omoloy Area, North-Eastern Siberia
Overall, the inferred vegetation changes point to landscape opening from the mid to late Holocene across all study sites, likely in response to climate cooling , and consistent with many other recent records from the Arctic Gajewski, 2015;Clarke et al., 2019;Sjögren and Damm, 2019). Vegetation composition within each lake's catchment follows its own trajectory, as revealed by nonoverlapping clusters of sedaDNA PCA sample scores ( Figure 7A).
In contrast, regional signals of vegetation change are similar for the forest tundra and open larch forest sites, as seen by the strong overlap of their pollen PCA sample scores ( Figure 7B). The vegetation history of the typical tundra site is distinctive both locally and regionally. The sedaDNA diagram for the typical tundra site (14OM12A) indicates that a shrubby community (Betula, Rhododendron, Vaccinium vitis-idaea, Salicaceae) shifted to a tundra-steppe community (Ranunculus spp., Saxifraga, Tephroseris) at approximately 5.2 ka (Figure 3, red bars), while the pollen record shows constant tundra-shrub (Betula nana, Poaceae, Cyperaceae) until 2.2 ka (Figure 3, blue bars). The expansion  of wet-sedge tundra during the last ∼2000 years is indicated by both sedaDNA and pollen. Soil moistening related to thawing permafrost, collapse of high-centered ice-wedge polygons, and related expansion of low-centered polygonal tundra is a common feature in lowland tundra areas (Ellis and Rochefort, 2006;Woo and Young, 2006;Zibulski et al., 2016). We assume that low-centered polygons expanded in the vicinity of the lake as indicated by Comarum palustre which generally inhabits depressions in polygon mires (de Klerk et al., 2011).
In the forest-tundra site (14OM02B), the vegetation was dominated by shrubs from 7.6 to 5.0 ka, as indicated by the high proportions of alder and dwarf-shrub pollen. During this stage, the vegetation within the lake catchment mainly comprised marsh cinquefoil (Comarum palustre) and dwarf shrub (Salicaceae) from 7.6 to 6.6 ka and Salicaceae shrubs with a small proportion of grasses until 5.0 ka. Alder shrub, as its low percentage of sedaDNA indicates, was rare in the lake's catchment. In the arctic, nitrogen-rich soils are common in Alnus-dominated area, whereas nitrogen-limited soil is favorable for Salix spp. growth (Binkley et al., 1994). Due to the high abundance of Salicaceae sedaDNA, we assume that the soils were nitrogen-poor, thus explaining the scarce Alnus in the catchment rather than it being a reflection of poor climate conditions. Alnus started to decline around 5 ka and shows a pronounced decrease from 2 ka, when the vegetation in the catchment shifted from Salicaceae-dominated shrubland to the modern mixed larch forest and dwarf shrubs with some herbs (Figure 5). We found a few wet sedges occurred in this typical tundra site in the late Holocene, which might be attributed to gradual permafrost thawing, and a simultaneous expansion of shrubs marked by the increased relative abundance of Betula (Blok et al., 2010).
In the open larch forest site (14OM20B), there is little change in the low proportion of Larix in either the pollen or sedaDNA record over the past 4.8 ka, suggesting that the larch forest developed its modern distribution during the mid-Holocene. This agrees with other pollen and macrofossil records, which indicate that forest retreated to its modern limits around 4.5-3.5 ka across most of Russia (Kremenetski et al., 1998;Payette et al., 2002). Alder shrub was well-established during 4.8-3.5 ka but subsequently became scarce (Figure 6), particularly in the lake's catchment after 1.2 ka (Figure 6, red bars). The decreases in Alnus pollen and sedaDNA proportions correspond well with other records from northern Siberia (Andreev et al., 2004;Salonen et al., 2011), suggesting a marked deterioration of habitat for erect shrubs in the late Holocene. According to our field observations, alder shrub grows in patches of larch forest in the uplands of this site, while Poaceae and dwarf shrubs inhabit lowland around the lake. We therefore assume that the modern landscape within the catchment was shaped around 1.2 ka.
Several studies have shown that microrelief and related small-scale hydrothermal regimes have a strong impact on the specific response of vegetation communities to climatic change (Boudreau and Rouse, 1995;Silvertown, 2004;Wolter et al., 2016;Nabe-Nielsen et al., 2017). This may explain why we find different wetting trends for the lake catchments of our three sites: an evident wetting trend in the typical tundra site since 1.6 ka, but a gentle trend in the forest-tundra site and a subtle trend in the open larch forest site. A thinner active layer and related poorer drainage in the tundra area compared to the southern forested sites could explain these differences (Nikolaev et al., 2009;Kuznetsova et al., 2010;Blok et al., 2011). The fact that turnover in taxonomic composition at each site follows its own time trajectory suggests a forcing factor other than climate changes, i.e., the effects of vegetation, soil characteristics, and topography within the lake catchment (Young et al., 1997), potentially explaining the latitudinal differences in vegetation composition and the site-specific trajectories of turnover in taxonomic composition captured by our pollen and sedaDNA results.
Most Holocene pollen records from Arctic Siberia (e.g., Andreev et al., 2001Andreev et al., , 2003Naidina and Bauch, 2001;Katamura et al., 2006;Klemm et al., 2013) document that forests extended farther north during the early to mid-Holocene, followed by treeline retreat. Our records show the same trends in principle, but they indicate that this vegetation shift was of limited spatial extent and that the timing might differ. Currently, forest tundra can be found around lake 14OM02B only 27 km away from our typical tundra site 14OM12A where, according to our sedaDNA record, Larix was not growing after 5.7 ka. This suggests that the start of forest retreat from the typical tundra site was earlier than the previous estimate of about 4.5 ka based on pollen (Kremenetski et al., 1998). The larch forest grew at the foresttundra site since 7.6 ka before retreating to its modern treeline position before 4.8 ka, as both the Larix pollen and sedaDNA indicate. As larch forest retreat is not seen in either our pollen or sedaDNA records, this suggests that a stable larch population has been established at this site since 7.6 ka. The frequent appearance of Larix DNA and pollen at the forest-tundra site also indicate that the larch forest retreated to its modern boundary before 4.8 ka. Because the taxonomic resolution of Larix is restricted to genus level, neither pollen analysis nor the sedaDNA approach can track the dynamics of individual larch species (i.e., Larix gmelinii, Larix sibirica, Larix cajanderi). However, an investigation by Schulte et al. (2020) that applied a capture approach of Larix chloroplast genomes from ancient lake sediments on the southern Taymyr peninsula, indicates that no broad-scale distributional changes of the individual Larix species are likely.

SedaDNA-Based Plant Diversity Changes Within Lake Catchments of the Omoloy Region
Our sedaDNA analyses track plant diversity variations over the mid to late Holocene. We have evidence that latitudinal alpha plant diversity is weakly linked with regional vegetation composition as indicated by the significant relationship between ordination axes and richness (Table 4). However, we find no evidence for a latitudinal increase in total plant species richness from north to south. In contrast, the highest total plant species richness is found at our typical tundra site, which is at the northernmost latitude, for before 2.6 ka, while the lowest richness is detected in the forest tundra. The open larch forest site, the southernmost region, generally has moderate total plant species richness but has larger fluctuations in both diversity indices than the other sites. Our results thus support earlier ideas that the latitudinal total plant species richness is not simply controlled by climate (Moeslund et al., 2013) but also by other factors.
For the typical tundra site, total plant species richness was highest between 5.7 ka and 4.4 ka ( Figure 4A) when the lake catchment contained diverse ecological habitats represented by dwarf shrubs (e.g., Betula nana, Salicaceae, Ericaceae spp.), riparian meadows (e.g., Ranunculus sceleratus, Caltha), open grassland (e.g., Asteraceae st2, Anthemideae, Poodieae), and patches of polygonal tundra (e.g., Comarum palustre, Carex aquatilis) (Figure 3). During this period, habitat types were generally relatively uniform, mainly dominated by dwarf-willow shrublands. Thus, our results clearly document that landscapes with a higher habitat diversity can be inhabited by more plant communities (Levine and HilleRisLambers, 2009). The corollary to this, is exemplified by the forest tundra, which has the lowest Shannon and Simpson diversity indices, possibly related to its simple Salicaceae-dominated habitat that potentially inhibits other species, forcing them to grow elsewhere to avoid extinction (Stewart and Levin, 1973).
For the late Holocene, total plant species richness and both diversity indices for the typical tundra site show a strong decline with meadows dominating the lake catchment until 1.6 ka (Figure 4A), followed by a steady decrease with a concomitant extension of wet-sedge habitat until 200 years ago (Figure 3). While high-centered polygon mires can support mesic species and various Ericaceous dwarf shrubs, low-centered polygons, particularly when the rims are flat, can mainly support only wetland species (e.g., Carex aquatilis) (Wolter et al., 2016). A decline in plant species richness during wetland taxa expansion is supported by the species richness pattern of the typical tundra site. Besides, the lowest plant species richness is found in the forest-tundra site. Flat terrain within lake catchments offer less opportunity for refugia than heterogeneous landscapes do (Clarke et al., 2019), which may explain the evident loss of species richness under the deteriorating climate in the late Holocene.
In the open forest site, plant species richness and diversity show a wave-like pattern but from around 2.5 ka have relatively higher values than the typical tundra and forest tundra sites when richness declines at these sites. We find low plant species richness in the open larch forest site at times of vegetation transition (e.g., Betula colonization at 3.2 ka; wet-sedge area expansion at 1.0 ka). One potential explanation for this pattern is that Betula and Carex aquatilis have good capacities for seed production but poor seed bank reserves leading to restricted co-existence with new species (Ebersole, 1989;Alsos et al., 2003). Furthermore, Betula nana is likely to be unable to sustain a long period of expansion under levels of abiotic stress due to its poor sexual reproduction in disturbed situations (de Groot et al., 1997). Therefore, total diversity temporarily increases with reduction of birch after 3.2 ka, composed of cold-tolerant taxa (Salicaceae, Caltha) and welladapted species (Rhododendron) recolonizing the habitats that facilitate diverse plant species (Rosenzweig, 1995).
More than half the total plant species richness has been lost from the typical tundra site since 4.4 ka, suggesting that typical tundra plant richness is most sensitive to environmental change. We infer that an increase in precipitation and related thermokarst degradation may have resulted in a decrease of herb taxa, restricted to water sedges and swamp cinquefoil. Such an outcome could be of high relevance for future richness forecasts in the Arctic. According to the latitudinal gradient hypothesis, more taxa would be expected in a diverse habitat experiencing moderate climate conditions, but our analyses suggest richness could decline in current typical tundra areas in the future.

CONCLUSION
Our aims were to reconstruct the latitudinal history of vegetation composition and diversity from the mid to late Holocene in the treeline area of northern Siberia based on pollen and sedaDNA analyses. We compared these two proxies with respect to their effectiveness at recording the long-term signals of vegetation transitions and plant diversity. Particularly, we compared the vegetation shifts and species diversity along vegetational gradients with a range from typical tundra in the north to forest tundra to open larch forest in the south.
Our study shows that inferences from the sedaDNA approach complement those based on pollen analysis. We conclude that sedaDNA proxy data can more precisely reflect variations in the local vegetation composition and are more suitable for reconstructing the history of latitudinal plant species diversity. In particular, sedaDNA analysis is a promising tool for recovering the site-specific establishment of larch forest. We believe that our sedaDNA data recorded vegetation turnover within each site's lake catchment while pollen reflected regional vegetation change.
SedaDNA results indicated that local vegetational transitions were likely driven by more than purely regional climate changes. We found that larch forest retreated earlier in the northern typical tundra site while sedge tundra developed later in the southern open larch forest site. However, this latitudinal pattern of vegetation compositional change is inconsistent with the observed variation in recorded total plant species richness and other diversity indices. We also found that water logging, likely related to permafrost degradation, resulted in a strong richness decline.

DATA AVAILABILITY STATEMENT
The raw sedaDNA sequence data and related scripts for analyzing are available at Dryad: doi: 10.5061/dryad.69p8cz900. The raw pollen counts and dating datasets are available at PANGAEA: doi: 10.1594/PANGAEA.922550.

AUTHOR CONTRIBUTIONS
UH designed this study. UH and SL led the interpretation and SL wrote this initial draft of manuscript. LP contributed to the fieldwork organization and KS-L organized the lab work. SL identified the pollen samples and performed the analyses of pollen and sedaDNA data. SL constructed the age-depth model under the guidance of UH. SK performed the rarefaction of sedaDNA data and SL did the subsequent statistical analyses. All other authors commented and provided input to the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by China Scholarship Council (201606180048 to SL), European Research Council (ERC Glacial Legacy 772852 to UH), and German Research Council (DFG EP98/3-1 to UH).

We thank Andrej Andreev and Mareike
Wieczorek for providing information about treeline location and pollen representativity of some taxa.