Modern Pollen Assemblages From Lake Sediments and Soil in East Siberia and Relative Pollen Productivity Estimates for Major Taxa

Modern pollen–vegetation–climate relationships underpin palaeovegetation and palaeoclimate reconstructions from fossil pollen records. East Siberia is an ideal area for investigating the relationships between modern pollen assemblages and near natural vegetation under cold continental climate conditions. Reliable pollen-based quantitative vegetation and climate reconstructions are still scarce due to the limited number of modern pollen datasets. Furthermore, differences in pollen representation of samples from lake sediments and soils are not well understood. Here, we present a new pollen dataset of 48 moss/soil and 24 lake surface-sediment samples collected in Chukotka and central Yakutia in East Siberia. The pollen–vegetation–climate relationships were investigated by ordination analyses. Generally, tundra and taiga vegetation types can be well distinguished in the surface pollen assemblages. Moss/soil and lake samples contain generally similar pollen assemblages as revealed by a Procrustes comparison with some exceptions. Overall, modern pollen assemblages reflect the temperature and precipitation gradients in the study areas as revealed by constrained ordination analysis. We estimate the relative pollen productivity (RPP) of major taxa and the relevant source area of pollen (RSAP) for moss/soil samples from Chukotka and central Yakutia using Extended R-Value (ERV) analysis. The RSAP of the tundra-forest transition area in Chukotka and taiga area in central Yakutia are ca. 1300 and 360 m, respectively. For Chukotka, RPPs relative to both Poaceae and Ericaceae were estimated while RPPs for central Yakutia were relative only to Ericaceae. Relative to Ericaceae (reference taxon, RPP = 1), Larix, Betula, Picea, and Pinus are overrepresented while Alnus, Cyperaceae, Poaceae, and Salix are underrepresented in the pollen spectra. Our estimates are in general agreement with previously published values and provide the basis for reliable quantitative reconstructions of East Siberian vegetation.


INTRODUCTION
Palaeoenvironmental studies in Siberia are important to understand climate and vegetation changes in the Northern Hemisphere. Pollen is the most widely used proxy for quantitative reconstructions of vegetation and climate in the past (Abraham et al., 2017;Sun et al., 2019;Chevalier et al., 2020;Liang et al., 2020). However, the number of quantitative climate and vegetation reconstructions available from Siberia is still low (e.g., Klimanov, 2000, 2001;Andreev et al., , 2004Andreev et al., , 2011Andreev et al., , 2014Andreev and Tarasov, 2013;Tarasov et al., 2013;Klemm et al., 2016;Kobe et al., 2020), partly because of a lack of modern pollen assemblages that provide the basis for the application of the modern analog technique or the generation of pollen-climate transfer functions (Overpeck et al., 1985;Magyari et al., 2014;Birks and Berglund, 2018). The relative pollen productivity (RPP) is an estimated value of pollen productivity relative to a reference taxon, which calibrates the relationship between vegetation cover and pollen data. Therefore, missing RPP estimates affect the ability of models such as the Landscape Reconstruction Algorithm (LRA; Sugita, 2007a,b) and the Multiple Scenario Approach (MSA; Bunting and Middleton, 2009) to make reasonable vegetation reconstructions.
Modern pollen datasets in Siberia have been published in previous studies (Tarasov et al., 2007Müller et al., 2010;Klemm et al., 2016) and databases such as Eurasian Modern Pollen Database . However, compared to other Northern Hemisphere regions, studies on surface pollen are still rare for some regions such as Chukotka and central Yakutia. Moreover, only a few studies of modern pollen assemblages in arctic Siberia (Pisaric et al., 2001;Müller et al., 2010;Klemm et al., 2013Klemm et al., , 2016 and southern Siberia (Pelánková et al., 2008) have been carried out that explore the relationships between pollen assemblages and vegetation or establish pollen-climate transfer functions.
Different pollen source areas and taphonomies can cause inconsistency in the pollen signals from different types of archives (Prentice, 1985;Minckley and Whitlock, 2000;Wilmshurst and McGlone, 2005;Klemm et al., 2013). As fossil pollen records in Siberia are mainly obtained from lake sediments, the quantitative reconstructions remain uncertain because the modern pollen datasets underpinning the reconstructions mostly originate from soils (e.g., Pelánková et al., 2008;Zhang et al., 2018;Cui et al., 2019;Geng et al., 2019). Modern pollen assemblages from lake sediments are less abundant (e.g., Clayden et al., 1996;Pisaric et al., 2001;Klemm et al., 2013Klemm et al., , 2016Niemeyer et al., 2015Niemeyer et al., , 2017 despite commonly being used as palaeoenvironmental archives. Since RPP estimates can vary between regions, the vast area of Siberia is still in need of RPP estimates to make better quantitative vegetation reconstructions. Hitherto, only Niemeyer et al. (2015) have provided RPP estimates for common taxa of the Siberian Arctic and investigated the differences between moss and lake pollen. However, Siberia covers a large area and RPP information from forested boreal areas in eastern Siberia and the Far East tundra-taiga transition area in Chukotka are completely lacking. The bias with archive type might be particularly strong in areas where taxa with strongly different transportation characteristics dominate, such as in Siberia. For instance, Larix, the dominant tree in East Siberian forests, is rare in pollen assemblages, probably because it is produced in low numbers, is poorly preserved, and poorly transported, while Alnus on the other hand is strongly overrepresented in pollen spectra (Niemeyer et al., 2015). The abundance and representation of Poaceae and Cyperaceae are also complex due to their diverse habitats (Bush, 2002;Semeniuk et al., 2006).
Here, we present the results of a pollen analysis of 48 moss/soil and 24 lake samples collected from East Siberia and RPP estimates for major plant taxa based on moss/soil samples in Chukotka and central Yakutia. The main objectives of this work are: (1) to assess how modern pollen assemblages reflect regional vegetation and climate conditions; (2) to compare and understand possible differences of pollen assemblages from lake sediments and surface soils; and (3) to obtain the relevant source area of pollen (RSAP) and RPP estimates of major taxa in East Siberia for future plant cover reconstructions.

Study Area
The study areas,  • E) and central  • E), are situated in eastern Siberia (Figure 1). The elevations of the sampling sites range from 94 to 843 m above sea level. The climate is characterized by a mean annual temperature (MAT) of -14.3 to -11.7 • C in Chukotka and -9.5 to -5.1 • C in central Yakutia, with the lowest temperatures in January and the highest in July. The mean annual precipitation (MAP) ranges from 183 to 274 mm in Chukotka and from 240 to 477 mm in central Yakutia (Matsuura and Willmott, 2018a,b).

Sample Collection
A total of 72 surface samples were collected from both study areas: 30 moss/soil samples and 16 lake surface samples from Chukotka and 18 moss/soil samples and 8 lake surface samples from central Yakutia in July 2016 (Overduin et al., 2017) and in July and August 2018 . Coordinates of the sampling sites were obtained by a hand-held Global Positioning System (GPS). Vegetation and major plant taxa were described in the field. Ecoregions, including Floodplain and Anthropogenic Meadows, Mountain Tundra, Open Woodlands, and Middle Taiga, for each sampling site were extracted from a 1:4 million scale vegetation map for the land area of the former Soviet Union. (Stone and Schlesinger, 2004) using ArcGIS 10.3. Detailed information of the 72 sites is presented in Table 1.

Pollen Analysis
Pollen samples were taken from the first centimeter of the soil/moss polster below the plant litter, while the lake sediment samples were taken from the uppermost centimeter of the lake cores. The samples were weighed (approximately 1-2 g for moss/soil samples and 3 g for lake samples), and a tablet with Lycopodium spores was added to the sample for the estimation of pollen concentrations (Stockmarr, 1971). Each sample was sieved to remove moss/plant residues and coarse particles, and was processed following a modified acetolysis procedure (Faegri et al., 2000), including HCl, NaOH, HF, and acetolysis treatments. The residue was then sieved through a 10 µm mesh. Waterfree glycerol was used for sample storage and preparation of the microscopic slides.
At least 300 terrestrial pollen grains were counted and identified in each sample under a microscope at 400X magnification using published pollen atlases and identification keys (Reille, 1992(Reille, , 1995(Reille, , 1998Wang, 1995). Pollen percentages were calculated based on the total number of terrestrial pollen grains. Tilia software (Grimm, 2004) was used to plot the results as a pollen percentage diagram (Figure 2).

Vegetation Survey
For the moss/soil samples collected in 2018 (n = 48), a vegetation survey around the sampling point was carried out following the steps below. The vegetation within 0-2 m was surveyed in concentric rings of 0.5 m increments. Plant taxa composition was estimated as total cover in percentage. To investigate the vegetation within 2-100 m, vegetation plots were overflown with a consumer grade drone and Survey Red Green Blue (RGB)    . The distribution of plant communities was mapped by a maximum likelihood classification of the layers using the superClass function in the "RStoolbox" package (Leutner et al., 2019) in R (R Core Team, 2019). The training dataset and cover of plant taxa for each community was visually selected and estimated in the field using 2 m × 2 m quadrats. The cover of major taxa was extracted by concentric rings of 1 m increments within the first 25 m and extrapolated to 100 m radius by the averaged taxa cover. The vegetation within 100-3000 m was inferred from a landcover classification at a 20 m resolution based on Sentinel-2 satellite image data (European Space Agency, ESA). The training dataset of land-cover types was created based on the 2018 vegetation plots (van Geffen et al., 2021a,b). Plant cover data within the different radii were extracted from these vegetation class maps within a 3000 m radius of each sampling site. Plant taxa composition in each land-cover type was estimated based on the taxa cover extracted from the drone-based orthomosaics. For Chukotka, the classification of the Sentinel-2 satellite data was performed with machine learning algorithms in Python from the SciKit-Learn library (Pedregosa et al., 2011). Several algorithms were trained, including Random Forest, Decision Tree, and Gaussian Naive Bayes, and the best performing algorithms were selected per region based on the mean accuracy over all the classes. For all 35 vegetation plots in Chukotka the vegetation classes were all classified together with an accuracy score of 82% and the K-Nearest Neighbours algorithm. For the vegetation plots in central Yakutia, class variety was more diverse with not enough vegetation plots of the same class for a robust classification. Instead of a supervised trained classification we defined the value ranges of the Normalized Difference Vegetation Index NDVI (Near Infrared -red/Near Infrared + red) and the relative absorption depth (Murphy, 1995) of the red Chlorophyll absorption band (green + Near Infrared/2 × Red) (European Space Agency, 2015) of each of the vegetation classes for assigning classes from the NDVI and pigment absorption depth maps.
To create vegetation input files for the Extended R-Value (ERV) models, the mean absolute cover (in m 2 m −2 ) of the plant taxa for ERV analysis was calculated for each chosen distance increment. In this study, we used different increments within different distances (1 m increment for the first 25 m radius, 5 m increment for between 20 and 100 m, 10 m increment for 100 to 1000 m, and 50 m increment for 1000-3000 m). We assume homogeneous vegetation cover in each concentric ring, as it is the assumption for data analysis using ERV models and related pollen-dispersal functions.

Climate Data
MAT, MAP, mean summer temperature (MST), mean winter temperature (MWT), mean summer precipitation (MSP), and mean winter precipitation (MWP) over 30 years  and 100 years  were interpolated using the weighted mean method from Terrestrial Air Temperature: 1900-2017 Gridded Monthly Time Series Version 5.01 (Matsuura and Willmott, 2018a) and Terrestrial Precipitation: 1900-2017 Gridded Monthly Time Series Version 5.01 (Matsuura and Willmott, 2018b) in R software.

Numerical Analysis
Taxa that occurred in at least three samples were used in the numerical analysis. A square root transformation of all the data was performed before all the numerical analyses to normalize skewed distributions and reduce the effect of extreme values.
Ordination analyses were used to investigate the main structure in the pollen data and its relation to vegetation type and environmental variables. A detrended correspondence analysis (DCA) was initially performed to estimate the underlying linearity of the data. The results of the DCA showed that the gradient lengths of the first four axes were less than 2.1 standard deviation units, suggesting linear underlying responses. Accordingly, the linear methods Principal Component Analysis (PCA) and Redundancy Analysis (RDA) were chosen to assess how well the pollen assemblages characterize the different vegetation types (Ter Braak and Prentice, 1988). The correlations between pollen assemblages and climate variables are explored in the RDA. Analyses were implemented using the "vegan" package in R (Oksanen et al., 2019).
PROCRUSTES rotation analysis (Peres-Neto and Jackson, 2001) was performed to compare the species scores of the PCA results for pollen data from different sample types using the "vegan" package in R (Oksanen et al., 2019). This analysis was to investigate the similarity and correlation between pollen data from moss/soil and lake samples. The non-randomness (significance) between the tested datasets was assessed by PROTEST (Jackson, 1995;Niemeyer et al., 2017).

Extended R-Value Analysis
Besides pollen and vegetation data, other input data required to run the ERV model are fall speed of pollen (FSP) and wind speed. We used a constant wind speed of 2.1 m s −1 , which is the mean wind speed calculated from the Global Surface Hourly dataset (1988-2018, Noaa National Centers for Environmental Information, 2001) based on selected weather stations in study area. FSP for the selected taxa ( Table 2) were taken from the Salix 0.022 Gregory, 1973 literature (Eisenhut, 1961;Gregory, 1973;Sugita et al., 1999;Broström et al., 2004;Li et al., 2015). ERV.Analysis.v1.3.1. Program (Sugita, unpublished) was used to estimate RPP for the selected taxa in Chukotka and central Yakutia. This program provided 4 pollen dispersal models including Prentice-Sutton distance-weighting method (Prentice's model) which was chosen as the best fit for this study. We ran the ERV analyses using selected moss/soil sampling sites from Chukotka (14 sites) and central Yakutia (17 sites). We selected taxa that occurred sufficiently frequently in both the pollen and vegetation data for most sites and are characterized by between-sample variation in pollen percentages and vegetation abundances (Li et al., 2017). Ericaceae was present at adequate quantities in the pollen assemblages and vegetation of most sites with a wide variation in abundance between sites in both areas. Poaceae is the most common reference taxon in studies of RPPs. Therefore, Ericaceae (Chukotka, Yakutia) and Poaceae (Chukotka) were selected as the reference taxa to run the ERV model and pollen productivity for the other taxa was estimated relative to the productivity of these reference taxa. We ran the ERV model using all sites with available vegetation and pollen data to assess the log-likelihood curves to identify the RSAP and evaluate the pollen-vegetation relationships. The taxa with non-linear relationships and the site outliers in terms of regional vegetation composition and structure were excluded and then a second ERV analysis was conducted. We also explored the effects of including different numbers of taxa in the ERV analysis, repeating calculations with 6, 7, or 8 taxa, and found that the results from the analysis of 6 taxa (Betula, Cyperaceae, Ericaceae, Larix, Poaceae, Salix) for Chukotka and 7 taxa (Alnus, Betula, Cyperaceae, Ericaceae, Larix, Picea, Pinus) for central Yakutia are most reasonable.
All three sub-models of the ERV model were tried in the analysis. The input pollen and vegetation datasets for ERV sub-models 1 and 2 were percentages while sub-model 3 used vegetation datasets expressed as absolute abundance (m 2 /m 2 ). ERV sub-model 1 and 2 assumes the background pollen as a constant percentage (Parsons and Prentice, 1981) and a constant proportion of total plant abundance (Prentice and Parsons, 1983). Sub-model 3 assumes that the background pollen comes from beyond RSAP. The RSAP was defined visually from a loglikelihood curve where the values increased with distance and reached an asymptote. The RPP of each taxon was estimated as the average value with all distances greater than the RSAP.

Pollen Assemblages
A total of 46 pollen and spore taxa were identified in the 72 surface samples, 31 of which occurred in at least three samples. The overall dominant taxa are Alnus, Artemisia, Betula, Cyperaceae, Ericaceae, Larix, Poaceae, Pinus, and Salix.

Vegetation Data
A total of 46 harmonized taxa (some taxa were combined in order to correspond to pollen types) was recorded in the field survey. In the final vegetation dataset employed for the ERV-modeling, 11 land-cover types were included in the application ( Table 3). For the Chukotka area, graminoid tundra, forest and shrub tundra, and prostrate herb tundra were adopted.

Pollen-Vegetation-Climate Relationships of Different Sediment Types
In order to explore the relationships between pollen assemblages, vegetation types, and climate variables, RDA was performed with climate variables as constraining variables (RDA in Figure 4). The first two RDA axes explain 50.46% (axis 1: 43.75%, axis 2: 6.709%) of the total variance observed in the pollen assemblages from moss/soil samples while the first two axes of the RDA based on lake samples explain 64.11% (axis 1: 50.29%, axis 2: 13.82%) of the total variance.
The first axes of the lake and moss/soil datasets separate the samples from Chukotka and central Yakutia. The taxa scores of axis 1 reflect the pollen contents of the main taxa for different vegetation types, i.e., Pinus, Picea, and Larix (positive RDA scores) and Ericaceae, Betula, Alnus, Salix, and Cyperaceae (negative RDA scores) in both analyses with different scores. Land-cover types: 1 -Graminoid tundra, 2 -Forest tundra and shrub tundra, 3 -Prostrate herb tundra, 4 -Open canopy pine and larch with lichen, 5 -Open canopy pine, 6 -Closed canopy pine, 7 -Open canopy mixed forest, 8 -Closed canopy mixed forest, 9 -Open canopy larch, 10 -Closed canopy larch, 11 -Closed canopy spruce. Types 1-3 occur in Chukotka and 4-11 in central Yakutia. The land-cover type IDs are described in Table 3: 0 -water body, 6 -Closed canopy pine, 7 -Open canopy mixed forest, 9 -Open canopy larch, 11 -Closed canopy spruce.
All climate variables are positively related to axis 1 with high scores. RDA1 scores of constraining variables are 0.9426 and 0.9327 (MAT) and 0.8948 and 0.9083 (MAP) for moss/soil and lake datasets, respectively. MAT and MAP are shown in the RDA plots (Figure 4). The first RDA axis apparently reflects temperature and moisture gradients as shown by the projections in both plots. The RDA plot of moss/soil samples indicates that temperature is the main controlling factor of the changes in pollen assemblages as the arrow of MAT is parallel to the first axis (Figure 4). Moreover, the relationship between the climate variables and pollen assemblages of lake samples is stronger than for the moss/soil samples since the first two RDA axes of lake samples explain more of the total variance (64.11%) than the moss/soil samples (50.46%). In general, the samples from Chukotka are associated with lower temperature and precipitation, whereas the samples from central Yakutia are strongly related to higher temperature and precipitation.
PROCRUSTES rotation analyses and PROTEST were performed to find the best fit in a statistical sense between PCA taxa scores of moss/soil and lake samples (Figure 5).
The results indicate a significant accordance in pollen data between the PCA taxa scores of different sample types. The PROCRUSTES rotation sum of squares(m 12 ) is 0.378 and the root mean square error (RMSE) is 0.1104. Correlation between the two ordination results (r = 0.7887) is high. Pollen taxa residuals of Rumex, Poaceae, Alnus, Larix, and Asteraceae between different sample types are above 0.15 (Figure 5) showing a lack of consistency for these taxa in the tested datasets.

Chukotka
Out of the three sub-models, sub-model 1 was excluded at first based on the plots of log-likelihood against distance (Figure 6). The RSAP is ca. 500 m with sub-model 3 and ca. 1300 m with sub-model 2 since the log-likelihood approached an asymptote ( Figure 6A). Ericaceae and Betula exhibit a relationship that is closest to an ideal linear relationship in ERV adjusted pollen proportions and vegetation proportions using sub-model 2 (Supplementary Figure 1A).
The RPP estimates relative to both Poaceae and Ericaceae for 6 taxa and their standard deviations (SDs) were calculated using two ERV sub-models (Table 4). ERV sub-model 2 mostly produces higher RPPs than sub-model 3 except for Cyperaceae. The ranking of the RPPs is the same for both ERV sub-models: Betula > Larix > Ericaceae > Poaceae > Salix > Cyperaceae with Ericaceae having a RPP of 1 (Figure 7). Cyperaceae and Salix   have very low RPPs but their large standard deviations suggest that they may not be credible values.

Central Yakutia
Out of the three sub-models, sub-model 1 was excluded at first based on the plots of log-likelihood against distance (Figure 6). The RSAP is ca. 250 m with sub-model 3 and ca. 360 m with sub-model 2 since the log-likelihood approached an asymptote ( Figure 6B). Picea and Pinus exhibit a relationship that is closest to an ideal linear relationship in ERV adjusted pollen proportions and vegetation proportions using sub-model 2 (Supplementary Figure 1B). The RPP estimates relative to Ericaceae for 7 taxa and their standard deviations (SDs) were calculated using two ERV sub-models (Table 4). ERV sub-models 2 and 3 produce similar but varying RPPs. The ranking of the RPPs is the same for both ERV sub-models: Pinus > Larix > Picea > Ericaceae > Betula > Alnus > Cyperaceae with Ericaceae having a RPP of 1 (Figure 7). Cyperaceae and Alnus have very low RPPs but their large standard deviations suggest that they may not be credible values. Betula also has a very large standard deviation.

Pollen-Vegetation-Climate Relationship
Our surface pollen assemblages reflect the vegetation types well in terms of dominant taxa in the study regions. Furthermore, pollen spectra from Middle Taiga and Mountain Tundra are unique as revealed by RDA. Pinus and Picea have positive RDA scores (Figure 4), as do most samples from central Yakutia, reflecting the Middle Taiga ecoregion. Ericaceae, Betula, Alnus, Salix, and Cyperaceae have negative RDA scores, reflecting the ecoregion of Mountain Tundra from Chukotka. The samples from Open Woodlands are scattered around the center of the RDA plots, which is reasonable since this represents a spatially and ecologically transitional vegetation type between tundra and taiga. There are only two sites that are inconsistent with this pattern: EN18008 and EN18065. The vegetation survey of EN18008 found Pinus pumila shrubs, which has resulted in a high proportion of Pinus in the pollen data and may explain why it was distributed among the samples from central Yakutia in the RDA plot. Our ordination results are similar to other studies. For example, Pelánková and Chytrý (2009) compare proportions of plant species in actual vegetation and their pollen types in surface pollen spectra along transects in the steppe, forest, and tundra of the valleys of the Russian Altai Mountains and conclude that pollen taxa abundances of Betula nana, Larix, Picea, and FIGURE 7 | Relative pollen productivity estimates and errors for selected taxa, using plant cover data weighted by two ERV sub-models. 5 | Relative pollen productivity (RPP) estimates rescaled relative to Ericaceae (RPPs from Changbai Mt. and Germany were rescaled based on the relationship between their original reference taxa and Ericaceae in this study) in previously published studies for selected taxa that are compared with this study. Salix are significantly correlated to the surrounding vegetation.
In a PCA of modern pollen spectra in north-eastern Siberia from Klemm et al. (2013), regional differences between tundra and taiga are reflected. Betula and Ericaceae in Chukotka and Picea and Pinus in central Yakutia exhibit relationships that are closest to a perfect linear relationship (Supplementary Figure 1). It is common in RPP studies that only a few taxa can fit the theoretical ERV-model linear relationship. A non-perfect relationship might be due to the discrepancy between pollen productivity and pollen dispersion, leading to unevenness in the pollen data and vegetation data (e.g., Alnus, Salix, Cyperaceae). Other factors influencing the pollen-vegetation relationship may be stochastic pollen dispersal processes, for example by insects rather than wind transport, or pollen transported in clumps rather than as single grains (Tufto et al., 1997;Theuerkauf et al., 2013;Li et al., 2017).
Pollen composition is numerically related to temperature and precipitation (Figure 4). Pinus and Picea are characteristic of the samples from central Yakutia with relatively high MAT, whereas Betula, Cyperaceae, and Ericaceae, are indicator taxa in samples from Chukotka and dominate under low MAT. This is in accordance with results from previous studies of Siberian surface pollen spectra (e.g., Pelánková and Chytrý, 2009;Klemm et al., 2013). It suggests that the relationship between pollen assemblages and climate variables is rather tight, and that palaeoclimates can be reasonably accurately reconstructed from fossil pollen records.
Although these correlations may provide climate indicators, the relationships between climate variables and a single taxon can be different in different areas. For example, Pelánková et al. (2008), based on their studies in southern Siberia, suggest that a high proportion of Pinus pollen indicates low summer temperatures and higher precipitation and the occurrence of Larix pollen indicates low winter temperatures and low precipitation, which contrasts with the positive correlation between MAT and Pinus and the insignificant correlation between Larix and temperature in this study. This contrast can be attributed to the range of climate variables and vegetation types in geographically different areas.

Pollen Assemblages in Different Sediment Types
PROCRUSTES analysis of PCA taxa scores reveals that pollen data from moss/soil and lake samples have a similar distribution (m 12 = 0.378, r = 0.7887). However, there are some differences in the representation of several pollen taxa originating from different sediment types as indicated by, for example, the high PROCRUSTES residuals of Poaceae, Alnus, Larix, and Salix ( Figure 5). The pollen percentages of Larix in lake samples from central Yakutia are generally larger than those from the moss/soil samples (Supplementary Figure 2). This finding is in accordance with the RPP estimates from moss surface and lacustrine surface-sediment samples in arctic Siberia (Niemeyer et al., 2015). The pollen percentages of Ericaceae from different sample types varied apparently in terms of mean value and variation range (Supplementary Figure 2). The variation of pollen assemblages in different sediment types may be attributed to the size of the source area which depends on the diameter of a sampling site (Sugita, 1993(Sugita, , 1994. Pollen data from large lakes show little site-to-site variation even if vegetation is highly heterogeneous compared with pollen data from smaller source area (Sugita, 2007a,b). Therefore, the majority of the pollen in moss/soil samples is likely to originate from local plants, while the lake surface-sediments contain pollen from a larger source area (Bunting, 2002;Zhao et al., 2009). Lisitsyna et al. (2012) state that lake sediment samples from mountain birch woodland tend to overestimate pine and underestimate birch while those from pine forest tend to have higher birch percentages than moss samples, which aligns with our results.
The vegetation type of the sampling sites from central Yakutia is Middle Taiga with more trees and less herbs than the Mountain Tundra and Open Woodlands of sampling sites from Chukotka. Lake samples are characterized by higher amounts of Poaceae and Cyperaceae, especially in samples from central Yakutia (Supplementary Figure 2). This might be explained by the higher presence of sedges around the lakes. The Poaceae family includes a number of plant species with very different ecological tolerances. As most fossil pollen records are collected from wet settings, it is difficult but critically important that palynologists recognize whether Poaceae pollen is derived from meadow and marshes that surround their coring site. Interpretations of the pollen signals of Poaceae are likely to overstate "dry" episodes (Bush, 2002).

Relative Pollen Productivity Estimates and Relevant Source Area of Pollen
Based on the plots of log-likelihood against distance from the sampling point (Figure 6), sub-model 2 provided a more satisfactory curve and reached an asymptote with the most relatively constant values for both regions. Therefore, sub-model 2 is the most appropriate model to estimate the RSAPs and RPPs for Chukotka and central Yakutia. Estimated RSAPs for moss/soil samples are respectively 1300 and 360 m for the Chukotka tundra and central Yakutia taiga, and are similar to other estimates (see Table 5). Several studies (Sugita et al., 1999;Bunting et al., 2004;Broström et al., 2005) suggest that different RSAPs are caused by the distribution and size of the vegetation patches when the basin size is constant. Larger patches and grids will lead to an increase in RSAP . Other factors such as the taxa included in the analysis and the method used to select sample locations can also influence the estimated RSAP (Broström et al., 2005;Nielsen and Sugita, 2005). Differences in RSAPs may also be due to the number of taxa included and the plant characteristics of the studied landscape (Li et al., 2017). Mixed herb-tree vegetation may have a larger RSAP than all-herb or all-tree vegetation (Bunting and Hjelle, 2010). In our case, taxa included in the ERV analysis of sites from Chukotka are fewer than those from central Yakutia and the vegetation type in Chukotka is mostly tundra with mixed herb and tree taxa while the vegetation in central Yakutia is mainly different types of forest. Moreover, large patch size and the openness can also lead to a relatively larger RSAP in Chukotka.
Even for the same taxa such as Betula (RPPchukotka: 1.80 ± 0.15, RPPyakutia: 1.02 ± 3.76) and Larix (RPPchukotka: 1.40 ± 0.24, RPPyakutia: 4.23 ± 2.24), the RPP estimates show variation. The reliability of RPP estimates can be assessed by their standard deviations. If the SD is larger than the RPP value, it implies that the estimated RPP is not different from zero and should be considered as unreliable (Li et al., 2017). In our study, the large standard deviations for Alnus (RPPyakutia: 0.54 ± 0.97) and Betula (RPPyakutia: 1.02 ± 3.76) for Yakutia and for Cyperaceae (RPPchukotka: 0.00001 ± 0.00430, RPPyakutia: 0.01 ± 3.04) for both areas suggest that they may not be credible values. We can also compare the results from sub-model 2 with sub-model 3. Sub-model 2 (Prentice and Parsons, 1983) was developed for datasets where both pollen and vegetation data are available as percentages while sub-model 3 (Sugita, 1994) can be used if absolute vegetation abundance (m −2 m 2 ) is known. Sub-model 2 assumes that the background pollen of each taxon is a constant proportion of total plant abundance (Prentice and Parsons, 1983) and sub-model 3 assumes that the background pollen comes from beyond the RSAP. Large variation in total plant abundance among sites may result in less good estimates from sub-model 2 (Prentice and Parsons, 1983). The taxa with similar RPPs are Betula, Larix, and Poaceae for Chukotka and Larix, Picea, and Pinus for central Yakutia. The RPP of Salix (0.14, 0.0006) is significantly different with both sub-models, hence its use should be treated with care.
To compare our results with other studies, RPP estimates in previously published studies were rescaled relative to Ericaceae for major taxa ( Table 5 and Figure 8). The large differences between RPP estimates may be due to different species and vegetation types between regions. Differences in methodology for vegetation data collection (Bunting and Hjelle, 2010) and fall speed of pollen can also influence the result of RPP estimates. Our RPP estimates for Betula (RPPchukotka: 1.80 ± 0.15, RPPyakutia: 1.02 ± 3.76) and Poaceae (RPPchukotka: 0.64 ± 0.18) are relatively small while our RPP estimates for Picea (RPPyakutia: 2.18 ± 0.53) and Pinus (RPPyakutia: 10.38 ± 3.75) are relatively large compared with other studies. Even though Larix is normally regarded as a very underrepresented taxon, according to our results, the RPP for Larix is not as low as the value from moss/soil samples in Niemeyer et al. (2015) and is similar to other studies (e.g., Matthias et al., 2012;Zhang et al., 2017). Besides, RPP estimates for Larix (RPPchukotka: 1.40 ± 0.24, RPPyakutia: 4.23 ± 2.24) vary not only in our own study but also in other studies. Nevertheless, most RPP estimates of selected taxa in this study are comparable with published studies except Alnus, Cyperaceae, and Salix. The large standard deviations and/or the dissimilar values returned by the different sub-models may indicate that our results for these taxa are of limited reliability. Estimates of RPP values can be tested by using them in pollen-based reconstructions of modern or palaeovegetation.

CONCLUSION
Our study reveals that the surface pollen assemblages from Chukotka and central Yakutia numerically reflect the main vegetation types as well as the climate, particularly temperature.
Pollen data from moss/soil and lake samples have a generally similar distribution and significant consistency. Still, for some pollen taxa differences are observed: we find a high abundance of Poaceae and Cyperaceae in lake samples, which at least partly originates from overrepresented wetland taxa in the direct vicinity of the lakes. In contrast to expectation, Larix has higher abundances in lake samples than in moss/soil samples. Furthermore, pollen percentages of major taxa in moss/soil samples show a higher variability compared to lake samples.
The RSAP of the tundra-forest transition area in Chukotka and the taiga area in central Yakutia are ca. 1300 and 360 m, respectively. For Chukotka, RPPs relative to both Poaceae and Ericaceae have been estimated while RPPs for central Yakutia are relative to Ericaceae. Larix, Betula, Picea, and Pinus are overrepresented while Alnus, Cyperaceae, Poaceae, and Salix are underrepresented in the pollen spectra. The RPPs for Alnus, Cyperaceae, and Salix should be used with caution. Our estimates are in general agreement with previously published values and provide a base for a reliable quantitative reconstruction of East Siberian vegetation.
Our new modern pollen data contribute to the existing modern pollen databases  and can be used as analogs of tundra and taiga under cold climate conditions. Our results have implications for the interpretation of fossil pollen records and the quantitative reconstruction of past vegetation and climate.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.pangaea.de/ 10.1594/PANGAEA.941577.

AUTHOR CONTRIBUTIONS
RG identified all the pollen samples and wrote the draft. AA helped with identifying pollen, discussed about the results, and revised the draft. SK, BH, FvG, and IS provided and classified the vegetation data. LP, EZ, and ET helped to collect samples and identify the vegetation data. FL helped with running the ERV model. YZ and UH discussed the results and revised the manuscript. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We thank our colleagues from the joint Russian-German expedition 2016 and 2018 for their support in the field. We thank Kathleen Stoof-Leichsenring, Sarah Olischläger, and Antonia Schönberg for support with the laboratory work. Special thanks to Feng Qin, Qiaoyu Cui, and Thomas Böhmer for their help in conducting the analyses. We also thank Cathy Jenks for her help in English writing.