Use of RNA and DNA to Identify Mechanisms of Microbial Community Homogenization

Biotic homogenization is a commonly observed response following conversion of native ecosystems to agriculture, but our mechanistic understanding of this process is limited for microbial communities. In the case of rapid environmental changes, inference of homogenization mechanisms may be confounded by the fact that only a minority of taxa is active at any given point. RNA- and DNA-based community inference may help to distinguish the active fraction of a community from inactive taxa. Using these two community inference methods, we asked how soil prokaryotic communities respond to land use change following transition from rainforest to agriculture in the Congo Basin. Our results indicate that the magnitude of community homogenization is larger in the RNA-inferred community than the DNA-inferred perspective. We show that as the soil environment changes, the RNA-inferred community structure tracks environmental variation and loses spatial structure. The DNA-inferred community loses its association with environmental variability. Homogenization of the DNA-inferred community appears to instead be driven by the range expansion of a minority of taxa shared between the forest and conversion sites, which is also seen in the RNA-inferred community. Our results suggest that complementing DNA-based surveys with RNA can provide unique perspectives on community responses to environmental change. IMPORTANCE Two primary mechanisms by which community homogenization occurs are: 1) the loss of environmental heterogeneity driving community convergence, and 2) increased rates of biotic mixing, driven by exotic invasions or range expansions. Better identifying these mechanisms could help inform future mitigation strategies. Only a minority of soil taxa tends to be active at any time, which makes identifying these mechanisms difficult. To circumvent this problem, we measured prokaryotic community structure in two ways: RNA-based inference (which should enrich for active taxa), and DNA-based inference (which includes active and inactive taxa) along a gradient of land use change. Our results suggest that changes to soil heterogeneity impact the RNA-inferred community, while range expansions contribute to the homogenization of both DNA- and RNA-inferred communities. Thus, RNA-based community inference may be a more sensitive indicator of environmentally driven homogenization, and researchers interested in microbial responses to rapid environmental change should consider this method.


INTRODUCTION 59
One of the most rampant forms of environmental change today is land use change 60 following the conversion of tropical rainforests to agriculture (1-4). Both above-and 61 below-ground communities have been shown to experience species loss and community 62 change at unprecedented rates following land use change (5-7), and this is of concern 63 because tropical rainforests are some of the most diverse and productive ecosystems on 64 the planet. Predicting community responses to tropical land use change is a priority if we 65 are to better understand how human activities will impact species loss and global-scale 66 biogeochemical cycling (8,9), but in order to gain such a level of predictability we must 67 better understand the mechanisms underlying community change. 68 8 change on Congo Basin ecosystems, which would be better-tested using replication at the 160 land type level (54). Limited access to sites and logistical challenges with sampling in 161 this area required that we extensively survey one site within each of three land types, 162 rather than performing higher levels of replication on fewer land types. This design is 163 appropriate for asking how these sites differ from one another, or how RNA-and DNA-164 inferred community composition or diversity patterns differ from one another (55-57). 165 Regarding inferences about general microbial responses to land use change in the Congo 166 Basin, this study would be considered a case study (54), whereby our results may be 167 suggestive of broader patterns, but such patterns should be corroborated using a design 168 with land type replication. cores were taken, homogenized, and then subsampled. From the homogenized mixture, 3 176 ml (approximately 1 g) of soil was added to 9 ml Lifeguard solution (Mobio, California, 177 USA) in the field, then transported cold and stored at -80° C in order to stabilize 178 nucleotides for later extraction. Our spatially explicit design allows for the estimation of 179 spatial turnover (beta diversity) (

Bioinformatics and statistical analysis 213
Paired end reads were joined then demultiplexed in QIIME (67) before quality 214 filtering. Primers were removed using a custom script. UPARSE was used to quality filter 215 and truncate sequences (416bp, EE 0.5) (68). Sequences were retained only if they had an 216 identical duplicate in the database. Operational taxonomic units (OTUs) were clustered 217 de novo at 97% similarity using USEARCH (69). OTUs were checked for chimeras 218 using the gold database in USEARCH. We used a custom script to format the UCLUST 219 output for input into QIIME. To assign taxonomy, we used the repset from UPARSE in 220 QIIME using greengenes version 13_5 (RDP classifier algorithm). Finally, we averaged 221 100 rarefactions at a depth of 3790 counts per sample for each community inference 222 (RNA or DNA) and each land type (forest, burned, or plantation) to achieve 223 approximately equal sampling depth across comparisons, which excluded three samples 224 in the DNA-inferred communities (two in the forest and one in the plantation). 225 Statistical analyses were performed in the R platform (70). Canberra pairwise 226 community distances were calculated using the vegdist function in the package 'vegan' 227 (71). Canberra was chosen because of its incorporation of abundance data, sensitivity to 228 rare community members (72), and ability to detect ecological patterns even in instances 229 of relatively low sampling extent (73). Rates of community spatial turnover were 230 estimated by regressing pairwise community similarity (1-Canberra distance) against 231 pairwise geographic distance between samples (74). We used a similar regression 232 approach between community similarity and environmental similarity to estimate the 233 relationship between community turnover and environmental turnover. Pairwise soil 234 environmental similarity was calculated using 1-Gower dissimilarity (75, 76) using the 235 daisy function in the package 'cluster' in R (77). Gower dissimilarity was chosen because 236 it can incorporate and compare different classes or scales of data (78). Mantel tests were 237 used to test for significant associations between geographic, community, and 238 environmental distance, and partial Mantel tests were used to estimate the relative 239 contribution of environmental distance and geographic distance on variation in 240 community dissimilarity in the 'vegan' package in R. Differences in average pairwise 241 similarity across land types were assessed using a one-way ANOVA after verifying 242 normal distribution of data. Post-hoc comparisons of group means were made using 243 Tukey's HSD. Distance-decay slopes were compared using the function diffslope 244  We developed several community analysis approaches to investigate whether 256 biotic invasion or range expansion contribute to biotic homogenization. Taxa found in a 257 conversion land type (i.e. the burned or plantation site), but not the forest, were 258 considered "newcomers". We removed these taxa from the community matrix, equalized 259 sampling extent (using rarefaction), and then re-ran analyses of pairwise community 260 similarity levels and distance-decay (described above). The expectation was that if they 261 contribute to homogenization (increased community similarity), then their removal 262 should decrease pairwise community similarity levels. We took an analogous approach to 263 ask if range expansion of forest-associated taxa (referred to as "bloomer" taxa) 264 contributes to biotic homogenization. We identified taxa that were differentially abundant 265 in converted sites relative to the forest site (described above), then removed them from 266 the community matrix of the converted site and re-assessed community similarity levels 267 and distance-decay. The expectation, as above, was that if these taxa contribute to 268 homogenization, then their removal should render the communities less similar. communities were inferred via DNA or RNA (Supplemental Fig. 1). OTU-level richness 296 also differed by land type (F 2,70 = 8.26, p < 0.001), but not community inference method 297 (p=0.80), with the burned site being significantly lower in richness than the forest or 298 plantation sites (Tukey's HSD p < 0.01, for both comparisons, Supp. Fig. 4). 299 300 301

Evidence of biotic homogenization following land use change 302
We asked whether soil prokaryotic communities in the sites undergoing 303 agricultural conversion were on average more similar to each other, relative to the 304 communities found in the forest. The RNA-inferred community showed a strong trend 305 towards homogenization across sites (F 2,219 = 23.33, p < 0.001, Fig and showed a shallower distance-decay slope (slope = -0.027, difference in slope = -344 0.025, p = 0.001). Thus burning and planting seem to introduce environmental 345 heterogeneity, but this heterogeneity tends to show little to no spatial structure. 346 347

Environmental heterogeneity continues to influence RNA-inferred (and not DNA-348 inferred) community turnover, despite loss of spatial structure 349
We asked whether the loss of spatial structure of the soil chemical environment 350 could be contributing to the loss of spatial turnover in the microbial community. To do 351 so, we regressed pairwise community similarity (1-Canberra distance) against pairwise 352 environmental similarity (1-Gower distance) for both the RNA-and DNA-inferred 353 communities. In the forest site, both RNA-and DNA-inferred community similarity 354 levels were positively correlated with environmental similarity (Fig. 4A, B), even after 355 accounting for differences due to geographic distance (Table 1) we look at the burned and plantation sites, however, this relationship persists for the 358 RNA-inferred community, but disappears for the DNA-inferred community (Table 1), 359 suggesting that the spatial homogenization of the DNA-inferred community may be 360 driven by other mechanisms besides soil chemical homogenization. Thus as 361 environmental heterogeneity loses its spatial structure, the RNA-inferred community 362 similarity levels continue to vary with this heterogeneity and lose spatial structure, while 363 the DNA-inferred community becomes decoupled from levels of environmental variation. 364 365

Biotic invasions do not contribute to homogenization 366
We next tested the hypothesis that the introduction of "newcomer" taxa (i.e. those 367 that were not previously present) was driving community homogenization.

Range expansion of forest-associated taxa drive loss of community variation 410
Because soil bacterial communities in the forest tended to show high taxonomic 411 overlap with the burned and plantation sites, we asked whether homogenization might 412 rather be driven by changes to the relative abundance of certain taxa. We used DESeq2 -413 a generalized linear model with a negative binomial distribution-to identify "bloomer" 414 taxa (i.e. those whose relative abundance significantly increased by land type). This homogenization is driven by environmental homogenization, community turnover should 496 continue to track environmental turnover, even when spatial structure is lost. We see this 497 in our data when we infer community structure using RNA, but not DNA, suggesting that 498 environmental spatial homogenization is likely a strong driver of the spatial 499 homogenization of the RNA-inferred community. The decoupling of responses in the 500 RNA-and DNA-inferred communities could represent differing levels of contribution 501 from homogenization mechanisms. Our results suggest that taxa that are enriched in the 502 burned or plantation sites relative to the forest are contributing to the loss of community 503 variation (i.e. average pairwise dissimilarity) in those sites. Those taxa also collectively 504 show wider spatial distributions (i.e. higher occurrence frequencies) in the disturbed sites 505 relative to the forest. These findings are consistent with the idea of a range expansion, 506 and the fact that we saw this trend in both the RNA-and DNA-inferred communities 507 suggests that identifying this type of homogenization mechanism may not require RNA-508 based community inference. A similar pattern has been observed in Amazonian sites that 509 have undergone conversion to cattle pasture, where prokaryotic taxa shared across forest 510 and agricultural sites tended to be more widespread in the agricultural sites (6), and 511 fungal communities in agricultural sites tended to be enriched in generalist taxa that were 512 more widespread (15). Thus by distinguishing communities using RNA and DNA, we see 513 that only part of the community seems to be responding to the environmental changes 514 associated with conversion, while communities inferred via both methods appear be 515 shaped by biotic factors such as the breakdown of dispersal barriers and/or the range 516 expansion of certain taxa. 517 The use of 16S rRNA as a proxy for activity has been the subject of recent 518 controversy. Of particular concern are two main issues: the assignment of false positives 519 (i.e. dormant taxa misidentified as active (28)), and the inaccurate assessment of activity 520 levels (e.g. driven by comparing ratios of the relative abundance of taxa in the RNA-vs 521 DNA-inferred communities (29-32)). The ribosomal content of a community, however, 522 should be at least enriched with the taxa that are active and/or growing, and there are a 523 number of studies that support the notion that rRNA-inference represents activity. For 524 example, if the active fraction of a community is more likely to be interacting with the 525 environment than the dormant fraction (which is likely avoiding the current 526 environmental conditions), then we would expect a stronger correspondence between 527 environmental conditions and community turnover in a community that is enriched in 528 active taxa (19). Indeed this has been shown both along a marine environmental gradient 529 (33) and a grassland soil system experiencing re-wetting following drought (34). It has 530 also been shown that N-addition to forest soil elicits a stronger response in communities 531 inferred from 16S rRNA than rDNA (35). Our results contribute to this narrative by 532 showing that RNA-inferred community turnover persistently tracks environmental 533 turnover, while this association is lost when inferring only with DNA. We also see that 534 the RNA-inferred community shows a more pronounced loss of community variation and 535 spatial structure than the DNA-inferred community. Thus while rRNA inference may 536 have certain limitations, our results, alongside others, suggest that this method should be 537 enriching for active taxa, and this can have important implications for both qualitative 538 and quantitative conclusions, especially in systems with strong environmental gradients. 539 Tropical ecosystems are characterized by immense heterogeneity, and this could 540 make the task of detecting general responses to land use change difficult. Two important 541 steps towards gaining a better understanding of common microbial responses to tropical 542 land use change include 1) expanding the breadth (i.e. the geographic representation) of 543 regions sampled, and 2) increasing the resolution of our study systems (e.g. by including 544 more sites along the conversion continuum). Our study allows us to ask whether 545 commonalities exist between our findings and those reported from other tropical 546 ecosystems undergoing land use change. The changes we see to the spatial structuring of 547 communities (i.e. a diminished distance-decay relationship) are consistent with responses 548 reported from the Amazon Basin (6, 25). While our study was not replicated at the land 549 type level-restricting our level of inference regarding how representative our findings are 550 of other Congo Basin areas-our results at least suggest that a diminished rate of 551 community distance-decay may be common across tropical areas facing a similar threat. 552 The method of conversion may be driving this similarity in microbial community 553 response. The predominant method for converting tropical rainforests to agriculture is the 554 use of slash-and-burn techniques (87). By including a recently slash-and-burned site in 555 our design, we have gained a rare glimpse into the impacts directly following the initial 556 step in agricultural conversion. Already at this stage we see that the loss of community 557 spatial structure (i.e. distance-decay) has occurred. What this suggests is that, at least 558 initially, spatial homogenization can be driven by the act of conversion, rather than other 559 management practices such as planting or crop choice. Thus by targeting a region that has 560 otherwise not been sampled, and increasing the resolution by which we survey the 561 conversion process, we have gained new insights that may help to elucidate common 562 community responses to tropical land use change. 563 Considering the rate and magnitude by which tropical rainforests are being 564 converted to agriculture (4), gaining a mechanistic understanding of community 565 responses to environmental change is imperative (9). Future efforts could investigate 566 whether the functional potential (i.e. gene content) or trait distributions of a community 567 are similarly impacted by land use change (37, 88), or whether ecosystem functions (e.g. 568 those involved in nutrient cycling or greenhouse gas emissions) are impacted by 569 community homogenization. Our work highlights the importance of distinguishing 570 between metabolic states of microbial community members, if we are to better 571 understand community responses to environmental change. Lastly, our work 572 demonstrates that trends in our system are consistent with those reported from 573 geographically disparate areas (e.g. the Amazon Basin), suggesting that despite large 574 differences between these areas, land use change may drive predictable community 575 changes.