Climate Projections for the Southern Ocean Reveal Impacts in the Marine Microbial Communities Following Increases in Sea Surface Temperature

Anthropogenic global warming can have strong impacts on marine ecosystems, especially on climate-sensitive regions such as the Southern Ocean (SO). As key drivers of biogeochemical cycles, pelagic microbial communities are likely to respond to increases in sea surface temperature (SST). Thus, it is critical to understand how SST may change in future scenarios and how these changes will affect the composition and structure of microbial communities. By using a suite of Earth System Models participating in the Coupled Model Intercomparison Project Phase 6 (CMIP6), machine learning, and 16S rRNA sequencing data, we investigated the long-term changes as projected by CMIP6 simulations in SST throughout the twenty first century and the microbial diversity responses in the SO. Four Shared Socioeconomic Pathways (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5) were considered to assess the SO surface sensitivity to a warming climate. The SST changes across SSPs were ≈0.3, ≈0.7, ≈1.25, and ≈1.6oC between 2015 and 2100, respectively, and the high emissions scenarios projected a much sooner emergence of the human-induced temperature change throughout the SO. The impacts on Antarctic marine diversity of bacteria and archaea are expected to be significant and persistent by the late twenty first century, especially within the higher end of the range of future forcing pathways.


INTRODUCTION
Throughout the second half of the twentieth century, most of the Earth's energy imbalance has been stored in the ocean, primarily at the surface (Ishii and Kimoto, 2009;Lyman et al., 2010;Trenberth, 2010;Abraham et al., 2013), and more recently reaching deeper layers (Bilbao et al., 2019). The resulting increase in ocean heat content has been reported to be largely due to anthropogenic forcing (Barnett et al., 2005;Gleckler et al., 2012;Pierce et al., 2012;Bilbao et al., 2019). These human-induced changes have already spread to about half (20-55%) of the world oceans (Silvy et al., 2020), and the largest heat gain has occurred in the Southern Ocean (Roemmich et al., 2015;Trenberth et al., 2016). However, the ice-insulated regions of the SO protect its sea surface waters from climate forcing (Gille, 2008(Gille, , 2002Fahrbach et al., 2011;Tonelli et al., 2019). As a result, heat and carbon uptake happens mostly through the formation and subduction of water masses (Heuzé et al., 2013;Sallée et al., 2013), and the earliest detection of the anthropogenic warming fingerprint in the SO occurs in the ocean interior rather than in surface waters (Silvy et al., 2020).
Despite the ability of the ice-insulated regions of the SO to delay the warming trends throughout the global ocean due to the local circulation and the interaction with sea ice and ice shelves (Fahrbach et al., 2011;Tonelli et al., 2019), warmer surface waters lead to increased sea ice melting, loss of ice mass, and increased surface water flux (Rignot et al., 2008), ultimately leading to a slowdown of bottom water formation and the stalling of the CO 2 sink in the SO (Lovenduski et al., 2008;Lenton et al., 2009;Purkey and Johnson, 2012). In particular, the surroundings of the Antarctic Peninsula and the western Weddell Sea have been exposed to intense warming and sea ice melting (Fan et al., 2014).
Climate change affects the growth, reproduction, and survival of aquatic organisms (Kroeker et al., 2010;McFeeters and Frost, 2011;Weydmann et al., 2012;Cripps et al., 2014). The increase in ocean temperature will likely affect the structure and dynamics of microbial community, but since microorganisms show large population sizes and relatively fast reproduction, they might be capable of adapting to global environmental changes through phenotypic plasticity and adaptive evolution (Collins et al., 2014;Hellweger et al., 2014;Doblin and van Sebille, 2016;Verde et al., 2016;Cavan et al., 2019). Microorganisms' responses to climate change have strong implications for an environmentally sustainable future (Cavicchioli et al., 2019), but our understanding of how they respond to climate warming remains limited.
Although the effect of climate change on microbial functioning is still poorly understood, we know that marine phytoplankton performs half of the global photosynthetic CO 2 fixation and half of the oxygen production, and they can have a fast response to climate variations (Arrigo, 2005;Teeling and Glöckner, 2012;Buttigieg et al., 2018). Heterotrophic and chemolithotrophic microorganisms are also important drivers of ocean biogeochemical cycles and constitute the foundations of many marine ecosystems, acting as an essential part of the functioning of trophic levels (Arrigo, 2005;Teeling and Glöckner, 2012;Buttigieg et al., 2018). The SO has shown high and dynamic microbial diversity, with the community composition strongly influenced by temperature (Signori et al., 2014(Signori et al., , 2018Cavicchioli, 2015). Given that temperature increase is expected to modify microbial diversity and distribution with cascading effects at higher trophic levels, predicting how the microbial diversity and community composition will respond to climate change has become an important challenge (Thomas et al., 2012;Toseland et al., 2013;Cavicchioli et al., 2019).
The main tools to investigate projections of global climate change are state-of-the-art climate models designed to estimate the progression of Earth's climate system in the twenty first century (21C; Eyring et al., 2016). These complex models, however, are not able to represent the marine microbial structure. It is, therefore, critical to make use of new methodologies combined with real data to assess the impacts of climate change on these sensitive pelagic microbiomes. Machine learning (ML) techniques enable the analysis of high-dimensional data, linking prediction and computational intelligence methods based on in situ measured data, which can be used to elucidate relationships between microorganisms and environmental factors, such as temperature (Yu and Liu, 2003;Bishop, 2006;Qu et al., 2019;Thompson et al., 2019). ML algorithms like random forest (RF) and neural networks were reported to be one of the most effective tools for analyzing microbiome data, with high accuracy in a range of 16S rRNA sequencing data (Liu et al., 2011;Larsen et al., 2012;Statnikov et al., 2013;Pasolli et al., 2016).
Here, we investigate the long-term sea surface temperature (SST) changes throughout the 21C, as simulated by CMIP6 projections (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5), as well as the response of microbial diversity and composition in the northwestern Antarctic Peninsula (NWAP) and the northwestern Weddell Sea (NWWS) using an RF model available from Python toolkit SciKit-Learn ML libraries.

Climate and Earth System Models
The World Climate Research Program coordinates the development of Climate and Earth System Models (ESM) by major modeling centers under the scope of the Coupled Model Intercomparison Project, now in its sixth phase (CMIP6; Eyring et al., 2016). CMIP6 models simulate the climate under different scenarios of future anthropogenic activity (Shared Socioeconomic Pathways-SSP) within the scope of an endorsed CMIP6 project named ScenarioMIP (O'Neill et al., 2016). These SSPs are built upon the same radiative forcing range previously used in CMIP5 (Taylor et al., 2012) described as Representative Concentration Pathways (RCP2.6, RCP4.5, RCP6.0, and RCP8.5), and named after a possible range of radiative forcing values in the year 2100: 2.6, 4.5, 7.0, and 8.5 W/m 2 , respectively (Meinshausen et al., 2011). The tier 1 SSPs are SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5 (O'Neill et al., 2016). The SSP narrative 1 illustrates possible anthropogenic drivers of climate change over the 21C (departing from the historical runs), ranging from sustainable to fossil-fueled development (Riahi et al., 2017): • SSP1-Sustainability-Taking the Green Road: Low challenges to mitigation and adaptation; • SSP2-Middle of the Road: Medium challenges to mitigation and adaptation; • SSP3-Regional Rivalry-A Rocky Road: High challenges to mitigation and adaptation; • SSP4-Inequality-A Road Divided: Low challenges to mitigation, high challenges to adaptation; • SSP5-Fossil-fueled Development-Taking the Highway: High challenges to mitigation, low challenges to adaptation.

Sea Surface Temperature Data
To quantify the projected changes in surface temperature over the Southern Ocean (SO), we computed the area-weighted SST annual means between 60 • S and 80 • S across four future scenarios, ranging from the more sustainable scenarios (SSP1-2.6 and SSP2-4.5) to high emissions trajectories (SSP3-7.0 and SSP5-8.5). The SST data were retrieved from the Earth System Grid Federation (ESGF 2 ) CMIP6 archive (variable name "tos"; native grid "gn"). Monthly mean outputs from 18 ESM ( Table 1) were available at the time of writing for the preindustrial control experiment (piControl) and four ScenarioMIP 21C projections: SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5. CMIP6 models are usually run multiple times for each experiment. Here, one single run was used for each model (usually the variant named "r1i1p1f1"). For consistency, the SST outputs from each model were firstly interpolated onto the same regular 360 • × 180 • grid using bilinear filtering with the same land-ocean mask, excluding the marginal seas and interior lakes like the Mediterranean Sea, Red Sea, Arabian Gulf, Black Sea, Caspian Sea, Baltic Sea, and Hudson Bay. The first-year annual mean (2015) is subtracted from each time series yielding the actual SST change over each SSP scenario projection. Although it has been suggested that model drift is negligible when considering multimodel means (Gupta et al., 2013), SST data were de-drifted based on each model piControl run to remove trends potentially caused by model equilibrium adjustment rather than by external forcing as in Ferrero et al. (2021).

Microbial Diversity Data
Microbial community datasets were obtained by our group from previously published studies (Signori et al., 2018;de Ferreira, 2019) under the Brazilian Antarctic Program, and comprised a total of 105 samples from surface waters (∼5 m depth) collected in the NWAP and NWWS. Sampling strategy and sample processing are detailed in Signori et al. (2018) and de Ferreira (2019). Briefly, approximately 3 L of seawater samples were filtered onto 0.22 µm-membrane Sterivex TM filters using a peristaltic pump onboard the Brazilian polar vessel Almirante Maximiano (Signori et al., 2018) or at Comandante Ferraz Brazilian Antarctic Station (King George Island, Antarctica; de Ferreira, 2019). After filtration, samples were frozen at −20 or −80 • C for molecular analysis. DNA extraction of SterivexTM filters was performed using the DNEasy Power Water Kit (Qiagen, Hilden, Germany), following manufacturing protocols, or according to specifications available in Signori et al. (2018). Total extracted DNA was then sequenced using an Illumina Miseq paired-end system 2 × 250 bp reads configuration, with the primers 515F (5 -GTGCCAGCMGCCGCGGTAA-3 ) and 806R (5 -GGACTACHVGGGTWTCTAAT-3 ; Caporaso et al., 2012), targeting the V4 region of the 16S rRNA gene for both Bacteria and Archaea. Reads are available in the NCBI database under the Bioproject IDs PRJNA383940 and PRJNA665033. The description of samples, coordinates, We obtained a total of 8,538,820 reads distributed among the 105 water samples, which 5,968,201 were quality filtered and then analyzed with QIIME2 (Quantitative Insights into Microbial Ecology) and its plugins (Bolyen et al., 2019). Based on the quality scores observed using qiime demux summarize and the interactive quality plot, the forward reads were truncated at position 270, and the reverse reads at 200, using the q2-dada2denoise script. This script uses DADA2 software to obtain a set of observed amplicon sequence variants (ASVs), as described by Callahan et al. (2016). Taxonomy was assigned through featureclassifier classify-sklearn using SILVA database v.132 with a confidence threshold of 0.7. Alpha diversity indices (Chao1 and Shannon) were calculated using the Phyloseq (McMurdie and Holmes, 2012) and vegan (Oksanen et al., 2020) packages in R (R Core Team, 2019). Shannon index is calculated based on the proportion of ASVs relative to the total number of ASVs and then multiplied by the natural logarithm of this proportion (Shannon, 1948), accounting for both the abundance and evenness of the ASVs. A map with the sampling regions is provided in Supplementary Figure 1.

Time of Emergence
The Time of Emergence (ToE) is defined as the time at which the signal of a forced response emerges from the noise of internal variability (Hawkins and Sutton, 2012), thus providing an indicator of the human-induced climate change for several climate variables (Chadwick et al., 2019). The ToE was computed for each model separately as the year when the SST time series at each grid point exceeds two standard deviations of the monthly mean SST from the piControl experiment, similar to previous studies (Hawkins and Sutton, 2012;Bordbar et al., 2015;Lyu et al., 2020;Ferrero et al., 2021). The results were averaged to obtain the 18-model mean ToE for each SSP future projection.

Machine Learning Predictions
The RF model used to predict microbial diversity and composition responding to SST in the NWAP and the NWWS consists of a supervised machine learning algorithm available from Python toolkit SciKit-Learn (Pedregosa et al., 2011). This ML model is a classifier that combines multiple decision trees, trains each one on a slightly different set of observations, and its final prediction accords to the voting on each decision tree (Svetnik et al., 2003;Chen et al., 2018).
The RF model was firstly trained with Chao1 and Shannon indices and observed SST from oceanographic stations in the NWAP and the NWWS. From a total of 105 samples, 80% were used to train the model (84 samples) and the remaining 20% (21 samples) were used to test it.
In practical terms, each sample provides one value of observed SST, one Chao1 index, and one Shannon index. These variables are sorted by temperature values ranging from −1.6 to 2.3 • C interval. The RF model is trained separately for each index multiple times using the same SST input as the classifier for both indices, which represent the ML targets. To avoid overfitting, the dataset is randomly partitioned into subsets of 84 samples for training, while 21 samples are kept for testing. Once validated for both indices, the model is used to predict the spatial distribution of Chao1 and Shannon using area-weighted SST annual means outputs from the projected future scenarios. It should be noted that, from a centennial-scale climate change perspective, the entire study area is subject to roughly the same warming trends within each SSP, therefore, the long-term SST time series used in RF predictions differ only across SSPs.
A similar approach was used to perform the composition analysis, where the participation percentage of each microbial taxon identified within the sample community and the associated SST observation were used to train the RF model. The projected SST annual means for each SSP were used to predict the composition percentage of each taxon and the results were merged to provide a temporal evolution of the microbial community composition. All RF trained models reached the score range of 0.87-0.93, suggesting good accuracy.

RESULTS AND DISCUSSION
Consistent with Bracegirdle et al. (2020), warming trends are projected to happen across all future climate scenarios, where higher human industrial activity leads to higher surface temperatures. The approximate SST change across these scenarios is ≈0.3, ≈0.7, ≈1.25, and ≈1.6 • C between 2015 and 2100 (Figure 1). Variation within these scenarios increases throughout the second half of the 21C, particularly for the high forcing scenarios, mostly due to uncertainties arising from internal variability, model structural differences, and radiative forcing (Ferrero et al., 2021).
Our SSP1-2.6 simulations project mild warming in the SO (Figure 1). Significant increases in SST are not expected to happen in the northwestern Antarctic Peninsula (NWAP) coast before 2080 nor until the end of the century in the northwestern Weddell Sea (NWWS; Figure 2A). This is particularly relevant to reassure the Weddell Sea (WS) ability to cushion the offshore warming signal advected by the Weddell Gyre (Ryan et al., 2016) due to the local dynamics and cryosphere-related processes (Fahrbach et al., 2011;Tonelli et al., 2019).
The "Middle of the Road" outcome from SSP2-4.5 simulations indicates a slightly sooner emergence of the climate change signal across the NWAP, which is projected to take place during the 2060 decade ( Figure 2B). Again, the NWWS is able to postpone the anthropogenic warming until the last decade of the 21C, with the inner WS still showing areas within the range of internal variability (≈65 • S).
Due to greater environmental pressure, the results of SSP3-7.0 and SSP5-8.5 yielded even earlier ToE around the NWAP  sea surface (Figures 2C,D). Close to coastal areas, rising SST exceeds the internal variability envelope in the early 2050 decade in these high forcing scenarios. And although the NWWS still exhibits a much later ToE compared to the NWAP, the humaninduced warming is projected to reach the inner WS by the end of the 2070 decade based on SSP5-8.5 simulations. To assess the spatial variability of ToE, we have computed the horizontal anomalies with the mean ToE for each SSP (Supplementary  Figure 2). This corroborates the understanding that the higher the anthropogenic footprint, the sooner the human-induced warming will emerge in the regions adjacent to the Antarctic Peninsula. Moreover, Supplementary Figure 2 highlights that this warming signal is expected to emerge much sooner across the NWAP compared to the NWWS. The more intensive and faster warming in the NWAP is caused by the Antarctic Circumpolar Current carrying well-mixed warmer waters much closer to the continent, which is consistent with studies that investigated the ocean and ice shelf dynamics along the West Antarctica Peninsula (Jacobs et al., 2011;Hellmer et al., 2012;Paolo et al., 2015;Zhang et al., 2016;Smith et al., 2020).
Our machine learning models indicate a decrease in both microbial communities' richness and diversity within all climate projections, with higher emissions causing a more significant decrease in both indices, particularly under the most critical scenario SSP5-8.5 (Figure 3). Therefore, microbial communities might be highly impacted by increasing temperatures, corroborating the observed trend for phytoplankton (Thomas et al., 2012) and prokaryotic communities from a mesocosm experiment in the Baltic Sea (Lindh et al., 2013). Also, SSP1-2.6 and SSP5-8.5 yielded contrasting predictions for microbial composition (Figure 4). Whilst the low emission scenario projected small changes in the relative abundance of microorganisms, the three scenarios with the highest increase in temperature, including the "middle of the road" scenario, show changes in microbial communities' structure, including the loss of diversity and decrease in microbial taxa that are important contributors to the biogeochemical processes and ecosystem functioning in the NWAP and NWWS. In  Figure 1) to illustrate the temporal evolution of the bacterial and archaeal community composition. Only taxa with relative abundance above 1% are presented.
general, the heterotrophic Flavobacteriales are projected to have higher relative abundances, while the sulfur-oxidizing Thiomicrospirales, the ammonia-oxidizing Archaea within Nitrosopumilales, and the planktonic euryarchaeotal Marine Group II have reduced relative abundance. These orders are composed of several species with important roles in the functioning of pelagic ecosystems, including the sulfur, nitrogen, and carbon cycles, and are currently described as abundant taxa in the SO (Signori et al., 2014;Zhang et al., 2015;Liu et al., 2019). For instance, Nitrosopumilales members are important contributors to nitrogen remineralization and carbon fixation, and are often the most abundant taxa in cold waters, including the euphotic and aphotic zones of the SO (Signori et al., 2014;Cheung et al., 2019).
The implications of a decrease in ammonia oxidation in the pelagic ecosystems are still unclear, but some modeling studies have indicated that it might affect nutrient stoichiometry, denitrification, marine productivity, and the biological carbon pump (Beman et al., 2011;Kitidis et al., 2011). Members of the Marine Group II have not yet been cultured and their lifestyles are still not well known. The information obtained through the reconstruction of genomes from metagenomic data showed a photoheterotrophic lifestyle inferred by the presence of proteorhodopsin genes (Pereira et al., 2019). Some studies have suggested that photoheterotrophy contributes to biomass accumulation in oligotrophic waters, and a loss in this metabolism would potentially affect nutrients uptake by the pelagic communities (Evans et al., 2015). Furthermore, a decrease in Thiomicrospirales populations would affect the oxidation of sulfur compounds, which has been described as an important process that contributes to protecting marine ecosystems from sulfide toxicity in coastal areas (Hu et al., 2018). However, the importance of sulfur oxidation in the functioning of the SO ecosystem is still unknown.
Microbes respond to changes in light, temperature, metabolites, and other environmental factors, making them good candidates for monitoring both short-and long-term ecological variations (Hanson et al., 2012;Zinger et al., 2014;Brum et al., 2015;Turner et al., 2016;Baker-Austin and Oliver, 2018;Buttigieg et al., 2018). Climate change affects interactions between species and forces them to adapt, migrate, and even be replaced by others (Hoffmann and Sgrò, 2011;Hutchins and Fu, 2017). Understanding how well microorganisms are adapted to environmental factors, such as temperature, and predicting how well they will respond to warming is essential to elucidate ecological adaptation of these organisms (Cavicchioli, 2016).
In this work, we were able to project future changes of microbial community structure and a diversity decrease for NWAP and NWWS, according to the SSP climate change scenarios. The ML prediction showed a trend of decreasing important bacterial and archaeal taxa involved with crucial biogeochemical processes, such as Nitrosopumilales, Marine Group II, and Thiomicrospirales, and an increase of heterotrophic groups, such as Flavobacteriales. In other words, regardless of when the forced signal emerges from the internal variability, temperature changes are seen to modulate the dynamics of the Southern Ocean marine microbial community.
In response to SSP scenarios, our predicted results showed that the impacts of high latitude climate change on pelagic microbial communities vary across future projections and mainly indicate a loss of bacterial and archaeal diversity in surface waters. However, it should be noted that even though the use of machine learning is growing in microbial ecology studies, it remains a predictive tool whose interpretation needs to be done carefully (Lucas, 2020). To our knowledge, this is the first study to predict the effect of long-term climate changes of SST on microbial diversity in the SO. In addition, our results show a trend of decreasing important bacterial and archaeal taxa involved with crucial biogeochemical processes, such as Nitrosopumilales, Marine Group II, and Thiomicrospirales, and an increase of heterotrophic groups, such as Flavobacteriales. We suggest that these microorganisms can be applied as potential models in future marine studies, to validate our prediction and to create new hypotheses regarding the response of these microbial taxa under the increasing temperatures predicted by the SSP scenarios.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
We apply Time of Emergence analysis and machine learning models (ML), and 16S rRNA sequencing data, to investigate the long-term changes in the SST throughout the twenty first century in the Southern Ocean (SO), as projected by CMIP6 Earth System Models simulations (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5), and the microbial diversity responses to increasing temperature in the northwestern Antarctic Peninsula. The high emissions scenarios projected a much sooner emergence of the human-induced temperature change throughout the SO. Based on ML predictions, the impacts on Antarctic marine diversity of bacteria and archaea are expected to be significant and persistent by the late twenty first century, especially within the higher end of the range of future forcing pathways. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We acknowledge the World Climate Research Program, which, its Working Group on Coupled Modeling, coordinated and promoted CMIP6. We thank the climate modeling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies that support CMIP6 and ESGF. We also thank the Brazilian Antarctic Program (PROANTAR), the captains of Npo. Almirante Maximinao and their respective crews of OPERANTAR XXXII and XXXIII; the Brazilian High Latitude Oceanography Group (GOAL) and our colleagues onboard for their help in sampling and data acquisition; and Ivan G. de C. Ferreira for MICROSFERA's data. The content of this manuscript has been presented in part at the SCAR 2020 OSC Online, Session 7-Southern Ocean Circulation: change and consequences, ISBN: 978-0-948277-59-7.