# ANTARCTIC BIOLOGY: SCALE MATTERS

EDITED BY : Peter Convey, Katrin Linse, Huw James Griffiths, Bruno Danis, Anton Pieter Van de Putte and Alison Elizabeth Murray PUBLISHED IN : Frontiers in Ecology and Evolution, Frontiers in Microbiology and Frontiers in Marine Science

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-778-2 DOI 10.3389/978-2-88963-778-2

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ANTARCTIC BIOLOGY: SCALE MATTERS

Topic Editors:

Peter Convey, British Antarctic Survey (BAS), United Kingdom Katrin Linse, British Antarctic Survey (BAS), United Kingdom Huw James Griffiths, British Antarctic Survey (BAS), United Kingdom Bruno Danis, Université libre de Bruxelles, Belgium Anton Pieter Van de Putte, Royal Belgian Institute of Natural Sciences, Belgium Alison Elizabeth Murray, Desert Research Institute (DRI), United States

Citation: Convey, P., Linse, K., Griffiths, H. J., Danis, B., Van de Putte, A. P., Murray, A. E., eds. (2020). Antarctic Biology: Scale Matters. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-778-2

# Table of Contents


Nuttapon Pombubpa, Eleonora Egidi, Ashley Franks, Pietro Buzzini and Laura Selbmann

*78 Mapping Antarctic Suspension Feeder Abundances and Seafloor Food-Availability, and Modeling Their Change After a Major Glacier Calving*

Jan Jansen, Nicole A. Hill, Piers K. Dunstan, Eva A. Cougnon, Benjamin K. Galton-Fenzi and Craig R. Johnson

*89 Linking Ross Sea Coastal Benthic Communities to Environmental Conditions: Documenting Baselines in a Spatially Variable and Changing World*

Vonda J. Cummings, Judi E. Hewitt, Simon F. Thrush, Peter M. Marriott, N. Jane Halliday and Alf Norkko


Scott Santagata, Veronica Ade, Andrew R. Mahon, Phillip A. Wisocki and Kenneth M. Halanych

#### *125 Local and Regional Scale Heterogeneity Drive Bacterial Community Diversity and Composition in a Polar Desert*

Kelli L. Feeser, David J. Van Horn, Heather N. Buelow, Daniel R. Colman, Theresa A. McHugh, Jordan G. Okie, Egbert Schwartz and Cristina D. Takacs-Vesbach


Franz M. Heindler, Henrik Christiansen, Bruno Frédérich, Agnes Dettaï, Gilles Lepoint, Gregory E. Maes, Anton P. Van de Putte and Filip A. M. Volckaert


John L. Darcy, Eli M. S. Gendron, Pacifica Sommers, Dorota L. Porazinska and Steven K. Schmidt

*195 Antarctic Relic Microbial Mat Community Revealed by Metagenomics and Metatranscriptomics*

Elena Zaikova, David S. Goerlitz, Scott W. Tighe, Nicole Y. Wagner, Yu Bai, Brenda L. Hall, Julie G. Bevilacqua, Margaret M. Weng, Maya D. Samuels-Fair and Sarah Stewart Johnson

*217 Comparison of Microbial Communities in the Sediments and Water Columns of Frozen Cryoconite Holes in the McMurdo Dry Valleys, Antarctica*

Pacifica Sommers, John L. Darcy, Dorota L. Porazinska, Eli M. S. Gendron, Andrew G. Fountain, Felix Zamora, Kim Vincent, Kaelin M. Cawley, Adam J. Solon, Lara Vimercati, Jenna Ryder and Steven K. Schmidt

*227 Spatial and Temporal Scales Matter When Assessing the Species and Genetic Diversity of Springtails (Collembola) in Antarctica* Gemma E. Collins, Ian D. Hogg, Peter Convey, Andrew D. Barnes and Ian R. McDonald

#### *245 Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods*

Adriana Rego, Francisco Raio, Teresa P. Martins, Hugo Ribeiro, António G. G. Sousa, Joana Séneca, Mafalda S. Baptista, Charles K. Lee, S. Craig Cary, Vitor Ramos, Maria F. Carvalho, Pedro N. Leão and Catarina Magalhães

#### *264 Ensemble Modeling of Antarctic Macroalgal Habitats Exposed to Glacial Melt in a Polar Fjord*

Kerstin Jerosch, Frauke K. Scharf, Dolores Deregibus, Gabriela L. Campana, Katharina Zacher, Hendrik Pehlke, Ulrike Falk, H. Christian Hass, Maria L. Quartino and Doris Abele

# Editorial: Antarctic Biology: Scale Matters

Bruno Danis <sup>1</sup> \* † , Anton Van de Putte2,3†, Peter Convey <sup>4</sup> , Huw Griffiths <sup>4</sup> , Katrin Linse<sup>4</sup> and Alison E. Murray <sup>5</sup>

<sup>1</sup> Marine Biology Laboratory, Université Libre de Bruxelles, Brussels, Belgium, <sup>2</sup> OD Nature, Royal Belgian Institute of Natural Sciences, Brussels, Belgium, <sup>3</sup> KU Leuven, Leuven, Belgium, <sup>4</sup> British Antarctic Survey, Cambridge, United Kingdom, <sup>5</sup> Division of Earth and Ecosystem Sciences, Desert Research Institute, Reno, NV, United States

Keywords: biodiversity, Antarctica, global change, eco-evo, spatial scale, temporal scale

**Editorial on the Research Topic**

**Antarctic Biology: Scale Matters**

#### ANTARCTICA: A TIPPING SANCTUARY

#### Edited by:

Charles K. Lee, University of Waikato, New Zealand

#### Reviewed by:

Hong Kum Lee, Korea Polar Research Institute, South Korea

> \*Correspondence: Bruno Danis bdanis@ulb.ac.be

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 24 January 2020 Accepted: 19 March 2020 Published: 21 April 2020

#### Citation:

Danis B, Van de Putte A, Convey P, Griffiths H, Linse K and Murray AE (2020) Editorial: Antarctic Biology: Scale Matters. Front. Ecol. Evol. 8:91. doi: 10.3389/fevo.2020.00091 A founding principle of the Antarctic Treaty is that, in the interests of all humankind, Antarctica should continue to be used exclusively for peaceful purposes and should not become the scene or object of international discord. From many standpoints, Antarctica is considered as a sanctuary, and plays a pivotal role in the global system. From an ecological point of view, Antarctica and the surrounding Southern Ocean harbor exceptional levels of biodiversity. Its ecosystems are, however, facing rapid climatic and environmental changes, and the scientific community, embodied by the Scientific Committee on Antarctic Research (SCAR), have identified the urgent need to understand the potential responses of these ecosystems. Such questions are extremely complex, as biodiversity, here defined as "the variability among living organisms from all sources, including inter alia, terrestrial, marine and other aquatic ecosystems and the ecological complexes of which they are part; this includes diversity within species, between species and of ecosystems" (Convention on Biological Diversity, 1992), can vary at many different spatio-temporal scales and levels of biological organization, from molecules to entire ecosystems.

Knowledge gaps (both Linnaean and Wallacean), previously identified by the Census of Antarctic Marine Life (2005–2010) (Schiaparelli et al., 2013), hamper our understanding and are being filled at a slow pace. This is due to both logistic and financial constraints tied to field work in the southern polar regions, and a continued reluctance from the research community and its funders toward considering raw data publication as a high priority and a means to justify their efforts.

This Research Topic offered the 137 participating authors the opportunity to publish research presented during the SCAR 12th Biology Symposium held in Leuven, Belgium, in July 2017. The main theme of the Symposium was "Scale matters." The rationale of the Symposium can be summarized as: Biological processes and diversity span all levels, from the small molecular scale, through population and up to large ecosystem scale; understanding these processes, as well as past and present patterns of biodiversity, are essential for understanding possible threats to Antarctic biology and their impacts. This collection focuses on understanding biological distribution and trends, as well as adaptation and processes, both in marine and terrestrial realms, and including human biology. Special attention is paid to multidisciplinary research and how combining insights from different fields can help understand this unique region.

We highlight here some of the outcomes of the 20 papers assembled in this special issue, organized into three sections, focusing on our current understanding, recent developments, and future challenges that should be addressed by the Antarctic biological research community.

### CURRENT UNDERSTANDING OF ANTARCTIC BIODIVERSITY

Our current knowledge of Antarctic biodiversity is derived from a mosaic of information sources, including historic data, literature, museum collections, researchers' computers and notebooks, and accessible dedicated information systems. Heindler et al. focus on museum collections, in an attempt to delineate prey and microbiome composition in fish using metabarcoding techniques. Looking back in time (between 20 and 100 years), the authors were able to obtain data and detect significant shifts in microbiome composition of trematomid fish guts, suggesting a new way to utilize museum collections to gain insight into microevolutionary processes and changes in trophic networks. Biersma et al. focused on the evolutionary history (timescale of Myr) of Antarctic flora in the terrestrial realm and proposed the existence of regional glacial refugia in the mountainous area of the Antarctic Peninsula, exploring the adaptive potential of a group of endemic species facing rapid climate change. In the marine domain, on the Antarctic continental shelf, Santagata et al. analyzed species composition in communities of bryozoans that are exposed to environmental pressures, such as ocean acidification and warming, proposing that their assemblages, which play a pivotal role as ecosystem engineers, were tied to the combined effects of seasonal ice scour and carbonate chemistry. At the microbial level, Lee et al. studied ecosystem responses to moisture gradients in the very specific environmental context of the McMurdo Dry Valleys. Here, along intense moisture gradients, the response in terms of community diversity emerged as a switch between a system driven by abiotic factors to one more sensitive to biotic factors, advancing our understanding of potential ecosystem responses in threatened systems, such as polar deserts. Also in the McMurdo Dry Valleys, Sommers et al. focused on microbial communities, but in the specialized habitat of cryoconite holes (sediment-filled melt holes on glacier surfaces), hypothesizing that higher partitioning between sediments and water would be a biodiversity driver through spatial niche partitioning; contrary to expectation, they observed the opposite, highlighting the need for further studies at even finer scales. Rego et al. presented a more methodological approach, aiming at improving the potential for High Throughput Sequencing to access the diversity of Actinobacteria and Cyanobacteria in microenvironments from the McMurdo Dry Valleys, and highlighted the importance of combining cultivation and sequencing approaches to recover both the abundant and rare components of the bacterial assemblages present. In an area displaying exceptional environmental gradients, Deception Island, Bendia et al. studied how microbial ecosystems responded to temperature, salinity, and geochemistry gradients in a volcanic setting, finding that bacterial community structure was significantly driven by all these factors, while archaeal community structure only responded to temperature variations.

Evidently, even though our knowledge of Antarctic biodiversity has greatly improved over the last decade, the Antarctic research community is constantly uncovering new realms and challenges to address, particularly in the current context of rapid environmental change.

## THE IMPORTANCE OF SCALES: RECENT DEVELOPMENTS

Significant progress is being made from methodological and fundamental standpoints. In marine, freshwater and terrestrial environments, new habitats are being discovered and new approaches are being collaboratively developed to funnel new data into areas of priority research.

Once again, in the McMurdo Dry Valleys, Zaikova et al. worked on microbial mats from relict lake deposits, another under-studied, challenging oligotrophic environment. Using metagenome assemblies, they were able to attribute key roles, including nitrogen cycling and carbohydrate degradation, as well as stress responses, DNA repair and sporulation pathways, gaining new insights into functional paleoecology. Knowledge gaps not only characterize challenging environments, but also challenge understanding of taxonomy or groups that are generally assumed to be well-documented. As shown by Christiansen et al. and Christiansen et al. in a study focusing on the diversity of mesopelagic fishes using barcoding techniques, pseudo-cryptic or unrecognized species are often found in various taxa, while colonization of the Southern Ocean has occurred repeatedly for myctophids. Using a similar phylogeographic approach, combined with particle tracking models, Brasier et al. investigated the distribution of polychaete morphospecies and found that the observed distribution was linked to oceanographic parameters at regional scale, which has important implications for environmental management strategies. Using "species archetype models," Jansen, Hill, Dunstan, Eléaume et al. analyzed the value in grouping species by functional similarity in their response to environmental change, in an attempt to compensate for the scarcity of presenceabsence data. They found that this approach may insufficiently resolve this issue, highlighting the need for caution in the use of distribution models when making statements about the distribution of biodiversity at various scales and taxonomic resolutions. Scale effects were also addressed by Feeser et al. in a study addressing soil microbial diversity and community dynamics, and the scale-dependent effects of environmental heterogeneity, in the McMurdo Dry Valleys. Their results highlight the importance of conducting such studies over large ranges of environmental gradients and across multiple spatial scales. Another functional approach was used by Cummings et al. in a study focusing on marine benthic systems from the Ross Sea and linking community composition to seafloor habitat and a series of other parameters. They showed that the most powerful approach was to include all parameters at all scales (and finding ways to connect them), which in the longer term produced the most robust models, with promising implications for the accuracy of predictions of change in coastal habitats.

#### REMAINING CHALLENGES: CONNECTING THE SCALES

As mentioned above, understanding many biodiversity patterns and their responses to complex, interlinked environmental changes requires studies at multiple temporal and/or spatial scales. Coupling ocean-surface change to responses in the seafloor community is an example of such a connection, as illustrated by Jansen, Hill, Dunstan, Cougnon et al.. These authors mapped patterns of abundance of suspension feeders in East Antarctica, modeling the ways in which they responded to the 2010 calving of the Mertz glacier. Their data confirmed a strong increase in suspension feeder abundance, providing insight into the importance of a changing icescape on seafloor habitat and fauna in polar environments. In the western Antarctic Peninsula region, marine ecosystems have also been exposed to considerable icescape changes. In this region, Jerosch et al. studied the responses of macroalgae to glacial melt in fjord ecosystems, using ensemble modeling, and produced the first iteration of a quantitative model of macroalgal production under melt conditions. On a much smaller scale, Darcy et al. used a classic biogeographic theory— Island Biogeography—to address self-contained ecosystems in the form of cryoconite holes in Taylor Valley and, more specifically, the relationships between bacterial diversity and the size of the cryconite holes and their geographic proximity. They found strong spatial structuring, suggesting that these holes are indeed behaving as "islands" in terms of bacterial diversity. Coleine et al. shared research about another peculiar niche found in Antarctica, the cryptoendolithic niche, setting out to identify if there was any relationship between abiotic parameters and fungal community biodiversity and composition. In this case, no correlations were found, presumably due to strong environmental heterogeneity at local scales, once again highlighting the need for integrated studies at multiple-scales. When assessing biodiversity patterns, spatio-temporal scales clearly matter, as highlighted by Collins et al.'s study of Victoria Land Antarctic springtails. Through collating and expanding an important dataset on Collembola CO1 DNA ("barcode") sequences, these authors found evidence for limited dispersal opportunities and strong heterogeneity in genetic diversity between sites. As a result, in the development of conservation strategies, they stress that species-specific spatio-temporal scales should be taken into consideration. Another study with urgent implications in terms of conservation is that of Ropert-Coudert et al., who focused on breeding failures in an Adélie penguin colony, and existing scenarios of entire populations collapsing, using the specific example of sharp changes in icescapes in the D'Urville Sea.

# INFORMATION ECOSYSTEMS ARE NEEDED TO CONNECT THE SCALES

Antarctic science is evolving rapidly, with developments in genomics, the digital revolution, big data, modeling, etc. now being the norm. However, data mining, probing deeper into available information and museum collections, and working in a transdisciplinary manner, across-scales, also helps the research community move ahead efficiently and respond to ongoing environmental emergencies. The publication of existing and new data still needs to be improved, 10 years after the Census of Antarctic Marine Life, as information remains scattered, incomplete and often has restricted availability, which can be problematic in the light of the dissynchrony between information needs and pace of climate change in south polar regions. A further step is the predictive use of such information for research, conservation and management purposes. This requires further development in the parameterization and modeling of complex systems at the same time as supporting the maintenance of the unique knowledge of the SCAR and Antarctic community in a series of biological disciplines, which are increasingly threatened with extinction.

These predictive applications also require a better integration of Antarctic biodiversity knowledge in global biodiversity assessments (including those focusing on biosphere integrity and rate of biodiversity -function- loss). A range of global biodiversity observation initiatives and interfaces are developing, such as IPBES (the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services), in which Antarctic biodiversity research outputs should be better represented, and in which SCAR community practices could pave the way in terms of international collaboration, and data valorization and sharing. This approach could refine the development of biodiversity assessments and indexes at the scale of the planet. As pointed in the SCAR Horizon Scan (see e.g., Kennicutt et al., 2014), and in a recent review on sustained Antarctic research, identified priorities for Life Science research include improving our understanding of mechanisms leading to biodiversity loss, how ecosystems respond to fast changing environmental conditions, characterizing biological adaptations, defining strategies of resilience, and constantly assessing the efficacy of ongoing and future conservation practices (Kennicutt et al., 2019).

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

We are grateful to the SCAR Biology Leuven Symposium participants for their efforts and good spirits during the event. We are especially grateful to the reviewers who returned their comments within the requested timeline.

# REFERENCES


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Danis, Van de Putte, Convey, Griffiths, Linse and Murray. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Distributional Patterns of Polychaetes Across the West Antarctic Based on DNA Barcoding and Particle Tracking Analyses

Madeleine J. Brasier 1, 2 \*, James Harle<sup>3</sup> , Helena Wiklund<sup>2</sup> , Rachel M. Jeffreys <sup>1</sup> , Katrin Linse<sup>4</sup> , Henry A. Ruhl <sup>3</sup> and Adrian G. Glover <sup>2</sup>

*<sup>1</sup> School of Environmental Science, University of Liverpool, Liverpool, United Kingdom, <sup>2</sup> Life Sciences, Natural History Museum, London, United Kingdom, <sup>3</sup> National Oceanography Centre, University of Southampton, Southampton, United Kingdom, <sup>4</sup> BioSciences, British Antarctic Survey, Cambridge, United Kingdom*

#### Edited by:

*Wei-Jen Chen, National Taiwan University, Taiwan*

#### Reviewed by:

*Stephane Hourdez, Centre National de la Recherche Scientifique (CNRS), France Andrew Anthony David, Clarkson University, United States*

> \*Correspondence: *Madeleine J. Brasier m.brasier@liverpool.ac.uk*

#### Specialty section:

*This article was submitted to Marine Evolutionary Biology, Biogeography and Species Diversity, a section of the journal Frontiers in Marine Science*

> Received: *13 July 2017* Accepted: *24 October 2017* Published: *16 November 2017*

#### Citation:

*Brasier MJ, Harle J, Wiklund H, Jeffreys RM, Linse K, Ruhl HA and Glover AG (2017) Distributional Patterns of Polychaetes Across the West Antarctic Based on DNA Barcoding and Particle Tracking Analyses. Front. Mar. Sci. 4:356. doi: 10.3389/fmars.2017.00356* Recent genetic investigations have uncovered a high proportion of cryptic species within Antarctic polychaetes. It is likely that these evolved in isolation during periods of glaciation, and it is possible that cryptic populations would have remained geographically restricted from one another occupying different regions of Antarctica. By analysing the distributions of nine morphospecies, (six of which contained potential cryptic species), we find evidence for widespread distributions within the West Antarctic. Around 60% of the cryptic species exhibited sympatric distributions, and at least one cryptic clade was found to be widespread. Additional DNA barcodes from GenBank and morphological records extended the observed range of three species studied here, and indicate potential circum-Antarctic traits. Particle tracking analyses were used to model theoretical dispersal ranges of pelagic larvae. Data from these models suggest that the observed species distributions inferred from genetic similarity could have been established and maintained through the regional oceanographic currents, including the Antarctic Circumpolar Current (ACC) and its coastal counter current. Improved understanding of the distribution of Antarctic fauna is essential for predicting the impacts of environmental change and determining management strategies for the region.

Keywords: circumpolar, biogeography, deep-sea, cryptic species, Southern Ocean, benthos

### INTRODUCTION

Enclosed by both the Antarctic Circumpolar Current (ACC) and frontal systems the Southern Ocean is often described as an isolated marine environment, **Figure 1**. These oceanographic features act as a physical barrier, which are thought to have prevented species movement into and out of the Southern Ocean. For this reason early predictions suggested that the majority of benthic fauna within Antarctic waters would be endemic to the Southern Ocean (Ekman, 1953; Hedgpeth, 1969). Endemism has since been observed in many major taxonomic groups based on species records from mostly morphological species identification (see reviews Dell, 1972; Arntz et al., 1997; Clarke and Johnston, 2003; Thorpe et al., 2007; De Broyer and Danis, 2011; Brandt et al., 2012; Kaiser et al., 2013). Current estimates of endemism, as reported in the recently published Southern Ocean Biogeographic Atlas, ranged from 50 to 97% between taxa (De Broyer et al., 2014).

A further biogeographic pattern that has long been associated with Antarctic marine fauna is the circum-polarity or circum-Antarctic distributions (Arntz et al., 1997; Clarke and Johnston, 2003). These distributions could be a result of a combination of factors that could have a homogenizing effect on the faunal communities. The continuous coastline around the continent itself provides connectivity between the different seas around Antarctica. Furthermore, given the relatively uniform physical conditions across the continental shelf, individual settlement and survival is not restricted by their physiology, e.g., temperature tolerance (Arntz and Gallardo, 1994). Larval dispersal around the continent could be aided by oceanographic currents, including the ACC, its coastal counter current (also referred to as East Wind Drift), and the Weddell and Ross Sea gyres, **Figure 1** (Orsi et al., 1993, 1995; Fahrbach et al., 1994; Linse et al., 2007).

Antarctic benthic fauna often have extended depth ranges and are considered to be eurybathic, i.e., capable of living in benthic habitats within both shallow and deep water. Some of the first suggestions for this were made by Dell (1972) and Knox and Lowry (1977) for various taxa including sponges, corals, polychaetes and molluscs. Broad depth distributions of Antarctic fauna are thought to be associated with the advance, and retreat of shelf ice during interglacial cycles (Brey et al., 1996). During periods of glacial expansion some "shelf fauna" may have been moved down slope into ice free habitats. In the following glacial retreat the now "slope fauna" could then recolonize shallower shelf areas thus evolving eurybathic distributions (Clarke and Johnston, 2003). This movement was possible due to similar physical conditions (e.g., temperature) on the shelf, slope and deep-sea floor, thus reducing the need for specific adaptations needed to survive in these environments (Clarke et al., 2009; Clarke and Crame, 2010).

Some of these generalized patterns have been challenged when it comes to the deep-sea benthos. Although the ACC and Polar Front may affect the movement of pelagic and shallow water species it may not be a barrier to the benthos (Clarke, 2003). Antarctic polychaetes seem to have some of the broadest geographical distribution ranges amongst the Antarctic benthic macrofaunal invertebrates (Schüller and Ebbe, 2007). During analysis of the ANtarctic benthic DEEP-sea biodiversity (ANDEEP) samples, it was noted that more than half of all polychaetes identified matched species found north of the Polar Front, 20% of which are also found in the northern hemisphere (Brandt et al., 2007). However, in recent works by Neal et al. (2017) depth was identified as the main factor structuring the polychaete communities within the Amundsen and Scotia Sea, contradicting the broad depth distributions often associated with Southern Ocean benthic marine fauna.

Large-scale sampling programs and species databases are providing valuable insight to the biogeography of many species, however, these analyses are predominantly based on morphological identifications. Furthermore, early identifications of Antarctic benthos reflect the species names on pre-existing monographs of, for example, European fauna. Given the abundance of undescribed deep-sea species, the number of specimens collected on large sampling programs and the potential for identifying features to be damaged on collection we use molecular taxonomy to validate the accuracy of biogeographic records. Schüller and Ebbe (2014), investigated all georeferenced Register of Antarctic Marine Species (RAMS) polychaetes within the Scientific Committee for Antarctic Research Marine Biodiversity Information Network (SCAR-MarBIN), concluding that while Southern Ocean polychaete taxonomy is improving with new species descriptions, there are still many "cosmopolitan" species which are most probably Antarctic species different to their Northern counterparts. This can be easily tested using DNA barcoding where comparative sequences exist (Brasier et al., 2016).

Our ability to define the geographic distribution, or biogeography, of species including how their distribution was established and is maintained, has also progressed with the development of DNA sequencing, phylogeography and population genetics (Riesgo et al., 2015). These methods allow us to visualize and, with sufficient sample numbers, calculate the level of population connectivity between known populations of species from different localities, which can be controlled by several interacting biological, physical and chemical factors. Since the application of genetics in Antarctic diversity and biogeographic studies many species presumed to be circum-Antarctic or cosmopolitan based on morphological analysis are actually comprised of more geographically restricted and separated cryptic clades. For example genetically distinct restricted populations of the isopod Betamorpha fusiformis (Raupach et al., 2007), the cephalopod Pareledone spp. (Allcock et al., 2011) and the crinoid Promachorinus kerguelensis (Wilson et al., 2007) have been identified by comparing mitochondrial DNA. In other cases, DNA barcoding has provided evidence for circum-Antarctic distributions, e.g., genetic homogeneity within benthic invertebrates has been recorded in the nemertean ribbon worm Parabolisa corrugatus (Thornhill et al., 2008), the two shrimp species Chorismus antarcticus and Nematocarcinus lanceopes (Raupach et al., 2010) and the pycnogonid Nymphon australe (Arango et al., 2011).

This study uses mitochondrial DNA barcoding data from one of the largest Antarctic polychaete barcoding projects to date (Brasier et al., 2016), to investigate the distributions of nine polychaete morphospecies previously considered to be cosmopolitan or circum-Antarctic. We combine our genetic analyses with particle tracking models to examine the directionality and distance of pelagic polychaete larvae and the potential to maintain genetic connectivity across widely distributed populations. Together with our knowledge of Southern Ocean glaciations and polychaete larval biology we aim to improve our understanding of the biological and physical factors influencing the distribution of polychaete species within the Southern Ocean. We hypothesize that although many morphospecies of polychaetes exhibit "characteristic" distributional patterns for Southern Ocean species including endemism, eurybathy and circum-Antarctic distributions (Griffiths et al., 2009; Brandt et al., 2012), their cryptic clades may be more restricted than their moprhospecies, i.e., cryptic

clades are found within a single region of Antarctica, which may be an artifact of their evolution during glacial periods when populations were isolated from one another.

# METHODS

#### Specimen Selection and Presumed Distribution

Polychaetes were collected from three locations within the Southern Ocean; the Scotia Arc containing six sampling sites and the Amundsen Sea and Weddell Sea both containing four sampling sites, **Figure 1**. All polychaetes were identified based on morphological characters, some individuals could not be identified to any currently described species and were considered new to science. These results are presented in Neal et al. (2017) and all voucher specimens were deposited in the Natural History Museum London, details can be found online using the Darwin Core Archive https://doi.org/10.5519/0068114. DNA was extracted and sequenced from nine polychaete morphospecies (211 specimens in all); for details on the phylogenetic analysis of these sequences including the identification of cryptic species see Brasier et al. (2016). Prior to DNA barcoding, a biogeographic distribution classification; cosmopolitan, circum-Antarctic or restricted, was allocated to each initial morphospecies analyzed (**Table 1**), where: "cosmopolitan" species are those that have been recorded throughout the majority of the world's oceans and both hemispheres; "cicrum-Antarctic" species are those that have been collected within different regions of the Southern Ocean and are considered to be widespread around the Antarctic and "restricted" species are those only recorded in one Antarctic region or location with no records within the Register of Antarctic Marine Species (RAMS) recorded or Basic Local Alignment Search Tool (BLAST) matches from other localities.

These definitions are coherent with the most recent biogeographic review of Antarctic polychaetes, Schüller and Ebbe (2014), who considered species circum-Antarctic if there were georeferenced RAMS records from at least the Weddell Sea, West Antarctic Peninsula (WAP) or the Scotia Arc, and the Ross Sea or Eastern Antarctica. The presumed distributions in **Table 1** were defined using taxonomic databases including the Encyclopedia of Life (EOL) and RAMS. For undetermined morphospecies, their distribution was deciphered by their presence within samples collected in the each of the regions as well as any BLAST matches with georeferenced sequence including larval matches within the Ross Sea (Heimeier et al., 2010; Gallego et al., 2014).

#### DNA Sequencing and Analysis

The genetic diversity of the DNA barcodes generated from the nine morphospecies was investigated in Brasier et al. (2016). Phylogenetic and distance analyses found evidence for twelve cryptic species, increasing the number of species in this study from nine to seventeen. In Brasier et al. (2016) clades were assigned MB# to distinguish species, including cryptic species, e.g., Scalibregma sp. MB1 and Scalibregma sp. MB2. Cryptic species were referred to as either "Genus cf species MB#," "Genus sp. MB#" or if no genus could be assigned "Family sp. MB#." The use of "cf " or "sp" was associated with the type locality of the original species and likelihood of the Antarctic specimens being the "true" species, for more details see Brasier et al. (2016). Using haplotype networks we have analyzed the distribution of each genetically identified species, morphological and cryptic, listed in **Table 1**.

As the recovery of 16S sequences was greater than COI in all morphospecies examined, this marker was used to construct georeferenced haplotype networks of the sequenced specimens to visualize species distributions and speculate biogeographic patterns. An exception was Hesionidae sp. A for which cryptic species were revealed in COI but not 16S. If sequence matches within the GenBANK database were found during phylogenetic analysis these were also included in the network (**Table 2**). To avoid problems with gaps, all sequences of the same species were trimmed in Mesquite (Version 2.75) to the same length (**Table 1**) following MAFFT (for 16S) or MUSCLE (for COI) sequence alignment in Geneious (R7). GPS coordinates were assigned to each sequence for its given sample location, and networks constructed using statistical parsimony (Templeton et al., 1992) and the TCS programme (Clement et al., 2000) in PopART for editing in CorelDRAWX8. For the number of sequences per species by site used in the georeferenced haplotype networks, please see Supplementary Material. Haplotype networks with depth referenced sequences were also created using the five depth bins; <500 m, 500–1,000 m, 1,001–1,500 m, 1,501–2,000 m and >2,000 m. The depth bins were chosen based on the discrete depth horizons sampled during collection expeditions (for details see Griffiths et al., 2008 and Linse et al., 2013). Most sequenced specimens were obtained from depths of 500, 1,000, 1,500, or 2,000 m, some specimens were collected from 100 and 200 m and were included in the <500 m depth bin and those at depths greater than 2,000 m were also binned together.

### Particle Tracking Model to Estimate Larval Dispersal

To gain an insight into the potential pathways of larval dispersal between the sampled localities a particle tracking model was employed (ARIANE; Blanke and Raynaud, 1997) to the output of a NEMO (Madec, 2008) ocean general circulation model (OGCM). Five day mean ocean current data obtained from the National Oceanography Centre, U.K., NEMO 1/12◦ OGCM (Duchez et al., 2014), for the period 2000-2009, was used to calculate the 3D trajectories of passive particles released from 17 sites from four Antarctic locations, **Table 3**. Within each model grid cell co-located with a sampled site, eight evenly distributed particles were released at every model depth level and at 5 day intervals for the period 2000-2009. These particles were then tracked for 1 year. No vertical or horizontal dispersion was added to the particle motion and no buoyancy terms were included, so the subsequent particle pathway is purely an advective one. A 75 km search radius or "trap" at each sample site location for the full water column was used to determine whether foreign particles passed close by and were therefore potentially connected to another sample site.

Particle tracking analysis is relevant to polychaete reproductive modes because six of the eight families included in this study are considered to have pelagic larvae (**Table 4**), with five of these known to have feeding (plankotrophic) larvae. The reproductive traits of each species examined in this study were inferred from family level trait data available from the largest polychaete trait database Polytraits<sup>1</sup> . Species level reproductive traits are rare, especially for those that are the hardest to collect and identify, e.g., deep sea and Antarctic species. We know from genetic studies that three of our species, Hesionidae sp. (MB1), Aglaophamus trissophyllus and Laonice weddellia have pelagic larvae in Antarctic waters (Heimeier et al., 2010; Gallego et al., 2014).

As discussed in our interpretation several aspects of larval biology were not incorporated into this model including larval mortality and behavior. These factors can lead to increased rentention of larvae within their source location and thus it is possible that our observations of the distance and direction of dispersal are overestimated (e.g., Cowen et al., 2006; Levin, 2006). Additionally we have not examined the seasonal differences in the release of larvae. Although some studies have found evidence

<sup>1</sup>PolyTraits Team. Data from: Polytraits: A database on biological traits of polychaetes. Hellenic Centre for Marine Research: Lifewatch, Greece (2017) http:// polytraits.lifewatchgreece.eu



 *in* Figures 2*–*4 *including the gene used, number of individuals per taxa and the number of base pairs and segrating sites in brackets.* \**Indicates cryptic species,* \*\**indicates additional sequences from GenBank used in alignment, see* Table 2*.*

TABLE 2 | Additional sequences included in the haplotype networks, including the MB species they matched, the gene sequenced, specimen locality, GenBANk names and accession number.


TABLE 3 | The release sites used in particle tracking analysis by location with latitude and longitude position and the depth range in which particles were released.


*Abbreviations correspond to the labeling in* Figures 5*–*8*. For Scotia Arc, Amundsen Sea and Weddell Sea sites these positions are averages of multiple EBS tracks at these stations from which the barcoded polychaetes were collected and, for the Ross Sea the locations in the literature from which the barcoded larval specimens used in haplotype networks were collected (Heimeier et al., 2010; Gallego et al., 2014).*

TABLE 4 | Reproductive traits of the 8 polychaete families containing species from which DNA barcodes were collected in this study.


*Traits obtained from the Polytraits database (http://polytraits.lifewatchgreece.eu), trait definitions are based on those stated on Polytraits. Fertilization type: fertilization can take place internally (within the female body) or externally often by broadcast spawning. Development type: the mode of development from the larval to adult stage either indirect (one or more successive free living larval stages) or direct (no intermediate larval stages). Larval mode: position of larval development either pelagic (in the water column) or benthic (near or on the seafloor). Larval feeding mode: either Planktotrophic (larvae capture their own food) or lecithotrophic (maternal derived nutrition). Original references: Strelzov (1979)<sup>1</sup> ; Fauchald (1983)<sup>2</sup> ; Bhaud (1988)<sup>3</sup> ; Blake and Arnofsky (1999)<sup>4</sup> ; Van Dover et al. (1999); Beesley et al. (2000)<sup>5</sup> ; Rouse and Pleijel (2001)<sup>6</sup> ; Pernet et al. (2002)<sup>7</sup> ; Carson and Hentschel (2006)<sup>8</sup> and Rouse and Pleijel (2006)<sup>9</sup> .*

for seasonal reproductive cycles in benthic fauna, different larval stages have been observed within the water column through the year (Bowden et al., 2009; Sewell and Jury, 2011). In this study particle tracking analyses are used purely to gain insight into how the oceanography of the Southern Ocean may theoretically affect population connectivity by its directionality and distance of dispersal. Furthermore, we do not know enough about Antarctic polychaete larvae to estimate these behavioral or biological model constraints.

#### RESULTS

The distribution patterns of seventeen genetically-determined polychaete species are based on the observed distributions from our haplotype networks and GenBank comparisons. In this study we found no evidence for cosmopolitan species. The genetic difference between our Antarctic specimens of Glycera capitata and Scalibregma inflatum, both considered to be cosmopolitan, with publically available sequence data from their Northern representatives [which were closer to their type localities, Greenland (G. capitata) and Norway (S. inflatum)], deemed the Antarctic clades to be cryptic species and so are examined for widespread Antarctic distributions, **Figure 2**.

### Observations of Widespread Species within West Antarctic

The most common distribution observed was widespread Antarctic, with 71% of the species investigated recorded to have genetically similar specimens in at least two Antarctic locations. The most abundant were Laonice weddellia, Harmothoe fuligineum and Aricidea simplex, **Figure 3**. A. simplex was sequenced from all three Antarctic locations sampled. The distribution of L. weddellia was also extended to the Ross Sea by identical sequences to that of a larval specimen from the Ross Sea (Gallego et al., 2014). No specimens of H. fuligineum were identified from the Weddell Sea, but specimens were found within the Amundsen and Scotia Arc regions.

Most of the widespread Antarctic cryptic species were sequenced from two of the sampled locations, e.g., the Scotia Arc and Amundsen Sea [Glycera sp. (MB1) and (MB2), Scalibregma sp. (MB3)]. Aglaophamus sp. (MB2) was the only cryptic species sequenced from all three locations. Aglaophamus sp. (MB1) was only sequenced from specimens in the Scotia Arc but was recorded as widespread because sequenced specimens from the Scotia Arc matched larval sequences from the Ross Sea (Heimeier et al., 2010), **Figure 4A**. Further larval matches from the Ross Sea were found with Hesionidae sp. (MB2). In comparison to Hesionidae sp. (MB2) the distribution of Hesionidae sp. (MB1) was reduced with representation from the Scotia Arc (Southern Thule) and the Amundsen Sea only, **Figure 4B**.

#### Observations of Restricted Species

Whilst both Scalibregma sp. (MB1) and Eurphrosinella cf cirratoformis (MB1) were collected from more than one Antarctic region, their cryptic counterparts Scalibregma sp. (MB2) and (MB3) and Eurphrosinella cf cirratoformis (MB2) were only sequenced from one location including; Southern

Thule, Elephant Island and the Amundsen Sea respectively, **Figure 2B**. Scalibregma sp. (MB2) was only represented by a single individual but Scalibregma sp. (MB3) and Eurphrosinella cf cirratoformis (MB2) were represented by four and three individuals respectively. Although these clades appear more restricted compared to their cryptic counterparts they exist sympatrically within the same regions. The only cryptic species that appear to exist allopatrically are the three clades of the morphospecies Aricidea belgicae. Each species was sequenced from a single separate region, Aricidea cf belgicae (MB1) from the Amundsen Sea, Aricidea cf belgicae (MB2) from the Weddell Sea and Aricidea cf belgicae (MB3) from the Scotia Arc, **Figure 3C**.

#### Depth Distributions

The maximum depth range that could be recorded was 100– 2,000 m. The greatest depth distribution recorded in this study was found in Laonice weddellia, which were collected at depths between 200 and 1,500 m. Aricidea simplex had a similar depth range but the deepest specimens were collected from 1,000 m, whilst Harmothoe fuligineum had the narrowest depth range for the non-cryptic species from 200 to 500 m. **Figure 3**.

Some patterns in the depth distributions of cryptic species were observed. For example Glycera sp. (MB1) was collected from 200 to 500 m and Glycera sp. (MB2) from 500 to 1,000 m depth, **Figure 1A**. Similar patterning was also observed between Hesionidae sp. (MB1) and Hesionidae sp. (MB2), Euphrosinella cf cirratoformis (MB1) and Euprhosinella cf cirratoformis (MB2) and, Aglaphamus cf trissophyllus (MB1) and Aglaophamus sp. (MB2), **Figures 3**, **4**. Two of the cryptic species of Scalibregma, (MB1) and (MB3) exhibited the same depth distribution, 200– 500 m, whilst the single representative of Scalibregma sp. (MB2) was only found at 1000 m, **Figure 2B**. For the allopatric cyprtic caldes of Aricidea belgicae, it was Aricidea cf belgicae (MB1) that exhibited the greatest depth distribution of 500–1,000 m, whilst Aricidea cf belgicae (MB2) and Aricidea cf belgicae (MB3) were only observed at 200 and 600 m respectively, **Figure 3B**.

#### Particle Tracking Analysis

Particles released within the Ross Sea shelf and slope sites were transported westward around the Antarctic continent in the counter current and thus away from the other sample locations, **Figure 5**. In contrast, particles released within the offshore site within the Ross Sea were carried eastward in the

direction of the WAP but away from the continental slope in the ACC. Sites within the Amundsen Sea followed the same direction of dispersal to the west with those on the edge of the continental shelf (AS\_BIO3 and AS\_BIO6) transporting particles toward the Ross Sea sites, **Figure 6** (AS\_BIO3). Particles from release sites at the northern tip of the WAP (Elephant Island, Livingston Island and Powell Basin) were transported in both directions, but with greater dispersal westwards across the Scotia Sea reaching the Shag Rocks and South Georgia sites, **Figure 7**. Particles from Shag Rocks and South Georgia, were also transported east and northward being advected by the ACC. Particle release sites within the Weddell Sea were well connected by the Weddell Sea gyre however, the extent of dispersal varied between sites. For the inshore locations (WS\_FTSE and WS\_FSCE) particles were transported westward and up the WAP (Figure not shown). Whilst those released further offshore (WS\_CS; **Figure 8**) reached further into the Scotia Sea and into the ACC, crossing though the Shag Rocks and South Georgia sites as well as those closer to the WAP: Powell Basin, Elephant Island and Livingston Island.

Statistical analysis in **Figures 9**, **10** show the percentage of particles passing through a 75 km radius or "trap" surrounding each of the other sites, and their mean time and depth of arrival. As expected, particles traveling the greatest distances were found in the surface layers as a result of the stronger advection (indicated by the lighter shaded squares in **Figures 9**, **10** and the lower percentages). In general, connectivity within regions was greater than between regions, for example 60% of particles from Elephant Island passed within 75 km of Livingston Island. However, this connectivity was unidirectional as only <1% of particles released from Livingston Island were observed in the vicinity of Elephant Island. The level of inter- and intra-regional connectivity is highly dependent on the release site location in relation to local current pathways, and to a lesser degree the physical distance between the release sites. For example, there appears to be no connectivity between AS\_BIO6 and AS\_BIO4/5, yet the Ross Sea sites receive particles from all Amundsen Sea sites. Southern Thule was the least connected site receiving particles <1% of particles from the Powell Basin site. Overall, the WS\_CS had the highest connectivity to sites outside of its region with particles reaching all sites within the Scotia Arc.

# DISCUSSION

In this study, nine of the seventeen genetically-identified species revealed different biogeographic patterns to their distributions inferred from morphology based identification records of the initial morphospecies and the RAMS database (**Table 1**). These nine species, have been described as cryptic species in Brasier et al. (2016), and their haplotype networks confirmed this displaying characteristics for cryptic speciation as described in Allcock and Strugnell (2012). In some cases, e.g., Aricidea cf belgicae (MB1), (MB2), and (MB3), more restricted distributions were observed but most remained widespread and potentially cicrum-Antarctic. Additionally, the networks showed the majority of cryptic species exhibited sympatric distributions, occupying the same geographic locations.

#### Widespread Antarctic Species

Matching haplotypes between individuals of the same species from at least two of the sampled regions as well as matches with larval DNA from the Ross Sea (Heimeier et al., 2010; Gallego et al., 2014) suggests an abundance of widespread polychaete species. Our results suggest that the three morphospeices, Laonice weddellia, Harmothoe fuligineum, and Aricidea simplex, are widespread within the West Antarctic, with potentially circum-Antarctic distributions. Similar results were also found for the Dorvilleidae polychaete Ophryotrocha orensanzi by comparing COI data from west and east Antarctic populations (Paxton et al., 2016). Eight of the cryptic species sequenced in this study appear to be widespread occurring in multiple regions within the West Antarctic. The matching haplotypes between Antarctic regions may indicate continued dispersal and ongoing gene flow between these regions that maintain genetic similarity (Arango et al., 2011; Soler-Membrives et al., 2017). These results are supported by the model findings, documenting the potential for oceanographic currents to carry particles or "larvae" between Antarctic regions in both directions around the continent.

The majority of polychaete cryptic species studied exist sympatrically, covering the same or overlapping regions of the West Antarctic. How these species and their current distributions were established and maintained is most probably a result of several complex biophysical interactions over geological time. Their evolution and distribution is likely to be influenced by the relative roles of vicariance and dispersal (Clarke, 1992; Aronson et al., 2007; Waters, 2008; González-Wevar et al., 2011). Although many records of cryptic species in the Antarctic are restricted to certain locations, widespread cryptic species have been recorded in several taxa. Examples of which include: the crinoid Promachocrinus kerguelensis, where six genetically distinct phylogroups were considered circumpolar, sympatric and eurybathic (Hemery et al., 2012); the brittle star Ophionotus victoriae, and the

*hexbin* over the sample period. For site abbreviations see Table 3.

amphipod Nymphon australe, where although some clades appear restricted others were considered widespread (Galaska et al., 2017; Soler-Membrives et al., 2017). Within the literature similar insights into larval dispersal have also been obtained from oceanographic observations in conjunction with genetic analysis. For example Matschiner et al. (2009) concluded that the lack of genetic structuring between populations of the notothenioid, Gobionotothen gibberifrons, throughout the Scotia Sea could be assigned to the passive transport of their larvae during the pelagic development phase by the ACC as indicated by surface drifter trajectories. The geographic distribution of marine taxa within the Southern Ocean reflects the species life-history traits (including bathymetric ranges, developmental modes and larval lifespans) and the influence of circum-Antarctic current systems (Young et al., 2015; González-Wevar et al., 2017; Moreau et al., 2017).

It is generally considered that the majority of cryptic species within Antarctic waters arose from physically separated populations over multiple glaciations (Allcock et al., 2001; Thatje et al., 2005; Galaska et al., 2017). Thus, the existence of sympatric species could suggest they evolved within the same area due to another method of isolation, for example differences in reproductive traits (Palumbi, 1994), responses to competition (Alizon et al., 2008) or predation (Wilson et al., 2013). An alternative explanation is that these species did evolve in geographic isolation and their widespread distribution was established after they evolved, possibly during post-glacial re-colonization. Given the limited size and mobility of these benthic polychaetes it is unlikely that adult specimens migrate between the three regions sampled. Instead, it is probable that genetic connectivity between adult populations is maintained by larval dispersal (Fraser et al., 2012).

The reproductive traits of polychaetes can vary on many taxonomic levels. For example within the family Polynoidae the brooding of eggs is considered to be relatively rare (Giangrande, 1997). However, Gambi et al. (2001) recorded brooding of eggs under dorsal elytra in three Antarctic Harmothoe species.

sample period. For site abbreviations see Table 3.

Within Spionidae, different populations of Polydora species are known to have different PLDs and feeding modes, exhibiting adelphophagy, the production of both planktotophic and adelphophagic larvae (where unfertilized eggs are ingested by the developing larvae) (Blake, 1969; Mackay and Gibson, 1999). Other spionids such as Streblospio benedicti exhibit poecilogony, producing both planktotrophic and lecithotrophic larvae (Levin, 1984). Given the recorded variation in polychaete reproduction, the reproductive modes of the species studied may not reflect their generalized family level trait data (**Table 4**). However, many species may still have a larval dispersal phase (Blake and Arnofsky, 1999; Faulwetter et al., 2014). Even brooding species can still possess a dispersal phase as brooders can release pelagic planktotrophic larvae as indicated by egg size in Gambi et al. (2001). Thus, it is possible that the widespread distribution of polychaetes observed here, were established and are maintained by larval dispersal by circum-Antarctic oceanographic currents (Starmans et al., 1999).

#### Eurybathic Species

Earlier studies of eurybathy suggested that the depth distributions of Antarctic polychaetes were comparable to that of European species. Thus, polychaetes did not confirm to the general "eurybathic" characteristics assigned to Antarctic taxa due to reduced physical changes with depth, which could influence physiology (Brey et al., 1996). In most oceans there is a noticeable change in faunal composition on the shelf break (Gage and Tyler, 1991). However, in Antarctica where the continental shelf is much deeper the change in species composition with depth occurs at about 2,000 m (Brandt et al., 2007). Sequenced specimens from the Amundsen Sea and Scotia Arc were collected from depths of 500–1,500 m and the additional samples from the Weddell Sea were collected between depths of 400–2,000 m. Compared to studies such as Brandt et al. (2007) the depthrange sampled may not be large enough to visualize any depth dependence in distribution patterns. Thus, for those species that appear eurybathic within this study may actually be a result of

the limited depth range (up to 2,000 m). If deeper slope and abyssal communities were included observed depth related patterns may have been different. However, Neal et al. (2017) observed depth patterns within the range of this study. Neal et al. (2017) noted greatest similarity in polychaete community composition between 500 m stations on the inner and outer shelf of the Amundsen sea, when compared to the communities from the same station at 1,000 and 1,500 m depths.

In conjunction with their geographic distribution many cryptic species appear to exist sympatrically or have overlapping depth distributions. For example, Hesionidae sp. (MB1) was sampled from depths of 500 m or more, whereas Hesionidae sp. (MB2) was only collected at 500 m depth or shallower. It is possible that Hesionidae sp. (MB2) is more dominant at shallower depths, whilst Hesionidae sp. (MB1), is outcompeted. This interaction could then be reversed at the deeper sites and is potentially associated with functional differences between cryptic species. The true absence of species at certain depths, i.e., truly restricted species, are very difficult to confirm and in some cases are questioned by later studies. For example, Schüller (2011) described three cryptic clades of Glycera spp. from the Weddell Sea, two clades were thought to be restricted to 2,000 m. This is in contrast to this study, as one of the clades from Schüller (2011) matched the Glycera sp. (MB2) specimens that were sequenced from stations 500 m and shallower.

#### Restricted Species

Previous studies uncovered restricted cryptic populations in species from several phyla. For example Held (2003) uncovered two cryptic clades of Ceratoserolis trilobitoides, one of which was found only on the WAP the other extending to the Weddell Sea. Furthermore, Linse et al. (2007) compared DNA sequences of the widely distributed bivalve Lissarca notorcadensis and uncovered four explicit haplotype groups within the Scotia Sea. Similar results have been recorded for the isopod Glyptonotus antarcticus (Held and Wägele, 2005) as well as the cephalopod genus Pareledone (Allcock et al., 2011). This high diversity has previously been assigned to the fragmented nature and limited

accessibility of available habitats in this region favoring speciation by population fragmentation (Allcock et al., 2011), especially in species with limited dispersal capacities (Strugnell and Allcock, 2013). In comparison to pelagic species there is generally a higher level of genetic structuring in benthic invertebrates, in Rogers (2007) this was assigned to the lower dispersal capabilities of benthic species at both larval and adult life stages. Within this study some of the cryptic clades of presumed circum-Antarctic species were only found within a limited number of sampled stations within a single region. This was the case for Scalibregma sp. (MB2) and (MB3), and Euphrosinella cf cirratoformis (MB2) that were more restricted than their broadly distributed, potentially circum-Antarctic, sister cryptic clades. Most of the restricted clades listed above were limited to sites within the Scotia Arc, an area known for particularly high biodiversity within Antarctica (Linse et al., 2007; Allcock et al., 2011; Neal et al., 2017). However, we acknowledge that restricted species are difficult to define, as all these clades contained fewer individuals (<4) this result could be an artifact of undersampling.

The three potential cryptic species of Aricidea cf. belgicae (MB1), (MB2), and (MB3) were each restricted to a single region; Scotia Arc (MB1), Amundsen Sea (MB2) and Weddell Sea (MB3). This is the only example within this study of allopatric cryptic species if their distribution evolved from geographically isolated populations and was maintained by limited dispersal capabilities. Again we do not know the reproductive mode of these clades and Paraonidae are known to undergo both direct and indirect development making it difficult to speculate their dispersal (**Table 4**). Furthermore, even broadcasting species may not be able to overcome geographic isolation (Galaska et al., 2017). As mentioned these "restricted" distsributions could again be an artifact of undersampling as clades (MB2) and (MB3) were only represented by two and one individuals respectively. Within undersampled environments it is difficult to determine the presence of both restricted and rare species (Smith et al., 2006). This is complicated further by the presence of cryptic diversity and thus the submission of DNA sequences to open access databases is extremely important in the future assessment of species biogeography.

### The Use of Particle Tracking Analysis to Understand Antarctic Connectivity

Here we use the model results presented to speculate on the direction and potential distances of passive larval dispersal within the Southern Ocean from our sampled sites. Particles released in the Amundsen Sea were transported counter clockwise around

the continent. This transport is likely to be due to the Antarctic coastal counter current. Topographically constrained by the Antarctic Slope Front (ASF) this westward current encircles the coastal margins of Antarctica (Thompson et al., 2009). The ASF is found consistently above or just offshore of the shelf break and the coastal counter current is found broadly over the continental shelf (Heywood et al., 1998). While the ACC is considered the dominant current in maintaining circumpolar connections, the coastal counter current has been shown to connect many regions of high krill density (Thorpe et al., 2007). In the Ross Sea, particles released on the shelf were tracked westward in the coastal counter current; whereas at the offshore location, the Ross Sea gyre and ACC advect the particles eastward. Releases on the Ross Sea slope appear to be influenced by both regimes, but with the majority of particles following the coastal counter current pathway. The ACC provides a pathway to connect regions in the opposite direction to the coastal counter current; particles released within the Scotia Arc transported eastward out into the Scotia Sea. This movement is likely to be driven by the ACC meandering as they are transported further off shelf (Young et al., 2015). Similar particle movements have been observed in Lagrangian tracking models used to estimate krill connectivity (Hofmann and Murphy, 2004; Piñones et al., 2013). Particle dispersal from the Weddell Sea sites is dictated by the cyclonic Weddell Sea gyre. Releases above the continental slope travel the farthest distance, with the majority of particles tracking eastwards once reaching the Scotia Arc. Sites closer to the coast sites within the Weddell Sea see a slower spreading toward the WAP, with particles advected both eastward toward the Scotia sites and westward in the coastal counter current.

The passive movement of particles around Antarctica within the PLD time frame indicates the possibility that larvae may be recruited into non-parent populations substantial distances from their origin. If larvae are able to settle, grow and reproduce in these locations this could maintain genetic connectivity and limit the potential for genetic differentiation between regions. Examples of source populations transporting larvae westward may include those within the Amundsen Sea supplying larvae to the Ross Sea and the Ross Sea shelf transporting larvae along the continental shelf. The Weddell Sea sites appear to be well connected and may provide larvae to sites along the WAP such as Livingston Island and Elephant Island, which in turn may supply larvae to northern sites within the Scotia Sea. However, it should be noted that given the observed depth ranges of these species (0–1,500 m) many of the particles advected over the deep open ocean may no longer be in suitable locations to establish populations, restricted by their physiology. If larvae are unable to disperse over unsuitable habitats this can prevent the degree of population connectivity (Rogers et al., 2006). This may be the case for dispersal from sites within the Scotia Arc, in which particles were transported north into the Scotia Sea. These sites included Shag Rocks and South Georgia but also Southern Thule from which no particles reached other locations. Thus, these sites may act as sink populations rather than source other regions.

The maintenance of widespread distributions by pelagic larval dispersal and recruitment may not applicable to all of the species, as we cannot confirm the presence of free-living larvae. Furthermore, Antarctic taxa in general are considered to lack free-living larval stages (Pearse et al., 1991, 2009) and the existence of circum-Antarctic brooding species is highly unlikely (Lörz et al., 2009). A potential explanation for the maintenance of genetic connectivity in brooding species or those lacking pelagic larvae includes the passive rafting of larvae or adults on floating substrate or ocean debris (Waters, 2008). Leese et al. (2010) suggested that this method maintained connected shallow water isolated populations of the isopod Septemserolis septemcarinata across the sub-Antarctic. Furthermore, direct evidence of rafting on kelp has been observed in the widespread sub-Antarctic brooding bivalve Gasimardia trapesina (Helmuth et al., 1994), the sea slug Onchidella marginata (Cumming et al., 2014) and two species of sub-Antarctic amphipods (Nikula et al., 2010). An additional factor that could influence population connectivity is anthropogenic transport (David and Loveday, 2017). We have limited knowledge of the extent or potential growth of this in the Southern Ocean however there is evidence of anthropogenic transport of non-native species into the Antarctic (Lee and Chown, 2007).

#### Limitations of Particle Tracking Models to Estimate Larval Dispersal of Antarctic Polychaetes

Given inherent difficulties of directly measuring dispersal when larvae are minute (∼200µm) compared to the potential scale of dispersal (∼km) (Gilg and Hilbish, 2003) dispersal distance is more often estimated using coupled biophysical models. The model scenario used in this study was based limited biological traits data. Limited biological knowledge is considered to be the main challenge when attempting to predict and validate dispersal pathways and distance (Levin, 2006; Metaxas and Saunders, 2009; Hilário et al., 2015). In this study we tracked particles for a maximum of 1 year, however PLD can be highly variable, for example recorded PLD in polychaetes has ranged from 13 to 150 days for planktotrophic larvae and 1–25 days for lechitrophic larvae from California (Carson and Hentschel, 2006). Within the Southern Ocean region, the life histories of marine organisms are often much slower than similar temperate and tropical taxa (Pearse et al., 1991), for example the lifespan of Antarctic echinoderm larvae is thought to exceed 1 year at a temperature of −1.5◦C (Shilling and Manahan, 1994; Marsh et al., 1999).

If the PLD of the species studied is less than 360 days, the sites reached by particles may be an overestimate of larval dispersal. **Figure 9** presents the mean time of arrival of particles and can be used to estimate the distance of dispersal over shorter PLDs. With a shorter larval duration of 3–6 months (90–180 days in **Figure 9**) most particles will only reach locations within their region. For example most sites within the Weddell Sea would remain connected but particles would not reach the sites in the Scotia Arc. A shorter PLD would greatly increase particle retention within the site or region of release therefore reducing the potential connectivity between regions (e.g., Shanks, 2009; Faurby and Barber, 2012).

As well as PLD our knowledge of polychaete larval behavior is very limited. Matschiner et al. (2009) showed that many larvae are capable of active vertical migration, which could lead to the avoidance of advection and increased retention near their sight of dispersal (Swearer et al., 2002). Additionally, in our study, particles were only released from sites we sampled. The existence of populations in between our sites would be important if the PLD and dispersal distance is shorter than predicted. These in-between populations would provide additional sites for genetic mixing between populations and larval release, and so contribute to the maintenance of circum-Antarctic genetic connectivity.

# CONCLUSIONS AND WIDER IMPLICATIONS

This study showed that the previously accepted biogeographic patterns of a third of the nine morphospecies examined should be questioned or re-described. Widespread distributions within the West Antarctic were recorded in 12 of the 17 species. These included 9 cryptic species existing sympatrically. The presence of widespread morphological and cryptic species is likely to be explained by their larval dispersal between populations as demonstrated by particle tracking models. The lack of consistency between the biogeography of cryptic species, some being widespread and potentially circum-Antarctic, whilst others are restricted, recorded within this study demonstrates the complexity of Southern Ocean biogeography (Brandão et al., 2010; Strugnell and Allcock, 2013; Chown et al., 2015). Fine scale differences in species distributions may be a result of variable life histories, habitat preferences, biological responses and ecological interactions within and between species through past and present physical conditions rather than a lack of transport connectivity from oceanographic currents. To fully appreciate why some species may be more dominant at certain depths, e.g., Hesionidae spp., or some widespread, whilst others are restricted, e.g., Scalibregma spp., would require investigations into their ecological traits to understand their functional differences. This is important because species with restricted distributions or limited dispersal capacities are often considered more vulnerable to extinction or less likely to recover from physical disturbances, as disturbed or removed populations may not be resupplied if they do not inhabit neighboring sites (Chown et al., 2015).

Our results presented here have valuable implications, improving our understanding of the drivers of biogeography and their implications for marine management under changing environmental conditions. The use of multiple data sets such as diversity, biogeographic and genetic data together with ocean physics model data, are valuable tools for the designation of effective marine management practices such as Marine Protected Areas (MPAs) and fishing restrictions (Robinson et al., 2017). The effectiveness of an MPA is, in part, reliant on the ability of species within the MPA to source external populations, thus mapping the likely dispersal pathways and distance of known species provides biological evidence for designation (e.g., Le Quesne and Codling, 2009; Planes et al., 2009). With increased benthic sampling and DNA barcoding these data can be used to assess the level of genetic connectivity between different polychaete populations, assess the abundance of rare and restricted species and provide further insight into the processes that determine species distributions.

#### AUTHOR CONTRIBUTIONS

This paper includes the main finding of the second chapter of MB's PhD thesis. MB conducted all laboratory DNA data collection, led the analysis and wrote this manuscript. JH ran the particle tracking analyses, produced model figures and conducted statistical analyses on the particle data. HW significantly contributed to the acquisition, analysis and interpretation of the genetic data. RJ contributed to the conception of this project and manuscript editing. KL donated all of the specimens used in this project and assisted in the interpretation and discussion of the DNA data collected. HR contributed to hypothesis framing, model design making a substantial contribution to its application and contributed to the interpretation of results. AG led the polychaete sampling at sea of the BIOPEARLII samples, and project-managed the sorting and identification of all BIOPEARL polychaetes. AG also contributed to the conception of this project and manuscript editing. All authors contributed to the revised article and are accountable for all aspects of the work should the accuracy and integrity be questioned.

#### REFERENCES


#### FUNDING

All DNA barcoding was supported by MB's PhD research training grant received from the University of Liverpool and the National Oceanography Centre. The expeditions JR144, JR179 (BIOPEARL I and II) and JR275 were part of the British Antarctic Survey core programmes "Global Science in the Antarctic Context" and "Polar Science for Planet Earth" funded by The Natural Environment Research Council.

#### ACKNOWLEDGMENTS

We would like to thank the crew and scientists who participated in the three research cruises that collected the polychaete specimens used in this study. We are particularly grateful to the BIOPEARL sample sorters that worked at sea, in the NHM Deep Sea Lab, and the BAS Lab: Rebekah Baker, David Barnes, Stefanie Kaiser, Ondine Cornubert, Adam Reed, Michael Mende, Wencke Krings, Moritz Stäbler and Chester Sands. Adrian Glover acknowledges the Natural Environment Research Council Collaborative Gearing Scheme for funding to participate in JR179 research cruise. Model data analyzed in this study was generated using the ARCHER UK National Supercomputing Service (http://www.archer.ac.uk). Particle tracking analysis was performed on the NERC funded JASMIN super-data-cluster facility (http://www.jasmin.ac.uk). Additional thanks are due to Lenka Neal for her help with specimen identification prior to DNA analyses. We are also grateful for the constructive comments provided by both the reviewers and editorial team.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars. 2017.00356/full#supplementary-material


Promachocrinus kerguelensis (Echinodermata). Mol. Ecol. 21, 2502–2518. doi: 10.1111/j.1365-294X.2012.05512.x


evidence from a new large polychaete dataset from the Scotia and Amundsen Sea. Mar. Biodiv. 47, 1–21. doi:10.1007/s12526-017-0735-y


Eastern Weddell Sea. Polar Biol. 34, 549–564. doi: 10.1007/s00300-010- 0913-x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Brasier, Harle, Wiklund, Jeffreys, Linse, Ruhl and Glover. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Mosaic of Geothermal and Marine Features Shapes Microbial Community Structure on Deception Island Volcano, Antarctica

Amanda G. Bendia<sup>1</sup> , Camila N. Signori<sup>1</sup> , Diego C. Franco<sup>1</sup> , Rubens T. D. Duarte<sup>1</sup> , Brendan J. M. Bohannan<sup>2</sup> and Vivian H. Pellizari<sup>1</sup> \*

<sup>1</sup> Departamento de Oceanografia Biológica, Instituto Oceanográfico, Universidade de São Paulo, São Paulo, Brazil, <sup>2</sup> Department of Biology, Institute of Ecology and Evolution, University of Oregon, Eugene, OR, United States

#### Edited by:

Alison Elizabeth Murray, Desert Research Institute, United States

#### Reviewed by:

William P. Inskeep, Montana State University, United States Charles K. Lee, The University of Waikato, New Zealand

> \*Correspondence: Vivian H. Pellizari vivianp@usp.br

#### Specialty section:

This article was submitted to Extreme Microbiology, a section of the journal Frontiers in Microbiology

Received: 24 November 2017 Accepted: 18 April 2018 Published: 07 May 2018

#### Citation:

Bendia AG, Signori CN, Franco DC, Duarte RTD, Bohannan BJM and Pellizari VH (2018) A Mosaic of Geothermal and Marine Features Shapes Microbial Community Structure on Deception Island Volcano, Antarctica. Front. Microbiol. 9:899. doi: 10.3389/fmicb.2018.00899 Active volcanoes in Antarctica contrast with their predominantly cold surroundings, resulting in environmental conditions capable of selecting for versatile and extremely diverse microbial communities. This is especially true on Deception Island, where geothermal, marine, and polar environments combine to create an extraordinary range of environmental conditions. Our main goal in this study was to understand how microbial community structure is shaped by gradients of temperature, salinity, and geochemistry in polar marine volcanoes. Thereby, we collected surface sediment samples associated with fumaroles and glaciers at two sites on Deception, with temperatures ranging from 0 to 98◦C. Sequencing of the 16S rRNA gene was performed to assess the composition and diversity of Bacteria and Archaea. Our results revealed that Deception harbors a combination of taxonomic groups commonly found both in cold and geothermal environments of continental Antarctica, and also groups normally identified at deep and shallow-sea hydrothermal vents, such as hyperthermophilic archaea. We observed a clear separation in microbial community structure across environmental gradients, suggesting that microbial community structure is strongly niche driven on Deception. Bacterial community structure was significantly associated with temperature, pH, salinity, and chemical composition; in contrast, archaeal community structure was strongly associated only with temperature. Our work suggests that Deception represents a peculiar "open-air" laboratory to elucidate central questions regarding molecular adaptability, microbial evolution, and biogeography of extremophiles in polar regions.

Keywords: polar marine volcano, Antarctica, environmental gradients, extremophiles, diversity, community structure

# INTRODUCTION

Despite its predominantly cold ecosystems, Antarctica harbors active volcanoes with versatile and extremely diverse microbial communities. These are unique habitats where psychrophiles, mesophiles, thermophiles, and hyperthermophiles coexist and interact in the same environment across a pronounced temperature gradient (Amenábar et al., 2013). At present, there are four

active volcanoes in Antarctica, three located in continental sites, and one in maritime Antarctica (called Deception Island) (Kyle and Cole, 1974; Herbold et al., 2014b). Deception Island differs from continental volcanoes specifically by its strong marine influence and higher temperatures, reaching values of 100◦C next to active fumaroles, while continental volcanoes reach values up to 65◦C (Muñoz-Martín et al., 2005; Herbold et al., 2014b).

Deception Island is also notable for its varied and steep environmental gradients. Because over half of Deception Island is covered by glaciers, there are pronounced temperature gradients over very short distances (e.g., a few meters) (Baker, 1969; Bartolini et al., 2014). There can be strong salinity gradients from glaciers to fumaroles, because the fumaroles are located in the intertidal zone. In addition to temperature and salinity, there are prominent geochemical gradients generated by continuous emissions of volcanic gases, creating a mosaic of environmental conditions that can favor metabolically diverse microbial communities (Amenábar et al., 2013; Herbold et al., 2014b).

Little is known about how volcanic activity may influence microbial communities in polar ecosystems, despite extensive study of geothermal sites in nonpolar regions (e.g., Antranikian et al., 2017; Price and Giovannelli, 2017; Ward et al., 2017), and Deception Island offers a unique opportunity to better understand these important ecosystems. Because of the geographical isolation and the predominantly cold habitats in Antarctica, we expect that community structure may differ from those of nonpolar geothermal systems. Furthermore, the existence of multiple steep environmental gradients represents a unique opportunity to understand the drivers of microbial community structure and diversity in geothermal polar regions. Several studies have shown that microbial communities can be structured by environmental parameters such as temperature, salinity, and geochemistry (e.g., Crump et al., 2004; Sharp et al., 2014; Antranikian et al., 2017; Price and Giovannelli, 2017; Ward et al., 2017); however, it isn't clear how these different drivers interact and simultaneously affect community structure.

To date, only two studies have used molecular methods to survey microbial community composition and diversity on Deception Island. These studies sampled Deception fumaroles, and characterized the microbial communities of these samples using denaturing gradient gel electrophoresis (DGGE) and subsequent sequencing of DGGE bands. These studies reported the presence of bacterial taxa, primarily members of the Firmicutes and Thermus/Deinococcus phyla, and observed for the first time hyperthermophilic Archaea in Antarctica. These studies were very limited, in both sampling extent (only fumaroles were sampled) and sampling depth (only relative coarse molecular techniques were used), and highlight the need for a more comprehensive survey of microbial communities on Deception Island (Muñoz et al., 2011; Amenábar et al., 2013).

In order to better understand how microbial community structure is shaped by gradients of temperature, salinity, and geochemistry in polar marine systems, we sampled sediments associated with fumaroles and glaciers from two geothermal sites on Deception Island, spanning temperatures between 0 and 98◦C, and we used next-generation sequencing to characterize the communities in these samples. We observed that temperature, pH, salinity, and nutrient concentrations together explained significant amounts of the variation in bacterial diversity, whereas variation in archaeal diversity was primarily explained by temperature. Furthermore, we observed that these factors interacted to alter bacterial and archaeal community composition. Additionally, we observed the coexistence of putative psychrophiles, mesophiles, thermophiles, and hyperthermophiles, and that this unique community structure reflects the mosaic of environmental conditions created by interaction between the volcanic activity, the marine environment, and the cryosphere.

## MATERIALS AND METHODS

#### Study Site and Sampling Strategy

Deception Island (62◦ 580 S, 60◦ 39<sup>0</sup> W) is a complex, horseshoeshaped stratovolcano whose central part collapsed during an eruption approximately 10,000 years ago, giving rise to a caldera called Port Foster Bay, approximately 9 km in diameter (Baker et al., 1975). Geothermal anomalies are found mainly at Fumarole Bay (FB), Whalers Bay (WB), and Pendulum Cove, probably originating during the last eruptions between 1967 and 1970 (Fermani et al., 2007). Fumaroles are distributed both in submerged and partially submerged regions (intertidal zones). Sediment temperature associated with fumaroles varies considerably, reaching values between 40 and 60◦C in WB, 70◦C in Pendulum Cove, and 80 and 100◦C in FB (Rey et al., 1995; Somoza et al., 2004). Fumarolic gases in Deception are mainly composed by CO<sup>2</sup> and H2S (Somoza et al., 2004), which in contact with atmospheric O<sup>2</sup> is oxidized to products as sulfite (SO2<sup>−</sup> 3 ) and sulfate (SO2<sup>−</sup> 4 ) (Zhang and Millero, 1993).

Sampling was performed during the XXXII Brazilian Antarctic Expedition (December 2013–January 2014), with logistical support from the polar vessel Npo. Almirante Maximiano. Surface sediment samples (ca. 5 cm) were collected in fumaroles and glaciers at the geothermally active sites of FB (62◦ 580 02.7<sup>00</sup> S, 60◦ 420 36.4<sup>00</sup> W) and WB (62◦ 580 45.1<sup>00</sup> S, 60◦ 330 27.3<sup>00</sup> W) (**Figures 1A,B**). In each site, three sediment samples were collected in each of three points with distinct temperatures: Points A and B were defined as samples collected in fumaroles, while point C was glacier samples, collected below the glacier's edge (**Figures 1C,D**). Distances between fumaroles and glaciers at each site were approximately 15 m, and the WB and FB transects were approximately 10 km apart. All fumaroles were in the intertidal zone, with exception of point B from FB, which was in the subtidal (submerged at 50 cm depth in water column). Samples were stored at −20◦C until arrival at the University of São Paulo, Brazil, in April 2014.

#### Physicochemical Analysis

We evaluated physicochemical parameters of the sediments, including granulometry, electrical conductivity, humidity, micronutrients (B, Cu, Fe, Mn, and Zn), organic matter, organic carbon, pH, P, Si, Na, K, Ca, Mg, Al, total nitrogen, nitrate, ammonia, and sulfate. These analyses were conducted

at "Luiz de Queiroz" College of Agriculture (Department of Soil Sciences, ESALQ-USP, Brazil), according to methods previously described (Keeney and Nelson, 1982; Van Raij et al., 2001).

# DNA Extraction and 16S rRNA Gene Sequencing

Total genomic DNA was extracted from 10 g of sediment using a PowerMax Soil DNA Kit (MoBio, United States), according to the manufacturer's protocol. Extracted DNA was concentrated and purified with PCR OneStep Inhibitor Removal Kit (Zymo Research, United States), and further quantified using Qubit dsDNA HS Assay (Thermo-Fisher Scientific, United States) and Qubit Fluorimeter 1.0 (Thermo-Fisher Scientific, United States). Microbial 16S rRNA gene fragments were amplified using the primers S-D-Bact-0341-b-S-17 and S-D-Bact-0785-a-A-21 for Bacteria, and S-D-Arch-0519-a-S-15 and S-D-Arch-1041-a-A-18 for Archaea (Klindworth et al., 2013), targeting the V3–V4 regions of the gene. The first PCR reaction was carried out with a thermal cycler (Thermo-Fisher Scientific, United States), using 25 µL of KAPA HiFi HotStart Ready Mix (KAPA Biosystems) polymerase, 5 ng of DNA, and 0.2 µM of each primer, under the following conditions: 95◦C for 3 min, 30 cycles of 95◦C for 30 s, 55 or 67◦C for 30 s (for Bacteria and Archaea, respectively), 72◦C for 30 s, and a final extension of 72◦C for 5 min. After purification (QIAquick Gel Extraction Kit – QIAGEN, United States) and quantification, 50 ng of amplicons was amplified and used for library preparation, under the following conditions: 95◦C for 3 min, eight cycles of 95◦C for 30 s, 55 and 72◦C for 30 s, and 72◦C for 5 min. The libraries were purified using an AMPure XP beads kit (Beckman Coulter, United States). After quality checking (Bioanalyzer 2100, Agilent Technologies, United States), the amplicons from each sample were mixed at equimolar concentrations and then sequenced using the Illumina Miseq platform at the Facilities Center for Research Support (CEFAP, Institute of Biomedical Sciences, University of São Paulo).

# Sequencing Data Processing and Statistical Analyses

Raw sequencing reads were filtered for length (>400 bp), quality score (mean, >30), and minimum expected errors of 1.0 using USEARCH tools (Edgar, 2010) and PRINSEQ Software (Schmieder and Edwards, 2011). Paired-end reads were assembled using PEAR software (Zhang et al., 2014), with a minimum overlap of 50 bp. Sequences were clustered at 97% similarity using USEARCH (Edgar, 2010) including de novo and reference-based chimera checking (ChimeraSlayer) (Haas et al., 2011). Operational taxonomic units (OTUs) with singletons (n = 1) were removed. Taxonomy was assigned to each OTU by performing BLAST searches against the Silva database v. 132 (updated on December 2017) (Quast et al., 2013), with a maximum E-value of 1e−5. Sequences were filtered

for only bacterial or archaeal sequences for further analyses in the Quantitative Insights into Microbial Ecology (QIIME) 1.8.0 pipeline (Caporaso et al., 2010). The phylogenetic tree was built using FastTree (Price et al., 2009). Alpha-diversity indexes (number of OTUs, Ace richness estimation, and Shannon and Simpson) were calculated, and differences in alpha-diversity estimates between groups of samples were tested using Student's t-test in R. OTU table was normalized for beta-diversity analysis, using cumulative sum scaling – CSS (Paulson et al., 2013). Betadiversity between samples was examined using a Bray–Curtis dissimilarity matrix visualized as a UPGMA dendrogram, as well as by weighted normalized Unifrac distance, visualized via non-metric multidimensional scaling (nMDS), with fitting of the environmental parameters (only parameters with p < 0.05 were represented), accomplished with the envfit function from the vegan package (Oksanen et al., 2013). To test the significance of differences between groups of samples (Fumaroles vs. Glaciers and FB vs. WB), analysis of similarity (adonis) using Unifrac values was performed. We performed Spearman correlations to determine relationships between community composition (selecting abundant OTUs, with >1% of relative abundance) and environmental parameters. Only parameters that exhibited p < 0.05 with at least one OTU were represented. In order to evaluate which combination of parameters were related to alpha-diversity (Shannon index), we performed multiple linear regressions. We have applied univariate linear regressions to select only the significant parameters (p < 0.05) for multiple linear regressions. All sequencing data were deposited in the National Center for Biotechnology Information Sequence Read Archives (SRA) under BioProject ID PRJNA386506. Graphs and statistical analysis were carried out using R software (version 3.3.1), and the packages ggplot2, vegan, qiimer, reshape2, and flyr.

# RESULTS

### Physicochemical Characteristics of the Sampling Site

Temperature measured in situ varied from 0 to 98◦C. Fumarole temperatures were 50◦C (WBA1, WBA2, and WBA3) and 10◦C (WBB1, WBB2, and WBB3) for WB, and 98◦C (FBA1, FBA2, and FBA3) and 80◦C (FBB1, FBB2, and FBB3) for FB. FB and WB glaciers exhibited temperatures near 0◦C. Values of pH varied between 6 (WBC3) and 7.9 (FBC2) for glaciers, and from 6.7 (WBA2 and FBA1) to 7.4 (WBB2) for fumaroles. Sediments were mainly composed of sand (representing 61.9–96.7% of sediment composition) (**Supplementary Table S1**).

Samples from the WB glaciers were characterized by higher concentrations of nitrogen compounds, as ammonia (53–150 mg kg−<sup>1</sup> ) and nitrate (18–206 mg kg−<sup>1</sup> ), when compared to the fumaroles (ammonia: 11–70 mg kg−<sup>1</sup> ; nitrate: 11–84 mg kg−<sup>1</sup> ). Samples from the WB glacier exhibited the highest concentrations of total nitrogen (1344–1421 mg kg−<sup>1</sup> ) in comparison to the other samples (280–567 mg kg−<sup>1</sup> ). By contrast, fumarole samples exhibited higher concentrations of marine (Na: 15.1–144.1 mmolc kg−<sup>1</sup> ; electrical conductivity: 429–6595 µS cm−<sup>1</sup> ) and volcanic geochemicals (sulfate: 125–293 mg dm−<sup>3</sup> ), when compared to the glaciers (Na: 3.4– 12.9 mmolc kg−<sup>1</sup> ; electrical conductivity: 84–210 µS cm−<sup>1</sup> ; sulfate: 4–9 mg dm−<sup>3</sup> ). Concentrations of Fe were higher in FB fumaroles (40–296 mg dm−<sup>3</sup> ) in comparison to WB fumaroles (23–72 mg dm−<sup>3</sup> ) and glaciers (45–115 mg dm−<sup>3</sup> ).

# Prevalent Taxa in Deception Island Glaciers vs. Fumaroles

In this study, we used bacterial and archaeal 16S rRNA primer sets to obtain a total of 1,700,412 and 1,684,699 high-quality reads, respectively, from 18 sediment samples of fumaroles and glaciers. A total of 5,884 OTUs ranging from 706 (FBB1) to 1,868 (FBC1) OTUs per sample were classified as Bacteria, whereas a total of 120 OTUs ranging between 4 (FBA2) and 44 (FBC1) OTUs per sample were assigned as Archaea (**Supplementary Table S2**). Even after several efforts, samples from FB fumarole at 98◦C (FBA1, FBA2, and FBA3) could not be amplified for Bacteria in the analyzed conditions. In addition, no archaeal sequences were detected in samples from the WB glacier (WBC1, WBC2, and WBC3). Quantitative PCR analysis was performed by our group with the same primers here employed and showed a very low abundance of Archaea in WB glacier (unpublished data).

Looking at phylum level, some bacterial groups were common across our samples (**Figure 2A**). For example, the bacterial groups Proteobacteria (Gammaproteobacteria and Alphaproteobacteria class), Planctomycetes (class Planctomycetacia, order Pirellulales), and Bacteroidetes (class Bacteroidia, order Flavobacteriales) were abundant in all glaciers and fumaroles (except FBA1, FBA2, and FBA3) analyzed in this study. Proteobacteria was the most abundant, comprising about 50% of the total taxonomic composition of each sample. Within Proteobacteria, Gammaproteobacteria (17.81.74–51.48%) was the most abundant class, followed by Alphaproteobacteria (10–27.78%).

Most taxonomic groups varied in abundance or occurrence between glacier and fumarole samples even at phylum level. For example, WB and FB glaciers exhibited the highest relative abundance of the bacterial phyla Verrucomicrobia (9.15– 18.41%), FBP (0.6–1.48%), Gemmatimonadetes (3.45–5.20%), Acidobacteria (0.3–4.6%), and Nitrospirae, in comparison to fumaroles (0.2–2.6, 0.0–0.1, 0.0, 0.0–0.2, and 1.3–1.7%, respectively). Prevalent genera in glaciers were Flavobacterium, Luteolibacter, Rhodoferax, Rhodanobacter, Dokdonella, and Polaromonas (**Figure 3**). The archaeal Thermoplasmata class (phylum Euryarchaeota) was dominant (>90%) in FB glacier samples, but not detected in fumarole samples (**Figure 2B**), and was exclusively represented by Marine Group II (**Figure 4**). When aligned with RDP database 11 (updated on 2016) (Cole et al., 2014), OTU1141 and OTU1575, related to archaeal Marine Group II, showed 85% of identify with Methanomassiliicoccus sequences.

Fumarole samples displayed the highest occurrence of Calditrichaeota (previously classified as Deferribacteres) and Chloroflexi bacterial phyla (**Figure 2A**). Calditrichaeota was abundant in WB fumarole samples, varying from 26.86 to

temperatures and geothermal sites of each sample are represented. Sequences were clustered at 97% similarity and taxonomy was assigned by performing BLAST searches against the Silva database v. 132 (E-value ≤ 1e–5).

34.04% in WBA1, WBA2, and WBA3, and from 7.50 to 8.16% in WBB1, WBB2, and WBB3, and was less abundant in FB fumarole samples at 80◦C (0.1–0.4%). Representatives of Calditrichaeota were not identified in glacier samples. Chloroflexi was identified in all samples associated with fumaroles (except for FB fumarole at 98◦C) and had greater abundance in the FB fumarole at 80◦C (5.98–13.24% for FBB1, FBB2, and FBB3). The most abundant classified genera of Bacteria in fumarole samples were related to Alphaproteobacteria (Albimonas, Loktanella, Pleomorphobacterium, and Sulfitobacter) and Gammaproteobacteria class (Thalassomonas and Woeseia), and Calditrichaeota phylum (Calorithrix, previously assigned as Caldithrix) (**Figure 3**). Sequences related to Chloroflexi phylum, Anaerolineae class, Caldilineaceae family were not assigned at genus level. Archaeal composition in fumaroles varied according to the temperature gradients, and no phylum

was dominant among all samples. For example, samples from the WB fumaroles were dominated by Thaumarchaeota (>95%), mostly represented by Nitrosopumilales (80.5–94.9%) and Nitrosocaldales (4.4–16.3%). The classified genera within Nitrosopumilales were Nitrosopumilus, Nitrosoarchaeum, and Nitrosotenuis (**Figure 4**). Samples from the WB fumarole at 50◦C (WBA1, WBA2, and WBA3) had approximately 20% of sequences classified as Nanoarchaeota phylum. Although one sample of FB fumarole of 80◦C (FBB3) showed a similar archaeal composition to WB fumaroles (>90% of OTUs related to Nitrosopumilales order), the others (FBB1 and FBB2) were dominated by euryarchaeotal members related to Haloferax genus (98%), which was assigned as Haloferax volcanii. Samples from FB fumarole of 98◦C displayed high abundance of Desulfurococcales (>90%) (exception of FBA1), with OTUs related to the marine hyperthermophilic Pyrodictium genus, followed by the orders Haloferacales (5%) and Nitrosopumilales (2%). FBA1 presented 96% of OTUs assigned as Haloferacales, and only 2% related to Desulfurococcales.

#### Alpha-Diversity Estimates

Measures of bacterial and archaeal alpha-diversity were significantly higher in glaciers than in fumaroles (with exception of bacterial richness) (**Supplementary Table S2**). Univariate linear regressions were carried out to analyze relations between alpha-diversity (using Shannon index) and environmental parameters. Bacterial diversity was positively related with pH (r <sup>2</sup> = 0.20, p = 0.049) and nitrate (r <sup>2</sup> = 0.40, p = 0.006), and negative related with temperature (r <sup>2</sup> = 0.38, p = 0.008), sulfate (r <sup>2</sup> = 0.37, p = 0.009), and Na (r <sup>2</sup> = 0.30, p = 0.01) (**Supplementary Figure S2**). Archaeal diversity was positively related with pH (r <sup>2</sup> = 0.48, p = 0.002), ammonia (r <sup>2</sup> = 0.22, p = 0.041), and nitrate (r <sup>2</sup> = 0.33, p = 0.013), and negatively related with temperature (r <sup>2</sup> = 0.81, p < 0.001) and sulfate (r <sup>2</sup> = 0.23, p = 0.01)

(**Supplementary Figure S3**). Although these p-values showed to be significant, only temperature exhibited a strong relation with archaeal diversity. Further, when we performed the multiple linear regressions, a combination of parameters showed strong correlations, and better explained bacterial (best multiple linear model with pH and sulfate parameters, r <sup>2</sup> = 0.83), and archaeal alpha-diversity (best multiple linear model with temperature and sulfate, r <sup>2</sup> = 0.86) (**Figure 5**).

#### Bacterial Community Structure Across Environmental Parameters

Considering the geographic distance and remarkable differences in environmental conditions, 7 bacterial OTUs were shared among all sediment samples from Deception Island. Among these, three OTUs were related to Gammaproteobacteria (Methylophagaceae, Burkholderiaceae, and Nitrosomonadaceae), two related to Alphaproteobacteria (Rhizobiaceae and Rhodobacteraceae), and the other two were assigned as Acidimicrobiia (Ilumatobacteraceae) and Verrucomicrobiae (Rubritaleaceae). 184 bacterial OTUs were shared between fumaroles samples and 216 bacterial OTUs between glaciers samples, representing 3.1 and 3.7% of the total.

In order to identify key environmental drivers of microbial composition, Spearman correlations were calculated, and only significant (p < 0.05) and strong correlations (r > −0.6 or 0.6) were considered. In general, bacterial OTUs prevalent in glaciers revealed negative correlations with marine and volcanic geochemicals (as electrical conductivity, Na, Mg, and sulfate), and positive correlations with ammonia (**Figure 6A**). In contrast, bacterial OTUs prevalent in fumaroles showed negative correlations with organic matter, organic carbon, Ca, and ammonia, and positive correlations with marine and volcanic geochemicals (as electrical conductivity, Na, K, Mg, sulfate, and temperature).

Bacterial beta-diversity explored by Bray–Curtis and weighted Unifrac distances revealed a clear distinction between fumarole and glacier samples (**Figure 7A** and **Supplementary Figure S1a**). Adonis analysis showed that samples differed significantly when comparing fumaroles and glaciers (p = 0.001, r <sup>2</sup> = 0.57), but not for geographic location (FB vs. WB) (p = 0.103, r <sup>2</sup> = 0.16). Marine and volcanic geochemicals were positively correlated

(p < 0.05) with fumarole bacterial communities in the nMDS analysis: temperature, Na, K, B, Mg, electrical conductivity, and sulfate (**Figure 5A**). In contrast, bacterial communities in WB glacier samples were more influenced by ammonia, total nitrogen, organic matter, organic carbon, Cu, and silt, whereas communities in FB glacier samples were positively related with nitrate and Ca.

#### Archaeal Community Structure Across Environmental Parameters

Only one archaeal OTU (OTU38, related to Candidatus Nitrosopumilus) was shared between fumarole and glacier samples. Five OTUs were shared between fumarole samples, all related to Nitrosopumilales. Glaciers shared 14 OTUs, 4 related to Marine Group II (Thermoplasmata class) and the other 10 were not classified at phylum level.

Fewer environmental parameters exhibited strong correlations with archaeal composition, when compared to Bacteria (**Figure 6B**). In general, parameters that showed significant and higher correlation values were temperature, pH, and Ca. Glaciers-abundant OTUs, related to Marine Group II (Thermoplasmata class), showed positive correlations with pH, Ca, nitrate, and ammonia, and negative correlations with electrical conductivity, Na, sulfate, and temperature. Haloferax (OTU11) and hyperthermophilic Archaea Pyrodictium (OTU524) were positively correlated with temperature and Fe. OTUs highly abundant in WB fumaroles, mainly related to Nitrosopumilus, revealed positive correlations with Si, Na, K, and sulfate. Unlike for other archaeal OTUs, Nitrosopumilus-related OTUs showed no significant correlations with temperature.

Archaeal beta-diversity revealed a clear distinction between fumaroles and glaciers (**Figure 7B** and **Supplementary Figure S1b**). However, similar to what was observed for taxonomic composition, archaeal beta-diversity showed a distinct pattern between samples for the highest temperature fumaroles from FB. Adonis analysis based on Unifrac distance showed that archaeal communities were significantly distinct between fumaroles and glaciers (r <sup>2</sup> = 0.49, p = 0.002), and likewise by their geographic locations (FB vs. WB) (r <sup>2</sup> = 0.42, p = 0.003). nMDS revealed that temperature and Fe were the main parameters related to archaeal communities of hightemperature fumaroles (FB fumaroles), excepted for FBB3 (**Figure 7B**). Archaeal communities of WB fumaroles and FBB3 were positively influenced by sulfate, B, Mg, Na, K, and electrical conductivity, whereas FB glacier communities were positively influenced by nitrogen compounds (ammonia and nitrate), Ca, and pH.

# DISCUSSION

It is well known that temperature and geochemical composition can act as strong selective pressures on microbial growth and survival (e.g., Sharp et al., 2014; Antranikian et al., 2017; Price and

Giovannelli, 2017; Ward et al., 2017), but little is known about how volcanic activity shapes microbial community structure in polar ecosystems. Using high-throughput sequencing of 16S rRNA, we observed that steep gradients of temperature, salinity, and geochemical characteristics over a short distance (ca. 15 m) strongly influenced the microbial community structure and diversity of glaciers and fumaroles. To the best of our knowledge, this work represents the first study using deep DNA sequencing to characterize bacterial and archaeal communities in a polar marine volcano in Antarctica.

Despite these strong gradients, we detected members of some bacterial groups in all of our samples, both glacier and fumarole. These included members of the Proteobacteria, Planctomycetes, and Bacteroidetes phyla, highlighting the diversity and versatility of these groups. This observation is consistent with previous reports that members of these phyla can be detected in geothermal sites across temperatures ranging from 7.5 to 99◦C (Sharp et al., 2014). We did not observe any archaeal phyla common to all of our samples, suggesting that members of archaeal phyla may have much narrower ecological niches than Bacteria at our sites.

Higher alpha-diversity was observed for both Bacteria and Archaea in glacier samples when compared to fumarole samples. This may be due to seasonal variations of temperature and nutrients in glaciers, which may select different groups of microorganisms and enhance microbial diversity in cold

Antarctic ecosystems (Hopkins et al., 2006; Kirchman et al., 2014; Bowman et al., 2017; Franco et al., 2017). Indeed, we identified phylogenetically and functionally distinct groups in our glacier samples, as psychrophilic (Flavobacterium, Luteolibacter, Rhodoferax, Polaromonas, and Arthrobacter), methylotrophic (Methylotenera), denitrifying (Rhodanobacter), and nitrifying (Nitrospira, Nitrosovibrio) Bacteria.

The drivers of community diversity varied among the glacier, cooler fumaroles (<50◦C) and the hotter fumarole (98◦C) samples. The concentration of nitrogen compounds was strongly associated with microbial diversity in glaciers, while temperature, and the concentration of volcanic and marine geochemicals, was associated with diversity in cooler fumaroles. We observed that temperature was the main driver of diversity in the hotter fumarole, which was dominated by Archaea.

Consistent with these observations, we found that sequences associated with Nitrospirae were particularly common in our glacier samples. Nitrospirae are known as the most abundant and diverse group of Bacteria performing nitrification (Lücker et al., 2010). Their presence is understandable given the high concentration of nitrogen compounds we observed in our glacier samples. Further, previous reports have also shown high annual nitrogen fluxes in polar glaciers and the abundance of microorganisms related to nitrogen cycle (Segawa et al., 2014; Lutz et al., 2017). The presence of other abundant taxa, notably members of the phyla Verrucomicrobia and Patescibacteria (Parcubacteria, previously assigned as OD1), has been previously related to environments with high methane concentrations (Dunfield et al., 2007; Peura et al., 2012). Although their role in the methane cycle remains unclear, their co-occurrence with possibly methanogenic Archaea (Methanomassiliicoccus) in our FB glacier samples provides additional indication of their potential role in the methane cycle. Previous surveys in subglacial sediments (Wanda Glacier, King George Island) also detected members of the Methanomassilliicoccales (Pessi et al., 2015). Taken together, our observations suggest that the nitrogen and methane cycles may be central biogeochemical processes in Deception glaciers.

In contrast with glaciers, one of the most abundant bacterial phyla exclusively found in fumaroles was Calditrichaeota (previously assigned as Caldithrix within Deferribacteres phylum), a group identified in marine hydrothermal vents, such as those in Japan, Greece, and Mid-Atlantic ridge (Miroshnichenko et al., 2003, 2010; Takaki et al., 2010), and rarely described in Antarctic ecosystems. The majority of Calditrichaeota members described to date are thermophilic, with optimal growth temperatures between 40 and 65◦C (Miroshnichenko et al., 2010). Gammaproteobacterial Thalassomonas, which have been mainly reported in tropical and coastal marine environments (Bowman and McMeekin, 2005), was the most abundant classified genus in our fumarole samples, and it was not found in our glacier samples. Other studies have detected these bacteria in deep sediments of the Gerlache Strait, Antarctica (López-García et al., 2001), and in shallow-sea hydrothermal vents on Panarea Island, Italy (Lentini et al., 2014).

Archaeal communities in our cooler fumarole samples (<50◦C) were predominantly composed of the Nitrosopumilales and Nitrosocaldales orders, whose members are involved in the nitrogen cycle, particularly chemolithotrophic ammonia oxidation (Könneke et al., 2005; De la Torre et al., 2008). In environments without sources of organic energy and sunlight, ammonia oxidation contributes to primary productivity, explaining the success of marine members of Nitrosopumilales in ecological niches such as the deep-ocean and shallow polar waters during summer and winter (Könneke et al., 2005; Signori et al., 2014; Learman et al., 2016). Nitrosopumilales were previously reported in several Antarctic sediments, associated with Wanda Glacier, Weddell Sea, and along the west coast (Gillan and Danis, 2007; Pessi et al., 2015; Learman et al., 2016). Nitrosocaldales members are thermophilic, growing at higher temperatures (>60◦C) than other thaumarchaeal ammonia oxidizers, and are globally distributed in various geothermal environments (De la Torre et al., 2008; Qin et al., 2017).

In our samples from the hotter fumarole (98◦C), microbial communities were strictly composed of Archaea, in agreement with the current knowledge of hyperthermophilic growth on this temperature (e.g., Stetter, 2006; Stetter, 2013). These communities were not as diverse as those in cooler fumaroles, suggesting that increased temperature resulted in a decrease of microbial diversity, as previously reported (Sharp et al., 2014; Antranikian et al., 2017). The dominant genus in these sample (>80%) was the hyperthermophilic Pyrodictium (in particular in samples FBA2 and FBA3), which has never been previously reported in Antarctic ecosystems. Members of this genus are adapted to a wide range of temperatures, with optimum growth between 80 and 105◦C (Stetter et al., 1983), and have been reported from shallow-sea hydrothermal vents, such as those in Vulcano, Italy (Stetter et al., 1983) and in Tachibana Bay, Japan (Takai and Sako, 1999), to deep-sea vents, as the Mariana Volcanic Arc (Nakagawa et al., 2006) and the Manus Basin, New Guinea (Takai et al., 2001). In addition, our work suggests that these marine hyperthermophiles could also colonize sediments associated to a non-submerged fumarole. Our results indicate that not only the marine influence, but also the local geochemistry and high temperature in the Deception hotter fumarole can act together as strong selective pressures and preferably select marine hyperthermophiles, despite the geographic isolation of Antarctica and its predominantly cold habitats. To date, hyperthermophilic Archaea have not been reported from any other geothermal site in Antarctica, probably due to the lower temperatures of continental volcanoes (up to 65◦C at surface) (Herbold et al., 2014b), which may not select for these microorganisms.

Surprisingly, microorganisms related to halophilic Haloferax were identified in some of our fumarole samples with >80◦C (FBA1, FBB1, and FBB2). Although previously described in Antarctica as abundant members of hypersaline subglacial lakes, such as the Deep Lake in Vestfold Hills (Williams et al., 2014), the high relative abundance of Haloferax in FB fumaroles (>80◦C) was unexpected, since no hyperthermophilic members of this genus had been previously described (only thermophiles such as H. volcanii, with growth temperature <50◦C) (Hartman et al., 2010). Possibly, Haloferax may have been collected in an inactive or even dead state or, less likely, undescribed members

may tolerate these high temperatures through an unknown mechanism. Further, differences found in archaeal composition within triplicates of FB fumaroles may be related with the rapidly heat lost in higher temperature sites, which promote very accentuated temperature gradients and, consequently, can favor the presence of different micro-niches (Cao et al., 2012).

When compared to continental geothermal systems in Antarctica, such as Tramway Ridge in Mount Erebus (Soo et al., 2009; Herbold et al., 2014a), Deception fumaroles shared only few common taxa, notably Chloroflexi, Planctomycetes, and Nitrosopumilales. Bacterial groups previously detected in several geothermal fields and in shallow and deep-sea hydrothermal systems, such as Aquificae and Thermotogae phyla (e.g., Lebedinsky et al., 2007; Miller et al., 2009) and Epsilonproteobacteria class (now classified as Epsilonbacteraeota phylum) (e.g., Akerman et al., 2013; Anderson et al., 2017; Price and Giovannelli, 2017), were not identified in our samples. In addition, we detected different hyperthermophiles in our samples than those described by a previous study in Deception (Amenábar et al., 2013), possibly due to the use of different molecular tools and PCR primers in our study. Our results indicate the importance of future studies on microbial community of Deception Island using other molecular techniques such as metagenomics and metatranscriptomics to elucidate the functionality of extremophiles in polar marine volcanoes.

#### CONCLUSION

By using a combination of 16S rRNA gene sequencing and physicochemical measurements we have found a strong separation of microbial community composition across environmental gradients, suggesting that bacterial community structure on Deception Island is strongly niche driven through the interaction of multiple environmental parameters (temperature, pH, salinity, sulfate, and nitrogen compounds), whereas archaeal community structure is mainly determined by temperature. Another important outcome of this study is the observation that Deception Island hosts bacterial and archaeal taxa previously reported from several highly contrasting environments, such as continental Antarctic volcanoes, non-volcanic polar ecosystems, and deep and shallow-sea hydrothermal vents. This likely reflects a mosaic of environmental conditions created by the interactions between volcanic activity, the marine environment, and the cryosphere, that can simultaneously select different groups of extremophiles (halophiles, psychrophiles, hyperthermophiles, and thermophiles). All of these factors make Deception a peculiar "open-air" laboratory to elucidate central questions regarding molecular adaptability, microbial evolution, and biogeography of extremophiles in polar regions.

#### AUTHOR CONTRIBUTIONS

AB collected the samples, conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, and prepared figures and/or tables. CS analyzed the data, wrote the paper, prepared figures and/or tables, and reviewed drafts of the paper. DF collected the samples, analyzed the data, wrote the paper, and prepared figures and/or tables. RD conceived and designed the experiments, wrote the paper, and reviewed drafts of the paper. BB suggested the statistical analysis, wrote the paper, and reviewed drafts of the paper. VP conceived and designed the experiments, contributed reagents/materials/analysis tools, and wrote and reviewed drafts of the paper.

# FUNDING

This study was part of the projects Microsfera (CNPq 407816/2013-5) and INCT-Criosfera (CNPq 028306/2009) and supported by the Brazilian National Counsel of Technological and Scientific Development (CNPq) and the Brazilian Antarctic Program (ProAntar). The São Paulo Research Foundation – FAPESP supported the following fellowships: AB Doctorate's fellowship (2012/23241-0), RD Post Doc fellowship (2012/11037-0), and CS Post Doc fellowship (2016/16183-5).

# ACKNOWLEDGMENTS

We thank the captain and the crew of the research polar vessel Almirante Maximiano, Dr. Wânia Duleba, and Dr. Antônio Carlos Rocha Campos for their support in sampling. We are very thankful to LECOM's research team and Rosa C. Gamba for their scientific support. We thank the Core Facility for Scientific Research – University of São Paulo (CEFAP-USP/GENIAL – Genome Investigation and Analysis Laboratory) for the Illumina Miseq sequencing.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2018. 00899/full#supplementary-material

FIGURE S1 | UPGMA dendrogram based on Bray–Curtis distance matrix of fumaroles (orange) and glaciers (blue) samples for Bacteria (a) and (b) Archaea.

FIGURE S2 | Models of linear regression between temperature, pH, ammonia, nitrate, sodium and sulfate parameters, and the Shannon diversity index for the Bacteria domain. The values of p and R <sup>2</sup> are described above for each model.

FIGURE S3 | Models of linear regression between temperature, pH, ammonia, nitrate, sodium and sulfate parameters, and the Shannon diversity index for the Archaea domain. The values of p and R <sup>2</sup> are described above for each model.

TABLE S1 | Physicochemical parameters of the sediments, including temperature, granulometry, electrical conductivity, micronutrients (B, Cu, Fe, Mn, and Zn), organic matter, organic carbon, pH, P, Si, Na, K, Ca, Mg, Al, total nitrogen, nitrate, ammonia, and sulfate.

TABLE S2 | Alpha-diversity estimates (number of OTUs, Ace = abundance-based coverage estimator, and Shannon and Simpson index) for Bacteria and Archaea. Results of Student's t-test, testing whether means of "fumaroles" vs. "glaciers" samples are different. P-value < 0.05 is significant.

## REFERENCES

fmicb-09-00899 May 3, 2018 Time: 17:37 # 12


and 454-pyrosequenced PCR amplicons. Genome Res. 21, 494–504. doi: 10. 1101/gr.112730.110



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bendia, Signori, Franco, Duarte, Bohannan and Pellizari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Molecular Data Suggest Long-Term in Situ Antarctic Persistence Within Antarctica's Most Speciose Plant Genus, Schistidium

Elisabeth M. Biersma1,2 \*, Jennifer A. Jackson<sup>1</sup> , Michael Stech3,4, Howard Griffiths <sup>2</sup> , Katrin Linse<sup>1</sup> and Peter Convey <sup>1</sup>

<sup>1</sup> British Antarctic Survey, Cambridge, United Kingdom, <sup>2</sup> Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom, <sup>3</sup> Naturalis Biodiversity Center, Leiden, Netherlands, <sup>4</sup> Leiden University, Leiden, Netherlands

From glacial reconstructions it is clear that Antarctic terrestrial life must have been extremely limited throughout Quaternary glacial periods. In contrast, recent biological studies provide clear evidence for long-term in situ persistence throughout glacial times within most extant Antarctic faunal and several microbial groups. However, even now, the evolutionary history of the Antarctic flora—despite playing major role in Antarctic ecosystems—remains poorly studied. We assessed the diversity, richness and relative age divergences within Schistidium (Grimmiaceae, Bryophyta), the most species-rich plant genus in the Antarctic, as well as the plant genus containing most Antarctic endemic species. We applied phylogenetic and molecular dating methods based on nuclear ribosomal Internal Transcribed Spacer sequences, including all known Antarctic Schistidium species with available sample material. We additionally investigated the continent-wide genetic diversity within the most common Antarctic representative of the genus—the endemic species Schistidium antarctici—and performed preliminary phylogeographic analyses of the bipolar species Schistidium rivulare. Most previously described Antarctic Schistidium species were genetically distinct, confirming their specific status. Interspecific divergences of all species took place at least ∼1 Mya, suggesting a likely in situ persistence in Antarctica for (at least) all endemic Schistidium species. The widespread endemic species, Schistidium antarctici, diverged from other Antarctic congeners in the late Miocene, thereby revealing the oldest extant plant species currently known in Antarctica, and providing increasing support for the hypothesis of vegetation survival through multiple glacial periods. Within S. antarctici we identified several distinct clades dividing the eastern Antarctic Peninsula and Scotia Arc islands from the western Antarctic Peninsula and all continental locations. This suggests that the mountainous spine on the Antarctic Peninsula forms a strong barrier to gene flow in this species, while increased genetic diversity in the northern Maritime Antarctic indicates likely glacial refugia in this area. This study provides an important first step toward assessing the diversity and evolutionary history of the most speciose moss genus in the Antarctic. The multi-million year presence of several endemic species contributes to studies on their adaptive potential to survive climate change over both historical and contemporary timescales.

Keywords: bryophyte, polar, biogeography, biodiversity, survival, Antarctic, moss, bipolar

Edited by:

Marco A. Molina-Montenegro, University of Talca, Chile

#### Reviewed by:

Cristian Atala, Pontificia Universidad Católica de Valparaíso, Chile Cristian Torres, University of the Bío Bío, Chile

> \*Correspondence: Elisabeth M. Biersma elibi@bas.ac.uk

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 16 March 2018 Accepted: 15 May 2018 Published: 05 June 2018

#### Citation:

Biersma EM, Jackson JA, Stech M, Griffiths H, Linse K and Convey P (2018) Molecular Data Suggest Long-Term in Situ Antarctic Persistence Within Antarctica's Most Speciose Plant Genus, Schistidium. Front. Ecol. Evol. 6:77. doi: 10.3389/fevo.2018.00077

# INTRODUCTION

Climatic oscillations during the Quaternary have played a major role in the occurrence and distribution of extant Antarctic biodiversity. Whilst only ∼0.18% of Antarctica is ice-free today (Burton-Johnson et al., 2016), reconstructions of Antarctica's past climate provide clear suggestions that, during the Last Glacial Maximum (LGM; ∼22–18 kya) as well as in previous glaciations, nearly all terrestrial areas in Antarctica were covered by thick, extensive ice sheets. Although terrestrial life must have been extremely limited during these periods, recent biogeographic and genetic studies find clear evidence for the occurrence of longterm (hundreds of thousands to millions and tens of millions of years) in situ persistence within most extant faunal and some microbial groups (Convey et al., 2008, 2009; Chong et al., 2015; Iakovenko et al., 2015; Bennett et al., 2016). Even with these recent advances, the origin and age of the extant Antarctic flora remain poorly studied, despite the flora playing a key role in Antarctic terrestrial ecosystems (Pisa et al., 2014). An improved understanding of the evolutionary history of Antarctica's flora is clearly needed, to gain a better picture of past and current distributions as well as the adaptations of terrestrial life in the Antarctic.

Apart from just two species of vascular plants, the extant Antarctic flora is predominantly composed of bryophytes, particularly mosses (Ochyra et al., 2008). Schistidium Bruch and Schimp. (Grimmiaceae) is thought to be Antarctica's most speciose moss genus, and encompasses an estimated 13 species in the Antarctic (11.6% of all currently accepted Antarctic moss species; Ochyra et al., 2008). The seven Antarctic endemic Schistidium species furthermore represent roughly two-thirds (63.6%) of the total number of 11 presumed Antarctic endemic moss species (**Table 1**). Unlike most moss species in the Antarctic that are often sterile, most Schistidium species produce sporophytes in profusion, making the genus particularly wellsuited for dispersal and potentially well-connected across the continent (Ochyra et al., 2008).

Despite their relative abundance and large contribution to the endemic Antarctic moss flora, no studies to date have focused on the phylogeny and genetic diversity of Schistidium in Antarctica. Indeed, globally, Schistidium represents one of the most taxonomically neglected moss genera, and the genus is generally regarded as difficult to identify based on morphology (Blom, 1996; Ochyra et al., 2008; Ignatova et al., 2009; Milyutina et al., 2010). With about 110 species (Frey and Stech, 2009), Schistidium is a very widespread and common genus worldwide, particularly in high latitude, polar regions, and cool, high altitude regions at lower latitudes. Most genetic work has focused on the Northern Hemisphere, and then particularly on studies of the Russian flora (e.g., Ignatova et al., 2009; Milyutina et al., 2010). The genus is in urgent need of global revision, particularly in the Southern Hemisphere (Ochyra et al., 2008). Many Southern Hemisphere regions still await taxonomic assessment and, judging from preliminary studies, it is likely that the species diversity of Schistidium will increase to reach levels similar to that of the Northern Hemisphere (Ochyra et al., 2008, and references therein).

A particularly abundant Southern Hemisphere species within the genus is the Antarctic endemic Schistidium antarctici (Cardot) L.I.Savicz and Smirnova. This is one of the most widespread and abundant moss species in Antarctica, within the continent as well as on some maritime and sub-Antarctic islands in the South Atlantic region, including the South Shetland Islands, South Orkney Islands, South Georgia, the South Sandwich Islands and Bouvetøya (Ochyra et al., 2008). In the continental Antarctic it is found in nearly all ice-free coastal regions of all generally accepted Antarctic sectors (generally accepted regions in the Antarctic, namely Maud, Enderby, Wilkes, Scott, Byrd and Ronne Sector; Pugh and Convey, 2008) and present in at least 10 out of 16 currently recognized Antarctic Conservation Biogeographic Regions (Terauds and Lee, 2016). It is commonly found fruiting (i.e. with mature sporophytes) in the maritime Antarctic (Convey and Smith, 1993; Smith and Convey, 2002), however it is seldom fertile in the dryer and colder continental Antarctic, where it primarily reproduces asexually by means of protonemal gemmae (Ochyra et al., 2008). An early isozyme study on this species revealed no genetic variation between populations in an area spanning ∼25 km in the Windmill Islands, in East Antarctica (Melick et al., 1994). However, studies with much wider geographic sampling are required to increase the resolution of genetic variation amongst populations of S. antarctici on a continent-wide scale.

We here assessed the genetic variation between Antarctic Schistidium species within the nuclear ribosomal Internal Transcribed Spacer (ITS) region (ITS1-5.8S-ITS2), one of the most variable genetic markers known in bryophytes (Stech and Quandt, 2010), of which ITS2 is one of the most widely used and promising barcode markers for mosses (Hassel et al., 2013). The aims of this study were to: (i) assess morphological species delimitations among Antarctic Schistidium species from phylogenetic and Automatic Barcode Gap Discovery (ABGD) analyses, (ii) investigate the timing of divergences between putative Antarctic endemic and nonendemic species in order to assess their relative age on the continent; (iii) identify patterns of dispersal, diversity and gene flow within S. antarctici, one of the most widespread and common plant species in the Antarctic, and (iv) perform an initial assessment of the genetic variation present between Southern and Northern Hemisphere populations of the bipolar species Schistidium rivulare (Brid.) Podp. The study catalyses assessment of the phylogeny and genetic variability within and between Antarctic Schistidium species, with importance for evaluating the biogeography of the most speciose plant genus in the Antarctic as well as their adaptive potential to respond to climate change.

# MATERIALS AND METHODS

#### Sampling and Molecular Methods

Herbarium and fresh samples of Schistidium species were sampled from most available regions in the Antarctic (see S1 Table for herbarium and location details). All herbarium samples were obtained from the herbaria based at the British Antarctic Survey (BAS) (herbarium code AAS), the Botanic Garden, Meise TABLE 1 | Information on geographic range, endemic status and occurrence of Antarctic Schistidium species (based on Ochyra et al., 2008), and inclusion in the current study.


\*Due to a lack of available sample material these species were not included in the analyses of the current study. \*\*We included the only available sample of S. urnulaceum (AAS 01946), which subsequent genetic and taxonomic verification revealed to be S. antarctici. Endemic refers to whether the species is endemic in the sub-Antarctic or Antarctic. Geographic terms: SA, South America; SSI, South Shetland Islands; sub-A, sub-Antarctic; A, Atlantic Ocean; I, Indian Ocean; SG, South Georgia; SSW, South Sandwich Islands; SOI, South Orkney Islands; AP, Antarctic Peninsula; AC, Antarctic continent.

(BR), and the University of Wollongong (WOLL), and were augmented by fresh collections made during expeditions of the authors (EB, PC). The nine species included here (according to the original identifications) were: Schistidium falcatum (Hook.f. & Wilson) B.Bremer, Schistidium lewis-smithii Ochyra, Schistidium rivulare, Schistidium andinum (Mitt.) Herzog, Schistidium urnulaceum (Müll.Hal.) B. G. Bell, Schistidium leptoneurum Ochyra, Schistidium amblyophyllum (Müll.Hal.) Ochyra & Hertel, Schistidium cupulare (Müll.Hal.) Ochyra, and S. antarctici. We attempted to include representatives of all 13 described Antarctic Schistidium species, however samples of four species (S. deceptionense Ochyra, Bedn.-Ochyra & R. I. L. Smith, S. halinae Ochyra, S. steerei Ochyra and S. praemorsum (Müll.Hal.) Herzog) were not available due to lack of material (see **Table 1**). Although not known to be present in the Antarctic continent, we also included three samples of Schistidium apocarpum (Hedw.) Bruch & Schimp., from southern Chile, and the sub-Antarctic locations of South Georgia and Macquarie Island. All sequenced specimens were morphologically identified based on Ochyra et al. (2008), using light microscopy.

DNA extraction was performed using the DNeasy Plant Mini Kit (Qiagen GmbH, Hilden, Germany), grinding specimens using a mortar and pestle and liquid nitrogen, following manufacturer's instructions. In most cases, only one gametophyte shoot was included per sample. The ITS region was PCR-amplified in two parts, using primer combinations ITS-A/ITS-C for ITS1 and ITS-E/ITS-B for ITS2 (Blattner, 1999). We used the Taq PCR Core Kit (Qiagen GmbH, Hilden, Germany), following manufacturer's instructions, with addition of 1 µl of Bovine Serum Albumin (BSA) in all reactions, and using an annealing temperature of 50◦C. Sequencing (forward and reverse) was executed by LGC Genomics (Berlin, Germany), using the amplification primers.

#### Sequence Editing and Alignment

As outgroup representatives we included GenBank sequences of Schistidium sordidum I. Hagen, Schistidium sinensiapocarpum (Müll. Hal.) Ochyra and Schistidium pulchrum H.H. Blom (GenBank nos. HM053942, HM053940, and HQ890521, respectively), given their basal position in the genus according to molecular phylogenetic reconstructions (Ignatova et al., 2009). For the bipolar species, S. rivulare, we also included four herbarium samples from Europe, and four available ITS sequences from GenBank from Russia (GenBank nos. HM053934–HM053937; Ignatova et al., 2009). The sequence dataset was aligned with PRANK v.140603 (Löytynoja and Goldman, 2008), using default settings. Models of evolution were selected using jModelTest v2.7.1 (Darriba et al., 2012) using the SPR base tree search operation, G rate variation option and AICc calculations, resulting in the model TPM1uf+G.

#### Phylogenetic Analyses

Bayesian analysis was performed in MrBayes v.3.2 (Ronquist et al., 2012), running the analysis for 1.5 × 10<sup>6</sup> generations (applying default settings of two runs with four chains), with trees saved every 1.0 × 10<sup>3</sup> generations, and omitting the first 25% of trees as burn-in. Convergence was assessed by checking that split frequencies had an average standard deviation below 0.01 and all parameters exceeded effective sample sizes (ESS) of 200 using Tracer v.1.6 (Rambaut et al., 2014). A maximum clade credibility tree with median heights was visualized using Figtree v1.4.2 (http://tree.bio. ed.ac.uk/software/figtree/). Maximum likelihood analyses were performed using RAxML-GUI v1.3.1 (Silvestro and Michalak, 2012), applying the 'bootstrap + consensus' option (1000 iterations) and the GTR+G model of evolution and default settings.

We used the Automatic Barcode Gap Discovery (ABGD; Puillandre et al., 2012) to examine species delimitations within our ITS dataset, using the online web server, and applying default settings. This automated species delimitation approach uses a pairwise genetic distance-based method to find nonoverlapping intra- and interspecific genetic distance distributions within the sequence dataset to construct hypothetical candidate species.

#### Within-Species Variation in Schistidium antarctici and Schistidium rivulare

We examined the phylogeographic structure within species with a sufficient sample size (>10 samples; resulting in analyses of S. antarctici and S. rivulare only, with n = 53 and n = 12, respectively) by calculating statistical parsimony networks of the ITS haplotypes using the TCS (Templeton et al., 1992) method in the program Popart (Leigh and Bryant, 2015), using default settings. We also calculated standard genetic diversity indices using Arlequin v3.5.1.2 (Excoffier and Lischer, 2010) [using Kimura 2P genetic distances, (Kimura, 1980)] within these species. Within S. antarctici, we investigated population structure in different regions of the maritime Antarctic (WAP: West Antarctic Peninsula, including the South Shetland Islands; NEAP: north-east Antarctic Peninsula; SOI: South Orkney Islands) by calculating FST (using haplotype frequencies only) and ΦST (Excoffier et al., 1992) (using Kimura 2P genetic distances) in Arlequin, with 10,000 dataset permutations to assess significance.

#### Molecular Dating

We assessed the relative divergence times within ITS between different Antarctic Schistidium species, with a particular focus on the relative divergence times between currently-recognized Antarctic endemic and non-endemic species. Divergence times were calculated in BEAST v2.4.1 (Bouckaert et al., 2014). Because of a lack of fossil data suitable for our dataset, we used two different nucleotide substitution rates. Firstly, we used (a) a rate of 4.47 × 10−<sup>3</sup> subst./site/my (with 95% highest posterior density intervals (95HPD): 1.76 × 10−<sup>3</sup> −8.34 × 10−<sup>3</sup> subst./site/my), corresponding to the evolutionary rate estimated for ITS in Polytrichaceae mosses [Method I2a in (Biersma et al., 2017)]. We performed an additional dating analysis based on (b) a much slower nuclear substitution rate (1.35 × 10−<sup>3</sup> subst./site/my) originally derived from angiosperms (Les et al., 2003, and references therein), but previously used in molecular studies on bryophytes (Hartmann et al., 2006; Lang et al., 2015; Biersma et al., 2017). Apart from the differences in rate, all settings in both analyses remained the same. We applied a lognormal clock, most well supported jModelTest model of evolution (GTR+G) and a coalescent tree prior, as this was both an intra- and inter-species analysis. The MCMC chains were run for a chain length of 4.0 × 10<sup>7</sup> generations, logging parameters every 1.0 × 10<sup>3</sup> generations. Convergence of the runs was assessed in Tracer v.1.6 (Rambaut et al., 2014), to ensure all parameters had ESS > 200 with a burnin of 10%. A maximum clade credibility tree of the analysis implementing the Polytrichaceae-based rate (a) (which was phylogenetically closer to our species of interest) was constructed using TreeAnnotator v1.8.2 (Drummond and Rambaut, 2007), using median node heights and a 10% burn-in. The tree was visualized using Figtree v1.4.2 (http://tree.bio.ed.ac.uk/software/ figtree/). All figures were edited in Illustrator CS5 software (Adobe Systems, Inc.).

# RESULTS

#### Phylogenetic Analyses

Nine out of the 13 Antarctic Schistidium species were sampled from most regions in the Antarctic (see **Figure 1** for sample locations of specimens representing different clades of the phylogenetic analysis). No material was available from the remaining four species. The phylogenetic analysis of ITS (**Figure 2**) revealed at least eight strongly-supported clades matching morphologically delimited Antarctic Schistidium species: S. antarctici, S. rivulare, S. andinum, S. falcatum, S. lewissmithii, S. amblyophyllum, S. leptoneurum, and specimens likely to represent S. cupulare (see below).

The species delimitation method ABGD revealed a clear "barcode gap" at Pmax = 0.0046, delimiting nine putative species clusters, while a conservative partition of ABGD was reached at Pmax = 0.0077, with eight putative species clusters (**Figure 2**). In the latter partition (Pmax = 0.0077), specimens originally identified as S. apocarpum and S. andinum were grouped together as one cluster.

Several specimens were initially misidentified according to their position in the molecular phylogenetic reconstructions as well as BLAST searches and/or reexamination of morphological characters. Three specimens from Alexander Island originally identified as S. antarctici (AAS 00508, AAS 09322, AAS 09346; cf. (Ochyra et al., 2008) formed a clade that was separated from all three Schistidium species so far reported from the southernmost Antarctic Peninsula (Ochyra et al., 2008), viz. S. amblyophyllum, S. andinum, and S. antarctici. Morphologically, these specimens clearly differ from the former two species in their leaf areolation, but resemble both S. antarctici and S. cupulare. According to the molecular results, but considering the small number of analyzed collections and morphological variability observed between them, they are indicated as "S. sp.<sup>a</sup> /cf. S. cupulare" in **Figure 2**. Additionally, the single representative of the rare S. urnulaceum (AAS 1946) proved to belong to S. antarctici, both genetically and morphologically. The three specimens originally identified as S. apocarpum (AAS 00494, AAS 00123A, AAS 03299) likely represent a different species (named "S. sp.<sup>b</sup> " in **Figure 2**), as none obtained a high BLAST hit to other S. apocarpum on GenBank; the highest hit was S. sinensiapocarpum (KX443490; coverage 92%, identity 92%) and the first hit with a sequence of S. apocarpum (JQ040700; coverage 82%, identity 91%) was approximately 40th in line of all BLAST results.

### Population Genetic Analyses of Schistidium antarctici and Schistidium rivulare

A total of 53 samples of S. antarctici were included throughout the species' geographic range. **Figure 3** shows a TCS haplotype network and map of the different haplotypes within S. antarctici. Although the total nucleotide diversity within S. antarctici was low (π = 0.002 ± 0.001), five genetically- and geographicallydistinct haplotypes were present within the species (**Figure 3**). The most common haplotype (haplotype 2; see **Figures 2**, **3**) falls in a basal position within the S. antarctici phylogeny (**Figure 2**) and is present in the west Antarctic Peninsula and associated islands (including the South Shetland Islands) and in East Antarctica, including the Byrd, Ross and Wilkes Sectors. The more phylogenetically derived S. antarctici group (haplotypes 3–5, see **Figures 2**, **3**) were predominantly present in the eastern Antarctic Peninsula (including James Ross I. and Vega I.) and more northern Scotia Arc archipelagoes (South Orkney Islands, South Georgia). Three samples within haplotypes 3–4 were obtained from the western side of the Antarctic Peninsula and the South Shetland Is. (AAS herbarium nos. 649, 1880, and 1771). The different regions in the maritime Antarctic (WAP, NEAP, SOI, see **Figure 3B**) exhibited highly significant genetic differentiation in S. antarctici, with all FST and ΦST values being highly significant.

The 12 sequences of the bipolar S. rivulare revealed higher genetic variation throughout its geographic range (π = 0.007 ± 0.004). Specimens from Russia were placed at the base of the clade, followed by more recent clades with samples of European and sub-Antarctic/Antarctic specimens, respectively. The sub-Antarctic and Antarctic specimens formed a distinct clade with high support (PP = 0.98; **Figure 2**). The haplotype network (**Figure 4**) revealed higher genetic variation in specimens from the Northern Hemisphere, with all Southern Hemisphere specimens represented by the same haplotype. Specimens from Eurasia were split into several branches but considerably enlarged sample sizes would be required to draw robust conclusions of the structure within these branches.

#### Divergence Time Analysis

The divergence time analysis (**Figure 5**) revealed multi-million year divergences between all Antarctic Schistidium species, with either rate [(a) or (b)] applied. Using the moss-defined rate (a), the Schistidium outgroup representatives and ingroup were estimated to have diverged ∼10.77 (HPD95: 20.51–5.34) Mya. The split of S. antarctici from other Antarctic species was estimated at about 7.71 (HPD95: 13.06–4.21) Mya in the late Miocene. Divergences between different S. antarctici haplotypes (1–5; see **Figures 2–3, 5**) occurred around 1.18 (HPD95: 2.69– 0.43) Mya. Within the specimens of S. rivulare examined, populations from the Northern and Southern Hemisphere were estimated to have separated ∼0.63 (HPD95: 1.16–0.22) Mya. The endemic and rare to very rare S. leptoneurum and S. lewis-smithii diverged from their closest Antarctic relatives approximately 1.27 (HPD95: 2.61–0.37) and 1.07 (HPD95: 2.23– 0.22) Mya, respectively. Moreover, S. cupulare diverged from its nearest Antarctic relative approximately 2.92 (HPD95: 5.30– 1.08) Mya. Based on this moss-derived rate, all Antarctic species diverged between the end of the Miocene (split between S. antarctici and the remaining species) and the Pliocene and Quaternary, a time when the global climate started to cool (see **Figure 5**). Applying a slower nuclear rate originally derived from angiosperms [rate (b); see section Materials and Methods], all clades diverged much earlier in time (∼3× earlier), resulting in divergences during the Pliocene and Miocene for most species, as well as possibly even a late Oligocene divergence for S. antarctici.

specimens originally identified as S. apocarpum likely represent a different species (see discussion and notes in S1 Table). Different haplotypes (1–5) within S. antarctici as shown in Figure 3 are provided next to the relevant samples.

# DISCUSSION

# Phylogenetic Analyses and Species Delimitations

This is the first molecular phylogenetic study of Antarctic species of Schistidium, the most speciose moss genus in the Antarctic. Our data confirm the validity of at least seven of the 13 currently recognized Antarctic species (S. rivulare, S. andinum, S. falcatum, S. amblyophyllum and the endemic species; S. antarctici, S. lewissmithii, and S. leptoneurum). In general, the molecular data thus seem to support the morphological species concept for Antarctic Schistidium species. However, further morphological and molecular studies will be required to assess the status of S. urnulaceum, S. cupulare, and the four remaining Antarctic

Schistidium species not included in this study through lack of available material (S. deceptionense, S. halinae, S. steerei, S. praemorsum).

overlapping locations are disentangled in (A,B) for better visibility of their haplotypes, and have lines pointing at their original location.

Our initial results suggest that the endemic S. urnulaceum may not be a distinct species, but possibly represents a phenotypic variant of S. antarctici. However, mis-identifications concerning S. antarctici and S. urnulaceum have been reported previously (Ochyra et al., 2008), and we found that the sequenced specimen was misidentified as well, and matches the morphological characters of S. antarctici. Schistidium cupulare has long remained a poorly known species (Ochyra et al., 2008). It is primarily distinguished by its leaf areolation, but considerable variation in the basal laminal cells was observed even in the three putative S. cupulare specimens included here. Since all three were originally identified as S. antarctici, the delimitation of S. antarctici and S. cupulare needs further study. Schistidium apocarpum is generally regarded as a problematic taxon due to its phenotypic variability (Ochyra et al., 2008), which has led to the grouping together (e.g., Bremer, 1980a,b) and subsequent differentiation (e.g., Blom, 1996) of various species within an "S. apocarpum complex" [see (Blom, 1996; Ochyra et al., 2008)]. While this complex has been well-studied in Scandinavia (Blom, 1996), its taxonomy in the Southern Hemisphere is still in need of revision. Consequently, the identity of the three samples originally identified as S. apocarpum in the present study remains ambiguous, although the BLAST indicate that they most likely do not belong to S. apocarpum s.str. nor to other Northern Hemisphere species of the S. apocarpum complex (sensu Blom, 1996).

Molecular relationships of the Antarctic Schistidium species partly agree with the intrageneric classification adopted in Ochyra et al. (2008). The Antarctic species of subg. Canalicularia, S. falcatum and S. lewis-smithii, form a well-supported clade, which is, however, nested inside subg. Apocarpa, to which all other Antarctic species belong. Within the latter subgenus, the species of sect. Conferta (except S. antarctici), sect. Rivularia (S. rivulare), and sect. Apocarpiformia form well-supported clades, too. Schistidium antarctici may be distinguished at section

level as well, however, the low support for the clade comprising all other included species may not exclude the possibility of sect. Conferta being monophyletic. As acknowledged in Ochyra et al. (2008), the intrageneric classification of Schistidium will need refinement based on increased taxonomic study, which is supported by the present molecular results.

# Support for Long-Term Antarctic Persistence of Several Schistidium Species

Our data suggest the presence of several distinct and old (∼1 Mya or older) Antarctic endemic species within Schistidium (namely S. antarctici, S. leptoneurum and S. lewis-smithii), indicating a long (certainly well before LGM) persistence of some of these species on the continent. Although other explanations like periodic recolonizations cannot be ruled out completely, the present data complement the recently-recognized and recurring pattern of long-term (pre-LGM) Antarctic presence across a range of terrestrial Antarctic biota, suggested from both molecular and classical biogeographic studies of all major extant faunal, floral and even microbial groups (Stevens and Hogg, 2003; Convey and Stevens, 2007; Convey et al., 2008, 2009; De Wever et al., 2009; Vyverman et al., 2010; Fraser et al., 2014; Pisa et al., 2014; Chong et al., 2015; Iakovenko et al., 2015; Bennett et al., 2016). These findings combine to overturn a long-held but largely untested view that all Antarctic terrestrial life is of recent, post-LGM origin, derived from previous glaciological reconstructions suggesting extensive icesheets covered nearly all terrestrial areas and extended far onto the Antarctic continental shelf throughout the LGM and previous glaciations. In part, this divergence of interpretation across different disciplines has been driven by a lack of spatial resolution in earlier glaciological models. However, recent modeling studies reconstructing Antarctica's past climate suggest considerably greater dynamism in Antarctica's ice sheets throughout the Pliocene and Quaternary than previously thought (Pollard and DeConto, 2009; DeConto and Pollard, 2016). Although at present precise locations of glacial refugia, where terrestrial life may have persisted in situ, remain unknown (Pugh and Convey, 2008; Convey et al., 2009), the biological evidence requiring such refugia, and at various regional scales, is increasingly clear (Convey et al., 2008). Our results here may suggest the presence of a refugial area in the northern Antarctic Peninsula/South Shetland Islands region of Maritime Antarctic, where the diversity of S. antarctici is highest. A separate study of the Antarctic Peninsula/South Shetland Islands endemic fly, Belgica antarctica Jacobs, 1900, implies a similar conclusion (Allegrucci et al., 2006). Most recently, Carapelli et al. (2017) report evidence in three springtail (Collembola) species native to the same region of persistence in situ on parts of the South Shetland Islands dating from at least the last interglacial (c. 150,000 years, two species), or the previous (c. 500,000 years, one species). The age (>1 Mya) of the endemic mosses S. leptoneurum and S. lewis-smithii documented here, whose geographic range is currently restricted to the South Shetland Islands, provides further support for a regional refugial area to have been present in this archipelago.

Our results suggest the ancestors of the sub-Antarctic and Antarctic populations of S. rivulare dispersed from the Northern Hemisphere to the sub-Antarctic and Antarctic. As Northern and Southern Hemisphere populations were estimated to have been separated for ∼0.63 (HPD95: 1.16–0.22) Mya, sub-Antarctic and Antarctic populations may have been present at their current locations prior to the Last Glacial Maximum (LGM; ∼20–18 kya). However, further geographic sampling is required to identify the source location from which the current sub-Antarctic and Antarctic populations are derived, as well as the age of the Antarctic populations.

# Diversity Patterns Within Schistidium antarctici and Conservation Implications

Our data provide novel and valuable indications of where and when the Antarctic endemic species S. antarctici may have persisted through repeated glacial periods, and of patterns of dispersal and gene flow across the continent. We found significant genetic differentiation in the northern Maritime Antarctic and sub-Antarctic, dividing the regions east of the mountainous spine of the Antarctic Peninsula (eastern Antarctic Peninsula), the South Orkney Islands, and South Georgia from regions on the west of the Peninsula and the rest of the continent (**Figure 3B**). This suggests that connectivity between the Antarctic Peninsula and Wilkes Land might be stronger than between the two regions on either side of the spine of the Antarctic Peninsula. It also indicates that S. antarctici populations in the majority of the sectors of the Antarctic continent are genetically very similar and appear to have been derived from only one haplotype (haplotype 2). The highest genetic variation was found in the northern Antarctic Peninsula region, suggesting as noted above that this is likely a region where the species survived the LGM in situ. Haplotypes 1–5 (see **Figures 2–3**,**5**) were estimated to have diverged around 1.18 (HPD95: 2.69–0.43) Mya (**Figure 5**), revealing S. antarctici to be an enduring and old species on the continent, originating and persisting there on at least a million-year timescale.

Implications of a pattern of distinct genotypes east of the mountainous spine of the Antarctic Peninsula, are also seen in other taxa including rotifers (Iakovenko et al., 2015) and diatoms (Kociolek et al., 2017), possibly providing further evidence supporting distinct bioregions on either side. The north-east and north-west Antarctic Peninsula have also been differentiated as distinct Antarctic Conservation Biogeographic Regions (ACBRs), based on multivariate analyses of regional biodiversity patterns (Terauds et al., 2012; Terauds and Lee, 2016). However, at present, the north-east Antarctic Peninsula (ACBR1) is much less well protected in terms of conservation measures than the north-west Antarctic Peninsula (ACBR3). While the latter has 21 Antarctic Specially Protected Areas (ASPAs), covering 1.99% of the region, the eastern side has just one ASPA, covering just 0.03%. Furthermore, no ASPAs in the north-east Antarctic Peninsula have been declared for the purposes of protecting biodiversity (compared to 17 in the north-west Antarctic Peninsula), even though it is the second most visited ACBR by tourists in the Antarctic (Terauds and Lee, 2016). Such observations highlight the conclusion of Hughes et al. (2016) about the overall lack of protection afforded to vegetation in the ASPA system. Given the growing evidence that this area supports unique lineages of multiple terrestrial species we suggest that priority is required toward area protection within the north-east Antarctic Peninsula (ACBR1) region.

#### AUTHOR CONTRIBUTIONS

PC and EB conceived the study, with details further developed by JJ, MS, KL and HG. EB carried out the molecular work. MS and EB performed taxonomical analyses. EB, with guidance from JJ, conducted the molecular analyses and wrote the manuscript. All authors contributed significantly to the manuscript.

#### REFERENCES


## DATA AVAILABILITY

Sequences have been deposited in GenBank under accession nos. MG582219–MG582295.

#### ACKNOWLEDGMENTS

We thank Sharon Robinson for providing new sample material from Wilkes Land, the curators of the AAS herbarium at the British Antarctic Survey (BAS), UK, and the Herbarium of the Botanic Garden Meise, Belgium, for providing herbarium material. The support of the Instituto Antartico Chileno (INACH) in assisting the authors' (EB and PC) collection of fresh specimens is gratefully acknowledged. We are grateful to Laura Gerrish (MAGIC, BAS) for preparing part of **Figure 3**. This study was funded by a Natural Environment Research Council (NERC) PhD studentship (ref NE/K50094X/1) to EB and NERC core funding to the BAS Biodiversity, Evolution and Adaptation Team. It also contributes to the Scientific Committee on Antarctic Research (SCAR) State of the Antarctic Ecosystem program.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00077/full#supplementary-material

a common biogeographic pattern. Biol. J. Linnean Soc. 120, 788–803. doi: 10.1093/biolinnean/blw004


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Biersma, Jackson, Stech, Griffiths, Linse and Convey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Taxonomic Resolution, Functional Traits, and the Influence of Species Groupings on Mapping Antarctic Seafloor Biodiversity

Jan Jansen1,2 \*, Nicole A. Hill <sup>1</sup> , Piers K. Dunstan<sup>3</sup> , Marc P. Eléaume<sup>4</sup> and Craig R. Johnson<sup>1</sup>

1 Institute for Marine and Antarctic Studies, University of Tasmania, Hobart, TAS, Australia, <sup>2</sup> Australian Antarctic Division, Kingston, VIC, Australia, <sup>3</sup> CSIRO Oceans and Atmosphere, Hobart, TAS, Australia, <sup>4</sup> Muséum National d'Histoire Naturelle, UMR 7205-ISYEB, Centre National de la Recherche Scientifique-UPMC-EPHE, Paris, France

#### Edited by:

Huw James Griffiths, British Antarctic Survey (BAS), United Kingdom

#### Reviewed by:

Juan Ernesto Guevara Andino, Field Museum of Natural History, United States Rakesh Bhutiani, Gurukul Kangri Vishwavidyalaya, India

> \*Correspondence: Jan Jansen jan.jansen@utas.edu.au

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 18 March 2018 Accepted: 29 May 2018 Published: 19 June 2018

#### Citation:

Jansen J, Hill NA, Dunstan PK, Eléaume MP and Johnson CR (2018) Taxonomic Resolution, Functional Traits, and the Influence of Species Groupings on Mapping Antarctic Seafloor Biodiversity. Front. Ecol. Evol. 6:81. doi: 10.3389/fevo.2018.00081 Benthic marine biodiversity on the Antarctic continental shelf is high and unique, yet its distributional patterns are still relatively poorly understood. Some of the main issues are that biological data are sparse, and that many species are rare and seem only weakly related to environmental conditions. Grouping species by taxonomic or functional similarity has historically been used to compensate for missing species identification, to generate a more widespread distribution of data-points, and this practice can help to gain a better understanding of the distribution of biodiversity. However, there are few guidelines on how to group species, the implicit assumptions about species associations in the groups are difficult to validate, and the information loss associated with grouping species is unknown. Here, we analyse whether grouping benthic macrofaunal species by taxonomic or functional similarity preserves distributional patterns seen in species distributions, using a model-based approach called "species archetype model" that groups species or other units based on the similarity in their responses to environmental factors. Using presence-absence data, the species archetype models identify twice as many assemblages when used on the highest taxonomic resolution data, than when applied to taxonomic data at lower resolution (e.g., class) or functional groups based on mobility, feeding type, and body shape. Further, confidence in the predictions of either taxonomic or functional groups is far less than for predictions based on the highest taxonomic resolution data. Although using functional groups is often thought to accumulate species with similar environmental responses, our analysis shows that functional groups may insufficiently resolve assemblage structure for presence-absence data. Model-based approaches provide key information to understanding the regional distribution of Antarctic marine biodiversity, and care needs to be taken when using a-priori groupings of species to make statements about the distribution of biodiversity.

Keywords: marine biodiversity, Southern Ocean, functional trait, taxonomic resolution, species archetype model, species distribution, Antarctica, benthic assemblages

# INTRODUCTION

The ocean surrounding the Antarctic continent supports unique assemblages of highly diverse benthic marine species (Griffiths et al., 2009; De Broyer et al., 2014; Chown et al., 2015). However, the remoteness which has protected this pristine environment for a long time also means that biological data for the region are sparse (De Broyer et al., 2014). Sparse biological data, the rarity of many Antarctic species, limited environmental predictor variables (Jansen et al., 2018b) and inconsistent relationships between biological data and environmental conditions across the regions and taxa studied (Cummings et al., 2010; Convey et al., 2014) are all reasons that the distributional patterns of seafloor biodiversity around Antarctica are still relatively poorly understood (Brandt et al., 2007; Chown et al., 2015). This is an issue because lack of knowledge about the distribution of biodiversity hinders (1) informed marine spatial planning in Antarctica, including the implementation of conservation measures, (2) policy development underpinning regulation of human activity in Antarctica, and (3) predicting the response of Antarctic marine ecosystems to environmental change.

Statistical models that link the occurrence of biota with relevant environmental factors are one way of making the most of sparse biological data to understand and map the distribution of benthic communities and their biodiversity. To date efforts to map seafloor communities have either focused on a few common species of echinoids (Gutt et al., 2012; Pierrat et al., 2012), or have been based on dissimilarity metrics (Koubbi et al., 2011), expert opinion (Gutt et al., 2013), or aggregating species with functional similarity (Jansen et al., 2018a). However, while these efforts provide useful insight into the distribution of biodiversity, they either analyse only a single component of the benthic community, they lack reproducibility, or individual species responses are difficult to identify (in the case of multivariate distance based approaches).

Aggregating species by taxonomic or functional similarity is a common approach to overcome difficulties in analysis of data with many rare species (e.g., Cunningham and Lindenmayer, 2005; Rooper et al., 2014; Jansen et al., 2018a), that comprise a substantial number of the species in any community. Traditionally, species are often grouped by their taxonomy, further motivated by constraints in time, expertise, or funding. More recently, species have been aggregated or classified according to functional traits. Functional traits define a species in terms of their ecological role and can include body size, other morphological characteristics and life history traits. Further, these traits are thought to be related to the performance of a species, and therefore its occurrence and abundance, under particular environmental conditions (Webb et al., 2010). In grouping species based on functional traits, it is assumed that different species with common functional traits respond to environmental gradients in a similar way. Because this approach provides a more mechanistic understanding of the structure and function of assemblages and their response to change (Sunday et al., 2015), it has attracted much interest (Petchey and Gaston, 2006; Cadotte et al., 2015). In Antarctica, functional traits such as mobility and feeding-strategy have been described as important factors influencing species distributions, with species commonly classified into mobile deposit feeders and sessile suspension feeders (e.g., Barry et al., 2003; Jansen et al., 2018b). While sessile suspension feeders, such as Bryozoans, Corals and Sponges are often associated with steep slope habitats and rocky substrate, mobile deposit feeders such as Holothurians and a range of burying fauna inhabit softer sediments in areas with low current speeds (Barry et al., 2003). Functional approaches can result in spatial patterns that are distinctly different to those based on traditional taxonomy and inform us about how communities are structured (Stuart-Smith et al., 2013). However, there are few guidelines on how and at which level to group species, yielding a wide variety of approaches that are dependent on individual researchers. Further, the assumptions about species associations in the groups are difficult to validate and it is unknown how much information about biodiversity patterns is lost when species are grouped.

In contrast to a-priori groupings, model-based approaches, such as Species Archetype Models (SAMs; Dunstan et al., 2011), allow grouping species or other units based solely on the similarity of their response to a suite of environmental covariates. SAMs are based on generalized linear models, are defined for a range of different types of data, and importantly for our context are able to model rarer species. Also, because SAMs are purely based on environmental responses, the resulting archetypes are not a result of any a-priori assumptions about species associations. Model-based approaches are a relatively new statistical tool, but have already shown promising results for mapping species and habitat distributions (e.g., Woolley et al., 2013; Hill et al., 2017; Ovaskainen et al., 2017).

In this study, we use SAMs to map the benthic invertebrate community on the George V shelf in East Antarctica. We use presence-absence data from underwater camera images in which taxa have been identified to the highest taxonomic resolution possible, and use two additional versions of the same dataset in which species were grouped a-priori by taxonomic or functional similarity. We hypothesize that species with similar functional traits in mobility, feeding-type, and bodyshape respond in a similar way to environmental conditions, and that grouping species before analysis only marginally affects predicted biodiversity patterns. Further, we hypothesize taxonomic groupings of species aggregate different functional traits, and therefore expect predicted biodiversity patterns to differ from both the analysis of functional group and of the species-data, unless niche-conservatisms is high in which case an analysis of taxonomic groups would differ only little from patterns observed in the species-data.

# METHODS

#### Study Area

The study area is the George V continental shelf and slope in East Antarctica, spanning latitudes 139◦E−147◦E from the Antarctic coastline to the shelf break at around 65.5◦ S. Water depth on the shelf is typically 500–700 m, punctuated by bathymetric features including the Mertz and Adélie Banks (200–250 m depth) and

the George V and Adélie Basins (depths up to 1,300 m; **Figure 1**). The oceanography in this area is mainly influenced by the Mertz Glacier Tongue and the adjacent Mertz Polynya (Cougnon et al., 2013), an area of ice free water that drives water circulation (Massom et al., 2001) and supports a relatively long growing season of phytoplankton (Sambrotto et al., 2003; Beans et al., 2008). Abundant and diverse benthic communities have been found primarily on the shallower section of the shelf between 200 and 600 m and on the shelf break (Post et al., 2011), and modeling work suggests widespread cover of suspension feeders on the banks (Jansen et al., 2018a).

# Biological Data

#### Data Collection and Scoring

Biological data were collected during the Collaborative East Antarctic Marine Census (CEAMARC) for the Census of Antarctic Marine Life in December 2007–February 2008 (Hosie et al., 2011).

Detailed underwaterstill images were obtained from a forward facing 8 megapixel Canon EOS 20D SLR with two speedlight strobes mounted on a beam trawl. Transects at 32 sites were mostly 4–6 km long (with some exceptions ranging between 3 and 16 km) and the water depth at the sample sites was between 200 and 1,550 m. The trawl was controlled using a deck winch and pictures were taken every 10 s. Fauna were identified to the lowest taxonomic resolution possible. Where species identification was not possible, specimens with similar overall appearance were grouped into morphotypes (operational taxonomic units, or OTU). The bottom third of each image was scored. For each image, the abundance of each OTU was estimated within 5% bins from 0 to 50%, and 10% bins from 50 to 100%. Although abundance was recorded from the images, the statistical method we used required reducing the data to presence-absence for analysis (see section Statistical Analysis). The image-derived data from each transect was then split at the boundaries of the environmental grid cells to ensure pictures used in the analysis all lay within the same value for the environmental covariates. A total number of 2,685 images, distributed across 41 grid cells, were used in the analysis.

#### Functional and Taxonomic Groupings

From the raw dataset comprising of 172 OTUs (Supplementary Table 1), we generated two aggregated datasets using expert knowledge. One dataset comprised OTUs grouped purely by taxonomy, and the other with OTUs grouped by functional traits. Each OTU was identified to the highest taxonomic resolution possible, and for each OTU we defined its mobility (mobile or sessile), its feeding-type (deposit feeders, opportunists, predators, active and passive suspension feeders), and body shape (12 categories). We chose these three categories of functional traits because they can be identified from images and expert knowledge, and because Antarctic benthic communities have been categorized in a similar way in the past, although not in such detail (Gutt et al., 2013). By aggregating OTUs with the same combination of these three functional traits, we identified 30 different functional groups (the full range of functional groups is listed in Supplementary Table 2).

Taxonomic grouping was done mostly at the class and phylum level. As a general rule, we aimed at creating a number of taxonomic groups similar to the number of functional groups to aid comparison. On average five OTUs were grouped together to form a taxonomic group, although in seven instances only a single OTU represented an isolated taxonomic group. In total, we identified 27 taxonomic groups (Supplementary Table 3).

#### Environmental Data

The environmental covariates used for model predictions were those that are commonly considered important for describing the habitat of benthic invertebrates. They were depth, slope of the seafloor and topographic position index derived from Beaman et al. (2011), ocean current speed, tidal current speed and temperature at the seafloor derived from an oceanographic model (Cougnon et al., 2013), and three measures for the availability of food at the seafloor from Jansen et al. (2018b) (i.e., food-particles arriving near the seafloor after sinking from the surface, horizontal flux of food along the seafloor, and foodparticles settling onto the seafloor). We did not include other environmental covariates because of their high correlation with variables already selected, namely surface productivity (highly correlated with the number of sinking particles arriving near the seafloor; Pearson's r = 0.971), and roughness of the seafloor (highly correlated with slope; Pearson's r = 0.992). In the map of the settling particles, 35 out of 2,515 grid-cells contained exceptionally high values, ranging from 1,035 to 5,122. These values are likely an artifact of the modeling process rather than a pattern likely to be observed, and we therefore arbitrarily adjusted the values those cells to 1,000 particles.

#### Statistical Analysis

Data were analyzed using species archetype models (SAMs) (Dunstan et al., 2011), which are based on generalized linear models and define groups of taxa based on their similar responses to environmental covariates. These groups are termed "species archetypes." Currently, SAMs are developed only for presenceabsence data and count-data, and we therefore reduced our raw-dataset from percent-cover estimates to presence-absence for the SAM-analysis. For the boxplots presented in **Figures 2**, **5**, we used the percentage-cover estimates for individual OTUs in the respective species archetypes to aid interpretation of the results. For the SAM-analysis, we considered all of the environmental covariates in section Environmental Data and included a polynomial term for depth and the logarithm for slope. We used Bayesian information criteria (BIC) for selecting the optimal number of species archetypes in each of the three datasets (OTUs, taxonomic groups and functional groups), running 50 iterations of the same model with random starts and extracting the BIC in each of these models to ensure that inference was based on the global maximum of the likelihood-surface. The model with the optimal number of species archetypes as determined by BIC (see Supplementary Figure 1 for the OTU-analysis) was then used to predict the occurrence of the species archetypes across the study area. This prediction uses the relationship identified between species archetypes and the environmental covariates, and then predict species archetype occurrence in areas where only environmental data are available. We restricted the prediction area to the continental shelf down to ∼1,500 m depth, the maximum depth that the camera was deployed.

For the statistical analysis, we used R version 3.3.1 (R Core Team, 2016), and the SAMs were developed using the R package "SpeciesMix" (Dunstan et al., 2011).

#### Data Availability

The full biological dataset is published in Robineau et al. (2018) and is publically available through the Australian Antarctic Division doi: 10.4225/15/5ae7cf565cebb.

## RESULTS

Classifying benthic fauna to the highest taxonomic resolution possible, we identified a total of 172 operational taxonomic units (OTUs) from 11 different phyla (Supplementary Table 1), belonging to 27 taxonomic and 30 functional groups. Many OTUs were rare, with 26 OTUs observed only once at the 41 sites, and half of all OTUs found at five sites or less. Only 17 OTUs were found at more than 20 sites, and six of these common OTUs represent unidentified taxa where individuals could not be distinguished further than to broad categories of Bryozoans, Sponges, Seastars, Ophiuroids, Holothurians, and Actinaria. The most dominant phyla were echinoderms (38 OTUs), sponges (35 OTUs), and cnidarians (34 OTUs). In total, we found 54 mobile OTUs compared to 118 sessile OTUs. Most sponges (=active suspension feeders) were of a simple erect form or had a stalked base lifting them off the ground (Supplementary Tables 2, 3). In contrast, most passive suspension feeders were branching in three dimensions.

#### OTU-Level Archetypes

The statistical analysis using species archetype models identified six distinct species archetypes (SAs, **Figure 2** and Supplementary Figure 1) from the full dataset which contains benthic fauna classified to the highest possible taxonomic resolution (operational taxonomic units, or OTUs). All Species Archetypes (SAs) contain OTUs with a mix of functional traits and belonging to different taxonomic groups (Supplementary Tables 2–5). Predictions for OTU occurrences from the SAs match well with the observed values (R <sup>2</sup> between 0.533 and 0.923), with the exception of SA-5 (R <sup>2</sup> = 0.178). The strongest (steepest) overall response from SAs is in relation to depth of the seafloor (**Figure 3** and Supplementary Table 6), but all environmental variables used in the analysis influence SA-distributions.

OTUs in SA-1 are mostly mobile with a few sessile taxa of simple body-shape, living in the deeper part of the continental shelf, where unconsolidated sediments prevail (**Figure 2** and Supplementary Table 4). This assemblage is predicted with high probability on the slope near the Adelie Bank and Sill, but just outside of areas sampled in our study, meaning we cannot confirm these high values with our observations (see sampling sites in **Figure 1** and predicted vs. observed for SA-1 in **Figure 2**). SA-2 is an assemblage representative of the slope areas, and is

FIGURE 2 | Distribution of six species archetypes (SA 1–6) identified from the OTU-dataset using the species archetype analysis, the standard error of the prediction, the total abundance of all OTUs combined within each SA and the relationship between predicted and observed values. A color-code is added on the left side of the graphs for helping to compare this figure with Figures 3, 4. The red dotted line indicates the 1:1 line between predicted and observed values; the R 2 -value is for a linear regression between observed and predicted values. The predictions are based on data from 41 sites, and the area under the Mertz Glacier Tongue is excluded from the predictions.

dominated by sessile taxa. Generally, SA-1 and SA-2 share a similar environmental response apart from SA-1 occurring with a higher probability on higher topographic positions and SA-2 occurring in warmer waters (**Figure 3**).

SA-3 is an assemblage containing species with a diversity of body-shapes and similar to SA-2, is dominated by passive suspension feeders (Supplementary Table 2). This archetype inhabits the edges of the banks and also occurs along the continental slope, typically in steep slope regions. SA-2 and SA-3 contain many taxa that classify as vulnerable marine ecosystems, meaning they are vulnerable to some fishing practices and deserve special protection due to their importance for ecosystem functioning (www.fao.org/in-action/vulnerablemarine-ecosystems).

SA-4 contains common taxa, similarly to SA-5, but is restricted to the banks. This is a typical Antarctic assemblage with many suspension feeders and a diverse range of bodyshapes. SA-5 dominates the benthic assemblage on most of the George V shelf (**Figure 4**), and represents a diverse range of a few (18) very abundant and common OTUs with a broad distribution (see also Supplementary Table 4). Although the standard error of the prediction is low (**Figure 2**) across the study region, comparing predicted and observed values reveals poor predictive power of this group (**Figure 2**, last column, R <sup>2</sup> = 0.18). Thus, the OTUs comprising this group are relatively abundant and ubiquitous independent of environmental variation. The response curve of SA-5 to environmental variables is shallow (**Figure 3**), indicating that even strong environmental differences only lead to weak changes in this group. SA-5 contains four OTUs that represent unidentified taxa where individuals could not be distinguished further than the broad categories of Bryozoans, Sponges, Seastars, and Ophiuroids.

SA-6 represents an assemblage with low abundance, and limited to shallow and steep sections on the shelf close to the coast and in locations characterized by muddy substrate, dropstones, and relatively protected from icebergs (Marc Eléaume, unpublished data). This group largely responds to depth and slope but not to other environmental variables examined, and contains a mix of mobile and sessile taxa with many different body shapes (Supplementary Table 4). SA-6 is the only archetype that never dominates the benthic community on the George V shelf (**Figure 4**).

All six SAs contain mobile and sessile species, but not every SA contains all categories of feeding type and body-shape (Supplementary Table 2). For example, only suspension feeders and predators can be found in SA-2, but there are no deposit feeders or opportunists present. Similarly, not every SA contains every body-shape. Flat body-shapes are absent from SA-2, erect stalked forms are absent from SA-3, barrel and tube-like sponges are absent from SA-1, SA-3, and SA-5, 2D-structured suspension feeders are absent from SA-1 and SA-5, no Anemones are found in SA-5 while tubeworms are found only in SA-3, SA-4, and SA-5. Ball-shaped forms are absent in SA-1 and SA-6 and massive sponges only occur in SA-6 (although only 3 OTUs fall into this category).

### Taxonomic- and Functional-Group Archetypes

In contrast to the analysis of the OTU-dataset, which reveals six species archetypes, the aggregated datasets of taxonomic groups (TG) and functional groups (FG) reveal three species archetypes

respectively (**Figure 5**). Both TG and FG archetypes show similar distributions, but all species archetypes show weak predictive power for the probability of occurrence (R 2 from 0.17 to 0.37, **Figure 5**, predicted vs. observed). Another similarity between TG- and FG-analyses is that each results in one archetype with abundant and very common OTUs (TG SA-1, FG SA-1), a second archetype with also abundant OTUs but not as common (TG SA-2, FG SA-2), and a third archetype of rare OTUs (TG SA-2, FG SA-2). Interestingly, despite these similarities, the OTU composition varies considerably between the TG- and FG-archetypes (Supplementary Figure 2).

# DISCUSSION

Our results show distinct assemblages of benthic macrofauna can be identified with much greater confidence when using data comprising presence/absence of operational taxonomic units (OTUs) than when data describe a-priori determined taxonomic or functional groups. Although we expected the expert-grouped taxa not to perform as well as when using the more highly resolved data, the magnitude of differences was surprising to us, especially in regard to grouping by functional traits. In theory, if the functional traits selected truly correspond to species with similar responses to environmental conditions, we should (1) expect predicted values from the analysis of functional groups to fit similarly well to observed values as predicted values from the OTU-analysis, and (2) expect OTUs in each species archetype to be from mostly the same functional groups. However, our results show this is not the case, and a-priori grouping of OTUs by functional or taxonomic similarity merges OTUs that do not respond to environmental conditions in the same way.

A possible explanation why the functional grouping poorly predicts the distribution of biodiversity is that the functional traits selected are not important in determining the presence or absence of these taxa, or are not important at all. Our results show most combinations of the three functional traits feedingtype, mobility and body-shape can be found in at least half of all species archetypes, in a wide range of environmental conditions across the study region, and are therefore not good surrogates for predicting the presence and absence of species. However, this result does not mean the functional traits selected are not important at all, because they could act on different levels of community structure, such as on the abundance of different species. Interestingly, a very broad grouping of taxa based solely on their feeding type, produces promising results for mapping patterns in the abundance of key components of the benthic community (Jansen et al., 2018a). More research is needed to resolve whether other functional traits not used here might be more suitable for predicting patterns in the presence and absence of species.

Another factor that influences the results is the taxonomic level at which to group species. In our study, we grouped species mostly at the class and phylum level. Classifying organisms into lower level taxonomic groups such as families is difficult when using image-data, because many family-specific features might not be distinguishable. Previous studies have also found that an aggregation at class or phylum level changes observed patterns in assemblage structure (Smale et al., 2010) and species distributions (Wodarska-Kowalczuk and Kedra, 2007). However, although there are examples where family-level aggregations may be used as effective surrogates for diversity patterns (e.g., in molluscs: Terlizzi et al., 2009), any taxonomic rank higher than species can behave as a random group of species not providing

ecologically meaningful information (Bevilacqua et al., 2012) and the outcomes of taxonomic groupings may vary between habitats and trophic levels (Sutcliffe et al., 2012). Our results suggest that presence-absence data should be classified to the highest taxonomic resolution where possible, and not amalgamated into either taxonomic of functional groupings when studying responses of Antarctic benthic communities to environmental conditions.

Our mapped predictions of the benthic communities on the George V shelf show similarities to previously detailed descriptions and maps of the region (Post et al., 2010, 2011; Jansen et al., 2018a). We can confirm that areas of particular interest (i.e., areas with rare assemblages, and a turnover of species archetypes) are shallow, steep habitats close to the coast or on the edges of the banks, and on the continental shelf break and slope. Vulnerable marine ecosystem (VME) taxa, which are of particular interest for management, have previously been detected and then protected from bottom fishing practices in two separate areas in this region (see **Figure 1**). SA-2 and SA-3 are likely representative for the distribution of VMEs, and the mapped predictions can help finding further assemblages in need of protection. SA-4 and SA-5 are most similar to the distribution of seafloor suspension feeders mapped in an earlier effort in this region (Jansen et al., 2018a). Some small-scale features in SA-4 and SA-5, particularly on the Adélie bank, stand out, but should be treated with care, as they could be artifacts from the ocean model and the food-availability-maps that have translated into the predictive maps. More sampling in these interesting areas would help to clarify if such patterns exist. A proportion of the benthic community (i.e., SA-5) is difficult to predict, even with the cutting-edge statistical methods used here, which at least partially explains why the biogeography of Antarctic benthos has been found to be hardly predictable in other regions around the continent (Gutt et al., 2013). Nonetheless, the overwhelmingly high correlations between observed and predicted values from the species archetype models are encouraging, and we suggest model-based approaches should be a first choice when mapping Antarctic benthic communities.

#### REFERENCES


Taxonomic resolution matters, and the level to which taxa in a dataset are identified influences the spatial patterns in biodiversity that are—and can be—observed. If resources such as time, money or expertise are limited, and taxa can only be identified at a broad taxonomic level, a more simplified and less accurate picture of the patterns of biodiversity has to be expected. Our study shows that—for the most part- the distribution of benthic fauna on the Antarctic continental shelf can be well explained by environmental conditions. This is a promising step toward mapping the distribution of Antarctic benthic fauna on a continental scale as a base for an informing management of this unique environment.

## AUTHOR CONTRIBUTIONS

JJ, NH, PD, and CJ conceived and designed the study; JJ, NH, PD, ME, and CJ analyzed the data; JJ created the figures and tables; JJ wrote the paper with contributions from all authors.

#### ACKNOWLEDGMENTS

We would like to thank Rachel Downey for classifying the sponges, Tina Molodtsova for helping to classify the cnidarians and the reviewers for their comments. Biological samples were collected during the CEAMARC program as part of the IPY #53 Census of Antarctic Marine Life program. Coastline and glacial features for the figures are taken from the Antarctic Digital Database version 5. JJ is supported by a Tasmanian Graduate Research Scholarship and a QAS Top-Up scholarship. This work was completed as part of Australian Antarctic Science project 4124.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00081/full#supplementary-material

of community responses to environmental drivers. J. Appl. Ecol. 49, 357–366. doi: 10.1111/j.1365-2664.2011.02096.x


ecological niche modelling. Mar. Ecol. Prog. Ser. 463, 215–230. doi: 10.3354/ meps09842


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jansen, Hill, Dunstan, Eléaume and Johnson. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Antarctic Cryptoendolithic Fungal Communities Are Highly Adapted and Dominated by Lecanoromycetes and Dothideomycetes

Claudia Coleine1,2, Jason E. Stajich<sup>2</sup> \*, Laura Zucconi<sup>1</sup> , Silvano Onofri<sup>1</sup> , Nuttapon Pombubpa<sup>2</sup> , Eleonora Egidi<sup>3</sup> , Ashley Franks4,5, Pietro Buzzini<sup>6</sup> and Laura Selbmann1,7

<sup>1</sup> Department of Ecological and Biological Sciences, University of Tuscia, Viterbo, Italy, <sup>2</sup> Department of Microbiology and Plant Pathology, Institute for Integrative Genome Biology, University of California, Riverside, Riverside, CA, United States, <sup>3</sup> Hawkesbury Institute for the Environment, Western Sydney University, Penrith, NSW, Australia, <sup>4</sup> Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne, VIC, Australia, <sup>5</sup> Centre for Future Landscapes, La Trobe University, Melbourne, VIC, Australia, <sup>6</sup> Department of Agricultural, Food and Environmental Sciences, Industrial Yeasts Collection DBVPG, University of Perugia, Perugia, Italy, <sup>7</sup> Section of Mycology, Italian National Antarctic Museum (MNA), Genoa, Italy

#### Edited by:

Bruno Danis, Free University of Brussels, Belgium

#### Reviewed by:

Charles K. Lee, University of Waikato, New Zealand Xuesong Luo, Huazhong Agricultural University, China

> \*Correspondence: Jason E. Stajich jason.stajich@ucr.edu

#### Specialty section:

This article was submitted to Terrestrial Microbiology, a section of the journal Frontiers in Microbiology

Received: 14 February 2018 Accepted: 06 June 2018 Published: 29 June 2018

#### Citation:

Coleine C, Stajich JE, Zucconi L, Onofri S, Pombubpa N, Egidi E, Franks A, Buzzini P and Selbmann L (2018) Antarctic Cryptoendolithic Fungal Communities Are Highly Adapted and Dominated by Lecanoromycetes and Dothideomycetes. Front. Microbiol. 9:1392. doi: 10.3389/fmicb.2018.01392 Endolithic growth is one of the most spectacular microbial adaptations to extreme environmental constraints and the predominant life-form in the ice-free areas of Continental Antarctica. Although Antarctic endolithic microbial communities are known to host among the most resistant and extreme-adapted organisms, our knowledge on microbial diversity and composition in this peculiar niche is still limited. In this study, we investigated the diversity and structure of the fungal assemblage in the cryptoendolithic communities inhabiting sandstone using a meta-barcoding approach targeting the fungal Internal Transcribed Sequence region 1 (ITS1). Samples were collected from 14 sites in the Victoria Land, along an altitudinal gradient ranging from 1,000 to 3,300 m a.s.l. and from 29 to 96 km distance to coast. Our study revealed a clear dominance of a 'core' group of fungal taxa consistently present across all the samples, mainly composed of lichen-forming and Dothideomycetous fungi. Pareto-Lorenz curves indicated a very high degree of specialization (F<sup>0</sup> approximately 95%), suggesting these communities are highly adapted but have limited ability to recover after perturbations. Overall, both fungal community biodiversity and composition did not show any correlation with the considered abiotic parameters, potentially due to strong fluctuations of environmental conditions at local scales.

Keywords: Antarctica, endolithic communities, extremophiles, fungi, ITS meta-barcoding

# INTRODUCTION

The Victoria Land region in Antarctica encompasses a latitudinal gradient of 8◦ , from the Darwin Glacier (79◦ S) in the South, to Cape Adare (71◦ S) in the North. Along with the widest area of the McMurdo Dry Valleys of the Southern Victoria Land, mountain tops and nunataks hanging from the polar plateau in the Northern Victoria Land are ice free-areas, and the exposed naked rocks represent the main substratum for microbial life (Nienow and Friedmann, 1993), hosting the highest standing biomass in this area (Cowan and Tow, 2004; Cary et al., 2010; Cowan et al., 2010;

Archer et al., 2017). The dramatic temperature fluctuation (Nienow et al., 1988a; Nienow and Friedmann, 1993), extremely low relative humidity, and scarce liquid water availability in this area constrain terrestrial ecosystem processes, leaving the microbes as the only life forms able to persist in one of the most extreme environments on Earth (Nienow and Friedmann, 1993; Vincent, 2000; Zucconi et al., 2016). Given the harsh conditions related to osmotic, thermal and UV stresses throughout the interior of the Antarctic continent, microbial communities are forced to develop in cryptic habitats as a stress avoidance strategy (Pointing and Belnap, 2012). Except for the coastal sites, where the most permissive climatic conditions favor epilithic growth, life in the inland and high altitudes is predominantly present as endolithic colonization (Friedmann and Ocampo, 1976; Zucconi et al., 2016).

Studies focused on the isolation, identification, evolution and adaptation of microbial taxa from Antarctic rock communities revealed the occurrence of a surprising diversity of both prokaryotic and eukaryotic microorganisms, some of which are exclusive to this habitat (Selbmann et al., 2005, 2008, 2013, 2014a; Adams et al., 2006; Egidi et al., 2014). The taxonomic and functional diversity of bacteria from soil and hypolithic communities of the Miers Valley, in the McMurdo Dry Valleys of Antarctica, have been recently characterized (Wei et al., 2016), revealing a cyanobacteria-dominated community with a relatively high degree of functional redundancy; fungal soil communities are dominated by Chytridiomycota species (Dreesens et al., 2014). In contrast, the fungal endolithic diversity still remains largely unexplored (Zucconi et al., 2016; Archer et al., 2017). In particular, Zucconi et al. (2016) highlighted the importance of environmental parameters, mainly rock typology and to a lesser extent, altitude and distance to coast, in shaping microbial colonization of the lichen-dominated lithic communities, which are exceptionally widespread in the Victoria Land region. Cryptoendoliths prefer porous rocks and readily colonize sandstone. The structure of the fungal component of these communities has been investigated using a fingerprinting approach that revealed a high predominance of a few fungal species. This organization indicates a high degree of specialization of the community, with a consequent high resistance to stresses, but a poor resilience so that external perturbations may easily lead to possible extinctions (Selbmann et al., 2017).

The fungal diversity and taxa description of the Antarctic cryptoendolithic communities have been primarily investigated using culture-dependent approaches to identify dothideomycetous black yeasts, basidiomycetous yeasts and lichenized mycobionts (Selbmann et al., 2005, 2008; Egidi et al., 2014). With the development of culture-independent molecular methods, such as DNA meta-barcoding, a more accurate census of the microbial community is possible, resulting in a robust and comprehensive overview of the community composition both at global and local scales (Ji et al., 2013; Smith and Peay, 2014; Hiergeist et al., 2015; Tedersoo et al., 2015; Valentini et al., 2016). Commonly occurring organisms, that are shared among communities from the same habitat, are likely to play a crucial functional role in that particular assemblage (Shade and Handelsman, 2012), and the DNA meta-barcoding approach can be conveniently applied to have a better understanding of biodiversity and ecology of the 'core' fungal members.

Despite these recent advances, only rare studies on Antarctic endoliths have been carried out limited on a few rock samples or on different samples from a single location (de la Torre et al., 2003; Pointing et al., 2009; Archer et al., 2017). Moreover, these studies also have focused almost exclusively on the bacterial compartment (Wei et al., 2016).

In this study, we utilized an ITS meta-barcoding strategy to investigate the diversity, composition and distributional patterns of the fungal communities colonizing sandstone samples from 12 localities and 14 sites spanning from North to South Victoria Land, Antarctica. This research aimed to (i) explore the fungal diversity and community composition related to an altitudinal (m a.s.l.) and distance to coast (km) gradients, (ii) define the 'core' group of fungal taxa associated with such communities (iii) investigate the structure of such extreme-adapted communities. These data may give clues for improving the accuracy of predictions on the effects of climate change on polar microbial diversity and potentially aid in developing strategies to preserve the biodiversity of these unique ecosystems.

## MATERIALS AND METHODS

#### Study Area

Sandstone outcrops distributed along a latitudinal transect ranging from 73◦ 290 2600S (Stewart Heights, Northern Victoria Land) to 76◦ 540 3600S (Battleship Promontory, McMurdo Dry Valleys, Southern Victoria Land) from 1000 m a.s.l. (Battleship Promontory) to 3300 m a.s.l. (Shafer Peak site 2) and from 29 km (Thern Promontory) to 96 km (Ricker Hills) distance to coast (**Table 1** and **Figures 1**, **2**) were sampled. All sites were located in Northern Victoria Land, except for Battleship Promontory, located in Southern Victoria Land.

Samples were collected during the XXVI Italian Antarctic Expedition (2010–2011); the presence of lithic colonization was first assessed by direct observation in situ using a magnifying lens. Sandstone were excised using a geological hammer and sterile chisel, and samples placed in sterile bags, transported and stored at −20◦C at the Tuscia University (Viterbo, Italy) until downstream analysis. The rocks investigated in this study are part of a larger set of different sandstone, granite, dolerite, quartz and lava-dike samples collected during the same expedition and analyzed using DGGE approach (Selbmann et al., 2017).

# DNA Extraction and Sequencing

Three rock samples from each site were crushed under sterile conditions by Grinder MM 400 RETSCH (Verder Scientific, Bologna, Italy) and 0.3 g were used for DNA extraction using MOBIO Power Soil DNA Extraction kit (MO BIO Laboratories, Carlsbad, CA, United States), according to the manufacturer's protocol. The ITS1 region was amplified using ITS1F (CTTGGTCATTTAGAGGAAGTAA) and ITS2 (GCTGCGTTCTTCATCGATGC) primers developed for short



read length (White et al., 1990; Smith and Peay, 2014). The PCR reactions were carried out in duplicate with a total volume of 25 µl, containing 1 µl of each primer, 12.5 µl of Taq DNA Polymerase (Thermo Fischer Scientific Inc., Waltham, MA, United States), 9.5 µl of nuclease-free water (Sigma–Aldrich, United Kingdom) and 5 ng of DNA. PCR conditions were: initial denaturation at 93◦C for 3 min, 35 cycles of denaturation at 95◦C for 45 s, annealing at 50◦C for 1 min, extension at 72◦C for 90 s, followed by a final extension at 72◦C for 10 min in an automated thermal cycler (Bio-Rad, Hercules, CA, United States). The obtained amplicons were purified with Qiagen PCR CleanUp kit (Macherey-Nagel, Hoerdt, France) and normalized after quantification with the Qubit dsDNA HS Assay Kit (Life Technologies, United States). The equimolar pool of uniquely barcoded amplicons was paired-end sequenced (2 × 300 bp) on an Illumina MiSeq platform at the Institute for Integrative Genome Biology, University of California, Riverside.

Two replicates for each rock sample were sequenced and sequence reads from the sample replicates were merged to increase the amount of sequence information (Supplementary Table S1).

#### Amplicon Sequencing Data

The ITS1 datasets were processed with the AMPtk: Amplicon ToolKit for NGS data (formally UFITS) v.0.9.3<sup>1</sup> (Palmer et al., 2018) . Barcodes and primers were removed from the amplicons sequencing data and reads were demultiplexed with split\_libraries.pyfunction in QIIME v 1.9.1 (Caporaso et al., 2010).

<sup>1</sup>https://github.com/nextgenusfs/amptk

Reads were subjected to quality trimming, PhiX screening, and chimera removal in AMPtk utilizing USEARCH with default parameters (v. 9.1.13) (Edgar, 2010). Singletons were removed and rare OTUs (<5 reads) were additionally trimmed off as recommended by Lindahl et al. (2013). The cleaned individual sample sequence files were merged into a single file clustered to identify molecular Operational Taxonomic Units (OTUs) with a 97% identity threshold using the VSEARCH (v 2.3.2) (Rognes et al., 2016) algorithm. Taxonomic identification was performed with SINTAX/UTAX (Edgar, 2010). Non-fungal sequences were excluded from the downstream analysis.

In addition, we mapped the relative abundances of compositional 'core' OTUs, defined as being present in at least 75% of the analyzed samples. The matrix display function in PRIMER-E software v7 (Ltd. Plymouth, United Kingdom) (Clarke and Gorley, 2015) was used to illustrate and compare the relative abundance of the community 'core' members on a heatmap using log-transformed taxonomic counts, while a UPGMA clustering method was implemented to reveal similarities in the community composition among the sampled sites, calculating Bray–Curtis index.

We examined whether the sampling effort was adequate to capture the fungal community richness by generating species rarefaction curves and species accumulation plots using the 'rarecurve' and 'specaccum' functions in the package 'vegan' (v. 2.3-4; Oksanen et al., 2013) in R 3.2.0 (R Development Core Team, 2015).

All raw sequence data have been submitted to the GenBank databases under BioProject accession number PRJNA379160.

#### Biodiversity and Statistical Analysis

Biodiversity indices were estimated on averaged data using Primer-E v7 to investigate species richness and evenness of the fungal community. Following Morris et al. (2014), our analyses included (i) species richness (S), estimated as a count of the total number of species found in each sample, (ii) the Shannon index (H'), a phylotype-based approach constructed using OTU groupings (Shannon and Weaver, 1949; Ludwig, 1988), (iii) the Simpson's index of Dominance (1-D), calculated to measure the probability that two individuals randomly selected from a sample will belong to the same species (or some category other than species) (Simpson, 1949), and (iv) the Pielou's equitability index (J') (Pielou, 1969). The nonparametric Spearman's correlation coefficients were calculated and graphically represented to explore relationships between the biodiversity indices and sampled localities (Spearman, 1904).

The variability in species composition of the communities (β diversity) among the 14 sites was calculated (Whittaker, 1960; Anderson et al., 2006). The relationship between the considered environmental variables (altitude and distance to coast) and β diversity was tested by multiple linear regression on distance matrices (MRM) (Legendre et al., 2005; Lichstein, 2007; Ptacnik et al., 2010) implemented in the 'ecodist' package (Goslee and Urban, 2007) of R version 3.4.2. In this analysis,

the environmental distance matrices between sampling sites were regressed against the species composition dissimilarity, to verify the effect of altitude (m a.s.l.) and distance to coast (km) on β diversity. Environmental distances were quantified by means of the Euclidean distance between each pair of sites, while the pattern in community similarity was calculated with the Jaccard index, starting from the species occurrence matrix. The robustness of results was estimated performing 1,000 permutations on the original dataset, and P-values for MRM models were obtained by comparing each observed regression coefficient with the distribution of 1,000 permuted values. All statistical tests were considered significant at P < 0.05.

To further show the evenness of these communities, Lorenz distribution curves were set up based on the meta-barcoding profiles, and a functional organization index was obtained (Fo). In this study, the Lorenz curves were also evaluated based on the Pareto principle (Pareto, 1897). The theoretical perfect uniformity, represented by a line with a slope of 45◦ (F<sup>o</sup> = 25%) means that all species in the community have the same number of individuals. F<sup>o</sup> value of 45% indicates a community where few species are dominant; higher values represent a highly specialized community where a small number of species are dominant, while the vast majority are present at low abundance (Marzorati et al., 2008; Chen et al., 2015; Wang et al., 2015).

#### RESULTS

### Amplicon DNA Sequencing and OTU Abundance

The multiplexed files contained 3,720,171 sequence reads, resulting in 1,097,371 fungal ITS rRNA gene reads passing the quality filtering step. Samples averaged 78,383 reads, with a minimum of 18,013 to a maximum of 297,482 reads per sample (**Table 2**). Sequence reads were obtained from all samples, including those collected at highest elevations, such as Mt New Zealand (2888 m a.s.l.) and Shafer Peak site 1 (3100 m a.s.l.) in Northern Victoria Land.

Singletons and rare taxa (<5 reads) were removed (87 out of 362 OTUs total). Clustering of OTUs was performed at 97% identity threshold, resulting in a total of 275 OTUs. About 5% of the OTUs were non-fungal and excluded from downstream analysis. Rarefaction curves of total fungal OTUs per rock sample rarely reached a plateau (Supplementary Figure S1). However, the curve of the species accumulation plots approached saturation, suggesting that additional samples would have recovered very few additional OTUs (Supplementary Figure S2).

#### Fungal Community Description

About 25% of the total OTUs retrieved were unidentified at Phylum or sub-Phylum level. The majority of the identified fungal sequences recovered among all samples belonged to the Ascomycota (ranging from 55 to 70% of relative abundances), followed by Basidiomycota (from 4 to 12%), and Mucoromycota and Zoopagomycota (present at 5% only in three sites: Battleship Promontory, Ricker Hills, and Stewart Heights) (**Figure 3**). The relative abundance of Ascomycota and Basidiomycota ITS sequences varied among locations.

At the class level, OTUs distribution varied among sites (**Figure 4**). Lichenized fungi in the Lecanoromycetes (Ascomycota) were the most abundant taxa and occurred in all analyzed samples (relative abundance ranging from 23 to 60%), followed by the ascomycetous classes Dothideomycetes (10 to 30%) and Eurotiomycetes (10–20%). The Tremellomycetes (Basidiomycota) were present at 10% of relative abundance in most of sites and totally absent in Trio Nunatak site 1; Agaricomycetes, Saccharomycetes, and Taphrinomycetes were the rarest members in these communities, detected only in few sites, mainly Thern Promontory, Mt Bowen, Stewart Heights, and Shafer Peak site 2.

The fungal 'core' community (i.e., OTUs present in at least 75% of the samples), was composed by 47 OTUs (**Table 3**). Fourteen OTUs were unclassified beyond the Kingdom level, while most OTU 'core' components belonged to the phylum Ascomycota and only three OTUs were assigned to the phylum Basidiomycota, including the former genus Cryptococcus sp. and Solicoccozyma aeria. Among the Ascomycota, most taxa belonged to the classes Lecanoromycetes and Dothideomycetes, followed by Sordariomycetes, Pezizomycetes, and Eurotiomycetes, which were present at lower percentage.

We further examined the distribution of the fungal taxa from the 'core' community. Several taxa were present at all sites including Solicoccozyma aeria (OTU 161), unidentified Lecanorales (OTU 351), the unclassified taxa OTUs 345-117- 227, one unidentified Acarosporaceae (OTU 281), Cryomyces antarcticus (OTU 7), Friedmanniomyces endolithicus (OTU 10), unidentified Basidiomycota (OTU 13), Buellia sp. (OTU 1), unidentified Sporormiaceae (OTU 26), Fusarium proliferatum (OTU 23) and Penicillium sp. (OTU 37). The distribution of taxa abundance was generally uniform across all the altitudes. Other 'core' taxa, such as unidentified Dothideomycetes (OTU 133), the unclassified taxa OTUs 137-217-186, unidentified Dothideales (OTU 110) and unidentified Pleosporales (OTUs 203–205), were present only intermittently among sites. Sampled sites were also hierarchically clustered by OTU abundance to examine patterns of similarities in community composition but did not exhibit any clustering by sampled localities (**Figure 5**).

#### Diversity and Statistical Analysis

Species richness (S), Shannon (H<sup>0</sup> ), Simpson (1-D), and Pielou (J0 ) indices for each site were calculated and reported in **Table 2**. The highest observed fungal richness (111 OTUs) was recorded at Ricker Hills (1400 m a.s.l.). In contrast, the Trio Nunatak site 1 (1000 m a.s.l.) exhibited low fungal richness, with only 17 OTUs retrieved. Battleship Promontory, Richard Nunatak and Stewart Heights showed the highest values for the other biodiversity indices. Among all sites, H<sup>0</sup> ranged from 0.55 to 2.47, 1-D from 0.57 to 0.84 and J<sup>0</sup> from 0.21 to 0.55.

Spearman's ρ values represented the correlation between biodiversity indices and sampled sites. In all cases we found no significant correlation between diversity indices and locations, even among sites located at similar


Analysis included: number of reads, species richness (S), Shannon index (H<sup>0</sup> ), Simpson's Index of Dominance (1-D), and Pielou's equitability index (J<sup>0</sup> ).

altitudes (P-values > 0.1) (**Figure 6**). A similar trend was between biodiversity and distance to coast (data not shown).

The variability in species composition of these communities (β diversity) across the 14 sites was measured with a Jaccard index, and the contribution of different geographic parameters (altitude: m a.s.l.; distance to coast: km) in shaping the fungal communities was estimated using an MRM analysis. The total β diversity varied among all locations. Highest similarity values occurred between Timber Peak (2800 m a.s.l., 49.5 km) and Bobby Rocks (1680 m a.s.l., 91 km) (80% of similarity) and between Timber Peak, Mt Billing (1300 m a.s.l., 44 km) and Ricker Hills (1115 m a.s.l., 96 km) (70% of similarity). The highest values of dissimilarity were obtained between Mt Bowen (1874 m a.s.l., 39.5) and Battleship Promontory (1000 m a.s.l., 33.5) (only 30% of similarity). Even sites from the same locations (i.e., Trio Nunatak site 1 and 2, Shafer Peak site 1 and 2) showed low percentages of similarity. In addition, MRM analysis indicated no correlation between the fungal community composition and either altitude (m a.s.l.) or distance to coast (km) (data not shown) among all sites (P-value > 0.05).

# Pareto Lorenz Curves

Pareto-Lorenz curves distribution patterns of the meta-barcoding profiles were plotted based on the numbers of OTUs and their frequencies. F<sup>o</sup> values were high in all sampled sites, ranging from 86 to 100%. These results indicate these fungal endolithic communities were dominated by very few, abundant, and specialized species and some other rare species (**Figure 7**).

### DISCUSSION

Endolithic microbial communities occur globally and play an important role in biogeochemical processes, including rock and mineral transformations, bio-weathering, and biomineral formation (Gadd et al., 2012), particularly in border ecosystems (Nienow and Friedmann, 1993). Nevertheless, little is known about the composition, diversity and distribution of these communities, especially in continental Antarctica, where they represent the predominant life form (Nienow and Friedmann, 1993). Sequence-based fungal communities have been retrieved from samples in all locations, including the highest altitude sites, such as Shafer Peak (3300 m a.s.l.) and Mt New Zealand (3100 m a.s.l.) in Northern Victoria Land, previously reported as colonized by fungal populations (Zucconi et al., 2016).

Most fungi in the Antarctic mycobiome are Ascomycetes. This is supported by the predominance of ascomycetous OTUs in our study and previous molecular surveys of soil Antarctic communities (Cox et al., 2016; Ji et al., 2016; Wei et al., 2016), chasmoendolithic communities in Miers Valley, McMurdo Dry Valleys (Yung et al., 2014), and in association with mosses in ice-free coastal outcrops (Hirose et al., 2016). The majority of the Ascomycota were identified as members of the lichen-forming class Lecanoromycetes, followed by Dothideomycetes, the widest and most diverse class in the Ascomycota. Our metabarcoding observations are consistent with 20 years of culture-based isolation and identification, where the Dothideomycetes were the most frequently isolated fungi from these communities (Selbmann et al., 2005, 2008; Egidi et al., 2014). In contrast, Lecanoromycetes are infrequently detected with standard isolation procedures, likely due to difficulties in culturing these fungi with obligate symbiotic lifestyles. However, the widespread presence of Lecanoromycetes detected with molecular approaches is not surprising, as lichen-forming fungi have been reported as dominant in cryptoendolithic communities colonizing sandstone rocks in Victoria Land (Friedmann et al., 1988; Cockell et al., 2003; Zucconi et al., 2016). Similarly, lichens predominate in many other continental Antarctic localities, where cold-adapted mycobionts have been previously recorded (e.g., Ruprecht et al., 2010, 2012). In this study, Lecidea sp. (family Lecideaceae; class Lecanoromycetes) and Buellia sp. (family Physciaceae; class Lecanoromycetes) were recorded in almost all the analyzed samples. The genus Buellia encompasses species considered endemic to Antarctica, such as Buellia frigida, a crustose lichen which grows on rock surfaces in ice-free areas of Antarctica and has been widely found in both coastal and mountain locations across the continent (Jones et al., 2015). Similarly, Lecidea species live endolithically in granite

rocks of continental Antarctica (de Los Rìos et al., 2005). Lichens are considered exceptionally well adapted to the lithic lifestyle, thanks to their low mineral nutrient demand, high

TABLE 3 | Fungal 'core' composition (OTUs present in ≥75 samples) 97% of identity.


freezing tolerance, and ability to be photosynthetically active at suboptimal temperatures (Kappen, 2000). Consistent with our findings, a recent study on lithic colonization patterns from an additional locality in the McMurdo Dry Valleys, University Valley, Southern Victoria Land, documented a lichen mycobiont prevalence in sandstone communities (Archer et al., 2017).

In addition to the lichen-forming taxa, Friedmanniomyces endolithicus (order Capnodiales; class Dothideomycetes) is a 'core' member of the sandstone cryptoendolithic community as it was found in the majority of the analyzed samples, confirming the widespread presence of Friedmanniomyces spp. in Northern Victoria Land. The genus Friedmanniomyces includes two described species of rock-inhabiting meristematic black fungi, F. simplex and F. endolithicus (Onofri et al., 1999; Selbmann et al., 2005). The genus is endemic to Victoria Land and the most-frequently isolated non-lichenized fungus from rock substrata in this area (Selbmann et al., 2005, 2015; Ruisi et al., 2007). Friedmanniomyces spp., a well-studied genus from the phylogenetic, taxonomic, and physiological perspective (Selbmann et al., 2005; Egidi et al., 2014), possess stress-tolerant adaptations which allow them to inhabit inert surfaces and survive long time under dry conditions (Onofri et al., 2004). We also identified Cryomyces antarcticus (class Dothideomycetes) as a community 'core' member, although occurring at a relatively lower abundance compared to the other dominant taxa. The genus Cryomyces encompasses four species of cold-adapted rockinhabiting black yeasts; Cryo. montanus and Cryo. funiculosus have been isolated from Alpine rocks collected above 3000 m a.s.l (Selbmann et al., 2014a), while the two Antarctic species Cryo. antarcticus and Cryo. minteri have been isolated primarily from the McMurdo Dry Valleys in Southern Victoria Land (Selbmann et al., 2005), and rarely in Northern Victoria Land (Cecchini, 2015). Our results suggest a wider distribution for Cryo. antarcticus in the continent than previously recorded, highlighting the higher power of meta-barcoding approaches compared to the culture-dependent approach in recovering low abundant, even if cultivable, slow growing fungi.

Additional members of the Ascomycota were found in the cryptoendolithic fungal communities, including one Aspergillus sp. (family Trichocomaceae; class Eurotiomycetes) and several unidentified taxa belonging to the order Pleosporales (Dothideomycetes). Several members of the Pleosporales have been previously associated with rock formations in both cold and hot areas (Ruibal et al., 2009; Egidi et al., 2014). Filamentous fungi such as Aspergillus spp. in Antarctica have been isolated and described from oligotrophic soils (Godinho et al., 2015), but never from rocks. Indeed, the environmental pressure typically associated with the exposed polar rocks requires a high degree of specialization (Selbmann et al., 2005, 2013; Onofri et al., 2007), suggesting this substratum is not suitable for fast-growing, cosmopolitan taxa, such as members of the genus Aspergillus. Therefore, although a long-range aerial spore dispersal cannot be completely excluded (see Pearce et al., 2009), we hypothesize that the occurrence of Aspergillus spp. in our dataset is more likely the result of a post-sampling contamination, rather than the reflection of a true component

of the Antarctic rock mycobiome. Basidiomycota represents a rare fraction of the 'core' community. Only three Basidiomycota phylotypes were recovered, two identified as Cryptococcus sp. and one to the taxonomically related species Solicoccozyma aeria (former C. aerius) as recurring community members. Although members of the former basidiomycetous genus Cryptococcus have been isolated globally, including Antarctica, both from rock and soil communities (Vishniac, 1985; Vishniac and Kurtzman, 1992; Montes et al., 1999; Scorzetti et al., 2000), the recent taxonomic revision of the Tremellomycetes (Liu et al., 2015) positioned this polyphyletic genus into many new taxa, thus modifying the taxonomical and ecological picture of the yeast distribution in polar and subpolar ecosystems. Accordingly, species of the genera Goffeauzyma, Naganishia, Papiliotrema, Solicoccozyma, Vishniacozyma, and Phenoliferia have been frequently found in both polar and non-polar cold ecosystems (Buzzini et al., 2017; Sannino et al., 2017). Although their association with rock substrates from cold sites worldwide has been previously reported by Selbmann et al. (2014b), overall Basidiomycota are infrequently found in cryptoendolithic community, making up at most 3% or less of fungal sequences on the sub-Antarctic, low maritime and high maritime Antarctic (Cox et al., 2016). New species from class Taphrinomycetes have been repeatedly isolated and described as endemic Antarctic species (Selbmann et al., 2014c), but were rarely identified in this dataset. Interestingly, we have been able to retrieve members of Saccharomycetes at low abundance, although these taxa have never been isolated from Antarctic cryptoendolithic communities. The overall low biodiversity indices values obtained in this study are consistent with recent observations of Antarctic lithic communities in Victoria Land on different rock typology (Selbmann et al., 2017). Compared with species richness indices of soil microbial communities of Antarctic Peninsula our endolithic mycobiome diversity is very low (Chong et al., 2009), indicating a strict predominance of a restricted number of specialized species. The fungal community composition did not appear influenced by elevation: indeed, samples collected on the top of Mt New Zealand (2888 m a.s.l.), Stewart Heights (2670 m a.s.l.), and Richard Nunatak (2000 m a.s.l.), had biodiversity richness values similar to the ones detected at lower altitude, e.g., Trio Nunatak (1400 m a.s.l.), Ricker Hills (1115 m a.s.l.), Battleship Promontory (1000 m a.s.l.), and Mt Billing (1300 m a.s.l.).

There is a remarkable variability in the occurrence of lichenforming and meristematic fungi, across altitudes, suggesting that this parameter alone is not an important driving factor of fungal community composition. The black yeast Cryo. antarcticus was consistently present along all localities, suggesting that this

FIGURE 6 | Spearman's correlation coefficients between the biodiversity indices calculated on 14 endolithic communities and altitudinal gradient. A significant correlation between fungal diversity and altitude was not found when R, H<sup>0</sup> , D, and J<sup>0</sup> indices were considered (Spearman's correlation coefficient, P-value > 0.1).

'core' taxon is remarkably resistant to ecological stresses. The members of the Cryomyces genus are among the most extreme environment resistant species know to date: they are able to resist extreme temperatures (−20 up to 90◦C) and UV radiation (Onofri et al., 2007; Selbmann et al., 2011; Pacelli et al., 2017). Cryo. antarcticus strains have withstood ground simulated space and Mars conditions (Pacelli et al., 2017) and 18 months of real Space exposure and Mars-simulated exposure outside the

International Space Station (Onofri et al., 2012, 2015, 2018). Based on this resilience, Cryo. antarcticus is considered among the best eukaryotic models for astrobiological studies.

The functionality (Fo) of these communities, represented by the Pareto-Lorenz curves, was mostly around 100% (**Figure 7**). The theoretical perfect uniformity (e.g., a slope of 45◦ , F<sup>o</sup> = 25%) means all species in the community have the same number of individuals. Values of F<sup>o</sup> near 100% indicate a highly specialized community where few species dominate, while all others are present as few representatives (Marzorati et al., 2008). Our results indicate that all the examined communities have a high degree of specialization and adaptation, but lack resilience to external perturbations.

An exacerbated level of specialization in such communities also implies a lack resilience to external perturbations, as reported by Selbmann et al. (2017).

The Spearman's coefficient correlation confirms that fungal biodiversity (i.e., number of OTUs retrieved and biodiversity indices) was not related to the sampled sites and environmental parameters considered in this study (i.e., altitude and distance to coast). These data were further supported by MRM analysis, which did not show any correlation between environmental variables and community composition, highlighting also a high level of dissimilarity in samples collected in same locations. These results lead to the conclusions that the environmental variables here considered (altitude and distance to coast) did not play a role in shaping the fungal community diversity and composition in these peculiar ecosystems. Selbmann et al. (2017) found a slight negative effect of these of these two parameters on fungal biodiversity in a study based on a much more heterogeneous sampling.

Nevertheless, it can be expected that in border ecosystems, particularly susceptible to environmental changes (Parmesan, 2006), even minimal local variations in environmental conditions could be crucial for microbial life. Therefore, additional parameters as water availability, rock temperature and sun exposition, might be considered in the future. For this reason, sensors for a constant monitoring of these parameters have already been installed in some sites of both northern and southern Victoria Land. Combined data on the climatic and environmental conditions, and their daily and seasonal variation, will be of help to elucidate the processes influencing biodiversity variations, community composition and species extinction. These parameters, monitored in the long run, may allow to predict possible consequences of Climate Change on terrestrial biota in Antarctica (Friedmann et al., 1987; Nienow et al., 1988b; McKay et al., 1993).

This study is the largest attempt to comprehensively identify patterns of fungal diversity in rocks in Antarctica along a

#### REFERENCES

gradient of altitude and distance to coast. With the advent of NGS technologies, our understanding of Antarctic microbial diversity and evolution is improving, but further investigation is required to elucidate how microbiota responds to environmental pressure and, in the long run, how future environmental changes will impact these unique communities (Tilman, 1996; Van Horn et al., 2014). The high degree of specialization and the low taxonomic richness found in this study alert on the potential high susceptibility of Antarctic endolithic communities to environmental changes (Tilman, 1996; Nielsen et al., 2012; Van Horn et al., 2014). Data obtained in this study are of importance to set a proper experimental plan in the future, taking into consideration additional environmental parameters and organizing a more targeted sampling aimed to provide a better evaluation of the potential consequences of environmental changes on this unique ecosystem.

#### AUTHOR CONTRIBUTIONS

Samples were collected by LS and LZ during the XXVI Italian Antarctic Expedition (2010–2011). CC performed the DNA extraction. CC and NP performed the PCR and sequencing preparation. CC, JS, and NP performed the data processing and analyses. CC, LS, LZ, and JS wrote the paper with input from SO, EE, NP, PB, and AF.

#### ACKNOWLEDGMENTS

LS and LZ wish to thank the Italian National Program for Antarctic Researches for funding sampling campaigns. Fungal ITS primers were made available through the Alfred P. Sloan Foundation Built Environment Program and sequencing supported by funds through United States Department of Agriculture – National Institute of Food and Agriculture Hatch project CA-R-PPA-5062-H to JS. Data analyses were performed on the High-Performance Computing Cluster at the University of California-Riverside in the Institute of Integrative Genome Biology supported by NSF DBI-1429826 and NIH S10- OD016290. NP was supported by a Royal Thai Government Fellowship.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01392/full#supplementary-material


Adams, B. J., Bardgett, R. D., Ayres, E., Wall, D. H., Aislabie, J., Bamforth, S., et al. (2006). Diversity and distribution of Victoria Land biota. Soil. Biol. Biochem. 38, 3003–3018. doi: 10.1016/j.soilbio.2006.04.030

the McMurdo Dry Valleys, Antarctica. Polar Biol. 40, 997–1006. doi: 10.1007/ s00300-016-2024-9


community present in oligotrophic soil of continental Antarctica. Extremophiles 19, 585–596. doi: 10.1007/s00792-015-0741-6


models of the thermal regime. Microb. Ecol. 16, 253–270. doi: 10.1007/BF020 11699



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Coleine, Stajich, Zucconi, Onofri, Pombubpa, Egidi, Franks, Buzzini and Selbmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mapping Antarctic Suspension Feeder Abundances and Seafloor Food-Availability, and Modeling Their Change After a Major Glacier Calving

Jan Jansen1,2 \*, Nicole A. Hill <sup>1</sup> , Piers K. Dunstan<sup>3</sup> , Eva A. Cougnon1,4 , Benjamin K. Galton-Fenzi 2,4 and Craig R. Johnson1†

1 Institute for Marine and Antarctic Studies, University of Tasmania, Hobart, TAS, Australia, <sup>2</sup> Australian Antarctic Division, Kingston, TAS, Australia, <sup>3</sup> CSIRO Oceans and Atmosphere, Hobart, TAS, Australia, <sup>4</sup> Antarctic Climate & Ecosystem Cooperative Research Centre, University of Tasmania, Hobart, TAS, Australia

#### Edited by:

Huw James Griffiths, British Antarctic Survey (BAS), United Kingdom

#### Reviewed by:

Ulrich Volkmar Bathmann, Leibniz Institute for Baltic Sea Research (LG), Germany Thomas Saucede, Université Bourgogne Franche-Comté, France

#### \*Correspondence:

Jan Jansen jan.jansen@utas.edu.au

†Craig R. Johnson orcid.org/0000-0002-9511-905X

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 22 February 2018 Accepted: 19 June 2018 Published: 06 July 2018

#### Citation:

Jansen J, Hill NA, Dunstan PK, Cougnon EA, Galton-Fenzi BK and Johnson CR (2018) Mapping Antarctic Suspension Feeder Abundances and Seafloor Food-Availability, and Modeling Their Change After a Major Glacier Calving. Front. Ecol. Evol. 6:94. doi: 10.3389/fevo.2018.00094 Seafloor communities are a critical part of the unique and diverse Antarctic marine life. Processes at the ocean-surface can strongly influence the diversity and abundance of these communities, even when they live at hundreds of meters water depth. However, even though we understand the importance of this link, there are so far no quantitative spatial predictions on how seafloor communities will respond to changing conditions at the ocean surface. Here, we map patterns in abundance of important habitat-forming suspension feeders on the seafloor in East Antarctica, and predict how these patterns change after a major disturbance in the icescape, caused by the calving of the Mertz Glacier Tongue. We use a purpose-built ocean model for the time-period before and after the calving of the Mertz-Glacier Tongue in 2010, data from satellites and a validated food-availability model to estimate changes in horizontal flux of food since the glacier calving. We then predict the post-calving distribution of suspension feeder abundances using the established relationships with the environmental variables, and changes in horizontal flux of food. Our resulting maps indicate strong increases in suspension feeder abundances close to the glacier calving site, fueled by increased food supply, while the remainder of the region maintains similar suspension feeder abundances despite a slight decrease in total food supply. The oceanographic setting of the entire region changes, with a shorter ice-free season, altered seafloor currents and changes in food-availability. Our study provides important insight into the flow-on effects of a changing icescape on seafloor habitat and fauna in polar environments. Understanding these connections is important in the context of current and future effects of climate change, and the mapped predictions of the seafloor fauna as presented for the study region can be used as a decision-tool for planning potential marine protected areas, and for focusing future sampling and monitoring initiatives.

Keywords: food availability, Antarctic marine biodiversity, pelagic-benthic-coupling, sea-ice, climate change, surface productivity, Mertz Glacier Tongue

# INTRODUCTION

Primary productivity is at the base of most marine ecosystems. In Antarctica, primary production is highly seasonal and intricately tied to the location, timing and duration of sea-ice and icefree areas such as polynyas (Arrigo and Van Dijken, 2003). The collapse of large ice-shelves or calving of massive icebergs, and the retreat of sea-ice that is mainly observed around the Western Antarctic Peninsula in recent years (Parkinson and Cavalieri, 2012), can dramatically alter the oceanographic setting with down-stream effects on the pattern of primary production hotspots and on Southern Ocean ecosystems (Arrigo et al., 2002; Gutt et al., 2011). Resulting changes in the location, timing and intensity of phytoplankton blooms (Cape et al., 2014) can influence the distribution of krill and predator aggregations (Gutt et al., 2011), the abundances of benthic suspension feeders (Fillinger et al., 2013; Gutt et al., 2013) and can affect carbon storage (Peck et al., 2010).

For most seafloor communities living below the photic zone (∼200 m), surface-derived primary production represents their main food source (Dayton and Oliver, 1977; Duineveld et al., 2004; Ruhl et al., 2014), and is therefore critical for their survival. Seafloor communities represent the richest component of Antarctic biodiversity (Griffiths, 2010), are highly endemic (Griffiths et al., 2009), and play an important role in the marine ecosystem (Thurber et al., 2014). However, despite evidence that a changing environment influences the distribution of these communities (Gutt et al., 2011, 2013; Fillinger et al., 2013; Griffiths et al., 2017), no study has so far quantified and mapped how their distribution might change due to a changing icescape at the ocean surface. One of the reasons for the lack of quantitative studies is that although surface-derived food is one of the main drivers, it is only recently that the nature and strength of this relationship has been quantified on the Antarctic shelf using a so-called Food-Availability-Model (FAM) (Jansen et al., 2018). Combining surface-productivity and ocean currents with particle-tracking, FAMs estimate the distribution of surfacederived food at the seafloor, and evaluate the estimates against data from sediment cores. Jansen et al. (2018) demonstrated a strong link between modeled flux of suspended food along the seafloor and abundances of sessile suspension feeders, providing a framework that allows to estimate the distribution of key elements of the seafloor community and to predict how they may change with changing ocean productivity and currents.

One Antarctic region that has recently undergone drastic environmental changes is the George V shelf in East Antarctica. The calving of the Mertz Glacier Tongue (MGT) in 2010 (Young et al., 2010) has resulted in profound environmental changes in the region, such as increased sea-ice concentrations (Campagne et al., 2015), and changes in ocean currents along the shelf as suggested by observations (Aoki et al., 2017) and modeling studies (Cougnon et al., 2017; Kusahara et al., 2017). These environmental changes have consequences for the dynamics and distribution of primary production (Shadwick et al., 2013), the abundance of top-predators (Wilson et al., 2016), and has been observed to influence the community structure of shallow-water benthos (Clark et al., 2015). However, the effect of the MGT-calving on the seafloor across the region has so far neither been assessed nor observed, and so its impact on benthic communities across the continental shelf is still unknown. Obtaining this knowledge, however, is crucial for meaningful assessment of the comprehensiveness, effectiveness and representativeness of the proposed marine protected areas in this region.

Here, we (i) quantify differences in the environmental setting on the George V shelf that will affect the supply of food to the benthos. In our modeling, we apply a recently developed FAM (Jansen et al., 2018) on two 5-year climatologies of remotely sensed surface chlorophyll-a for the period before and after the glacier calving, and use ocean current velocities from a purpose-built oceanographic model (Cougnon et al., 2017) (more details can be found in the section Materials and Methods). We then (ii) map the distribution of benthic suspension feeder abundances before the glacier calving, using faunal abundances derived from underwater camera images and environmental predictor variables. Using the pre-calving statistical model for the suspension feeder abundances and the change in environmental conditions after the glacier calving, we then (iii) predict changes in suspension feeder abundances across the region, revealing the strong impact of the changing icescape on the seafloor ecosystem (**Figure 1** for general results, and **Figure 2** for an overview of the study-region).

# RESULTS

## Changes in Environmental Conditions

Our results reveal that several aspects of the observed and the modeled marine environment have changed since the calving of the MGT (**Figure 3**). Average sea-ice concentrations increased in the study region by 50–80%, particularly in the spring season, except over the Mertz Bank and where the calved MGT was located (Supplementary Figure 1). Near the Mertz Bank, average surface-chlorophyll-a (chl-a) concentrations increase by a factor of two or more (**Figure 3C**). The area of highest average surfacechl-a concentration also shows an eastward extension into areas previously covered by sea-ice. West of the Mertz Bank, the George V Basin and the Adélie Bank show decreasing values for surface-chl-a (**Figure 3C**). In this area, the breakup of the sea-ice post-calving occurs much later in the year (Supplementary Figure 1), shortening the time-period where surface phytoplankton is observed by satellites from around 4.5 to 3 months. At the South-East tip of the Adélie Bank, north of the grounded giant iceberg B09B, the outline of a newly formed polynya (Tamura et al., 2012; Fogwill et al., 2016) can be observed, marked by lower spring-time sea-ice concentrations and higher surface-chla relative to the surrounding area. Modeled seafloor current speeds (Supplementary Figure 2) increase by about 5 cm/s on the shallower sections of the shelf down to around 500 m depth, and decrease by almost 50% at the shelf break and slope, as well as in the area previously occupied by the MGT. Changing bottom tidal current speeds account for almost all the increase in current speed on the South-Western flanks of the Adélie and Mertz Bank, in the deep George V Basin and below the iceberg B09B at the North-Western edge of Commonwealth Bay (**Figure 3F**).

The FAM tracks and quantifies three components of surfacederived food particles: the sinking component captures the

FIGURE 1 | Graphic summarizing observed and predicted changes in environmental conditions (sea-ice, surface-chl-a, ocean current speeds, food-export) and seafloor fauna due to the calving of the Mertz Glacier Tongue (MGT) in 2010. The graphics shows a cross-section of the George V continental shelf ∼80 km off the coast, looking South toward the Antarctic continent. The top graphic shows pre-calving environmental conditions and displays abundances of suspension feeders as observed from towed camera images. The bottom graphic shows observed changes in sea-ice, surface-chl-a and the position of the grounded iceberg B09B, as well as modeled changes in ocean current speeds, food-availability and suspension feeder abundances. Simplified food-availability-model (FAM) results are indicated for (1) Mertz Bank, (2) Adélie Sill, (3) Adélie Bank, and (4) Adélie Basin. Additional indicators are included in the bottom graphic to highlight important changes.

advection of phytodetrital matter by currents as it sinks through the water column until it reaches the seafloor; the flux component represents the horizontal flux of food particles along the seafloor before sedimentation; the settling component represents the final location of advected particles after taking into account the redistribution by seafloor currents. Sinking and settling particles follow similar patterns to the other environmental variables mentioned before, with an eastward shift for the peak number of sinking and settling particles. The model-output shows an absence of sedimentation on large parts of the Mertz Bank (closest to the former tip of the MGT) due to increased current speeds (Supplementary Figure 3). Horizontal food flux along the seafloor, which is dependent mainly on the interaction between the distribution of surface productivity and seafloor current speeds, increases 20- to 50-fold on wide sections of the Mertz Bank (**Figure 3I**). In contrast, changes are more patchy on the Adélie Bank, where increases in flux are mostly restricted to the inner section of the bank and the shelf break, while the edges of the bank experience decrease in flux. Further, most of the deeper sections of the shelf experience lower flux than before the calving.

# Predicted Changes in Suspension Feeder Abundances

Mapped predictions of suspension feeder (SF) cover are based on the statistical relationship between pre-calving cover estimated from still-images, and the environmental covariates depth and log (horizontal flux) (Supplementary Table 1, devianceexplained = 44%), which are selected as the best predictor variables by the stepwise regression process (Supplementary Table 2). There is a good fit between the predicted values from the statistical model and the observed values at the sampling sites, with a slight underestimation of high cover values (**Figure 4**). SF-cover before the calving is high on most of the shallower sections of the shelf (<500 m depth) (**Figure 5A**), with values estimated between 60 and 100% cover. The two shallow banks in the study area, the Mertz Bank closest to the MGT in the East and Adélie Bank in the West, show relatively similar patterns of SF-cover, with average values of around 60% except for on the edges of the banks, where cover of suspension feeders may reach up to 100%. In contrast, post-calving predictions show a clear difference between the two banks (**Figure 5B**) in that SF-cover is predicted to increase by 20–40% (**Figure 5C**) on large parts of the Mertz Bank, while the Adélie Bank and other areas retain similar SF-cover as previously. The strong predicted increase in SF-abundance on the Mertz Bank in the east stems directly from the 20- to 50-fold increase in predicted particle flux that is a direct result of both increased surface production and stronger tidal currents.

#### DISCUSSION

We predict that the calving of massive icebergs will have farreaching effects on benthic communities mediated through

FIGURE 3 | Comparison of mean values for selected biologically relevant environmental variables before and after the calving of the Mertz Glacier Tongue (MGT) in 2010. The first row shows environmental conditions in the 5 years leading up to the calving, the second row in the 5 years after, and the bottom row shows the magnitude of the change. Panels (A–C) are satellite-derived (MODIS-A) estimates of surface-chl-a corrected for Southern Ocean application (Johnson et al., 2013). The missing data at the previous location of the MGT stems from a landmask-artifact in the NASA-data. Panels (D–F) represent the speed of fluctuating currents (tidal currents) at the bottom layer of the regional ocean model used for this study (Cougnon et al., 2017). Panels (G–I) show the number of particles moving horizontally along the seafloor before their permanent sedimentation (on log-scale). In all maps the strongest changes can be observed in the Eastern section of the region, close to the location of the MGT. In the post-calving maps, the outline of the newly grounded iceberg B09B is added for reference while dotted lines indicate the original position of the glacier tongue before it broke off.

the mechanism of pelagic-benthic coupling, and that changes occur even hundreds of kilometers away from the glacier tongue. While previous studies have shown that calving events can have localized negative impacts on the benthos through iceberg scouring (Gutt et al., 1996), here we predict that the combination of changes in local oceanography and surface production influences patterns of seafloor food-availability at much larger scales. Particularly strong changes are predicted in the horizontal flux of food particles post-calving, which is important in determining the distribution of suspension feeders (Jansen et al., 2018). Similar to other Antarctic regions that have recently become ice-free (e.g., Fillinger et al., 2013), our results suggest environmental conditions on the Mertz Bank now are much more favorable for suspension feeders (SFs) than before the calving. Our modeling suggests that there will be a strong, but locally confined increase in SF-abundance on the Mertz Bank of up to 40%. Further away, near the Adélie Bank, increases in bottom current speeds seem to compensate for the overall decrease in food supply, resulting in a prediction of only marginal changes in SF-abundance. The distribution of surface production around the newly formed polynya on the leeward side of the grounded iceberg B09B (Fogwill et al., 2016) seems to slightly influence SF-abundances with relatively stable cover predicted beneath the polynya in contrast to decreasing cover in the ice-covered area directly north of the iceberg B09B. Close to and below the position of the tip of the MGT before it broke away, where Beaman and Harris (2005) have previously found a high number of macrobenthic species, including many sponges and bryozoans, our model predicts a substantial decrease

in SF-abundance, due to a decrease in floor current speed affecting the horizontal food-flux. However, we caution that little confidence should be placed in this result; the environmental conditions in this area might be unique due to the glacier tongue and we lack biological samples for this area. Further, we also lack confidence in the food-availability-data because of missing data in the remotely sensed surface chl-a dataset (see section Surface Productivity and Sea-Ice). Unfortunately, the shallower sections of the Mertz Bank and the western part of the Adélie Bank, which we show here are particularly interesting areas, have not been physically sampled as part of the survey (see **Figure 2**). However, because the survey was designed to cover a wide range of depths and geomorphologies (Hosie et al., 2011), because of the high confidence in the predicted values from the statistical model (**Figure 4**), and because of the similarity in environmental conditions between the shallow banks prior to the glacier calving, we are confident the relationship between environmental variables and distributional patterns of suspension feeder abundances is consistent across the region.

Regions around ice-shelves and glacier tongues provide valuable insight into the dynamic environment of the Antarctic shelf. When an ice-shelf calves a massive iceberg or collapses entirely, the marine environment, to which species might have acclimated to for many years, can transition quickly between a food-poor and a food-rich system (Gutt et al., 2011). The MGT is thought to calve massive icebergs in a ∼70 year cycle (Campagne et al., 2015; Giles, 2017), meaning that there are likely also differences in the long-term stability of environmental conditions, and in the frequency of iceberg-scour between the Mertz Bank near the MGT and the Adélie Bank in the West of the George V shelf. Studies on the West Antarctic Peninsula suggest that at least some components of Antarctic benthic communities on the shelf, such as glass sponges and pioneering species, can increase rapidly in areas that are newly ice-free, fueled by higher export of surface production (Gutt et al., 2011; Fillinger et al., 2013). Conversely, slower-growing deep-sea corals and bryozoans may respond more slowly to changing environmental conditions. Whether any species or communities have adapted to these long-term cyclic events in the George V region is unknown, because a comparative study between the Mertz and Adélie Bank has so far not been conducted due to a lack of biological data. Further, it is currently also not possible to validate our predictions with independent data, because there have been no comprehensive observations of the deep seafloor since the glacier calving. However, if the benthic community response in East Antarctica is similar to that of the West Antarctic Peninsula, the community composition on the Mertz Bank can be expected to change rapidly in the more favorable environment after the glacier calving, or will have undergone changes already, given that 8 years have passed since the calving event. The postulated more favorable environment on the Mertz Bank might continue to persist for some time until the MGT regrows the ice tongue. Further, oceanographic models from after the calving indicate an increase in basal melting of the MGT due to warmer, faster moving waters from the east after grounded tabular iceberg relocation and the MGT calving (Cougnon et al., 2017) which may slow the regrowth of the MGT.

Food-availability is a key factor influencing species distributions. Here, we map predicted changes in relevant seafloor-food-availability caused by the calving of a major glacier tongue, and predict change in distributional patterns of benthic suspension feeders, a key element of Antarctic biodiversity. The study area on the George V shelf lies within the recently proposed East Antarctic MPA (AAD, 2017), and we suggest the Mertz Bank and Adélie Bank should be considered as distinct areas for future sampling of the benthic community. The predicted distribution of suspension feeders after the glacier-calving provides an up to date picture of a key part of seafloor biodiversity, from which the representativeness of the proposed MPA can be assessed. Until regular monitoring programs are established, modeling studies such as ours give important information and context for future monitoring and assessment. Our study provides insight into temporal change and into the mechanisms that drive changes at the seafloor. This is important for a holistic understanding of the Antarctic marine ecosystem, and helps us to understand how climate change can affect the seafloor in the future.

#### MATERIALS AND METHODS

#### Study Area

The study area is located on the relatively deep (500–700 m) East Antarctic continental shelf and slope between latitudes 139 and 147◦E (**Figure 2**). The depth of the shelf ranges between 200 m on the banks to 1,300 m in the basins. The most prominent feature in this region is the Mertz Glacier Tongue (MGT) in the east which has strong influences on both oceanography (Barber and Massom, 2007) and biology (Arrigo and Van Dijken, 2003; Sambrotto et al., 2003; Beans et al., 2008; Jansen et al., 2018). Strong katabatic winds in the region drive sea-ice production (Massom et al., 2001), convection of dense water that contributes to overturning circulation (e.g., Williams et al., 2008), and importantly also form ice-free surface areas on the westward sides of the MGT and the grounded icebergs. These permanent ice-free areas support a long growing season for phytoplankton resulting in high phytoplankton productivity (Arrigo and Van Dijken, 2003; Sambrotto et al., 2003; Beans et al., 2008). Abundant and diverse benthic suspension feeder communities have been found primarily on the shallower section of the shelf between 200 and 600 m (Post et al., 2011). Further, tidal currents on the seafloor redistribute surface derived production, with flux rates of organic particles directly related to the abundance and species richness of the benthic community (Jansen et al., 2018).

The MGT calves off massive icebergs in an estimated 70 year cycle (Campagne et al., 2015), the last event happening in 2010 after a collision between the massive iceberg B09B and the MGT. Since the calving of the MGT, the Mertz Polynya has decreased significantly in size (Tamura et al., 2016), changing ocean circulation (Aoki et al., 2017; Cougnon et al., 2017; Kusahara et al., 2017) and increasing sea-ice concentrations (Tamura et al., 2012). The iceberg B09B grounded on the South-Eastern flank of the Adélie Bank shortly after the collision, and a new polynya has formed on its leeward side (Fogwill et al., 2016). For more details on the study area and the oceanography we refer to numerous papers on the region (e.g. Beaman and Harris, 2005; Cougnon et al., 2013, 2017; Shadwick et al., 2013; Kusahara et al., 2017; Jansen et al., 2018).

## Environmental Data and Numerical Modeling

#### Ocean Model and Bathymetry

Ocean current speeds and directions before and after the glacier calving are derived from a tide-simulating oceanographic model for the George V shelf developed by Cougnon et al. (2017) based on the Regional Ocean Modeling System (ROMS) (Shchepetkin and McWilliams, 2005). The model setup used here is similar to that described by Cougnon et al. (2013), using the same horizontal and vertical grid. The horizontal grid has a resolution of 2.16 km near the southern boundary and 2.88 km near the northern boundary. The vertical grid is arranged to give higher resolution at the top and bottom of the water column. The model domain encompassed the area from the Antarctic coastline to the deep ocean at 62.72◦ S, and from 135.77◦E to the west of the French base, Dumont D'Urville, to 158.08◦E to the east of George V Land (Cougnon et al., 2013).

The model includes ocean-ice shelf thermodynamics described by three equations following Holland and Jenkins (1999), frazil ice thermodynamics following Galton-Fenzi et al. (2012), as used in previous studies (e.g., Cougnon et al., 2013; Gwyther et al., 2014) and a simplified analytic tidal forcing at the lateral boundaries (Cougnon et al., 2017 for details). The bathymetry in both simulations is based on RTopo-1 (Timmermann et al., 2010), and modified to include local high-resolution bathymetry (Beaman et al., 2011) as described in Mayet et al. (2013). Ice draft of the MGT and B09B, along with the underlying bathymetry, is based on an early version of the most up-to-date product by Mayet et al. (2013).

The total run time of the model for each simulation(before and after calving) is 33 years, using an annually repeating loop of the same lateral forcing for both simulations and an annually repeating loop of the surface forcing corresponding to each icescape. This 33 year run includes a spin-up phase of 30 years to reach equilibrium. The spin-up phase of the model has no relevance for the seafloor communities, but is a procedure to ensure the ocean model reaches equilibrium (so that the ocean heat content and the ocean currents are relatively stable under the applied forcing, allowing to analyze the output). The iceberg B09B is at equilibrium in the model. In the model, icebergs and ice-shelves are steady and do not move or change shape. However, they are thermodynamically active, which means that heat and salt fluxes due to ocean-driven melting/refreezing are taken in consideration.

#### Surface Productivity and Sea-Ice

We estimated spatial patterns of surface productivity from measures of ocean color derived from NASA's Moderate Resolution Imaging Spectroradiometer (MODIS-Aqua) (NASA Goddard Space Flight Center, Ocean Ecology Laboratory, Ocean Biology Processing Group, 2014). We used Level-3 binned daily remote sensing reflectance, provided at a resolution of 4 km equal-area bins, and corrected the values for Southern Ocean application using the algorithm in Johnson et al. (2013). Daily measures of chlorophyll-a concentrations were averaged for southern hemisphere spring and summer in each year for a 5 year period before (2005–2009) and after (2011–2016) the calving of the MGT. There are no data available from NASA for the area previously covered by the MGT, presumably due to a landmark artifact.

We estimated seasonal patterns of sea-ice concentrations from satellite-measures of Nimbus-7 SMMR and DMSP SSM/I-SSMIS Passive Microwave Data (Cavalieri et al., 1996, updated yearly). Daily measures of sea-ice concentrations were averaged for southern hemisphere spring and summer in each year for a 5-year period both before (2005–2009) and after (2011–2016) the calving of the MGT.

#### Food Availability Model (FAM)

We mapped the availability of surface-derived food at the seafloor before and after the calving event using a validated food-availability model (FAM) as described in Jansen et al. (2018). The FAM uses a distribution of particles that is based on multi-year averages of satellite derived chlorophyll-a (section Surface Productivity and Sea-Ice), and tracks individual particles from the surface to the seafloor while accounting for their sinking speed, the speed and direction of currents (section Ocean Model and Bathymetry) in 3D, and the sedimentation-rate of particles on the seafloor based on particle sizes (Jansen et al., 2018). The model generates three maps of food availability, namely a sinking-map (showing the number of particles arriving/temporarily settling on the seafloor), a map of horizontal flux (showing where particles move along the seafloor before their sedimentation), and a settling-map (showing where particles permanently settle on the seafloor). While the objectives in our study differ strongly from the Jansen et al. (2018)-study (Jansen et al. described & validated a new method, while we apply the method to map the distribution of suspension feeders and their changes through time), the only difference in the food-availability model is using a different ocean model and a different surfacechl-a climatology. Current speeds on the shelf seem to be slightly lower in the pre-calving part of the model developed by Cougnon et al. (2017) compared to that of Cougnon et al. (2013), but are within the seasonal variation of the Cougnon et al. (2013) model.

For the particle tracking, we used four consecutive timeslices of the ocean-model for the summer season before and after the calving respectively. We used the four consecutive time slices with the strongest differences in current direction and speed, to ensure that each time slice adequately captures one full tidal movement (a 6 h time-slice with 3 h of incoming tide and 3 h of outgoing tide would show very little current speed). The maximum number of seed particles was ∼4.5 million for the pre-calving model and the particles were tracked in 30 min time-steps. At each time-step the location of each particle with respect to the ROMS-cells was calculated, and water current speed and direction at that location updates for advection of the particles during the next timestep. Particles were stopped when they either moved out of the study area or matched the stopping criteria for the respective model, as described in Jansen et al. (2018). The resulting particle distributions from each model-run were backtransformed into a regular grid with a resolution of 1/15 degrees.

We use the FAM-parameters previously defined for this region (Jansen et al., 2018), namely a sinking speed of 300 m/day, a particle radius of 0.24 mm, the density of seawater at 1,030 kg/m<sup>3</sup> , the density of settling particles at 1,100 kg/m<sup>3</sup> and an aspect ratio of 1 representing idealized spherical particles in our modeling.

For the particle tracking, we used R Version 3.3.1 (R Core Team, 2016) with the packages "ptrackr" (Jansen and Sumner, 2017), "raster" (Hijmans, 2015), "ncdf4" (Pierce, 2014), "nabor" (Elseberg et al., 2012), "geostatspat" (Brown, 2015) and "spatstat" (Baddeley and Turner, 2005).

#### Biological Data Collection

We use the same dataset of benthic images as used by Jansen et al. (2018), which is available through the Australian Antarctic Division Data Centre (Jansen et al., 2017). It comprises detailed underwater still images collected during the Collaborative East Antarctic Marine Census (CEAMARC) for the Census of Antarctic Marine Life in December 2007 to February 2008 (Hosie et al., 2011). Transects during the CEAMARC were designed to cover a wide range of depths and geomorphologies in the region and therefore can be considered representative of the area modeled. A forward facing 8 megapixel Canon EOS 20D SLR with two speedlight strobes was mounted on a beam trawl and pictures were taken every 10 s. 32 sites were sampled with transect length mostly between 4 and 6 km, with exceptions ranging between 3 and 16 km. The trawl was controlled using a deck winch. Benthic fauna were identified to the lowest taxonomic resolution possible and, where species identification was not possible, specimens with similar overall appearance were grouped into morphotypes. The bottom third of each image was scored. For each image, the abundance of each species/morphotype was estimated within 5% bins from 0 to 50% and 10% bins from 50 to 100%. Using taxonomy and body-type along with expert knowledge, the abundance of the suspension feeding fauna in each picture was calculated.

#### Statistical Analysis

We use the image data and the maps of environmental data for 2005–2009 to generate a pre-calving statistical model, aiming at producing the statistical model that best explains the abundance of suspension feeders (SF). Each transect was split at the boundaries of the environmental grid cells to ensure all pictures lay within the same value for the environmental covariates. We multiplied %-cover estimates in each image by 100 and rounded up to generate integer values that better suit a statistical analysis using a multiple linear regression with a negative binomial GLM (assuming the values would then represent the number of pixels covered by the fauna). We backwards selected variables from a full model using AIC. The full model contained the important environmental variables identified by Jansen et al. (2018), namely depth, tidal-current speed and the horizontal flux of particles along the seafloor. The final model contained only depth and log(horizontal-flux) as predictor variables. We found that using a negative binomial generalized linear model (compared to a linear model in the previous study) did not affect the selection of model terms. Therefore, the change in selected model terms is likely to come from the difference in the ocean model or the surface productivity. The pre-calving statistical model showed a good fit between the predicted and the observed values at the sample sites, with possibly a slight underestimation of suspension feeders at high abundances (**Figure 5**). Due to the limited amount of biological data available, we were not able to use separate datasets for training and testing the statistical model.

We then used the pre-calving statistical model to predict the spatial distribution of SF-cover in both the pre-calving and the post-calving environment. The difference between the resulting maps was used to make inferences about areas with expected increases and decreases in the abundance of suspension feeders. Further, we bootstrapped the parameters of the pre-calving statistical model to obtain estimates for the standard-deviation of the predictions.

For the statistical analysis, we used R Version 3.3.1 (R Core Team, 2016) and the packages "raster" (Hijmans, 2015), "MASS" (Venables and Ripley, 2002), "maptools," and "modEvA."

#### REFERENCES


#### Data Availability

Estimates of suspension feeder abundances from benthic images are available through the Australian Antarctic Division Data Centre (Jansen et al., 2017). Raster files containing mapped predictions of food-availability and suspension feeder abundances presented in this study, from before and after the glacier calving are available through the Australian Antarctic Data Centre (Jansen, 2018).

#### AUTHOR CONTRIBUTIONS

JJ, NH, CJ, PD, and EC conceived and designed the study. JJ, NH, CJ, and PD analyzed the data. EC and BG-F did the numerical modeling. JJ prepared all figures and wrote the paper with contributions from all authors.

#### ACKNOWLEDGMENTS

We thank Marc Eléaume for useful discussions and for making the original biological data available to us, and the reviewers of the paper for their comments. Biological samples were collected during the CEAMARC program as part of the IPY #53 Census of Antarctic Marine Life program. Coastline and glacial features for the figures are taken from the Antarctic Digital Database version 5. JJ is supported by a Tasmanian Graduate Research Scholarship and a QAS Top-Up scholarship. This work was completed as part of Australian Antarctic Science project 4124.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00094/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jansen, Hill, Dunstan, Cougnon, Galton-Fenzi and Johnson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Linking Ross Sea Coastal Benthic Communities to Environmental Conditions: Documenting Baselines in a Spatially Variable and Changing World

Vonda J. Cummings <sup>1</sup> \*, Judi E. Hewitt <sup>2</sup> , Simon F. Thrush<sup>3</sup> , Peter M. Marriott <sup>1</sup> , N. Jane Halliday <sup>1</sup> and Alf Norkko4,5

<sup>1</sup> National Institute of Water and Atmospheric Research, Wellington, New Zealand, <sup>2</sup> National Institute of Water and Atmospheric Research, Hamilton, New Zealand, <sup>3</sup> Institute of Marine Science, University of Auckland, Auckland, New Zealand, <sup>4</sup> Tvärminne Zoological Station, University of Helsinki, Helsinki, Finland, <sup>5</sup> Baltic Sea Centre, Stockholm University, Stockholm, Sweden

#### *Edited by:*

Huw James Griffiths, British Antarctic Survey (BAS), United Kingdom

#### *Reviewed by:*

Jan Marcin Weslawski, Institute of Oceanology (PAN), Poland Jeroen Ingels, Florida State University, United States

> *\*Correspondence:* Vonda J. Cummings vonda.cummings@niwa.co.nz

#### *Specialty section:*

This article was submitted to Marine Ecosystem Ecology, a section of the journal Frontiers in Marine Science

*Received:* 04 March 2018 *Accepted:* 15 June 2018 *Published:* 09 July 2018

#### *Citation:*

Cummings VJ, Hewitt JE, Thrush SF, Marriott PM, Halliday NJ and Norkko A (2018) Linking Ross Sea Coastal Benthic Communities to Environmental Conditions: Documenting Baselines in a Spatially Variable and Changing World. Front. Mar. Sci. 5:232. doi: 10.3389/fmars.2018.00232 Understanding the functionality of marine benthic ecosystems, and how they are influenced by their physical environment, is fundamental to realistically predicting effects of future environmental change. The Antarctic faces multiple environmental pressures associated with greenhouse gas emissions, emphasizing the need for baseline information on biodiversity and the bio-physical processes that influence biodiversity. We describe a survey of shallow water benthic communities at eight Ross Sea locations with a range of environmental characteristics. Our analyses link coastal benthic community composition to seafloor habitat and sedimentary parameters and broader scale features, at locations encompassing considerable spatial extent and variation in environmental characteristics (e.g., seafloor habitat, sea ice conditions, hydrodynamic regime, light). Our aims were to: (i) document existing benthic communities, habitats and environmental conditions against which to assess future change, (ii) investigate the relationships between environmental and habitat characteristics and benthic community structure and function, and (iii) determine whether these relationships were dependent on spatial extent. A very high percentage (>95%) of the between-location variability in macro- or epifaunal community composition was explainable using multi-scale environmental variables. The explanatory power varied depending on the scale of influence of the environmental variables measured (fine and medium-scale habitat, broad scale), and with community type. However, the inclusion of parameters at all scales produced the most powerful model for both communities. Ice duration, ice thickness and snow cover were important broad scale variables identified that directly relate to climate change. Even when using only habitat-scale variables, extending the spatial scale of the study from three locations covering 32 km to eight locations covering ∼340 km increased the degree of explanatory power from 18–32 to 64–78%. The increase in explanatory power with spatial extent lends weight to the possibility of using an indirect "space for time" substitution approach for future predictions of the effects of change on these coastal marine ecosystems. Given the multiple and interacting drivers of change in Antarctic coastal ecosystems a multidisciplinary, long term, repeated observation approach will be vital to both improve and test predictions of how coastal communities will respond to environmental change.

Keywords: Antarctica, benthos, coast, environmental change, multi-scale observations, functional traits

#### INTRODUCTION

Seafloor community composition is the product of numerous biological and physical processes acting across multiple spatial and temporal scales (e.g., Ricklefs, 1987; Kolasa and Pickett, 1991). Such communities act as sentinels of change, integrating many processes occurring in the surrounding ocean. Understanding the complex functionality of benthic ecosystems, and how they are influenced by their physical environment, is fundamental to realistically predicting future effects of environmental change. Key to these predictions are the need for baseline information on biodiversity and the role of bio-physical processes in shaping existing marine ecosystems, as well as predictions of change at appropriate space and time scales (Dayton and Tegner, 1984; Wiens, 1989).

Antarctic marine environments are amid dramatic change, with significant air and ocean warming already observed (Meredith and King, 2005; Martinson et al., 2008; Turner et al., 2009; IPCC, 2013) and meta-analyses of multiple global climate models predicting that several more degrees of warming could occur in Antarctica over the coming decades (Walsh, 2009; Mayewski et al., 2015). Warming sea temperatures can, as well as directly affecting fauna (Peck et al., 2004; Peck, 2011), combine with changing wind patterns and oceanographic conditions to manifest further impacts through altered sea ice dynamics (e.g., Ducklow et al., 2013). The importance of changes in sea ice extent and persistence have been well highlighted, particularly because changes are anticipated across large areas of the Southern Ocean (Stammerjohn et al., 2008; Gutt et al., 2015, 2016). To date, sea temperatures have increased in the Antarctic Circumpolar Current, both at the surface and at depth (e.g., Aoki et al., 2003; Boning et al., 2008). The Western Antarctic Peninsula (WAP) and Potters Cove have exhibited the greatest increases in sea temperature, by ∼1 ◦C since 1950 (Meredith and King, 2005; Schloss et al., 2012). The changes in sea ice dynamics that have accompanied warming in the WAP region have cascaded through trophic levels to affect a range of organisms (Clarke et al., 2007; Ducklow et al., 2007; Grange and Smith, 2013). Although such changes have not been detected in the Ross Sea region of Antarctica, variations in sea ice characteristics and productivity (multiyear, annual, polynya influence) are known to have a major influence on marine communities in this area (e.g., Dayton and Oliver, 1977; Thrush et al., 2006; Norkko et al., 2007; Smith et al., 2012a,b).

A recent review, conducted by Convey et al. (2014), found that ice cover, ice scour, salinity and productivity were the most important physical and biological drivers of Antarctic biodiversity, and that there was not one key factor acting Antarctic-wide. Ice-shelf collapse and iceberg calving have been shown to affect multiple ecosystem components through interruptions to "usual" sea ice patterns and productivity (e.g., Arrigo et al., 2002, Ainley et al., 2006; Kooyman et al., 2007; Siniff et al., 2008; Gutt et al., 2011, 2013; Thrush and Cummings, 2011). Considerable environmental change is expected in Antarctic marine ecosystems in the coming decades (e.g., Stammerjohn et al., 2008, 2012; Jacobs and Giulivi, 2010; Smith et al., 2014), including changes in ocean chemistry (Orr et al., 2005; McNeil and Matear, 2008). The complexity of the direct and indirect influences of this potential change on marine communities cannot be understated (see Gutt et al., 2015). We are not currently in a good position to understand how these unique ecosystems will respond.

Nearshore communities, important to higher trophic levels (Grange and Smith, 2013) and often rich in primary production, are at the boundary between terrestrial and open ocean systems. Significant inputs of water (freshening), sediments, and associated minerals are observed in these areas due to their proximity to ice shelves and glaciers and the associated runoff with melt (Dierssen et al., 2002; Moline et al., 2008; Grange and Smith, 2013; Pasotti et al., 2015; Dayton et al., 2016; Monien et al., 2017). Consequently, impacts of warming could be expected to be greater in the coastal regions, at least in the short term (e.g., Gutt et al., 2015). For this reason, a good understanding of the environmental drivers of patterns in benthic community composition, and the potential resilience of these coastal communities is important.

In this study, we focus on shallow water benthic communities in the coastal Ross Sea. Our analyses link coastal benthic community composition to seafloor habitat and sedimentary parameters and broader scale environmental characteristics, at eight locations along the Ross Sea coastline (**Figure 1**). This spatially extends a study by Cummings et al. (2006) to encompass greater variation in environmental characteristics (e.g., seafloor habitat, sea ice conditions, hydrodynamic regime, light climate). Some of these characteristics can be considered aligned along a natural gradient of factors that will influence benthic community structure and function, and which are anticipated to change in future (e.g., sea ice persistence).

Our aims were three-fold: (i) to document existing benthic communities, habitats and environmental conditions against which to assess future change, (ii) to investigate the relationships between environmental and habitat characteristics and benthic community structure and function, and (iii) to determine whether the conclusions of Cummings et al. (2006) hold as the spatial and environmental extent of the data increases and whether explanatory power also increases. If the same

characteristics are found to be important and if the explanatory power remains the same or increases then it is likely that natural gradients can be used as an indirect space for time (e.g., Pickett, 1989; Fukami and Wardle, 2005) substitution that might help predict the effects of their changes on these coastal marine ecosystems in the coming decades.

#### MATERIALS AND METHODS

Eight coastal Ross Sea locations were chosen, spanning ∼340 km of coastline (**Figure 1**) and considerable variability in seafloor habitat and sea ice conditions. At each location (Cape Evans, New Harbor, Dunlop Island, Spike Cape, Granite Harbor, Tethys Bay west, Tethys Bay south and Gerlache Inlet), three sites, separated by at least 50 m, were chosen within the 15–25 m depth range (**Table 1**). These locations were sampled from 2001 to 2007, in October/November (except for Tethys Bay west, which was sampled in January; see **Table 1** for exact dates). While two of these locations were situated in Tethys Bay, they were ∼1.6 km apart, and the ice persists for approximately 1 month longer each season at Tethys Bay west than at Tethys Bay south (**Table 1**). Each site was accessed through a hole drilled in the sea ice and sampling was conducted by SCUBA divers.

The methodologies used for sample collection and processing, and their rationale, are provided in detail in Cummings et al. (2006), along with our analysis of the three southernmost locations (New Harbor, Dunlop Island, Spike Cape; spanning ∼32 km of coastline). These methods are briefly reiterated below.

#### Transect Based, Nested Sampling

Benthic macro-infaunal communities (hereafter "macrofauna"), epifaunal communities, and fine and medium scale habitat features, were quantified at each site using a nested sampling design. At each site, a 20 m long transect line was laid on the seafloor oriented parallel to the shoreline. Five marker pegs were inserted into the sediment at random points along the length of the transect, which was then videoed by SCUBA divers at fixed heights of 70 and 40 cm above the seafloor. These two heights enabled us to categorize habitat features of different densities between sites, and the information was later scaled to numbers or percent cover m<sup>2</sup> (Cummings et al., 2006). Epifaunal taxa and habitat features were quantified from 1.5 m lengths of video frame grabs, each centerd on the location of a sediment core collected


TABLE 1 | Depths, dates of sampling, and broad scale environmental characteristics at the eight coastal Ross Sea locations.

nr, Not recorded.

to quantify macrofauna (Cummings et al., 2006). Five mediumscale habitat features were quantified from the video frame grabs: % cover of sand, rock, pebble, cobble, and coarse sediments with detritus (Cummings et al., 2006).

Sediment cores were collected adjacent to each marker, to quantify macrofaunal communities (70 mm diameter, 100 mm deep core), and fine-scale sediment characteristics (26 mm diameter, 50 mm deep core). The large core samples were sieved (500µm mesh), and then preserved in 70% isopropanol in seawater, before macrofauna were separated out, and identified to the lowest taxonomic level possible. Sediment from each small core was homogenized and sub-sampled for analysis of chlorophyll a (Chl a), phaeophytin, particle size and organic content. Chl a and phaeophytin were extracted from freeze dried sediments by boiling in 90% ethanol. The extract was measured spectrophotometrically, and an acidification step was included to separate degradation products (phaeophytin) from Chl a (Sartory, 1982). Sediment for particle size analysis was digested in 6% hydrogen peroxide for 48 h to remove organic matter, and dispersed using Calgon. A Galai particle analyser (Galai Cis−100; Galai Productions Ltd, Midgal Haemek, Israel) was then used to calculate percentage volumes for the coarse, medium and fine sand, silt and clay fractions. Organic content was determined by drying the sediment at 60◦C for 48 h, followed by combustion at 400◦C for 5.5 h.

#### Broad Scale Environmental Variables

At each location we collected information on current velocity, water temperature (◦C), light transmission to the seafloor, the thickness of sea ice (m), and snow cover (m). Current velocities (cm s−<sup>1</sup> ) and water temperatures were measured by an S4 or Anderra current meter deployed 4 m above the seabed for the duration of our visit (at least 3 days). Light transmission, measured as percentage incident light, was calculated from five replicate measurements of photosynthetically available radiation (PAR) made both above the sea ice and at the seafloor, using a LiCor Li190SA quantum sensor for incident irradiance (background irradiance above the ice) and a Li192SB for underwater irradiance attached to a Li-1000 logger. Ice cover duration (i.e., number of months per year that sea ice covered the site) was also estimated using personal observations and satellite imagery. Current, temperature and irradiance were not measured at Tethys Bay west.

#### Functional Traits of Macrofauna

To identify the relative predominance of macrofauna with particular functional characteristics, the taxon list was examined and traits assigned using a combination of literature information, advice from experts and our own observations. Three trait categories were characterized for each taxon: feeding traits—suspension feeders, deposit feeder-grazers or predatorscavengers; motility traits—sedentary, limited motility, freely motile; and living position—surface dwellers, or shallow (<2 cm) or deep (>2 cm) sediment dwellers. These traits were chosen for their potential influence on ecosystem functions, and the expectation that differences in traits may be closely linked to habitat (Hewitt et al., 2008). When a taxon exhibited more than one trait within a particular category, "fuzzy coding" was applied to reflect the relative importance of the multiple traits (sensu Chevenet et al., 1994). The fuzzy probability was then multiplied by the abundance of each taxon to provide an abundance weighted measure of each trait. Results are presented as relative proportions per trait within each category.

#### Statistical Analyses Community Descriptions

Diversity indices [number of individuals, number of taxa, and species evenness (J′ )] were calculated for both the epifauna and macrofauna using the DIVERSE procedure within PRIMER (Clarke and Gorley, 2001). To assess the relative variability in abundance of the three epifaunal taxa common across locations, the scallop Adamussium colbecki, the seastar Odontaster validus and the sea urchin Sterechinus neumayeri. Coefficients of variation (CoV; SD/mean) were calculated, using the average abundance values (m−<sup>2</sup> ) for each of the three sites per location.

The relationship within and between locations for (a) macrofaunal community composition (determined from the core data), and (b) epifaunal community composition (determined from the video) were investigated using non-metric multidimensional scaling ordination (MDS; Clarke and Gorley, 2001) on untransformed data, and averages for each site. Patterns in macrofaunal community functional traits were also investigated using MDS. SIMPER (PRIMER Clarke, 1993), with a cutoff of 90%, was used to determine average similarities within locations, along with the taxa that characterized the communities at each location.

#### Linking Benthic Community Composition and Environmental Characteristics

The degree to which variations within and among locations in benthic community composition (macrofauna, epifauna) could be explained by broad scale environmental features and/or medium to fine-scale habitat features was investigated. The scales were assigned to the variables based on the spatial grain at which they were collected (**Table 2**). Habitat scale features were those specific to the habitat surrounding the macro- and/or epifauna. Those collected using small cores immediately adjacent to the macrofauna cores were considered to represent fine scale habitat structure pertaining to macrofauna; those collected using video along the 20 m transect represented the habitat loosely pertaining to epifauna. Our measures of fine- and medium-scale habitat features also encompassed variation at their respective scales through multiple measurements: (i) cores were taken at 5 random positions along each 20 m transect, (ii) medium scale habitat features were also assessed at 5 random positions, each of which encompassed a 1.5 m long section of transect. Factors such as currents, light, and ice conditions were considered broad scale because these were generally more consistent at the scale of the locations within which the transect and core data were collected. We note that some of these broad scale factors will vary within locations.

Broad scale environmental variables included in these analyses were ice thickness (m), ice cover duration (mo), snow cover (m), light transmission, current speed (minimum, maximum, average), maximum water temperature, and depth. Latitude and longitude were also included in this category. Habitat scale features included medium-scale data determined from the transect video (% cover m−<sup>2</sup> sand, rock, pebble, cobble, coarse sediments with detritus), and the fine-scale information from the core samples [sediment particle size (% composition of gravel, medium sand, fine sand, mud), % organic content, and chlorophyll a, phaeophytin (µg g−<sup>1</sup> )].

We used Canonical Correspondence Analysis (CCA; ter Braak, 1986), on an untransformed average for each location and down weighting of rare species, with forward selection to link changes in benthic communities to local and broadscale environmental drivers. The relationship between functional traits of macroinfauna and the multiscale suite of environmental characteristics was investigated using the same procedure.

#### RESULTS

# Communities

#### Epifauna

Numbers of epifaunal species per site ranged from 2 to 5 m−<sup>2</sup> , with abundances lowest at Spike Cape (Site 2, 10.0 ± 3.6 m−<sup>2</sup> ) and highest at Cape Evans (Site 1, 219.8 ± 31.9 m−<sup>2</sup> ) (**Table 3**). The extremely high numbers at the latter site are due mostly to the echinoderm Sterechinus neumayeri. Numbers of individuals TABLE 2 | Environmental variables assessed at each site, and the scales at which they were observed.


Fine scale habitat features represent sediment characteristics, and were collected using small cores (26 mm diam. × 50 mm deep); Transect scale habitat features represent seafloor substrate type, and were assessed from 1.5 m long sections of transect; Broad scale characteristics were measured at each site, and incorporate characteristics of the general surrounding area.

were generally lowest at the southernmost locations, New Harbor, Spike Cape and Dunlop Island, and highest at Tethys Bay south (**Table 3**). Species evenness (J′ ) is generally moderate (values of 0.4–0.6; **Table 3**), although at a few sites, where only one or two species are found, J′ is consequently elevated (i.e., 1.63 at Tethys Bay west Site 3; and 0.8–1.0 at Gerlache Inlet, Dunlop Island Site 3, New Harbor Site 1 and Cape Evans Site 2; **Table 3**).

Across locations, the epifaunal communities were dominated by one of five taxa: the echinoderms S. neumayeri, Odontaster validus, and Ophionotus victoriae, the bivalves Adamussium colbecki and Laternula elliptica, and the sponge Homaxinella balforensis; **Table 4**). At four locations (Cape Evans, Spike Cape, Tethys Bay south and Tethys Bay west) the first and second most abundant taxa were S. neumayeri and O. validus, respectively. Within location similarity was greatest at New Harbor and Tethys Bay south (∼74%; **Table 4**). The MDS of epifaunal community composition shows generally clear separation and tight clustering in ordination space of sites within Dunlop Island, Spike Cape and New Harbor (**Figure 2A**). Community compositions at the Gerlache Inlet and Cape Evans sites are the most variable (**Figure 2A**).

The relative within-location variability in abundance of the three common epifaunal taxa, assessed using the CoV, indicated that A. colbecki was distributed in similar numbers across sites at New Harbor (0.12), but was considerably patchier at Gerlache Inlet (1.27) (**Table 4**). Distribution of O. validus was similar across sites at Cape Evans, Dunlop Island, Tethys Bay south (0.26–0.38), and most variable at Granite Harbor, Tethys Bay west and Spike Cape (0.8–1.19). S. neumayeri distributions were similar at Tethys Bay west, Tethys Bay south, Granite Harbor (0.3–0.4) and relatively patchy at Dunlop Island (1.09).



No. species, number of species; No. inds, number of individuals; J′ , species evenness. Numbers presented are means ± SE.

#### Macrofauna

Numbers of macrofauna species ranged from 2.6 ± 0.4 core−<sup>1</sup> at Tethys Bay west Site 2, to 17.4 ± 1.94 core−<sup>1</sup> at Cape Evans Site 3 (**Table 3**). Numbers of individuals were also variable, ranging from 3.6 ± 0.93 core−<sup>1</sup> at Gerlache Inlet Site 2 to around 138.6 ± 36.8 core−<sup>1</sup> at Cape Evans. Evenness was high (>0.6) at most sites (**Table 3**), indicating the dominance of 1–2 species. The exception is all three of the Gerlache Inlet sites, where evenness was relatively low (0.33–0.49; **Table 3**). Dominant taxa in the macrofaunal communities varied with location. The burrowing anemone Edwardsia sp. was common at New Harbor, Spike Cape, Dunlop Island and Granite Harbor, and the polychaete Polygordius antarcticus at five of the eight locations (**Table 5**). Polychaetes featured at all locations except Granite Harbor and Tethys Bay north, and one (Ophryotrocha ?notialis) was the dominant taxa at Gerlache Inlet (**Table 5**).

The within location similarity was low compared to that of the epifauna, reaching a maximum at Cape Evans and Granite Harbor (47 and 58%, respectively), and a low of ∼15% at Tethys Bay west and Gerlache Inlet (**Table 5**). The MDS of macrofaunal community composition shows generally clear distinctions in ordination space of sites within a location, with little overlap between locations (**Figure 2B**). The exceptions to this are New Harbor and Tethys Bay north, and Dunlop Island and Spike Cape (**Figure 2B**).

For all three of the functional characteristics investigated here (i.e., feeding type, motility, living position), the Granite Harbor macrofaunal communities were clearly distinguishable from those of all other sites (**Supplementary Figure 1**). Specifically, these communities were dominated by suspension feeders with limited mobility that dwell in the near-surface sediment layer (i.e., <2 cm; **Supplementary Figure 1**). The macrofaunal community traits at the three Terra Nova Bay locations (Tethys Bay west, Tethys Bay south, Gerlache Inlet) tended to be similar to one another, with mostly freely motile, surface dwelling predator/scavengers (**Supplementary Figure 1**). While the remaining (more southerly) locations showed variable trait characteristics, macrofauna at New Harbor, Cape Evans and Dunlop Island were most commonly freely motile depositgrazers dwelling in the top 2 cm (**Supplementary Figure 1**).

#### Environmental Characteristics

The largest distinction between locations in broad scale environmental characteristics was in sea ice cover. Annual sea ice was present at all locations except New Harbor (where multi-year ice is the norm), but varies in the number of months it persists each year (**Table 1**), in thickness (range from 2.0 m at Tethys Bay west to 3.5 m at New Harbor at the time of sampling), and in the amount of snow cover on the ice (**Table 1**). Note that the latter measure of snow cover can also vary considerably from year to TABLE 4 | The percentage similarity in epifaunal community composition at each location, and the dominant taxa contributing to 90% of the variability (from SIMPER analysis of mean abundance m−<sup>2</sup> determined from video analysis).


The coefficient of variation for the three epifaunal taxa common across locations is also given. CE, Cape Evans; NH, New Harbor; DI, Dunlop Island; SC, Spike Cape; GH, Granite Harbor; TBW, Tethys Bay west; TBS, Tethys Bay south; GI, Gerlache Inlet.

year at the same location (authors' pers. obs.). Currents measured at the sites at the time of sampling were fastest at Granite Harbor (3.0 ± 1.85 cm sec−<sup>1</sup> ) and slowest at Tethys Bay south (1.83 ± 1.13 cm sec−<sup>1</sup> ). At each location, <1% of the above ice incident light is transmitted to the seafloor below (**Table 1**).

The video transect-habitat analyses revealed simple habitat structure at all three of the New Harbor sites (100% sand) and the three Dunlop Island sites (from 77.33 ± 8.62 to 92.50 ± 1.89% sand; **Supplementary Table 1**), and the most complex at Granite Harbor and Gerlache Inlet (mixed habitat type; **Supplementary Table 1**). The proportion of the seafloor containing sediments available to be cored was low at several sites (i.e., coarse sediment/sand categories **Supplementary Table 1**). At this finer core-habitat scale, the soft sediments were comprised predominantly of coarse sand at Cape Evans (∼80%; **Supplementary Table 2**), New Harbor (52–60%; **Supplementary Table 2**), Tethys Bay south (50– 64%; **Supplementary Table 2**) and Tethys Bay west (62– 75%; **Supplementary Table 2**). Dunlop Island sediments also contained a significant portion of coarse sand (38.26 ± 1.66 to 41.30 ± 5.20%), with considerable percentages of gravel, fine and medium sand. Granite Harbor sediments were more variable, with one site dominated by coarse sand (Site 1, 43.05 ± 6.05%), and Sites 2 and 3 with similar percentages of medium sand (40.20 ± 0.40 and 42.46 ± 1.18%, respectively) and equal amounts of coarse and fine sand (28.30 ± 2.17 and 27.12 ± 2.87%). The sediments at Spike Cape were of mixed particle sizes. Sediment organic content was very low at all locations, only exceeding 1% at Granite Harbor Site 1 (1.31 ± 0.48%). Sediment associated chlorophyll a was also highest at this Site (6.69 ± 2.77 µg g−<sup>1</sup> ). Concentrations of phaeophytin were variable, ranging from 1.60 ± 0.36 µg g−<sup>1</sup> at New Harbor Site 1, to 26.09 ± 9.66 µg g−<sup>1</sup> at Tethys Bay west Site 3 (**Supplementary Table 2**).

#### Linking Benthic Community Composition and Environmental Characteristics

In the multi-scale CCA, six variables explained 97% of the variability in epifaunal community composition between the eight locations (**Figure 3A**). These include maximum current and longitude (aligned with CCA Axis 1), and ice thickness, % composition of silt, % cover of sand, and phaeophytin. The direction of the vectors for the latter two variables are aligned with the communities at New Harbor, Dunlop Island and Gerlache Inlet (% sand), and Spike Cape, Tethys Bay west, Tethys Bay south, and Cape Evans (phaeophytin) (**Figure 3**). Maximum current speed is the variable important in separating the Granite Harbor community from those at the remaining locations (**Figure 3A**).

A mixture of seven broad scale and habitat scale variables combined to explain 95% of the variability in macrofaunal community composition (i.e., longitude, ice duration, ice thickness, % cover of sand, snow cover, % cover of sediment detritus, % sediment organic content; **Figure 3B**). The southwestern McMurdo Sound locations tend to be arranged along Axis 1, with the Terra Nova Bay sites distributed along Axis 2 (**Figure 3B**). Cape Evans and Tethys Bay west are close together, near the bottom left of the CCA (**Figure 3B**). The variables closely associated with these axes are longitude and % cover of sand (Axis 1) and ice duration and snow cover (Axis 2).

When comparing the percentage variation explained at the different scales of observation, the multi-scale CCA (described above) explained the most variability in community composition for both epifauna and macrofauna (97 and 95%, respectively), with a similar number of variables in each case (**Table 6**; **Figures 3A,B**). For epifauna, the analysis incorporating only broad scale parameters also explained a high amount of community variation (91%), while that using parameters measured at only the habitat scale explained the least (64%) (**Table 6**). For macrofaunal communities, the variation explained by the separate habitat and broad scale CCAs was similar (i.e., 78 and 73%, respectively; **Table 6**).

At the multi-scale, a very high percentage of variation in functional traits of macrofauna (97%) was explained by a combination of five broad and habitat-scale parameters (i.e., ice duration and thickness, % cover of sand, sediment detritus and Phyllophora; **Figure 3C**). Interestingly, none of the habitatscale variables identified as important were those measured at the finest scale (i.e., sediment cores). Axis 2 distinguished New Harbor and Dunlop Island, both of which had abundant sand and no coarse sediment detritus, and the three Terra Nova Bay locations, with the shortest ice duration (8–9 months) and motile, surface dwelling predator/scavengers, were located in the top left-hand side of the ordination. Cape Evans was distinguished by its relatively high amount of Phyllophora (**Figure 3C** and **Supplementary Table 1**) and was grouped with Dunlop Island

TABLE 5 | The percentage average similarity in macrofaunal community composition at each location, the number of taxa contributing to 90% of the similarity, and the taxa found in average abundances of <sup>&</sup>gt;1 ind. core−<sup>1</sup> .


Taxa found in average abundances of >10 ind. core−<sup>1</sup> are indicated in bold.

and New Harbor, reflecting the similarity in functional traits amongst these sites (i.e., freely motile near-surface dwelling deposit-grazers).

### DISCUSSION

This paper has described the benthic communities and environments at eight coastal Ross Sea locations. Information was collected using a nested sampling design that had previously been implemented at the three southernmost locations. This new description has provided further quantitative, baseline information on each location, as a step toward assessing future change. We describe how the epifaunal and macrofaunal communities, and the functional traits of the latter, vary between locations. A major aim was to investigate whether community characteristics are predictable based on the surrounding environmental factors, and whether the explanatory power of these factors increases with spatial scale of observation. The original analysis of only three locations (∼32 km spatial extent) concluded that low or simple habitat structure did not necessarily result in low diversity of benthic fauna; a conclusion that was supported following the addition of five more locations to the north and east (∼340 km total spatial extent). Even when using only habitat-scale variables, we could explain considerably more of the variation in community composition across our eight, more spatially extensive, sampling locations (epifauna 64%; macrofauna 78% explained; **Table 6**) than in our previous analysis of only the three southernmost locations (32 and 18%, respectively; Cummings et al., 2006). In the present study, the combination of habitat and broad scale variables explained ≥95% of the variability across all eight locations, indicating the importance of capturing environmental and biological variability across a larger scale in documenting and understanding community patterns.

FIGURE 3 | Canonical Correspondence Analysis ordinations of the relationship between habitat and broad scale environmental variables and (A) epifaunal and (B) macrofaunal community composition, and (C) macroinfaunal functional traits. The variation explained by the variables was 97, 95, and 97%, respectively. The locations are represented by different colored symbols as defined in Figure 2. Phaeo, phaeophytin; ice-thick, ice thickness; ice-dur, ice duration; sed/detritus, sediment associated detritus; org, organic content.

TABLE 6 | Summary of results of the Canonical Correspondence Analysis (CCA) conducted to link benthic community composition to habitat scale and broad scale characteristics, separately and together (multiscale).


In each case, the overall % variation explained and the variables important in explaining this variation is given. Habitat scale features include fine- and medium-scale variables assessed using cores (26 mm diam.) and quadrats (1.5 × 0.5 m), respectively. Italics are used to distinguish fine-scale habitat features (% composition) from medium-scale habitat features (% cover m−<sup>2</sup> ).

#### Communities

Across all eight locations, epifaunal communities were dominated by one of three invertebrate taxa (S. neumayeri, A. colbecki, and O. validus; **Table 4**), with only 2-5 species in total. Overall abundances were generally lowest at the southwestern most locations, New Harbor, Spike Cape and Dunlop Island, but there were no obvious patterns in species evenness (**Table 3**). These three locations receive relatively food-poor water that flows northwards along the coastline after exiting the Ross Ice Shelf (Barry and Dayton, 1988; Thrush et al., 2006; **Figure 1**). When the distribution of the three main taxa within locations was examined using CoV, all showed patchy distributions at some locations, and more even distributions at others (**Table 4**). This patchiness was not highly correlated with diversity in seafloor habitat type (as characterized from the transect-video; cf. **Table 4** and **Supplementary Table 1**). The most dominant species in the macrofaunal communities were Polygordius antarcticus, Ophryotrocha ?notialis, Edwardsia sp., Munnidae type A, and copepods. Although, like the epifauna, one or two species tended to dominate these communities, many more species also featured in reasonable abundances (**Table 5**). Surface dwelling, freely motile macrofaunal taxa characterized the communities at the three Terra Nova Bay locations; at Gerlache Inlet and Tethys Bay south these were predator/scavengers, and at Tethys Bay west they were deposit feeder/grazers (**Supplementary Figure 1**). No patterns were apparent for the other locations (**Supplementary Figure 1**).

#### Linking Communities to Environment

Despite the obvious differences in environmental characteristics between some of the locations in this study—for example, New Harbor, where sea ice can persist for a decade and the seafloor is predominantly sandy, vs. Gerlache Inlet, where the ice breaks out annually and the seafloor habitat is heterogeneous—similarities in community composition exist that are often greater than those at locations with more similar environmental characteristics. In this example, epifaunal communities at both locations have high numbers of suspension feeding A. colbecki (**Table 4**), and the macrofaunal communities are dominated by polychaetes (**Table 5**). Surface dwelling freely motile macrofauna were the most common at both locations, however feeding traits differed (deposit-grazers at New Harbor, predator/scavengers at Gerlache Inlet; **Supplementary Figure 1**).

Formally examining benthic community composition and its relationship to environmental characteristics did, however, reveal some interesting patterns. In the epifaunal MDS and CCAs, the three locations with A. colbecki amongst their most dominant taxa, New Harbor, Dunlop Island and Gerlache Inlet, were located together in ordination space (**Figures 2A**, **3A**). Of the environmental variables important in this separation, ice thickness and % cover of sand aligned with these communities in the CCA (**Figure 3A**). Maximum current speed is the variable important in separating the Granite Harbor epifaunal community (dominated by S. neumayeri, L. elliptica, and O. validus) from those at the remaining locations (**Figure 3A**). A total of six variables explained 97% of the variation in epifaunal community composition across the eight locations (longitude, phaeophytin, ice thickness, % cover of sand, maximum current, % composition of silt; **Figure 3A**). A similarly high percentage of the variation was explained for macrofaunal community composition (95%; longitude, ice duration, ice thickness, % cover of sediment detritus and sand, snow cover, % organic content; **Figure 3B**), and macrofaunal functional traits (97%; ice duration and thickness, sand, sediment detritus, Phyllophora; **Figure 3C**). While this explanatory power is extremely high, for both the epifaunal and macrofaunal community analyses, one of these variables was longitude, a likely surrogate for an environmental factor (or factors) not included in this analysis.

One of our aims in describing the functional traits of macrofaunal communities was to identify patterns that may reflect specific site characteristics. For example, an abundance of benthic suspension feeders could reflect a food-rich water column, more exposed/disturbed sites may be characterized by motile species. However, such simple relationships were not always apparent: while the low abundance of macrofaunal suspension feeders at water column food-poor New Harbor supports this idea, suspension feeders did not predominate at Gerlache Inlet, a location adjacent to the phytoplankton-rich Terra Nova Bay polynya (**Supplementary Figure 1**). Potentially, the general nature of the trait categories we used (e.g., predator/scavenger; deposit/grazer) has made identification of clear correlations difficult. Alternatively, and perhaps more likely, it could be due to the generally facultative or omnivorous nature of Antarctic fauna (e.g., Norkko et al., 2007; Wing et al., 2012). In the latter case, the expression of traits by a particular species may differ between locations, or along environmental gradients, perhaps in response to differential food availability or competition with other fauna, thus confusing the signal (e.g., Concepción et al., 2017, Gianuca et al., 2017). Indeed, the plasticity of feeding strategies of epifauna found at some of these same locations has already been noted (Norkko et al., 2007). A clearer understanding of function at the species and community scale is needed to characterise and predict the capacity for functional redundancy and therefore the inherent response diversity to help predict their likely resilience to environmental change. Incorporating other types of natural history information which have not been considered here, such as competitive interactions within and between species, will also be important in determining and understanding community patterns (e.g., Dayton, 1989).

# Which Environmental Factors, and at Which Scale?

Broad scale environmental variables were more important drivers of community composition for epifauna, while habitat scale variables are most important for macroinfauna (**Table 6**). However, in both community types, environmental information at multiple scales (habitat + broad scale parameters combined) explained more variability in community composition than when either habitat or broad scale parameters were considered on their own (**Table 6**). This improvement in explanatory power with increasing scale was considerably greater for macrofauna than for epifauna (**Table 6**). Nevertheless, even for epifauna, multiscale factors explained 6% more than was explained by broad scale variables alone, and three of the six important variables in the multiscale analysis were habitat scale variables (phaeophytin, % composition of silt, % cover of sand; **Table 6**). For both macro- and epifaunal communities, the high number and variety of variables identified as influencing community composition illustrates the complexity of the system, and the inherent difficulty of predicting community composition based on simple (few factors) environmental information (e.g., Gutt, 2007 and references therein). These results further highlight the influence of environmental factors across multiple scales on structuring benthic communities.

The structuring influence of sea ice conditions and their control of light availability and productivity to benthic community structure and function is paramount and indisputable in this region (e.g., Thrush and Cummings, 2011; Fountain et al., 2016). Environmental factors reflective of sea ice conditions were important variables explaining variation in community composition and functional traits (i.e., ice thickness for epifauna; ice thickness and duration and snow cover for macrofauna; ice thickness and duration for macrofaunal functional traits; see **Figure 3**). This suggests that an indirect space-for-time substitution type approach, using gradients in these factors which are anticipated to change with warming, could be successfully used to predict ecosystem consequences under future scenarios. Nevertheless, the large number of other biophysical variables, processes and interactions contributing to structuring these communities signifies the complexity of the processes acting across this region. While we detected clear differences in environmental conditions between New Harbor and the considerably more productive and habitat diverse Gerlache Inlet, as already discussed there were also strong similarities in their epifaunal communities. More information is required to (i) understand interactions between structuring variables (including those not measured here), and (ii) incorporate temporal variability in these factors in the analysis.

### Future Variables to Consider

There are several broad scale variables not considered in this analysis for which longitude could be a placeholder: these include disturbance by ice (Dayton, 1989; Gutt, 2001; Smale, 2007) and by wind and wave induced disturbance likely to be experienced during the (relatively short) ice free periods in summer. Characterizing hydrodynamic conditions at different scales (site, location, bay; coastal, and open ocean connectivity; e.g., Gutt et al., 2018) will be crucial to understanding potential sources of food and larvae, freshening and ice formation processes going forward (e.g., Hauquier et al., 2015). Associated with circulation, proximity to productive polynyas, phytoplankton blooms and their influences on supply of food are also not considered here, but will be very important in sustaining benthic communities (Dayton and Oliver, 1977; Thrush et al., 2010), and may underlie similarities in community patterns between locations. Although we have "estimated" productivity via sediment-associated measures of benthic chlorophyll a and phaeophytin, and light availability via PAR, other measures are needed to more explicitly link productivity sources from the sea ice and the water column to the benthos. Information on biogeochemical characteristics of the water column are also necessary, particularly as ocean acidification is anticipated to be one of the most important and widespread changes facing Antarctic waters (e.g., Fabry et al., 2009; Constable et al., 2014; Gutt et al., 2015). Freshening, from melt of ice shelves and glaciers and elevated inputs from terrestrial stream flows (e.g., Grange and Smith, 2013; Sahade et al., 2015; Dayton et al., 2016) can also be expected to influence biogeochemistry, particularly in these coastal regions. More data is needed on variability in these key factors across multiple spatial and temporal scales to elucidate the relative importance of the various factors from a community and/or population perspective (cf. Emslie et al., 2007).

### Temporal Variation

The data presented here represent a snapshot in time of the epi- and macrofaunal benthic communities at each of our study locations. The influence of variation in environmental factors over time (seasonal, annual, and greater-than annual scales), particularly for ice conditions and productivity, have not been incorporated in our analyses, yet would be expected to have a strong influence on the benthic communities we documented at each sampling point (e.g., Dayton, 1989; Thrush and Cummings, 2011). While our study has explained virtually all the variability in community composition using our suite of environmental drivers, rapid or unexpected change to one (or several) of these parameters could deconstruct these relationships. For example, settlement and growth of a large hexactinellid sponge, absent in the preceding 22 years, was noted in correlation with a regionwide shift in phytoplankton productivity associated with the calving of a massive iceberg (Dayton et al., 2013). Such a change in community dominance resulting from episodic recruitment may be a good indicator of some environmental disruption having occurred. Although biological process amongst Antarctic benthos is considered slow, this same study, where observations were made over a 40 year time period, suggests that population increases and decreases may occur over decade rather than century time periods, emphasizing the need to re-evaluate ideas of slow processes and stability over century time scales (Dayton et al., 2013), and, likewise, their response to change. As climate change will result in direct and indirect effects across the food chain through alteration of bottom up and top down processes (e.g., Constable et al., 2014), more detailed natural history information on species and communities of interest is needed to predict likely change and sensitivities (e.g., Ricklefs, 2012). The importance of repeated and consistently made observations of biological communities, and biologically relevant environmental parameters (at appropriate space and time scales) in understanding ecosystem structure and function cannot be overstated, particularly when the aim is to detect responses to and predict implications of, future change (e.g., Ducklow et al., 2013; Sahade et al., 2015; Obryk et al., 2016; Barnes, 2017).

# CONCLUDING COMMENTS

We consider that the data presented here can act as a valuable baseline against which future patterns can be assessed, and that it will be useful in the choice of best/most representative sites in the coastal Ross Sea for longer term studies. However, our aims in this analysis were not to merely describe existing benthic communities, habitats, and environmental conditions. Instead we hoped to define variables important for driving community and habitat change and to determine whether, once these were established, natural gradients in key parameters could be used to predict future changes associated with climate warming. Our results suggest that this indirect space for time surrogacy approach incorporating broad scale variables likely to be affected in future, may well be able to be utilized, but that the role of smallscale habitat features would also need to be incorporated for such predictions to be truly useful. When long term, consistent and repeated observations of coastal benthic communities are combined with measures of multiscale environmental characteristics likely to be important in influencing functioning and in structuring distributions of species, populations and communities, these observations become immensely powerful in understanding natural variability, detecting responses to and attributing causes of change.

# AUTHOR CONTRIBUTIONS

VC participated in all Antarctic sampling events, led the 2003– 2007 field programmes, coordinated sample processing, analyses (sample identification, statistics) and planned and wrote this manuscript. AN led the 2001–2002 field programmes, and ST and PM were key participants in all field events. JH conducted the statistical analyses. PM analyzed the video transect footage, while NH contributed with sample identifications and figure preparation. All authors contributed to the design of the sampling protocols and the writing of the paper.

# FUNDING

This research was funded by the New Zealand Ministry of Fisheries (ZBD2001/02, ZBD2002/01, ZBD2006/03), the New Zealand Ministry for Business, Innovation and Employment (COX01707), and the National Institute of Water and Atmospheric Research (NSOF, COMS 1406).

#### ACKNOWLEDGMENTS

We thank the many scientists and divers involved in the field components of this research (Neil Andrew, Rod Budd, Scott Edhouse, Greig Funnell, Ian Hawes, Drew Lohrer, Steve Mercer, Peter Notman, Anne-Marie Schwarz), NIWA Hamilton laboratory staff for sample processing, particularly Greig Funnell for video analysis and identifications, and Geoff Read for polychaete identifications. We are very grateful to Antarctica New Zealand for their excellent logistics support, without which

#### REFERENCES


this work would not be possible. Programma Nazionale Di Ricerche in Antartide are thanked for their support in Terra Nova Bay. Finally, we thank the reviewers for their constructive comments on this manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars. 2018.00232/full#supplementary-material

Supplementary Figure 1 | The proportion of the macrofaunal community exhibiting different types of functional traits at each site and location. (A) Feeding mode, (B) motility, (C) living position.

Supplementary Table 1 | Habitat characteristics at the transect-video scale (mean ± SE). The coarse sediment category also contained detrital organic material.

Supplementary Table 2 | Habitat characteristics at the sediment core scale (mean ± SE). CE, Cape Evans; NH, New Harbor; DI, Dunlop Island; SC, Spike Cape; GH, Granite Harbor, GH; TBW, Tethys Bay west; TBS, Tethys Bay south; GI, Gerlache Inlet.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Cummings, Hewitt, Thrush, Marriott, Halliday and Norkko. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Two Recent Massive Breeding Failures in an Adélie Penguin Colony Call for the Creation of a Marine Protected Area in D'Urville Sea/Mertz

Yan Ropert-Coudert <sup>1</sup> \*, Akiko Kato<sup>1</sup> , Kozue Shiomi <sup>2</sup> , Christophe Barbraud<sup>1</sup> , Frédéric Angelier <sup>1</sup> , Karine Delord<sup>1</sup> , Timothée Poupart 1,3, Philippe Koubbi <sup>4</sup> and Thierry Raclot <sup>5</sup>

#### Edited by:

Anton Pieter Van de Putte, Royal Belgian Institute of Natural Sciences, Belgium

#### Reviewed by:

Brett W. Molony, Department of Primary Industries and Regional Development of Western Australia (DPIRD), Australia Alastair Martin Mitri Baylis, South Atlantic Environmental Research Institute, Falkland Islands Bruno Danis, Free University of Brussels, Belgium

#### \*Correspondence:

Yan Ropert-Coudert yan.ropert-coudert@cebc.cnrs.fr

#### Specialty section:

This article was submitted to Marine Fisheries, Aquaculture and Living Resources, a section of the journal Frontiers in Marine Science

Received: 11 December 2017 Accepted: 16 July 2018 Published: 06 August 2018

#### Citation:

Ropert-Coudert Y, Kato A, Shiomi K, Barbraud C, Angelier F, Delord K, Poupart T, Koubbi P and Raclot T (2018) Two Recent Massive Breeding Failures in an Adélie Penguin Colony Call for the Creation of a Marine Protected Area in D'Urville Sea/Mertz. Front. Mar. Sci. 5:264. doi: 10.3389/fmars.2018.00264 <sup>1</sup> Centre d'Etudes Biologiques de Chizé, UMR7372 CNRS-Université La Rochelle, Villiers en Bois, France, <sup>2</sup> National Institute of Polar Research, Tokyo, Japan, <sup>3</sup> School of Life and Environmental Sciences, Faculty of Science & Technology, Deakin University, Burwood, VIC, Australia, <sup>4</sup> Unité Biologie des Organismes et Ecosystèmes Aquatiques (BOREA, UMR 7208), Sorbonne Universités, Muséum National d'Histoire Naturelle, Université Pierre et Marie Curie, Université de Caen Basse-Normandie, CNRS, IRD, Paris, France, <sup>5</sup> Institut Pluridisciplinaire Hubert Curien, Département Ecologie, Physiologie et Ethologie, UMR 7178, CNRS-UDS, Strasbourg, France

In the d'Urville Sea in East Antarctica, a population of roughly 20,000 pairs of Adélie penguins of Iles des Pétrels (Terre Adélie) has experienced two massive breeding failures, with no chick surviving the 2013–14 and 2016–17 breeding seasons. In both seasons the extent of sea ice in front of the colony persisted throughout the breeding cycle of the birds. The timing of sea-ice recession differed greatly between seasons and the absence of polynya in a crucial phase of the cycle were paramount in driving these failures. The change in the icescape in front of Ile des Pétrels following the calving of the Mertz glacier in 2010, together with increase in precipitations and changes in sea-ice firmness explain this situation and are discussed in the present manuscript. To prevent additional future impacts on this colony, like competition with fisheries for instance, we strongly support a scientific research zone in the d'Urville Sea—Mertz area, one of the three zones of proposed Marine Protected Area in East Antarctica to the Commission for the Conservation of Antarctic Marine Living Resources.

Keywords: eco-indicating species, Sea ice, breeding, foraging, marine protected areas, seabirds, extreme events

# INTRODUCTION

Antarctica and its surrounding ocean are at the forefront of the global environmental changes. Freshwater ice on land and sea ice are indeed strongly affected by the warming of the air and sea, and this has drastic consequences on the biotic components of the Antarctic ecosystems (Constable et al., 2014). A large number of mid-trophic level species playing pivotal roles in the food webs, rely on sea ice for their embryonic development, food and breeding, and are the main prey species of marine birds and mammals. As such, the monitoring of the biology of Antarctic meso- and top predators has been successfully used as eco-indication of environmental changes (Hindell et al., 2003).

Adélie penguins (Pygoscelis adeliae) are often considered as good indicators of environmental changes because their ecology is closely related to the state of sea ice (Ainley, 2002). Recently, satellite measurements have led to a new estimate that suggests an increase in the total number of Adélie penguins, as well as in the number of colonies across the continent (Lynch and LaRue, 2014). Using this technique and ground-based surveys, Adélie populations have been shown to be generally in decline in the Antarctic Peninsula, while other populations are stable or slightly increasing (Cimino et al., 2013; Lyver et al., 2014; Southwell et al., 2015). The use of satellite counts has contributed to the downgrading of the species on the IUCN scale from "Near Threatened" to "Least Concerned" in 2017. However, this downgrading at a continental scale may mask major local disparities and further population trajectories: stable and/or slightly increasing populations are found where sea ice has increased and the opposite where sea ice is melting. With projected continuous warming the increasing sea-ice trend is not expected to last.

In the d'Urville Sea of the East Antarctic sector, a well-studied region in terms of eco-regionalisation (Koubbi et al., 2011), the situation of Adélie penguins in Terre Adélie has rapidly deteriorated over the last 6 years. This population experienced a catastrophic breeding season with no chick surviving out of 20196 breeding pairs in the 2013–14 breeding season (Barbraud et al., 2015; Ropert-Coudert et al., 2015). This had never been recorded over the 36 years that the Adélie colony was monitored and not to such an extent in other locations around the continent. Details about the physical factors that led to this massive failure—namely sea-ice extent larger than usual and rain episodes—have been recently described for Adélie penguins (Barbraud et al., 2015; Ropert-Coudert et al., 2015). Here, we report on a second massive breeding failure for the same Adélie penguins' colony that took place in the 2016–17 season. We aim to highlight the similarities and differences in the sea-ice and meteorological conditions that could cause the two massive failures, and compare the impact these had on the foraging activity and reproductive output of the birds. Such unusual events—both in terms of size of the population affected and the increasing frequency at which they occur—could be indicative of the start of massive changes in the environment and have specific relevance to the current discussion around the establishment of a Marine Protected Area in East Antarctica, which includes the D'Urville Sea/Mertz region.

#### MATERIALS AND METHODS

The study was conducted on Adélie penguins' colonies on Ile des Pétrels, near the Dumont d'Urville station (66◦ 40′ S, 140◦ 01′E). Female Adélie penguins lay one or two eggs in mid-November and both parents participate to the reproductive effort by alternating foraging trips at sea, where they feed primarily on two species of krill (Euphausia superba and E. crystallorophias) and Antarctic silverfish (Pleuragramma antarctica) (Ropert-Coudert et al., 2002), with presence on the nest to keep the eggs/chicks warm and protected from predators. When the chicks are developed enough and display thermoregulatory abilities, parental trips to sea become irregular and the chicks form crèches, until they fledge in early March (Ainley, 2002).

#### Biological Data

Breeding success data were collected between 1993 and 2017 as part of the program 109 of the French Polar Institute (IPEV), on the colonies of Ile des Pétrels, which ranged from 10,849 to 20,957 breeding pairs over these 24 seasons. Active nests (with at least one egg) were counted in November, and the number of chicks that were still alive was counted in February. The breeding success was calculated as the number of chicks in February divided by the number of active nests in November (see Jenouvrier et al., 2006 for more details).

Foraging activity of Adélie penguins was monitored in the frame of the IPEV program 1091 through the deployment of miniaturized GPS devices (CatLogTM GPS, Catnip Technologies, USA, and AxyTrekTM GPS-accelerometry loggers, Technosmart, Italy). These devices were temporarily attached to the back feathers of the penguins using marine tape (Wilson et al., 1997) and recorded the position of the birds every 15–30 min during the long trips of the incubation phase, and every 3 min in the shorter trips of the chick-rearing phase. Location data were processed following the methods described in Widmann et al. (2015), which showed that, typically, females travel at sea for 15 days after laying the eggs, followed by the foraging trip of males that last between 10 and 15 days. The egg hatches after the return of the male. Thereafter, foraging trips become shorter, getting as short as a few hours to 3–4 days during the chick rearing phase. Foraging trip duration (days) and maximum distance (km) from the colony was calculated for each trip. Adults were weighed to the nearest 10 g at their departure from (initial body mass) and their return to their nest. The difference between these two body masses was calculated to give the body mass change for birds in incubation. Body mass changes were not calculated during chick-rearing as we could not ascertain that the adults had already fed chicks or not when recaptured at the nest and only initial body mass is thus shown. Statistical analyses were conducted in R (R Core Team, 2016). Analysis of variance with post-hoc Bonferroni multiple comparison testing was used for each stage when more than three seasons were compared.

#### Environmental Data

Meteorological data (monthly air temperature and direction of prevailing winds in degrees from the north) between 1956 and 2017 were downloaded from the British Antarctic Survey website (https://legacy.bas.ac.uk/met/READER/). The percentage of seaice cover was calculated as the percentage where sea-ice concentration was more than 15% in an area ranging from 135 and 145◦E and 61 and 67◦ S (foraging range of Adélie penguins at Ile des Pétrels, Widmann et al., 2015) and over a period ranging from 1992 to 2017, using the data provided by the Institut Français de Recherche pour l'Exploitation de la Mer (ftp://ftp.ifremer.fr/ifremer/cersat/products/gridded/psiconcentration/data/antarctic/). Sea-ice extent was calculated as the distance (km) from the colony to the nearest open water (seaice concentration < 15%). A polynya was defined as a zone of

open water enclosed in sea ice. Summer sea-ice characteristics (firmness) around the colony were observed several times a week by foot and using helicopter for distant sites (up to the ice edge).

#### RESULTS

The breeding success varied greatly over the 1993–2017 period but it declined from 2011 onward—to the exception of the 2015–16 season which showed breeding success around 0.9 with breeding successes below 0.4 and reaching zero, or near zero success, in 2 years. Indeed, breeding success was zero for the 20196 breeding pairs in the 2013–14 season and only two chicks survived out of the 18163 breeding pairs in the 2016– 17 season (**Figure 1**). In the remainder of the analyses we will focus on comparing the conditions between the zero success years (2013–14 and 2016–17 seasons) and the typical high breeding season of 2015–16. We chose this season because its breeding success is exceptionally high in comparison with other recent years (post 2010 and Mertz glacier calving, see Discussion) and as such causes and consequences are expected to be exacerbated (see Ropert-Coudert et al., 2009 for a similar test case). In addition, 2015–16 is the only year of high breeding success for which we have tracking data.

The season 2013–14 showed the largest sea-ice extent and cover in our dataset (since 1992) in the beginning of the breeding season in November (**Figures 2A,B**). Interestingly, 2016–17 had the smallest sea-ice extent in the beginning of the breeding season but both years kept a larger than usual extent later in the summer. While it was compact in 2013–14, the sea ice in January was sherbet-like in 2016–17 and did not break for the whole summer. There was no polynya opening in vicinity of the colony in 2013–14 and 2016–17, closest opened water was 80– 90 km away. In contrast, a polynya started to open in front of the colony in November 2015 and connected with the open sea by January 2016. Air temperature in the two zero years were in the upper range and 2016–17 saw the warmest November in decades (**Figure 2C**). Wind direction over the 3 years considered here shows a clear shift in the regime as the winds were blowing mainly from a more easterly direction than before (**Figure 2D**).

In the 2013–14 season, following the large extent of seaice, birds traveled further during trips of longer duration at all breeding stages (**Table 1**, **Figure 3**). In 2015–16, the maximum distances reached during incubation by females did not differ from that of the 2013–14 season but the trips were shorter in time. During incubation by males and chick rearing the distances traveled and the time spent at sea in 2015–16 were considerably less than during the 2013–14 season. During the 2016–17 season, incubating birds did not travel as far as those in the 2013– 14 season but their trips were as long as in the previous zero year. Interestingly, duration and distance of foraging trips varied substantially over the 2016–17 chick-rearing phase. Birds could only have access to cracks and crevasses in the ice near the breeding colony to find food. Parents were observed provisioning chicks, and the reddish color of the feces suggested they found krill within a short radius around the colony. Yet, these resources lasted only for a few days and distances covered apparently increased again as birds stopped returning to their nests or only after protracted periods of time (up to 20 days for those recaptured).

The body mass gained during foraging trips by incubating females (F = 16.7, P < 0.001, **Figure 4A**) and males (F = 31.0, P < 0.001, **Figure 4B**) was greater in the 2015–16 season compared with both the 2013–14 and 2016–17 seasons, which did not differ significantly. The initial body mass of chick-rearing birds was different among seasons: it was the heaviest in 2015–16 and the lightest in 2013-14, with 2016–17 being intermediate (F = 65.2, P < 0.001, **Figure 4C**).

#### DISCUSSION

Sea-ice distribution contributed to shape the foraging and breeding performances of Adélie penguins, as has been reported in other locations around the Antarctic (Emmerson and Southwell, 2008). The reasons for the two catastrophic years recorded in Terre Adélie differ but in both seasons the extent of the sea ice in front of the colony was greater than in other years and, most importantly, persisted throughout the season. This abundance of sea ice could be explained by several, non-exclusive, factors. For instance, in mid-February 2010 the massive B9B iceberg coming from the Ross Sea collided with the Mertz Glacier tongue, east of Dumont d'Urville (Young et al., 2010). The expected consequences of such a massive change in the icescape included, among other things, a decrease in polynya activity (Tamura et al., 2012). As several icebergs, freed from the collision, anchored themselves in the shallow bay in front of the colony of Dumont d'Urville they created a network of "pillars" onto which sea ice could form and be retained (Massom et al., 2009). In parallel, temperatures rising from the seasonal norms in the summer also melted the icebergs' and continent's freshwater ice, which then flow toward the sea where it froze again at the contact of the −1.8◦C seawater (https://www.nasa.gov/content/goddard/ antarctic-sea-ice-reaches-new-record-maximum). In addition, a

FIGURE 2 | Monthly distance to the nearest open water from the colony (A) and the percentage of ice-covered area between 135 and 145◦E, and between 61 and <sup>67</sup>◦S from 1992 to 2017 (B). Monthly average air temperature in ◦<sup>C</sup> (C) and direction of prevailing winds in degrees from the north (◦ ) (D) observed at the Dumont d'Urville station for the years 1956–2017 (months on the x-axis). The years 2013–14, 2015–16, and 2016–17 are in orange, blue and magenta, respectively, while other years are in gray.

TABLE 1 | Mean values (±SD) of durations and maximum distances reached during foraging trips of Adélie penguins, by year and by phase of the reproductive cycle.


Sample size (number of individuals) is given between brackets. Values with the same superscript letters do not differ significantly. Maximum distances in 2016–17 are not used because 8 GPSs stopped recording before the return of the birds due to battery exhaustion.

change in the main bearing of the winds that, instead of blowing from the mainland and detaching ice from the continent, blows now more from the East and tend to concentrate the sea ice in front of the archipelago of Pointe Géologie (Massom et al., 2009). Finally, linked to the increased temperature and change in wind regimes increased precipitation (including rain) in 2013–14 further aggravated the conditions for the Adélie penguins: besides the flooding of unprotected nests, malnourished, wet chicks with their non-waterproof plumage suffered from the cooling power of the wind leading to massive failure events (Ropert-Coudert et al., 2015).

If sea ice is the principal reason behind the failures, there are major differences in both "zero years;" one is the extent of the sea ice early in the season, and the other is the state of the sea ice, although the role of the latter deserves further investigation and repeatable measurement. The differences between the 2 years when the breeding success was zero show that it is imperative to monitor the changes in bird activity throughout the season. Here, sea-ice extent during incubation in the 2015– 16 (high breeding success) was similar to that of 2016–17 but it subsequently decreased and a large polynya opened directly opposite the colony. The opening of a polynya in the vicinity of the colony has been shown to be crucial to the breeding success of the penguins from Dumont d'Urville, as it allowed parents to access food quickly and return rapidly to their nest to feed the growing chicks (Widmann et al., 2015). In other words, a very small sea-ice extent in October does not necessarily mean that the breeding success will be high, especially if, like in 2016–17, the sea ice turns into a glutinous sherbet that does not break with the offshore swell, preventing access to open water and reducing net food availability. Those are visible in the lower body mass gains during incubation in the

FIGURE 3 | Foraging tracks of Adélie penguin females (orange) and males (pink) during their first incubation trips, and males and females during chick rearing (yellow) superimposed on maps of sea-ice concentration (Dec. 1st of each year) from open water (navy blue) to maximum ice concentration (lighter shades of blue), for (A) 2013–14, (B) 2015–16, and (C) 2016–17. The periphery of the continent is indicated by a black border. The colony is indicated by a red dot.

two zero years and lower body mass of adult during chick rearing.

In the light of the above, long-term (> 10 years) and finescale (combining at-sea tracking of the foraging with on-land monitoring of breeding activities) observations are more than ever paramount to our understanding of the resilience of species to the growing pressure of environmental changes but also to chaotic events and their consequences (Lescroël et al., 2014). These are some of the pressing questions highlighted by the 1st Horizon Scan of the Scientific Committee on Antarctic Research (Kennicutt et al., 2014). Our long-term observatories on Adélie penguins' foraging and breeding activities contributed to the demonstration of the fragility of the populations of penguins and the ecosystem they depend upon. Recent studies in the Ross Sea highlighted the existence of an optimal sea-ice condition for Adélie populations (Ainley, 2002; Ballard et al., 2010). Here, we further suggests that this optimal may be narrower than expected, but also that the sea-ice recession timing and the characteristics of the sea ice may play a major role in shaping the foraging and breeding successes of Adélie penguins (e.g., Lescroël et al., 2014).

In summary, the Adélie penguin population of Ile des Petrels can be now considered as a population with severe environmental constraints and, based on current climatic projections of a global increase in air temperature (IPCC, 2014), this may continue for several years. Although difficult to anticipate, the absence of two generations at 3 years intervals will most certainly lead to changes in the dynamics of this population, which was increasing until now. In Terre Adélie, the success of the emperor penguin breeding is also very closely related to the extent of the fast ice attached to the Antarctic continent, partly for the same reasons than for the Adélie penguin (but on another time frame): the greater this extent, the less successful breeding, especially if no polynya forms in the vicinity of the colony (Barbraud and Weimerskirch, 2001; Massom et al., 2009; Barbraud et al., 2015). For example, in 2013, 2014, and 2016, the mortality of emperor penguin eggs and chicks was as high as 0.89, 0.97, and 0.73, respectively (number of eggs laid in May of the year that died before fledging in December of the same year, C. Barbraud, unpublished data).

Given the aforementioned environmental constraints already imposed on the populations of the archipelago of Pointe Géologie, we strongly support the planning of a Marine Protected Area in East Antarctica. Three areas are proposed by the Australian and the EU delegations to the Commission for the Conservation of Antarctic Marine Living Resources (https://www.ccamlr.org/). It includes the d'Urville Sea – Mertz, Drygalski and Mac Robertson regions. For the D'Urville Sea Mertz, there are multiple specific objectives to protect representative areas of pelagic and benthic biodiversity including essential habitats for mid-trophic level species (krill species and Antarctic silverfish) and important bird areas. In October 2017, the proponents of the East Antarctic MPA proposal indicated to CCAMLR that, for the d'Urville Mertz region, a no take zone for Antarctic krill fisheries should be proposed to at least protect the summer foraging range of the Adélie population and the emperor breeding ground of Ile des Pétrels, and to allow long term monitoring of the region until clear trends on marine birds populations are obtained. While an MPA is no solution to global environmental changes, it removes some

#### REFERENCES


of the additional impacts (e.g., fisheries) that could weigh on populations. Predictive modeling is a powerful conservation tool to anticipate how populations may react to expected changes (e.g., Jenouvrier et al., 2015) but such approaches require data like those presented here to be obtained in a system with no additional pressure. Protecting the waters of East Antarctica would mean the creation of a large-scale MPA in the Southern Ocean, and be the next step in realizing the initial commitment of the members of CCAMLR to establish a network of protected areas around Antarctica.

# AUTHOR CONTRIBUTIONS

YR-C, KS, CB, FA, KD, TP, and TR collected data in the field. YR-C, AK, CB, and KD analyzed the data, YR-C and AK wrote the manuscript and all authors revised it.

#### FUNDING

This study was financially supported by the French Polar Institute Paul Emile Victor (IPEV), the WWF-UK through R. Downie, the Zone Atelier Antarctique et Subantarctique – LTER France of the CNRS. This study was approved by the ethic committee of IPEV. This study is a contribution to the program SENSEI funded by the BNP Paribas Foundation.

#### ACKNOWLEDGMENTS

We thank A.J.J. MacIntosh, X. Meyer, M. Pellé who participated to the fieldwork in the 2013–14 season and all the members of the logistic on site for their assistance.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ropert-Coudert, Kato, Shiomi, Barbraud, Angelier, Delord, Poupart, Koubbi and Raclot. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Compositional Differences in the Habitat-Forming Bryozoan Communities of the Antarctic Shelf

Scott Santagata<sup>1</sup> \*, Veronica Ade<sup>2</sup> , Andrew R. Mahon<sup>3</sup> , Phillip A. Wisocki <sup>1</sup> and Kenneth M. Halanych<sup>4</sup>

*<sup>1</sup> Department of Biology, Long Island University-Post, Greenvale, NY, United States, <sup>2</sup> Department of Science, Syosset High School, Syosset, NY, United States, <sup>3</sup> Department of Biology, Central Michigan University, Mount Pleasant, MI, United States, <sup>4</sup> Department of Biological Sciences, Auburn University, Auburn, AL, United States*

In some areas of the Antarctic shelf, bryozoans are abundant, acting as ecosystem engineers creating secondary structures with wide benthic coverage and harboring numerous other species. As the combined forces of global warming and ocean acidification threaten these habitats, we measured the composition of habitat-forming bryozoan communities using two techniques for imaging the sea floor, a YoYo-camera system and the AWI Ocean Floor Observation System (OFOS). YoYo-camera transects of the Bellingshausen, Amundsen, and Ross Seas were conducted during a research cruise on the *R/V Nathaniel B. Palmer* in 2013. OFOS transects included sites in the northern Palmer Archipelago where it borders the Scotia Sea and the Weddell Sea as part of the DynAMO project during the PS81 and PS96 cruises of *R/V Polarstern* in 2013 and 2015-16, respectively. Areas of bryozoan colonies were measured from the sea floor images using machine-learning algorithms available through the Trainable Weka Segmentation plugin developed for FIJI software. Habitat-forming bryozoan communities in the Palmer Archipelago and Ross Sea were largely composed of anascan flustrid species with finely mineralized skeletons, and to a lesser extent by other ascophoran lepraliomorph and umbonulomorph species having more robustly mineralized skeletons. Although habitat-forming bryozoan communities in the shallower (200 m) sites of the Weddell Sea also contained flustrid species, percent area and composition of flustrid bryozoans declined with increasing depth. Lepraliomorph and umbonulomorph bryozoan morphotypes were more abundant in the Weddell Sea, maintaining their relative percent area and increasing their percent composition between 200 − 400 m. Moreover, our analyses of species composition based on externally gathered datasets show similar trends among sites, depths, and degrees of colony mineralization to our seabed imaging study. Variation present in the bryozoan species compositions of the Amundsen and Bellingshausen Seas suggest that these areas potentially represent divergent bryozoan communities requiring further validation via remote imaging surveys. Overall, compositional differences among Antarctic habitat-forming bryozoan communities are likely influenced by the combined effects of seasonal ice scour and carbonate chemistry, which in an increasingly acidified and warming ocean may put the communities of the eastern Weddell Sea at greater risk.

Keywords: bryozoa, sea floor imaging, Ross Sea, Weddell Sea, Antarctic shelf

#### Edited by:

*Peter Convey, British Antarctic Survey (BAS), United Kingdom*

#### Reviewed by:

*Rachel Victoria Downey, Australian National University, Australia Juan Ernesto Guevara Andino, Field Museum of Natural History, United States*

> \*Correspondence: *Scott Santagata scott.santagata@liu.edu*

#### Specialty section:

*This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *19 April 2018* Accepted: *20 July 2018* Published: *14 August 2018*

#### Citation:

*Santagata S, Ade V, Mahon AR, Wisocki PA and Halanych KM (2018) Compositional Differences in the Habitat-Forming Bryozoan Communities of the Antarctic Shelf. Front. Ecol. Evol. 6:116. doi: 10.3389/fevo.2018.00116*

# INTRODUCTION

Among the rich benthic invertebrate communities of the Antarctic shelf are diverse assemblages of echinoderms, brachiopods, crustaceans, sea spiders, octocorals, glass sponges, and bryozoans (Thatje et al., 2005). Many of these species are endemic (Pugh and Convey, 2008) and survive only in a narrow thermal range (Pörtner et al., 2007). Many Antarctic shelf invertebrates have body types supported by calcite, aragonite, or bimineralic skeletons requiring increased metabolic costs to build due to the combined influences of cold temperature, lower pH, increased hydrostatic pressure, and mineral dissolution (Andersson et al., 2008; McNeil and Matear, 2008). As a result of these natural physiological challenges, the skeletons of some Antarctic invertebrates are often noted for being thinner and more delicate than those of comparable shallow-water temperate or warm-water species (McClintock et al., 2009; Watson et al., 2012; Duquette et al., 2018). The effects of typical environmental pressures on Antarctic shelf invertebrates are compounded by the impacts of ocean acidification (Hofmann et al., 2010), global warming (Ingels et al., 2012), and the shoaling of the calcite compensation depth (Hillenbrand et al., 2003; Griffiths, 2010). For these reasons, identifying specialized communities of the Antarctic shelf rich with species dependent on mineralized body types that act as ecosystem engineers (e.g., bryozoans) and provide secondary structure for numerous other taxa is crucial for protecting Antarctic diversity. Bryozoan diversity in the Southern Ocean is estimated at more than 400 species, with the majority belonging to the clades Cheilostomata and Cyclostomata (De Broyer and Danis, 2011; Figuerola et al., 2014; Pabis et al., 2014). Within these clades bryozoan species tend to converge on different kinds of encrusting and erect colonial forms that can make significant secondary structures or reef-like habitats. Although some habitat-forming bryozoan communities have been documented worldwide (Wood et al., 2012), little is known about the species assemblages that create these unique habitats and their importance to Antarctic shelf communities as a whole.

Faunal communities inhabiting the deeper regions of the Antarctic shelf are, in general, less physically disturbed than benthic assemblages on the shallower regions of the shelf (Barnes and Conlan, 2007). Disturbances in modern shelf communities are most notably caused by seasonal ice scour, which has its greatest effect down to 100 m but can progress with lesser effects down to 500 m in some places (Gutt et al., 1996; Barnes and Conlan, 2007). The last glacial maximum that traversed the western Antarctic shelf likely limited shelf benthic communities to the deeper continental margin until about 12 and 15 thousand years ago (Anderson et al., 2002). Since many benthic invertebrates inhabiting the shelf exhibit eurybathic distributions (Brey et al., 1996), the post-glacial recolonization process may have been sourced by populations from the continental slope and abyssal regions, surrounding continents and islands, or other surviving refuge habitats (Barnes and Kuklinski, 2010). Studies of Antarctic shelf bryozoans and sea spiders support the refugia hypothesis (Barnes and Kuklinski, 2010; Leese et al., 2015; Barnes et al., 2016). Based on older and current patterns of ecological disturbance and subsequent recolonization, invertebrate communities inhabiting the shallower and deeper regions of the Antarctic shelf may be comprised of significantly different species assemblages (Jones et al., 2007; Boger, 2011; Barnes et al., 2016).

Antarctic shelf communities have often been described by their species compositions using standard sampling methods (e.g., Agassiz trawler and epibenthic sledge). The development of remote imaging systems allows for the non-destructive characterization of shelf fauna on significantly wider spatial scales (Grange and Smith, 2013; Piepenburg et al., 2017). Used together direct and remote sampling techniques provide powerful tools for comparing benthic species assemblages on both small and wide spatial scales that may be tracked over time. Our study focuses on measuring the coverage and composition of the most abundant systematic groups of bryozoans occurring on the Antarctic shelf: the anascan Flustrina (flustrids) and two ascophoran groups, the Lepraliomorpha and Umbonulomorpha. We lumped the latter two systematic groups together, as recent phylogenetic analyses have determined the Lepraliomorpha and Umbonulomorpha to be paraphyletic clades that interdigitate (Waeschenbach et al., 2012; Taylor et al., 2015). Taking advantage of more recent imaging surveys with wider geographic distributions and betterresolved images of the Antarctic benthos, we used novel machine learning-based segmentation algorithms to discern between major morphological grades and clades of bryozoans.

Specific aims for our study were to map and measure the relative coverage and clade compositions of habitatforming bryozoan communities using our seabed image datasets. Bryozoan-rich habitats with greater benthic coverage were selectively investigated since these larger communities are more likely to be reinvestigated by future benthic studies thus facilitating the documentation of potential temporal shifts. We test whether differences in coverage and clade composition correlate significantly with select abiotic factors such as depth, temperature, and salinity. As the clade groupings listed above differ significantly in degree of biomineralization (Smith et al., 2006; Smith, 2014), we also identify specific Antarctic bryozoan communities more at risk from the effects of ocean acidification. Our criteria used to categorize habitat-forming bryozoan communities emphasizes select regions of the Antarctic shelf over wide spatial scales of the Ross Sea, Palmer Archipelago, and Weddell Sea, so we augmented our geographical and depth comparisons using bryozoan species records gathered from external datasets testing for similar trends in community structure.

#### MATERIALS AND METHODS

#### Sampling

YoYo-camera transects were conducted during 2013 on the R/V Nathaniel B. Palmer in three western Antarctic Seas, with the majority occurring in the Ross Sea (Bellingshausen Sea: six; Amundsen Sea: two; and Ross Sea: 12). Sea floor imaging on the R/V Nathaniel B. Palmer consisted of a vertically oriented YoYo-camera system using a DSC 10,000 camera, an OIS 3821 strobe, and a bottom contact trigger positioned 2.5 m above the bottom (Halanych et al., 2013). YoYo-camera images were standardized to a single size (4.32 m<sup>2</sup> ) using parallel laser beams set 10 cm apart for scale. Seabird SBE3 oceanographic sensors mounted on a CTD rosette maintained by the US Antarctic Program collected environmental variables such as salinity, temperature, and fluorometer readings at maximum depth. The AWI Ocean Floor Observation System (OFOS) imaging of the seabed in the Antarctic Peninsula and Weddell Sea was part of the DynAMO project during the PS81/96 cruises of R/V Polarstern in 2013 and 2015-16. Using the OFOS system, image size varies based on distance from the seabed and is calculated for each image (median = 5.4 m<sup>2</sup> ). Salinity, temperature, and chlorophyll α readings at maximum depth were collected by CTD (see Segelken-Voigt et al., 2016; Piepenburg et al., 2017; for sampling details). OFOS transects included a subset from the Palmer Archipelago where it borders the Scotia Sea (n = four), but the majority of the OFOS transects were conducted throughout the Weddell Sea (n = 16). Respective depth ranges between OFOS and YoYo-camera transects were split between shallower (170– 400 m) and deeper (600–1,200 m) areas of the shelf, with the greatest amount of overlap between 400 and 600 m (**Figure 1** and **Table 1**).

#### Sea Floor Imaging Analysis

We used the machine learning algorithms included with the Trainable WEKA segmentation plugin (Arganda-Carreras et al., 2017) available through FIJI imaging software (Schindelin et al., 2012) to measure the area of bryozoan colonies. As the purpose of our study was to specifically document and characterize specialized habitat-forming bryozoan communities, the review and inclusion of images for further analysis was intentionally non-random. All YoYo-camera and OFOS images were visually

inspected for the presence of bryozoan colonies. The ability to correctly identify and categorize bryozoan colony types in the sea floor images was crucial to this process. To that end, a database was created that includes macrophotographic images and scanning electron micrographs of bryozoan samples collected by benthic trawls that corresponded to the YoYo-camera transects as part the R/V Nathaniel B. Palmer cruise (n = 133 samples). Our classifications were also cross-referenced with the bryozoan taxa collected from these sites as listed in our species records gathered from external databases. We chose to measure areas of the Antarctic shelf with more extensive bryozoan coverage and not every observed bryozoan colony, so sea floor image series were chosen where bryozoans occurred in at least three successive images. Ten adjacent images were processed as a unit in FIJI (see **Table 1** for the number of processed images per site). Resolution of all processed images was downsized to 1,024<sup>∗</sup> 1,024 pixels. Some image series gathered by YoYocamera were color-corrected to improve contrast. Trainable Weka Segmentation plugin was used with the default settings plus the entropy and structure filters to reduce classification errors of the resulting three bryozoan groups (Flustrina, Lepralio-Umbonulomorpha, and Background Substrate/Other Taxa). The plugin requires that the user manually outline representative bryozoan colony types from each group, background substrate, and other taxa in a subset of images from the 10-image set. This process trains the algorithm to classify and outline each group in all provided images. Once the image segmentation is applied, all images are visually inspected for classification errors. If classification errors are found, then the user has the option of manually correcting the errors, improving the image segmentation, and then reapplying the improved classification to all images. From these segmented images, percent areas of each group were measured with MorphJLib for Region Morphometry. We processed 400 images in shelf areas of the Ross Sea, Palmer Archipelago/Scotia Sea, and Weddell Sea where more extensive bryozoan coverage was observed (total area analyzed from all photographs with bryozoans = 2232.7 m<sup>2</sup> ).

Despite the reliability of the segmentation methods used here, we recognize that a low proportion of the bryozoans in the sea floor images may have been misclassified, as colony form varies significantly in some species (e.g., Adelascopora secunda and A. jeqolqa with flustriform and cellariiform colony morphologies, respectively). Since our main classification criterion between bryozoan groups was based on the presence or absence of flexible flustriform (foliaceous) colony morphologies, some cellariiform genera in the clade Flustrina (e.g., Melicerita and Cellaria) may have been misclassified with the lepralioumbonulomorph morphotypes in some images. Although not as abundant as cheilostome bryozoans, moderately mineralized cyclostome bryozoans with erect colonies (e.g. Hornera and Fasciculipora) may also have been included with the lepralioumbonulomorph morphotypes in some images. However, since cellariiform members of Flustrina and all cyclostome bryozoans have more mineralized skeletons than their flustriform counterparts (see **Supplemental Figure 1**; Hayward, 1995), our classification scheme succeeds in separating bryozoan species with predominantly chitinous epithelial body walls (cystids)


TABLE 1 | Sites in the Ross, Scotia, and Weddell Seas where bryozoan-rich and habitat-forming bryozoan communities were found on the Antarctic Shelf.

*Habitat-forming bryozoan communities with significant benthic coverage are listed in bold. NBP, R/V Nathaniel B. Palmer Cruise during 01/14/13 - 02/07/13; N* = *Number of seabed images; % F/L-U, Percent Flustrid/Lepralio-Umbonulomorph bryozoans; PS81, Polarstern Cruise during 2013; PS96, Polarstern Cruise during 2015–2016; R, Ross Sea; S, Scotia Sea; W, Weddell Sea.* \**Total area represented by the subset of adjacent images.*

from those with mainly mineralized epithelial cystids. This phylogenetic and functional split based on colony form and degree of epithelial mineralization further supports the compositional differences within our classification scheme.

#### Geographic and Depth-Related Trends

We plotted our processed sea floor imaging data with ArcGIS software (ver. 10.5). Relationships among bryozoan percent area, percent composition, bryozoan colony type, Antarctic sea, temperature, salinity, fluorometric readings, and depth were explored using JMP 14 software (SAS). Values for percent area and percent composition were arcsine transformed before conducting the ANOVA and principle component analyses. We then augmented our geographic and depth analyses listed above using the combined bryozoan collection records of the Smithsonian Institution National Museum of Natural History, the Census of Marine Antarctic Life, and the SCAR Marine Biodiversity Information Network (Barnes and Downey, 2014) from Antarctica. This dataset consisted of unique bryozoan species records listing bryozoan species, latitude, longitude, and depth. From this dataset we quantified the number of occurrences of 145 bryozoan species as an indirect proxy of relative abundance in the Ross Sea, Amundsen Sea, Bellingshausen Sea, Palmer Archipelago, and Weddell Sea (Ross Sea N = 629; Amundsen Sea = 183; Bellingshausen Sea = 134; Palmer Archipelago = 1,446, and Weddell Sea = 1,904 records; see **Supplemental Table 1**). Sampled depth ranges were not equally distributed among our geographic sites, so we focused on depth ranges of 200–1,000 m for all sites as these depth ranges best matched our seabed image dataset. Species counts for the Amundsen and Bellingshausen Seas were grouped together to offset the effects of decreased sampling intensity in these areas. We tested for trends in bryozoan community structure, biomineralization (see **Supplemental Table 2**), and depth among sites using non-metric dimensional scaling (nMDS) with the vegan package available through R (ver. 3.5). The species dataset was autotransformed with a square root transformation and a Wisconsin double standardization. The Bray Curtis method was used to create a dissimilarity matrix and differences between geographic areas and depth ranges were tested with a permutational MANOVA (AMOVA). The percentages of finely, moderately, and heavily mineralized species were measured within each depth range among sites. These percentages were plotted as environmental vectors and surfaces on the resulting ordination plots. More broadly, depth-related trends among bryozoan clades were explored worldwide, adding bryozoan species records (Smithsonian Institution National Museum of Natural History, N = 1,723) from other non-Antarctic sites in the Southern and Northern Hemispheres with reliable depth, latitude, and longitude values.

#### RESULTS

#### Composition of Habitat-Forming Bryozoan Communities on the Antarctic Shelf

Habitat-forming bryozoan communities were found in a subset of sites in the Ross Sea (see **Figure 2A**) consisting mainly of leaf-like colonies of anascan flustrid species (e.g., Carbasea, Nematoflustra, Isosecuriflustra, and Kymella, etc.) with finely mineralized skeletons (see **Figure 2B**). Other common types, though present to a lesser extent than flustrid

FIGURE 2 | Overview of bryozoan colony types and habitats. (A) Blake trawl gathered from the Ross Sea containing mainly flustrid bryozoans. (B) Typical flustrid, lepraliomorph, and umbonulomorph colony types. (C) Habitat-forming bryozoan communities of the Ross Sea. (D) Habitat-forming bryozoan communities of the Weddell Sea. (E) Habitat-forming bryozoan communities of the Scotia Sea. (F) Segmented image where flustrids are shaded red, lepralio-umbonulomorphs are shaded green, and the background substrate and other taxa are shaded purple.

bryozoans, are the more robustly mineralized colonies of diverse ascophoran lepraliomorph (e.g., Reteporella and Adelascopora) and umbonulomorph (e.g., Bostrychopora, Cellarinella, and Systenopora) species. **Figures 2C–E** show typical sea floor images from the Ross Sea, Palmer Archipelago/Scotia Sea, and Weddell Sea where bryozoans are common, forming substantial secondary structures. Although different kinds of invertebrate communities were present in the Bellingshausen and Amundsen Sea sites, bryozoan habitats applicable to our study were not found. The resolution of both the YoYo-camera and OFOS systems allowed for bryozoan colonies to be readily distinguished from the surrounding background substrate and associated taxa (e.g., sponges, cnidarians, echinoderms, and fish). Furthermore, due to the more robust mineralized skeletons and divergent colony morphologies present in the vast majority of lepraliomorph and umbonulomorph bryozoans as compared to flustrid species, the Trainable Weka Segmentation plugin for FIJI software was able to reliably discriminate between these two bryozoan categories (see **Figure 2F** with flustrids shown in green, lepralio-umbonulomorphs in red, and background substrate and associated taxa in purple).

Our surveys of all imaging datasets found 16 out of 37 sites in the Ross Sea, Palmer Archipelago/Scotia Sea, and Weddell Sea (n = 6, 4, 6, respectively; see **Table 1**) with applicable bryozoan communities in a depth range of 170–765 m. Fifteen of these sites had a total bryozoan area >3 m<sup>2</sup> (**Table 1** and **Figure 3**). No obvious geographic trend in the distribution of habitat-forming bryozoan sites was found in the Ross Sea (see **Figure 3B**), and only one of the Ross Sea sites exhibited a total area that matched sites in the Palmer Archipelago/Scotia Sea and Weddell Sea (**Figures 3A–D**). All of the habitat-forming bryozoan sites in the Weddell Sea were found in its more eastern region (**Figure 3D**). The total percent compositions for flustrid and lepralio-umbonulomorphotypes within habitatforming bryozoan communities were similar between the Ross Sea and Palmer Archipelago/Scotia Sea sites, but differed from the Weddell Sea (Ross Sea: 79% flustrid/21% lepralioumbonulomorph; Palmer Archipelago/Scotia Sea: 90/10%; Weddell Sea: 34/66%).

Overall, the Ross and Palmer Archipelago/Scotia sites are largely comprised of flustrid bryozoans and to a lesser extent by lepralio-umbonulomorph morphotypes. However, the Palmer Archipelago/Scotia Sea sites are more variable with several high outliers for lepralio-umbonulomorph morphotypes. Percent area and percent clade composition of these two bryozoan morphotypes are significantly different in the Weddell Sea sites, as lepralio-umbonulomorphs account for more of the percent area and percent composition (**Figures 4A,B**). A one-way ANOVA found significant differences in the means for both percent area and percent clade composition between flustrid and lepralio-umbonulomorph morphotypes ignoring background substrate and other taxa within each of the three sea regions (**Table 2**). Significant differences were mostly found for the percent areas and percent clade compositions of flustrid and lepralio-umbonulomorph morphotypes among the three regions, except in a few comparisons (**Table 2**).

#### Environmental Variables

Environmental variables such as salinity, temperature, and fluorometric readings at depth varied slightly among sites with habitat-forming bryozoan communities (**Supplemental Table 3**). A principle components analysis including these variables along with transformed measurements of percent area and percent composition resulted in the eigenvector weightings shown in **Table 3** that account for 66.3% of the total variation. Although environmental factors such as temperature and salinity were included as significant weightings in the first two eigenvectors, their relative importance is unclear since our data do not include any variation over time. The three Antarctic regions are differentiated largely by their different compositions of flustrid and lepralio-umbonulomorph bryozoans and depth (**Figure 5**). We explore the differences in bryozoan composition and depth in more detail in **Figure 6**. For this analysis images in which bryozoans comprised <10% of the total area were excluded to rule out the influence of pioneer colonies and to focus more on larger bryozoan communities. The trends observed between maximum depth and the percent areas of bryozoans are largely based on the compositional differences among the three Antarctic regions. The percent area of flustrid bryozoans increases with depth in the Ross Sea (r <sup>2</sup> = 0.30, **Figure 6A**), but this trend is largely influenced by only two sites (NBP 1210 013113 and NBP 1210 020113, 585 and 544 m, respectively). In general, the deeper shelf sites in the Ross Sea are characterized by a greater prevalence of flustrid bryozoans as compared to lepralio-umbonulomorph bryozoans. Flustrid bryozoans comprise most of the shallower shelf sites in the northern Palmer Archipelago/Scotia Sea showing no significant correlation with maximum depth (**Figure 6A**). However, the percent area of lepralio-umbonulomorph bryozoans did increase with maximum depth (r <sup>2</sup> = 0.31). In contrast to the Ross Sea and Palmer Archipelago/Scotia Sea sites, the percent area of flustrid bryozoans decreased in the shallower areas of the eastern Weddell Sea (r <sup>2</sup> = 0.36), and lepralio-umbonulomorph bryozoans are more prevalent these habitats (**Figure 6A**). Similar trends are observed for the percent clade composition among regions and depths with the only obvious difference being that the percent composition of lepralio-umbonulomorph bryozoans increases with depth in the Weddell Sea sites (r <sup>2</sup> = 0.25, **Figure 6B**).

Significant differences in species composition and degree of mineralization are evident among sites and some depth groupings in our nMDS analyses. Overlap in species composition within the first 400 m is shown for the Ross Sea, Weddell Sea, and the Palmer Archipelago (**Figures 7A,C**). Deeper zones of these regions (> 600 m) exhibit more unique species compositions. Amundsen and Bellingshausen species compositions by depth range were significantly more variable and different from the other regions (**Figure 7A**). Species compositions among regions also separate based on the percentage of finely, moderately, and heavily mineralized species groups (**Figures 7B,D,E**). Trends in species degree of biomineralization are similar to our results based on our seabed images. More finely skeletonized species are found in the Palmer Archipelago and Ross Sea, but moderate and heavily mineralized species are more prevalent in the Amundsen and Bellingshausen Seas as well as deeper (>800 m) regions of the Palmer Archipelago and Weddell Sea (**Figures 7B,E**). Due to sampling biases it is difficult to test for significant differences in community structure beyond 1,000 m. Although based on a small number of samples the deepest Antarctic bryozoans are all flustrids occurring between 3,000 and 5,000 m (N = 3). Beyond Antarctica, worldwide bryozoan records show that cyclostome, flustrid, and lepralio-umbonulomorph bryozoans all occur down to depths of 2,700–2,800 m. However, only flustrid bryozoans are found down to 5,900–6,000 m (**Figure 8**). These flustrid bryozoans of the abyssal zone include both Antarctic and non-Antarctic species of Notoplites, Himantozoum, and Camptoplites.

FIGURE 3 | GIS maps of the location and area of bryozoan-rich and habitat-forming bryozoan communities considered in this paper. (A) Map of Antarctica showing all sites. (B) Ross Sea. (C) Sites along the northern portion of the Palmer Archipelago where it borders the Scotia Sea. (D) Weddell Sea. Black dots are areas included in the imaging surveys, but where habitat-forming bryozoan communities were not observed. Red circles are areas with habitat-forming bryozoan communities scaled by the total bryozoan area. The NOAA GIS base maps of Antarctica obtained from a web-shared repository.

#### DISCUSSION

## Biogeography and Composition of Antarctic, Habitat-Forming Bryozoan Communities

Our study focused on mapping and characterizing unique bryozoan-rich communities producing significant coverage on areas of the Antarctic shelf. Although several studies include bryozoans as being one of the more common and diverse members of the shelf fauna (e.g., Barnes and Kuklinski, 2010; Barnes et al., 2016; Figuerola et al., 2018), the impact of expansive bryozoan secondary structures on the surrounding community has not been evaluated on a wide geographic scale (Wood et al., 2013). Evaluating the geographic distribution of habitat-forming

bryozoan communities in other Antarctic regions based on published studies using direct sampling or remote-imaging techniques (or both) can be difficult due to interpretational differences in categorizing bryozoan-rich habitats. For example, based on trawl samples bryozoans have been found to be both species-rich and abundant at some Amundsen Sea sites (Linse et al., 2013), but without corresponding seabed images to measure benthic coverage it is not possible to assess if these sites include habitat-forming bryozoan communities as defined here. A recent YoYo-camera survey of the Sabrina Coast Shelf found abundant forms of bryozoans and many other invertebrate taxa, even though bryozoan dominated communities were not observed (Post et al., 2017). However, bryozoan-rich communities (including sponges and soft corals) were found in imaging surveys of the George V Ice Shelf (Post et al., 2011). In general, habitat-forming bryozoan communities are more expansive in Antarctic regions than in other sites worldwide (Wood et al., 2012), albeit only a subset of the Antarctic regions accessible for remote imaging have been explored in detail.

Compositional differences discussed here in habitatforming bryozoan communities among the Ross Sea, Palmer Archipelago/Scotia Sea, and Weddell Sea are similarly noted for particular species of bryozoans. For example, the flustrid bryozoan Melicerita obliqua makes moderately mineralized cellariiform colonies that do not fit our clade categories, but is segmented correctly by its degree of mineralization. This species is widely distributed in low abundances in the Ross Sea (trawl data, Winston, 1983). However, Melicerita obliqua is one of the most common benthic suspension feeders found in an expansive coastal region of the eastern Weddell and Lazarev Seas [grand average of 595.6 colonies per 100 m<sup>2</sup> , see



*P* < \*\*\* *0.001* \* *0.05. F, Flustrid bryozoans; L-U, Lepralio-Umbonulomorph bryozoans.*

TABLE 3 | Breakdown of the first two eigenvectors resulting from the principle component analysis.


*Significant variable weightings are listed in bold. L-U, Lepralio-Umbonulomorph bryozoans.*

(Gutt and Starmans, 1998)] lending support to our finding that habitat-forming communities in the Weddell Sea have a greater prevalence of moderately to robustly calcifying species. Similar to the results of Gutt and Starmans (1998), we found bryozoans abundant in eastern Weddell Sea sites where suspension feeders dominate, but not in more western Weddell Sea sites which are characterized by a higher abundance of detritus feeders. Bryozoan-rich sites in the eastern Weddell and Lazarev Seas are also among the most diverse in other invertebrate taxa (Gutt and Starmans, 1998). Consistent with our study, Cellarinella spp. found on the Fimbul Ice Shelf region of the Weddell Sea are abundant in the shallower (245 m) sites, but decrease sharply at a maximum depth of 510 m (seabed imaging data, Jones et al., 2007). Other unidentified species of bryozoans are more common at the deeper Fimbul Ice Shelf sites (Jones et al., 2007). Similar depth-related differences in the abundances of bryozoan erect colony forms occur at sites from Signy Island (Scotia Sea), where flexible flustrid bryozoan species are more abundant with increasing depth down to 290 m as compared to other more heavily mineralized colony types (Barnes, 1995).

## Mineralization Patterns of Antarctic Bryozoans

The composition of bryozoan skeletons varies with latitude, as species from high latitudes and polar regions (60–90◦ ) predominantly use calcite, with a limited number of species being bimineralic (Kuklinski and Taylor, 2009; Taylor et al., 2009). This latitudinal trend in bryozoan skeletal composition may be due to aragonite skeletons' greater susceptibility to dissolution in colder water (Morse et al., 1980; Taylor et al., 2009). Another compounding factor is that bryozoan skeletons vary with respect to the weight percent of magnesium carbonate in the calcite, and skeletons with higher amounts of Mg-calcite are more soluble in colder water (Andersson et al., 2008). As the negative effects of seawater pH and hydrostatic pressure on biomineralization intensify with depth (Feely et al., 2009), selection should favor deep-water bryozoans with lower skeletal Mg content. However, no significant correlations have been found among pH, Mg-calcite content, and depth for species of Antarctic (n = 4, Figuerola et al., 2015) and Artic (n = 52, Borszcz et al., 2013) bryozoans, perhaps because polar bryozoans already exhibit lower levels of Mg-calcite content at both shallow and deeper depths. Despite these results, latitudinal patterns in skeletal composition suggest that there may be a more significant effect from temperature than from depth on aspects of the biomineralization process in bryozoans (Kuklinski and Taylor, 2009; Taylor et al., 2009, 2015). This assertion was not supported by more rigorous sampling of four species of Antarctic bryozoans that exhibited significant variability in the Mg-calcite skeletal content among sites and showed no significant correlation with temperature (Loxton et al., 2014). Therefore, selection may favor local adaptations in response to discrete environmental conditions among isolated populations of bryozoans over small spatial scales (Loxton et al., 2014). Local adaptive responses may be one reason why more significant correlations among calcite types and depth have not been found in the above studies. It should be noted that none of these studies measured

the abundances of bryozoan colony morphotypes with different degrees of biomineralization as shown here.

site among the three Antarctic regions. L-U, Lepralio-Umbonulomorph bryozoans.

Despite the variation present in the biomineralization patterns of Antarctic bryozoans, some studies do show their usefulness as ecological indicators of global change (Barnes et al., 2011; Fortunato, 2015). Moreover, Antarctic bryozoan species from shallow-water habitats exhibit wide variation in mass-specific metabolic rates (Peck and Barnes, 2004). Interestingly, one common flustrid species, Camptoplites bicornis, was found to have one of the highest mass-specific metabolic rates even when compared to some Antarctic molluscs (Peck and Barnes, 2004). Considering that species of Camptoplites are found in shallow and deep-water habitats alike (as deep as 5 km, see Hayward, 1981), this metabolic flexibility may be one key trait that allows species of Camptoplites, other lightly skeletonized flustrid bryozoans, and a small number of unmineralized ctenostome species (N = 5, Grischenko and Chernyshev, 2015) to occupy and potentially exploit areas of the deeper shelf and the deep sea.

#### Potential Abiotic Factors

Due to the inherent patchiness and abiotic complexity of benthic communities along the Antarctic shelf, other studies have not found specific environmental factors that account for most of the variation in percent area or species composition aside from depth (Jones et al., 2007) and latitude/longitude (Gutt and Starmans, 1998). However, the presence or absence of significant benthic faunal coverage may be correlated with patterns in the bottom-near current that provides sustenance to these communities (Gutt and Starmans, 1998). In terms of substrata, glacial dropstone densities correlate with increased species diversity and benthic coverage in sites off the Sabrina Coast and the West Antarctic Peninsula (Post et al., 2017; Ziegler et al., 2017). Although isolated bryozoan colonies are present on hard substrata in the seabed images included in our study, habitat-forming bryozoan communities largely occurred on soft bottom areas of the Ross Sea shelf. Dropstones were not observed in significant abundance at the Palmer Archipelago/Scotia Sea or Weddell Sea sites. These observations do not rule out that dropstone presence and specific sediment characteristics may in some areas facilitate recruitment and formation of habitatforming bryozoan communities.

Sea surface temperatures largely influence seasonal and longterm sea ice coverage patterns surrounding Antarctica (Comiso et al., 2017). Although average sea ice cover increased for the whole of Antarctica over the last 34 years, seasonal patterns in sea ice expansion and melt in some seas of western Antarctica, the Antarctic Peninsula, and the Weddell Sea are different (Comiso et al., 2017). Winter-spring trends show ice melt in the Antarctic Peninsula, but ice expansion in the western Amundsen, Ross, and eastern Weddell Seas. Summer-autumn trends show significant ice retreat in the Bellingshausen and Amundsen Seas, but increases in the Weddell and Ross Seas (Comiso et al., 2017). Additionally, reduction in sea ice cover and subsequent increases in primary algal productivity promote growth in Antarctic benthic shelf communities that outweighs (but does not completely eliminate) the negative impacts of increased ice scour in the shallow shelf zones (Barnes and Conlan, 2007; Barnes, 2017). Interestingly, bryozoan recolonization patterns after recent ice scour differ between areas. Both the George V Ice Shelf (Post et al., 2011) and eastern Weddell Sea (Teixidó et al., 2004) have high bryozoan recolonization abundances, compared to low bryozoan recolonization abundances in the western Weddell (Gutt and Piepenburg, 2003). Based on environmental factors linked to patterns in sea ice coverage, zoobenthic communities in the Bellingshausen and Amundsen

Seas would be predicted to increase with time. Interestingly, increased benthic colonization has already been observed in sites of the Antarctic Peninsula (Lagger et al., 2018). In general, the abundances of benthic suspension feeders in the western Ross Sea (Rowden et al., 2015), the northern Antarctic Peninsula (Gutt et al., 2016; Segelken-Voigt et al., 2016), and the eastern Weddell Sea (Gutt and Starmans, 1998) do positively correlate with long-term trends in seasonal ice melt. Spatial and temporal variability in Antarctic marine carbonate chemistry and seawater pH are modified by sea ice melt, primary productivity, freshwater

and percent heavily skeletal species. All percentages are listed as decimals.

inputs, ocean mixing, and photosynthetic production (Roden et al., 2013; Matson et al., 2014; Smith et al., 2014; Stark et al., 2018). Considering the complex interactions among these environmental drivers, heterogenic species assemblages varying in degrees of body mineralization are to be expected in different regions and depths of the Antarctic shelf. Understanding how future trends in Antarctic sea ice melt will influence ocean acidification and potentially negatively impact the composition of Antarctic shelf communities is crucial for protecting these specialized habitats.

Our study supports the conclusion that flustrid bryozoans are more prevalent in the habitat-forming bryozoan communities in the Ross Sea and northern Palmer Archipelago/Scotia Sea. In contrast, sites in the eastern Weddell Sea are comprised mainly of moderately to robustly mineralized, cellariiform flustrids, lepraliomorph, and umbonulomorph bryozoans. The species compositional differences observed among Antarctic sites and depths are likely influenced by the spatial and temporal variability inherent in seasonal ice scour, carbonate chemistry, and primary productivity. How these specialized communities will respond to the combined forces of future global warming and ocean acidification remains an open question (although see Ashton et al., 2017), but considering the overall species diversity characteristic of habitat-forming bryozoan communities, these habitats should be protected

(Wood et al., 2012). Marine protection areas are designated in the Ross Sea and the sub-Antarctic Island of South Georgia (Hogg et al., 2018; Nyman, 2018), however surrounding Antarctic regions remain unprotected and, in particular, shelf communities of the eastern Weddell Sea should be given careful consideration.

## AUTHOR CONTRIBUTIONS

AM and KH conceived and planned the sampling on the R/V N.B. Palmer; SS, VA, and PW gathered and analyzed the data; SS wrote the paper; AM, VA, PW, and KH reviewed the manuscript.

#### FUNDING

KH, AM, and SS provided funding. NSF OPP Grant #1043745 to KH, NSF OPP 1043670 to AM, and LIU Grants in Aid of Research to SS.

### ACKNOWLEDGMENTS

The authors would like to thank the efforts of the crew of the R/V N.B. Palmer, D. Piepenburg, and crew of the Polarstern.

### REFERENCES


The authors also thank the reviewers for helpful suggestions on improving the paper.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00116/full#supplementary-material

Supplemental Figure 1 | Clade and mineralization classifications based on common genera of Antarctic bryozoans. Macrophotographic images of various genera are listed from top to bottom in the left panel. Across each panel the middle image is a scanning electron micrograph of the mineralized skeleton with the tissues removed. The right panel is an image taken from our seabed imaging dataset that corresponds to the particular bryozoan group or genus.

Supplemental Table 1 | Bryozoan species records gathered from external datasets.

Supplemental Table 2 | Antarctic bryozoan species grouped by degree of mineralization: fine, moderate, and heavy.

Supplemental Table 3 | Environmental variables measured at bottom depth in bryozoan-rich and habitat-forming bryozoan sites.


Figuerola, B., Gordon, D. P., Polonio, V., Cristobo, J., and Avila, C. (2014). Cheilostome bryozoan diversity from the southwest Atlantic region: is antarctica really isolated? J. Sea Res. 85, 1–17. doi: 10.1016/j.seares.2013.09.003


ocean acidification. Antarct. Sci. 21, 449–456. doi: 10.1017/S09541020099 90198


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Santagata, Ade, Mahon, Wisocki and Halanych. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Local and Regional Scale Heterogeneity Drive Bacterial Community Diversity and Composition in a Polar Desert

Kelli L. Feeser<sup>1</sup> , David J. Van Horn<sup>1</sup> , Heather N. Buelow<sup>1</sup> , Daniel R. Colman<sup>1</sup> , Theresa A. McHugh<sup>2</sup> , Jordan G. Okie<sup>3</sup> , Egbert Schwartz<sup>4</sup> and Cristina D. Takacs-Vesbach<sup>1</sup> \*

<sup>1</sup> Department of Biology, University of New Mexico, Albuquerque, NM, United States, <sup>2</sup> Department of Biological Sciences, Colorado Mesa University, Grand Junction, CO, United States, <sup>3</sup> School of Life Sciences, School of Earth and Space Exploration, Arizona State University, Tempe, AZ, United States, <sup>4</sup> Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, United States

#### Edited by:

Peter Convey, British Antarctic Survey (BAS), United Kingdom

#### Reviewed by:

Don A. Cowan, University of Pretoria, South Africa Chun Wie Chong, International Medical University, Malaysia

\*Correspondence:

Cristina D. Takacs-Vesbach cvesbach@unm.edu

#### Specialty section:

This article was submitted to Terrestrial Microbiology, a section of the journal Frontiers in Microbiology

Received: 16 May 2018 Accepted: 30 July 2018 Published: 21 August 2018

#### Citation:

Feeser KL, Van Horn DJ, Buelow HN, Colman DR, McHugh TA, Okie JG, Schwartz E and Takacs-Vesbach CD (2018) Local and Regional Scale Heterogeneity Drive Bacterial Community Diversity and Composition in a Polar Desert. Front. Microbiol. 9:1928. doi: 10.3389/fmicb.2018.01928 The distribution of organisms in an environment is neither uniform nor random but is instead spatially patterned. The factors that control this patterning are complex and the underlying mechanisms are poorly understood. Soil microbes are critical to ecosystem function but exhibit highly complex distributions and community dynamics due in large part to the scale-dependent effects of environmental heterogeneity. To better understand the impact of environmental heterogeneity on the distribution of soil microbes, we sequenced the 16S rRNA gene from bacterial communities in the microbe-dominated polar desert ecosystem of the McMurdo Dry Valleys (MDV), Antarctica. Significant differences in key edaphic variables and alpha diversity were observed among the three lake basins of the Taylor Valley (Kruskal–Wallis; pH: χ <sup>2</sup> = 68.89, P < 0.001, conductivity: χ <sup>2</sup> = 35.03, P < 0.001, observed species: χ <sup>2</sup> = 7.98, P = 0.019 and inverse Simpson: χ <sup>2</sup> = 18.52, P < 0.001) and each basin supported distinctive microbial communities (ANOSIM R = 0.466, P = 0.001, random forest ratio of 14.1). However, relationships between community structure and edaphic characteristics were highly variable and contextual, ranging in magnitude and direction across regional, basin, and local scales. Correlations among edaphic factors (pH and soil conductivity) and the relative abundance of specific phyla were most pronounced along local environmental gradients in the Lake Fryxell basin where Acidobacteria, Bacteroidetes, and Proteobacteria declined while Deinococcus–Thermus and Gemmatimonadetes increased with soil conductivity (all P < 0.1). Species richness was most strongly related to the soil conductivity gradient present within this study system. We suggest that the relative importance of pH versus soil conductivity in structuring microbial communities is related to the length of edaphic gradients and the spatial scale of sampling. These results highlight the importance of conducting studies over large ranges of key environmental gradients and across multiple spatial scales to assess the influence of environmental heterogeneity on the composition and diversity of microbial communities.

Keywords: environmental heterogeneity, 16S rRNA genes, gradient analysis, spatial scale, polar desert, McMurdo Dry Valleys

# INTRODUCTION

fmicb-09-01928 August 18, 2018 Time: 18:53 # 2

Understanding the controls on the distribution of organisms has been one of the fundamental goals of ecology for the past century. Numerous studies suggest that this distribution is neither uniform nor random, but instead spatially patterned (Legendre and Fortin, 1989; Ettema and Wardle, 2002; O'Brien et al., 2016). However, the factors that control this patterning are complex, multifaceted, and include abiotic characteristics, biotic interactions, and stochastic events. Additionally, many ecological processes that influence the distribution, abundance, and interactions of species are scale-dependent, including the flow of individuals within an environment, the impacts and extent of disturbances, and variation in environmental conditions and habitability (Leibold et al., 2004). Spatial variation in environmental conditions, referred to here as environmental heterogeneity, is often cited as the primary driver of biodiversity (Stein et al., 2014; Coyle and Hurlbert, 2016). Environmental heterogeneity increases resource diversity and provides opportunities for niche partitioning and speciation events (Stein et al., 2014), allowing for increased species coexistence (MacArthur and Levins, 1964; Coyle and Hurlbert, 2016). However, there is an inherent tradeoff between environmental heterogeneity and diversity: as niche opportunities increase, the effective area available for each species decreases and the probability of stochastic extinctions rises (Allouche et al., 2012). Consequently, environmental heterogeneity is an important factor for species coexistence, persistence, and diversification (Stein et al., 2014).

Recent evidence suggests that environmental heterogeneity strongly impacts the distribution of microbial communities. However, these relationships are complex and contingent on several factors. First, there are the confounding effects of spatial scale.

Larger spatial areas generally encompass greater environmental heterogeneity which can come in the form of longer gradient lengths (e.g., wider ranges of conditions) or harsher gradient severity (e.g., more extreme conditions). Thus, as environmental heterogeneity is inextricably linked with spatial scale, patterns observed at small scales do not necessarily correspond to those found at larger scales (Franklin and Mills, 2009; Geyer et al., 2013; Van Horn et al., 2013; Bar-Massada and Wood, 2014; Stein et al., 2014). Additionally, because environmental heterogeneity typically involves simultaneous changes in numerous abiotic and biotic parameters, the impacts of one variable on microbial community structure may become overwhelmed by the impacts of other variables, creating threshold effects and difficulties in resolving drivers of community change. Clades within communities also respond differently to various environmental factors because their diverse physiological adaptations to resource limitations or environmental severity result in differences in relative fitness (Franklin and Mills, 2009; Geyer et al., 2013). Finally, a related complexity is the presence of "contextual effects," i.e., the observation that relationships between environmental factors and communities depend on geographic context (Van Horn et al., 2013). Therefore, multi-scale analyses along environmental gradients and across various landscape contexts are necessary to understand the abstruse dynamics of scale-dependent ecological processes that structure microbial communities.

A recent review of environmental heterogeneity-diversity studies noted that soil habitats are particularly underrepresented in the literature (Stein et al., 2014), despite the critical importance of soil microbial communities to ecosystem functioning (Cavigelli and Robertson, 2000). This limited understanding is due to the extremely high diversity of these systems, which often contain thousands of microbial species per-gram of soil (Torsvik et al., 1996), the physical and chemical complexity and heterogeneity of soil habitats (Nannipieri et al., 2003), and the frequency of stochastic disturbances which result in the formation of complex spatial patterns (O'Brien et al., 2016). Spatially structured communities have been observed at continental scales (Fierer et al., 2009; Lauber et al., 2009; Barberán et al., 2012; Fierer et al., 2012, 2013) to centimeter scales (Morris, 1999; Grundmann and Debouzie, 2000; Franklin and Mills, 2009; O'Brien et al., 2016), but the underlying ecological mechanisms remain difficult to decipher. Two master variables frequently implicated in controlling microbial diversity and community composition are pH and salinity. Several studies have suggested that the most influential variable on soil microbial community composition is pH (Fierer and Jackson, 2006; Baker et al., 2009; Lauber et al., 2009; Smith et al., 2010) while others have suggested salinity (Lozupone and Knight, 2007; Zeglin et al., 2011; Lee et al., 2012). We propose that differences in scale-dependent environmental heterogeneity may underlie these conflicting conclusions, and we suggest that determining the impacts of environmental gradient lengths and severity are crucial to unlocking this puzzle.

The soils of the McMurdo Dry Valleys (MDV) are an ideal natural laboratory to investigate the relationships between microbial community structure, edaphic characteristics, and scale. Considered the coldest, driest, most oligotrophic desert on Earth (Hopkins et al., 2006; Cary et al., 2010), the MDV are a microbe-dominated ecosystem, as extreme conditions prohibit the existence of higher plants and animals. In the absence of vegetation and biotic interactions such as herbivory, MDV soils are shaped into distinct spatial patterns by physicochemical factors including moisture, salinity, pH, and carbon availability (Barrett et al., 2004; Lee et al., 2012; Van Horn et al., 2013, 2014; Okie et al., 2015). MDV soils contain relatively low levels of biodiversity providing an ideal model system with which to investigate community-environment interactions. Invertebrates are rare and phyla include only the protozoa, rotifers, tardigrades, nematodes and Collembola. At many sites, only one species of nematode, the endemic Scottnema lindsayae are found (Freckman and Virginia, 1997; Barrett et al., 2004). Fungal and archaeal community distribution is similarly patchy (Arenz and Blanchette, 2011; Richter et al., 2014) and while bacteria are ubiquitous, their diversity is on average, approximately one third of that found in most other soils (Van Horn et al., 2013, 2014).

Previous research has leveraged the existence of patterned ground formations or soil polygons in the MDV and elsewhere to investigate the role of geomorphic history (Brinkmann et al., 2007) and physicochemical variation across multiple spatial

scales (Barrett et al., 2004) on soil biodiversity, although the former study was focused on cyanobacteria and the latter on invertebrates. Strong physical and biogeochemical gradients form along soil polygons that are related to nematode abundance and biodiversity variation. However, a similar, systematic investigation of MDV soil bacterial communities has not been conducted, despite their ubiquity across the MDV landscape (Takacs-Vesbach et al., 2010) and that multiple studies have shown that MDV soil bacterial communities are active under in situ conditions (Schwartz et al., 2014; Buelow et al., 2016). Understanding the effects of environmental heterogeneity is crucial to predicting the controls on the diversity and distribution of soil microbial communities. However, untangling these relationships is complicated by landscape contexts and spatialscale dependent evolutionary and ecological mechanisms. Using soil polygons as the organizing framework across multiple biogeochemically diverse basins is an ideal approach to address environmental heterogeneity. The goal of this study was to explore the effects of edaphic pH and electrical conductivity (EC) gradients on bacterial community diversity and composition as they vary across regional, watershed basin, and local scales. Specifically, we examined the degree to which these key edaphic gradients structure microbial communities by disentangling the impacts of (1) spatial scale, (2) edaphic gradients, and (3) related threshold effects on microbial distribution patterns.

# MATERIALS AND METHODS

#### Site, Sampling Description, and Chemical Analysis

The McMurdo Dry Valleys, Victoria Land, Antarctica (77◦ 300 S, 163◦ 00<sup>0</sup> E) comprise one of the harshest environments on Earth, with air and surface soil temperatures averaging between −15 and −30◦C and extremes ranging from −60 to 25◦C on the soil surface (Doran et al., 2002). The MDV are the largest ice-free zone in continental Antarctica (Fountain et al., 1999) with a total area of 22,700 km<sup>2</sup> and an ice-free area of 4,500 km<sup>2</sup> (Levy, 2013). The MDV receive little precipitation (<10 cm snow per year; Keys, 1980), most of which is lost through sublimation (Clow et al., 1988). The low soil moisture and precipitation results in an accumulation of salts and high pH in the upper soil stratum (Bockheim, 1997). Physicochemical gradients across the MDV are especially heterogeneous owing to its diverse glacial history (Lee et al., 2012). The hyper-arid mineral soils are primarily categorized as Anhyorthels or Anhyturbels and contain very little organic matter (Bockheim, 1997). Furthermore, the soils of the MDV are subjected to frequent freeze-thaw cycles that create physical sorting of rocks and soil particles and cause the expansion and contraction of permafrost layers 0.2–0.5 m below the ground surface (Kessler et al., 2001; Bockheim, 2002; Kessler and Werner, 2003). These processes create patterned ground formations, termed soil polygons. The polygons are clearly distinguished by intersecting troughs along their margins, and are a prominent landscape feature useful for geometrically scaling local ecological information (Barrett et al., 2004). Within polygons, significant differences in soil chemistry have been detected that are presumably due to naturally occurring edaphic heterogeneity and/or soil polygon mechanics, i.e., cryoturbation (Bockheim, 2002; Barrett et al., 2004).

Soils were aseptically collected from polygons during the austral summer of 2012 from the three major hydrological basins of the Taylor Valley: Bonney, Fryxell, and Hoare. A map of approximate sampling locations and major topographical features is provided in the **Supplementary Figure S1**. Eight soil polygons with an approximate radius of 6 m were randomly selected within each lake basin. Polygons within the Bonney basin were between 380 and 410 m from the lake margin, polygons within the Hoare basin 270–280 m, and polygons within the Fryxell basin were approximately 120 m away. Within each polygon, five samples were aseptically collected with sterilized scoops to a depth of approximately 10 cm along radial transects beginning from the trough edge to the center (at 0, 0.4, 0.8, 2, and 6 m), for a total of 120 samples. Soils were collected into sterile Whirl-Pak bags. Within 24 h, soils for molecular analysis were subsampled into sterile tubes by preserving approximately 10 g of soil with an equal volume of sucrose lysis buffer (Giovannoni et al., 1990). Samples were stored at −20◦C until extraction. Soil pH was determined on 1:2 soil/deionized water extracts using an Orion pH probe. EC of 1:5 soil/water extracts was measured with a Yellow Springs Instrument 3100 conductivity meter.

### DNA Extraction, Sequencing, and Sequence Analysis

DNA from 0.7 g of soil was extracted using the cetyltrimethylammonium bromide (CTAB) method (Hall et al., 2008; Mitchell and Takacs-Vesbach, 2008). Barcoded amplicon pyrosequencing of 16S rRNA genes was performed as previously described (Dowd et al., 2008; Van Horn et al., 2013, 2014) using V6 universal bacterial primers 939F 5<sup>0</sup> TTG ACG GGG GCC CGC ACA AG-3<sup>0</sup> and 1492R 5<sup>0</sup> -GTT TAC CTT GTT ACG ACT T-3<sup>0</sup> on a Roche 454 FLX instrument using Roche titanium reagents following the manufacturer's instructions.

The 16S rRNA gene sequences were quality filtered, denoised, screened for PCR errors, and chimera checked using default parameters in AmpliconNoise (Quince et al., 2011). The Quantitative Insights into Microbial Ecology (QIIME) pipeline was used to analyze the 16S rRNA gene sequences (Caporaso et al., 2010a). Unique 16S rRNA gene sequences or operational taxonomic units (OTUs) were identified using the 97% DNA identity criterion using UCLUST (Edgar, 2010). A representative sequence was chosen from each OTU and aligned using the PyNAST aligner (Caporaso et al., 2010b) and the Greengenes core set (version 13.8) (DeSantis et al., 2006). Taxonomic assignments of the OTUs were made using the Ribosomal Database Classifier program (Wang et al., 2007).

All measures of community diversity (observed species, inverse Simpson, Good's coverage, Bray–Curtis, and Jaccard distances) and composition were performed with randomly selected subsets of 500 sequences per sample to standardize for varying sequencing efforts across samples. Raw sequence data from this study are available through the NCBI Sequence Read Archive as PRJNA436435. The individual sff files from this study were assigned the accession numbers SAMN08624939– SAMN08625045.

#### Statistical Analysis

fmicb-09-01928 August 18, 2018 Time: 18:53 # 4

The normality of pH, EC, and alpha diversity distributions were assessed using Shapiro–Wilk tests. Significant differences in pH, EC, and alpha diversity data were assessed using non-parametric Kruskal–Wallis rank sum tests followed by post hoc pairwise Tukey honest significant difference (HSD) tests, corrected for multiple comparisons.

Data were pooled among all three lake basins for regional scale analysis. Patterns in microbial communities among and within lake basins (basin scale) were analyzed using nonmetric multidimensional scaling (NMDS) using Bray–Curtis and Jaccard distances. Differences in microbial community composition (i.e., Bray-Curtis distances) were assessed by an Analysis of Similarity (ANOSIM) test with 999 permutations to assess significance. In addition, we investigated the degree to which microbial community profiles were associated with environmental factors by using the Random Forests classification algorithm (Breiman, 2001) implemented in QIIME's supervised\_learning.py command with 10-fold crossvalidation on a rarefied OTU table (−e 500) that was filtered to remove OTUs with less than 10 sequences. Performance of the random forests classifier is reflected by the ratio of baseline error to the estimated generalization error, ratios of 2 or greater indicate that the classifier is at least twice as accurate as random guessing. Phylum level patterns were investigated using linear regression analysis of relative abundance correlations along the radial transects of polygons.

Canonical correspondence analysis (CCA) was used to identify significant environmental variables that explained the variance of the OTU-level community structure. CCA is a constrained analysis that only partitions variation that can be explained by environmental factors while using chi-square distances to perform weighted linear mapping (Oksanen et al., 2010). CCA is considered a robust and valuable analysis for ecological data because it performs well with skewed species distributions, noise, interrelated environmental variables, and violations of assumptions (Palmer, 1993). The statistical significance of explanatory variables was assessed via the adonis test, as implemented in the vegan R package (Oksanen et al., 2010). Adonis is a permutational (n = 999) multivariate analysis of variance test that partitions distance matrices among sources of variation. The significance of CCA model constraints were assessed by the permutation test function anova.cca. CCA tests were run when considering communities at the regional and basin scales. NMDS and CCA tests were conducted using the Vegan library (Oksanen et al., 2016) in the R programming environment.

We considered the degree and significance of spatial structuring on community-environmental relationships across regional, lake basin, and local scales via Mantel and partial Mantel tests (Mantel, 1967; Smouse et al., 1986; Legendre and Legendre, 2012). Mantel tests were conducted to assess spatial auto-correlation using Jaccard- (community composition) and Euclidean- (observed species, pH, EC, spatial) based distance matrices. Spatial distance matrices were based on geographic coordinates at the regional and basin scales and on distance from the polygon trough at local scales. Partial Mantel tests were used to compute the correlations among edaphic variables and community composition and richness while controlling for the effects of spatial structure. Mantel and partial Mantel tests were implemented using Spearman correlations within the Vegan library (Oksanen et al., 2016) and significance of results were assessed via permutational analyses (n = 999). Additional analysis of spatial structuring was performed using these distance matrices to create a multivariate Mantel correlogram.

We investigated the relative influence of pH versus EC on species richness using a sliding window model. To do this, we created windows (i.e., subsets) of 10 samples across each edaphic gradient and then ran the frame across the entire gradient, stepping by one sample at a time. For example, samples were ordered from lowest to highest pH and the first window contained samples 1–10 while the second window consisted of samples 2–11. The process was repeated along the EC gradient. Linear regressions of observed species in relation to pH + EC were calculated for each window and relative contributions of the edaphic variables to explainable variability (R 2 ) were assessed using the relaimpo R package (Grömping, 2006). We used the recommended "lmg" metric, which provides a decomposition of the model explained variance into non-negative contributions while removing the effects of regression variable ordering. Results were visualized by plotting rectangles representative of the windows shaded by the proportion of variability explained along each edaphic gradient after normalization of overlapping regions. Normalization was conducted by averaging the relative contribution of each respective edaphic gradient in intervals of 5 µS/cm and 0.05 pH units.

Lastly, the effects of pH and EC as drivers of microbial community diversity and composition were assessed by Spearman rank correlations. P-values of environmental factors were adjusted for multiple comparisons using the Benjamini and Hochberg (1995) method.

# RESULTS

# Sequencing Results

Pyrosequencing of 16S rRNA gene libraries resulted in 680,727 reads (6,275 ± 4,540 reads per sample, n = 107 samples), with a mean length of 393 ± 28 bp. Samples were rarified to 500 sequences per sample to account for uneven sequencing depth among samples. When data from all samples were considered together (regional scale), a total of 25,449 and 5,092 OTUs (97% sequence similarity) were identified in the non-rarefied and rarefied datasets, respectively. The Good's coverage statistic of the rarified dataset ranged from 0.66 to 0.97 with an average of 0.80, indicating that the majority of diversity was detected in most samples (Good, 1953; Kemp and Aller, 2004). The rarified dataset included 5,092 OTUs that were primarily assigned to the phyla Acidobacteria (average of 28% relative abundance), Actinobacteria (9%), Bacteroidetes

(19%), Deinococcus–Thermus (8%), Gemmatimonadetes (5%), and Proteobacteria (9%).

### Spatial Scale, Edaphic Gradients and Microbial Distributions

The relationships between edaphic gradients and the distribution of bacteria were investigated at three spatial scales; regional, basin, and local. The CCA model constructed at this geographic scale (**Table 1** and **Figure 1**) explained 17.1% of the variability

TABLE 1 | Results of canonical correspondence analysis performed at the regional and basins scales.


<sup>∗</sup>P ≤ 0.1, ∗∗P ≤ 0.01, ∗∗∗P ≤ 0.001.

of the OTU-level community matrices, leaving 82.9% of the variation unexplained (model P = 0.001). Additionally, adonis tests revealed that lake basin origin accounted for 25% of the community variance, EC for 14%, pH for 3%, and distance from the trough for 2% (all P-values ≤ 0.05). Mantel tests indicated significant spatial structuring of community and environmental variables at the regional scale (**Table 2**). Partial Mantel tests at the regional level indicated significant correlations between pH and community composition and between EC and both community composition and richness (**Table 3**). Similar results were obtained using tests based on both Jaccard and Bray–Curtis community dissimilarity matrices, so only results derived using Jaccard are reported. Additional evidence of spatial autocorrelation is evident in a Mantel correlogram (**Supplementary Figure S3**).

At the basin scale, soil pH and EC varied significantly (Kruskal–Wallis; pH: P < 0.001, EC: P < 0.001) (**Table 4**). The pH values were least alkaline in the Bonney Basin (mean 8.77, ±SE 0.06), intermediate in the Fryxell Basin (mean 9.57, ±SE 0.08), and most basic in the Hoare Basin (10.03, ±SE 0.04) (**Table 4**). In contrast, soil EC values were lowest in Hoare Basin (144 µS/cm, ± SE 10), intermediate in Bonney Basin (361 µS/cm, ±SE 54), and highest in Fryxell Basin (788 µS/cm, ±SE 135) (**Table 4**). Alpha diversity also varied significantly among basins (Kruskal–Wallis; observed species: P < 0.05 and inverse Simpson: P < 0.001) (**Table 4**). The highest average microbial richness, as measured by the number of observed species (OTUs), was found in the Lake Hoare basin (166 ± 6) and the lowest in the Lake Fryxell basin (136 ± 7). The inverse Simpson diversity index, which incorporates species evenness, was highest (43.31 ± 4.55) in the Lake Bonney basin soils and lowest (20.68 ± 2.16) in the

Lake Fryxell Basin soils. Both indices indicated that the Fryxell basin communities were the least diverse.

Differences were observed in the overall taxonomic composition of the soils from the various basins. Soils from Lake Bonney Basin had the most even distribution of phyla, containing 4-15% relative abundances of Acidobacteria, Actinobacteria, Bacteroidetes, Firmicutes, Planctomycetes, Proteobacteria, Verrucomicrobia, and Deinococcus–Thermus (**Supplementary Figure S2**). Lake Fryxell basin soils were dominated by Acidobacteria (28%), Bacteroidetes (24%), and Deinococcus–Thermus (15%). Acidobacteria were especially dominant in the Lake Hoare basin soils (43%), which also included Bacteroidetes (18%), Actinobacteria (9%), and Verrucomicrobia (7%). Bacterial community composition varied significantly among lake basins as evidenced by statistically significant clustering by basin (ANOSIM R statistic = 0.47, P = 0.001) using both Bray–Curtis (Legendre and Gallagher, 2001) and Jaccard distances (**Figure 2**), in addition to high random forests classification ratios (ratio of 14.1). Basin-specific CCA models were only significant for communities within the Lake Hoare and Fryxell basins (**Table 1**). EC explained the most variation in the Lake Fryxell basin (adonis: R <sup>2</sup> = 0.41), while pH and distance from the trough only explained 4% of variation. Within the Hoare basin, EC explained 17% of variation, distance from the trough explained 14%, and pH accounted for 6%. Partial Mantel tests at the regional level indicated significant correlations between pH and community composition in all three basins, EC and composition in Hoare and Fryxell basins, and EC and richness in Fryxell Basin (**Table 3**).

At the local scale, soil pH varied significantly along the polygon transects only in the Lake Bonney basin (Kruskal– Wallis; P = 0.0358) (**Figure 3** and **Supplementary Table S1**), increasing from an average of 8.45 to 9.04 from the trough to the polygon center. Soil EC only varied significantly along transects in the Lake Fryxell basin (Kruskal–Wallis; P = 0.036), increasing from 228 µS/cm at the trough to 1,190 µS/cm in the center. Along the polygon transects, the highest number of observed species was found within the Lake Hoare polygon troughs



Composition Richness


Local scale key indicates basin\_polygon number. <sup>∗</sup>P ≤ 0.1, ∗∗P ≤ 0.01, ∗∗∗P ≤ 0.001, ND, not determined due to small sample size.

Local scale key indicates basin\_polygon number. <sup>∗</sup>P ≤ 0.1, ∗∗P ≤ 0.01, ∗∗∗P ≤ 0.001, ND, not determined due to small sample size.

while the lowest was in the center of Lake Fryxell polygons. Lake Bonney basin transects generally had the highest inverse Simpson values and Lake Fryxell basin transects had the lowest. Both metrics decreased toward the center of polygons within the Lake Fryxell basin (Kruskal–Wallis; observed species: P = 0.0006, inverse Simpson: P = 0.0028). The relative composition of phyla along the polygon transects within Bonney and Hoare basins were stable and indistinguishable (random forests ratios of 1.0 and 1.7, respectively, **Figure 4**). In contrast, communities along the transects in the Fryxell basin soils diverged significantly (random forests ratio of 2.1) as Deinococcus–Thermus and Gemmatimonadetes increased in relative abundance toward the center of the polygons while Acidobacteria and Bacteroidetes decreased. Populations of Acidobacteria and Actinobacteria were inversely correlated, as were Bacteroidetes versus Deinococcus– Thermus and Gemmatimonadetes (**Table 5**). Both Bacteroidetes and Proteobacteria, as well as Gemmatimonadetes and Deinococcus–Thermus were positively correlated. Correlations between soil chemistry and community structure at the local level were only significant for some transects in some basins. In the Fryxell basin, significant correlations were observed between community composition and EC along six out of eight transects, and between richness and EC at four out of eight transects (**Table 3**). Raw edaphic and alpha diversity values are reported in **Supplementary Table S2**.

## Threshold Effects and Phylum-Specific Patterns

Our sliding window model constructed at the regional scale simultaneously assessed the effects of pH and EC on richness and indicated that EC generally explained the majority of variation in the number of observed species (**Figure 5**). EC was the more

TABLE 4 | Mean ± standard error of soil geochemical properties and alpha diversity values for soils collected from each lake basin.


Ranges of edaphic factors (minimum–maximum, difference) are reported inside parentheses. <sup>∗</sup> Indicates significant difference in raw data among basins (P < 0.05). Basin means with same superscript letter are not statistically different after adjusting for multiple comparisons.

influential edaphic variable across intervals from 80–165, 290– 640, and at 1000 µS and above (**Figure 5B**). In contrast, pH explained more variation in small windows from 8.95–9.1 to 10.05–10.15 (**Figure 5A**).

Phylum-specific spearman rank correlations were calculated to determine the proportion of variance in alpha diversity and phyla relative abundance that could be explained by distance from the polygon trough, pH, and EC (**Figure 6**). Significant correlations between distance from trough and alpha diversity were found within the Lake Hoare and Fryxell basin soils based on positive correlations between distance and observed species and negative correlations between distance and inverse Simpson (all P < 0.05). Significant correlations to phyla relative abundance were also only found in the Hoare and Fryxell lake basins. In all cases where a phylum was significantly correlated to an edaphic gradient in more than one basin, the direction of the correlation was consistent between basins; for example, Deinococcus–Thermus was positively correlated with EC in both Hoare and Fryxell basins (**Figure 6**).

# DISCUSSION

To better understand the relationship between bacterial species distribution patterns and the environmental factors that shape them, we performed a spatially stratified examination of the inherent edaphic gradients within the cold, dry, and oligotrophic polar desert ecosystem of the MDV, Antarctica. This study focused on the relationships among edaphic pH and EC gradients on bacterial communities as they vary across local, lake basin, and regional scales. The simplified trophic structure and high spatial physiochemical heterogeneity of MDV soils provides an ideal model system in which to study natural, low-complexity microbial community-environment interactions and thus better understand the underlying scale-dependent processes that structure these communities. By minimizing the effects of biotic interactions, we can more directly address the poorly understood impacts of environmental heterogeneity and spatial scale on microbial community composition and diversity. Our findings corroborate previous studies of soil bacterial communities within the lake basins of Taylor Valley by describing high spatial variability and linking soil microbial community structure to edaphic geochemical gradients (Barrett et al., 2006; Niederberger et al., 2008; Smith et al., 2010; Zeglin et al., 2011; Lee et al., 2012; Sokol et al., 2013; Van Horn et al., 2013). However, this is the first study to use the inherent edaphic gradients within soil polygons to investigate the effects of spatial scale, environmental heterogeneity, and landscape context on bacterial community structure.

### Environmental Gradients at Different Scales

Edaphic gradients, which strongly affect the spatial patterning of soil microbial communities (Barrett et al., 2006; Fierer and Jackson, 2006; Lozupone and Knight, 2007; Niederberger et al., 2008, 2015; Smith et al., 2010; Zeglin et al., 2011; Lee et al., 2012; Sokol et al., 2013; Van Horn et al., 2013), occur across different

FIGURE 2 | Non-metric multidimensional scaling plots created using (A) Bray–Curtis and (B) Jaccard distances show distinctiveness of communities among basins.

spatial scales due to variation in the underlying drivers producing these gradients. For example, broad-scale edaphic gradients can be caused by differences in topography, climate, and geologic history that occur across landscapes (Jenny, 1941; Sinsabaugh et al., 2008; Townsend et al., 2008). The variations in edaphic conditions that we observed between lake basins in the MDV were most clearly illustrated by the absence of overlap between samples from different lake basins in our pH versus EC bi-plots (**Figure 5**). These observations are an example of landscape-scale gradients likely due to differences in parent geology, glacial till sequence, and paleo-lacustrine organic matter deposition (Péwé, 1960; Burkins et al., 2000).

While broad-scale gradients create a general template for the formation of biological communities, fine-scale factors such as soil structure, microclimates, topography, and transition zones between habitats (ex. riparian zones), can superimpose local-scale gradients on top of regional patterns (Ettema and Wardle, 2002). In the MDV, fine-scale gradients in edaphic factors including soil moisture, organic matter, pH, EC, ions, and nutrients have been observed near snowpacks (Gooseff et al., 2003; Van Horn et al., 2013), lake margins (Northcott et al., 2009; Zeglin et al., 2009), hyporheic zones (Barrett et al., 2009; Northcott et al., 2009; Zeglin et al., 2009; Niederberger et al., 2015), ponds (Moorhead et al., 2003), and mummified seals (Tiao et al., 2012). In this study, the effects of the physical processes inherent to polygon formation, (e.g., frostsorting and aeolian deposition, Kessler et al., 2001; Bockheim, 2002; Kessler and Werner, 2003; Barrett et al., 2004) created

local-scale gradients within some (Fryxell), but not all (Bonney and Hoare) basins. Thus, both the basin- and local-scale gradients for pH and EC provided an opportunity to study the effects of edaphic gradients on microbial community diversity and composition.

# Bacterial Community Responses to Gradients

In this study, we focused on bacterial community responses to pH and EC gradients because these edaphic parameters are master drivers of microbial community structure and diversity (Fierer and Jackson, 2006; Lozupone and Knight, 2007). High soil salinity increases water limitation by controlling total ion concentrations, and therefore the total water potential of the soil. Salinity can exert additional biological stress on soil microbes by osmotically increasing intracellular ion concentrations to potentially toxic levels resulting in decreases in respiration, critical enzyme activity, and nitrogen and carbon cycling (Frankenberger and Bingham, 1982; Zahran, 1997). The exact mechanism by which pH exerts its effects on microbes is less well defined, though two possible explanations have been proposed. First, pH may be an amalgam of other edaphic characteristics such as nutrient concentrations, cationic metal solubility, organic matter content, moisture, and salinity, and it is these variables that directly impact the soil communities (Lauber et al., 2009). Alternatively, severe pH gradients may create a selective advantage for species with increased tolerance to pH extremes (Lauber et al., 2009).

As described above, our sampling encompassed both broadand local-scale variation in these primary edaphic variables, along with the subsequent impacts to bacterial community structure. Similar to other studies (e.g., Fierer and Jackson, 2006; Lauber et al., 2009), broad-scale environmental differences created distinct communities in spatially distant, but cohesive,

TABLE 5 | Statistical metrics of significant linear regressions comparing phyla relative abundances along polygon transects within the Fryxell basin.


<sup>∗</sup>P ≤ 0.1, ∗∗P ≤ 0.01, ∗∗∗P ≤ 0.001.

pH, and conductivity adjusted for multiple comparison using Benjamini and Hochberg (1995).

areas (lake basins) based on ordination analyses (**Figures 2**, **5**). In particular, the Lake Bonney communities were differentiated from those in the other two lake basins, and the Lake Bonney soils had the lowest minimum, maximum, and mean soil pH. This result, coupled with the overall low within- and high between-basin variation in pH, suggests that this variable is particularly important in shaping the baseline community found in different regions of the MDV. Superimposed upon the broadscale pH pattern, local polygon-scale conductivity gradients appear to drive local community composition, with predictable increases in Deinococcus–Thermus and Gemmatimonadetes and decreases in Acidobacteria, Bacteroidetes, and Proteobacteria with increasing EC (**Table 5** and **Figures 3**, **4**). These findings were particularly strong in Lake Fryxell basin, indicating that edaphic variations may elicit sufficient physiological stress to result in community changes due to environmental filtering and competitive interactions, as observed elsewhere (Weiher and Keddy, 2001).

Our findings of both local- and broad-scale edaphic and community patterns, contrast with several previous studies both within and outside of the MDV that found diverse or non-existent patterning at local scales and coherent patterns at larger scales (Barrett et al., 2004; O'Brien et al., 2016). We propose that a key to understanding this discrepancy is the degree of edaphic heterogeneity captured within our study. More specifically, not only is the presence of gradients important in structuring microbial communities, but the length or severity of the gradient is also crucial. For example, Lake Fryxell basin soils have the highest average EC (788 ± 135 µS) and range (86–2808 µS) (**Table 4**) and the bacterial communities at Lake Fryxell were the most highly responsive to EC (**Figures 1**, **5**). Furthermore, the strength of relationships among edaphic conditions and microbial communities depended on the magnitude of the heterogeneity exhibited at each scale that was analyzed (**Table 1**). Although the magnitude of relationships among community structure and edaphic factors varied among basins, the direction of change remained constant (**Figure 6**). This implies that, regardless of location, phyla are responding similarly to edaphic conditions. However, because the core communities were different due to the broad-scale underlying factors, communities from different basins did not appear to converge even at the extreme end of the edaphic ranges.

Our results suggest that Deinococcus–Thermus was likely the only phylum thriving in the high EC soils of the Lake Fryxell polygon centers. These organisms have previously been associated with low productivity soils (Niederberger et al., 2008). Deinococcus–Thermus had the largest variation across all scales, second only to Proteobacteria (**Supplementary Figure S2**). Interestingly, Proteobacteria have been identified as constituting a substantial proportion of the active communities in the MDV by studies using stable isotope probing (Schwartz et al., 2014) and RNA sequencing (Buelow et al., 2016). Thus, we can surmise that Proteobacteria and Deinococcus–Thermus are active and adapting to environmental conditions within our study system. Furthermore, the two aforementioned activitybased studies indicated that Acidobacteria and Bacteroidetes were largely inactive. Our study found comparably low variation for these phyla across spatial scales, indicating that perhaps they (or their relic DNA) had a more cosmopolitan distribution that decreased in relative abundance when environmental conditions better suited more specialized taxa, i.e., high EC selection of Deinococcus–Thermus.

Together, these results suggest several consistent patterns with respect to interactions between environmental gradients and bacterial community structure. First, sufficiently long gradients allowed a greater number of niches (and therefore a greater number of taxa), resulting in variation in community composition along the gradients. These niches may be due to differences physiological stresses or other factors correlated to the abiotic gradient (Okie et al., 2015). We found that at any scale, there was potential for thresholds of effect: we observed localscale thresholds within the Fryxell lake basin soils, but basin-scale thresholds in pH between Lake Bonney and Lake Hoare soils. Additionally, these edaphic factors did not work in isolation, but instead interacted synergistically. Further investigation into the relative contributions of EC and pH to microbial richness at the regional-scale revealed the dominance of EC (**Figure 5B**), while pH explained more variation in small windows that correspond to EC levels below 500 µS (**Figure 5A**). This not only highlights the importance of gradient severity, but reveals that when EC is low, pH is more influential to community diversity and composition. Thus, it appears that environmental gradient length organizes soil bacterial communities. This conclusion is especially interesting considering the discrepancies over the suggested dominant environmental drivers reported in the literature. For example, Fierer and Jackson (2006) suggest that soil bacterial diversity is primarily correlated to pH, whereas Lozupone and Knight (2007) found diversity strongly correlates to soil salinity, but not pH. Our study encompassed almost neutral to very basic soils, ranging from ∼7.6 to 10.4 while Fierer and Jackson (2006) studied a pH range of 3.5 to 9, and did not consider soil salinity. We note that it is plausible that microbial communities are more responsive to acidity than alkalinity due to the complexities involved in the physiological adaptations toward acidophily (Colman et al., 2017). The pH and salinity ranges in the Lozupone and Knight (2007) study are not reported, and samples were binned into "saline" and "non-saline" groups. Thus, it is possible there are no actual contradictions in the conclusions given by these research groups but that these reportedly differing results were due to limitations of the edaphic gradients present in the different studies.

#### CONCLUSION

These findings stress the importance of a spatially explicit experimental design and recognition of the inherent gradients across a variety of spatial scales. In particular, effort is needed to sample the entirety of gradients present at each relevant spatial scale before reaching conclusions about the distribution of organisms or relationships among communities and environmental factors. Failure to do so may lead to spurious inferences or results that are artifacts of limited and unrepresentative data. Random sampling with the objective

of capturing an unbiased representation of soil heterogeneity may capture edaphic ranges but will not capture the local structure needed to inform our understanding of ecological processes. For instance, we captured clear patterns at scales less than 6 m in Lake Fryxell soils because we had sufficient EC gradients to elicit physiological response. The substantial local EC gradients resulted from the physical processes of polygon formation and significantly affected larger-scale patterning. While environmental extremes may be less frequent, they play an important role in structuring biota across the landscape, even in the physiologically challenging environment posed by the MDV. Furthermore, we observed coherent local-scale patterns because the bacterial communities were (1) diverse, (2) active, and (3) adapted to the local environment. Barrett et al. (2004) did not observe clear local-scale patterns, potentially because the edaphic gradients sampled did not impart sufficient physiological stress on the nematodes, or perhaps because the eukaryotic community was not diverse enough and many samples did not contain organisms. In contrast, the ubiquity of bacteria in these soils allows us to see an additional aspect of environment-community interactions, that is, the effects of environmental filtering across finer spatial scales.

We conclude that a combination of local-scale polygon mechanisms as well as regional-scale geological histories drove changes in edaphic gradients that played a large role in determining the microbial community composition and diversity within the McMurdo Dry Valleys of Antarctica. Our results suggest that the relative importance of pH versus EC in structuring microbial communities is contextually related to the length and severity of edaphic gradients and the spatial scale of sampling, creating a framework in which to interpret conflicting literature.

# AUTHOR CONTRIBUTIONS

Funding was secured by CT-V, DVH, and ES. DVH conceived and designed the experiments. DVH, DC, TM, and HB performed the

#### REFERENCES


fieldwork. KF processed the samples for DNA sequencing. DVH, CT-V, and KF analyzed the data. KF, DVH, and CT-V wrote the manuscript with input from all the authors.

# FUNDING

This work was funded by NSF Grant OPP1142096 awarded to CT-V, DVH, and ES. The McMurdo LTER (NSF Grant OPP1115245) provided additional support to CT-V and DVH acknowledges support from NSF Grant OPP124599. The DNA sequencing reported in this publication was done by the Molecular Biology Facility in the Department of Biology and the Center for Evolutionary and Theoretical Immunology at the University of New Mexico, which is supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number P30GM110907. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

### ACKNOWLEDGMENTS

We would like to thank the support staff at McMurdo Station for their assistance, as well as Raytheon Company, Inc. and Petroleum Helicopters, Inc. for logistical support. We appreciate assistance with sample processing provided by George Rosenberg of the Center for Evolutionary and Theoretical Immunology (CETI), University of New Mexico.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01928/full#supplementary-material

ecosystems. Hydrol. Earth Syst. Sci. 13, 2349–2358. doi: 10.5194/hess-13-2349- 2009




at global scale. Ecol. Lett. 11, 1252–1264. doi: 10.1111/j.1461-0248.2008. 01245.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Feeser, Van Horn, Buelow, Colman, McHugh, Okie, Schwartz and Takacs-Vesbach. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Diversity of Mesopelagic Fishes in the Southern Ocean - A Phylogeographic Perspective Using DNA Barcoding

Henrik Christiansen<sup>1</sup> \*, Agnès Dettai <sup>2</sup> , Franz M. Heindler <sup>1</sup> , Martin A. Collins <sup>3</sup> , Guy Duhamel <sup>4</sup> , Mélyne Hautecoeur <sup>4</sup> , Dirk Steinke5,6, Filip A. M. Volckaert <sup>1</sup> and Anton P. Van de Putte1,7

<sup>1</sup> Laboratory of Biodiversity and Evolutionary Genomics, KU Leuven, Leuven, Belgium, <sup>2</sup> UMR 7205 ISYEB CNRS-MNHN-Sorbonne Universite-EPHE, Département Systématique et Évolution, Muséum National d'Histoire Naturelle, Paris, France, <sup>3</sup> Centre for Environment, Fisheries and Aquaculture Science, Lowestoft, United Kingdom, <sup>4</sup> UMR 7208 BOREA, Département Milieux et Peuplements Aquatiques, Muséum National d'Histoire Naturelle, Paris, France, <sup>5</sup> Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada, <sup>6</sup> Department of Integrative Biology, University of Guelph, Guelph, ON, Canada, <sup>7</sup> OD Nature, Royal Belgian Institute of Natural Sciences, Brussels, Belgium

#### Edited by:

Oana Moldovan, Emil Racovita Institute of Speleology, Romania

#### Reviewed by:

Peter John Unmack, University of Canberra, Australia Anna Maria Pappalardo, Università degli Studi di Catania, Italy

#### \*Correspondence:

Henrik Christiansen henrik.christiansen@kuleuven.be orcid.org/0000-0001-7114-5854

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

Received: 21 February 2018 Accepted: 24 July 2018 Published: 05 September 2018

#### Citation:

Christiansen H, Dettai A, Heindler FM, Collins MA, Duhamel G, Hautecoeur M, Steinke D, Volckaert FAM and Van de Putte AP (2018) Diversity of Mesopelagic Fishes in the Southern Ocean - A Phylogeographic Perspective Using DNA Barcoding. Front. Ecol. Evol. 6:120. doi: 10.3389/fevo.2018.00120 Small mesopelagic fish are ubiquitous in the ocean, representing an important trophic link between zooplankton and tertiary consumers such as larger fish, marine mammals and birds. Lanternfishes (Myctophidae) are common worldwide as well as in the Southern Ocean. However, only 17 of the approximately 250 myctophid species occur exclusively in sub-Antarctic or Antarctic waters. It is unclear whether they colonized these latitudes once and diversified from there, or whether multiple colonization events took place in which multiple ancestral phenotypes entered the Southern Ocean at various times. Phylogeographic patterns have been investigated for individual myctophid species, but so far no study has compared species across the Southern Ocean. Here, we present a dataset with previously unpublished cytochrome c oxidase I (COI; n = 299) and rhodopsin (rh1; n = 87) gene sequences from specimens collected at various locations in the Southern Ocean. Our data extend the DNA barcode library of Antarctic mesopelagic fish substantially. Combined morphological and molecular taxonomy lead to confident species level identification in 271 out of 299 cases, providing a robust reference dataset for specimen identification, independently of incomplete morphological characters. This is highly topical in light of prospective ecological metabarcoding studies. Unambiguous sequences were subsequently combined with publicly available sequences of the global DNA barcode library yielding a dataset of over 1,000 individuals for phylogenetic and phylogeographic inference. Maximum likelihood trees were compared with results of recent studies and with the geographical origin of the samples. As expected for these markers, deep phylogenetic relationships remain partially unclear. However, COI offers unmatched sample and taxon coverage and our results at the subfamily to genus level concur to a large extent with other studies. Southern Ocean myctophids are from at least three distant subfamilies suggesting that colonization has occurred repeatedly. Overall, spatial divergence of myctophids is rare, potentially due to their enormous abundance and the homogenizing force of ocean currents. However, we highlight potential (pseudo-)cryptic or unrecognized species in Gymnoscopelus bolini, Lampanyctus achirus, and the non-myctophid genus Bathylagus.

Keywords: marine biodiversity, adaptation, Antarctic, COI, Myctophidae, phylogeny, rhodopsin

#### INTRODUCTION

The mesopelagic fauna of the world's oceans is dominated by ubiquitous small filter feeding fish. These fishes likely represent a total biomass of up to 10 billion tons and include the perhaps most abundant vertebrate species on earth, Cyclothone sp. (Irigoien et al., 2014; Proud et al., 2018). Approximately 90% of all small mesopelagic fishes belong to the bristlemouths (Gonostomatidae) and lanternfishes (Myctophidae). They form an important trophic link between primary consumers (predominantly mesozooplankton) and higher trophic levels such as large fish, squid, marine mammals, and birds (Smith et al., 2011a). Most small mesopelagic fish, which are generally found in the zone between approximately 200 and 1,000 m, undertake a diurnal vertical migration following their prey into the epipelagic zone to feed at night (Isaacs et al., 1974). During daytime they retreat into the deep again, where they digest and excrete, which likely results in a substantial vertical carbon flux (Irigoien et al., 2014). Sonar reflections of their swim bladders cause the oceanic deep scattering layer (Barham, 1966). However, despite their importance for marine food webs and organic carbon cycling, small mesopelagic fish are largely understudied.

The sub-Antarctic and Antarctic waters of the Southern Ocean are of particular importance both for global climate through ocean circulation and as a relatively pristine sanctuary for marine biodiversity. The Southern Ocean harbors considerable biodiversity (Brandt et al., 2007; Griffiths, 2010), although species richness of fish is low compared to temperate and tropical seas with 322 currently recognized species from 19 families (Eastman, 2005). Nevertheless, the Southern Ocean has been identified as an evolutionary hotspot, particularly because of the morphological and ecological diversity of species and a high degree of endemism, which amounts to 88% in fish (Eastman, 1991, 2005). It is believed that a key factor for such evolutionary uniqueness is the relative isolation of the Southern Ocean fauna, initiated approximately 24–25 mya by the formation of the Antarctic Circumpolar Current (ACC), a system of ocean currents flowing around Antarctica from West to East between 50◦ and 60◦ South (Eastman, 1991; Rintoul et al., 2001; Lyle et al., 2007). The ACC is the dominant hydrographic feature in the Southern Ocean (Orsi et al., 1995), and by providing a continuous, strong flow it forms a variable, but permanent boundary between Antarctic waters and water masses of lower latitudes (Rintoul et al., 2001). This greatly hampers any possible north-south (or south-north) migration of organisms. However, Saunders et al. (2017) recently showed that lanternfish biomass in the Scotia and Weddell Sea must be supported by mass immigration from lower latitudes. The mesopelagic zone in temperate regions is generally strongly stratified and includes a distinct thermocline. Temperatures tend to range between 2◦

and 15◦C with the upper layer being warmer and well mixed, followed by a sharp decrease in temperature at the thermocline (at around 50–400 m) and a gradual decrease of temperature with increasing depth. In contrast, the Southern Ocean is relatively well-mixed with temperatures ranging between −0.5 and 2.0◦C (Ikeda, 1988). Temperatures at 1,000 m depth are therefore similar in temperate and Antarctic regions, whereas temperatures of upper water masses are very different.

Although less abundant than at lower latitudes, mesopelagic fish are still numerous in Antarctic and sub-Antarctic waters and represent a major part of the biomass (Eastman, 1993). In terms of species richness, abundance, and biomass, the mesopelagic zone there is dominated by lanternfishes (Myctophidae) (Donnelly et al., 1990; Kock, 1992). Myctophids are common in oceanic waters north of the Antarctic Slope Front (ASF; near the Antarctic continental shelf break), where they act as largely opportunistic mesozooplankton feeders with some interspecific dietary variation (Pakhomov et al., 1996; Pusch et al., 2004; Connan et al., 2010; Saunders et al., 2014, 2015). Charismatic Antarctic top predators such as king penguin (Cherel et al., 2009), Antarctic fur seals (Casaux et al., 2011; Santora, 2013), and seabirds (Connan et al., 2007) heavily rely on myctophids as a food source. Despite their small size some myctophid species (Gymnoscopelus spp., Electrona carlsbergi) were commercially exploited in the 1980s (Hulley, 1990; Kock, 1992). Of the approximately 240 myctophid species recognized worldwide, 68 have been recorded south of the Sub-Tropical Front, 15 of which have a Sub-Antarctic and two an Antarctic distribution pattern (Duhamel et al., 2014). The remaining species exhibit widespread distribution patterns and only sporadically occur in the Southern Ocean.

Ecological studies are dependent on accurate biological identification to a level of taxonomic resolution appropriate for the study goal (Tautz et al., 2003). In myctophids, photophore patterns are mainly used to distinguish species. However, accurate identification can be impeded, because myctophids tend to lose scales during capture and are easily damaged in the net. The identification of early life stages may also be challenging. DNA barcoding is a molecular technique that uses the mitochondrial cytochrome c oxidase I gene (COI) as a genetic marker to provide biological identifications (Hebert et al., 2003). The system is now widely accepted and many taxa, including teleosts, have been successfully integrated in barcoding initiatives and data systems (Ratnasingham and Hebert, 2007, 2013; Ward et al., 2009). A sufficiently complete reference dataset of DNA barcodes thus enables fast and efficient verifications for morphologically identified specimens as long as COI exhibits levels of interspecific divergence that are higher than the intraspecific divergence of a given group. Furthermore, it can assist with the discovery of misidentified specimens, cryptic or simply not yet identified new species, help settle synonymies, or hint at intraspecific genetic structuring (Hajibabaei et al., 2007; see Bucklin et al., 2011 for an extended overview of marine barcoding applications). The latter can be used in phylogeography, a discipline concerned with phylogenetic relatedness and connectivity of species or populations with respect to geographic distribution. Genetic distance, derived from markers such as COI, is used to study the historical processes that may be responsible for the contemporary geographic distribution of individuals. In order to increase robustness of results derived from COI data, it can be useful to include an additional genetic marker, particularly nuclear and thus biparentally inherited (Cao et al., 2016; Thiel and Knebelsberger, 2016). Rhodopsin belongs to a family of genes, the so called G-protein-coupled receptors, that are involved in translating external information (e.g., light, molecules) into internal signals that can be processed by organisms. Rhodopsin encodes a protein that is involved in photoreception (Palczewski et al., 2000). It occurs on the rod cells and is extremely light sensitive enabling vision under low-light conditions (Yokoyama and Yokoyama, 1996). In Actinopterygians, the rhodopsin gene generally occurs in two copies, homologous to other vertebrates. One copy, rh1, is an intronless retrogene that does not recombine anymore with other opsins and has proven useful for fish identification and phylogeny (Fitzgibbon et al., 1995; Chen et al., 2003; Lin et al., 2017; Morrow et al., 2017).

Less than a decade ago Grant and Linse (2009) recognized a lack of Antarctic barcoding studies. The Census of Antarctic Marine Life (CAML) set an explicit focus on DNA barcoding, resulting in many studies making significant progress in addressing this gap (Schiaparelli et al., 2013 and references therein). Over the past few years, Antarctic barcoding demonstrated the usefulness of COI sequencing e.g., to identify Trematomus fishes (Lautredou et al., 2010), and showcased the presence of cryptic species in various groups, e.g., pycnogonids (Krabbe et al., 2010), amphipods (Havermans et al., 2011), octopuses (Allcock et al., 2011), skates (Smith et al., 2008), and grenadier fishes (Smith et al., 2011b; McMillan et al., 2012). These examples clearly demonstrate that despite the fact that Antarctic biodiversity is still underexplored (Grant et al., 2010; Griffiths, 2010), molecular techniques can enhance our understanding of contemporary diversity patterns and the processes that shaped these (Allcock and Strugnell, 2012). Fish communities of the Southern Ocean have been studied using DNA barcoding, but these studies primarily focused on benthic fish in the Scotia Sea (Rock et al., 2008), Dumont D'Urville Sea (Dettaï et al., 2011), and Ross Sea (Smith et al., 2012). Phylogeographic patterns of myctophids have been investigated for a few species (e.g., Electrona antarctica, Van de Putte et al., 2012), but to date no study has compared species across the Southern Ocean. Here, we present an extensive DNA barcoding approach to investigate the ecologically relevant community of Antarctic mesopelagic fish.

Our objectives were, (1) to extend the DNA barcode library of Antarctic mesopelagic fish, (2) to assess the success of specimen identification using this system, (3) to discover potential mismatches between taxonomy and genetic identification, (4) to compare our Antarctic myctophid phylogenetic data with recent myctophid phylogenies, and (5) to investigate phylogeographic patterns of common Antarctic myctophids. To achieve these objectives, we used a large-scale dataset of mesopelagic Antarctic fish, covering over 1,000 specimens from a circum-Antarctic sampling range. This dataset includes 386 new samples and combines these with publicly available sequences found on the Barcode of Life Data Systems, BOLD (Ratnasingham and Hebert, 2007). We focused on the analysis of COI, but extended our results by incorporating an additional nuclear marker (rh1). Thus, a comprehensive picture on the inter- and intraspecific diversity of mesopelagic fishes occurring in the Southern Ocean was drawn.

# MATERIALS AND METHODS

### Sampling and Identification

Mesopelagic fish were captured in the Southern Ocean and sub-Antarctic waters during various expeditions. The sampling effort comprised cruise 200 with RV James Clark Ross (see Collins et al., 2012b), cruises PS65 and PS69 with RV Polarstern, BROKE-West with RV Aurora Australis and additional Atlantic samples collected with RV G.O. Sars (BOLD project FISCO); the POKER sampling campaign 2010 off Kerguelen and additional Pacific samples from the JAMSTEC survey with RV Hakuho Maru (BOLD project MYCSO); cruises JR100 (Collins et al., 2008), JR161 and JR177 (Collins et al., 2012b) with RV James Clark Ross and few specimens from commercial vessels (BOLD project BASMF); and finally 23 myctophid specimens collected off South Africa (BOLD project DSSAU). Samples from the Atlantic and Pacific Oceans were included to provide an outgroup framework. Overall, these sampling efforts yielded a total of 386 previously unpublished specimens (**Table 1**). All specimens were identified morphologically aboard the research vessels or, in absence of a taxonomic expert, immediately frozen or preserved whole in high-grade ethanol or formalin and identified at the respective institutions. Muscle tissue or fin biopsies were excised using sterile tools and stored in ethanol. In most cases identifications were carried out to species level and only in some instances to family or genus level (juvenile/larval or severely damaged specimens). The majority of specimens are stored at the Muséum National d'Histoire Naturelle (MNHN, Paris), KU Leuven (Belgium), the British Antarctic Survey (BAS, Cambridge) or the National History Museum (NHM, London), and the South African Institute for Aquatic Biodiversity (Grahamstown), respectively. Detailed collection data of all specimens are shown in **Supplementary Table S1**.

# DNA Extraction, PCR and Sequencing

DNA was extracted from the tissue sample using a modified standard salting-out protocol (Cruz et al., 2016). Extracts from the datasets FISCO, BASMF, and DSSAU were subsequently shipped to the University of Guelph, Canada, for COI amplification and sequencing following protocols described in Steinke and Hanner (2011). Primers used for COI were the cocktails C FishF1t1-C FishR1t1 as described in Ivanova et al. (2007). Rhodopsin gene fragments (rh1) were amplified using TABLE 1 | Overview of fish specimens and species and the respective numbers of DNA sequences that were successfully obtained for the cytochrome c oxidase I (COI) and rhodopsin (rh1) gene.


Project name abbreviations as used in the Barcode of Life Data Systems, BOLD (Ratnasingham and Hebert 2007). Some samples with unclear identification status were excluded from tree building.

Rh193-5′CNTATGAATAYCCTCAGTACTACC3′ and Rh1039r-5 ′ TGCTTGTTCATGCAGATGTAGA3′ primers (Chen et al., 2003). Amplification was conducted in 25 µl volume with 0.2 mM dNTP's, 2.5 mM MgO2, 20µM primer mix, and conventional PCR buffer and Taq polymerase. PCR conditions were 2 min initial denaturation at 94◦C, followed by 30–40 cycles of 20 s at 94◦C, 30 s at 50–60◦C, 70 s at 72◦C, and a final 3 min elongation at 72◦C. The MYCSO COI and rh1 dataset was generated at MNHN (France) following Dettaï et al. (2011) for extraction, PCR, and sequencing using standard automatic capillary sequencers. Additional rh1 sequences were generated at KU Leuven (Belgium) for the FISCO samples following the same protocol.

#### Dataset Augmentation and Trimming and Phylogenetic Statistics

We were able to retrieve 299 COI and 87 rh1 sequences of 308 specimens from 16 locations (**Table 1**, **Figure 1**). These sequences were deposited in the BOLD datasets: "Fishes of the Scotia Sea" (FISCO), "Myctophids of the Southern Ocean" (MYCSO), "BASMF," and "DSSAU." To increase taxonomic and spatiotemporal coverage our unpublished dataset was extended with publicly available data from BOLD/GenBank including some previously published Antarctic COI barcode sets: Rock et al. (2008) (samples from the Scotia Sea, South Orkney Islands, and Elephant Island), Mabragaña et al. (2011) (Argentina), Smith et al. (2012) (Ross Sea, Heard and McDonald Islands and more). Altogether, sequences cover an unprecedented area of the Southern Ocean, although many regions remain underrepresented (**Figure 1**). This can be attributed to the enormous logistic and financial challenges posed by Antarctic exploration. Species identity of all previously unpublished specimens from the Southern Ocean dataset was confirmed using internal tools of BOLD (Ratnasingham and Hebert, 2007) using all available COI sequences > 500 bp with species level identification on 15th December 2017. If molecular and morphological identification did not match, a second morphological examination was performed and only specimens that were attributed to the species identified by BOLD were kept as such. In addition, in case of any doubt, e.g., the absence of crucial morphological characters, the specimen was excluded from further analysis.

Sequences from five Synodotus binotatus (two-spot lizardfish) specimens curated on BOLD were included for both COI and rh1. Synodus binotatus is an aulopiform fish, the order with closest common ancestor to myctophiform fishes (Betancur et al., 2017). These sequences were used as outgroup for phylogenetic tree rooting. Two COI sequences each of Neoscopelus macrolepidotus and N. microchir (Neoscopelidae (blackchins), the other family in Myctophiformes, next to Myctophidae) were also included. No rh1 sequences of Neoscopelidae were available. Three different datasets were used for phylogenetic reconstruction: (1) all available COI sequences (new sequences, published myctophid sequences, and outgroup; total N = 1073); (2) all available rh1 sequences (new sequences and outgroup; N = 90); and (3) a concatenated dataset consisting of specimens from (1) and (2) for which good quality sequences of COI and rh1 were available (N = 68, including outgroup). Sequences were aligned via MUSCLE (Edgar, 2004) within Geneious v.8.1.5 (Biomatters Ltd) using a maximum of eight iterations and standard preset values. Tree building was performed in R v3.1.2 (R Core Team, 2016) using the packages "ape" (Paradis et al., 2004; Popescu et al., 2012) and "phangorn" (Schliep, 2011). Kimura's two-parameter substitution model (Kimura, 1980) is commonly used in DNA barcoding studies to construct genetic distance matrices, although the fit might be poor (Collins et al., 2012a). We decided to assess a variety of nucleotide substitution models with phangorn's "modelTest" function. The most appropriate model for all three datasets as determined by Akaike's information criterion (AIC) was the general time reversible model with gamma distributed rate variation among sites and a proportion of invariable sites ("GTR+G+I"). This substitution model was used as initial fit and for subsequent maximum likelihood (ML) optimization using a stochastic algorithm instead of nearest-neighbor-interchange to avoid local maxima. Edge support was evaluated with 10,000 randomly seeded bootstraps. Consensus trees were created in Geneious with a support threshold of 70% (Hillis and Bull, 1993) and were subsequently manually checked and annotated using MEGA7 v7.0.26 (Kumar et al., 2016). COI haplotype networks

COS, Cosmonauts Sea; CPS, Cooperation Sea; KG, Kerguelen Islands; HMI, Heard and McDonald Islands; DDU, Dumont d'Urville Sea; ROS, Ross Sea.

were created by median joining (Bandelt et al., 1999) in popART v.1.7 (Leigh and Bryant, 2015).

# RESULTS

#### Extension of the DNA Barcode Library

Mesopelagic fish of various research expeditions were identified, cataloged, and when possible sequenced for COI and/or rh1. In some cases (not listed here) sequencing was impossible due to DNA degradation or amplification failure. Overall, 297 reliable COI sequences were added to BOLD after rigorous validation and exclusion of doubtful samples (two samples excluded, see below). Some of these sequences belong to larval, juvenile or incidentally caught fishes whose adult stages are generally not mesopelagic (notothenioids, grenadiers), leaving 264 Antarctic meso- or bathypelagic specimens from 35 different species with validated identification and COI sequences—a biogeographic assemblage that was previously almost absent in the database. The worldwide database for myctophids was extended by 23.7% to a total of 1,021 sequences. Furthermore, 87 validated rh1 sequences were added to BOLD. All samples and sequence IDs and associated metadata can be found in **Supplementary Table S1**.

# Specimen Identification

All 299 previously unpublished COI sequences were identified using BOLD data and tools (using only species level barcode records). In some instances this revealed most likely misidentified or mislabeled sequences in BOLD. If only one COI sequence of a given species on BOLD was misidentified, the identification engine will declare there was no species level match. For instance, at time of study BOLD contained 68 sequences with the Barcode Index Number (BIN; Ratnasingham and Hebert, 2013) corresponding to Electrona antarctica (Antarctic lanternfish; BOLD: AAB3737), all with low pairwise distance (average: 0.08%, maximum: 0.78%). The nearest neighbor of this BIN is Symbolophorus veranyi (large-scale lanternfish; BOLD: AAC4870; pairwise distance: 2.39%). Yet, one of the specimens in AAB3737 (E. antarctica) is labeled Nannobrachium achirus a more distant species well represented by 27 other sequences in the clearly distinct BIN BOLD: AAB3778. In these cases we highlighted the likely misidentified or mislabeled sequences present in the database (**Supplementary Table S2**). Five such sequences were accessible to us and are now flagged in BOLD to avoid future misidentifications when using the database.

Using this procedure, morphological and molecular identification showed high levels of congruence (97.65%). In eleven cases the morphological identification could be confidently improved using the BOLD identification engine, as specimens were attributed at least to genus level; which was confirmed through the placement of the specimen in the phylogenetic trees. In seven further cases a mismatch between morphological and molecular identification was detected. After detailed inspection the identification was revised (**Table 2**). Five specimens with unclear species or genus level identification could not be matched to any available COI sequence in BOLD. However, they could be attributed to genus level based on the phylogenetic tree (highlighted in bold italics in **Figure 1**). Two specimens were excluded from further analysis, because COI and rh1 gave conflicting results, likely indicating contamination or similar error in the laboratory. Lastly, nine specimens had matching morphological and molecular identification, although only uni-directional sequences were obtained. These were included in phylogenetic analyses, but flagged as non-barcode compliant on BOLD.

#### Phylogeny

The curated datasets were used to produce three phylogenetic ML consensus trees: one for COI, one for rh1, and one for both combined. Sequence alignment was not problematic, as these are coding sequences without gaps. In each case "GTR+G+I" was identified as the most appropriate nucleotide substitution model. Clades with bootstrap support below 70% after consensus tree building were collapsed, i.e., these splits were not retained or displayed in the figures.

#### The Cytochrome C Oxidase I Gene

The dataset includes 1,073 sequences of 539 bp length with 337 variable sites. Myctophidae are not resolved as monophyletic, because the aulopiform species Notolepis coatsi (Antarctic jonasfish) and Lagiacrusichthys macropinna (previously Benthalbella macropinna, see Davis, 2015) are placed within a Lampanyctinae clade and the neoscopelid species Neoscopelus macrolepidotus and N. microchir are placed next to Notolychnus valdiviae (topside lanternfish; **Figure 2**). Other outgroup taxa are placed outside of Myctophidae, but their exact position is not well resolved (i.e., often <70% bootstrap support and thus displayed as polytomic). Within Myctophidae the tribe Electronini (sensu Paxton, 1972) and subfamily Gymnoscopelinae (sensu Martin et al., 2017) are monophyletic with medium bootstrap support (BS = 78 and 79%, respectively). Diaphinae (sensu Martin et al., 2017) is monophyletic as well (BS = 89%), with the exception of the inclusion of Symbolophorus boops (bogue lanternfish). The placement of these three subfamilies/tribes and the remaining genera within the Myctophidae is less clear, with bootstrap support at times below the applied cut-off threshold. Myctophid species with their main distribution range in Sub-Antarctic or Antarctic waters all belong to the three groups mentioned above, except for S. boops and Lampanyctus achirus (previously Nannobrachium achirus, see Martin et al., 2017), a bathypelagic species that was placed within a clade of Lampanyctus spp., sister group of Parvilux ingens. As the focus of this study is on (sub-)Antarctic mesopelagics, further description and discussion is restricted to these species and their position in the phylogenetic trees.

Within Electronini, the position of the genera Metelectrona, Electrona, Krefftichthys, and Protomyctophum is unclear. Electrona antarctica forms a clade with K. anderssoni (BS = 75%). Metelectrona ventralis (flaccid lanternfish) is resolved as sister group (BS = 84%) to Protomyctophum, the only monophyletic genus (BS = 99%). Electrona subaspera (rough lanternfish) and E. paucirastra (belted lanternfish), and E. risso (electric lanternfish) and E. carlsbergi (electron subantarctic lanternfish), respectively, appear to be closely related (BS = 89% in both cases). Within Protomyctophum, the split into the subgenera Hierops and Protomyctophum is supported except for P. tenisoni, which is placed next to these subgenera (BS = 99%). Hierops contains P. parallelum (parallel lanternfish), P. thompsoni (bigeye lanternfish), P. arcticum (Arctic telescope), and P. crockeri (California flashlightfish) (BS = 97%) and the subgenus Protomyctophum contains P. bolini, P. choriodon, P. andriashevi, and P. gemmatum (BS = 88%).

In Gymnoscopelinae (sensu Martin et al., 2017), Gymnoscopelus is monophyletic (BS = 99%) and sister group to a clade with medium support (BS = 74%) containing Scopelopsis multipunctatus and the also monophyletic Notoscopelus (BS =



FIGURE 2 | Phylogenetic consensus tree of myctophid fishes based on cytochrome c oxidase I (COI) gene variation using a maximum likelihood analysis with 10,000 bootstrap permutations. Bootstrap support> 70 is shown above the branches; branches with lower support are collapsed to polytomies; species with Antarctic, broadly Antarctic, and sub-Antarctic distribution pattern following Duhamel et al. (2014) are depicted in purple, dark blue, and light blue, respectively (please see online version of the article for full-scale, color figure). Number of collapsed samples noted in brackets. Samples where genus level identity was added a posteriori based on position in the tree are noted in bold italics.

99%). Within Gymnoscopelus, two single specimens identified as G. piabilis (Southern blacktip lanternfish) and G. nicholsi (Nichol's lanternfish) form their own clade apart from all other specimens with the same identification (BS = 93%). Sister group to these is a clade comprised of G. hintonoides (false-midas lanternfish), G. piabilis, and G. fraseri (BS = 100%), in which G. hintonoides and G. piabilis are resolved as monophyletic (BS = 88 and 98%), but G. fraseri not. Gymnoscopelus bolini and G. nicholsi form a clade (BS = 91%), in which G. nicholsi is placed as one group (BS = 99%), but G. bolini as three. Lastly, G. braueri and G. opisthopterus are monophyletic sister group to all others (BS = 98%) and also monophyletic within each species (BS = 82% for G. braueri and 99% for G. opisthopterus).

#### The Rhodopsin Gene

The compiled rhodopsin dataset includes 90 sequences of 820 bp length including 511 variable sites. The Myctophidae are monophyletic with 100% bootstrap support (**Figure 3**). Diaphinae and Gymnoscopelinae are resolved as monophyletic similarly to the COI tree, but Electronini are not. However, the taxonomic sampling is much smaller than for the COI dataset and covers 21 myctophid species, whereas Duhamel et al. (2014) report 66 species that are at least occasionally recorded south of the Sub-Tropical Front.

Electronini are paraphyletic with the inclusion of Diogenichthys sp. and Myctophum species. Electrona antarctica is placed outside the remaining Electronini and and Myctophum spp. as sister group to Diogenichthys sp. (BS = 86%). Within the other Electronini, Kreffthichtys anderssoni, and E. carlsbergi diverge first from the monophyletic Protomyctophum (BS = 98%), represented by P. bolini and P. choriodon.

The genus Gymnoscopelus (no other Gymnoscopelinae were available for rh1) forms a monophyletic group with high bootstrap support (100%), with two G. fraseri and four G. nicholsi diverging first from all other specimens (BS = 99%). Three further G. fraseri are resolved within the remaining clade, next to G. bolini and G. hintonoides (BS = 78%) and another clade that comprises G. braueri and G. opisthopterus (BS = 100%). However, the resolved topology differs from the COI tree, although G. braueri and G. opisthopterus are resolved as sister taxa in both analyses. Lampanyctus achirus—the only other myctophid common in sub-Antarctic waters in this dataset clusters with a clade of Lampanyctus spp. similarly to COI.

#### Both Markers Combined

The concatenated dataset comprises 68 specimens. In total this dataset has 795 variable sites. As expected the concatenation

of both markers reduced the size of the dataset, but provided at times higher confidence in the resolved consensus topology. The Myctophidae are monophyletic with 100% bootstrap support (**Figure 4**). The tribe Electronini is monophyletic except for the inclusion of a Diogenichthys sp. and placed within a clade also containing Symbolophorus spp., Hygophum spp., and Myctophum species. This entire clade is a sister group of a clade containing Notolychnus valdiviae, Lampadena luminosa (luminous lanternfish), a Diaphinae clade, and a clade of Gymnoscopelus and Lampanyctus. The latter are both monophyletic with 100% bootstrap support.

Within the Electronini a clade comprising E. antarctica and the single Diogenichthys sample diverges first from the remaining samples. Krefftichthys anderssoni is sister group to a clade with E. carlsbergi and the remaining Protomyctophum (P. bolini and P. choriodon). Hence, Protomyctophum is monophyletic (BS = 100%), but Electrona and Krefftichthys are not. Diaphinae are monophyletic, but only D. richardsoni, D. brachycephalus (short-headed lanternfish), and unidentified Diaphus spp. are included; therefore, further inferences are impossible.

Gymnoscopelinae (although only Gymnoscopelus is present in this dataset, neither Scopelopsis nor Notoscopelus) is resolved as monophyletic with 100% bootstrap support. Within Gymnoscopelus, G. nicholsi diverges first from other taxa, with high support (BS = 100%). Gymnoscopelus bolini is sister group (BS = 96%) to a clade that contains two other clades, with G. fraseri and G. hintonoides (BS = 100%) and G. braueri and G. opisthopterus (BS = 100%), respectively. Lampanyctus achirus is again placed inside a clade of Lampanyctus spp., here sister group of the Gymnoscopelus clade (BS = 80%). Sister to these two clades are the Diaphinae, Notolychnus valdivae, and two samples of Lampadena luminosa.

#### respectively (color figure available online). Number of collapsed samples noted in brackets.

#### Phylogeography and Cryptic Species

In addition to further parameterizing the BOLD database and investigating phylogenetic relationships of particularly sub-Antarctic and Antarctic myctophids, our data was used to identify phylogeographic diversity patterns of Southern Ocean myctophids. These analyses focused on the largest dataset, COI, and mainly on species of the tribe Electronini and the subfamily Gymnoscopelinae, comprising 203 specimens from 16 species and 167 specimens from 12 species, respectively. They were plotted as sub-trees of the COI tree (**Figure 2**) with all (sub-) Antarctic species coded corresponding to sampling locality and associated haplotype networks (**Figures 5**, **6**; codes as in **Figure 1**). The geographical coverage within species varies from circum-Antarctic to only a few single sites. In general, these data cover specimens of most waters around Antarctica, as well as more northerly areas (Scotia Sea, South Atlantic Ocean north of the Scotia Sea and off Argentina, waters around Bouvet Island, and the Kerguelen and Heard and McDonald Islands plateaus, and off South Africa; **Figure 1**). Lampanyctus achirus is not resolved as a single clade, but rather two groups – one with seven individuals caught off South Africa, and one with 21 individuals from the Ross Sea, Dumont d'Urville Sea, Scotia Sea, Kerguelen Islands, and also one individual from South Africa (**Supplementary Figure S1**). Symbolophorus boops is resolved as group of six individuals with low COI variation. Phylogeographic patterns of the other (sub-) Antarctic myctophids are discussed by tribe/subfamily below.

#### Electronini

Electrona antarctica is the most common species in available COI sequences (N = 61). Nonetheless, intraspecific variation appears to be minimal, with only one small group of five individuals clustering apart with moderate support (BS = 84%) and the vast majority of specimens showing one identical haplotype (**Figure 5**). The group that clusters apart comprises samples from the Ross Sea, Heard and McDonald Islands, and Cosmonauts Sea, all locations also present in the other group. The intraspecific diversity is low and appears not to be related to geography. Krefftichthys anderssoni is present in sufficient numbers (N = 20) and with relatively broad geographical coverage, but no structure is apparent, although haplotype diversity is a little higher compared to E. antarctica. The sub-Antarctic Electrona carlsbergi features one sample of unknown origin that is separated with moderate support (BS = 71%). The remaining samples show some haplotype diversity, but no phylogeographic pattern. Electrona subaspera as well as Protomyctophum parallelum are only present in small numbers (N = 1 and 4, respectively). Metelectrona ventralis is represented by only eight samples, five of which come from Agulhas Bank off South Africa (**Figure 5**). The remaining three samples have no public locality information. However, two of these build a distinct cluster divergent from all others (BS = 95%). All nine P. tenisoni are from the Scotia Sea or the South Atlantic Ocean, showing no signs of phylogeographic diversity. The remaining Protomyctophum, i.e., P. bolini, P. choriodon, P. andriashevi, and P. gemmatum, also show no sign of elevated intraspecific variability or clustering by location.

#### Gymnoscopelinae

In Gymnoscopelus (**Figure 6**), G. nicholsi shows moderate diversity with all but one specimen represented in one clade, but more different haplotypes than E. antarctica. The only outlier here is a specimen from the Kerguelen Island plateau (BS = 99%), although the remaining samples include individuals from the Kerguelen area as well. In contrast, G. bolini is split into three groups with high support: (1) one group with two samples each from the Ross Sea, Heard and McDonald Island, and Scotia Sea (BS = 98%); (2) one group (including a sub-split) with samples from Heard and McDonald as well as Kerguelen Islands, but also

(COI) gene variation using a maximum likelihood analysis with 10,000 bootstrap permutations. Bootstrap support> 70 is shown above the branches; branches with lower support are collapsed to polytomies. Branches of species with Antarctic, broadly Antarctic, and sub-Antarctic distribution pattern following Duhamel et al. (2014) are depicted in purple, dark blue, and light blue, respectively (color figure available online). Geographic origin is reflected by colored circles in the tree and networks, approx. clockwise from West (dark) to East Antarctic (light). In the haplotype networks one branch represents one mutation. Additional mutation steps between samples are indicated with small black circles.

FIGURE 6 | Haplotype networks and phylogenetic consensus tree of myctophid fishes of the subfamily Gymnoscopelinae (sensu Martin et al., 2017) based on cytochrome c oxidase I (COI) gene variation using a maximum likelihood analysis with 10,000 bootstrap permutations. Bootstrap support> 70 is shown above the branches; branches with lower support are collapsed to polytomies. Branches of species with Antarctic, broadly Antarctic, and sub-Antarctic distribution pattern following Duhamel et al. (2014) are depicted in purple, dark blue, and light blue, respectively (color figure available online). Geographic origin is reflected by colored circles in the tree and networks, approx. clockwise from West (dark) to East Antarctic (light). In the haplotype networks one branch represents one mutation. Additional mutation steps between samples are indicated with small black circles.

Ross and Scotia Sea (BS = 92%); and (3) one individual from off Dumont d'Urville Sea (DDU). East Antarctic coastal waters seem to stand out, but the sampling density is too low for solid inferences. Two other individuals, one nominal G. piabilis caught off Argentina and one nominal G. nicholsi collected off South Africa, appear as sister group to a clade of G. fraseri, remaining G. piabilis, and G. hintonoides (BS = 93%). The latter form a polyphyletic group with only G. hintonoides monophyletic in the tree (BS = 88%), but all three species separated in the haplotype network. All 24 individuals of G. opisthopterus form one group with low COI variation. Gymnoscopelus braueri in turn exhibits more divergence, with one DDU individual clustering as sister to a group of two Ross Sea and one Bouvet Island samples (BS = 92%), which diverge from the 26 remaining individuals. However, the haplotype network of G. opisthopterus and G. braueri rather resembles the pattern of E. antarctica (**Figure 5**) with one very common shared haplotype.

# DISCUSSION

Our study adds to the increasing knowledge and baseline data of Antarctic marine biodiversity as envisioned by the Census of Antarctic Marine Life and associated initiatives (Schiaparelli et al., 2013). It successfully uses the BOLD database to uncover mismatches between morphological and molecular specimen identification and highlights targets for deeper phylogenetic study to ascertain the position of some species and specimens in the lanternfish family. Lastly, we discuss phylogeographic patterns and the evolution of the family Myctophidae in the Southern Ocean in general and hypothesize that the presence of myctophids in the high polar seas is the result of multiple colonization events.

### Extending and Using the DNA Barcode Library for Specimen Identification

The newly added sequences expand the public DNA barcoding database of Myctophidae to more than 1000 individual sequences. Of these, 263 belonged to specimens captured in the Southern Ocean, which represents a substantial increase of the barcode library of Antarctic mesopelagic fish. This will be of major importance for future ecological studies that intend to use the library for specimen identification. The value of the BOLD database is likely to increase even further with the development of metabarcoding studies. Recent approaches include for example the detection of tropical sharks (Bakker et al., 2017), large-scale larval fish ecology through efficient identification of thousands of larvae (Kimmerling et al., 2018), as well as Antarctic studies characterizing notothenioid fish assemblages (Cowart et al., 2017) and toothfish diet (Yoon et al., 2017). All these examples are fully dependent on a high quality reference database to match metabarcoding sequences, as the lack of identification can lead to reduced or biased results and interpretations. Good coverage of the (sub-)Antarctic teleost fauna, which now includes demersal fishes (Rock et al., 2008; Dettaï et al., 2011; Smith et al., 2011b, 2012; Mabragaña et al., 2016) and large parts of the meso- and bathy-pelagic fish fauna (this study), will likely be highly valuable for future metabarcoding studies investigating for example the diet and trophic position of top predators. Species that are quickly digested can be detected with this molecular approach, although quantification remains challenging. Such studies can contribute to refine our understanding of food webs in Antarctic and sub-Antarctic waters (Cornejo-Donoso and Antezana, 2008; Pinkerton and Bradford-Grieve, 2014), a task of high relevance with regard to the ecosystem approach of the Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR; Kock et al., 2007; Constable, 2011).

Morphological specimen characterizations were verified with DNA barcoding. Generally, the success of specimen identification using the BOLD database (Ratnasingham and Hebert, 2007) was very high. Where morphological identification is challenging because discriminating characters are frequently lost during sampling, such as in the Myctophidae, DNA barcoding represents a useful complement to traditional identification. In eleven cases we were able to improve the initial identification and in seven further cases, we discovered mismatches between taxonomy and genetic signature, which were attributed to initial misidentification or mislabeling. Without DNA barcoding these would likely have retained an erroneous identification, which in turn poses problems for further ecological analysis and interpretation. As other studies have shown, DNA barcoding of Antarctic vertebrates is a useful molecular taxonomic approach (Rock et al., 2008; Smith et al., 2008, 2011b, 2012; Duhamel et al., 2010; Lautredou et al., 2010; Dettaï et al., 2011; Rey et al., 2011). Furthermore, it may also serve as a starting point for phylogenetic and phylogeographic investigations (Duhamel et al., 2014; Mabragaña et al., 2016). Such help for taxonomy is highly needed in times where classical taxonomic expertise has become rare (Cao et al., 2016). At least some myctophids, however, may be particularly prone to DNA degradation problems. Some of the authors observed that samples from non-myctophid fishes collected during the same expeditions and processed in the same way had much higher amplification success rates. This might be linked to (taxon- )specific degradation processes and we therefore recommend that myctophids are processed first when treating a fish catch for scientific purposes.

## Phylogeny and Phylogeography of Southern Ocean Mesopelagic Fishes

The topology of phylogenetic trees constructed using COI and rh1 concur to a great extent with recent multi-marker phylogenies (Poulsen et al., 2013; Davis et al., 2014; Denton, 2014; Martin et al., 2017). However, our data cannot be used to discuss the relationship between myctophid tribes. The elsewhere well-supported monophyly of Myctophidae is not resolved in our full COI data set with some aulopiform and neoscopelid species placed within Myctophidae. In addition, the first split of Myctophidae showed polytomy with nine branches; in other words, the placement of these clades is unclear (**Figure 2**). The rh1 tree spans fewer taxa and resolved Lampadena as sister group to all other Myctophidae, which is not supported by other studies. COI and rh1 alone are clearly not sufficient to accurately resolve deeper phylogenetic relationships of Myctophidae. However, these markers offer the most comprehensive datasets, which are important for a holistic understanding of the approximately 395 currently described Myctophidae (Eschmeyer and Fong, 2018). High taxon density can positively affect tree topology through breaking down of otherwise very long branches. COI in particular is unmatched regarding coverage with more than 700 sequences from 149 species already previously available in BOLD for our analysis. At the subfamily to genus level our results indeed match better with recent hypotheses of myctophid intrarelationships. For example, within Lampanyctinae (sensu Martin et al., 2017) the COI tree shows that the genera Lampadena and Taaningichthys are related, as are Bolinichthys, Lepidophanes, and Ceratoscopelus and, lastly, Stenobrachius, Triphoturus, Parvilux, and Lampanyctus. This concurs with the multi-marker results of Denton (2014) and Martin et al. (2017). The concatenated dataset even resolves an initial split into a clade with Diaphinae, Lampanyctinae, Gymnoscopelinae, and Notolychninae and another clade with Electronini and Myctophini (**Figure 4**). This pattern matches the analyses of Poulsen et al. (2013), Davis et al. (2014), Denton (2014), and of Martin et al. (2017), with the exception that the latter find Diaphinae closer related to Myctophinae. Ultimately, the deep phylogenetic hypotheses of Myctophidae still need further work. With respect to phylogeny our data can serve as starting point to highlight genera or species that are of particular interest for further analysis, such as studies analyzing entire mitogenomes or large numbers of single nucleotide polymorphisms. We restrict this discussion here to species common in the Southern Ocean but extend it toward intraspecific genetic diversity by analyzing phylograms and haplotype networks in relation to geography. Such phylogeographic patterns are important to understand the distribution of biodiversity in the mesopelagic zone of the Southern Ocean and to plan conservation and management actions accordingly in light of climatic changes. We discuss specific phylogenetic and phylogeographic implications and recommendations of our study as compared to other recent phylogenies for: (1) Electronini, (2) Gymnoscopelinae (sensu Martin et al., 2017); (3) other Southern Ocean myctophids, and (4) other Southern Ocean mesopelagic, non-myctophids fishes.

The tribe Electronini is monophyletic within our COI analysis, but not in the rh1 tree, where Diogenichthys appears related to Electrona antarctica (**Figure 3**). This signal is likely the reason that Diogenichthys is also placed within Electronini in the concatenated analysis (**Figure 4**). Other recent studies have all corroborated the monophyly of Electronini (Poulsen et al., 2013; Davis et al., 2014; Denton, 2014; Martin et al., 2017). It is possible that the accuracy of the rh1 marker here is affected by a bias in base composition across taxa (Chen et al., 2003). Inferences from rh1 alone as well as results from the concatenated dataset that may stem primarily from the rh1 signal as is the case here must therefore be interpreted with caution. The relationships within Electronini are still somewhat obscure (Denton, 2014). All our trees support the monophyly of Protomyctophum, but the placement of Protomyctophum and the other genera, Metelectrona, Electrona, and Krefftichthys remains unclear. In the COI dataset, E. antarctica and K. anderssoni cluster together and apart from remaining Electrona spp. With rh1 and the concatenated dataset K. anderssoni rather appears to be a sister group of all other species, except E. antarctica, somewhat similar to Martin et al. (2017). However, Denton (2014) resolved Krefftichthys as sister group of only Protomyctophum. These contradictory results are evidence that more detailed studies are needed to clarify relationships within this tribe. Within Protomyctophum and using COI, where more than two Protomyctophum species were included, the split between the subgenera Protomyctophum and Hierops is supported except for P. tenisoni which diverges first (**Figure 2**; Gordeeva, 2013; Denton, 2014). With some additional support a revision of Protomyctophum as suggested by Denton (2014) appears sensible.

All Electronini species are recovered as single clusters, with low to moderate intraspecific levels of diversity (**Figure 5**). They show no divergent groups that might point to undescribed or cryptic species. The striking example of this is E. antarctica, where the majority of individuals belong to the same haplotype despite the distant locations. A dominant, widespread haplotype may indicate reduced genetic diversity due to for example a recent bottleneck. However, more variable markers show high levels of genetic diversity in this species (Van de Putte et al., 2012). Electrona antarctica is arguably the most common myctophid in Antarctic waters with a circum-Antarctic distribution and preference for water temperatures below 2.5◦C (Hulley, 1990; Duhamel et al., 2014). Similar to Antarctic krill (Deagle et al., 2015), its enormous abundance may be the key factor leading to virtual panmixia, i.e., genetic homogeneity across the entire distribution range (Van de Putte et al., 2012). Other Electronini species, in particular Protomyctophum bolini and Electrona carlsbergi, show more variability in their COI haplotypes, but no relation to geography was detectable. It seems that in addition to high effective population size, the strong force of the Antarctic Circumpolar Current causes high connectivity, resulting in at most subtle spatial genetic structure, but clearly no pronounced phylogeographic structure of fishes in the tribe Electronini.

The subfamily Gymnoscopelinae is well supported as monophyletic group in all our analyses confirming findings of other phylogenetic studies (Poulsen et al., 2013; Davis et al., 2014; Denton, 2014; Martin et al., 2017). We therefore adopted the taxonomic revision of Martin et al. (2017), who promoted the former lampanyctine tribe of Gymnoscopelini to Gymnoscopelinae. The genus Gymnoscopelus appears clearly monophyletic. Our concatenated dataset resolves G. nicholsi as sister group to all other species (BS = 100%; **Figure 4**), as suggested by Denton (2014), the only other study that included more than two Gymnoscopelus species. In the individual COI and rh1 datasets the placement of G. nicholsi was not apparent (**Figures 2**, **3**), highlighting the value of an additional nuclear marker to increase confidence in phylogenetic positioning. There are examples where rhodopsin can distinguish fish species, where COI fails (Thiel and Knebelsberger, 2016), such as in the present case. In our case all Gymnoscopelus spp. can be delineated using COI, but the exact position within the tree remains to be evaluated, except where they are corroborated by Denton's (2014) results.

With respect to phylogeography, the Gymnoscopelinae show a different pattern from the Electronini (**Figure 6**). G. nicholsi features various COI haplotypes, but only one individual, collected off the Kerguelen Islands, stands out in the phylogenetic tree (**Figure 6**). This is similar to the pattern of P. bolini within the Electronini. Gymnoscopelus bolini on the other hand splits into three groups based on COI variation (**Figure 6**). This pattern might hint at genetic structuring, but this needs to be investigated further, as the single sample from East Antarctica (DDU) is divergent, while samples from Ross and Scotia Sea, and Heard and McDonald, and Kerguelen Islands group together. Other factors than circumpolar position may be at play here, for instance trophic niche partitioning, sexual selection, or simply increased levels of genetic variability. The other species show relatively low variability and no pattern related to sampling locality, with two exceptions, both samples collected in lower latitudes (off South Africa and Argentina, respectively). These two samples were identified as G. piabilis and G. nicholsi. Given their very high COI sequence divergence (23 mutations apart from the nearest neighbor in median joining networks, not shown in figure), we recommend reexamination of the specimens, if feasible, to investigate whether they possibly belong to different (cryptic) species or subspecies.

The available sequences identified as Symbolophorus boops (BOLD references DSFSE476-08 to DSFSE480-08 and DSFSG260-10) cluster apart from the two other Symbolophorus clades resolved in our COI tree (one composed of S. californiensis, S. reversus, S. evermanni, Symbolophorus sp., and S. rufinus and the other composed of S. barnardi and S. veranyi; **Figure 2**). Instead these sequences settle within the Diaphinae (sensu Martin et al., 2017). Unfortunately we discovered a posteriori that the COI sequences included here as S. boops were likely misidentified on BOLD. These sequences are probably from a Diaphus species (P. A. Hulley, pers. comm.) currently also not present on BOLD, but the specimens are in poor condition, preventing definite identification. The correction has been transmitted to the BOLD database. Other studies that included genetic data proposed that Symbolophorus is closer related to Myctophum, Hygophum, and other genera, as opposed to Diaphinae, but they all lacked specimens of S. boops (Poulsen et al., 2013; Denton, 2014; Martin et al., 2017). Therefore, we highly recommend the collection of further samples/sequences in order to resolve the phylogenetic position of S. boops, and to re-identify the specimens erroneously labeled as Symbolophorus boops. In fact, the entire genus would benefit from a detailed systematic revision as already noted by Wisner (1976).

The genus Nannobrachium was recently placed into synonymy with Lampanyctus (Martin et al., 2017). Our results fully support this across all datasets and all former Nannobrachium were therefore labeled as Lampanyctus. Interestingly, Lampanyctus achirus is the only species of the Lampanyctinae with a sub-Antarctic distribution (sensu Martin et al., 2017). Based on COI the species also splits into two divergent clades with 99 and 100% bootstrap support, respectively (**Figure S1**). One clade consists entirely of specimens caught off South Africa and the other clade of specimens from around Antarctica. These results, however, need to be corroborated with nuclear and morphological data. This again underscores the importance of specimen vouchers for groups with difficult morphological characters and uncertain species delineation. Nevertheless, recent studies have shown that undescribed species abound even in groups that were thought well known (Geiger et al., 2014). The pattern observed for L. achirus might be indicative of cryptic or undescribed species as found before in Myctophidae, for example in Benthosema pterotum (skinnycheek lanternfish; Zahuranec et al., 2012). Currently, there are two specimens of B. pterotum with COI sequences in BOLD (from Poulsen et al., 2013 and Denton, 2014) and they also show a deep split. Additional COI sequencing of the specimens used by Zahuranec et al. (2012) could therefore enable fast, cost-efficient, and confident discrimination between the two cryptic species with COI in the future.

According to Duhamel et al. (2014) the most abundant non-myctophid mesopelagic fish families in the Southern Ocean are Bathylagidae (deep-sea smelts; 5 species), Gonostomatidae (bristlemouths; 4 spp.), Notosudidae (waryfishes; 2 spp.), Paralepididae (barracudinas; 4 spp.), and 5 species of Stomiidae (barbeled dragonfishes). We found COI sequences for only four (Notolepis coatsi, Bathylagus antarcticus, Idiacanthus atlanticus, and Borostomias antarcticus), plus five bathypelagic species (Icichthys australis, Lagiacrusichthys macropinna, Poromitra capito, Sio nordensjkoldii, and Oneirodes notius). Apart from Myctophidae and Nototheniidae, and from species only occasionally recorded south of the Sub-Tropical Convergence, Duhamel et al. (2014) list 51 species for the whole Southern Ocean pelagic zone. Thirty-four of those are present with COI in BOLD (January 2018). Many less abundant species remain to be sequenced in order to complete the reference database. Intraspecific variability is difficult to assess for the available species due to the limited number of samples. Poromitra capito (N = 3) showed two haplotypes, Notolepis coatsi (N = 4) showed three haplotypes. The genus Bathylagus is believed to comprise at least three species in the Southern Ocean, although morphological discrimination is very difficult. Preliminary evidence suggests that four species with distinct COI signature are present in the Scotia Sea (Collins et al., unpublished). In this study twelve Bathylagus antarcticus (Antarctic deep-sea smelt), all collected in the Lazarev Sea, were included, which had twelve unique haplotypes and showed at least two divergent clades in our COI tree (BS = 97 and 84%). Dettaï et al. (2011) also found diverging clades of B. antarcticus in the Dumont d'Urville Sea. We recommend a detailed integrative taxonomic investigation of all available Bathylagus specimens using morphology and several genetic markers to clarify the status of this genus. The other species mentioned above were only available in low numbers (N ≤ 2), which does not permit an examination of intraspecific variation.

# Evolution of Myctophidae in the Southern Ocean

Overall, intraspecific genetic divergences are very low with only one case where COI variation clearly relates to sampling locality (Lampanyctus achirus). This may be expected, because large abundance promotes gene flow and homogenization (Hauser and Carvalho, 2008), especially in conjunction with the strong oceanographic connectivity enforced by the ACC (Orsi et al., 1995; Young et al., 2015). Another key contributing factor is the pelagic lifestyle of myctophids, characterized by seemingly free dispersal and the lack of a connection to a specific benthic habitat. Recent analyses suggesting that myctophid biomass in Antarctic waters is dependent on mass immigration from lower latitudes (except for E. antarctica and K. anderssoni) may support the idea of high connectivity (Saunders et al., 2017). However, examples of extended geographic structure despite high dispersal capabilities have been found in the Antarctic (Havermans et al., 2011; Damerau et al., 2014). In addition, weak genetic structuring has been observed for myctophids, although this was between the Mediterranean Sea and the Atlantic Ocean (Pappalardo et al., 2015). Circumpolar connectivity patterns in the sub-Antarctic and Antarctic are complex and variable, largely depending on the interplay of oceanography and life history traits (Moon et al., 2017 and references therein). For myctophids the combination of large abundance and a freeroaming, pelagic life style seems to cause a lack of genetic structuring. Our analysis is only a preliminary attempt to characterize such structure and is inherently biased toward common species, for which sufficient numbers of samples were available. If abundance indeed has a strong effect on population genetic or phylogeographic structure of lanternfishes, especially rare species should be investigated in detail. So far, only one study has investigated genetic structure of an Antarctic myctophid with multiple, variable markers (Van de Putte et al., 2012). Insights into the genetics of myctophid populations would be useful in order to optimize current modeling efforts (Koubbi et al., 2011; Freer et al., 2018), which in turn are important for conservation planning in the Southern Ocean (Hill et al., 2017). Attempts to explain and forecast mesopelagic fish distribution ranges typically use oceanographic parameters, particularly temperature and salinity (Koubbi et al., 2011; Duhamel et al., 2014; Olivar et al., 2017). Non-surprisingly, characteristics of deep water masses better explain myctophid species occurrence than surface water properties (Olivar et al., 2017). For both the characterization of current distribution and for future predictions questions arise, such as what temperatures can Southern Ocean myctophids physiologically sustain? To what level have they already adapted to colder waters and can they adapt to current rates of environmental change?

Benthic biodiversity in the Southern Ocean is comparably high, including many endemic species (Brandt et al., 2007). This is not the case for the mesopelagic fauna, mostly because it is not as isolated as Antarctic shallow water systems. However, it appears that only a few myctophid species adapted to permanently thrive under the prevailing environmental conditions. Hence, only 17 of 68 species ever recorded in the Southern Ocean are truly (sub-)Antarctic and endemic to these waters. This corresponds to 7.1% of all described myctophids, probably an underestimate considering the deep molecular divergences within the non-Antarctic species in the group, as e.g., in the supposedly monotypic genus Notolychnus. In contrast, equatorial and tropical fish communities feature high myctophid species richness (Olivar et al., 2017). Compared to for example pycnogonids, where 17.3% of all species are endemic to (sub-) Antarctic waters (Krabbe et al., 2010), myctophids seem to have diversified less in the Southern Ocean. In fact, just two species (E. antarctica and G. opisthopterus) exhibit what is described as an Antarctic distribution pattern (Duhamel et al., 2014). Looking at the phylogenetic trees, it becomes even clearer that adaptation to Southern Ocean conditions must have occurred repeatedly. There is no single species flock of Southern Ocean myctophids, but species from at least three subfamilies sensu Martin et al. (2017) are in fact true Southern Ocean species, although the vast majority belongs either to the Electronini or Gymnoscopelus. This suggests parallel evolution within similar environments based on similar genomic architecture. Denton (2018) recently showed that lanternfishes experienced elevated diversification rates initiated around the Eocene-Oligocene transition, which on the other hand could indicate that the formation of the ACC was an important evolutionary trigger for mesopelagic fish species. Southern Ocean lanternfishes are an interesting model to study evolution and speciation in the deep sea (de Busserolles et al., 2013; Denton, 2014). The diversification of Electronini (especially Protomyctophum, see also Denton, 2018) seems particularly intriguing, as it might represent an example of a (relatively small scale) marine adaptive radiation.

# CONCLUSIONS

With this study we substantially extend the DNA barcode library of Antarctic mesopelagic fish, particularly lanternfishes. The combination of morphological and molecular identification led to confident species level identification in 281 out of 299 cases. Several misidentifications or otherwise uncertain samples were identified in the database.Overall,DNA barcode libraries provide a robust reference dataset for specimen identification, especially to the rescue of fragile morphological characters. As expected, the mitochondrial COI and nuclear rh1 genetic markers were not sufficient to resolve deep phylogenetic relationships. However, our results are largely congruent with recent phylogenetic studies of the family. Some of our findings suggest the importance of further study or reidentification, e.g., of Symbolophorus boops. In addition, we highlight potential (pseudo-)cryptic or unrecognized species and recommend further investigation of Gymnoscopelus bolini, two specific Gymnoscopelus specimens(nominally identified as G. piabilis and G. nicholsi), Lampanyctus achirus and the nonmyctophid genus Bathylagus. The fact that myctophid species from at least three subfamilies are Southern Ocean species suggests that colonization and adaptation to this environment has occurred repeatedly. Overall, spatial divergence of species Christiansen et al. Mesopelagic Fishes of the Southern Ocean

is rare in this family, potentially due to the enormous abundance of many myctophids and the homogenizing force of ocean currents. Finally, this study provides an overview of currently available Antarctic samples and associated levels of intraspecific diversity, which bothmay facilitate future ecological, phylogenetic, and evolutionary investigations of Southern Ocean myctophids, a fish family that surely warrants increased scientific attention.

#### ETHICS STATEMENT

All procedures involving the capture of fish followed internationally recognized CCAMLR CEMP standard methods and were permitted under the Antarctic Marine Living Resources Act.

#### AUTHOR CONTRIBUTIONS

AVdP conceived the study with input from AD, HC, FH, MC, and FV. AVdP, AD, MC, GD, and MH were involved in sample collection. AVdP, AD, MC, DS, and HC contributed to sequencing and compiled BOLD datasets. HC and FH analyzed the data with help of AD and wrote the manuscript with help of AD, DS, MC, AVdP, and FV. All authors read and approved the manuscript.

#### FUNDING

Molecular work including sequencing was supported by the governments of France and Canada. The former through the Service de Systematique Moleculaire of the Muséum National d'Histoire Naturelle (MNHN; UMS2700), supported by the network Bibliothèque du Vivant funded by the CNRS, the MNHN, the INRA, and the CEA (Centre Nacional de Séquençage). The latter through Genome Canada and the Ontario Genomics Institute for the International Barcode of Life Project. This work was further supported by the

#### REFERENCES


Belgian projects vERSO and RECTO (http://rectoversoprojects. be). This is contribution no. 25 to the vERSO project, funded by the Belgian Science Policy Office (BELSPO, Contract no. BR/132/A1/vERSO). The first author was funded by a grant from the former Flemish agency for Innovation by Science and Technology, now Flanders Innovation and Entrepreneurship (VLAIO, grant no. 141328). DS was supported by the Alfred P. Sloan Foundation and the Food From Thought research program funded by the Canada First Research Excellence Fund.

#### ACKNOWLEDGMENTS

We thank the officers, crew, and scientists of the cruises involved in the capture of the samples. The cruises with RV James Clark Ross were supported by British Antarctic Survey (BAS) and its Discovery 2010 project. The RV Polarstern cruises were supported by the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI). The CAML-CEAMARC cruises of RV Aurora Australis and RV Umitaka Maru (IPY project no. 53) were supported by the Australian Antarctic Division (AAD), the Japanese Science Foundation, the French polar institute IPEV, the CNRS, the MNHN, and the ANR (White project ANTFLOCKs USAR n07-BLAN-0213-01). The Austral cruise POKER 2 was supported by specific grants of the Ministry of Alimentation, Agriculture and Fisheries (MAAP), the Marine Reserve of TAAF and the contributions of French ship owners involved in the Kerguelen Islands fisheries, with the logistic help of SAPMER and TAAF. We thank F. Busson and the 2013 JAMSTEC RV Hakuho Maru campaign led by K. Tsukamoto. We thank the editor and reviewers for constructive comments.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00120/full#supplementary-material


photoreceptors and found in other ray-finned fishes. J. Exp. Biol. 220, 294–303. doi: 10.1242/jeb.145953


comparative application of morphological and molecular methods. Zookeys 2016, 139–164. doi: 10.3897/zookeys.617.8866


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Christiansen, Dettai, Heindler, Collins, Duhamel, Hautecoeur, Steinke, Volckaert and Van de Putte. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Corrigendum: Diversity of Mesopelagic Fishes in the Southern Ocean – A Phylogeographic Perspective Using DNA Barcoding

Henrik Christiansen<sup>1</sup> \*, Agnès Dettai <sup>2</sup> , Franz M. Heindler <sup>1</sup> , Martin A. Collins <sup>3</sup> , Guy Duhamel <sup>4</sup> , Mélyne Hautecoeur <sup>4</sup> , Dirk Steinke5,6, Filip A. M. Volckaert <sup>1</sup> and Anton P. Van de Putte1,7

<sup>1</sup> Laboratory of Biodiversity and Evolutionary Genomics, KU Leuven, Leuven, Belgium, <sup>2</sup> UMR 7205 ISYEB CNRS-MNHN-Sorbonne Universite-EPHE, Département Systématique et Évolution, Muséum National d'Histoire Naturelle, Paris, France, <sup>3</sup> Centre for Environment, Fisheries and Aquaculture Science, Lowestoft, United Kingdom, <sup>4</sup> UMR 720 BOREA, Département Milieux et Peuplements Aquatiques, Muséum National d'Histoire Naturelle, Paris, France, <sup>5</sup> Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada, <sup>6</sup> Department of Integrative Biology, University of Guelph, Guelph, ON, Canada, <sup>7</sup> OD Nature, Royal Belgian Institute of Natural Sciences, Brussels, Belgium

#### Edited and reviewed by:

Oana Moldovan, Emil Racovita Institute of Speleology, Romania

#### \*Correspondence:

Henrik Christiansen henrik.christiansen@kuleuven.be

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

Received: 22 September 2018 Accepted: 27 September 2018 Published: 02 November 2018

#### Citation:

Christiansen H, Dettai A, Heindler FM, Collins MA, Duhamel G, Hautecoeur M, Steinke D, Volckaert FAM and Van de Putte AP (2018) Corrigendum: Diversity of Mesopelagic Fishes in the Southern Ocean – A Phylogeographic Perspective Using DNA Barcoding. Front. Ecol. Evol. 6:162. doi: 10.3389/fevo.2018.00162

Keywords: marine biodiversity, adaptation, Antarctic, COI, Myctophidae, phylogeny, rhodopsin

#### **A Corrigendum on**

#### **Diversity of Mesopelagic Fishes in the Southern Ocean – A Phylogeographic Perspective Using DNA Barcoding**

by Christiansen, H., Dettai, A., Heindler, F. M., Collins, M. A., Duhamel, G., Hautecoeur, M., et al. (2018). Front. Ecol. Evol. 6:120. doi: 10.3389/fevo.2018.00120

In the original article, there was an error. The sequences of Symbolophorus boops included from the Barcode of Life Datasystems (BOLD) in our analyses were likely misidentified, which was kindly brought to our attention by P. A. Hulley.

A correction has been made to the last sentence of the Abstract:

However, we highlight potential (pseudo-)cryptic or unrecognized species in Gymnoscopelus bolini, Lampanyctus achirus, and the non-myctophid genus Bathylagus.

A correction has been made to the Discussion, Sub-section Phylogeny and Phylogeography of Southern Ocean Mesopelagic Fishes, Paragraph 6:

The available sequences identified as Symbolophorus boops (BOLD references DSFSE476-08 to DSFSE480-08 and DSFSG260-10) cluster apart from the two other Symbolophorus clades resolved in our COI tree (one composed of S. californiensis, S. reversus, S. evermanni, Symbolophorus sp., and S. rufinus and the other composed of S. barnardi and S. veranyi; Figure 2). Instead these sequences settle within the Diaphinae (sensu Martin et al., 2017). Unfortunately we discovered a posteriori that the COI sequences included here as S. boops were likely misidentified on BOLD. These sequences are probably from a Diaphus species (P. A. Hulley, pers. comm.) currently also not present on BOLD, but the specimens are in poor condition, preventing definite identification.

The correction has been transmitted to the BOLD database. Other studies that included genetic data proposed that Symbolophorus is closer related to Myctophum, Hygophum, and other genera, as opposed to Diaphinae, but they all lacked specimens of S. boops (Poulsen et al., 2013; Denton, 2014; Martin et al., 2017). Therefore, we highly recommend the collection of further samples/sequences in order to resolve the phylogenetic position of S. boops, and to re-identify the specimens erroneously labeled as Symbolophorus boops. In fact, the entire genus would benefit from a detailed systematic revision as already noted by Wisner (1976).

A correction has been made to the seventh sentence of the Conclusions:

With this study we substantially extend the DNA barcode library of Antarctic mesopelagic fish, particularly lanternfishes. The combination of morphological and molecular identification led to confident species level identification in 281 out of 299 cases. Several misidentifications or otherwise uncertain samples were identified in the database. Overall, DNA barcode libraries provide a robust reference dataset for specimen identification, especially to the rescue of fragile morphological characters. As expected, the mitochondrial COI and nuclear rh1 genetic markers were not sufficient to resolve deep phylogenetic relationships. However, our results are largely congruent with recent phylogenetic studies of the family. Some of our findings suggest the importance of further study or re-identification, e.g., of Symbolophorus boops. In addition, we highlight potential (pseudo-)cryptic or unrecognized species and recommend further investigation of Gymnoscopelus bolini, two specific Gymnoscopelus specimens (nominally identified as G. piabilis and G. nicholsi), Lampanyctus achirus and the non- myctophid genus Bathylagus. The fact that myctophid species from at least three subfamilies are Southern Ocean species suggests that colonization and adaptation to this environment has occurred repeatedly. Overall, spatial divergence of species is rare in this family, potentially due to the enormous abundance of many myctophids and the homogenizing force of ocean currents. Finally, this study provides an overview of currently available Antarctic samples and associated levels of intraspecific diversity, which both may facilitate future ecological, phylogenetic, and evolutionary investigations of Southern Ocean myctophids, a fish family that surely warrants increased scientific attention.

Finally, in Table 2, the third row of the third column should read Protomyctophum luciferum instead of Porotomyctophum luciferum.

The authors apologize for these errors and state that this does not change the remainder of the scientific conclusions of the article in any way. The original article has been updated.

#### REFERENCES


Wisner, R. (1976). The Taxonomy and Distribution of Lanternfishes (family Myctophidae) of the Eastern Pacific Ocean - NORDA-Report 3. Bay St. Louis, MS.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Christiansen, Dettai, Heindler, Collins, Duhamel, Hautecoeur, Steinke, Volckaert and Van de Putte. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Historical DNA Metabarcoding of the Prey and Microbiome of Trematomid Fishes Using Museum Samples

Franz M. Heindler <sup>1</sup> \*, Henrik Christiansen<sup>1</sup> , Bruno Frédérich2,3, Agnes Dettaï <sup>4</sup> , Gilles Lepoint <sup>2</sup> , Gregory E. Maes <sup>5</sup> , Anton P. Van de Putte<sup>6</sup> and Filip A. M. Volckaert <sup>1</sup>

<sup>1</sup> Laboratory of Biodiversity and Evolutionary Genomics, University of Leuven, Leuven, Belgium, <sup>2</sup> Laboratory of Oceanology, FOCUS, University of Liège, Liege, Belgium, <sup>3</sup> Laboratory of Functional and Evolutionary Morphology, FOCUS, University of Liège, Liege, Belgium, <sup>4</sup> UMR 7138, CNRS-UPMC-IRD-MNHN, Département Systématique et Evolution, Muséum National d'Histoire Naturelle, Paris, France, <sup>5</sup> Laboratory of Cytogenetics and Genome Research, Genomics Core, University of Leuven, Leuven, Belgium, <sup>6</sup> OD Nature, Royal Belgian Institute of Natural Sciences, Brussels, Belgium

#### Edited by:

Oana Moldovan, Emil Racovita Institute of Speleology, Romania

#### Reviewed by:

Katy Morgan, University of Bath, United Kingdom Luiz Henrique Garcia Pereira, Universidade Federal da Integração Latino-Americana, Brazil

#### \*Correspondence:

Franz M. Heindler franzmaximilian.heindler@kuleuven.be orcid.org/0000-0003-4305-7296

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

Received: 11 June 2018 Accepted: 10 September 2018 Published: 28 September 2018

#### Citation:

Heindler FM, Christiansen H, Frédérich B, Dettaï A, Lepoint G, Maes GE, Van de Putte AP and Volckaert FAM (2018) Historical DNA Metabarcoding of the Prey and Microbiome of Trematomid Fishes Using Museum Samples. Front. Ecol. Evol. 6:151. doi: 10.3389/fevo.2018.00151 Antarctic specimens collected during various research expeditions are preserved in natural history collections around the world potentially offering a cornucopia of morphological and molecular data. Historical samples of marine species are, however, often preserved in formaldehyde which may render them useless for genetic analysis. We sampled stomachs and hindguts from 225 Trematomus specimens from the Natural History Museum London. These samples were initially collected between 20 and 100 years ago and fixed in either formaldehyde or ethanol. A 313 bp fragment of the cytochrome c oxidase subunit I (COI) was amplified and sequenced for prey item identification in the stomach and a 450 bp region of the 16S rRNA gene to investigate microbiome composition in the gut system. Both data sets were characterized by large dropout rates during extensive quality controls. Eventually, no unambiguous results regarding stomach content (COI) were retained, possibly due to degraded DNA, inefficient primers and contamination. In contrast, reliable microbiome composition data (16S rRNA) was obtained from 26 samples. These data showed a correlation in change of microbiome composition with fish size as well as year of the catch, indicating a microbiome shift throughout ontogeny and between samples from different decades. A comparison with contemporary samples indicated that the intestinal microbiome of Trematomus may have drastically changed within the last century. Further extensive studies are needed to confirm these patterns with higher sample numbers. Molecular analyses of museum stored fish can provide novel micro evolutionary insights that may benefit current efforts to prioritize conservation units in the Southern Ocean.

Keywords: natural history museum, notothenioidei, 16S rRNA, COI, Southern Ocean, Antarctica

# INTRODUCTION

The Antarctic continent and the surrounding Southern Ocean contain fragile and unique ecosystems, molded by cold climate, seasonal photoperiod, and remoteness. Despite its distance to congested areas, human influences on the ecosystems include both direct impacts (such as commercial fishing, tourism, and research) and indirect impacts (such as global warming or ocean acidification). These impacts increased considerably during the last 100 years. For example, Antarctica is among the areas most affected by climate change (Solomon et al., 2009), which is expected to have negative effects on marine Antarctic ecosystems (Griffiths et al., 2017). All over the world, climatic changes have already had severe effects on fish populations (Roessig et al., 2004). These effects are expected to be especially grave in the Southern Ocean, as it is home to a unique fish fauna adapted to the prevailing stenothermal conditions (Eastman, 1993). This fragility is due to thermal limitations of the fish species itself, but also to temperature restraints and preferences of complex associated prey, symbiont, and microbiome communities. The wellbeing of fish stocks is often dependent on high abundance of their main prey items (Roessig et al., 2004). Key prey species may be forced to change their distribution range, driving the fish to either prey on other taxa or migrate and follow the initial prey. Baseline data studying these shifts in associated fauna is often lacking, especially for remote areas such as the Southern Ocean. Therefore, describing the associated fauna of fish species in the present and in the past may be a yet underappreciated way to understand eco-evolutionary processes and follow environmental changes.

Natural history museums worldwide house an immense number of preserved specimens. Many of these samples were collected before or during the period of rapid increase in anthropogenic impact on natural populations, and could therefore yield valuable information regarding recent ecoevolutionary processes (Ceballos and Ehrlich, 2002; Wandeler et al., 2007). While such museum samples have been used extensively for morphological studies in the past (e.g., taxonomy, morphometrics, and meristic counts), they often pose challenges for genetic and genomic analysis due to nucleic acid degeneration (Chakraborty et al., 2006; Bi et al., 2012, 2013). Amongst the first reports of successful DNA extraction from archived samples was the study of Higuchi et al. (1984) targeting the extinct quagga, a member of the horse family. Molecular analysis of museum samples, sometimes also referred to as "ancient DNA" studies, have gained popularity since (e.g., Lambert et al., 2002; Nielsen et al., 2017). Advances of high-throughput sequencing techniques applied to museum samples have changed the possible scale and effort needed for archived DNA investigations and shown how powerful and valuable these can be (Bi et al., 2012, 2013; Nielsen et al., 2017). To date, studies using museum samples have mainly focused on the host (e.g., extracting DNA from bones, otoliths, teeth), whereas organisms that lived on or within the host, as well as ingested prey organisms are mostly ignored. Museum samples can be used to reveal the genetic variation of a host species in space and time. In addition, these samples may be useful to study associated symbionts, parasites, prey items, or microbiome composition, with enormous potential to unveil eco-evolutionary processes over large time scales.

Metabarcoding has the potential to simultaneously assess the presence and to some extent also abundance of hundreds of species, which makes it attractive for diversity assessment studies. Knowledge of prey items or the microbiome composition within species can often highlight spatial differences in diet, behavior, or environmental pressure. Recently, studies assessed the biodiversity of ancient samples by metabarcoding permafrost samples (Jørgensen et al., 2012; Bellemain et al., 2013; Willerslev et al., 2014) and dental calculus (Eisenhofer et al., 2017), however, such studies are few. Using ancient samples gives a unique opportunity to compare community structures through time and space.

The gastrointestinal microbiome, the community of protozoans, bacteria, and viruses inhabiting the digestive system of a fish host, may display fundamental interspecific as well as intraspecific variation (Ghanbari et al., 2015; Egerton et al., 2018). It is affected by the physical properties of a habitat, such as water temperature, salinity or pollution, as well as biological factors, such as preferred prey, interactions with other species, and the ambient microbiome of the water column (Tarnecki et al., 2017; Bagi et al., 2018; Chen et al., 2018). In turn, the gastrointestinal bacterial community can influence the nutrition (Bäckhed et al., 2004; Turnbaugh et al., 2006) and therefore growth and reproduction, general behavior (Cryan and Dinan, 2012), and vulnerability to diseases (Kau et al., 2011) of the host itself. The microbiome composition varies and reflects both host-specific changes as well as environmental changes. Understanding past and present diversity patterns of diet and gastrointestinal microbiome composition can help pinpoint ecological implications of centennial-scale change in the Southern Ocean. Such insights may be useful in the context of current conservation planning, especially when integrated with other information sources and disciplines (e.g., Dawson et al., 2011). With the present proof-of-concept study, we aim to advance the development of metabarcoding techniques applied to museum samples originating from the Southern Ocean. We hope that such innovative methods lead to a better understanding of eco-evolutionary dynamics between host, prey, and microbiome species in times of rapid change.

Notothenioidei include the dominant fishes in the High-Antarctic and feature prominent adaptations such as antifreeze glycoproteins (Chen et al., 1997), reduced ossification (Eastman and Devries, 1981), and loss of heat-shock response (e.g., Hofmann et al., 2000). This makes them valuable model species for evolutionary and developmental biology (Rutschmann et al., 2011; Postlethwait et al., 2016). The genus Trematomus includes 13 species (sensu Duhamel et al., 2014) of medium-sized high-Antarctic shelf fishes that display ecological diversification (Lannoo and Eastman, 2000; Janko et al., 2011; Duhamel et al., 2014). They show differing levels of habitat preference in terms of depth and bottom association (inshore—deep-sea; cryopelagic benthic), population genetic structure, and feeding habits (Dewitt et al., 1990; Van De Putte et al., 2012; Mcmullin et al., 2017). Morphological stomach content identification has shown that Trematomus species feed mainly on a variety of small crustaceans (amphipods, copepods, euphaussiids, isopods, mysids), but also molluscs, polychaetes, algae, and fish (Vacchi and La Mesa, 1995; La Mesa et al., 1997, 2004, 2015; Moreira et al., 2014; Jurajda et al., 2016). Precise, species level prey identification necessary for high-resolution intra- and interspecific comparisons, however, can be challenging due to degradation and lost taxonomic characters. The diversification of Trematomus is part of the adaptive radiation of the Notothenioidei, believed to coincide with extinction and recolonization cycles in the Antarctic (Near et al., 2012). Ancient climate change has likely played a major role in facilitating the evolution of these fishes (Matschiner et al., 2011; Near et al., 2012). However, contemporary rates of environmental change and additional stressors such as pollution and new colonizing species are unprecedented and may fundamentally alter future evolution or even cause extinction of high-Antarctic fish species (Dornburg et al., 2017).

The aims of this study are threefold: (1) to test the applicability of metabarcoding techniques to determine the prey and microbiome composition of museum samples (2) to compare results from museum samples to those of contemporary samples (2017–2018), and (3) to explore potential driving factors of microbiome variability found in museum samples.

#### MATERIALS AND METHODS

Stomach and hindgut samples of 225 specimens of the genus Trematomus were obtained from the Natural History Museum, London. Fish were carefully dissected to minimize damage to the specimens. Stomachs were opened to remove stomach content and a small portion of the hindgut (1 cm) was removed. Stomach content and hindgut were stored separately in 70% ethanol. Sampling dates ranged from 1901 to 1988. Standard length (SL), year of catch, location of catch, and species identity (based on the morphological identification of the initial identifier) were recorded from the ledgers of the museum for all samples as far as available. Contemporary samples (n = 15) were caught with hook and line in the vicinity of the Gerlache Strait, Antarctic Peninsula in the season of 2017–2018. They were morphologically identified and frozen at −20◦C until being processed.

#### Laboratory Procedures

Eight protocols were tested for DNA purification, including two commercial kits specialized for formalin fixed and paraffin embedded (FFPE) tissue (**Supplementary Material**; Sato et al., 2001; Shi et al., 2002, 2004; Joshi et al., 2013). The method described below is strongly based on Shi et al. (2002) and Shi et al. (2004) and showed the most promising results. During molecular laboratory work special care was taken to prevent (cross-) contamination of samples. A large piece of stomach content (0.5 × 0.5 cm) or the entire piece of hindgut (1 cm) was placed into screwcap microtubes (fitted with rubber seals) with 500 µl of Phosphate Buffered Saline (PBS) at pH 9. Tissue was fragmented thoroughly in each tube to ameliorate efficiency. Samples were heated to 100◦C for 10 min, left to cool on ice for 5 min and then spun down with 20,000 × g for 5 min. PBS was carefully removed without taking any tissue and replaced by 500 µl of PBS at pH 7.2 and again heated to 100◦C for 10 min. PBS was again carefully removed and further purification steps were conducted using the commercial Nucleospin <sup>R</sup> Tissue (Macherey-Nagel, Accession number: 740952) DNA extraction kit following the manufacturer's protocol. Since more tissue was used than the manufacturer anticipated, multiple (2×, 3×, or 4×) amounts of the manufacturer's recommended chemicals were used depending on the amount of initial tissue. Furthermore, digestion was extended from 2 to 48 h. Final elution of DNA from the columns was also extended to 1 h. Workbench wipes (workbench contamination), human saliva wipes (human contamination) and no-template extractions (blanks) were included as contamination controls for amplification and sequencing.

For prey identification a 313 bp region of the COI gene was amplified from the stomach content using the tailed primers NGSmlCOIint and NGSjgHCO2198 according to Leray et al. (2013). The V3 and V4 region (460 bp) of the 16S rRNA gene was amplified using the tailed primers 16s-IllumTS-F and (4) 16s-IllumTS-R to assess the microbiome composition (Klindworth et al., 2013). The reaction mix for the amplicon PCR for COI contained 12.5 µl of MytaqTM 2x Mix (Bioline, Accession number: BIO-25041), 0.5 µl of each primer (20µM), 10.5 µl of molecular grade water and 1 µl of DNA template with a PCR profile of 10 s of denaturation at 95◦C, 30 s of annealing at 62◦C and 60 s elongation at 72◦C for 16 cycles with the annealing temperature dropping every cycle by 1◦C, followed by 25 cycles with an annealing temperature at 46◦C. The reaction mix for the amplicon PCR for 16S contained 12.5 µl of MytaqTM 2x Mix, 2.5 µl of each primer (1µM), 2.5 <sup>µ</sup>l of DNA template (5 ng ul−<sup>1</sup> ) and 5 µl of molecular grade water with a PCR profile of 60 s of initial denaturation at 95◦C followed by 25 cycles of 15 s denaturation at 95◦C, 15 s of annealing at 55◦C and 10 s elongation at 72◦C, finishing with a final extension of 72◦C for 300 s. PCR products were cleaned up using Agencourt AMPure XP beads (Beckman Coulter, Accession number: A63882) following the manufacturer's instructions with a bead to template ratio of 0.8 to 1. Thereafter followed an indexing PCR, which binds a unique primer barcode to each respective sample following Lange et al. (2014) with a PCR mix of 10 µl of MytaqTM 2x Mix, 0.5 µl of each forward and reverse indexing-primer (to form a unique identifiable primer combination for each sample; 20µM) and 9 µl of DNA template with a PCR profile of an initial denaturation of 1 min at 95◦C followed by 15 cycles of denaturation for 15 s at 95◦C, 15 s of annealing at 51◦C and 10 s of extension at 72◦C finishing with a final extension of 5 min at 72◦C. The PCR product was cleaned up again, then quantified using the commercial Quant-iTTM Picogreen <sup>R</sup> kit (Thermo Fisher) and pooled, if sufficient template (20 ng) was available. Sequencing took place on an Illumina MiSeq PE 3000 (Genomics Core, KU Leuven, Belgium).

## Filtering and Statistical Analysis

After the generation of the raw reads samples were demultiplexed using the bcl2fastq v2.16 tool integrated in the Illumina platform. Barcode mismatch was set to 0 to avoid index cross ambiguity errors. 16S rRNA data was analyzed through QIIME v.1.9.1 (Caporaso et al., 2010) to OTU level via NEPHELE (NEPHELE, 2016) for FASTQ paired-end reads: All reads with ambiguous base calls or a Phred score below 20 were removed from the dataset. Forward and reverse reads were joined with a minimum overlap of 30 bp and a maximum of 25% difference in base calls in the overlapping regions. Reads were classified by similarity based on Operational Taxonomic Units (OTUs) using an open reference approach. Chimeras were identified and removed using

uchime (Edgar et al., 2011). OTUs were identified by alignment to the SILVA 128 (Quast et al., 2012) reference database based on 99% similarity. The final output for both 16s rRNA and COI data resulted in an OTU table with number of reads per OTU for each sample. Analysis of COI data followed the protocol of Aylagas and Rodríguez-Ezpeleta (2016). The quality of the reads were checked using FASTQC v0.11.5 (Andrews, 2010) and merged using FLASH v1.2.11 (Magoc and Salzberg, ˇ 2011) with a minimum and maximum overlap of 217 bp and 257 bp, respectively. Reads with a Phred score below 25 were removed using Trimmomatic v0.36 (Bolger et al., 2014). Reads were classified into OTUs using an open reference approach with mothur v.35.1 (Schloss et al., 2009). Taxonomic assignment of OTUs was conducted using the Barcode of Life Datasystem (BOLD, www.boldsystems.com). Rarefaction curves were created in Nephele to assess the number of identified OTUs over the number of reads per sample and per group. Data was analyzed using Calypso (Zakrzewski et al., 2016). Principal Coordinate Analysis (PCoA) was applied in R Studio with R v.3.3.2 (R Core Team, 2016). Linear models (LM; R base package "stats" v3.5.0) were utilized to assess the relationship between the microbiome composition as reflected in the second PCoA axis and standard length (SL) and the year of catch. Species identity and location were found to be non-significant, possibly due to small sample size and therefore not further considered. Both significant variables (SL, year of catch) are not independent of each other and cannot be disentangled due to limited sample number because of large dropout rates (**Figure 1**). Therefore, both variables were used separately to create LMs, which were tested for significance (ANOVA) and adjusted for multiple testing using the Bonferroni correction. The diversity was assessed using Calypso and significance was tested using Tukey's (HSD) post-hoc test.

### RESULTS

All data are available online under http://dx.doi.org/10.17632/ 8cr8yzvsj2.9: containing metadata for all museum fish (including dorsal and lateral photos) as well as demultiplexed raw MiSeq reads from museum and contemporary samples. Metadata for contemporary samples are available under http://dx.doi.org/10. 17632/gk94xj8ydg.1.

### Success Rate of Museum Sample Metabarcoding

The complete workflow was characterized by high dropout rates of museum samples at every stage (**Figure 1**). Initially, 400 specimens of the genus Trematomus were identified in the museum's catalogs. Of these only 225 were suitable for stomach and gut analysis, either due to fish size or to the fact that the intestines had already been removed previously. After extraction and amplification, sufficient DNA (at least 20 ng for sequencing) was available in 84 gut and 67 stomach samples. Only 44 gut and 49 stomach samples were sequenced with at least 1,000 reads per sample. Quality filtering reduced the sample size further to 35 gut and 27 stomach samples. In the final step samples were compared to blank, human contamination, and workbench contamination samples. The microbiome composition (16S rRNA) of most museum samples clustered distinctly apart from the majority of the control samples (**Figure 2A**). Samples with a negative value on PCoA 1 axis (n = 5) clustered close to the control samples. They were therefore removed from the dataset as being contaminated by either the environment (indicated by proximity to workbench samples), the researcher (proximity to human contamination samples), or a combination of both. One blank control (positive value on the PCoA 1 axis) clustered close to the museum samples. This blank control was most likely

fish from natural history museum (gray) and control samples (black).

contaminated by museum samples. The PCoA of the prey item composition (COI) shows that museum samples and control samples were evenly distributed (**Figure 2B**). Here, blank as well as workbench contamination samples cluster within the museum samples, indicating great homogeneity between them. All prey items from museum samples were manually assessed, and the taxonomic classification evaluated (data not shown). Samples were dominated by reads linked to environmental (other species used in the same laboratory) and human contamination. Few reads (<3% in all samples) were of species that actually occur in the Southern Ocean.

### Contrasting Museum and Contemporary Samples

Microbial data (16S rRNA) was reanalyzed for the remaining museum (M) samples together with contemporary (C) Trematomus samples to evaluate a temporal contrast with 23 museum samples, one blank extraction, three human contamination, three workbench contamination controls, and 15 contemporary samples (**Table 1**). In total 2,331,397 reads were obtained for all samples with an average of 51,809 (± SE 10,266) reads per sample. Contemporary samples produced more than twice as many reads as museum samples with averages of 88,578 (± SE 8,399) and 38,899 (± SE 17,774) reads per sample, respectively. The blank sample had in total 14 reads. For most samples rarefaction curves of the number of observed species over number of sequences per sample leveled out at about 1000 sequences per sample, indicating that the sampling effort was more than sufficient, and a majority of the species present were recorded (**Figure 3A**). Rarefaction curves for the samples clustered by treatment groups (museum samples, contemporary samples, workbench controls, and human controls) show that the curve for museum samples evened out at only about 4,000 reads, indicating more rare species in the samples compared to the contemporary samples (**Figure 3B**). Overall, this indicates that sufficient reads per sample and per treatment were obtained in order to sequence a majority of the bacterial species present. For further analysis all OTUs that counted less than two reads were removed from the dataset.

Principal component analysis (PCoA) showed that in both the microbiome (16S rRNA) as well as prey item composition (COI) the contemporary and museum samples cluster distinctly away from each other (data not shown), with little overlap between contemporary and recent samples. Contemporary samples did not cluster within the control samples for both datasets.

The microbiome composition of Trematomus museum samples proved to be highly diverse (a full abundance list of all taxa can be found online (http://dx.doi.org/10.17632/8cr8yzvsj2. 9) under bacterial\_taxa\_summary.html). Here, the focus lays on a comparison between species, where museum as well as contemporary samples were available (T. hansoni, T. newnesi, T. loennbergii). Bacterial composition is compared at the phylum and family level (**Figure 4**). The most abundant phylum in all sample groups was Proteobacteria. However, abundance of both phyla and family varied greatly between museum and contemporary samples of the same species. Trematomus hansoni showed the most similar microbiome between museum and contemporary samples when analyzed at the phylum level. However, family level analyses presented large differences as well. In total, museum and contemporary samples shared only 32, 49, and 33 OTUs for T. hansoni (total number of OTUs M: 131, C: 110), T. newnesi (M: 131, C: 134), and T. loennbergii (M: 80, C: 96), respectively (**Figure 5**). In T. hansoni the composition of phyla was most similar with Proteobacteria (M:75%, C: 79%), Deionococcus-Thermus (M:14%, C: 2%), Bacteroidetes (M: 2%, C: 9%), and Actinobacteria (M: 0.5%,

#### TABLE 1 | List of fish included in Figures 2, 4, 5.


For each fish the identification number corresponding to the metadata (museum:.9 http://dx.doi.org/10.17632/8cr8yzvsj2; contemporary: http://dx.doi.org/10.17632/gk94xj8ydg.1), the species name (based on morphological identification) the standard length (SL) in cm, the year of catch and the location of catch is given.

C: 2%) present in notable (≥1%) abundance in both museum and contemporary samples. Museum and contemporary samples of T. newnesi differed most with only Proteobacteria (M: 79%, C: 22%), Cyanobacteria (M: 1%, C: 60%), Bacteroidetes (M: 2%, C: 1%), and Actinobacteria (M: 1%, C: 1%) occurring in notable abundances. Similarly, T. loennbergii showed overlaps in the phyla Proteobacteria (M: 94%, C: 80%) and Bacteroidetes (M: 1%, C: 5%). A visual comparison at the family level (**Figure 4**) points to greater overlap between different species within the same time frame (museum vs. contemporary) rather than within species (T. hansoni vs. T. newnesi vs. T. loennbergii). Furthermore, museum samples showed much more similarity to each other when compared to contemporary samples, as is reflected in the number of OTUs shared between all species in museum (65 OTUs) and contemporary (33 OTUs) samples (**Figure 5**).

An overall comparison between museum and contemporary samples shows that there was no significant difference between average Shannon indices, although museum samples exhibited much less variability than contemporary samples [TukeyHSD: F(1, 21) = 1.988, p > 0.05, **Figure 6A**]. Species richness, on

the other hand, was higher and exhibited less variation in museum samples [TukeyHSD: F(1, 21) = 16.85, p < 0.001, **Figure 6B**]. There were no statistically significant differences between evenness of museum and recent samples [TukeyHSD: F(1, 21) = 1.5518, p > 0.05, **Figure 6C**].

There were no statistically significant differences [TukeyHSD: F(3, 19) = 6.798, p > 0.05] between average species richness between museum and recent samples of T. hansoni (M: 134.4 ± 13; C: 85 ± 38.4), T. newnesi (M: 112, 5 ± 4.4; C: 50.8 ± 10.4), or T. loennbergii (M: 98.4 ± 14.3; C: 105.2 ± 8.3). However, there was a significant difference between average species richness between museum T. hansoni and contemporary T. newnesi samples (p = 0.0054). The average Shannon index [TukeyHSD: F(3, 19) = 1.039, p > 0.05] and evenness [TukeyHSD: F(3, 19) = 0.832, p > 0.05] were similar with no significant differences (p > 0.05).

### Biological Background

Non-contaminated samples of the microbiome analysis differ mainly along the second PCoA axis (**Figure 2A**). PCoA values were extracted and tested for correlations with metadata. There was no correlation of PCoA 2 values of the microbiome composition with neither species identity nor location. However, there was a correlation between the PCoA 2 values and the standard length of the fish [Bonferroni corrected p = 0.0053, F(1, 21) = 11.6, R <sup>2</sup> = 0.3556, **Figure 7A**] as well as a correlation between the PCoA 2 values of the microbiome composition and year of catch [Bonferroni corrected p = 0.0345, F(1, 21) = 6.685, R <sup>2</sup> = 0.2415, **Figure 7B**]. The linear models explain 35.5 and 24.2% of the variability of the data, respectively. In contemporary fish the microbiome composition differed mainly along the first PCoA 1 axis and these values were therefore used for further analysis. However, the PCoA 1 values did not correlate to any of the potential explanatory variables (**Figure 7A**). Noteworthy is also the distribution of the year of catch of samples from which meaningful results were obtained (**Figure 7B**). A majority of these samples were caught between 1901 and 1913 (n = 14), with another small cluster (n = 4) from 1938/39, another cluster (n = 4) from 1988, and one single sample from 2006.

# DISCUSSION

# Museum Sample Metabarcoding

In this study, we targeted the stomach content and internal microbiome of fish that have been stored in museum collections for a prolonged amount of time. Extraction of DNA, amplification of a gene fragment (COI/16S rRNA), and successful sequencing has proven extremely difficult and characterized by high dropout rates. We established an intensive control system in order to ensure that results do not merely reflect contamination of ambient bacterial communities, which has been problematic in many metabarcoding studies (Ficetola et al., 2016). The COI data, targeting prey items in the stomach, showed particularly high contamination rates and even the removal of all OTUs from the dataset that were present in contamination controls yielded no usable results. We therefore conclude that this method as presented and applied here is not suitable to amplify COI fragments from the stomach of museum stored fish specimens.

In contrast, reliable results of the microbiome composition (16S rRNA) for 23 fish was obtained. Even with high dropout rates, this is a promising finding, because it opens up enormous possibilities for future studies to assess the intestinal microbiome of fish from museum collections. The 16S rRNA data of museum samples, targeting the bacterial community within the hindgut, was distinctly different from control samples, indicating no contamination issues. One confounding explanation might be that the bacterial communities found here were remnants of bacterial communities from the ethanol used to preserve the fish in the museum. However, if that was the case, we would

expect greater homogeneity in the contamination of the samples, i.e., bacterial communities would be more homogenous between samples indicated by nearly all OTUs shared and also more samples would likely have been contaminated. In addition, fish were not previously opened (access to gut) before this study, which should limit contamination within the intestine. Interestingly, the few samples that passed all pipeline and quality control steps, were all collected at few time points (year of catch). This possibly reflects a strong influence of the initial (and long-term) preservation method. Formalin (formaldehyde) gained increasing popularity for specimens after the first quarter of the twentieth century in all fields of biological sciences (Herbin, 2013). Formalin replaced the more expensive ethanol as standard preservative, also providing better preservation and higher efficiency, especially for larger specimens. However,

even if buffered properly, formalin causes crosslinking among DNA molecules, between DNA molecules and nucleoproteins, and between nucleoproteins alone (Koshiba et al., 1993). This complicates/hampers DNA extraction and marker amplification. If not buffered correctly, formic acid forms with time in the formalin preservative. Depending on the pH, the effects of this process can range from structural modifications, over denaturation to complete depolymerization of the DNA (Thomas and Doty, 1956; Geiduschek, 1958). Furthermore, if conditions remain acidic, DNA hydrolyzes resulting in further structural changes (Koshiba et al., 1993). In museum collections, the initial preservative (formalin/ethanol fixation) is generally not documented. Formalin, as a preservative, was first introduced in 1891 and gained attention between the years of 1896 to 1937 (Herbin, 2013). Between 1960 and 1980 it was used almost

FIGURE 5 | Six-way Venn diagram of the microbial OTU composition in the guts of members of the genus Trematomus (T. hansoni, T. newnesi, T. loennbergii) from museum and contemporary samples.

exclusively. Due to the health risks of formalin, starting around 2000 samples were over time gradually transferred to ethanol, without keeping a record of the initial preservative. Most likely the successful metabarcoding sequences of this study are from specimens that have never been stored in formalin. That might explain why more ancient samples (anterior to 1913) have better amplification success and come in batches (same time and same expedition, so preserved in a similar way). Unfortunately, preservation data is not available, so this hypothesis cannot be tested here. We recommend that future museum metabarcoding studies carefully evaluate from which time periods samples are available and whether there is any information on preservation techniques. It would be interesting to specifically test sequencing success in relation to formalin/ethanol use, although many confounding factors (e.g., time between capture and transfer to preservative, organism size, tissue permeability) may be present.

In cases with no information available, researchers have to critically weigh their options as many samples may fail as shown here, which can drastically increase the cost of such a project. Unfortunately, most surveys and collections from the Southern Ocean were collected in the second half of the twentieth century, after formalin became the preservative of choice for fish, but before the potential of DNA was taken into account. The use of these samples for molecular studies is therefore probably limited.

It is also unclear why we were able to obtain data for the microbiome composition (16S rRNA; both historical and contemporary) and unable for the prey item composition of the historical samples (COI; data for the prey item composition of contemporary samples were generated but not shown, available under http://dx.doi.org/10.17632/gk94xj8ydg.1). It might be that bacterial communities based on their cellular structure and properties are more resistant and cope better with long term storage in the environment of the stomach and intestines, which can include aggressive digestive acids. Prey items may not withstand such an environment that well with the consequence of genetic material degrading in the course of the many years of storage. As there is no evidence for this suggestion in the literature, it remains speculation at this point in time. Another possible hypothesis would be that 16S rRNA primers used in this study are generally more efficient than the ones used for the COI fragment. In combination with low quality template DNA, this might result in a successful amplification for the 16S rRNA fragment, but not the COI fragment. Further technical tests or replication studies are needed to clarify the cause of the problem here and/or to find alternative ways to successfully obtain COI data from historical stomach contents.

DNA metabarcoding studies in general can suffer from poor taxonomic resolution due to primer bias (Deagle et al., 2014). Regarding museum samples short target fragments are often the only option due to fragmented and/or degraded DNA. If well-preserved samples yielding high-quality DNA is available, metagenomics (i.e., using genomewide markers or whole-genome data) offer a taxonomically more representative alternative (Porter and Hajibabaei, 2018). Natural history museums may contain samples amenable to such approaches as well (e.g., long-term ethanol preserved specimens, that were not exposed to formalin). Despite these limitations and tradeoffs, museum metabarcoding may offer unique insights into the temporal dynamics and driving factors of fish microbiome variation. Such data can be especially valuable to understand the responses to decadal to centennial scale environmental changes. In the following we explore temporal and biological signals with our—-admittedly small—-data set after quality control to showcase potential trends in Antarctic fish microbiome composition.

#### Microbiome Composition Through Time

So far only one study has investigated the microbial gut fauna of Antarctic fishes (Notothenia coriiceps and Chaenocephalus aceratus), using Sanger sequencing (Ward et al., 2009). We found much greater microbial diversity in both museum and contemporary samples of Trematomus spp., which can be expected given that our data are based on high-throughput sequencing. In general, the number of OTUs found in the intestinal microbiome is very variable in marine fish (Sullam et al., 2012). Two of the OTUs found in Ward et al. (2009) were also present in our contemporary samples (AF206298-Ehrlichia sp. "trout isolate," FM178379.1-Aliivibrio salmonicida LFI1238). In nine cases the same genus (Fusobacterium, Photobacterium, Aliivibrio, Desulfovibrio, Mycoplasma, Desulfovibrio, Shewanella, Moritella, Sphingomonas) was present in our contemporary samples. The museum sample microbiomes from our study included three genera (Fusobacterium, Shewanella, Sphingomonas), that were found by Ward et al. (2009) as well. Differences between the two studies are probably largely driven by the different sequencing techniques and possibly to a lesser extent by the different target species. Interestingly, there is less overlap between our museum samples and the results of Ward et al. (2009) than between our contemporary samples and Ward et al. (2009)

We also directly compared microbiome composition between contemporary and museum samples of three Trematomus species and found little overlap. Temporal comparisons of the same species show more variation than between species comparisons from the same sample type. Such results, albeit based on small sample size, could indicate that the microbiome of all fishes has undergone a drastic community shift. While there is large overlap in the functional groups (phylum level) in terms of presence and absence, the actual proportions of each group vary greatly between museum and contemporary samples. At higher resolution of the microbial community (family level), there is a clear change in composition, with few OTUs present in both museum and contemporary samples. This supports the idea that the microbiome may have fundamentally changed in these three species within the last century. We need to stress, however, that due to the large amounts of dropout during quality control our findings are based on very small sample sizes. The observed differences could therefore be driven by an insufficient community representation (especially in T. hansoni and T. loennbergii). Further studies ideally selecting individuals from similar locations and with known preservation history are needed to confirm these patterns. It is also unclear how such a drastic change could have occurred. The microbiome composition can rapidly and drastically be affected by change in behavior (David et al., 2014a), changes in diet (David et al., 2014b) or due to environmental influences such as pollution (Bagi et al., 2018; Chen et al., 2018). Ancient intestinal microbiome studies in humans show great resemblance of coprolite samples (between 8,000 and 1,400 years B.P.) with that of contemporary traditional rural communities, but drastic differences with samples associated with a cosmopolitan lifestyle (Tito et al., 2012). This shows that strong community shifts in the intestinal microbiome do occur across populations, but are generally thought to be associated with environmental pressures or changes in lifestyle (Schnorr et al., 2016). In our case, it is unclear what could have caused the indicated shift. One suggestion might be that humans have an increasing impact on the Antarctic environment. Despite the distance to congested areas, impacts on the Antarctic ecosystem have increased dramatically in the last 100 years. They include direct impacts such as pollution, tourism, and research (Clarke and Harris (2003), as well as indirect impacts such as the emission of greenhouse gases (Trathan and Agnew, 2010). While the consequences of direct impacts are relatively minor compared to other areas in the world (Halpern et al., 2008; Trathan and Agnew, 2010), indirect impacts have already and are expected to greatly affect and permanently alter the ecosystems of the Southern Ocean and Antarctica in the near future (Clarke and Harris, 2003; Schofield et al., 2010; Mintenbeck et al., 2012; Griffiths et al., 2017). Changing microbiome composition may be among these alterations. However, only a larger dataset with more environmental information and samples from multiple museum collections will be able to validate this hypothesis. Our results for contemporary samples also represent an important baseline that will be useful for the study of future changes in the microbiome composition of fish.

#### Biological Factors Influencing Microbiome Composition

A correlation between the microbiome composition and the size of the fish, and therefore also its age (White, 1991), was found for museum samples and indicated as well in contemporary samples. The contemporary samples were used as a control in this study with low sample size. The hinted trend here remains to be validated. There are many cases where the intestinal microbiome evolves throughout the development of an individual, as it changes its lifestyle or diet. There is a clear change in the microbiome composition of children compared to adults (Kau et al., 2011). In fish, body size (and therefore age) plays a crucial role in the interaction between predators and prey (Lundvall et al., 1999). With increasing size bigger prey items can be ingested and smaller ones might become less important. This shift might be reflected in the changing composition of the intestinal microbiome composition with size. Notothenioid fish have diversified ecologically and therefore feature a variety of life styles and feeding habits. Members of the genus Trematomus also show diversification in habitat use and diet (e.g., Brenner et al., 2001). At the same time they attain a similar range of maximum sizes and undergo ontogenetic shifts in in life style (Dewitt et al., 1990). It seems therefore plausible that microbiome composition of Trematomus fishes is dependent on fish size. This hypothesis would be supported by studies concerning the development of the intestinal microbiome of Coho salmon (Oncorhynchus kisutch) where the authors conclude that an early life microbiome is unstable (Romero and Navarrete, 2006). Initial colonization of the gut occurs after first feeding by microbes derived from the water column and the prey items. Over time, initial microbes are outcompeted by strains that are adapted to the intestinal environment. This is in line with other studies of humans (e.g., Cho and Blaser, 2012) and mice (El Aidy et al., 2012, 2013), that show that the initial intestinal microbiome is highly unstable and subject to change during development. In contrast, some studies support a more vertical transmission (from parent to progeny) of bacteria and highlight its importance (Funkhouser and Bordenstein, 2013). It seems that in Trematomus size, development, and age play a predominant role in the composition of the intestinal microbiome. Species and location appear to be of lesser importance. Larger sample sizes, a better understanding of the life history and more information of feeding habits are needed in order to better comprehend this relationship.

#### CONCLUSION

Here we show how to obtain information about the intestinal microbiome of century old fish from museum collections through metabarcoding. The feasibility of this provides an excellent opportunity to go back in time and learn about centennial scale microbiome community shifts. Our results indicate that a drastic shift may have occurred in recent years, coinciding with increasing environmental pressure from global change. Furthermore, microbiome composition of Trematomus fishes seems linked to ontogeny, rather than species identity or locality. Due to many samples failing at quality control steps, these findings are based on very small sample sizes and more extensive studies are needed to confirm such patterns. Future studies could try to select specimens with known preservation history and avoid formalin fixated samples, as these might be responsible for the high dropout rates we experienced. Overall, metabarcoding studies of museum fish harbor great potential for understanding eco-evolutionary processes that lead to adaptation within relatively short time scales.

#### DATA AVAILABILITY STATEMENT

All generated data (raw reads) and metadata for this study can be found in the Mendeley Data repository (http://dx. doi.org/10.17632/8cr8yzvsj2.9) under the title "Historical DNA metabarcoding of the prey and microbiome of trematomid fishes using museum samples." Metadata and COI data for contemporary fish can be found in the Mendeley Data repository (http://dx.doi.org/10.17632/gk94xj8ydg.1) under the title "Trophic assessment of Antarctic fish."

# AUTHOR CONTRIBUTIONS

FH conceived the study with input from HC, BF, AD, GL, GM, AV, and FV. FH collected data, performed molecular laboratory

#### REFERENCES


work, and analyzed the data. FH and HC led the writing of the manuscript.

#### FUNDING

This research received support from the SYNTHESYS Project (http://www.synthesys.info/), which is financed by European Community Research Infrastructure Action under the FP7 Capacities Program. It furthermore received funds through the Brilliant Marine Research Idea Philanthropy Award 2017 issued by the Vlaams Instituut voor de Zee (VLIZ), Belgium. Research was funded by the Refugia and Ecosystem Tolerance in the Southern Ocean project (RECTO; BR/154/A1/RECTO) as well as the Ecosystem Responses to global change—a multiscale approach in the Southern Ocean project (vERSO; BR/132/A1/vERSO) (http://rectoversoprojects.be), both funded by the Belgian Science Policy Office (BELSPO). This is contribution number 003 of the RECTO project and contribution number 029 of the vERSO project. HC was supported by a grant from the former Flemish agency for Innovation by Science and Technology (IWT), now managed through Flanders Innovation & Entrepreneurship (VLAIO, Grant No. 141328).

#### ACKNOWLEDGMENTS

We thank the Natural History Museum, London, for museum samples and especially J. Maclaine for assistance during sampling. Furthermore, we thank B. Wallis and the crew of Ocean Expeditions as well as Thomas Desvignes for the acquisition of contemporary samples. We also thank Bart Hellemans for assistance during molecular laboratory work. Furthermore, we thank the reviewers for their constructive and detailed comments and suggestions.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00151/full#supplementary-material

metabarcoding of ancient DNA from arctic permafrost. Environ. Microbiol. 15, 1176–1189. doi: 10.1111/1462-2920.12020


high-throughput community sequencing data. Nat. Methods 7, 335–336. doi: 10.1038/nmeth.f.303


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Heindler, Christiansen, Frédérich, Dettaï, Lepoint, Maes, Van de Putte and Volckaert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Stochastic and Deterministic Effects of a Moisture Gradient on Soil Microbial Communities in the McMurdo Dry Valleys of Antarctica

Kevin C. Lee<sup>1</sup> , Tancredi Caruso2,3, Stephen D. J. Archer<sup>1</sup> , Len N. Gillman<sup>1</sup> , Maggie C. Y. Lau<sup>4</sup> , S. Craig Cary<sup>5</sup> , Charles K. Lee<sup>5</sup> and Stephen B. Pointing6,7 \*

1 Institute for Applied Ecology New Zealand, Auckland University of Technology, Auckland, New Zealand, <sup>2</sup> School of Biological Sciences, Queen's University Belfast, Belfast, United Kingdom, <sup>3</sup> Institute for Global Food Security, Queen's University Belfast, Belfast, United Kingdom, <sup>4</sup> Department of Geosciences, Princeton University, Princeton, NJ, United States, <sup>5</sup> International Centre for Terrestrial Antarctic Research, University of Waikato, Hamilton, New Zealand, <sup>6</sup> Yale-NUS College, National University of Singapore, Singapore, Singapore, <sup>7</sup> Department of Biological Sciences, National University of Singapore, Singapore, Singapore

#### Edited by:

Bruno Danis, Free University of Brussels, Belgium

#### Reviewed by:

Angel Valverde, University of the Free State, South Africa Asuncion de los Ríos, Consejo Superior de Investigaciones Científicas (CSIC), Spain

> \*Correspondence: Stephen B. Pointing

stephen.pointing@yale-nus.edu.sg

#### Specialty section:

This article was submitted to Extreme Microbiology, a section of the journal Frontiers in Microbiology

Received: 08 August 2018 Accepted: 12 October 2018 Published: 01 November 2018

#### Citation:

Lee KC, Caruso T, Archer SDJ, Gillman LN, Lau MCY, Cary SC, Lee CK and Pointing SB (2018) Stochastic and Deterministic Effects of a Moisture Gradient on Soil Microbial Communities in the McMurdo Dry Valleys of Antarctica. Front. Microbiol. 9:2619. doi: 10.3389/fmicb.2018.02619 Antarctic soil supports surface microbial communities that are dependent on ephemeral moisture. Understanding the response to availability of this resource is essential to predicting how the system will respond to climate change. The McMurdo Dry Valleys are the largest ice-free soil region in Antarctica. They are a hyper-arid polar desert with extremely limited moisture availability. Microbial colonization dominates this ecosystem but surprisingly little is known about how communities respond to changing moisture regimes. We utilized the natural model system provided by transiently wetted soil at lake margins in the Dry Valleys to interrogate microbial responses along a well-defined contiguous moisture gradient and disentangle responses between and within phyla. We identified a striking non-linear response among bacteria where at low moisture levels small changes resulted in a large impact on diversity. At higher moister levels community responses were less pronounced, resulting in diversity asymptotes. We postulate that whilst the main drivers of observed community diversity were deterministic, a switch in the major influence occurred from abiotic factors at low moisture levels to biotic interactions at higher moisture. Response between and within phyla was markedly different, highlighting the importance of taxonomic resolution in community analysis. Furthermore, we resolved apparent stochasticity at high taxonomic ranks as the result of deterministic interactions taking place at finer taxonomic and spatial scales. Overall the findings provide new insight on the response to moisture and this will be useful in advancing understanding of potential ecosystem responses in the threatened McMurdo Dry Valleys system.

Keywords: Antarctica, dry valleys, hyporheic, oligotrophic, soil bacteria, soil fungi, water availability

# INTRODUCTION

The McMurdo Dry Valleys of Antarctica constitute the largest ice-free region on the continent (SCAR, 2004). This extreme polar desert presents exceptional environmental and resource challenges for life (Convey et al., 2014). Water bio-availability is limited to a short period during the austral summer when temperatures allow ice and snow to melt (Pointing et al., 2015).

**174**

Microhabitats within and beneath rocks support patchy islands of cryptic but well-developed microbial colonization (Yung et al., 2014; Wei et al., 2016; Archer et al., 2017). Soils, however, are a poorly developed oligotrophic and moisture-limited microbial habitat and consequently they support extremely low standing biomass characterized by heterotrophic bacteria (Niederberger et al., 2008; Pointing et al., 2009; Lee et al., 2012). Terrestrial meltwater ponds and streams occur where liquid water can accumulate throughout the Dry Valleys and these aquatic systems support extensive colonization dominated by cyanobacterial mats (Vincent et al., 1993; Vincent, 2002). The coupling of these meltwater features to water input/output also result in expansive wetted hyporheic soils where water is retained over several weeks during the growing season (Gooseff et al., 2003) and creates favorable environments for microbial colonization (Ball and Levy, 2015). Developing an understanding of microbial response to moisture as the primary driver of habitability is essential to predicting potential impacts on Dry Valleys ecology in a landscape on the threshold of climate-induced change (Fountain et al., 2014).

Evidence for microbial community responses to moisture is scarce and somewhat contradictory for hyporheic soils since most previous studies have focused on the aquatic environment: A study of transiently wetted Antarctic hyporheic soils identified a defined community structure between moisturesufficient (Cyanobacteria-dominated) and <5% moisture soils (Acidobacteria, Actinobacteria, Deinococci and Bacteroidetesdominated) (Niederberger et al., 2015b), although another comparing Antarctic and hot desert soils did not (Zeglin et al., 2011). Where low moisture soils were experimentally augmented with water and nutrients a drive toward desiccation tolerant taxa Acidobacteria, Firmicutes and Proteobacteria (Buelow et al., 2016), or Actinobacteria and Bacteroidetes (Tiao et al., 2012) was observed. This highlights the complexity of community response when diverse taxa within and between phyla may be present and respond differently along an environmental gradient.

It is reasonable to assume that deterministic processes drive community assembly in response to moisture for this system since this is the key factor determining habitability in the Dry Valleys (Cowan et al., 2014). The deterministic model predicts niche partitioning will result in segregation in terms of species co-occurrence or even aggregation if niche partitioning interacts with environmental stochasticity (Chave, 2004; Dornelas et al., 2006; Chase, 2007), such as a meltwater moisture regime in the Dry Valleys. The Antarctic system is an ideal model against which to test this due to the extreme oligotrophic nature of soils and lack of trophic complexity. It is also reasonable to assume that bacterial taxa do not display the same response to abiotic variables and this has been suggested by one recent study that revealed striking differences even at a simple delineation between photoautotrophs and heterotrophs (Caruso et al., 2011). We therefore predict considerable heterogeneity in moisture response may occur among diverse taxonomic groups.

Microbial response to moisture availability along a contiguous gradient remains poorly defined for Antarctic Dry Valleys soil. In addition, previous studies employed typical distance-related comparisons of communities and so whilst broad trends in overall community have been observed, the approach may have obscured potential differential responses at finer taxonomic resolution, and the latter is more informative with regard to functionality (Fierer et al., 2007). Here we report bacterial community response to a contiguous and well-defined moisture gradient at an unprecedented level of taxonomic resolution. We identify support for a deterministic process driving bacterial diversity shifts in response to moisture availability and establish for the first time that a striking non-linear response to moisture occurs. The findings provide critical new insight on the response to moisture and will allow better predictions of resilience in the threatened McMurdo Dry Valleys system.

# MATERIALS AND METHODS

# Field Sampling

Sampling was conducted around the hyporheic zone of Spaulding Pond, Taylor Valley in the McMurdo Dry Valleys of Antarctica, during the austral summer of 2015. Four transects were defined in north-east (S77◦ 39.489<sup>0</sup> , E163◦ 07.501<sup>0</sup> ), south-east (S77◦ 39.513<sup>0</sup> , E◦ 163 07.626<sup>0</sup> ), south (S77◦ 39.589<sup>0</sup> , E◦ 163 07.006<sup>0</sup> ), and north-west (S77◦ 39.470<sup>0</sup> , E◦ 163 06.336<sup>0</sup> ) facing hyporheic moisture gradients. We adopted a zonal sampling approach where soils were retrieved that matched visible delineations along linear transects extending across the hyporheic zone from the water's edge at each sampling station: Zone (1) Saturated soil adjacent to water's edge; Zone (2) Wet soil as indicated by dark coloration; Zone (3) Ephemerally wet soil, dry but with evidence for previous water indicated by surface evaporites; Zone (4) Dry soil with no indication of recent moisture. Soils from each zone in each transect were recovered using aseptic technique and saturated with Lifeguard solution (Qiagen, Netherlands) (n = 16) with parallel sampling for geochemical and moisture analysis (n = 16). All samples were stored in darkness frozen at -20oC until processed.

# Soil Analysis

Moisture content was estimated gravimetrically after drying soil samples to constant dry mass at 120oC for 48 h. Soil chemical analyses for variables known to affect microbial colonization, including pH, extractable cations, cation exchange capacity, phosphorous and sulfur were measured according to standard chemical analysis methods (Blakemore et al., 1987; Landcare Research<sup>1</sup> ). In brief, geochemical tests were conducted using the following methodology: All sediments were dried in a forced air convection drier at 35◦C, and after drying, sediments were crushed to pass through a 2 mm sieve. For pH, 10 mL of sediment was slurried with 20 mL of water, and after standing, the pH was measured (1:2 v/v slurry). Cations (K+, Ca2+, Mg2+, Na+) were extracted using ammonium acetate (1.0M, pH 7, 1:20 v/v sediment:extractant ratio, 30 min extraction), and determined by ICP-OES. Cation Exchange Capacity (CEC) was calculated by summation of the extractable

<sup>1</sup>https://www.landcareresearch.co.nz/resources/laboratories/environmentalchemistry-laboratory/services/soil-testing/methods (accessed April 2018).

cations and the extractable acidity. Phosphorus was extracted using Olsen's procedure (0.5M sodium bicarbonate, pH 8.5, 1:20 v/v sediment:extractant ratio, 30 min extraction), and the extracted phosphate was determined calorimetrically by a molybdenum blue procedure. For sulfate-sulfur, sediments were extracted using 0.02M potassium dihydrogen phosphate after 30 min shaking, and sulfate-sulfur was measured by anion-exchange chromatography (IC). Total nitrogen (TN) and total carbon (TC) were determined by the Dumas method of combustion. Each sample was combusted to produce varying proportions of CH<sup>4</sup> and CO gas. The CH<sup>4</sup> and CO gas was oxidized to CO<sup>2</sup> using the catalysts Copper Oxide and Platinum. The CO<sup>2</sup> was measured using Thermal Conductivity detector. Available nitrogen was estimated after incubating sediment samples for 7 days at 40◦C, after which the ammonium-N was extracted with potassium chloride (2M potassium chloride, 1:5 v/v sediment:extractant ratio, 15 min shaking), and determined calorimetrically. Moisture was measured in percentage w/w; pH in pH unit; Olsen P (bicarbonate-extractable P), K, Ca, Mg, Na, and S (Sulfur and Sulfate) were measured in mg per kg of soil; cation-exchange capacity (CEC) was measured in me per 100 g of soil.

#### Environmental 16S rRNA Gene-Defined Diversity

Samples were thawed on ice and three 0.5 g extractions were conducted using the CTAB method optimized for Antarctic oligotrophic environmental samples (Archer et al., 2015). DNA yield was measured in ng/g soil. Illumina MiSeq libraries were prepared as per manufacturer's protocol (Metagenomic Sequencing Library Preparation Part # 15044223 Rev. B; Illumina, San Diego, CA, United States) as previously described (Lee et al., 2016). PCR targeting the V3–V4 regions of bacterial and archaeal 16S rRNA gene with the primer set: PCR1 forward (5<sup>0</sup> TCGTCGGCAGCGTCAGATGT GTATAAGAGA CAGCCTACGG GNGGCWGCAG 3<sup>0</sup> ) and PCR1 reverse (50GTCTCGTGGG CTCGGAGATG TGTATAAGAG ACAGGACTAC HVGGGTATCT AATCC 3<sup>0</sup> ) was conducted using KAPA HiFi Hotstart Readymix (Kapa Biosystems, Wilmington, MA, United States) and the following thermocycles: (1) 95◦C for 3 min, (2) 25 cycles of 95◦C for 30 s, 55◦C for 30 s, ◦C for 30 s, 72◦C for 5 min, and (3) holding the samples at 4◦C. The amplicons were then indexed using Nextera XT index kit (Illumina). AMPure XP beads (Beckman-Coulter, Brea, CA, United States) was used to purified the amplicon. Sequencing was conducted with an Illumina MiSeq system (Illumina) with the 500 cycle V2chemistry at Auckland University of Technology, New Zealand. A 5% PhiX spike-in was used, as per manufacturer's recommendation.

#### Data Processing

The paired-end reads were merged using USERCH v.9.0.2132 (Edgar, 2013). The merged reads were then filtered to remove extraneous sequences that were unlikely to be the target marker amplicon (16S rRNA gene). Mothur v1.36.1 (Schloss et al., 2009) was used to remove sequences outside of 200–500 bp range

FIGURE 2 | Pairwise comparison of multivariate geochemistry data from the soil samples. The upper diagonal displays the correlations, the diagonal boxes show the densities, and a matrix of scatterplots are shown in the lower diagonal. Linear regressions and 95% confidence intervals in the scatterplots are represented by blue lines/areas. Red lines/areas represent local regressions and their 95% confidence intervals. Moisture was measured in percentage; pH in pH unit; Olsen P (bicarbonate-extractable P), K, Ca, Mg, Na, and S (Sulfur and Sulfate) were measured in mg per kg of soil; cation-exchange capacity (CEC) was measured in me per 100 g of soil. DNA yield was measured in ng/g soil.

or containing > 6 homopolymers. Low quality sequences (>1 expected error) and singletons (except for when calculating bacterial richness and diversity) were removed to reduce false identification of operational taxonomic units (OTUs). These curated high quality sequences were clustered de novo with USEARCH using a 97% identity threshold into OTUs. The representative OTU sequences were taxonomically classified using the RDP classifier (Wang et al., 2007) implemented in QIIME v9.1.1 (Caporaso et al., 2010) with Greengenes 13\_8 reference database (McDonald et al., 2012). From the 16 samples, a total of 4.3 GB (gzipped) of data was generated. The processing identified 321 OTUs, and resulted in a total of 1,173,355 reads and a mean sampling depth of 73,334 reads. All sequence data acquired during this investigation has been deposited in the EMBL Sequence Read Archive (SRA) as BioProject PRJEB27415 under accession numbers ERS2573055 to ERS2573070.

#### Statistical Analyses

Models for regression fitting estimated taxa (OTU) richness (Chao1 index) and diversity (Shannon's index) against soil moisture(%) were compared and selected based on p-values, adjusted R 2 values, and Akaike information criterion (AIC) of the models. In both cases, asymptotic regression model through the origin (SSasympOrig in R) was chosen. However, due to evidence of heteroscedastic residual variances with Shannon's diversity data, its model was fitted with a variance power weighting function. The relationships between the relative abundances of major phyla (overall mean relative abundance > 2%, i.e., Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Cyanobacteria, Firmicutes, and Proteobacteria) and moisture (%) were investigated using Spearman's rank order correlation, due to violation of assumptions for Pearson's correlation (homoscedasticity and linearity). Cyanobacteria and Actinobacteria showed significant correlation (p < 0.05) confirming monotonic relationships (positive and negative) between the phyla and moisture content in the soil. PERMANOVA was performed using the adonis2 function in R "vegan" package (Oksanen et al., 2015) to analyze partitioning of community distance matrix among sources of variations. The analysis aims to test the significance of factors associated with samples (explanatory variables such as moisture and pH) in explaining the degrees of difference between

communities. Terms including moisture, pH, and soluble cation (combining Mg, K, and Na measurements as a proxy of salinity) were added sequentially to the test. Phylogenetic trees for community phylogenetic structure analysis was constructed with FastTree v2.1.9 (Price et al., 2010). Representative OTU sequences were aligned by MUSCLE v3.8.31 (Edgar, 2004). Net relatedness index (NRI) was calculated based on mean phylogenetic distance (MPD) from the tree. Specifically, the MPD index quantifies mean distance of any taxon from every other taxon and was converted to NRI by multiplying with −1. Null model algorithm based on independent swap (999 randomization) was used to test whether communities were phylogenetically clustered (positive values) or overdispersed (negative values). Results for NRI (**Figure 1F**) was expressed as effects size, i.e., [–(MPD –MPDnull)/SD(MPDnull)]. In significantly clustered communities, OTUs are phylogenetically closer than expected under random sampling (i.e., MPD would be smaller than the average null MPD and thus NRI would be positive and larger than expected under the null model). Bacterial phylogenetic metrics were calculated using the R package "picante" (Kembel et al., 2010) and other package that support phylogenetically informed statistical analyses ("ape," "phylobase," "adephylo," "phytools") (Swenson, 2014). Detecting non-random phylogenetic patterns with metrics such as the Net Relatedness Index would reject the null hypothesis that bacterial taxa associate randomly in terms of phylogenetic relationships. Rejecting this null hypothesis would support, while not prove, the alternative hypothesis that non-random patterns can be generated by selection for shared traits (assuming trait conservatism). Selection can be exerted either by interactions between taxa, or the environment, or both. We decided that non-random phylogenetic patterns discriminated using Net


TABLE 1 | PERMANOVA test statistics of the main environmental factors influencing partitioning between communities.

fmicb-09-02619 November 2, 2018 Time: 15:25 # 6

adonis2 (formula, ps.d ∼ Moisture + pH + MgKNa; data, ps.df).

Relatedness Index was a more robust method than network analysis where interactions are assumed or should be assumed a-priori or demonstrated experimentally before the analysis (Pascual and Dunne, 2006; Caruso et al., 2012) and would not allow us to prove biotic interactions while we would also lose the information embodied in the phylogeny of our taxa. In this case using Net relatedness Index could determine that phylogenetically related bacterial taxa share traits that are key to adapt to the extreme conditions of Antarctic soil. Thus, the fact that bacterial taxa do not associate randomly in terms of phylogenetic relationships imply potential selection for shared traits (either by interactions between taxa or the environment, or both). We believe there is more information in phylogeny based species distribution (community phylogenetics) than just species distribution (which would be the only trait considered by analyzing the species correlation matrix with network analysis).

#### RESULTS

#### Clearly Defined Abiotic Gradients Occurred in the Hyporheic Zone

The key abiotic driver of the hyporheic soil landscape was moisture [PERMANOVA, R <sup>2</sup> = 0.31, F(1,12) = 6.7603, p = 0.003; **Table 1** and **Figures 1**–**3**] and moisture data supported the broad delineation of four zones in the hyporheic soil (**Figure 1A**). The gradient of moisture decreased sharply from over 60% at the water's edge to approximately 2% in soils at the transition to arid soil. A steep transition occurred between 6.8 and 15.6% moisture content across all samples (n = 16). Other variables including pH (**Figure 1B**) and soluble cations, a proxy for osmotic challenge in soils, (**Figures 1C**, **2**, **3**) also varied along the transects but were not significant after accounting for moisture [PERMANOVA p = 0.062 [pH], p = 0.472 9 (soluble cations); **Table 1**]. These were identified as closely linked covariables with moisture and thus all subsequent analysis employed moisture as the primary abiotic variable. The low variability in other abiotic variables along transects did not display a significant trend.

# Bacterial Diversity Displayed a Non-linear Response to Moisture

Biomass estimation in poorly colonized desert soil is notoriously problematic and so we used recoverable environmental DNA as a proxy for biomass. These suggested relatively low biomass

FIGURE 4 | Asymptotic trends of Chao1 species richness estimator (Adj R <sup>2</sup> = 0. 0.3875, p = 0.00278) and Shannon's diversity index with soil moisture content (Adj R <sup>2</sup> = 0.0679, p = 0.02748). For more detailed description of the models, please refer to Table 2.

TABLE 2 | Asymptotic regression analyses of estimated species richness and diversity showing significant correlations.


AIC, Aikake information criterion; logLik, log likelihood; L(χ 2 ), Chi-squared statistic.

in all samples although those with intermediate sub-saturated moisture levels supported highest DNA yield (**Figure 1D**). Alpha diversity displayed a clear positive correlation with moisturedefined zones (**Figure 1E**) and analysis of bacterial phylogenetic structure revealed a trend toward random community assembly as moisture decreased within zones more distant from the water's edge (**Figure 1F**). Deeper analysis using regression models revealed a striking non-linear response to moisture for alpha

diversity (**Figure 4** and **Table 2**). At low soil moisture levels of < 6.8%, Chao 1 richness estimates and Shannon's diversity index increased sharply with small increments in soil moisture. At higher moisture levels of up to 62.6%, even large increases in moisture resulted in relatively little change in alpha diversity suggesting communities had reached an asymptotic state and were no longer limited by water.

Community structure was clearly delineated by principal coordinates analysis (PCoA) and weighted UniFrac (biotic data) into three groupings that broadly reflected moisturedefined habitat zones within the hyporheic soils (**Figure 5**). The contrast between highest moisture soils cluster and lower moisture soil clusters (zones 1–2 vs. zone 3–4) explained 73% of overall community dissimilarity, whilst the other clusters for wet/ephemerally wet soils (zones 3–4) and dry soils (zone 4) explained less than 10% of community dissimilarity. Canonical correspondence analysis (CCA) was used to verify that the abiotic gradient due to moisture and its covariables was the most significant factor in community assembly and taxon occurrence (**Figure 3**).

# Response to Moisture Varied Markedly Between Taxonomic Groups

We identified 19 bacterial phyla, and among these seven were above 2% mean relative abundance (**Figure 6**). These were considered as major phyla and subjected to further analysis. Five distinct patterns in response to the moisture gradient were identified (**Figure 7**). The Acidobacteria and Actinobacteria decreased in overall relative abundance as moisture increased, and displayed evidence for a succession of taxa within each phylum. Bacteroidetes and Chloroflexi displayed no discernible change with moisture and this in part reflected their low overall relative abundance. The Cyanobacteria increased in relative abundance with increasing moisture (Spearman's correlation for the two major phyla: Cyanobacteria S = 174, p = 0.001, rho = 0.744; Actinobacteria S = 1134, p = 0.006, rho = -0.668) whilst maintaining overall taxonomic complexity. The relative abundance of Firmicutes was stochastic across the moisture range. Proteobacteria exhibited an increase in relative abundance with moisture, peaking at 15.57% moisture and decreasing at high levels. For most phyla an abrupt shift occurred between 6.82 and 15.57% moisture where the abundance of multiple taxa changed markedly (For example, within the Actinobacteria, Euzebyales were abundant only below 6.82%; Chloroflexi were abundant only above 15.57%, and among the Firmicutes, a transition in dominance occurred from Planomicrobium at lower moisture levels to Clostridium above 6.82% moisture.

# DISCUSSION

# A Tipping Point for Bacterial Diversity at Low Moisture Levels

The steep response of diversity at low moisture levels indicated dynamic communities that were highly responsive to moisture. We identify 6.8% soil moisture as a tipping point below which extreme challenge to maintenance of diversity occurs. Previous studies in polar and non-polar deserts have reported differences

FIGURE 6 | Taxa distribution of soil arranged in the order of the moisture gradient (lowest, 2.2% to highest, 62.56%). The stack bar plots show relative abundance of the taxa of interest to total bacterial abundance. Detailed taxonomic levels therein (e.g., Order, Family, Genus) are shown to illustrate response to moisture at lower taxonomic rank for abundant phyla. The dotted red line indicates the moisture threshold where shifts in member taxa were most common. The position of the dotted red line in each plot is based on rank order position of the threshold between sample ranked 9–10 in moisture content (dry to wet) which corresponds to the 6.82–15.57% threshold.

between "wet" and "dry" soils without resolution of a contiguous moisture gradient, although comparisons with alpha diversity corroborate our findings (Pointing et al., 2007; Zeglin et al., 2011; Niederberger et al., 2015b). The tipping point occurred at a moisture level regarded as very low for edaphic systems globally (Pointing and Belnap, 2012) although it is typical for Dry Valleys soils and thus the response is relevant on a landscape scale to this Antarctic system.

The influence of pH and soluble salts has also been previously shown to be a major biological determinant across global datasets (Fierer and Jackson, 2006; Lozupone and Knight, 2007), although our study shows that at least for polar desert systems this likely reflects moisture availability since they strongly covaried. Our findings indicate that the fragile and endangered Antarctic terrestrial ecosystem (Cowan and Tow, 2004; Chown et al., 2012) is highly susceptible to shifts in biodiversity under low moisture regimes but that after a threshold soil moisture content of around 25% was reached then limited additional biological change would occur with increasing moisture. This highlights the immediacy of the threat to bacterial diversity from even small increases in liquid water availability due to climate change.

# Biotic Interactions Become Significant at Higher Moisture Levels

We observed phylogenetic patterns consistent with potential over-dispersion in wetter soil and high interdependence in taxon distribution (Kembel and Hubbell, 2006; Kembel, 2009; Cadotte et al., 2010). These patterns could be caused either by limited dispersal or distantly related taxa, or intense and possibly negative biotic interactions, or strong selection by environmental filtering, or a combination of both processes (Harper et al., 1961; Kembel, 2009). The scale of the study and the dispersal capability of Antarctic soil bacteria (Bottos et al., 2013) sugest that limited dispersal may play a minor role although we cannot rule this out completely. Instead, the wetter communities were dominated by Cyanobacteria and formed significantly nonrandom assemblages, where an increase in biocomplexity may also be accompanied by greater structural complexity that may influence community assembly (De Los Ríos et al., 2004, 2014) and be phylogenetically non-random due to the shared traits that are necessary to adapt to the extreme environment. This implies that biotic traits may be important in influencing the high degree of cyanobacterial endemism in Antarctic soils (Jungblut et al., 2010; Vyverman et al., 2010; Bahl et al., 2011). Other taxa likely to be influenced by biotic interactions include four out of six most abundant taxa in our study, all of which were psychrophilic species previously identified only from Antarctica (Franzmann et al., 1991; Spring et al., 2003; Reddy et al., 2004), which suggests competitive traits may be desirable in endemic taxa.

In contrast, stochastic process was more pronounced in drier soil i.e., the phylogenetic structure of taxa present was closer to being randomly assembled. Stochasticity may be particularly relevant to explaining occurrence of the spore-forming bacteria and other taxa tolerant of environmental stress via poikilohydric responses. For example, the Firmicutes exhibited a particularly high variance in relative abundance throughout the moisture range and particularly low diversity with only six taxa across two endospore-forming genera. We speculate that this accounted, at least in part, for the stochastic signal by reflecting recolonization from endospore reservoirs following dormancy events where low moisture resulted in extinction for other taxa.

# Implications for Ecosystem Functionality and Resilience

A fairly tight coupling between taxonomy and functionality has been shown for Antarctic Dry Valleys soil bacteria

(Chan et al., 2013; Wei et al., 2016) as well as other systems (Delgado-Baquerizo et al., 2016; Fernández Martínez et al., 2017) and so the taxonomic shifts observed in this study may also reflect shifts in ability to conduct some geobiological transformations. The steep nature of the relationship between moisture and diversity below the tipping point suggests major changes are more likely at low moisture levels. The cyanobacteria-dominated communities supported by high moisture result in net carbon input to the system and this has a potential positive feedback on other trophic levels through provision and utilization of photosynthetic exudates (Niederberger et al., 2015a). This may facilitate persistence of cyanobacterial mats over multiple growing seasons despite relatively low productivity rates per unit biomass (Vincent and Howard-Williams, 1986; Conovitz et al., 1999). Rates of genetic substitution have been linked to water availability and productivity and therefore the diversity pattern we reveal might be due to an elevated tempo of evolution and speciation in wetter soils (Gillman and Wright, 2014). A shift toward more productive soils may lead to accelerated speciation over millennia. However, the more immediate impact would likely be a loss of polar desert-adapted endemicity that has evolved with low moisture or stochastic moisture stress. Given that recruitment estimates to Antarctic soils are extremely low (Burrows et al., 2009) this may reduce overall resilience to further shifts in moisture regime. A further concern is that since we have shown community shifts may occur in response to relatively small changes in moisture and at low moisture levels, the propensity for 'greening' of the Antarctic Dry Valleys in a warmer world should therefore be considered a risk in light of the nature of this system as a unique and protected hyper-arid environment (SCAR, 2004).

#### Stochastic Patterns May Be Deterministic at Taxonomic Resolution Beneath Phylum Level

Our findings have broad implication for understanding the mechanistic basis for microbial community assembly. We highlight the critical importance of scale in taxonomic analysis and demonstrate it has fundamental impact on the outcome of estimates for the relative influence of stochastic and deterministic processes on community assembly. The widely accepted dogma is that microbial communities are largely driven by deterministic processes, although most microbial

#### REFERENCES


ecology studies have restricted their consideration of these to environmental filters (Zhou and Ning, 2017). Another important deterministic influence, however, is biotic interaction since deterministic drivers include all non-random processes. Our study showed that in a simple system moisture availability was the key abiotic process, but that this obscured the strong influence of biotic interactions at finer taxonomic resolution. This has largely been overlooked by the scale of taxonomic interrogation in many earlier studies of Antarctic microbial ecology, as most have focused on phylum or class level patterns. Stochastic processes also occupy a key role in our system (Volkov et al., 2003; Chave, 2004), especially for oligotrophic or resource-limited habitats where niche factors dominate but there also remains a relative large amount of random variation that is not a function of environmental variables. Some of this unexplained variance can be resolved when data at the phylum level are further decomposed into data at the family or genus level. For example in the case of Chloroflexi and Firmicutes in our model system their distribution appeared stochastic at phylum level but genuslevel distribution was clearly a result of deterministic influence. Overall, we advocate broader consideration of deterministic and stochastic factors should be accompanied by fine-scale taxonomic resolution in order to yield a more comprehensive understanding of microbial interactions.

### AUTHOR CONTRIBUTIONS

ML and SP conceived the study. CL, SC, and SP secured research funding. CL led the expedition team in the McMurdo Dry Valleys. LG and ML conducted the fieldwork. KL and SA performed the laboratory experiments. KL, SA, TC, and SP performed data analysis and interpretation. SP wrote the manuscript. All authors read and commented on the draft manuscript.

# FUNDING

Field and logistical support was provided by Antarctica New Zealand. The research was funded by a grant from the New Zealand Ministry of Business, Innovation & Employment (UOWX1401) and Yale-NUS College Start-Up Fund.


communities across a richness gradient in Northern California. Divers. Distrib. 16, 892–901. doi: 10.1111/j.1472-4642.2010.00700.x


fmicb-09-02619 November 2, 2018 Time: 15:25 # 11

Reddy, G. S. N., Matsumoto, G. I., Schumann, P., Stackebrandt, E., and Shivaji, S. (2004). Psychrophilic pseudomonads from Antarctica: Pseudomonas antarctica sp. nov., Pseudomonas meridiana sp. nov. and Pseudomonas proteolytica sp. nov. Int. J. Syst. Evol. Microbiol. 54, 713–719. doi: 10.1099/ijs.0. 02827-0

SCAR. (2004). SCAR bulletin 155. Polar Rec. 40, 371–382.

fmicb-09-02619 November 2, 2018 Time: 15:25 # 12


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Lee, Caruso, Archer, Gillman, Lau, Cary, Lee and Pointing. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Island Biogeography of Cryoconite Hole Bacteria in Antarctica's Taylor Valley and Around the World

John L. Darcy <sup>1</sup> \*, Eli M. S. Gendron<sup>2</sup> , Pacifica Sommers <sup>2</sup> , Dorota L. Porazinska<sup>3</sup> and Steven K. Schmidt <sup>2</sup>

<sup>1</sup> Department of Botany, University of Hawaii, Manoa, HI, United States, <sup>2</sup> Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, United States, <sup>3</sup> Department of Entomology and Nematology, University of Florida, Gainesville, FL, United States

#### Edited by:

Anton Pieter Van de Putte, Royal Belgian Institute of Natural Sciences, Belgium

#### Reviewed by:

Jianjun Wang, Nanjing Institute of Geography and Limnology (CAS), China Ana M. C. Santos, University of Alcalá, Spain

> \*Correspondence: John L. Darcy jdarcy@hawaii.edu

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

Received: 14 July 2018 Accepted: 22 October 2018 Published: 20 November 2018

#### Citation:

Darcy JL, Gendron EMS, Sommers P, Porazinska DL and Schmidt SK (2018) Island Biogeography of Cryoconite Hole Bacteria in Antarctica's Taylor Valley and Around the World. Front. Ecol. Evol. 6:180. doi: 10.3389/fevo.2018.00180 Cryoconite holes are holes in a glacier's surface caused by sediment melting into the glacier. These holes are self-contained ecosystems that include abundant bacterial life within their sediment and liquid water, and have recently gained the attention of microbial ecologists looking to use cryoconite holes as "natural microcosms" to study microbial community assembly. Here, we explore the idea that cryoconite holes can be viewed as "islands," in the same sense that an island in the ocean is an area of habitat surrounded by a barrier to entry. In the case of a classic oceanic island, the ocean is a barrier between islands, but in the case of cryoconite holes, the ocean is comprised of impermeable solid ice. We test two hypotheses, born out of island biogeographic theory, that can be readily applied to cryoconite hole bacteria. First, we ask to what extent the size of a cryoconite hole is related to the amount of bacterial diversity found within it. Second, we ask to what extent cryoconite holes exhibit distance decay of similarity, meaning that geographically close holes are expected to harbor similar bacterial communities, and distant holes are expected to harbor more different bacterial communities. To test the island size hypothesis, we measured the sizes of cryoconite holes on three glaciers in Antarctica's Taylor Valley and used DNA sequencing to measure diversity of bacterial communities within them. We found that for two of these glaciers, there is a strong relationship between hole size and bacterial phylogenetic diversity, supporting the idea that cryoconite holes on those glaciers are "islands." The high biomass dispersing to the third glacier we measured could explain the lack of size-diversity relationship, remaining consistent with island biogeography. To test the distance decay of similarity hypothesis, we used DNA sequence data from several previous studies of cryoconite hole bacteria from across the world. Combined with our Taylor Valley data, those data showed that cryoconite holes have strong spatial structuring at scales of one to several hundred kilometers, also supporting the idea that these dirty holes on glaciers are really islands in the cryosphere.

Keywords: Antarctica, cryoconite, biogeography, islands, microbial ecology

# INTRODUCTION

Cryoconite holes are microbial oases within the extreme environment of a glacier's surface ice. These holes form when sediment is blown onto the ice and is heated by solar energy, causing it to melt into the glacier's surface. Because cryoconite holes can be easily visually identified on a glacier's surface, and because they are discrete, self-contained, isolated pockets of life occurring in an otherwise inhospitable environment, cryoconite holes have attracted the interest of ecologists studying microbial community assembly who wish to use them as natural microcosm experiments (Ambrosini et al., 2016; Sommers et al., 2018). Much like an island within a sea of glacial ice, each cryoconite hole harbors its own microbial community (Stanish et al., 2013), which is active at least during the summer months when they contain liquid water (Telling et al., 2014). Cryoconite holes on the glaciers of Antarctica's Taylor Valley are even more akin to islands than those on arctic or alpine glaciers in that significant dispersal barriers exist between them, since older holes are "lidded" by ice year-round (Fountain et al., 2004), meaning that their captive microbiota cannot easily escape, and new species cannot easily immigrate. Understanding the extent to which these "natural microcosms" are indeed islands within glaciers is important to the study of cryoconite holes and to their use as model systems for studying microbial community assembly. This is because biogeographic structure of microbial comminutes is a result of community assembly processes (Nemergut et al., 2013; Darcy et al., 2017), and theory for community assembly in island systems is well developed (Rominger et al., 2016). Here, we investigate the extent to which bacterial communities living within cryoconite holes match two predictions made by island biogeographic theory (MacArthur and Wilson, 1967; Nekola and White, 1999).

The first prediction is that larger islands are more species-rich than smaller islands (MacArthur and Wilson, 1967; Rosenzweig, 1995). This pattern has been shown to be true for bacteria several times, most notably in two model systems with "island sizes" comparable to those of cryoconite holes. Bell et al. (2005) surveyed bacterial diversity as a function of "island" size in tree-holes, which are permanent or semi-permanent pools of water that form at the base of European beech trees due to the buttressing of the tree's roots. They found that bacterial genetic diversity was strongly associated with the volume of water in tree holes, obeying the species-area power law, S = cA<sup>Z</sup> , where S is species richness or species diversity (alpha-diversity), c is a constant that is specific to the location or taxon in question, A is island area (or volume, in the case of tree wells), and Z is the slope of the line relating species to area. Species-area relationships are usually visualized and tested as linear fits of log-transformed S vs. log-transformed A, which is why Z, an exponent, is the slope. The other model system in which island size was a strong predictor of bacterial genetic diversity was in the oil reservoirs of machines that are used to cut metal, such as lathes and mills (Van Der Gast et al., 2005). This study used the total volume of the oil sump tank as area, and also found a highly significant relationship between island size and bacterial genetic diversity, even when temporal variation in diversity was accounted for. These patterns have been shown for other microbes as well, including a significant relationship between the size of trees and the diversity of fungi that live among the tree's roots (Glassman et al., 2017). Given previous successful demonstrations of bacteria and other microbes following the island biogeographic species-area relationship, if cryoconite holes are indeed island-like, larger cryoconite holes should have more species-rich bacterial communities than smaller holes.

However, relationships between island size and diversity are not comparable between tree holes and machine oil reservoirs, because those two very different microbiomes harbor different sorts of organisms with different baseline diversities for islands of the same size. Similarly, we do not expect bacterial communities from different glaciers to necessarily fall on the same speciesarea curve. Here, we intentionally sampled three glaciers whose characteristics differ, and used each glacier separately to test our hypothesis that larger cryoconite holes have more diverse bacterial communities than small cryoconite holes. The glaciers we analyze here are all located in Antarctica's Taylor Valley, and have each been the subject of previous microbiological study (Foreman et al., 2007; Stanish et al., 2013; Sommers et al., 2018). These previous studies have found that the Commonwealth, Canada, and Taylor glaciers represent a biomass and diversity gradient across the Taylor Valley (Sommers et al., 2018), making them ideal to study as separate island biogeographic experiments.

The second island biogeographic prediction we test here is that of distance decay of similarity. Simply stated, islands that are geographically close together should have similar communities living on them, and islands that are far apart should have communities that are less similar (more different) from each other (Nekola and White, 1999; Soininen et al., 2007). Microbial communities often follow this pattern, as they are spatially autocorrelated even when sample locations are not "islands" in a strict sense (Mackas, 1984; Franklin and Mills, 2003; Robeson et al., 2011; Wang et al., 2013). There can be many factors that confound this pattern, but it is commonly understood that the main drivers of community-level distance decay of similarity patterns (sometimes called "isolation by distance") (Green et al., 2004; Martiny et al., 2011) are dispersal barriers (the inability of species to travel between distant islands) and environmental dissimilarity (Nekola and White, 1999). In the latter case, distant islands may be less likely to have similar environments than islands that are close together, and species distributions are often functions of their environment, even for bacteria (Fierer and Jackson, 2006; Nemergut et al., 2013). Although this type of spatial pattern is not diagnostic for islands since many non-island study systems exhibit strong spatial autocorrelation (Robeson et al., 2011; Darcy et al., 2017), it is expected that islands communities will be strongly spatially autocorrelated.

Unlike our first hypothesis, which we test here using three glaciers within the same valley, for the distance decay of similarity hypothesis we compiled a global dataset of cryoconite hole bacterial DNA sequence data from past studies. To test for patterns of distance decay of similarity, which are expected if cryoconite holes are truly island-like, we obtained geographic coordinates obtained for each sample from the cryoconite holes we sampled in Taylor Valley and from the other studies we used to assemble the global data set. Using these data, we tested the distance decay of similarity hypothesis at the global scale, but also at the regional scale for Antarctica's McMurdo Dry Valleys.

# METHODS

# Sample Collection

Cryoconite hole sediment samples were collected frozen in November 2016, from the Taylor, Canada, and Commonwealth glaciers in Taylor Valley, Antarctica (**Figure 1**). More detailed information about these glaciers can be found elsewhere (Sommers et al., 2018). On each of the three glaciers, 30 sample sites were selected in a spatially nested sampling scheme (modified from King et al., 2010), containing geographic scales of roughly 2, 15, and 100 m (**Figure 1**). These 30 samples consisted of 5 triangles with side length 15 m, spaced 100 m from each other, and each of these 5 triangles contained 6 samples, two at each vertex, roughly 2 m apart. For each ice core taken, first the dimensions of the cryoconite hole were measured using a measuring tape. Since most holes were slightly ellipsoid, diameters were measured along each hole's north-south and east-west axes. Geographic locations of cryoconite holes were recorded using precision GPS measurements.

Next, a 10 cm SIPRE corer attached to a handheld gasolinepowered motor was used to extract the cryoconite from the glacier. The corer was centered over the cryoconite hole, and the motor was run until the swarf (debris displaced by the corer and ejected through its flutes) turned brown to indicate it passed through the sediment, and then white again, to indicate it drilled past the sediment and into the ice below. Cores were placed in Whirl-Pak bags (Nasco) and kept in a freezer at −20◦C until further processing. Some ice cores broke apart immediately during sampling, but in all cases the sediment "puck" was entirely intact.

# Sample Processing and DNA Sequencing

Each cryoconite hole core was processed in the Crary Laboratory at McMurdo Station to remove excess ice from the sediment, homogenize the sediment, and allocate a subset of the sediment for DNA sequencing. Excess ice was removed from each core using a chisel, then sterile water was used to melt away ∼2 mm of the core's surface to avoid potential contaminants. Cores were then placed into acid-washed plastic beakers and thawed overnight at 4◦C, with additional thawing taking place at room temperature where necessary. Thawed cores were then homogenized vigorously using a sterilized spatula, and ∼0.3– 0.5 g of homogenized sediment from each core was placed in PowerSoil DNA extraction kit bead-beating tubes (MoBio, USA), and genomic DNA was extracted following the manufacturer's instructions. Extracted DNA was kept at −80◦C until it was shipped frozen from McMurdo Station to Boulder, Colorado, USA with a temperature data logger included. The temperature of DNA samples never exceeded −10◦C during transit. The 16S ribosomal RNA gene was PCR amplified in triplicate from each sample using the bacterial universal primers 515f and 806r with attached Illumina barcodes unique to each sample (Caporaso et al., 2012). Amplified DNA was pooled and normalized to equimolar concentrations using SequalPrep normalization Plate Kits (Invitrogen, Carlsbad CA, USA), and was sequenced on the Illumina MiSeq platform using 2 × 250 bp chemistry at the BioFrontiers Sequencing Core Facility at the University of Colorado, Boulder.

# Bioinformatics and DNA Sequence Processing

Raw data from the Taylor Valley cryoconite holes (above) were de-multiplexed and quality filtered, and paired-end reads were joined together using QIIME v.1.9.1 (Caporaso et al., 2010). DNA sequencing data from other studies of cryoconite holes on different glaciers across the Earth (**Table 1**) were downloaded from their respective repositories, and a metadata table was compiled to keep track of each sample and its geographic origins. Spatial locations for each individual sample were either taken directly from manuscripts, supplementary materials, or published datasets, but when locations were not specific (e.g., one geographic coordinate given for several samples), only the stated value was used, as such many pairs of samples had distances of 0 between them even though this was not true at the time of sampling. These 0-distance pairs were excluded from downstream analyses, but pairs containing the same samples at larger distances (>0 m) were retained. Although all studies aggregated here used the same 16S ribosomal RNA gene in their analyses, not all studies used the same regions within that gene. Therefore, in order to compare those different studies, we used QIIME to perform reference-based OTU (operational taxonomic unit, used instead of "species") clustering, which enables analysis of DNA sequencing data generated using different regions of the 16S gene. We used the GreenGenes (DeSantis et al., 2006) database as a reference, because it contains a phylogeny for the sequences within it, and each sequence in the database covers the full length of the 16S gene. This resulted in an OTU table (species × site table), which was then rarefied (randomly downsampled) to 2,000 sequences per sample, so that samples that were sequenced more deeply (e.g., Ambrosini et al., 2016) would not have artificially higher phylogenetic diversity values than samples that were sequenced at a lower depth. Samples with <2000 sequences were discarded.

# Island Size and Phylogenetic Diversity Analysis

The "island size" of each cryoconite hole was calculated as A = π ∗ r1 ∗ r2, where r<sup>1</sup> and r<sup>2</sup> are the two radii of the cryoconite hole (half of the measured diameters). Diversity of each hole was calculated with QIIME using Faith's (Faith, 1992) phylogenetic diversity metric, which uses the sum of branch-lengths in a phylogenetic tree as a diversity index. The phylogenetic tree in this analysis was provided by the GreenGenes database (DeSantis et al., 2006). Faith's (Faith, 1992) phylogenetic diversity was used because it is less sensitive to clustering artifacts (i.e. spurious OTUs) than OTU richness, although this analysis was done with OTU richness as well. Because the three glaciers we studied have significantly different geography and biogeochemistry (Sommers et al., 2018), each glacier was analyzed separately using linear

regression. For each of the three glaciers, a linear model was fit to the log10-transformed phylogenetic diversity values as modeled by log10-transformed cryoconite hole areas, and statistically tested using R 3.33 (R Core Team, 2016). This analysis was not performed on data from glaciers outside the Taylor Valley because measurements of cryoconite hole area were not available for the other samples.

#### Spatial Autocorrelation Analysis

Compositional dissimilarity was calculated for each pair-wise comparison between samples for the entire data set using unweighted UniFrac distance (Lozupone and Knight, 2005) in QIIME 1.9 (Caporaso et al., 2010). UniFrac distance is a distance metric that is similar to Faith's (Faith, 1992) phylogenetic diversity (used above) because UniFrac distance takes the phylogeny of microbial communities into account. UniFrac distance is the fraction of branch lengths in a phylogenetic tree containing all OTUs from two communities that are not shared by those two communities. The GreenGenes tree (DeSantis et al., 2006) was used for this calculation. The resulting dissimilarity matrix was used in the analysis of spatial autocorrelation of bacterial community structure for each of the three glaciers we sampled in this study (**Figure 1**). We used Mantel tests to test for statistical significance in R 3.33 with the vegan package (Oksanen et al., 2016). Mantel tests were also run for all samples from the McMurdo Dry valleys (all samples from Antarctic glaciers except Ecology, **Table 1**) and for all samples worldwide (all samples from all glaciers in **Table 1**). Because the global pattern appeared to plateau at small (<10<sup>3</sup> m) and large (>10<sup>6</sup> m) geographic scales while being driven by turnover at intermediate distances, we fit a logistic model to the data using:

$$\wp = \frac{a - b}{1 + e^{-r^\*(\kappa - i)}} + b \tag{1}$$

where a is the upper asymptote, b is the lower asymptote, r is the exponential rate of change, and i is the x-coordinate of the curve's inflection point. This model was fit to the data using logtransformed geographic distances in R 3.33 (R Core Team, 2016) using downhill simplex optimization as implemented in R's optim function. A linear model was fit as well.

# RESULTS AND DISCUSSION

#### Island Size and Bacterial Diversity

Previous studies have shown that cryoconite hole microbial communities in the Taylor Valley follow a biodiversity gradient

TABLE 1 | Locations of glaciers.


related to both spatial and environmental variables (Sommers et al., 2018), and our results here agree with those previous findings (**Figure 1A**). Taylor, Canada, and Commonwealth glaciers each had a significantly different mean bacterial phylogenetic diversity within its cryoconite holes, even though the sizes of the cryoconite holes selected for sampling were fairly uniform across glaciers. Thus, similar to other island biogeographic studies, the baseline diversity that an environment can support is a more important determinant of diversity than island size is. For example, one would not expect tree hole bacterial diversity (Bell et al., 2005) to have the same relationship to island size as machine oil reservoir bacteria (Van Der Gast et al., 2005), because other factors of those environments such as nutrient availability, temperature, and immigration may be very different and may influence diversity more strongly than island size. While different glaciers within the same valley are clearly not as different of ecosystems as the aforementioned examples, there is still a large effect of the glacier itself. This effect of local geography or biogeochemistry has been previously documented for cryoconite holes both within the Dry Valleys (Webster-Brown et al., 2015; Sommers et al., 2018) and elsewhere (Liu et al., 2015; Ambrosini et al., 2016).

When we analyzed the glaciers separately, both Canada Glacier and Taylor Glacier exhibited statistically significant relationships between cryoconite hole (island) size and bacterial phylogenetic diversity (**Figure 2**), and a significant relationship with species (OTU) richness as well (**Supplementary Figure 1**). In both of these cases, cryoconite hole surface area explained roughly 30% of the variance in bacterial phylogenetic diversity, indicating that although geography is likely the largest determinant of diversity [through climate and biogeochemical processes; (Bagshaw et al., 2013; Sommers et al., 2018)], island size is an important part of microbial community assembly within cryoconite holes. Island size can change over time as a cryoconite hole ages, since the growth of phototrophs can add to the volume of sediment and new sediments can be added during the rare times that the holes are open to the atmosphere. Diversity in these holes likely fluctuates as a function of hole age as well, so temporal variation in size and diversity may both be limitations of our analysis. Nonetheless, our analysis of Canada and Taylor Glacier diversity data supports our hypothesis that cryoconite hole bacterial communities exhibit a relationship between island size and diversity.

This was not the case for Commonwealth glacier, however, where cryoconite hole size had no detectable qualitative or quantitative relationship with diversity (**Figure 2**). During our sample collection atop Commonwealth Glacier, we frequently observed pieces (up to 1 cm) of cyanobacterial mat littering the glacier's surface. These same cyanobacteria were present in the cryoconite sediment from Commonwealth glacier as well, and these conspicuous glacial denizens provided a stark contrast between Commonwealth glacier and the two more inland glaciers we sampled. Other researchers have noticed these cyanobacterial mats as well, and have jokingly named Commonwealth Glacier "The Jungle of the Dry Valleys." These cyanobacterial mats are blown onto Commonwealth Glacier from nearby streams (McKnight et al., 1999; Taton et al., 2003), but they are not so common atop the glacier that they are abundant in every cryoconite hole. As we melted out cryoconite sediment from commonwealth glacier during sample processing, we noticed that not all samples contained visually conspicuous cyanobacterial mat. This wealth of wind-dispersed microbial biomass and diversity may explain why we detected no significant relationship between island size and bacterial diversity on Commonwealth Glacier: each cryoconite hole may encounter so much immigration that the slope of the relationship is significantly lowered, making it more difficult to detect.

Islands that are close to large sources of biodiversity like large islands or continents have increased immigration, meaning that they are at less risk of extinction (loss of diversity) and have a greater chance of recruiting new diversity. In this way, cryoconite holes on Commonwealth Glacier are close to the streams where cyanobacterial mats and other microbial life are abundant, so they essentially have a large nearby continent providing a source of new microbial diversity, and the other two glaciers do not have this feature. So it is our hypothesis that the lack of a relationship between island size and bacterial diversity in Commonwealth Glacier cryoconite holes is the result of the immigration rate being so high for Commonwealth Glacier that island size is not an important factor. The higher average bacterial phylogenetic diversity we and others observed on Commonwealth Glacier supports this hypothesis qualitatively. While cryoconite holes on Canada and Taylor glaciers appear very island-like, cryoconite

holes on Commonwealth Glacier may be islands too, albeit islands close to continental shore.

## Spatial Autocorrelation of Bacterial Communities

Within the McMurdo Dry Valleys, cryoconite hole bacterial communities that were more geographically close together had similar bacterial communities, while cryoconite holes that were more geographically distant had more different bacterial communities (**Figure 3**), supporting the distance decay of similarity hypothesis. This pattern was statistically significant (Mantel test; P = 0.001), and also the correlation coefficient was fairly large (r<sup>M</sup> = 0.718), indicating that the distance decay of similarity relationship is fairly robust even at the regional scale within the Dry Valleys.

While this relationship was statistically significant at smaller, within-glacier scales, its effect size was extremely low (**Supplemental Figure 2**) compared to analyses including multiple glaciers. We sampled at multiple nested spatial scales within each of the three Taylor Valley glaciers (**Figure 1**), and we expected the spatial structure of cryoconite hole bacterial communities to be fine-scale, occurring at spatial scales on the order of tens or hundreds of meters. Spatial structuring at small spatial scales (< 1 km) is often found in biogeographic analyses of supraglacial microbial communities, like in debris atop debris-covered glaciers (Franzetti et al., 2013; Darcy et al., 2017). But in the present study, this was not the case, as very little spatial structuring was observed for each of those three glaciers when analyzed separately (all r<sup>M</sup> values < 0.09). This contrasts with other supraglacial microbial biogeography, such as that of debris-covered glaciers, where strong spatial structuring of microbial communities is observed at scales <1 km (Franzetti et al., 2013; Darcy et al., 2017). However, is assumed that spatial structuring on debris-covered glaciers is caused by biogeochemical gradients over space, rather than dispersal limitation. Indeed, for many organisms including microbes, patterns of distance decay of similarity are likely caused more by spatially autocorrelated environmental gradients than by dispersal limitation (Soininen et al., 2007; Astorga et al., 2012). While the physical isolation of Antarctic cryoconite holes made strong spatial structuring seem likely at small spatial scales, it may be that biogeochemical conditions within cryoconite holes on a given glacier are too uniform to see that spatial structuring.

At the global scale, significant spatial structuring of cryoconite hole bacterial communities is strong (Mantel test; P = 0.001, r<sup>M</sup> = 0.888). All data points from smaller spatial scales that we analyzed for the Dry Valleys data are still present in the global analysis, even though in this analysis comparisons between close-together pairs of cryoconite holes from different regions across the globe are overlaid at short distances in the plot (**Figure 3**). In other words, comparisons at 100 meters come from Antarctic, Arctic, and other glaciers worldwide. Therefore, across the world, cryoconite holes that are close together harbor similar bacterial communities, and perhaps more importantly, cryoconite holes that are farther apart do not harbor similar bacterial communities. Without a global-scale analysis of cryoconite hole biogeochemical conditions (which would require uniform global-scale data), it is not feasible to disentangle the effects of dispersal limitation and selection by environmental variables in driving this pattern.

Above distances of 1,000 kilometers, there is no longer any effect of geographic distance on cryoconite hole bacterial community composition (**Figure 3**, **Supplemental Figure 4**), suggesting that even though spatial structuring at geographic distances <1,000 kilometers is a global phenomenon, at larger scales geographic distance no longer plays a role in shaping bacterial community composition of cryoconite holes. This is supported by our logistic model fit (**Figure 3**), which fit our data much better than a linear model considering the linear model's biased residuals (**Supplementary Figure 3**, R<sup>2</sup> = 0.879). The logistic model shows how there is very little distance-decay pattern at small spatial scales (<1 km), a strong pattern at the regional spatial scale (1 to 1,000 km), and very little pattern at larger spatial scales (>1,000 km). This lack of spatial structuring at larger scales is not surprising, since for some microbes, a similar dropoff has been observed at distances of several hundred meters (Robeson et al., 2011). This pattern is not diagnostic of the underlying process creating spatial structuring in cryoconite hole bacterial communities, but may inform future efforts to use cryoconite holes as model systems. Studies wishing to capture the spatial structuring of these bacterial communities (or the potential underlying biogeochemical features of the holes) should sample at scales from one to several hundred kilometers, and not to expect strong spatial structuring at smaller or much larger scales.

# CONCLUSIONS

Cryoconite hole bacterial communities appear "island like," because they exhibit a significant relationship between island size (hole area) and bacterial phylogenetic diversity (**Figure 2**), and show significant distance decay of similarity patterns both at the regional (McMurdo Dry Valleys) scale and at the global scale (**Figure 3**). However, these patterns are not entirely perfect or clear-cut, since not all glaciers analyzed exhibited a relationship between island size and diversity, and distance decay of similarity patterns for cryoconite hole microbial communities were weak (but significant) within glaciers yet very strong at larger spatial scales. We conclude that cryoconite hole bacterial communities do have island biogeography, but the strength of this pattern depends on spatial scale.

#### AUTHOR CONTRIBUTIONS

JD, PS, DP, and SS designed experiments, and JD, PS, and DP collected and processed samples. JD, EG, and PS collected and

#### REFERENCES


analyzed data. JD primarily wrote the manuscript, and all authors contributed text and revisions.

#### FUNDING

This work was funded by NSF Polar Programs Award 1443578.

#### ACKNOWLEDGMENTS

The authors thank Diana Nemergut, who began this project. She will be missed. The authors also thank F. Zamora, A. Fountain, and UNAVCO for help in the field and help designing experiments.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00180/full#supplementary-material

Taylor Valley, Antarctica. J. Geophys. Res. Biogeosciences 112:G04S32. doi: 10.1029/2006JG000358


and molecular approach. Appl. Environ. Microbiol. 69, 5157–5169. doi: 10.1128/AEM.69.9.5157-5169.2003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Darcy, Gendron, Sommers, Porazinska and Schmidt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Antarctic Relic Microbial Mat Community Revealed by Metagenomics and Metatranscriptomics

Elena Zaikova<sup>1</sup> , David S. Goerlitz <sup>2</sup> , Scott W. Tighe<sup>3</sup> , Nicole Y. Wagner <sup>1</sup> , Yu Bai <sup>1</sup> , Brenda L. Hall <sup>4</sup> , Julie G. Bevilacqua<sup>1</sup> , Margaret M. Weng<sup>5</sup> , Maya D. Samuels-Fair <sup>5</sup> and Sarah Stewart Johnson1,6 \*

#### Edited by:

*Alison Elizabeth Murray, Desert Research Institute (DRI), United States*

#### Reviewed by:

*Anne D. Jungblut, Natural History Museum, United Kingdom Jamie S. Foster, Space Life Science Laboratory, University of Florida, United States Stephen Brian Pointing, Yale-NUS College, Singapore*

#### \*Correspondence:

*Sarah Stewart Johnson sarah.johnson@georgetown.edu*

#### Specialty section:

*This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *15 July 2018* Accepted: *03 January 2019* Published: *23 January 2019*

#### Citation:

*Zaikova E, Goerlitz DS, Tighe SW, Wagner NY, Bai Y, Hall BL, Bevilacqua JG, Weng MM, Samuels-Fair MD and Johnson SS (2019) Antarctic Relic Microbial Mat Community Revealed by Metagenomics and Metatranscriptomics. Front. Ecol. Evol. 7:1. doi: 10.3389/fevo.2019.00001* *<sup>1</sup> Department of Biology, Georgetown University, Washington, DC, United States, <sup>2</sup> Georgetown University Medical Center, Georgetown University, Washington, DC, United States, <sup>3</sup> Advanced Genomics Lab, University of Vermont Cancer Center, Burlington, VT, United States, <sup>4</sup> School of Earth and Climate Sciences, University of Maine, Orono, ME, United States, <sup>5</sup> Department of Earth and Planetary Science, Washington University in St. Louis, St. Louis, MI, United States, <sup>6</sup> Science, Technology, and International Affairs Program, Georgetown University, Washington, DC, United States*

Buried upslope from the modern lakes in the McMurdo Dry Valleys of Antarctica are relict lake deposits embedded in valley walls. Within these relict deposits, ancient microbial mats, or paleomats, have been preserved under extremely arid and cold conditions since the receding of larger paleolakes thousands of years ago, and now serve as a sheltered niche for microbes in a highly challenging oligotrophic environment. To explore whether paleomats could be repositories for ancient lake cells or were later colonized by soil microbes, determine what types of metabolic pathways might be present, analyze potential gene expression, and explore whether the cells are in a vegetative or dormant state, we collected paleomat samples from ancient lake facies on the northern slopes of Lake Vanda in Wright Valley in December 2016. Using a gentle lysis technique optimized to preserve longer molecules, combined with a polyenzymatic treatment to maximize yields from different cell types, we isolated high-molecular weight DNA and RNA from ancient paleomat samples. Community composition analysis suggests that the paleomat community may retain a population of indigenous mat cells that may flourish once more favorable conditions are met. In addition to harboring a diverse microbial community, paleomats appear to host heterotrophs in surrounding soils utilizing the deposits as a carbon source. Whole genome long-read PacBio sequencing of native DNA and Illumina metagenomic sequencing of size-sorted DNA (>2,500 nt) indicated possible cell viability, with mat community composed of bacterial taxa. Metagenome assemblies identified genes with predicted roles in nitrogen cycling and complex carbohydrate degradation, and we identified key metabolic pathways such as stress response, DNA repair, and sporulation. Metatranscriptomic data revealed that the most abundant transcripts code for products involved in genetic information processing pathways, particularly translation, DNA replication, and DNA repair. Our results lend new insight into the functional ecology of paleomat deposits, with implications for our understanding of cell biology, Antarctic microbiology and biogeography, and the limits of life in extremely harsh environments.

Keywords: Antarctica, cell survival, DNA repair, dormancy, extremophiles, metagenomics, stress response, transcriptome

#### INTRODUCTION

Antarctica is uniquely positioned to investigate fundamental questions about life in extreme environments. Within the McMurdo Dry Valleys, microbes have adapted to some of the most physically and chemically challenging conditions on Earth (Cary et al., 2010). Previous studies have characterized communities in rock and soil samples in the hyper arid and cold conditions of the Dry Valleys, and have found that the intense environmental pressure creates specialized communities (Pointing et al., 2009). Yet, the Dry Valleys support more diverse microbial life than was previously believed (Lee et al., 2012). Additional studies have also begun to assess the effects of climate change on these extreme and highly specialized environments (Chan et al., 2013). Its high latitude produces extreme solar radiative states, with high UV flux in the summer and unabated darkness in the winter. The terrain is scoured by severe katabatic winds, and mean annual air temperatures between −14.8◦C and −30◦C (Doran et al., 2002). The annual precipitation in snowfall is only 3–50 mm water equivalent, making it one of the driest deserts on the planet (Fountain et al., 2010).

Despite these challenges, a wide array of microbial life inhabits the Dry Valleys, much of which was unknown before the advent of molecular methods. This is particularly the case within the refugia of a dozen ice-covered lakes containing permanently unfrozen water or brine along the valley floors (Van Trappen et al., 2002; Taton et al., 2003; Karr et al., 2005; Murray et al., 2012). The lakes are thought to be the remnants of larger glacial lakes that once occupied the Dry Valleys (Doran et al., 1994), and geologic features associated with these paleolakes have been noted throughout the area (Doran et al., 1998; Hendy et al., 2000; Hall et al., 2002).

One such lake, Lake Vanda, is permanently ice-covered and meromictic (Canfield and Green, 1985; Spigel and Priscu, 1998; Green and Lyons, 2009). It has a maximum depth of about 75 m; the upper 50 m of the water column are cool, oxic, and oligotrophic, while the lower 20 m of the lake bottom are composed of a warm, hypersaline and anoxic brine (Canfield and Green, 1985; Green et al., 1998; Green and Lyons, 2009). Similar to other lakes in the Dry Valleys, Lake Vanda within Wright Valley is thought to be the remnant of a larger glacial lake, Glacial Lake Wright (Hall et al., 2001). Benthic microbial mats currently line the bottom of Lake Vanda, and moat mats form seasonally in meltwater at the ice's edge (Love et al., 1983; Wharton et al., 1983; Zhang et al., 2015). Throughout Wright Valley, ancient microbial mats have been detected in association with relict shorelines and deltas (Hall et al., 2010). The preservation of these features is excellent, as erosion of landforms associated with lake level change is thought to be minimal (Hall et al., 2010). The paleomat record embedded in the valley walls above Lake Vanda spans ∼30,000 years. Within 10–20 meters of the current shoreline, along facies dating to ∼2,000 years old, such paleomat deposits are particularly prevalent when compared to nearby sites (See **Figure 1**).

Over millennial time scales, these remnant mats are an important locus of biotic activity in the Dry Valleys. Moorhead et al. (1999) pointed to four potential sources of organic matter in the soils of the Dry Valleys: in situ primary production, extraction from endolithic communities, aeolian transport of organic matter from nearby lacustrine environments, and erosion of exposed sediments from ancient lake beds. Of these sources, the largest contributor seems to be erosion of exposed sediments from ancient lakebeds. A comparison of Miers Valley and Beacon Valley found soil from Miers Valley to harbor diverse microorganisms while microbial activity was largely absent within soils from Beacon Valley (Wood et al., 2008a,b). The authors tentatively attributed the difference to the presence of a modern lake in Miers Valley and the absence of lacustrine systems in Beacon Valley.

However, cells can survive over long time periods under conditions of desiccation and extreme cold. Indeed, freeze-drying has long been used as a method to store and transport microbial strains in culture collections (Morgan et al., 2006), and viable cells have been indicated in permanently frozen environments over tens of thousands (Soina and Vorobyova, 2005; Bergholz et al., 2009) and even hundreds of thousands year timescales (Johnson et al., 2007). A cultivation-based study from similar paleomat deposits from the Dry Valleys found that select strains of microbial cells could be revived from paleomat deposits that had been stored at room temperature (Antibus et al., 2012a). Although the authors of the study could not fully exclude the possibility of colonization by soil bacteria, ancient mat samples displayed low abundance and diversity of cultivable bacteria, which would be expected due to a loss of viability over time (Antibus et al., 2012a). A paired study assessing the persistence of 16S rRNA from the same samples also found that DNA abundance and integrity declined with sample age over several millennia (Antibus et al., 2012b). Additionally, extracellular DNA has been found to comprise about 40% of the DNA in soil samples, and this relic DNA can confound the determination of microbial diversity, resulting in inflated estimates (Carini et al., 2016).

Here we present a description of the microbial community composition, metagenomics and metatranscriptomics of a desiccated microbial mat from the Wright Valley. The overarching goal of this work was to investigate whether

paleomats may serve as "seed banks" of stress-resistant organisms over long timescales, while also providing habitat and a source of carbon to modern soil-dwelling microbes. Our specific aims were to survey the microbial cells associated with Lake Vanda paleomat deposits to: (1) assess if community structure and taxonomy is different within the relic mat material from that found in surrounding soils and whether it is similar to the composition of modern moat mats, (2) analyze what types of metabolic pathways might be present in the relic mat metagenome, and (3) determine what gene transcripts could be recovered from the relic mat, and (4) discern if the cells in the relic mat are surviving via dormancy, a favored explanation for the persistence of microbes in harsh conditions (Jørgensen, 2011), particularly in response to extreme cold (Abyzov et al., 1998) and desiccation (Lebre et al., 2017). For the purposes of this study, modern mats are defined as microbial mats inhabiting the benthic portion of present-day Lake Vanda, and paleomats are defined as thin, desiccated mats buried upslope from the present lake and carbon-dated to millennial time-scales.

One obstacle to undertaking genomics work in Antarctica, however, is that DNA fragments may be preserved over long timescales. The hyper-arid, exceptionally cold, and salty conditions associated with the Dry Valleys paleolakes are optimal for preserving ancient DNA (Smith et al., 2001; Willerslev and Cooper, 2005; Hofreiter et al., 2015). Thus, the detection of genes or gene fragments using molecular methods may simply indicate the presence of well-preserved ancient DNA, not necessarily the viability or intactness of the parent organism. This has been especially problematic for next-generation sequencing techniques, which rely on short fragments of DNA (150–300 bp in length). The DNA fragment lengths associated with ancient dead biomass, even in cold environments, are typically <300 bp in length (Hofreiter et al., 2015). The longest ancient DNA sequence ever generated from a dead specimen on ancient timescales is 1,042 bp, from an exquisitely preserved 9,000-yearold Antarctic sample (Lambert et al., 2002). Therefore, longer fragments of DNA would be indicative of intact cells rather than eDNA or aDNA.

For this study, we assessed the Lake Vanda paleomat, surrounding soil, and modern mat bacterial community using amplicon sequencing of the taxonomic marker 16S rRNA gene to compare the composition of the paleomat, surrounding soil and modern mat bacterial communities. To assist in discriminating highly fragmented ancient DNA from DNA derived from viable organisms, we focused on the recovery and sequencing of significantly longer read lengths (>2,500 nt). We leveraged Illumina metagenomic sequencing of whole and long-fragment fractions of DNA in combination with continuous long-read, third-generation Pacific Biosciences sequencing of native DNA to characterize the putative function and structure of the paleomat community. In addition, we isolated and sequenced RNA from the Lake Vanda paleomat to investigate potential transcriptional activity, which truly dormant cells do not undertake, and to identify which genes are expressed, thus providing insight into cell survival strategies.

#### MATERIALS AND METHODS

## Sample Collection

All samples were collected in December 2016 (austral summer), at ∼11 am. Tyvek suits and nitrile gloves were worn to minimize contamination. Paleomat samples were collected from beneath ∼5 cm of soil ∼500 ft upshore of Lake Vanda (S 77◦ 31.149′ E 161◦ 38.315). Sterile, DNA-free copper utensils (ashed at 550◦C overnight) were used to excavate the site, and multiple sample replicates were collected, a subset of which were collected into sterile cryotubes that were immediately placed into a Taylor Wharton cryoshipper charged with liquid nitrogen, and the remainder was placed into sterile Whirl-Pak bags (Nasco, Fort Atkinson, WI, USA), and immediately placed on dry ice. Soil adjacent to the paleomat samples, ∼20 cm away and judged by eye to be free of paleomat fragments, was collected in the same manner in multiple replicates. The modern mat sample was collected from a moat mat at the near edge of Lake Vanda (S 77◦ 31.168′ E 161◦ 38.381′ ), from beneath ∼5 cm of meltwater. Following helicopter transport to McMurdo Station, samples were either extracted (in the case of the MiniSeq run) or maintained at a temperature of −80◦C for ∼1 month, then shipped on dry ice via air transport to Georgetown University, where they were maintained at a temperature of −80◦C until additional processing and analysis. Because of limited access to near-shore modern mats due to ice-cover and lean biomass concentrations within the surrounding soil, sufficient amounts of high quality, high molecular weight DNA suitable for metagenomics could not be obtained, these two samples were analyzed just for amplicon-based 16S rRNA gene sequencing. Similarly, to attain µg-quantities of high molecular weight DNA from paleomat samples, several extractions from multiple replicates were pooled.

## Scanning Electron Microscopy

Scanning Electron Microscopy was performed by the UVM Microscopy Imaging Core Facility using the following procedure: briefly, the mat samples were suspended in Karnovsky's fixative (2.5% glutaraldehyde/2% formaldehyde in 0.1 M cacodylate buffer) and rinsed three times with 0.1 M cacodylate buffer followed by post-fixation with 1% osmium tetroxide (OsO4) in 0.1 M cacodylate buffer for 30 min at 4◦C. A final rinse of the sample was done with 0.1 M cacodylate buffer, followed by serial ethanol dehydration and critical point drying in CO2. Samples were finally sputter-coated with either gold/palladium and imaged using a JEOL JSM-6060 scanning electron microscope (JEOL, Peabody, MA) operating at 10–15 kV.

## δ13C and 14C

To assess the age of material collected, paleomat samples were analyzed for δ13C and 14C by stable isotope ratio mass spectrometry (SIRMS) and accelerator mass spectrometry (AMS), respectively, at the National Ocean Sciences Accelerator Mass Spectrometry (NOSAMS) Facility at the Woods Hole Oceanographic Institution in Woods Hole, Massachusettes. Data was collected using the Pelletron Tandem 500 kV AMS system (Povinec et al., 2009; Roberts et al., 2010). The measured δ13C value—a function of taxonomy, productivity, and organic carbon burial—was −17.46‰. Radiocarbon ages were calculated using the Libby half-life of 5,568 years using the convention described by Stuiver and Polac (1977); Stuiver (1980). This calculated carbon date, 2,150 ± 15 years, and refined with a 2-sigma calibration suing the CALIB program and the INTCAL 2013 dataset (Reimer et al., 2013), yielding a 99.6% probability that the calendar-year age lies between 2,040 and 2,148 years BP.

#### Nucleic Acid Isolation and Sequencing

DNA and RNA were extracted from cells using protocols to maximize lysis efficiency and molecular weight of the molecules isolated.

#### DNA Extraction and Sequencing

DNA was isolated from ancient paleomat samples collected from Lake Vanda using multiple DNA isolation protocols, using the ZymoBIOMICS DNA Miniprep Kit (Zymo Research, Irvine, CA) or MoBio PowerSoil DNA Isolation kit (Qiagen, USA) with protocol modifications, dependent on the downstream application. Each individual extraction was performed on ∼200 mg of starting material.

For 16S rRNA sequencing on the Illumina MiniSeq platform, DNA from mat and soil samples was extracted using physical lysis by bead-beating in the FastPrep-24 Classic Instrument (MP Biomedicals, Santa Ana, CA, USA) with a speed of 6 m/s for 40 s. Following cell lysis, DNA was purified using the ZymoBIOMICS DNA Miniprep Kit (Zymo Research, Irvine, CA) following manufacturer's protocol. The DNA was eluted in DNase/RNase-free water and stored at −80◦C. Library preparation, pooling, and MiniSeq sequencing were performed at the University of Illinois at Chicago Sequencing Core (UICSQC). Briefly, genomic DNA was PCR-amplified using a primer set targeting the V4 region of bacterial 16S rRNA gene— CS1\_515F (5′ - GTGCCAGCMGCCGCGGTAA-3′ ) and 806R (5′ -GGACTACHVGGGTWTCTAAT-3′ )—and a two-stage PCR protocol (Bybee et al., 2011; Naqib et al., 2018). Samples were pooled in equal volume, fragments smaller than 300 bp were removed from the pooled library, which was then sequenced on the Illumina MiniSeq platform together with a phiX spike-in, and resulting in 150 bp PE reads.

For shotgun sequencing on the Illumina MiniSeq platform, DNA was extracted using the MoBio PowerSoil DNA Isolation kit (Qiagen, USA) according to the manufacturer's protocol, with the exception of DNA elution in 50 µL of nuclease-free water instead of the reagent provided in the kit (solution C6). The extracted DNA was stored at −80◦C. A whole genome shotgun metagenomic sequencing library were constructed from 50 ng of total gDNA using the Nextera XT DNA Library Preparation Kit (Illumina USA) according to the manufacturer's instructions. The library was sequenced on the Illumina MiniSeq system using paired-end 150 bp read lengths and the MiniSeq High Output Reagent Kit (300-cycles) at the Albert P. Crary Science and Engineering Center, McMurdo Station, Antarctica.

An additional DNA preparation was isolated using enzymatic digestion. Briefly, paleomat samples were incubated with MetaPolyzyme (Sigma-Aldrich, USA), a mix of five enzymes (achromopeptidase, chitinase, lyticase, lysostaphin, lysozyme, and mutanolysin) designed to target bacteria and fungi (Tighe et al., 2017). For this DNA preparation, the bead-beating step was omitted to maximize the yield of high molecular weight DNA. DNA extraction from cell lysates was carried out using a ZymoBIOMICS DNA Miniprep Kit following manufacturer's protocol, and the resulting DNA was eluted in DNAase/RNasefree water and stored at −80◦C. DNA from this set of extractions was used for whole genome shotgun using the whole DNA fraction, as well as size-selected (>2,500-bp) gDNA fraction. The sequencing libraries were prepared using the MiSeq Reagent Kit v3 according to the manufacturer's instructions, and were sequenced at UICSQC on Illumina MiSeq, producing 2×300 bp reads.

DNA was isolated for Pacbio sequencing using enzymatic digestion, and the ZymoBIOMICS DNA Miniprep Kit following manufacturer's protocol with modifications as described above. The sequencing library was prepared following the Pacific Biosciences 2 kb SMRTBell Template Preparation and Sequencing protocol with an additional bead-based template cleanup using 0.5X AMPure PB beads to eliminate smaller-sized fragments. A final library QC was performed with a Qubit fluorometer to determine library concentration and an Agilent Bioanalyzer 2100 (Agilent Genomics) using the DNA 12000 reagent kit to determine the size distribution of the library. The prepared sequence library was loaded onto a single Sequel v2.1 SMRTcell at a concentration of 6 pM, following the PacBio diffusion loading protocol and including a polymerase-bound complex cleanup. One 600-min movie was taken of the SMRTcell.

In all extraction protocols unless otherwise specified, DNA concentration, quality, molecular weight and fragment size distribution were determined via the Agilent Bioanalyzer 2100 using the HS DNA Assay (Agilent, USA). Spectrofluorometry using a Qubit High Sensitivity DNA Assay (Thermo Scientific, United States) was used to quantify the amount of gDNA present in the extractions. In both assessments the manufacturer's protocols were followed.

#### RNA Extraction and Sequencing

Total RNA was isolated from ∼500 mg of ancient paleomat material collected from Lake Vanda (WA3A) using the Zymo Direct-zol RNA MiniPrep kit (Zymo Research Corp., USA) according to the manufacturer's instructions. Briefly, the paleomat material was mechanically disrupted on dry ice in the original collection tube using a sealed, disposable RNasefree spatula. TRIzol reagent (500 µl) was added directly to the sample, which was subsequently vortexed and centrifuged at 12,000 × g for 30 s to remove particulates, and the supernatant was transferred to an RNAse-free tube. Cell lysis in TRIzol was repeated for the remaining paleomat material to provide a total of 1,000 µl supernatant, which was then mixed with one volume of 100% ethanol and loaded onto a Zymo-Spin IIC Column. The RNA sample was treated in-column with 5 U of DNase I, washed and eluted in 20 ul of RNase-free water. RNA yield and integrity were determined via fluorometric quantitation using the Qubit RNA HS Assay Kit (Thermo Scientific, United States), and the Agilent Bioanalyzer 2100 using the RNA 6000 Nano Kit (Agilent, USA) (**Figure S1**). Sequencing libraries were constructed from 2 ng of total RNA using the SMARTer Stranded Total RNA-Seq Kit v2-Pico library preparation kit (Takara Bio, USA) according to the manufacturer's instructions using protocol option 2 (without prior RNA fragmentation). Library preparation included both an RT (-) control (using paleomat RNA as template, but excluding the SMARTScribe Reverse Transcriptase enzyme), and a negative control using water as template. Both controls resulted in no library construction, confirming the absence of any contaminating gDNA. The final RNASeq (cDNA) library was sequenced on the Illumina MiSeq System using paired-end 300 bp read lengths and the MiSeq Reagent Kit v3 (600-cycle) (Illumina, Inc, USA) at the Genomics and Epigenomics Shared Resource at Georgetown University Medical Center.

#### qRT-PCR

The RNA extraction was tested for gDNA contamination using qRT-PCR performed on a C1000 thermalcycler fitted with a CFX96 detection module following the method described in Zaikova et al. (2010). The target gene, 16S rRNA, was amplified using either bacteria-specific (27F, 5′ - AGAGTTTGATCMTGGCTCAG-3′ ) or archaea-specific (20F, 5 ′ -TTCCGGTTGATCCYGCCRG-3′ ) forward primers and the universalreverse primer 519R (5′ -GNTTTACCGCGGCKGCTG-3 ′ ). A total of 20 µL reaction consisting of 2 µL template, 0.6 µL of each reverse and forward primer, 10 µL of SsoAdvanced Universal SYBR green Supermix (Bio-Rad, USA) and 6.8 µL of nuclease-free water was prepared for each sample. Water was used as the template for the negative control reaction. Serial dilutions across 7 orders of magnitude of chromosomal copy number (10<sup>7</sup> to 10<sup>1</sup> ) of custom gblocks DNA oligos based on cosmopolitan gram-negative and gram-positive 16S bacterial sequences, and Thaumarchaeal and Euryarchaeal 16S archaeal sequences (IDT, USA) were used as the standard bacterial and archaeal PCR controls. Thermal cycle parameters included an initial denaturation at 98◦C for 3 min, followed by 35 cycles of 98◦C for 15 s, primer annealing at 55◦C (for bacteria) or 65◦C (for archaea) for 30 s, an extension at 72◦C for 15 s followed by a melt curve after 35 cycles. The melt curve was performed from 65 to 95◦C with 0.5◦C increments.

# Sequence Analysis

#### Microbiome Composition Analysis

Bacterial 16S rRNA gene sequences were processed using the dada2 R package (Callahan et al., 2016) implemented on R version 3.5.0 through RStudio version 1.1.447 (R Core Team, 2013; RStudio Team, 2016). First, demultiplexed, raw reads were trimmed and filtered to remove phiX sequences, primer sequences and low-quality bases, and reads and error rates were estimated for each sample. Identical sequences were then dereplicated into unique sequences, and their corresponding abundance was noted for each sample. Forward and reverse reads were merged, chimeric sequences were removed and an abundance distribution table for each of the unique merged sequences—exact sequence variants (ESVs)—was constructed. Taxonomy was assigned using a naive Bayesian classifier method implementation in dada2, with Silva v128 as the reference database (Quast et al., 2013). Alpha diversity was calculated using the phyloseq R package (McMurdie and Holmes, 2013), distributions of relative taxonomic abundances were visualized with Krona (Ondov et al., 2011), and a Venn diagram comparing ESV distribution among the three samples was produced in R. Since only three samples were examined, and no biological replicates were sequenced, we could not carry out comparative statistical analyses.

#### Metagenomic Read Assembly and Annotation **Illumina Data Assembly, Metagenomic Binning, and Annotation.**

Adapters, sequencing artifacts and poor-quality bases were trimmed from Illumina reads, and the trimmed reads were subsequently filtered for quality and length using BBDuk (Bushnell, 2014). The trimmed, filtered reads were assembled into contigs and scaffolds using MEGAHIT (Li et al., 2015, 2016). BWA MEM (Li, 2013) was used to map reads back to the assembly and samtools (Li, 2011) was used to convert between formats and calculate coverage. MetaBAT2 (Kang et al., 2015) was used to place sequences into genomic bins, using parameters optimized for sensitivity. CheckM (Parks et al., 2015) was used to assess binning quality and ensure that no sequence appeared in more than 1 bin. Following metagenomic binning, 17 MAGs with low contamination, <5%, and high completion, >90%, were produced. Of these, 9 were obtained from the MiniSeq assembly, 6 from the size-selected and 1 from whole fraction MiSeq assemblies. Average nucleotide identity was calculated using JSpeciesWS using blast+ and MUMmer approaches (Richter et al., 2016). As the results using the two methods were highly similar, only the blast-based result is reported here.

Metagenomic assemblies and high-quality genomic bins were annotated ab initio using Prokka version 1.12 (Seemann, 2014). The amino acid sequences of predicted open reading frames (ORFs) within each dataset were analyzed using KAAS to map the KO numbers to the KEGG GENES database (Moriya et al., 2007). Taxonomy of the ORFs identified by Prokka was assigned using Kraken (Wood and Salzberg, 2014). Gene assignments, taxonomy and KEGG pathway distributions weresummarized, analyzed and plotted using R v 3.5.0.

#### **PacBio Sequence Assembly and Annotation.**

In order to build the PacBio assembly, Bamtools 2.4.1 (Barnett et al., 2011) was used to convert the CCS (Circular Consensus Sequence) processed subreads from bam format to a fasta file. No additional filtering was performed. Canu 1.6 was used to build the PacBio assemblies. The taxa present in both the CCScorrected reads and CCS-corrected contigs were identified with Kraken 1.0 and visualized in Krona, and functional annotation was done using Prokka, for consistency with analyses of Illumina metagenomic data. With the exception of taxonomic distribution, all other outputs were visualized and summarized using R.

#### Transcriptome Analysis

RNA reads were trimmed and filtered for length and quality using BBDuk. Reads passing QC were mapped to contigs produced by assembly of Illumina and PacBio metagenomic sequences. To identify reads that represented subunits of the rRNA gene, reads were mapped to the small subunit (16S rRNA and 18S rRNA) and large subunit (23S rRNA and 28S rRNA) of the rRNA genes available in the Silva v 132 database (Quast et al., 2013). Reads that did not map to rRNA genes were extracted using samtools and used to for transcriptome analysis. NonrRNA reads were then mapped to the Prokka-predicted genes and assemblies again using bwa. Alignments were assessed using samtools and R. Transcripts per kilobase million (TPM) calculation was performed in R. KEGG category expression was based on predicted genes that had >10 reads mapped and the summed corresponding TPM. As with metagenomic KEGG distributions, the same predicted gene can participate in more than one pathway, so the total exceeds the number of genes predicted.

#### Sequence Accession Numbers

Metagenomic, metatranscriptomic and SSU rRNA amplicon sequences reported in this study were deposited in the GenBank Sequence Read Archive with accession numbers SRR8217969 - SRR8217976 and project accession PRJNA506221. Metagenomic assemblies, annotations and ESV table are available upon request.

# RESULTS

#### Paleomat Sample Appearance and Age

The Lake Vanda paleomat deposits sampled appeared as desiccated, thin, beige flakes (**Figure 1**), present in regular layers about 5 cm below the surface, and less structured layers at ∼20 cm below surface. The age of the Lake Vanda paleomats studied here was calibrated to be ∼2,100 ± 50 years (see Materials and Methods, Pathway Analysis and Survival Strategies), though the age of the mat and the age of the cells may differ, as viable cells present may only contribute a small fraction of the total carbon. The measured δ13C value of the paleomat was −17.46‰, most likely indicative of a primarily heterotrophic community.

Scanning electron microscopy imaging of the paleomat revealed the presence of several types of intact cells within the sheet-like mat flakes (**Figure 2A**). The mat material appeared layered and quite porous in texture (**Figure 2B**), and was covered by what appeared to be filamentous cells (**Figure 2C**). The filaments were ∼1.5µm wide and were associated with coccus cells, which were ∼0.8µm in diameter and often appeared in clusters (**Figures 2D,E**). In addition to filaments and coccus cells, rod-shaped cells were also observed in the sample, as were coryneform (club-shaped) cells that were observed in mat material grooves (**Figure 2F**). The coryneform cells had distinct

banding on the cell surface and were ∼1µm long and ∼0.4µm wide.

#### Sequencing Datasets Summary

To characterize the community structure of the Lake Vanda paleomat microbial community as well as the functional potential encoded by the genomes of organisms and their transcriptional activity, we generated 16S rRNA gene amplicon, four metagenomic, and one metatranscriptome datasets (summarized in **Table S1**). Bacterial composition and distribution in the paleomat, modern mat and soil surrounding the paleomat was explored using 16S rRNA gene sequencing. For each of the three samples, paired-end reads across the V4 region of the gene were generated, for a total of 305,452 reads for the modern mat, 3,63,111 reads for the paleomat, and 96,133 reads for the surrounding sand. After trimming, quality filtering, chimera, singleton, and unassigned ESV removal, 93% of modern mat reads, 92% of sand reads, and 80% of paleomat reads were used in composition analysis and description. These reads formed 964 ESVs: 476 in the modern mat, 481 in the paleomat, and 106 ESVs in the soil sample.

In order to investigate whether short eDNA and aDNA fragments would impact sequencing results, metagenomic data was generated using two different extraction methods, with differences in DNA fragments sequenced and sequencing platform, as detailed in Materials and Methods. The DNA sequenced on the PacBio platform, as well as size-selected (>2,500 bp DNA fragments) and whole DNA fractions sequenced on the MiSeq platform were all prepared from the same DNA extraction, whose average fragment length was ∼5,000 bp. The MiniSeq run, performed in Antarctica, yielded 23,883,870 150-bp paired-end reads, which were assembled into 276,991 contigs with an N50 value of 1,350 bp. The MiSeq whole DNA fraction run (300-bp paired-end) produced fewer reads−6,095,791–and assembled into 2,37,640 shorter contigs (N50 of 677 bp). The MiSeq size-selected sequencing run produced 16,527,393 300-bp paired-end reads, which resulted in 8,58,609 contigs with an N50 of 918 bp. PacBio sequencing and assembly resulted in 44,521 contigs with total length 322 Mbp, however, many reads were short (∼7,88,000 reads comprising ∼2.8 Gb data) and could not be assembled. Sequence data from DNA isolated using either mechanical lysis (bead-beating) or enzymatic digestion with MetaPolyzyme identified highly similar taxonomic diversity, suggesting that the different DNA isolation methods lysed the same population of cells, and sequencing captured the full representation of the microbial diversity present in the paleomat samples. In addition, we did not identify any pronounced differences in gene identity and frequencies between whole DNA isolations and size-selected fractions, indicating the negligible contribution of eDNA and/or aDNA to the metagenomics sequence data. RNA sequencing yielded 20,249,504 paired-end reads, with an average length of 76 nt. Following trimming, filtering, phiX and artifact removal, >80 % of the reads were used for transcriptome analysis.

# Bacterial Community Composition of the Paleomat, Surrounding Soils, and Modern Mat

The paleomat, surrounding soil, and modern mat were distinct in bacterial composition, yet 12 ESVs were common between the three samples (**Figure 3A**). These shared ESV were not represented by abundant sequences in the modern mat (<0.01– 1% reads each), were slightly more abundant in the paleomat (0.01–3% reads in each ESV), and were moderately abundant in the soil surrounding the paleomat (0.5–1% reads for most, with an ESV identified as unclassified Frankiales that represented 19.5% of the reads). Of these 12 ESVs, 8 were Actinobacterial sequences, 5 of which were Micrococcales taxa, 2 were Propionibacteriales taxa, and 1 Frankiales ESV. The other 4 ESVs were a Deinococcus-Thermus sequence (Truepera sp.), a Chloroflexi taxon, and two Bacteriodetes sequences (Pontibacter and Segetibacter). The majority of ESVs identified in the modern mat (90% ESVs) and paleomat (83%) were unique and not detected in either of the other samples. However, these ESVs represented a different proportion of sequence abundance, with the modern mat unique ESVs representing 87% of the reads in the sample, whereas the unique paleomat ESVs comprised 46% of paleomat 16S reads. In the surrounding soil dataset, the unique (not shared) ESVs represented 46% of the soil ESVs but only 14% of the reads in the sample. Although the paleomat sample share similar numbers of ESVs with either the modern mat or the surrounding soil, the abundances of those ESVs in the paleomat dataset were not equivalent, as ∼33% reads were in ESVs shared with surrounding soil, and just 3% of reads by abundance were in ESVs common between the two mat samples. These observations indicate the modern mat harbored a community that was the most distinct from the other samples, and that, although the paleomat bacterial community was distinct, it was more similar to that of the soil sample than the modern mat. Additionally, the modern mat community was the most diverse, and the surrounding soil community was the least diverse (**Figure 3A**).

Despite some similarity in composition and presence of a few shared ESVs, the distribution, relative abundances, and taxonomic identities of ESVs differed among the samples (**Figure 3B**). The paleomat was dominated by Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria, Gammatimonadetes, Chloroflexi, and Deinococcus-Thermus. The most abundant phylum in the paleomat, Actinobacteria, was dominated by Actinomycetales. Bacteroidetes showed a large abundance of Cytophagales and Saprospirales. Bacilli were prevalent among the Firmicutes, and to a lesser degree, Clostridia. Within the proteobacterial members of the paleomat bacterial community, Alphaproteobacteria were the most dominant member, containing abundant Sphingomonadales and Rhodobacterales. Proteobacteria, Bacteroidetes, Cyanobacteria, and Actinobacteria dominated the modern Lake Vanda moat mat bacterial community (**Figure 3B**). Proteobacteria were represented primarily by Betaproteobacterial taxa, followed by Alphaproteobacteria and Gammaproteobacteria. Bacteroidetes contained a roughly even distribution of Sphingobacteriales, Flavobacteriales, and Cytophagales. Cyanobacteria were primarily represented by Synechococcophycideae, and to a lesser extent sequences that were identified as possible chloroplast 16S.

The surrounding soil, on the other hand, was dominated by Actinobacteria, with taxa belonging to this phylum containing ∼60% of the reads. Other numerically-important phyla of the surrounding soil bacterial community were the Bacteroidetes, Proteobacteria, Deinococcus-Thermus, and Firmicutes. These phyla were also the dominant soil community members in McKelvey Valley, where they mediate nitrogen transformation and play key roles in carbon cycling (Chan et al., 2013).

# Taxonomic Composition of the Paleomat Metagenome

To identify the constituents of the paleomat microbial community, we assigned taxonomy to contigs and scaffolds assembled from each Lake Vanda paleomat metagenomic sequence dataset. The taxonomic assignment results indicated that in all cases, most sequences could not be assigned to any known taxon, and were unclassified (**Figure 4**). There was no discernable difference in patterns of community structure between different DNA isolation methods and sequencing approaches,with only minor observed differences. Although sequences generated on the PacBio platform had a lower proportion (55% compared to 88–90% for Illumina) of unclassified sequences, we recovered the same taxa with similar patterns as with Illumina short-read sequencing (**Figure 4**). In all four datasets, of the contig sequences that could be assigned to a particular taxonomic group, most were affiliated with Actinobacteria. The five most abundant, based on sequence count, Actinobacterial subdivions consisted of Micrococcineae, with strains belonging to genera that can degrade hydrocarbons and recalcitrant substrates, Frankineae, Streptomycineae, Corynebacterineae, and Propionibacterineae. The next-most abundant phylum was Proteobacteria, mostly Alphaproteobacteria (Rhizobiales, Sphingomonadales, Rhodobacterales, Caulobacterales, Rhodospirillales), Gammaproteobacteria (Xanthomonadales and Pseudomonadales), Betaproteobacteria (Burkholderiaceae, Comamonadaceae, Rhodocyclaceae), and Deltaprotobacteria, consisting of Myxococcales. The remaining bacterial phyla, including Firmicutes, Bacteroidetes, Deinococcus-Thermus, and Planctomycetes among others, were each represented by <1% of all assembled sequences in each dataset.

#### Taxonomic Composition of the Transcriptionally-Active Component of the Paleomat Community

To identify non-ribosomal rRNA reads with the metatranscriptome dataset, 16,246,265 paired-end reads passing QC and filtering were mapped to known small subunit rRNA and large subunit rRNA sequences in the Silva database (Quast et al., 2013). This represented >92% RNASeq reads, indicating that <8% reads were likely mRNA. The rRNA reads were removed from downstream transcriptome analyses. Of the remaining reads, 64% mapped to (and were properly paired) the MiniSeq assembly contigs, 57% mapped to the

PacBio assembly contigs, 69% mapped to the MiSeq sizeselected assembly, and 53% mapped to the MiSeq whole DNA fraction assembly contigs. Mapping of RNASeq reads to the Silva database provided valuable information into potentially active members of the community based on rRNA gene expression. Over 99% of RNASeq-detected rRNA genes with >10 mapped reads were bacterial (**Figure S2**), and over 75% of these were Actinobacterial taxa, the majority of which were Micrococcales. There were an additional 50 bacterial taxa with mapped reads (**Figure S2**), however, some assignments, particularly those with only a few mapped reads, may be spurious as the reference database included the full-length

genes, including conserved regions, which are present in all bacteria taxa and difficult to reliably identify with short reads.

Transcripts per kilobase million (TPM) values indicated that sequences aligning to Actinobacteria were most abundant within expressed contigs (**Figure 5**). In decreasing order of expression abundance, the most common organisms within this phylum belong to Micrococcales, Corynbacteriales, Streptomycetales, Propionibacteriales, Micromonosporales, Pseudonocardiales, Streptosporangiales, Frankiales, Geodermatophilales, and seven other orders with total abundances of <1% each. The next most-highly

expressed phylum were the Proteobacteria, particularly the Alphaproteobacterial order Sphingomonadales (including the genera Sphingobium, Sphingomonas, Novosphigobium, and Sphingopyxis and pigmented photoheterotroph Erythrobacter), Rhizobiales (including the nitrogenfixing Bradyrhizobium, Rhizobium, and Azorhizobium), Rhodobacterales (Rhodobacter, which can also fix nitrogen), Rhodospirillales, Gammaproteobacteria (Xanthomonadaceae), Betaproteobacteria (Burkholderiales) and Deltaproteobacteria. Also well-represented in the active community were the Gram-negative, non-spore-forming Bacteroidetes—mostly Cytophagaceae, Cyclobacteriaceae and Flavobacteriaceae—as well as the Firmicutes, grouped primarily into Clostridiales and Bacillales.

# Functional Genes in Lake Vanda Paleomat

As observed with taxonomic composition, the types of genes that were identified in each of the metagenome datasets were highly similar (**Figure 6**). In all four metagenomic datasets, the KEGG system with the most number of ORFs was Metabolism, particularly Amino Acid Metabolism (including amino acid related enzymes and amino acid biosynthesis and degradation) and Carbohydrate Metabolism (most ORFs in glycolysis, TCA cycle, pentose phosphate cycle, pyruvate, butanoate and propanoate metabolisms, glyoxylate, and dicarboxylate metabolism, amino sugar and nucleotide sugar metabolism). Metabolism of Cofactors and Vitamins (including CoA, biotin and folate biosynthesis, nicotinate and nicotinamide metabolism, as well as porphyrin and chlorophyll metabolism), nucleotide (purine and pyrimidine) metabolism and Energy Metabolism (oxidative phosphorylation, carbon fixation, photosynthesis, methane metabolism, and nitrogen and sulfur metabolisms) were encoded by >1% ORFs in each metagenome (**Figure 6B**). Within the Genetic Information Processing system, most ORFs were associated with translation and DNA replication and repair. Most ORFs mapping to Cellular Processes were associated with transport and catabolism, including exosome and prokaryotic defense systems, and Cellular Community– quorum sensing and biofilm formation. Predicted ORFs within Environmental Information Processing were associated with membrane transport (ABC and other transporters and secretion system) and signal transduction, primarily twocomponent systems that enable bacteria to sense and respond to environmental cues. Additionally, predicted ORFs mapped to Organismal and Human Diseases systems, however, 95% of the annotations also mapped to other systems, namely Metabolism, Genetic Information Processing and Cellular Processes. The remaining 5% of annotations mapping to these two systems consisted of genes known to play roles in other functions, for example serpins and nitric oxide reductase, as well as antibiotic resistance genes. A full list of KO numbers and their KEGG system assignments is provided in **Supplementary Information** as a tsv file.

#### Expressed Functions in Lake Vanda Paleomat Community

The KEGG system with highest-expressed, at the RNAlevel, predicted ORFs was Genetic Information Processing, mainly translation and ribosome-associated genes, although DNA replication and repair were also transcribed (**Figure 7**). Metabolism was the second-highest expressed system, and consisted of lipid metabolism, carbohydrate metabolism, energy metabolism (nitrogen and methane), lipopolysaccharide biosynthesis, and xenobiotics degradation. Lipid and fatty acid biosynthesis gene transcription, together with expression at the RNA level of membrane proteins suggest that membrane fluidity and integrity may be actively maintained in at least some members of the paleomat community. Additionally, quorum sensing genes were expressed in the microbial community, as were two-component systems, which are important response systems to changing environmental conditions.

Among the most-highly expressed predicted protein-coding genes—coding DNA sequence (CDS) ORFs—were hypothetical proteins and antitoxin genes. Antitoxins form part of toxinantitoxin (TA) systems, where the toxin is a stable protein that is counteracted by the antitoxin, which can be either a protein or non-coding DNA sequence, and is expressed at a higher level than the toxin. TA systems can perform important cell functions, such as stress response, apoptosis, cell cycle arrest, cell wall biosynthesis and membrane integrity. Other highly expressed transcripts included stress response proteins, Sec-independent

protein translocase protein TatAy, cold shock proteins as well as ribosomal proteins, transcription and elongation factors (**Figure S3**). These observations suggest that the members of the community detected by RNA transcribe the necessary machinery for protein expression, and thus activity, within the Lake Vanda paleomat.

#### Metagenome Assembled Genome Taxonomic Identity, Function, and Expression

To determine which specific members of the community encoded specific functions, we binned metagenomic sequences to recover metagenome-assembled genomes (MAGs). Closelyrelated bacterial species may have divergent biological roles, therefore identification of which genes likely came from which genome is a useful tool to explore the genomic potential of, and difference between, species in the absence of pure cultures. A total of 16 high quality (>90% completion, <5% contamination) MAGs were identified in the Illumina-sequenced metagenome assemblies (**Figure 8A**). Due to similarity of community composition among all datasets, only MAGs from the MiniSeq whole DNA fraction and MiSeq size-selected fraction were used for downstream analyses. These MAGs were comprised of four bacterial phyla, Actinobacteria, Bacteroidetes (Cellulophaga algicola and Cytophagales), Proteobacteria (Alphaproteobacteria, Sphingomonadales, and Xanthomonadaceae) and Firmicutes (Bacilli) (**Figure 8B**). Average nucleotide identity (ANI) was used to compare the MAGs to each other to identify any MAGs that were likely from the same population. Three pairs of MAGs (one

from either sequencing approach) were highly similar, >99% ID, suggesting that they represented the same genomic population (**Figure 8B**).

The MAGs identified in the Lake Vanda paleomat varied in % GC content, ranging from 38 to 67%, coding density (82– 90%) and number of genes encoded (**Table S2**). In addition, MAGs varied in the number of genes expressed, with up to 160 predicted ORFs that had mapped RNA reads (**Figure 9**). A full list of predicted genes expressed at the RNA level in MAGs is provided in **Figure S4**. The 50 most highly expressed CDS ORFs showed that the different MAGs express different sets of genes, and that in all MAGs hypothetical proteins were among the most highly-expressed ORFs (**Figure 9**). The expressed genes that could be functionally annotated share homology with genes involved in stress response, ribosome and transcription, cell cycle regulation, and membrane transport, and likely serve similar functions in the Lake Vanda paleomat.

#### Transcribed Genes in Lake Vanda Paleomat MAGs

Understanding which mechanisms organisms express, and therefore probably utilize, to cope with environmental stress in uncultivated organisms is made possible by mapping transcriptomic reads to individual genomes or genomic bins i.e., MAGs. The two Xanthomonadaceae MAGs, whole fraction bin 3 and size-selected bin 22, likely represent the same population of cells due to the similarity of their genomic characteristics. Of the 2,911 genes encoded by the whole fraction bin 3 MAG, only 34 had mapped RNA reads. DNA repair may play an important role in this organism as protein recA and recB were detected in the transcriptome. Additionally, 60 kDa chaperonin groL, which functions in proper assembly and refolding of unfolded polypeptides that are produced under stress conditions. In addition to proper protein folding and DNA repair, this MAG expresses penicillin-binding protein 1A, which is involved in peptidoglycan, and therefore cell wall, biogenesis.

The Alphaproteobacteria MAGs consisted of two distinct populations, the Sphingomonadales (whole fraction bin 15 and size-selected bin 12) and Alphaproteobacteria (size-selected bin 31). The Sphingomonadales MAG expressed just a few genes, 18 of 2,585, which included elongation factor G and chaperone protein ClpB, which processes protein aggregates as part of a stress-induced multi-chaperone system. The sizeselected bin 31 MAG only had 23 predicted genes that were detected by transcriptomics. These included regulatory protein BlaR1, which is needed for beta-lactamase induction, as well as the blue light- and temperature-regulated antirepressor BluF, a photoreceptor protein induced at low temperatures and blue light irradiation.

The Actinomycateles MAG, whole fraction bin 2, contained 4,143 proteins, 91 of which were detected in the transcriptome. Among the expressed putative genes were 60 kDa chaperonin 2 and chaperone proteins DnaJ 1 and DnaK, transcription regulators and elongation factors, as well as DNA gyrase subunits A and B. This organism may be sensitive to oxygen concentrations, as it also expresses the hypoxic response protein hrp1, which is induced in response to hypoxia or low levels of nitric oxide and carbon monoxide.

The Bacilli MAGs were not from the same population of cells, sharing ∼72% ANI. The whole fraction bin 4 had the highest number of expressed predicted genes, 163, and had low GC content. Among the transcribed predicted genes were general stress proteins yocK, yflT, yugI\_2, ysnF, yfkM, and the universal stress protein TeaD, as was OpuCC that helps cells cope with environmental stress. In addition to thioredoxin, ribosome proteins and transcriptional regulation genes, this MAG expressed regulatory protein Spx that may play a role in down-regulating genes involved in growth and negatively affecting competence and sporulation during periods of extreme stress. Other genes with roles in sporulation inhibition (Adapter protein MecA 1) and negative regulation of competence (ClpC/MecB) were also expressed, as was the response regulator Spo0A. These cells were likely in the vegetative state, as Vegetative protein 296 (sufC) and cell division proteins were expressed. Components of the UvrABC DNA repair system were also detected in the transcriptome of this MAG. Similarly,

the second Bacilli MAG also contained Adapter protein MecA 1, ClpC/MecB, general stress proteins and UvrABC system components.

The Bacteroidetes MAGs, whole fraction bin 14 and size-selected bin 10, were identified as Cellulophaga algicola, and shared >99% ANI. This organism was very low GC (38%) and contained ∼3,300 predicted genes, 37 of which were detected in the transcriptome. The expressed predicted genes included the global nitrogen regulator ntcA\_1, gyrB and DNA repair proteins, macrolide export ATP-binding/permease protein MacB, which has roles in energy generation and antibiotic resitance, azurin, and BarA, the kinase component of the two-component regulatory system UvrY/BarA involved carbon metabolism regulation.

The other four Bacteroidetes bins all were identified as Cytophagales, but did not share high ANI with each other. Among the predicted genes detected in the transcriptome for these MAGs were ClpC/MecB, macrolide export ATPbinding/permease protein MacB, oxidoreductases (including fructose, etc.), multidrug resistance protein MdtB, outer membrane proteins, chaperonin and chaperone proteins, stress proteins, cold shock proteins and metal (cobalt, zinc, cadmium) resistance enzymes. Putative genes with roles in reactive oxygen tolerance were also expressed. DNA repair genes including RecA,

RadA, DNA gyrase subunits A and B, G/U mismatch-specific DNA glycosylase, an enzyme that excises ethenocytosine and uracil and is required for DNA damage lesion repair, as well as the modification methylase DpnIIA, which protects DNA from cleavage, were also detected in the transcriptome.

# DISCUSSION

The bacterial community composition of the paleomat appears to represent a continuum between indigenous mat cells and late colonization by soil microbes, distinct from the community prevalent in the modern moat mat, whose constituent organisms are exposed to significantly different conditions and nutrient availability than those in the paleomat sample. A portion of the OTUs in the 16S rRNA analysis reside within modern mat communities whereas a significant subset also represent soil contributions, though it should be noted that paleomat fragments, too small to be visible, could have been extracted as part of the surrounding soils. Although we could not perform robust statistical comparisons across samples due to a lack of biological replication, our results provide a foundational description of the composition, genomic content and transcription of microbial communities in an Antarctic paleomat, providing a first look into the ecology of these systems.

A variety of DNA extraction techniques, including bead beating and the use of a Metapolyzyme pre-treatment with lysing enzymes to induce spheroplast formation, were used to ensure full lysis of recalcitrant endospores. Extraction methods omitting mechanical lysis (bead-beating) and including Metapolyzyme digestion produced DNA with the highest molecular weight when compared to other methods tested, as determined by visualizing the purified DNA using an Agilent BioAnalyzer. This high molecular weight DNA could then be size-selected using BluePippin or gel-based approaches. Although different from the 16S rRNA gene sequencing results in terms of relative proportions of some bacterial phyla, possibly due to differences in genome sizes and coverage, there were broad similarities in the four metagenomic data sets, despite different sequencing platforms and different extraction techniques (bead beating, chemical lysis), suggesting that the same populations of cells were lysed and sequenced. Furthermore, since there was little difference between size-selected and whole DNA fractions, it appears that the amount of eDNA and/or aDNA present in the sample was negligible or too degraded for sequencing library construction, indicating that the results likely represent a viable community of intact cells.

# Lake Vanda Paleomat Community Is Comprised of Bacterial Phyla

A subset of the organisms recovered were similar to those in the Antibus et al. (2012a) cultivation study targeting similar paleomats, suggesting that the bacterial groups identified sourced from within the paleomat are viable. The cultivable bacteria found by Antibus et al. (2012a) in modern mats include members of the Firmicutes, Actinobacteria, Proteobacteria, Cyanobacteria, and Deinococcus-Thermus. In contrast, cultivable bacteria in ancient samples were restricted to Firmicutes, Actinobacteria, and Bacteroidetes (Antibus et al., 2012a). In our study, among organisms associated with gene transcripts, were Actinobacteria, Proteobacteria, Firmicutes, and Bacteroidetes, as well as low levels of Chloroflexi, Cyanobacteria, Actinobacteria, Deinococcus-Thermus, Planctomycetes, Acidobacteria, Pseudothermotoga, Dictyoglomi, Chlorobia, Deferribacteres, Spirochaetes, as well as members of multiple candidate divisions. Similarly, a recent mesocosm study using soil sourced from near Lake Fryxell and aimed at characterizing active taxa, showed that Bacteria, particularly Actinobacteria, Firmicutes, and Proteobacteria, were the dominant members of the community, albeit at lower proportions than in our samples (Buelow et al., 2016). These results indicate that bacteria with diverse metabolic functions, including heterotrophy, autotrophy and roles in nitrogen cycling, are likely able to survive in a non-dormant state within the paleomat community.

In addition to differences inherent between cultivationbased and molecular detection approaches, one reason for the discrepancy in observed potentially active taxa may be due to the fact that samples used by Antibus et al. (2012a) were sourced from older facies (up to 26 k year) than the Lake Vanda paleomats we studied, ∼2 k year. Another reason may be that culturing only recovered a subset of the viable cells in the paleomats. While culturing is an important approach to studying cell survival, the vast majority of microbes have not been cultured, and may be resistant to isolation in a laboratory setting due to their as-yet unknown physiological requirements (Schloss and Handelsman, 2005). In a follow-up study of 16S rRNA gene PCR products from bulk DNA, Antibus et al. (2012b) found Firmicutes, Bacteroidetes, Chloroflexi, Cyanobacteria, Actinobacteria, and Actinobacteria from the same microbial mat samples, sourced either from viable cells or preserved genetic material.

In our data set, Actinobacteria, which have been previously detected in modern and ancient permafrost environments (e.g., Kochkina et al., 2001; Yergeau et al., 2010) were the most abundant among clades with mapping gene transcripts, potentially because of the presence of DNA repair at low temperature (Johnson et al., 2007). Corynebacteria, one of the most abundant orders in the Lake Vanda paleomat, are pleomorphic bacteria whose shape depends on surrounding conditions. While usually rod-shaped, Corynebacteria appear distinctively club-shaped during certain stages of the life cycle. They are non-spore-forming, non-motile, organotrophs, and are either aerobic or facultatively anaerobic. Micrococcales are diverse shapes—small cocci and rods—and members of this order, for example Arthrobacter, which is expressed in the Lake Vanda paleomat, can switch between a coccus (stationary phase) and rod-shape (exponential growth). Arthrobacter cocci are resistant to desiccation and starvation. The cells revealed in our scanning electron microscope images of the Lake Vanda paleomat are consistent with the morphologies of both of these abundant non-spore-forming orders (See **Figure 2**).

Among the Proteobacteria with mapping gene transcripts, Cyclobacteriacea had genes related to carbohydrate metabolism, quorum sensing, antibiotic resistance, and carotenoid biosynthesis. Cytophagaceae are heterotrophic bacteria, whose members are primarily aerobes with a respiratory metabolism. Most of them are rod-shaped, but some exhibit filamentation, produce pigments and utilize polysaccharides and proteins. Flavobacteriaceae most are aerobic with a respiratory metabolism, most are rod-shaped, but some exhibit long filaments, and utilize macromolecules such as polysaccharides and proteins.

We also detected gene transcripts related to members of the Firmicutes phylum, which can exist in a metabolically active state as well as form spores that are resistant to desiccation and extreme conditions. We detected Clostridia, which are anaerobic, mostly saprophytic organisms, as well as Bacillus bacteria, which are mainly aerobic. Despite their spore-forming capability, the gene transcripts mapping to Firmicutes suggest the organisms were not dormant at the time of collection. Since samples were collected into a −190◦C liquid nitrogen-primed cryoshipper, transferred directly into a −80◦C freezer, then processed on dry ice before a rapid addition of Trizol, it seems unlikely outgrowth could have been triggered. This result is reinforced by the two Firmicutes MAGs, size-selected bin 42 (MiSeq) and whole fraction bin 4 (MiniSeq). They are both Bacilli, but with an ANI of ∼72% ID, they are not from the same population of cells. Recovered gene transcripts include those associated the down-regulation of sporulation as well as genes focused on stress response and thiol stability.

#### Pathway Analysis and Survival Strategies

In additional to taxonomy, our metagenomic and metatranscriptomic data provide insights into functional ecology of the paleomat, both as a community and as MAGs. Different taxa appear to use different strategies, indicated by the observation that different MAGs are associated with different gene transcripts. Indeed, the only annotation that was detected across different MAGs was "hypothetical protein," which includes multiple predicted protein sequences. Yet, many of the genes perform similar functions, allowing for some common themes to emerge.

#### Stress Response and DNA Repair Are Key Functions in Paleomat Community

Starvation and nutrient scarcity are two of the main challenges faced by the cells in the Antarctic Dry Valleys. Indeed, mesocosm transcriptomes of organisms in Dry Valleys soils appear to be adapted to low nutrient availability, as the most abundant transcripts were associated with carbon and nitrogen cycling (Buelow et al., 2016). While Antarctic conditions are generally very oligotrophic, it may be that the abundance of organic carbon in the relic mats prevents dormancy from being a dominant survival strategy, despite the extreme desiccation and extreme temperatures. Most members of the community are heterotrophs, as revealed by putative function at metagenome level and expressed function at the metatranscriptome level. However, a few autotrophs are present, and the full Calvin-Benson-Bassham cycle is encoded in metagenomes. Similar observations of dominance by heterotrophic organisms, but presence of autotrophs have been made in Dry Valleys soil communities (Chan et al., 2013; Buelow et al., 2016), suggesting common strategies are utilized by diverse organisms. Additionally, a few organisms, such as Rhodobacter, also have physiological plasticity—the ability to grow autotrophically as well as heterotrophically under different environmental conditions. Whether the specific strains identified by sequencing are capable of this metabolic plasticity remains to be determined.

Amino acid uptake is a response and adaptation to starvation. The gene encoding carbon starvation protein A, cstA, was found in the Lake Vanda paleomat transcriptome. The CstA protein is induced in response to carbon starvation and works by facilitating and increasing peptide uptake (Schultz and Matin, 1991; Rasmussen et al., 2013). Moreover, the presence of cstA gives organisms the ability to survive and continue growing in a low-nutrient environment without forming a spore (Marshall, 1977). Within our data, this gene was primarily associated with Actinobacteria and Proteobacteria, two of the major phyla in our sample. Additionally, ABC transporters involved amino acid uptake were present in the metagenomes. Interestingly, the expressed ABC transporters were similar to those involved in exoprotein transport and sugar (galactose) uptake. However, amino acid transport permeases, oligopeptide transport ATPbinding protein OppF, and dipeptide transport system permease protein DppB, which can be used for heme uptake (Létoffé et al., 2006) were expressed in the paleomat transcriptome. Our transcriptome data suggests that the organisms in the community are adept at taking up nutrients from their environment.

In addition to resisting environmental stresses, organisms in harsh environments must compete for scarce resources. Metagenomic sequences primarily associated with Actinobacteria and Proteobacteria and several MAGs encoded and expressed genes with roles in antibiotic resistance. In Lake Vanda, competition between bacteria within their harsh habitat is likely the leading cause for the evolution of antibiotic production and resistance, offering resistant organisms a competitive advantage (Tam et al., 2015; Wei et al., 2015). In many environments, especially pristine environments poor in nutrients and with extreme conditions contain antibiotic resistance genes that are unlikely to have resulted from anthropogenic contribution to antibiotic resistance.

Universal stress protein, general stress proteins, OpuCC, and stress response proteins were all present in the paleomat transcriptome. Universal stress proteins are induced under high stress conditions, including nutrient starvation and exposure to oxidants and DNA-damaging agents, such as UV, to provide general stress protection via mechanisms independent of other stress responses (Gustavsson, 2002; Tkaczuk et al., 2013). The gene encoding the stress response protein Ysnf was present and expressed in several MAGs. YnsF is dependent on the transcription factor Sigma B (Prágai and Harwood, 2002), which is abundantly present in the metagenome. Presence of this transcription factor usually points to an enhanced response to phosphate starvation, allowing the organism to survive under conditions of phosphate-deprivation (Prágai and Harwood, 2002). Sigma B regulates gene response to phosphate starvation by limiting the metabolic consequences of phosphate deprivation (Allenby et al., 2005). Other stress response genes, rpoE and rpoH, both of which encode sigma factors (Jensen-Cain and Quinn, 2001), are present in the metagenome. Although no other genomic or transcriptomic data is available for other Antarctic paleomats, a GeoChip study of soil organisms from McKelvey Valley also indicated the importance of sigma factors necessary for various environmental stresses, including osmotic shock, desiccation) cold- and heat-shock, and starvation (Chan et al., 2013). RpoH is responsible for sigma 32 and plays a role in the activation of chaperones and DNA repair enzymes (Gruber et al., 2003). RpoE encodes sigma factor 24, and is essential in some Firmicutes for sporulation (Kirk et al., 2014). Therefore, the paleomat community members detected in the transcriptome express the genes necessary to resist environmental stresses and enable their survival.

Among some of the most highly-expressed genes in our dataset were chaperone cold shock proteins. These proteins function to respond to rapid temperature downward shift to help cells adapt to new conditions (reviewed in Keto-Timonen et al., 2016). Downward temperature changes cause changes in promoter activity during translation, suggesting that ribosomes play an important role in sensing temperature drops and eliciting a cold-shock response (Vanbogelen et al., 1990; Jones and Inouye, 1994, 1996; Graumann and Marahiel, 1998; Phadtare and Inouye, 2008). In the paleomat transcriptome, many of the expressed genes were associated with the ribosome and ribosome maturation. Of the nine members of the Csp family identified so far, two are constitutively expressed at physiological temperatures, one is induced under nutrient deficient conditions, and four are sensitive to cold shocks (Phadtare and Inouye, 2008). Interestingly, other chaperone proteins implicated in cold and heat shock responses also play essential roles in the folding of most proteins and are therefore required for survival even at physiological temperatures (Gragerov et al., 1992; Langer et al., 1992).

Chaperonins and protein chaperone proteins, including DnaK and DnaJ were well-represented in the Lake Vanda paleomat transcriptome. Chaperones play an important role in proper protein folding and homeostasis maintenance under thermal stress conditions and hyperosmotic shock by preventing aggregation of misfolded proteins. The chaperones DnaK, DnaJ, GrpE, and GroEL act in a sequential manner to recognize a growing polypeptide, stabilize an intermediate folding state, hydrolyzes ATP, and catalyzes the formation of the protein's native state. Although these proteins function to fold many proteins, including ones unrelated to cold shock tolerance, studies have shown that at least one of the GroE genes was essential for growth at low temperatures (Fayet et al., 1989). Additionally, chaperone proteins can act to re-establish membrane fluidity. Their importance in this role is highlighted by the expression of lipid and fatty acid biosynthesis pathway genes expressed in our dataset.

Besides stress and starvation response, DNA repair was a key function expressed in the transcriptome. The UvrABC that forms part of the prokaryotic nucleotide excision repair system recognizes and fixes DNA damage lesions (Truglio et al., 2006). This type of repair system removes DNA damage caused by UV radiation. We also detected components of base excision repair system that repairs endogenous DNA damage—deamination, alkylation, oxidation, and single-stranded breaks—throughout the life cycle (Krokan et al., 2013). Additionally, we found RNA-expressed genes that are essential for DNA maintenance and recombinational repair. The RecA protein functions in recombinational DNA repair and is regulated through the SOS response, which was present in the metagenome (Cox, 2007). RadA also functions in recombinational DNA repair, where it repairs DNA breaks. Taken together, the functions of genes expressed in the Lake Vanda microbial mat community act to cope with starvation, environmental stress including cold and osmotic shock, and DNA damage either endogenous or due to UV exposure.

### Future Work

Given the nature of the RNA we recovered in this study, we cannot exclude the possibility that the RNA is ancient, preserved in the paleomat. Dormancy, for instance, might protect RNA through bundling with protein or polyhydroxybutyrate (Ackermann et al., 1995) within a recalcitrant endospore. We saw little evidence for dormancy, however, or evidence the presence of aDNA or eDNA in the paleomat, which has a far longer half-life than RNA. Protein isolation and metaproteomics would determine what levels of proteins and peptides are present in the paleomat, confirming the processes of transcription and translation.

Expanding this analysis to other, older paleomats in the Dry Vallys is also an intriguing direction for future work. It is possible that over longer time scales, low-level DNA repair and other gene transcripts would be less prevalent as more and more organisms form spores. Sequencing platforms like Oxford Nanopore Technologies' MinION, which has recently been demonstrated in the Dry Valleys as a proof of principle for future in situ analyses (Johnson et al., 2017), and PacBio may also open new ways to interrogate DNA damage and repair via base modification detection. Base modifications, like cytosine methylation, are directly detectable in PacBio data by measuring kinetic variation during base incorporation (Marx, 2016) and have also been detected in nanopore sequencing data (Simpson et al., 2017). As DNA does not have to be amplified before sequencing, this method opens the door to analyzing rare sequence variants in situ. While not yet demonstrated, deamination events may be detectable too. Over time, in the absence of DNA repair, the genomes of truly dormant organisms accumulate damage due to spontaneous chemical reactions. The most common form of hydrolytic DNA damage involves deamination of one of the three nucleic acids containing an amine group, which in the case of the base cytosine generates uracil or its analogs (Shapiro and Hofreiter, 2014). The subsequent pairing of uracil with adenine leads to the observation of characteristic C->T/G- >A transitions. If large numbers of deamination events were prevalent in the recovered DNA of spore-forming organisms, it would suggest true dormancy, that DNA repair processes had been limited.

# CONCLUSION

Paleomats in the Dry Valleys of Antarctica may play a key role in Antarctic biogeography, but they have never been studied with metagenomics or metatranscriptomics. The results of this study provide an exploratory look at the genomic potential and possible transcriptional activity of an Antarctic paleomat, and suggest that Dry Valleys paleomats may contain multiple types of viable cells, potentially sourced from indigenous mat communities, followed by later colonization by soil-dwelling microbes. We recovered long continuous strands of DNA from the Lake Vanda paleomat, significantly longer than those recovered from dead organisms on ancient timescales, along with RNA transcripts and several near-complete MAGs. Despite the extremely cold and arid ambient conditions, dormancy does not appear to be a prevalent cell survival strategy, even for spore-forming members of the Firmicutes. This may be because of the relatively abundant relic carbon associated with the ancient microbial mats, but further work is called for.

These paleomats may serve as "seed banks," outposts for life over long timescales, at the same time as being sheltered nutrientrich niches for soil-dwelling microbes in the oligotropic terrain of the Dry Valleys. Viable organisms from paleomats may have been reintroduced to benthic and moat mat communities via aeolian transport or during lake level fluctuations, which have occurred many times since the last glacial maximum (Hall et al., 2010). By elucidating the resilience of life in cold, hyper-arid conditions, our results may have implications for biotechnology applications involving stress response and DNA repair as well as providing new insight into Antarctic microbiology and biogeography, and the limits of life in extremely harsh environments.

#### AUTHOR CONTRIBUTIONS

EZ, DG, ST, and SJ conceived of the study. BH advised on Antarctic geology and study design. EZ, DG, ST, YB, and SJ performed field work in Antarctica. EZ, DG, ST, YB, and NW

#### REFERENCES


performed research. EZ analyzed the data. EZ, DG, ST, YB, NW, BH, JB, MW, MS-F, and SJ wrote the paper.

#### FUNDING

This work was supported by NSF Office of Polar Programs Award PLR-1620976 to SJ at Georgetown University. The Genomics and Epigenomics Shared Resource at Georgetown University Medical Center is partially supported by NIH/NCI grant P30-CA051008.

#### ACKNOWLEDGMENTS

We give special thanks to the Illumina Corporation, specifically Mostafa Ronaghi, for dedicated support of Antarctic research and for supplying necessary instrumentation and reagents while at the Crary Lab at McMurdo Station. We are grateful to Millipore Sigma for early access to Metapolyzyme for DNA extraction. Much appreciated technical contributions were made by the University of Vermont Microscopy Imaging Facility and the Advanced Genome Technology Lab. The University of Wisconsin–Madison Biotechnology Center provided outstanding PacBio sequencing support and advised on data analysis. Members of the Extreme Microbiome Project provided extremely helpful technical input. Last but not least, we would like to thank the U.S. Antarctic Program and the support staff at McMurdo Station, particularly the individuals who work at the Crary Lab, HeloOPS, and MacOPS.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00001/full#supplementary-material


the McMurdo Dry Valley lakes, Antarctica. Polar Biol. 38, 1097–1110. doi: 10.1007/s00300-015-1669-0

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zaikova, Goerlitz, Tighe, Wagner, Bai, Hall, Bevilacqua, Weng, Samuels-Fair and Johnson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Comparison of Microbial Communities in the Sediments and Water Columns of Frozen Cryoconite Holes in the McMurdo Dry Valleys, Antarctica

Pacifica Sommers<sup>1</sup> \*, John L. Darcy<sup>2</sup> , Dorota L. Porazinska1,3, Eli M. S. Gendron<sup>1</sup> , Andrew G. Fountain<sup>4</sup> , Felix Zamora<sup>4</sup> , Kim Vincent<sup>1</sup> , Kaelin M. Cawley5,6, Adam J. Solon<sup>1</sup> , Lara Vimercati<sup>1</sup> , Jenna Ryder<sup>1</sup> and Steven K. Schmidt<sup>1</sup>

<sup>1</sup> Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, United States, <sup>2</sup> Department of Botany, University of Hawai'i Manoa, Honolulu, HI, United States, ¯ <sup>3</sup> Department of Entomology and Nematology, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL, United States, <sup>4</sup> Geology Department, Portland State University, Portland, OR, United States, <sup>5</sup> National Ecological Observatory Network Operated by Battelle, Boulder, CO, United States, <sup>6</sup> Institute of Arctic and Alpine Research, University of Colorado Boulder, Boulder, CO, United States

#### Edited by:

Alison Elizabeth Murray, Desert Research Institute (DRI), United States

#### Reviewed by:

Rachael Marie Morgan-Kiss, Miami University, United States Rebecca Gast, Woods Hole Oceanographic Institution, United States

\*Correspondence: Pacifica Sommers pacifica.sommers@colorado.edu

#### Specialty section:

This article was submitted to Extreme Microbiology, a section of the journal Frontiers in Microbiology

Received: 13 July 2018 Accepted: 15 January 2019 Published: 04 February 2019

#### Citation:

Sommers P, Darcy JL, Porazinska DL, Gendron EMS, Fountain AG, Zamora F, Vincent K, Cawley KM, Solon AJ, Vimercati L, Ryder J and Schmidt SK (2019) Comparison of Microbial Communities in the Sediments and Water Columns of Frozen Cryoconite Holes in the McMurdo Dry Valleys, Antarctica. Front. Microbiol. 10:65. doi: 10.3389/fmicb.2019.00065 Although cryoconite holes, sediment-filled melt holes on glacier surfaces, appear small and homogenous, their microbial inhabitants may be spatially partitioned. This partitioning could be particularly important for maintaining biodiversity in holes that remain isolated for many years, such as in Antarctica. We hypothesized that cryoconite holes with greater species richness and biomass should exhibit greater partitioning between the sediments and water, promoting greater biodiversity through spatial niche partitioning. We tested this hypothesis by sampling frozen cryoconite holes along a gradient of biomass and biodiversity in the Taylor Valley, Antarctica, where ice-lidded cryoconite holes are a ubiquitous feature of glaciers. We extracted DNA and chlorophyll a from the sediments and water of these samples to describe biodiversity and quantify proxies for biomass. Contrary to our expectation, we found that cryoconite holes with greater richness and biomass showed less partitioning of phylotypes by the sediments versus the water, perhaps indicating that the probability of sediment microbes being mixed into the water is higher from richer sediments. Another explanation may be that organisms from the water were compressed by freezing down to the sediment layer, leaving primarily relic DNA of dead cells to be detected higher in the frozen water. Further evidence of this explanation is that the dominant sequences unique to water closely matched organisms that do not live in cryoconite holes or the Dry Valleys (e.g., vertebrates); so this cryptic biodiversity could represent unknown microbial animals or DNA from atmospheric deposition of dead biomass in the otherwise low-biomass water. Although we cannot rule out spatial niche partitioning occurring at finer scales or in melted cryoconite holes, we found no evidence of partitioning between the sediments and water in frozen holes. Future work should include more sampling of cryoconite holes at a finer spatial scale, and characterizing the communities of the sediments and water when cryoconite holes are melted and active.

Keywords: Antarctica, cryoconite, niche partitioning, extremophile, bacteria, ciliate

# INTRODUCTION

fmicb-10-00065 January 31, 2019 Time: 18:40 # 2

Cryoconite holes are a unique aquatic environment found on glacier surfaces that support active microbial communities capable of playing a significant role in regional nutrient cycles. They start as sediment patches that accumulate on icy surfaces (McIntyre, 1984). Solar radiation heats the sediment more efficiently than the surrounding ice because of its lower albedo. The warmer sediment melts the underlying ice and migrates downward, forming holes of meltwater that deepen quickly to an equilibrium depth, where the rate of deepening is equal to the ablation rate of the ice surface (Gribbon, 1979; Wharton et al., 1985; Fountain et al., 2008). On the glaciers in the McMurdo Dry Valleys of Antarctica, the surface energy balance is such that the ubiquitous water-filled holes quickly freeze over, leaving a subsurface pool of water that can persist for over a decade, melting every austral summer (Fountain et al., 2004; Tranter et al., 2004). However, in warm years, they may also flush with their accumulated carbon and nitrogen into ice-covered lakes and contribute to increased productivity (Bagshaw et al., 2013).

The nitrogen from cryoconite holes is fixed by active microbial communities (Wharton et al., 1985; Christner et al., 2003; Tranter et al., 2004; Foreman et al., 2007; Telling et al., 2014) which exhibit distinct vertical partitioning of at least the ciliate community during summer melt-out (Mieczan et al., 2013). Given the differences in light availability (Bagshaw et al., 2016b), cell densities (Foreman et al., 2007), and differences in nutrients and other physicochemical characteristics (Foreman et al., 2007; Mieczan et al., 2013) between the bottom sediment and the overlaying water, other taxa also are likely to partition between the sediments and water within cryoconite holes. Partitioning at the water-sediment interface can be driven by physical processes such as freeze concentration of impurities (gasses, ions, and particles) as the formation of ice rejects those impurities (as in Wait et al., 2006). It can also be generated by the organisms themselves through movement or population growth (Mieczan, 2010; Mieczan et al., 2013). For species requiring similar resources, but whose favorable locations differ, this can lead to spatial niche partitioning, promoting coexistence and enhancing biodiversity (Levin, 1974; Chesson, 2000; Cardinale, 2011). Not only could distinct spatial niche partitioning at the very coarse scale of sediment and water promote species richness through coexistence, but greater biomass and richness could enhance partitioning by forcing species to specialize their growth in either sediments or water through competitive interactions. Either of those dynamics would lead to more distinctive communities in the sediments and water in richer cryoconite holes than in lower biomass and lower richness holes.

To test whether we observed less overlap in communities between the water and sediments of richer cryoconite holes, we sampled frozen cryoconite holes across three glaciers representing a gradient of phylodiversity (Sommers et al., 2018) and biomass (Porazinska et al., 2004) in the Taylor Valley, Antarctica. Specifically, we employed high-throughput amplicon sequencing to characterize bacterial and microbial eukaryotic communities in each layer and to determine the percentage of phylotypes in the water also detected in the sediments, and vice versa. To verify the gradient, we also determined the richness of phylotypes, and we measured total carbon, chlorophyll a, and DNA as proxies for biomass. We expected that the structure of microbial communities would differ between the sediments and water to at least some degree, as they do in other Antarctic aquatic ecosystems (e.g., Archer et al., 2014, 2015). For example, we expected that non-cyanobacterial primary producers such as algae, and motile taxa such as ciliates would be found primarily in the water, while cyanobacteria would form mats in the sediments, associated with less motile grazers such as tardigrades. We also expected that like most aquatic environments, including cryoconite holes on one of our sampled glaciers, the sediments would have greater biomass than the water (Foreman et al., 2007). Despite this difference, we expected that the gradient of biomass across the glaciers would be reflected in both the water and sediments.

# MATERIALS AND METHODS

#### Field Site, Sample Collection, and Processing

The McMurdo Dry Valleys, Antarctica, stretch from the edge of the Antarctic ice sheet to the coast at McMurdo Sound. Our field study was in the Taylor Valley, a ∼40 km long, ∼12 km wide valley between two mountain ranges. The valley floor is dotted with lakes, exposed bedrock, with large expanses of poorly developed soils (Priscu, 1999). A major geologic feature, the Nussbaum Riegel, divides Taylor Valley physically and climatically into two main basins, the warmer (in summer), drier Lake Bonney basin to the west which is adjacent to the ice sheet; and the cooler, wetter Lake Fryxell basin to the east adjacent to the Sound (Fountain et al., 1999, 2010; Barrett et al., 2006). A third, much smaller basin, is formed by Canada Glacier on the north side of the Riegel, which dams the natural drainage to Lake Fryxell, forming Lake Hoare.

The major wind directions are channeled by the mountains that border the valley and can be divided into up-valley or downvalley. In summer the more gentle breezes are up-valley, onshore winds from the Ross Sea. The stronger, particularly in winter, winds are down-valley föhn winds descending from the ice sheet (Nylen et al., 2004; Šabacká et al., 2012). The föhn winds are particularly important because they can transport sediments and biota from the valley floor soils, stream channels, and lakes on to the glaciers.

We sampled cryoconite holes between 7 and 17 November, 2016, while they were frozen, on three different glaciers, Taylor ("Tay"), an outlet glacier of the ice sheet at the head of the valley, Canada ("Can"), about midway down-valley, and Commonwealth ("Com"), at the valley mouth by the McMurdo Sound. We used a SIPRE corer to drill cores 20 ( ± 10) cm long and 10 cm in diameter. We stored the cores in sterile Whirl-Pak bags (Nasco, Fort Atkison, WI, United States) at −20◦C for up to 1 month. In the Crary Laboratory at McMurdo Station, cores were separated into the sediment layer and the overlaying ice. Each component was separately washed with deionized water

to melt off the outer layer and potential cross-contamination from the drill, then placed in separate, acid-washed, high-density polyethylene (HDPE) beakers covered with aluminum foil and melted at 4◦C for 12–24 h. Final melting took place at room temperature when necessary immediately prior to homogenizing, subsampling, and filtering.

#### DNA Extraction and Sequencing

For DNA extractions from the sediment layer, excess water was poured off and 0.3–0.4 g wet sediment was transferred to a PowerSoil DNA Isolation Kit (MoBio Inc., Carlsbad, CA, United States) bead beating tube. For DNA extractions from the water, 50 ml water was poured through a Whatman Nuclepore (GE Healthcare, Pittsburg, PA, United States) 0.2 µm filter, and that filter was transferred to a PowerWater DNA Isolation Kit (MoBio Inc., Carlsbad, CA, United States) bead beating tube. Tubes were frozen at −20◦C until the DNA was extracted, up to 1 month later, following the manufacturer's protocol with one exception. For the final step in eluting the DNA from the spin filter in the water extractions only, we used only half the final solution in order to increase the concentration of DNA because we expected it to be near detection thresholds. For 15 arbitrarily selected samples from the sediments and the water on each glacier (45 sediments and 45 water total), we measured the concentration of extracted DNA using a Qubit fluorometer (Qubit, London, United Kingdom), according to the manufacturer's instructions. DNA concentrations were backcalculated to be ng per approximate cubic centimeter (g dry sediment or ml water).

Extracted genomic DNA was amplified in triplicate using 16S (515f-806r primers, Caporaso et al., 2012) and 18S (1391f-EukBr primers, Amaral-Zettler et al., 2009; Caporaso et al., 2012) SSU ribosomal gene markers. Amplified DNA was pooled and normalized to equimolar concentrations using SequalPrep Normalization Plate Kits (Invitrogen Corp., Carlsbad, CA, United States), and sequenced using the Illumina MiSeq V2 (2 × 250 bp chemistry) at the BioFrontiers Sequencing Core Facility at the University of Colorado at Boulder. Sequences have been deposited in the NCBI SRA database under project PRJNA480849.

#### Biomass and Production

As a measure of past primary production biomass, we measured chlorophyll a concentrations in the sediments and the water. For chlorophyll extractions from the water, 50 ml of melted ice was filtered through a Whatman GF/F glass fiber filter (GE Healthcare, Pittsburg, PA, United States). For chlorophyll in the sediments, 5 g sediments were mixed with 50 ml melted ice from the same cryoconite hole, then allowed to settle for at least 5 min before being filtered. We wrapped the filters in aluminum foil and stored them at −20◦C for up to 2 months before acetone extraction, which followed the methods of Chiuchiolo (2017). The filters were extracted in 10 ml 90% acetone for 24 h with agitation at 12 and 24 h, then 4 ml was pipetted into a cuvette and fluorescence was measured with a 10-AU Fluorometer (Turner Designs, Sunnydale, CA, United States). Measurements were back-calculated to a chlorophyll concentration based on a regression of known chlorophyll standard solutions, then further transformed for sediment samples to a per kg dry mass equivalent, and to account for the chlorophyll already in that water sample.

#### Data Processing and Analysis

Raw reads were de-multiplexed and quality filtered using the QIIME v1.9.1 bioinformatics package (Caporaso et al., 2010a,b), using paired-end sequences that were joined with VSEARCH (Rognes et al., 2016). Bacterial and eukaryotic sequences were separately clustered into OTUs at 97% similarity using UCLUST (Edgar, 2010). Taxonomy was assigned using QIIME's parallel\_assign\_taxonomy\_blast.py script with the SILVA 128 Ref NR99 database's taxonomic information (Quast et al., 2013). All mitochondrial and chloroplast OTUs based on this classification were removed from the bacterial data set and all bacterial OTUs were removed from the eukaryotic data set. OTUs that made up at least 1% of the extraction blank sequences were discarded as likely lab contamination, with two bacterial exceptions (both Bukholderiales) that were dominant members of the sediment community, where contaminants are unlikely to dominate, given the DNA concentrations. Common sequences from real samples could show up in the low-DNA blanks from pipette aspiration. Singleton OTUs were removed. The reads per sample were scaled by the g dry sediment or ml of water extracted to ensure more even comparison, and multiplied by 1,000 to maintain rare OTUs above zero before being rounded to nearest integers. Samples were then rarefied to 6,900 bacterial sequences and 5,680 eukaryotic sequences each. Richness of OTUs was calculated as the number of OTUs in the samples after transformation and rarefaction. The taxonomic assignments of the dominant OTUs from each component (water and sediments) on each glacier set were verified with the NCBI non-redundant ribosomal database using BLAST.

Because we collected samples over a gradient of phylodiversity and microbial abundance within cryoconite holes along the Taylor Valley (Porazinska et al., 2004; Sommers et al., 2018), we expected differences among glaciers and between sediments vs. water. We therefore analyzed differences in the OTU richness per cm<sup>3</sup> with a two-way analysis of variance (ANOVA) including an interaction term, log transforming the data where appropriate. We used Tukey HSD post hoc comparisons of groups for more detailed discussion, as implemented in R 3.3.2 (R Core Team, 2016). We took a similar approach to analyzing differences in chlorophyll and DNA.

We calculated the Bray-Curtis pairwise dissimilarity metric (Bray and Curtis, 1957) for all samples using QIIME. We used a principal coordinates analysis (PCoA) implemented in package "ape" (Paradis et al., 2004) in R to visualize the results, and a permutational multivariate analysis of variance (PERMANOVA) implemented in package "vegan" (Oksanen et al., 2018) with post hoc pairwise comparisons in "RVAideMemoire" (Hervé,, 2018) to ask how assemblage structures differed among habitats and glaciers.

To compare the overlap in OTUs detected in the sediment and water we calculated the percent of OTUs in each habitat (water or sediments) also detected in the other for that sample. These were also compared with an interactive two-way ANOVA, log transforming data where appropriate.

#### RESULTS

#### Microbial Assemblages

fmicb-10-00065 January 31, 2019 Time: 18:40 # 4

As expected, the overall richness of bacterial OTUs (n<sup>1</sup> = n<sup>2</sup> = 74) was 50% or more greater in sediments (scaled to 1 kg dry sediment) than in water (scaled to1 l) (F = 136, P < 0.001), and this pattern was consistent across all glaciers. Also as expected, Commonwealth Glacier was the richest and Taylor was the most depauperate (F = 27, P < 0.001) (**Figure 1**). The richness of eukaryotic OTUs (n<sup>1</sup> = n<sup>2</sup> = 71) was similarly greater in the sediments than the water (F = 183, P < 0.001), with richness also differing significantly across glaciers (F = 34, P < 0.001), and a

Glacier. Richness in water (light bars) is scaled up to 1 l of water, and richness in sediments (dark bars) to 1 kg dry sediment. Error bars are bootstrapped 95% confidence intervals implemented in "ggplot2" (Wickham, 2009) in R.

significant interaction between glacier and sample type (water vs. sediment) (F = 6.4, P = 0.002) (**Figure 1**).

Also as expected, assemblages of both bacteria and microbial eukaryotes differed significantly between sediments vs. water by glacier (bacteria F = 35, P = 0.001; eukaryotes F = 40, P = 0.001; **Figure 2**). More specifically, post hoc comparisons showed that all were significantly distinct from one another (P < 0.01 for all). Although dispersion of community distances differed significantly among comparisons both for bacteria (F = 19, P < 0.001) and eukaryotes (F = 26, P = 0.043), PERMANOVA is robust to heterogeneous dispersions for balanced designs (Anderson and Walsh, 2013).

We next determined whether the communities in the water were a subset of the communities in the sediments by calculating what proportion of OTUs from each stratum were also detected in the other. We found that significantly higher percentages of bacterial OTUs detected in the water were also found in the sediments than vice versa (F = 100, P < 0.001), and that the percentage differed by glacier (F = 54, P < 0.001), with a significant interaction between sediment-water comparison and glacier (F = 3.3, P = 0.040) (**Figure 3**). Specifically, the percentage of overlapping bacterial OTUs in both the sediments (Tukey HSD: Tay-Can: P = 0.012; Tay-Com: P < 0.001; Can-Com: P = 0.16) and water (Tukey HSD: Tay-Can: P < 0.001; Tay-Com: P < 0.001; Can-Com: P = 0.54) were higher on Canada and Commonwealth Glaciers than on Taylor Glacier. The percent of overlapping eukaryotic OTUs also varied between sediments and water (F = 62, P < 0.001) and by glacier (F = 31, P < 0.001) (**Figure 3**).

We furthermore plotted the ten dominant bacterial OTUs for sediments and water on each glacier (**Figure 4**). Sediments were dominated by cyanobacteria, with the genus of the dominant OTU differing among glaciers. In particular, the dominant cyanobacterium in Commonwealth sediments, Nostoc sp., was

FIGURE 2 | Principal coordinates analysis biplot of Bray-Curtis dissimilarity metric for bacterial and eukaryotic assemblages in cryoconite holes on three glaciers along a gradient of diversity. Solid circles: sediments; open circles: water. Red: Commonwealth Glacier; gold: Canada Glacier; navy: Taylor Glacier.

more dominant than the most dominant cyanobacterium on other glaciers, a Chamaesiphon sp. on Canada and a Phormidium sp. on Taylor. Polaromonas sp. (Betaproteobacteria) was also consistently in the ten most relatively abundant OTUs. These same cyanobacterial OTUs and the same Polaromonas sp. OTU also appeared in the water along with some other organisms, such as an Acinetobacter sp. (Gammaproteobacteria), that were not dominant in the sediments.

Dominant eukaryotes in the sediments included an alga OTU (Pleurastrum sp.), bdelloid rotifers (Rotaria sp. and Adineta sp.), tardigrades (Acutuncus sp. and Diphascon sp.), and a ciliate, Stokesia sp. (**Figure 5**). The alga was much more dominant on Commonwealth Glacier than on the other glaciers. Dominant OTUs in the water included some of those taxa, but also OTUs representing organisms that are visibly absent from cryoconite holes, such as teleost fish (Emmelichthys sp). Those OTUs were completely absent from the sediments, as were those of Chrysophyceae, despite their prevalence in the water.

#### Biomass and Production

The proxies for biomass and primary production were substantially greater in the sediments than in the water, as expected. DNA concentrations per cm<sup>3</sup> (N<sup>1</sup> = N<sup>2</sup> = 45) were four orders of magnitude greater in the sediments than the water (F = 1100, P < 0.001), even on Commonwealth Glacier, which had the highest concentrations (F = 41, P < 0.001) (**Figure 6**). Concentrations of chlorophyll a (N<sup>1</sup> = N<sup>2</sup> = 65) were also significantly higher in the sediments than in the water (F = 4100, P < 0.001), and differed among glaciers (F = 51, P < 0.001) with Commonwealth Glacier having the highest concentrations and Taylor Glacier the lowest (**Figure 6**).

# DISCUSSION

As expected, OTU richness and the concentrations of DNA and chlorophyll a were much greater in the sediments of cryoconite

intervals implemented in "ggplot2" (Wickham, 2009) in R.

intervals implemented in "ggplot2" (Wickham, 2009) in R.

sediment (right panels). Tay: Taylor Glacier, Can: Canada Glacier, Com: Commonwealth Glacier. Error bars are bootstrapped 95% confidence limits, as implemented by ggplot2 (Wickham, 2009).

holes than in the water, and the sediments and water contained distinct assemblages of bacteria and microbial eukaryotes. The richness and structure of their assemblages also differed among glaciers, in both the sediments and water, in agreement with past work (Sommers et al., 2018). However, contrary to our main hypothesis, the overlap of OTUs between sediments and water was greater on the glaciers with higher biomass proxies and OTU richness (Canada and Commonwealth) than on the glacier with lower biomass and richness (Taylor).

The greater overlap in OTUs between the sediment and water on richer glaciers is inconsistent with the idea that partitioning of sediment and water supports greater richness by increasing the niches available to support more species. Our work does not rule out that spatial partitioning may increase diversity at finer scales, both within the sediments (e.g., anaerobic microsites) and within the water (e.g., organisms attached to suspended particles). The greater biomass on Canada and Commonwealth Glaciers could have led to higher probabilities of abundant organisms being detected in both the sediments and water. The most abundant sequences from the sediments tended to be detected in the water, consistent with this explanation. Additionally, the partitioning of resources, consumers, or other factors could promote coexistence and OTU richness regardless of spatial relationships (Chesson, 2000). Finally, some abundant organisms could primarily inhabit the water while they are active during the peak of the melt

season (as in Mieczan et al., 2013), but become inactive near the end of the season (as in Hawes et al., 2011; Safi et al., 2012), allowing gravity and freeze compression to settle them to the sediments. Modeling of Antarctic cryoconite holes indicate that they freeze from the top down at the end of the summer melt season (Zamora, 2018), creating the potential for a downwardmoving freezing front to exclude ions and biological particles from the water (as in Watterson, 1985). This same process occurs in coastal Antarctic ponds, creating brine layers at the bottoms (Wait et al., 2006; Hawes et al., 2011; Safi et al., 2012). Such concentration in the benthic sediments, whether due to inactivity and settling or to freeze concentration, would generate a seasonal component to sediment-water partitioning in cryoconite holes. Further work on cryoconite holes should therefore investigate the seasonality of organisms' activity in cryoconite holes, and sediment-water partitioning of the broader community during peak melt as a comparison.

Bacterial assemblages in the sediments were dominated by cyanobacteria, which are common in aquatic ecosystems in the McMurdo Dry Valleys (McKnight et al., 1998), and contribute significantly to the in situ primary production of the valleys (Barrett et al., 2006). This dominance is consistent with cryoconite holes having the potential to be net photosynthetic ecosystems (Bagshaw et al., 2016a), and consistent with results from other cryoconite holes in the region (Webster-Brown et al., 2015). While this was particularly true on the Commonwealth Glacier, even the lowest biomass Taylor Glacier had a diverse array of cyanobacteria. As larger Antarctic ponds freeze, their primary production slows to the point that they become net heterotrophic systems, and the relative abundance of primary producers may be at a minimum during the winter (Hawes et al., 2011; Safi et al., 2012). Assuming the same processes occur in cryoconite holes, this relative abundance of cyanobacteria in their frozen state is a conservative measure of their relative abundance. The dominant cyanobacteria in the sediments of each glacier differed, possibly due to localized Aeolian transport (Stanish et al., 2013), which was also found to be the case on other glaciers in the Dry Valleys (Webster-Brown et al., 2015). The prevalence of an OTU matching Nostoc sp. on the Commonwealth Glacier was particularly striking. The ability of the Nostoc genus to fix atmospheric nitrogen into a biologically usable form could provide an escape from nutrient limitations on glaciers (Dodds et al., 1995), and Commonwealth had the greatest chlorophyll concentrations. However, there were also some similarities among glaciers, with a Chamaesiphon sp. and a Leifsonia sp. being dominant cyanobacteria on all three glaciers. A Polaromonas sp. was also consistently in the ten most relatively abundant OTUs. Polaromonas phylotypes are regularly part of polar and alpine microbial communities globally (Darcy et al., 2011).

The eukaryotic assemblage in the sediments was primarily composed of algae, microfauna, and ciliates. The dominant alga, a Pleurastrum sp., falls within the "CP clade" (Chlamydopodium-Pleurastrum) as described by Schmidt et al. (2011). This monophyletic clade also encompasses a "snow" alga (AF514408) isolated from the high (78◦N) Arctic, algae inhabiting talus soils at the Niwot Ridge Long Term Ecological Research site (Freeman et al., 2009), and algae found in the sediments on top of the debris-covered Toklat Glacier in central Alaska (e.g., KM870774) (Schmidt and Darcy, 2015). Members of the CP-clade have been found to make up a large proportion of sequences in molecular analyses of both High Himalayan soils (up 36% of 18S sequences, Schmidt et al., 2011) and some Antarctic Dry Valley soils (Fell et al., 2006). However, members of this clade have also been cultured from Lake Fryxell and other lakes in Antarctica (De Wever et al., 2009), so it is unclear if they are terrestrial algae that end up in lakes or lake algae that end up in soil/sediments. The CP clade has been previously detected in a cryoconite hole on Canada Glacier (Christner et al., 2003). Dominant microfauna included the bdelloid rotifers Rotaria sp. and Adineta sp. and the tardigrades Acutuncus sp. and Diphascon sp. It is important to keep in mind that OTUs do not translate directly to species, and that the diversity of both rotifers and tardigrades in cryoconite holes is possibly underestimated by using the V9 region of the 18S SSU gene marker (Tang et al., 2012). Sequences from a ciliate, Stokesia sp., and a flatworm related to Phaenocora sp. were also common, as previously seen in DNA from these glaciers (Sommers et al., 2018).

Dominant OTUs in the water included some of the OTUs common in the sediments, but also sequences whose closest matches were organisms that are visibly not present in cryoconite holes, such as teleost fish, Emmelichthys sp., and the insects Rhyzobius sp. (Coleoptera) and Xylocoris sp. (Hemiptera). These sequences could reflect the presence of microbial animals without close relatives in the SILVA 128 RefNR99 database, and that therefore match to some other animal. As they were completely absent from the richer sediments, they could also be reflective of the low DNA concentrations, and reflect lab contamination or atmospheric deposition of DNA from dead cells that became trapped in the ice. We did not detect these sequences in the biologically richer sediments, although they may have been present but not detectable given the higher biomass and concentrations of DNA. While we cannot completely rule out the possibility of contamination, their presence after filtering out OTUs from extraction blanks suggests the DNA was in the water samples and not a reflection of contamination.

Such OTUs could plausibly represent a signal of atmospheric deposition of relic DNA that remained undegraded in the lowbiomass water. Wind is a major driver of biotic and abiotic processes in Taylor Valley, and plays a role in cryoconite hole formation by transporting sediments onto glaciers (Lancaster, 2002). The microbial communities of closed cryoconite hole sediments in the Taylor Valley and the surrounding McMurdo Dry Valleys show a strong signal of dispersal from local habitats, with some dispersal from farther away (Webster-Brown et al., 2015; Sommers et al., 2018) with little spatial structuring of cryoconite hole communities at the within-glacier scale (Darcy et al., 2018). Material is primarily transported down valley by powerful föhn winds (Šabacká et al., 2012), common in winter, although the gentle up-valley onshore breezes, more common in summer, could bring DNA or light microbial material from the ocean into the valley (Nylen et al., 2004). Based on these wind patterns, however, we might expect to find evidence of coastal relic DNA more common in the Commonwealth Glacier samples than those of the Taylor Glacier. The dominance of suspected

inactive taxa in the holes of Taylor Glacier, therefore, may be less driven by wind patterns than its low DNA concentrations.

Some of the OTUs found only in the water may be organisms truly able to grow in cryoconite holes. For example, the Massilia species in the water, which was not detected in our control blanks, matched sequences from liquor fermentation (Pang et al., 2018: MG859189), but also sequences transported by coastal winds (unpubl.: MG271571 and MG270648). However, many members of its class (Burkholderiales) are also very adaptable showing high rates of horizontal gene transfer and are often earlysuccessional taxa (Nemergut et al., 2007) that can be globally dispersed in cold environments (Darcy et al., 2011). Similarly, the closest matches for the fungus Penicillium sp. were from aquatic areas with human contamination, such as municipal water and water tower systems, and in canals (e.g., KT265809, KX610136, and KX090324), but Penicillium species have been isolated from the soils of Antarctic islands, where they were relatively abundant (Gomes et al., 2018). Another eukaryote that was nearly absent in the sediments, but dominant in the water, was a golden algae (class Chrysophyceae). Its closest relatives are from marine samples, but it is also related to the "Hydrurus clade" described by Klaveness et al. (2011). Members of this clade occur almost exclusively in high mountain or other cold environments including Baltic Sea Ice (FN690692), high-mountain snow (AJ867745), high-Arctic snow (HQ230104), and high-mountain streams (AY689714) (Klaveness et al., 2011). Unlike vertebrates, relatives of these taxa could conceivably be active in a cryoconite hole and represent an active part of the community, or they could represent contamination or relic DNA from dead cells, detectable only because of the low biomass in the frozen water.

The bulk of the DNA and chlorophyll a were found in the sediments of cryoconite holes across all glaciers, as in previous work on Canada Glacier's cryoconite holes (Foreman et al., 2007). Although we expected greater biomass in the sediments than the water, similar to most aquatic systems, the degree of the difference was surprising. Previous work had found approximately four times the number of bacterial cells in the sediments compared with the water (Foreman et al., 2007), whereas we found orders of magnitude differences in the proxies of biomass we used. A previous measurement of chlorophyll a within melted Canada Glacier cryoconite hole water was approximately five times our measurements of the water (Foreman et al., 2007), but more mixing with the productive sediments could have taken place for the previous measurement. Differences in the extraction protocols could make DNA concentrations less comparable between sediments and water, but DNA from 50 ml in the water was mostly undetectable by fluorometry, even when concentrated, suggesting biological material was extremely low in the frozen water relative to 0.3 g sediment, potentially allowing relic or contaminant DNA or rare organisms to be detected primarily in the water. Low biomass in the water relative to the sediments is furthermore consistent with regional Antarctic ponds (Foreman et al., 2011; Hawes et al., 2011).

Pond-sized aquatic ecosystems have been proposed as good model systems for studying macroecological processes (Adrian et al., 2009; Williamson et al., 2009; Hortal et al., 2014). The response of planktonic microbial communities in Antarctic ponds to freezing has been previously characterized (Hawes et al., 2011; Safi et al., 2012), and the substantial differences in biogeochemistry factors between ponds with the same climate make them useful model systems (Archer et al., 2016). Ponds, however, are difficult to create experimentally. Antarctic cryoconite holes could serve as an even smaller study system for understanding the physical and biological interactions in lentic systems that repeatedly melt and freeze. With hundreds of holes in close proximity formed in the same glacial ice, the effect of freezing solid on the stratification of microbial life can be studied with highly replicated experiments or observational studies, which are less feasible in pond-sized ecosystems.

# CONCLUSION

In conclusion, we found that cryoconite holes with more OTUs and more biomass had greater overlap in OTUs present in both the sediment and water. This was contrary to a hypothesis that richer environments might be so due to partitioning of the sediments and water. Possible explanations include that other factors create niche partitioning to promote diversity, that spatial partitioning occurs at a finer scale than sampled, or that organisms primarily inhabiting the water were compressed by freezing down to the sediment layer, leaving primarily relic DNA in the frozen water to be sampled. Future work should include determining seasonal activity levels of dominant taxa in cryoconite holes, and whether changes in relative abundance are observed between the water of melted and frozen holes.

#### AUTHOR CONTRIBUTIONS

SS, PS, DP, JD, FZ, and AF determined the study design. PS, DP, JD, and FZ collected and processed the samples. FZ and AF collected and analyzed the physical data on natural cyroconite holes. PS, EG, JD, DP, SS, KC, KV, AS, LV, and JR contributed to analysis and interpretation of biogeochemical and community data. PS, EG, FZ, and AF primarily wrote the manuscript. All authors contributed text and revisions to the manuscript.

# FUNDING

This work was funded by National Science Foundation Polar Programs Award 1443578. Publication of this article was funded by the University of Colorado Boulder Libraries Open Access Fund.

# ACKNOWLEDGMENTS

This project was originally proposed by the late Diana Nemergut, to whom we are grateful and miss very much. Thanks to Amy Chiuchiolo and the McMurdo LTER for their assistance measuring chlorophyll, and to the BioFrontiers Sequencing Center at the University of Colorado Boulder. We appreciate the thoughtful suggestions of two reviewers.

### REFERENCES

fmicb-10-00065 January 31, 2019 Time: 18:40 # 9


algae: further evidence for the existence of glacial refugia. Proc. R. Soc. B 276, 3591–3599. doi: 10.1098/rspb.2009.0994



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Sommers, Darcy, Porazinska, Gendron, Fountain, Zamora, Vincent, Cawley, Solon, Vimercati, Ryder and Schmidt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Spatial and Temporal Scales Matter When Assessing the Species and Genetic Diversity of Springtails (Collembola) in Antarctica

Gemma E. Collins <sup>1</sup> , Ian D. Hogg1,2 \*, Peter Convey <sup>3</sup> , Andrew D. Barnes 1† and Ian R. McDonald<sup>1</sup>

*<sup>1</sup> School of Science, University of Waikato, Hamilton, New Zealand, <sup>2</sup> Polar Knowledge Canada, Canadian High Arctic Research Station, Cambridge Bay, NU, Canada, <sup>3</sup> British Antarctic Survey, Cambridge, United Kingdom*

#### Edited by:

*Angela McGaughran, Australian National University, Australia*

#### Reviewed by:

*Bettine Van Vuuren, University of Johannesburg, South Africa Katy Morgan, University of Bath, United Kingdom*

#### \*Correspondence:

*Ian D. Hogg ian.hogg@polar.gc.ca orcid.org/0000-0002-6685-0089*

*†Andrew Barnes orcid.org/0000-0002-6499-381X*

#### Specialty section:

*This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *08 August 2018* Accepted: *28 February 2019* Published: *22 March 2019*

#### Citation:

*Collins GE, Hogg ID, Convey P, Barnes AD and McDonald IR (2019) Spatial and Temporal Scales Matter When Assessing the Species and Genetic Diversity of Springtails (Collembola) in Antarctica. Front. Ecol. Evol. 7:76. doi: 10.3389/fevo.2019.00076* Seven species of springtail (Collembola) are present in Victoria Land, Antarctica and all have now been sequenced at the DNA barcoding region of the mitochondrial cytochrome *c* oxidase subunit I gene (COI). Here, we review these sequence data (*n* = 930) from the GenBank and Barcode of Life Datasystems (BOLD) online databases and provide additional, previously unpublished sequences (*n* = 392) to assess the geographic distribution of COI variants across all species. Four species (*Kaylathalia klovstadi, Cryptopygus cisantarcticus, Friesea grisea*, and *Cryptopygus terranovus*) are restricted to northern Victoria Land and three (*Antarcticinella monoculata*, *Cryptopygus nivicolus*, and *Gomphiocephalus hodgsoni*) are found only in southern Victoria Land, the two biogeographic zones which are separated by the vicinity of the Drygalski Ice Tongue. We found highly divergent lineages within all seven species (range 1.7–14.7%) corresponding to different geographic locations. Levels of genetic divergence for the southern Victoria Land species *G. hodgsoni*, the most widespread species (∼27,000 km<sup>2</sup> ), ranged from 5.9 to 7.3% divergence at sites located within 30 km, but separated by glaciers. We also found that the spatial patterns of genetic divergence differed between species. For example, levels of divergence were much higher for *C. terranovus* (>10%) than for *F. grisea* (<0.2%) that had been collected from the same sites in northern Victoria Land. Glaciers have been suggested to be major barriers to dispersal and two species (*C. cisantarcticus* and *F. grisea*) showed highly divergent (>5%) populations and over 87% of the total genetic variation (based on AMOVA) on either side of a single, 16 km width glacier. Collectively, these data provide evidence for limited dispersal opportunities among populations of springtails due to geological and glaciological barriers (e.g., glaciers and ice tongues). Some locations harbored highly genetically divergent populations and these areas are highlighted from a conservation perspective such as avoidance of human-mediated transport between sites. We conclude that species-specific spatial and temporal scales need to be considered when addressing ecological and physiological questions as well as conservation strategies for Antarctic Collembola.

Keywords: Antarctica, biogeography, collembola, dispersal, mitochondrial DNA barcodes, population genetic structure, species diversity, springtails

# INTRODUCTION

Due to the extreme environmental conditions that characterize Antarctica (Convey, 2013), as well as the geographical, and island-like, isolation of suitable terrestrial habitats (Bergstrom and Chown, 1999), long-range dispersal events for Antarctic Collembola (springtails) are rare and they usually rely on liquid water for transport via flotation (Hawes et al., 2008; McGaughran et al., 2011a; Carapelli et al., 2017). It is likely that springtails are currently unable to disperse among the three Antarctic Conservation Biogeographic Regions (ACBRs; Terauds and Lee, 2016) in the Ross Sea Sector of Antarctica: northern Victoria Land (NVL), southern Victoria Land (SVL), and the Transantarctic Mountains (TAM), owing to various physical obstacles in the marine and terrestrial realms (e.g., the Drygalski Ice Tongue; **Figure 1**). Within each of these three ACBRs, available habitat is patchy and local microhabitats are likely to be important for long-term persistence of populations (Sinclair and Sjursen, 2001). These small "island" populations can thus accumulate genetic mutations resulting in distinct genetic patterns across the landscape. Previous studies on the diversity patterns of lichen communities within the Antarctic have suggested that distributions are likely to have been driven by geological or glaciological features rather than environmental gradients that are associated with latitudinal or longitudinal distances (Adams et al., 2006; Peat et al., 2007; Green et al., 2011; Colesie et al., 2014). Similarly, the distributions of Antarctic springtail species may not follow a pattern of decreasing species diversity with increasing latitude (e.g., Caruso et al., 2009). The spatial scales at which sampling is undertaken are, therefore, important as genetic differences between populations do not necessarily increase proportionately with distance. However, until now a broader synthesis of genetic data for multiple species across larger spatial scales has not been undertaken. This is unfortunate as such an analysis could help to assess the relative role of geographic barriers in structuring populations of Antarctic springtails.

Terrestrial invertebrates were first discovered in the Ross Sea sector of Antarctica during the early 1900s (e.g., Carpenter, 1902, 1908; Willem, 1902; Gregory, 1909; Macnamara, 1919). Further entomological research was undertaken during the 1960s and 1970s and provided morphological descriptions and distributional ranges as well as general ecological observations and physiological studies, particularly for the most widespread species in this area—Gomphiocephalus hodgsoni (Gressitt and Leech, 1961; Salmon, 1962; Gressitt et al., 1963; Janetschek, 1963, 1967, 1970). More recently, molecular approaches including allozyme analyses (e.g., Frati et al., 1996; Stevens and Hogg, 2003) and DNA sequencing of mitochondrial cytochrome c oxidase subunit II (e.g., Frati and Dell'Ampio, 2000; Frati et al., 2001; McGaughran et al., 2010) and cytochrome c oxidase subunit I (COI) gene regions (e.g., Nolan et al., 2006; Stevens and Hogg, 2006; McGaughran et al., 2010) have been undertaken for Antarctic springtails. These data have enhanced understanding of their evolutionary histories and allowed testing of hypotheses, such as endemism of the fauna, proposed by earlier researchers (e.g., Wise, 1967). Further, they have provided insights into the long-term persistence of endemic taxa, their survival in refugia and subsequent expansion during interglacial periods (e.g., Convey et al., 2008; McGaughran et al., 2011b; Beet et al., 2016; Carapelli et al., 2017). In turn, this has facilitated predictions as to how populations may respond to future changes in habitat availability and environmental conditions (e.g., Chown and Convey, 2007; Lee et al., 2017). While these studies either focus on a single species or have made comparisons between two species, none has assessed the full range of COI sequences available for all known species in the area.

Based on early studies of springtail distribution using morphological taxonomy, the Ross Sea sector (longitude 160– 175◦E) was subdivided into five areas (Wise, 1967), although more recent data have confirmed that some of the described species distributions within these five areas were incorrect (e.g., Ross Island contains only G. hodgsoni). Studies focused on specific taxa (e.g., Cryptopygus terranovus) have suggested additional landscape divisions based on genetic evidence for geographical isolation (Stevens et al., 2006b; Hawes et al., 2010; Carapelli et al., 2017). However, other species present in the same area do not necessarily exhibit the same divergence patterns (Caruso et al., 2009). For example, in an area of NVL where Friesea grisea and C. terranovus have overlapping ranges, the two species have very different levels of genetic divergence; <0.2% for F. grisea (Torricelli et al., 2010b) and >10% for C. terranovus (Carapelli et al., 2017).

Our current study compiled and consolidated all available COI sequences and collection information for each of the seven species found in Victoria Land; Cryptopygus cisantarcticus, C. terranovus, Friesea grisea, and Kaylathalia klovstadi in NVL, and Antarcticinella monoculata, C. nivicolus, and Gomphiocephalus hodgsoni in SVL (**Table 1**). We provide a comprehensive list of collection sites for each haplotype to assess spatial variability. We also examined whether spatially isolated populations would be distinct at the COI gene region, and if so, whether landscape features (such as particularly large glaciers and ice tongues) may provide contemporary barriers to dispersal. Where relevant, we highlight instances of possible cryptic speciation as indicated by highly distinct genetic lineages, as well as identify priority sites from a conservation perspective.

# METHODS

# Existing Sequence Data

All publicly available sequences were downloaded from the BOLD and GenBank databases (n = 930 sequences), covering the seven springtail species present in Victoria Land. Supporting information was also downloaded from BOLD where possible, although in many instances these data were absent, particularly for sequences that had been obtained directly from GenBank. Where possible, we endeavored to manually retrieve collection details from original research articles. To maximize sequence length of previously-analyzed specimens, original trace files (where possible) were imported into Geneious v11.1.5 (https://www.geneious.com) for complete re-analysis. Forward and reverse sequence reads were assembled, any discrepancies were edited and then consensus sequences

were extracted. For cases where no trace files were available, final sequences that had been uploaded to GenBank or BOLD were used.

#### New Sequence Data

A total of 392 previously unpublished sequences have been included in the current study (**Tables 2**–**4** and **Table S2**). Of these, 56 were obtained from individuals collected from NVL in January 2004 (C. Beard and R. Seppelt), January 2015 (C. Cary) and November 2017 (I. Hogg). For previously unpublished sequences of G. hodgsoni, specimens were collected from sites in the vicinities of Mackay Glacier (n = 60), Taylor Valley (n = 21) and the southern Dry Valleys (n = 249) from 2009 to 2016 (I. Hogg, G. Collins, C. Beet and N. Demetras). An additional six specimens of C. nivicolus were collected from Mount Seuss in 2008 (I. Hogg), and Mount Seuss and Tiger Island in 2015 (G. Collins, C. Beet, I. Hogg and D. Cowan). For a full list of sites where specimens were collected, see **Figures 1**, **2**; **Tables 2**, **3**, and **Tables S1–S9**.

These previously unpublished sequences were from specimens collected in pitfall traps (described in McGaughran et al., 2011a), using modified aspirators (Stevens and Hogg, 2002), or isolated by flotation from soil samples (Freckman and Virginia, 1993). In all cases, individual specimens were immediately preserved in 100% ethanol for later DNA extraction. DNA extractions, PCR amplifications and COI sequencing were carried out at the University of Waikato following procedures outlined in Collins and Hogg (2015) or at the Canadian Centre for DNA Barcoding following established protocols (see http://ccdb.ca/resources/).

#### Data Analyses

A separate alignment was made for each species using ClustalW within Geneious. After forward (LCO or LepF1) and reverse (HCO or LepR1) primer regions (26 bp each) were removed, TABLE 1 | The seven springtail species present in the two Antarctic Conservation Biogeographic Regions (ACBRs) in Victoria Land (NVL and SVL), within the Ross Sea sector of Antarctica.


*Refer to Sinclair and Stevens (2006) for additional taxonomic and distributional detail, and Register of Antarctic Species (RAS; raw.biodiversity.aq) for history of name changes for each species.*

TABLE 2 | Collection sites in northern Victoria Land, grouped by region, including the number of COI sequences at each site for the four species in this area; *Cryptopygus cisantarcticus* (CCIS), *Cryptopygus terranovus* (CTER), *Friesea grisea* (FGRI), and *Kaylathalia klovstadi* (KKLO).


*The number of sequences newly generated in the current study are shown in parentheses. See* Table S1 *for additional details, such as correlation of site codes used in previous studies.*

each species' alignment was individually assessed and trimmed accordingly to maximize alignment length but also retain maximum sequence coverage for each species. Final alignments varied in length, from 422 to 586 bp. Based on these final trimmed alignments, unique haplotypes for each species were then manually assigned codes in order of the date they were sequenced (e.g., GHOD-001 for Gomphiocephalus hodgsoni, haplotype 1). Future haplotypes can therefore be added sequentially to this dataset.

The numbers of duplicate sequences for each haplotype were manually calculated based on the number of replicates within our trimmed alignments, as well as from original tables and figures provided in published papers for cases where only unique sequences were originally deposited online (Nolan et al., 2006; Demetras, 2010; McGaughran et al., 2010). This was particularly problematic for G. hodgsoni, as the numbers of each haplotype from each specific location could only be determined by a process of elimination through cross-referencing all available tables, figures and supplementary data. Unfortunately, collection information is absent for 46 G. hodgsoni specimens (see **Table S9**), outlined as follows. The precise collection location is uncertain for the two full mitochondrial genomes that have been previously assembled for G. hodgsoni (GenBank accessions AY191995 and NC005438; Nardi et al., 2003) as well as the three

TABLE 3 | List of collection sites in southern Victoria Land, grouped by region, including the number of COI sequences at each site for the three species in this area; *Antarcticinella monoculata* (AMON), *Cryptopygus nivicolus* (CNIV) and *Gomphiocephalus hodgsoni* (GHOD).


*(Continued)*

#### TABLE 3 | Continued


*The number of sequences newly generated in the current study are shown in parentheses. See* Table S1 *for additional details, such as correlation of site codes used in previous studies.*

TABLE 4 | Sequence details for the seven species of Collembola present in NVL and SVL, showing alignment lengths used in the current study (which vary for each species).


*Of the number of sequences (n) within each alignment, the number of sequences new to this study are shown in parentheses. The number of unique haplotypes are also shown (h), along with relative proportion of A and T nucleotides (% A–T) and maximum divergences (uncorrected p-distance) for each species. We also highlight the potential for multiple divergent lineages within each of the species, with mean divergences (uncorrected p-distance) provided in parentheses.*

sequences from Greenslade et al. (2011). GenBank accession numbers for the five sequences new to McGaughran et al. (2008) could not be linked with their respective site data, and 41 specimens from McGaughran et al. (2010) which include those from McGaughran et al. (2008) were also lacking site data. For specimens from Demetras (2010) where collection data were not resolved to precise site, we have used the broader "valleylevel" locations (i.e., Garwood, Marshall and Miers Valleys, and Shangri La) in our analyses. Overall, sequences from a total of 88 sites were assessed in this study (**Figures 1**, **2**; **Tables 2**, **3** and **Table S1**). For specimens with collection data absent, sequences were included in the initial alignment for phylogenetic tree construction and haplotype assignments, and were then removed for subsequent biogeographic analyses.

Each of the individual alignments for the seven species were then assembled into one master alignment which was trimmed to 422 bp and a Maximum Likelihood tree was generated in MEGA v7.0.26 (Kumar et al., 2016) based on the sequence model GTR+I+G (Guindon and Gascuel, 2003; Darriba et al., 2012; JModelTest2), including 1,000 bootstrap replicates. All sequence divergence values included in the current study were also generated in MEGA based on uncorrected p-distances.

Haplotype pie charts (**Figures 4**–**7**) were generated in R v3.5.1 utilizing the packages "mapdata" (Becker et al., 2016) and "mapplots" (Gerritsen, 2014) based on the individual species alignments which were each trimmed to different lengths, depending on the sequences. To retain haplotype-level resolution, phylogenetic tree excerpts

included in these figures were based on the individual species alignments (422–586 bp).

To determine potential barriers to dispersal, Analysis of Molecular Variance (AMOVA) analyses were performed separately for each species in R v3.5.1 utilizing the package "poppr" with 16,000 permutations (Kamvar et al., 2014, 2015). For this, species alignments that had each been trimmed to between 422 and 586 bp (**Table 4**) were used, and haplotypes were grouped according to their occurrence at sites within particular regions (listed in **Tables 2**, **3**) to test the amount of genetic variability occurring within and between regions. As specimens of Cryptopygus nivicolus were all found within the Mackay Glacier region, we performed the AMOVA for this species with the sites Towle Glacier and Springtail Point designated as separate regions.

#### RESULTS AND DISCUSSION

We compiled all available COI sequences for the seven springtail species in Victoria Land. Sequence coverage ranged from 25 to 950 individuals per species (total n = 1,322), with between 5 and 88 haplotypes identified for each species (**Figure 3**; **Table 4**). The final alignments for each species varied in length (422–586 bp), largely due to variation in sequence quality, and we treated each species separately for biogeographic analyses to maximize variability within the datasets, although a final alignment trimmed to 422 bp was used to construct the phylogenetic tree (**Figure 3**). Relative proportions of AT-richness ranged from 58.3 to 64.6% for each species (**Table 4**), and no insertions or deletions were found. Maximum intraspecific sequence divergences ranged from 2.7 to 15.4% (uncorrected p-distance) and for all seven species, high divergence values (1.7–14.7%) were found among individuals from different geographic locations (**Table 4**), supported by AMOVAs.

#### Northern Victoria Land Taxa

Northern Victoria Land is the northern-most Antarctic Conservation Biogeographic Region (ACBR) of the Ross Sea sector and extends from Cape Adare (71.3◦ S) to the Drygalski Ice Tongue (75.3◦ S) (**Figure 1**). We recovered a total of 299 COI sequences from the four springtail species that are present in this area; Kaylathalia klovstadi (KKLO; n = 35 sequences), Cryptopygus cisantarcticus (CCIS; n = 25), Friesea grisea (FGRI; n = 66), and Cryptopygus terranovus (CTER; n = 173). Each species was collected from a different range of sampling locations and also showed contrasting population genetic structures. For example, two genetically distinct lineages (clades C and D) of C. terranovus from sites in the central and northern zones as designated by previous studies (e.g., Fanciulli et al., 2001; Hawes et al., 2010; Carapelli et al., 2017), were highly divergent (>10%) whereas F. grisea specimens from the same area all had identical sequences (**Figures 4**, **5**). Similarly, sequences of K. klovstadi were only 0.7% divergent between Cape Hallett and Redcastle Ridge, whereas C. cisantarcticus specimens differed by 4.3% between the same two collection sites. The Tucker Glacier is a significant barrier to dispersal for Antarctic springtails in the area, with highly distinct (>5%) populations present on either side of this glacier and no shared haplotypes between populations, consistent across species.

#### Kaylathalia klovstadi (Carpenter, 1902)

The first collections of this species (originally described as Isotoma klovstadi) were obtained from Geikie Ridge during the British Antarctic Expedition (1898–1901), after which it was not collected again until 1965 from Ridley Beach at Cape Adare (Wise, 1971). The first five COI sequences obtained for K. klovstadi were from individuals collected at Cape Hallett and published in a study that reclassified the genus from Isotoma to Desoria (Stevens et al., 2006a). The same five sequences were further used to again reclassify the genus to Kaylathalia (Stevens and D'Haese, 2016).

We now include a further 30 previously unpublished COI sequences from three locations (including Cape Hallett), revealing 15 new haplotypes (KKLO-005 to KKLO-019; **Tables 2**, **4**; **Tables S2**, **S3**). By expanding the sequence coverage and including two additional sites, we are now able to demonstrate geographical isolation of distinct genetic lineages (2%) for K. klovstadi (**Figure 4B**). When haplotypes in the combined dataset were grouped by region, an AMOVA showed that 75% of the sequence variation occurred between the northern Cape Adare group (n = 8), and the southern Cape Hallett and Redcastle Ridge group (n = 27) (**Table 5**; p < 0.01). No haplotypes were shared between sites, although Redcastle Ridge was represented by only 3 sequences which were similar to those from Cape Hallett (0.7% divergence).

Based on a study of 40 COII gene sequences, Frati et al. (2001) suggested that the Tucker Glacier was as a major barrier to dispersal for K. klovstadi. Currently, no COI sequences were available for K. klovstadi from the southern side of the Tucker Glacier and this would benefit from further attention, particularly as strong population genetic structure for the COI gene was observed for C. cisantarcticus and F. grisea on opposing sides of this glacier (see sections Cryptopygus cisantarcticus Wise, 1967 and Friesea grisea Schaeffer, 1891).

#### Cryptopygus cisantarcticus (Wise, 1967)

The first four COI sequences of C. cisantarcticus were from specimens collected at Cape Hallett and included in Stevens et al. (2006b), who suggested this species has been isolated for 18–11 MY based on comparisons with other Southern Hemisphere springtail species. An additional 10 sequences were obtained from Crater Cirque and presented as an outgroup in Carapelli et al. (2017).

Here, we added a further 10 sequences from Cape Hallett and a single sequence from Redcastle Ridge, representing nine new haplotypes (CCIS-006 to CCIS-014), and provide comparison among all COI sequences currently available for C. cisantarcticus (**Tables 2**, **4**; **Tables S2**, **S4**). When haplotypes in the combined dataset were grouped by location, an AMOVA showed that 87% of the sequence variation occurred across the Tucker Glacier, between the northern Cape Hallett and Redcastle Ridge group (n = 15) and the southern Crater Cirque population (n = 10) (**Table 5**; p < 0.01).

Population genetic structure differed for C. cisantarcticus as compared to other species in the area where their sequence distributions overlapped (**Figure 4**). Sites at Redcastle Ridge and Cape Hallett are in relatively close proximity (<16 km), and divergence between these two locations was very low for K. klovstadi (0.7%) while much higher for C. cisantarcticus (4.3%). Furthermore, mean divergence for C. cisantarcticus across the Tucker Glacier was 5.8%, whereas divergence was higher for F. grisea, at >9%.

#### Friesea grisea (Schaeffer, 1891)

This species has been considered "pan-Antarctic" until very recently; specimens of Friesea grisea from the Antarctic Peninsula were found to be morphologically distinct to those in Eastern Antarctica (Greenslade, 2018) in agreement with the absence of haplotype sharing between the two locations (>15% divergence; Torricelli et al., 2010b). A total of 55 COI sequences for F. grisea

clades, to highlight geographical isolation of well-supported clades (bootstrap values provided on branches, where relevant).

have been published (Torricelli et al., 2010b). In addition, the full mitochondrial genome has previously been assembled for F. grisea (GenBank accession KR180288) collected from Kay Island (Torricelli et al., 2010a) and we have included the COI gene region from this sequence in our analyses.

We added an additional 10 unpublished sequences from Cape Hallett (situated to the north of Tucker Glacier;

**Figure 4**), including one new haplotype, for a total of 20 sequences from this site (**Table 2**; **Tables S2**, **S5**). When haplotypes in the combined dataset were grouped by location, an AMOVA showed that 98.3% of the sequence variation occurred across the Tucker Glacier (**Table 5**; p < 0.01), between the northern Cape Hallett population (n = 20) and the southern group (n = 46) comprised of seven sites across approximately 220 km of coastal terrain (**Figure 4D**). Individuals of F. grisea along that entire coastal area south of the Tucker Glacier were within 0.2% divergence, while the population at Cape Hallett was highly differentiated (9.2% divergence), further supporting a lack of dispersal across the Tucker Glacier, in agreement with our findings for C. cisantarcticus.

#### Cryptopygus terranovus (Wise, 1967)

An analysis of allozymes for Cryptopygus terranovus (previously known as Gressittacantha terranova; Greenslade, 2015) showed northern, central and southern zones of genetic differentiation within the vicinity of Terra Nova Bay that were separated by the Aviator and Campbell Glaciers (**Figure 5A**; Fanciulli et al., 2001). It was also suggested that the population at Apostrophe Island (API; northern zone) was comprised of individuals that had more recently migrated from the central zone. The first COI sequences (n = 4) for this species were collected from near the Mario Zuchelli station (C. Beard, pers. comm.) and were reported in Stevens et al. (2006b). Hawes et al. (2010) provided 54 sequences and 25 new haplotypes from fine-scale (∼15 km<sup>2</sup> ) sampling in the southern zone (Terra Nova Bay). Broader-scale sampling by Carapelli et al. (2017) provided a further 114 sequences from 11 sites, including 38 new haplotypes. Each of these three genetic studies provided additional evidence to support the division of C. terranovus into the three zones as first suggested by Fanciulli et al. (2001) and our AMOVA analyses found that 59.8% of the sequence variation occurred between the three zones (**Table 5**; p < 0.01).

We provide a further five sequences (no new haplotypes) from near Jang Bogo Station in Terra Nova Bay, and O'Kane Glacier (**Table 2**; **Tables S2**, **S6**). We analyzed a total of 173 COI sequences for C. terranovus and identified 68 unique haplotypes from the 577 bp alignment. Sequences were grouped into the four distinct clades that have been previously identified for this species (Hawes et al., 2010; Carapelli et al., 2017). Overall, each clade contained between 20 and 78 sequences, mean sequence divergences within each clade ranged from 0.48 to 2.09%, and mean sequence divergences between clades ranged from 6.9 to 14.7% (**Figure 5**; **Table 4**; **Table S6**).


TABLE 5 | Analysis of molecular variance (AMOVA) (Excoffier et al., 1992) results for the seven species of Collembola in Victoria Land, as implemented in R v3.5.1 using the package "poppr" (Kamvar et al., 2014, 2015).

*Sites were grouped into regions according to* Tables 2*,* 3*, while Towle Glacier and Springtail Point were each designated as separate regions for C. nivicolus. Statistical significance of variance components were tested with 16,000 permutations.*

#### Southern Victoria Land Taxa

This region extends from the Drygalski Ice Tongue to the north and the Koettlitz Glacier to the south, and includes the Mackay Glacier region as well as the Victoria Land Dry Valleys including Victoria, Wright, Taylor, Garwood, Marshall, and Miers Valleys (**Figures 1**, **2**). We analyzed a total of 1,023 COI sequences from the three springtail species that are currently known from this region; Antarcticinella monoculata (AMON; n = 33), Cryptopygus nivicolus (CNIV; n = 40), and Gomphiocephalus hodgsoni (GHOD; n = 950). All three species are range-restricted, with genetic differentiation found among different geographic locations. However, patterns of population structure vary among the three species. For example, individuals of G. hodgsoni at Mount Gran (n = 3) showed sequence divergences of 7.6% (Bennett et al., 2016), while individuals of C. nivicolus from Mount Gran (n = 7) were similar to those from other nearby sites such as Mount Seuss and Mount England (Beet et al., 2016). No sites were found with all three species present. Both C. nivicolus and A. monoculata were found at Springtail Point, while C. nivicolus and G. hodgsoni were both found at 5 sites (Towle Glacier, Tiger Island, and Mounts Seuss, Gran and England).

#### Antarcticinella monoculata (Salmon, 1965)

In total, 33 COI sequences were available for A. monoculata (Beet et al., 2016; Bennett et al., 2016) and all specimens were collected from within a 250 km<sup>2</sup> range of the Mackay Glacier (**Figure 6A**; **Table 3**; **Table S7**). Sequences from the two northern-most sites, Mount Murray and Cliff Nunatak, were all the same haplotype (AMON-005) and were divergent from the other four haplotypes that were found at the more southerly sites (10.1% mean p-distance). An AMOVA revealed that 98.7% of the genetic variation occurred between these two geographically isolated populations (**Table 5**; p < 0.01).

#### Cryptopygus nivicolus (Salmon, 1965)

The first COI sequences (n = 2) for this species were obtained from Mount England (Stevens et al., 2006b). More recently, Bennett et al. (2016) and Beet et al. (2016) contributed another 32 sequences from an additional 5 sites, showing genetic divergence among habitats.

A further 6 sequences were included as part of our study (**Tables 3**, **4**; **Tables S2**, **S7**). In total, we identified 16 haplotypes from the 40 COI sequences for this species, clustering into three distinct groups (3.3–4.3% mean divergence) corresponding to different geographic locations (**Figure 6B**). Springtail Point and the Towle Glacier site each had unique C. nivicolussequences that were not found at any other site. The remaining nine haplotypes (n = 27 sequences) occurred across four sites in the vicinity of the Mackay Glacier (Mounts Gran, Seuss, England and Tiger Island; **Figure 6B**), suggesting individuals may have recently dispersed among hese sites (within 40 km of each other). Possible explanations for connectivity among these sites include dispersal via meltwater streams or in marine, nearshore environments when sea ice is absent along coastal boundaries. An AMOVA found that when sites at Springtail Point and Towle Glacier were designated as separate regions, only 35% of the genetic variation occurred between each of the three regions (**Table 5**; p < 0.01). However, genetically divergent populations of C. nivicolus were found at Springtail Point and the Towle Glacier site (**Figure 6B**), highlighting the importance of conservation strategies to prevent human-mediated transport of genetic variants between these currently isolated populations.

#### Gomphiocephalus hodgsoni (Carpenter, 1908) **Previous Studies**

The first COI sequences for G. hodgsoni were contributed by a study that also examined allozymes and was focused on phylogeography (Stevens and Hogg, 2003). COI sequences were

obtained for 45 individuals from 21 localities, and 14 unique haplotypes were identified (A-N). Among their findings it was suggested that two sympatric populations were present in Taylor Valley (1.5% divergence), and that recent transport (possibly by birds or humans) had occurred between Ross Island and Granite Harbor as similar haplotypes were shared between these locations. Sequences from Stevens and Hogg (2003) were included in their subsequent study (Stevens and Hogg, 2006), where genetically divergent populations were identified at Beaufort Island and in Taylor Valley, when compared to the remaining sites on Ross Island (n = 5 sites) and the continental mainland (n = 11 sites, including Taylor Valley). However, the Beaufort Island haplotype (our haplotype GHOD-010) has since been found at additional sites on the continent (see **Table S9**), while the unique Taylor Valley haplotype (our haplotype GHOD-011) remains restricted to Taylor Valley and belongs to Group Y in Collins and Hogg (2015) and Nolan et al. (2006).

A following study of G. hodgsoni (Nolan et al., 2006) targeted the two sympatric groups in Taylor Valley that had previously been identified by Stevens and Hogg (2003). The eight Taylor Valley sequences from Stevens and Hogg (2003) were also incorporated into their dataset, and 10 unique haplotypes (A–J) were identified from the combined dataset (n = 48). The study concluded that the two sympatric phylogroups (haplotypes A– F = group X; G–J = group Y) probably diverged less than 1 million years ago, around the time when ancient Lake Washburn flooded the Valley (40 kya). Ancestors of the two populations probably survived in isolated refugia at either end of the Valley (group Y inland and group X in higher elevation coastal areas), recolonizing the Valley and reconnecting following the retreat of Lake Washburn.

An additional five COI sequences and two new haplotypes were recovered from GenBank (McGaughran et al., 2008). Unfortunately, we were unable to determine which sequences corresponded to haplotype codes referred to in the manuscript. As these sequences were included in the subsequent study of McGaughran et al. (2010), they have also been included in our study. The McGaughran et al. (2008) study included sequences from Stevens and Hogg (2003) and identified 20 haplotypes (G1-G20) from a total dataset of 96 sequences from the wider McMurdo Dry Valleys and Ross Island area. Divergences of G. hodgsoni COI sequences were up to 2.1%, suggesting the populations had diverged within the last million years (based on molecular clock calibration of 1.5–2.3%/my e.g., Brower, 1994; Juan et al., 1996; Quek et al., 2004). Further, haplotype diversity tended to be higher at more inland sites and particularly at higher altitudes, suggesting this was a result of persistence in refugial habitats. McGaughran et al. (2008) also suggested that Ross Island individuals were derived from a founder population that originated from the Dry Valleys.

McGaughran et al. (2010) compared patterns of sequence divergences of G. hodgsoni to those of Cryptopygus antarcticus from the Antarctic Peninsula and provided COI and COII sequences for both taxa. Existing G. hodgsoni COI sequences from Stevens and Hogg (2003) and McGaughran et al. (2008) were also included, and 45 haplotypes (H1–H45) were identified from their final alignment of 289 individuals (471 bp). In reviewing these sequences, we were able to identify exact site information for all but 31 specimens (see **Table S9**). Key findings from McGaughran et al. (2010) included limited haplotype sharing between local sites within both the Peninsula and continental areas, and that survival in refugial habitats was likely to have occurred on a Pleistocene timescale. However, the Peninsula contained more rare haplotypes and it was suggested that differences in population structure were a result of landscape differences and colonization events between the two localities. Greenslade et al. (2011) provided three additional COI sequences (658 bp) for the existing haplotype H43 from McGaughran et al. (2010) (our haplotype GHOD-046) in a study which confirmed Gomphiocephalus as a distinct genus.

A further 90 G. hodgsoni COI sequences were contributed by Demetras (2010) from the southern Dry Valleys area (Marshall, Garwood and Miers Valleys) with limited diversity found (10 haplotypes TABS Gh1—Gh10; <0.8% divergence). Based on a haplotype network analysis, it was suggested that all 10 haplotypes were derived from a single lineage (300–500 kya). Possible reasons for the low diversity in G. hodgsoni from the southern Dry Valleys include more recent recolonization of this area or bottleneck events and differences in landscapes as compared to the larger Dry Valleys (e.g., Taylor) where there is much higher COI diversity. As the data from Demetras (2010) were previously unpublished, we have now uploaded them to BOLD and added them to our analyses.

Collins and Hogg (2015) provided a further 151 G. hodgsoni COI sequences as part of a two-hourly time-series of pitfall trap collections from Spaulding Pond in Taylor Valley. This study targeted the two main "X" and "Y" haplotype groupings previously reported by Nolan et al. (2006), known to occur in sympatry at Spaulding Pond (Gh1-Gh12 = Y; Gh13-Gh19 = X). More individuals were found from group Y (n = 120) relative to group X (n = 31). As group Y is thought to have recolonized from inland "colder" refugial sites, this study inferred that the site was dominated by intrinsically cold-adapted individuals, and that the relative proportions of X and Y individuals could change with a warming climate. Furthermore, activity of individuals from the X and Y haplotype lineages during each two-hourly pitfall trap collection was more closely linked to air temperature than any of the other measured environmental variables. This highlights the potential for temporal shifts in the genetic structure of populations as a consequence of environmental changes.

An additional 67 sequences (658 bp; 16 haplotypes, 8 new) were contributed by Bennett et al. (2016), along with sequences from A. monoculata and C. nivicolus. This study was focused on past isolation events using a molecular clock analysis. They concluded that isolation was likely to have occurred during the last 4 million years and that glaciation events since that time have further contributed to the high COI diversity within both G. hodgsoni and C. nivicolus. In particular, specimens of G. hodgsoni from Mount Gran (our haplotype GHOD-071) were found to be highly divergent (>7%) from the remaining sampled dataset. In a study focused on genetic diversity of Collembola from sites in the vicinity of Mackay Glacier and further north, Beet et al. (2016) contributed 66 new sequences (527 bp; 5 new haplotypes). Sequences from Bennett et al. (2016) were also included in Beet et al. (2016) analyses and, based on the combined dataset, it was hypothesized that populations may have been isolated for 3–5 million years, consistent with a collapse and reformation of the West Antarctic Ice Sheet.

#### **Our Findings**

As the most widespread and well-studied springtail in the Ross Sea region, the dataset of COI sequences for G. hodgsoni (n = 950) was larger than that of any other species and represented specimens from 67 sites (**Tables 3**, **4**; **Tables S8, S9**). Overall, 620 sequences were available from nine published studies as outlined above, and a further 330 previously unpublished sequences have now been added here (**Table S2**). The majority of sequences were <2.6% divergent, while distinct populations were present at Towle Glacier (n = 6; 5.9% divergence) and Mount Gran (n = 3; 7.3% divergence) which are located further inland (**Figure 7**). Similarly, genetically divergent populations of C. nivicolus were present at the inland sites Towle Glacier and Springtail Point, suggesting that greater connectivity occurs among coastal sites. Overall, an AMOVA revealed that the majority (>60%) of haplotype diversity occurred within regions (**Table 5**; p < 0.01), particularly as the highly diverged individuals from sites Towle Glacier and Mount Gran were sequenced at low numbers.

Tripp Island is approximately 70 km from other sites and located to the north of Mawson glacier which appeared to be a dispersal barrier for A. monoculata (>10% divergence). However, individuals of G. hodgsoni from Tripp Island were not genetically divergent (<0.5%) relative to other locations. In contrast, individuals of G. hodgsoni at Mount Gran <10 km from other sites (**Figure 7**) were highly genetically divergent (>7%). These findings highlight that geographical barriers to dispersal such as glaciers, and lack of water transport between locations, which is unlikely for sites further inland, have a stronger influence on the population genetic structures of Antarctic springtails than distance alone.

# CONCLUDING DISCUSSION

In this study we examined the available COI sequence data for all seven of the Collembola species that occur within Victoria Land (71◦ to 78.5◦ S). All species harbored distinct lineages (1.4–14.7% mean sequence divergences) that were isolated by geological or glaciological features, rather than by distance alone. Such levels of COI diversity are suggestive of long-term isolation and lack of dispersal among locations.

In several cases, the distribution of genetic variants among sites differed for the different species. Indeed, differing distributional patterns were previously reported in a largescale study of the three species C. terranovus, F. grisea, and G. hodgsoni (Caruso et al., 2009). Here, we showed that sequence divergence for F. grisea (0.2%) and C. terranovus (10.8%) differed greatly among the same sites in northern Victoria Land. As a possible explanation, F. grisea may have more recently dispersed throughout this area whereas C. terranovus may have remained isolated by the Aviator and Campbell Glaciers (e.g., Fanciulli et al., 2001; Carapelli et al., 2017). In southern Victoria Land (SVL), Springtail Point harbored a distinct lineage (>3.3% divergence) of C. nivicolus whereas A. monoculata from Springtail Point was highly similar (<0.9% divergence) to individuals from nearby sites.

FIGURE 7 | Geographical distribution of the three distinct (5.9–7.3% mean divergence) lineages of *Gomphiocephalus hodgsoni* within southern Victoria Land for specimens where collection data were available (GHOD; *n* = 904; 422 bp), showing widespread distribution (500 km reach) of the main group of haplotypes (*n* = 895; 2.6% mean divergence) and distinct populations at both Mount Gran and Towle Glacier sites, further inland. Pie charts are proportionate to the number of sequences, centered at collection site co-ordinates. See Tables S8, S9 for further haplotype and collection details for each specimen.

Similarly, the Mount Gran population of G. hodgsoni was highly distinct (7.3% divergence) whereas sequences of C. nivicolus from Mount Gran were identical to those from other sites. Populations to the north of Mawson Glacier were genetically distinct for A. monoculata (>10% sequence divergence; Cliff Nunatak and Mount Murray) while the G. hodgsoni population (Tripp Island) was genetically similar (0.5%), even though these sites are geographically isolated from other sites by >70 km (**Figures 2**, **6**, **7**).

Given the high levels of genetic divergence among populations for all seven springtail species (1.7–14.7%), we highlight the potential that these distinct lineages could be cryptic species. Cryptic diversity has been suggested for C. terranovus in NVL (Hawes et al., 2010; Carapelli et al., 2017) and G. hodgsoni, C. nivicolus, and A. monoculata in SVL (Beet et al., 2016). Suggestions of cryptic diversity for the springtail F. grisea between NVL and the Antarctic Peninsula (Torricelli et al., 2010a,b) have now been validated through morphological differences and redescription of the species (Greenslade, 2018). However, the potential for cryptic variability among Victoria Land specimens of F. grisea has not previously been suggested, although the Cape Hallett population has been identified as genetically distinct, possibly resulting from pre-Pleistocene isolation (Torricelli et al., 2010b).

The Tucker Glacier in NVL provides an example of a smallscale (<16 km) barrier to springtail dispersal as genetically distinct (>5% sequence divergence) populations of both C. cisantarcticus and F. grisea have been found either side of the glacier. Previous studies using the COII gene have also demonstrated this for K. klovstadi (Frati et al., 2001; Stevens et al., 2007), suggesting the potential for microspeciation processes occurring at these sites. In SVL, we show that the phylogeographic patterns differed for each of the three species, and we highlight Springtail Point, Towle Glacier, Mount Gran and sites to the north of Mawson Glacier as possible sites of conservation focus, as these locations harbored genetically divergent (3.3–10.1%) populations. The vicinity of the Drygalski Ice Tongue continues to provide a current barrier to dispersal for springtail taxa between NVL and SVL and we found no evidence of gene flow or species sharing between these regions.

Future studies focused on longer COI sequences (or indeed, additional genetic markers), are likely to reveal further genetic diversity for springtails in Victoria Land. The potential for temporal changes in genetic diversity as a consequence of environmental changes such as climate warming has also been highlighted. Next generation sequencing and metabarcoding approaches for genomic monitoring will be important for future studies, particularly to detect spatial and temporal shifts in genetic diversity. With a warming climate, future glacier melt will provide additional opportunities for the dispersal of springtails (via flotation) while also exposing new habitats for colonization.

The Antarctic Conservation Biogeographic Regions (Terauds et al., 2012; Terauds and Lee, 2016) provide a geographical framework for assessing broad-scale biological diversity. However, we suggest that knowledge of local-scale patterns of genetic diversity will be critical for addressing ecological and physiological questions, such as those pertaining to dispersal, and responses to climate changes. The understanding of relevant temporal and spatial scales as well as the current distribution of genetic diversity is essential to identify current barriers to dispersal as well as sites that harbor unique genetic resources and, therefore, areas of conservation value.

#### REFERENCES

Adams, B. J., Bardgett, R. D., Ayres, E., Wall, D. H., Aislabie, J., Bamforth, S., et al. (2006). Diversity and distribution of Victoria Land biota. Soil Biol. Biochem. 38, 3003–3018. doi: 10.1016/j.soilbio.2006.04.030

#### DATA AVAILABILITY

All previously unpublished sequences and associated data are available in the dataset "DSANTSP18" on BOLD. GenBank accession numbers and BOLD IDs are provided in supplementary data sheets for all sequences included in this manuscript. Original trace files can be found associated with specimens in the BOLD dataset, where available. For any additional information, contact GC: gec9@students.waikato.ac.nz/.

#### AUTHOR CONTRIBUTIONS

GC and IH conceived the study. IH collected new specimens from NVL, and processed these with GC. GC and IH collected and processed additional specimens from SVL. All alignments and analyses were performed by GC, with assistance from AB using R. The manuscript was prepared by GC, with contributions from all co-authors.

#### ACKNOWLEDGMENTS

We acknowledge support from a Waikato University Doctoral Scholarship, a Waikato Graduate Women Merit Award for Doctoral Study and an Antarctica New Zealand Postgraduate Scholarship (Sir Robin Irvine) to GC. IH is grateful to Antarctica New Zealand, Craig Cary and the Korean Polar Research Institute (KOPRI) for arranging/providing logistic support for field work in NVL, and to Polar Knowledge Canada for financial support. PC was supported by NERC core funding to the British Antarctic Survey's 'Biodiversity, Evolution and Adaptation' Team. We are also grateful for support provided to the Canadian Centre for DNA Barcoding from Genome Canada and the Ontario Genomics Institute as part of the International Barcode of Life Project and thank the Ontario Ministry of Economic Development and Innovation for support of the BOLD database. This paper contributes to the 'State of the Antarctic Ecosystem' (AntEco) and the Antarctic Thresholds, Ecosystems Resilience and Adaptation (AnT-ERA) programmes of SCAR. The satellite imagery used to generate **Figures 1**, **2** was retrieved from GloVis, courtesy of the NASA EOSDIS Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, (https:// lpdaac.usgs.gov/data\_access/glovis).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00076/full#supplementary-material

Beet, C. R., Hogg, I. D., Collins, G. E., Cowan, D. A., Wall, D. H., and Adams, B. J. (2016). Genetic diversity among populations of Antarctic springtails (Collembola) within the

Becker, R., Wilks, A., and Brownrigg, R. (2016). Mapdata: Extra Map Databases. R package version 2.2–6.

mackay glacier ecotone. Genome 59, 762–770. doi: 10.1139/gen-2015-0194


Antarctica: evidence for population differentiation. Polar Biol. 24, 934–940. doi: 10.1007/s003000100302

Freckman, D. W., and Virginia, R. A. (1993). Extraction of nematodes from dry valley Antarctic soils. Polar Biol. 13, 483–487. doi: 10.1007/BF00 233139

Gerritsen, H. (2014). Mapplots: Data Visualisation on Maps. R Packag. version 1.5.


Victoria Land, Antarctica. Mol. Phylogenet. Evol. 46, 606–618. doi: 10.1016/j.ympev.2007.10.003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, BV, declared past collaborations with one of the authors, PC, to the handling Editor.

Copyright © 2019 Collins, Hogg, Convey, Barnes and McDonald. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods

Adriana Rego1,2, Francisco Raio<sup>1</sup> , Teresa P. Martins<sup>1</sup> , Hugo Ribeiro1,2 , António G. G. Sousa<sup>1</sup> , Joana Séneca<sup>1</sup> , Mafalda S. Baptista1,3, Charles K. Lee3,4 , S. Craig Cary3,4, Vitor Ramos<sup>1</sup>† , Maria F. Carvalho<sup>1</sup> , Pedro N. Leão<sup>1</sup> and Catarina Magalhães1,5 \*

#### Edited by:

Peter Convey, British Antarctic Survey (BAS), United Kingdom

#### Reviewed by:

Kimberley Warren-Rhodes, Ames Research Center, United States Geok Yuan Annie Tan, University of Malaya, Malaysia

\*Correspondence:

Catarina Magalhães cmagalhaes@ciimar.up.pt; catarinamagalhaes1972@gmail.com

#### †Present address:

Vitor Ramos, Instituto Politécnico de Bragança (IPB-CIMO), Campus de Santa Apolónia, Bragança, Portugal

#### Specialty section:

This article was submitted to Extreme Microbiology, a section of the journal Frontiers in Microbiology

Received: 23 August 2018 Accepted: 24 April 2019 Published: 31 May 2019

#### Citation:

Rego A, Raio F, Martins TP, Ribeiro H, Sousa AGG, Séneca J, Baptista MS, Lee CK, Cary SC, Ramos V, Carvalho MF, Leão PN and Magalhães C (2019) Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods. Front. Microbiol. 10:1018. doi: 10.3389/fmicb.2019.01018 1 Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Porto, Portugal, 2 Institute of Biomedical Sciences Abel Salazar (ICBAS), University of Porto, Porto, Portugal, <sup>3</sup> International Centre for Terrestrial Antarctic Research, University of Waikato, Hamilton, New Zealand, <sup>4</sup> School of Science, University of Waikato, Hamilton, New Zealand, <sup>5</sup> Faculty of Sciences, University of Porto, Porto, Portugal

Bacterial diversity from McMurdo Dry Valleys in Antarctica, the coldest desert on earth, has become more easily assessed with the development of High Throughput Sequencing (HTS) techniques. However, some of the diversity remains inaccessible by the power of sequencing. In this study, we combine cultivation and HTS techniques to survey actinobacteria and cyanobacteria diversity along different soil and endolithic micro-environments of Victoria Valley in McMurdo Dry Valleys. Our results demonstrate that the Dry Valleys actinobacteria and cyanobacteria distribution is driven by environmental forces, in particular the effect of water availability and endolithic environments clearly conditioned the distribution of those communities. Data derived from HTS show that the percentage of cyanobacteria decreases from about 20% in the sample closest to the water source to negligible values on the last three samples of the transect with less water availability. Inversely, actinobacteria relative abundance increases from about 20% in wet soils to over 50% in the driest samples. Over 30% of the total HTS data set was composed of actinobacterial strains, mainly distributed by 5 families: Sporichthyaceae, Euzebyaceae, Patulibacteraceae, Nocardioidaceae, and Rubrobacteraceae. However, the 11 actinobacterial strains isolated in this study, belonged to Micrococcaceae and Dermacoccaceae families that were underrepresented in the HTS data set. A total of 10 cyanobacterial strains from the order Synechococcales were also isolated, distributed by 4 different genera (Nodosilinea, Leptolyngbya, Pectolyngbya, and Acaryochloris-like). In agreement with the cultivation results, Leptolyngbya was identified as dominant genus in the HTS data set. Acaryochloris-like cyanobacteria were found exclusively in the endolithic sample and represented 44% of the total 16S rRNA sequences, although despite our efforts we were not able to properly isolate any strain from this Acaryochloris-related group. The importance of combining cultivation and sequencing techniques is highlighted, as we have shown that culture-dependent methods employed in this study were able to retrieve actinobacteria and cyanobacteria taxa that were not detected in HTS data set, suggesting that the combination of both strategies can be usefull to recover both abundant and rare members of the communities.

Keywords: actinobacteria, McMurdo Dry Valleys, Antarctic soil, bacteria diversity, bacterial cultivability, endolitic microbiota, Antarctic microenvironments, cyanobacteria

# INTRODUCTION

fmicb-10-01018 May 31, 2019 Time: 15:59 # 2

The continent of Antarctica comprises about 0.34% ice-free areas (Convey, 2011) characterized by extreme cold and dry conditions (Wierzchos et al., 2004). In the McMurdo Dry Valleys (henceforth Dry Valleys), the largest ice-free region of the Antarctic continent (Pointing et al., 2009) and the coldest and driest desert on earth (Wood et al., 2008), the environmental stresses range from high variations in temperature (Doran et al., 2002), low nutrient availability and soil moisture (Buelow et al., 2016) to high ultraviolet solar radiation incidence (Perera et al., 2018).

Under such constraints, embracing the limits of physiological adaptability, microorganisms developed specialized strategies to survive, such as the colonization of edaphic and endolithic microenvironments (Walker and Pace, 2007), the entry into dormancy states (Goordial et al., 2017) and the biosynthesis of secondary metabolites (Wilson and Brimble, 2009; Tian et al., 2017).

Oligotrophic soils from the Dry Valleys are considered to be microbiologically distinct from all other soils worldwide (Fierer et al., 2012b) and High Throughput Sequencing (HTS) studies have proved their bacterial diversity is much larger than previously thought (Lee et al., 2012; Wei et al., 2016).

The Antarctica Dry Valleys soils are usually dominated by Actinobacteria, the prevalent phylum in cold arid soils (Pointing et al., 2009; Van Goethem et al., 2016; Goordial et al., 2017). Although the molecular basis behind actinobacteria dominance in cryoenvironments is still unknown (Goordial et al., 2015), metabolic activity at subzero temperatures has been detected (Soina et al., 2004). The formation of spores allows the survival in desert-like habitats (Mohammadipanah and Wink, 2016) and cyst-like resting forms have been described for non-sporulating actinobacteria species (Soina et al., 2004).

In addition to soil environments, rocky niches are of particular relevance in the Dry Valleys ecosystems since they provide protection to different biota from harsh environmental conditions such as intense solar radiation exposure, temperature fluctuations, wind, and desiccation (Cockell and Stokes, 2004; Walker and Pace, 2007). These rock-inhabiting organisms are very important in Dry Valleys because of the extent of rockexposed surface, thus accounting largely for the productivity and biomass in this system (Omelon, 2008; Pointing and Belnap, 2012). Dry Valleys hypolithic and endolithic communities are often dominated by cyanobacteria (Omelon, 2008; Chan et al., 2012; Van Goethem et al., 2016). Indeed, as primary colonizers after the retreat of glaciers, cyanobacteria are leading components of Dry Valleys ecosystem, enabling colonization by other microorganisms (Vincent, 2000). They are well adapted to the stress of desiccation and despite most of the Dry Valleys soils lack any visible cyanobacterial growth, their presence was detected through HTS even in low moisture soil samples (Wood et al., 2008).

In fact, HTS techniques have revolutionized the traditional biodiversity studies based on culturing approaches and opened an array of new opportunities to explore previously hard to access environments, as are extreme environments. These approaches can be used to study both cultured and uncultured diversity, being of particular relevance in these types of environments – where the unique environmental conditions are hard to mimick in the laboratory. Cultivation-independent studies have revealed that in the Dry Valleys, abiotic factors drive the diversity and structure of microbial communities (Pointing et al., 2009; Lee et al., 2012; Magalhães et al., 2012). Aridity seem to be the most preponderant factor shaping bacterial communites, not only in Dry Valleys but across multiple deserts (Kastovská et al., 2005; Pointing et al., 2007; Wood et al., 2008). The type of habitat (lithic vs. soil) also has a preponderant role in shaping the bacterial community composition as shown for Antarctic and other hyper-arid deserts (Azúa-Bustos et al., 2011; Stomeo et al., 2013; Makhalanyane et al., 2015).

Although HTS techniques promised to be able to replace bacterial culturing (Venter et al., 2004), cultivation techniques are still necessary to improve taxonomic resolution and even diversity coverage (Lagier et al., 2015; Choi et al., 2017; Ramos et al., 2017), as well as to recover whole genomes or allow physiological and metabolic studies (Sørensen et al., 2002; Ramos et al., 2018; Lambrechts et al., 2019). The information provided by the bacterial isolates can further allow to understand the cultivation requirements and develop directed cultivation techniques leading to potential novel discoveries (Ramos et al., 2018; Tahon et al., 2018). Antarctic bacteria, by evolving in such extreme conditions are particularly interesting in terms of biotechnological applications, including for bioremediation (Gran-Scheuch et al., 2017; Lee et al., 2018), antimicrobials (Gesheva and Vasileva-Tonkova, 2012; Tedesco et al., 2016) and production of anti-freezing molecules (Muñoz et al., 2017).

Combined approaches encompassing culture dependent and independent techniques to retrieve a broader bacterial diversity have been employed for Antarctic studies (Babalola et al., 2009; Aislabie et al., 2013). However, only a fraction of the taxa recovered did overlap, highlighting the complementarity of both approaches. In a comprehensive study Lambrechts et al. (2019) have shown that over 85% of Antarctic soil bacterial sequences available in databases still belong to uncultured genera or higher

taxonomic level. Even for extensively studied phyla, such as actinobacteria, there are reports of uncultured phylotypes (Smith et al., 2006; Babalola et al., 2009).

By mimicking natural conditions, novel micro-culturing techniques, including soil substrate membrane system (SSMS) (Pudasaini et al., 2017) and extended incubation times (Tahon and Willems, 2017) have reduced the gap between the cultured and uncultured approaches. Several Antarctic studies have successfully retrieved novel genera and families, including first isolates of novel taxa of recalcitrant bacteria, by employing novel cultivation approaches (van Dorst et al., 2016; Pulschen et al., 2017; Tahon and Willems, 2017; Tahon et al., 2018). In the present study, we combine HTS and microbial cultivation techniques as culture-independent and dependent approaches to survey actinobacteria and cyanobacteria diversity along different soil and endolithic micro-environments of Victoria Valley, one of the Dry Valleys. Novel cultivation techniques, previously fruitful in retrieving novel and recalcitrant taxa from Antarctic and other deserts, were included. A comprehensive insight of the main factors shaping endolithic and edaphic microbial communities, such as moisture levels and pH is addressed.

# MATERIALS AND METHODS

#### Samples Location and Collection

Substrate from a rock with endolithic colonization (END) and from a soil transect with a gradient of water availability were collected in Victoria Valley during the K020 Mission in January 2013, integrated in the NZTABS international program. For the transect, a total of six sites between T1 and T6 were sampled from a 32 m transect with increasing distance from a water pond near the main water source in the Victoria Valley – Lake Vida (**Figure 1** and **Table 1**). Several scoops of soil were collected aseptically and stored in a sterile Whirl-Pak. All samples were kept at −30◦C in Lifeguard solution (MoBio) in the 1st week after sample and then at −80◦C until further analysis.

## Physicochemical Analysis of Transect Soil Samples

Water Activity (Aw) was measured in situ in all sampling points using a portable water activity analyzer (PaWKit AquaLab, Decagon). Conductivity and pH were also measured in all soil samples using a CyberScan PC 510 Bench Meter (Eutech Instruments) following the slurry technique which consists in mixing 1:2.5 mass ratio of samples and de-ionized water (Edmeades et al., 1985).



#### Bacterial Community Diversity Analysis

Environmental DNA (eDNA) was extracted using a modification of the CTAB extraction protocol (Barrett et al., 2006). The 16S rRNA gene was amplified by PCR using the universal primer pair 27F/1492R (Weisburg et al., 1991) and then sequenced by pyrosequencing technology. Briefly, the 16S rRNA gene was amplified for the V3-V4 hypervariable region with barcoded fusion primers containing the Roche-454 A and B Titanium sequencing adapters, an eight-base barcode sequence, the forward (50–ACTCCTACGGGAGGCAG-3<sup>0</sup> ) and reverse (50– TACNVRRGTHTCTAATYC -3<sup>0</sup> ) primers (Wang and Qian, 2009). The PCR reaction was performed using 5 U of Advantage Taq polymerase (Clontech), 0.2 µM of each primer, 0.2 mM dNTPs, 6% DMSO and 2–3 µL of template DNA. The PCR conditions employed were: initial denaturation step at 94◦C for 3 min, followed by 25 cycles of 94◦C for 30 s, 44◦C for 45 s and 68◦C for 60 s and a final elongation step at 68◦C for 10 min. The amplicons were quantified by fluorimetry with PicoGreen (Invitrogen), pooled at equimolar concentrations and sequenced in the A direction with GS 454 FLX Titanium chemistry, according to the manufacturer's instructions (Roche, 454 Life Sciences) at Biocant (Cantanhede, Portugal). The 454 machine-generated FASTA (.fna) and quality score (.qual) files were processed using the QIIME (Quantitative insights into microbial ecology) pipeline (Caporaso et al., 2010). Initially, raw reads were demultiplexed and subjected to a quality filtering – sequences with a quality score below 25 were removed. The next step, Pick OTUs (Operational Taxonomic Units) (Seath et al., 1973) was performed in parallel with 3 different workflows: pick\_de\_novos\_otus.py, pick\_otus.py (closed-reference method) and pick\_open\_reference\_otus.py (open-reference method). The OTU table obtained from the open-reference method was selected for the downstream analyses. Essentially, all sequences were clustered into OTUs at 97% sequence similarity using UCLUST (Edgar, 2010) and the reads aligned to the Greengenes v13\_8 (GG) (DeSantis et al., 2006) database using PyNAST. For the taxonomic assignment, the RDP Classifier 2.2 (Wang et al., 2007) was used with the UCLUST method. For each sample, alpha and beta diversity metrics were calculated using weighted and unweighted UniFrac metrics (Lozupone and Knight, 2005). The R packages phyloseq (Mcmurdie and Holmes, 2013) and ggplot2 (Wickham, 2009) were used for downstream analysis and visualization including alpha diversity calculations and relative and total abundance taxonomy summary charts.

#### Isolation of Actinobacteria and Cyanobacteria

Some of the environmental samples that were collected in Victoria Valley were also used for bacterial isolation, based on their composition as estimated by HTS. Samples T5 and T6 from soil transect were selected for the isolation of actinobacteria, while endolithic and transect samples T1 and T3 were used for the isolation of cyanobacteria. In addition, sample T6 was also used for attempting to isolate low-abundance cyanobacteria in an actinobacteria-dominated sample. It is known that the cultivable fraction of the microbial richness is typically below 1% (Epstein, 2013), so in order to improve the cultivability and maximize the recovery of microbial strains from the samples, different culture strategies – including pre-treatments – were employed as described below.

## Culture Strategies for the Isolation of Actinobacteria

For the transect sample T6, 0.5 g of the original sample (soil) were weighted under sterile conditions and 5 mL of sterile saline solution (0.85% NaCl at the temperature of 4◦C) were added to resuspend the sample. The solution was vortexed for 10 min and allowed to settle for 2 min in an ice bath. Sequential dilutions (down to a dilution factor of 10−<sup>2</sup> ) of the supernatant were performed, inoculated (in duplicate) on solid media (as described below) and incubated at three different temperatures (4, 9 and 19◦C). All the media were supplemented with 50 mg/L of cycloheximide (BioChemica) and streptomycin (BioChemica) to inhibit the growth of fungi or other eukaryotes and Gram-negative bacteria, respectively. The dilutions were plated onto an oligotrophic medium – Nutrient-poor sediment extract (NPS) – primarily made with an extract from the original Antarctica soil sample and then with sand collected from a beach in northern Portugal (Francelos Beach, Vila Nova de Gaia, Portugal), to simulate the oligotrophic environmental conditions. Previous works have indicated that soil-extract agar is able to retrieve a wider and more diverse range of biodiversity when compared to traditional media (Hamaki et al., 2005). Briefly, ca. 500 g of substrate was mixed with 500 mL of distilled water, homogenized and allowed to settle. For medium preparation, 100 mL of the supernatant solution was mixed with 900 mL of distilled water and 17 g of bacteriological agar. Obtained colonies were then streaked in the same medium and in richer media, in order to investigate which one could render a higher biomass growth. The richer media used were: modified nutrientpoor sediment extract (MNPS): 5 g/L soluble starch, 1 g/L potassium nitrate, 100 mL/L substrate extract and 17 g/L agar; International Streptomyces Project medium 2 (ISP2) (Shirling and Gottlieb, 1966) and raffinose histidine agar (RH) (Vickers et al., 1984). Bacterial colonies were successively streaked until pure colonies were achieved.

For the transect sample T5, a more selective approach was used. The soil sample (0.5 g) was weighted under sterile conditions and 2.5 mL of sterile saline solution were added to resuspend the sample. The solution was incubated on an ultrasound for 1 min and vortexed for 5 min. The sample was then submitted to two different pre-treatments to maximize the selection of sporulating actinobacteria: (1) heat-shock, which consisted in the incubation of 1 mL of the suspension at 50◦C, for 5 min and (2) incubation with antibiotics (Hame¸s-Kocaba¸s and Uzel, 2012), consisting in the incubation of 1 mL of the suspension with 20 mg/L of streptomycin (BioChemica) and nalidixic acid (BioChemica), at 28◦C for 30 min. For each pre-treatment, serial dilutions (down to a dilution factor of 10−<sup>2</sup> ) were performed and plated onto different media selective for actinobacteria: Actinomycete Isolation Agar (AIA): sodium caseinate 2 g/L, L-asparagine 0.1 g/L, sodium propionate 4 g/L,

dipotassium phosphate; Czapeck agar (Axenov-Gribanov et al., 2016) and Starch Casein Nitrate Agar (SCN): 10 g/L soluble starch, 0.3 g/L casein sodium salt from bovine milk, 2.62 g/L potassium phosphate dibasic trihydrate, 2 g/L potassium nitrate, 2 g/L sodium chloride, 0.05 g/L magnesium sulfate heptahydrate, 0.02 g/L calcium carbonate, 0.01 g/L iron(II)sulfate heptahydrate. The plates were incubated at 4 and 28◦C. The bacterial colonies grown in the plates were streaked in the same isolation media until pure colonies were obtained.

### Culture Strategies for the Isolation of Cyanobacteria

The endolithic (END) and samples T1, T3, and T6 of the soil transect, all preserved at −80◦C in Lifeguard solution, were used for cyanobacterial isolation. Before the inoculation, samples were submitted to a washing process, which consisted in the centrifugation of the samples at 4500 × g for 3 min, removal of the supernatant, resuspension of the pellet in BG11<sup>0</sup> medium [without nitrogen source (Rippka, 1988)], brief agitation of the suspension, centrifugation again of the suspension and discarding of the supernatant. The inoculation was carried out by adding an equal part of the pelleted samples to (a) glass Erlenmeyer's of 100 mL with liquid medium [BG11<sup>0</sup> and Z8 (Kotai, 1972)] and (b) to solid (Z8 and BG110) agar plates, and allowed to grow at 19◦C, under a 12:12 h light (20–30 µmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> photon irradiance):dark cycle. Plates were prepared with 1.5% agarose and supplemented with 0.5% of cycloheximide (5 mL/L), to prevent the growth of eukaryotic microorganisms. When visible growth was detected in the liquid/solid media, aliquots were transferred and streaked onto solid Z8 or BG11<sup>0</sup> medium plates. The single colonies were selected and re-streaked aseptically to fresh Z8 and/or BG11<sup>0</sup> medium plates. The procedure was repeated until isolation was achieved (Rippka, 1988). cyanobacterial isolates were visually inspected under a microscope (Leica DMLB) and then transferred to Z8, both in liquid and solid (agar) medium. Isolates are maintained in the LEGE Culture Collection (Ramos et al., 2018).

## Identification of Bacterial Isolates Through 16S rRNA Gene Sequence Amplification

After isolation of pure cultures, each bacterial isolate was grown in 10 mL of liquid medium (composition according to the medium from which the isolated was retrieved) in 50 mL Falcon tubes/100 mL Erlenmeyer's, until enough biomass was obtained to extract DNA. DNA was extracted using the E.Z.N.A. <sup>R</sup> Bacterial DNA Kit (OMEGA bio-tek) and Purelink Genomic DNA Mini Kit – Gram-negative bacterial cell protocol (Invitrogen), for bacterial and cyanobacterial strains, respectively. The manufacturer's instructions were followed, and DNA was eluted in 100 µL of elution buffer. The integrity of the gDNA was assessed by agarose gel electrophoresis (0.8% agarose gel prepared in TAE buffer 1X, stained with 1 µL of SYBR <sup>R</sup> Safe DNA Gel Stain from Thermo Fisher Scientific). One microliter of DNA (with loading dye) was loaded onto each lane before electrophoresis at 80 V for 30 min. The 16S rRNA gene was amplified by PCR in a Veriti <sup>R</sup> 96-Well Thermal Cycler (Thermo Fisher Scientific) using primer pair 27F/1492R (Weisburg et al., 1991) (1465 bp) for bacteria and primer pairs CYA106F, CYA359F, CYA781R (Nubel et al., 1997) and 1492R (Weisburg et al., 1991) (combinations CYA106F – CYA781R: ∼675 bp and CYA359F – 1492R: ∼1130 bp) for cyanobacteria.

For bacteria, the PCR reaction was prepared in a volume of 10 µL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC.), 1.5 mM MgCl<sup>2</sup> (TAKARA BIO INC.), 250 µM dNTPs (TAKARA BIO INC.), 1.5 µL of each primer (2 µM), 0.25 mg/mL of UltraPureTM BSA (Life Technologies), 0.25 U TaKaRa TaqTM Hot Start Version (TAKARA BIO INC.) and 1 µL of template DNA. The PCR conditions were: initial denaturation step at 98◦C for 2 min, followed by 30 cycles of a denaturation step at 94◦C for 30 s, annealing at 48◦C for 90 s and extension at 72◦C for 2 min, followed by a final extension step at 72◦C for 10 min. PCR products (3 µL loaded in each well) were separated by electrophoresis on a 1.5% (w/v) agarose gel during 30 min at 150 V. The ladder utilized was GRS ladder 1 kb (Grisp). The gel was stained with 1 µL SYBR <sup>R</sup> Safe DNA Gel Stain (Thermo Fisher Scientific), visualized under UV-light at Gel Doc XR+ System (BIO-RAD) and analyzed with the Image LabTM software (BIO-RAD). For cyanobacteria, the PCR reaction was prepared in a volume of 20 µL containing 1× Green GoTaq <sup>R</sup> Flexi Buffer (Promega), 2.5 mM MgCl2 (Promega), 500 µM of DNTP Mix (Promega), 0.1 µM of each of the primers, 0.5 U of GoTaq <sup>R</sup> DNA Polymerase (Promega) and 2 µL of template DNA. The PCR conditions executed were: initial denaturation step at 92◦C for 4 min, followed by 35 cycles of a denaturation step at 92◦C for 30 s, annealing at 50◦C for 30 s and extension at 72◦C for 1 min, followed by a final extension step at 72◦C for 5 min.

The PCR products of bacteria and cyanobacteria were then sequenced by Sanger sequencing at i3S (Porto, Portugal) and GATC Biotech (Constance, Germany), respectively. Raw forward and reverse sequences (ab1 files) were imported into Geneious 8.1.9 (Kearse et al., 2012) for de novo assembling.

#### Phylogenetic Analysis of Bacterial Isolates

The obtained sequences (approximately 1,400 and 1,100 bp for actinobacteria and cyanobacteria, respectively) were submitted to a blast(n) analysis against the NCBI Nucleotide collection database and the sequences from the first 5 blast(n) matches were retrieved. The multiple sequence alignment (using the ClustalW algorithm) and the phylogenetic analysis were performed in MEGA7 (Kumar et al., 2016). The alignments were manually curated to remove short sequences and gap regions. The best nucleotide substitution model was determined by the corrected Akaike Information Criterion (AICc) in MEGA7. The phylogenetic trees were reconstructed using the Maximum Likelihood statistical method, bootstrap (with 500 replications) and the correspondent best nucleotide substitution model (TN93 + G and GTR + G + I).

# RESULTS AND DISCUSSION

fmicb-10-01018 May 31, 2019 Time: 15:59 # 6

The hyper-arid desert of McMurdo Dry Valleys, located in Victoria Land is considered one of the most inhospitable habitats, being restricted to microbial colonization (Pointing et al., 2009). In these environments, abiotic factors such as moisture, pH and conductivity, clearly drive the diversity and structure of the microbial communities (Pointing et al., 2009; Magalhães et al., 2012; Magalhães et al., 2014).

In this study, a transect of soil samples was collected in Victoria Valley, with a clear decrease in water availability (AW) from the wetter sampling sites (T1, T2, and T3) to the drier ones (T5 and T6) (**Table 1**). Soil pH ranged from neutral (7.09) to moderately alkaline (8.41), increasing with the distance to water availability.

# Cyanobacteria and Actinobacteria Diversity and Distribution

The bacterial community composition across the transect was assessed through 454 pyrosequencing of the 16S rRNA gene. A total of 180499 sequences were obtained for the seven studied samples, which after quality filtering decreased to 71447. The number of sequences per sample ranged between 2950 (endolithic sample) to 17570 (sample T6). In total 4530 different OTUs (at 97% identity) were retrieved (**Supplementary Table S1**).

Alpha-diversity results indicated that bacterial diversity was not fully covered from the sequencing effort in transect samples as a plateau phase was not reached, with the exception of the endolithic sample (**Supplementary Figure S1**). Two different beta-diversity metrics were employed – weighted and unweighted UniFrac (Lozupone and Knight, 2005), both phylogeny-based. The resultant output was summarized by Principal Coordinates Analysis (PCoA) (**Figures 2B,C**). The principal coordinate 1 (PC1) explained 28.9% and 42.4% of the amount of variation for unweighted and weighted analyses, respectively. From both plots, it is observable a similar clustering pattern – samples T1 and T2 cluster together, as well as samples T4, T5, and T6. Further, sample T3 and END seem to be distributed separately from the other samples. This clustering pattern may be indicative of a switch in habitat type from moisture soils, comprising T1 and T2, which were the ones closer to the water source, to open arid soils, comprising locations T4, T5, and T6 and correspond to the locations which are further away from the water source.

In accordance with previous reports for Antarctic (Pointing et al., 2009; Van Goethem et al., 2016) and hot hyperarid deserts (Azúa-Bustos et al., 2011; Stomeo et al., 2013; Makhalanyane et al., 2015), the type of habitat (endolithic vs. soil) dramatically constrains bacterial community composition. Clearly distinct taxonomic and phylogenetic composition was observed in the two niches under study (**Figure 3**). Also consistent with previous studies (Van Goethem et al., 2016), diversity indices (alpha-diversity) revealed soil samples as more diverse than the endolithic sample, according to the number of observed OTUs (**Supplementary Figure S1**) and the richness and diversity of the sample (**Figure 2A**). A total of 34 bacterial phyla were detected across all the samples, with the phyla Actinobacteria, Proteobacteria, Cyanobacteria, and Bacteroidetes being the most abundant and present in all samples (**Figure 3A**). Actinobacteria, Acidobacteria, and Bacteroidetes are usually the dominant phyla described for Dry Valleys (Aislabie et al., 2006; Cary et al., 2010). Interestingly, in contrast to previous studies (Smith et al., 2006; Fierer et al., 2012b; Van Goethem et al., 2016), Proteobacteria was highly represented on the transect and endolithic samples. This phyla is generally dependent on high organic soil contents, which is not the case of most oligotrophic Antarctic soils (Cary et al., 2010; Chan et al., 2013). As expected,

Acidobacteria, considered an oligotrophic phylum (Fierer et al., 2012b), was also well represented and dispersed among the different samples (**Figure 3A**).

In this study, special attention was given to the distribution of Actinobacteria and Cyanobacteria – the most abundant heterotrophic and autotrophic phylum, respectively.

Cyanobacteria are usually the dominant phyla in lithicassociated communities (Chan et al., 2012) across multiple deserts (Pointing et al., 2007; Crits-Christoph et al., 2016; Khomutovska et al., 2017; Perera et al., 2018). In this study cyanobacteria close related to Acaryochloris were found exclusively in the endolithic sample and represented 44% of the total relative abundance (**Figure 3B**). Up till now, the two described species of Acaryochloris are marine (Miyashita et al., 2003; Partensky et al., 2018) but cyanobacteria closely related to these taxa have already been associated with endolithic

communities in Antarctic Dry Valleys and Atacama granite and calcite rocks, respectively (de Los Ríos et al., 2007; Crits-Christoph et al., 2016), and within the underexplored cold desert of Pamir mountains (Khomutovska et al., 2017). These niches provide a barrier to penetration of organisms, protection against harmful solar irradiance as well as a microclimate distinct from the exterior of the rock, with higher moisture levels (Friedmann, 1980; de Los Ríos et al., 2007; Wierzchos et al., 2012). On the other hand, one of the main factors that influence the composition of endolithic photosynthetic communities is the quantity and quality of light available (de Los Ríos et al., 2005). Curiously, the same applies for the distribution of Acaryochloris in marine environments (Chan et al., 2007; Partensky et al., 2018). This genus is characterized by having sheathed and non-motile cells, but its chief distinctive character is the presence of chlorophyll d (chl d) as the major photosynthetic pigment (Miyashita et al., 2003; Partensky et al., 2018).

Over 30% of the total bacterial relative abundance in the current dataset was attributable to actinobacterial strains, mainly distributed by two families – Euzebyaceae and Rubrobacteraceae (**Figures 3A,B**). Members of Euzebya genus have been reported from Victoria Valley's hypolithic and endolithic communities (Van Goethem et al., 2016). The Rubrobacter genus, commonly found in Dry Valleys (Wei et al., 2016) has been reported as dominant genus of endolithic microbial communities in Atacama Desert (Crits-Christoph et al., 2016; Meslier et al., 2018). Water availability seemed to clearly define a threshold or a limit for colonization of some bacterial phyla, mainly cyanobacteria, along the soil transect. The HTS data show that the percentage of cyanobacteria decreases from about 20% in sample T1 to negligible values on the last three samples of the transect (T4–T6) (**Figure 3A**). On sample T1, Leptolyngbya and Pseudanabaena genera were well represented while in sample T2, the genus Phormidium was the most abundant (**Figure 4**). Leptolyngbya is usually associated with lake and maritime Antarctic communities (Taton et al., 2006; Zakhia et al., 2008), while Phormidium is commonly found in Antarctic water-saturated soils and river beds (Vincent et al., 1993) but both were found in polar deserts (Michaud et al., 2012). These findings suggest, as already shown for Antarctic (Pointing et al., 2007; Wood et al., 2008; Van Goethem et al., 2016) and other hyper-arid deserts (Kastovská et al., 2005), that water availability and thus distance to aquatic ecosystems shapes the taxonomic composition of the cyanobacterial communities. In fact, in hyper-arid habitats, water availability was shown to be the main driver and most limiting environmental factor for cyanobacteria distribution (Warren-Rhodes et al., 2006, 2007). In agreement, the results obtained in this study suggest the existence of a biological threshold for cyanobacterial colonization and survival, observed between 1– 0.6 of water availability (Aw), while moving from sample T3–T4 (**Table 1**). Limited photosynthetic activity is related to limiting moisture levels (Tracy et al., 2010) and might be a reason for cyanobacterial activity reduction and distribution.

Among cyanobacteria, Synechococcales (represented by the genera Acaryochloris, Leptolyngbya, and Pseudanabaena) and Oscillatoriales (Phormidium) orders dominated on the studied samples. Interestingly, Chroococcidiopsis (Caiola et al., 1993)

a desiccation tolerant cyanobacteria, dominant in arid and hyper-arid deserts (Cámara et al., 2015; Khomutovska et al., 2017; Lacap-Bugler et al., 2017; Gómez-Silva, 2018) was not detected in this study.

Previous studies have stated that Oscillatoriales are capable of overcoming Chroococcidiopsales in certain cold desert conditions, namely Phormidium have been suggested to pursue a competitive advantage in colonizing cold-habitats (de la Torre et al., 2003; Pointing et al., 2007). This group is also commonly associated with places with higher availability of water, as described by Steven et al. (2013) in Arctic desert soils. Filamentous cyanobacteria are known to be able to thrive successfully under extreme environmental constraints due to their mucilage production, motility or the production of akinetes in the case of some heterocyst-differentiating cyanobacteria (Murik et al., 2017; Rucová et al., 2018 ˇ ).

The dominance of coccoid, Acaryochloris-like cyanobacteria in the endolithic sample and its absence in soil samples seems to indicate a high level of specialization and adaptation to this type of environment. Indeed, as stated by Partensky et al. (2018) chl d allows Acaryochloris to thrive in (micro)habitats enhanced by light radiations other than the visible spectrum, remarkably those from the infrared region. The isolation of any strain would be of great relevance to help shed some light about the possible presence of this pigment (and in its eventual ecophysiological role) in the detected Acaryochloris-like cyanobacterium, which seems to be an important component of polar, endolithic communities. The existence of aridity-associated phylotypes, as suggested by the data presented, was already documented for other cold-deserts (Pointing et al., 2007). Inversely, actinobacteria abundance increases from about 20% in wet soils to roughly 50% in the driest samples and also with higher pH (**Figure 3A** and **Table 1**). This pattern suggests that actinobacteria is favored with the decrease in moisture content, contrary to cyanobacteria. A similar shift was observed by Takebayashi et al. (2007) from members of Proteobacteria to actinobacteria with decreasing in water availability. Germination and growth at 0.5 Aw has been previously reported for actinobacteria (Evans et al., 2014; Stevenson and Hallsworth, 2014) and Barnard et al. (2013) have shown that actinobacteria relative abundance increase with

desiccation. Interestingly, and in contrast with what we observed, it has been shown that the relative abundance of actinobacteria of Namib desert increased with an increase in moisture (Armstrong et al., 2016) while the opposite has been shown to occur for Chihuahuan desert (Clark et al., 2009). Recent studies revealed the major predictors of moisture preferences are not phylogeny but physiological traits (Lennon et al., 2012), which can explain such contradictory observations. Also, soils with high pH usually support higher relative abundances of actinobacteria and lower of Acidobacteria when compared to more acidic soils (Fierer et al., 2012a). A comprehensive study with 88 different soils revealed a positive correlation between soil pH and actinobacteria abundance (Lauber et al., 2009), as we report in this study. The shift observed within actinobacteria distribution across the soil transect studied, might be a result not only of a decrease in water availability but a combination with pH increase.

Among Actinobacteria, families Sporichthyaceae, Euzebyaceae, Patulibacteraceae, and Nocardioidaceae were the most abundant (**Figure 3C**). The genus Rubrobacter (Pointing et al., 2009) was present in all samples, with a higher frequency in endolithic and in the last three samples of the transect. Previous studies have detected Rubrobacter in Dry Valleys soils (Aislabie et al., 2013; Wei et al., 2016), and the observed distribution suggests that this genus might be widely adapted to this environment.

Members of Sporichthyaceae family, with a higher distribution in sample T5, have been reported from Dry Valleys soil (Van Goethem et al., 2016) and cryptoendolithic communities (de la Torre et al., 2003) as well as Atacama desert soils (Idris et al., 2017). The distribution of Patulibacteraceae and Nocardioidaceae families increased in samples with lower water availability. Although at low frequencies, the Nocardioides genus was detected in all soil samples, suggesting, as for Rubrobacter, an important role for this genus in this arid environment. In fact, members of Nocardioides genus have been previously found in other desert soils, as Atacama (Idris et al., 2017) and Badain Jaran (Sun et al., 2018) deserts, which have inclusively led to novel species being isolated (Tuo et al., 2015).

Members of Euzebyaceae were detected particularly in END and T3 sample (**Figure 3C**), which is corroborated by previous works that have found the presence of members of the Euzebya genus in Dry Valleys hypolithic and endolithic communities (Van Goethem et al., 2016). In Atacama desert, their presence was also detected in endolithic microhabitats (Meslier et al., 2018). According to the literature, Euzebya seems to be highly adapted to endolithic and hypolithic environments, however, our study seems to be the first that detected the presence of this genus in the Dry Valleys soils. Remarkably, the endolithic sample was the only among those studied to harbor actinobacteria affiliated with Streptomycetaceae, a family commonly found and retrieved by cultivation from Dry Valleys soils samples (Cameron et al., 1972; Babalola et al., 2009).

## Culture-Dependent Isolation and Diversity of Actinobacterial Strains

Culture-based studies on Dry Valleys have initially proposed a dominance of a small number of aerobic groups, and few anaerobic isolates for endolithic (Friedmann, 1982) and edaphic habitats. Although, nowadays Antarctic soils have been extensively studied by culture-based approaches, these studies have still retrieved only a small number of bacterial phyla, from which only a small fraction of genera have been cultivated (Lambrechts et al., 2019).

However, molecular-based phylogenetic studies have revealed microbial diversity of Antarctic Dry Valley as remarkably high (Smith et al., 2006). At least 14 different bacterial phyla have been described from Dry Valleys bacterial lithic communities – dominated by Acidobacteria, Actinobacteria and Bacteroidetes (Cary et al., 2010). Still, culture-based studies have in general, retrieved some specific genera of the actinobacteria phylum such as Arthrobacter, Brevibacterium, Corynebacterium, Micrococcus, Nocardia, and Streptomyces (Johnson et al., 1972). Due to the low cultivable fraction of the microbial richness [typically below 1% (Epstein, 2013)], different strategies to improve the culturability of microorganisms have started to be used, including pre-treatment strategies and oligotrophic media (Xiong et al., 2013), which have provided fruitful results, in particular in Antarctic ecosystems (van Dorst et al., 2016; Pulschen et al., 2017; Tahon and Willems, 2017; Tahon et al., 2018). In fact, previous studies have revealed Antarctic edaphic bacteria resistant to cultivation but recently, Pulschen et al. (2017) has shown that it is possible to grow recalcitrant bacteria from Antarctic soils by using longer incubation periods, lower temperatures and oligotrophic media.

In the present study, different culture isolation strategies, including mimicking of oligotrophic conditions and application of selective pre-treatments were used, in order to isolate actinobacteria strains from Victoria Valley samples. Attempts to isolate actinobacteria were only successful in sample T5, where we used two different pre-treatments – heat shock and incubation with antibiotics. Eleven actinobacterial strains were isolated and identified in this study (**Table 2**), and they all belonged to the Micrococcales order, affiliated with two different families (Micrococcaceae and Dermacoccaceae) and four different genera (Micrococcus, Kocuria, Dermacoccus, and Flexivirga).

According to pyrosequencing data, the dominant actinobacteria family in sample T5 was Sporichthyaceae, followed by Patulibacteraceae, Nocardioidaceae, and Rubrobacteraceae (**Figure 3C**). For the Sporichthyaceae family, there are no records of cultivation isolates in Antarctica. Curiously, accordingly to the pyrosequencing data, Micrococcaceae were not found in sample 5 and Dermacoccaceae were not detected in any sample (**Figure 5B**). Bacterial species from Micrococcaceae family are commonly retrieved from Antarctic culture-based studies (Johnson et al., 1972; Liu et al., 2000; Cary et al., 2010). The obtained isolates were assigned to two different and less common genera from this family – Micrococcus and Kocuria. The Kocuria genus has resulted from the phylogenetic and chemotaxonomic division of Micrococcus genus and both include species isolated in Antarctica (Liu et al., 2000; Reddy et al., 2003). Despite being considered a less common genus, members of Kocuria genus are recurrently isolated across desert soils (Gommeaux et al., 2010; Schulze-Makuch et al., 2018), inclusive new species (Li et al., 2006). Species from the Dermacoccus

#### TABLE 2 | Summary of obtained isolates.

fmicb-10-01018 May 31, 2019 Time: 15:59 # 10


The isolates non-identified, clonal strains and strains in isolation process were excluded from the Table. <sup>1</sup> Isolation in AIA medium, at 28◦C, PT1, dilution 10−<sup>1</sup> . 2 Isolation in AIA medium, at 28◦C, PT2, not diluted. <sup>3</sup> Isolation in SCN medium, at 28◦C, PT2, substrate inoculated. <sup>4</sup> Isolation in Z8 medium, at 19◦C.

genus have also been previously isolated from Galindez Island, maritime Antarctica (Vasileva-Tonkova et al., 2014). One of the reasons behind Micrococcaceae cultivation amenability can be related to the production of cyst-like resting forms. It has been shown that Micrococcus species from permafrost harbor such cyst-like cells, that provide protection to adverse external factors and are responsible for their survival under prolonged exposure to subzero temperatures (Soina et al., 2004). In addition, another study revealed that members of Micrococcaceae family have higher growth rates in water addition samples, when comparing to other bacterial groups (Schwartz et al., 2014), suggesting that members of this family are adapted to intend for transient water inputs in arid soils. For the remaining genus – Flexivirga – there is no report for previous isolation in Antarctica, however, one of the 5 closest hits at NCBI (Flexivirga sp. M20-45) was isolated from an alpine forest soil by Franca, L. and Margesin, R. (unpublished). In the phylogenetic tree (**Figure 6**) the isolate AT20 groups with the strain Flexivirga sp. ID2601S, retrieved from an evaporation core by Kim et al. (unpublished). According

FIGURE 6 | Phylogenetic tree of the 16S rRNA gene nucleotide sequences of the obtained actinobacterial isolates and their closest matches at NCBI 16S database. The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura–Nei model. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+G, parameter = 0.4569)]. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 59 nucleotide sequences. Sequences of strains isolated in Antarctic or cold-environments are highlighted and the sequences obtained in this study are in bold.

to the phylogenetic analysis and the 16S rRNA similarity, the isolate AT20 might represent a new species from the Flexivirga genus.

# Culture-Dependent Isolation and Diversity of Cyanobacterial Strains

Cyanobacteria dominance and adaptive success in Antarctica is well known (Vincent, 2007; Quesada and Vincent, 2012). They thrive particularly in lakes and ponds through the establishment of benthic microbial mats (Vincent, 2000). Pioneer studies on Dry Valleys cyanobacterial distribution, have proven that the majority of cryptoendolithic cyanobacteriadominated communities belong to the Phormidium genus (Friedmann, 1986).

Here, attempts were made to isolate cyanobacteria from soil and endolithic rock samples. In total 10 cyanobacterial strains from the order Synechococcales were obtained (**Table 2**), distributed by 3 different genera (Nodosilinea, Leptolyngbya, Pectolyngbya, and an unidentified Synechococcales).

From END sample, it was only possible to isolate two clonal strains (Nodosilinea sp. LEGE 13457 and LEGE 13458) with high phylogenetic (**Figure 7**) and morphological (**Supplementary Figure S2**) similarities to one Leptolyngbya antarctica strain previously isolated from a benthic microbial mat in Dry Valleys and eastern Antarctic lakes (Taton et al., 2006). Although they were identified with over 99% similarity to L. antarctica ANT.LAC.1, the phylogenetic analysis indicates that LEGE 13457 and LEGE 13458 strains fit within the clade harboring Nodosilinea strains and is placed away from the reference strain Leptolyngbya boryana PCC6306 (**Figure 7**). This apparent inconsistency is due to the current status of the taxonomy of cyanobacteria, which is in a protracted process of revision (Komárek, 2016; Walter et al., 2017). Thus, our findings suggest that a taxonomic revision of the species L. antarctica (West and G.S.West) Anagnostidis and Komárek (1988) is needed, something that was already demonstrated by Taton et al. (2006). According to the pyrosequencing data, over 40% of the total bacterial abundance of the END sample was composed of strains from Acaryochloris genus and only 0.4% of Leptolyngbya, however, and despite our efforts we were not able to properly isolate any strain from Acaryochloris genus. As abovementioned, this would be very important to better explore the role of chl d for the successful adaptation of this cyanobacterium in endolithic habitats from Antarctica, in a similar manner as it was exposed for marine counterparts (Chan et al., 2007). Also, sequences from Nodosilinea strains were not detected by the pyrosequencing technology. Interestingly, L. antarctica ANT.LAC.1 is the only strain from Antarctica present on the clade harboring the referred isolated strains (LEGE 13457 and LEGE 13458; **Figure 7**). L. antarctica has been detected from several environments not exclusively in Antarctica, and has been found as a dominant OTU in Arctic soil crusts (Pushkareva et al., 2015).

Attempts were made to isolate cyanobacteria from three soil transect samples, T1 – the sample with highest water availability, T3 and also T6 – with an arid soil-type habitat. From sample T1, three strains identified as Leptolyngbya frigida and one unidentified Synechococcales were obtained. According to the phylogenetic tree, the unidentified Synechococcales strain is most closely related to uncultured cyanobacterium strains from Antarctica, and according to the matrix of distances it only shares 97.5% of similarity to Phormidesmis priestleyi ANT.LACV5.1, suggesting it can represent a new species. In agreement with the cultivation results, the pyrosequencing data revealed that the dominant genera corresponded to Leptolyngbya, Pseudanabaena, and Phormidium, however, Phormidesmis genus remained undetected (**Figure 4**). Interestingly, from sample T3, a identical strain (100% similarity at the 16S rRNA) from the one obtained in sample T1, unidentified Synechococcales strain AR3-H-2A and a Nodosilinea sp. strain, with high identity to the ones isolated in END sample, were retrieved. According to the pyrosequencing data, the sample T3 contained well-represented strains from Pseudanabaenaceae family (**Figure 3B**) and a lower distribution of the Phormidium genus (**Figure 4**). Concerning the sample with lowest water availability – sample T6 – two strains identified as Plectolyngbya hodgsonii and L. frigida were obtained. From the phylogenetic tree (**Figure 7**) is possible to verify that the obtained Pectolyngbya isolates group together in a subclade formed only by Antarctic strains. The pyrosequencing data revealed that this sample only contained 0.1% of cyanobacteria from the Synechococcaceae family and did not detect any from Leptolyngbyaceae to which the isolated strains were affiliated (**Figure 5C**).

All the 16S rRNA gene sequences from obtained strains are more closely-related to other Antarctic or coldenvironment cyanobacteria strains (**Figure 7**). Particularly, the clade of L. frigida is composed in its majority of strains isolated in Antarctica or cold-environments. Notably, most of the closest relatives at NCBI correspond all to the same study (Taton et al., 2006) that assessed the cyanobacteria diversity in Antarctic lakes, including in the Dry Valleys.

#### Culture-Dependent vs. Culture-Independent Approach

Over 99% of microorganisms from the environment are recalcitrant to cultivation in the laboratory (Kaeberlein et al., 2002) and the revolution of HTS techniques has opened an array of opportunities with new discoveries and access to previously unknown and uncultivable diversity. However, a few bottlenecks such as limited detection of minority populations (Lagier et al., 2012), difficulty to discriminate the lowest taxonomic level and the fraction of unassigned sequences (Rinke et al., 2013) have renewed the interest in bacterial cultivation practices for "non-cultivable" species. Together with the development of new approaches to retrieve bacteria previously considered as uncultivable, the rebirth of culture in microbiology (Kaeberlein et al., 2002; Lagier et al., 2018) has emerged. The use of culture media and diffusion chambers to simulate natural environments (Kaeberlein et al., 2002),

co-culture (Stewart, 2012) and most recently culturomics (Lagier et al., 2016) have yielded great improvements in culturability. Actually, the combination of HTS methods with culturedependent techniques has started to be used to identify new bacterial species (Ma et al., 2014).

Here, a combined approach including HTS and culturedependent techniques (including mimetization of the natural conditions and use of pre-treatments) was employed. In this study, the isolation of the same cyanobacterial species (including clonal strains) from different samples and different microenvironments, suggests, as already reported (Taton et al., 2006), that the cultivation conditions may have selected for some specific genera. While, OTU-level rarefaction curve of END sample has reached a plateau (**Supplementary Figure S1**), indicating that we have captured most of the bacteria diversity, 50% of the isolated strains were not detected by HTS sequencing technology. Factors such as bias in DNA extraction (Zielinska et al., 2017 ´ ) due to the protocol used, differential PCR amplification or the different distribution of rRNA operons in the different bacteria can influence the proportion of rRNA phylotypes (Klappenbach et al., 2000).

Concerning the isolation of actinobacteria, the combination of pre-treatments with culture conditions clearly dictated the success in isolation of T5 sample. Interestingly, none of the isolated strains from this study that were affiliated with Micrococcaceae and Dermacoccaceae were represented in the pyrosequencing data set of samples from which isolates

were retrieved (T5, **Figure 3C**). As already demonstrated by Pulschen et al. (2017) and Tahon and Willems (2017), it is possible to retrieve recalcitrant bacteria from Antarctic samples by adopting some simple approaches such as longer incubation periods, use of low-temperatures and oligotrophic media. By applying these approaches Tahon and Willems (2017) were able to obtain at least 12 representatives of novel genera or families and two potential first cultured isolates of novel taxa.

It is important to note that due to samples being preserved at −80◦C on life-guard solution, some diversity may have not been recovered due to a decrease in viability associated with storage in this solution and temperatures. In addition, other variables can influence the ability of bacteria to grow, from culture media composition (Xiong et al., 2013) to more complex requirements, as the presence of specific growth signals (Lewis et al., 2010) or dependency on other microorganism(s) (Davis et al., 2011). Also, the percentage of active members of the community that can be cultivable is usually low. It has been previously suggested that in arid soils, such as Antarctica Dry Valleys, the cyanobacterial populations are not actively growing (Aislabie et al., 2013), as they originate probably from wind dispersion (Michaud et al., 2012).

Previous studies have also demonstrated that the complementarity of culture-dependent and independent approaches is represented by only a fraction of species detected concomitantly (Lagier et al., 2012; Pudasaini et al., 2017; Tahon and Willems, 2017). The low complementarity can be explained by the limitations presented of both approaches. Further Carini et al. (2017) have shown that relic DNA accounts for about 40% of the prokaryotic 16S rRNA amplified genes. In colder soils, DNA from non-viable cells can persist even for longer periods, thus an inflation in bacterial diversity might be one of the reasons for the reduced overlap observed.

As already suggested, a combined approach using HTS to guide the culture-based isolation process (Babalola et al., 2009) and the use of pre-treatments and specific culture media (Pulschen et al., 2017), can improve the identification and culture retrieval of new bacterial species.

The limited sampling sites covered by this study can result in some bias. Although might reduce the confidence of our hypothesis, most of the results presented are supported by previous studies that should, however, be further explored in future sampling campaigns in the Dry Valleys.

#### CONCLUSION

This study combined pyrosequencing and cultivation techniques to assess the actinobacteria and cyanobacteria diversity of Dry Valleys microenvironments. The effect of environmental parameters on the distribution of these communities, in particular along a soil transect with a gradient of water availability, was also evaluated. This study highlights the capacity of Dry Valleys prokaryotic communities to thrive below thresholds that are considered to be life-limiting. The major role of actinobacteria and cyanobacteria, the dominant heterotrophs and phototrophs, in Dry Valleys ecosystem is supported by their distribution across environmental gradients.

Our findings are in agreement with other studies, by demonstrating that Dry Valleys bacterial diversity and abundance is driven by environmental forces. The effect of one particular environmental parameter – water availability – was evaluated and a clear shift between microbial communities was registered. This shift was characterized by a pattern of phyla replacement as distance to the water source increased, likely resulting from a shift in habitat from high moisture soils to open arid soils.

Our results revealed that the type of habitat (endolithic vs. soil) dramatically constrains the bacterial community composition, characterized by a clearly distinct taxonomic and phylogenetic composition between both characterized habitats. Cyanobacteria dominated over the remaining phyla in the endolithic environment. The gradient of water availability (**Figure 5A**) and pH seems (**Table 1**) to dictate the distribution of cyanobacteria and actinobacteria, suggesting that actinobacteria is favored with the decrease in moisture content and increase in pH, contrary to cyanobacteria. However, caution is necessary when extrapolating from these results since a reduced number of samples were analyzed here. Finally, our study further illustrates the importance of combining cultivation and sequencing techniques. Indeed, despite the power of HTS technologies, we show that culture-dependent methods employed in this study were able to retrieve taxa that were not detected in any of the pyrosequencing data. By combining the two approaches, we have improved the coverage of the diversity present in the samples and were able to retrieve both abundant and rare members of the communities. The isolation of microorganisms from this environment remains challenging, and future work will include further optimization of isolation strategies and culture conditions.

# AUTHOR CONTRIBUTIONS

AR, MC, PL, and CM designed and conceived the research study and experiments. AR, FR, TM, HR, MB, and JS developed the experimental work. AR, AS, VR, and PL analyzed the data. AR wrote the main manuscript text with support of CM, MC, VR, PL, and MB. SC and CL coordinated Antarctica sampling campaign. All authors improved, reviewed, and approved the final manuscript.

#### FUNDING

This project was funded by Portuguese Science and Technology Foundation (FCT) through a grant to CM (NITROLIMIT project - PTDC/CTA-AMB/30997/2017) and to PL (IF/01358/2014), and through a Ph.D. scholarship to AR (SFRH/BD/140567/2018). Antarctic campaign was conducted as part of the New Zealand Terrestrial Antarctic Biocomplexity Survey (nzTABS), through an award from the New Zealand Foundation for Research and Technology (FRST) and an award from Antarctica New Zealand to SC (UOWX0710) that supported all field

science and a Post-doctoral fellowship to CL (UOWX0715). The work was also supported through awards from the New Zealand Marsden Fund to CL and SC (UOW1003), the New Zealand Ministry of Business, Innovation and Employment to SC and CL (UOWX1401), the United States National Science Foundation to SC (ANT-0944556 and ANT-1246292). The Portuguese Polar Program (PROPOLAR) funded CM participation in the Antarctic Campaign through FCT nacional funds (PIDDAC). This research was partially supported by the Strategic Funding ÐID/Multi/04423/2019 through national funds provided by FCT and ERDF, in the framework of the programme PT2020.

#### REFERENCES


#### ACKNOWLEDGMENTS

We are sincerely grateful to Antarctica New Zealand for providing logistics support during K020 event.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2019.01018/full#supplementary-material





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Rego, Raio, Martins, Ribeiro, Sousa, Séneca, Baptista, Lee, Cary, Ramos, Carvalho, Leão and Magalhães. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Ensemble Modeling of Antarctic Macroalgal Habitats Exposed to Glacial Melt in a Polar Fjord

Kerstin Jerosch<sup>1</sup> \*, Frauke K. Scharf <sup>1</sup> , Dolores Deregibus <sup>2</sup> , Gabriela L. Campana2,3 , Katharina Zacher <sup>1</sup> , Hendrik Pehlke<sup>1</sup> , Ulrike Falk <sup>4</sup> , H. Christian Hass <sup>5</sup> , Maria L. Quartino2,6 and Doris Abele<sup>1</sup>

 Alfred Wegener Institute Helmholtz Center for Polar and Marine Research, Functional Ecology, Bremerhaven, Germany, Department of Coastal Biology, Argentinean Antarctic Institute, Buenos Aires, Argentina, <sup>3</sup> Department of Basic Sciences, National University of Luján, Luján, Argentina, <sup>4</sup> Department of Geography FB08, University of Bremen, Bremen, Germany, Alfred Wegener Institute Helmholtz Center for Polar and Marine Research, Wadden Sea Research Station, List, Germany, Museo Argentino de Ciencias Naturales Bernardino Rivadavia, Buenos Aires, Argentina

#### Edited by:

Bruno Danis, Free University of Brussels, Belgium

#### Reviewed by:

Charlene Guillaumot, Free University of Brussels, Belgium Charles Amsler, College of Arts and Sciences, University of Alabama at Birmingham, United States

> \*Correspondence: Kerstin Jerosch kerstin.jerosch@awi.de

#### Specialty section:

This article was submitted to Biogeography and Macroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 26 June 2018 Accepted: 20 May 2019 Published: 13 June 2019

#### Citation:

Jerosch K, Scharf FK, Deregibus D, Campana GL, Zacher K, Pehlke H, Falk U, Hass HC, Quartino ML and Abele D (2019) Ensemble Modeling of Antarctic Macroalgal Habitats Exposed to Glacial Melt in a Polar Fjord. Front. Ecol. Evol. 7:207. doi: 10.3389/fevo.2019.00207 Macroalgae are the main primary producers in polar coastal regions and of major importance for the associated heterotrophic communities. On King George Island/Isla 25 de Mayo, West Antarctic Peninsula (WAP) several fjords undergo rapid glacial retreat in response to increasing atmospheric temperatures. Hence, extended meltwater plumes laden with suspended particulate matter (SPM) are generated that hamper primary production during the austral summer season. We used ensemble modeling to approximate changes in the benthic productivity of an Antarctic fjord as a function of SPM discharge. A set of environmental variables was statistically selected and an ensemble of correlative species-distribution models was devised to project scattered georeferenced observation data to a spatial distribution of macroalgae for a "time of measurement" ("tom") scenario (2008–2015). The model achieved statistically reliable validation results (true scale statistics 0.833, relative operating characteristics 0.975) and explained more than 60% of the modeled macroalgae distribution with the variables "hard substrate" and "SPM." This "tom" scenario depicts a macroalgae cover of ∼8% (63 ha) for the total study area (8 km<sup>2</sup> ) and a summer production of ∼350 t dry weight. Assuming a linear increase of meltwater SPM load over time, two past (1991 and 1998), and two future (2019 and 2026) simulations with varying SPM intensities were applied. The simulation using only 50% of the "tom" scenario SPM amount (simulating 1991) resulted in increased macroalgal distribution (143 ha) and a higher summer production (792 t) compared to the "tom" status and could be validated using historical data. Forecasting the year 2019 from the "tom" status, an increase of 25% SPM results in a predicted reduction of macroalgae summer production to ∼60% (141 t). We present a first quantitative model for changing fjordic macroalgal production under continued melt conditions at WAP. As meltwater influenced habitats are extending under climate change conditions, our approach can serve to approximate future productivity shifts for WAP fjord systems. The reduction of macroalgal productivity as predicted for Potter Cove may have significant consequences for polar coastal ecosystems under continuing climate change.

Keywords: seaweed distribution modeling, bioclimatic ecosystem change, distribution shift, macroalgae summer production, South Shetland Islands, Antarctica

#### INTRODUCTION

The Western Antarctic Peninsula (WAP) is one of the regions responding most dramatically to climate change (Kim et al., 2018). In spite of the current phase of relative cooling over the past two decades (Turner et al., 2016), the long-term warming trend observed since the middle of the past century has caused loss of massive ice shelves and retreat of over 80% of all coastal glaciers along the Northern WAP (Cook et al., 2016). The changes in the cryosphere have led to pronounced ecosystem changes in the coastal systems (Barnes and Peck, 2008; Hoegh-Guldberg and Bruno, 2010; Sorte et al., 2010; Constable et al., 2014; Sahade et al., 2015). A major aspect of glacier retreat is the progressive subglacial and surface erosion during the melting seasons which generates extended sediment plumes mainly in inshore and glaciated areas (Monien et al., 2017). Surface transport of eroded sediments produces shading effects on, inter alia, benthic primary producers (Zacher et al., 2009; Deregibus et al., 2017). On the other hand, the glacier retreat is opening previously ice covered coastal seabed for new colonization by benthic organisms, e.g., macroalgae (Quartino et al., 2013; Barnes, 2017). Macroalgal communities not only serve as secondary habitats for a huge number of epiphytes and associated fauna but furthermore enhance local carbon burial by reducing flow velocity above ground and trapping particles, enhancing both inorganic, and organic deposition rates (Duarte et al., 2013). Macroalgae and their epiphytes are the main benthic primary producers of the coastal food web of Potter Cove (Iken et al., 1998; Quartino and Boraso de Zaixso, 2008) and contribute substantially to the dissolved and particulate carbon pool (Reichardt and Dieckmann, 1985; Fischer and Wiencke, 1992).

Macroalgal colonization of the seafloor is mainly affected by the availability of light, substrate type, grain size, steepness of the bottom slope (topology), wave action, and ice scour (Klöser et al., 1996; Zacher et al., 2009; Quartino et al., 2013; Wiencke et al., 2014; Clark et al., 2017; Campana et al., 2018). Especially in shallow water coastal systems, changes of sea ice duration and timing, and coastal fast ice dynamics are important drivers of benthic community composition and sustainability. Clark et al. (2013) predicted that earlier ice break-up can shift shallow water ecosystems from invertebrate dominated to macroalgae dominated communities, in areas with hard substrate. Macroalgal growth, productivity and vertical depth range are constrained by light availability under sea ice and sediment discharge plumes (DeLaca and Lipps, 1976; Wiencke, 1990a; Brouwer et al., 1995; Zacher et al., 2009; Clark et al., 2015, 2017; Deregibus et al., 2016), and by increasing physical disturbance from ice scour following sea ice break-up (Clark et al., 2015). Hence, while glacial retreat eventually supports increased macroalgal productivity on newly available hard substrates, shading due to turbid surface waters may curtail the net effect of macroalgal productivity to a currently unspecified extent. For any calculation of production or carbon budgets, an accurate quantification of coastal habitats suitable for macroalgal growth is an important prerequisite.

Species distribution models (SDM) statistically analyze the relationships between species distribution and the spatial patterns of environmental variables (Guisan and Thuiller, 2005; Elith and Leathwick, 2009; Dormann et al., 2012). They are applied in natural resource management and conservation planning (Miller, 2014), and form useful tools in ecosystem change modeling under future climate scenarios (Pineda and Lobo, 2009; Vorsino et al., 2014). Contrary to mechanistic models, they are independent from detailed species knowledge and require comparatively simple, widely available presence data. Therefore, SDMs provide a feasible and practical framework for an overarching environmental impact assessment (Elith and Leathwick, 2009), including a range of species over large spatial scales, especially in regions where difficult sampling conditions complicate in situ surveys.

We apply an ensemble modeling (EM) approach that combines a defined number of SDMs of best fit to an optimized ensemble model, in order to assess the antithetic effects of a retreating glacier on macroalgae distribution in the studied fjord (Potter Cove, King George Island, South Shetland Islands, **Figure 1A**). Based on a long-term data series at Carlini station that covers the past 25 years, it is known that Potter Cove has experienced an increase of summer sea surface temperatures equaling 0.36◦C per decade between 1991 and 2010 (Schloss et al., 2012). The same paper reports regional sea ice duration to vary considerably between years, albeit with no significant trend over time. Earliest formation of a solid sea ice cover was at the end of April and latest ice breakout in the end of November (Schloss et al., 2012), so that essentially no sea ice was and is present during the algal summer growth period. Mean summer meltwater stream discharge measured in the southeast of Potter Cove over three summer seasons (2010–2012) amounted to a suspended particulate matter (SPM) concentration of 283 mg l−<sup>1</sup> , while southwesterly areas are low in SPM (0–0.5 mg l−<sup>1</sup> ) (Monien et al., 2017). A suite of environmental variables (raster data) either causative or indicative for macroalgal distribution patterns was included: bathymetry, slope, SPM, hard substrate occurrence probability, distance to glacier front, total organic carbon (TOC). Most of these raster data result from geostatistical models.

Mayo: CH, Collins Harbor; EC, Esmerald Cove; MC, Marian Cove; EI, Ezcurra Inlet; MaI, Mackellar Inlet; MI, Martel Inlet; AG, Anna Glacier; BD, Bellinghausen Dome; CG, Crystal Glacier; DG, Drake Glacier; DoG, Domeyko Glacier; EG, Eldred Glacier; FG, Fourcade Glacier; HI, Hektors Icefall; MDI, Moby Dick Icefall; MG, Moczydlowski Glacier; PFG, Polar Friendship Glacier; PG, Poetry Glacier; UG, Sher Glacier (A), seabed morphology of Potter Cove (B), and photo of entering fresh and SPM-accumulated meltwater from Fourcade Glacier into the inner cove (C).

Furthermore, we developed a set of model deviations, designed to evaluate variations of SPM entry into the cove simulating quantitative changes of sediment discharge as a function of glacial retreat.

Here, we analyze macroalgal presence/absence "data" from repeated surveys (2008–2015) in Potter Cove taking into account new knowledge on effects of shading on macroalgal productivity (Zacher et al., 2009; Deregibus et al., 2016). The data sets we used come from long-term, interdisciplinary ecosystem monitoring activities, unique for Antarctic shallow water systems. Our aim was to run a distribution model, which predicts and defines the potential ecological niche of macroalgae and allows for temporal and spatial simulations of their response to environmental changes in Potter Cove. The effects of climate-induced alterations of sediment discharge on macroalgae distribution and inferred productivity is shown for the "time of measurement (tom)" scenario (2008–2015) as well as two past (1991 and 1998) with less SPM in the water column and two future simulations (2019 and 2026) with increased SPM.

#### MATERIALS AND METHODS

#### Study Area

Potter Cove is a 4 km long and 2.5 km wide tributary fjord to Maxwell Bay on the Southwestern coast of King George Island (KGI) (**Figure 1A**). The fjord covers ∼10 km<sup>2</sup> surface area and is almost free of glacial ice cover since 2016, surrounded by ice cliffs to the northeast. At the southern coast, meltwater streams intersect gravelly beaches occasionally occupied with grounded ice blocks (**Figure 1B**). The inner cove (6.5 km<sup>2</sup> ) is divided into different basins by transversal ridges, remnants of underwater moraines, which were formed during late Holocene glacial advances or still stands (Wölfl et al., 2016). Ice melting and erosive processes (tides and waves) have given rise to newly exposed inshore hard bottom substrates, including a small rocky island (Isla D) of 70 m above-sea level diameter (**Figures 1B,C**), which is currently colonized by mixed communities of macroalgal and invertebrate fauna (Campana et al., 2009; Quartino et al., 2013; Lagger et al., 2017).

In the north and west, the cove is surrounded by the rapidly retreating Fourcade glacier as part of the Warszawa Icefield, which has maximum elevations of ∼470 m (**Figure 1A**). From 1999 to 2008, the annual average frontline retreat rate up to about 40 m y−<sup>1</sup> on the Potter Peninsula (Rückamp et al., 2011). In recent years, as the glacier moved on land, slowing retreat rates were documented (Falk et al., 2018). The newly ice-free subglacial rock bed and surface area are subject to erosion (Monien et al., 2011, 2017; Rückamp et al., 2011) resulting in higher sediment discharge loads during warmer summer melt seasons (**Figure 1C**). This affects coastal marine areas up to 2 km distance from land (Jerosch et al., 2018). Monien et al. (2017) estimated an approximate sediment load of 4 × 10<sup>5</sup> tons y−<sup>1</sup> entrained locally into the surface water layer.

The Argentinian research base Carlini (former Jubany) with the German-Argentinian collaborative laboratory "Dallmann" is located at Potter Cove. The Potter Cove ecosystem and its responses to rapid glacial retreat have been extensively studied and monitored over the past decades. Within the recent period (2008–2017), the collaborative research was stimulated by two EU supported actions, IMCOAST (www.imcoast.org) and IMCONet (www.imconet.eu). Project data, collected and processed within these research programs, were compiled into a georeferenced database. The data sets are available at www.pangaea.de.

#### Modeling Approach

Geostatistical algorithms were applied in ArcGIS <sup>R</sup> to generate raster data from sampling sites of environmental data, which were quality-assessed by statistical mean values such as Standard Error and Root-Mean-Square (**Supplementary Figure 1**, data sets and processing). Secondly, the response variable, a spatial snapshot of the macroalgal communities in Potter Cove was compiled, to statistically analyze the spatial response of macroalgae to environmental drivers, and to simulate their spatial distribution by changing environmental conditions.

The biodiversity modeling package Biomod2 Version 3.1- 64 described in detail in Thuiller et al. (2014) was used in the R statistics environment (R-3.1.2, RCoreTeam, 2014). The modeling technique includes a set of commonly used algorithms for SDMs, namely five machine learning methods: Random Forest (RF), Maximum Entropy modeling (MaxEnt), Artificial Neural Networks (ANN), Generalized Boosted Models (GBM), Classification Tree Analysis (CTA); two regression models: Generalized Additive Models (GAM), and Generalized Linear Models (GLM), furthermore, the climate-envelope-model Species Range Envelope (SRE), the non-parametric regression Multiple Adaptive Regression Splines (MARS), and Flexible Discriminant Analysis (FDA). For explanation of the SDM algorithms refer to Elith and Graham (2009). Uncertainty is expressed by the difference between alternative realizations represented as deviations in response curves. For validation, two evaluation metrics, True skill statistic (TSS) and Relative Operating Characteristic (ROC) were applied within the Biomod2 package.

The number of input variables may constrain the complexity of models (Dormann et al., 2013). Accordingly, the variable selection decreases the resulting variance of regression parameters, improves processing time, reduces errors during processing, prevents possible misinterpretation of results, and eventually permits the evaluation of main abiotic drivers through an index of variable importance for shaping species distribution (Guisan and Thuiller, 2005; Elith and Leathwick, 2009; Merow et al., 2014). Testing the method by Dormann et al. (2013) shows that regression-type approaches (e.g., generalized linear models) and machine-learning techniques (e.g., MaxEnt) work reliably (i.e., condition number <10) if used under moderate collinearity. Here, the explanatory variables used for the SDMs were chosen through an iterative process (**Supplementary Material**). Environmental variables that are highly correlated with macroalgae presence/absence data (Pearson correlation coefficient | r | ≥ 0.7) as well as redundant variables were omitted. Further, variables with a low mean variable importance value (≤0.1) were excluded during an iterative implementation of the ensemble modeling. The six remaining environmental variables are ranked below by mean variable importance value (**Table 1**).

#### Macroalgae and Hard Substrate Data

For a "time of measurement" ("tom") status of macroalgae distribution, presence and absence data sampled during 2008– 2015 were compiled from three main sources: video, imagery, and hydroacoustic data (**Supplementary Table 1**). The macroalgae presence/absence data set covers the total environmental data ranges obtained in the study area and are densely distributed, which, according to van Proosdij et al. (2016), supports the accurate operability of ensemble SDMs (**Figure 2A**). Additionally, three previously unpublished video transects recorded in 2011–2012 were converted to frames, georeferenced, and analyzed for macroalgal distribution data following the methodology described in Quartino et al. (2013). A chessboard patterned hydroacoustic scan (RoxAnn GDX) of the seabed was accomplished in 2012 (**Figures 2A,B**). An unsupervised visual classification method was used to annotate macroalgae and substrate from the hydroacoustic data set, which was validated by imagery of a drop-down camera (Hass et al., 2016). We considered these data less reliable (high precision in position but low entropy) than data obtained from video footage and photographs [lower precision in position but high entropy (section Species Distribution Modeling)]. Areas deeper than 45 m marked as "macroalgae present" on hydroacoustic scans were excluded, since macroalgal vertical distribution was only verified to this depth (video footage in Peñón de Pesca). As macroalgae occur only on hard substrates, the following assumptions were made for the data sets in **Table 1**: (1) substrate coarser than "gravelly sand" is assigned as hard substrate. (2) at locations with soft sediment the absence of macroalgae is assumed, (3)"macroalgae present" sites were classified as hard substrate if macroalgae coverage was 100% and the sea floor was not visible.

The first description of the spatial extent of sublittoral macroalgae coverage in Potter Cove was published by Klöser et al. (1996). Macroalgae distribution was manually extrapolated toward depth based on small-scale dive observations (video transects) recorded down to 30 m water depth during the summer season 1993/1994. For reason of comparability, we georeferenced and clipped the published map to the extent of our study area (785.31 ha) and the areas assigned to macroalgae coverage were digitized (109.63 ha, 13.96%). A mean macroalgal summer production of 5.55 t ha−<sup>1</sup> was calculated based on the accumulated monthly production published by Quartino and Boraso de Zaixso (2008). Their sampling was performed by scuba diving at six sites, from January to



FIGURE 2 | Input data for the spatial distribution modeling (SDM) in Potter Cove located between Barton and Potter Peninsula [map between (A) and (B)]: presence/absence macroalgae (A), presence/absence hard substrate (B) and six relevant environmental variables; probability of hard substrate (C), suspended particulate matter (SPM) (D), distance to the glacier front (E), bathymetry (F), total organic carbon (TOC) (G), and slope (H), ordered by the mean variable importance value resulted from the modeling process. For data sources please refer to Table 1 and Supplementary Table 1. Projection: WGS84, UTM 21S.

March 1994, 1995, and 1996. Three sampling units of 1 m<sup>2</sup> were placed at 0, 5, 10, 20, and 30 m along 26 transects perpendicular to the shore. Biomass data from two sites obtained during two summer seasons 1994–1995 and published growth rates (Wiencke, 1990a,b; Gömez and Wiencke, 1997) were used to calculate the macroalgal production. Quartino and Boraso de Zaixso (2008) assumed biomass as a mean over the water depth of 0–30 m. We used the same assumptions to estimate the macroalgal summer production per area in the simulations.

The hard substrate presence and absence data set resulted from several data sources acquired between 2010 and 2015 with different methodologies: van Veen grab samples, video material and photographs, acoustic data, and derivatives inferred from assumption 3 as defined above. The final data set was interpolated using indicator kriging, which produced a probability raster of occurrence (**Figure 2C**). For detailed information, see **Supplementary Material** and **Supplementary Table 1**.

#### Environmental Variables

The use of the Biomod2 spatial-temporal framework requires the definition of thematic maps (raster data) that describe the macroalgae related ecosystem.

The spatial coverage of a SPM plume (snapshot of a normal situation) is visible on the satellite image from 2013/03/07 (DigitalGlobe., 2014) and was derived from its 4th image band. The continuous raster cell values of the satellite image provide relative SPM values with a high spatial resolution of the SPM data set; the darker the cell color the higher the corresponding SPM value (**Figure 2D**). SPM is a highly dynamic variable both and space and time depending on the weather conditions. It is mostly connected to air temperature and wind speed and direction. The geostatistical data analysis revealed that the interpolation of a consistent data set would not result in a reliable raster data set representing e.g., seasonal mean values. We therefore decided to use an SPM snapshot, which represents an accurate and spatially consistent data set of the SMP plume extent and relative SPM amount in the water characteristically for normal weather conditions on King George Island according to meteorological data (doi: 10.1594/PANGAEA.80825, doi: 10.1594/PANGAEA. 808250, doi: 10.1594/PANGAEA.758314).

The mean Euclidean distance to the nearest glacier front (**Figure 2E**) was calculated in ArcGIS 10.5.1 based on the glacier front digitized from the satellite image (DigitalGlobe., 2014).

The bathymetry raster (**Figure 2F**) with a resolution of 5 × 5 m was processed based on single beam data from the Argentinean Antarctic Institute (Instituto Antártico Argentino, IAA) published in Wölfl et al. (2014), and multibeam data acquired by the United Kingdom Hydrographic Office (UKHO, 2012) with a cell size of 5 × 5 m. A coastline digitized from the satellite image (DigitalGlobe., 2014) supplemented the interpolation process. The "Topo to Raster" tool in ArcMap 10.3 was used to merge the three data sets, with the coastline representing the 0-m-contour for the interpolation process ("contour type option"). For a detailed description of the data processing refer to Jerosch and Scharf (2015).

The TOC [mass%] raster (**Figure 2G**) was interpolated using the top sediment layers (up to 2 cm) of 47 published (Monien et al., 2014) and 10 unpublished (Monien, unpublished) push core samples taken in 2010. The statistical errors of several interpolation methods (e.g., IDW, Empirical Bayesian Kriging, Indicator, Ordinary, and Co-Kriging) with changing settings were compared (**Supplementary Material**).

The slope (**Figure 2H**) is defined as the seabed gradient in the direction of maximum inclination (e.g., Lundblad et al., 2006; Wilson et al., 2007) and was calculated from the directional East-West and North-South gradient of the processed bathymetry raster (DEM Surface Tools, Jenness, 2013).

Clipping and bilinear resampling of the environmental raster input to the same resolution of 5 × 5 m was processed on a raster stack in an R statistics environment (R-3.1.2, RCoreTeam, 2014). All geospatial raster data were projected to UTM21S (WGS1984) coordinates, clipped and resampled to the resolution and the extent of a template raster for the SDM.

#### Projection Induced by Climate Change

Even if climate warming slows down during a period of transient cooling (Turner et al., 2016; Oliva et al., 2017), the time lag of ice mass response to the current climatic conditions will cause further glacial retreat until the glacier is in equilibrium with the climatic boundary conditions (Osmanoglu et al., 2014; Falk et al., 2018). "Equilibrium" means mass accumulation equaling ablation resulting in an overall mass balance of zero. For the future scenarios, we neglected the process of hard substrate variation in the model since the glacier is currently on land. For the past scenarios we referred to the ice-free areas of the years 1988 and 1995 based on the glacier front lines published by Rückamp et al. (2011). Assuming constancy of the present retreat rates on land in the future, we can compute changes of the present ice-mass extent, which is between 35 and 90 m terrestrial elevation of the glacier equilibrium line, and the 110 or 230 m altitude of the Warzawa Icefield equilibrium line. Under these future scenarios, the distance melt water would travel through lose rocks and thawing soil would roughly double or triple (Falk et al., 2018). Based on these predictions for glacial retreat and assuming a linear increase of SPM discharge (derived from Schloss et al., 2012) with increasing distance between the glacier front and the coastline, we contrived four different modeling scenarios for a predictive analysis with varying amounts of SPM entering the Potter Cove marine system. Scenarios 1 and 2 represent conditions with lower SPM discharge (0.75 and 0.5-fold). Scenarios 3 and 4 represent an increasing rate of meltwater discharge (moderate to intense) into the system (1.25- and 1.5- fold). For each of the scenarios, supplemental ensemble models (EMs) were calculated that consider the same modeling approaches identified for the "tom" status best fit. The only difference consists in the modified SPM raster cell values: scenario 1: assuming 50% of the "tom" status SPM raster value, scenario 2: 75%, scenario 3: 125%, scenario 4: 150% (**Table 2**). The appropriate years representing the scenarios were estimated by a linear extrapolation of the significant regression line published by Schloss et al. (2012) for Potter Cove summer SPM data between 1990 and 2010. We set the estimated SPM value therein for the year 2010 (17.296 mg m−<sup>3</sup> ) as 100% and used this value as a basis to identify years for the simulated scenarios as follows: years 1991 (scenario 1), 1998 (scenario 2), 2019 (scenario 3), and 2026 (scenario 4). Scenario 3 qualifies as "future scenario" because the "tom" status of macroalgae presence data relates to data acquired between 2008 and 2015.

#### Biomod2 Model Calibration and Validation Species Distribution Modeling

We applied ten different modeling algorithms and two different prevalence-independent (discrimination) performance metrics, the threshold-independent Relative Operating Characteristic (ROC), and the threshold-dependent True Skill statistic (TSS) scores with 10 permutations to test the importance of each variable relevant for the modeled response variable within the Biomod2 environment (section Modeling Approach). Single models were run with repeated random data splits; hence, 70% of the data were used to train the models, while the residual 30% of data were used to validate model performance. Due to the cross-validation procedure, each of the 10 algorithms was replicated 20 times for the model calibration, amounting to a total number of 200 modeled results. Hydroacoustic data were weighted by 0.75 and image data by 1 (section Species Distribution Modeling).

#### Model Validation

The area under the ROC curve (AUC) validation statistic is a commonly used threshold independent accuracy index that ranges from 0 to 1 (1 = highly accurate prediction, 0.5 = prediction no better than random) for assessing the capacity of species distribution models. The ROC index defines the probability that an SDM will rank a presence locality higher than an absence (Pearce and Ferrier, 2000; Liu et al., 2009) and is therefore not well-suited for modeling presence only data (van Proosdij et al., 2016). According to Pearce and Ferrier (2000), rates higher than 0.9 indicate excellent discrimination because the sensitivity rate is high, relative to the false positive rate.


TABLE 2 | Macroalgae progression over simulated time spans (1991-2026) depending on varying SPM intensities (\*Probability of macroalgal occurrence <sup>&</sup>gt;75%; <sup>+</sup>after Quartino and Boraso de Zaixso, 2008 applying a mean summer production of 5.55 t ha−<sup>1</sup> ).

The deviance between observations and predictions (the subtraction of the output score from 1 for presences and from 0 for absences) throughout the whole cove may reflect the areas where commission error (negative deviance) and omission error (positive deviance) spatially coincide (Lobo et al., 2008). The deviances between observations and predictions are expressed as under- and overestimations as follows: underestimation is defined as −97 to −50% deviance of measured to predicted value (e.g., measured 1 for presence and measured 0.25), moderate underestimation −50 to −25%, good −25 to 25%, moderate overestimation 25 to 50%, overestimation 50 to 91% (e.g., measured 0 for absence and measured 0.75). Besides over and underestimation, this itemization also reveals the regional distribution of the errors; information, which is not provided by the TSS and the ROC scores.

As the AUC is used for SDM evaluation (e.g., Vorsino et al., 2014), but also has been criticized (e.g., Jiménez-Valverde, 2012), we also provide the evaluation criterion TSS**,** recommended by Allouche et al. (2006). The TSS is a measure of performance of species distribution models and prevalence independent and favors the combination of binary predictions which best separate presences from absences. It corresponds to the sum of sensitivity and specificity minus one. The "sensitivity" value denotes the proportion of presences correctly predicted, whereas the "specificity" value denotes the proportion of absences correctly predicted (Barbet-Massin et al., 2012). TSS statistic ranges from −1 to +1 and tests the concordance between the expected and observed distribution. A TSS value of +1 indicates perfect agreement between the observed and expected distributions, whereas the value 0 defines a model which has a predictive performance no better than random. An evaluation metric quality measure of 0.7 or higher indicates good or very good performance of the model (Thuiller et al., 2010).

#### Ensemble Modeling

The application on biomod2 is one of several current approaches to modeling species' distributions using presence/absence data and environmental data. Because of stochastic elements in the algorithm and underdetermination of the system (multiple solutions for the model optimization), no unique solution is produced. Such ensemble models have been used extensively in credit scoring applications and other areas because they are considered to be more stable and, more importantly, predict better than single classifiers (Lessmann et al., 2015). They are also known to reduce model bias and variance (Kim et al., 2006; Tsai and Hsiao, 2010). Ensemble classifiers pool the predictions of multiple base models. Much empirical and theoretical evidence has shown that model combination increases predictive accuracy (Paleologo et al., 2010; Finlay, 2011). Biomod2 proposes an ensemble modeling approach implying the synthesis of two or more SDMs of best fit to a single ensemble model (EM), thus to improve model accuracy by including an indication of fuzziness and to asses model congruence. This inclusive statistical procedure improves the stability and accuracy of predictive nonlinear models, because it integrates uncertainties in parameter values and model structure. The approach has already been applied for ecosystem degradation by Vorsino et al. (2014) and for Antarctic sea-level rise simulations by DeConto and Pollard (2016). EMs consider uncertainties such as dependence on the initial conditions and partially incomplete model formulation.

In this study, the EM was enhanced successively by the systematic parameter selection and an optimization of the TSS threshold by visual revision. The total number of alternative realizations (200) was scaled by a binomial GLM to ensure comparable results. The EM was calculated as the mean value of 135 alternative realizations with a high TSS value (>0.7) defined as "good" prediction accuracy (Thuiller et al., 2010).

# RESULTS

#### Model Quality Assessment

The algorithm-ranking by TSS and ROC scores for all alternative realizations (**Figure 3**) shows best performance of the RF algorithm, followed by CTA, GBM, MaxEnt, and GAM. All 20 realizations of these five algorithms were incorporated into the EM. Furthermore, the majority of MARS (16) and FDA (14) realizations had mean TSS scores clearly >0.7 and a ROC >0.9. Five ANN realizations barely achieved a TSS evaluation score between 0.70 and 0.72 and a ROC between 0.90 and 0.91. GLM and SRE performance was comparatively weak. Hence, these algorithms were not included in the EM. The lineup of the most suitable realizations for each algorithm (**Figure 4**) highlights the spatial distinctions between the results. Most of the models showed good or very good performance in terms of predictive power and accuracy, with highest ROC values of 0.980 and 0.941, obtained for the RF and CTA models, respectively. The poorest performance was shown by the SRE model which had a mean ROC of 0.768. Nevertheless, the mapped distribution patterns varied remarkably depending on the model used. For response curves see **Supplementary Figure 2**. The EM predicts the spatial

occurrence of macroalgae in Potter Cove with a high statistical reliability (TSS 0.833/ROC 0.975).

A validation across all 135 models revealed a ranking of environmental variables accounting for the "tom" status macroalgae distribution (**Figure 5A**) by mean variable importance values from high to low as follows: probability of hard substrate occurrence (36.62%), SPM (24.83%), distance to glacier front (12.35%), TOC (9.69%), bathymetry (9.08%), and slope (7.44%). The percent values were independently determined by the model and no interactions were taken into account between the variables.

## Simulation of Macroalgal Distribution in Potter Cove

In the "tom" status EM (2008–2015) in **Figure 5A**, macroalgae communities are concentrated on the northern and northwestern shores of Potter Cove, with diminishing occurrence probability toward the inner glacier-proximal cove section (northeast). The northwestern areas represent hard substrates exposed by glacial retreat. A second region of high occurrence is located close to the southeastern edge of the cove opening into Maxwell Bay (Peñón 1) which has a longer history as ice-free area. Areas of uncertainties and differences between the alternative realizations were determined based on standard deviation and under- and overestimated areas (**Figure 5B**). Small standard deviations were typical for the deeper parts of the cove where macroalgae are absent. The high scores of standard deviations in the northern part of Potter Cove resulted from small-scale variability in the binary-coded macroalgae input data set of presence (1) and absence (0), located in close vicinity to each other, as well as from areas of poor sample density. Where the standard deviation was high, the model revealed overestimations, preferably in areas of macroalgae. Here, the model has difficulties to represent the small-scale variability of input data (presence alternates with absence at very close range) and predicts macroalgae presence in areas of general absence of algae. In contrast, only few underestimations occurred in the inner cove (measured presence predicted as absence).

Macroalgae (water depth of >30 m, hard substrate) cover 62.88 ha of the 785.31 ha Potter Cove total bottom area, which equals 8.01% macroalgae coverage in Potter Cove. Applying the mean macroalgal summer production of 5.55 t ha−<sup>1</sup> estimated from Quartino and Boraso de Zaixso (2008) the mean summer macroalgae production for the study area in Potter Cove is approximated to 348.98 t in the "tom" scenario (2008-2015). Of the total modeled macroalgal summer production, 5.22% are estimated for the newly ice-free areas due to glacier retreat (3.48 ha), which amounts to 19.31 t (**Supplementary Figure 3**).

The scenario maps derived from the "tom" EM display the potential macroalgae distributions, based on the assumptions of varying SPM transport dynamics into Potter Cove (see section Biomod2 Model Calibration and Validation) relating to different states of glacier retreat in past and future. Scenario 1 (simulating the state in 1991; **Figure 6A**) and scenario 2 (1998; **Figure 6B**) show more spatially extended macroalgae occurrence probability under reduced SPM input and thus higher light availability in the water column. Especially in the southern peripheral region and in the inner glacial proximal cove section of Potter Cove, macroalgal presence probability becomes significantly greater compared to the 2008–2015 "tom" status. We validated our modeling approach for the past by overlaying the macroalgae distribution obtained in scenario 1 (status 1991) with the digitized map of Klöser et al. (1996) and the original in situ data from 1994 to 1996 (Quartino et al., 2005), and with the coastline of Fourcade glacier from 1995 (Rückamp et al., 2011). Almost everywhere, the predicted areas match fully with the in situ data (**Figure 5**), which additionally confirms the validity of our model.

Scenarios 3 (year presumed 2019) and 4 (2026) in **Figures 6C,D** assume increased sediment stress in the near future and predict a decrease of areas inhabited by macroalgae compared to the "tom" (2008–2015) scenario. The most significant negative changes are predicted for the northwestern area close to the outer cove (Peñón de Pesca).

As a general trend, macroalgal coverage is decreasing between the past, over the "time of measurement," toward the future scenarios, under conditions of increasing SPM runoff into the cove, the basic assumption in our simulations (**Figure 6**). This results in a decrease of macroalgal summer production over time. In scenario 1 (1991) the macroalgae suitable habitat area (probability >75%) extends over 142.75 ha. Applying the mean macroalgal summer production estimate of 5.55 t ha−<sup>1</sup> (Quartino and Boraso de Zaixso, 2008) this equals 792.26 t. In scenario 2 (1998) only 100.25 ha are covered by macroalgae equaling 556.39 t. In contrast, macroalgae area extension decreases to 25.39 ha ( ˆ= 140.91 t) in scenario 3 and to only 7.4 ha in scenario 4 ( ˆ= 41.07 t). Distribution maps for macroalgae under varying SPM conditions (**Figure 6**) are available at doi: 10.1594/PANGAEA.854410 (Jerosch et al., 2015).

their predictive quality according their True skill statistic (TSS) and Relative Operating Characteristic (ROC) scores. The probability of macroalgae occurrence is given in %. Note that none of the models generated with GLM (J) and SRE (K) scored for the implementation into the EM (TSS <0.7). The final mean Ensemble Model (I) is highlighted with a green frame.

# DISCUSSION

Enhanced spatial knowledge of suitable habitat conditions for macroalgal growth allowed for refined estimations of macroalgal summer production in Potter Cove on KGI, compared to previously published estimates based on straight forward up-scaling. We deem the integrative EM approach preferable over single SDM based approaches, as it accounts for uncertainty (e.g., initial conditions, imperfect model formulation), and model congruence (e.g., Vorsino et al., 2014; DeConto and Pollard, 2016). In agreement with the ecology and physiology of benthic macroalgae (Wiencke and Clayton, 2002; Hanelt et al., 2003;

Zacher et al., 2009; Hurd et al., 2014), the present EM identified hard substrate occurrence and SPM distribution in surface waters as the main drivers for macroalgae distribution and colonization in this Antarctic fjord. Sufficient light availability is, indeed, an important prerequisite for a positive carbon balance and the build-up of macroalgal biomass (Gomez et al., 1997; Deregibus et al., 2016). On the other hand, the model does not include important factors such as ice scouring in the inner part of the cove (Deregibus et al., 2017), sedimentation effects on algal grazers (Zacher et al., 2016), and it does not account for obvious interactions and feedback loops between different input variables. Newly opening hard substrate surfaces in glacial proximity can be smothered and buried under sediment deposits as glacial run-off continues, reducing macroalgae habitat availability.

To our knowledge, we here present the first correlative distribution model of macroalgae in polar regions. The SDM was evaluated reliably (TSS 0.833/ROC 0.975) by using independent field data for past and present scenarios. Both continuous performance metrics such as AUC and binary performance metrics such as TSS are independent of prevalence (i.e., proportion of spatial coverage). They are often employed as they are frequently implemented in software solutions such as Biomod2. Prevalence-independent metrics are, however, limited to measuring discrimination and cannot be used to assess calibration. Binary metrics often select models that have reduced ability to separate presences from absences, which can lead to uncertain estimates (Lawson et al., 2014). We applied 20 runs for each algorithms, generating 20 AUC and TSS values for each model, to provide a more informative evaluation of model accuracy than only one estimate. We used AUC and TSS, two evaluation methods found in many published examples (e.g., Hijmans, 2012; Vorsino et al., 2014), to compare the modeling results obtained with different algorithms for our data set. However, we agree with the recommendation of Lawson et al. (2014) and Muscarella et al. (2014) to use the wider application of prevalence-dependent continuous metrics, particularly likelihood-based metrics such as Akaike's Information Criterion (AIC), to assess the performance of presence–absence models. As AIC has not yet been implemented in Biomod2, combination with the R package ENMeval (Muscarella et al., 2014) is recommended for future SDM studies.

Furthermore, the model could be improved by applying structured, spatially segregated allocation of data to calibration and validation data sets instead of random splitting, to assess the ability of the model to predict more distant locations and to conduct spatially independent model evaluations (Dormann et al., 2007; Hijmans, 2012; Muscarella et al., 2014; Roberts et al., 2017). This is especially recommended in studies in which training and test data sets are not spatially independent in all areas as was the case in Potter Cove. This approach could have in particular improved the predictions in the northern part of the cove where under- and overestimations occur in close vicinity (**Figure 5**; Barnes and Clarke, 2011; Lawson et al., 2014; Muscarella et al., 2014; Boavida-Portugal et al., 2018).

The model considers two decisive but opposed effects influencing the distribution of macroalgae: new ice-free areas due to tidewater glacier retreat provide new potential habitats while increased sediment run-off reduces light availability, the prerequisite for algal growth (Quartino et al., 2013). The extrapolation of a linear regression on long-term SPM data (1991–2010) (after Schloss et al., 2012) to the year 2026 was used to simulated two past and two future scenarios with distinct SPM levels. We studied macroalgae distribution in the year of glacier transition on land; hence no more newly ice-free areas can be expected in future Potter Cove scenarios. However, apparently the time elapsed since ice retreat has not allowed for appreciable macroalgal growth in every section of the newly available habitat. Hence, the model predicts further alterations of the macroalgal habitat in Potter Cove as long as SPM concentrations increase during growing season and a likely decrease in macroalgae biomass, albeit with new colonization of recently ice-free and still not overgrown habitats in the inner cove.

The EM predicts decreasing macroalgal area coverage for particular scenarios with increasing sediment run-off through the past to the recent past ("tom" status) and into the future. Both, the predicted "tom" status (2008–2015) and the past 1991 scenario 1 could be confirmed by comparing to empirical knowledge of past macroalgal distribution, based on underwater video and photographic transects. Macroalgae presence/absence data (**Figure 2A**) reveal low macroalgae distribution along the south coast, which explains low probability values in the "tom" scenario, whereas future and past scenarios show ample macroalgae probabilities. This effect is caused by hard substrate availability and by local variation of SPM. Both, measured and then interpolated SPM data, as well as the satellite pixels, show slightly lower SPM values along the southern coast. However, SPM residence time and plume extension are highly variable in space and time. Nevertheless, we can predict low macroalgae probability on the southern coast because the currents push the SPM plume (Lim et al., 2013) toward this side, and because the sandy bottom is covered by smaller and looser stones and rubble rather than by rocks as on the north coast.

The modeled mean summer macroalgae production for 1991 is 23% greater than the production estimated for 1994– 1995 (790 vs. 608.65 t) based on phototransects (Klöser et al., 1996; Quartino and Boraso de Zaixso, 2008) and calculated for exactly the same spatial extent of study area. We attribute the main difference to a methodological underestimation due to the low data density. According to the modeled results, there is indication for a higher macroalgae coverage at the northern coast of the cove compared to the study of Klöser et al. (1996) (**Supplementary Figure 3**). However, decreasing production over time in our simulation is in good agreement with our recent results showing a slight decrease in production from 1991 to 1994–1995.

SPM significantly depends on the climate weather conditions, mainly air temperature and wind speed as well as the duration of these weather conditions. Our analysis showed that the weather conditions (wind speed and direction) 1 day or hours before the measurement correlate to the SPM amount in the water column. We are currently working on modeling the SMP dynamics in Potter Cove coupled with FESOM-C (Androsov et al., 2019), which will be used for future SDMs in Potter Cove. This new analysis shows that meteorological conditions on the day before the measurement strongly influence spatial SPM concentrations in Potter Cove surface waters (Neder and Fofonova, unpublished data).

The SPM snapshot we used in this study indicates decreasing SPM concentration from the head toward the outer cove. On the same transect, macroalgae maximum growth depth and the summer production increase reciprocally and macroalgal species composition changes (Quartino et al., 2013; Deregibus et al., 2016; Campana et al., 2018). Increasing turbidity, meaning less light availability may lead to an upward shift of the macroalgae at the coastline, because they do not maintain positive carbon balance at deeper depth under high SPM (Deregibus et al., 2016). An upward shift of macroalgae was also found for an Arctic fjord comparing data from the mid-90ies with 2012–2014 (Bartsch et al., 2016). Not only the decreased light availability for the algae, but also the sediment per se may reduce the macroalgal recruitment success by up to 100% (Zacher et al., 2016). This may be due to processes such as sediment scour and burial (Airoldi, 2003).

While scenarios 1 (1991) and 2 (1998) model macroalgal coverage of 143 ha and 100 ha, Klöser et al. (1996) made their approximation for 110 ha in 1993/94. The model describes the past and the "tom" status of macroalgal distribution in Potter Cove with a slight tendency of an underestimation. The arguably crude projection of macroalgal community distribution and inferred productivity in a future of increasing discharge of eroded sediments for 2019 (scenario 3) and 2026 (scenario 4) highlights a general trend toward a dramatically reduced macroalgal summer productivity inside Potter Cove as melting of the Fourcade glacier continues. This process is mitigated by an increased colonization and productivity by macroalgae inside the cove in shallow hard bottom areas formerly covered by the glacier (Quartino et al., 2013; Deregibus et al., 2016; Campana et al., 2018) and hence predicts a general increase in benthic primary production in polar coastal areas.

This rather drastic decline of productivity presumably represents an overestimation because modeled SPM values were reduced/increased equally in all areas of the cove and by assuming a linear relation between distance to the glacier and SPM change in time. Monien et al. (2017) stated, however, that up to 50% of the plume sediments are deposited in glacier proximal areas (the inner cove) and do not affect all areas equally as assumed in our model. This is also indirectly confirmed by light measurements at different sites of the cove, measuring higher light penetration in the outer cove compared to the inner cove (Quartino et al., 2013; Deregibus et al., 2016). Furthermore, the satellite image showing the SPM plume represents a snapshot of a single day, potentially failing to adequately represent the general situation. However, even if changes will be less pronounced, the negative effect of increased sedimentation on macroalgal distribution and production is clearly visible and reproducible with our applied data sets.

More spatial aspects that structure benthic communities in polar areas can be added to the model such as ice scouring and sea ice timing (Smale et al., 2008; Barnes, 2017; Deregibus et al., 2017). Clark et al. (2013) predicted that earlier ice break-up can shift shallow water ecosystems from invertebrate dominated to macroalgae dominated communities, in areas with hard substrate present. In recent years we observed higher frequency of ice-free winters in Potter Cove (doi: 10.1594/PANGAEA.773378, Gómez Izquierdo et al., 2009). In the outer areas of Potter Cove big icebergs are a major disturbance factor (Klöser et al., 1996) and iceberg incidence inside the cove has increased in recent years (Deregibus et al., 2017). Contrary, ice disturbance in the inner cove is produced by flow of ice blocks through the glacier front line (Falk et al., 2016), which may diminish in the future as the glacier front retreats further onto land. Ice scouring was signaled as a driver of increased macroalgal patchiness in shallower shelf areas (Clarke et al., 2007; Clark et al., 2015) and the coexistence of early and late successional stages (Quartino et al., 2005; Barnes, 2017). Results of ongoing experiments in Potter Cove will provide detailed information about ice disturbance influence on macroalgae (Deregibus et al., 2017) and as soon as transferred into a spatial data set, they can be included in the model.

Arguably, the productivity estimates presented here are limited to present day data availability. To get more reliable productivity measurements, it would be very important to take year-round data of the main biomass builders into account, as many Antarctic macroalgae show highly seasonal growth starting in late winter/spring (Wiencke and Clayton, 2002). However, late winter to early spring data on algal physiology and light climate are extremely rare and difficult to obtain for polar coasts. For the future we recommend to conduct an appropriate spatial habitat monitoring program for representative areas which could serve to improve productivity estimates. For improved biomass estimations, we need models for single macroalgal species, since decreasing productivity can also result from replacement of kelp by more robust but smaller macroalgae species. Furthermore, a monitoring programme should include biological feedback in terms of zoobenthos succession on newly ice-free hard substrates as shown by Lagger et al. (2017) and Campana et al. (2018).

## CONCLUSIONS

There is an urgent need to quantify and model geographic shifts of species and community distribution ranges in times of global change (Sahade et al., 2015; Singer et al., 2016; Urban et al., 2016). Quality assessed extrapolation of single measurements in coastal areas and coastal structures such as glacial coves highlights the ecologically relevance of local analyses and supports regional and global budgeting of carbon cycling. Correlative models can spatially simulate effects of climate change and allow for reproduction of the results. Species distribution models provide a statistical validation of the results and ranking of the model-relevant environmental variables for cause-effect assessments. Statistical models are robust and applicable to many groups of species; however, they do not yet consider ecosystem functions and feedback loops such as species interactions. The combination of robust statistical correlative models with mechanism-orientated modeling is a promising approach for future biomass approximations in coastal Antarctic fjord systems under climate change with the next generation of species distribution models (Urban et al., 2016). It remains however unclear whether the projections resulting from such models are more reliable (Singer et al., 2016). Here, we provide the first step for spatial-temporal ecosystem modeling of macroalgae in the Antarctic Peninsula region. An exceptional long-term and high-density database, rarely available in polar coastal environments, was used for model construction. The approach can be expanded to associated questions, such as the distribution of macroalgal-associated fauna, or to improve regional macroalgal distribution estimates for systems similarly influenced by glacial erosion. Factors to be considered would include sea ice coverage, hard substrate, and SPM data in the areas of interest. Significant reductions of macroalgal productivity as we are predicting for Potter Cove is an assumption that can be tested in the upcoming decades. As more data layers for ice scour and grazers become available, the model can be refined.

#### AUTHOR CONTRIBUTIONS

KJ, DA, and KZ contributed conception and design of the study. KJ, FS, and HP organized the database and ran the statistical analysis and the R codes. MQ and FS analyzed video transects recorded in 2011–2012. KJ wrote the first draft of the manuscript. DA, KZ, HP, UF, DD, GC, and MQ wrote sections of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

#### FUNDING

The research was supported by Grants JE 680/1-1 and ZA 753/1-1 of the priority programme SPP 1158 Antarctic Research with Comparative Investigations in Arctic Ice Areas of the German Research Foundation (DFG). The present manuscript also presents an outcome of the EU research network IMCONet funded by the Marie Curie Action IRSES, within the Seventh Framework Programme (FP7 IRSES, Action No. 318718).

#### ACKNOWLEDGMENTS

The work was performed based on data collected at Carlini (former Jubany) Station, Dallmann Laboratory, within the framework of the scientific collaboration existing between Instituto Antártico Argentino/Dirección Nacional del Antártico and Alfred Wegener Institute, Helmholtz Centre for Polar

#### REFERENCES


and Marine Research. Special thanks to Patrick Monien for providing unpublished TOC data and to Irene Schloss for the scenario specification of the SPM magnitude. Many thanks to Stephan Frickenhaus for constructive comments, which improved the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00207/full#supplementary-material

The following are the supplementary data to this article: **Supplementary Material** supplies details about the modeling procedure, the input data processing and the model validation. The table in **Supplementary Table 1** provides precise information about the input data.


a dichotomy. J. Biogeogr. 39, 2119–2131. doi: 10.1111/j.1365-2699.2011. 02659.x


Bathymetry for Potter Cove, WAP, and Antarctica. Bremerhaven: PANGAEA. doi: 10.1594/PANGAEA.854410


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Jerosch, Scharf, Deregibus, Campana, Zacher, Pehlke, Falk, Hass, Quartino and Abele. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.