Mare Incognitum: A Glimpse into Future Plankton Diversity and Ecology Research

integrated with recent advances in marine ecosystem modeling, may shed light on marine ecosystem structure and functioning. A EuroMarine foresight workshop on the “ Impact of climate change on the distribution of plankton functional and phylogenetic diversity ” (PlankDiv) identiﬁed ﬁve grand challenges for future plankton diversity and macroecology research: (1) What can we learn about plankton communities from the new wealth of high-throughput “omics” data? (2) What is the link between plankton diversity and ecosystem function? (3) How can species distribution models be adapted to represent plankton biogeography? (4) How will plankton biogeography be altered due to anthropogenic climate change? and (5) Can a new unifying theory of macroecology be developed based on plankton ecology studies? In this review, we discuss potential future avenues to address these questions, and challenges that need to be tackled along the way.


INTRODUCTION
Marine ecosystems are altered by anthropogenic climate change and ocean acidification at an unprecedented rate (Waters et al., 2016). In recent years, observational studies have documented shifts in plankton biogeography and community structure in several ocean basins associated to sea warming, with changes that rank among the fastest and largest documented (Beaugrand et al., 2002;Poloczanska et al., 2013;Rivero-Calle et al., 2015). How changes in plankton distribution, phenology, and biomass may impact fisheries and other ecosystem services is poorly quantified (Cheung et al., 2013), with large uncertainties in the magnitude of potential cascading effects caused by trophic mismatch (Edwards and Richardson, 2004), trophic amplification (Chust et al., 2014a), and on global biogeochemical cycles (Doney et al., 2012). In consequence, current management policies suffer from a lack of understanding of marine systems (Borja et al., 2010), and biases arise in the perception of potential ocean calamities in the absence of robust evidence .
While recent oceanographic efforts such as Tara Oceans (Pesant et al., 2015) and Malaspina (Duarte, 2015) expeditions have generated a staggering wealth of novel observational data on plankton distribution and diversity (Figure 1), these same data have revealed the extent of our ignorance of marine ecosystem structure and function. A large fraction of plankton diversity recorded in recent surveys cannot be assigned to known taxonomic groups (de Vargas et al., 2015), highlighting how profoundly our knowledge of the planktonic world is biased toward the taxa sampled or cultured. Not only the identity of major players, but also the drivers of community structure and interactions between organisms remain a "mare incognitum." In the surface ocean, plankton composed of prokaryotes (viruses, bacteria, and archaea) and eukaryotes (protists and metazoans; Figure 1) have been shown to form complex interaction networks driven by multiple biotic and abiotic factors (Lima-Mendez et al., 2015), and despite their key role as resource for higher trophic levels, mesopelagic plankton communities are some of the least studied on Earth (St. John et al., 2016).
Despite these gaps in our understanding, the existing data reveal the importance of community composition for marine ecosystem function. For instance, an investigation of planktonic communities at the global scale using highthroughput metagenomic sampling techniques has recently linked carbon export patterns to specific plankton interaction networks (Guidi et al., 2016), suggesting that the who's who in the plankton world is of paramount importance for the carbon cycle. Integrated with revised estimates in species abundance and biomass (Buitenhuis et al., 2013), and combined with advances in statistical (Robinson et al., 2011) and mechanistic modeling techniques (Follows et al., 2007), novel high-throughput metagenomic data may allow us to relate biogeographic patterns of plankton distribution and diversity to further ecosystem processes.
Marine plankton ecology research is thus at a crossroads: At a time where marine ecosystems reveal their nature for the first time, these transient ecosystems have already adapted to environmental changes and are continuing to do so (Waters et al., 2016), with unknown consequences for ecosystem function, and ecosystem service provision. In this context, a close collaboration between researchers belonging to various fields of plankton ecology appears timely to identify the most pressing questions, and to accelerate progress in our understanding of marine ecosystem structure and function. Recently, a EuroMarine foresight workshop on the "Impact of climate change on the distribution of plankton functional and phylogenetic diversity" (PlankDiv), held in March 2016 in Villefranche-sur-Mer, France, gathered experts in climate change ecology, species distribution modeling, plankton biology, as well as genomics and evolution. They identified five fundamental questions in future plankton diversity and macroecology research: (1) What can we learn about plankton communities from the new wealth of highthroughput "omics" data? (2) What is the link between plankton diversity and ecosystem function? (3) How can species distribution models be adapted to represent plankton biogeography? (4) How will plankton biogeography be altered due to anthropogenic climate change? and (5) Can a new unifying theory of macroecology be developed based on plankton ecology studies? These questions, along with their associated challenges, are the subject of this review.

THE NEW WEALTH OF PLANKTON DATA
Several recent circumpolar missions have ushered in a new era of plankton biogeography research at the planetary scale. This recent explosion of biological data is perhaps best exemplified by the output of the Tara Oceans expedition (Karsenti et al., 2011). While still only offering a temporal snapshot of marine communities, the 7.2 Terabites of metagenomic data gathered are a 1,000 times that generated by the previous largest marine data project, the Sorcerer II Global Ocean Sampling (Rusch et al., 2007). High-throughput omics data offer great potential to reveal the global structure of transient marine planktonic ecosystems, since genetic methods compare favorably to traditional observational methods such as microscopy or flow cytometry in terms of the time expenditure, expert knowledge required to identify organisms, and the cost of equipment and analysis. The growing spatial coverage of data enables researchers to estimate global-scale taxonomic diversity of unicellular eukaryotes (de Vargas et al., 2015), to identify the main environmental drivers of community structure in marine prokaryotes (Sunagawa et al., 2015), and to delve into the complexity of biotic interactions between plankton species spanning multiple domains of life, as well as their link to global biogeochemical cycling (Lima-Mendez et al., 2015;Guidi et al., 2016). Complementary to a "bulk" screening of marine biodiversity, single-cell genomics approaches allow matching of phenotype and genotype, and have been used to investigate the phylogenetic affinities of microbial dark matter (i.e., currently unculturable microbial organisms; Rinke et al., 2013;Hug et al., 2016) and to uncover niche partitioning within globally distributed lineages of marine microbes (Kashtan et al., 2014). In combination, bulk and targeted approaches could unravel the taxonomic composition of planktonic organisms, as well as aspects of their ecological function (Thrash et al., 2014;Louca  et al., 2016) and genome evolution to new environments (Mock et al., 2017).
Both approaches are challenged by the lack of high quality reference databases (Sunagawa et al., 2015). This highlights the need for comprehensive reference databases to guide the validation and integration of the streams of new data, and their comparison with taxonomic information (e.g., Buitenhuis et al., 2013). In addition, genomic sampling often results in temporal snapshots of one particular aspect of biodiversity [e.g., ribosomal-RNA based Operational Taxonomic Unit (OTU) richness]. Applying this approach to marine plankton communities at similarly broad geographic scales is difficult and expensive, but necessary to improve the assessments of the temporal variability of plankton diversity (Lewandowska et al., 2014). Currently, high-resolution time-series datasets are often restricted to easilyaccessible, mostly coastal locations, making extrapolation to the expanses of the open ocean difficult. Therefore, the use of these data for ecological purposes may not be straightforward, especially when trying to estimate abundances of planktonic organisms from metabarcoding (e.g., Decelle et al., 2014).
While the genomic quantification of species composition has become more and more common (Bik et al., 2012;Bik, 2014), and harbors potential for marine ecosystem monitoring in times of rapid environmental and ecosystem change, the link between the identity and the functional role of species remains obscure. Genomic approaches can provide thousands of OTUs, whose metabolic state, morphology, and environmental tolerances are largely unknown. Supplementary measurements of functional traits in laboratory experiments and the quantification of spatiotemporal variability across populations is severely limited by our success in culturing the large diversity of plankton in vitro.
Estimates that <30% of plankton are cultivable highlight the daunting task of obtaining such data across the heterogeneous plankton lineages and put alternatives, such as single-cell screens, metatranscriptomic approaches, or in silico method developments, to the forefront for the characterization of at least some aspects of plankton diversity.

ASSESSING FUNCTIONAL AND PHYLOGENETIC FACETS OF PLANKTON BIODIVERSITY
Traditional approaches have determined marine biodiversity using species occurrence or abundance information at the regional to global scale (e.g., Tittensor et al., 2010). However, there is a growing consensus about the need to assess other facets of biodiversity such as functional diversity, which accounts for biological traits, and phylogenetic diversity to link environmental changes, ecosystem composition and ecosystem function (Naeem et al., 2012;Mouillot et al., 2013). These two promising concepts developed for macro-organisms should be increasingly used within the marine and climate change contexts to further improve our understanding of the link between plankton diversity, ecosystem productivity, or additional functions related to global biogeochemical cycles.
Functional diversity uses a set of complementary indices (Mouillot et al., 2013) combining measures of species abundance with selected physiological and ecological traits suggested to reflect the fitness of an organism, and which may influence ecosystem function (Violle et al., 2007). Since certain traits may occur across species pertaining to different taxa, estimates of functional diversity allow for the comparison of assemblages with little (no) taxonomic or phylogenetic overlap, but with similar responses to their environment. This metric can account for the intraspecific variability of ecological strategies (e.g., the trophic status of mixotrophic species), and it can include a diverse range of trait variables (e.g., size, feeding strategy, nutrient uptake kinetics). Although much progress has been made in understanding which characteristics of plankton determine their growth, reproduction, and survival (Litchman and Klausmeier, 2008;Litchman et al., 2013;Benedetti et al., 2016), information on traits is restricted to a few well-studied species (Barton et al., 2013). Consequently, trait choice often depends on subjective criteria such as the availability of data (Petchey and Gaston, 2006), therefore open access trait databases should be developed for marine species (Costello et al., 2015). In addition, it is challenging to measure multiple functional traits of thousands of species. Although omics data could allow identifying traits at the community level (Louca et al., 2016), more research is still needed to assign functional traits to sequences, especially for eukaryotic plankton. Despite these methodological issues, traitbased approach of marine communities opens new opportunities for a better understanding of ecosystem functioning and for the development of ecological indicators (Beauchard et al., 2017).
An alternative approach relies on the interspecific phylogenetic differences as a proxy for the overall diversity of a system, assuming that biological characteristics linked to individual fitness and ecological roles show phylogenetic conservatism, i.e., that communities consisting of species with a lower degree of relatedness differ more in their respective trait values, and are thus more diverse (Mouquet et al., 2012). Phylogenetic diversity indices (Tucker et al., 2016) measure the breadth and distribution of evolutionary history present in an assemblage (Mouquet et al., 2012;Cadotte et al., 2013), using DNA sequences to assess the phylogenetic distances between species, by aligning sequences to a reference tree, or by de-novo building of phylogenetic trees (Hinchliff et al., 2015).
With the advent of metagenomic data, these promising approaches need to be further explored in terms of their applicability to and relevance for the description of marine ecosystem function. However, the use of phylogenetic diversity critically depends on methodological advances: a substantial fraction of high-throughput sequences obtained by second generation sequencing for microbial communities may still lack sufficient phylogenetic information to provide a reliable phylogenetic placement. In the near future, the popularization of third generation sequencing (e.g., PacBio, Nanopore), which sequences single molecules of DNA in real time, may circumvent this problem, and will provide full opportunities to use phylogenetic diversity estimates to study present and future ecosystem function.

SPECIES DISTRIBUTION MODELING-RUNNING BEFORE WE CAN WALK?
Species Distribution Models (SDMs) are statistical tools that model a species realized niche, i.e., the environmental conditions under which the species can maintain a viable population (Hutchinson, 1957), by relating their occurrence or abundance to environmental conditions (Guisan and Zimmermann, 2000).
Several key ecological attributes make planktonic species particularly well-suited for SDMs (Robinson et al., 2011): (i) their distribution reflects their environmental preferences, since plankton are short-lived organisms, with population dynamics tightly connected to climate (Sunday et al., 2012); (ii) plankton are less commercially exploited than other marine species, and thus, their spatial patterns are less biased by captures as in the case of many fish and shellfish species. These attributes make them a key group for monitoring the impacts of climate change on biodiversity and ecosystem functioning (Richardson, 2008). So far, SDMs have seldom been applied to study plankton biogeography, with only a handful of studies on phytoplankton (Irwin et al., 2012;Pinkernell and Beszteri, 2014;Brun et al., 2015;Rivero-Calle et al., 2015;Barton et al., 2016) and some more on zooplankton (e.g., Reygondeau and Beaugrand, 2011;Chust et al., 2014b;Villarino et al., 2015;Brun et al., 2016;Benedetti et al., in press). This is due not only to the limited data availability for model development, but also due to several unaddressed methodological issues.
In plankton, a major problem with SDMs is the scarcity of occurrence data, which can lead to an incomplete niche description and/or biased models. A major challenge is therefore to discern biological distribution patterns from patterns of sampling effort, especially in traditional taxonomy-based plankton data sets where reliable absences data are usually unavailable and large regions, such as the South Pacific, are chronically undersampled. Using one of the most extensive plankton data sets to date, the North Atlantic Continuous Plankton Recorder data, Brun et al. (2016) found that a suite of commonly used SDMs are unable to predict and hindcast the distribution of zooplankton and phytoplankton examplespecies on the decadal scale. One way to improve SDMs is either through careful methodological adjustments, such as a targeted selection of the background (Phillips et al., 2009), the reduction of environmental predictors, and model complexity (Merow et al., 2014). Another approach could be to merge existing data archives and to combine genomic data with traditional approaches in order to reduce the sampling bias. However, since SDMs apply at the species level, this will require specific identifications, either from microscopy, imaging, or sequencing, which would necessitate to keep taxonomic expertise in our laboratories and, in parallel, to develop specific tools for automatic identification.
In their basic form and most common use, classical SDMs do not account generally for three major ecological processes that may be crucial for plankton distribution: (i) the role of dispersal and its limitation, (ii) biotic interactions, and (iii) intraspecific variability, which we discuss below. The relative importance of these processes in shaping planktonic species' ranges is still being under debate (Cermeño and Falkowski, 2009;Chust et al., 2013).
Plankton dispersal is controlled by ocean currents and can impact diversity and community structure (Lévy et al., 2014). Although barriers to dispersal are fewer in the marine realm compared to the terrestrial one (Steele, 1991), coupling ocean connectivity patterns (Treml et al., 2008;Foltête et al., 2012) with niche models is likely important. Source-sink dynamics may arise frequently because of the advection of water masses (e.g., Beaugrand et al., 2007;Villar et al., 2015) that can introduce species to unsuitable regions (Pulliam, 2000), potentially biasing SDMs. Future developments for plankton could ensue from graph-based techniques (Dale and Fortin, 2010) and from SDMs coupling with simple dispersal models (Foltête et al., 2012;Zurell et al., 2016).
Furthermore, the need to account for biotic interactions when predicting species distributions has been advocated (Boulangeat et al., 2012;Wisz et al., 2013). Recently, the exploration of the plankton "interactome" (Lima-Mendez et al., 2015) allowed to describe how biotic interactions occur across trophic levels and relate to environmental conditions and ecosystem functioning, with many new symbiotic interactions identified (Guidi et al., 2016). When prior knowledge is too limited, food-web models could be inferred from simple size-based, or multi-traits assumptions (Albouy et al., 2014), or based on ecosystem models (e.g., Follows et al., 2007;Le Quéré et al., 2016) in combination with satellite estimates of (phyto)plankton community composition (e.g., Hirata et al., 2011).
Finally, SDMs do not consider intraspecific variability, thus assuming that genetic adaptation is negligible. However, many planktonic species exhibit local adaptation (Peijnenburg and Goetze, 2013;Sjöqvist et al., 2015) or consist of several ecotypes with different environmental preferences, and phenotypic plasticity, dispersal, and evolutionary changes could mitigate climate change impacts as they could help species to adapt to changing conditions (O'Connor et al., 2012). One possibility to account for both local adaptation and phenotypic plasticity is to include a population-dependent component in mixed effect models (e.g., Valladares et al., 2014). Furthermore, the joint use of genomic and taxonomic information may help to constrain the differences between subpopulations or ecotypes of a species, and to identify so-called cryptic species.

ADRIFT IN AN OCEAN OF CHANGE
In contrast to works on higher trophic levels (e.g., Cheung et al., 2009), the investigation of the response of plankton to future climate changes has mostly focused more on bulk variables (e.g., biomass, production), with large uncertainties associated with the simulated response of primary and secondary production (e.g., Bopp et al., 2013;Laufkötter et al., 2015). Yet, observational evidence of changes in planktonic ecosystems has been accumulating over the past decades, with ongoing efforts to attribute these changes to specific environmental drivers (e.g., Beaugrand et al., 2008;Rivero-Calle et al., 2015).
SDMs have been used to support observations of poleward plankton distribution range shifts in response to global warming in the North Atlantic (Beaugrand et al., 2002;Richardson, 2008), as well as changes in the relative abundance of certain groups (Rivero-Calle et al., 2015). However, range shifts and in particular phenological changes can vary according to region and species, leading to unexpected emergent patterns (Richardson et al., 2012;Poloczanska et al., 2013;Burrows et al., 2014;Barton et al., 2015). In fact, multiple non-exclusive and interlinked adaptation strategies at the organismal level may all operate in concert, or, alternatively, the selection of one strategy may reduce the necessity to employ another. For example, shifts in spatial distribution may preclude the necessity for phenological adjustments in a given species attempting to maintain its thermal niche. Other adaptation strategies involve species plasticity and genetic modification in order to face changing conditions (Lavergne et al., 2010;Dam, 2013), which have been documented for spatially isolated zooplankton (Peijnenburg et al., 2006;Yebra et al., 2011), but could not be confirmed for other species (Provan et al., 2009). Another alternative adaptation strategy is the change in depth-distribution, i.e., the migration to deeper waters in search for cooler temperatures carried out by fishes (Perry et al., 2005).
Given the multitude of adaptation options, future projections of ecosystem change are prone to large uncertainties. Moreover, disentangling the effects of anthropogenic climate change on plankton distribution and phenology shifts from other drivers (e.g., climate variability, population dynamics) is equally challenging (Chust et al., 2014b). In particular, the combination of controlling factors, together with systematic biases in sampling effort can lead to biases in estimated trends. The decomposition of factors using different SDMs can detect the so-called "niche tracking, " which is the shift of a species distribution to follow the displacement of their habitat, e.g., poleward shifts (Monahan and Tingley, 2012;Bruge et al., 2016). At the community level, thermal biases between the average thermal affinity of assemblages and local temperature (Stuart-Smith et al., 2015) have to be considered to improve our understanding of the sensitivity of plankton reorganization with warming.

TOWARD A UNIFIED THEORY OF MACROECOLOGY
Predicting how species will respond to global environmental change requires an understanding of the processes generating their current large-scale spatio-temporal patterns of diversity and distribution, which is the essence of macroecology. One such predominant pattern on Earth is the decline in biodiversity of terrestrial and marine macroorganisms from tropical to polar areas (e.g., Tittensor et al., 2010). Hypotheses explaining this pattern often call upon evolutionary history (Mittelbach et al., 2007), diversity-area relations (Rosenzweig, 1995), temperature effects (Allen et al., 2002), or climatic stability (Fraser and Currie, 1996). Although these premises often find empirical support, their testing in the open oceans has been limited. Whereas, zooplankton likely reflect the general latitudinal trend (Beaugrand et al., 2013), bacterioplankton may form seasonal diversity peaks at high (Ladau et al., 2013) and mid (Sunagawa et al., 2015) latitudes, and for phytoplankton the validity of the global pattern itself and the processes that may explain it are still ambiguous (Rodríguez-Ramos et al., 2015;O'Brien et al., 2016). To alleviate data scarcity, which may have contributed to uncertainty, we suggest the implementation of SDMs as strategic tools to integrate novel with traditional data and to depict aspects of global diversity variation across major taxa and spatio-temporal scales.
The validity of the concept of SDM in plankton and its specific adaptation warrant further testing of the processes that determine plankton distribution, abundance, community assembly, and the maintenance of diversity at local to global scales. More than a decade after the appearance of the unified neutral theory of biodiversity (Hubbell, 2001), there is still an active debate on the relative contribution of demographic stochasticity, dispersal, and niche processes on plankton communities (Pueyo, 2006a,b;Cermeño and Falkowski, 2009;Chust et al., 2013), which promoted the revisiting of the "Paradox of the Plankton" (Hutchinson, 1961). Recent studies have tried to reconcile neutral and niche theories (Adler et al., 2007) and suggest that neutral combined with metabolic theory can explain macroecological patterns (Tittensor and Worm, 2016). Furthermore, neutral processes might similarly shape both population genetics and community patterns in plankton . The combination of data from time-series, global in situ observations and experiments on marine plankton provides a unique opportunity to characterize the niches of species (Brun et al., 2015) and to explore the relations between ecological niche characteristics (e.g., niche dissimilarity) and local species richness.
Thus, important open questions include: Is plankton community assembly mainly driven by niche assembly or neutral processes? Does this depend on the spatio-temporal scale of observation? Which method(s) can be used to disentangle the dominating process in community assembly and ecosystem structure? What will be the effect of the removal of geographical barriers that have long separated the Earth's biogeographical provinces on marine plankton diversity ("homogocene, " Rosenzweig, 2001)? How does the evolution of microorganism dependency based on gene loss shape the structure and dynamics of communities (Mas et al., 2016)? Due to their fast duplication rates and rapid response to environmental conditions, planktonic communities assemble, dismantle, and re-assemble constantly in natural environments, thus tracking environmental disturbances. Therefore, they are optimally suited to test classical ecological theories established for terrestrial ecosystems, and to answer questions related to diversity-stability relationships, the area-diversity hypothesis, or food web interactions.

CONCLUSION
Plankton ecology research stands at a crossroads. The staggering increase in the wealth of plankton observation data coincides with a time of significant advances in marine ecosystem modeling, which allow, for the first time, the testing of important theories of macroecology in the marine realm. These achievements offer great promise to shed light on marine ecosystem functioning and ecosystem service provision within the context of global climate change. To unlock their potential, we identified a strong need for concomitant developments in the field of bioinformatics and biostatistics, ecological niche modeling, and genetic reference database assembly, thus allowing for a successful integration of these novel with traditional observations, including taxonomic expertise. Paired with the rigorous verification of new and existing macro-ecological theories in the marine realm, and the testing and application of novel biodiversity metrics that better link ecosystem composition to ecosystem function and ecosystem service provision, these theoretical and empirical advances may allow for the urgently needed quantification of potential impacts of climate change on marine ecosystems and feedbacks to higher trophic levels. Due to the complexity of the task, and the scarcity of observational evidence of these transient ecosystems, we conclude that interdisciplinary, collaborative efforts between experts focussing on different aspects of plankton ecology will be critical in mediating this process.

AUTHOR CONTRIBUTIONS
GC, MV, FB, TN, SV, AA, SMV, DR, JI, and SA conceived and wrote the main manuscript text. All authors reviewed the manuscript.

ACKNOWLEDGMENTS
This research was funded by the EuroMarine Network (http:// www.euromarinenetwork.eu), through the organization of the PlankDiv EuroMarine Foresight workshop, held at the Observatoire Océanographique de Villefranche-sur-mer, Villefranche-sur-mer, France, in March 2016, and cofounded by the Basque Government (Department Deputy of Agriculture, Fishing and Food Policy). The PlankDiv workshop was also supported by the Laboratoire d'Océanographie de Villefranchesur-mer (LOV, UPMC/CNRS), the PlankMed action of WP5 MERMEX/MISTRAL, and by the French national programme EC2CO-LEFE (FunOmics project). This is contribution 810 from AZTI Marine Research Division. We thank the editor and three reviewers for their insightful comments, which greatly improved the manuscript.