Synthesis of Thresholds of Ocean Acidification Impacts on Decapods

Assessing decapod sensitivity to regional-scale ocean acidification (OA) conditions is limited because of a fragmented understanding of the thresholds at which they exhibit biological response. To address this need, we undertook a three-step data synthesis: first, we compiled a dataset composed of 27,000 datapoints from 55 studies of decapod responses to OA. Second, we used statistical threshold analyses to identify OA thresholds using pH as a proxy for 13 response pathways from physiology to behavior, growth, development and survival. Third, we worked with the panel of experts to review these thresholds, considering the contributing datasets based on quality of the study, and assign a final thresholds and associated confidence scores based on quality and consistency of findings among studies. The duration-dependent thresholds were within a pH range from 7.40 to 7.80, ranging from behavioral and physiological responses to mortality, with many of the thresholds being assigned medium-to-high confidence. Organism sensitivity increased with the duration of exposure but was not linked to a specific life-stage. The thresholds that emerge from our analyses provide the foundation for consistent interpretation of OA monitoring data or numerical ocean model simulations to support climate change marine vulnerability assessments and evaluation of ocean management strategies.


INTRODUCTION
Increased atmospheric CO 2 absorption is causing ocean acidification (OA), with future climate scenarios predicting a decrease in global surface ocean pH by 0.3-0.4 units by the end of this century (Stocker et al., 2013;Bindoff et al., 2019). This phenomenon has the potential to substantially alter fisheries, marine biodiversity, ecosystem services, food security, and carbon storage (Barbier and Burgess, 2017). Particularly vulnerable to OA are marine calcifying organisms, such as mollusks, echinoderms and crustaceans (Kroeker et al., 2013;Fabricius et al., 2014) that are important structural and functional components of the marine ecosystems.
Effectively monitoring and managing vulnerable species requires understanding of the thresholds at which marine calcifiers become vulnerable to OA (Weisberg et al., 2016). Such thresholds have previously been described for pteropods (Bednaršek et al., 2019) and echinoderms (Bednaršek et al., 2021), yet scientific consensus on OA thresholds is lacking for decapod crustaceans despite their critical contribution to marine ecosystems. They form the bulk of zooplankton and serve as an important food source for higher trophic levels. Several species are key predators that recycle carbon from benthos to the nekton and provide biotic resistance against invasive species (Baird and Ulanowicz, 1989;Silliman and Bertness, 2002;DeRivera et al., 2005;Boudreau and Worm, 2012). They have significant economic importance for commercial and recreational fisheries across the global and regional scales, from the U.S. West coast and polar regions (Hodgson et al., 2016;Long et al., 2017;Pacific States Marine Fisheries Commission, 2019), to the Gulf and Atlantic coasts (Tomasetti et al., 2018), North Atlantic (Agnalt et al., 2013), and tropical regions (Stentiford et al., 2012). Understanding effects of OA on this group is important, as negative responses to OA could have drastic ecological impacts and socio-economic implications.
A relatively large body of research exists about the effects of OA on decapods, predominantly on their physiological and behavioral responses. Decapods are considered less sensitive to decreasing levels of carbonate saturation than are other calcifiers, because of their tightly regulated physiology and calcification process in adults, which is not directly dependent on environmental carbonate chemistry (Boßelmann et al., 2007;Taylor et al., 2015). However, the early life-history stages are particularly vulnerable to OA (Walther et al., 2010;Carter et al., 2013;Ceballos-Osuna et al., 2013;Long et al., 2013a;Small et al., 2015;Gravinese, 2018), either through maternal carryover effects Swiney et al., 2016) or direct effects including reduced growth rate (Allan and Maguire, 1992;Coffey et al., 2017;Ragagnin et al., 2018), increased oxidative stress and energy metabolism Hu et al., 2016), exoskeleton dissolution (Bednaršek et al., 2020); reduced egg production (Kurihara et al., 2008;Meseck et al., 2016), and increased mortality (Kurihara et al., 2008;Long et al., 2013a,b;Coffey et al., 2017;Swiney et al., 2017;Tomasetti et al., 2018). Maintaining biomineralization under OA may come with high energetic costs, causing organisms to divert energy from vital physiological processes, such as reproduction (Long et al., 2013a;Meseck et al., 2016), and growth (Wood et al., 2008). The aerobic metabolic responses to reduced environmental pH are extremely variable in most invertebrate phyla; some marine decapods are highly tolerant to OA stress, which can be mainly attributed to their capacity for acid-base regulation that allows the organisms to buffer disruptions to internal pH (Small et al., 2010;Long et al., 2017). The maintenance of hemolymph pH (pH e ) has a potential to be a strong indicator of organismal homeostasis (Truchot, 2012) in response to changing carbonate chemistry. As with many other invertebrate phyla, crustacean pH e regulation is via proton equivalent ion exchange, the presence of proteins and associated bicarbonate accumulation (Whiteley, 2011;Fehsenfeld and Weihrauch, 2017). In addition, acclimation can occur through carry-over or transgenerational effects , phenotypic plasticity, maternal effects (Kurihara et al., 2008;Long et al., 2013b), and becomes apparent in neutral or even positive responses to OA exposure (Melzner et al., 2009;Ries et al., 2009).
This study presents a global literature review of 55 published studies of OA stress on decapods. From the approximately 27,000 total datapoints representing a variety of lethal and sublethal organismal responses to pH, statistical analyses were done to identify pH thresholds, which were reviewed and confirmed by the expert panel. This review further provides recommendations to apply these thresholds to monitoring data and numerical ocean model simulations in the context of marine vulnerability assessments. Metrics of exposure, such as duration (time interval below the threshold), intensity (magnitude of departure from threshold) and severity were recommended for consistent application across monitoring data and model outputs. Finally, research recommendations are provided to reduce uncertainty in thresholds and to support their implementation to various management applications, from climate change assessments to local pollution impact assessments.

MATERIALS AND METHODS
The synthesis was conducted in three parts. First, we conducted a global literature review on decapod crustaceans to compile information on the species-specific experimental or observational responses to OA conditions, including the treatment duration, life stage, and geographic region. Second, statistical analyses were used to identify thresholds for the compiled responses. Finally, a panel of decapod experts were asked to apply their professional judgment and assign a confidence score for each response pathway, which was based on the data quality and consistency of findings for the studies. In addition, the experts identified the most appropriate exposure duration associated with specific threshold.

Literature Review and Compilation of Raw Experimental or Field Observation Data
The literature search was conducted by searching Google Scholar and Web of Science, as well as the reference lists of the identified studies for analysis until March 1st 2019. We specify literature review metadata as requested by ROSES (RepOrting standards for Systematic Evidence Syntheses; Haddaway et al., 2018) in the Supplementary Table 1. The search terms used included "ocean acidification" or "climate change" with "decapod crustacean, " "decapod, " or "crustacean." From this review, we identified 55 experimental studies and 1 observational study (Supplementary Table 2) that met the minimum criteria of: (1) laboratory studies that used measured a minimum of four experimental treatments were included; (2) comprehensive record of the experimental water chemistry was needed for the inclusion of the study in the dataset; (3) only experiments where carbonate chemistry was changed using CO 2 gas pumping was included, while the studies that added HCl to the seawater were excluded. In cases when experimental studies were conducted with multiple parameters (e.g., warming, hypoxia), we only extracted the data related to OA. We obtained data either from the article itself or extracted it from the article's figures using WebPlotDigitizer version 4.3 1 . This yielded ∼27,000 data points for 46 species (Table 1 and Figure 1). The majority of the data came from studies on Brachyura (true crabs), including spider and tanner crabs, lobster, king crab, hermit crab, and porcelain crab, but less data for shrimp and prawns ( Table 1 and Supplementary Table 2). The resulting database included the level of OA stress, response measure, specific metric (including units), and treatment duration (Figure 1). We calculated pH anytime when not reported in the study from the remaining carbonate chemistry parameters (dissolved inorganic carbon (DIC or pCO 2 ) using the R Seacarb package (Gattuso et al., 2019) with dissociation constants of carbonic acid in seawater from Lueker et al. (2000), KHSO 4 dissociation constants from Dickson et al. (1990), total boron from Lee et al. (2010), and k f from Dickson and Goyet (1994). When sufficient water chemistry data were available, pH values were converted to the total scale (pH T ); if such data was not available we could not add study results in our dataset.
The thresholds were duration-dependent for each life stage and associated with a specific vertical habitat (pelagic, benthic), with additional categorical variables including taxonomic information, location and geographic origin and duration of exposure to low pH. Egg (pelagic, benthic) and larval (pelagic), juvenile and adult (benthic) life stages of decapods will experience different exposure regimes depending which habitat they occupy. Raw data were categorized by life stage independently, with 38% for larval stages, 32% for juveniles, 25% for adults, and only 5% for egg stages. We have separated the studies based on their duration and binned it in separate duration as following: <1 day, 1 day-1 week, 1 week-1 month, 1-6 months; 6 months-1 year and >1 year, always combining the data with similar duration of the experimental exposure. A total of 342 different response endpoints were documented from the studies across various species, life stages, and geographic regions. Based on the 55 studies, experimental treatments spanned a range of pH conditions, from 6.11 to 8.20, with exposure duration varying from 0 to 755 days. Expert synthesis identified duration-dependent thresholds for pH effects on mortality, feeding, physiology (respiration, hemolymph pH), growth rate, search time, and hatching success (Supplementary Table 2). These thresholds ranged from pH 7.40-7.80 (Figure 2) with assigned confidence scores (Figure 3).

Threshold Analyses
Two types of statistical analyses were used to identify thresholds for each endpoint in the database (Figure 4). First, a breakpoint 1 https://automeris.io/WebPlotDigitizer analysis defined as the point at which there is a significant change in specific decapod response (Y-axis) for an incremental change in the environmental stressor (x-axis; e.g., pH). Piecewise regression analysis was used to identify the breakpoint in slopes in the response measures over the gradient in pH stress (package "segmented"; version 2.15.1, Muggeo, 2017;R Core Team, 2019). A Davies test was used to determine the significance of the breakpoint (Davies, 2002). Second, the data were fitted to leastsquares regression (LSR). When the LSR was significant (pvalue < 0.05), a threshold value was calculated as pH at which the response variable had decreased by 25% of the difference between the highest and lowest pH treatment in the dataset. The significance of this 25% decline is to determine in a systematic fashion where the response parameter substantially differed from the control value. Data were analyzed in R (Version 3.6.1). The thresholds were grouped by response metric, life stage, duration, and specific species.

Expert Assessment of Degree of Confidence and Determination of Appropriate Exposure Duration
We employed a formal, structured process to obtain the expert evaluation and synthesis of OA effects on selected decapod parameters, including: (1) assembly of expert panel members based on their expertise, (2) evaluation of study data and threshold analyses, (3) expert discussion and judgments to select the final set of thresholds and associated confidence scores. The expert panel were recruited based on the following criteria related to the required decapod/crustacean ecophysiology; their taxonomy morphology, biomechanics, and biomineralization; life history and ecology; and marine community ecology related to OA and food trophic levels, OA field/experimental observations, evolutionary biology, policy lead for OA impacts on decapod fisheries and biogeochemistry. Numerous candidates were identified based on this list and prioritized based on a qualitative evaluation of the quality and depth of the publication record and their availability to attend a 3-day workshop. OA expertise was highly regarded, but not strictly required if the expert provided a unique expertise.
Our process required that the expert judgment considered the statistical threshold analyses and the raw data or and derived results, apply their individual expertise, communicate relevant information, and make recommendations or challenges to other team members in order to generate consistent expert judgment. The experts refined interpretation of the analytical outcomes based on their knowledge of the studies and how they should be weighted. After discussion of these data and their interpretation, each of the 10 experts listed in the co-author contribution section, voted on the numeric value and provided a confidence score for that threshold, providing a rationale for their choice as needed, particularly if it deviated from the majority opinion. Experts could abstain from voting if they did not feel confident that there was sufficient evidence that the threshold existed for a given metric. Numerical votes (with the details of the voting noted in each Response section) were averaged to arrive at a final a threshold and exposure duration. In cases where it appeared that the response depended on the length of exposure, the experts identified the most appropriate duration exposure over which the thresholds should be applied.
Experts were asked to score their confidence in the final threshold into one of nine categories based on the scoring metric by IPCC (reported in Mastrandrea et al., 2010; Supplementary Figure 1). The matrix has two axes: agreement and evidence, with agreement being consistency of results among studies and evidence encompassing the quantity and quality of studies used to establish the threshold. For agreement, a study comprising a single experiment had low level of agreement by default, though a single paper could contribute to a high level of agreement if multiple experiments were done on multiple species. The group identified 1-3 studies as a low evidence, 3-10 studies as a moderate evidence, and 10 or more studies as a robust evidence. These were guidelines rather than strict rules because individual studies contained different amounts of data.

Applying Thresholds to Biogeochemical Model Outputs
Experts discussed guidance that should be provided in order to achieve a more consistent application of these thresholds to OA-related carbonate chemistry observations and ocean FIGURE 2 | Synthesis of thirteen thresholds, characterized by its magnitude (pH) and duration of exposure response. The thresholds were determined based on expental data using breakpoint analyses and linear regression, and a final step involving their interpretation through expert consensus. Number of studies per effect threshold were as follows: hatching success = 11, growth rate = 10, adult respiration = 12, hemolymph pH = 17, feeding rate = 4, search time = 4, 30-day adult survival = 6; 180-day adult survival = 3; 24-day juvenile survival = 6, 7-day larval survival = 6; 30-day larval survival = 11 studies. Details on thresholds and studies are found in Table 2. FIGURE 3 | Confidence score based on the combination of evidence and agreement for 13 different endpoints as determined by expert consensus (based on Supplementary Figure 1). Numbers correspond to numbered thresholds in Table 2. numerical model output. As with a comparable pteropod synthesis (Bednaršek et al., 2019), they recommended the use of measures such as intensity, duration, and severity of exposure (sensu Hauri et al., 2013). Intensity represents the magnitude of the departure from the threshold, calculated as the negative average deviation from the threshold. The duration represents the time below the threshold expressed as percent (%) of the total analysis period. The severity is the product of intensity and duration, with % time of pH below the threshold deviation.
To demonstrate threshold application, we employed numerical model simulations from the Regional Ocean Modeling System (ROMS; McWilliams, 2005, 2009), with biogeochemical elements (Biogeochemical Elemental Cycling; BEC, Moore et al., 2004) that have been developed for the California Current System (CCS) along the Pacific Coast of North America. The ROMS-BEC model provides a realistic three-dimensional representation of the physical circulation and the biogeochemical cycles of nutrients, oxygen, inorganic, and organic carbon species and related plankton functional groups and captures the broad patterns of physical circulation, productivity, dissolved nutrients and DO along the CCS, and their temporal variability and trends. The simulation used here for this demonstration has a spatial resolution of 1 km and spans the 10-year period from February 1997 to November 2007 with 1-day averaged outputs. Daily pH values at life-stage specific water depths (juveniles at 100 m, adults at 300 m) were calculated FIGURE 4 | Example of breakpoint and least square regression analysis of experimental stress-response from combined studies of juvenile decapod mortality (pH breakpoint = 7.75, 25% LSR = 7.84, p-value ≤0.0001). Black line indicates linear best-fit line with the 95 th percentile confidence interval of the least-squares regression (r 2 = 0.24, p-value < 0.0001; red cross indicates where the predicted response has decreased by 25% of the difference between maximum and minimum predicted values; blue line is piecewise linear regression fit; dotted line is the breakpoint; gray area indicates standard error for the line, and light blue area signifies 95% confidence interval for the breakpoint. Regression statistics are given for each statistical analysis. Color of data points denotes source of data from cited studies. with the duration of exposure associated with a species and life stage specific threshold. The threshold diagnostics included two case studies. First, juvenile mortality (pH threshold of 7.75 for a duration of 30 days at 100 m) was applied for a duration of 1 month averaged over the year, which coincides with the presence of juveniles along the California part of CCS. Second, a threshold for adult mortality (pH threshold of 7.65 for 180 days duration at 300 m) was applied across the area of the CCS as the juvenile mortality threshold.

Synthesis of Thresholds by Endpoints and Carbonate Parameter of Stress
The majority of the 55 studies selected for the final threshold identification used pH as the chemical measure of organismal stress (30 out of 55). The remaining studies still measured pH and included this data in their paper. Thus, pH was chosen to be a parameter against which biological responses were compared across all the studies. pH was a preferred parameter to pCO 2 because more studies used pH in their datasets, the reason being the ease of measuring and manipulating seawater pH compared to pCO 2 . However, we emphasize that pH is a proxy, and not a single parameter, for explaining biological impacts related to multiple carbonate chemistry effects. Supplementary Figures 2-12 illustrate the statistical threshold analyses that were used in final deliberations by experts.

Growth and Development Hatching success -larvae
Hatching success, a measure of successful embryo development, decreased under low pH. Our synthesis of hatching success data incorporated five studies on five species, including Dungeness crab, southern Tanner crab, stone crab, porcelain crab, and shrimp ( Table 2). These taxa came from a variety of geographic regions, including tropics, polar, and upwelling, and a number of different depths, from intertidal (porcelain crab, stone crab) to the deep sea (snow crab). Hatching success can be highly variable even under control conditions; in the flat porcelain crab Petrolisthes cinctipes (Randall, 1840), for example, hatching ranged from 30 to 95% among broods . The exposure duration varied highly among the studies in part due to the high variance in crustacean embryo development periods. Among the species studied, the embryo duration ranged from about 10 days [the Florida stone crab Menippe mercenaria (Say, 1818), Gravinese, 2018] to several months to a year (Chionoecetes bairdi; Swiney et al., 2016). Although only two of the five studies showed an effect of OA on hatching success Gravinese, 2018), these were the only two studies in which embryos were exposed for most or all of the embryonic development period. In the remaining studies, exposure time was less than ∼30% of the full period required for embryo development (Arnberg et al., 2013;Ceballos-Osuna et al., 2013;Miller et al., 2016).
Based on these considerations, the expert panel declined to specify a duration, noting that this will vary with speciesspecific development duration. Within predicted near-future pH conditions, the exposure time necessary to reduce hatching success needs to be a substantial portion of this developmental period. Indeed, C. bairdi did not have decreased hatching success in embryos held at low pH for nearly a year. It was only in a second brood extruded in the lab after the females had been in low pH water for a year that a reduction in hatching success was observed, suggesting that exposure during both oogenesis and embryo development were necessary to affect hatching success . Considering the two findings where significant effects were observed, the expert panel unanimously set the threshold for hatching success at pH 7.82, based largely on an LSR of 7.82 (p = 0.0013, R 2 = 0.11, Table 2 TABLE 2 | Summary of threshold results, including the threshold numbers (corresponding to number in Figure 3), effect group (physiology, growth and development, behavior, mortality) and specific effect with associated units, life stage, direction of response (negative or positive effect), habitat, species used in the study(ies). and Supplementary Figure 2). The expert panel had medium degree of confidence in this threshold, citing medium evidence and medium agreement between studies.

Growth rate -adult and juvenile
The expert panel combined juvenile and adult growth rate because the thresholds for these two life history stages were similar. There were four papers involving shrimp or prawn species where adult growth rates were reported ( Table 2). Of those four, two showed no effect of OA on growth (Taylor et al., 2015;Lowder et al., 2017) while two showed significant effects, though at widely different pH levels (6.60 - Wickins, 1984a;7.60 -Kurihara et al., 2008). Eight studies on eight species showed a reasonably consistent effect on juveniles. Except for one study on a prawn, all juvenile studies were on crab or lobster species. Five of the eight species studied showed a significant reduction in growth at pH that ranged from 7.30 to 7.80. Except for the prawn species (Wickins, 1984a), the species that had reduced growth were all cold-water or temperate species that primarily live in open ocean benthic habitats (red and blue king crab; Long et al., 2013aLong et al., ,b, 2017Swiney et al., 2017), southern Tanner crab (Long et al., 2013b), and American lobster (McLean et al., 2018). One study on the American lobster by Agnalt et al. (2013) did not find a significant effect on growth. The expert panel considered this work to be consistent with that by McLean et al. (2018), wherein the lowest pH tested in the former was at 7.60 and the latter did not see a reduction in growth rate until a pH of 7.40. The two species that were not affected by pH were the blue crab Callinectes sapidus (Rathbun, 1896) (Glandon and Miller, 2017) and the porcelain crab P. cinctipes . This is unsurprising given that the blue crab is an estuarine species and the porcelain crab an intertidal species, so both are adapted to live in highly diel fluctuating pH environments. The final adult and juvenile threshold identified was unanimously set at 7.75 for 105 days (Table 2), based on significant experimental effects in individual studies rather than statistical threshold analyses. The expert panel considered the confidence score of this threshold to be of mid-level, with a high level of evidence, but low agreement between studies.

Respiration -adults
Metabolism may increase as an adaptive response, decrease as a pathological (or short term 'shut down') response, or remain unchanged. Literature comparisons are further complicated by the variety of methods used, and the different units used when presenting data. Therefore, aerobic metabolism data were normalized to account for variation in units among studies, i.e., treatments were divided by control values according to time specific interval. In total, the expert group used data from 11 studies on 13 species to come to their decision ( Table 2 and Supplementary Figure 3). In general, aerobic metabolism decreased with decreasing pH.
The expert panel noted that while the outputs from the statistical analysis were variable, there was generally good agreement between studies that the threshold occurred at ∼pH = 7.75; below which adult respiration significantly decreased. This observation was supported by an LSR-derived threshold of 7.77 (p < 0.0001, R 2 = 0.17). The expert panel opinion on threshold ranged between pH = 7.75 and a slightly lower value of pH = 7.70. There was also some debate about duration, with opinions varying from 7 to 21 days. The final adult respiration threshold was set at 7.74, with a duration of 13 days (Figure 2, Table 2, and Supplementary Figure 3). The expert panel generally had medium confidence in this threshold. They thought this threshold had medium evidence and medium-tohigh agreement.

Respiration -early life stages
The expert panel did not produce aerobic metabolism thresholds for either juveniles or larvae because data were either insufficient to determine a threshold (three juvenile studies on three species) or plentiful (eight larval studies on six species), but highly variable, with such within-study variability masking possible effects of pH. Measures of larval respiration have the additional complication of being highly stage-specific (molt stage affecting baseline values), duration-dependent and indirectly affected by any influence of current or past hypercapnic exposure on larval stage duration, all making inter-study and inter-species comparisons difficult. Therefore, the expert panel proposed that the results of physiological stress would be more detectable through mortality or growth rate data, as opposed to a subtle sublethal and variable metric like aerobic metabolism.

Hemolymph pH -adults
Our synthesis of the response of pH to environmental hypercapnia was drawn from 15 studies, encompassing 13 species (all except one were adult pleocyemata decapods with half of those brachyuran, or true crab, species) with four species the subject of more than one study (Supplementary Figure 4). These taxa came from a variety of regions, from warm temperate to polar, and a variety of depths and locations. In most species, their extracellular pH (pH e ) was tightly regulated, but could be compromised by greatly reducing environmental pH, albeit occasionally to unrealistic levels.
Most brachyuran crabs showed well-developed pH e regulation even at very low seawater pH, down to at least pH = 7.40 in the Carcinidae and even lower in the Polybiidae and Xenograpsidae (at least pH = 6.74). Data for the Oregoniidae were more mixed. The southern Tanner crab Chionoecetes bairdi (Rathbun, 1924) was able to maintain pH e at very low environmental pH (at least pH = 7.50) for 2 years, but the congeneric C. tanneri showed a dramatic reduction in pH e when exposed to seawater pH = 7.63 after just 24 h. The extent to which this is a species difference due to the poor ion-regulatory ability of C. tanneri (see Pane and Barry, 2007) and not just a physiological lag in restoring hypercapnic-induced disturbance in acid-base balance, as has been noted for other decapod species, is not clear. The spider crab Hyas araneus (Linnaeus, 1758) could maintain pH e after 12 days at seawater pH = 7.60. However, after 10 weeks exposure, pH e regulation broke down between pH = 7.55 and 7.81, depending if accompanied by the thermal stress that can compromise acid-base regulation (Zittier et al., 2013).
Non-brachyuran species also generally show well-developed extracellular acid-base regulation, although this is not invariant. The two caridean shrimps for which we had data and the achelate South African rock lobster Jasus lalandii (H. Milne Edwards, 1837) also exhibited good pH e regulation, with the ability to negate pH change for 30 days exposure of at least pH = 7.49-7.51 and 28 weeks at pH 7.32. Over 2 weeks juvenile European lobster Homarus gammarus (Linnaeus, 1758) also displayed good pH e regulation in seawater pH as low as 6.9, but for adult Norwegian lobster Nephrops norvegicus (Linnaeus, 1758), pH e regulation was compromised at seawater pH approaching 7.60 during longer term (97-115 d) exposure. The mud lobster, Upogebia deltaura (Leach, 1816), showed good pH regulation at seawater pH = 7.60, but regulation failed drastically below this point. The discrepancy could be species-specific or may be due to the different time scales employed.
Although the dataset contains a study that recorded unchanged pH e in crab after 2 years of hypercapnic exposure, most studies are short-term (<30 days). Therefore, our final threshold duration was set at 21 days, based on unanimous expert consensus, even though no clear trends relating pH to experiment duration were detectable. When all species were included, the LSR 25% analyses determined the change in the hemolymph pH to occur at 7.60 (R 2 = 0.14, p < 0.0001). While this does not seem unreasonable, the fact that not all taxa are so tolerant to low pH (e.g., some members of the Origoniidae and Nephropidae) led the expert group to unanimously set the threshold at a higher value of pH = 7.70. The expert panel had relatively high degree of confidence in this threshold, citing high evidence and medium level of agreement between studies. While the effect of hypercapnia on physiological functions other than aerobic metabolism and acid-base balance have received some attention, there are either too few data and/or the rates of function are too variable (e.g., nitrogenous excretion) to allow the identification of thresholds for these endpoints.

Feeding rate -adults
Feeding rate was defined as the number of individual prey items (bivalves) consumed per individual per time during a feeding trial. Feeding trials ranged from 30 min to 6 days. Units were standardized across all studies. This threshold included data from four studies on four species of Brachyura (true crabs) from temperate/coastal environments ( Table 2 and Supplementary Figure 5). In three out of four studies (Appelhans et al., 2012;Wu et al., 2017;Wang et al., 2018) there was no significant trend in feeding rate. Therefore, in choosing a threshold, experts placed greater emphasis on their interpretation judgment of the data, rather than the statistical analyses. Most panel members voted for the pH threshold of 7.70 based on the first decline in the feeding rate observed in the individual study data, rather than the breakpoint. This threshold was reached via expert consensus and was not based on the threshold derived from statistical analysis. Most people voted for the duration to be set at 12 days; one person voted for 30 d and one person voted for 20 days. The final average was 16 days. The expert panel was unanimous in their decision of low-medium degree of confidence in this threshold, specifically citing low overall evidence and medium agreement between studies. Because all studies used to derive the behavioral threshold were based on Brachyura from intertidal or shallow subtidal temperate systems, the expert panel advised caution in applying the threshold to systems beyond these habitats.

Search time -adults
Search time data were available only for crab species and was defined as the time needed to search for and locate a prey item during feeding trials. Data were extracted from three studies on three different species (two true crabs and one hermit crab); the two crabs came from the intertidal zone in temperate regions whereas the hermit crab came from an upwelling region in the deep sea (884 m; Kim et al., 2016; Table 2). Feeding trials lasted between 15 min (Wang et al., 2018) and 2 days (Wu et al., 2017). The expert panel recommended combining taxa from multiple depths and environments, as they all showed similar trends, and set the threshold at either 7.76 or 7.80, with most voting for 7.76, supported by a LSR-derived threshold of 7.77 (p = 0.041, R 2 = 0.30, Table 2 and Supplementary Figure 6). The expert panel was also split on duration, with most advocating for 7 days and several advocating for 14 days. Once the votes were averaged, the final threshold was set at 7.76 for 9 days ( Table 2). The expert panel had low degree of confidence in the threshold, citing low evidence and low-to-medium agreement between studies, with added recommendation that this threshold only refers to the search time as a behavior pattern, and should not be used to describe other types of behavior pathways.
Other behavioral parameters. There were other behavioral measures related to feeding that were measured in multiple studies (e.g., handling time, breaking time); however, these datasets had fewer clear trends. Ultimately the expert panel decided that these feeding metrics were redundant to search time, as they were all different aspects of feeding or foraging behavior.
Behavioral thresholds are particularly difficult to interpret because the underlying mechanisms may differ depending on the severity of the hypercapnic stress, as well as the number of exogenous and endogenous factors that can be differently (i.e., via different pathways) impacted by OA. Under moderate pH stress, sensory capabilities of decapods may be compromised. For example, mechanoreceptors may be damaged by low pH (Bednaršek et al., 2020), impairing ability to sense the surrounding environment. Although there are no studies that studied the loss of the mechanoreceptors and changes in the behavior patterns, most of the studies associated with the feeding experiments delineate the trend that could be indirectly associated with potential loss of mechanoreceptors, as demonstrated by the increased search time, decreased foraging efficiency, and overall organismal fitness. Alternatively, at very low pH values, search time will be affected because the animal may start shutting down due to the negative impact of elevated pCO 2 and low pH on hemolymph oxygen carrying capacity, and thus aerobic metabolism, acid-base balance and other fundamental physiological functions. While this is separate from behavioral mechanisms, it can be difficult to determine which of these pathways is causing organismal response by just exploring the available data.

Mortality -larvae
The crustacean larval survival dataset comprised 13 studies on eight species, including crabs, lobster, and shrimp ( Table 2 and Supplementary Figures 7, 8). Taxa were drawn from a variety of geographic regions, including temperate, tropical, and polar and the results showed high variability in survival both within and among species. This made it difficult to identify trends in the data and define a threshold. Like brooding mortality, it was further complicated by among-species variance in the duration of the larval period. The expert panel decided to separate the data into two time periods: less than 7 days (Supplementary Figure 7) and 8-30 days (Supplementary Figure 8). Although the thresholds were similar, 7.40 and 7.52 for the <7 and 8-30-day periods respectively, the expert panel members had different degrees of confidence for the two time periods.
For larval survival over less than 7 days, we found six papers that reported that only one species to have reduced larval survival . The expert panel decided on a threshold of pH 7.44 (supported by breakpoint analyses = 7.44; p < 0.001), although agreeing on the low degree of confidence in this threshold, citing low-to-medium agreement between studies and a medium amount of evidence.
There were 11 studies with exposure times of 8-30 days (Supplementary Figure 8) and among these, four reported significant reductions in larval survival over that period. However, the expert panel commented that the spread of the relatively large quantity of data contributed to lowering of the confidence score, making the identification of a threshold more difficult. There was also debate as to whether or not was better to generate a lower value threshold (∼pH = 7.40) with higher degree of confidence that was minimally protective, or a higher value threshold (∼pH = 7.80) that was highly protective, but in which the group had low confidence. The expert panel decided to eliminate statistically derived threshold and used the expert voting to generate the final threshold of pH at 7.52 ( Table 2) and low confidence score, citing low agreement between studies and medium-to-low evidence (Figure 3).

Mortality -juveniles
A total of 10 studies on nine species reported juvenile mortality, of which several showed a clear increase in sensitivity with increased exposure time. The expert panel, therefore, decided to set separate thresholds for three different exposure times (i.e., <8 days; 8-30 days; 1-6 months). The expert panel examined data for less than 8 d of exposure and found no evidence for a threshold over the range of pH tested (down to pH 7.40), suggesting that juveniles are tolerant to fairly low pH for periods of up to 1 week. When considering data from 8 to 30 days exposure, experts weighted data from five different studies including five different species, but a sensitivity-related threshold emerged after considering only the following four studies (Long et al., 2013bSwiney et al., 2017;Ragagnin et al., 2018 ; Supplementary Figure 9). Based on raw data rather than on breakpoint analyses, the expert panel set lower threshold at pH = 7.60 and 24 days ( Table 2).
The longest duration exposures, from 1 to 6 months, included nine studies covering eight different species that showed increased mortality with decreasing pH (Supplementary  Figure 10). Several species (red king crab, blue king crab, and Tanner crab) showed a clear increase in effect size, such that mortality was detected at a lower pH or associated with prolonged exposure (Long et al., 2013a;Swiney et al., 2017). The most studied species, the American lobster, did not show a significant effect in two studies (Ries et al., 2009;McLean et al., 2018), while one study showed a negative effect at lower pH (Menu-Courey et al., 2019). The final threshold was set unanimously at 7.75 for 60 days, which was based solely on expert judgment rather than statistical analyses. The expert panel expressed a relatively high degree of confidence in this threshold given high number of studies but noted that there was only moderate agreement consistency among the studies (Figure 3). Yet, given that the majority of the studies were on crab or lobster species, the applicability of this threshold to other decapod groups remains unknown.

Mortality -adults
Similar to the juvenile thresholds, the expert panel decided on different thresholds for different exposure periods for adults: <1 month, 1-6 months, and 6-12 months. For the shortest duration, there were data from six different studies (Supplementary Figure 11), based on which the expert panel reconsidered the threshold between 7.40 and 7.60. Supported by a significant breakpoint at 7.60 (p = 0.047); the final threshold was set at pH of 7.52 for 30 days ( Table 2). The expert panel observed high agreement in data among studies demonstrating that adults were highly tolerant of OA for this exposure period. The expert panel expressed a moderate to high degree of confidence in this threshold, citing medium-to-high evidence and medium agreement between studies (Figure 3). At exposure durations of 30-207 days, three studies were available (Supplementary Figure 12). Based on the significant of the breakpoint analyses at 7.68 (p < 0.001), the expert panel was split on a threshold value between 7.60 and 7.70. The final threshold was set at pH of 7.65 with duration of 180 days. The expert panel had moderate degree of confidence in this threshold, citing medium evidence and low-to-medium agreement among studies (Figure 3). For the 6-12-month exposure, there were only two studies conducted (Kurihara et al., 2008;Swiney et al., 2016). The final threshold was set unanimously at pH 7.80 with duration of 365 days ( Table 2), supported by a significant breakpoint of 7.80 (p < 0.0001). The expert panel expressed very low degree of confidence in the threshold (Figure 3).

Pathways With Limited Information: Exoskeleton Integrity
Exoskeleton integrity refers to the observations of microhardness (Coffey et al., 2017), Ca 2+ content (Wickens, 1984b;Long et al., 2013bLong et al., , 2016Taylor et al., 2015;Swiney et al., 2016), calcification (Ries et al., 2009), and external and internal dissolution (Bednaršek et al., 2020). Exoskeleton integrity is the result of trade-offs between the processes of calcification (Ries et al., 2009) and dissolution, whereby the balance between dissolution and calcification is pH dependent (Bednaršek et al., 2020). However, an in-depth mechanistic understanding of the balance between these two processes is fundamentally lacking. No studies have comprehensively and simultaneously investigated both calcification and dissolution in the same species. While consistent trends in reduced microhardness existed across species, the expert panel concurred that a threshold should not be chosen for two reasons. First, the drivers behind the observed processes of reduced microhardness are not fully understood. Microhardness is impacted by the organic content and ions within the cuticle. Second, the functional outcome of reduced microhardness, such as impaired competitiveness and altered predator-prey relationships, are still poorly understood. Moreover, the trend of calcification is highly variable among different species (Ries et al., 2009), and investigation should also focus on intraspecific variation and life-stage specific patterns.
Exoskeleton dissolution consist of external and internal dissolution of the exoskeleton. While internal dissolution is more explicit during molting, with intense calcium fluxes between hemolymph and old/new exoskeleton, and dependent on different developmental stages, it can also occur as a result of exposure to high pCO 2 environment (Bednaršek et al., 2020). In addition, a variety of different physiological processes can contribute to internal dissolution, such as internal buffering and increased respiratory activity (Truchot, 1979;Cameron, 1985;Michaelidis et al., 2005;Hans et al., 2014), but in general point toward increased energetic trade-offs. Nevertheless, based on the complexity of processes and lack of functional understanding related to internal dissolution, the expert panel excluded internal dissolution across species for threshold consideration.
In contrast to internal dissolution, external dissolution is not under direct physiological control of the organism, but rather the consequence of direct exposure to low calcite saturation state in the seawater (Bednaršek et al., 2020). The extent of dissolution depends on the species-specific bio-composition and mechanical strength of the organic layer (Chadwick et al., 2019), as well as the crystalline composition of the exoskeleton. Exoskeleton dissolution relates to the reduced growth (Bednaršek et al., 2020) and may compromise larval development (Miller et al., 2016). Since larval decapods display exoskeleton dissolution upon low OA exposure, this response could be used as a viable field measure. No breakpoint threshold was generated, but a significant LSR threshold of pH = 7.57 (p = 0.0023, R 2 = 0.4504) using the external dissolution dataset from Bednaršek et al. (2020; Figure 2). The expert panel agree that the external dissolution is recommended as a promising indicator that merits additional research for decapods. However, the evaluation of the external dissolution should consider both species-specific response and regional exposure history basis to allow for the most accurate interpretation of their sensitivity.

Demonstration of Threshold Application to Ocean Model Numerical Simulations
Review of available data on seasonal and vertical depth distribution of various life stages of decapod species living along the California coast indicated that larvae are generally present in the upper 30-50 m of the water column in the winter-spring period, while juvenile and adults are present year around and occupy vertical habitats down to 200-1000 m (Pereyra, 1966;Shanks and Eckert, 2005;Delmanowski et al., 2017). Although there are distinct species differences in spatial distribution and preferred depth range, our thresholds are not species-or habitatspecific, they are only life stage specific, and thus seasonal application of the thresholds to vertical habitat extent was an important consideration for this exercise.
Application of the juvenile and adult decapod mortality thresholds to the ROMS-BEC numerical simulations highlights two key points (Figure 5). First, the potential habitat defined by application of these thresholds represents bookends of the range of effects to benthic versus pelagic life stages reflective of different habitat depths. Exposure regimes with pH below the thresholds for adult mortality at 300 m are common, while pH below juvenile mortality thresholds were less frequent and found along the onshore and cross-shelf region at a lower part of the 100 m vertical habitat. Second, there is a consistent pattern of increasing magnitude and duration from offshore to onshore ( Figure 5); on average, the greatest magnitude and duration of exposure to low pH conditions that could result in increased juvenile mortality is in the coastal regions, with conditions below threshold pH ∼30 -50% of the time and at 0.04-0.06 pH units below the threshold (Figures 5A-C). This suggests that the increased OArelated early life stages mortality can occur frequently along the CA coast. For adult mortality, the spatial area characterized by prolonged duration of unfavorable pH conditions were below threshold conditions between 60 and 80% of the time with an intensity of 0.05 pH units below the threshold. This indicates that pH related mortality in the CCS is potential wide-ranging, indicating that adult decapods living at 300 m could be more at risk due to OA exposure (Figures 5D-F).

Synthesis of the Effects of Ocean Acidification on Decapods and Comparison With Other Taxa
Our global literature review and synthesis of OA thresholds for decapod crustaceans reveals that this group is sensitive to OA across multiple pathways, ranging from physiological and behavioral responses to mortality. Ten of the thirteen thresholds are in the narrow pH range of 7.65-7.80, indicating that multiple pathways will be affected simultaneously and suggesting that organismal fitness may be affected sooner than if only predicted by a single pathway is considered. Thresholds were identified across different life stages, indicating sensitivity through the duration of the entire life history cycle, rather than being related to one life stage-specific bottleneck. Recent studies suggest that juveniles may be particularly sensitive to OA and other global change drivers (e.g., Walther et al., 2010;Small et al., 2015;Menu-Courey et al., 2019). We note that duration of exposure required to impact decapod physiology was highly variable among endpoints and life stages, ranging from 9 days to 1 year for sublethal responses, and 7-180 d for lethal responses, in comparison to 2-14 days for pteropods (Bednaršek et al., 2019(Bednaršek et al., , 2021, making decapod habitats characterized by prolonged low pH exposure and thus higher risks. We reviewed emerging patterns of comparative decapod sensitivities from previous studies. Although we note that our study focused on reviewing sensitivity across the decapod group rather than comparatively for different taxonomic groups. Kroeker et al. (2013) found decapods mean response to OA not to be significantly different from those of a broader set of taxonomic groups, while Wittmann and Pörtner (2013) found crustaceans to be less sensitive to OA in comparison with echinoderms and mollusks. In addition, the prevalent view that crustaceans have lower OA sensitivity has recently been refuted in many experimental studies; we found that the majority of sublethal thresholds were induced at pH ranging from 7.50 to 7.80, making it comparable to the range of conditions reported in other synthesis work (Wittmann and Pörtner, 2013; 851 and 1370 µatm of pCO 2 ; which equates to an approximate pH of 7.80-7.50). It also puts decapods on par with pteropods, which have thresholds in the range of pH 7.55-7.85 (Bednaršek et al., 2019) and are among the most sensitive of marine calcifiers found. We recommend that the synthesis effort of this taxonomic group should be frequently revisited as our scientific understanding of thresholds matures.

Considerations for Application of Thresholds to Observations or Numerical Modeling Simulations
Ocean acidification thresholds can be used to provide consistent interpretation of monitoring data and regional oceanic model output for OA status and trends assessments and to predict the effects of climate change-an important foundation for marine water quality and marine resource management (Aminzadeh et al., 2016;Weisberg et al., 2016). Thresholds can be applied to numerical model simulations and observations to assess the magnitude, duration, severity of OA effects, but the guidance is required for their application, particularly with respect to decisions on data aggregation across the spatial extent and temporal duration. Especially pertinent issues are local and regional species-specific life history, history of OA exposure, and species-specific optimal habitat conditions.
Understanding the vertical distribution of decapods, including specific life history stages, is one of the most important parameters to appropriately aggregate observations or model output. Separate application of thresholds for life stages (pelagic vs. benthic) provides insights into which habitats will be first or more frequently impacted. Life stage-specific OA thresholds should be applied seasonally, considering event-like pH settings, e.g., the onset and duration of major physical-biogeochemical events (i.e., phytoplankton productivity, upwelling cycles, timing of pelagic-benthic transitioning). Different life stages can sustain different levels of mortality without significant long-term impacts to the population (Batiuk et al., 2009). Uncertainty in applying these thresholds lies in whether there is recovery that resets the exposure clock when conditions rise above the threshold. Therefore, the identification of the critical conditions under which population level effects will depend on life-stage sensitivities and cumulative effects of co-occurring environmental conditions and which life stages represent a potential demographic bottleneck for a given species. For example, the reproduction of Dungeness crab (Metacarcinus magister) in the northeast Pacific occurs in the winter and spring, with the coastal presence of the larval stages in the springtime, coinciding with the onset of the upwelling (Bednaršek et al., 2020). Applying the threshold during the upwelling helps in identifying critical timing and regional aspects of the most critical life stages. Lastly, this review does not take into account the dynamic variation in pH at scales less than a day in very shallow subtidal or intertidal habitats although the variability in these habitats is subjected to change under future climate change scenarios.

Limits of This Synthesis and Priority Research Needs
Very few OA experimental studies had sufficiently number of and approach ranges of their treatment levels, a reason for which many studies were excluded from consideration. When it was justified, we pooled information across diverse taxonomic groups within the order to simplify applicability of thresholds across taxa and to augment the statistical power and confidence. Frequently, this involved combining data from very different species (e.g., crabs and prawns) and regions (e.g., polar and tropical species), which occasionally produced noisier dataset and decreased the expert panel's confidence in the threshold (e.g., larval mortality). Although species were pooled, the thresholds themselves were driven by those species that were sensitive to pH as the thresholds identified were not adjusted to account for any non-sensitive species in the comparative group. Pooling across broader spatial scales can mask some of the regional-specific responses that emerge from the common exposure history or local adaptation to pH levels (e.g., Calosi et al., 2013Calosi et al., , 2017Vargas et al., 2017;Thor et al., 2018), a consideration which is key to regional monitoring efforts and management goals. Pre-acclimation to such conditions could mean lower sensitivity compared to no previous exposure (Pane and Barry, 2007;Spicer et al., 2007;Small et al., 2020), but not necessarily, as this is a species-and stage-specific response . Nevertheless, we observed that in most cases the responses on the order level were similar, with the commonality of responses enhancing the ability to generalize the findings across larger spatial scales.

Regionally specific studies with appropriate exposure ranges
More studies on benthic and pelagic species taken from locations that span their latitudinal range are needed with species of regional importance, including refinement of OA conditions that different life stages experienced locally and regionally. When conducting these studies, a greater number of treatment levels is needed to better and more accurately resolve OA thresholds, particularly in the range of OA levels where the effects are first observed.

Addressing multiple stressors
Multi-stressor experiments that include OA in combination with additional stressors are necessary because species responses can be non-linear. Univariate OA thresholds do not capture the modifying effects of multiple stressors, which can range from full compensation (no response) to additive, synergistic, and antagonistic responses. The interaction or even collinearity of pH with other potential stressors, such as temperature, salinity or oxygen will produce multi-dimensional response surfaces in decapods. There is an urgent need for improved information that can help to construct the multi-stressor response surfaces for a variety of different species and habitats. The shape of a response surface is an emergent property of the underlying system and as such, it can provide more accurate quantitative identification, assessment and interpretation of multiple parameter interactions and thresholds (Lintz et al., 2011).
The limited number of available studies suggest that the greatest uncertainties are associated with the interactive higher temperature stress. Elevated temperature in combination with reduced pH had negative effects on many biological endpoints, such as reduced hatching success, increased resting metabolic rate in shrimp and prawns, reduced survival in prawns (Dissanayake and Ishimatsu, 2011;Arnberg et al., 2013), and decreased search time and capture efficiency for prey and increased feeding rate (Wu et al., 2017). Warming experiments and field studies with low pH and increased temperature are the highest priority to derive more realistic threshold values and to develop more reliable estimates of habitat suitability. An OA threshold could be temperature-dependent over the entire temperature range, or over a narrow thermal window that defines a specific OA threshold. Other multiple stressor related studies report opposing effects that are most likely species, life-stage and treatment specific; temperature combined with OA did not affect European lobster juveniles (Small et al., 2016), but on juvenile red king crab, temperature and OA had an antagonist effect with a small increase in temperature and a synergistic negative effect with a larger increase . The effect of warming in combination with OA may be more severe for larval and megalopae stages; four out of five studies reported significant negative effects of increased temperatures on survival (Walther et al., 2010;Small et al., 2015;Waller et al., 2017;Gravinese et al., 2018), while three of the five studies noted that temperature had a more substantial impact on larval mortality than pH (Small et al., 2015;Waller et al., 2017;Gravinese et al., 2018).
Currently, there are limited data on hypoxia as a second stressor, although crustaceans appear to be more vulnerable to hypoxia than other taxa such as fishes, bivalves, and gastropods (Vaquer-Sunyer and Duarte, 2008;Spicer, 2014). As with temperature, the effects of combining OA and hypoxia seems to be species-and stage-specific; for example, while the combination of these stressors in adult porcelain crabs reduced respiration, such effect was not detected in hermit crabs (Steckbauer et al., 2015).

Role of adaptation and acclimatization
To understand the role of acclimation/acclimatization and adaption in deriving thresholds and avoiding bias associated with more tolerant species, we need to systematically test for different populations and species, maternal and transgenerational effects within the context of the range of environmental gradients they are naturally exposed to currently and in future. Many decapod species can, to some extent, compensate for CO2driven acidification of extracellular fluids by possessing effective mechanisms of acid-base regulation. Despite the apparent acid-base coping mechanisms, an individual's physiological development, life stage specificity and the ecological shifts it experiences can all contribute to the adaptation strategies for number of reasons: physiological capacity is still in development; larval exposure could carry negative impact for the subsequent stages making juveniles less tolerant (sensu Walther et al., 2010); loss of physiological control through the juvenile organogenesis requiring high energy expenditure (Noisette et al., 2021), or it may be that possessing such an ability as a larvae is not so critical given the habitats they occupy (although see Bednaršek et al., 2020). The exposure to hypercapnia may alter the swimming behavior of crab larvae, which is species-and life stage-dependent; while there was no effect of hypercapnia on stage V of the stone crab larvae, earlier life stage changes to faster downwards swimming, possibly affecting larval transport in increasingly hypercapnic sea water (Gravinese et al., 2019).
Such studies need to follow best practices to guarantee that experimental conditions and approaches are appropriate to account for and investigate epigenetic effects, environment/genetic interactions, and plastic responses, as well as conditioning, post settlement selection and moderate scale adaptation to local and regional conditions that can be of significant physiological importance. The majority of experimental studies included a relatively short time exposure, which could overestimate effect relative to what occurs in a natural ecological setting (Leuzinger et al., 2011). Many species are able to acclimate (i.e., exhibit phenotypic plasticity) if exposure is sufficiently long (e.g., Dupont et al., 2013;Zhao et al., 2019) and other species show potential for evolutionary adaptation (e.g., Parker et al., 2015;Thor and Dupont, 2015). Most of the studies used in this synthesis do not account for these processes because acclimation studies are difficult to account for in the statistical analyses. For example, in one experiment the mortality rate of juvenile blue king crab was high in the lowest pH treatment at the beginning of the experiment, but that rate decreased over time until it was equal to that of the control .

Need for realistic exposure regimes and community structure
Studies are needed to better integrate field observations and laboratory experiments (Spicer, 2014). This includes, for example, the use of experimental conditions that are more ecologically realistic, approaches that are less reductionist, prolonged exposure, and the use of exposure treatments that are informed by natural variability and fluctuations (e.g., Eriander et al., 2016) in duration, magnitude and frequency, as well as recovery periods. Uncertainty in applying these thresholds lies in understanding the duration of relaxation needed to allow for at least partial recovery of individuals before the next exposure event.
Consistent and standardized field observations, including studies along environmental gradients, in situ transplants and in situ experimental systems, can be used to validate experimental results and establish species population status. With multiple environmental conditions that define in situ responses, coupled with novel sensors, co-located chemicalbiological observations and the use of sensitive biological indicators, field studies could help to delineate the importance of the potentially confounding environmental parameters and adjust the experimentally derived thresholds. Field studies can also serve toward scaling up from individual to population and community responses, validating empirical models under future projected scenarios. The use of mesocosms can provide insight on how species-specific OA-related sensitivities scale up through multiple processes (e.g., predation pressure, competition) invasive species interaction (e.g., McGaw et al., 2011) to structure ecosystems and determine climate change related ecosystem tipping points.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
NB designed and led the study. NB lead the review process and analyses, with MR conducting the review and assembled and analyzed the data. MR conducted the threshold analyses, revised by NB and expert panel. FK conducted the analyses of numerical model simulations. NB, RA, PC, RC, RF, SL, WL, JSp, JŠt, and JT included the expert panelists. MS and SW facilitated the expert panel process. NB led and completed manuscript writing with the input from all the co-authors. All the authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
Expert panel on the group was represented by the first 10 co-authors of the study, the remaining four did not vote in the process. This research was supported by California Ocean Protection Council, grant number C0302500. Reference to trade names or commercial firms does not imply endorsement by the National Marine Fisheries Service, NOAA. The findings and conclusions in the manuscript are those of the authors and do not necessarily represent the views of the National Marine Fisheries Service, NOAA. This manuscript has a contribution number 5121 from NOAA PMEL.