Synthesis of Thresholds of Ocean Acidification Impacts on Echinoderms

Assessing the vulnerability of marine invertebrates to ocean acidification (OA) requires an understanding of critical thresholds at which developmental, physiological, and behavioral traits are affected. To identify relevant thresholds for echinoderms, we undertook a three-step data synthesis, focused on California Current Ecosystem (CCE) species. First, literature characterizing echinoderm responses to OA was compiled, creating a dataset comprised of >12,000 datapoints from 41 studies. Analysis of this data set demonstrated responses related to physiology, behavior, growth and development, and increased mortality in the larval and adult stages to low pH exposure. Second, statistical analyses were conducted on selected pathways to identify OA thresholds specific to duration, taxa, and depth-related life stage. Exposure to reduced pH led to impaired responses across a range of physiology, behavior, growth and development, and mortality endpoints for both larval and adult stages. Third, through discussions and synthesis, the expert panel identified a set of eight duration-dependent, life stage, and habitat-dependent pH thresholds and assigned each a confidence score based on quantity and agreement of evidence. The thresholds for these effects ranged within pH from 7.20 to 7.74 and duration from 7 to 30 days, all of which were characterized with either medium or low confidence. These thresholds yielded a risk range from early warning to lethal impacts, providing the foundation for consistent interpretation of OA monitoring data or numerical ocean model simulations to support climate change marine vulnerability assessments and evaluation of ocean management strategies. As a demonstration, two echinoderm thresholds were applied to simulations of a CCE numerical model to visualize the effects of current state of pH conditions on potential habitat.

Assessing the vulnerability of marine invertebrates to ocean acidification (OA) requires an understanding of critical thresholds at which developmental, physiological, and behavioral traits are affected. To identify relevant thresholds for echinoderms, we undertook a three-step data synthesis, focused on California Current Ecosystem (CCE) species. First, literature characterizing echinoderm responses to OA was compiled, creating a dataset comprised of >12,000 datapoints from 41 studies. Analysis of this data set demonstrated responses related to physiology, behavior, growth and development, and increased mortality in the larval and adult stages to low pH exposure. Second, statistical analyses were conducted on selected pathways to identify OA thresholds specific to duration, taxa, and depth-related life stage. Exposure to reduced pH led to impaired responses across a range of physiology, behavior, growth and development, and mortality endpoints for both larval and adult stages. Third, through discussions and synthesis, the expert panel identified a set of eight duration-dependent, life stage, and habitat-dependent pH thresholds and assigned each a confidence score based on quantity and agreement of evidence. The thresholds for these effects ranged within pH from 7.20 to 7.74 and duration from 7 to 30 days, all of which were characterized with either medium or low confidence. These thresholds yielded a risk range from early warning to lethal impacts, providing the foundation for consistent interpretation of OA monitoring data or numerical ocean model simulations to support climate change marine vulnerability assessments and evaluation of ocean management strategies. As a demonstration, two echinoderm thresholds were applied to simulations of a CCE numerical model to visualize the effects of current state of pH conditions on potential habitat.

INTRODUCTION
Climate change is accelerating the oceanic uptake of anthropogenic CO 2 , causing a decline in pH and the availability of calcium carbonate minerals, a process termed ocean acidification (OA). Intensification of OA conditions directly contributes to the decline in suitable habitats for many marine invertebrates, such as mollusks, echinoderms, crustaceans, and pteropods (Kroeker et al., 2013;Fabricius et al., 2014;Przeslawski et al., 2015;Bednaršek et al., 2019Bednaršek et al., , 2020, potentially altering their population abundances, causing local extirpation, shifts in species' geographical ranges and, at worst, global extinction. Many studies have described pathways related to OA vulnerability, but only a few that have identified the thresholds where those effects occur. Taxon-specific syntheses are lacking on the critical thresholds at which OA impacts a range of developmental, physiological, biomineralization, growth and behavioral, and lethal endpoints. These thresholds are critical to the development of management strategies, as they provide an understanding of the priority species and geographic areas and a means with which to evaluate preventative or ameliorative strategies. Ocean acidification thresholds have been previously described for pteropods (Bednaršek et al., 2019) and decapods (Bednaršek et al., in review). Thresholds have yet to be comprehensive described for echinoderms, despite their known sensitivity (e.g., Dupont et al., 2008;Wood et al., 2008;Spicer et al., 2011;Kroeker et al., 2013;Stumpp et al., 2013;Byrne and Fitzer, 2019;Byrne and Hernández, 2020), though only a few thresholds have been reported (Dorey et al., 2013;Jager et al., 2016;Lee et al., 2019). Echinodermata is one the most abundant and ecologically successful phyla of marine animals and have been successful in colonizing a range of habitats, including shallow coastal, neritic (0-200 m), down to upper (0.2-1 km), and lower bathyal (>2000 m) zones (Micael et al., 2009;Lawrence, 2013Lawrence, , 2020Byrne and O'Hara, 2017). Echinoderm classes including the Crinoidea (e.g., feather stars), Ophiuroidea (e.g., brittle stars) and Asteroidea (e.g., sea stars), Echinoidea (sea urchins), and Holothuroidea (sea cucumbers) are ecologically important as ecosystem engineers in rocky reef ecosystems, serving as macroalgal grazers and keystone predators where their grazing and predation play a key role in structuring benthic communities (Paine, 1966;Lawrence, 2013Lawrence, , 2020Filbee-Dexter and Scheibling, 2014). They also have important functions in benthic-pelagic coupling as bioturbators and remineralizers (Ambrose et al., 2001) and as a food source for crustaceans, fish, mammals, and sea birds (Baden et al., 1990;Mattson, 1990;Wolfe et al., 2018). Economically, sea urchins' gonads and cucumber based beche-de-mer products contribute to a multi-million-dollar industry for international food markets (Conand and Byrne, 1993;Eriksson and Byrne, 2015;O'Hara and Byrne, 2017;Teck et al., 2018).
Echinoderms represent an ideal animal model for understanding the balance between the sensitivity and tolerance mechanisms to OA conditions, due to their diverse habitat preferences and associated local and regional carbonate chemistry exposure regime (Wolfe et al., 2020) that structures their acclimatization and adaptation strategies (Barry et al., 2010;Yu et al., 2011;Calosi et al., 2013bCalosi et al., , 2017Kelly et al., 2013;Sunday et al., 2014;Vargas et al., 2017), The sensitivity of echinoderms to OA occurs via multiple pathways that are also life stage specific. Common pathways of OA sensitivity include reduced ability for homeostasis, skeletogenesis and growth rate, increased mortality, impaired reproduction; alterations to the chemosensory behavior; and reduced feeding efficiency (Wood et al., 2008(Wood et al., , 2011Byrne et al., 2010Spicer et al., 2011;Barry et al., 2014;Taylor et al., 2014). pH effects on larvae occur through various pathways (Politi et al., 2004;Byrne and Przeslawski, 2013;Padilla-Gamiño et al., 2013), such as the allocation of metabolic energy for protein synthesis and ion transport (Pan et al., 2015), with the acid-base and osmoionic regulatory reaching their capacity threshold (Dupont and Thorndyke, 2012;Stumpp et al., 2012b;Dubois, 2014;Smith et al., 2016;Byrne and Fitzer, 2019) and resulting in reduced scope for growth and metabolic requirements, and ultimately also negatively impacted organismal performance and survival. Given greater larval echinoderm OA sensitivity and the control of larval recruitment on population abundance and distribution, the larval OA thresholds define a key point of vulnerability by which the ecological and economic role of echinoderms within coastal marine ecosystems could be heavily compromised in the future.
This study presents a comprehensive literature review and a synthesis of published studies related to the experimental and in situ OA exposure that were published through October 2018 for echinoderm taxa of California Current Ecosystem (CCE), an eastern boundary upwelling system on the Pacific Coast of North America. From ∼12,000 compiled data points, statistical analyses were conducted to identify duration-dependent pH thresholds specific to the life stage and their habitat (pelagic, benthic). Through the expert discussion, deliberation and consensus, eight representative pH thresholds were selected ranging from duration-dependent early warning to lethal impacts. The experts also provide recommendations on how to apply these thresholds to monitoring data and model output, here demonstrated as an example application to a CCE numerical model to illustrate the importance of life history in determining species sensitivity. Finally, further research is recommended to improve the understanding of echinoderm responses to OA and to support the implementation of OA thresholds to various management applications, from climate change to local pollution impact assessments.

MATERIALS AND METHODS
The synthesis was conducted in three steps. First, a review of literature was conducted on the effects of OA on echinoderm species present in the CCE to compile information on the speciesspecific experimental or observational responses to OA stress, including the treatment duration, life stage, and geographic region. When limited information was available for a CCE species, a geographically expanded data search was conducted on phylogenetically closely related species. Second, statistical analyses were used to identify thresholds of OA for each of the impairment pathways. Finally, a panel of decapod experts provided their professional judgment to evaluate the quality of studies, weigh (or remove) them based on that data quality assessment, assign a duration-dependent threshold and associated confidence score, which was based on the data quality and consistency of findings for the studies ( Table 1). The experts also scored them for regional importance in the CCE (Table 1). To illustrate how these thresholds could be applied, two thresholds applied to simulations of a regional biogeochemical model to visualize potential habitat constraints.

Literature Review and Compilation of Raw Experimental or Field Observation Data
The literature search was conducted by searching Google Scholar and Web of Science, as well as the reference lists of the identified studies for analysis until October 1, 2018. The search terms used included "ocean acidification" or "climate change" with "echinoderm." The literature search and review identified 40 experimental studies and one observational study that met the minimum criterion of: (1) experimental studies with data on a minimum of four treatment levels or (2) field stress-response studies conducted across a range of pH values. It is important to note that not each study needs at least four treatments levels, but all the studies combined that have contributed to the generation of one specific thresholds need to have at least four experimental treatment levels.
This search yielded a review database of 13 species: six sea urchins/sand dollars (Dendraster excentricus, Lytechinus pictus, Mesocentrotus franciscanus, Strongylocentrotus droebachiensis, Strongylocentrotus fragilis, and Strongylocentrotus purpuratus), four brittle stars (Amphiura filiformis, Ophiocten sericeum, Ophionereis schayeri, and Ophiura ophiura). Approximately 55% of the data came from CCE-specific studies. No data suitable to generate pH thresholds were available for holothuroids and crinoids, so these groups were excluded from the review. Data were extracted presented from Supplementary Material associated with the articles or from the article's figures using WebPlotDigitizer version 4.3 1 . A range of covariates were recorded including the life stage, taxonomic information, location, and treatment duration (days). A total of ∼12,000 datapoints were extracted from the raw experimental data in published literature or from authors with unpublished data. Data from multi-generation studies were excluded.
An explicit step in the review involved standardizing the specific response metric and/or the OA measure, specifically related to the individual body size with growth rate and feeding rate. If needed, metabolic parameters were transformed so that the units were comparable among studies. A variety of OA exposure metrics were used across studies, but the majority used pH as the primary measure; as such, all data were placed in the context of this single parameter. When another carbonate system parameter was utilized but the paper contained the necessary water chemistry values, it was converted to pH using the R Seacarb package (Gattuso et al., 2019), using Equation 1. Program 1 https://automeris.io/WebPlotDigitizer preferences were set to use carbonate system solubility products from Lueker et al. (2000), KHSO 4 dissociation constants from Dickson et al. (1990), total boron from Lee et al. (2010), and k f from Dickson and Riley (1979) as cited in Dickson and Goyet (1994). If pH values were unavailable or not able to be generated, the study was excluded. When possible (i.e., when sufficient water chemistry data were available), pH values were converted to pH T .
Additional categorical variables were recorded in the review database and included life stage, habitat depth, geographic origin, and duration. Egg and larval (pelagic), juvenile and adult (benthic) life stages of echinoderms are physiologically disparate and will experience fundamentally different exposure regimes. Thus, data were categorized by life stage independently. Most data (about 50%) were from studies of larval stages, while only 17% of the data came from adult life stages. Additionally, benthic organisms will experience different pH ranges depending on whether they inhabit shallow (<200 m) or deep water (>200 m depth) so species data were categorized by habitat depth. All the data came from shallow-water species, apart from one deepwater species (S. fragilis). A total of 237 response endpoints across various species, life stages and geographic locations were characterized into four effect categories: (1) behavior, (2) growth, (3) physiology, and (4) mortality.

Threshold Analysis
Two types of statistical analyses were used support expert decisions on thresholds (Supplementary Figure 1). First, a breakpoint was identified, defined as the point at which there is a significant change in echinoderm responses (Y-axis) as a result of the incremental change in the environmental stressor (x-axis, e.g., pH pH T ). Piecewise regression analysis was used to identify the breakpoint in slopes in the response measures over the gradient in OA stress (package "segmented"; version 2.15.1, Muggeo, 2008;R Development Core Team, 2020). A Davies test was used to determine the significance of the breakpoint. Second, the data were fitted to least squares regression (LSR). When the LSR was significant at p value < 0.05 and data followed a linear trend, the panel (see below) considered what were meaningful changes in the response parameters and used that information to support their threshold decisions. Table 1 provides the LSR-predicted percent change from the mean of the control treatments in the dataset that final pH threshold represents, based on expert consensus. Data were analyzed in R (Version 3.5.0). Only analyses that were significant at p < 0.05 are reported in the text ( Table 1). After running the threshold analyses for all studies, the thresholds were grouped by response metric, life stage, habitat depth, duration, and species.

Expert Consensus on Thresholds and Exposure Duration and Assessment of Uncertainty
We employed a formal, structured process to obtain the expert evaluation and synthesis of OA effects on selected decapod parameters, including: (1) evaluation of study data and threshold analyses; (2) assembly of expert panel members based on their expertise; (3) expert discussion through the consensus-based The next set of column gives the final the expert consensus pH and associated duration (days), then, for reference, what percent change that expert derived thresholds represents from the mean experimental control value (in percentage) when the least squares regression (LSR) was significant (p-value < 0.05), and, when significant, the statistical breakpoint pH. The expert certainty rating is specified based on degree of evidence and degree of agreement among studies considered (ranging from high to low). CCE relevance is specified as "1" indicating low relevance is the species is not found in the CCE, and the study was not conducted on organisms from the CCE; with "2" indicating medium relevance (the species is found in the CCE but the study was conducted on organisms that were not from the CCE); and "3" indicating high relevance (the species is found in the CCE and the study was conducted on organisms from the CCE). When "2.5" is specified, it means that the mix of studies evaluated were a range of CCE relevance of 2 to 3. Finally, source of data used to derive the thresholds is given.
Frontiers in Marine Science | www.frontiersin.org approach to select the final set of thresholds and associated confidence scores. The expert panel were recruited based on the following criteria related to the required expertise: (1) echinoderm ecophysiology; (2) echinoderm taxonomy and morphology, biomechanics, biomineralization; (3) echinoderm ecology and community ecology; and (4) OA biogeochemistry and observations. The candidates were ultimately selected based on a qualitative evaluation of the quality and depth of the publication record and the availability of the candidates to attend a 3-day workshop. OA expertise was highly regarded, but not strictly required if the expert provided a unique expertise.
Our process required that the experts considered and discussed the statistical threshold analyses and the underlying study data, applied their individual expertise, communicated relevant information, and made recommendations or challenged the conclusions to ultimately generate a set of consolidated thresholds. The experts excluded studies with data quality limitations or interpretability issues, refined interpretation of the analytical outcomes based on their knowledge of the studies and how data should be weighted for unbiased results. In one case (swimming speed), they pulled in Chan et al. (2011), which had only two treatments because they decided to combine it with Chan et al. (2016) because of the compatible response measures and experimental procedures. After discussion of these data and their interpretation, each of the nine experts and now listed in the co-author contribution section, voted on the numeric value as well as the confidence score for each specific threshold, providing a rationale for their choice as needed, particularly if it deviated from the majority opinion. The experts identified the most appropriate time frame (duration) over which the thresholds should be applied. Experts could abstain from voting if they did not feel confident that there was sufficient evidence that the threshold existed for a given metric. In cases where the experts disagreed with the statistical result and/or were divided in their opinions, the threshold value with the most votes was chosen; if there was no clear majority on the value and the uncertainty, the value was discussed until consensus was reached.
Experts were asked to score their confidence in the final threshold into one of nine categories based on the scoring metric developed by the IPCC (Supplementary Figure 1). The matrix has two components, the agreement and evidence. The agreement indicates the consistency of results among studies, while evidence encompasses the quantity and quality of studies used to establish the threshold. For agreement, a study comprising a single experiment had low level of agreement by default, though a single paper could contribute to a high level of agreement if multiple experiments were done on multiple species. The group identified 1-3 studies as a low evidence, 3-10 studies as a medium evidence, and 10 or more studies as a robust evidence. These were guidelines rather than strict rules because individual studies contained different amounts of data. Agreement referred to agreement amongst experiments on the specific parameter being considered (e.g., swimming speed) as opposed to the general category as a whole (e.g., behavior).

Scoring for Regional Relevance
Echinoderms have diverse habitat preferences and exposure related to the natural variability of the local and regional carbonate chemistry (Wolfe et al., 2020) that structures their acclimatization and adaptation strategies (Barry et al., 2010;Yu et al., 2011;Calosi et al., 2013bCalosi et al., , 2017Kelly et al., 2013;Sunday et al., 2014;Vargas et al., 2017). Thresholds were designated with a CCE regional relevance score, acknowledging this issue of local acclimatization and adaptation. A score of #1 indicated low relevance (species that are not present in the CCE, and study not conducted on specimens from the CCE); #2 indicated medium relevance (species that are found in the CCE but study conducted on specimens from outside of the CCE); and #3 indicated high relevance (species that are found in the CCE and study conducted on specimens from the CCE) (see Tables 1, 2). All data from urchin were only from the CCE species whereas brittle stars were from species not found in the CCE.

Applying Thresholds to Numerical Ocean Biogeochemical Model Simulations
Experts recommended use of measures such as: intensity (magnitude deviation from threshold), duration (number of days below threshold), frequency (the number of events below the pH threshold for events longer than the duration threshold), severity (duration times intensity of departure from the thresholds), and recovery (average number of days between adverse events; Hauri et al., 2013). Application of OA thresholds to either monitoring data or numerical model simulations considered available data on relevant demographics parameters. This included habitat-specific life history, vertical dimensions of habitat of different life stages and their seasonal occurrence, an appropriate spatial and vertical aggregation, duration of exposure and adaptation traits.
These measures were applied for a set of OA thresholds applied to a set of model simulations from the Regional Ocean Modeling System (ROMS; McWilliams, 2005, 2009), with biogeochemical elements (Biogeochemical Elemental Cycling; BEC, Moore et al., 2004) that has been developed for the CCE, the Pacific Coast of North America. The ROMS-BEC model provides a realistic three-dimensional representation of the physical circulation  and the biogeochemical cycles of nutrients, oxygen, inorganic and organic carbon species, and several plankton functional groups (Deutsch et al., in review). The simulation analyzed here is at 1 km spatial resolution southern CCE horizontal resolution domain of the ROMS-BEC, a geographic region synonymous with California coast Renault et al., 2020). Daily pH values at specific water depths that were life-and species-specific were used together with the duration of exposure associated with a specific threshold over the 10-year simulation from 1996 to 2007. Two case studies focused on thresholds applicable to two different life stages. A respiratory effects threshold (pH = 7.75; 1-week duration) was applied year-around for the adult sea urchin species (e.g., S. fragilis, S. droebachiensis, etc.) to represent their habitat found along the 0-500 m shelf of the Californian coast. A larval behavioral response threshold (impaired swimming speed; pH = 7.7; 1-week duration) was applied to pelagic habitat for echinoderm species including S. purpuratus and D. excentricus over April-July period (Kenner and Lares, 1991;Basch and Tegner, 2007;Hammond and Hofmann, 2010).

RESULTS OF EXPERT CONSENSUS
pH as the Measure to Define OA-Related Stress Level The expert panel recommended pH as a measure to describe the stress levels in echinoderms because the majority or studies frequency reported pH as a variable (12 out of 16), as well as because most of the synthesis and meta-analyses literature up to date used pH as a comparable response. Based on the studies selected, pH response in experimental treatments spanned a range of pH conditions of 6.44-8.31 (Figure 1), with exposure duration varying from 1 day to 5 months, but predominantly <50 days.

Synthesis of Thresholds by Endpoints
The expert synthesis produced eight thresholds across multiple physiological, growth and development, behavioral, and survival endpoints ( Table 1 and Figure 2), scored by expert confidence certainty (Figure 3 and Table 1) and CCE relevance ( Table 1) and the implication for the echinoderm group.

Behavior: Larvae
The larval behavior threshold was based on larvae swimming speed data from two papers on the sand dollar D. excentricus and S. purpuratus (Chan et al., 2011. Swimming speed increased with decreasing pH. The panel placed the threshold at pH = 7.70 (7-day duration), corresponding a LSR predicted 35% reduction from the experimental control (p = 0.023, R 2 = 0.12, F = 5.22, DF = 40). The threshold was rated as low evidence and agreement (two studies, Table 1). The observed change in swimming behaviors could be a result of coordinated morphological changes (D. excentricus), which has biomechanical implications for larvae to maintain directed movement (stability) in moving water and imply settling at a smaller size and elevate risk of juvenile mortality (Chan et al., 2015.

Behavior: Adults
The threshold for behavior of adults living in the deep-water below 200 m was determined based on a single study of S. fragilis related to the time of righting . The righting time (after being flipped over) as a measure of muscular function could indicate an indirect measure of the exposure to physiologically stressful conditions and increased with decreasing pH. The breakpoint analysis detected a significant breakpoint at pH = 7.25 (p = 0.034) which the panel adjusted down 7.20 (30day duration), with low evidence confidence rating (single study). In general, the panel considered behavioral category to be not only data deficient but also highly variable, making it difficult to generalize this threshold across wide range of behavior patterns. Only deep-water data were available for this threshold; there is no equivalent shallow-water threshold that could be synthesized.

Physiology: Larvae
Respiration rate was considered a proxy for aerobic metabolism and was derived from Dorey et al. (2013) and Stumpp et al. (2013). At the chosen threshold of 7.74 (14-day   duration), respiration was an LSR-predicted 25% higher than the experimental control value (p = 0.04, R 2 = 0.04, F = 0.74, DF = 16), a value with low evidence (two studies). Larval respiration increased with reduced pH levels, an effect that could be attributed to increased basal metabolic demand (Dorey et al., 2013) and a reduction in gastric pH and intracellular pH (Stumpp et al., 2012a, Stumpp et al., 2013. This is indicative of increased cost to maintain the cellular and digestive system acid-base balance and resulted in increased compensatory feeding.

Physiology: Shallow-Water Adults
Two metrics, extracellular fluid pH and respiration rate, were combined into the category of physiological threshold for adults of shallow species (Wood et al., 2008(Wood et al., , 2010(Wood et al., , 2011Christensen et al., 2011;Spicer et al., 2011;Taylor et al., 2014). Despite some initial evidence of compensation or intracellular buffering capacity Stumpp et al., 2012b;Moulin et al., 2014;Stumpp and Hu, 2017), reduced seawater pH led to an overall linear decrease in extracellular fluid pH. Given this reason, and the fact that the response was strongly linear (LSR p < 0.001, R 2 = 0.81, F = 167, DF = 40), the panel placed the extracellular fluid pH threshold at a lower pH = 7.60 (7-day duration), an LSRpredicted value 42% below the experimental control (Table 1) and low evidence. The panel noted that the buffering capacity is highly species-specific (Calosi et al., 2013a), varying among individuals within a species (Guscelli et al., 2019) and can take up to 6-12 months for a species to regain its initial acid-base status (Spicer, unpubl.). Data on rates of aerobic metabolism came from four studies ( Table 2) of brittle stars (A. filiformis, O. ophiura, O. schayeri, and O. sericeum), and were normalized using control values to facilitate study comparisons (Wood et al., 2008(Wood et al., , 2010(Wood et al., , 2011Christensen et al., 2011). The panel set respiration threshold was set at pH = 7.75 (7-day duration) based on consistent changes after a week of exposure in all species, a value representing a LSR-predicted 35% increase from the experimental control (p = 0.004, R 2 = 0.55, F = 13.22, DF = 11). The expert panel gave this threshold medium evidence and medium-to-high agreement. Respiration rate initially increased with decreasing pH, indicating increased energetic demand for vital biological processes, such as calcification (Wood et al., 2008) or mucus production (Christensen et al., 2011), as well as a likely plastic metabolic response that buffers against the temporary increase in energetic demand. Brittle stars have the capacity to withstand a certain degree of physiological stress due to acclimatization and/or adaptation to local conditions (as seen in echinoderms - Pespeni et al., 2013-and other phyla, see Calosi et al., 2013bLardies et al., 2014;Vargas et al., 2017). However, ultimately the respiration rates decreased, and the respiratory threshold identified here refers to the loss of compensation capacity.

Physiology: Deep-Water Adults
The feeding rate threshold for the adults of deep-water species was determined from a single study investigating the feeding rate of S. fragilis, a species that, while adapted to the lower pH conditions typical of depths >200 m, exhibits poor ability to regulate acid-base balance . To utilize this study, the panel adjusted the control pH in the experimental treatments from 7.92 to 7.64 to reflect naturally occurring low pH conditions. Feeding rate decreased linearly and a final threshold of pH = 7.38 was a value roughly equivalent to an LSR-predicted 25% reduction from the control treatment (p < 0.001, R 2 = 0.47, F = 45.23, DF = 52). With only a single study, it was rated low in confidence ( Table 1). The implication of this threshold is important toward understanding how the shifts in grazing rates, competition, and other biological interactions, can restructure communities in deep-sea benthic ecosystems . Reduced foraging efficiency could lead to food limitation and potential decline in an individual's growth, survival and ultimately, negatively affect population dynamics.

Growth and Development: Larvae
The thresholds for larval growth rate (expressed as µm d −1 ) were developed from three different metrics and 10 studies (see Table 1). The three metrics were: growth rate along the midline for S. purpuratus, total body length and symmetry index, for S. droebachiensis and S. purpuratus, respectively. The growth rate for S. purpuratus was calculated by fitting a logarithmic regression on the size data over the first week of the experiment when growth is mostly linear (Stumpp et al., 2011;Dorey et al., 2013). Only papers that found a significant treatment-related difference in growth rate were included (Stumpp et al., 2011;Pespeni et al., 2013;Pan et al., 2015;Chan et al., 2016). The panel positioned the growth rate threshold for S. droebachiensis at pH = 7.49 (7-day duration) based on a 35% LSR-predicted reduction of the experimental control (p < 0.001, R 2 = 0.61, F = 42.38, DF = 27). The total body length threshold, based on six studies of S. purpuratus (Yu et al., 2011;Matson et al., 2012;Kelly et al., 2013;Padilla-Gamiño et al., 2013;Pespeni et al., 2013) was placed at pH = 7.67 (7-day duration), 46% below the experimental control value, and below the statistical breakpoint of 7.74 (p = 0.025). The final threshold considered was based on a single study on S. droebachiensis (Dorey et al., 2013) of a symmetry index of the right and left postoral arms, with asymmetry indicating abnormal morphological development. The panel set the threshold at pH = 7.64 (14-day duration), a value corresponding to LSRpredicted 35% linear reduction from the experimental control (p = 0.0018, R 2 = 0.36, F = 12.63, DF = 22).
The panel thought it useful to provide a single mean threshold for larval growth and development based on the three thresholds. The panel judged two of these thresholds were of higher quality (larval growth rate, total body length) and were given twice as much weight as the symmetry index that was fair quality. This yielded a final pH = 7.62 (7-day duration set conservatively) with high level of evidence and medium-to-high agreement between studies for larval growth and development. The implication of this thresholds related to the fact reduced larval growth and development thresholds signal important changes in the organismal energy budget. Also, they can alter larval movement and clearance rates (Chan, 2012), prolong duration in the pelagic larval stage, delay settlement, and significantly reduce the number of settlers due to high mortality in the plankton .

Growth and Regeneration: Adults
The panel considered establishing a growth and regeneration thresholds for shallow water adults, based on a single study examining arm regeneration rate in O. ophiura (Wood et al., 2010). Arm regeneration rate increased with decreasing pH and a threshold was considered at pH = 7.63 (30-day duration), consistent with a LSR 35% = 7.63 (p = 0.004, F = 13.7, DF = 10). However, the study only had three treatments and finally, the panel deferred because of data quantity and consistency with other thresholds and chose not to set a threshold. Likewise, no equivalent dataset exists from which to derive a threshold for deep-water species.

Mortality: Larvae
Larval mortality signals the potential number of recruits to the population and increases with declining pH. The mortality threshold was based on mortality data from two studies on S. droebachiensis that tracked mortality over time and mortality occurred after 1 week of exposure (Dorey et al., 2013;Chan et al., 2015). The panel set the threshold at pH = 7.23 (7-day duration) a value consistent with breakpoint analyses (p < 0.001).
No adult mortality threshold could be produced from the synthesized studies, an artifact of limitations of the lower range of experimental pH conditions tested, which are often constructed with the aim to investigate future experimental conditions of the surface open water, largely deviating from the current benthic pH conditions (e.g., as seen for intertidal habitats, Wolfe et al., 2020). In this case, insufficient experimental data could be partially complemented by evidence from the spatially limited field studies around the marine vents. In these environments, echinoderms have been observed at pH = 7.1 but not at pH of 6.8 (Calosi et al., 2013a;Foo et al., 2018;González-Delgado and Hernández, 2018), positioning the local threshold below 7.1. The panel declined to advance this threshold into the selected set of all the thresholds.

Demonstration of Threshold Application to Southern CCE Model Numerical Simulations
Application of the exposure metrics (duration, intensity, frequency, and severity of departure of the thresholds) and thresholds of larval behavioral and adult physiological impairments to ROMS-BEC numerical simulations illustrate key points in echinoderm application (Figure 4). First, these exposure metrics indicate that declining habitat suitability is more severe for benthic (adults) than for pelagic (larval) life stages. The conditions that characterize the exposure (maximum) with pH below the thresholds for the adult physiology occur for 60-100 days ( Figure 4A), with 30-50 such events ( Figure 4B) occurring in the coastal regions on the annual basis, and with 0.15-0.25 pH units below the threshold (Figures 4C,D). Comparatively, the percent time below the larval thresholds is shorter (5-15%; Figure 4E) and the events are less frequent ( Figure 4F) and less intense (Figures 4G,H). Second, there is a consistent pattern of increasing magnitude, duration, and severity from offshore to onshore (Figure 4); on average, the greatest magnitude and duration of exposure to low pH conditions that could result in impaired physiological or behavioral responses of adults or larvae, respectively, is in the CA coastal regions where both life stages are most abundant and ecologically important. Thus, increased OA-related exposure can occur frequently and might carry the risks related to the cross-generational carryover effects through adults' exposure .

Synthesis of the Effects of OA on Echinoderms: Synthesis and Limitation
This synthesis, represented in a form of eight thresholds, shows a high level of confidence that OA negatively affects a variety of biological processes (behavior, physiology, growth and development, survival) and represent a potential risk for echinoderm population's demography and survival. It demonstrates robust evidence from the primary literature that indicates echinoderm sensitivity to OA. Compared with the pH thresholds for pteropods (Bednaršek et al., 2019), decapods (Bednaršek et al., in review), and bivalves (Barton et al., 2012;Gimenez et al., 2018), the pH thresholds across all life stages of the echinoderm are significantly lower.
Our conclusion of lesser pH sensitivity of echinoderms contrasts with the results of several reviews of invertebrate groups that indicated that echinoderms are more sensitive to acidification (Byrne and Przeslawski, 2013;Kroeker et al., 2013;Wittmann and Pörtner, 2013;Przeslawski et al., 2015). However, it is consistent with the Dupont et al. (2010) who found that echinoderms appear to be robust to future 2100 conditions (RCP 8.5). Five of the eight thresholds occur in the narrow pH range of 7.60-7.75, indicating that multiple pathways will be affected simultaneously and impact organismal fitness. The magnitude and duration of exposure required to trigger this effect was highly variable amongst endpoints and life stages; the duration that triggered larval and shallowwater adult thresholds (7 days) was comparable to that for pteropods (2-14 days, Bednaršek et al., 2019), but adult deepwater species (>200 m) required prolonged OA conditions >30 days in order to trigger this effect. Lower pH thresholds were associated with less sensitive stages and processes (like lethality) in adults, while higher pH thresholds are characteristic for the early life stages. This is indicative of higher susceptibility in the early life stages compared to adults and a potential bottleneck for species/population responses and agree with previous studies (Kroeker et al., 2013;Przeslawski et al., 2015). Work by Lee et al. (2019), which was published after our working group's deliberations determined sensitivity thresholds of larval S. purpuratus based on midgut pH, metabolic rate and expression of acid-base transporters, suggested a physiological tipping point of pH = 7.20. Similarly, Hu et al. (2017) showed that pH regulation in the larval gut was strongly correlated with the larval sensitivity to OA. In addition, the outcome for these larvae, however, appears to be strongly influenced by carryover effects that depend on the pH environment of their parents. Overall, robust evidence from the primary literature and agreement on echinoderm biological responses across reduced pH levels in the experimental conditions (Sato et al., 2018) lend a strong support for a potential role of echinoderm thresholds as a useful tool to interpret monitoring data and biogeochemical model simulations.
Echinoderm OA thresholds are habitat dependent, underscoring that processes of acclimatization and adaptation to small-scale variability environmental conditions and multiple stressors can structure species sensitivity in the regional OA hotspots. The processes of acclimatization and adaptation to local environmental conditions can structure species sensitivity across the small-scale spatial, temporal, and vertical variability in the regional OA hotspots. The adults of the single deep-water habitat species can tolerate inherently lower pH conditions due to their exposure and adaptation to naturally low pH conditions and variability, ultimately resulting in lower OA thresholds (e.g., Calosi et al., 2013b;Dorey et al., 2013;Chakravarti et al., 2016;Vargas et al., 2017;. In contrast, shallow water species tend to live at higher pH levels, which are reflected in their higher pH threshold values. These processes can influence the larval sensitivity through the carryover effects that depend on the pH environment of their parents. For instance, S. purpuratus larvae generated from parents living in the CCE upwelling zone are more resilient to low pH compared to larvae by parents from higher ambient pH conditions (Yu et al., 2011;Kelly et al., 2013). How the greater resilience of CCE echinoid larvae to OA applies to the other regions and taxonomic groups is not known. For instance, the larvae of the North Atlantic ophiuroid Ophiothrix fragilis are extremely sensitive to OA (Dupont et al., 2008) demonstrating that a cross-life stage impacts might appear sooner than predicted only based on a single pathway. This emphasizes the fact that echinoderm OA thresholds are OA-habitat exposure dependent and need to be considered on the regional scale.

Threshold Application to Observations or Numerical Modeling Simulations
Application of thresholds to numerical model simulations and observations can visualize the magnitude, duration, severity of OA exposure (Hauri et al., 2013), and potential for biological effects, a critical step to marine resource and water quality management (Bednaršek et al., 2020). However, the management applications of these investments cannot be fully realized without practical guidance on how to consistently apply them (Weisberg et al., 2016). Such guidance should take all relevant demographics parameters, including habitat specific life history, an appropriate horizontal and vertical aggregation, duration of exposure and adaptation traits, into account. Here, we demonstrated the application of echinoderm thresholds to biogeochemical model output based on insights from the current knowledge of the echinoderm ecology. The model output show that larval echinoderms are already being exposed to pH conditions along the California coast to the extent to cause negative sub-lethal response (Figure 4). Late autumn and winter conditions indicate relaxation of low pH conditions for pelagic stages, but exposure to low pH in the benthic habitats for the juvenile/adults. Based on the determined thresholds, prolonged benthic adult exposure to low pH can lead to reduced fitness and survival (Morley et al., 2009;Steckbauer et al., 2015), however, this could be offset through a species plasticity or adaptation capacity. This is particularly the case in the CCE with its known mosaic pattern in upwelling and pH variability (Chan et al., 2017) that can differentially structure vulnerability of the spatially distant adult populations. As such, the information on the adaptation scope should be included into the model input when available. Finally, one issue in accurate model interpretation of the larval OA exposure is the lack of understanding of their vertical distribution, which is considered a "black box" of echinoderm ecology (Chan et al., 2018), yet paramount to appropriately understanding their temporal and spatial distribution with respect to particular life stage and critical habitats of impact.

Limits of Synthesis and Priority Research Recommendations
On several occasions, experts did not support assigning the thresholds to a specific process, or thresholds had high uncertainty because they were based on a single study. These instances were due to insufficient or poorly curated data (which were removed from consideration); disparity in response measures or equivocal responses with poor signal to noise ratio; data gaps that did not allow for the extrapolation across spatial scales or life stages, insufficient evidence (shallow and deep-water adult growth and mortality and shallow water behavior); for and the number of experimental treatments (more than three); and because the studies did not realistically capture in situ pH variability the defines species exposure in the field (e.g., Frieder et al., 2014;Kapsenberg et al., 2018;Hoshijima and Hofmann, 2019;Chan and Tong, 2020).
Beyond the expert scoring of confidence based on amount and agreement of evidence, other sources of uncertainty exist in the echinoderm pH thresholds. First, information was pooled across species from different localities in order to increase applicability of thresholds across taxa, as well as augment the statistical power and confidence for several endpoints. An important caveat to pooling is that it assumes that different locations can be treated comparatively despite potentially reducing the signal based on the OA related preexposure history re to OA conditions. Second, experimental data used in this synthesis measured effects based on chronic, static pH conditions, which does not realistically capture the natural variability species experience in the field, including press (long term perturbations), and pulse events [extreme events, discrete variability (e.g., Frieder et al., 2014;Kapsenberg et al., 2018;Hoshijima and Hofmann, 2019;Chan and Tong, 2020)], fluctuating conditions or "relaxation period" to test the duration of the exposure needed for the organisms to recover. Third, potentially the largest knowledge gap, relates to the need to consideration of OA with other stressors, that can potentially act as modulators for OA thresholds. Sufficient evidence exists to show that combinations of multiple stressors can change the magnitude and the direction of single, univariate stressors (Christensen et al., 2011;Wood et al., 2011;Guscelli et al., 2019). However, there is still a considerable lack of experimental data that included treatments of OA with variable temperature and DO, which precluded consideration of how to incorporate multiple stressors into the OA thresholds. Currently, only five studies have been identified that have dual stressor effect using the same species and response measures were identified. Furthermore, no studies exist examining the effects of pH and low dissolved oxygen on the physiology of the echinoderm species found in the CCE.
Additional field and experimental data can refine these thresholds over time, particularly if conducted in a way to extract or identify thresholds rather than simply document OA adverse effects. These kinds of integrated analyses provide an opportunity for identifying information gaps; here we identified five areas where future research investments would substantially improve our ability to define OA thresholds. To overcome the limitations of synthetizing the data from the experiments that are not intended for threshold extraction, the experimental designs should increase the number of treatments and scale the appropriate range of OA stress in order to optimize threshold derivation (see for example Christen et al., 2013;Dorey et al., 2013). The application of thresholds can also be improved if a critical understanding of the relationship between exposure duration and recovery time is acquired. Experimental evidence also needs to be validated in the field, supported by the co-location of the biological and chemical monitoring and development of geochemical proxies. Studies focusing on the role of local acclimatization and adaptation vs. short-term plasticity are needed, such as latitudinal experiments with genetically different populations differing to track genetic changes and variation in time and space, as well as laboratory breeding and natural selection experiments. Ultimately, for a more accurate future projections with incorporated ecological complexities, studies with multiple stressors and trophic levels and supported with demographic and ecosystem models are urgently needed.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available upon the request, without undue reservation.

AUTHOR CONTRIBUTIONS
NB designed and led the study and led the manuscript writing with the input from all the co-authors. MR and NB conducted the literature review. MR conducted the threshold analyses, revised by NB, and expert panel. FK conducted the analyses of numerical model simulations. NB, RA, MB, SD, PC, KC, RF, JS, and JP-G are expert panelists. MS and SW facilitated the expert panel process. All authors contributed to the article and approved the submitted version.

FUNDING
This research was supported by the California Ocean Protection Council, grant number C0302500. This is PMEL contribution number 5124.