Perspectives on in situ Sensors for Ocean Acidification Research

As ocean acidiﬁcation (OA) sensor technology develops and improves, in situ deployment of such sensors is becoming more widespread. However, the scientiﬁc value of these data depends on the development and application of best practices for calibration, validation, and quality assurance as well as on further development and optimization of the measurement technologies themselves. Here, we summarize the results of a 2-day workshop on OA sensor best practices held in February 2018, in Victoria, British Columbia, Canada, drawing on the collective experience and perspectives of the participants. The workshop on in situ Sensors for OA Research was organized around three basic questions: 1) What are the factors limiting the precision, accuracy and reliability of sensor data? 2) What can we do to facilitate the quality assurance/quality control (QA/QC) process and optimize the utility of these data? and 3) What sort of data or metadata are needed for these data to be most useful to future users? A synthesis of the discussion of these questions among workshop participants and conclusions drawn is presented in this paper.


INTRODUCTION
Ocean acidification (OA) studies increasingly take advantage of autonomous sensors for carbonate system parameters, especially CO 2 partial pressure (pCO 2 ) and pH. Typically high quality measurements of ocean carbonate chemistry [pCO 2 , pH, dissolved inorganic carbon (DIC), and total alkalinity (TA)] are made on discrete water samples collected from research vessels, but sample collection and laboratory based analysis are laborious, expensive, slow, and provide low temporal and spatial resolution. Underway pCO 2 measurements (for surface water) represent a relatively mature technology, with millions of such data now available (Bakker et al., 2016), the ability to make repeated multi-point calibrations, published intercomparisons, and many widely adopted best-practices (e.g., Körtzinger et al., 2000;Pierrot et al., 2009). Note, however, that underway instruments have historically offered greater opportunity for maintenance and calibration than is possible in a moored system. Here, we focus on the recent generation of compact, autonomous sensors (see Table 1 for examples) which are designed to offer the potential for collecting high-frequency data over long-term in situ deployments in a range of marine and estuarine environments. However, this in situ sensor technology is new, and the quality of the data is still uneven. Unlike underway systems, in situ measurements cannot accommodate frequent calibration using certified gases or standards and rely on pre-, post-deployment calibration and opportunistic collection of water samples. Although recommendations do exist, 1 consensus regarding best practices for data evaluation and treatment is still evolving. This paper is an effort to identify what can be done to optimize the utility of OA sensor data, independent of (or complementary to) further development and optimization of the measurement technologies themselves. We summarize the results of a 2-day workshop on OA sensor best practices, which was organized around three major questions: (1) What are the factors limiting the precision, accuracy and reliability of sensor data? (2) What can we do to facilitate the quality assurance/quality control (QA/QC) process and optimize the utility of these data? (3) What sort of data or metadata are needed for these data to be most useful to future users?
The workshop was held February 7th and 8th, 2018, in Victoria, British Columbia, Canada. The following discussion draws on the collective experience and perspectives of the participants to assess the current state of the art.

WHAT ARE THE FACTORS LIMITING THE PRECISION, ACCURACY AND RELIABILITY OF SENSOR DATA?
Acquiring routine confidence in a new oceanographic sensor can take up to 20 years of development and testing. Technologies for measuring salinity and oxygen, for example, are relatively mature. Carbonate system sensors are at a much earlier phase of development, but the technical problems will probably not prove to be inherently more intractable. To achieve a level of confidence suitable for routine deployment requires establishing the precision of the sensor and recognizing, mitigating, and correcting for accuracy issues such as sensor drift and biofouling. When a sensor technology is still new, as is currently the case with carbonate system sensors, the cumulative collective experience of the research community is required to resolve these questions about sensor performance.
The precision and accuracy required of sensor data depend on the intended application (Newton et al., 2015), e.g., the data quality required for confident detection of long-term trends associated with "climate" signals are more stringent than those required for studying shorter time scale variations (termed "weather" signals). Newton et al. (2015) estimated that the required uncertainties in CO 2 system data for "weather" versus "climate" monitoring are 0.02 versus 0.003 for pH, 10 versus 2 µmol kg −1 TA and 2.5% versus 0.5% for pCO 2 , respectively. Sensor evaluation studies must include consideration of such application-dependent uncertainty requirements (e.g., Atamanchuk et al., 2014;Okazaki et al., 2017).
Most instruments have been developed and tested under conditions of moderate (e.g., 10-25 • C) temperature and relatively constant salinity, typical of temperate oceanic waters, possibly limiting deployments in more extreme environments. In high-latitude waters, for example, water temperatures are lower and salinities are more variable than the conditions under which the instruments have been tested. For example, pCO 2 sensors in an experimental chamber simulating Arctic Ocean conditions showed poorer correspondence between sensor and discrete pCO 2 measurements with larger salinity changes associated with sea-ice growth (König, 2017). However, long term field deployments in an Arctic coastal landfast sea-ice environment show promising results if adequate measures are taken to resolve drift and offset throughout deployment (Duke, 2019). Another substantial knowledge gap exists for nearshore and estuarine waters, where salinity and chemical composition may vary rapidly over large ranges. Ideally, sensors should be deployed in a variety of novel conditions and their responses carefully documented. Note too, that each individual instrument is unique, and its particular characteristics need to be documented (see sections "What Sort of Data or Metadata are Needed for These Data to be Most Useful to Future Users?" and "Summary"). Furthermore, accuracy (and comparability to discrete samples) may be impacted by sensor-specific nuances associated with deployment such as power, connectivity, instrument orientation, time to equilibrium, and potential interference from colocated sensors.
Assessment of in situ sensor performance should incorporate sensor redundancy and include pre-and post-deployment laboratory calibrations where possible (e.g., Bresnahan et al., 2014;Miller et al., 2018). However, this approach requires that instruments are recovered in a condition suitable for such postdeployment calibrations to be meaningful, and assumes that drift during deployment is predictable. For example, Argo floats are not typically retrieved at the end of deployment, requiring benchmarking at regular intervals. Dissolved oxygen sensors are "benchmarked" relative to atmospheric oxygen concentrations when floats surface (Bittig et al., 2018). Referencing sensors against a reference water mass or climatology assumed to be stable is also a common practice (e.g., Gonski et al., 2018;Wolf et al., 2018). A particular difficulty with in situ referencing for carbonate system sensors is the non-stationarity of the reference state; ocean DIC concentrations are increasing over time, and often years have elapsed between collection of float data and the last available shipboard reference profile (Johnson et al., 2017). In addition, discrete calibration samples need to be collected at deployment and recovery (if the sensor is recovered); in special cases, additional mid-deployment calibration samples can be collected by passing ships or locally based scientists. However, field calibrations, whether based on reference water masses or discrete samples, are often single-point calibrations, which could lead to bias if conditions are highly variable (as in coastal or high-latitude waters).
Another significant issue is human resources. Collection of discrete samples for comparison to sensor output may represent the ideal benchmark; however, collection of samples for validation purposes requires specialized training and collection is not trivial, especially when considering moored or mobile platforms in extreme or remote locations. Use of mercuric chloride (HgCl 2 ) has been identified as a potential barrier to "citizen science" (sample collection by local people). It was also noted that unlike samples for DIC, pCO 2 , and TA, discrete pH samples cannot be stored for later analysis on land (Dickson et al., 2007). Use of carefully validated redundant reference sensors may provide a complementary strategy for benchmarking/validating newer "off the shelf " sensors.
The role of proprietary technology in ocean observing systems also needs to be carefully considered. Knowing exactly how a sensor works is obviously helpful to developing effective quality control protocols and accurately interpreting the resulting data. However, operators can and should work to understand the "black box" response (known inputs, observed outputs) even without fully knowing the internal workings. Most manufacturers are quite small companies, and their financial viability is important to the overall enterprise of building sustainable observing systems. Agencies funding construction of such observing systems should take account of the specific needs of small commercial enterprises, including the need to maintain proprietary control over some technology. Partnerships between academic and public sector scientists and sensor manufacturers should be encouraged, including the establishment of dedicated funding streams.

WHAT CAN WE DO TO FACILITATE THE QA/QC PROCESS AND OPTIMIZE THE UTILITY OF THESE DATA?
Best practices and QA/QC protocols for mature oceanographic sensor measurements (e.g., salinity and temperature) are established and have been broadly adopted for real-time applications and mobile platforms such as gliders and floats (e.g., IOOS, 2014IOOS, , 2016. Consensus among the workshop participants was that QA/QC best practices for carbonate system sensors still require further development to meet the same standards. Continued efforts to characterize sensor performance and improve our understanding of their responses across the full range of environmental conditions for different marine environments is necessary. Nevertheless, we are in a good position to define the current state of best practices for currently available and anticipated carbonate system sensors and identify a way forward. As above, the OceanBestPractices repository 2 is archiving best practices submitted by individual authors and groups with an objective of developing community best practices. The current state of available QA/QC information for sensors lags behind that for, e.g., shipboard equilibrators or ARGO O 2 data. SOCAT has produced an online "Quality Control Cookbook" (Olsen et al., 2017) which provides a clear statement of the criteria used for assigning data submission flags in the context of existing World Ocean Circulation Experiment QA/QC protocols. Wanninkhof et al. (2013) have provided further guidance on incorporating in situ sensors into the SOCAT QC framework, including details such as degree of in situ calibration necessary for desired confidence and levels of accuracy. A number of published papers address practical suggestions and demonstrations for Durafet R pH sensor QC (Bresnahan et al., 2014;Johnson et al., 2016;Rérolle et al., 2016;McLaughlin et al., 2017;Gonski et al., 2018;Miller et al., 2018). Similarly, there is considerable ongoing research into complications with in situ spectrophotometric pH analyzers, especially those resulting from indicator dye impurities and the moving parts within these systems (e.g., Liu et al., 2011;Lai et al., 2018). There is very little comparable literature on sensors for pCO 2 , DIC or TA, and much information about sensors' limitations in particular environments or deployment configurations still circulates largely by word of mouth.
Another useful example of QA/QC treatment for carbonate system sensors is the "FixO3 (The Fixed point Open Ocean Observatories) Handbook of Best Practices" available at several websites including the International Ocean Carbon Coordination Project 3 . The FixO3 Handbook details best practices for deployment, calibration, and quality control recommendations developed by the International OceanSITES initiative. Ultimately, however, the FixO3 Handbook acknowledges that quality control procedures for carbonate system sensors are not ready to be adopted as "best practices" and further work is required. The Essential Ocean Variable (EOV): Carbonate System description by the Global Ocean Observing System (GOOS) also reflects this conclusion with its assessment of the "Readiness" level of some OA sensor types described here.
Consensus among the workshop participants was that at this stage a broad community discussion is necessary. Vehicles for a clearer definition of best practices could take the form of a formal working group and/or a web-based discussion forum. A webbased forum in particular could serve as a central repository for information on current practices as is now being hosted by the Intergovernmental Oceanographic Commission (IOC) through the OceanBestPractices repository.

WHAT SORT OF DATA OR METADATA ARE NEEDED FOR THESE DATA TO BE MOST USEFUL TO FUTURE USERS?
As with QA/QC best practices, data archiving and metadata requirements are not standardized and vary with individual manufacturers' output streams. This lack of standardization may be partly because some metadata are sensor-specific, and the underlying algorithms are proprietary. Although manufacturers are often disinclined to share all the operational details of their sensors, they must allow for easy download from the sensor of all potentially useful diagnostics. Ultimately, metadata requirements should be based on the needs of the scientific community.
Current resources for compiling metadata and suggested metadata standards for OA sensors are limited and scattered. Best practices and metadata standards exist for shipboard underway and discrete CO 2 system measurements (e.g., Dickson et al., 2007) but have not yet been collectively agreed upon for sensors. Models and recommendations for sensor metadata have been developed for SOCAT (Pfeil et al., 2013) and the Global Ocean Acidification Observing Network (GOA-ON), and can serve as a starting point for more universal standards. Another resource is the National Centers for Environmental Information Ocean Carbon Data System (OCADS) data submission portal, which provides a template for carbonate system metadata requirements. The OCADS template accommodates DIC, TA, pH, and pCO 2 discrete sample data and provides a relatively exhaustive list of input fields for metadata. Additional resources include the progressive development of metadata and data processing practices by BGC-Argo 4 and formal recommendations for pH sensor (ISFET) data (Johnson et al., 2018).
Data archiving standards for carbonate system sensors require discussion of metadata information requirements. However, lack of standardized metadata requirements and the continuous development of sensor technology and best practices demands 4 http://www.argodatamgt.org/Documentation a broad, inclusive approach. There was consensus among workshop participants that entire data streams should be archived. Archiving of ancillary data (e.g., salinity) is needed for retrospective analyses and validation of measurements made throughout the sensor development cycle. Moreover, as discussed earlier, calibration/validation protocols vary among groups and by logistical constraints particular to the mode of deployment. Thus, metadata should include all available information about calibration procedures and data streams for co-located sensors.
Given that the final output (e.g., pH) from the sensor will in many cases be reprocessed in the future, all raw (voltage) data are required as well as detailed information about in situ sensor deployment configurations (e.g., continuous voltage measurement until voltages stabilize vs. fixed measurement period) and how the final values (in non-proprietary formats) were derived. There is an analogy to ocean color data: space agencies archive raw (Level 0) data and most users work with Level 2 or 3 data (Level 1 is individual scenes before atmospheric correction). Most ocean color data products go through multiple reprocessings, and it is now standard practice to specify the reprocessing number in scientific publications, in the interest of traceability and reproducibility. We expect ocean carbonate chemistry sensors to go through a similar process if they are to play a useful role in documenting temporal changes in ocean chemistry (see Gemmrich et al., 2011, for an example of misinterpretation of sensor data by investigators who did not properly consider compatibility of data collected on successive deployments). Furthermore, data reprocessing is likely to result in new scientific insights that will help guide future Frontiers in Marine Science | www.frontiersin.org development of sensor technology and data interpretation. There was general agreement that estimates of pCO 2 , pH, etc. generated by sensors should be considered data products and thus require version numbers. Software versions used for data collection and processing should also be saved in repositories, as well as information regarding discrete sample measurements taken. Most information about limitations of specific sensors under specific conditions is passed along by word-of-mouth. Such limitations can include not only mechanical failures and electrical responses, but also software issues (e.g., algorithms that are only intended to work within a certain range of temperature and salinity). This reinforces the need for raw data to be available for subsequent reprocessing and for the formal publication of best practices.

SUMMARY
Carbonate system sensors (for measuring pH or pCO 2 ) are becoming more widely available through multiple manufacturers. Use of these sensors is attractive because they provide high temporal resolution data that allow us to better understand the processes controlling carbonate system dynamics in the marine environment, particularly when deployed alongside more mature autonomous oceanographic sensors (e.g., CTD, dissolved oxygen, and chlorophyll fluorescence sensors). The current global development of ocean observatories provides additional platforms for these sensors, with opportunities for long deployments. However, recognition that these sensors are not mature demands both continued technological and methodological development. Uniform QA/QC protocols and metadata requirements for these sensors are also lacking and require a community effort to consolidate and disseminate a set of formal "best practices." Ultimately, the goal of collecting sensor data is scientific evaluation of changing ocean chemistry, to understand the carbonate system, and not simply to monitor it. Collecting high-quality discrete samples (preferably of at least 2, and better 3, CO 2 system parameters) even at only a few time points during the deployment substantially increases the value of sensor data. Empirical algorithms that estimate carbonate system parameters from other, more frequently measured, parameters (e.g., Fassbender et al., 2017) may also be useful, as long as their error distributions are understood. It is also important that methodologies used in validation measurements be carefully documented and included in the metadata. There was general consensus on the need for archiving as broad a set of metadata as possible, along with ancillary data such as salinity, and that persistent identifiers should provide for unambiguous citation and linking to assigned data product version numbers.