Divers as Citizen Scientists: Response Time, Accuracy and Precision of Water Temperature Measurement Using Dive Computers

There is a lack of depth-resolved temperature data, especially in coastal areas, which are often commonly dived by SCUBA divers. Many case studies have demonstrated that citizen science can provide high quality data, although users require more confidence in the accuracy of these data. This study examined the response time, accuracy and precision of water temperature measurement in 28 dive computers plus three underwater cameras, from 12 models. A total of 239 temperature response times (τ) were collected from 29 devices over 11 chamber dives. Mean τ by device ranged from (17 ± 6) to (341 ± 69) s, with significant between-model differences found for τ across all models. Clear differences were found in τ by pressure sensor location and material, but not by size. Two models had comparable τ to designed-for-purpose aquatic temperature loggers. 337 mean data points were collected from equilibrated temperatures in hyperbaric chamber (n = 185) and sea (n = 152) dives, compared with baseline mean temperature from Castaway CTDs over the same time period. Mean bias, defined as mean device temperature minus baseline temperature, by model ranged from (0.0 ± 0.5) to (−1.4 ± 2.1) °C and by device from (0.0 ± 0.6) to (−3.4 ± 1.0) °C. Nine of the twelve models were found to have “good” accuracy (≤0.5 °C) overall. Irrespective of model, the overall mean bias of (−0.2 ± 1.1) °C is comparable with existing commonly used coastal temperature data sets, and within global ocean observing system accuracy requirements for in situ temperature. Our research shows that the quality of temperature data in dive computers could be improved, but, with collection of appropriate metadata to allow assessment of data quality, some models of dive computers have a role in future oceanographic monitoring.


INTRODUCTION
The oceans have a critical role in climate change, acting as a heat sink and being responsible for the uptake of more than 90% of the excess heat in our climate system between 1971 and 2010 (Pörtner et al., 2019;Johnson and Lyman, 2020). Warming ocean temperatures are intrinsically linked to sea level rise and projections show the rise accelerating because of non-linear thermal expansion (Widlansky et al., 2020). In addition, the number and severity of occurrences of extreme events linked to increased sea temperatures, such as heat waves, are expected to increase with global warming (Bindoff et al., 2019). Global sea surface temperature (SST) is projected to rise by up to 6.4 • C depending on the emission scenario (Aral and Guan, 2016); accordingly, both sea surface and subsurface temperatures are defined as essential climate variables (Bojinski and Richter, 2010;Lindstrom et al., 2012). However, there is regional variability (Kennedy, 2014); for example, SST around the British Isles has been increasing at a rate of up to six times the global average rate (Dye et al., 2013) and at twice the global rate in offshore China since 2011 (Tang et al., 2020). In contrast, parts of the North Atlantic have experienced cooling (Wright et al., 2016). Shifts in biodiversity have been seen in response to variations in temperature between 0.1 and 0.4 • C (Danovaro et al., 2020), with shallow seasonal thermoclines being important to ecosystem dynamics, horizontal and vertical distribution of fish (Aspillaga et al., 2017) and biological production (Palacios et al., 2004). Variation and oscillations in thermocline depth and temperature have been recorded during the stratification period (Bensoussan et al., 2010;Aspillaga et al., 2017).
In situ data are essential to monitor these local variations, supplement satellite sea surface temperature data and validate ocean models (Brewin et al., 2017), but there are a lack of depthresolved temperature data (Wright et al., 2016) and few time series on localised variations in thermoclines (Bensoussan et al., 2010). This lack in data is especially true in areas near to the coast which research vessels and Argo floats cannot commonly reach (Wright et al., 2016). Citizen science has been shown to provide opportunities for collecting data at broad spatial and temporal scales, which would not be possible by traditional means because of accessibility and financial constraints (Pocock et al., 2014;Wright et al., 2016). Many case studies have shown that citizen science can provide high quality data (Kosmala et al., 2016) with comparable accuracy to dedicated research studies (Vianna et al., 2014;Albus et al., 2019;Krabbenhoft and Kashian, 2020), but with uncertainty regarding the reliability and quality of data (Burgess et al., 2016;Gibson et al., 2019). To address these concerns, and to increase the value of existing datasets, users require more confidence in the accuracy of these data (Burgess et al., 2016;Kosmala et al., 2016). In situ measurements should have associated uncertainty estimates (Barker et al., 2015). Post hoc data quality assessment and error detection have been found to dispel doubts about data quality (Burgess et al., 2016).
SCUBA divers (from here on referred to as divers) have been involved in many marine citizen science projects (Thiel et al., 2014;Hermoso et al., 2019) including marine protected area monitoring (Pocock et al., 2014), reef habitat/biodiversity surveys (Branchini et al., 2015;Hermoso et al., 2019) and marine debris collection (Pasternak et al., 2019). Some areas most frequently accessed by citizen scientist divers are the shallow coastal subtidal areas (e.g., to depths < 40 m; Thiel et al., 2014) where reliable physico-chemical data series are sparse. Within the estimated 6-10 million recreational divers globally (Wright et al., 2016) the use of dive computers may be approaching 100% (Azzopardi and Sayer, 2010). Dive computers are worn with the primary purpose of managing decompression limits via algorithms which calculate the level of nitrogen load in tissues. Most modern dive computers record profiles of temperature and depth, with the latter derived from a dedicated pressure sensor. Temperature data are required to correct for non-linear pressure sensor output as ambient temperature changes (Li et al., 2016), but as temperature does not have the same impact on decompression algorithms as pressure, the same level of accuracy is not required. Consequently, temperature data are obtained from thermal corrections applied to the pressure sensor (Azzopardi and Sayer, 2010;Wright et al., 2016), rather than from a dedicated temperature sensor. Temperature readings are not calibrated, and only have an advertised accuracy (where published by manufacturers) of ± 2 • C (Mares, 2020; Azzopardi and Sayer, 2012), or ± 2 • C within 20 min of temperature change (Suunto, 2018). Previous research has explored the possibility of collecting temperature data from dive computers. Wright et al. (2016) concluded that, with processing, temperature data from dive computers could be a useful resource. Other authors recommend that these data be avoided for scientific study (Azzopardi and Sayer, 2012), or state that dive computers do not have sufficient accuracy to measure ocean temperature changes (Egi et al., 2018).
This study builds on the work carried out by Wright et al. (2016) and investigates a range of dive computers in replicated experiments which aim to mimic real-world scenarios, to quantify the temperature responses of different models; aiming to address some of the concerns regarding the potential use of these data. We focus on three objective measures; the time constant τ , accuracy and precision. Time constants are used to measure a sensor's response to change; representing the time taken for 63% of the total step change in temperature to have taken place. τ is useful in the context of oceanographic temperature change (such as thermocline identification), and, in conjunction with the sample rate, the potential to gather useful data from relatively short dive profiles. Temperature accuracy is defined as the systematic error in the devices' temperature measurement when compared with a reference temperature, such as from a calibrated microCTD. By focusing on these measures, this paper investigates the potential of different devices as alternative sources of in situ temperature for oceanographic monitoring. The response to temperature change within and between models and as a function of the dive computer's body material, size, pressure sensor location and attachment to the diver (i.e., worn on the wrist or hanging freely) are analysed to ascertain whether some models or features may offer potential for higher quality data than others.

MATERIALS AND METHODS
Equipment 28 dive computers (11 models from 7 brands), along with three Paralenz Dive Camera+ cameras (for the purposes of this study referred to collectively as "dive computers"; see Table 1) were analysed. All devices shared the ability to record full profiles of temperature and depth as a function of time, except Suunto Vypers, which only store a single minimum temperature per dive. All devices were able to sample at intervals of 30 s or less and were set to the highest frequency possible for each model for all dives.
Recorded temperature resolution ranged from 0.1 to 1 • C. The devices were categorised into four "sizes": "Small" (diameter < 5 cm), "Medium" (5 cm < diameter < 7.5 cm), "Large" (diameter > 7.5 cm), and "Camera" and further classified by pressure sensor location based on the identifying small holes in the housing material into "Back" or "Edge" with Paralenzes being defined as "Covered" ( Table 1). Material was a composite category based on front, edge and back material being metal (m) or plastic (p).
All hyperbaric tests were carried out in a cylindrical two-compartment, 2,000 mm diameter Divex therapeutic recompression chamber, manually controlled to compress to the simulated nominal depths, as described by Sayer et al. (2014). For all baseline temperature measurements with the exception of water bath trials, three SonTek CastAway CTDs (CTD = Conductivity, Temperature, Depth) with 0.01 • C resolution, ± 0.05 • C accuracy, sampling rate of 5 Hz (SonTek CastAway CTD, 2020) were used. For unpressurised temperature comparison a Grant R4 refrigerated bath with TXF200 heating circulator was used.

Time Constants (τ)
Inside the hyperbaric chamber, all devices and Castaways were immersed to (8.5 ± 2.5) cm in a tub containing 13 litres of cold (10 ± 1) • C fresh water and allowed to acclimatise for 10 min, as high ambient air temperature has been demonstrated to affect temperature profiles for several minutes into a dive. Three further tubs were filled with well-mixed warm water between 18 and 25 • C. Although fitted with an environmental control unit it was not possible to regulate chamber air temperature precisely; varying between 18 and 27 • C over the course of a single dive of 1 h duration, caused by the heating effect of compression and subsequent cooling across the non-insulated chamber walls. To minimise the impact of the changing chamber temperature on tub temperature, warm tubs starting temperatures approximated the mid-point of potential chamber ascent temperatures (as measured with a stick digital thermometer).
Some models allow manual switching between salt and freshwater mode (densities unspecified by manufacturers), but to allow comparison between dive computers which did not have this capability, all dive computers were left in default salt-water mode for all dives with the exception of the Shearwater Perdix which was set to "EN13319" mode (1,020 kg m −3 water density) (Shearwater, 2020). All devices were allowed to automatically start recording temperature profiles according to their default pressure parameters, except for Paralenz Dive Camera+, which were started manually.
After acclimatisation, all tubs were compressed to a maximum simulated depth of between 9 and 10.4 m. Once the simulated depth was reached, one Castaway was moved from the cold bucket to each of the warm tubs and stirred well, followed by a further 2 min of acclimatisation. One Paralenz Dive Camera+ was then moved into each warm tub and stirred well. Early trials established that all devices reached temperature equilibrium before 7 min. Therefore, after 7 min all Paralenz Dive Camera+ were removed and switched off to conserve battery life. Subsequently, a dive computer was moved into each of the warm tubs, stirred well, then left for 7 min, repeated until all the devices had been transferred. This interval approach was designed to minimise any effect of cold-water ingress by the transfer of devices between tubs, without impacting the temperature response of previously added devices. Two dives were carried out with the same depth/tub protocol using only the three Paralenz Dive Camera+ devices, and nine replicates with all devices (Schema in Supplementary Figure 1).
Dive profiles were downloaded from individual devices into the open-source divelog software, Subsurface (Subsurface, 2020). Profiles were then combined in an XML-based format and exported into R Studio for processing. For each dive by device, data were aligned to the start point of the response curve and sliced at the first instance of the maximum temperature, isolating the full temperature response (Figure 1). In contrast to the findings of Egi et al. (2018), not all models' temperature response had a single exponential form, and linear regression did not consistently produce a good fit. Time constants were ascertained by exponential fitting via numerical integration as defined by Jacquelin (2009), using the area under the curve to calculate τ , allowing linear regression to be applied to non-linear data without estimation of parameters (Jacquelin, 2009).

Accuracy
Three protocols were followed to assess the temperature accuracy and consistency of the dive computers.

Water Bath
Dive computers only start to record profiles once they reach a prescribed pressure, but for safety reasons, it is not possible to put a temperature-controlled water bath in a pressurised chamber environment. Therefore, trials were conducted in an unpressurised environment and the temperatures were visually recorded from the computer displays. Water temperature was controlled using a Grant R4 refrigerated bath filled with deionised water, with the circulation set to maximum and temperature equilibrated to (20.0 ± 0.1) • C. Between 9 and 11 devices could be submerged in the water bath at once, so the experiments were run in a series of batches. An initial batch was submerged in the bath for 15 min (three times the maximum mean model time constant, by which time all devices have equilibrated to final temperature). Temperature was then read from the digital display of each device whilst still submerged, and the device removed from the bath. Once all device temperatures had been read the subsequent batch was submerged for 15 min and the process repeated. The process was then repeated at bath temperatures of 10 and 30 • C. For analysis, the deviation of on-screen temperature display from the water bath temperature was noted. On-screen temperature resolution for all devices is limited to 1 • C, with the exception of the Ratio iX3M GPS Deep which display temperature on-screen at a resolution of 0.1 • C. In the material column, m denotes metal and p, plastic. E.g., ppp denotes plastic for the front, edge and back of the housing, respectively.
FIGURE 1 | Example of response curve for one dive/device. Elapsed seconds is the entire dive profile during which all devices were moved between cold and warm tub at 7 min intervals.

Chamber
Six replicate dives were carried out in the outer lock of the Divex chamber, with a maximum simulated depth of (10 ± 1) m. Three dives included a temperature change from a cold to warm environment and three a warm to cold transition, using one tub for the starting temperature and three for the contrast temperature. All devices acclimatised in a single tub for 10 min, unpressurised, to the same starting temperature (cold or warm, depending on dive). Devices were then shared across the three tubs with contrasting temperature; one Castaway CTD in each tub to provide a baseline. All tubs were compressed to the simulated depth for 10 min, then decompressed and removed (Schema in Supplementary Figure 2). Over the six dives, cold tub final temperature ranged from 10.4 to 12.6 • C and warm tub final temperature ranged from 16.8 to 19.5 • C. Raw data output from the Castaways was used, retaining the full temperature profile as a function of pressure and time. Castaway depth was calculated from pressure using the swDepth function in R (swDepth, 2020), which uses Fofonoff and Millard's refitted equation (Fofonoff and Millard, 1983). Device profiles were aligned by depth and time with the relevant Castaway from the same tub. Mean device temperature from the final 180 s at > 2.5 m depth was calculated (to compensate for differences in depth at which devices start recording) by which time all devices had  Figure 3). The mean from the equivalent 180 s Castaway data were used as baseline temperature for comparison. Mean bias was defined as mean device temperature minus mean Castaway temperature.

Sea Dives
Six sea dives were carried out by RHIB at dive sites local to Oban (56.41535 • N, 5.47184 • W), with maximum depths ranging from 13.5 to 30.7 m (mean: 18.6 m). For each pair of dives, half the dive computers were carried hanging loosely on a frame made from plastic piping, and half were worn on the arms of two divers (Figure 2). For subsequent dives in each dive pair, each device was switched to the other mounting position. All Paralenz Dive Camera+ were transported on the frame for all dives (as they were not wrist mountable), along with all Castaways for baseline temperature.
Raw Castaway data was imported, depth calculated as per section "Chamber." The sea dives had a shallow cold surface thermocline from snow melt run-off. The mean temperature below the depth at which the Castaway temperatures equilibrated (top of the bottom mixed layer) was used as a baseline temperature for comparison for each dive (Supplementary Figure 4). In dive number order this depth was 5, 10, 10, 10, 10, and 12 m. As the frame was carried by divers, and therefore may not have been consistently horizontal, small variations were seen in Castaway depths. Device dive profiles were imported into R Studio and mean temperatures calculated for each device, Castaway and model for the final 180 s below the specified depth (Supplementary Figure 5). Mean bias was defined as mean device/model temperature minus mean Castaway temperature.

RESULTS
As per Wright et al. (2016), devices and models were categorised as accurate if the mean bias from baseline temperature was ≤0.5 • C and as precise if the standard deviation of the mean bias was ≤0.5 • C. Devices were defined as having quick, intermediate or slow response to temperature change (respectively τ < 60 s, 60 s ≤ τ < 120 s, τ ≥ 120 s).

Time Constants
A total of 239 τ values were collected from 26 devices over 9 dives plus three Paralenz Dive Camera+ cameras over 6 dives. 13 τ values were lost because of battery failures or camera recording not initiating correctly. All Ratio iX3M GPS Deep dives and two Shearwater Perdix dives were removed from the analyses because of a poor regression fit (Figure 3). Mean τ by model ranged from (18 ± 5) s to (304 ± 45) s (Figure 4 and Supplementary Table 2). Uncertainties represent 1 σ unless otherwise described. Time constants and residuals were not normally distributed; time constants were best fitted to an inverse Gaussian distribution curve. A generalised linear model (GLM) approach was used in R Studio to look for significant differences. Significant between-model differences were found for τ for all models (p < 0.001) [Mares Puck Pro (p < 0.01)]. Mean τ by device ranged from (17 ± 6) to (341 ± 69) s ( Figure 5). S(τ fit) represents 95% confidence intervals in the regression fit, based on the standard error of the regression (full data in Supplementary Table 3). S(τ fit) < 10 s was considered to be a good fit and applied to all profiles except for those mentioned in the first paragraph of this section.  Clear differences were found in τ by pressure sensor location and material, but not by size (Figure 6). All devices with the pressure sensor at the edge along with the Paralenz Dive Camera+ were defined as having a quick response (17 s ≤ τ < 52 s) and all with a pressure sensor at the back were classified as intermediate or slow responders. Devices with entirely metal housing had quick mean response (17 s ≤ τ < 24 s), part metal/part plastic were

Water Bath Trials
A total of 78 data points were collected from 29 devices over three conditions (bath temperatures). One Suunto D6i data point was missed because of a dead battery. Paralenz Dive Camera+ were not included in the water bath deployments due to not having an on-screen temperature display. Mean bias is defined as displayed device temperature minus water bath temperature, averaged on a model or device basis. By model, this ranged from 0 to (−1 ± 1.7) • C. The mean bias by device ranged from 0 to (−2.3 ± 1.5) • C (Supplementary Tables 4, 5).

Sea Dives
A total of 152 mean bias values were collected from 31 devices over five sea dives. Three values are missing due to failure to recover data from Paralenz Dive Camera+. Mean bias by model, without taking into account experimental condition, ranged from (0.0 ± 0.1) to (−1.3 ± 2.2) • C and by device ranged from (0 ± 0.1) to (−3.5 ± 0.1) • C (Tables 2, 3).

"On Frame" vs. "On Arm"
Wearing a computer "on arm" led to a non-negative mean bias across all models (0.0 ± 1.6) • C ( Table 4) and devices (0.0 ± 2) • C (Supplementary Table 6) when compared to being carried on the frame (Figure 7). Brand, housing material, shape or response group were not found to be significant for bias in "on arm"/"on frame" data.

Overall Accuracy
As depth resolved-temperature data are required for scientific interest and collecting temperature data from dive computers in an unpressurised environment would not be recommended, only data from sea and chamber accuracy dives were combined for overall accuracy results. Across the total n = 337 data points from the two accuracy protocols, overall mean bias was (−0.2 ± 1.1) • C. Mean bias by model ranged from (0.0 ± 0.5) to (−1.4 ± 2.1) • C (Figure 8 and Table 5) and by device ranged from (0.0 ± 0.6) to (−3.4 ± 1.0) • C (Supplementary Table 7).

DISCUSSION
Despite the inherent limitations of the existing technology, our research shows that, while there is wide between-model variation in both temperature bias and τ , there is value in data derived from devices commonly carried by SCUBA divers as a source of subsurface temperature data in coastal areas. We demonstrate that there is sufficient consistency in bias within some models to offer the potential for bias correction by model. In addition, an overall bias of (−0.2 ± 1.1) • C demonstrates that, with sufficient datapoints, valuable data may be produced irrespective of the models from which data were derived. Due to variation in τ , while not all models would be recommended for use in scenarios of temperature change, some models also demonstrate a τ which, in conjunction with a sufficiently high resolution, offer the potential for identification of thermoclines.

Response Time
τ varied widely between models, with less within-model variance than between. We saw less within-device variation in τ than Egi et al. (2018), although a similar mean τ (46 s compared with 52 s) was seen for the only model used in both papers (Mares Matrix). Within-model consistency is promising for the purposes of citizen science, as it offers projects the potential to select specific models based on the project objectives or run post hoc corrections. Six models were defined as quick responders (τ < 60 s) (Supplementary Table 8). Of these, the two models with the shortest τ [Suunto D6i (18 ± 5) s and Paralenz Dive Camera+ (22 ± 3) s] have τ comparable designed-for-purpose aquatic temperature loggers; the plastic Star-Oddi Starmon mini has an 18 s standard τ . Although more commonly used in moored scenarios, Starmon minis have been used to measure lake temperature profiles, with corrections applied (Jóhannesson et al., 2007).
Exponential fits proved consistent across models, exceptions causing poor fit were errant temperature data points recorded in the temperature profile (Suunto EON Steel) or a sharp rise in temperature followed by a levelling or drop before a further rise (Ratio iX3M GPS Deep). In the case of the Ratios, the response seen could be because of intermittent heating caused by internal electronic functions of the model, or, as a slow responding but higher resolution model, the devices may have been affected by cold water ingress introduced by adding additional devices. When dive computer model was excluded as a parameter from the generalised linear model, pressure sensor location and housing material were also found to significantly influence τ . As the two features are correlated (e.g., all devices with a pressure sensor at the back are entirely housed in plastic, Table 1), it is not possible to fully separate the effect of the two variables. Also, while pressure sensor location is identifiable (Supplementary Table 1), it is not known whether the temperature sensor is co-located with the pressure sensor in any given model. However, it is logical to postulate that in a small device, or where a sensor is close to the edge of the device housing, a more rapid response to temperature change will be seen than that of a sensor buried deep within a larger housing, where the thermal mass of the dive computer itself may slow the response.

Temperature Accuracy
All models performed well within the ± 2 • C advertised accuracy (Mares, 2020;Azzopardi and Sayer, 2012;Suunto, 2018) overall, with only one model having a mean absolute bias ≥1 • C (Aqualung i750TC), and only two (Aqualung i750TC, Suunto Vyper) having poor precision. The overall mean bias seen [(−0.2 ± 1.1) • C] is comparable with existing commonly used coastal temperature data sets, such as those using handheld digital thermometers for subsurface temperature measurement; Cefas coastal temperature datasets include data from thermometers and

Model
Device ID Sea dives Chamber Aqualung 2 5 −1.9 ± 0.0 6 −1.9 ± 0.8 Aqualung i750TC Aqualung 3  data loggers with accuracies of (± 0.2 to ± 0.3 • C) (Morris et al., 2018). A systematic negative bias of −1 • C has been seen in satellite sea surface temperature (satSST) (Brewin et al., 2017) and up to 6 • C bias between coastal satSST and in situ devices (Smit et al., 2013). Sampling requirements for the global ocean observing system in situ SST temperature are 0.2 to 0.5 • C (Needler et al., 1999), and bias-corrected numerical oceanic models have been shown to still have up to −0.86 • C offset from baseline satellite temperature after corrections have been applied (Macias et al., 2018). As nine of the twelve dive computer models were found to have "good" accuracy (≤0.5 • C) overall (Supplementary Table 8), these requirements and biases indicate that, with sufficient data points, some models of dive computers can offer an additional source of temperature data to contribute to ocean temperature monitoring, numerical models and composite satellite products.
Differences were found in both bias and variance (accuracy and precision) across the two conditions (sea and chamber). Nine models had the same accuracy categorisation in both sea and chamber dives (Supplementary Table 8). Of these, only three models (Aqualung i750TC, Garmin Descent MK1, Scubapro G2) had the same precision across the two conditions. Precision was found to be improved in sea conditions, with eight models categorised as having "good" precision (Supplementary Table 8).
Only one model (Ratio iX3M GPS Deep) was found to have good precision in the chamber. The reduced precision found in nine of the models in the chamber is likely caused by differences between tub temperatures in dive repetitions, combined with the effect of a static water environment on the Castaway temperature sensor. Castaway CTDs are designed to work with a steady flow of water of around 1 m s −1 through the sensor channel. Collection of data in real world scenarios will always lead to differences caused by environmental variation for which it is not possible to control. In the present study, all Castaways were positioned on a frame carried by one diver, while all the dive computers were worn on the wrists of two divers. It is therefore possible that, although TABLE 4 | Comparison of bias by model worn "on arm" with loose on a frame.

On frame
On arm On arm minus on frame Bias is defined as the mean temperature derived from the final 180 s of sea dives below the top of the bottom mixed layer, compared to baseline Castaway temperature data acquired over the same time.
precision was better than in the chamber, proximity differences combined with local variations in temperature led to additional variation being seen in the sea dives.
With the exception of three devices [Ratio iX3M (n = 1), Garmin Descent Mk1 (n = 1), Suunto EON Steel (n = 1)], all individual devices aligned with their model's overall accuracy categorisation, demonstrating positive within model consistency. Similarly, only one device had lower precision than its model's categorisation, with four devices [Suunto EON Steel (n = 2), Aqualung i750TC (n = 2)] having better precision than their model would indicate. This within model consistency is encouraging for post hoc bias correction by model. Across both conditions, all models except three showed overall negative bias to the baseline temperature. In contrast, Mares Matrix had an overall bias of 0, whilst Ratio iX3M GPS Deep and Paralenz Dive Camera+ biased warm. This could be caused by an internal heating effect of the electronics due to additional active functions as both Ratio iX3M GPS Deep and Paralenz DiveCamera are both devices with additional functionality in comparison with some smaller devices.
Diver attachment placement also had significant effect on bias in sea dives, with all models "on arm" having a non-negative mean bias compared with than "on frame" (irrelevant of whether the device was biased colder or warmer than the baseline). These differences could be caused by the heating effect of the diver's body, an effect of an additional barrier between the ambient water temperature and the temperature sensor (dependent on sensor location within the housing). All divers were wearing drysuits, but the material and thickness varied (neoprene/membrane).
With the exception of two models (Mares Matrix, Suunto EON Steel) there was greater variation in within-model bias in "on arm" conditions. This could be due to differences in positioning of dive computers on arms, the amount of contact between the device and the diver's arm, or the dive suit material. When collecting or correcting data across different environments, console mounted devices which are mounted on a hose not attached to the diver may be preferable for temperature data accuracy. Alternatively, it is common for divers to have redundancy in kit, carrying two dive computers. The secondary device could be attached safely to the diver but not worn on the arm. It is recommended that attachment mechanism and thermal protection type be noted in data collection from citizen scientist divers so it can be taken into consideration.

Technology Limitations
Accuracy in recorded or displayed temperature, or response to temperature change does not form part of primary dive computer function and dive computer manufacturers are not providing temperature data for oceanographic purposes. The results found are in no way reflective of the performance of any model in the designed purpose as diver safety devices. Whilst dive computers in the United Kingdom must adhere to standards set in British Standard BS EN13319:2020, which covers functional and safety requirements including depth and time, the Standard does not include temperature (British Standard, 2000).
The greatest potential for temperature data from citizen scientist divers is to address the lack of depth-resolved data in FIGURE 7 | Effect of wearing devices "on arm" vs. "on frame." Bias from Castaway baseline data by device, black line represents an equal bias in both conditions. FIGURE 8 | Normalised bias by model across sea and chamber dives. The black line represents the median. The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). Upper and lower whiskers extend from the hinge to the largest/smallest value, respectively, no further than 1.5 * inter-quartile range from the hinge. Data beyond the end of the whiskers are plotted individually as outliers. coastal regions. To improve the overall use of dive computers as oceanographic monitoring devices in less-well performing models, manufacturers could look at improving the quality of the out of the box measurements. The addition of an accurate dedicated temperature sensor, with considered placement of the sensor would support unbiased detection of water temperature change. Whilst the majority of dive computer models tested by Azzopardi and Sayer (2010) were found to be consistently within 1% of nominal depth, the addition of conductivity sensors to measure salinity would increase the accuracy of depth values, although this would not affect temperature data quality. Inclusion of geolocation ability would allow easy identification of dive locations. The combination of all of the above would maximise the citizen science potential of divers, due to their access to otherwise hard to reach locations. Within the limitations of the current commercially available devices, a citizen science project dataset could be improved by calibrating individual dive computers in advance, simply, using an iced bucket of water. As evidenced by the water bath trialsthis would be greatly improved by an additional significant figure to the unpressurised temperature display, as currently the majority of models display only positive integers, limiting the potential accuracy by introducing truncation effects.

Citizen Science and Use of Data
We need to better understand how model type affects temperature profiles so that citizen science diving projects can help fill gaps in coastal temperature datasets. To standardise data, there should be a focus on the models offering the greatest accuracy and shortest temperature response. Only one model (Aqualung i750TC) was found to have poor accuracy and precision across all conditions, along with a slow response to temperature change. Five of the six models with a quick temperature response (τ < 60 s) were also found to also have good accuracy, with good/moderate precision overall (Figure 9). These comprise Mares Matrix (2/2), Garmin Descent (2/3), Suunto D6i (3/3), Suunto EON Steel (2/3) and Suunto D4i (1/1), all sharing promising characteristics as individual devices.
When considering models for citizen science data collection, those with the greatest potential have a high sample rate and resolution, are likely to have a pressure sensor located on an edge and have a metal or part-metal housing. In addition, a standardised model could be used by all volunteers in a project and simple corrections applied for systemic model bias. The most promising model tested here for overall use across citizen science projects is the Mares Matrix. This model had consistently good accuracy and precision and a quick response to temperature change; exhibiting an overall mean bias of (0.0 ± 0.4) • C and τ = (46 ± 5) s with a recorded resolution of 0.1 • C and a 5 s sampling rate. A close second is the Suunto EON Steel, which has good accuracy overall, moderate precision and a quick response to temperature change, with a recorded resolution of 0.1 • C and a 10 s sampling rate. Other models have shorter τ (Suunto D6i, Suunto D4i, Garmin Descent), but single degree resolution, making them less useful for monitoring temperature change.
With sufficient data points, we found "good" accuracy, irrespective of originating device. Therefore, data collected by local groups or dive centres in commonly dived, discrete areas, may generate sufficient data points to provide a useful accuracy, irrelevant of model. In addition, not all sampling locations have equal value (Callaghan et al., 2019) and lower quality data may still be of use to support decision making (Buytaert et al., 2016) if uncertainties are quantified. As such, in remote, less widely sampled areas where there are limited pre-existing records, dive computer information may still be of use as indicative data, even with fewer sampling points or from devices with less accuracy/precision.
In addition to the device-related effect, we found that mode of attachment and placement on the diver body had an influence on temperature accuracy. Therefore, for citizen science-derived dive computer profiles to be useful on a wider basis, collection of metadata is crucial. Downloaded profiles already contain metadata such as date, time and model, but diver attachment, placement and diver thermal protection type should be collected in addition, to enable a more comprehensive assessment of data quality on an individual profile basis. An online portal facilitating easy upload of profiles and associated metadata is currently in late-stage development. Ideally, data from different citizen science dive portals should be combined in a global dataset.
Temperature from dive computers could be used to complement biological datasets. For example, thermocline depth affects vertical distribution of fish (Sogard and Olla, 1993), so computer-derived temperature data could contribute to a better understanding of local variability in fish movements. Temperature data can also support regional assessment of hydrological conditions (Morris et al., 2018). In highly dived areas, the data would provide a time series allowing identification of seasonal variation, albeit without complete temporal coverage. They may also be useful for marine recreation (Brewin et al., 2015) or feeding into numerical models and satellite products (Smit et al., 2013) in areas where the accuracy is known to be < 1 • C. They could be especially useful in commonly dived, poorly sampled areas, such as the South Pacific, where the volume of dive profiles could provide data of a useful resolution irrespective of model.
In conclusion, the limitation of divers as citizen scientists for temperature data collection is inherent in the devices themselves. The challenge is to understand the uncertainty in accuracy and precision recorded by the devices rather than the abilities or knowledge of the citizen science diver. Our research shows that the quality of temperature data in dive computers could be improved, but implementation would need to be driven by manufacturers, or by diver demand. As some models of dive computers can demonstrably provide data comparable to that collected by more traditional methods, within required accuracy levels for some monitoring scenarios, they have a role to play in future oceanographic monitoring.

AUTHOR CONTRIBUTIONS
CM carried out the experiments and analysed the data. CM wrote the manuscript with contributions from all authors. All authors set the conceptual framework of this study.

FUNDING
CM project was part of the Next Generation Unmanned Systems Science (NEXUSS) Centre for Doctoral Training which was funded by the Natural Environment Research Council and the Engineering and Physical Science Research Council (EPSRC) (Grant No: NE/N012070/1). The Ph.D. project was additionally supported by Cefas Seedcorn (DP901D). The diving and chamber tests were supported through a grant from the NERC National Facility for Scientific Diving (Grant No: NFSD/17/02).

ACKNOWLEDGMENTS
We acknowledge Mares and Paralenz for loans of dive computers, cameras, and technical information and Benita Maritz for water bath inspiration.