Validation and Intercomparison of Ocean Color Algorithms for Estimating Particulate Organic Carbon in the Oceans

Particulate Organic Carbon (POC) plays a vital role in the ocean carbon cycle. Though relatively small compared with other carbon pools, the POC pool is responsible for large fluxes and is linked to many important ocean biogeochemical processes. The satellite ocean-colour signal is influenced by particle composition, size, and concentration and provides a way to observe variability in the POC pool at a range of temporal and spatial scales. To provide accurate estimates of POC concentration from satellite ocean colour data requires algorithms that are well validated, with uncertainties characterised. Here, a number of algorithms to derive POC using different optical variables are applied to merged satellite ocean colour data provided by the Ocean Colour Climate Change Initiative (OC-CCI) and validated against the largest database of $\textit{in situ}$ POC measurements currently available. The results of this validation exercise indicate satisfactory levels of performance from several algorithms (highest performance was observed from the algorithms of \cite{stramski2008} and \cite{loisel2002}) and uncertainties that are within the requirements of the user community. Estimates of the standing stock of the POC can be made by applying these algorithms, and yield an estimated mixed-layer integrated global stock of POC between 0.77 and 1.3 Pg C of carbon. Performance of the algorithms vary regionally, suggesting that blending of region-specific algorithms may provide the best way forward for generating global POC products.

Particulate Organic Carbon (POC) plays a vital role in the ocean carbon cycle. Though relatively small compared with other carbon pools, the POC pool is responsible for large fluxes and is linked to many important ocean biogeochemical processes. The satellite ocean-color signal is influenced by particle composition, size, and concentration and provides a way to observe variability in the POC pool at a range of temporal and spatial scales. To provide accurate estimates of POC concentration from satellite ocean color data requires algorithms that are well validated, with uncertainties characterized. Here, a number of algorithms to derive POC using different optical variables are applied to merged satellite ocean color data provided by the Ocean Color Climate Change Initiative (OC-CCI) and validated against the largest database of in situ POC measurements currently available. The results of this validation exercise indicate satisfactory levels of performance from several algorithms (highest performance was observed from the algorithms of Loisel et al., 2002;Stramski et al., 2008) and uncertainties that are within the requirements of the user community. Estimates of the standing stock of the POC can be made by applying these algorithms, and yield an estimated mixed-layer integrated global stock of POC between 0.77 and 1.3 Pg C of carbon. Performance of the algorithms vary regionally, suggesting that blending of region-specific algorithms may provide the best way forward for generating global POC products.

INTRODUCTION
Total particulate organic carbon (POC or C o ) in the ocean is a key currency used in studies of both the biological export of carbon from the surface to the deep ocean, and the availability of food for marine organisms. The pool of POC in the ocean is relatively small (estimates include: 0.43 Pg C in the first lightattenuation depth- Gardner et al., 2006;2.28 Pg C over a 200 m surface layer- Stramska, 2009). Despite the relative small size of the POC compartment, its components (phytoplankton, bacteria, zooplankton, and organic detritus) are responsible for large fluxes in the ocean, because of their high turnover rates. The organic tissue generated by photosynthesis in the sunlit ocean is either exported from the surface via the "biological pump" (Volk and Hoffert, 1985;Ducklow et al., 2001), transferred to higher trophic levels through the food chain, transformed into detritus, or recycled via the microbial loop, with some of it going into the pool of dissolved organic (DOC) and inorganic carbon (DIC). Organic particles are therefore involved in two important carbon fluxes in the ocean, primary production and export to either the deep ocean or the DOC and DIC pools, in addition to being an integral part of the marine food web. In addition to the components of POC arising from local sources, POC may be transported to a particular location from distant sources: for example, by currents that move POC horizontally in the ocean, or by transport of POC of terrestrial origin to the oceans by river outflow. Though POC is typically treated as a single pool, there is growing awareness of the importance of different particles, such as defined by their size, because of their variety of biogeochemical functions, and their effect on ocean optical properties. For example, it has been shown that around 40% of POC concentration in the oligotrophic regions may be associated with bacteria alone (Cho and Azam, 1990) and submicron detrital particles can also make a significant contribution to the POC pool (Mel'nikov, 1976). Similarly, relatively large particles (generally larger than a few micrometers) can play an important role in POC export (e.g., Boyd and Newton, 1995;Dall'Olmo et al., 2009). The importance of particles characteristics in determining the optical signal of POC has also been recognized.
The theoretical work of Stramski and Kiefer (1991), assuming spherical and homogenous particles, indicated that small particles can make an important contribution to the backscattering signal in the oceans. Further work has shown the impact of non-sphericity and intracellular structures on optical properties, particularly backscattering (Meyer, 1979;Kitchen and Zaneveld, 1992;Bernard, 2004, 2006;Clavano et al., 2007;Matthews and Bernard, 2013;Robertson Lain et al., 2017). Work by Cetinić et al. (2012) linked variation in the beam attenuation coefficient with plankton community composition, and variability in particle backscattering with changes in particle composition due to remineralization. They also highlighted how measurement artifacts might influence the observed relationships between POC and optical properties. Further work has explored separation of the phytoplankton component in POC-through both indirect (Behrenfeld et al., 2005;Kostadinov et al., 2016) and direct methods (Graff et al., 2012(Graff et al., , 2015. The contribution of phytoplankton to the POC pool leads to the covariance between chlorophyll a concentration ([Chl]) and POC concentrations, although some scatter exists in these relationships, as a result of variability in the phytoplankton community composition, physiological factors that can affect the carbon-chlorophyll ratio in phytoplankton, and the variable contribution of substances other than phytoplankton (including detritus and bacteria) to POC (see discussion and references within Stramska and Stramski, 2005;Sathyendranath et al., 2009). POC is readily quantifiable by filtering seawater samples, and forms a key component of many biological ocean models. However, in situ samples are expensive to collect, leading to a scarcity of data that hinders efforts to both validate ocean models and develop a complete understanding of POC dynamics. Satellite ocean color data offers the opportunity to quantify POC at the global scale on an almost daily basis. Ocean color or more specifically the water-leaving radiance and corresponding remote sensing reflectance spectra, and derived [Chl] are recognized as Essential Climate Variables (ECVs) by the Global Climate Observing System (GCOS, 2011). This is in recognition of their importance for studying various biological variables and processes in the ocean. In fact, of all the oceanic ECVs that are amenable to remote sensing, ocean color is the only one that targets a biological property. In response to the GCOS requirements, the European Space Agency (ESA) Ocean Color Climate Change Initiative (OC-CCI) has generated a time series of merged satellite products for climate research, using data from the ESA satellite sensor MERIS (MEdium spectral Resolution Imaging Spectrometer) and NASA (National Aeronautics and Space Administration) satellite sensors SeaWiFS (Sea-viewing Wide Field-of-view Sensor) and MODIS-Aqua (Moderate-resolution Imaging Spectroradiometer-Aqua). The products include the normalized remote-sensing reflectances, R rs at SeaWiFS wavelengths and [Chl], as well as some additional inherent optical properties (IOPs) such as absorption and backscattering coefficients of phytoplankton and other particulate matter, and diffuse attenuation coefficient for downward plane irradiance, K d at 490 nm. There is a recognized need in the user community for additional products from ocean color that deal directly with POC, including separation of the contribution of phytoplankton, and the size distribution of particles. Further, these products need to be regionally optimized and their uncertainties well characterized. Both of these requirements may be addressed via optical classification (e.g., Moore et al., 2009), whereby waters are classified according to their spectral or bio-optical properties. Optical classification allows specific algorithms to be applied to the different optical water types (resulting in a global, merged product) and provides provides a method (Moore et al., 2009;Jackson et al., in press) that can be used to calculate per pixel errors. Other methods exist: e.g., formal error propagation (Lee et al., 2010), or estimation of uncertainties based on modelobservation comparison (Maritorena et al., 2010). However, the users consulted within the OC-CCI project expressed a preference for uncertainties based on comparison with in situ data (Sathyendranath et al., in press).
Remote sensing of POC through ocean color radiometry requires the exploitation of some optical signal that is associated with the material. In fact, optically, the beam attenuation coefficient of particles (c p ), particle scattering coefficient (b p ), backscattering coefficient (b bp ) and the attenuation coefficient of downward irradiance K d are all sensitive to particle abundance (to a first order), and to particle composition (through refractive index), size, shape and internal structure. It has been demonstrated that POC is correlated with in situ c p measured using transmissometers (Gardner et al., 1993;Bishop, 1999;Claustre et al., 1999;Stramska and Stramski, 2005), which has provided a robust optical method for measuring POC using in situ devices. Although b p and c p are not among the data products that are routinely retrieved from remote sensing, satellite-based algorithms exist for retrieving POC from all the optical properties listed above, as well as from remote-sensing reflectance values.
This paper compares five different algorithms for estimating POC concentrations, selected as being representative of varied approaches that are prevalent for POC retrieval from oceancolor data. Each algorithm is applied to different optical properties derived from satellite ocean color data, and each uses different formulations for linking the parameters to the POC concentration. Matchups between in situ measurements of POC and satellite ocean color allow for the validation, intercomparison of global performance, and estimation of uncertainties associated with the POC calculated using these algorithms.

Collation of an In situ Database
For this study, POC concentration data were collected from a number of existing databases and from individual contributors. Databases collated included PANGAEA (https://www.pangaea.de/) and SeaBASS (Werdell and Bailey, 2005), and those compiled by Martiny et al. (2014) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO, USA). Further data were included from the Atlantic Meridional Transect (AMT) (including data derived from both CTD and the ship's clean water supply) and other cruises in the Southern Ocean (see a description of the Good Hope line and associated data collection in Thomalla et al., 2017). Operationally, POC is defined as all the organic carbon that is retained on GF/F filters (nominal pore size of 0.7 µm). To measure POC, samples are collected on pre-combusted (450 • C) GF/F filters and dried overnight at 65 • C before analysis. To remove particulate inorganic carbon, filters are acidified either by adding low-carbon HCl directly or by overnight exposure to the fumes of a concentrated HCl solution in a desiccator. Filters are then dried, packed in pre-combusted tin capsules, combusted at 960 • C in an elemental analyser to convert the organic carbon in CO 2 . The liberated CO 2 is finally detected by thermal conductivity (Sharp, 1974). Acetanilide is used as a standard. The procedure for applying a blank however is not always consistent across studies, and as such could be a source of bias within the data set collated here. Cetinić et al. (2012) (and references therein) have studied the consequences of different methodologies for treating POC blanks, summarizing that the effect of DOC adsorption on filters (if not accounted for with an adequate blank correction) can cause substantial bias at low POC concentrations. In the database used here, a large quantity of the samples from low POC regions (i.e., the oligotrophic gyres) are from the AMT cruise programme, where a multiple-volume intercept blank methodology is used to reduce potential bias from blanks. Where data was provided at depth, measurements were averaged over 10m to provide the "surface" value for the matchup. Optical weighting of these measurement were considered, however given the variability of the water types sampled and associated mixed layer depths, and the necessary assumptions to apply an optical model for this purpose, it was decided not to introduce additional sources of uncertainty for these data points.

Extraction of Satellite-In situ Matchups
Matchup extraction was based on the procedure developed for the OC-CCI. The daily, 4 km, sinusoidally projected OC-CCI version 2 data (Sathyendranath et al., 2016) were searched to find satellite data associated with each in situ data point. The OC-CCI data is a merged product from three sensors, each with a different overpass time. However these overpass times are generally around 12 p.m. ± 2.5 h, meaning a maximum time difference between the in situ and satellite data of <12 h. The OC-CCI data used contain all of the relevant optical and biogeochemical properties necessary for implementation of the different algorithms under consideration, as well as water class membership of each pixel which quantifies the similarities between the remote-sensing reflectance spectrum at that pixel and the characteristic mean and covariance spectra associated with each of the optical classes (Jackson et al., in press), see also the OC-CCI product user guide (http://www.esa-oceancolorcci.org/?q=documents). The OC-CCI data were interrogated to assess the availability of data covering the latitude and longitude of the in situ data point, on the same date as the in situ data collection. If the central pixel contained valid data, the data surrounding eight pixels are also extracted from the selected data (a 3 × 3 pixel box, corresponding to a 12 km × 12 km region) . The central value and the mean, median, and standard deviation, number of valid pixels out of the nine pixels, the optical class with the dominant membership, and the calculated POC products using various algorithms applied to the central and mean pixel values, are returned as output, along with the in situ data values and metadata.

Candidate Algorithms
Five different algorithms for determination of POC concentration were considered. For consistency of comparison, all algorithms were implemented using the appropriate variables from the OC-CCI product suite. Another reason for using the OC-CCI products is that a rigorous algorithm selection procedure for atmospheric correction  and in-water properties  (including derivation of [Chl], IOPs and the diffuse attenuation coefficient), had been put in place, to ensure quality of products. Furthermore, being a merged product, OC-CCI coverage is higher than that available from single-sensor products, ensuring a higher number of match-up points with in situ data. However, we recognize that all the five candidate algorithms were initially developed, implemented and tested using other ocean-color products, and that any systematic differences between the OC-CCI products and the datasets used by the algorithm developers, could be a potential source of difference in performance. Therefore, in the description of the algorithms, we also provide details of how the algorithms were implemented in the original work.
We also note that there are differences in how the five algorithms compared here were developed and implemented in the original work. For example, two of the algorithms (algorithms A and B presented below) were derived solely from coincidentally collected in situ data, whereas Algorithm D (G06-described below) combined in situ measurements of POC and beam attenuation, with satellite-derived measurements of diffuse attenuation coefficient. Algorithms A and B are based on some 50 measurements, whereas the POC-beam attenuation coefficient relationship used in Algorithm D was based on over 3,000 measurements. Algorithm C (described below) relies on a large in situ database of [Chl] and backscattering ratio to estimate total particle scattering coefficient from particle backscattering coefficient, and then relies on an extensive literature review to find a conversion factor between particle scattering coefficient and POC. Here we use a common set of satellite data (OC-CCI) to compute POC using the different algorithms and compare the products against a common set of in situ data. Where there are differences between how the various steps in the algorithms were implemented in the original work, and how OC-CCI treated similar steps in its product generation, they are highlighted, since such differences could have potential impact on algorithm performance.

Algorithm A Based on Remote-Sensing Reflectance
This algorithm (designated C o (A)) proposed by Stramski et al. (2008) uses remote-sensing reflectance at 443 and 555 nm (equation 1) as inputs, and takes the following form: (1) The model parameters were determined using some 53 pairs of co-located in situ measurements of both POC and spectral values of R rs from oligotrophic and upwelling waters of the East Atlantic and the South Pacific. The authors have provided various fits to the models for different pairs of wavebands for R rs and for different selections of data. The one we have used here is based on all data, and for the 443-555 waveband pair, as recommended by the authors. This algorithm is currently used by the NASA Ocean Biology Processing Group to generate the global POC data product from ocean color data. The R rs values in the OC-CCI product suite were used as input to this algorithm for the validation exercise presented here. We note that a similar POC algorithm based on the blue-to-green reflectance ratio has been developed and validated with the data from the Southern Ocean (Allison et al., 2010). For the Southern Ocean algorithm the best fit coefficients of the power function of the same form as Equation (1) were determined to be 189.29 and -0.87. A power function was found to provide better error statistics compared with those using modified fits of those typically used in maximum band ratio algorithms for [Chl].

Algorithm B Based on Backscattering by Particles
This algorithm (C o (B)), also proposed by Stramski et al. (2008) uses b bp (555), the particulate backscattering coefficient at 555 nm (2), as input. The equation has the form: Stramski et al. (2008) tested two approaches for calculating b bp for the eventual determination of POC: firstly, the method of Maffione and Dana (1997) as refined by Boss and Pegau (2001) was used to derive spectral backscattering coefficient from in situ measurements of volume scattering function at 140 o . Next, using a two-step empirical approach, Stramski et al. (2008) then calculated b bp by removing the backscattering coefficient for pure water [they used pure-water backscattering coefficients proposed by Buiteveld et al. (1994) and by Morel (1974), and reported that the difference between the measurements of b bw was likely within the range of errors associated with the measurements themselves, and therefore did not substantially impact the performance of the algorithm for deriving POC]. Then, b bp was empirically related to POC concentrations (Equation 2). Stramski et al. (2008) provide various fits to this equation, but the parameters selected above correspond to the results (number of observations = 54) excluding upwelling data, which provided better uncertainty metrics, and may better reflect the data within the database used here. Secondly, to provide the remote-sensing context, Stramski et al. (2008) used either a direct empirical relationship between the backscattering coefficient (b b ) and R rs or the Quasi Analytical Algorithm (QAA) approach of Lee et al. (2002) to derive b bp from R rs . They found that the comparison with measured b bp was improved considerably when an empirical correction based on their measurements of b bp was applied to the QAA algorithm. Thus, this algorithm uses a two-step approach: first, b bp is derived from R rs , and then that b bp is used in Equation (2) to calculate POC.
In the computation of OC-CCI b bp products, the QAA model was used, along with Zhang et al. (2009) for the purewater backscattering coefficient (Sathyendranath et al., 2016). In QAA the spectral backscattering coefficient is calculated empirically using R rs at 440 nm and at 555 nm. A power law is then used to calculate the particle backscattering coefficients at other wavelengths given the value of the spectral backscattering coefficient and b bp (555), estimated from an analytical relationship using R rs (555) and the absorption coefficient (a(555)). These values of b bp were used to compute POC in the calculations presented here, without any empirical correction. For this algorithm a linear fit provided the best error statistics in the original study of Stramski et al. (2008).

Algorithm C Based on both Backscattering Coefficient and Chlorophyll-a Concentration
This algorithm (C o (C)) is based on a combination of b bp (490) and [Chl], and was proposed by Loisel et al. (2002) (3). They derived b bp and [Chl] from SeaWIFS data using the method of Loisel and Stramski (2000) and the OC4 algorithm (based on O'Reilly et al., 1998) respectively. They then used a relationship proposed by Twardowski et al. (2001) to compute the total particle scattering coefficient given the particle backscattering coefficient (b bp ) and [Chl] (designated B in Equation 3). They then adopted a conversion value of 400 mg C m −2 to go from total scattering coefficient to POC concentration. Combining these steps yields the following algorithm: Algorithm C is implemented here using the Lee et al. (2002) QAA approach (for b bp ), and a more recent OC4 (v6) for [Chl].

Algorithm D Based on the Diffuse Attenuation Coefficient for Irradiance and Beam Attenuation Coefficient
This algorithm (C o (D)), based on Gardner et al. (2006), uses a two-step relationship relating K d (490) to beam attenuation coefficient for particles (c p ), and then the beam attenuation coefficient to POC: and In Gardner et al. (2006), K d (490) is obtained from SeaWIFS data where the algorithm of Mueller (2000) is used, based on water-leaving radiances at 490 and 555 nm. Gardner et al. (2006) used an extensive database (number of observations = 3,462) of concurrent measurements of POC and beam attenuation coefficient from Atlantic, Pacific and Indian Oceans, to derive the parameters of Equations (4) and (5) empirically. The K d (490) values used in the comparison presented here are obtained from the OC-CCI version 2 data, which uses the method of Lee et al. (2005).

Algorithm E Based on Spectral Backscattering Coefficient:
This algorithm (C o (E)) was developed by Kostadinov et al. (2009) and Kostadinov et al. (2016). It correlates the slope (η) of backscattering as a function of wavelength to the slope (ξ ) of the particle size distribution (PSD) that is assumed to follow a powerlaw. The method has three steps: firstly, η is calculated from the spectral b bp values at 490, 510, and 555 nm extracted from the OC-CCI matchup data (for which OC-CCI uses Lee et al. (2002)). Note that in Kostadinov et al. (2009) and Kostadinov et al. (2016), b bp is instead retrieved using the formulation of Loisel and Stramski (2000). Secondly, look-up-tables (LUTs) are used to retrieve the parameters of the PSD, namely the slope (ξ ) and the differential number concentration at a reference diameter of 2 µm, (N o ), given η and b bp at 443 nm. The LUTs are constructed using theoretical forward simulations using Mie code (Bohren and Huffman, 1983). Finally, to compute C o , the PSD is integrated to calculate particle volume in the 0.5 to 50 µm diameter range, and then volume is converted to carbon using existing allometric relationships derived from phytoplankton cultures (Menden-Deuer and Lessard, 2000), assuming biogenic origin for all the scattering particles. An empirical correction is applied to N o (based on PSD validation statistics) to achieve more realistic absolute carbon concentration values (Kostadinov et al., 2016).

Separation of Matchups by Optical Water Class and Calculation of Uncertainties
The OC-CCI product suite includes memberships of each pixel in 14 optical classes, following the fuzzy logic classification methodology of Moore et al. (2009), with some modifications as described in Jackson et al. (in press). The memberships of the 14 optical water classes associated with the satellite matchups were extracted alongside the radiometric and biogeochemical properties required for the validation. Each matchup point was then assigned to the optical class that had the dominant membership in the central match-up pixel. The statistical analyses were carried out for the global dataset as well as for subsets of the data grouped according to dominant optical class. These uncertainty metrics per optical class were then used to assign uncertainties at each pixel, by calculating the weighted average of the metrics associated with each of the water classes, with the membership of the classes in that pixel providing the weighting function.
Statistical analysis used in the assessment of each algorithm was based on that used by OC-CCI (see Brewin et al., 2015). The Kolmogorov-Smirnov test for normality of the in situ matchup data showed a significant deviation of normality for log 10 transformed and un-transformed data (p <0.001). Therefore, for completeness, the statistical analysis was conducted for both log 10 transformed using parametric tests and for un-transformed data using non-parametric, rank-based, statistics. Statistical metrics computed were: • Pearsons correlation coefficient for log 10 transformed data, and Spearman's correlation for un-transformed data (r p and r s respectively), • Root mean square differences for log 10 transformed and untransformed data (RMSD -, in log 10 for the transformed data and mg m −3 of POC for the untransformed data), • Bias for log 10 transformed data and untransformed data ((δ), in log 10 for the transformed data and mg m −3 of POC for the untransformed data), • Median absolute percentage deviation between predictions and observations (MAPD in %), an estimate of bias and precision was estimated as the interquartile range (IQR) of the absolute percentage deviation for the untransformed data, • Center pattern root mean square deviation ( ), in mg m −3 ), which is the error of the predicted values with respect to the observations, irrespective of the average bias between the two distributions. It is related to the RMSD and bias as follows: 2 =δ 2 , and was calculated for both the log 10 transformed and un-transformed data, • Slope and intercept (S, I) from a Type-II linear regression (Reduced Major Axis) for log 10 transformed and untransformed data.
To provide an indication of the stability of the statistics and to compute confidence intervals on them, bootstrapping (Efron, 1979;Efron and Tibshirani, 1993) with random re-sampling and replacement was used to construct 1,000 different datasets from which confidence intervals were computed for some statistics. Statistics were computed for the whole dataset, and also after segregating the data according to dominant water class, at the central matchup pixel.

Distribution of Validation Data and Matchups
Geographic distribution of the data in the in situ data base is shown in Figure 1, with the color scale representing the average POC concentration (mg m −3 ) over measurements made in the top 10 m (total N = 63, 704, depth averaged N = 19, 282). Figure 2 shows the in situ data with valid satellite data matchups (N = 3891), colored according to the concentration of POC (mg m −3 ) measured. A number of expected patterns can be observed. Firstly, the number of valid matchups is much reduced relative to the total number of data points in the in situ database. This is a result of several factors including: the averaging of the in situ data over the top 10m, elimination of data points outside of the OC-CCI period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012); and failure of matchup when there were no satellite observations corresponding to the date and location of the in situ sampling. Other factors are spatially heterogenous, such that there is some regional skew in the likely success of obtaining a matchup, for example cloud cover has a spatially and temporally variable influence on matchup availability, such that in some areas it is less likely that a matchup will be found, e.g., in the tropics, or during winter in the mid latitudes-where satellite coverage was relatively poor. There is a high concentration of data in the Atlantic-as a result of the substantial AMT cruise data. Similarly, there is a high concentration of data from some coastal regions, particularly in the northern hemisphere. Observed concentrations of POC follow an expected distribution, with higher values in coastal and shelf regions, and lower concentrations in the oligotrophic gyres (Figure 2). A bimodal distribution can be observed in the in situ data, as a result of the high numbers of data from the AMT cruises which go predominantly through the oligotrophic gyres, and from coastal regions, which tend to be predominantly high-POC areas (Figure 3).

Algorithm Performance
The histogram of the in situ data frequency distribution is replicated to a degree by all the algorithms (Figure 3). The histogram of Algorithm A is very similar to that of the in situ data, with both histogram shapes and peak locations reproduced. Algorithm B tends to overestimate the POC values relative to in situ data at the lower end of POC concentrations, resulting in the first peak in the histogram being offset toward the right of the figure (i.e., toward higher POC concentrations). By contrast, Algorithm C underestimates POC slightly, at the lower end. The range of estimates provided by Algorithm D is narrower than that of the in situ data. Algorithm E also has a narrower range, and in general underestimates POC concentrations. Both algorithms D and E show significant shifts in both peaks of the distribution relative to those of the in situ data.
As could be expected from the histograms (Figure 3), there is generally a high correlation between the algorithm estimates and in situ measurements of POC, with r values between 0.75 and 0.82 for all algorithms tested (Figures 4A-E). It is of note that all algorithms show a small cloud of underestimated concentrations associated with high in situ POC. These points are from different data sources, and from different regions, and as such, outliers do not appear to be related to any systematic errors in the in situ measurements. Furthermore, not all of these substantial underestimates are associated with common data points across all the algorithms. It is important to note, however, that although we compare the satellite-derived POC with in situ POC over a very broad range, extending to very high values >1,000 mg m −3 , the original formulations of the algorithms were not, in general, implemented with such high POC data.
The scatter plots and uncertainty statistics for Algorithm A shown in Figure 4A and Table 1 suggest that this algorithm, on average, performs better than the other algorithms shown in Figures 4B-E. Algorithm A has the smallest bias, a linear fit that is closest to the 1:1 line, with a slope of 0.92 and a relatively small intercept value (the second smallest after Algorithm C), and also the highest (albeit slightly) correlation coefficient and the smallest values (albeit slightly) of the Root Mean Square Difference. The statistical parameters are consistently good for Algorithm A, though the results for Algorithm C also present some advantages. The performance of Algorithm C has a slope of 1 and intercept closest to zero, but has a slightly higher RMSD, CRMSD and bias compared with Algorithm A (Figure 4C). Algorithm B has relatively small bias, however other parameters are clearly inferior compared with Algorithm A (Figure 4B). Some statistical parameters associated with the performance of Algorithm D are significantly inferior compared with those associated with algorithms A and C, especially the slope and intercept of linear fit which indicate a large deviation of this fit from the 1:1 line ( Figure 4D). As a result, low values of POC are overestimated, and high values underestimated, compared with the in situ data. The statistical performance for Algorithm E is poorest with significant negative bias and the best fit regression deviating greatly from the 1:1 line ( Figure 4E). Performance statistics are summarized for all algorithms in the first section of Table 1. A statistical analysis was also conducted for the nontransformed data, and provided in the second section of Table 1.

Performance Per Water Class
To further understand algorithm performance, the matchups were separated by their optical water class. The total number of matchups per water class, and their distribution spatially, is shown in Figures 5, 6 respectively. The lower water classes are associated with oligotrophic regions, such as in the gyres, whilst the higher classes correspond to progressively more turbid shelf and coastal waters. Statistical performance of the algorithms across the different water classes is summarized in Figure 7.
Performing the statistical analysis across the different water classes reveals some similarities in performance across all algorithms, and some consistency with the overall performance (Figure 7). Algorithms A, C, and D show little differences between them in terms of RMSD across the water classes, with the exception of water class 14 (i.e., the most optically complex waters) (Figure 7A). The RMSD associated with Algorithm E follows the same broad pattern as algorithms A and D across the water classes, although its RMSD is substantially higher than those algorithms in all instances. Algorithm B shows slightly  For log 10 : rs is Spearman's correlation; slope and intercept for were calculated with a Type-II linear regression model (Major Axis) and the statistics provided have uncertainty estimates (95% confidence interval), derived from 1,000 bootstrap realizations. For untransformed data: rp is Pearson's correlation; , δ and are provided with uncertainty estimates (95% confidence interval), derived from 1000 bootstrap realizations; slope and intercept for were calculated with a Reduced Major Axis regression model; MAPD is the median absolute percent deviation between predictions and observations and is a measure of bias, and IQR is the interquartile range of the absolute percent error, and is a measure of precision. Bold italic numbers are the best results for each statistic, for some is the highest value (e.g., rs or rp), for some is the lowest (e.g., , δ, , Intercept and MAPD) and for some is the closest to one (e.g., Slope).
higher RMSD than the other algorithms in some of the more oligotrophic water classes (1-5), then largely follows the same patterns as algorithms A and D. The cloud of outlier points observed in the overall comparisons are associated with water classes 6, 7, and 8, where RMSD is relatively high for most of the algorithms. The estimates of bias for each of the water classes is consistent with the results from the global application e.g., Algorithm A shows very little bias, and this is consistent across water classes, whilst Algorithm E generally underestimates across all classes ( Figure 7B). Algorithm C shows slight negative bias across most of the water classes, except 12 and 13, whilst algorithms D and B show slight (larger) positive bias in the more oligotrophic classes. Algorithm B shows large positive bias in water class 13, as with Algorithm C; however, both algorithms show negative bias in class 14. The center-pattern (or unbiased) RMSD in Figure 7C shows the largest uncertainties are associated with water classes 6, 7, and 8 whilst the largest differences in algorithm performance across the water classes are found in water classes 12, 13, and 14. The regional variability in algorithm performance, which can be associated with the optical water classes, is discussed further in Sections 3.4, 3.5, and 4.2 below.

Mapped Products
In addition to application to the matchups points, the POC algorithms can also be applied to global satellite data to compare algorithm performance at synoptic scales. POC concentrations FIGURE 4 | Scatter plots and statistics detailing performance of (A) Algorithm A by Stramski et al. (2008) (using R rs ), (B) Algorithm B by Stramski et al. (2008) (using b bp ), (C) Algorithm C by Loisel et al. (2002), (D) Algorithm D by Gardner et al. (2006), and (E) Algorithm E by Kostadinov et al. (2016). Statistical parameters are as follows: correlation coefficient (r), bias (δ), Root Mean Square Difference ( ), Center Patterned Root Mean Square Difference ( ), slope (s), intercept (I). The solid line is the 1:1 line, and the dashed line is the line of best fit for the linear regression.   were estimated by applying algorithms A-E to a sample set of OC-CCI monthly products from May 2005 (Figure 8). All algorithms produce the broad patterns that were observed in the in situ measurements and would be expected to be associated with POC, i.e., increased POC associated with regions of high [Chl] in upwelling zones, lower concentrations in the oligotrophic gyres, and higher concentrations in turbid shelf and coastal regions. However, there are some notable differences between the POC concentrations estimated by the different algorithms. Algorithm A and C perform similarly (Figures 8A,C). Algorithm B (Figure 8B) produces estimates that are generally higher relative to Algorithm A and C, particularly at low POC concentrations. Algorithm D (Figure 8D) underestimates at higher POC concentrations relative to all other algorithms, whilst at low concentrations its estimates are generally higher than algorithms A and C, and lower than algorithm B. In contrast, Algorithm E ( Figure 8E) estimates lower POC concentrations relative to the other methods. A transect, extracted along from 20 o west, shows the regional differences in algorithm estimates for POC, and the associated OC-CCI [Chl] for reference ( Figure 8F). Histograms of these products (not shown), show a similar range to the in situ data used for validation (≈ 10-1,000), though values greater than 1,000 mg m −3 are scarce in the satellite products (though higher values are more frequent with Algorithm C than with Algorithm A). Also, the pronounced bimodal frequency distribution of the in situ data is absent in the satellite products, which show a unimodal distribution. This difference between the frequency distribution of in situ data and satellite products could have had an impact on the comparative statistics presented here.

Mapped Uncertainties
Each algorithm has uncertainties associated with its performance for each water class, calculated from the validation exercise. These values can be used to estimate uncertainties for pixels outside of direct matchup locations, using a weighted average based on the percent membership to each of the classes. This procedure was applied to the data in Figure 8 to calculate per pixel RMSD and bias. As could be expected from the performance of the algorithms across the substantial in situ data set, Algorithm A (R rs based algorithm of Stramski et al., 2008) shows low RMSD and bias when uncertainties are calculated (Figures 9, 10). Algorithms B and particularly Algorithm C show higher RMSE, particularly in the gyre regions, consistent with the distribution of the matchupbased estimates in Figure 3. Algorithm D has low RMSD and bias in the oligotrophic gyres, but relatively higher values appear in the more productive (upwelling and coastal) areas. Algorithm E shows high RMSE throughout the image. Highest positive bias estimates are associated with algorithm B, relating to overestimation in the gyre regions ( Figure 10B). Bias for Algorithm E (Figure 10E) shows negative bias globally, consistent with the general trend toward underestimation shown in Figures 4E, 8E.

Variability in Algorithm Performance
As a Result of Input Satellite Data, Choice of Optical Models, and Regional Optical Properties The strength of the relationships between bio-optical properties and POC concentration has been quantified in the original studies where the algorithms examined above were formulated, and in some cases, validated against satellite data. To the extent that some of the satellite data used in the original studies are included in the match-up dataset used here, the impact on the results is likely to be small because the size of the match-up data used here is much larger than that used in any of the previous studies. Furthermore, the comparisons presented here are based on a common satellite product suite (OC-CCI). However, further insights are gained from the bigger in situ data base assembled for this study, and from the climate-quality data set provided by OC-CCI.
Results from the OC-CCI based validation are generally consistent with the observations made by Stramski et al. (2008). The in situ data used by Stramski (Stramski et al., 2008) were limited to the south eastern Pacific and eastern Atlantic Oceans and covered a POC range of 12-270 mg m −3 , whilst the range for the data used here cover a broader range (2.7-8,097 mg m −3 ). The waters sampled by Stramski et al. (2008) ranged from upwelling to oligotrophic, with significant contributions of mineral particle matter to the particle assemblage at some stations. Consistent with the results here, Stramski et al. (2008) found that empirical relationships between R rs and POC performed better than two-step approaches where an inherent optical property (IOP) is derived from the R rs and then related to the POC. They also indicated relatively better performance of the R rs relationship over that derived from b bp , highlighting the uncertainties in the derivation of b bp as one source of error in estimation of POC from IOPs. Additionally, the relationships between POC and IOPs would be expected to vary as a result of the particle size distribution (PSD), the refractive index of particles, and the fractional POC concentrations within different particle types in the assemblage. For example, significant variations in the POC-specific backscattering coefficient has been reported for different water bodies of the Southern Ocean (see Figure 1 in Stramski et al., 1999). Whilst variability in the POC-specific backscattering introduces uncertainty in total POC estimates, a better understanding of the relationship between particle characteristics and IOPs has the potential to provide further insight into the composition of the POC pool, and therefore to improve POC algorithms. Hence, it is important to pursue this line of algorithm development, even if the current performance of these methods might not be as good as that of some more empirical approaches. The line of investigation that accounts for the contributions of different types of POC to their optical properties is already yielding fruit (Stramski et al., 2008).
The algorithm of Loisel et al. (2002) (Algorithm C) is also a two-step approach, drawing on the relationship between b p and POC, via a relationship between b bp and [Chl]. Though Loisel et al. (2002) did not directly validate their POC estimation from SeaWiFS data, they found a good match between retrieved b bp and that measured in situ in a previous study (Loisel et al., 2001). Loisel et al. (2002) did indicate variability in the b bp : [Chl] relationship, linked to changes in the particulate pool; they highlighted the variable influence of small particles consisting of dead cells, grazers, and minerals. Gardner et al. (2006)  (Algorithm D ) also uses a two-step approach, exploiting the relationship between the beam attenuation coefficient (c p ) and POC. This relationship was shown to be strong, when in situ POC was compared with transmissometer profiles. Although no fully-validated algorithm exists for routine derivation of c p from satellite ocean color measurements, Gardner et al. (2006) showed in situ c p was strongly correlated with [Chl] (r = 0.845-0.897) and K d (490)) (r = 0.846-0.878) derived from SeaWIFS data over different oceanic regions.
Algorithm E, by Kostadinov et al. (2016), addresses some sources of variability between optical properties and POC, such as the influence of the particle size distribution, which was also identified as being important by Stramski et al. (2008). The method of Kostadinov et al. (2016) uses spectral values of b bp to derive a PSD, which is then converted to POC (and phytoplankton carbon) using allometric relationships. The focus of the Kostadinov et al. (2016) paper was on phytoplankton carbon, computed as 1/3 of POC. Relationships between the phytoplankton carbon estimated from in situ PSD measurements and direct analytical determinations, showed r values between 0.5 and 0.714, depending on the limits of integration of the PSD, with wider limits resulting in the lower r. As discussed by Stramski et al. (2008), Kostadinov et al. (2016) also notes the impact of uncertainties in retrieved backscattering arising from both measurement and theory. In particular, assumptions of sphericity and homogeneity used in Mie theory are likely to be violated in real seawater particle assemblages, particularly for backscattering and in coastal and more productive areas (which are included in the database used here). For a more detailed discussion of the sphericity and homogeneity assumption, see Kostadinov et al. (2009) and refs. therein. Future work needs to focus on developing and more widely adopting bio-optical models that relax the Mie assumptions (e.g., Bernard, 2004, 2006;Clavano et al., 2007;Matthews and Bernard, 2013;Robertson Lain et al., 2017). Understanding of PSD variability, how it relates to backscattering, and how FIGURE 13 | Summary of (A) POC to [Chl] relationships including the in situ POC data collated in this study and the extracted OC-CCI [Chl (gray dots)] and the relationships proposed by Morel (1988) (green line), Stramska and Stramski (2005) (red line), Legendre and Michaud (1999)  particle composition affects scattering over broad marine regions are required to develop further such detailed mechanistic approaches.
General sources of error associated with any ocean-color product include differences introduced by choice of sensor, sensor calibration, and the atmospheric correction procedure used to retrieve R rs . In addition to these, a further consideration, particularly in the cases where algorithms use IOPs, is the methods used to derive the IOP product from the R rs data. The OC-CCI processing uses the Quasi-Analytical Algorithm (QAA) of Lee et al. (2002) to calculate IOPs, including the b bp values used in this study. The original study by Kostadinov et al. (2016) used the method of Loisel and Stramski (2000) to estimate b bp . Stramski et al. (2008) also used different formulations to calculate b bp from R rs , finding a corrected version of QAA produced a better estimate of b bp , and a strong relationship with POC (r = 0.933). The effect of the choice of method to derive b bp on the POC estimates requires further consideration, which goes beyond the scope of this study, as this IOP is particularly poorly understood and validated (Lee et al., 2002). The differences in algorithm performance across the different water classes indicate that regional variability in performance of the different algorithms can be expected. This is confirmed in the mapped regional distribution of uncertainties (Figures 9,  10). These results suggest that algorithms either need to be selected carefully for applications in different regions, or that a selection of optimal algorithms may have to be blended for a global product (as done in Jackson et al., in press). This point is also raised in Stramski et al. (2008), where different formulations are provided for global application, and excluding upwelling data. Uncertainties in the underlying satellite data may also be responsible for a portion of this variability: for example, an IOP model may be more or less suitable to derive backscattering. It should also be noted that there can also be uncertainties in the in situ data and the validation process that can affect the assessment of uncertainties in algorithm performance. Ideally, multiple replicates will be taken to quantify uncertainties in the in situ measurement, and instruments will have a well-characterized calibration history, and be processed with community endorsed methodologies. For POC, the issue of blank correction was already highlighted in Section 2.1. Uncertainties resulting from variable methods used for the in situ data collated for this study may influence the results presented here, particularly at low POC concentrations. In terms of comparison to matchups, further uncertainties can be introduced by comparing values at different scales, i.e., point measurements may not represent the average over a pixel (in this case of 4 km in size). These uncertainties will limit the ultimate accuracy to which any satellite based product can be derived and validated. However, issues of spatial mismatch are beginning to be addressed with the use of underway systems (for example, . Despite the difficulties highlighted above, the overall performance of the algorithms studied here is encouraging. Percentage error estimates based on the OC-CCI methodology show how well these algorithms can generate products suitable for the needs of the scientific community. For example, the percentage errors associated with the Stramski et al. (2008) R rs algorithm applied to OC-CCI data in May 2005 (Figure 11), show that a majority of pixels fall within an error range that is widely accepted by the ocean color community for [Chl] (30%; GCOS, 2011).

Variability in the Ratio of Particulate Organic Carbon to Chlorophyll-a
Further perspective on the performance of the different algorithms can be gained by considering the covariance between POC and [Chl]. The relationship between the in situ POC data and satellite [Chl] is shown in Figure 12A, where the color indicates the associated dominant optical water class. These data then forms the background for each of the subsequent panels of Figure 12, which show the relationship between the POC estimated by each algorithm, and the satellite [Chl]. Algorithm A shares a commonality in method with the algorithms used to derive satellite [Chl], in that the same reflectance ratios are used to derive POC, and [Chl] (at lower concentrations); hence it shows a very constrained relationship in this domain of the parameter space ( Figure 12B). Other algorithms capture the scatter in the POC:[Chl] relationship to a greater or lesser degree, though offsets can be seen, associated with the behavior identified in the validation exercise, i.e., overestimation of POC relative to lower [Chl] in the case of Algorithm B (Figure 12C), and underestimation of the ratio at low [Chl] using POC from Algorithm C (Figure 12D-though it should also be noted that this algorithm is also dependent on [Chl] to derive POC). As with Algorithm A, Algorithm D shows a relatively constrained relationship between POC and [Chl]. Algorithm E produces similar variability between POC and [Chl] as seen in the in situ data, in terms of shape and scatter of the curve, but the bias of this algorithm toward lower estimates of satellite derived POC is clear ( Figure 12D).
The ratio of POC to [Chl] is important in the context of the discussion here for two reasons. Firstly, this ratio is important in the context of biogeochemical modeling, and the ecological and physiological processes that influence this ratio. Secondly, empirical relationships between POC and chlorophyll have been developed, which can be applied to satellite derived estimates of [Chl]. As mentioned above, these algorithms are typically similar to those employing blue:green reflectance ratios (e.g., Algorithm A from Stramski et al., 2008), and as such were not initially considered in the algorithm intercomparison here. Figure 13A shows a number of these empirical relationships, against a background of the same in situ POC and [Chl] data as shown in Figure 12. Figure 13B shows POC estimated using these [Chl]-based algorithms on OC-CCI [Chl] as a function of the blue-green R rs reflectance ratio. The same reflectance ratio is employed by Stramski et al. (2008) to derive POC, and is also used in a number of empirical [Chl] algorithms. A linear regression of the in situ POC concentrations, against the satellite derived [Chl], results in an r 2 value of 0.70. Using the various relationships shown in Figure 13A to estimate POC based on the satellite [Chl] returns r 2 values between 0.63 and 0.69, lower than those returned for all the other algorithms assessed. The [Chl] based approaches show in Figure 13 produce RMSD values (between 0.27 and 0.47) and bias (between −0.03 and 0.117) in the same range as the other algorithms.
Even though to first order Chl and POC are positively correlated in the global ocean, a residual scatter in the relationship remains (e.g., in satellite observations- Figure 12A, and in situ observations as well-e.g., Kostadinov et al., 2012). Ideally, a POC algorithm should be able to retrieve POC independently of [Chl] and capture the variable POC/[Chl] ratio correctly. Note that this ratio can vary due to both variability in the fraction of living phytoplankton carbon in the total POC pool, due to the physiology and photoacclimation of the phytoplankton component of POC (Geider, 1987;Geider et al., 1998;Behrenfeld et al., 2005), and species specific differences among phytoplankton themselves (Stramski, 1999). Therefore, independent knowledge of total POC, living phytoplankton carbon, and [Chl a] should be the goal of future bio-optical algorithm development.

Estimates of Total Pools of Carbon
The OC-CCI archive can be used to estimate total pools of POC in the mixed layer, taking into account interannual and regional variability, which is well captured by this merged dataset. Algorithms A-D were applied to the monthly OC-CCI version 2 data, and the values integrated over the mixed-layer depth (derived from MIMOC, Schmidtko et al., 2013), assuming homogeneity over the mixed layer. These were then averaged over all the months and for all the years of the OC-CCI version 2 (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) to provide estimates of the average standing pool of POC as follows: Algorithm A: 0.86 Pg C, Algorithm B: 1.3 Pg C, Algorithm C: 0.87 Pg C, Algorithm D: 0.77 Pg C. These are larger than the estimate of Gardner et al. (2006) and smaller than the estimate of Stramska (2009). Comparison of these estimates with those of phytoplankton carbon pools estimated in a parallel study (Martinez-Vicente et al., in review), indicates that phytoplankton carbon represents between 17 and 48% of the total POC pool. Whilst this ratio shows considerable variability, the often assumed value of 1/3 for phytoplankton carbon:POC falls within this range. High levels of variability in the phytoplankton carbon to POC ratio were also observed in situ by Graff et al. (2015). Satellite based estimates calculated by Kostadinov et al. (2016) (using a different set of mixed layer depth values) suggest a phytoplankton carbon standing stock of around 0.24 Pg C, implying a corresponding POC stock of around 0.72 Pg C when using the 1/3 assumption. Kostadinov et al. (2016) showed the estimated phytoplankton standing stock to be similar to estimates derived from the application of both the Stramski et al. (2008) b bp based algorithm, combined again with a 1/3 assumption and the method of Behrenfeld et al. (2005) to SeaWIFS data, and to model estimates from the Coupled Model Intercomparison Project 5 (CMIP5). The estimate of phytoplankton carbon standing stock from Kostadinov et al. (2016) is similar to that estimated by other size class based approaches, such as that of Roy et al. (2017) which used size classes derived from absorption to estimate a total phytoplankton carbon stock of 0.26 Pg C. Though the global estimates of POC from the different approaches assessed here are quite similar to each other, the differences are more pronounced at smaller scales, as can be seen in Figure 8F.

CONCLUSIONS
A variety of POC algorithms were applied to matchup pixels extracted from the satellite OC-CCI ocean color data, and validated against in situ data. The database used here represents the largest collection of in situ POC data available, to the author's knowledge. The five algorithms showed strong predictive capacity for estimating POC, with Algorithm A (based on R rs - Stramski et al., 2008) and C (based on Loisel et al., 2002) performing well across the broad range of the in situ dataset. Algorithms A and C performed consistently across different water types as defined in the OC-CCI data. From the water class based validation, errors can be estimated per pixel. For Algorithm A and C, the errors were mostly within the range requested by the user community. These results suggest a maturity in POC algorithms and their suitability for production of long term time series for climate related studies. However, several key points of development are highlighted from the inter comparison of the different algorithms and the various studies reviewed here. Greater knowledge of the composition of the particulate pool, and how it affects the IOPs of the oceans, may allow increased accuracy of POC algorithms (within the constraints of the sensitivities of current satellite ocean color radiometry), as well as providing further information on different types of particles, many of which play important roles in water quality and ocean biogeochemistry. To support this aim, further in situ data should be collected, including additional measurements to provide detail on phytoplankton community size structure, physiology, and photoacclimation. Further, it is recommended that future work seeks to use consistent methodology for blank correction of POC measurements, and clarify any trends in the low POC region which may be influenced by these uncertainties. Further understanding of the sources of variability between POC and optical parameters can then be incorporated in to future, semi-analytical algorithms. New understanding of these relationships may also inform future sensor development (e.g., hyperspectral sensors) and optical modeling techniques.

AUTHOR CONTRIBUTIONS
HE: All the calculations, preparation of figures, lead on writing the manuscript. VM: Project manager for Pools of Carbon project; development of matchup processing and statistical analysis. RB: Provision of code for statistical analysis based on OC-CCI methodology. GD: Provision of in situ data, content on b bp uncertainties and impact of particle sizing. AH: Perspectives on community requirements, particularly for ecosystem modeling. TJ: Provision of code for calculation of per optical water class uncertainties, based on the OC-CCI methodology. HK: Collation of the in situ database. TK: Algorithm provider, content on algorithm performance relating to particle size distributions. HL: Algorithm provider, content on regional variability in algorithm performance. SR: Input on statistical analysis and relative contributions of phytoplankton to the POC pool. RR: Collation of the in situ database. ST: Data contribution and guidance on POC data for the Southern Ocean. DS: Algorithm provider, content on history of POC algorithm development, and interpretation of comparative analysis of different algorithms. TP: Project leader, scientific advice. SS: