Monitoring Marine Habitats With Photogrammetry: A Cost-Effective, Accurate, Precise and High-Resolution Reconstruction Method

Underwater photogrammetry has been increasingly used to study and monitor the three-dimensional characteristics of marine habitats, despite a lack of knowledge on the quality and reliability of the reconstructions. More particularly, little attention has been paid to exploring and estimating the relative contribution of multiple acquisition parameters on the model resolution (distance between neighbor vertices), accuracy (closeness to true positions/measures) and precision (variability of positions/measures). On the other hand, some studies used expensive or cumbersome camera systems that can restrict the number of users of this technology for the monitoring of marine habitats. This study aimed at developing a simple and cost-effective protocol able to produce accurate and reproducible high-resolution models. Precisely, the effect of the camera system, ﬂying elevation, camera orientation and number of images on the resolution and accuracy of marine habitat reconstructions was tested through two experiments. A ﬁrst experiment allowed for testing all combinations of acquisition parameters through the building of 192 models of the same 36 m 2 study site. The ﬂying elevation and camera system strongly affected the model resolution, while the photo density mostly affected bundle adjustment accuracy and total processing time. The camera orientation, in turn, mostly affected the reprojection error. The best combination of parameters was used in a second experiment to assess the accuracy and precision of the resulting reconstructions. The average model resolution was 3.4 mm, and despite a decreasing precision in the positioning of markers with distance to the model center (0.33, 0.27, and 1.2 mm/m Standard Deviation (SD) in X, Y, Z, respectively), the measures were very accurate and precise: 0.08% error ± 0.06 SD for bar lengths, 0.36% ± 0.51 SD for a rock model area and 0.92% ± 0.54 SD for its volume. The 3D geometry of the rock only differed by 1.2 mm ± 0.8 SD from the ultra-high resolution in-air reference. These results suggest that this simple and cost-effective protocol produces accurate and reproducible models that are suitable for the study and monitoring of marine habitats at a small reef scale.


INTRODUCTION
Mapping marine habitats on a large scale has been of primary interest for years as it is essential to estimate, to understand, and to predict biotic assemblages and the distribution of marine biodiversity (Tittensor et al., 2014). To date, knowledge on the patterns and changes of marine biodiversity in Europe and its role in ecosystem functioning has been scattered and imprecise.
In the past few years, however, the study of marine biodiversity has risen from relative obscurity to becoming an important issue in European policy and science. The reason is obvious: the seas are not exempt from the impacts of the Anthropocene and the diversity of life in marine ecosystems is rapidly changing (McGill et al., 2015). Nevertheless, most indicators of biodiversity loss relate to the global scale (Pimm et al., 2014), whereas it is locally that biodiversity needs to be monitored in order to capture a potential consistent loss, which is still not properly detected at the global scale (Dornelas et al., 2014). Key data for studying the effects of anthropogenic pressures on the marine environment at different scales, from local to global, is in fact lacking (Halpern et al., 2008). Conversely, two-dimensional maps are insufficient for the understanding of fine scale ecological processes as the structural complexity of benthic habitats has been shown to play a major role in structuring their constitutive ecological assemblages (Friedlander et al., 2003;Kovalenko et al., 2012;Graham and Nash, 2013;Agudo-Adriani et al., 2016;Darling et al., 2017). Besides, monitoring marine habitats requires the accurate measurement of lengths, areas and volumes that are difficult or even impossible to get in situ with traditional methods. A variety of tools and techniques have been used to measure these metrics, from in situ diver measurements (Dustan et al., 2013) to remote sensors such as airborne Lidar (Wedding et al., 2008). These are either punctual with very low spatial resolution (distance between neighbor points/vertices), not accurate enough, or limited to very shallow waters. There is certainly a need for a cost-effective and operational, accurate [closeness of a measurement to the true value (Granshaw, 2016)], precise (low variability, highly reproducible) and high resolution three-dimensional (3D) protocol for capturing the fine-scale architecture of marine habitats.
Modern photogrammetry, also known as structure-frommotion, defines the 3D reconstruction of an object or scene from a high number of photographs taken from different points of view (Figueira et al., 2015). Photogrammetry was first developed for terrestrial applications, and it was later introduced for underwater use by archeologists in the 1970s (Pollio, 1968;Drap, 2012). This technique has seen impressive growth during the last decade and is now extensively used in marine ecology to study interactions between habitat structure and the ecological assemblages (Darling et al., 2017). It has also been used to automatically map the seabed from metrics such as luminance (Mizuno et al., 2017), and simply as a tool for monitoring growth of benthic sessile organisms (Bythell et al., 2001;Chong and Stratford, 2002;Abdo et al., 2006;Holmes, 2008;Gutiérrez-Heredia et al., 2015;Reichert et al., 2016). Based on this technique, recent developments in hardware and image processing have rendered possible the reconstruction of high-resolution 3D models of relatively large areas (1 ha) (Friedman et al., 2012;González-Rivero et al., 2014;Leon et al., 2015). However, the underwater environment is still challenging due to many factors: no GPS information is available to help positioning the photographs, light refraction reduces the field of view and makes necessary the use of a wide angle lens that increases image distortions (Guo et al., 2016;Menna et al., 2017), and large lighting and water clarity variations affect the image quality and consequently the calculation of the position and orientation of the photographs (i.e., bundle adjustment) . Despite these environmental constraints, many studies showed relatively accurate 3D models reconstructed with photogrammetry, notably individual scleractinian corals (2-20% accuracy for volume and surface area measurements, depending on colony complexity) (Bythell et al., 2001;Cocito et al., 2003;Courtney et al., 2007;Lavy et al., 2015;Gutierrez-Heredia et al., 2016).
Monitoring benthic biodiversity through 3D reconstruction requires a sub-centimetric to millimetric resolution, with a corresponding accuracy (closeness to true positions/measures) and precision (variability of positions/measures). This is because some habitats are highly complex in structure and have low growth such as coral reefs (Pratchett et al., 2015) or coralligenous reefs (Sartoretto, 1994;Garrabou and Ballesteros, 2000;Ballesteros, 2006). In order to be able to monitor these important habitats for biodiversity, very high-resolution and accurate reconstructions to detect slow growth or small changes in structure over time are needed. Nevertheless, improve resolution and accuracy comes at a cost. In this instance, the finer the resolution for a given surface area, the more time-consuming the modeling process is, and with a very high number of images the reconstruction software can even reach the machine limits (Agisoft, 2018). Ultimately, large photogrammetric models are usually made to the detriment of the final resolution and require expensive, hard to deploy "Underwater autonomous vehicles" (UAV) (Johnson-Roberson et al., 2010). There is consequently a need to strike a balance between a robust and highly accurate protocol, and a methodology that remains relatively low-cost and time-effective, in order to map numerous sites within a largescale monitoring system. The correlation between local ecological processes, quantified on punctual 3D models, and continuous macro-ecological variables available at broader scale, could help better understand and manage marine biodiversity of entire regions (González-Rivero et al., 2016).
Over the last decade, many research teams have developed their own methodology, some using monoscopic photogrammetry (Burns et al., 2015a(Burns et al., ,b, 2016Figueira et al., 2015;Gutiérrez-Heredia et al., 2015), and some using stereo photogrammetry (Abdo et al., 2006;Ferrari et al., 2016;Bryson et al., 2017;Pizarro et al., 2017). Others have worked on the effect of trajectory , camera orientation (Chiabrando et al., 2017;Raczynski, 2017), and camera system (Guo et al., 2016) on the resulting model accuracy. Others focused on the repeatability of measurements such as volume (Lavy et al., 2015) and surface rugosity in different environmental conditions . Most of these studies had the same goal underpinning: estimating the accuracy and precision of the models and their derived metrics, and eventually assessing the effect of a given driver. If some of these effects are well documented in traditional photogrammetry, there is still a lack of knowledge on their transposition below the surface. Indeed, underwater photogrammetry remains understudied. Moreover, none of the underwater studies explicitly tested different configuration arrangements on one given study area in order to assess the relative effects of all the parameters on the model, including resolution, accuracy and precision. Knowing the best possible results in underwater conditions remains a research gap. Recently, an effort has been made to assess the influence of photo density (the number of photos per m 2 ) on volume-area (Raoult et al., 2017) and rugosity measurements . Although most software manuals recommend a minimum overlap between photos to ensure correct bundle adjustments and accurate models [80% overlap + 60% side-overlap (Agisoft, 2018)], it proved difficult to quantify the influential factors. So far it seems that no study about underwater photogrammetry has considered either the effect of flying elevation, or the total processing time (TPT) of the images as criteria for the definition of an operational and cost-effective monitoring method. These two factors are yet crucial to take into account as they are likely to affect both model resolution and the capacity to monitor numerous sites.
Based on monoscopic multi-image photogrammetry, this study aims to define an operational, cost-effective methodology of producing high-resolution models of surfaces ranging between a few square meters to 500 m 2 , in order to monitor marine biodiversity of temperate and tropical habitats. This study will define a method that is easy to handle, compatible with off-theshelf commercial softwares, with a sub-centimetric resolution and the lowest processing time possible. Through the setup of two experiments, we tested in natura the influence of the camera system, flying elevation, camera orientation and photo density on the resolution, accuracy and precision of 3D reconstructions. More particularly, this study addresses the following questions: (1) What is the relative influence of camera system, flying elevation, camera orientation and photo density on: (a) The accuracy of bundle adjustment? (b) The number of points in the dense cloud? (c) The model resolution?
(2) What is the best value for the photo density considering the trade-off between accurate bundle adjustment, high dense cloud size, high resolution and low processing time? (3) Having considered the best combination of these parameters in relation to the photo density, what is the expected accuracy and precision for:

MATERIALS AND METHODS
Two experiments were carried out. The first experiment (hereafter referred to as "experiment 1") entailed testing all combinations of the parameters and assessing their relative effects on the accuracy of the bundle adjustment and model resolution. The second experiment (hereafter "experiment 2") entailed measuring the accuracy and precision of the models, reconstructed with the best combination of parameters found in experiment 1. The process flow diagram is illustrated in Figure 1.

Experimental Design
Experiment 1: Defining the Best Set of Parameters for Acquisition The first experiment tested different combinations of parameters, assessing their relative effects on the accuracy of bundle adjustment and model resolution. The experiment took place at 15 m depth in the Mediterranean Sea (Calvi bay, Corsica, France) and corresponded to a 6 × 6 m patch of sand in between dense Posidonia oceanica meadows, with a mixture of existing artificial structures and objects placed across the area. All acquisitions were conducted using two different cameras: • A Nikon D3S in a waterproof Seacam housing, mounted with a Nikon 20 mm fixed lens and a hemispherical dome port (hereafter referred to as "Nikon D3S + Dome, " see details Table 1). • A Nikon D4 in a waterproof Seacam housing, mounted with a Nikonos RS 20-35 mm marine lens (hereafter referred to as "Nikon D4 + RS, " see details Table 1).
The diver flew at two different elevations over the horizontal area (2.5 and 5 m above the seafloor, following a depth gauge), describing parallel and regularly spaced transects, commonly known as the "mow the lawn" method . He flew at a relatively low speed of 20-25 m/min with a time lapse of 0.5 s between pictures. Each acquisition was done in two steps: (i) a first with nadiral orientation (i.e., vertical downward) and (ii) a second with oblique orientation. Nadiral acquisition involved a single path along each transect, and oblique acquisition involved two paths per transect (one toward right, and the other toward the left). Oblique photos were taken with an angle of approximately 30 • from vertical.
All combinations of parameters (camera, elevation and orientation) were performed with three replicates per combination, to distinguish intrinsic and extrinsic variabilities. In total, this represents 24 photo datasets. For each dataset, eight sub-datasets were generated by uniformly sampling one over N images (1 ≤ N ≤ 8) to study the effect of photo density on the accuracy of bundle adjustment and model resolution. We ran all models with 100, 50, 33, 25, 20, 17, 14, and 12.5% of photos, which represents a total of 192 models ranging from 23 to 594 photos (see step 1 in Figure 1). For a given subsampling level (%), datasets constituted of both nadiral and oblique photos, by definition, resulted in higher camera densities than pure nadiral. All models were orientated and sized with a 1 × 1 m cross scale FIGURE 1 | Flow diagram of the methodology. C2M, Cloud-to-mesh.
Frontiers in Marine Science | www.frontiersin.org bar located in the center of the study area, and were aligned together with four 10 × 10 cm coded markers placed at the corners. The scale bar length was of the same magnitude as the edge of the scanned area (1 m vs. 6 m) (VDI / VDE, 2002). All models were produced within the same spatial extent (i.e., same bounding box), defined by the positions of the four markers placed at the corners of the study site. Camera settings were adjusted to achieve a sufficient balance between depth of field, sharpness, and exposure, with the constraint of a high shutter speed to avoid image blurring (20 mm focal length, F10, 1/250 s, ISO 400). Focus was set automatically at the beginning of each acquisition, then turned to manual.

Experiment 2: Measuring the Accuracy and Precision With the Best Set of Parameters
The aim of the second experiment was to evaluate the accuracy and precision of the models by assessing the variability in the positioning of coded markers, and comparing measurements and 3D geometries of true and modeled objects, using the best combination of parameters obtained in experiment 1. The site was located at 15 m depth in the Mediterranean Sea (Roquebrune Bay, next to Cap Martin, France) and corresponded to a 5 × 5 m sandy area, equipped with objects of known dimensions. These include a resin-made fake rock (surface area = 1.04 m 2 , volume = 20.03 L), a plant bucket (surface area = 0.191 m 2 , volume = 10.81 L), a 0.6 × 0.4 m pane (surface area = 0.24 m 2 ), ten 0.35 m bars, and four 0.85 m bars (see Figure 2). All these objects were tagged with 10 × 10 cm coded markers which were automatically detected by the reconstruction software with a subpixel accuracy. Their 3D position was exported for assessing the XYZ position and bar length measurements. Four coded markers fixed on a 0.9 × 0.9 m cross were used to scale and orientate the models. Additional coded markers were used at the corners of the study site, rock and bucket to automatically set the corresponding bounding boxes, thereby ensuring that the extent for all the global models and rock/bucket sub-models (separate models of the rock and bucket built for area, volume and cloud-to-mesh distance measurements) were identical. Six replicates were acquired with the best combination of parameters concluded from experiment 1 (camera system, flying elevation, camera orientation and photo density; see step 3 in Figure 1).

Models Processing and Camera Calibration
All models were processed with Agisoft PhotoScan Professional Edition V. 1.4.0 (Agisoft, 2018). This commercial software is commonly used by the scientific community (Burns et al.,   Figueira et al., 2015;Guo et al., 2016;Casella et al., 2017;Mizuno et al., 2017) and uses a classic photogrammetric workflow. This consists in the automatic identification of key points on all photos, bundle adjustment, point cloud densification, mesh building and texturing. We used the self-calibration procedure implemented in PhotoScan to be as simple and operational as possible, as it has been shown that the refraction effects at the air-port and port-water interfaces can be absorbed by the physical camera calibration parameters during self-calibration (Shortis, 2015). Calibration optimization was conducted for all properties after bundle adjustment (see step 1 in Figure 1).
For the experiment 2, the volume and surface area of the rock and bucket were computed and exported after completing a "hole filling" step to close each sub-model. The models were processed on a Windows 10 workstation with the following specifications: 64bits OS, 128 Gb RAM, 2 × NVIDIA GeForce GTX 1080 Ti 11 Gb, Intel Core i9-7920X 12 CPU 2.9 GHz.

Accuracy of Bundle Adjustment
For all processed models, we computed several metrics with a view to assessing the accuracy of bundle adjustment. The reprojection error (see Table 2) is a good indicator of the quality of self-calibration process during bundle adjustment.
The Ground Control Point Root Mean Square Error (GCP RMSE) measures the difference between true coordinates of a GCP and its coordinates calculated from all photos. The GCP RMSE/Ground Sample Distance ratio (GSD) gives good indication of the realized vs. potential accuracy (Förstner and Wrobel, 2016), as GSD corresponds to the pixel size in mm (see Table 2). Experiment 1 aims at minimizing these three metrics to ensure accurate bundle adjustment.

Model Statistics
Since the aim of this study was to define the best trade-off between the development of high-resolution models and the speed of their reconstruction, model resolution and TPT were computed for all processed models. Model resolution was defined as the average distance between a vertex of the mesh and its closest neighbor (see Table 2), and the TPT including bundle adjustment, point cloud densification, meshing and texturing. In order to take into account model completeness, the dense cloud size (million points) was also calculated.

Accuracy and Precision of the Models
The precision of the positioning of 52 coded markers was assessed by calculating the SD of their XYZ position across the six replicates (see Table 2). The SD was analyzed with regards to the distance to the model center, to assess a potential decrease

Metric
Unit Definition Experiment 1: accuracy of bundle adjustment Reprojection error pixels Root mean square reprojection error, averaged over all tie points on all images. The reprojection error represents the average distance, in pixels, between a tie point on the image (from which a 3D point has been reconstructed) and the reprojection of its 3D point back on the image.
Ground Sample Distance (GSD) mm/pix Distance between the center of two pixels measured on the ground ["pixel size in object space units" (Granshaw, 2016)]. GSD = [sensor width (mm) × flying elevation (m)]/[real focal length (m) × image width (pix)] for a given image. This value is averaged over all images by PhotoScan and is available in the model processing report (see "Ground resolution" in the "Survey Data" section of the report). The accuracy and precision of objects measurements were assessed with the mean and standard deviation of the relative measurement error, respectively.

Ground Control Point Root Mean
Cloud-To-Mesh distance (C2M) mm Absolute distance between a 3D point of the rock model to the in-air reference model. The accuracy and precision of 3D geometries were assessed with the mean and standard deviation of the C2M distance, respectively.

Model statistics
Dense cloud size million points Total number of points in the dense cloud.
Model resolution mm Average distance between a vertex of the mesh and its closest neighbor. This value was obtained by averaging the distance between a vertex and its closest neighbor for 10 000 randomly selected vertices. in precision for points located further away from the center. The accuracy of the positioning was not assessed because exact XYZ positions of the markers was not available. The accuracy and precision of measurements for all objects of known dimensions (lengths, areas, volumes; see step 4 in Figure 1) were assessed with the mean and SD of the relative measurement error (see Table 2). The simple geometry of the bucket allowed for the calculation of its volume and surface area. The rock volume and surface area were measured on an ultra-high resolution, in-air reconstruction  using 255 photographs taken with the Nikon D4. The photos were taken from a variety of orientations, using maximum quality parameters for reconstruction. The in-water rock models were aligned to the in-air reference with the four coded markers, and compared using cloud-to-mesh signed distance implemented in CloudCompare Version 2.10 (CloudCompare, GPL software, 2018). The accuracy and precision of the reconstructions were assessed with the mean and SD of absolute distances between the six replicates and the in-air reference (see Table 2), calculated for 5 000 000 points randomly selected on the reference rock surface.

Analysis of the Results
All marker positions and model statistics were exported from PhotoScan as text and PDF files (PhotoScan processing reports). They were parsed and analyzed with R Software V. 3.4.2 (R Core Team, 2017). The accuracy of bundle adjustment was assessed using the reprojection error, GCP RMSE and GCP RMSE/GSD. The resolution of the models after densification and meshing was defined as the average distance between two neighbor vertices of the mesh (see "Model Statistics"). The individual effect of each parameter (camera system, flying elevation, camera orientation and photo density) on the different metrics (reprojection error, GCP RMSE, GCP RMSE/GSD, dense cloud size, model resolution and TPT) was evaluated with a linear mixed effect model (LMM) (Zuur et al., 2009) after a log-transformation was performed to linearize the relationship with photo density. LMMs are statistical models that are suited for the analysis of clustered dependent data. Indeed, in all models built from the subsampling of one of the 24 datasets from experiment 1 (for studying the effect of photo density on the different metrics) data points are not independent. LMMs incorporate both fixed (the explanatory factors: camera system, flying elevation, camera orientation, photo density) and random effects used to control for pseudo-replication in the data by taking into account heterogeneity in the relationships between the explained variable and explanatory factors among the datasets (Patiño et al., 2013). The random effects structure depicts the nesting of datasets from the most to the least inclusive (camera > flying elevation > orientation > replicate). Models were fit with the "lmer" function in the "lme4" R library (v. 1.1-21). Each LMM was first built with all 4 predictors and sequentially pruned by dropping the least significant predictor until all remaining predictors were significant, such that the final LMMs only included the predictors that had a significant effect on each metric (t-test < 0.05). GSD was independently used as a predictor of the dense cloud size, model resolution and TPT in order to integrate both camera and flying elevation effects and make the results generalizable to a broader range of camera systems and flying elevations.
In experiment 1, the aim was to minimize the reprojection error, the GCP RMSE, the GCP RMSE/GSD, the model resolution, and the TPT, whilst maximizing the dense cloud size. Experiment 2 assessed the accuracy and precision of the reconstructions obtained with the best set of parameters from experiment 1. This was conducted through the analysis of the XYZ positioning of markers, objects lengths, surfaces and volume measurements, and cloud-to-mesh distance between the rock reconstructions and an ultra-high in-air reference model.

RESULTS
The weather conditions were clear during both experiments (swell < 0.5 m, wind < 5 knots), and the sky was cloudy, ensuring homogenous enlightening at those relatively shallow depths. Water conditions were homogenous, and refraction was supposed constant over time and space during acquisition, as it has been proven to be insensitive to temperature, pressure and salinity (Moore, 1976). The total acquisition time for both nadiral and oblique passes was 5 to 7 min for one given combination of parameters (see step 1 in Figure 1). All coded markers were well detected by PhotoScan for all models, and all appeared on a minimum of 4 photos.

Accuracy of Bundle Adjustment
The reprojection error ranged from 0.30 to 0.65 pixels across all models, but was significantly smaller at 2.5 m flying elevation and pure nadiral orientation. The proportion of variance explained for the overall reprojection error was 96.7% (see Table 3). All parameters had a significant effect on the reprojection error (t-test, p < 0.001), except camera system (t-test, p > 0.05).
The GCP RMSE ranged from 0.1 to 9.0 mm across all combinations, but 83 % of models had a value smaller than 1 mm (see Figure 3). The proportion of variance explained for the overall modeled GCP RMSE was 64.5%. The only parameter   that had a significant effect on GCP RMSE was the photo density (t-test, p < 0.001, see Table 3), with a negative trend. The GCP RMSE/GSD ratio ranged from 0.1 to 7.7 across all combinations, but 89 % of models had a ratio smaller than 1. The proportion of variance explained for the overall modeled GCP RMSE/GSD ratio was 67.3% Both flying elevation and photo density had a significant effect on the GCP RMSE/GSD ratio (t-test, p < 0.01 and p < 0.001, respectively). Camera system and orientation did not have any significant effect (t-test, p > 0.05).

Dense Cloud Size
The dense cloud size ranged from 3.3 to 24.1 million across all models, but was significantly smaller for the Nikon D3S + Dome with 5 m flying elevation and nadiral + oblique orientation (t-test, p < 0.001, See Table 4). The proportion of variance explained for the overall dense cloud size was 99.9%. The camera system and flying elevation accounted for 94.0% of the total variance. The orientation and photo density had a significant effect on the model resolution (t-test, p < 0.001) but only accounted for 5.9% of the total variance. GSD as single predictor had a significant effect (t-test, p < 0.001) and accounted for 78.3% of the total variance. This GSD effect corresponded to about −0.15 to −0.62 million points/m 2 per mm.pix −1 (log scale). The combination of parameters that gave the highest dense cloud size included the Nikon D4 + RS with a flying elevation of 2.5 m, and pure nadiral orientation (mean 19.4 million points ± 2.2 SD; see Figure 4).
The exponential regressions for each combination were in turn calculated (R 2 = 0.42-0.99). Using the regression curves for each combination, the dense cloud size reached 95% of the  maximum value achieved for a photo density between 3.3 and 18.9 photos/m 2 , depending on the camera system, elevation, and orientation. For the stated values of photo density, the dense cloud size was 95% of the maximum value achieved, and additional photos only increased the number by the remaining 5%. The lowest photo density value was obtained from the Nikon D4 + RS camera using pure nadiral orientation at 5 m elevation, while the highest value was obtained from the Nikon D3S + Dome camera using nadiral and oblique orientation at 5 m elevation. The value obtained from the Nikon D4 + RS, 2.5 m altitude and pure nadiral orientation was 4.7 photos/m 2 .

Model Resolution
The model resolution ranged from 4.5 to 11.6 mm across all models but was significantly smaller for the Nikon D4 + RS with 2.5 m flying elevation and nadiral orientation (t-test, p < 0.001, see Table 4). The proportion of variance explained for the overall model resolution was 99.9 %. The camera system and elevation explained 96.8% of the total variance. The orientation and photo density had a significant effect on the model resolution (t-test, p < 0.001) but only accounted for 3.1% of the total variance. GSD as single predictor had a significant effect (t-test, p < 0.001) and accounted for 89.0% of the total variance (with log scale). But GSD and model resolution were linearly positively correlated (Pearson correlation: 0.99), with the following intercept and Frontiers in Marine Science | www.frontiersin.org slope: model resolution = 0.4 + 5.8 mm/mm.pix −1 . The combination of parameters that gave the finest model resolution included the Nikon D4 + RS with a flying elevation of 2.5 m, and pure nadiral orientation (mean 4.6 mm ± 0.06 SD; see Figure 4). Model resolution was highly anti-correlated with the dense cloud size (spearman correlation: −0.98).

Total Processing Time
TPT increased exponentially with the photo density, ranging from 4 min 40 s to 17 h (see Figure 4). TPT was strongly dependent on the photo density, camera system, and orientation (t-test, p < 0.001, see Table 4), with 91.3 % explained variance, but flying elevation was not significant (t-test, p > 0.05). GSD as single predictor did not have any significant effect (t-test, p > 0.05). Exponential regressions were calculated for each combination (R 2 = 0.87-0.99). From the regression curves for each combination, it was deduced that the TPT reached a 5% of the overall maximum value for a photo density of 3.8-8 photos/m 2 , depending on the camera system, elevation, and orientation. The lowest photo density values were obtained for Nikon D4 + RS camera at 5 m elevation (for both orientations), while the highest value was obtained for Nikon D3S + Dome camera, pure nadiral orientation and 2.5 m elevation. The value for Nikon D4 + RS at 2.5 m and pure nadiral orientation was 5.0 photos/m 2 .

Accuracy and Precision of the Models With the Best Combination of Parameters
The results stated above show that the best trade-off between high bundle adjustment accuracy, high dense cloud size, high-resolution modeling and low TPT was achieved using the Nikon D4 + RS camera system at a flying elevation of 2.5 m, using pure nadiral orientation and a targeted photo density of 4-5 photos/m 2 (which corresponds approximately to 1 photo/s at a swimming speed of 20-25 m/min). This combination of parameters was used to assess the quality of the models in experiment 2 (six replicates). The average reached photo density was 4.3 photos/m 2 ± 0.18 SD over the six datasets with a model resolution of 3.4 mm ± 0.2 SD. GCP RMSE on the four markers of the cross scale bar ranged from 0.76 to 0.93 mm.

Positioning of Markers
The SD of the positioning of 52 markers placed across the modeled area ranged from 0.2 to 1.9 mm in X, 0.2 to 1.8 mm in Y and 0.2 to 4.3 mm in Z (see Figure 5). The SD increased with distance to the model center for all three axes (t-test, p-value < 0.001; see Table 5).

Measures of Lengths, Surfaces and Volumes
The calculation of all measurements was very accurate (mean error < 1%, see Table 6) and precise (SD error < 0.6%), except in the case of describing the bucket's surface area, which indicated a lower accuracy (mean error = 16.11%) and precision (SD error = 2.12%; see Table 6).

Reconstruction of 3D Geometries
The in-air model reference of the rock had a resolution of 0.5 mm. The absolute cloud-to-mesh distance between the underwater and the in-air rock model, computed on 5 000 000 randomly selected points, ranged from 0.04 to 9.6 mm across the six replicates (see Figure 6). The mean absolute C2M distance over all points and all replicates was 1.2 mm. The SD of the absolute C2M distance across the six replicates ranged from 0.01 to 9.5 mm, with a mean value over all points of 0.8 mm. Mean and SD of C2M distance were highly correlated (spearman correlation: 0.77).

DISCUSSION
This study aimed at developing a simple and operational methodology for the 3D reconstruction of marine habitats, with the highest resolution, accuracy and precision, for the shortest processing time. We evaluated the effects of the camera system, flying elevation, camera orientation, and photo density on the resulting reconstructions of a 6 × 6 m sand patch using reprojection error, GCP RMSE, model resolution, and TPT. By considering the parameters which gave the best results, and the photo density which achieved the best tradeoff between accuracy of the reconstructions and processing time, this study assessed the expected accuracy and precision of reconstructions on a second study site with reference objects of known dimensions and geometry.

Best Practices for Underwater Photogrammetry
The bundle adjustment was mostly accurate across all combinations, with a reprojection error no greater than 0.65 pixels and a GCP RMSE/GSD ratio smaller than one for 89% of the models. This ratio is a good indicator as it represents the ratio between realized and potential accuracy (Förstner and Wrobel, 2016). It was smaller at higher photo densities, indicating that photo density gives more robustness during bundle adjustment. The reprojection error increased with photo density (t-test, p-value < 0.001), but there was a saturation point around 5-7 photos/m 2 with a lower slope for higher densities. Indeed, the reprojection error naturally increases at low photo densities, as with very few photos it is expected that 3D points are reconstructed from only 2-3 images and thus have a null or low reprojection error. Flying elevation and nadiral + oblique orientation showed higher reprojection error, due to a lower sharpness of images taken in these configurations. The camera system did not have any significant effect on the reprojection error, which suggests that the self-calibration algorithm implemented in PhotoScan interpreted and corrected the distortions of the two different optical systems fairly successfully. Also, reprojection error is not affected by the sensor resolution as it is expressed in pixel unit. The two main drivers of model resolution were the camera system and the flying elevation. The model resolution was in average 2.4 mm smaller for the Nikon D4 + RS (log scale), and 1.8 mm/m elevation. By definition, the GSD is directly depending on the resolution and size of the camera sensor, and distance to modeled object (Förstner and Wrobel, 2016). Therefore, the closer to the object and the higher the resolution of the sensor, the smaller the GSD. Model resolution and GSD being highly linearly correlated (Pearson correlation: 0.99), a smaller GSD involves a smaller model resolution. Its effect corresponded to 5.8 mm/mm.pix −1 (any increase in GSD by 1 mm leads to an increase of 5.8 mm in model resolution, with quality setting for the dense cloud set to "high"). This is an interesting result to keep in mind when planning a photogrammetric acquisition, as GSD can be calculated from sensor properties and flying elevation (see Table 2). For a desired model resolution and a given camera system, one can estimate the corresponding flying elevation to be practiced. However, the quality of the camera system will condition the best possible model resolution, as decreasing the flying elevation would require more and more photos to maintain sufficient overlap between pictures, in turn increasing the overall processing time. Other unpublished experiments undertaken by the participants of this study confirm the impracticality of further decreasing flying elevation in the case of homogenous textures such as muddy sediment or dense seagrass, since the footprints of photos taken at lower elevation can be too small to ensure the reliable capture of key points, confusing the bundle adjustment algorithm. This issue was not observed in the case of biogenic reefs such as coralligenous or coral reefs that exhibit a high number of reliable key points.
On the other hand, dense cloud size was highly anti-correlated with model resolution (spearman correlation: −0.98), hence affected by the same parameters as model resolution, with reverse trends. Indeed, the model resolution was here defined as the average distance between a vertex of the mesh and its closest neighbor. But at low photo densities, the dense cloud size was also impacted by missing surfaces that could not be reconstructed because of the insufficient coverage (mostly at the model edges). This was not observed for photo densities greater than 3-4 photos/m 2 . Taking this effect into account by normalizing by the total area, GSD had an effect of about −0.15 to −0.62 million points/m 2 per mm.pix −1 (log scale; any increase in GSD by 1 mm.pix −1 leads to a decrease of 0.15 to 0.62 million points per square meter).
The camera orientation significantly influenced most of the metrics, but most importantly affected the reprojection error. All analyses showed that the best results could be obtained by using pure nadiral orientation rather than nadiral and oblique orientation, contrary to the findings of other studies (Chiabrando et al., 2017). It is nevertheless important to underline the fact that, whilst the selected study sites had a relatively flat topography with low-height objects, in the case of more complex environments such as reefs, oblique images can be vitally important to ensuring the comprehensive analysis of all details of the site. This is illustrated by the cloud-to-mesh distances represented in Figure 6: the lowest accuracy and precision corresponded to points located below overhanging parts of the rock that were not always wellrepresented by the photographs, depending on the exact position of each photo.
TPT exponentially increased with photo density, necessitating a trade-off between accurate bundle adjustment/high dense cloud size and expected TPT. It was concluded that a value of 4-5 photos/m 2 of mapped seafloor worked well at 2.5 m height. A higher number of photos would mean that the TPT would have been considerably higher whilst the accuracy of bundle adjustment and dense cloud size would have seen no such corresponding increase. With 16.2 Mpix, the Nikon D4 required a significantly higher TPT than the Nikon D3S and its 12.1 Mpix sensor, but this effect was marginal compared to the one of photo density (see Table 4).
The workflow designed in this study used "high" quality setting for dense cloud generation (see Figure 1), which corresponds to a downscaling of the original image size by factor of four [two times by each side, (Agisoft, 2018)]. However, with this quality level, depth maps and dense cloud generation are the two most time consuming steps of model production (70-80% of TPT). A "medium" quality corresponds to a downscaling of the original image size by factor of 16 (four times by each side) and could drastically reduce TPT. According to the size of the area to be modeled, there could be a trade-off between the GSD (through flying elevation and sensor properties) and the dense cloud quality in order to reach the desired resolution within the shortest TPT. At low flying elevation, more photos are required to cover the area, but the GSD is smaller, which might not require the same quality level as a higher flying elevation in order to reach a given model resolution. Further investigations should tackle this point to help researchers making decisions on this trade-off.

Accuracy and Precision of the Resulting Methodology
The positioning of markers in the scene was achieved with a SD smaller than 5 mm on the three axes, but was increasing with distance to the center at a rate of about 0.3 mm/m in XY and 1.2 mm/m in Z. If this precision can be sufficient in many usages, it must be taken into consideration notably in the case of model comparison, such as coral or coralligenous outcrops growth at a reef scale. For such applications, the model should always be centered on the area requiring the finest precision, and changes in Z must be larger than the precision expected at the given distance to reflect true reef changes or growth. This underlines the need for further study at a broader scale so as to evaluate the decrease in precision for areas > 100 m 2 . People using underwater photogrammetry can place coded markers at the boundaries of the region of interest to assess the lowest expected precision of their models, especially on the Z axis.
However, the methodology achieved a satisfactory accuracy for length measurements, with measurement errors corresponding to 8.0 mm per 10 m. It is important to note that the reference bars used were only 0.35 to 0.85 m long and that the accuracy could be different at larger scales. With regards to volumes and surfaces, the expected accuracy was predicted to decrease with increased object complexity, as informed by several studies (Figueira et al., 2015;Bryson et al., 2017). The poorest accuracy and precision occurred in the calculation of the bucket's surface area. It is quite possible that the texture was too homogenous and that PhotoScan could not identify enough tie points on the surface to build its shape, resulting in important surface artifacts. This must be taken into consideration when monitoring artificial structures with very poor texture such as newly submerged metallic or concrete structures. The reconstruction of the resin-made rock (an imitation of natural surfaces and shapes) on the other hand proved to be very accurate (0.92% error for volume and 0.36% for area) and precise (0.54% SD for volume and 0.51% for area), surpassing previous results in similar studies (Bythell et al., 2001;Courtney et al., 2007;Figueira et al., 2015;Lavy et al., 2015;Shortis, 2015;Gutierrez-Heredia et al., 2016). Comparing 3D meshes, the models were also very accurate and precise with a mean C2M distance of 1.2 mm over 5 000 000 randomly selected points on the rock surface, and a SD of 0.8 mm over the six replicates. The highest differences with the in-air reference model were observed for points below overhanging parts of the rock (see Figure 6), as they were not always captured by the pure nadiral photos. The highest variability was observed for the same points. Indeed, according to the exact position of photos for each dataset, these points were accurately reconstructed for some replicates, and showed artifacts for others. This highlights the limits of pure nadiral orientation in the case of more complex structures, where the operator should adapt camera orientation to follow the structure of the modeled object.
With the exception of the calculated surface of the bucket, the methodology produced highly accurate and precise reconstructions. The method is well-suited and easily deployed for the study and monitoring of marine benthic ecosystems. Using this method, artificial or natural habitats that exhibit the same level of complexity as the experimental setup (low complexity) could be monitored with an expected 1.2 mm accuracy. Healthy biogenic habitats such as coralligenous and coral reefs are expected to show a high complexity; the accuracy of the method still needs to be assessed for these habitats, as reconstruction errors are known to increase with habitat complexity (Figueira et al., 2015;Bryson et al., 2017). This solution is cost-effective and operational in terms of the material resources required for monoscopic photogrammetry, time required underwater, and post-processing time. By comparison, whilst stereoscopic mounts enable the direct scaling of the scene (Ahmadabadian et al., 2013), they cost twice as much as monoscopic cameras, produce twice as many photos increasing the overall processing time, and do not necessarily achieve more accurate results (Abdo et al., 2006;Figueira et al., 2015;Bryson et al., 2017).

Integration of This Method Within Marine Ecosystem Monitoring Networks
Photogrammetry has been increasingly used for studying and monitoring marine biodiversity (as aforementioned in the introduction). It has been deemed a suitable tool for monitoring natural or anthropogenic disturbances and their effects on biodiversity in marine ecosystems (Burns et al., 2016). The results of this study showed that, at the small reef scale, photogrammetry can provide very accurate and precise sub-centimetric reconstructions.
Biogenic reefs and seagrass meadows are two of the most frequently monitored marine habitats because of the ecological role they play for biodiversity (Ballesteros, 2006;Boudouresque et al., 2012;Coker et al., 2014). Photogrammetry is particularly well suited for the monitoring of reefs as they are built of sessile and immobile organisms, and their 3D structure is known to play a major role in structuring their constitutive ecological assemblages (see introduction). However, artificial light becomes mandatory over 30-40 m depth because of light absorption, notably in the case of deeper habitats such as coralligenous reefs (Ballesteros, 2006). The methodology outlined by this study can be used to monitor small to medium size reefs (20-500 m 2 ); to characterize their structural complexity, measure growth rates and study 3D relationships between species, for example. Given recent developments in deep learning, third-dimensional models can be exploited as an additional information layer for classifying species from their 3D characteristics (Maturana and Scherer, 2015;Qi et al., 2016). Models of sub-centimetric resolution are currently confined to use on a small area for practical reasons and computational limitations, but future technological advances might enable use on larger extents. Future research should focus on the use of photogrammetric technology in developing scientific understanding of complex ecological processes related to structure. In the coming years, photogrammetry should support detailed studies covering vast marine regions, opening opportunities to draw correlations with macroecological variables available at a broader scale.

CONCLUSION
There has been an increasing demand in recent years for innovative methods capable of easily and efficiently capturing fine-scale ecological processes and detecting small changes for the monitoring of the effects of anthropogenic pressures on marine habitats. This study outlined a method that enables the production of accurate and reproducible, highresolution models of marine environments with a low TPT. The simple experimental design supported the test of a set of variables proven operational. All in all, the method is suitable for the monitoring of marine benthic biodiversity in a large-scale multi-sites monitoring system. The study's results showed that flying elevation and camera system, and therefore GSD, strongly affected the model resolution. Photo density meanwhile vastly influenced processing time and bundle adjustment accuracy. If the operator effect was not studied here, because it indirectly accounts for several factors that are not straightforward (i.e., trajectory, flying elevation, camera orientation, swimming speed, etc.), it should be tackled as it could have a major influence, notably at large scales. Accounting for computing limitations, future investigations should assess the accuracy and precision of models built with the same parameters, but at a larger scale (>500 m 2 ). The influence of the dense cloud quality setting should also be studied as it is expected to play an important role the trade-off between model resolution and TPT. This study was necessary prior to ecological interpretations of 3D models built with the resulting methodology.

DATA AVAILABILITY
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
GM, JD, and FH designed the study and collected the data; GM produced the 3D models and performed the statistical analysis; GM, JD, SL, and PB contributed to the results interpretation; GM wrote the first draft of the manuscript and produced the figures; JD and SL supervised the research project. All authors contributed to manuscript revision, read and approved the submitted version.