^{1}

^{1}

^{1}

^{2}

^{1}

^{1}

^{1}

^{1}

^{1}

This article was submitted to Robotic Control Systems, a section of the journal Frontiers in Robotics and AI

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

One of the key distinguishing aspects of underwater manipulation tasks is the perception challenges of the ocean environment, including turbidity, backscatter, and lighting effects. Consequently, underwater perception often relies on sonar-based measurements to estimate the vehicle’s state and surroundings, either standalone or in concert with other sensing modalities, to support the perception necessary to plan and control manipulation tasks. Simulation of the multibeam echosounder, while not a substitute for in-water testing, is a critical capability for developing manipulation strategies in the complex and variable ocean environment. Although several approaches exist in the literature to simulate synthetic sonar images, the methods in the robotics community typically use image processing and video rendering software to comply with real-time execution requirements. In addition to a lack of physics-based interaction model between sound and the scene of interest, several basic properties are absent in these rendered sonar images–notably the coherent imaging system and coherent speckle that cause distortion of the object geometry in the sonar image. To address this deficiency, we present a physics-based multibeam echosounder simulation method to capture these fundamental aspects of sonar perception. A point-based scattering model is implemented to calculate the acoustic interaction between the target and the environment. This is a simplified representation of target scattering but can produce realistic coherent image speckle and the correct point spread function. The results demonstrate that this multibeam echosounder simulator generates qualitatively realistic images with high efficiency to provide the sonar image and the physical time series signal data. This synthetic sonar data is a key enabler for developing, testing, and evaluating autonomous underwater manipulation strategies that use sonar as a component of perception.

Simulation of robotic manipulation systems has proven its usability and effectiveness for designing, testing, and evaluating its capability (

Manipulation has made much progress in recent years, notably from images and point clouds. Underwater manipulation has been carried out using multi-sensor suites (

One of the key distinguishing aspects of underwater manipulation tasks is the perception challenges of the ocean environment, specifically turbidity, backscatter, and lighting effects. High-fidelity simulations of sonar-based perception are essential for bathymetric maps or obstacle avoidance as well as to support manipulation planning and control to enable accurate feedback (

Sonar-based perception is particularly challenging due to the slow propagation speed of sound compared to electromagnetic waves, and the large bandwidth to center frequency ratios of sonar applications. This challenge leads to a large amount of data produced by acoustic systems, and correspondingly a large computational load to simulate high-fidelity synthetic data, particularly for real-time simulations.

Several approaches have been developed in the literature for the open source Gazebo robot simulator (

Previous methods in the literature are based on image processing and video rendering of the simulated rays intersecting a target to convert into a sonar image pixel-wise and image-based manner (

For any task that requires geometric information obtained from sonar data (such as autonomous manipulators), these basic acoustic properties must be accounted for in the simulation. In addition to the limited fidelity of the previous image-based approaches, they generate only post-processed imagery of the underlying acoustic measurements to qualitatively approximate the interface provided to a human operator through a sonar viewing application. Autonomous manipulation, perception, planning and control rely upon access to the sensor-level measurements that are independent of the viewer used to render the measurements into imagery.

This study presents a beam-level time series sonar simulation method to provide a range versus intensity data that underwater vehicle manipulator systems can exploit. A point-based scattering model is implemented, following

The application results based on proposed methods demonstrate that this multibeam echosounder simulator generates qualitatively realistic images with high efficiency to provide the sonar image and the physical time series signal data in real time with sufficient refresh rate showing its effectiveness and usability. This sonar data is a key enabler for developing, testing, and evaluating autonomous underwater manipulation strategies that use sonar as a perception component.

The contributions of the methods in this article are to virtually produce sonar perception data with appropriate fidelity and within sufficient refresh rates. Collectively, the methods can provide physical time series signal data to improve the simulation infrastructure that underwater manipulation strategies and systems can exploit. Individually, our contributions are as follows:

• A simplified physics-based forward looking echosounder with a point-based scattering model within Gazebo framework to support underwater vehicle simulations

• High fidelity acoustics simulation including multibeam, scattering, noise, and target-wise reflectivity to increase the fidelity of current capabilities and generate sensor-level beam intensity measurements suitable for exercising autonomous manipulation perception, planning and control

• GPU parallelization for real time sonar simulation data processing

The model is based on a point scattering model of the echo level using the ray-based spatial discretization of the model facets as scatterers corrected with beam pattern appropriate for a line array. We first present the geometric information provided by Gazebo for each beam, then detail the calculations need to produce an intensity time series for each beam.

A single sonar beam within the field of view (FOV) of the sensor is shown in _{
B
}} for _{
B
} beams. Individual rays correspond to each scatterer on the target mesh. The following information is generated for each ray within an individual beam:

A single sonar beam with its azimuth and elevation beam widths.

• The range _{
i
} as the distance from the origin of the sonar reference frame to the first intersection between the ray and a target in the field of view. The azimuth of the ray is fixed in the sensor frame as _{
i
} and the elevation angle of the ray as _{
i
} (

Set of rays forming a single sonar beam.

• The incident angle _{
i
} as the difference between the ray vector,

Projected ray surface area and surface area patch of the visual object.

• The reflectivity of the target intersected by the _{
i
}, which is a property of the target object model (shape, size, and material), and the sonar frequency and bandwidth.

This scene information is provide by the Gazebo 3D simulation framework at each execution cycle (typically 1 kHz). The scene information is then used as inputs to the sonar model described below to generate an acoustic time series. The range and the normal vector of the intersected target elements in the environment are obtained from the depth buffer interface to the Gazebo 3D rendering pipeline. The locations in the depth buffer are then correlated with the visual scene rendering to deduce the target reflectivity. The methods to calculate the time series are detailed below.

We define a ray as a vector, _{
i
}, from the sensor frame origin to the first intersection with a visual object within the scene. We calculate the incidence angle (angle between the surface normal and the intersecting ray) as

The projected ray surface area, _{
i
} and _{
i
} angles for each ray are assumed to be infinitesimally small, then the projected area ray scene can be calculated by

Using the ray area projected onto the surface of the model element, _{
i
} is a function of the incident angle, calculated as

This equation relates the geometry of the ray to the surface area of the element that intersects the ray.

The target strength of the model element, which represents the ratio of scattered to incident intensity, is given by_{
sca
} is the intensity scattered by the target measured at 1 m distance, and _{
inc
} is the incident intensity on the target. Models for target strength are typically complex, and depend on the shape and geometry of the target (

Substituting for _{
i
}, the intensity ratio becomes

Synthetic time series are simulated using the point-based scattering model of

Here, we define each ray intersected surface mesh element as a scatterer; the number of rays is equivalent to the number of scatterers. This approach is valid so long as the ray intersecting surface mesh is discretized with an average ray spacing smaller than the resolved area of the system. If this criterion is not satisfied, then the number of rays in a beam should be increased before use in this ray-based scattering model.

The overall spectrum of the signal received from each beam, _{
j
}(_{
n
} is the complex scatterer amplitude, _{
w
} = 2_{
w
}′. _{
j
}(

The source spectrum is a user defined input and remains constant for each ray and is modeled in the frequency domain by _{
c
} and bandwidth,

The Gaussian form here is simple, but in any realistic simulation, the spectrum of the wave transmitted by the sonar system, and realistic filtering by the sonar receiving subsystem should be used. The source parameter by _{0} and has units of Pa⋅m. The decibel version of _{0} is the source level (

To obtain synthetic time series, acoustic frequency is discretized into linearly spaced vector from _{min} to _{max} and centered on the center frequency, _{
c
}. The full width of the transmit spectrum is the bandwidth, ^{1}
_{
m
} = 2_{
m
}/

Spherical spreading is an appropriate assumption for modeling acoustic propagation at short ranges. The two-way transmission loss for incoherent scattering in _{
j
}(_{
i
} is the two-way distance (from source to scatterer, and back to the receiver).

The scatter amplitude, _{
i
}, is calculated for each ray, and is related to the target strength discussed in

The variables _{
xi
} and _{
yi
} are draws from independent Gaussian random variables. Although the random variable, _{
i
}, is indexed by _{
i
}. The summation of complex Gaussian noise and a coherent field would result in Rician statistics (

A realistic acoustic image should show the side lobes of the beam pattern of a single point target. In some applications, it is advantageous to simulate the time series of each element of the receive array, and perform beamforming. This method is accurate, as it reproduces the signal processing of the sonar sensor exactly, but may require proprietary signal processing algorithms used in the system. However, simulating channel-level acoustic measurement, prior to beamforming, is computationally intractable for real time execution, at least using current computing hardware. To mitigate this inefficiency, we simulate each beam assuming that the horizontal beam pattern is a uniform, ideal beam pattern within the beamwidth. The time series are generated for a fan of beams whose directions correspond to the beamformed directions for the particular multibeam echosounder. The ideal beams are then corrected by performing a weighted sum of the beams, detailed below. This method does not perfectly reproduce the time series generated from the more accurate method, but should be sufficient so long as there are no extremely strong scatterers outside the fan of ideal beams to be simulated.

The beam pattern of an array is defined in polar coordinates where the acoustic intensity is the distance along the radial axis and the angle is relative to the transducer axis. The beam pattern is visualized as one main lobe in the center with smaller side lobes radiating away from the main axis (

Beam pattern schematics of half power beamwidth.

By inspection, the highest return will be along the main axis, and the response decreases off-axis. Local maxima appear away from the main axis and are referred to as side lobes. Any acoustic image or measurement will be subject to side lobes, which introduces some ambiguity into finding a target’s direction in sonar data. This ambiguity is present when using any coherent imaging system that uses multiple receivers (_{
bw
}, is marked at −3 dB on the main lobe. We define half the beam widths

The effect of side lobes can be simulated by performing a correction whereby each beam is a weighted sum of all the other beams. The weight is the directivity pattern of the array with the main response axis steered in a particular direction, as in_{
i,j
} is the weight for the

The weights are the values of the beam pattern sampled at specific angles. In this work, we use the directivity pattern corresponding to a uniform linear array,_{
i
} is the horizontal angle corresponding to the

The beam pattern, _{
w
}, the beam pattern is that of a uniform aperture function. The radiated pressure is modeled as a normalized

The half intensity point, _{
w
}, can be solved for by setting

For high frequency sonar, we can assume _{
w
} becomes

The final beam pattern is

The overall algorithm to calculate the sonar model is shown in _{
j
}, _{
j
}) for _{
i
} in the range [−_{
w
}, _{
w
}] with the associated range and intensity pair (_{
i
}, _{
i
}). Here, _{
w
} is equivalent to _{
w
} for elevation angles. The set of ordered pairs from all rays is _{
j
}, _{
j
}) for a beam. Thereafter, for interference between beams, _{
B
}} as a corrector.

Multibeam echosounder calculation algorithm.

The Gazebo simulator and Robot Operating System (ROS) have become ^{2}

Overall procedures of the imaging sonar simulation process: (i) a Gazebo camera plugin obtain the underwater scene; (ii) two dimensional set of sonar field-of-view data captured in the rendering scene; (iii) point scattering model is calculated for each ray data (iv) summation of rays to each beam; (v) beam pattern effect is calculated for beams; (vi) and the windowing and fast-fourier transform (FFT) is performed to produce range-intensity sonar data for each beam.

Using a GPU with the NVIDIA CUDA library^{3}

In order to apply beam pattern effect, the weights in

The final signal data of the sonar simulation is produced in a custom ROS message format^{4}

To evaluate our simulator, the sonar model with the developed plugin is set inside a square sonar tank with a cylinder as a target as shown in

A multibeam echosounder set in side the square sonartank with a cylinder target.

Sonar configurations.

Sonar configuration (Blueview P900-90) | — |
---|---|

Frequency | 900 kHz |

Bandwidth | 2.95 kHz |

Field-of-view | 90° |

Range | 10 or 60 m |

Beam width | 1° × 20° |

Beam spacing | 0.18° |

# of beams | 512 |

# of rays | 11 or 114 |

Source level | 220 dB re |

The range-intensity sonar data obtained from the simulator is shown in

Simulated intensity-range sonar data of a cylinder in a square sonar tank (reflectivity = 1e-3, # of elevation rays = 11).

Simulated multibeam echosounder image of a cylinder in a square sonar tank with reflectivity = 1e-3 [

The result shows the target object and the sonar tank in the final image. Also, the beam scattering of the signal is shown in the vicinity of the target cylinder on the colorized image.

The refresh rate and calculation time of the sonar image for various sonar configurations.

Full calculation | Ray reduced | Ray/Range reduced | ||
---|---|---|---|---|

Range (m) | 60 | 60 | 10 | |

# of rays | 114 | 11 | 11 | |

Refresh rate (Hz) | 0.5 | 3.0 | 10 | |

Time (s) | Ray signal | 0.3 | 0.02 | 0.00 |

Summation | 1.26 | 0.16 | 0.04 | |

Correction | 0.05 | 0.05 | 0.01 | |

FFT | 0.03 | 0.03 | 0.00 |

It is often the case that the target objects in the environment have different reflectivity according to their material properties. By prescribing the reflectivity parameter for each object model in the scene, the sonar amplitude is calculated accordingly.

Simulated multibeam echosounder image of various reflectivities [

Simulated multibeam echosounder image of two cylinder targets [

Simulated multibeam echosounder image of two tilted cylinder targets [

In this article we have developed a method to simulate physical-based and high-fidelity multibeam echosounder acoustic perception that underwater manipulation strategies and systems can exploit for development, testing and evaluation. The contributions of this research are the implementation of the point-based scattering model to represent target scattering to produce realistic coherent image speckle and the correct point spread function, as well as the usage of GPU parallelization to obtain real-time refresh rate of up to 10 Hz. A fruitful area of future work would be to implement a more physically based scattering model, such as the Kirchhoff approximation (

The multibeam echosounder developed can provide a sonar image and the underlying physical intensity-range time series signal data approximating what would be produced by an actual sonar. It is implemented as a plugin in the Gazebo framework and released as an open source project for users to manipulate for their uses, such as benchmarking against real sensors for quantitative comparisons. The parameterization of the plugin allows users to simulate various echosounders on the market. Accurate physics-based sensor modeling is a step toward simulating more realistic perception for manipulation, in order to leverage existing manipulation methods in the underwater domain.

The datasets and source codes in this study are publicly available as open-source and can be found here:

W-SC: Main researcher for the article DO: Theoretics support and advisory DD: Gazebo implementation depth camera plugin modification MZ: variational reflectivity development support AR: Literature research and early code developments BB: Research coordinate and ROS msg format developments MM: Docker environment and Github repository support CV: Simulation test and code reviews, JH: Development of relative file paths function of database files used in the plugin.

Author MZ was employed by the company Open Robotics.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

This research was performed while the author held an NRC Research Associateship award at Field Robotics Laboratory, Naval Postgraduate School.