Your new experience awaits. Try the new design now and help us make it even better

METHODS article

Front. Robot. AI, 22 January 2026

Sec. Field Robotics

Volume 12 - 2025 | https://doi.org/10.3389/frobt.2025.1718177

This article is part of the Research TopicHomo Aquaticus: New Frontiers in Living and Working in the OceanView all 10 articles

ATRON: Autonomous trash retrieval for oceanic neatness

John AbanesJohn Abanes1Hyunjin JangHyunjin Jang2Behruz ErkinovBehruz Erkinov3Jana AwadallaJana Awadalla4Anthony Tzes
Anthony Tzes5*
  • 1Electrical and Computer Engineering, NYU Tandon School of Engineering, New York, NY, United States
  • 2Mechanical Engineering, Virginia Tech, Blacksburg, VA, United States
  • 3Electrical Engineering, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
  • 4Mechanical Engineering, Egyptian Refining Company, Cairo, Egypt
  • 5Center of Artificial Intelligence and Robotics (CAIR), New York University Abu Dhabi, Abu Dhabi, United Arab Emirates

The subject of this article is the development of an unmanned surface vehicle (USV) for the removal of floating debris. A twin-hulled boat with four thrusters placed at the corners of the vessel is used for this purpose. The trash is collected in a storage space through a timing belt driven by an electric motor. The debris is accumulated in a funnel positioned at the front of the boat and subsequently raised through this belt into the garbage bin. The boat is equipped with a spherical camera, a long-range 2D LiDAR, and an inertial measurement unit (IMU) for simultaneous localization and mapping (SLAM). The floating debris is identified from rectified camera frames using YOLO, while the LiDAR and IMU concurrently provide the USV’s odometry. Visual methods are utilized to determine the location of debris and obstacles in the 3D environment. The optimal order in which the debris is collected is determined by solving the orienteering problem, and the planar convex hull of the boat is combined with map and obstacle data via the Open Motion Planning Library (OMPL) to perform path planning. Pure pursuit is used to generate the trajectory from the obtained path. Limits on the linear and angular velocities are experimentally estimated, and a PID controller is tuned to improve path following. The USV is evaluated in an indoor swimming pool containing static obstacles and floating debris.

1 Introduction

There are an estimated 269,000 tons of plastic on the water surface (Prakash and Zielinski, 2025). Projections indicate that the global plastic waste production will surpass one billion metric tons (Subhashini et al., 2024). These vast amounts of primarily plastic debris persist in the environment and require hundreds of years for decomposition.

During decomposition, microplastics contain hazardous chemicals and cause lasting damage (Chrissley et al., 2017). Toxins originating from plastic elements disrupt ecosystems and pose threats to human health, including cancers, birth defects, and immune system disorders (Akib et al., 2019).

Traditional labor-intensive methods for waste collection are insufficient due to the scale and dispersed nature of the problem in remote or hazardous locations (Flores et al., 2021), and stationary solutions are hindered by environmental changes (Subhashini et al., 2024). Recently, the focus has shifted from manual cleanup (Chandra et al., 2021) to robotic systems that address ocean pollution (Akib et al., 2019).

Unmanned surface vehicles (USVs) are at the forefront of these efforts (Bae and Hong, 2023; Fulton et al., 2019; Turesinin et al., 2020; Costanzi et al., 2020; Shivaanivarsha et al., 2024; Lazzerini et al., 2024; Suryawanshi et al., 2024); they are designed for collecting floating debris from rivers, ponds, and oceans. Commercial solutions such as WasteShark, Clearbot, and MANTA are used for trash collection.

The development of these robots is underpinned by technologies such as a) edge computing (Carcamo et al., 2024; Chandra et al., 2021; Salcedo et al., 2024), b) computer vision and AI for object classification (Li et al., 2025), c) environmental awareness (Li et al., 2024; Wang et al., 2019), d) intelligent navigation and control (Gunawan et al., 2016; Li et al., 2025; Wang et al., 2019), and e) effective debris collection (Li et al., 2024; Subhashini et al., 2024; Akib et al., 2019).

Marine cleaning robots are often limited by their small-scale and restricted operational capacities (Chandra et al., 2021; Flores et al., 2021). The dynamic nature of aquatic environments poses a complex challenge for effective and adaptable path planning, and operation in remote or GPS-denied areas is particularly difficult (Wang et al., 2019).

Several challenges remain in developing autonomous debris collection systems, including lighting and weather conditions, efficient path planning, robust navigation, and mechanical designs for handling diverse floating waste.

This article presents Autonomous Trash Retrieval for Oceanic Neatness (ATRON), addressing these challenges through an integrated approach that combines mechanical design, perception, planning, and control, as shown in Figure 1. The designed USV is similar to that in Ahn et al. (2022) and utilizes a robust twin-hulled catamaran design that enables heavy-duty debris collection operations. There are some fundamental changes, including the use of SLAM and various sensors (spherical camera, LiDAR) for USV localization and algorithms related to path planning relying on the orienteering problem and the rapidly exploring random tree (RRT) algorithm.

Figure 1
A pool with a robotic device equipped with a 3D printed electronics box, sealed battery container, Insta360 X2 spherical camera, 2D LiDAR, and thrusters. The device features a conveyor belt and navigates around labeled obstacles and target debris in the water.

Figure 1. ATRON: an autonomous USV capable of extracting debris from water surfaces.

The contributions in this article include the following:

• The development of a twin-hulled, ROS-based USV, ATRON, for collecting up to 1 m3 of floating debris. Four independent thrusters placed at the edges of the vessel provide a linear velocity of up to 1.47 m/s and an angular velocity of up to 0.3 rad/s.

• The development of a visual system for debris and obstacle detection using a spherical camera; YOLOv11 is used for debris classification (Ali and Zhang, 2024).

• The utilization of the orienteering problem for task planning and RRT for obstacle avoidance.

• Simulation and experimental studies conducted in an indoor pool for debris collection and obstacle avoidance.

• The development of a GPU-based physics simulator (Isaac Sim) for the evaluation of several classification algorithms under various sea states and lighting conditions.

2 ATRON design (materials and equipment)

The ATRON is a twin-hulled catamaran measuring (2.7×1.5×1.65) m (above the waterline), with a dry weight of 120 kg and a 0.15 m draught depth; and it is stable up to sea-state 2. Two parallel pontoons constitute its framework and are connected via an aluminum (1.75×1.2 m) platform, while a secondary elevated (1.35×0.6 m) platform is used for its electronics, including the 360° camera mounted on top of a 1.2-m pole. The conveyor belt used for trash removal, isolation absorbers, and the four underwater thrusters supplement the structure.

Figure 2 illustrates the comprehensive system architecture integrating the mechanical, electrical, and computational subsystems. The USV operates on a hierarchical control structure built on ROS 2 Humble (Abaza, 2025), enabling modular communication between perception, planning, and actuation components.

Figure 2
Diagram of a robotic system architecture featuring sensors, computation, control, and power subsystems. Sensors include a 2D LiDAR and 360 camera. Computation uses ROS2 Humble, integrating Yolov11, SLAM Toolbox, and OMPL. Control components comprise a control board, four thrusters, and a conveyor belt, managed by a remote. Feedback and display involve a Wi-Fi router and external display. Power is supplied by a 12V, 95Ah sealed battery. Connections include both wired and wireless transmissions. Subsystems are categorized as mechanical, electrical, and computer.

Figure 2. ATRON structural block diagram.

The Insta360 × 2 spherical camera provides 360° visual coverage for debris detection, while the Slamtec S2P 2D LiDAR is used for SLAM and obstacle avoidance, and the BNO055 9-DOF IMU is used for attitude estimation and sensor fusion. YOLOv11 (Ali and Zhang, 2024) is used for real-time debris and obstacle detection from the spherical camera feed, within the SLAM Toolbox (Macenski and Jambrecic, 2021) for GPS-denied localization using LiDAR and inertial measurement unit (IMU) fusion, while components of the OMPL (Şucan et al., 2012) are used for path planning. The system can be connected through a Wi-Fi/cellular router.

The ATRON utilizes a differential thrust propulsion system with four brushless thrusters positioned at the vessel’s corners on extended aluminum outriggers. Each thruster delivers 1.2 kgf at 12 V, controlled by bidirectional ESCs capable of 50 Hz PWM modulation for velocity control. A custom controller based on the STM32F103C8T6 microcontroller serves as the interface between high-level ROS 2 commands and low-level actuator control. It provides four PWM ESC-based output channels, PWM control for the conveyor motor via a solid-state relay, and I2C communication with the IMU. The controller communicates with the onboard computer via a USB-to-UART bridge, exchanging JSON-formatted commands and sensor data at 50 Hz for real-time control.

The debris collection system is a 300-mm-wide reinforced rubber timing belt with molded cleats spaced at 50 mm intervals. The belt spans a 600 mm incline from the waterline to the collection container, driven by a 500 W brushed DC motor with integrated 10:1 reduction gearing. Drainage holes (10 mm diameter) are distributed across the belt surface at 30 mm intervals to prevent water accumulation that could reduce lifting capacity or destabilize the collected items. The ATRON has a V-shaped extended metal funnel with a 120° opening angle to guide debris toward the collection zone.

All systems operate from a 12 V, 95 Ah sealed waterproof environment lead-acid battery providing 5.84 operating hours using a DC-to-AC inverter unit, while the USB computer-port can provide ample power for the LiDAR, the 360° camera, and the ATRON controller, which uses an onboard voltage regulator powering the IMU-sensor and any other circuitry. The thrusters are powered from bidirectional ESCs, while a solid-state relay is used in the conveyor mechanism, as shown in Figure 3.

Figure 3
Diagram of a power and data flow system for a robotic setup. A battery supplies 12V to an inverter, converting it to 220V AC for an onboard computer, which powers a LiDAR and a 360-degree camera. The setup also includes an ATRON controller connected through USB and 5V. The ATRON controller communicates with the BNO055 IMU using I2C and 3.3V. It sends PWM signals to an SSR and four ESC modules.

Figure 3. ATRON electrical systems.

The ATRON microcontroller relies on the STMicroelectronics STM32F103C8T6, which provides six PWM output channels using dedicated hardware timers. All PWM signals are level-shifted to 5 V using a TXB0104PWR bidirectional voltage-level translator to ensure compatibility between the MCU and the external devices. The high-speed I2C bus is interfaced to a BNO055 IMU. This microcontroller polls the IMU and provides the sensor data in JSON format.

3 ATRON software design (methods)

3.1 Image rectification

The 360° spherical camera observes the surrounding space and returns dual fisheye images, as shown in Figure 4. The camera’s extrinsic calibration parameters include a) cx and cy, which correspond to the deviation of each hemispherical image from the optical center; b) crop size; and c) T=[tx,ty,tz]T and R=Rz(ψ)Ry(θ)Rx(ϕ), determining the relative translation and rotation between the front and back hemispherical images.

Figure 4
Diagram illustrating fisheye and equirectangular imaging of an indoor pool. Two circular images represent front and back fisheye views. Arrows indicate conversion to an equirectangular image below, with divisions indicating front and back hemispheres.

Figure 4. Dual fisheye image to equi-rectangular image.

After calibration, the equirectangular projection is applied as described by Flores et al. (2024). Each pixel (u,v) in the equirectangular image is first converted to spherical coordinates (ϕ,θ) via Equation 1

ϕ=2πuWπ,θ=πvHπ2,(1)

where W(H) is the width (height) of the equirectangular image. These are then converted to 3D unit vectors described in Equation 2:

p=X,Y,Z=cosθsinϕ,sinθ,cosθcosϕ.(2)

For the front hemisphere points (Z0), the projection is described in Equation 3

r=X2+Y2,α=arctanr|Z|,rfisheye=αwπ,(3)

where w is the width of the fisheye image. The pixel coordinates in the front fisheye image are xf=cxf+Xrrfisheye,yf=cyf+Yrrfisheye. A similar approach is used for points in the back hemisphere. The resulting fisheye to equirectangular process is shown in Figure 4.

The equirectangular format enables the extraction of perspective views at arbitrary viewing angles. Given a field of view (FoV), azimuth Θ, and zenith Φ, each pixel (i,j) in the output perspective image is mapped in Equations 47:

d=1p12i/Wd0.5tanFoV/22j/Hd0.5tanFoVHd2Wd.(4)
d=RyΦRzΘd=x,y,zT,(5)
u=Warctan2y,x2π+0.5,(6)
v=H0.5arcsinzπ,(7)

where (u,v) represent the equirectangular coordinates.

The camera records (W×H) = (3840×1920) dual fisheye images. The cubemap images in Figure 5 have a FoV of 90°, a width Wd of 960, and a height Hd of 960 pixels. The generated views of the left, front, up, down, right, and back perspectives, respectively, are shown in Figure 5.

Figure 5
Indoor swimming pool with bright fluorescent lighting and clear blue water. Lane dividers are visible, along with a device floating in the foreground.

Figure 5. Cubemap representation with FoV 90°.

The ATRON utilizes a Slamtec S2P LiDAR (Madhavan and Adharsh, 2019) with a range of 50 m and angular resolution of 0.1125° with a 32 KHz sampling rate, along with a BNO055 9-DOF IMU.

3.2 2D to 3D object projection

Since the debris floats on the water’s surface, LiDAR cannot be used in wavy sea states, and visual methods are utilized instead, relying on YOLOv11 (Ali and Zhang, 2024). Since soda cans mostly constitute the debris and floatable buoys are the obstacles, the YOLOv11 model was trained on 960×960 images containing objects with these two possible classes.

The distance (depth) of the debris or obstacles was estimated, and given that the camera was attached at a given height with respect to the sea level, as shown in Figure 6, a methodology similar to Li et al. (2025) was utilized, assuming that θy is positive downward.

Figure 6
Diagram of a boat on the water with labeled axes for camera and USV frames, showing the camera's field of view (FOV) directed towards a buoy. On the right, a grid illustrates different angles with images of a pool from various perspectives marked as Top, Front, Right, Back, and Bottom.

Figure 6. 2D image to 3D Cartesian coordinate projection (right). θy, the vertical angle between the camera axis and the detected object is calculated using the heading equation for θy. θx and θy are overlaid over the cubemap projection images (right). Notably, calculations are valid only for θy0,h=1.5 m.

The transformation from the 3D-coordinates [Xc,Yc,Zc] in the camera frame to pixel coordinates [u,v] is shown in Equation 8 (Wang et al., 2010)

uv1=fx0cx0fycy001XcYcZc,(8)

while the horizontal distance of the object (debris or obstacle) to the camera (Wang et al., 2010) is Zc=htan(θy).

The heading θy (angle between the camera’s principal axis and the ray connecting the camera to the object) is θy=FoV2(vcy)fy. The yaw angle of the object in the camera’s x-axis, θx, is described in Equation 9

θx=FoV2ucxfx, while Xc=Zctanθx.(9)

3.3 USV-kinodynamics

Assuming minimal roll and pitch of the USV, its state vector x in a 2D environment (Fossen, 2021) is x=x,y,θ,v,ω, where a) x,y are USV’s position coordinates in the 2D plane, b) θ is the USV’s heading angle, c) v is the linear velocity in the direction of the USV’s heading, and d) ω is the USV’s angular velocity around the vertical axis. The control input vector commands are u=T,τ and affect the USV’s speed and turning rate; T corresponds to the thrust generated by the propulsion system, and τ is the USV’s rotational torque that affects its heading.

The USV’s differential kinematics is ẋ=vcosθ,ẏ=vsinθ,θ̇=ω, while its simplified hydrodynamics model is expressed in Equation 10

v̇=1mTDvv,ω̇=1IτDωω,(10)

where m(I) is the USV’s mass (moment of inertia around the Z-axis) and Dv(Dω) is the linear (angular) drag coefficient.

3.4 USV simultaneous localization and mapping

The USV utilizes 2D LiDAR complemented by its IMU to perform SLAM. The orientation data from the IMU are fused to improve the accuracy of the SLAM algorithm using the extended Kalman filter (EKF) (Ribeiro, 2004; Einicke and White, 2002). The EKF recursively estimates the robot state xk=[x,y,θ]T by fusing the planar odometry from scan matching with IMU orientation measurements. The filter’s prediction step is expressed by Equations 11, 12

x̂k|k1=fx̂k1|k1,uk,(11)
Pk|k1=FkPk1|k1FkT+Qk,(12)

followed by its update step in Equations 1315.

Kk=Pk|k1HkTHkPk|k1HkT+Rk1,(13)
x̂k|k=x̂k|k1+Kkzkhx̂k|k1,(14)
Pk|k=IKkHkPk|k1,(15)

where Fk and Hk are the Jacobians of the motion and measurement models, respectively, Qk and Rk represent process and measurement noise co-variances, respectively, and Kk is the Kalman gain. The covariance of the LiDAR (IMU) odometry is obtained from its specifications.

3.5 Orienteering problem for USVs

The order of the collected debris is optimized by treating each piece of debris as a node in a classic orienteering problem (OP). Each debris node is considered to have a score Si, and the edge joining nodes i and j are Eij={0,1}, where Eij=1 signifies that the edge between nodes i and j is chosen as part of the optimized orienteering cycle. The goal is to maximize the ensuing objective function (Gunawan et al., 2016) in Equation 16.

argmaxi,ji=2N1j=2NSiEij.(16)

Let d(Eij) be the length of the edge Eij; then, OP satisfies the constraint in Equation 17

i=1N1j=2NdEijEijc,(17)

where c is the maximum length of the orienteering cycle.

The USV implements a heuristic approach (Gunawan et al., 2016) to the OP. Figure 7 indicates that the orienteering path varies depending on the cost c{30,50} meters, where the depot node is at (0,0).

Figure 7
Chart a shows an orienteering problem solution with a value of 14, 15 nodes visited, and a path length of 27.88 meters, featuring the depot and visited nodes connected by red paths. Chart b presents a higher value of 22, visiting 23 nodes with a path length of 47.63 meters, showing a more complex route. Chart c displays a clustered orienteering problem solution with a value of 27, 9 clusters visited, and a path length of 43.33 meters, highlighting larger cluster nodes.

Figure 7. OP solution for various distance constraints. (a) OP solution with c = 30 m, (b) OP solution with c = 50 m, (c) Clustered OP solution via heuristic approach for c = 50 m. (a, b) Unclustered output; (c) clustered nodes.

The adopted approach assumes static debris, and a static high-level global planner is utilized, followed by a dynamic low-level local planner for small debris using a clustered OP (Angelelli et al., 2014; Elzein and Caro, 2022). A greedy algorithm is introduced that creates clusters with a maximum diameter, which represents regions in which the low-level dynamic follower is implemented.

Given nodes N with positions and the maximum diameter dmax, clusters are produced satisfying their maximum cluster diameter constraint diameter(Cj)=maxni,nkCjpipk2dmax, where pi is the position of node ni.

A greedy algorithm selects a random node to be a cluster and then inspects each unassigned node n*. If including n* results in a cluster with a smaller diameter than dmax, then it is included in that cluster. The algorithm continues until no further clusters can be created and has a complexity of O(N2).

Each cluster’s position is its centroid, and its score equals the number of nodes it contains. The original orienteering solution path is then mapped through these clusters, with consecutive duplicate clusters removed, as shown in Figure 7C. This enables hierarchical planning, in which global routes are determined using clusters, while local control manages individual debris.

This allows the planning algorithm in Section 3.6.1 to be agnostic to small deviations in the position of the debris (because of sea currents).

3.6 Path and trajectory planning

3.6.1 Path planning via OMPL

For path planning and obstacle avoidance, the USV relies on the OMPL (Şucan et al., 2012), and the USV’s footprint (L×W) = (2.7×1.5 m) is utilized. The USV’s configuration space is determined by extracting the free space obtained from the SLAM algorithm and subtracting the obstacle space. The path of the USV is then constrained by supplying the OMPL’s constrained planner (Kingston et al., 2019). The USV can be thought of as a tank-driven system represented by Wang et al. (2019), where vL(vR) is the velocity created by the left (right) thruster side described by Equation 18.

ẋẏθ̇=12cosθ12cosθ12sinθ12sinθ1W1WvLvR.(18)

These constraints are used in OMPL to obtain the path P={p|p=(x,y,θ)} via the RRT connect algorithm (Karaman and Frazzoli, 2011; Kuffner and LaValle, 2000). In environments with sparse obstacles, paths between subsequent nodes are obtained within less than 2 s on average as the USV can, in the vast majority of cases, simply drive around the obstacle.

3.6.2 Path tracking

The velocity of the USV is calculated using a regulated pure pursuit (RPP) controller (Macenski et al., 2023), an improved implementation of the classic pure pursuit (PP) controller, which operates at a constant linear velocity. The classic PP controller follows a look-ahead point specified by the look-ahead radius R. The furthest point on the desired path within this look-ahead radius is considered. The PP algorithm geometrically determines the curvature required to drive the vehicle from its current position to the look-ahead point. The curvature κ is κ=2ΔxR2, where Δx is the lateral offset of the look-ahead point in the vehicle’s coordinate frame, and R is the look-ahead radius. The vehicle’s coordinate system is placed at the rear differential with the x-axis aligned with the vehicle’s heading. The USV operates with a constant linear velocity v (except for the starting and end points). The required angular velocity is ω=κv=2ΔxvR2. The look-ahead distance R acts as a tuning parameter: larger values result in smoother tracking with less oscillation but slower convergence to the path, while smaller values provide a tighter path following at the cost of potential oscillations.

In RPP, the linear velocity is further scaled to improve performance along tight turns. Let v be the desired nominal speed. Let rmin be the minimum radius of curvature that the USV can turn at this given nominal speed. RPP reduces the commanded linear velocity based on a threshold curvature κthresh. The implementation utilizes a modified regulated pure pursuit (MRPP) controller that has a final threshold κmax, which acts as a safety net to calculate the linear velocity vκ in Equation 19.

vκ=v,|κ|<κthreshv|rmin||κ|,κthresh|κ|<κmax0κ>κmax(19)

The MRPP was used to follow the path generated by OMPL shown in Figure 8. The USV’s actual path is represented by the blue line using R=0.4 m, rmin=1.0 m, and v=1.47 m/s.

Figure 8
A grid map showing a path planning scenario with a black vehicle, represented as a rectangle with a highlighted border, navigating past orange circular obstacles. The vehicle's trajectory is marked with pink and green lines, avoiding red points that signify path nodes on the grid.

Figure 8. Orienteering (purple), OMPL (green), and actual (blue) path of the USV.

3.6.3 PID control and signal mixing

In order to achieve the desired behavior, the input velocity of the USV is sent to a PID controller. PID controllers are used for the linear (angular) USV velocity. Discrete PID-controllers are then utilized with a Ts-sampling period.

4 Results

4.1 Camera calibration

The camera’s extrinsic parameters are estimated by performing live dual-fisheye to equirectangular mapping in order to minimize any stitching artifacts, as shown in Figure 4. The camera uses 1920×1920 pixel-images, resulting in the following intrinsic and extrinsic parameters from Wang et al. (2010), as summarized in Table 1.

Table 1
www.frontiersin.org

Table 1. Fisheye camera configuration parameters.

Using these parameters allows the formation of equirectangular images in 49 m and the cubemap images in 79 m, resulting in a total per-image processing time of 128 m at 4K resolution when deployed on an Intel i5 processor.

4.2 Object detection results

Debris (obstacles) such as cans/plastic bottles (buoys) were collected by recording 3,840×1,920 images from the 360° camera and extracting four 960×960 cubemap images representing the front, left, right, and back image views. A total of 1,416 images were extracted and annotated. Six image-augmentation steps were applied, including a) horizontal flipping, b) ±10% rotation, c) ±25% saturation, d) ±10% exposure, e) random bounding box flipping, and f) mosaic stitching, as shown in Figure 9. These augmentations increased the training data size to 5,192 images. Training–validation–test split was divided in a 70–20–10 ratio.

Figure 9
Multiple images showing an indoor swimming pool from various angles. The pool is marked with lane lines, and some images reveal the ceiling and surrounding architecture.

Figure 9. Preprocessed images in YOLO training.

Varying lighting conditions were used by adjusting the image brightness and saturation. Cubemap face images were also rotated to make the model more adaptable against USV tilt/yaw, and mosaic stitching was implemented. These preprocessing adjustments are shown in Figure 9.

The data were trained on YOLOv11 using the ADAM optimizer with an initial learning rate of 0.001, utilizing a cosine decay learning rate scheduler with three warmup epochs. A batch size of 16 was used, and the dataset contained 4,000 random images split into a 70–20–10 distribution for training, validation, and testing, respectively. Validation results are shown in Figure 10 along with the training results. The model achieves an mAP50 of 0.76 and an mAP95 of 0.365, with the precision-recall curves shown in Figure 10 (bottom). Overall, in sea states of 3 and above with poor lighting conditions, YOLO efficiently identified buoys (98% accuracy) but had several false positives recognized as cans (74% accuracy).

Figure 10
Six images depict a swimming pool showing ground-truth labels and model-predicted labels for obstacles and debris. Below is a precision-recall curve graph comparing obstacles and debris with precision on the y-axis and recall on the x-axis.

Figure 10. Model validation and training results.

This model was further investigated using the confusion matrix shown in Figure 11a. Several false positives/negatives occurred in detecting debris (due to their small size) from the background (poor lighting conditions); several samples are shown in Figure 11b. The false positives were primarily due to the sharp reflections in the water surface (environmental conditions), while the false negatives occurred when the debris appeared small in the images. Notably, a) the superclass debris had several classes under it; these classes included cans, plastics, and others, and b) the summation of column-elements in the normalized confusion matrix should be one. The confusion matrix implies that the buoys were classified and identified correctly, the debris was recognized with a 74% accuracy, and the small cases of background were misclassified as debris (93% of instances). This inference, when using a GPU RTX 4050 GPU, resulted in a 25-ms delay.

Figure 11
The image contains two parts: (a) a normalized confusion matrix with three true classes—obstacle, debris, and background—and their predicted classes, showing values like 0.98 for obstacle correctly predicted. (b) three swimming pool images highlighting issues like reflections mistaken for debris and small or distant debris being undetected.

Figure 11. Confusion matrix and object detection failure cases. (a) Normalized Confusion Matrix, (b) Detection Failure Cases.

The model is inferred from the left, front, right, and back image views, as shown in Figure 12, where buoys are represented by orange cylinders and cans are represented by red markers (bottom). Furthermore, Figure 12 showcases a successful 2D-to-3D projection using the bounding box coordinates to estimate the position of various debris and obstacles.

Figure 12
Indoor swimming pool with multiple lanes. In the lower section, a digital representation of the pool shows a path in red, marked with several orange and red points labeled as detections.

Figure 12. Stitched and annotated left, front, right, and back image views inferred with the YOLOv11 model (top) are used to locate debris and obstacles in the 2D map (bottom).

4.3 USV kinodynamic measurements

The data sampling period was 20 ms, while ωmax=0.3rad/s. These were measured from the onboard IMU’s gyroscope readings during rapid rotational maneuvers.

Since no direct velocity sensor was available, the maximum linear velocity was estimated by numerically integrating the acceleration data obtained from the IMU’s accelerometer. The USV started from a stationary position and accelerated until it reached terminal velocity, vmax=1.47m/s. The control signal on the resultant ATRON thrust and torque is nearly identical for the clockwise and counterclockwise rotation of propellers.

4.4 Odometry and sensor fusion

An EKF was used to fuse LiDAR data with IMU using the robot localization technique (Censi, 2008). Figure 13a shows the yaw angles obtained from LiDAR odometry, from the IMU, and from the fused data. The EKF fuses the smooth LiDAR odometry (blue) with the noisy IMU orientation (red) to produce a robust and accurate fused orientation (yellow). Notably, inclusion of the IMU is necessary since the USV can have small pitch and roll angles because of the sea state (waves and currents).

Figure 13
Chart a depicts yaw angle over time measured by LiDAR and IMU, with a filtered result. Chart b shows LiDAR and filtered lateral velocities over time. Image c illustrates a path trajectory with colored lines indicating motion direction and path.

Figure 13. Experimental USV results: (a) LiDAR fusion with IMU generating yaw angles. (b) USV lateral velocities. (c) Experimental USV path following.

The USV’s lateral velocity is shown in Figure 13b, reflecting tilting of the USV.

4.5 Path following results

To test the path-following capabilities of the USV, a path consisting of four corners was to be followed. The USV’s actual experimental response is shown in Figure 13c, where the use of the PID-based control significantly improves the robot’s path-following capabilities by reducing its overshoot.

Automatic tuning (Åström et al., 1993) was used for obtaining the PID parameters. The ultimate gain Ku and ultimate period Tu were different for the linear and angular (torsional) components of the USV-controller: for the linear controller, Ku=5 and Tu=2 s, while for the angular controller, Ku=3 and Tu=2 s. The PID parameters used were

Kp=Ku3,Ki=2Ku3Tu,Kd=KuTu9.

4.6 Path tracking for various sea states

The remote environment used in the simulation studies is shown at the bottom of Figure 14. This environment was used in NIVIDA’s Isaac simulator (Collins et al., 2021); this simulates the USV’s buoyancy, along with all sensors and transducers on top of the USV under ROS. Different sea state levels from 0 to 4 on the Beaufort wind scale are emulated (Meaden et al., 2007), which involves waves up to 4 m. Furthermore, the induced wind and the roll and pitch angles are nonzero and affect the effectiveness of the 2D LiDAR, the odometry system, and the ability to recognize the debris. The varying angles create erroneous behavior in the MRPP algorithm (Macenski et al., 2023), resulting in rapid turns, as shown in the upper half of Figure 14, while tracking a square 10×10 m. In both cases, Bezier smoothing (Arvanitakis and Tzes, 2012) (passing through the vertices of this square) was utilized to improve the tracking performance. The maximum attainable velocities were enforced during the simulation while different sea states were applied (Meaden et al., 2007).

Figure 14
The top image shows graphs comparing USV odometry under different sea conditions. The left graph, for Sea State 0, has a mean squared error of 0.158, while the right graph, for Sea State 4, has an error of 0.968. Both graphs plot X and Y positions, with Bezier reference and EKF odometry paths. The bottom image is a 3D rendering of a coastal scene with dense trees, a small body of water, boats.

Figure 14. USV odometry tracking performance for different sea states (top) within the improved simulated environment (bottom).

5 Discussion

5.1 Summary of the results

This article presented ATRON, a large-scale autonomous USV designed for the removal of floating debris. With its 2.7×1.5 m footprint and 1 m3 collection capacity, ATRON represents a significant advancement over existing small-scale prototypes, demonstrating the feasibility of practical autonomous marine cleanup systems.

The system’s key innovations include the novel use of a 360° camera with projection techniques for debris localization, avoiding the limitations of depth cameras in aquatic environments. The dual fisheye to equirectangular projection relies on placing the camera at an a priori known height. YOLOv11L successfully estimated debris positions, while the mapping capabilities were obtained with a 2D LiDAR. The integration of OMPL with the orienteering problem enabled optimal path planning, with the clustered variant effectively reducing computational complexity while maintaining collection efficiency.

Experimental validation in both an indoor pool environment and a simulated environment confirmed the system’s capabilities. The sensor fusion approach combining LiDAR odometry with IMU data through an extended Kalman filter provided robust localization despite the pitch and roll variations common in aquatic environments. The MRPP controller with PID control delivered precise path following, thus validating the overall system architecture.

5.2 Limitations and concluding remarks

Despite the demonstrated effectiveness of ATRON in controlled conditions, several limitations must be acknowledged. The USV has thus far only been evaluated in an indoor pool environment where hydrodynamic perturbations are negligible. While the catamaran’s low center of mass provides inherent passive stability, the absence of trials in open-water conditions with currents, waves, and wind precludes the validation of performance under realistic operating scenarios due to the necessary regulatory constraints and permit requirements.

A further limitation concerns the visual localization pipeline. The 2D-to-3D projection requires approximately 0.13 s per frame for calibration, which, at an angular velocity of 0.3 rad/s and an object distance of 10 m, introduces a positional differential of approximately 0.45 m. In addition, the absence of damping on the camera mount transmits vibrations from the thrusters and conveyor mechanism to the sensor, introducing noise into the projection parameters. This combination of calibration latency and vibration-induced disturbance degrades the accuracy of debris localization.

Future work will focus on addressing these limitations. The image-processing pipeline is being optimized through reduced resolution and refined projection algorithms to decrease calibration latency. Image stabilization is under development using both software- and hardware-based approaches, including the utilization of the integrated gyroscope in the 360° camera and the addition of physical damping elements. Furthermore, open-water trials are planned once regulatory requirements are satisfied, enabling the assessment of ATRON’s stability and performance in dynamic marine environments and advancing the system toward operational deployment.

Data availability statement

Publicly available datasets were analyzed in this study. These data can be found at https://github.com/RISC-NYUAD/ATRON.

Author contributions

JoA: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project Administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing. HJ: Conceptualization, Funding acquisition, Investigation, Methodology, Resources, Writing – original draft. BE: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft. JaA: Conceptualization, Data Curation, Investigation, Methodology, Resources, Writing – original draft. AT: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review and editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the NYUAD Center for Artificial Intelligence and Robotics, Tamkeen under the New York University Abu Dhabi Research Institute Award CG010, and the Mubadala Investment Company in the UAE. The latter was not involved in the study design, collection, analysis, interpretation of data, the writing of this article, or the decision to submit it for publication.

Acknowledgements

This project acknowledges support from A. Oralov, D. Al Jorf, and F. Darwish for their contributions to the object detection system. Gratitude is expressed toward N. Evangeliou from NYUAD’s RISC Laboratory for his guidance on the various electromechanical systems.

Conflict of interest

Author JaA was employed by Mechanical Engineering, Egyptian Refining Company.

The remaining author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author AT declared that they were an editorial board member of Frontiers at the time of submission. This had no impact on the peer review process and the final decision.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frobt.2025.1718177/full#supplementary-material

References

Abaza, B. F. (2025). AI-Driven dynamic covariance for ROS 2 Mobile robot localization. Sensors 25, 3026. doi:10.3390/s25103026

PubMed Abstract | CrossRef Full Text | Google Scholar

Ahn, J., Oda, S., Chikushi, S., Sonoda, T., and Yasukawa, S. (2022). Design and development of ocean debris collecting unmanned surface vehicle and performance evaluation of collecting device in tank. J. Robotics, Netw. Artif. Life 9, 209–215. doi:10.57417/jrnal.9.3_209

CrossRef Full Text | Google Scholar

Akib, A., Tasnim, F., Biswas, D., Hashem, M. B., Rahman, K., Bhattacharjee, A., et al. (2019). “Unmanned floating waste collecting robot,” in TENCON 2019 IEEE region 10 conference. IEEE, 2645–2650. doi:10.1109/TENCON.2019.8929537

CrossRef Full Text | Google Scholar

Ali, M. L., and Zhang, Z. (2024). The YOLO framework: a comprehensive review of evolution, applications, and benchmarks in object detection. Computers 13, 336. doi:10.3390/computers13120336

CrossRef Full Text | Google Scholar

Angelelli, E., Archetti, C., and Vindigni, M. (2014). The clustered orienteering problem. Eur. J. Operational Res. 238, 404–414. doi:10.1016/j.ejor.2014.04.006

CrossRef Full Text | Google Scholar

Arvanitakis, I., and Tzes, A. (2012). “Trajectory optimization satisfying the robot’s kinodynamic constraints for obstacle avoidance,” in 2012 20th mediterranean conference on control and automation. MED IEEE, 128–133.

CrossRef Full Text | Google Scholar

Åström, K. J., Hägglund, T., Hang, C. C., and Ho, W. K. (1993). Automatic tuning and adaptation for PID controllers-A survey. Control Eng. Pract. 1, 699–714. doi:10.1016/0967-0661(93)91394-c

CrossRef Full Text | Google Scholar

Bae, I., and Hong, J. (2023). Survey on the developments of unmanned marine vehicles: intelligence and cooperation. Sensors 23, 4643. doi:10.3390/s23104643

PubMed Abstract | CrossRef Full Text | Google Scholar

Carcamo, J., Shehada, A., Candas, A., Vaghasiya, N., Abdullayev, M., Melnyk, A., et al. (2024). “AI-Powered cleaning robot: a sustainable approach to waste management,” in 2024 16th international conference on human System interaction. IEEE, 1–6.

CrossRef Full Text | Google Scholar

Censi, A. (2008). “An ICP variant using a point-to-line metric,” in 2008 IEEE international conference on robotics and automation (IEEE), 19–25.

Google Scholar

Chandra, S. S., Kulshreshtha, M., and Randhawa, P. (2021). “A review of trash collecting and cleaning robots,” in 2021 9th international conference on reliability, Infocom technologies and optimization (Trends and future directions). IEEE, 1–5.

Google Scholar

Chrissley, T., Yang, M., Maloy, C., and Mason, A. (2017). “Design of a marine debris removal system,” in 2017 systems and information engineering design symposium. IEEE, 10–15.

CrossRef Full Text | Google Scholar

Collins, J., Chand, S., Vanderkop, A., and Howard, D. (2021). A review of physics simulators for robotic applications. IEEE Access 9, 51416–51431. doi:10.1109/access.2021.3068769

CrossRef Full Text | Google Scholar

Costanzi, R., Fenucci, D., Manzari, V., Micheli, M., Morlando, L., Terracciano, D., et al. (2020). Interoperability among unmanned maritime vehicles: review and first in-field experimentation. Front. Robotics AI 7, 91. doi:10.3389/frobt.2020.00091

PubMed Abstract | CrossRef Full Text | Google Scholar

Einicke, G. A., and White, L. B. (2002). Robust Extended Kalman filtering. IEEE Trans. Signal Process. 47, 2596–2599. doi:10.1109/78.782219

CrossRef Full Text | Google Scholar

Elzein, A., and Caro, G. A. D. (2022). A clustering metaheuristic for large orienteering problems. PLOS ONE 17, e0271751. doi:10.1371/journal.pone.0271751

PubMed Abstract | CrossRef Full Text | Google Scholar

Flores, H., Motlagh, N. H., Zuniga, A., Liyanage, M., Passananti, M., Tarkoma, S., et al. (2021). Toward large-scale autonomous marine pollution monitoring. IEEE Internet Things Mag. 4, 40–45. doi:10.1109/iotm.0011.2000057

CrossRef Full Text | Google Scholar

Flores, M., Valiente, D., Peidró, A., Reinoso, O., and Payá, L. (2024). Generating a full spherical view by modeling the relation between two fisheye images. Vis. Comput. 40, 7107–7132. doi:10.1007/s00371-024-03293-7

CrossRef Full Text | Google Scholar

Fossen, T. I. (2021). Handbook of marine craft hydrodynamics and motion control, 2nd ed. John Wiley and Sons.

Google Scholar

Fulton, M., Hong, J., Islam, M. J., and Sattar, J. (2019). Robotic detection of marine litter using deep visual detection models. Int. Conf. on Robotics and Automation (IEEE), 5752–5758. doi:10.1109/icra.2019.8793975

CrossRef Full Text | Google Scholar

Gunawan, A., Lau, H. C., and Vansteenwegen, P. (2016). Orienteering problem: a survey of recent variants, solution approaches and applications. Eur. J. Operational Res. 255, 315–332. doi:10.1016/j.ejor.2016.04.059

CrossRef Full Text | Google Scholar

Karaman, S., and Frazzoli, E. (2011). Sampling-based algorithms for optimal motion planning. Int. J. Robotics Res. 30, 846–894. doi:10.1177/0278364911406761

CrossRef Full Text | Google Scholar

Kingston, Z., Moll, M., and Kavraki, L. E. (2019). Exploring implicit spaces for constrained sampling-based planning. Intl. J. Robotics Res. 38, 1151–1178. doi:10.1177/0278364919868530

CrossRef Full Text | Google Scholar

Kuffner, J., and LaValle, S. (2000). RRT-connect: an efficient approach to single-query path planning. Proc. 2000 ICRA. Millenn. Conf. IEEE Int. Conf. Robotics Automation. Symposia Proc. (Cat. No.00CH37065) 2, 995–1001. doi:10.1109/robot.2000.844730

CrossRef Full Text | Google Scholar

Lazzerini, G., Gelli, J., Della Valle, A., Liverani, G., Bartalucci, L., Topini, A., et al. (2024). Sustainable electromechanical solution for floating marine litter collection in calm marinas. OCEANS 2024 - Halifax, 1–6. doi:10.1109/oceans55160.2024.10753855

CrossRef Full Text | Google Scholar

Li, X., Chen, J., Huang, Y., Cui, Z., Kang, C. C., and Ng, Z. N. (2024). “Intelligent marine debris cleaning robot: a solution to ocean pollution,” in 2024 international conference on electrical, communication and computer engineering (IEEE), 1–6.

Google Scholar

Li, R., Zhang, B., Lin, D., Tsai, R. G., Zou, W., He, S., et al. (2025). Emperor Yu tames the flood: water surface garbage cleaning robot using improved A* Algorithm in dynamic environments. IEEE Access 13, 48888–48903. doi:10.1109/access.2025.3551088

CrossRef Full Text | Google Scholar

Macenski, S., and Jambrecic, I. (2021). SLAM Toolbox: SLAM for the dynamic world. J. Open Source Softw. 6, 2783. doi:10.21105/joss.02783

CrossRef Full Text | Google Scholar

Macenski, S., Singh, S., Martín, F., and Ginés, J. (2023). Regulated pure pursuit for robot path tracking. Aut. Robots 47, 685–694. doi:10.1007/s10514-023-10097-6

CrossRef Full Text | Google Scholar

Madhavan, T., and Adharsh, M. (2019). “Obstacle detection and obstacle avoidance algorithm based on 2-D RPLiDAR,” in 2019 international conference on computer communication and informatics (IEEE), 1–4.

Google Scholar

Meaden, G. T., Kochev, S., Kolendowicz, L., Kosa-Kiss, A., Marcinoniene, I., Sioutas, M., et al. (2007). Comparing the theoretical versions of the Beaufort scale, the T-Scale and the Fujita scale. Atmos. Research 83, 446–449. doi:10.1016/j.atmosres.2005.11.014

CrossRef Full Text | Google Scholar

Prakash, N., and Zielinski, O. (2025). AI-enhanced real-time monitoring of marine pollution: part 1-A state-of-the-art and scoping review. Front. Mar. Sci. 12, 1486615. doi:10.3389/fmars.2025.1486615

CrossRef Full Text | Google Scholar

Ribeiro, M. I. (2004). Kalman and extended Kalman filters: concept, derivation and properties. Inst. Syst. Robotics 43, 3736–3741.

Google Scholar

Salcedo, E., Uchani, Y., Mamani, M., and Fernandez, M. (2024). Towards continuous floating invasive plant removal using unmanned surface vehicles and computer vision. IEEE Access 12, 6649–6662. doi:10.1109/access.2024.3351764

CrossRef Full Text | Google Scholar

Shivaanivarsha, N., Vijayendiran, A. G., and Prasath, M. A. (2024). WAVECLEAN – an innovation in autonomous vessel driving using object tracking and collection of floating debris, in 2024 international conference on communication, computing and internet of things (IC3IoT), 1–6.

Google Scholar

Subhashini, K., Shree, K. S., Abirami, A., Kalaimathy, R., Deepika, R., and Selvi, J. T. (2024). “Autonomous floating debris collection system for water surface cleaning,” in 2024 international conference on communication, computing and internet of things (IEEE), 1–6.

Google Scholar

Şucan, I. A., Moll, M., and Kavraki, L. E. (2012). The open motion planning Library. IEEE Robotics and Automation Mag. 19, 72–82. doi:10.1109/mra.2012.2205651

CrossRef Full Text | Google Scholar

Suryawanshi, R., Waghchaure, S., Wagh, S., Jethliya, V., Sutar, V., and Zendage, V. (2024). “Autonomous boat: floating trash collector and classifier,” in 2024 4th international conference on sustainable expert systems (ICSES), 66–70.

Google Scholar

Turesinin, M., Kabir, A. M. H., Mollah, T., Sarwar, S., and Hosain, M. S. (2020). “Aquatic Iguana: a floating waste collecting robot with IOT based water monitoring system,” in 2020 7th international conference on electrical engineering, computer sciences and informatics (IEEE), 21–25.

Google Scholar

Wang, Y., Li, Y., and Zheng, J. (2010). “A camera calibration technique based on OpenCV,” in The 3rd international conference on information sciences and interaction sciences. IEEE, 403–406.

CrossRef Full Text | Google Scholar

Wang, W., Gheneti, B., Mateos, L. A., Duarte, F., Ratti, C., and Rus, D. (2019). Roboat: an autonomous surface vehicle for urban waterways. IEEE/RSJ International Conference on Intelligent Robots and Systems IROS, 6340–6347.

CrossRef Full Text | Google Scholar

Keywords: collision avoidance, path planning, uncrewed marine vessel, YOLO object detection, orienteering problem

Citation: Abanes J, Jang H, Erkinov B, Awadalla J and Tzes A (2026) ATRON: Autonomous trash retrieval for oceanic neatness. Front. Robot. AI 12:1718177. doi: 10.3389/frobt.2025.1718177

Received: 03 October 2025; Accepted: 17 December 2025;
Published: 22 January 2026.

Edited by:

Mark R. Patterson, Northeastern University, United States

Reviewed by:

Alberto Topini, University of Florence, Italy
Wenyu Zuo, University of Houston, United States

Copyright © 2026 Abanes, Jang, Erkinov, Awadalla and Tzes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Anthony Tzes, YW50aG9ueS50emVzQG55dS5lZHU=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.