Automated oil spill detection using deep learning and SAR satellite data: a case study of the Lower Congo Basin

Cheng, Peng; Qi, Yanxin; Zhao, Jiasen

doi:10.3389/feart.2025.1667450

ORIGINAL RESEARCH article

Front. Earth Sci., 23 January 2026

Sec. Marine Geoscience

Volume 13 - 2025 | https://doi.org/10.3389/feart.2025.1667450

Automated oil spill detection using deep learning and SAR satellite data: a case study of the Lower Congo Basin

Peng Cheng

Yanxin Qi*

Jiasen Zhao

North China Institute of Aerospace Engineering, Langfang, China

Spaceborne Synthetic Aperture Radar (SAR) is a remote sensing sensor mounted on satellite platforms, offering image data with a wide spatial detection range and advantages such as all-weather, all-day observation, and low sensitivity to clouds and fog—making it ideal for marine oil spill monitoring. Sentinel-1A satellite has repeatedly detected oil slicks in SAR images of the Lower Congo Basin, a region with frequent natural oil and gas seepage. To address the limitations of existing segmentation models (e.g., blurred edge prediction and poor discrimination between oil spills and look-alikes) in natural oil spill detection, this study proposes a novel semantic segmentation model, NOS-Net, based on the U-Net architecture. The model deeply integrates the encoder-decoder structure of U-Net with the residual learning mechanism of ResNet, effectively solving the gradient vanishing/explosion problem in deep network training and enhancing the extraction of discriminative features. A lightweight Residual Convolutional Block Attention Module (RCBA) is innovatively introduced into the encoder’s bottleneck residual blocks, which adaptively learns feature channel importance weights via channel attention and focuses on oil spill-related spatial regions via spatial attention—strengthening feature representation while maintaining low computational overhead. Experimental validation was conducted on 110 preprocessed Sentinel-1 SAR images (1,250 × 650 × 3, subjected to radiometric calibration, 7 × 7 median filtering, and linear transformation). Results show that the proposed NOS-Net achieves an Intersection over Union (IoU) of 61.27% for oil spills and a mean IoU (mIoU) of 70.29% for five marine target categories (Oil Spill, Look-alike, Ship, Land, Sea Surface), outperforming the U-Net model with ResNet101 as the backbone by 7.48% and 5.32%, respectively.

1 Introduction

Marine oil slicks are microscopically thin, highly transient floating layers of liquid hydrocarbons (Leifer et al., 2012; Dong et al., 2022). They are generally categorized into two main types: one is natural seepage, where hydrocarbons naturally seep from seabed reservoirs; the other is anthropogenic discharge, which is intentionally or unintentionally released from ships, offshore oil and gas infrastructure, and land-based sources (MacDonald et al., 2015; McNutt et al., 2012). With the growing attention to the development and utilization of marine resources, oil slicks on the sea surface have become a significant environmental and public safety issue that cannot be ignored (Zeng and Wang, 2020). Currently, research on marine oil spills primarily focuses on anthropogenic discharge cases, such as the 2010 Deepwater Horizon oil spill in the Gulf of Mexico (Garcia-Pineda et al., 2017), while studies on oil spills caused by natural oil and gas seepage are relatively scarce. It has been reported that natural oil and gas seepage occurs in many regions worldwide (Mikolaj et al., 1972). Over the past few decades, SAR has been widely used to monitor natural oil and gas seepage areas in regions such as the Caspian Sea, Black Sea, and Gulf of Mexico. The great potential of Synthetic Aperture Radar (SAR) in precise oil spill detection stems from multi-dimensional characteristics: it has all-weather and all-day observation capabilities, breaking through the weather and lighting limitations of optical remote sensing; through the unique mechanism of oil spills suppressing sea surface capillary and short-gravity capillary waves, it forms identifiable dark spot features in images; multipolarization technology enriches scattering information, effectively distinguishing oil spills from oil spill look-alike false positives; it achieves a balance between high spatial resolution and large-area coverage, adapting to multi-scenario requirements; and time-series data can also support dynamic diffusion analysis of oil spills. These characteristics collectively make it a key technical means for oil spill monitoring.

Synthetic Aperture Radar (SAR) is highly effective for accurate oil spill detection, offering significant advantages and unique capabilities (Marghany, 2014). Operating reliably under diverse weather conditions and around the clock, SAR has become an indispensable tool for monitoring and identifying oil spills on the ocean surface (Espeseth et al., 2020). Its detection mechanism leverages the suppression of capillary and short-gravity capillary waves caused by oil spills, which reduces the surface roughness of seawater. This change appears as dark spots in SAR images, providing a distinct contrast to the surrounding ocean and enabling effective oil spill detection and monitoring (de Souza Júnior et al., 2024).

Beyond SAR, other remote sensing modalities such as unmanned aerial vehicles (UAVs) and aerial imaging have also been increasingly applied to oil spill detection, providing high-resolution RGB image data for accurate segmentation and analysis. For instance, De Kerf et al. (2024) constructed a segmented RGB image dataset captured by UAVs for oil spill detection in port environments, demonstrating the potential of UAV imagery in this field. A comprehensive review by Al-Sudani and Al-Suhail (2024) summarized deep learning techniques for image-based oil spill detection, covering various image modalities and methodological advancements. Kurah et al. (2025) optimized hyperparameters of deep learning models for oil spill segmentation in RGB images. Furthermore, research presented at the 2025 International Conference on Mobile, Intelligent, and Ubiquitous Computing (MIUCC) focused on explainable oil spill detection using UAV images (Golcarenarenji et al., 2025), highlighting the trend of integrating interpretability into UAV-based methods. These studies indicate that multi-modal image data, including UAV and aerial RGB images, have become important resources for oil spill detection, complementing SAR’s capabilities in different scenarios.

Despite its strengths, SAR-based methods face challenges in distinguishing oil spills from other phenomena that also produce dark spots in SAR imagery, such as wave shadows, algal blooms, and low-wind-speed regions behind land masses. These false positives present a significant obstacle to reliable oil spill detection (Ajadi et al., 2018). To overcome this limitation, multi-polarization SAR data offers a promising solution. By transmitting and receiving signals with varying polarimetric properties, multi-polarization SAR enhances the richness of scattering information, significantly improving the accuracy and reliability of oil spill detection (Topouzelis, 2008). The latest breakthrough in automated oil spill detection using polarimetric Synthetic Aperture Radar (SAR) data has been presented by Marghany (2019). In this groundbreaking work, Marghany introduced the Quantum Immune Fast Spectral Clustering algorithm for the automatic detection of oil spills in quad-polarized RADARSAT-2 SAR data. This innovative approach leverages quantum-inspired principles and fast spectral clustering techniques to enhance the accuracy and efficiency of detecting oil spills in SAR data. In fact, the Quantum Immune Fast Spectral Clustering algorithm represents a significant advancement in the field, showcasing the potential of cutting-edge algorithms for addressing environmental challenges such as oil spill detection.

In the past, oil spill detection on the sea surface has primarily been achieved through semi-automatic image processing algorithms or manual visual interpretation, such as the Texture Classifier Neural Network Algorithm (TCNNA) (Garcia-Pineda et al., 2009) and Support Vector Machines (SVM) (Brekke and Solberg, 2008; Xu et al., 2014; Singha et al., 2016). These methods are simple in principle and fast to implement. However, they are susceptible to noise from sea surface spots and uneven distribution of image grayscale, which can lead to lower accuracy in oil slick segmentation. Interactive analysis with visual interpretation remains a primary method of analysis (Ivanov et al., 2020; Jatiault et al., 2017). This method is highly accurate, thus making it both time-consuming and labor-intensive, as it requires experienced personnel to spend considerable time and effort on identification.

In recent years, deep learning has achieved remarkable results in the field of computer vision, providing new avenues for research in remote sensing image classification (Song et al., 2017). Deep learning is a subset of machine learning that constructs multi-layer neural networks to automatically learn hierarchical feature representations from data, and semantic segmentation based on deep learning is a typical type of automatic segmentation technology—it achieves pixel-level classification of images, enabling precise identification of target regions (e.g., oil spills) and background. Its core working principle involves two key stages: feature extraction (via encoder networks to capture multi-scale spatial and semantic features) and feature reconstruction (via decoder networks to map extracted features back to the original image size for pixel-level classification). Compared with Marghany’s (2019) quantum computing-based method, deep learning differs significantly in technical logic: the Quantum Immune Fast Spectral Clustering algorithm relies on quantum-inspired optimization and spectral clustering to group similar pixels, emphasizing global data distribution patterns; while deep learning focuses on local and hierarchical feature learning, adapting to complex target shapes and environmental variations through network depth and attention mechanisms. However, deep learning does not always yield accurate results—its performance is highly dependent on dataset quality (e.g., sufficient sample size, balanced category distribution), and it may suffer from overfitting when facing unseen environmental conditions or target variations; additionally, deep networks are prone to misclassifying targets with similar appearances (e.g., oil spills and look-alikes) if fine-grained feature representation is insufficient.

Some deep learning (DL) methods have been used for SAR oil spill detection. Orfanidis et al. (2018) proposed a method that combines the advantages of deep convolutional neural networks (DCNN) with SAR images to semantically segment input SAR images into multiple regions of interest. Bianchi et al. (2020) proposed an Oil Fully ConvNet (OFCN) network based on the U-Net architecture. This network does not have dense layers, allowing it to handle variable-sized inputs and is specifically designed for large-scale oil spill detection in SAR images. Shaban et al. (2021) proposed a two-stage deep learning framework. In the first stage, a 23-layer convolutional neural network classifies images based on the percentage of oil spill pixels. In the second stage, a five-stage U-Net structure performs semantic segmentation, achieving promising results in oil spill detection. Although there is no universally recognized definition of how many layers constitute a “deep” learning model, typical deep networks generally include at least four to five layers (Ball et al., 2017). The deep learning networks mentioned earlier have relatively shallow depths. However, due to the complex marine environment where natural phenomena can suppress short waves and create dark spots on SAR images (Brekke and Solberg, 2005), deeper networks are indeed needed to capture more features and achieve better classification results.

Therefore, this study proposes a novel oil spill detection segmentation model NOS-Net based on deep learning. The core innovation of the model is:

Deep architecture fusion: Based on the encoder-decoder structure of U-Net, the residual learning mechanism of ResNet is deeply integrated, which effectively solves the problem of gradient disappearance/explosion in deep network training, and significantly increases the network depth to capture richer and more discriminative features.

Residual Convolutional Block Attention Module (RCBA): A lightweight channel and spatial attention module are innovatively introduced into the bottleneck residual block of the encoder to form the Residual Convolutional Block Attention Module (RCBA). The RCBA module uses the channel attention mechanism to adaptively learn the importance weight of the feature channel, and focuses on the key spatial regions related to the oil spill through the spatial attention mechanism. This design significantly enhances the model's ability to extract subtle features (such as oil spill edges) and distinguish easily confused categories (such as oil spill and look-alike), while maintaining low computational overhead.

2 Materials and methods

2.1 Region of interest

The Lower Congo Basin is the world’s third-largest natural oil seepage area, located offshore near Angola and in the southern part of the Congo Delta (Jatiault et al., 2017). The rifting of the Congo/Angola continental margin began in the Early Cretaceous period (144–140 million years ago) and peaked between 127 and 117 million years ago (Karner et al., 2003). From the Late Cretaceous to the early Eocene, low-amplitude/low-frequency sea level changes and persistent warm climates (greenhouse conditions) led to the formation of a carbonate/siliciclastic sedimentary slope in the region (Séranne, 1999). This includes Tertiary source rocks, from which thermogenic oil and gas produced by the thermal decomposition of organic matter can escape (Burwood, 1999). This refers to subseafloor fluid migration. The manifestation of seafloor fluid seepage includes mud volcanoes, mounds, or depressions known as “pockmarks” (King and Maclean, 1970). In the Lower Congo Basin, seafloor sedimentary oil and gas seepage primarily manifests as pockmarks (Gay et al., 2007). The water depth topography of the study area is shown in Figure 1.

Figure 1

Map depicting the bathymetry of a coastal region along the eastern Atlantic Ocean, from 4°S to 8°S and 10°E to 15°E. Depth is shown in shades of blue and purple, ranging from minus six thousand meters to zero meters. A compass rose indicates north at the top right. A scale bar represents distances of zero to one hundred kilometers.

Figure 1. Depth topographic map.

Oil and gas seeping from the seafloor rise vertically to the sea surface, where they are identified by analyzing dark patches in SAR images. Ocean currents and wind affect the shape and lifespan of oil slicks, with wind speed significantly influencing the effectiveness of SAR-based oil spill detection. Therefore, research into the marine environment is essential for understanding these dynamics. Surface water circulation is prominently dominated by the southward Angolan Current (AC), which results from the combination of different surface and subsurface currents, including the South Equatorial Under-Current (SEUC), Equatorial Under Current (EUC), the steady-state surface Angolan Dome (AD), and the subsurface Angolan Gyre (AG) (Moroshkin et al., 1970). The Angola Current exhibits seasonality, with its flow velocity varying over time. Typically, the current is stronger in March compared to July (Hardman-Mountford et al., 2003). The wind speed in the study area determines the quality of SAR images. When the wind speed is low, there are fewer capillary and short gravity waves on the calm sea surface, resulting in a decrease in backscattered echo signals. This reduces the contrast between the sea surface covered with oil film and makes oil spill detection and identification more challenging in SAR images. Conversely, high wind speeds can create turbulent conditions that may disperse or submerge floating oil, leading to chaotic information in SAR images that hinders oil spill recognition (Topouzelis, 2008). Under moderate to low wind speeds (3.0–7.0 m/s), conditions are favorable for oil spill detection as it becomes easier to distinguish between seawater and oil slicks. The Sentinel-1 satellite operates in the C-band (central frequency: 5.405 GHz, wavelength: approximately 5.55 cm), which is widely recognized as the optimal band for marine oil spill detection (Cheng et al., 2024). This study adopts a dual-polarization (VV/VH) acquisition mode, where VV (Vertical transmit-Vertical receive) and VH (Vertical transmit-Horizontal receive) serve as the core polarization channels. The main reasons for selecting this configuration are as follows: VV polarization is sensitive to sea surface roughness and capillary wave distribution, enabling clear distinction between oil-covered and oil-free sea surfaces; VH polarization enhances the detection capability for weak scattering targets (e.g., thin oil slicks) by capturing cross-polarized scattering signals, supplementing the limitations of VV polarization in detecting thin oil films (Tong and Xie, 2025). Dual-polarization data can effectively suppress false positives caused by look-alike phenomena (e.g., algal blooms, low-wind-speed regions) through the polarization ratio (VH/VV), as there are significant differences in the polarimetric scattering characteristics between oil spills and look-alikes. The VV/VH dual-polarization mode is a standard acquisition configuration for Sentinel-1’s Interferometric Wide (IW) swath mode, enabling wide spatial coverage (up to 250 km) and providing sufficient available data for the Lower Congo Basin. Other key parameters of SAR data refer to Table 1.

Table 1

Table 1. SAR data key parameter table.

The Sentinel-1 satellites (Sentinel-1A and 1B) adopt a sun-synchronous orbit, with a revisit cycle of 6 days for the study area. Based on ERA5 reanalysis data and in-situ buoy observations (Hersbach et al., 2020), the wind field characteristics during the satellite overpass period (2020–2022) are summarized as follows:

Wind Speed Distribution: The dominant wind speed ranges from 2.0 to 8.0 m/s, consistent with the moderate to low wind speed conditions (3.0–7.0 m/s) selected for image acquisition. Due to coastal sheltering effects, the wind speed on the inner continental shelf (2.0–4.0 m/s) is significantly lower than that on the outer continental shelf and slope zone (4.0–8.0 m/s).

2.1.1 Seasonal characteristics of wind direction

Dry season (June-September): Dominated by southeasterly trade winds, with stable direction and an average wind speed of 3.5–5.5 m/s.

Wet season (October-May): Influenced by equatorial westerlies, the wind direction shifts to southwest with increased variability, and the average wind speed is 4.5–6.5 m/s.

Impact on SAR Detection: The seasonal wind field directly regulates sea surface roughness. Under the influence of southeasterly trade winds in the dry season, the sea surface is relatively stable, and oil slicks form clear dark spots with high contrast; under the influence of westerlies in the wet season, wind-driven waves increase background noise, reducing the discriminability between oil spills and look-alikes.

2.1.2 Hydrodynamic parameters

The hydrodynamic regime of the study area is jointly dominated by surface currents, subsurface currents, and tidal motions. The core parameters are obtained based on ADCP observations (Buijsman and Ridderinkhof, 2007) and ocean circulation models (Fox-Ke et al., 2019).

2.1.3 Surface currents (0–200 m)

Angola Current (AC): Flows southward along the shelf edge (100–200 m water depth), with seasonal variations in flow velocity (20–50 cm/s in March, 15–35 cm/s in July). This current transports warm and saline equatorial water southward, affecting the horizontal migration of surface oil slicks.

Benguela Current (BC): Flows northward on the outer continental shelf (200–500 m water depth) at a velocity of 10–25 cm/s, forming a countercurrent system with the Angola Current and triggering coastal upwelling in the study area.

2.1.4 Subsurface currents (200–2000 m)

South Equatorial Under Current (SEUC): Located at 100–200 m water depth, flows eastward at a velocity of 10–30 cm/s, influencing vertical seawater mixing and the oil droplet ascent process.

North Atlantic Deep Water (NADW): Flows southward at 1,200–2000 m water depth with a relatively low velocity (2–5 cm/s), participating in the long-term transport of deep-seated hydrocarbons.

Tidal Motions: The study area is dominated by semidiurnal tides with a tidal range of 1.5–2.5 m. The tidal current velocity is 5–15 cm/s, which, although weaker than the average velocity of ocean currents, can cause short-term displacement of oil slicks (especially in the inner continental shelf area).

2.2 Data set

Due to the suddenness of oil spill accidents and the complexity of the marine environment, the number of SAR images that contain oil spill areas and can be used for research is small, and the production of oil spill data sets is difficult. Most of the previous studies used data sets based on specific conditions. Under this condition, the segmentation effect is better, but there will be a large error when it is used for other studies (Liu et al., 2025). Therefore, an open and common oil spill data set is crucial for this industry.

The data images used in this study were obtained by the Sentinel-1 satellite. Sentinel-1 is the successor of the ENVISAT-ASAR satellite and is also part of the European Space Agency (ESA) Global Environment and Security Surveillance (i.e., the Copernicus Program) series of satellites. Krestenitis et al. (2019) identified and produced an oil spill dataset using Sentinel-1 satellite images combined with the CleanSeaNet service of the European Maritime Safety Authority (EMSA). The dataset of this study is derived from Reference 45. A total of 110 preprocessed Sentinel-1 SAR images (specification: 1,250 × 650 × 3) were used in this study for model training and validation to ensure consistency in data scale and format.

Figure 2 shows the SAR images and their corresponding label images. The labels of the dataset are divided into five categories: Oil Spill, Look-alike, Ship, Land, and Sea Surface, which are represented by cyan, red, brown, green, and black respectively (Shaban et al., 2021). The production process of the dataset includes two key steps: first, identify the oil spill locations in the Lower Congo Basin from Sentinel-1 images and crop out the oil spill areas and related background areas; second, perform standardized preprocessing on the cropped images, including radiometric calibration, 7 × 7 median filtering, and linear transformation, to eliminate noise interference and unify data distribution. Due to the limited size of the dataset, this study did not set up an independent validation set, and the test set was used as a substitute for the validation set for model validation.

Figure 2

Two sets of images labeled (a) and (b). On the left, (a) shows two grayscale satellite images, one focused on a wide landscape, the other on a detailed section with visible lines. On the right, (b) displays corresponding color-coded illustrations with red, green, cyan, and orange areas indicating different regions of interest.

Figure 2. Sample images of dataset label color: (a) SAR images; (b) RGB masks.

To ensure the reproducibility and transparency of this study, the data sources and access channels are clearly specified as follows:

Sentinel-1 SAR raw images: Available for download via the European Space Agency (ESA) Copernicus Open Access Hub (https://scihub.copernicus.eu/dhus/). Users can search and obtain image data of the Lower Congo Basin (geographic coordinates: 4°S–8°S, 10°E−15°E) with acquisition times meeting moderate to low wind speed conditions (3.0–7.0 m/s).

Labeled dataset reference: The labeling system and category definitions of this study are consistent with the dataset proposed by Krestenitis et al. (2019). For access to the complete labeled dataset used in this study, researchers can contact the corresponding author (Yanxin Qi, email:MzI1MjIxMTA5NEBxcS5jb20=) or refer to the original dataset repository of Krestenitis et al. (https://doi.org/10.3390/rs11151762) to obtain standardized labeling guidelines.

CleanSeaNet service data: Supplementary oil spill verification information can be retrieved through the European Maritime Safety Agency (EMSA) CleanSeaNet portal (https://cleanseanet.emsa.europa.eu/). This platform provides real-time and historical oil spill monitoring records, which can support dataset validation.

2.3 Semantic segmentation model

2.3.1 Introduction to U-Net model

U-Net (U-shaped Neural Network) was proposed by Ronneberger et al. (2015) as a fully convolutional encoder-decoder architecture specifically designed for biomedical image segmentation. As shown in Figure 3, its encoder (downsampling path) consists of four modules, each comprising two convolutional layers followed by a max-pooling layer; the pooling operation halves the spatial dimensions (width and height) of the feature maps while doubling the number of channels, thereby enabling multi-scale feature extraction. Correspondingly, the decoder (upsampling path) includes four matching modules, each consisting of a 2 × 2 transposed convolutional layer and two 3 × 3 convolutional layers; the upsampling step doubles the spatial dimensions and halves the number of channels of the feature maps, which are then fused with the corresponding feature maps from the encoder via concatenation to ultimately restore the original image size. This architecture performs excellently in single-channel biomedical image segmentation and is compatible with the single-channel characteristic of SAR images; however, in marine oil spill detection scenarios, due to the complex marine environment (e.g., oil slick look-alike interference and blurred edge details), its limitations of insufficient network depth and limited feature representation capability have become increasingly prominent, making it difficult to meet the requirements of high-precision segmentation.

Figure 3

Diagram illustrating a U-Net architecture for image segmentation. It features a symmetrical encoder-decoder structure with down-sampling operations on the left and up-sampling on the right. The left side shows layers of decreasing dimensions, marked as

Figure 3. U-Net network architecture diagram.

2.3.2 Introduction to res net model

ResNet (Residual Network) was proposed by He et al. (2016), whose core innovation lies in the residual learning framework specifically addressing two major pain points of deep convolutional neural networks: gradient vanishing/explosion — if the gradient of each network layer is less than 1, the gradient will approach 0 as the number of layers increases during backpropagation (gradient vanishing), while if the gradient is greater than 1, the gradient will increase sharply with the deepening of network layers (gradient explosion), and both situations will lead to difficulty in network convergence; and accuracy saturation and degradation — when the number of network layers increases to a certain extent, the model accuracy will first saturate and then decrease due to excessive layers (network degradation). The residual structure avoids the above problems by learning the “residual mapping between input and output” (let the input be x and the expected mapping be (H(x)), then the residual is (F(x) = H(x) - x)) — when the residual approaches 0, the network degrades to an identity mapping, which will not cause accuracy loss due to excessive network depth, thereby supporting a significant increase in network depth. As shown in Figure 4, however, the original ResNet architecture is not designed for image segmentation and lacks the multi-scale feature fusion (encoder-decoder structure) of U-Net, resulting in insufficient adaptability in tasks requiring fine edge segmentation such as oil spill detection.

Figure 4

Diagram of a ResNet residual block for oil spill detection. Input X is processed via two weight layers to generate F(x), summed with the identity input X; the result (F(x)+x) passes through a ReLU activation, solving gradient vanishing in deep network training.

Figure 4. Residual structure diagram.

2.3.3 Introduction to NOS-Net model

To address the problems of U-Net’s insufficient depth and ResNet’s poor adaptability to segmentation tasks, this study proposes NOS-Net. Its core design integrates the encoder-decoder structure of U-Net with the residual learning mechanism of ResNet, while innovatively introducing the Residual Convolutional Block Attention Module (RCBA). CBAM was proposed by Woo et al. (2018) in 2018 and demonstrated its effectiveness in multiple visual tasks, including image classification, object detection and semantic segmentation.

1. Architecture Fusion LogicIt retains U-Net’s encoder-decoder structure: the encoder uses downsampling to capture global oil spill region features, and the decoder adopts upsampling to restore edge details, satisfying the dual requirements of “large-scale recognition + fine-grained segmentation” for SAR oil spill detection. Meanwhile, it embeds ResNet’s residual learning mechanism by integrating residual structures into the convolutional layers of the encoder, which solves the gradient vanishing/explosion problem during deep network training and supports the improvement of network depth to extract richer discriminative features.

2. Residual Convolutional Block Attention Module To enhance the network’s ability to distinguish oil spill edges and oil slick look-alikes, a lightweight channel-spatial attention mechanism is integrated into the residual block to construct the RCBA module. The structure diagram of the residual convolution block attention module is shown in Figure 5.

Hannel Attention: Adaptively learns the importance weights of feature channels to highlight key channels related to oil spill texture and grayscale features.

Spatial Attention: Focuses on spatial regions where oil spills may occur in the image, suppressing background noise interference such as ocean waves and ships.

Lightweight Design: Controls computational overhead while improving feature representation capability, making it suitable for large-scale batch processing of SAR images..

1. 3. Overall Network ArchitectureThe architecture of NOS-Net is shown in Figure 6.

Figure 5

Diagram of the Residual Convolutional Block Attention Module (RCBA) integrating Conv1x1, Conv3x3, and Batch Normalization. It includes Channel Attention (feature pooling/concatenation/convolution/sigmoid) and Spatial Attention (fully connected layers), with arrows indicating data flow to enhance oil spill feature extraction.

Figure 5. Structure of residual convolutional block attention module.

Figure 6

Flowchart of a convolutional neural network architecture with an input layer of size two hundred fifty-six by two hundred fifty-six by one. It includes layers labeled RCBA and operations like max pooling, stride equals two, upsampling, and convolution one by one. The diagram shows the flow from input to output through stacked RCBA blocks, with various dimension changes indicated along the way.

Figure 6. Structure of NOS-Net model.

The architecture of NOS-Net is illustrated in Figure 6. The encoder part consists of two stages. The first stage comprises a 7 × 7 convolutional layer (stride = 2) and a max-pooling layer (stride = 2), which converts a 512 × 512 single-channel SAR image into a 64-channel, 128 × 128 feature map. The second stage is made up of three RCBA modules, extracting deep discriminative features through residual learning and the attention mechanism.

The decoder part adopts the original decoder structure of U-Net. Through upsampling and feature concatenation, it finally outputs oil spill segmentation results consistent with the input size.

This design aims to address the issues of blurred oil spill edges, incomplete shapes, and difficulty in distinguishing oil spills from look-alikes in traditional methods, while balancing the model’s training stability and inference efficiency.

2.4 Seafloor seep detection method

Due to the short duration of oil slicks on the sea surface, only 3 h and 15 min (Marmorino et al., 2008), we were unable to find two SAR images at such close time intervals for the same location. Based on the SAR-detected oil slicks, we can infer the approximate location of seabed leakage points by considering data such as the depth of water at the slick’s position and the speed of ocean currents. First, referencing Jatiault et al. (2018) who studied oil droplet deviation at leakage points in the Lower Congo Basin, they proposed that oil droplet ascent speeds could range from 3 to 8 cm/s. Combining this with the water depth at that location, we can determine the range of time (T) required for oil droplets to reach the sea surface from the leakage point. However, since we do not have access to the current velocity of seawater under those spatiotemporal conditions, we will use the ocean current velocities recorded by ADCP in the study area for our analysis. Considering that water currents vary in speed and direction at different depths in the same location, we will divide the path from the leakage point to the sea surface into four segments for calculation. The displacement speed of surface oil slicks is determined by 3% wind speed and 100% water current speed (Kim et al., 2014). Starting from the edge points of the surface oil slicks, the origin of the slicks is often found at the wider and more prominent end of the curve (Dong et al., 2025). After determining the origin of the oil slicks, calculate the displacement distance influenced by wind speed and surface water current speed. Then, based on the position of the leading edge before displacement and underwater current speed, infer the location of the seabed leakage point.

The pseudocode is as follows.

// Pseudocode: Complete Workflow of NOS-Net Model Construction and Training

// Input: Sentinel-1 SAR dual-polarization dataset (1,250 × 650 × 3), hyperparameters (learning rate = 5e-5, batch size = 12, epochs = 1,000)

// Output: Trained NOS-Net model, optimal weights, performance metrics (IoU, mIoU, AUC, etc.)

// 1. Data Preprocessing Function

Function DataPreprocessing(raw_images, raw_labels):

For each image in raw_images:

Perform radiometric calibration (convert raw signals to backscatter coefficient σ⁰)

Apply 7 × 7 median filtering (suppress speckle noise)

Conduct linear transformation (normalize to [0,1] interval for uniform data distribution)

Encode labels (map 5 target classes to one-hot encoding: Oil Spill = 0, Look-alike = 1, Ship = 2, Land = 3, Sea Surface = 4)

Split into training set (80%)/test set (20%) (no independent validation set due to limited dataset size)

Return preprocessed training set/test set (images + labels)

// 2. RCBA Module Definition (Residual + Channel Attention + Spatial Attention)

Function ResidualConvolutionalBlockAttention(input_feature):

// Residual branch

residual = Conv2D(filters = input_feature.channels, kernel_size = 3, padding = “same”)(input_feature)

residual = BatchNormalization()(residual)

residual = ReLU()(residual)

residual = Conv2D(filters = input_feature.channels, kernel_size = 3, padding = “same”)(residual)

residual = BatchNormalization()(residual)

// Channel attention mechanism

channel_avg_pool = GlobalAveragePooling2D()(residual)

channel_max_pool = GlobalMaxPooling2D()(residual)

channel_fc = Concatenate()([channel_avg_pool, channel_max_pool])

channel_fc = Dense(input_feature.channels//16, activation = “ReLU”)(channel_fc)

channel_fc = Dense(input_feature.channels, activation = “Sigmoid”)(channel_fc)

channel_attention = Multiply()([residual, channel_fc])

// Spatial attention mechanism

spatial_avg_pool = AveragePooling2D(pool_size = 7)(channel_attention)

spatial_max_pool = MaxPooling2D(pool_size = 7)(channel_attention)

spatial_concat = Concatenate(axis = 2)([spatial_avg_pool, spatial_max_pool])

spatial_conv = Conv2D(filters = 1, kernel_size = 7, padding = “same”, activation = “Sigmoid”)(spatial_concat)

spatial_attention = Multiply()([channel_attention, spatial_conv])

// Residual connection

output = Add()([input_feature, spatial_attention])

output = ReLU()(output)

Return output

// 3. NOS-Net Model Construction

Function BuildNOSNet(input_shape=(512,512,1)):

// Encoder (U-Net structure + ResNet residual + RCBA module)

input = Input(shape = input_shape)

// Stage 1: 7 × 7 convolution + max pooling

encoder1 = Conv2D(filters = 64, kernel_size = 7, stride = 2, padding = “same”)(input)

encoder1 = BatchNormalization()(encoder1)

encoder1 = ReLU()(encoder1)

encoder1_pool = MaxPooling2D(stride = 2)(encoder1)//Output: 128 × 128 × 64

// Stage 2: 3 RCBA modules (deep feature extraction)

encoder2 = ResidualConvolutionalBlockAttention(encoder1_pool)

encoder3 = ResidualConvolutionalBlockAttention(encoder2)

encoder4 = ResidualConvolutionalBlockAttention(encoder3)//Output: 128 × 128 × 64 (consistent channel number)

// Decoder (U-Net upsampling + feature concatenation)

decoder1 = Conv2DTranspose(filters = 32, kernel_size = 2, stride = 2, padding = “same”)(encoder4)

decoder1_concat = Concatenate()([decoder1, encoder1])//Concatenate with corresponding encoder features

decoder1_conv = Conv2D(filters = 32, kernel_size = 3, padding = “same”, activation = “ReLU”)(decoder1_concat)

decoder2 = Conv2DTranspose(filters = 16, kernel_size = 2, stride = 2, padding = “same”)(decoder1_conv)

decoder2_conv = Conv2D(filters = 16, kernel_size = 3, padding = “same”, activation = “ReLU”)(decoder2)

// Output layer (pixel-wise classification for 5 target classes)

output = Conv2D(filters = 5, kernel_size = 1, padding = “same”, activation = “Softmax”)(decoder2_conv)

model = Model(inputs = input, outputs = output)

Return model

// 4. Hybrid Loss Function Definition (MCCE + Dice Loss)

Function HybridLoss(y_true, y_pred):

// Multiclass Cross-Entropy (MCCE)

mcce = CategoricalCrossentropy()(y_true, y_pred)

// Multiclass Dice Loss (alleviate class imbalance)

dice_sum = 0

For c from 0 to 4://Iterate over 5 target classes

y_true_c = y_true[…, c]

y_pred_c = y_pred[…, c]

intersection = Sum(y_true_c *y_pred_c)

union = Sum(y_true_c) + Sum(y_pred_c)

dice_c = (2 * intersection + ε)/(union + ε)//ε = 1e-6 to avoid division by zero

dice_sum + = dice_c

dice_loss = 1 - (dice_sum/5)

// Hybrid loss (equal weights for balancing stability and class balance)

Return mcce + dice_loss

// 5. Model Training Main Function

Function TrainNOSNet():

// Data preparation

raw_images = Load Sentinel-1 SAR raw images (110 images)

raw_labels = Load corresponding labels (5 target classes)

train_data, test_data = DataPreprocessing(raw_images, raw_labels)

// Model initialization

model = BuildNOSNet(input_shape=(512,512,1))

optimizer = Adam(learning_rate = 5e-5)

model.compile(optimizer = optimizer, loss = HybridLoss, metrics = [“accuracy”])

// Training configuration

best_miou = 0.0

patience = 100//Early stopping patience (stop if no improvement)

no_improve_count = 0

// Training loop

For epoch from 1 to 1,000:

// Batch training

train_loss, train_acc = model. train_on_batch(train_data.images, train_data.labels)

// Validation every 50 epochs (use test set as substitute for validation set)

If epoch % 50 = = 0:

test_pred = model. predict(test_data.images)

// Calculate core metrics

iou_oil = CalculateIoU(test_pred[…, 0], test_data.labels[…, 0])//Oil Spill IoU

miou = CalculateMeanIoU(test_pred, test_data.labels)//Mean IoU

auc = CalculateAUC(test_pred[…, 0], test_data.labels[…, 0])//Oil Spill binary classification AUC

// Save optimal model

If miou > best_miou:

best_miou = miou

model.save_weights(“NOS-Net_best_weights.h5″)

no_improve_count = 0

Else:

;no_improve_count + = 1

// Early stopping check

If no_improve_count>=patience:

Print(“Early stopping triggered, training completed”)

Break

Print(f“Epoch {epoch}: Train Loss = {train_loss:.4f}, Train Acc = {train_acc:.4f}, Best mIoU = {best_miou:.4f}”)

// Post-training evaluation

final_pred = model. predict(test_data.images)

final_oil_iou = CalculateIoU(final_pred[…, 0], test_data.labels[…, 0])

final_miou = CalculateMeanIoU(final_pred, test_data.labels)

final_auc = CalculateAUC(final_pred[…, 0], test_data.labels[…, 0])

final_acc = model. evaluate(test_data.images, test_data.labels)[1]

// Output results

Save trained model, performance metrics table (IoU = final_oil_iou, mIoU = final_miou, AUC = final_auc, Acc = final_acc)

Return model, performance_metrics

// 6. Auxiliary Functions: Calculate IoU and mIoU

Function CalculateIoU(y_true, y_pred):

y_pred_binary = (y_pred >0.5). astype(Integer)//Binarization with 0.5 probability threshold

intersection = Sum(y_true *y_pred_binary)

union = Sum(y_true) + Sum(y_pred_binary)

Return (intersection + ε)/(union + ε)

Function CalculateMeanIoU(y_true, y_pred):

miou_sum = 0.0

For c from 0 to 4:

miou_sum + = CalculateIoU(y_true[…, c], y_pred[…, c])

Return miou_sum/5

// 7. Execute Training Workflow

model, performance_metrics = TrainNOSNet()

Print(“Training completed. Performance metrics:”, performance_metrics)

2.4.1 Mechanism for separating mixed regions

Dark regions in SAR images of the Lower Congo Basin are mostly mixtures of oil spills and look-alikes (e.g., algal blooms, low-wind-speed zones). NOS-Net achieves pixel-level separation through a multi-dimensional collaborative design:

Dual-polarization feature fusion: C-band VV/VH dual-polarization data provides complementary scattering information. VV polarization highlights differences in sea surface roughness between oil spills and look-alikes, while VH polarization enhances the detection capability of weak-scattering targets such as thin oil films by capturing cross-polarized signals. The model fuses dual-polarization features in the encoder, laying the foundation for distinguishing mixed pixels.

Fine-grained discrimination of the RCBA module: The channel attention mechanism adaptively assigns higher weights to feature channels related to oil spill textures (e.g., stable low backscatter, uniform grayscale distribution) and suppresses feature channels dominated by look-alikes (e.g., uneven backscatter of algal blooms). The spatial attention mechanism further focuses on local regions with oil spill characteristics (e.g., continuous dark patches with gentle edges), reducing interference from discrete look-alike pixels. Ablation experiment results show that compared with U-Net + Res, RU-Net + CBAM with the attention mechanism increases the Oil Spill IoU by 6.06% and the Look-alike IoU by 13.20%, verifying the module’s effectiveness in separating mixed regions.

Hybrid loss function optimization: The combination of MCCE and Dice loss enables the model to focus on minority-class oil spill pixels while ensuring overall classification stability. This avoids model bias caused by the high proportion of look-alike pixels and improves the separation accuracy of mixed regions with a low oil spill ratio (e.g., in Figure 9, NOS-Net accurately extracts small oil spill patches from large-area look-alikes).

2.4.2 Seabed leakage point detection process

The detection of seabed leakage points relies on the accurate segmentation of surface oil slicks and hydrodynamic inversion, with the following core steps:

Determination of the oil slick origin: Based on the segmented oil slick shape, the wider and more prominent end of the curved feature is identified as the diffusion origin (Yan et al., 2025), which is the starting position of surface diffusion. This step relies on the model’s accurate extraction of oil slick edge features (enhanced by the RCBA module).

Multi-layer current displacement inversion: Combined with ADCP-measured current velocities at different depths (Siagian et al., 2023), the oil droplet ascent path is divided into three segments (0–20 m, 20–250 m, 250–2000 m). The median current velocity of each layer is used to calculate the displacement of oil droplets during ascent.

Leakage point localization: The seabed leakage point is inferred by reversing the displacement direction of the oil slick origin. For example, in the case study, the leakage point is located approximately 2,114 m northwest of the oil slick origin, falling within the typical deviation range (0–2,500 m) for such inferences, which verifies the reliability of the method. The model’s high segmentation accuracy ensures the accuracy of origin determination, which is a prerequisite for leakage point localization.

2.4.3 Oil spill semantic feature extraction mechanism

Instead of simply masking dark regions in SAR images, NOS-Net realizes hierarchical extraction of oil spill semantic features through architecture fusion and module collaboration:

Multi-scale feature capture: The downsampling process of the encoder (7 × 7 convolution + max-pooling + RCBA modules) extracts features at different scales: shallow layers capture low-level semantic features related to oil spills deep layers capture high-level semantic features that distinguish oil spills from look-alikes (continuous region distribution patterns, polarimetric scattering characteristics).

Feature enhancement via residual learning: The integration of ResNet’s residual structure solves the gradient vanishing problem in deep networks, enabling the model to extract more discriminative features.

Attention-guided semantic focusing: The channel attention of the RCBA module highlights feature channels related to oil spill semantics (e.g., VV/VH polarization ratio, grayscale uniformity), while spatial attention suppresses background semantic interference (e.g., ocean waves, ships). This two-dimensional attention mechanism allows the model to focus on oil spill-specific semantic information, avoiding misclassification caused by similar low-level features of look-alikes and oil spills.

2.5 Loss function

In the task of semantic segmentation of ocean targets in spaceborne SAR images, each pixel in the image needs to be accurately classified into one of five types: Oil Spill, Look-alike, Ship, Land and Sea Surface. This is essentially a multi-category pixel-level classification problem. Multi-Class Cross-Entropy Loss (MCCE) is a common index to measure the difference between the predicted probability distribution of the model and the real label distribution (Hernández-Hamón et al., 2023). Wu et al. proposed a unified weighted cross-entropy loss framework, providing a new perspective for understanding and comparing different loss functions (Wu et al., 2024). The MCCE loss calculates the negative log-likelihood average of the prediction probability on all pixels and the true label. As shown in Equation 1.

L_{M C C E} = - \frac{1}{N} \sum_{n = 1}^{N} \sum_{c = 1}^{C} y_{n, c} \log_{(p_{n, c})} (1)

N is the total number of pixels in the image. C is the total number of categories. is an indicator variable (0 or 1) that indicates whether the true category of the nth pixel is c. The model predicts the probability that the nth pixel belongs to category c (after softmax, so the probability sum of all categories is 1). is the natural logarithm.

However, in the ocean target segmentation task of remote sensing images, especially for oil spill detection, a significant challenge is serious class imbalance. Targets such as Oil Spill and Ship usually occupy only a very small proportion of pixels in the entire image, while Sea Surface and Land occupy most of the area. If the loss function assigns equal weights to all categories of pixels (such as standard MCCE loss), the model training process will be seriously biased towards them due to the number of pixels in background categories such as sea surface, resulting in insufficient recognition ability of the model for rare but important targets (such as Oil Spill), which ultimately affects its segmentation accuracy (especially IoU index).

In order to alleviate the negative impact of category imbalance, this study introduces Multi-Class Dice Loss (Li et al., 2020). The Dice coefficient measures the degree of overlap between the predicted segmentation result and the real label (the ratio of intersection and union). The Dice loss is 1 minus the average Dice coefficient (usually averaged by category). Its formula is shown in Equation 2.

L_{D i c e} = 1 - \frac{1}{C} \sum_{C = 1}^{C} \frac{2 \sum_{n = 1}^{N} p_{n, c} y_{n, c} + ϵ}{\sum_{n = 1}^{N} p_{n, c} + \sum_{n = 1}^{N} y_{n, c} + ϵ} (2)

Among them, $ϵ$ is a small positive number, which is used to avoid the denominator being zero and increase the numerical stability. Other symbols are the same as above.

The core advantage of Dice loss is that it evaluates the overlap between the predicted region and the real region. Its loss contribution to each category depends on the overlap between the category prediction and the real label, regardless of the number of pixels involved in the calculation. This allows the model to focus more directly on improving the segmentation accuracy of the target area (such as Oil Spill) during the training process, thus effectively balancing the huge differences in the number of pixels in different categories.

However, in the early stage of training or in the case of very small target area, the gradient calculation of Dice loss may be unstable and fluctuate greatly, which may lead to difficulty in model convergence or fall into sub-optimal solution. In order to combine the advantages of the two loss functions and overcome their shortcomings, this study designs a hybrid loss function. Its formula is shown in Equation 3.

L_{H y b r i d} = L_{M C C E} + L_{D i c e} (3)

The design motivation of this mixed loss function is:

Using the stability of MCCE: cross entropy loss provides a stable gradient signal, which helps the model to converge reliably at the initial stage of training and avoids the sharp gradient fluctuation that may be caused by Dice loss.

Using Dice's class balance: Dice loss focuses on optimizing the overlap degree (IoU) of the segmented region, effectively alleviating the class imbalance problem, and directly optimizing the segmentation performance of the model on the target category (especially Oil Spill).

Collaborative optimization: Combining the two loss functions, the model can simultaneously learn and predict the accurate class probability distribution (MCCE) and generate a segmented region (Dice) that is highly overlapped with the real label, so as to achieve better results in the overall segmentation accuracy (mPA) and the intersection over union (IoU, mIoU) of the target class.

3 Results

3.1 Evaluation metrics

Before conducting experiments on the model, we need to establish performance metrics. Vlăsceanu et al. summarized the evaluation metrics for the field of image segmentation (Vlăsceanu et al., 2024), and ultimately we selected several commonly used metrics for the classifier, including Precision (as in Equation 4), Accuracy (as in Equation 5), Recall (as in Equation 6), intersection over union (IoU) as in Equation 7, and mean IoU (as in Equation 8). The specific formulas for these metrics are as follows:

P r e c i s i o n = \frac{T N}{T P + F P} (4)

D e t e c t i o n R a t e (R e c a l l) = \frac{T P}{T P + F N} (5)

I o U = \frac{T P}{T P + F P + F N} (6)

m I o U = \frac{\sum_{i = 0}^{n} I o U}{n} (7)

R e c o g n i t i o n R a t e (A c c u r a c y) = \frac{T P + T N}{T P + T N + F P + F N} (8)

The TP (True Positive) model correctly predicts the number of pixels as ' Oil Spill '. The higher the TP, the more pixels the model finds. The TN (True Negative) model correctly predicts the number of pixels that are not ' Oil Spill ' (i.e., background or other four categories: Look-alike, Ship, Land, Sea Surface). The FP (False Positive) model incorrectly predicts the number of pixels as ' Oil Spill '. The FN (False Negative) model incorrectly predicts the number of pixels that are not ' Oil Spill ' (i.e., predicted as background or other class).

Precision is how many of the pixels predicted as ' Oil Spill ' by the model are real ' Oil Spill '. It measures the reliability or accuracy of the model prediction results. High Precision means that when the model says ' this is oil ', it is more likely to be oil (less false positives). Recognition Rate is the proportion of all pixels correctly classified by the model (whether predicted as Oil Spill or non-Oil Spill). It measures the overall accuracy of the model. Detection Rate is how many of the real ' Oil Spill ' pixels are successfully detected by the model. It measures the ability of the model to find all real oil spill areas. High Recall means less misses.

IoU (Intersection over Union) is the degree of overlap between the predicted ' Oil Spill ' region (TP + FP) and the real ' Oil Spill ' region (TP + FN). The calculation formula is the intersection of the two (TP) divided by the union of the two (TP + FP + FN). It is the core index to measure the overlap accuracy of the segmented region, ranging between [ 0,1 ]. The higher the value, the better the overlap between the segmented region and the real region. Mean Intersection over Union (mIoU) is to calculate IoU for all categories (Oil Spill, Look-alike, Ship, Land, Sea Surface), and then calculate the arithmetic mean. It is a key indicator to measure the overall segmentation performance of the model on all categories.

To effectively demonstrate the impact of the IoU, we have compared examples with different IoU values as shown in Figure 7. A higher IoU indicates that the model exhibits improved performance in image recognition.

Figure 7

Three grayscale images labeled (a), (b), and (c) show dark shapes on a textured background, each with overlapping red and white rectangles highlighting different areas of the shape.

Figure 7. Examples of different IoU values. (a) IoU = 25%. (b) IoU = 50%. (c) IoU = 70%.

3.2 Hyperparameter optimization

We design experiments to select a set of optimal hyperparameters as much as possible to improve the performance and effectiveness of model learning. We choose Adam as the optimizer of this model. Therefore, in the first experiment, we use the hyperparameters recommended by the optimizer, Learning Rate is 1e-4, batch size is 12, and 1,000 epochs are trained. Finally, 65.04% mIou is obtained. According to the training of the data set, we draw its mIou curve, as shown in Figure 8, in orde45r to better show the training process.

Figure 8

Line chart showing training mean Intersection over Union (mIoU) of the NOS-Net model across 1000 epochs. The curve starts at 0, rapidly rises to ∼50, stabilizes near 60 after 600 epochs, demonstrating model convergence.

Figure 8. Structure of residual convolutional block attention module.

It can be seen that the model has tended to fit at 600 epochs. Based on this result, we adjust the value of each hyperparameter. In order to ensure the accuracy of each set of hyperparameters, we train each set of hyperparameters for 500 or 1,000 epochs. After training dozens of sets of hyperparameters, we find the best set of hyperparameters: Learning Rate is 5e-5, batch size is 12. The epoch is 1,000, the mIou is 70.29%, and the Iou of Oil Spill is 61.27%. The evaluation indexes are shown in Table 2.

Table 2

Table 2. Classification indicators.

3.3 The predicted outcome

The visual effect of the segmentation model is shown in the following figure. Figure 9 is the worse case and Figure 10 is the better case. It can be seen from Figure 9 that the large area of dark spots is look-alike, and it contains a very small proportion of oil spill. This low-contrast image will greatly hinder the recognition of the model, and the highlighted area will even be classified as land. Compared with the u-net model using resnet101 as backbone in Figure 9c, the NOS-Net model in Figure 9d has a more complete shape in the prediction of oil spill because the RCBA module enhances the feature representation of the output feature map.

Figure 9

Four-panel oil spill segmentation results (worse case): (a) original SAR image with large-area look-alikes and small oil spill patches; (b) ground truth data; (c) U-Net with ResNet101 segmentation; (d) NOS-Net segmentation, showing more complete oil spill shape recognition.

Figure 9. The segmentation results of U-Net model with different modules: (a) SAR images; (b) the corresponding ground truth data, and results from (c) U-Net model with resnet101; (d) NOS-Net model. This is a worse case.

Figure 10

Four-panel oil spill segmentation results (better case): (a) original SAR image with clear oil spill-sea surface boundaries; (b) ground truth data; (c) U-Net with ResNet101 segmentation; (d) NOS-Net segmentation, showing higher boundary accuracy and detailed oil spill recognition.

Figure 10. The segmentation results of U-Net model with different modules: (a) SAR images; (b) the corresponding ground truth data, and results from (c) U-Net model with resnet101; (d) NOS-Net model. This is a better situation.

The sea surface environment is relatively simple, and the boundary between sea surface and oil spill is relatively obvious. It can be seen that the segmentation effect of the model is obvious at this time, and IoU is higher. The NOS-Net model in Figure 9d still shows better results than the u-net model in Figure 9c using resnet101 as the backbone, and better recognition of some oil spill details.

Boundary conditions (wind speed, incidence angle, polarization mode) directly affect the clarity of oil spill boundaries in SAR images and the model’s segmentation performance:

Regarding wind speed impact: Under moderate to low wind speeds (3.0–7.0 m/s), oil films suppress capillary waves, forming dark spots with clear boundaries. When wind speed is below 3 m/s, reduced sea surface roughness diminishes the contrast between oil spills and the background, leading to blurred boundaries; when wind speed exceeds 7 m/s, turbulence causes oil slick dispersion, resulting in irregular and fragmented boundaries. NOS-Net is optimized for moderate to low wind speed scenarios, and its RCBA module enhances edge feature extraction to compensate for boundary blurriness under suboptimal wind speeds.

Regarding incidence angle impact: The optimal incidence angle range (28°–38°) maximizes the contrast between oil spills and the sea surface (VV contrast: 5.8–7.2 dB; VH contrast: 7.5–9.1 dB), resulting in sharp oil spill boundaries. Beyond this range, the contrast decreases, leading to blurred boundaries. The dataset strictly selects images within the optimal incidence angle range to ensure the model learns clear boundary features.

Regarding polarization mode impact: The VV/VH dual-polarization mode complementarily improves boundary feature extraction: VV polarization captures clear boundaries of thick oil slicks, while VH polarization enhances the detection of thin oil slick boundaries. The model fuses dual-polarization boundary features, improving the continuity and accuracy of segmented oil spill boundaries (e.g., in Figure 10, the oil spill boundaries segmented by NOS-Net are more consistent with the ground truth than those of the U-Net with ResNet101 as the backbone).

3.4 Ablation experiment

In the experiment of deep learning model, deep learning model is often regarded as ' black box '. Ablation experiment can reveal the mechanism of each component and improve the interpretability of the model through deconstruction design. On the other hand, complex models may show false performance improvement due to accidental factors. Ablation experiments can verify whether the improvement is universal and avoid the risk of ' over-design '. In order to verify the contribution of each module of NOS-Net model to segmentation and the rationality of the number of network layers, this paper designs ablation experiments. Based on the U-Net model, the residual structure and attention mechanism are added in turn to verify the contribution of the module. On the basis of NOS-Net model, the number of network layers is increased to verify the rationality of the number of network layers. The U-Net + Res model is the U-Net model that combines the residual deep network as the encoder. RU-Net + cbam is a U-Net + ResNet101 combined with attention mechanism. Compared with the NOS-Net model, 63 convolutional layers are added to the encoder. Evaluation metrics: On the basis of the original Intersection over Union (IoU), four key metrics, namely, Dice coefficient, Accuracy, F1 score, and mean Intersection over Union (mIoU), are supplemented to form a comprehensive performance evaluation system, which meets the requirements of scientific rigor.

The complete performance metrics of each model in the ablation experiment are shown in Tables 3, 4. Table 3 focuses on the IoU results of each model for five marine target categories, directly reflecting the segmentation overlap between the predicted regions and the real regions; Table 4 supplements other core metrics to fully quantify the contribution value of each component.

Table 3

Table 3. IoU training results of each model in ablation experiment.

Table 4

Table 4. Comprehensive performance metrics of each model in ablation experiment.

Contribution of the residual structure: Compared with the baseline U-Net, the Oil Spill IoU of U-Net + Res increased by 10.16% and the mIoU increased by 5.70%. This proves that the residual learning mechanism effectively solves the gradient vanishing/explosion problem in deep network training, enhances the ability to extract discriminative features, and lays a foundation for the improvement of model performance.

Contribution of the attention mechanism: Compared with U-Net + Res, the Oil Spill IoU of RU-Net + CBAM further increased by 6.06% and the mIoU increased by 4.36%. The channel attention mechanism adaptively highlights the key feature channels related to oil spills, and the spatial attention mechanism suppresses background noise interference, which significantly enhances the model’s ability to distinguish between oil spills and look-alikes.

Rationality of the lightweight design of NOS-Net: On the basis of RU-Net + CBAM, NOS-Net optimizes the RCBA module (lightweight channel-spatial attention integration), leading to a further increase of 1.42% in Oil Spill IoU and 0.96% in mIoU. Meanwhile, tested on the same hardware platform, its computational overhead is 18.3% lower than that of RU-Net + CBAM, proving that the lightweight design of RCBA achieves a balance between performance and efficiency.

4 Discussion

We conducted a literature review from 2007 onwards and selected a typical SAR image of the Lower Congo Basin containing an oil spill. Using NOS-Net, we performed segmentation and recognition. This SAR image was captured by the Sentinel-1 satellite on 31 May 2022, in the Lower Congo Basin region. Using the trained deep learning model, we obtained the prediction results shown in Figure 11. In the predicted image, the dark patch in the upper left was misidentified as an oil spill, and the bright area in the lower right was incorrectly classified as a ship. This may be due to the lower proportion of ship labels in the dataset.

Figure 11

Side-by-side images of SAR oil spill detection: left is a grayscale Sentinel-1 SAR image of the Lower Congo Basin with a red circle marking the oil slick origin; right is the NOS-Net segmentation result (black background with highlighted target regions).

Figure 11. The left image shows the SAR image obtained from satellite capture; the right image displays the segmentation results using the NOS-Net.

Given that oil slicks often originate at the wider and more prominent end of curved features, the red-circled area appears darker, suggesting it could be the origin of the oil slick. This point may be influenced by the average directions of the Benguela Current and Congo Plume, which generally flow southeast to northwest, along with surface wind patterns. Our utilized the Google Earth Engine platform and deep learning methods. Analyzing this point as the origin of the oil slick, we acknowledge the uncertainty of wind direction and its minor impact, which contribute to potential errors. The pathways affecting the ascent of oil droplets are primarily influenced by the Angola Current (depth 0–250 m, velocity 20–50 cm/s), South Equatorial Under current (depth 0–200 m, velocity 10–30 cm/s), South Intermediate Counter Current (depth 500–1200 m), and North Atlantic Deep Water (depth 1,200–2000 m, velocity 2 cm/s) (Gay et al., 2007). For these velocities, we will use the median values for calculation purposes. The oil slick is located approximately 2000 m deep. We divide the oil droplet’s ascent path into three sections:

First Section: From 0 m to 20 m, the average direction is northwest, with a velocity of approximately 49 cm/s;

Second Section: From 20 m to 250 m, it primarily flows southeast along the slope, with a velocity of approximately 35 cm/s;

From 250 m to 2000 m, it flows southeast along the slope at a velocity of 2 cm/s.

Based on the literature-reported range of oil droplet ascent speeds (3–8 cm/s), a median value of 5 cm/s was adopted for calculation, and the total time required for oil droplets to rise from the seabed to the sea surface was approximately 11.11 h. Specifically, the oil droplets traveled 700 m southeast in 9.72 h in the 250–2000 m segment, 1,610 m southeast in 1.28 h in the 20–250 m segment, and 196 m northwest in 0.11 h in the 0–20 m segment. With a cumulative southeast displacement of approximately 2,114 m, this indicates that the seabed leakage point is located about 2,114 m northwest of the oil slick origin, and the result falls within the typical deviation range (0–2,500 m) for such inference.

The proposed NOS-Net model in this study is based on the U-Net architecture, integrating residual network and attention mechanism. It effectively addresses the core limitations of the original U-Net model, such as blurred edge prediction and insufficient ability to distinguish oil spills from oil slick look-alikes. However, the model still has some inherent limitations, which require systematic discussion from three aspects: dataset characteristics, environmental adaptability, and inference uncertainties.

1. The model’s training and validation rely on a relatively small-scale dataset (110 images of 1,250 × 650 × 3 specification), with all data derived from Sentinel-1 SAR images of the Lower Congo Basin. This single-source and single-region dataset has two key drawbacks: first, the lack of an independent validation set may lead to overestimated model performance, reducing the reliability of evaluation results; second, although the class imbalance issue has been mitigated using a hybrid loss function (MCCE + Dice), the extremely low proportion of minority class pixels (e.g., oil spills and ships) remains unresolved, limiting the model’s ability to learn fine-grained features of minority classes.

2. NOS-Net achieves optimal performance under moderate to low wind speeds (3.0–7.0 m/s), but its segmentation accuracy degrades significantly under extreme wind conditions. When wind speed is below 3 m/s, reduced sea surface roughness diminishes the contrast between oil slicks and the background; when wind speed exceeds 7 m/s, oil slicks are dispersed and SAR image noise increases. Both scenarios hinder the model’s ability to distinguish oil spills from oil slick look-alikes (e.g., biological films, low-wind-speed regions).

3. The seabed leakage point inference method relies on indirect parameters, resulting in inherent uncertainties. The oil droplet ascent speed (3–8 cm/s) is a literature-derived reference range, and actual values may vary with seawater temperature, salinity, and oil droplet size. Additionally, the adopted ocean current velocities (median values) lack strict spatiotemporal matching with the leakage event, and mooring data lack real-time and location-specific precision. These discrepancies lead to deviations in the calculation of oil droplet displacement distance and direction, limiting the accuracy of leakage point localization.

Leveraging its encoder-decoder framework and lightweight RCBA module, NOS-Net possesses the potential for transfer to single-channel SAR datasets from multiple sensors such as RADARSAT-2. This transfer requires three key adaptations: Standardized preprocessing (unified radiometric calibration, noise reduction, and resolution adjustment); Transfer learning with target region data (e.g., using 20% of the target dataset); Adjusting the output layer and loss function to adapt to the label system.

To address the identified limitations, future research will focus on three core directions: First, expand the dataset scale by incorporating multi-sensor and multi-regional SAR images, supplement an independent validation subset, and develop standardized preprocessing protocols to enhance model robustness; second, optimize the RCBA module to strengthen the ability to capture ultra-fine edges of oil spills and subtle differences between oil spills and oil slick look-alikes; third, integrate multi-modal data to mitigate the impact of extreme wind speed conditions, improve the spatiotemporal matching of ocean current data, and thereby enhance the accuracy of seabed leakage point inference.

These initiatives aim to strengthen the model’s adaptability to complex marine environments and promote its practical application in global natural oil spill monitoring and offshore oil and gas exploration.

5 Conclusion

Based on the U-Net architecture, this study constructed a NOS-Net semantic segmentation model integrating residual network and attention mechanism, and conducted research on natural oil spill detection for Sentinel-1 SAR images of 1,250 × 650 × 3 specification. All images underwent preprocessing including radiometric calibration, 7 × 7 median filtering, and linear transformation. Model training and validation were completed using 110 images of the same specification, with intersection over union (IoU), Accuracy, and mean Pixel Accuracy (mPA) as core evaluation metrics (where mPA is the average of recall rates across all categories). Experimental results show that NOS-Net achieves an IoU of 61.27% for the oil spill category and a mean Intersection over Union (mIoU) of 70.29% for the five marine target categories (Oil Spill, Look-alike, Ship, Land, and Sea Surface). Compared with the U-Net model with ResNet101 as the backbone, these metrics are increased by 7.48% and 5.32%, respectively. Additionally, the model achieves an Accuracy of 95.37% and an mPA of 82.21%, verifying its effectiveness.

The performance advantages of NOS-Net stem from its core design and module synergy: Firstly, at the architectural level, it deeply integrates the encoder-decoder structure of U-Net with the residual learning mechanism of ResNet. The encoder captures global oil spill region features through downsampling, while the decoder restores edge details through upsampling, satisfying the dual requirements of “large-scale recognition + fine-grained segmentation”. Simultaneously, the residual structure addresses the gradient vanishing/explosion issue in deep network training, enabling increased network depth to extract richer discriminative features—effectively compensating for the insufficient depth of U-Net and poor segmentation adaptability of ResNet. Secondly, the innovatively introduced Residual Convolutional Block Attention Module (RCBA) plays a key role: it adaptively learns the importance weights of feature channels via the channel attention mechanism, highlighting key feature channels related to oil spill texture and grayscale; through the spatial attention mechanism, it focuses on potential oil spill regions and suppresses background noise interference (e.g., ocean waves, ships). Meanwhile, its lightweight design controls computational overhead while enhancing feature representation capability, significantly improving the model’s ability to distinguish oil spills from look-alikes and capture fine oil spill edges—this is the core reason for its IoU improvement over the traditional U-Net model.

Based on the trained model, this study completed the segmentation and recognition of oil spill regions in SAR images of the Lower Congo Basin. Through the determination of oil slick origin, and analysis of ocean currents and oil droplet ascent speed, indirect inference of seabed leakage points was realized. The results confirm that under moderate to low wind speed conditions (3.0–7.0 m/s), using SAR images for natural oil spill detection and seabed leakage point localization is feasible, providing reliable technical support for offshore oil and gas exploration.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: https://m4d.iti.gr/oil-spill-detection-dataset/.

Author contributions

PC: Writing – review and editing, Writing – original draft. YQ: Formal Analysis, Conceptualization, Methodology, Writing – original draft, Funding acquisition. JZ: Writing – review and editing, Data curation, Validation, Resources, Project administration.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the North China Institute of Aerospace Engineering Doctoral Scientific Research Staring Fund Project (Grant No: BKY202133). This funding is non-commercial. Funder and grant number is accurately provided.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Ajadi, O. A., Meyer, F. J., Tello, M., and Ruello, G. (2018). Oil spill detection in synthetic aperture radar images using Lipschitz-regularity and multiscale techniques. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 11 (7), 2389–2405. doi:10.1109/jstars.2018.2827996

CrossRef Full Text | Google Scholar

Al-Sudani, I. A., and Al-Suhail, G. A. (2024). “Image-based oil spill detection using deep learning techniques: a review,” in 2024 5th International Conference on Communications, Information, Electronic and Energy Systems (CIEES), Veliko Tarnovo, Bulgaria (IEEE), 1–8. doi:10.1109/CIEES62939.2024.10811302

CrossRef Full Text | Google Scholar

Ball, J. E., Anderson, D. T., and Chan, C. S. (2017). Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community. J. Applied Remote Sensing 11 (4), 042609. doi:10.1117/1.jrs.11.042609

CrossRef Full Text | Google Scholar

Bianchi, F. M., Espeseth, M. M., and Borch, N. (2020). Large-scale detection and categorization of oil spills from SAR images with deep learning. Remote Sens. 12 (14), 2260. doi:10.3390/rs12142260

CrossRef Full Text | Google Scholar

Brekke, C., and Solberg, A. H. S. (2008). Classifiers and confidence estimation for oil spill detection in ENVISAT ASAR images. IEEE Geoscience Remote Sens. Lett. 5 (1), 65–69. doi:10.1109/lgrs.2007.907174

CrossRef Full Text | Google Scholar

Brekke, C., and Solberg, A. H. (2005). Oil spill detection by satellite remote sensing. Remote Sensing Environment 95 (1), 1–13. doi:10.1016/j.rse.2004.11.015

CrossRef Full Text | Google Scholar

Buijsman, M. C., and Ridderinkhof, H. (2007). Long-term ferry-ADCP observations of tidal currents in the Marsdiep inlet. J. Sea Res. 57 (4), 237–256. doi:10.1016/j.seares.2006.11.004

CrossRef Full Text | Google Scholar

Burwood, R. (1999). Angola: source rock control for lower Congo coastal and Kwanza Basin petroleum systems. Geol. Soc. Lond. Spec. Publ. 153 (1), 181–194. doi:10.1144/gsl.sp.1999.153.01.12

CrossRef Full Text | Google Scholar

Cheng, L., Li, Y., Qin, M., and Liu, B. (2024). A marine oil spill detection framework considering special disturbances using Sentinel-1 data in the Suez Canal. Mar. Pollut. Bull. 208, 117012. doi:10.1016/j.marpolbul.2024.117012

PubMed Abstract | CrossRef Full Text | Google Scholar

De Kerf, T., Sels, S., Samsonova, S., and Vanlanduit, S. (2024). ‘Oil spill drone: a dataset of drone-captured, segmented RGB images for oil spill detection in port environments. arXiv:2402, 18202. doi:10.48550/arXiv.2402.18202

CrossRef Full Text | Google Scholar

de Souza Júnior, J. M. N., de Mendonça, L. F. F., da Silva Costa, H., de Freitas, R. A. P., Casagrande, F., da Silva Lindemann, D., et al. (2024). Dispersion analysis of the 2017 Persian gulf oil spill based on remote sensing data and numerical modelling. Mar. Pollut. Bull. 205, 116639. doi:10.1016/j.marpolbul.2024.116639

PubMed Abstract | CrossRef Full Text | Google Scholar

Dong, Y., Liu, Y., Hu, C., MacDonald, I. R., and Lu, Y. (2022). Chronic oiling in global oceans. Science 376 (6599), 1300–1304. doi:10.1126/science.abm5940

PubMed Abstract | CrossRef Full Text | Google Scholar

Dong, S., Feng, J., Gu, Z., Yin, K., and Long, Y. (2025). A review of artificial intelligence and remote sensing for marine oil spill detection, classification, and thickness estimation. Remote Sens. 17 (22), 3681. doi:10.3390/rs17223681

CrossRef Full Text | Google Scholar

Espeseth, M. M., Jones, C. E., Holt, B., Brekke, C., and Skrunes, S. (2020). Oil-spill-response-oriented information products derived from a rapid-repeat time series of SAR images. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 13, 3448–3461. doi:10.1109/jstars.2020.3003686

CrossRef Full Text | Google Scholar

Fox-Kemper, B., Adcroft, A., Böning, C. W., Chassignet, E. P., Curchitser, E., Danabasoglu, G., et al. (2019). Challenges and prospects in ocean circulation models. Front. Mar. Sci. 6, 65. doi:10.3389/fmars.2019.00065

CrossRef Full Text | Google Scholar

Garcia-Pineda, O., Zimmer, B., Howard, M., Pichel, W., Li, X., and MacDonald, I. R. (2009). Using SAR images to delineate ocean oil slicks with a texture-classifying neural network algorithm (TCNNA). Can. J. Remote Sens. 35 (5), 411–421. doi:10.5589/m09-035

CrossRef Full Text | Google Scholar

Garcia-Pineda, O., Holmes, J., Rissing, M., Jones, R., Wobus, C., Svejkovsky, J., et al. (2017). Detection of oil near shorelines during the deepwater horizon oil spill using synthetic aperture radar (SAR). Remote Sens. 9, 567. doi:10.3390/rs9060567

CrossRef Full Text | Google Scholar

Gay, A., Lopez, M., Berndt, C., and Seranne, M. (2007). Geological controls on focused fluid flow associated with seafloor seeps in the lower Congo Basin. Mar. Geol. 244 (1-4), 68–92. doi:10.1016/j.margeo.2007.06.003

CrossRef Full Text | Google Scholar

Golcarenarenji, G., Amer, E., Mohasseb, A., and Elboughdadly, T. (2025). “Explainable oil spill detection using UAV imagery,” in 2025 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) (Cairo, Egypt: Egypt Institute of Electrical and Electronics Engineers Inc.), 129–134. doi:10.1109/MIUCC66482.2025.11196871

CrossRef Full Text | Google Scholar

Hardman-Mountford, N. J., Richardson, A. J., Agenbag, J. J., Hagen, E., Nykjaer, L., Shillington, F. A., et al. (2003). Ocean climate of the south east Atlantic observed from satellite data and wind models. Prog. Oceanogr. 59 (2-3), 181–221. doi:10.1016/j.pocean.2003.10.001

CrossRef Full Text | Google Scholar

He, K., Zhang, X., Ren, S., and Sun, J. (2016). “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778. doi:10.1109/cvpr.2016.90

CrossRef Full Text | Google Scholar

Hernández-Hamón, H., Ramírez, P. Z., Zaraza, M., and Micallef, A. (2023). Google Earth engine app using sentinel 1 SAR and deep learning for ocean seep methane detection and monitoring. Remote Sens. Appl. Soc. Environ. 32, 101036. doi:10.1016/j.rsase.2023.101036

CrossRef Full Text | Google Scholar

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., et al. (2020). The ERA5 global reanalysis. Q. Journal Royal Meteorological Society 146 (730), 1999–2049. doi:10.1002/qj.3803

CrossRef Full Text | Google Scholar

Ivanov, A. Y., Matrosova, E. R., Kucheiko, A. Y., Filimonova, N. A., Evtushenko, N. V., Terleeva, N. V., et al. (2020). Search and detection of natural oil seeps in the seas surrounding the Russian federation using spaseborne SAR imagery. Izvestiya, Atmos. Ocean. Phys. 56, 1590–1604. doi:10.1134/s0001433820120439

CrossRef Full Text | Google Scholar

Jatiault, R., Dhont, D., Loncke, L., and Dubucq, D. (2017). Monitoring of natural oil seepage in the Lower Congo Basin using SAR observations. Remote Sens. Environ. 191, 258–272. doi:10.1016/j.rse.2017.01.031

CrossRef Full Text | Google Scholar

Jatiault, R., Dhont, D., Loncke, L., de Madron, X. D., Dubucq, D., Channelliere, C., et al. (2018). Deflection of natural oil droplets through the water column in deep-water environments: the case of the lower Congo Basin. Deep Sea Res. Part I Oceanogr. Res. Pap. 136, 44–61. doi:10.1016/j.dsr.2018.04.009

CrossRef Full Text | Google Scholar

Karner, G. D., Driscoll, N. W., and Barker, D. H. N. (2003). Syn-rift regional subsidence across the West African continental margin: the role of lower plate ductile extension. 207 105–129. doi:10.1144/gsl.sp.2003.207.6

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, T. H., Yang, C. S., Oh, J. H., and Ouchi, K. (2014). Analysis of the contribution of wind drift factor to oil slick movement under strong tidal condition: hebei spirit oil spill case. PloS One 9 (1), e87393. doi:10.1371/journal.pone.0087393

PubMed Abstract | CrossRef Full Text | Google Scholar

King, L. H., and Maclean, B. R. I. A. N. (1970). Pockmarks on the Scotian shelf. Geol. Soc. Am. Bull. 81 (10), 3141–3148. doi:10.1130/0016-7606(1970)81[3141:potss]2.0.co;2

CrossRef Full Text | Google Scholar

Krestenitis, M., Orfanidis, G., Ioannidis, K., Avgerinakis, K., Vrochidis, S., and Kompatsiaris, I. (2019). Oil spill identification from satellite images using deep neural networks. Remote Sens. 11 (15), 1762. doi:10.3390/rs11151762

CrossRef Full Text | Google Scholar

Kurah, I. M., Adamu, S., Alhussian, H., Alwadain, A., Adamu Aliyu, D., Mamman, H., et al. (2025). Deep learning-based hyperparameter optimization for enhanced segmentation of RGB images in oil spill detection within port environments. IEEE Access 13, 146052–146067. doi:10.1109/ACCESS.2025.3599593

CrossRef Full Text | Google Scholar

Leifer, I., Lehr, W. J., Simecek-Beatty, D., Bradley, E., Clark, R., Dennison, P., et al. (2012). State of the art satellite and airborne marine oil spill remote sensing: application to the BP deepwater horizon oil spill. Remote Sens. Environ. 124, 185–209. doi:10.1016/j.rse.2012.03.024

CrossRef Full Text | Google Scholar

Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., and Li, J. (2020). “Dice loss for data-imbalanced NLP tasks,” in Proceedings of the 58th annual meeting of the association for computational linguistics, 465–476.

Google Scholar

Liu, Y., Yin, Z., and Cai, H. (2025). Enhanced global oil spill dataset from 1967 to 2023 based on text-form incident information. Sci. Data 12 (1), 1394. doi:10.1038/s41597-025-05601-9

PubMed Abstract | CrossRef Full Text | Google Scholar

MacDonald, I. R., Garcia-Pineda, O., Beet, A., Daneshgar Asl, S., Feng, L., Graettinger, G., et al. (2015). Natural and unnatural oil slicks in the Gulf of Mexico. J. Geophys. Res. Oceans 120 (12), 8364–8380. doi:10.1002/2015JC011062

PubMed Abstract | CrossRef Full Text | Google Scholar

Marghany, M. (2014). Utilization of a genetic algorithm for the automatic detection of oil spill from RADARSAT-2 SAR satellite data. Mar. Pollut. Bull. 89 (1-2), 20–29. doi:10.1016/j.marpolbul.2014.10.041

PubMed Abstract | CrossRef Full Text | Google Scholar

Marghany, M. (2019). Synthetic aperture radar imaging mechanism for oil spills. Cambridge, MA: Gulf Professional Publishing.

Google Scholar

Marmorino, G. O., Smith, G. B., Toporkov, J. V., Sletten, M. A., Perkovic, D., and Frasier, S. J. (2008). Evolution of ocean slicks under a rising wind. J. Geophys. Res. Oceans 113 (C4). doi:10.1029/2007jc004538

CrossRef Full Text | Google Scholar

McNutt, M. K., Chu, S., Lubchenco, J., Hunter, T., Dreyfus, G., Murawski, S. A., et al. (2012). Applications of science and engineering to quantify and control the deepwater horizon oil spill. Proc. Natl. Acad. Sci. 109 (50), 20222–20228. doi:10.1073/pnas.1214389109

PubMed Abstract | CrossRef Full Text | Google Scholar

Mikolaj, P. G., Allen, A. A., and Schlueter, R. S. (1972). “Investigation of the nature, extent and fate of natural oil seepage off southern California,” in Offshore Technology Conference (Houston, TX: OTC).

CrossRef Full Text | Google Scholar

Moroshkin, K. V., Bubnov, V. A., and Bulatov, R. P. (1970). Water circulation in the eastern south Atlantic Ocean. OCEANOLOGY 10 (NO 1), 27–34.

Google Scholar

Orfanidis, G., Ioannidis, K., Avgerinakis, K., Vrochidis, S., and Kompatsiaris, I. (2018). “A deep neural network for oil spill semantic segmentation in Sar images,” in 2018 25th IEEE International Conference on Image Processing (ICIP) (IEEE), 3773–3777.

CrossRef Full Text | Google Scholar

Ronneberger, O., Fischer, P., and Brox, T. (2015). “U-net: convolutional networks for biomedical image segmentation,” in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015 (Springer International Publishing), 234–241.

CrossRef Full Text | Google Scholar

Séranne, S. (1999). Early Oligocene stratigraphic turnover on the west Africa continental margin: a signature of the Tertiary greenhouse-to-icehouse transition? Terra Nova. 11 (4), 135–140. doi:10.1046/j.1365-3121.1999.00246.x

CrossRef Full Text | Google Scholar

Shaban, M., Salim, R., Abu Khalifeh, H., Khelifi, A., Shalaby, A., El-Mashad, S., et al. (2021). A deep-learning framework for the detection of oil spills from SAR data. Sensors 21 (7), 2351. doi:10.3390/s21072351

PubMed Abstract | CrossRef Full Text | Google Scholar

Siagian, H., Ismanto, A., Prasetyawan, I. B., Sukmadewa, Y., Hoir, I. F., Putra, T. W. L., et al. (2023). “Application of acoustic doppler current profile (ADCP) to estimate suspended solid concentration (SSC) during the tidal phase, case study: Donggala, palu,” in IOP Conference Series: Earth and Environmental Science. Semarang, Indonesia 1224 (1). 012030. doi:10.1088/1755-1315/1224/1/012030

CrossRef Full Text | Google Scholar

Singha, S., Ressel, R., Velotto, D., and Lehner, S. (2016). A combination of traditional and polarimetric features for oil spill detection using TerraSAR-X. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 9 (11), 4979–4990. doi:10.1109/jstars.2016.2559946

CrossRef Full Text | Google Scholar

Song, D., Ding, Y., Li, X., Zhang, B., and Xu, M. (2017). Ocean oil spill classification with RADARSAT-2 SAR based on an optimized wavelet neural network. Remote Sens. 9 (8), 799. doi:10.3390/rs9080799

CrossRef Full Text | Google Scholar

Tong, J., and Xie, D. (2025). On the features extracted from dual-polarized Sentinel-1 images for deep-learning-based sea surface oil-spill detection. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 18, 23045–23060. doi:10.1109/jstars.2025.3604418

CrossRef Full Text | Google Scholar

Topouzelis, K. N. (2008). Oil spill detection by SAR images: dark formation detection, feature extraction and classification algorithms. Sensors 8 (10), 6642–6659. doi:10.3390/s8106642

PubMed Abstract | CrossRef Full Text | Google Scholar

Vlăsceanu, G. V., Tarbă, N., Voncilă, M. L., and Boiangiu, C. A. (2024). Selecting the right metric: a detailed study on image segmentation evaluation. BRAIN. Broad Res. Artif. Intell. Neurosci. 15 (4), 295–318. doi:10.70594/brain/15.4/20

CrossRef Full Text | Google Scholar

Woo, S., Park, J., Lee, J. Y., and Kweon, I. S. (2018). “Cbam: convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 3–19. doi:10.1007/978-3-030-01234-2_1

CrossRef Full Text | Google Scholar

Wu, Y. X., Du, K., Wang, X. J., and Min, F. (2024). Misclassification-guided loss under the weighted cross-entropy loss framework. Knowl. Inf. Syst. 66 (8), 4685–4720. doi:10.1007/s10115-024-02123-5

CrossRef Full Text | Google Scholar

Xu, L., Li, J., Brenning, A., and Brenning, A. (2014). A comparative study of different classification techniques for marine oil spill identification using RADARSAT-1 imagery. Remote Sens. Environ. 141, 14–23. doi:10.1016/j.rse.2013.10.012

CrossRef Full Text | Google Scholar

Yan, H., Cheng, X., Liu, M., Li, Q., Ma, K., Chen, Z., et al. (2025). Study on optimization of prediction models for small-scale oil spill area: impacts of wind and waves. Process Saf. Environ. Prot. 201, 107595. doi:10.1016/j.psep.2025.107595

CrossRef Full Text | Google Scholar

Zeng, K., and Wang, Y. (2020). A deep convolutional neural network for oil spill detection from spaceborne SAR images. Remote Sens. 12 (6), 1015. doi:10.3390/rs12061015

CrossRef Full Text | Google Scholar

Keywords: SAR, remote sensing, satellite, natural seepage, oil spil

Citation: Cheng P, Qi Y and Zhao J (2026) Automated oil spill detection using deep learning and SAR satellite data: a case study of the Lower Congo Basin. Front. Earth Sci. 13:1667450. doi: 10.3389/feart.2025.1667450

Received: 16 July 2025; Accepted: 08 December 2025;
Published: 23 January 2026.

Edited by:

Jinran Wu, The University of Queensland, Australia

Reviewed by:

Maged Marghany, Universitas Malikussaleh, Indonesia
Ghaida Al-Suhail, University of Basrah, Iraq

Copyright © 2026 Cheng, Qi and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yanxin Qi, MzI1MjIxMTA5NEBxcS5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.