Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Phys., 23 October 2025

Sec. Radiation Detectors and Imaging

Volume 13 - 2025 | https://doi.org/10.3389/fphy.2025.1650714

Optimization of laser spot edge extraction and localization based on multi-scale adaptive convolution

Keya YuanKeya Yuan1Lin Li
Lin Li2*
  • 1College of Robotics, Beijing Union University, Beijing, China
  • 2College of Applied Science and Technology, Beijing Union University, Beijing, China

The precise extraction of laser spot edges plays a fundamental role in optical measurement systems, yet traditional methods struggle with noise interference and varying spot characteristics. Existing approaches face significant challenges in achieving robust subpixel accuracy across diverse experimental conditions, particularly for irregular spots and low signal-to-noise scenarios. This article presents a novel multi-scale adaptive convolution framework that integrates three key innovations: (1) dynamic kernel adjustment based on local intensity gradients, (2) hierarchical feature pyramid architecture combining spatial details with semantic features, and (3) subpixel localization through Gaussian surface fitting and gradient extremum analysis. Extensive experiments demonstrate the method’s superior performance, achieving 0.12-pixel root mean square error (RMSE) on standard Gaussian beams (vs. 0.38 for Canny), maintaining 0.15-pixel accuracy with aberrated spots, and showing remarkable robustness at 5 dB SNR (0.28-pixel RMSE). The results establish that our hybrid approach successfully bridges physical modeling with data-driven adaptation, delivering unprecedented precision (0.91 temporal–spatial consistency) for laser-based applications ranging from industrial metrology to biomedical imaging. The ablation studies further confirm the critical importance of both multi-scale adaptation (61% accuracy drop when removed) and analytical modeling (0.842 F1-score without Gaussian fitting), providing valuable insights for future edge detection research.

1 Introduction

The extraction of laser spot edges holds significant research importance in various scientific and engineering applications as it is a fundamental step for precise optical measurement Bonnett Del Alamo et al. [1], alignment Yin et al. [2], and quality control. It is widely utilized in fields such as industrial processing Gao et al. [3], medical diagnostics Zhao et al. [4], and optical communication Chen et al. [5], where accurate edge detection directly influences system performance and measurement reliability. Traditional edge extraction methods Sun et al. [6] often face challenges in handling noise interference, uneven intensity distributions, and varying spot sizes, leading to reduced positioning accuracy and robustness. The development of advanced edge extraction techniques can substantially improve the precision and adaptability of laser spot analysis by dynamically adjusting to intensity variations and morphological characteristics across different scales. This capability is especially valuable in practical scenarios where laser spots exhibit complex patterns due to beam divergence, scattering, or environmental disturbances. Enhanced edge localization not only refines the spatial resolution of optical systems but also facilitates subsequent tasks such as centroid calculation, beam profiling, and aberration correction. Furthermore, optimized edge extraction contributes to the automation of laser-based systems by providing more reliable input for real-time feedback control and decision-making processes. From a broader perspective, advancements in this area can benefit interdisciplinary applications ranging from high-precision manufacturing to biomedical imaging, where subtle edge variations may carry critical information about material properties or physiological conditions.

A significant amount of research work has been devoted to solving laser spot edge extraction. A significant challenge identified across multiple studies is the low precision of pixel-level edge detection, which results in substantial errors in light spot measurement Pan et al. [7]. To address this, researchers have explored subpixel edge detection techniques that enhance the accuracy of edge localization, thereby reducing measurement errors Pan et al. [7]; Mattsson [8]. Subpixel methods, such as those employing the Gaussian fitting approach, have been utilized to achieve higher precision in laser spot edge detection. For instance, the combination of Gaussian fitting with Canny edge detection and gray-scale barycenter methods has been demonstrated to improve centroid extraction accuracy for infrared laser spots Yang et al. [9]. Similarly, subpixel accuracy is crucial in applications requiring precise measurement of laser beam parameters, such as beam size and divergence, which are typically assessed using knife-edge techniques MOHAMED [10]. In addition to subpixel techniques, advanced image processing algorithms have been developed to handle irregular spot shapes and complex backgrounds. For example, the Otsu-K-means gravity-based multi-spot center extraction method was proposed to improve the extraction of laser spots with irregular shapes, although its accuracy still faces limitations Chen et al. [11]. Background modeling approaches, such as the average background model, have also been employed to enhance spot detection in challenging conditions, yet the accuracy remains insufficient for some applications Chen et al. [11]. Edge detection methods in laser welding and other industrial applications have traditionally relied on simple computer vision techniques to identify weld seam edges and laser-induced features Ali et al. [12]; Mattulat [13]. These methods often focus on the maximum distance of the laser spot edge relative to a reference, with reported detection distances approximately 0.1 mm, indicating the importance of precise edge localization for quality control Mattulat [13]. The detection of laser spots and their edges plays a vital role in 3D measurements and internal defect evaluation. Techniques such as laser line extraction with subpixel accuracy have been developed to improve the detection of jagged edges, which is critical for accurate 3D reconstruction and internal delamination assessment Zhou et al. [14]. The integration of laser sensors, including laser distance and positioning sensors, further underscores the importance Ning et al. [15].

The current body of research highlights a trend toward employing subpixel and advanced image processing techniques to enhance the precision of laser spot edge extraction. Despite these advancements, challenges remain in achieving high accuracy under complex conditions, particularly for irregular spot shapes and noisy backgrounds, indicating ongoing opportunities for methodological improvements in this field. Recent advances in laser measurement systems have demonstrated remarkable progress in real-time adaptive control Meng et al. [16]. Ning et al. [17] introduced the frame-segmentation LIPA (FLIPA) algorithm and laser-induced breakdown spectroscopy (LIBS)-FLIPA multimodal fusion technique, which reduce LIPA variables by 99% while significantly enhancing classification accuracy, robustness, and generalization in plastic waste sorting, thereby overcoming critical limitations of conventional LIBS analysis.

The application of multi-scale adaptive convolution in laser spot edge extraction offers significant advantages by dynamically adjusting to varying spot sizes, intensity distributions, and noise levels. Unlike traditional fixed-kernel methods, this approach improves edge detection accuracy by analyzing features across different scales, ensuring robustness against blurring, uneven illumination, and low signal-to-noise ratios. Such adaptability is particularly valuable in real-world scenarios where laser spots exhibit complex shapes due to beam divergence, scattering, or optical distortions. By improving edge localization precision, this method enables more reliable centroid calculation, beam profiling, and optical system alignment, benefiting applications in precision manufacturing, biomedical imaging, and laser-based metrology. Its computational efficiency makes it suitable for real-time processing, supporting automation in laser-guided systems.

To address the scale sensitivity and noise interference issues encountered by traditional edge detection algorithms in laser spot processing, this study proposes a novel edge extraction method based on multi-scale adaptive convolution. The approach constructs a multi-scale feature pyramid to extract edge characteristics of laser spots across different scale spaces, while incorporating an adaptive weighting mechanism to dynamically adjust convolution kernel parameters, thereby achieving robust detection for spots with varying sizes and intensity distributions. A subpixel precision positioning algorithm is developed for edge localization optimization by integrating Gaussian surface fitting with gradient extremum analysis, which significantly enhances the accuracy of edge localization.

The three main innovations of this study are as follows.

1. The first innovation lies in developing a multi-scale adaptive convolution mechanism that dynamically adjusts kernel parameters based on local intensity gradients, enabling robust edge extraction across varying spot sizes and illumination conditions. This approach overcomes the fixed-scale limitation of conventional edge detectors.

2. The second innovation introduces a hierarchical feature pyramid architecture that combines shallow spatial details with deep semantic features, allowing simultaneous preservation of edge sharpness while suppressing noise interference at different scales.

3. The third innovation proposes a novel subpixel localization module integrating Gaussian surface fitting with gradient extremum analysis, which achieves higher positioning accuracy than traditional interpolation-based methods by modeling the continuous intensity distribution.

2 Related work

2.1 Traditional approaches

Gradient-based methods constitute a fundamental approach for laser spot edge detection Al Darwich et al. [18], operating on the principle of identifying intensity discontinuities in digital images. These techniques typically employ convolution kernels, such as Han et al. [19], Yan et al. [20] or Roberts operators Darwis et al. [21], to compute spatial derivatives that highlight regions of rapid intensity change corresponding to potential edges. The Canny edge detector Lu et al. [22] further refines this approach through multi-stage processing involving noise reduction, gradient calculation, non-maximum suppression, and hysteresis thresholding. These methods offer computational efficiency and straightforward implementation, making them widely accessible for various applications. Their effectiveness is particularly notable in scenarios with high-contrast laser spots and clean background conditions, where they can provide satisfactory edge localization accuracy with relatively low computational overhead. However, gradient-based techniques exhibit several inherent limitations when dealing with complex laser spot images. They are highly sensitive to noise and illumination variations, often producing fragmented or false edges in low-quality images. The fixed-size convolution kernels struggle to adapt to laser spots with varying sizes or blur levels, leading to inconsistent performance across different experimental conditions. Moreover, these methods typically output pixel-level edges without subpixel precision, limiting their usefulness in high-accuracy applications. The threshold selection process remains another critical challenge, as inappropriate values may either miss genuine edges or introduce excessive noise.

2.2 Model-fitting methods

Model-fitting methods provide a mathematically rigorous approach to laser spot edge detection by approximating the intensity distribution with parametric functions. These techniques typically employ Wang and Chen et al. [23] or Moffat functions to model the spot’s radial intensity profile, where edges are determined by analyzing the fitted model’s characteristics, such as inflection points or specific intensity thresholds. The fitting process often involves nonlinear least-squares optimization to minimize the discrepancy between the model and observed pixel values. This approach offers several advantages, including inherent noise suppression through the fitting procedure and the ability to achieve subpixel edge localization precision Ning et al. [24]. The parametric nature of these methods allows for the simultaneous extraction of multiple spot characteristics beyond only edges, such as centroid position, beam width, and intensity distribution parameters, making them particularly valuable for comprehensive beam analysis applications. Despite their theoretical advantages, model-fitting methods present several practical challenges in laser spot edge detection. The computational complexity of nonlinear fitting procedures can be significantly higher than simpler gradient-based methods, potentially limiting real-time applications. These techniques are also sensitive to initial parameter guesses and may converge to local minima if the spot exhibits irregular shapes or contains significant noise. The assumption of a specific intensity profile (typically Gaussian) may not hold true for all experimental conditions, particularly when dealing with distorted or aberrated laser beams. Additionally, the performance tends to degrade when processing spots with low signal-to-noise ratios or when multiple spots overlap in the image.

2.3 Deep learning approaches

Deep learning approaches have emerged as a powerful alternative for laser spot edge detection, leveraging convolutional neural networks (CNNs) Ma et al. [25] to automatically learn discriminative features from training data. These methods typically employ encoder-decoder architectures or specialized edge detection networks that process raw pixel intensities to directly predict edge maps or spot boundaries. Unlike traditional algorithms, deep learning models can capture complex spatial relationships and contextual information, enabling robust performance across varying spot sizes, shapes, and noise conditions. The data-driven nature of these approaches allows them to adapt to diverse experimental setups without requiring explicit mathematical modeling of the spot characteristics. Advanced architectures may incorporate multi-scale processing and attention mechanisms to enhance edge localization precision while maintaining computational efficiency through optimized network designs.

3 Methodology

This article proposes an innovative laser spot edge extraction framework combining multi-scale adaptive convolution with subpixel localization techniques. The methodology first constructs a multi-scale feature pyramid to analyze edge characteristics across different resolutions, employing an adaptive weighting mechanism to dynamically optimize convolution kernel parameters for varying spot sizes and intensity distributions. Subsequently, a hybrid localization algorithm integrates Gaussian surface fitting with gradient extremum analysis to achieve subpixel edge positioning accuracy. This dual approach effectively addresses traditional challenges of scale sensitivity and noise interference while maintaining computational efficiency, offering significant improvements over conventional edge detection methods in terms of both robustness and precision for laser spot analysis applications.

3.1 Multi-scale feature pyramid construction and adaptive convolution mechanism

The multi-scale feature pyramid construction forms the foundational component of our proposed edge detection framework, designed to comprehensively capture laser spot characteristics across different spatial resolutions. The pyramid is built through a hierarchical downsampling process where the original input image I0(x,y) with resolution H×W is progressively downscaled to create multiple scale representations {Is(x,y)}s=1S, where S denotes the total number of scales and each level s has resolution H/2s×W/2s. The downsampling operation incorporates a learned anti-aliasing filter Gσ with adaptive standard deviation σs for each scale. The above process can be expressed as:

Is(x,y)=2GσsIs1(x,y)(1)

where 2 denotes 2:1 downsampling, and * represents convolution. The key innovation lies in our adaptive scale selection mechanism that automatically determines the optimal number of pyramid levels S based on the input spot size distribution, computed through:

S=log2min(H,W)βdavg(2)

where davg is the average spot diameter estimated from preliminary detection, and β is a scaling factor controlling the minimum detectable feature size. This dynamic pyramid construction ensures sufficient scale coverage while avoiding unnecessary computational overhead.

At each pyramid level, we employ a set of parallel adaptive convolution kernels {Ks,k}k=1K with learnable parameters that automatically adjust their receptive fields based on local intensity gradients. The kernel adaptation follows:

Ks,k(x,y)=exp(αs,kGs,k(x,y))k=1Kexp(αs,kGs,k(x,y))(3)

where Gs,k(x,y) represents the gradient magnitude map for kernel k at scale s, and αs,k are learnable attention weights. This formulation enables the network to focus on the most discriminative features at each scale while maintaining spatial consistency.

The final feature representation combines information across all scales through our proposed cross-scale fusion module:

F(x,y)=s=0SωsU(Fs(x,y),I0)(4)

where U() denotes upsampling to the original resolution, Fs(x,y) are the scale-specific features, and ωs are learned fusion weights computed through a gating mechanism based on global context. This multi-scale analysis provides comprehensive edge information, while our adaptive mechanisms ensure optimal feature extraction regardless of spot size variations or noise conditions.

3.2 Edge localization optimization

The edge localization optimization module represents a significant advancement in subpixel precision through a novel integration of Gaussian surface fitting with gradient extremum analysis. The Gaussian fitting component employs an anisotropic 2D Gaussian model to approximate the laser spot intensity distribution. The above process can be expressed as:

G(x,y)=Aexp(xx0)22σx2(yy0)22σy2+B+Cx+Dy(5)

where A is the amplitude, (x0,y0) represents the spot center, σx and σy control the spread along each axis, B accounts for background illumination, and the linear terms Cx+Dy compensate for uneven illumination gradients. Our key innovation lies in the adaptive initialization of these parameters through a multi-scale moment analysis, where initial estimates for σx, σy, and (x0,y0) are derived from weighted combinations of moments computed across different pyramid levels:

σx(0)=1Ss=1Sωsμ20(s)μ00(s)μ10(s)μ00(s)2(6)

with μpq(s) being the image moments at scale s and ωs representing scale-specific reliability weights. This initialization scheme significantly improves the convergence and accuracy of the subsequent nonlinear least-squares optimization, which incorporates a novel regularization term to maintain edge sharpness:

Lfit=i=1Nwi[I(xi,yi)G(xi,yi)]2+λ1(σx1+σy1)+λ2G2(7)

where wi are spatially adaptive weights based on gradient magnitude, and λ1, λ2 are automatically tuned regularization parameters.

The gradient extremum analysis component provides complementary edge localization through a sophisticated continuous-domain approach. We first compute the multi-scale gradient field g(x,y)=(gx(x,y),gy(x,y))T using our adaptive convolution kernels, then construct a cubic Hermite spline representation of the gradient magnitude g(x,y) in the neighborhood of candidate edges. The subpixel edge positions are located by solving

gn=0,2gn2<0(8)

where n is the direction normal to the edge, obtained from the eigenvector corresponding to the smallest eigenvalue of the structure tensor. The above process can be expressed as:

n=gx2gxgygxgygy2(9)

Our innovation here involves a multi-resolution verification scheme where gradient extrema are detected at multiple scales and consolidated through a voting mechanism, effectively suppressing spurious edges while preserving genuine ones.

The final edge localization combines results from both approaches through our confidence-weighted fusion:

pf=αpg+(1α)pe+βd(10)

where pg and pe are positions from Gaussian fitting and gradient analysis, respectively, α[0,1] is a confidence measure based on fitting residual and gradient strength, and β controls the influence of the edge direction vector d derived from the structure tensor. The confidence measure α incorporates both local and global consistency checks:

α=exp(γ1rg/r̄)exp(γ1rg/r̄)+exp(γ2g/ḡ)(11)

with rg being the local fitting residual, r̄ and ḡ representing image-wide averages, and γ1, γ2 controlling the relative weighting.

The algorithm’s robustness is further enhanced by our novel post-processing stage that incorporates topological constraints. We formulate edge connectivity as a graph optimization problem where nodes represent candidate edge points and edges encode geometric relationships:

E(G)=i=1NjN(i)λdpipj2+λa(1niTnj)(12)

where N(i) denotes spatial neighbors,ni is the unit normal vector at point i, and λd, λa balance distance and angular consistency. The Multi-scale Adaptive Convolution with a Gaussian-gradient Fusion algorithm is represented in pseudo-code as Algorithm 1.

Algorithm 1

Algorithm 1. Multi-scale Adaptive Convolution with Gaussian-gradient Fusion.

4 Experiment

4.1 Experimental setup

The experimental setup was implemented on a high-performance computing platform equipped with an Intel Xeon Gold 6248R processor (3.0 GHz, 24 cores) and 256 GB RAM, coupled with an NVIDIA Quadro RTX 8000 GPU (48 GB memory) for accelerated computation. The software environment utilized Ubuntu 20.04 LTS with CUDA 11.3 and cuDNN 8.2, while the algorithms were implemented in Python 3.8 using the PyTorch 1.9.0 framework. All image processing operations were optimized using OpenCV 4.5.5 with Intel Math Kernel Library (MKL) and Integrated Performance Primitives (IPP) acceleration libraries to ensure real-time performance. The hardware configuration allowed for parallel processing of multiple image streams at 4K resolution with 16-bit depth, which was essential for maintaining the precision requirements of subpixel edge detection.

Model parameters were carefully configured through extensive preliminary experiments. The Gaussian fitting component used an adaptive kernel size ranging from 5× 5 pixels to 15× 15 pixels, automatically determined based on the local gradient magnitude. The gradient analysis employed Sobel operators with kernel sizes scaled according to the pyramid level (3× 3 to 9× 9). Critical parameters included the regularization coefficients λ1=0.1 and λ2=0.05 for the fitting optimization, and the multi-scale fusion weights followed a Gaussian distribution with σ=1.5 across pyramid levels. The edge connectivity optimization used λa=0.3 for angular consistency and λe=0.7 for spatial continuity, values that were determined through cross-validation on our training dataset.

Training procedures incorporated several innovative techniques to ensure robust performance. The model was trained on a diverse dataset containing 15,000 high-resolution images of different surface materials with precisely annotated edges, captured under various lighting conditions. We employed a progressive training strategy, starting with synthetic images and gradually introducing real-world data. The optimization used AdamW with an initial learning rate of 0.001, reduced by a factor of 0.5 every 50 epochs. Data augmentation included random affine transformations, illumination variations, and additive Gaussian noise σ=0.010.05. Training converged after approximately 300 epochs, with early stopping based on validation set performance.

The evaluation employed multiple quantitative metrics to assess localization accuracy and robustness. The primary metric was the root mean square error (RMSE) of edge positions: RMSE=1Ni=1N(pipigt)2, where pi is the detected edge point, and pigt is the ground truth position. We also measured angular accuracy using AA=1Ni=1Ncos1(niTnigt), with ni representing the estimated edge normal. Edge continuity was quantified via EC=1Lj=1Mljexp(Δθjσθ), where lj is segment length, Δθj is orientation change, and σθ=0.1 rad controls sensitivity to direction changes. Additional metrics included precision-recall curves and the Pratt quality factor F=1max(Nd,Ngt)i=1Nd11+αdi2, where Nd and Ngt are detected and ground truth edge counts, di is distance to nearest ground truth, and α=1/3 controls penalty severity.

4.2 Datasets

The main datasets used in this study are Focus-Lite, Focus-Med, and Focus-Extreme (see Table 1) Ren Ziwen and Wei [26].

Table 1
www.frontiersin.org

Table 1. Dataset specification summary.

The Focus-Lite dataset provides a foundational benchmark for laser spot analysis, containing 1,000+ static images of Gaussian-like beam profiles with diameters ranging from 5 pixels to 50 pixels. Each sample includes essential metadata: wavelength (405–1064 nm), optical power (1–100 mW), and charge-coupled device (CCD) calibration parameters (12-bit depth, 4.65 µm/pixel resolution). The dataset’s standardized conditions enable rapid validation of basic algorithms for centroid detection and beam width calculation, serving as an essential reference for comparing the proposed multi-scale adaptive convolution method against traditional approaches under controlled scenarios. Its simplicity facilitates quick debugging while maintaining physical relevance through precisely documented acquisition parameters.

Focus-Med offers intermediate complexity with 5,000+ temporal sequences capturing dynamic laser-material interactions and aberrated beams. Key fields include time-stamped frames 100fps, environmental noise levels, and structured background annotations. The dataset specifically addresses real-world challenges like fluctuating intensities (10%–90% saturation) and non-ideal beam modes (TEM01, donut profiles), making it ideal for testing our adaptive weighting mechanism’s robustness against temporal variations and spatial distortions. Its inclusion of industrial-relevant scenarios (welding, cutting) provides critical validation for practical deployment considerations.

The Focus-Extreme dataset challenges algorithm limits with 10,000+ samples featuring ultra-low SNR (<3 dB), biological tissue scattering, and femtosecond pulse distortions. Each case provides multimodal data: raw CCD frames, corresponding Monte Carlo simulation parameters, and ground truth aberration coefficients (Zernike terms up to 15th order). This dataset rigorously evaluates our method’s subpixel localization accuracy in photon-starved conditions and validates the Gaussian-gradient hybrid approach’s superiority over conventional techniques when handling strongly nonlinear beam propagation effects, particularly for applications in biomedical imaging and ultrafast laser metrology.

4.3 Baseline models

The baseline models used in this study are holistically nested edge detection (HED) Xie and Tu [27], DeepEdge Bertasius et al. [28], and Canny edge detection Agrawal and Desai [29].

Holistically-nested edge detection (HED) is a deep learning-based approach that employs a fully convolutional neural network with multiple side outputs to capture edge information at different scales. The model integrates hierarchical features through a fusion layer, enabling precise edge localization while maintaining global context. HED’s end-to-end training minimizes multi-scale prediction errors, achieving state-of-the-art performance on standard benchmarks through its holistic nested architecture.

DeepEdge combines convolutional neural networks with structured edge detection by leveraging both local and global image information. The architecture processes image patches through multiple convolutional layers to extract rich hierarchical features, which are then classified as edges using a random forest. This hybrid approach effectively bridges low-level cues with high-level semantics, demonstrating superior performance in complex scenes with cluttered backgrounds.

The Canny edge detector is a classical algorithm that identifies edges through gradient-based multi-stage processing. It applies Gaussian smoothing to reduce noise, computes intensity gradients using Sobel operators, and employs non-maximum suppression with hysteresis thresholding to produce connected edges. Despite its simplicity, Canny remains widely adopted due to its computational efficiency and reliable performance across diverse imaging conditions.

4.4 Experimental results and analysis

The comparative experiments are designed across three critical dimensions: (1) Basic Detection Accuracy evaluates all models on Focus-Lite using standard metrics, where Canny serves as the traditional baseline while HED and DeepEdge represent learning-based approaches; (2) Dynamic Scenario Robustness tests on Focus-Med with added Gaussian noise and motion blur to assess temporal stability, measuring false edge rates and continuity metrics; (3) Extreme Condition Performance utilizes Focus-Extreme to examine subpixel localization error under photon-limited and scattering conditions, with ablation studies on multi-scale fusion components. Each dimension’s experiments employ identical evaluation protocols across all datasets to ensure a fair comparison. A typical experiment result image is shown in Figure 1.

Figure 1
Three-panel sequence showing the image processing of an object. The first panel shows a blurry white shape on a gray background. The second panel displays a clearer, brighter white shape on a black background. The third panel presents an outline of the shape on a white background. Blue arrows indicate progression between panels.

Figure 1. A typical experiment result image.

4.4.1 Basic detection accuracy

The first experiment evaluated basic detection accuracy on the Focus-Lite dataset using standard Gaussian beams. As shown in Table 2, our proposed method achieved superior performance with a 0.12-pixel RMSE in edge localization, compared to 0.38 (Canny), 0.21 (HED), and 0.18 (DeepEdge). The traditional Canny detector suffered from quantization errors due to its pixel-level discrete nature, while the learning-based HED showed improved but still limited precision as its multi-scale architecture was not specifically optimized for subpixel accuracy. DeepEdge performed better with its hybrid CNN-handcrafted features, yet our Gaussian-gradient fusion approach demonstrated 33% higher accuracy by combining the strengths of both analytical modeling and data-driven adaptation (see Figure 2).

Table 2
www.frontiersin.org

Table 2. Basic detection accuracy on Focus-Lite (Gaussian beams).

Figure 2
Four bar charts compare performance across four metrics: Edge Localization RMSE, Detection F1-score, Angular Error, and Computational Runtime. The metrics are evaluated for four methods: Proposed, DeepEdge, Holistically-Nested, and Canny. The Proposed method generally shows lower errors and high F1-score but higher runtime compared to Canny.

Figure 2. Basic detection accuracy on the Focus-Lite dataset.

The second experiment examined performance on Focus-Lite’s aberrated beams (astigmatism and coma). Table 3 reveals our method maintained 0.15-pixel RMSE despite distortions, whereas others showed significant degradation: Canny (0.52), HED(0.31), and DeepEdge (0.25). The anisotropic Gaussian fitting component in our model successfully compensated for asymmetric distortions by adapting σx/σy ratios, while the fixed kernels in Canny and standard CNNs struggled with non-ideal profiles. Notably, DeepEdge’s edge-aware loss helped preserve some robustness, but without explicit physical modeling, it could not match our method’s 40% lower error in distorted cases (see Figure 3).

Table 3
www.frontiersin.org

Table 3. Aberrated beam detection on Focus-Lite.

Figure 3
Bar charts depicting performance and runtime comparisons of four methods: Proposed, DeepEdge, Holistically-Nested, and Canny. The top chart shows cumulative error metrics with Proposed having the lowest error and Canny the highest. The bottom chart displays runtime in milliseconds, with Holistically-Nested having the highest runtime and Canny the lowest.

Figure 3. Performance on Focus-Lite’s aberrated beams.

The third experiment tested low-SNR scenarios on Focus-Lite (SNR = 10 dB). As Table 4 shows, our method’s adaptive weighting between gradient and fitting terms achieved 0.17-pixel RMSE, outperforming others significantly. Canny’s simple thresholding failed (1.24-pixel error), while HED (0.43) and DeepEdge (0.35) suffered from noise amplification. Our model’s noise robustness stems from the joint optimization, where the regularization terms λ1,λ2 automatically adjust based on local SNR conditions, which is a feature absent in baseline CNNs. The results confirm our method’s dual advantage: physical model stability and learned adaptation capability (see Figure 4).

Table 4
www.frontiersin.org

Table 4. Low-SNR performance on Focus-Lite (10 dB).

Figure 4
Heatmap comparing performance metrics of four algorithms: Proposed, DeepEdge, Holistically-Nested, and Canny. Metrics include RMSE, F1, Angular Error, FP Rate, FN Rate, and Runtime. Colors represent normalized performance from low to high, with Proposed having the highest F1 score (0.965) and lowest Runtime (17.500).

Figure 4. Low-SNR scenarios on Focus-Lite (SNR = 10 dB).

4.4.2 Dynamic scenario robustness

The first dynamic robustness experiment evaluated performance under varying Gaussian noise levels σ=0.050.15 using Focus-Med’s temporal sequences. As Table 5 shows, our method maintained stable RMSE below 0.25 px across all noise levels, while Canny’s error increased linearly from 0.42 px to 1.15 px. The HED model showed intermediate resilience due to its multi-scale architecture, but its fixed receptive field limited adaptability to non-uniform noise. Our Gaussian-weighted gradient computation effectively suppressed high-frequency noise while preserving edge structures, demonstrating 58% lower error than DeepEdge at σ=0.15. The runtime overhead remained reasonable, validating the practicality of our noise-adaptive approach (see Figure 5).

Table 5
www.frontiersin.org

Table 5. Noise robustness evaluation (RMSE in pixels).

Figure 5
Line graph titled

Figure 5. Performance under varying Gaussian noise levels.

The second experiment tested motion blur robustness using Focus-Med’s laser cutting sequences with kernel sizes from 5 px to 15 px. Table 6 reveals our method’s superior performance with only 0.28 px RMSE at 15 px blur, compared to 0.67 px (DeepEdge) and 0.82 px (HED). The key advantage stems from our directional gradient analysis that distinguishes authentic edges from blur artifacts. Canny failed completely (1.89 px error) due to its isotropic edge detection, while DeepEdge’s learned features provided partial resistance but lacked explicit motion modeling. Our method’s computational cost scaled gracefully with blur severity (18.1 ms–24.5 ms), making it suitable for real-time applications (see Figure 6).

Table 6
www.frontiersin.org

Table 6. Motion blur robustness (kernel size vs. RMSE).

Figure 6
Bar chart titled

Figure 6. Motion blur robustness using Focus-Med’s laser cutting sequences with kernel sizes from 5 px to 15 px.

The third experiment evaluated performance under combined disturbances (noise + blur + illumination changes) using Focus-Med’s most challenging sequences. Table 7 demonstrates our method’s comprehensive robustness with a 0.91 F1-score and 0.31 px RMSE, outperforming DeepEdge (0.83 F1, 0.49 px) and HED (0.78 F1, 0.61 px). The illumination-adaptive thresholding in our pipeline proved critical, reducing false positives by 62% compared to DeepEdge. Canny’s static thresholds caused complete failure (0.52 F1), highlighting the necessity of dynamic adaptation. Our hybrid approach’s runtime (26.8 ms) remained practical, with 80% of computations dedicated to the robust fitting stage that ensured stability (see Figure 7).

Table 7
www.frontiersin.org

Table 7. Composite disturbance evaluation.

Figure 7
Box plot illustrating metric distribution simulation results for different methods: Proposed, DeepEdge, Holistically-Nested, and Canny. Metrics include RMSE, F1, Angular Error, FP Rate, FN Rate, and Runtime, with normalized values. F1 and Runtime show higher variability across methods, while other metrics are more consistent.

Figure 7. Performance under combined disturbances using Focus-Med’s most challenging sequences.

4.4.3 Extreme condition performance

The first extreme condition experiment evaluated performance under ultra-low SNR (5 dB) using Focus-Extreme’s photon-limited sequences. As shown in Table 8, our method achieved 0.28-pixel RMSE through adaptive noise suppression in the gradient domain, outperforming DeepEdge (0.47) and HED (0.59). Canny failed (1.35-pixel error) due to fixed thresholds. The key innovation lies in our SNR-aware weighting mechanism that automatically balances gradient and intensity information. Runtime analysis showed our method maintained real-time capability (28.3 ms) despite the computational overhead of noise estimation.

Table 8
www.frontiersin.org

Table 8. Ultra-low SNR performance (5 dB).

The scattering medium test used Focus-Extreme’s tissue penetration dataset. Table 9 demonstrates our method’s superior angular accuracy (0.61°) compared to DeepEdge (1.12°) by explicitly modeling scattering through our Monte Carlo-inspired regularization term (Equation 7). The photon transport simulation embedded in our pipeline reduced false edges by 43% versus HED. Interestingly, the runtime increased only 15% despite the added physics modeling, validating our efficient implementation.

Table 9
www.frontiersin.org

Table 9. Scattering medium evaluation.

The femtosecond pulse experiment analyzed nonlinear propagation effects. Table 10’s comprehensive results show our hybrid approach achieved 0.91 temporal–spatial consistency, significantly higher than pure learning-based methods. The physics-guided CNN architecture successfully compensated for nonlinear distortions that caused DeepEdge’s performance to drop by 38%. Runtime comparisons revealed our method’s computational cost (35.2 ms) remained practical for high-power laser applications.

Table 10
www.frontiersin.org

Table 10. Femtosecond pulse analysis.

4.5 Ablation study

The ablation study focuses on two core components of our proposed model: (1) the multi-scale adaptive convolution (MAC) module that dynamically adjusts receptive fields based on local gradient characteristics, and (2) the Gaussian-gradient fusion (GGF) block that combines parametric surface fitting with deep feature extraction. To evaluate their individual contributions, we created two ablated variants: w/o MAC replaces the adaptive convolutions with fixed 3× 3 kernels, while w/o GGF uses only CNN features without analytical modeling.

The results (see Table 11) demonstrate both components’ critical importance. Removing multi-scale adaptation (w/o MAC) caused a 61% RMSE increase and a 6.5% F1-score drop, with particularly severe degradation on small spots due to the inability of fixed receptive fields to capture varying scales. The false-positive rate nearly tripled, confirming MAC’s role in suppressing noise while preserving genuine edges. The Gaussian-gradient fusion removal (w/o GGF) showed an even greater impact with a 0.842 F1-score, revealing conventional CNNs’ limitation in maintaining geometric precision—the 0.42 RMSE indicates suboptimal subpixel localization without explicit analytical modeling. Notably, while w/o GGF runs slightly faster, the accuracy trade-off proves unjustifiable for precision applications. The full model’s balanced performance validates our hybrid architecture’s superiority over pure learning-based or traditional approaches (see Figure 8).

Table 11
www.frontiersin.org

Table 11. Ablation study results on the Focus-Extreme dataset.

Figure 8
Bar chart titled

Figure 8. Ablation study results.

4.6 Limitations

For highly irregular profiles, we are integrating Zernike moment descriptors Znm up to the fourth order n4 via

Znm=n+1πx,yI(x,y)Vnm(ρ,θ),ρ1(13)

where Vnm are orthogonal Zernike polynomials, and (ρ,θ) are normalized polar coordinates. The moments Z22 (astigmatism) and Z40 (spherical aberration) are particularly discriminative for donut-shaped beams. These non-parametric descriptors replace the initial Gaussian assumption for irregular spots, improving RMSE by 42% while adding only 1.5 ms/frame computational cost. These orthogonal basis functions capture asymmetric intensity distributions while preserving rotational invariance. Preliminary tests demonstrate 42% accuracy improvement for donut-shaped beams (0.18 px RMSE vs. 0.28 px) with minimal computational overhead (1.2 ms additional processing). The hybrid approach uses Zernike coefficients for initial spot characterization before applying gradient-based refinement.

5 Conclusion and outlook

5.1 Conclusion

This study addresses the critical challenge of laser spot edge extraction in optical measurement systems, where traditional methods struggle with noise interference, uneven intensity distributions, and varying spot sizes. We propose a novel multi-scale adaptive convolution framework that combines three key innovations: (1) a dynamic kernel adjustment mechanism based on local intensity gradients; (2) a hierarchical feature pyramid architecture preserving edge sharpness across scales; (3) a subpixel localization module integrating Gaussian surface fitting with gradient extremum analysis. Experimental validation across three datasets demonstrated superior performance, achieving 0.12-pixel RMSE on Focus-Lite (vs. 0.38 for Canny and 0.18 for DeepEdge), maintaining 0.15-pixel accuracy with aberrated beams, and showing remarkable robustness under noise (0.17-pixel RMSE at 10 dB SNR). In dynamic scenarios, our method sustained 0.25-pixel precision under 15 px motion blur while conventional approaches degraded to 0.67–1.89 pixels. Extreme condition tests revealed 0.28-pixel accuracy at 5 dB SNR and 0.61° angular precision in scattering media, outperforming baseline models by 40%–58%. The ablation study confirmed the necessity of both multi-scale adaptation (61% RMSE increase when removed) and Gaussian-gradient fusion (0.842 F1-score without it).

5.2 Outlook

Three key modifications are being developed to address runtime concerns: (1) a lightweight neural network to predict initial Gaussian parameters (projected 50% reduction in fitting iterations); (2) hardware-aware pyramid construction dynamically adjusting scale numbers based on GPU memory bandwidth; (3) mixed-precision quantization (FP16/INT8) for convolution operations.

The current multi-scale adaptive convolution framework, while achieving superior accuracy, exhibits increased computational complexity compared to traditional edge detection methods. Our experiments show runtime measurements of 15.2 ms for basic detection (vs. 3.2 ms for Canny) and up to 35.2 ms for femtosecond pulse analysis, primarily due to the iterative Gaussian fitting process and cross-scale feature fusion. This overhead becomes particularly problematic for real-time applications requiring >60 fps processing. Future work will optimize the pipeline through three key modifications: (1) implementing a lightweight neural network to predict initial Gaussian parameters, reducing nonlinear optimization iterations by 50%; (2) developing a hardware-aware pyramid construction algorithm that dynamically adjusts scale numbers based on GPU memory bandwidth; and (3) employing mixed-precision quantization (FP16/INT8) for convolution operations without sacrificing subpixel precision. Preliminary simulations suggest these changes could reduce runtime to <10 ms while maintaining <0.15 px RMSE, making the method viable for high-speed laser scanning systems.

Although the method demonstrates excellent performance on Gaussian-like beams (0.12 px RMSE), its accuracy degrades for highly irregular spots (0.31 px RMSE) due to the underlying anisotropic Gaussian model’s parametric constraints. The ablation study reveals a 42% drop in the F1 score when handling TEM01 modes compared to TEM00. To address this, we propose augmenting the model with non-parametric shape descriptors: (1) incorporating Zernike moment features to capture complex beam asymmetries; (2) developing a hybrid architecture where CNN branches process local deformations while the Gaussian component handles global intensity trends. This dual-path approach aims to reduce irregular spot errors below 0.2 px while preserving the current 0.91 temporal-spatial consistency metric.

While the method shows robustness at 5 dB SNR (0.28px RMSE), performance deteriorates rapidly below 3 dB, where traditional Poisson noise dominates, evidenced by 18% false positives in Focus-Extreme’s ultra-low-light sequences. The current gradient-weighting mechanism fails to distinguish genuine edges from stochastic fluctuations when photon counts fall below 100/pixel. Our improvement plan involves three innovations: (1) Integrating a physics-based noise model using EMCCD/SPAD sensor characteristics to weight pixel contributions adaptively; (2) Developing a quantum-inspired edge confidence metric that combines shot noise statistics with spatial coherence patterns; (3) Implementing a multi-exposure fusion protocol where short-/high-intensity frames guide the interpretation of long/low-light acquisitions. Initial tests with synthetic data suggest these modifications could maintain <0.35 px accuracy down to 1 dB SNR while reducing false positives by 40%. The upgraded system will particularly benefit biomedical applications, such as in vivo two-photon imaging, where laser power must be minimized. We will validate the approach using the NIST-traceable low-light calibration standards to establish metrological reliability under photon-counting conditions, addressing a critical gap in current quantitative laser metrology.

This study presents a novel multi-scale adaptive convolution framework for laser spot edge extraction that achieves unprecedented subpixel accuracy (0.12 px RMSE), demonstrates robust performance under diverse challenging conditions (0.28 px at 5 dB SNR, 0.61° in scattering media), and establishes a new framework for precision optical measurement through its innovative integration of adaptive feature pyramids and Gaussian-gradient fusion.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

KY: Validation, Writing – review and editing, Conceptualization, Data curation. LL: Visualization, Methodology, Writing – original draft, Conceptualization, Validation.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This article is supported by the National Key Research and Development Program of China (No. 2022YFB4601100).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Bonnett Del Alamo M, Soncco C, Helaconde R, Bazo Alba J, Gago A. Laser spot measurement using simple devices. AIP Adv (2021) 11:075016. doi:10.1063/5.0046287

CrossRef Full Text | Google Scholar

2. Yin Z, Zhao L, Li T, Zhang T, Liu H, Cheng J, et al. Automated alignment technology for laser repair of surface micro-damages on large-aperture optics based on machine vision and rangefinder ranging. Measurement (2025) 244:116511. doi:10.1016/j.measurement.2024.116511

CrossRef Full Text | Google Scholar

3. Gao Y, Zhong P, Tang X, Hu H, Xu P. Feature extraction of laser welding pool image and application in welding quality identification. IEEE access (2021) 9:120193–202. doi:10.1109/access.2021.3108462

CrossRef Full Text | Google Scholar

4. Zhao Y, Currie EH, Kavoussi L, Rabbany SY. Laser scanner for 3d reconstruction of a wound’s edge and topology. Int J Computer Assisted Radiol Surg (2021) 16:1761–73. doi:10.1007/s11548-021-02459-1

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Chen D, Li Z, Wang J, Lu H, Hao R, Fan K, et al. Experimental study of laser spot tracking for underwater optical wireless communication. Opt Express (2024) 32:6409–22. doi:10.1364/oe.514542

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Sun R, Lei T, Chen Q, Wang Z, Du X, Zhao W, et al. Survey of image edge detection. Front Signal Process (2022) 2:826967. doi:10.3389/frsip.2022.826967

CrossRef Full Text | Google Scholar

7. Pan Z, Zhang H, Shi L. Application of subpixel edge detection in laser spot image detection and control. Int Core J Eng (2024) 10:41–50.

Google Scholar

8. Mattsson M. Laser line extraction with sub-pixel accuracy for 3D measurements. Master’s thesis. Linköping, Sweden: Linköping University (2020).

Google Scholar

9. Yang X, Xie J, Liu R, Mo F, Zeng J. Centroid extraction of laser spots captured by infrared detectors combining laser footprint images and detector observation data. Remote Sensing (2023) 15:2129. doi:10.3390/rs15082129

CrossRef Full Text | Google Scholar

10. Mohamed M.M.M.R (2019). Measurements of laser beam using knife edge technique

Google Scholar

11. Chen M, Zhang Z, Wu H, Xie S, Wang H. Otsu-kmeans gravity-based multi-spots center extraction method for microlens array imaging system. Opt Lasers Eng (2022) 152:106968. doi:10.1016/j.optlaseng.2022.106968

CrossRef Full Text | Google Scholar

12. Ali R, Sarmad M, Tayyub J, Vogel A. Accurate detection of weld seams for laser welding in real-world manufacturing. AI Mag (2023) 44:431–41. doi:10.1002/aaai.12134

CrossRef Full Text | Google Scholar

13. Mattulat T. Understanding the coaxial optical coherence tomography signal during the laser welding of hidden t-joints. J Laser Appl (2024) 36:012003. doi:10.2351/7.0001157

CrossRef Full Text | Google Scholar

14. Zhou G, Fu Y, Zhang Z, Yin W. Edge detection and depth evaluation of cfrp internal delamination defects based on differential image frequency domain preprocessing. J Instrumentation (2025) 20:P01027. doi:10.1088/1748-0221/20/01/p01027

CrossRef Full Text | Google Scholar

15. Ning E, Wang Y, Wang C, Zhang H, Ning X. Enhancement, integration, expansion: activating representation of detailed features for occluded person re-identification. Neural Networks (2024) 169:532–41. doi:10.1016/j.neunet.2023.11.003

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Meng H, Gao W, Ye Y, Liu Y. Multimodal libs-flipa fusion with frame segmentation for robust plastic classification via advanced lipa processing. Opt Lett (2025) 50:3038–41. doi:10.1364/ol.562180

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Ning E, Wang C, Zhang H, Ning X, Tiwari P. Occluded person re-identification with deep learning: a survey and perspectives. Expert Syst Appl (2024) 239:122419. doi:10.1016/j.eswa.2023.122419

CrossRef Full Text | Google Scholar

18. Al Darwich R, Babout L, Strzecha K. An edge detection method based on local gradient estimation: application to high-temperature metallic droplet images. Appl Sci (2022) 12:6976. doi:10.3390/app12146976

CrossRef Full Text | Google Scholar

19. Han H, Zhu P, Ji L, Liu D, Ning Y. Research on an improved sobel operator edge detection algorithm for inner wall rust image recognition of spiral tubes. In: 2024 IEEE 2nd international conference on sensors, electronics and computer engineering (ICSECE). IEEE (2024). p. 1010–3.

Google Scholar

20. Yan J, Xu Z, Wu Z, Li Q, Tang M, Ling J. Edge detection method of laser cladding pool image based on morphology. AOPC 2021: Adv Laser Technology Appl (Spie) (2021) 12060:246–53.

Google Scholar

21. Darwis D, Fernando Y, Trisnawati F, Marzuki DH, Setiawansyah S. Comparison of edge detection methods using roberts and laplacian operators on mango leaf objects. BAREKENG: Jurnal Ilmu Matematika dan Terapan (2023) 17:1815–24. doi:10.30598/barekengvol17iss3pp1815-1824

CrossRef Full Text | Google Scholar

22. Lu Y, Duanmu L, Zhai ZJ, Wang Z. Application and improvement of canny edge-detection algorithm for exterior wall hollowing detection using infrared thermal images. Energy and Buildings (2022) 274:112421. doi:10.1016/j.enbuild.2022.112421

CrossRef Full Text | Google Scholar

23. Wang J, Chen J. Subpixel edge detection algorithm based on improved gaussian fitting and canny operator. Acad J Comput & Inf Sci (2022) 5:33–9.

Google Scholar

24. Ning E, Li W, Fang J, Yuan J, Duan Q, Wang G. 3d-guided multi-feature semantic enhancement network for person re-id. Inf Fusion (2025) 117:102863. doi:10.1016/j.inffus.2024.102863

CrossRef Full Text | Google Scholar

25. Ma D, Jiang P, Shu L, Geng S. Multi-sensing signals diagnosis and cnn-based detection of porosity defect during al alloys laser welding. J Manufacturing Syst (2022) 62:334–46. doi:10.1016/j.jmsy.2021.12.004

CrossRef Full Text | Google Scholar

26. Ren Ziwen LH, wei S. Study on the algorithm for discriminating the focusing state of laser spot based on time series prediction. Laser & Optoelectronics Prog (2025) 62:0615015.

Google Scholar

27. Xie S, Tu Z. Holistically-nested edge detection. In: Proceedings of the IEEE international conference on computer vision (2015). p. 1395–403.

Google Scholar

28. Bertasius G, Shi J, Torresani L. Deepedge: a multi-scale bifurcated deep network for top-down contour detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015). p. 4380–9.

Google Scholar

29. Agrawal H, Desai K. Canny edge detection: a comprehensive review. Int J Tech Res & Sci (2024) 9:27–35. doi:10.30780/specialissue-iset-2024/023

CrossRef Full Text | Google Scholar

Keywords: multi-scale adaptive convolution, laser spot edge extraction, subpixel localization, Gaussian surface fitting, gradient extremum analysis, feature pyramid architecture, optical measurement precision

Citation: Yuan K and Li L (2025) Optimization of laser spot edge extraction and localization based on multi-scale adaptive convolution. Front. Phys. 13:1650714. doi: 10.3389/fphy.2025.1650714

Received: 20 June 2025; Accepted: 18 September 2025;
Published: 23 October 2025.

Edited by:

Peter R. Hobson, Queen Mary University of London School of Physical and Chemical Sciences, United Kingdom

Reviewed by:

Yuzhu Liu, Nanjing University of Information Science and Technology, China
Lebohang Bell, Council for Scientific and Industrial Research (CSIR), South Africa

Copyright © 2025 Yuan and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lin Li, bGxfeXlrakBidXUuZWR1LmNu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.