- 1College of Robotics, Beijing Union University, Beijing, China
- 2College of Applied Science and Technology, Beijing Union University, Beijing, China
The precise extraction of laser spot edges plays a fundamental role in optical measurement systems, yet traditional methods struggle with noise interference and varying spot characteristics. Existing approaches face significant challenges in achieving robust subpixel accuracy across diverse experimental conditions, particularly for irregular spots and low signal-to-noise scenarios. This article presents a novel multi-scale adaptive convolution framework that integrates three key innovations: (1) dynamic kernel adjustment based on local intensity gradients, (2) hierarchical feature pyramid architecture combining spatial details with semantic features, and (3) subpixel localization through Gaussian surface fitting and gradient extremum analysis. Extensive experiments demonstrate the method’s superior performance, achieving 0.12-pixel root mean square error (RMSE) on standard Gaussian beams (vs. 0.38 for Canny), maintaining 0.15-pixel accuracy with aberrated spots, and showing remarkable robustness at 5 dB SNR (0.28-pixel RMSE). The results establish that our hybrid approach successfully bridges physical modeling with data-driven adaptation, delivering unprecedented precision (0.91 temporal–spatial consistency) for laser-based applications ranging from industrial metrology to biomedical imaging. The ablation studies further confirm the critical importance of both multi-scale adaptation (61% accuracy drop when removed) and analytical modeling (0.842 F1-score without Gaussian fitting), providing valuable insights for future edge detection research.
1 Introduction
The extraction of laser spot edges holds significant research importance in various scientific and engineering applications as it is a fundamental step for precise optical measurement Bonnett Del Alamo et al. [1], alignment Yin et al. [2], and quality control. It is widely utilized in fields such as industrial processing Gao et al. [3], medical diagnostics Zhao et al. [4], and optical communication Chen et al. [5], where accurate edge detection directly influences system performance and measurement reliability. Traditional edge extraction methods Sun et al. [6] often face challenges in handling noise interference, uneven intensity distributions, and varying spot sizes, leading to reduced positioning accuracy and robustness. The development of advanced edge extraction techniques can substantially improve the precision and adaptability of laser spot analysis by dynamically adjusting to intensity variations and morphological characteristics across different scales. This capability is especially valuable in practical scenarios where laser spots exhibit complex patterns due to beam divergence, scattering, or environmental disturbances. Enhanced edge localization not only refines the spatial resolution of optical systems but also facilitates subsequent tasks such as centroid calculation, beam profiling, and aberration correction. Furthermore, optimized edge extraction contributes to the automation of laser-based systems by providing more reliable input for real-time feedback control and decision-making processes. From a broader perspective, advancements in this area can benefit interdisciplinary applications ranging from high-precision manufacturing to biomedical imaging, where subtle edge variations may carry critical information about material properties or physiological conditions.
A significant amount of research work has been devoted to solving laser spot edge extraction. A significant challenge identified across multiple studies is the low precision of pixel-level edge detection, which results in substantial errors in light spot measurement Pan et al. [7]. To address this, researchers have explored subpixel edge detection techniques that enhance the accuracy of edge localization, thereby reducing measurement errors Pan et al. [7]; Mattsson [8]. Subpixel methods, such as those employing the Gaussian fitting approach, have been utilized to achieve higher precision in laser spot edge detection. For instance, the combination of Gaussian fitting with Canny edge detection and gray-scale barycenter methods has been demonstrated to improve centroid extraction accuracy for infrared laser spots Yang et al. [9]. Similarly, subpixel accuracy is crucial in applications requiring precise measurement of laser beam parameters, such as beam size and divergence, which are typically assessed using knife-edge techniques MOHAMED [10]. In addition to subpixel techniques, advanced image processing algorithms have been developed to handle irregular spot shapes and complex backgrounds. For example, the Otsu-K-means gravity-based multi-spot center extraction method was proposed to improve the extraction of laser spots with irregular shapes, although its accuracy still faces limitations Chen et al. [11]. Background modeling approaches, such as the average background model, have also been employed to enhance spot detection in challenging conditions, yet the accuracy remains insufficient for some applications Chen et al. [11]. Edge detection methods in laser welding and other industrial applications have traditionally relied on simple computer vision techniques to identify weld seam edges and laser-induced features Ali et al. [12]; Mattulat [13]. These methods often focus on the maximum distance of the laser spot edge relative to a reference, with reported detection distances approximately 0.1 mm, indicating the importance of precise edge localization for quality control Mattulat [13]. The detection of laser spots and their edges plays a vital role in 3D measurements and internal defect evaluation. Techniques such as laser line extraction with subpixel accuracy have been developed to improve the detection of jagged edges, which is critical for accurate 3D reconstruction and internal delamination assessment Zhou et al. [14]. The integration of laser sensors, including laser distance and positioning sensors, further underscores the importance Ning et al. [15].
The current body of research highlights a trend toward employing subpixel and advanced image processing techniques to enhance the precision of laser spot edge extraction. Despite these advancements, challenges remain in achieving high accuracy under complex conditions, particularly for irregular spot shapes and noisy backgrounds, indicating ongoing opportunities for methodological improvements in this field. Recent advances in laser measurement systems have demonstrated remarkable progress in real-time adaptive control Meng et al. [16]. Ning et al. [17] introduced the frame-segmentation LIPA (FLIPA) algorithm and laser-induced breakdown spectroscopy (LIBS)-FLIPA multimodal fusion technique, which reduce LIPA variables by 99% while significantly enhancing classification accuracy, robustness, and generalization in plastic waste sorting, thereby overcoming critical limitations of conventional LIBS analysis.
The application of multi-scale adaptive convolution in laser spot edge extraction offers significant advantages by dynamically adjusting to varying spot sizes, intensity distributions, and noise levels. Unlike traditional fixed-kernel methods, this approach improves edge detection accuracy by analyzing features across different scales, ensuring robustness against blurring, uneven illumination, and low signal-to-noise ratios. Such adaptability is particularly valuable in real-world scenarios where laser spots exhibit complex shapes due to beam divergence, scattering, or optical distortions. By improving edge localization precision, this method enables more reliable centroid calculation, beam profiling, and optical system alignment, benefiting applications in precision manufacturing, biomedical imaging, and laser-based metrology. Its computational efficiency makes it suitable for real-time processing, supporting automation in laser-guided systems.
To address the scale sensitivity and noise interference issues encountered by traditional edge detection algorithms in laser spot processing, this study proposes a novel edge extraction method based on multi-scale adaptive convolution. The approach constructs a multi-scale feature pyramid to extract edge characteristics of laser spots across different scale spaces, while incorporating an adaptive weighting mechanism to dynamically adjust convolution kernel parameters, thereby achieving robust detection for spots with varying sizes and intensity distributions. A subpixel precision positioning algorithm is developed for edge localization optimization by integrating Gaussian surface fitting with gradient extremum analysis, which significantly enhances the accuracy of edge localization.
The three main innovations of this study are as follows.
1. The first innovation lies in developing a multi-scale adaptive convolution mechanism that dynamically adjusts kernel parameters based on local intensity gradients, enabling robust edge extraction across varying spot sizes and illumination conditions. This approach overcomes the fixed-scale limitation of conventional edge detectors.
2. The second innovation introduces a hierarchical feature pyramid architecture that combines shallow spatial details with deep semantic features, allowing simultaneous preservation of edge sharpness while suppressing noise interference at different scales.
3. The third innovation proposes a novel subpixel localization module integrating Gaussian surface fitting with gradient extremum analysis, which achieves higher positioning accuracy than traditional interpolation-based methods by modeling the continuous intensity distribution.
2 Related work
2.1 Traditional approaches
Gradient-based methods constitute a fundamental approach for laser spot edge detection Al Darwich et al. [18], operating on the principle of identifying intensity discontinuities in digital images. These techniques typically employ convolution kernels, such as Han et al. [19], Yan et al. [20] or Roberts operators Darwis et al. [21], to compute spatial derivatives that highlight regions of rapid intensity change corresponding to potential edges. The Canny edge detector Lu et al. [22] further refines this approach through multi-stage processing involving noise reduction, gradient calculation, non-maximum suppression, and hysteresis thresholding. These methods offer computational efficiency and straightforward implementation, making them widely accessible for various applications. Their effectiveness is particularly notable in scenarios with high-contrast laser spots and clean background conditions, where they can provide satisfactory edge localization accuracy with relatively low computational overhead. However, gradient-based techniques exhibit several inherent limitations when dealing with complex laser spot images. They are highly sensitive to noise and illumination variations, often producing fragmented or false edges in low-quality images. The fixed-size convolution kernels struggle to adapt to laser spots with varying sizes or blur levels, leading to inconsistent performance across different experimental conditions. Moreover, these methods typically output pixel-level edges without subpixel precision, limiting their usefulness in high-accuracy applications. The threshold selection process remains another critical challenge, as inappropriate values may either miss genuine edges or introduce excessive noise.
2.2 Model-fitting methods
Model-fitting methods provide a mathematically rigorous approach to laser spot edge detection by approximating the intensity distribution with parametric functions. These techniques typically employ Wang and Chen et al. [23] or Moffat functions to model the spot’s radial intensity profile, where edges are determined by analyzing the fitted model’s characteristics, such as inflection points or specific intensity thresholds. The fitting process often involves nonlinear least-squares optimization to minimize the discrepancy between the model and observed pixel values. This approach offers several advantages, including inherent noise suppression through the fitting procedure and the ability to achieve subpixel edge localization precision Ning et al. [24]. The parametric nature of these methods allows for the simultaneous extraction of multiple spot characteristics beyond only edges, such as centroid position, beam width, and intensity distribution parameters, making them particularly valuable for comprehensive beam analysis applications. Despite their theoretical advantages, model-fitting methods present several practical challenges in laser spot edge detection. The computational complexity of nonlinear fitting procedures can be significantly higher than simpler gradient-based methods, potentially limiting real-time applications. These techniques are also sensitive to initial parameter guesses and may converge to local minima if the spot exhibits irregular shapes or contains significant noise. The assumption of a specific intensity profile (typically Gaussian) may not hold true for all experimental conditions, particularly when dealing with distorted or aberrated laser beams. Additionally, the performance tends to degrade when processing spots with low signal-to-noise ratios or when multiple spots overlap in the image.
2.3 Deep learning approaches
Deep learning approaches have emerged as a powerful alternative for laser spot edge detection, leveraging convolutional neural networks (CNNs) Ma et al. [25] to automatically learn discriminative features from training data. These methods typically employ encoder-decoder architectures or specialized edge detection networks that process raw pixel intensities to directly predict edge maps or spot boundaries. Unlike traditional algorithms, deep learning models can capture complex spatial relationships and contextual information, enabling robust performance across varying spot sizes, shapes, and noise conditions. The data-driven nature of these approaches allows them to adapt to diverse experimental setups without requiring explicit mathematical modeling of the spot characteristics. Advanced architectures may incorporate multi-scale processing and attention mechanisms to enhance edge localization precision while maintaining computational efficiency through optimized network designs.
3 Methodology
This article proposes an innovative laser spot edge extraction framework combining multi-scale adaptive convolution with subpixel localization techniques. The methodology first constructs a multi-scale feature pyramid to analyze edge characteristics across different resolutions, employing an adaptive weighting mechanism to dynamically optimize convolution kernel parameters for varying spot sizes and intensity distributions. Subsequently, a hybrid localization algorithm integrates Gaussian surface fitting with gradient extremum analysis to achieve subpixel edge positioning accuracy. This dual approach effectively addresses traditional challenges of scale sensitivity and noise interference while maintaining computational efficiency, offering significant improvements over conventional edge detection methods in terms of both robustness and precision for laser spot analysis applications.
3.1 Multi-scale feature pyramid construction and adaptive convolution mechanism
The multi-scale feature pyramid construction forms the foundational component of our proposed edge detection framework, designed to comprehensively capture laser spot characteristics across different spatial resolutions. The pyramid is built through a hierarchical downsampling process where the original input image
where
where
At each pyramid level, we employ a set of parallel adaptive convolution kernels
where
The final feature representation combines information across all scales through our proposed cross-scale fusion module:
where
3.2 Edge localization optimization
The edge localization optimization module represents a significant advancement in subpixel precision through a novel integration of Gaussian surface fitting with gradient extremum analysis. The Gaussian fitting component employs an anisotropic 2D Gaussian model to approximate the laser spot intensity distribution. The above process can be expressed as:
where
with
where
The gradient extremum analysis component provides complementary edge localization through a sophisticated continuous-domain approach. We first compute the multi-scale gradient field
where
Our innovation here involves a multi-resolution verification scheme where gradient extrema are detected at multiple scales and consolidated through a voting mechanism, effectively suppressing spurious edges while preserving genuine ones.
The final edge localization combines results from both approaches through our confidence-weighted fusion:
where
with
The algorithm’s robustness is further enhanced by our novel post-processing stage that incorporates topological constraints. We formulate edge connectivity as a graph optimization problem where nodes represent candidate edge points and edges encode geometric relationships:
where
4 Experiment
4.1 Experimental setup
The experimental setup was implemented on a high-performance computing platform equipped with an Intel Xeon Gold 6248R processor (3.0 GHz, 24 cores) and 256 GB RAM, coupled with an NVIDIA Quadro RTX 8000 GPU (48 GB memory) for accelerated computation. The software environment utilized Ubuntu 20.04 LTS with CUDA 11.3 and cuDNN 8.2, while the algorithms were implemented in Python 3.8 using the PyTorch 1.9.0 framework. All image processing operations were optimized using OpenCV 4.5.5 with Intel Math Kernel Library (MKL) and Integrated Performance Primitives (IPP) acceleration libraries to ensure real-time performance. The hardware configuration allowed for parallel processing of multiple image streams at 4K resolution with 16-bit depth, which was essential for maintaining the precision requirements of subpixel edge detection.
Model parameters were carefully configured through extensive preliminary experiments. The Gaussian fitting component used an adaptive kernel size ranging from 5
Training procedures incorporated several innovative techniques to ensure robust performance. The model was trained on a diverse dataset containing 15,000 high-resolution images of different surface materials with precisely annotated edges, captured under various lighting conditions. We employed a progressive training strategy, starting with synthetic images and gradually introducing real-world data. The optimization used AdamW with an initial learning rate of 0.001, reduced by a factor of 0.5 every 50 epochs. Data augmentation included random affine transformations, illumination variations, and additive Gaussian noise
The evaluation employed multiple quantitative metrics to assess localization accuracy and robustness. The primary metric was the root mean square error (RMSE) of edge positions:
4.2 Datasets
The main datasets used in this study are Focus-Lite, Focus-Med, and Focus-Extreme (see Table 1) Ren Ziwen and Wei [26].
The Focus-Lite dataset provides a foundational benchmark for laser spot analysis, containing 1,000+ static images of Gaussian-like beam profiles with diameters ranging from 5 pixels to 50 pixels. Each sample includes essential metadata: wavelength (405–1064 nm), optical power (1–100 mW), and charge-coupled device (CCD) calibration parameters (12-bit depth, 4.65 µm/pixel resolution). The dataset’s standardized conditions enable rapid validation of basic algorithms for centroid detection and beam width calculation, serving as an essential reference for comparing the proposed multi-scale adaptive convolution method against traditional approaches under controlled scenarios. Its simplicity facilitates quick debugging while maintaining physical relevance through precisely documented acquisition parameters.
Focus-Med offers intermediate complexity with 5,000+ temporal sequences capturing dynamic laser-material interactions and aberrated beams. Key fields include time-stamped frames
The Focus-Extreme dataset challenges algorithm limits with 10,000+ samples featuring ultra-low SNR (<3 dB), biological tissue scattering, and femtosecond pulse distortions. Each case provides multimodal data: raw CCD frames, corresponding Monte Carlo simulation parameters, and ground truth aberration coefficients (Zernike terms up to 15th order). This dataset rigorously evaluates our method’s subpixel localization accuracy in photon-starved conditions and validates the Gaussian-gradient hybrid approach’s superiority over conventional techniques when handling strongly nonlinear beam propagation effects, particularly for applications in biomedical imaging and ultrafast laser metrology.
4.3 Baseline models
The baseline models used in this study are holistically nested edge detection (HED) Xie and Tu [27], DeepEdge Bertasius et al. [28], and Canny edge detection Agrawal and Desai [29].
Holistically-nested edge detection (HED) is a deep learning-based approach that employs a fully convolutional neural network with multiple side outputs to capture edge information at different scales. The model integrates hierarchical features through a fusion layer, enabling precise edge localization while maintaining global context. HED’s end-to-end training minimizes multi-scale prediction errors, achieving state-of-the-art performance on standard benchmarks through its holistic nested architecture.
DeepEdge combines convolutional neural networks with structured edge detection by leveraging both local and global image information. The architecture processes image patches through multiple convolutional layers to extract rich hierarchical features, which are then classified as edges using a random forest. This hybrid approach effectively bridges low-level cues with high-level semantics, demonstrating superior performance in complex scenes with cluttered backgrounds.
The Canny edge detector is a classical algorithm that identifies edges through gradient-based multi-stage processing. It applies Gaussian smoothing to reduce noise, computes intensity gradients using Sobel operators, and employs non-maximum suppression with hysteresis thresholding to produce connected edges. Despite its simplicity, Canny remains widely adopted due to its computational efficiency and reliable performance across diverse imaging conditions.
4.4 Experimental results and analysis
The comparative experiments are designed across three critical dimensions: (1) Basic Detection Accuracy evaluates all models on Focus-Lite using standard metrics, where Canny serves as the traditional baseline while HED and DeepEdge represent learning-based approaches; (2) Dynamic Scenario Robustness tests on Focus-Med with added Gaussian noise and motion blur to assess temporal stability, measuring false edge rates and continuity metrics; (3) Extreme Condition Performance utilizes Focus-Extreme to examine subpixel localization error under photon-limited and scattering conditions, with ablation studies on multi-scale fusion components. Each dimension’s experiments employ identical evaluation protocols across all datasets to ensure a fair comparison. A typical experiment result image is shown in Figure 1.
4.4.1 Basic detection accuracy
The first experiment evaluated basic detection accuracy on the Focus-Lite dataset using standard Gaussian beams. As shown in Table 2, our proposed method achieved superior performance with a 0.12-pixel RMSE in edge localization, compared to 0.38 (Canny), 0.21 (HED), and 0.18 (DeepEdge). The traditional Canny detector suffered from quantization errors due to its pixel-level discrete nature, while the learning-based HED showed improved but still limited precision as its multi-scale architecture was not specifically optimized for subpixel accuracy. DeepEdge performed better with its hybrid CNN-handcrafted features, yet our Gaussian-gradient fusion approach demonstrated 33% higher accuracy by combining the strengths of both analytical modeling and data-driven adaptation (see Figure 2).
The second experiment examined performance on Focus-Lite’s aberrated beams (astigmatism and coma). Table 3 reveals our method maintained 0.15-pixel RMSE despite distortions, whereas others showed significant degradation: Canny (0.52), HED(0.31), and DeepEdge (0.25). The anisotropic Gaussian fitting component in our model successfully compensated for asymmetric distortions by adapting
The third experiment tested low-SNR scenarios on Focus-Lite (SNR = 10 dB). As Table 4 shows, our method’s adaptive weighting between gradient and fitting terms achieved 0.17-pixel RMSE, outperforming others significantly. Canny’s simple thresholding failed (1.24-pixel error), while HED (0.43) and DeepEdge (0.35) suffered from noise amplification. Our model’s noise robustness stems from the joint optimization, where the regularization terms
4.4.2 Dynamic scenario robustness
The first dynamic robustness experiment evaluated performance under varying Gaussian noise levels
The second experiment tested motion blur robustness using Focus-Med’s laser cutting sequences with kernel sizes from 5 px to 15 px. Table 6 reveals our method’s superior performance with only 0.28 px RMSE at 15 px blur, compared to 0.67 px (DeepEdge) and 0.82 px (HED). The key advantage stems from our directional gradient analysis that distinguishes authentic edges from blur artifacts. Canny failed completely (1.89 px error) due to its isotropic edge detection, while DeepEdge’s learned features provided partial resistance but lacked explicit motion modeling. Our method’s computational cost scaled gracefully with blur severity (18.1 ms–24.5 ms), making it suitable for real-time applications (see Figure 6).

Figure 6. Motion blur robustness using Focus-Med’s laser cutting sequences with kernel sizes from 5 px to 15 px.
The third experiment evaluated performance under combined disturbances (noise + blur + illumination changes) using Focus-Med’s most challenging sequences. Table 7 demonstrates our method’s comprehensive robustness with a 0.91 F1-score and 0.31 px RMSE, outperforming DeepEdge (0.83 F1, 0.49 px) and HED (0.78 F1, 0.61 px). The illumination-adaptive thresholding in our pipeline proved critical, reducing false positives by 62% compared to DeepEdge. Canny’s static thresholds caused complete failure (0.52 F1), highlighting the necessity of dynamic adaptation. Our hybrid approach’s runtime (26.8 ms) remained practical, with 80% of computations dedicated to the robust fitting stage that ensured stability (see Figure 7).
4.4.3 Extreme condition performance
The first extreme condition experiment evaluated performance under ultra-low SNR (5 dB) using Focus-Extreme’s photon-limited sequences. As shown in Table 8, our method achieved 0.28-pixel RMSE through adaptive noise suppression in the gradient domain, outperforming DeepEdge (0.47) and HED (0.59). Canny failed (1.35-pixel error) due to fixed thresholds. The key innovation lies in our SNR-aware weighting mechanism that automatically balances gradient and intensity information. Runtime analysis showed our method maintained real-time capability (28.3 ms) despite the computational overhead of noise estimation.
The scattering medium test used Focus-Extreme’s tissue penetration dataset. Table 9 demonstrates our method’s superior angular accuracy (0.61°) compared to DeepEdge (1.12°) by explicitly modeling scattering through our Monte Carlo-inspired regularization term (Equation 7). The photon transport simulation embedded in our pipeline reduced false edges by 43% versus HED. Interestingly, the runtime increased only 15% despite the added physics modeling, validating our efficient implementation.
The femtosecond pulse experiment analyzed nonlinear propagation effects. Table 10’s comprehensive results show our hybrid approach achieved 0.91 temporal–spatial consistency, significantly higher than pure learning-based methods. The physics-guided CNN architecture successfully compensated for nonlinear distortions that caused DeepEdge’s performance to drop by 38%. Runtime comparisons revealed our method’s computational cost (35.2 ms) remained practical for high-power laser applications.
4.5 Ablation study
The ablation study focuses on two core components of our proposed model: (1) the multi-scale adaptive convolution (MAC) module that dynamically adjusts receptive fields based on local gradient characteristics, and (2) the Gaussian-gradient fusion (GGF) block that combines parametric surface fitting with deep feature extraction. To evaluate their individual contributions, we created two ablated variants: w/o MAC replaces the adaptive convolutions with fixed 3
The results (see Table 11) demonstrate both components’ critical importance. Removing multi-scale adaptation (w/o MAC) caused a 61% RMSE increase and a 6.5% F1-score drop, with particularly severe degradation on small spots due to the inability of fixed receptive fields to capture varying scales. The false-positive rate nearly tripled, confirming MAC’s role in suppressing noise while preserving genuine edges. The Gaussian-gradient fusion removal (w/o GGF) showed an even greater impact with a 0.842 F1-score, revealing conventional CNNs’ limitation in maintaining geometric precision—the 0.42 RMSE indicates suboptimal subpixel localization without explicit analytical modeling. Notably, while w/o GGF runs slightly faster, the accuracy trade-off proves unjustifiable for precision applications. The full model’s balanced performance validates our hybrid architecture’s superiority over pure learning-based or traditional approaches (see Figure 8).
4.6 Limitations
For highly irregular profiles, we are integrating Zernike moment descriptors
where
5 Conclusion and outlook
5.1 Conclusion
This study addresses the critical challenge of laser spot edge extraction in optical measurement systems, where traditional methods struggle with noise interference, uneven intensity distributions, and varying spot sizes. We propose a novel multi-scale adaptive convolution framework that combines three key innovations: (1) a dynamic kernel adjustment mechanism based on local intensity gradients; (2) a hierarchical feature pyramid architecture preserving edge sharpness across scales; (3) a subpixel localization module integrating Gaussian surface fitting with gradient extremum analysis. Experimental validation across three datasets demonstrated superior performance, achieving 0.12-pixel RMSE on Focus-Lite (vs. 0.38 for Canny and 0.18 for DeepEdge), maintaining 0.15-pixel accuracy with aberrated beams, and showing remarkable robustness under noise (0.17-pixel RMSE at 10 dB SNR). In dynamic scenarios, our method sustained 0.25-pixel precision under 15 px motion blur while conventional approaches degraded to 0.67–1.89 pixels. Extreme condition tests revealed 0.28-pixel accuracy at 5 dB SNR and 0.61° angular precision in scattering media, outperforming baseline models by 40%–58%. The ablation study confirmed the necessity of both multi-scale adaptation (61% RMSE increase when removed) and Gaussian-gradient fusion (0.842 F1-score without it).
5.2 Outlook
Three key modifications are being developed to address runtime concerns: (1) a lightweight neural network to predict initial Gaussian parameters (projected 50% reduction in fitting iterations); (2) hardware-aware pyramid construction dynamically adjusting scale numbers based on GPU memory bandwidth; (3) mixed-precision quantization (FP16/INT8) for convolution operations.
The current multi-scale adaptive convolution framework, while achieving superior accuracy, exhibits increased computational complexity compared to traditional edge detection methods. Our experiments show runtime measurements of 15.2 ms for basic detection (vs. 3.2 ms for Canny) and up to 35.2 ms for femtosecond pulse analysis, primarily due to the iterative Gaussian fitting process and cross-scale feature fusion. This overhead becomes particularly problematic for real-time applications requiring >60 fps processing. Future work will optimize the pipeline through three key modifications: (1) implementing a lightweight neural network to predict initial Gaussian parameters, reducing nonlinear optimization iterations by 50%; (2) developing a hardware-aware pyramid construction algorithm that dynamically adjusts scale numbers based on GPU memory bandwidth; and (3) employing mixed-precision quantization (FP16/INT8) for convolution operations without sacrificing subpixel precision. Preliminary simulations suggest these changes could reduce runtime to <10 ms while maintaining <0.15 px RMSE, making the method viable for high-speed laser scanning systems.
Although the method demonstrates excellent performance on Gaussian-like beams (0.12 px RMSE), its accuracy degrades for highly irregular spots (0.31 px RMSE) due to the underlying anisotropic Gaussian model’s parametric constraints. The ablation study reveals a 42% drop in the F1 score when handling TEM01 modes compared to TEM00. To address this, we propose augmenting the model with non-parametric shape descriptors: (1) incorporating Zernike moment features to capture complex beam asymmetries; (2) developing a hybrid architecture where CNN branches process local deformations while the Gaussian component handles global intensity trends. This dual-path approach aims to reduce irregular spot errors below 0.2 px while preserving the current 0.91 temporal-spatial consistency metric.
While the method shows robustness at 5 dB SNR (0.28px RMSE), performance deteriorates rapidly below 3 dB, where traditional Poisson noise dominates, evidenced by 18% false positives in Focus-Extreme’s ultra-low-light sequences. The current gradient-weighting mechanism fails to distinguish genuine edges from stochastic fluctuations when photon counts fall below 100/pixel. Our improvement plan involves three innovations: (1) Integrating a physics-based noise model using EMCCD/SPAD sensor characteristics to weight pixel contributions adaptively; (2) Developing a quantum-inspired edge confidence metric that combines shot noise statistics with spatial coherence patterns; (3) Implementing a multi-exposure fusion protocol where short-/high-intensity frames guide the interpretation of long/low-light acquisitions. Initial tests with synthetic data suggest these modifications could maintain <0.35 px accuracy down to 1 dB SNR while reducing false positives by 40%. The upgraded system will particularly benefit biomedical applications, such as in vivo two-photon imaging, where laser power must be minimized. We will validate the approach using the NIST-traceable low-light calibration standards to establish metrological reliability under photon-counting conditions, addressing a critical gap in current quantitative laser metrology.
This study presents a novel multi-scale adaptive convolution framework for laser spot edge extraction that achieves unprecedented subpixel accuracy (0.12 px RMSE), demonstrates robust performance under diverse challenging conditions (0.28 px at 5 dB SNR, 0.61° in scattering media), and establishes a new framework for precision optical measurement through its innovative integration of adaptive feature pyramids and Gaussian-gradient fusion.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
KY: Validation, Writing – review and editing, Conceptualization, Data curation. LL: Visualization, Methodology, Writing – original draft, Conceptualization, Validation.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This article is supported by the National Key Research and Development Program of China (No. 2022YFB4601100).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Bonnett Del Alamo M, Soncco C, Helaconde R, Bazo Alba J, Gago A. Laser spot measurement using simple devices. AIP Adv (2021) 11:075016. doi:10.1063/5.0046287
2. Yin Z, Zhao L, Li T, Zhang T, Liu H, Cheng J, et al. Automated alignment technology for laser repair of surface micro-damages on large-aperture optics based on machine vision and rangefinder ranging. Measurement (2025) 244:116511. doi:10.1016/j.measurement.2024.116511
3. Gao Y, Zhong P, Tang X, Hu H, Xu P. Feature extraction of laser welding pool image and application in welding quality identification. IEEE access (2021) 9:120193–202. doi:10.1109/access.2021.3108462
4. Zhao Y, Currie EH, Kavoussi L, Rabbany SY. Laser scanner for 3d reconstruction of a wound’s edge and topology. Int J Computer Assisted Radiol Surg (2021) 16:1761–73. doi:10.1007/s11548-021-02459-1
5. Chen D, Li Z, Wang J, Lu H, Hao R, Fan K, et al. Experimental study of laser spot tracking for underwater optical wireless communication. Opt Express (2024) 32:6409–22. doi:10.1364/oe.514542
6. Sun R, Lei T, Chen Q, Wang Z, Du X, Zhao W, et al. Survey of image edge detection. Front Signal Process (2022) 2:826967. doi:10.3389/frsip.2022.826967
7. Pan Z, Zhang H, Shi L. Application of subpixel edge detection in laser spot image detection and control. Int Core J Eng (2024) 10:41–50.
8. Mattsson M. Laser line extraction with sub-pixel accuracy for 3D measurements. Master’s thesis. Linköping, Sweden: Linköping University (2020).
9. Yang X, Xie J, Liu R, Mo F, Zeng J. Centroid extraction of laser spots captured by infrared detectors combining laser footprint images and detector observation data. Remote Sensing (2023) 15:2129. doi:10.3390/rs15082129
11. Chen M, Zhang Z, Wu H, Xie S, Wang H. Otsu-kmeans gravity-based multi-spots center extraction method for microlens array imaging system. Opt Lasers Eng (2022) 152:106968. doi:10.1016/j.optlaseng.2022.106968
12. Ali R, Sarmad M, Tayyub J, Vogel A. Accurate detection of weld seams for laser welding in real-world manufacturing. AI Mag (2023) 44:431–41. doi:10.1002/aaai.12134
13. Mattulat T. Understanding the coaxial optical coherence tomography signal during the laser welding of hidden t-joints. J Laser Appl (2024) 36:012003. doi:10.2351/7.0001157
14. Zhou G, Fu Y, Zhang Z, Yin W. Edge detection and depth evaluation of cfrp internal delamination defects based on differential image frequency domain preprocessing. J Instrumentation (2025) 20:P01027. doi:10.1088/1748-0221/20/01/p01027
15. Ning E, Wang Y, Wang C, Zhang H, Ning X. Enhancement, integration, expansion: activating representation of detailed features for occluded person re-identification. Neural Networks (2024) 169:532–41. doi:10.1016/j.neunet.2023.11.003
16. Meng H, Gao W, Ye Y, Liu Y. Multimodal libs-flipa fusion with frame segmentation for robust plastic classification via advanced lipa processing. Opt Lett (2025) 50:3038–41. doi:10.1364/ol.562180
17. Ning E, Wang C, Zhang H, Ning X, Tiwari P. Occluded person re-identification with deep learning: a survey and perspectives. Expert Syst Appl (2024) 239:122419. doi:10.1016/j.eswa.2023.122419
18. Al Darwich R, Babout L, Strzecha K. An edge detection method based on local gradient estimation: application to high-temperature metallic droplet images. Appl Sci (2022) 12:6976. doi:10.3390/app12146976
19. Han H, Zhu P, Ji L, Liu D, Ning Y. Research on an improved sobel operator edge detection algorithm for inner wall rust image recognition of spiral tubes. In: 2024 IEEE 2nd international conference on sensors, electronics and computer engineering (ICSECE). IEEE (2024). p. 1010–3.
20. Yan J, Xu Z, Wu Z, Li Q, Tang M, Ling J. Edge detection method of laser cladding pool image based on morphology. AOPC 2021: Adv Laser Technology Appl (Spie) (2021) 12060:246–53.
21. Darwis D, Fernando Y, Trisnawati F, Marzuki DH, Setiawansyah S. Comparison of edge detection methods using roberts and laplacian operators on mango leaf objects. BAREKENG: Jurnal Ilmu Matematika dan Terapan (2023) 17:1815–24. doi:10.30598/barekengvol17iss3pp1815-1824
22. Lu Y, Duanmu L, Zhai ZJ, Wang Z. Application and improvement of canny edge-detection algorithm for exterior wall hollowing detection using infrared thermal images. Energy and Buildings (2022) 274:112421. doi:10.1016/j.enbuild.2022.112421
23. Wang J, Chen J. Subpixel edge detection algorithm based on improved gaussian fitting and canny operator. Acad J Comput & Inf Sci (2022) 5:33–9.
24. Ning E, Li W, Fang J, Yuan J, Duan Q, Wang G. 3d-guided multi-feature semantic enhancement network for person re-id. Inf Fusion (2025) 117:102863. doi:10.1016/j.inffus.2024.102863
25. Ma D, Jiang P, Shu L, Geng S. Multi-sensing signals diagnosis and cnn-based detection of porosity defect during al alloys laser welding. J Manufacturing Syst (2022) 62:334–46. doi:10.1016/j.jmsy.2021.12.004
26. Ren Ziwen LH, wei S. Study on the algorithm for discriminating the focusing state of laser spot based on time series prediction. Laser & Optoelectronics Prog (2025) 62:0615015.
27. Xie S, Tu Z. Holistically-nested edge detection. In: Proceedings of the IEEE international conference on computer vision (2015). p. 1395–403.
28. Bertasius G, Shi J, Torresani L. Deepedge: a multi-scale bifurcated deep network for top-down contour detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015). p. 4380–9.
Keywords: multi-scale adaptive convolution, laser spot edge extraction, subpixel localization, Gaussian surface fitting, gradient extremum analysis, feature pyramid architecture, optical measurement precision
Citation: Yuan K and Li L (2025) Optimization of laser spot edge extraction and localization based on multi-scale adaptive convolution. Front. Phys. 13:1650714. doi: 10.3389/fphy.2025.1650714
Received: 20 June 2025; Accepted: 18 September 2025;
Published: 23 October 2025.
Edited by:
Peter R. Hobson, Queen Mary University of London School of Physical and Chemical Sciences, United KingdomReviewed by:
Yuzhu Liu, Nanjing University of Information Science and Technology, ChinaLebohang Bell, Council for Scientific and Industrial Research (CSIR), South Africa
Copyright © 2025 Yuan and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Lin Li, bGxfeXlrakBidXUuZWR1LmNu