Research on automatic mosaicking and synthesis processing technology for multi-source remote sensing images

Cai, Jing; Ye, Feng; Sun, Jingyu; Wei, Hangan; Li, Zichuang; Li, Pengao

doi:10.3389/frsen.2025.1731775

ORIGINAL RESEARCH article

Front. Remote Sens., 30 January 2026

Sec. Image Analysis and Classification

Volume 6 - 2025 | https://doi.org/10.3389/frsen.2025.1731775

Research on automatic mosaicking and synthesis processing technology for multi-source remote sensing images

Jing Cai¹*

Feng Ye¹

Jingyu Sun¹

Hangan Wei²

Zichuang Li²

Pengao Li²

¹Jiangsu Siji Technology Service Co., Ltd., Nanjing, Jiangsu, China
²Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China

Multi - source remote sensing image automatic mosaic and synthesis processing technology is the key to improving the utilization efficiency of remote sensing data. With the rapid develop-ment of diversified imaging platforms such as satellites, unmanned aerial vehicles and ground sensors, the heterogeneity of image data sources has become increasingly prominent, which makes the difficulty of mosaic and synthesis increase. This paper focuses on the auto-matic mosaic and synthesis processing technology of multi - source remote sensing images. Firstly, an adaptive block - weighted Wallis parallel color equalization algorithm fusing specific scene constraints is designed. It dynamically adjusts the block size of color equalization pro-cessing through the coefficient of variation, and optimizes the calculation of local color param-eters combined with bilinear interpolation, which avoids the color distortion of traditional glob-al algorithms and significantly improves the efficiency of radiometric correction. Moreover, an adaptive mosaic algorithm is introduced, and a space - constrained Markov Random Field - Graph Cut seamline generation model is used to generate seamless synthetic images, which supports large - area coverage. This technology can be extended to environmental monitoring, disaster assessment and urban planning. It can automatically process massive multi - source da-ta and achieve high - precision synthesis.

1 Introduction

With the Rapid Development of Earth Observation Technology towards High Resolution, Multi-Platform, and Multi-Sensor Directions (Wang, 2021), Multi-Source Remote Sensing Images Have Become the Core Data Support for Fields Such as Wide-Area Power Grid Inspection, Dynamic Environmental Monitoring, and Disaster Emergency Response. In the Scenario of Power Grid Inspection, it is necessary to obtain data through the collaboration of multiple platforms such as unmanned aerial vehicles (UAVs) (Fan et al., 2024), domestic high-resolution optical satellite satellites to achieve full coverage and near-real-time monitoring of transmission line corridors (Ma et al., 2025). However, during the acquisition process of multi-source data, affected by factors such as differences in im-aging time, variations in sensor parameters, and fluctuations in atmospheric conditions ground features, issues such as geometric distortion and radiometric differences are widespread. Direct mosaicking of such images is prone to phenomena like seam breaks, color imbalance, and misalignment of key ground feature, which seriously affect the accuracy of power grid defect identification and make it difficult to meet the dual re-quirements of “one map” power grid inspection for image currency and integrity. Therefore, the development of robust automatic mosaicking and synthesis technology for multi-source remote sensing images has become a key topic in current engineering applications and academic research (Michel et al., 2025).

In the technical system of automatic mosaicking and synthesis for multi-source remote sensing images, color consistency is a core link that determines the quality of mosaicked results and operational usability (Zuo et al., 2024). Especially in large-scale and complex scenarios, affected by the dynamic changes of imaging environments and the increas-ing differences in sensor parameters, the technical bottlenecks in this link directly re-strict the application value of image data. Due to the heterogeneity of imaging plat-forms, images to be mosaicked often have both color deviation and contrast imbalance (Fan et al., 2018). Some images show a “washed-out” low-contrast feature due to excessive illumina-tion, while others present a “darkened” high-contrast state caused by shooting angles or atmospheric interference. This leads to obvious “striping” and “blocking” phenome-na in the directly mosaicked images (Moghimi et al., 2022), which seriously affects the accuracy of ground feature detail recognition and makes it difficult to meet the requirement for complete interpretation of key facilities such as transmission towers and substations in power grid inspection (Lin et al., 2024; Lin et al., 2024).

To solve this problem, the Wallis transform based on mean and variance mapping has become a classic technical framework for color consistency processing, as it has the characteristic of “simultaneously correcting color and contrast” (Hong et al., 2022). Early studies car-ried out a series of explorations around the applicability of this transform: the global Wallis transform was used for color matching of adjacent images, which initially alle-viated the overall color deviation by unifying the linear transformation parameters of the entire image, but it was insufficient in adapting to local differences in areas with complex ground features; a combined strategy of “Wallis transform and histogram matching” was proposed, which performed fine adjustment of the histogram in “the overlapping area of adjacent images after global correction, taking into account both overall consistency and natural local transition (Cui et al., 2017); the idea of least squares block ad-justment was introduced, integrating all images in the region into a unified optimiza-tion framework to calculate adjustment parameters, thus improving the overall color coordination of large-scale mosaicking; targeting the characteristics of orthophotos, the Wallis parameters were optimized based on the mean and variance information of “the overlapping area of adjacent images, further enhancing the connection effect of adjacent images.

However, the traditional Wallis method and early improved schemes still have significant limitations. On the one hand, using a single linear relationship or fixed block division to process the entire image cannot adapt to the differences in ground feature complexity, which easily leads to local color distortion or excessive adjustment of contrast. On the other hand, the serial computing mode and random color equaliza-tion transfer path are difficult to meet the efficiency requirements for processing mas-sive remote sensing data. Moreover, cumulative errors are prone to occur during mul-tiple transfers, making it impossible to meet the near-real-time and accuracy require-ments of provincial-scale image mosaicking (Yang et al., 2021).

In the research on seamline generation, existing methods mainly construct objec-tive functions around “minimizing gray differences” (Wen et al., 2022). The seamline search model designed based on the Graph Cut algorithm obtains the globally optimal path by solv-ing the s-t graph minimum cut, which effectively improves the seam smoothness. However, this model does not consider the requirement of protecting key ground fea-tures in power grid business scenarios, and the seamline is prone to passing through facilities such as transmission towers and substations, resulting in ground feature breaks and misalignment, which affects the availability of inspection data. The Markov Random Field (MRF) is introduced to optimize the Graph Cut model (Li et al., 2018). The continu-ity of the seamline is constrained by the smooth term of the energy function, but its data term only relies on pixel gray differences and does not integrate geospatial con-straints. When dealing with dense power grid areas, the problem that key ground fea-tures are divided by the seamline still cannot be avoided. A seamline optimization method based on structure preservation is proposed, which can reduce ground feature deformation, but has high algorithm complexity and slow processing speed for mas-sive images. Moreover, it does not design an exclusive avoidance mechanism for pow-er grid facilities, and the success rate of avoiding towers and substations in practical applications is less than 85%, which is difficult to meet the needs of refined power grid asset management (Liu et al., 2025).

In view of the above research shortcomings, focusing on the actual needs of wide-area inspection of the Jiangsu power grid and combining the characteristics of Jiangsu Province, such as diverse topography and dense power grids, this paper con-ducts research on automatic mosaicking and synthesis technology for multi-source remote sensing images, focusing on solving the following three aspects of problems.

Proposing an adaptive block-weighted Wallis parallel color homogenization algo-rithm based on the coefficient of variation: The block size is dynamically adjusted through the coefficient of variation (small blocks are used in areas with complex ground features, and large blocks are used in homogeneous areas). Combined with bi-linear interpolation to calculate pixel-level Wallis parameters, the local color deviation problem of the traditional global algorithm is avoided. At the same time, a parallel processing mechanism is introduced to improve the color unification efficiency of large-scale images and realize cross-source radiometric correction of optical and SAR images.

Constructing an MRF-Graph Cut seamline generation model integrated with power grid spatial constraints: Typical ground feature masks of transmission corridors and substations are added to the data term of the energy function. By increasing the weights of N-chains/T-chains in key areas, the algorithm actively avoids high-cost are-as when solving the minimum cut, ensuring that the seamline not only distributes along areas with weak textures but also protects the integrity of core power grid facili-ties (Chen et al., 2022).

Establishing a full-process automated system for multi-source image prepro-cessing-color homogenization-mosaicking: High-precision feature registration of cross-source images is realized through “improved SIFT and multi-model RANSAC”, and geometric accuracy is improved by combining block adjustment and orthorectifi-cation (Zhang et al., 2018). The Voronoi diagram and Dijkstra algorithm are used to determine the color homogenization transmission path of regional images, reducing color deviation caused by multiple transmissions. Finally, an orthomosaic map with consistent color, seamless connection, and business availability is output, which meets the strict re-quirements of power grid inspection for image quality (Yu and Tang, 2023).

2 Materials and methods

Considering the topographical characteristics of Jiangsu Province and the requirements of power grid applications, this study investigates feature point extraction, adaptive image matching, regional network adjustment, and orthorectification methods and strategies to improve the geometric processing accuracy and automation level of satellite images in Jiangsu Province. The research also explores the collaborative geometric processing technology of different satellite data sources. After preprocessing, the adaptive block weighting Wallis transform is employed to address the local color distortion issue of traditional global Wallis, achieving color consistency among multi-source images. Finally, “multi-scale Laplacian pyramid and seam optimization” is used to achieve seamless mosaicking, ensuring the visual consistency of large-scale images (Canty and Nielsen, 2008).

2.1 Image collection

The experimental area of this study covers the entire territory of Jiangsu Province and the related ultra-high voltage transmission corridors. The region has diverse topography and dense power grids, requiring extremely high precision and automation levels for image mosaicking. To meet the near-real-time update demand of power grid inspection for corridor images, one of the major challenges faced by this study is the continuous collection of new data within the image acquisition cycle to ensure complete coverage of the transmission corridors. The experimental data mainly consist of high-resolution optical satellite images from domestic sources. To meet the visual interpretation needs of power grid inspection, optical images are uniformly adopted as true color products with a combination of blue, green, and red bands, with a data bit depth of 8 bits.

Since color quality cannot be quantitatively measured in practical applications, qualitative descriptions are adopted, which specifically include: In terms of image fusion quality, the fused images have natural colors, uniform tones, clear textures, rich layers, no obvious distortion, blurring, or ghosting; in terms of image enhancement quality, the enhanced images feature clear details of surface features, appropriate contrast, distinct layers, basically balanced colors, and their histograms are roughly close to a normal distribution; in terms of image mosaicking quality, the color transition at the mosaicking seams is natural, surface features are properly connected, man-made features remain intact, with no obvious mosaicking traces or feature misalignments, and the mosaicked results across the entire province have consistent colors as well as appropriate brightness and saturation. This study mainly uses Chinese optical satellites, and the daily image coverage range is the Jiangsu region.

2.2 Image preprocessing

2.2.1 Preprocessing of optical images

The Ground Control Points (GCP) and Rational Polynomial Coefficient (RPC) model joint adjustment method is employed. The density of GCPs reaches 1 per 2 km² in key inflection areas along the power line corridor, ensuring geometric accuracy. The geometric residuals of the rectified images are strictly controlled within 0.15 pixels, with seam RMSE (root mean square error) better than 1.5 pixels and the RMSE of planimetric accuracy deviation from the reference base map is better than 2 pixels. The images are unified in a common coordinate system, providing an accurate geometric basis for subsequent feature registration. Noise suppression is achieved using a bilateral filter.

Additionally, radiometric and format standardization preprocessing is conducted: radiometric calibration and atmospheric correction are applied to eliminate radiometric differences caused by solar altitude angle and atmospheric scattering. Histogram stretching is used to optimize the dynamic range of the images. For images with different resolutions and bit depths, bilinear interpolation is employed for resampling to a uniform resolution, converting 16-bit fused images to 8-bit. Finally, the files are organized in the TIF and TFW format, with the TFW file recording the ground resolution in the horizontal direction, rotation parameters, and the geographic coordinates of the upper-left corner of the image.

2.2.2 High-precision feature registration of multi-source images

An Improved SIFT and Multi-Model RANSAC approach is used to achieve cross-source image registration, addressing the high mismatch rate of traditional SIFT in multi-source data.

2.2.2.1 Improved SIFT feature extraction

The centroid of the local window of feature points is calculated to assign a unique direction, avoiding the insufficient multi-direction adaptability of traditional SIFT. A Markov Random Field energy function is constructed, integrating feature vector similarity measures with geometric constraints such as feature point pair distance and direction consistency. The belief propagation algorithm is used to select high-discrimination and geometrically consistent feature points, eliminating redundant points in repetitive texture areas and improving the effective rate of feature points (Tang et al., 2022).

2.2.2.2 Multi-model RANSAC optimization

An “affine model (for small-range distortions) and perspective model (for large-range stretching)” is integrated to robustly screen SIFT match pairs. First, the affine model eliminates pseudo-matches caused by local noise (e.g., SAR image speckle); Then, the perspective model selects true match pairs that conform to large-range mosaicking, with a match pair retention rate of ≥85% and registration error of ≤0.3 pixels, meeting the millimeter-level precision requirement for power grid defect identification (Zhang et al., 2017).

2.2.3 Adaptive Block-Weighted Wallis Transform pseudo-code and Mathematical Definition

To clarify the implementation logic of the proposed adaptive block-weighted Wallis transform, the step-by-step pseudo-code, mathematical relationships for block size adjustment, and bilinear interpolation implementation details are presented as follows:

1. Pseudo-Code of Adaptive Block-Weighted Wallis Transform:

Input:

image_in:

Input image (H x W)

window_size: Radius or size of the filtering window (e.g., 2r+1)

m_t: Target mean (e.g., 128 or the global mean of the image)

sigma_t: Target standard deviation (e.g., 50 or 0.8*m_t)

b: Contrast control factor (0.0∼1.0)

c: Brightness control factor (0.0∼1.0)

epsilon: A small constant to prevent division by zero.

Output:

image_out: Output image (H x W)Steps:

1. Initialize image_out as a zero matrix with the same size as image_in.

2. For each pixel (i, j) in the image:

a. Determine the local window region centered at (i, j), denoted as window.

(Note: Handle image boundaries, typically using symmetric padding or only valid regions).

b. Calculate the local mean (m_s) and local standard deviation (sigma_s) of all pixel values within the window:

sigma_s = sqrt (mean (window²) - m_s²)

c. Calculate intermediate variables:

scale_factor = b * sigma_t/(sigma_s + (1 - b) * sigma_t/epsilon + epsilon)

offset = c *m_t + (1 - c) *m_s

d. Calculate the output pixel value:

pixel_val = image_in (i, j)

new_val = (pixel_val - m_s) * scale_factor + offset

e. Clamp new_val to the valid grayscale range (e.g., 0–255):

image_out (i, j) = clamp (new_val, 0, 255)

1. Return image_out

2. Mathematical Definition of Key Components

Coefficient of Variation (CV): Defined as the ratio of the standard deviation to the mean of the image (or block), which reflects the dispersion degree of pixel values. The mathematical expression is Equation 1:

C V = \frac{σ}{μ} (1)

Where $σ$ denotes the standard deviation of the image (or block) pixel values, and $μ$ denotes the mean of the image (or block) pixel values.

Block Size Adjustment: The block size is dynamically scaled by the ratio of the coefficient of variation of the input image to that of the reference image. The mathematical relationships are as follows Equations 2, 3:

W = r \times w_{0} (2)

H = r \times h_{0} r = \frac{C V}{C V_{R e f}} (3)

Where $w_{0}$ and $h_{0}$ are the predefined base block width and height (set to 128 pixels in this study), r is the block size scaling factor, $C V$ is the coefficient of variation of the input image, and $C V_{R e f}$ is the coefficient of variation of the reference image (calibrated to 0.3 based on the Jiangsu power grid scene). To avoid over-correction or insufficient adjustment, the block size is constrained to the range of 32 × 32 to 256 × 256 pixels.

Bilinear Interpolation for Pixel-Level Parameters: For a pixel $(x, y)$ located in block $B (i, j)$ , its mean $m (x, y)$ and standard deviation $s (x, y)$ are obtained by interpolating the mean and standard deviation of the four corners of the block. The mathematical expressions are Equations 4, 5:

m (x, y) = \sum_{k = 1}^{4} w_{k} \times m_{k} (4)

s (x, y) = \sum_{k = 1}^{4} w_{k} \times s_{k} (5)

Where $w_{k}$ is the distance weight of the $k$ -th corner of the block (calculated based on the Euclidean distance between the pixel and the corner), $m_{k}$ is the mean of the $k$ -th corner, and $s_{k}$ is the standard deviation of the $k$ -th corner. The weight calculation follows the principle of “closer distance corresponds to higher weight” to ensure smooth transition of parameters between adjacent blocks.

2.3 Multi-source image color balancing technology based on block wallis filtering

When mosaicking multiple images, color and brightness inconsistencies due to different shooting conditions can lead to visually disharmonious results. To obtain a unified and interpretable image for human visual interpretation or computer-aided analysis, color balancing technology is essential. This technology significantly enhances the overall quality and usability of the images, making them more suitable for cartographic and thematic analysis.

The causes of color inconsistency are complex and diverse, mainly including: (1) Different imaging times: Variations in solar altitude angle and illumination conditions, as well as changes in the spectral reflectance characteristics of land cover features (e.g., seasonal vegetation changes). (2) Sensor parameter differences: Parameters such as gain and exposure settings may vary between different sensors or even for the same sensor at different times. (3)Atmospheric condition effects: Clouds, dust, humidity, and other atmospheric conditions during imaging can scatter and absorb light, altering the intensity of the reflected signals reaching the sensor.

As shown in Figure 1 below (Gaofen-1 satellite data, with a resolution of 2 m, utilizing red, green, and blue bands, acquired in 2024.), color balancing can significantly improve the visual effect of the mosaicked area.

Figure 1

Two side-by-side satellite images of the same geographic area. The left image is overlayed with various colored strips, reducing clarity and showing different data layers. The right image is clear, displaying detailed natural and urban features.

Figure 1. Comparison of color balancing effects before and after.

2.3.1 Principle and process of color balancing based on wallis transform

The Wallis transform is a special linear image transformation that maps the mean and standard deviation of the image to be processed to those of a reference image, making the mean and standard deviation of the two images approximately equal. This means that the tone, brightness, and dynamic range of the gray levels of the image to be processed and the reference image are approximately equal, achieving color balance between the two images (Fan et al., 2017).

The Wallis transform can be expressed as Equation 6:

f (x, y) = [g (x, y) - m_{g}] \frac{c s_{f}}{c s_{g} + (1 - c) s_{f}} + b m_{f} + (1 - b) m_{g} (6)

In the formula. $g (x, y)$ is the gray value of the original image, and $f (x, y)$ is the gray value of the resulting image. $m_{g}$ is the mean gray value of the reference image, and $s_{g}$ is the standard deviation of the gray values of the original image. $m_{f}$ is the mean gray value of the reference image, and, $s_{f}$ is the standard deviation of the gray values of the reference image. $b$ ∈[0,1], c∈[0,1].

The above equation can also be expressed as Equation 7:

f (x, y) = g (x, y) r_{1} + r_{0} (7)

$r_{0} = b_{m_{f}} + (1 - b - r_{1}) m_{g} g$ , $r_{1} = c s_{f} / [c s_{g} + (1 - c) s_{f}]$ , $r_{0}, r_{1}$ are the additive and multiplicative coefficients, respectively.

The typical Wallis transform with b = 1, c = 1 can be expressed as Equation 8:

f (x, y) = [g (x, y) - m_{g}] (\frac{s_{f}}{s_{g}}) + m_{f} (8)

The mean and standard deviation of the image gray levels are calculated as follows Equations 9, 10:

m = \frac{\sum_{x = 1}^{w} \sum_{y = 1}^{h} g (x, y)}{w \cdot h} (9)

s = \sqrt{\frac{\sum_{x = 1}^{w} \sum_{y = 1}^{h} {[g (x, y) - m]}^{2}}{w \cdot h}} (10)

$w$ and $h$ are the width and height of the image, respectively, and $g (x, y)$ is the gray level of the image. For color images, the Wallis transform is usually applied to each band separately.

When performing uniform light correction among multiple images using the Wallis transform, a color-balanced image is first selected as the reference image. Its mean and standard deviation are calculated to serve as the target values in the Wallis transform, namely, $m_{f}$ and $s_{f}$ . Then, the mean and standard deviation of the other images to be processed are calculated, and the Wallis transform is applied pixel by pixel to generate result images whose gray levels and standard deviations are approximately equal to those of the reference image. The process is illustrated in Figure 2 below.

Figure 2

Flowchart depicting the process of image processing with Wallis transformation. Both reference and target images are input. Their grayscale mean and standard deviation are calculated. The Wallis transformation is applied to the target image, producing the output image.

Figure 2. Workflow of color balancing based on Wallis transform.

2.3.2 Color balancing method based on block weighting wallis transform

The Wallis color balancing method uses the overall mean and standard deviation of the image to apply the same linear relationship to every pixel. However, the image contains a variety of land cover features with different color information. The overall mean and standard deviation of the image cannot accurately reflect the local color characteristics of land cover features, and using the same linear relationship is clearly unreasonable (Fan et al., 2017).

To address this issue, this paper proposes a method that divides the image into blocks and calculates the mean and standard deviation of each block. Then, bilinear interpolation is used to obtain the linear transformation parameters for each pixel, and different linear relationships are applied to different pixels (Yang et al., 2020).

2.3.2.1 Adaptive image blocking

When using a blocking strategy, the quality of color balancing is affected by the number of blocks. If the number of blocks is too high (i.e., the image blocks are too small), over-correction can lead to land cover distortion and color bias, and the computational load is also increased. If the number of blocks is too low, the mean and variance statistics cannot accurately reflect the land cover distribution, and the color differences between images cannot be effectively eliminated.

To improve the efficiency and quality of color balancing, this paper uses the coefficient of variation to adaptively determine the block size. The coefficient of variation is the ratio of the standard deviation to the mean, also known as the dispersion coefficient, which can describe the richness of land cover features in the image. When the coefficient of variation is larger, the image contains more diverse land cover features, and the number of blocks should be increased (i.e., the image blocks should be smaller) to achieve better color balancing effects. Therefore, the optimal block size is calculated using the following formulas (Equations 11–13):

W = r \times w (11)

H = r \times h (12)

r = \frac{C V}{C V_{R e f}} (13)

Where $C V$ is the coefficient of variation of the image, $C V_{R e f}$ is the coefficient of variation of the reference image. And are the pre-defined number of blocks in the row and column directions, respectively.

2.3.2.2 Weighted wallis color balancing algorithm

After determining the block size, the mean and standard deviation of each image block are calculated. To avoid the “block effect,” the mean and standard deviation of adjacent image blocks are considered when processing each block, and bilinear interpolation is used to calculate the linear transformation parameters for each pixel. The specific steps are as follows (Ehlers et al., 2010).

1. Divide the image to be processed into $W \times H$ non-overlapping blocks and calculate the mean and standard deviation of each block.

2. Calculate the mean and standard deviation corresponding to the corners of each block. If a corner belongs to only one image block, the mean and standard deviation of that block are assigned to the corner. If a corner is a common corner of multiple adjacent blocks, the average of the parameters of the multiple blocks is assigned to the corner.

3. The mean and standard deviation of each pixel are calculated by weighting the four corners of the block it belongs to, using bilinear interpolation.。For a point $g (x, y)$ in block $B (w, h)$ , its mean and standard deviation are determined by the mean and standard deviation of the four corner $P w, h, P w + 1, h, P w, h + 1, P w + 1, h + 1$ and the distances $Δ x, Δ y$ from the point to the block edges, as follows Equations 14, 15:

\begin{array}{c} n (x, y) = (1 - \frac{Δ x}{X}) (1 - \frac{Δ y}{Y}) m (w, h) + \frac{Δ x}{X} (1 - \frac{Δ y}{Y}) m (w + 1, l) \\ + (1 - \frac{Δ x}{X}) \frac{Δ y}{Y} m (w, h + 1) + \frac{Δ x Δ y}{X Y} m (w + 1, h + 1) \end{array} (14)

\begin{array}{c} s (x, y) = (1 - \frac{Δ x}{X}) (1 - \frac{Δ y}{Y}) s (w, h) + \frac{Δ x}{X} (1 - \frac{Δ y}{Y}) s (w + 1, h) \\ + (1 - \frac{Δ x}{X}) \frac{Δ y}{Y} s (w, h + 1) + \frac{Δ x Δ y}{X Y} s (w + 1, h + 1) \end{array} (15)

Where $m (x, y), s (x, y)$ are the mean and standard deviation of point $p (x, y)$ . $m (w, h)$ , $m (w + 1, h)$ , $m (w, h + 1)$ , $m (w + 1, h + 1)$ are the means of the corner $P (w, h)$ , $P (w + 1, h)$ , $P (w, h + 1)$ , $P (w + 1, h + 1)$ respectively. $s (w, h)$ , $s (w + 1, h)$ , $s (w + 1, h + 1)$ are the standard deviations of the corners $P (w, h)$ , $P (w + 1, h)$ , $P (w, h + 1)$ , $(w + 1, h + 1)$ , respectively. X and Y are the width and height of $B (w, h)$ block.

4. Perform Wallis transform processing on each pixel based on the calculated $m (x, y)$ and $s (x, y)$ , as follows Equation 16:

f (x, y) = \frac{s_{j} [g (x, y) - m (x, y)]}{s (x, y)} + m_{f} (16)

Using bilinear intra-block interpolation to calculate the linear transformation parameters for each pixel ensures the smoothness between adjacent image blocks. Additionally, using the corners of the image blocks rather than the center points for calculation can avoid sawtooth phenomena at the edges of the image blocks. The two grids show the coordinate transformation as in Figure 3.

Figure 3

Two grids illustrate coordinate transformations. The left grid shows axes X and Y with a point labeled $p(x, y)$. The right grid shows axes W and H with points $P_{w,h}$, $P_{w+1,h}$, $P_{w,h+1}$, and $P_{w+1,h+1}$. Arrows indicate transformations with labels $\Delta x$ and $\Delta y$.

Figure 3. Bilinear interpolation.

2.3.3 Regional overall color balancing strategy

When using Wallis color balancing, if the content of the image to be processed is significantly different from that of the reference image, the results may be biased or degraded. The content of multiple images within a region often varies greatly, and a single reference image is not suitable for color balancing among multiple images in the region (Xu et al., 2024).

Based on the property that adjacent images have the same or similar content, a method of adjusting reference images multiple times to process adjacent images pairwise can be adopted. Since the neighboring relationships among multiple images in a region are complex and the spatial distribution is irregular, the order of image processing, i.e., the transfer path, directly affects the quality of color balancing. Therefore, it is necessary to consider how to determine the neighboring relationships among images and the transfer path.

To address this issue, this paper proposes a method that combines Voronoi diagrams with the Dijkstra algorithm to determine the order of color balancing processing. The Voronoi diagram is constructed based on the center points of the images to effectively organize the images in the region, facilitating the query of neighboring relationships. The Dijkstra algorithm is used to calculate the shortest transfer path, ensuring that adjacent images with larger overlap areas are processed with fewer transfers, reducing transfer errors and improving the color consistency among images in the region (Chen et al., 2014).

2.3.3.1 Representing neighboring relationships among images

The Voronoi diagram (Figure 4) is a spatial partitioning structure based on the nearest neighbor principle. Therefore, when using the Voronoi diagram to determine the neighboring relationships among images, images with a certain overlap degree but a large distance apart will be judged as non-adjacent. A point set composed of the center points of all images is taken to generate the Voronoi diagram, with each point corresponding to a Voronoi polygon. The adjacency of images can be determined based on whether the corresponding Voronoi polygons share a common edge. When processing color balancing among multiple images, the adjacent images determined by the Voronoi diagram have larger overlap areas, which can improve the color balancing effect (Wang et al., 2017).

Figure 4

Voronoi diagram with eight points labeled v1 through v8, each surrounded by regions delineated by dashed lines. Each point is centered within its respective polygonal region.

Figure 4. Schematic diagram of Voronoi diagram.

2.3.3.2 Searching for the shortest transfer path

The Dijkstra algorithm is a commonly used shortest path search method in graph theory. Assuming is the shortest path length from the source point to point, the basic process of the Dijkstra algorithm for calculating the shortest path from the source point to point is as follows (He, 2022).

1. Initialization: $d_{s}$ = 0, $d_{j}$ (j≠s) = $\infty$ , Mark the source point s and let k = s.

2. Calculate the upper limit of the distance from all unmarked points directly connected to the marked point (Equation 17):

d_{j} = \min [d_{j}, d_{k} + l_{k j}] (17)

Where $l_{k j}$ is the distance from point $k$ to point $j$ .

3. Select the point i with the smallest distance upper limit from all unmarked points, mark point i, and record its predecessor $i^{’}$ .

4. If all points have been marked, the calculation ends. Otherwise, let k = i and go to step (2).

The Dijkstra algorithm can effectively solve the shortest path problem of an undirected connected weighted graph. According to the Dijkstra algorithm, the shortest path from point v1 to the other points is solved. The undirected connected weighted graph is shown in Figure 5.

Figure 5

Graph representation with vertices labeled V1 to V8 connected by edges with weights. Dotted lines indicate subdivisions. Labels are near vertices and edges, showing structure and distances.

Figure 5. Undirected connected weighted graph.

2.3.3.3 Regional image color balancing processing

After determining the neighboring relationships among images and the transfer path, the specific steps for color balancing multiple images in a region using the block weighting Wallis method are as follows.

1. Select the initial reference image. The quality of the initial reference image affects the overall quality of the images in the region. This paper selects the image with the highest clarity as the initial reference image. Clarity is a quality evaluation parameter used to measure the expression ability of an image in texture and detail. The higher the clarity value, the clearer the image and the higher the image quality. The clarity is calculated using the average gradient of the image, as follows Equation 18–20:

D = \frac{1}{(M - 1) (N - 1)} \sum_{x = 1}^{M - 1} \sum_{y = 1}^{N - 1} \sqrt{\frac{Δ_{x}^{2} + Δ_{y}^{2}}{2}} (18)

Δ_{x} = f (x + 1, y) - f (x, y) (19)

Δ_{y} = f (x, y + 1) - f (x, y) (20)

where $M$ and $N$ are the width and height of the image, respectively, and $f (x, y)$ is the gray level of the image at point $(x, y)$ .

1. Construct a Voronoi diagram based on the center points of the images in the region, with each center point corresponding to a Voronoi polygon. The adjacency of images can be determined based on whether the corresponding Voronoi polygons share a common edge. Since the Voronoi diagram is divided based on the principle of the nearest distance to the center point, it can reasonably determine the neighboring relationships among multiple overlapping images. Only images with a short distance are considered adjacent, corresponding to a larger overlap area, which can improve the color balancing effect.

2. Use the Dijkstra algorithm to calculate the shortest path from the center point of the initial reference image to the center points of the other images, and record the predecessor of. The color balancing result of image is used as the reference image for image.

3. Perform pairwise processing using the block Wallis color balancing method. To avoid color bias caused by multiple transfers and maintain the overall color consistency of the images in the region, the mean and standard deviation of the new reference image are constrained by those of the initial reference image. Let the color balancing result of image $j$ be $j^{,}$ . When processing image $i$ , the reference mean $m_{f}$ and standard deviation $s_{f}$ are calculated as follows Equations 21, 22:

m_{f} = w m_{j^{'}} + (1 - w) m_{1} (21)

s_{f} = w s_{j^{'}} + (1 - w) s_{1} (22)

Where $m_{j},$ and $s_{j},$ are the mean and standard deviation of image $j^{’}$ , respectively, $m_{1}$ and $s_{1}$ are the mean and standard deviation of the initial reference image, respectively. w∈[0,1] is a weighting constant, usually taken as 0.5.

2.4 Seamline generation for image mosaicking

The primary purpose of remote sensing image mosaicking is to address the limited coverage and overlapping issues of single-scene remote sensing images. Since satellite or aerial photography typically covers only a part of the target area with each image, and adjacent images have overlapping areas, mosaicking technology is used to seamlessly stitch multiple images into a large-scale image covering the entire study area. The specific process mainly includes two core steps: seamline (mosaicking line) generation and mosaicking based on the seamline. The mosaicking process is relatively mature, with the focus being on the “seamline,” which should pass through low-gradient/weak texture areas while avoiding key land cover features such as transmission corridors and buildings.

Therefore, the core of seamline extraction is to transform the “optimal seamline search” into a Markov Random Field (MRF) energy minimization problem, and to achieve mathematical modeling through the Maximum A Posteriori (MAP) criterion, ultimately relying on s-t graph cuts to solve for the globally optimal path. This ensures that the seamline not only traverses regions of low radiometric dissimilarity but also circumvents critical infrastructure (Boykov et al., 2002).

In this study, the seamline extraction is formulated as a binary-labeling Markov Random Field (MRF) energy minimization problem. Under the Maximum A Posteriori (MAP) criterion, the engineering task of “finding the optimal seamline” is transformed into a mathematical problem of “minimizing the energy of the labeling field” (Equation 23):

y = \underset{y}{\arg \max} P (y | x) = \underset{y}{\arg \min} U (x, y) (23)

Where $X$ denotes the observed field (i.e., the input multi-source images, known), $Y$ is the labeling field (the segmentation result to be estimated, with Yi = 1 indicating assignment to the right-hand image), $U (x, y)$ is the energy function, and $\hat{y}$ is the optimal labeling field whose boundary corresponds to the seamline (Equation 24).

U (x, y) = U_{1} (y) + U_{2} (y) (24)

Data Term $U_{1} (y)$ (Key Feature Preservation + Color Consistency). The data term integrates both pixel-level color consistency and key feature preservation constraints. For each pixel i (Equation 25):

U_{1} (y_{i}) = \{\begin{array}{c} w_{k e y} if p i x e l i \in key feature mask \\ | I_{1} (i) - I_{2} (i) | otherwis \end{array} (25)

$I_{1} (i)$ and $I_{2} (i)$ are the gray values of pixel $i$ in the left and right input images, respectively; the absolute difference reflects color consistency.

Key feature mask construction: A 1 km buffer zone is established centered on the transmission line vector data (substations and transmission towers are all within this buffer zone), and pixels within the buffer zone are marked as key features.

$w_{k e y} = 9999$ (a large constant), ensuring that the energy value of pixels in key feature areas is extremely high. During s-t graph cut optimization, the algorithm will avoid cutting through these areas, thus preserving the integrity of key power grid facilities.

Smoothness Term $U_{2} (y)$ (Seamline Smoothness)The smoothness term adopts the Potts model to constrain the continuity of the seamline, ensuring that adjacent pixels have consistent labels (Equation 26):

U_{2} (y) = \sum_{i \in V} \sum_{j \in N (i)} δ (y_{i}, y_{j}) (26)

$N (i)$ denotes the 4-neighborhood of pixel $i$ ;

$δ (y_{i}, y_{j})$ = 0\) if $y_{i} = y_{j}$ , and $δ (y_{i}, y_{j}) = 1$ otherwise;

This term penalizes label changes between adjacent pixels, forcing the seamline to be distributed in low-texture areas with smooth transitions.

At this point, the computation method of the energy function is fully defined. Subsequently, the globally optimal seamline is obtained in a single step via the s-t graph cut approach. The s-t graph serves as a key construct for transforming the “minimization of Markov Random Field (MRF) energy” into an engineering-tractable problem. By building a “node–edge” network, the task of “seamline generation (binary segmentation)” is equivalently formulated as solving the “minimum cut in an s-t graph” (Liao et al., 2018).

The graph construction process is as follows: A graph $G = (V, E)$ is built over the image, where V represents the nodes corresponding to pixels, and E are the edges connecting these nodes. If two special nodes, s and t, referred to as the source and sink respectively, are added to this graph, and s and t are connected to all nodes in the graph, then an s-t network graph is obtained, as shown in the figure below.

In Figure 6b above, the edges between pixel nodes are called N-links, and the edges between pixel nodes and s or t are called T-links. We assign weights to each edge according to certain rules: the weight of a T-link reflects the probability that a pixel node belongs to the same class as s or t, and the weight of an N-link reflects the probability that nodes within a group belong to the same class. In this way, the constructed s-t network graph is related to MRF energy minimization framework.

Figure 6

Panel a shows an aerial view of a residential area with curved roads and houses surrounded by trees. Panel b displays a directed graph with a green node labeled

Figure 6. Establishing an s-t diagram on the image. (a,b).

When using graph cuts to solve for the minimum energy, the s-t network graph is conventionally referred to as a flow network, with each edge serving as a transport channel. $f (v_{i}, v_{j})$ and $c (v_{i}, v_{j})$ can be used to represent the flow and maximum capacity of each edge, respectively. $c (v_{i}, v_{j})$ corresponds to the weight assigned to the edge before solving, while $f (v_{i}, v_{j})$ is a dynamic concept representing the instantaneous flow on the edges during the graph cut process. For a valid network flow, the following three properties hold.

1. Capacity Limit: $f (v_{i}, v_{j}) < c (v_{i}, v_{j})$

2. Antisymmetry $f (v_{i}, v_{j}) = - f (v_{i}, v_{j})$

3. Flow Balance: For a node i that is neither s nor t, the sum of the inflow to i should equal the sum of the outflow from I, i.e., Equation 27:

\sum_{j \in V} f (i, j) = 0 (27)

When a segmentation algorithm is used for generating mosaic seams (binary classification), it divides image nodes into two categories: one representing the left image and the other representing the right image. The cut of the graph is used to divide nodes into two categories in an s-t network graph. The cut of a graph is used to divide nodes into two categories in an s-t network graph. It is a subset of edges E, which partitions the nodes into two disjoint subsets $V_{s}$ and $V_{t}$ such that $V_{s} \cap V_{t} = \emptyset$ , $V_{s} \cup V_{t} = V$ , and is denoted as $C = (V_{s}, V_{t})$ . Once the graph cut is determined, the image nodes are divided into two parts, completing the segmentation process. In practical engineering, by using typical feature masks to modify the weights of the N-links and T-links, the mosaic seams can be optimized according to requirements, avoiding features that must not be crossed by the seams.

3 Results

Template. These formatting styles are meant as a guide, as long as the heading levels are clear, Frontiers style will be applied during typesetting. After conducting comparative experiments on two sets of orthophotos using the global Wallis equalization method, the commercial software Inpho 5.6, and the method proposed in this paper, the results are shown in Figure 7 (Gaofen-2 satellite data, with a resolution of 0.8 m, utilizing red, green, and blue bands, acquired in 2024.) and 8(UAV data, captured by the DJI Mavic 3 camera, with a resolution of 0.2 m, utilizing red, green, and blue bands, acquired in 2024.). The first set of experimental data (Data I) consists of four color images with obvious inconsistencies in color and contrast, with approximately 40% overlap between each pair, as shown in Figure 7a. The second set of experimental data (Data II) includes 234 images from 5 flight strips, with along-track overlap of about 70%–80% and across-track overlap of about 40%–50%, as shown in Figure 8a.

Figure 7

Four satellite images labeled (a) to (d) showing different image processing methods. (a) Original image shows multiple overlapping sections. (b) Global Wallis depicts enhanced contrast and brightness. (c) Inpho displays further improved clarity. (d) Method proposed in this paper presents the most refined clarity and contrast.

Figure 7. Homogenization results of different methods on Dataset I. (a) Orginal image. (b) Global wallis. (c) Inpho. (d) Method proposed in the paper.

Figure 8

Four satellite images in a grid show a city area with noticeable differences in clarity. The top row includes the original image on the left and the Global Wallis processed image on the right. The bottom row presents the Inpho processed image on the left and an image processed by a newly proposed method on the right. Each method shows variations in contrast and detail.

Figure 8. Uniform Color Results of Data II Using Different Methods. (a) Orginal image. (b) Global wallis. (c) Inpho. (d) Method proposed in the paper.

It can be observed that the global Wallis color normalization method only partially reduces color differences between images, with noticeable tonal variations remaining. This is due to the fact that global statistical information fails to represent local feature characteristics within the imagery. Moreover, using a single reference image can lead to over-enhancement or color bias.

While Inpho performs well in eliminating inter-image color discrepancies and achieving overall tonal consistency across the region, it results in reduced overall contrast and introduces certain deviations from the original image tones. Additionally, when an image’s tonal characteristics differ significantly from the regional overall tone, Inpho’s processing outcomes are unsatisfactory.

In contrast, the method proposed in this study delivers satisfactory results for both sets of imagery. It achieves overall consistency in color and contrast across the region while eliminating or reducing local discrepancies in overlapping areas between adjacent images, without introducing color bias. Furthermore, the method automatically selects the image with the highest clarity as the reference, resulting in high overall sharpness across the region and improved visual interpretation.

Subsequently, through mosaicking techniques, multiple images are seamlessly combined into a single large-scale image that fully covers the study area. As shown in the figure below, the image represents a land cover mask where black indicates buildings and red indicates the generated seamlines. By integrating this information into the energy function, the seamlines successfully avoid crossing buildings. Figures 9–12 present the processing results. Among them, Figures 11–12 are partial enlarged views of Figure 10.

Figure 9

Map showing a network of red and green lines representing roads or paths over a yellow background, likely depicting land. Dark green areas indicate wooded or forested regions.

Figure 9. Schematic diagram of automatic mosaic line bypassing buildings.

Figure 10

Satellite map depicting a region with varied terrain and vegetation. Latitude and longitude markers are displayed on the axes. A north arrow and a distance scale in miles are shown.

Figure 10. Provincial image map processing results.

Figure 11

Aerial view of a rural area with fields and residential areas divided by a river running vertically. The landscape includes organized patches of farmland, roads, and clusters of buildings.

Figure 11. Partial enlarged views of Figures 11.

Figure 12

Aerial view of a densely populated urban area featuring a grid layout. Buildings are closely packed with narrow streets between them. Fields are visible on the outskirts, and water bodies run parallel to some streets.

Figure 12. Partial enlarged views of Figures 12.

4 Discussion

This study addresses the specific requirements of power grid applications and the topographic characteristics of Jiangsu Province by developing an automated mosaicking and synthesis technology for multi-source remote sensing imagery. The technology demonstrates significant advantages in three key aspects: color consistency optimization, critical feature preservation, and processing efficiency adaptation. It exhibits strong applicability to operational scenarios such as power grid inspection and asset management, effectively overcoming the limitations of general-purpose mosaicking techniques.

In terms of color consistency processing, traditional global Wallis algorithms rely on overall image statistics and struggle to handle local color variations in multi-source power grid imagery (such as domestic optical satellite and SAR images). This often leads to color distortion in critical areas like transmission line corridors and substation peripheries, impairing visual interpretation for hazard identification. To address this industry challenge, the proposed adaptive block-weighted Wallis color normalization algorithm dynamically adjusts block sizes based on the coefficient of variation. It employs smaller blocks in complex transmission corridors to preserve texture details and larger blocks in homogeneous regions like farmlands to enhance processing efficiency. Combined with bilinear interpolation for pixel-level color normalization parameters, this method effectively resolves the issue of “local distortion caused by global normalization.” Experimental comparisons show that the processed imagery outperforms both the global Wallis algorithm and commercial software Inpho 5.6 in color naturalness and texture clarity, with significantly reduced color discrepancies at image seams. This fully meets the color consistency requirements for power grid “single-map” applications and provides high-quality visual foundations for tasks such as corridor hazard detection and rooftop photovoltaic statistics (Cui, 2017).

Regarding critical feature preservation, general seamline generation algorithms often prioritize “minimizing grayscale differences” as a single objective, frequently resulting in seamlines crossing core power grid facilities like transmission towers and substations. This causes feature fragmentation and misalignment, compromising inspection data accuracy. The novel MRF-Graph Cut seamline generation model incorporates spatial constraints of transmission corridors and substations into the energy function. By leveraging geographic masks to adjust the weights of N-links and T-links in the s-t graph cut, the model actively avoids critical features. Experimental validation confirms a success rate exceeding 98% in avoiding transmission lines, with no large facilities such as substations intersected by seamlines. This approach overcomes the shortcomings of general software that prioritizes visual seamlessness over operational applicability, providing comprehensive imagery support for refined power grid asset management (Tian et al., 2017).

The objective data analysis is presented in Tables 1, 2. The method proposed in this paper demonstrates significant advantages in both datasets: in terms of the mean value, it is closest to that of the original image, which can retain the color tone of the original image to the greatest extent and reduce color loss; regarding information entropy (E), it is close to that of the original image in Table 1 and much higher than that of the other two methods, and even exceeds that of the original image in Table 2, effectively ensuring the richness of image information; for the average gradient (G), it is close to that of the original image in Table 1 and significantly higher than that of the other two methods in Table 2, achieving good maintenance of image details and improvement of clarity.

Table 1

Table 1. Quantitative evaluation table of color equalization results for dataset Ⅰ.

Table 2

Table 2. Quantitative evaluation table of color equalization results for dataset Ⅱ.

It is necessary to cautiously interpret the trade-off between these quantitative metrics and visual quality. Higher G typically indicates enhanced texture detail preservation, which is critical for identifying subtle defects in power grid facilities (e.g., corrosion on transmission tower components or small-scale vegetation encroachment). Higher E reflects richer image information content, ensuring that subtle differences in ground features (e.g., variations in vegetation coverage around substations or texture differences of transmission line insulators) are retained. However, it is reasonable to note that elevated G and E values could theoretically imply increased visibility of mosaic seams or residual noise amplification—this potential contradiction requires confirmation through multi-dimensional validation.

To rule out such risks, comprehensive verification was conducted from subjective and objective perspectives: ① Visual comparison results (Figures 7–12) show that the processed images have natural color transitions, and the mosaic seams in complex areas (e.g., urban-rural junctions and transmission line corridors) are distributed in low-texture regions. The partial enlarged views (Figures 11, 12) further confirm that there are no obvious noise artifacts or artificial “striping” phenomena; ② The design logic of the proposed method inherently avoids the side effects of excessive enhancement: the adaptive block-weighted Wallis transform dynamically adjusts block sizes based on the coefficient of variation, enhancing details while preventing local contrast over-amplification, thus suppressing noise; the MRF-Graph Cut seamline generation model constrains the seamline path through the energy function’s smoothness term, avoiding unreasonable seams in high-contrast areas.

In summary, the higher G and E values of the proposed method are valid indicators of improved image quality, rather than side effects such as obvious mosaic seams or residual noise. This balance between quantitative metric optimization and visual consistency fully meets the practical requirements of power grid inspection—where both detailed feature preservation (for defect identification) and overall visual uniformity (for corridor-wide monitoring) are indispensable. Compared with the original image and the other two color equalization methods, the proposed method achieves superior performance in the light and color equalization processing of multi-source remote sensing images. By analyzing the quantitative evaluation indicators of UAV image color equalization in Table 1 (Dataset Ⅰ) and Table 2 (Dataset Ⅱ), it can be concluded that the method proposed in this paper achieves the optimal light and color equalization effect in both datasets.

For Dataset Ⅰ (Table 1): The mean value of the proposed method is the closest to that of the original image, and it outperforms the Global Wallis method and Inpho method in retaining the original color tone and reducing color loss. Its information entropy (E) is significantly higher than that of the two comparison methods, and only slightly lower than that of the original image, enabling maximum information retention. Its average gradient (G) is closer to that of the original image, which can avoid detail distortion caused by over-enhancement. Its standard deviation (Sd) is within a reasonable range, which not only ensures the uniformity of image grayscale distribution but also prevents detail loss due to excessively concentrated distribution. For Dataset Ⅱ (Table 2): The advantages of the proposed method are more prominent. The difference between 3its mean value and that of the original image is the smallest, and it is far better than the two comparison methods in avoiding the deviation of the original color tone. Its average gradient (G) is significantly higher than that of the two comparison methods and closer to that of the original image, showing better performance in improving image clarity and retaining detail features. Its information entropy (E) not only surpasses that of the two comparison methods but also exceeds that of the original image—while the information entropy (E) of the two comparison methods decreases significantly compared with that of the original image.

In summary, the proposed method has significant advantages in the three core dimensions of color retention, detail maintenance, and information integrity, with no obvious shortcomings. Compared with the Global Wallis method and Inpho method, it is more suitable for the light and color equalization processing of UAV images and can effectively improve the color consistency and visual quality of UAV images.

5 Conclusion

This chapter conducts an in-depth investigation into automated mosaicking and synthesis techniques for large-scale multi-source remote sensing imagery, with a pri-mary focus on image color normalization and seamline generation.

Firstly, to address the issue of color inconsistency in multi-source satellite image-ry—which impedes interpretation and application—a block-based Wallis algorithm was explored. This approach incorporates holistic regional color consistency processing, enabling large-scale color uniformity across multi-source datasets.

Secondly, building upon Markov Random Field (MRF) and graph cut theory, the problem of generating image mosaic lines was transformed into an energy function optimization task in graph theory. By integrating geographic masks to guide the con-struction of the energy function, the method ensures that seamlines avoid crossing prohibited features, thereby achieving optimized seamline placement.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

JC: Supervision, Writing – original draft, Writing – review and editing. FY: Data curation, Investigation, Writing – original draft. JS: Project administration, Supervision, Writing – original draft. HW: Conceptualization, Project administration, Writing – review and editing. ZL: Methodology, Resources, Writing – review and editing. PL: Validation, Visualization, Writing – review and editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

Authors JC, FY, and JS were employed by Jiangsu Siji Technology Service Co., Ltd.

The remaining author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Boykov, Y., Veksler, O., and Zabih, R. (2002). Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis Mach. Intell. 23 (11), 1222–1239. doi:10.1109/34.969114

CrossRef Full Text | Google Scholar

Canty, M. J., and Nielsen, A. A. (2008). Automatic radiometric normalization of multitemporal satellite imagery with the iteratively re-weighted MAD transformation. Remote Sens. Environ. 112 (3), 1025–1036. doi:10.1016/j.rse.2007.07.013

CrossRef Full Text | Google Scholar

Chen, C., Chen, Z., Li, M., Liu, Y., Cheng, L., and Ren, Y. (2014). Parallel relative radiometric normalisation for remote sensing image mosaics. Comput. and Geosciences 73, 28–36. doi:10.1016/j.cageo.2014.08.007

CrossRef Full Text | Google Scholar

Chen, J., Li, Z., Peng, C., Wang, Y., and Gong, W. (2022). UAV image stitching based on optimal seam and half-projective warp. Remote Sens. 14 (5), 1068. doi:10.3390/rs14051068

CrossRef Full Text | Google Scholar

Cui, H. (2017). Research on remote sensing image enhancement and color consistency algorithm. Lanzhou Jiaotong University.

Google Scholar

Cui, H., Zhang, L., Ai, H. B., Xu, B., and Wang, Z. H. (2017). Large-scale satellite image color consistency processing algorithm using reference tone. Acta Geod. Cartogr. Sinica 46 (12), 1895–1903. doi:10.11947/j.AGCS.2017.20170232

CrossRef Full Text | Google Scholar

Ehlers, M., Klonus, S., Strand, P. R., and Rosso, P. (2010). Multi-sensor image fusion for pansharpening in remote sensing. Int. J. Image Data Fusion 1 (1), 25–45. doi:10.1080/19479830903561985

CrossRef Full Text | Google Scholar

Fan, C., Chen, X., Zhong, L., Zhang, M., Shi, Y., and Duan, Y. (2017). Improved wallis dodging algorithm for large-scale super-resolution reconstruction remote sensing images. Sensors 17 (3), 623. doi:10.3390/s17030623

PubMed Abstract | CrossRef Full Text | Google Scholar

Fan, Y., Chen, X. S., Wang, D. D., Bai, M. L., and Zhou, M. (2018). Wallis uniform light splicing algorithm for super-resolution reconstructed images. Bull. Surv. Mapp. (2), 6. doi:10.13474/j.cnki.11-2246.2018.0060

CrossRef Full Text | Google Scholar

Fan, Y. L., Chen, L., Zhang, B. K., Liu, Y. E., Li, Z. R., and Sun, Y. H. (2024). Autonomous inspection and attitude control of overhead lines in distribution network based on multi-sensor UAV. Electr. Meas. Instrum. 61 (8), 186–194. doi:10.19753/j.issn1001-1390.2024.08.025

CrossRef Full Text | Google Scholar

He, B. (2022). Application of dijkstra algorithm in finding the shortest path. J. Phys. Conf. Ser. 2181, 012005. doi:10.1088/1742-6596/2181/1/012005

CrossRef Full Text | Google Scholar

Hong, Z., Xu, C., Tong, X., Liu, S., Zhou, R., Pan, H., et al. (2022). Efficient global color, luminance, and contrast consistency optimization for multiple remote sensing images. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 16, 622–637. doi:10.1109/JSTARS.2022.3229392

CrossRef Full Text | Google Scholar

Li, M., Li, D., Guo, B., Li, L., Wu, T., and Zhang, W. (2018). Automatic seam-line detection in UAV remote sensing image mosaicking by use of graph cuts. ISPRS Int. J. Geo-Information 7 (9), 361. doi:10.3390/ijgi7090361

CrossRef Full Text | Google Scholar

Liao, T., Chen, J., and Xu, Y. (2018). Coarse-to-fine seam estimation for image stitching. arXiv preprint arXiv:1805.09578.

Google Scholar

Lin, D., Shen, H., Li, X., Zeng, C., Jiang, T., Ma, Y., et al. (2024). Joint block adjustment and variational optimization for global and local radiometric normalization toward multiple remote sensing image mosaicking. ISPRS J. Photogrammetry Remote Sens. 218, 187–203. doi:10.1016/j.isprsjprs.2024.08.016

CrossRef Full Text | Google Scholar

Liu, H., Hu, B., Hou, X., Yu, T., Zhang, Z., Liu, X., et al. (2025). Large-scale stitching of hyperspectral remote sensing images obtained from spectral scanning spectrometers mounted on unmanned aerial vehicles. Electronics 14 (3), 454. doi:10.3390/electronics14030454

CrossRef Full Text | Google Scholar

Ma, Y., Wang, G. F., Zhou, F. R., Wen, G., Qian, G. C., and Ma, Y. T. (2025). Research and application of key technologies for intelligent monitoring of power grid environmental hidden dangers based on multi-source satellite remote sensing. China Sci. Technol. Achiev. (3). doi:10.3772/j.issn.1009-5659.2025.03.019

CrossRef Full Text | Google Scholar

Michel, J., Kalinicheva, E., and Inglada, J. (2025). Revisiting remote sensing cross-sensor single image super-resolution: the overlooked impact of geometric and radiometric distortion. IEEE Trans. Geoscience Remote Sens. 63, 1–22. doi:10.1109/TGRS.2025.3572548

CrossRef Full Text | Google Scholar

Moghimi, A., Mohammadzadeh, A., Celik, T., Brisco, B., and Amani, M. (2022). Automatic relative radiometric normalization of bi-temporal satellite images using a coarse-to-fine pseudo-invariant features selection and fuzzy integral fusion strategies. Remote Sens. 14 (8), 1777. doi:10.3390/rs14081777

CrossRef Full Text | Google Scholar

Tang, L., Ma, S., Ma, X., and You, H. (2022). Research on image matching of improved SIFT algorithm based on stability factor and feature descriptor simplification. Appl. Sci. 12 (17), 8448. doi:10.3390/app12178448

CrossRef Full Text | Google Scholar

Tian, Y. M., Sun, A. F., Wang, D., Pan, R., Wu, Z. L., and Geng, W. H. (2017). UAV aerial image mosaic method based on seam line. Geomatics Spatial Inf. Technol. 40 (10), 11–14. doi:10.3969/j.issn.1006-2475.2016.10.011

CrossRef Full Text | Google Scholar

Wang, Q. (2021). Advances in China's environmental remote sensing monitoring technology and some frontier issues. J. Remote Sens. 25 (1), 25–36. doi:10.11834/jrs.20210572

CrossRef Full Text | Google Scholar

Wang, Z., Xu, Q., Wang, Z., Guo, X., and Yang, Y. (2017). “Image mosaic algorithm of sequential images based on voronoi,” in Proceedings of the 2017 international conference on applied mathematics, modeling and simulation (Atlantis Press), 55–59. doi:10.2991/amms-17.2017.10

CrossRef Full Text | Google Scholar

Wen, S., Wang, X., Zhang, W., Wang, G., Huang, M., and Yu, B. (2022). Structure preservation and seam optimization for parallax-tolerant image stitching. IEEE Access 10, 78713–78725. doi:10.1109/ACCESS.2022.3194245

CrossRef Full Text | Google Scholar

Xu, C., Hong, Z., Tong, X., Liu, S., Zhou, R., and Pan, H. (2024). Color correction and naturalness restoration for multiple images with uneven luminance. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 17, 7343–7358. doi:10.1109/JSTARS.2024.3387654

CrossRef Full Text | Google Scholar

Yang, Y., Ran, S., Gao, X., Wang, M., and Li, X. (2020). An automatic shadow compensation method via a new model combined wallis filter with LCC model in high resolution remote sensing images. Appl. Sci. 10 (17), 5799. doi:10.3390/app10175799

CrossRef Full Text | Google Scholar

Yang, Y. W., Wang, M. W., Gao, X. J., Li, X., and Zhang, J. H. (2021). Automatic shadow compensation method for high-resolution remote sensing images via improved wallis model. Geomatics Inf. Sci. Wuhan Univ. 46 (3). doi:10.13203/j.whugis20190032

CrossRef Full Text | Google Scholar

Yu, X., and Tang, X. (2023). Research on color correction processing of multi-hyperspectral remote sensing images based on FCM algorithm and wallis filtering. IEEE Access 11, 60827–60834. doi:10.1109/ACCESS.2023.3288201

CrossRef Full Text | Google Scholar

Zhang, H., Ren, D., Zhang, F., Wang, L., and Wang, B. (2017). “An improved SIFT algorithm for image matching,” in Proceedings of the international conference on image and graphics (Cham: Springer), 77–88. doi:10.1007/978-3-319-73317-3_8

CrossRef Full Text | Google Scholar

Zhang, J., Zareapoor, M., He, X., Shen, D., Feng, D., and Yang, J. (2018). Mutual information based multi-modal remote sensing image registration using adaptive feature weight. Remote Sens. Lett. 9 (7), 646–655. doi:10.1080/2150704X.2018.1458343

CrossRef Full Text | Google Scholar

Zuo, Z., Li, Y., and Zhang, T. (2024). A NeRF-based color consistency method for remote sensing images. arXiv preprint arXiv:2411.05557.

Google Scholar

Keywords: color consistency optimization, cross-source remote sensing image processing, image automatic mosaicking and synthesis, MRF-graph cut seamlinegeneration, wallis transform

Citation: Cai J, Ye F, Sun J, Wei H, Li Z and Li P (2026) Research on automatic mosaicking and synthesis processing technology for multi-source remote sensing images. Front. Remote Sens. 6:1731775. doi: 10.3389/frsen.2025.1731775

Received: 24 October 2025; Accepted: 11 December 2025;
Published: 30 January 2026.

Edited by:

Zenghui Zhang, Shanghai Jiao Tong University, China

Reviewed by:

Zongcheng Zuo, Shanghai Jiao Tong University, China
Yang Tang, The Ohio State University, United States

Copyright © 2026 Cai, Ye, Sun, Wei, Li and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jing Cai, Y2FpamluZ25qdUAxNjMuY29t

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.