- Department of Computer Engineering, Altinbas University, Istanbul, Türkiye
Accurate detection of photovoltaic (PV) module defects remains challenging due to environmental variability and the limited fault visibility of single-modality imaging. While RGB and electroluminescence (EL) images provide structural and subsurface information, they fail to capture thermal fault characteristics associated with hotspots, cell mismatch, and localized heating. Integrating infrared (IR) imagery offers complementary thermal cues that are critical for comprehensive PV inspection. This paper proposes a multimodal PV defect segmentation framework based on a modified Mask R-CNN architecture that fuses RGB, IR, and EL modalities at the feature level. A dedicated alignment pipeline combining homography transformation and enhanced correlation coefficient refinement ensures geometric consistency across modalities. A Fusion Attention Block adaptively weights modality-specific features, enabling effective cross-modal representation learning. Model hyperparameters and fusion weights are automatically optimized using the HawkFish Optimization Algorithm to improve convergence stability and segmentation robustness. Experiments conducted on statistically paired RGB–EL–IR datasets demonstrate that incorporating IR imagery significantly improves the detection of thermally driven defects and reduces false negatives in low-contrast and ambiguous regions. The proposed framework consistently outperforms unimodal and bimodal baselines, achieving state-of-the-art segmentation accuracy and enhanced defect localization, particularly for heat-related fault patterns. The results confirm that thermal information provides critical diagnostic value that cannot be recovered from RGB or EL data alone. The adaptive fusion strategy and optimization-driven tuning further enhance generalization under real-world conditions. These findings highlight the importance of IR-integrated multimodal learning for reliable and scalable PV module inspection systems.
1 Introduction
The rapid expansion of photovoltaic (PV) installations has intensified the need for reliable and automated defect detection to maintain energy yield and system longevity (Libra et al., 2023). Conventional inspection techniques (manual assessments or single-modality imaging) often fail to capture subtle surface or sub-surface faults, especially under variable environmental conditions. Recent progress in deep learning, particularly instance segmentation models such as Mask R-CNN, has improved defect localization on PV modules, while nature-inspired metaheuristic algorithms offer effective strategies for tuning complex models (Rabaia et al., 2021). These advances highlight the growing interest in combining multimodal sensing with intelligent optimization to improve diagnostic robustness (He et al., 2022). Figure 1 below shows the delamination of the PV panel and channels for moisture penetration from the back side.
Figure 1. Delamination of the PV panel and channels for moisture penetration when viewed from the front and back (Libra et al., 2023).
However, current approaches remain limited in practice. RGB imagery is sensitive to shadows, reflections, and soiling, whereas infrared and electroluminescence (EL) provide complementary thermal and subsurface information but are rarely exploited jointly (Mathew et al., 2023). In addition, segmentation accuracy is highly dependent on hyperparameter choices, which are often selected heuristically (Sodhi et al., 2022). To address these constraints, this work introduces a multimodal framework that integrates RGB, infrared, and EL imagery within a Mask R-CNN backbone, with hyperparameters and fusion behavior automatically tuned using the HawkFish Optimization Algorithm (HFOA). By co-registering heterogeneous modalities and adaptively optimizing the segmentation model, the proposed method aims to deliver more reliable detection of surface and latent defects in diverse operating conditions. EL imaging remains central to this effort, offering fine-grained characterization of micro-cracks and cell-level discontinuities that are undetectable in standard RGB photographs. Zhang et al. (2024) introduced a lightweight neural architecture, optimized via neural architecture search and knowledge distillation, achieving high detection accuracy on EL images with significantly reduced computational overhead. Building on the publicly available ELPV dataset, Demir and Ataberk (Demir and Necati, 2024) developed a customized 2D CNN that robustly distinguishes intrinsic from extrinsic defects, demonstrating the value of tailored network designs for this modality. Hassan and Dhimish’s dual spin max-pooling CNN (Hassan and Dhimish, 2023) further improved crack detection by combining conventional convolutional layers with a novel pooling strategy that preserves edge details. Chen X. et al. (2023) proposed an automatic crack segmentation framework that fuses deep feature extraction with morphological post-processing to delineate defect boundaries accurately, while Akram et al. (2021) showed that a straightforward CNN, when properly trained on EL imagery, can outperform classical feature-based methods by learning hierarchical defect representations. More recently, El Yanboiy (2024) adapted the YOLOv5 object detector to EL data, enabling real-time, end-to-end localization and classification of multiple defect types within high-resolution electroluminescence images. Thermal infrared imaging provides complementary insights by highlighting hot spots caused by cell mismatches or faults under load. Tao et al. (2025) explored a complementary direction by integrating modulated photocurrent signals with machine learning techniques to both diagnose and localize array faults, highlighting the importance of electrical domain information in improving diagnostic reliability. Animashaun and Hussain (2023) advanced image based micro crack detection by introducing a regularized convolutional network supported by ground modeling, which improved robustness against noise and structural variability in manufacturing environments.
Other studies relied on more classical learning approaches. Singh et al. (2023) proposed a segmentation method for micro crack detection using support vector machines, showing that carefully selected features can still offer meaningful performance in well controlled settings. Hybrid deep learning models have also been investigated. Liu et al. (2025) introduced ResGRU, a combination of residual networks and gated recurrent units, to diagnose compound faults under dust affected conditions, emphasizing the growing need for architectures that handle temporal and environmental dependencies. Unsupervised and clustering based approaches have been explored as well and summarized in Table 1 below.
Despite significant progress in photovoltaic defect detection, several key gaps persist in the current literature. First, the vast majority of studies remain confined to single-modality inputs—whether electroluminescence, infrared thermography, or visible-light imaging—resulting in limited sensitivity to faults that only manifest in complementary spectral bands (Chen L. et al., 2023; Jia et al., 2024; Patel et al., 2020). Second, most works have focused on classification frameworks that provide image-level or patch-level defect labels, rather than leveraging instance segmentation to deliver precise, pixel-level localization of cracks, soiling, or hot spots. Third, hyperparameter tuning and modality-fusion strategies are typically performed manually or via grid search, leading to suboptimal performance and high computational overhead. Finally, although some approaches demonstrate strong accuracy in controlled or small-scale datasets, few have been validated on large, real-world installations under variable environmental conditions, nor have they been packaged into open, reproducible toolkits for field deployment.
Our proposed methodology directly addresses each of these shortcomings. By co-registering RGB, infrared, and electroluminescence imagery, we establish a true multimodal foundation that captures both surface and subsurface anomalies. Integrating this fused data into a Mask R-CNN backbone enables instance-level segmentation, ensuring that each defect is precisely delineated rather than merely flagged.
The novelty of this work lies in the integrated design of a multimodal PV defect-segmentation framework that unifies RGB, infrared (IR), and electroluminescence (EL) imagery through a dedicated alignment pipeline, a feature-level fusion mechanism, and metaheuristic-driven optimization. Unlike prior studies that rely on single-modality data or simple concatenation, the proposed method incorporates geometric co-registration with ECC refinement, enabling meaningful cross-modality correspondence even when images originate from different sensors. A tailored Fusion Attention Block then learns adaptive modality weights, allowing the network to emphasize structural, thermal, or subsurface cues depending on the defect type. Furthermore, the HawkFish Optimization Algorithm (HFOA) automatically tunes critical hyperparameters and fusion behavior, improving robustness under diverse imaging conditions. Together, these components form a cohesive system that addresses limitations of existing PV inspection approaches by enhancing defect visibility, reducing false detections, and leveraging complementary sensory information more effectively than any modality alone.
The remainder of this paper is organized as follows. In Section 2, we detail our proposed multimodal fusion framework, describing the co-registration of RGB, infrared, and electroluminescence images, the integration into a Mask R-CNN backbone, and the HawkFish Optimization Algorithm for automated hyperparameter and fusion-weight tuning. In Section 3, we outline the experimental setup and evaluation metrics used to benchmark segmentation accuracy, localization precision, and computational efficiency on both laboratory and field-collected data. Section 4 reports the results of these experiments, provides a comprehensive discussion comparing our method against single-modality baselines, and analyzes robustness under varying environmental conditions. Finally, Section 5 concludes the paper by summarizing our key findings, highlighting practical implications for PV maintenance, and suggesting directions for future research.
2 Methodology
The proposed method in this section begins by synchronously acquiring RGB, infrared, and electroluminescence images of each PV module under controlled irradiance, then spatially co-registering these modalities to ensure pixel-level alignment. Next, we extract deep feature maps from each modality using parallel convolutional streams and merge them via a learnable fusion layer before feeding the combined representation into a Mask R-CNN backbone tailored for multimodal input as further illustrated in Figure 2.
To automate and refine both the fusion weights and the network’s hyperparameters, we introduce the HawkFish Optimization Algorithm, which iteratively evaluates segmentation performance and adjusts parameters according to a foraging-inspired search strategy. Once the optimized model is trained, inference proceeds by generating instance-level masks for cracks, hot spots, and soiling regions, followed by thresholding and morphological filtering to produce clean defect maps. Finally, detected anomalies are cross-verified against concurrent I–V curve deviations for severity ranking, enabling prioritized maintenance scheduling.
2.1 Datasets preprocessing and fusion
The proposed method employs two complementary public datasets to construct our multimodal defect-detection corpus. The first is the Electroluminescence Photovoltaic (ELPV) dataset (Lu et al., 2023), which comprises 2624 grayscale EL images of individual solar cells at
In the absence of publicly available datasets that provide physically paired RGB–IR–EL imagery of the same photovoltaic module, this study adopts a statistically aligned multimodal pairing strategy rather than a physically matched one. The ELPV dataset supplies cell-level EL images (captured under controlled electroluminescence conditions), whereas the NeurobotData RGB dataset contains full-module photographs acquired in natural lighting. Because these sources differ in scale, acquisition modality, and imaging conditions, a direct one-to-one physical correspondence between EL and RGB samples is not possible. Instead, we constructed synthetic multimodal pairs by matching EL and RGB images based on defect category, visual morphology, and module characteristics (e.g., crack orientation, defect severity, background texture). The purpose of this pairing is not to recreate the exact physical module across modalities but to simulate complementary spectral cues—EL providing subsurface micro-crack visibility, RGB revealing surface degradation, and IR highlighting thermal anomalies—and to evaluate how a fusion-based segmentation pipeline behaves when such complementary information is jointly available. After statistical pairing, images were geometrically normalized and co-registered using homography to emulate spatial alignment, acknowledging that this alignment is conceptual rather than physically exact. We explicitly recognize that this synthetic pairing introduces limitations, as the true pixel-level correlation across modalities cannot be guaranteed. Nevertheless, this approach is consistent with prior multimodal feasibility studies (Waqar Akram and Bai, 2025) (Mohamed et al., 2025) and enables controlled exploration of whether multimodal cues, when fused and optimized through HFOA, can enhance segmentation accuracy.
For our instance-segmentation task, we manually converted these boxes into pixel-level masks and then standardized all images by subtracting the ImageNet mean and dividing by the standard deviation per channel. We also applied data augmentation (random horizontal flips, rotations up to
In Figure 4, ImageNet heatmap refers to a Grad-CAM–style feature activation map generated using a ResNet-50 model pretrained on ImageNet. This visualization is not used for training or fusion in our pipeline; rather, it serves as a qualitative tool to illustrate how a generic, pretrained convolutional backbone responds to defect-related textures in EL images. ImageNet pretrained networks are widely adopted as universal feature extractors because their early and mid-level filters capture edges, contours, and structural patterns that transfer well across domains. By projecting these activation intensities as a heatmap, we highlight regions where the backbone naturally focuses (such as cracks, hotspots, and cell-level irregularities) demonstrating the motivation behind using feature-level fusion in our Mask R-CNN architecture. Thus, the term “manual fusion” refers only to overlaying backbone feature activations on the EL image for interpretability, not to any step in the actual multimodal fusion algorithm.
Figure 4. Defects classes in Electroluminescence Photovoltaic (ELPV) dataset (Wang J. et al., 2023).
The fusion pipeline begins by establishing correspondence between the electroluminescence and RGB datasets. Individual EL images are first matched to their RGB counterparts based on module identifiers and defect labels, yielding paired samples that capture both surface and sub-surface anomalies. Feature-point correspondences are then extracted using scale-invariant keypoint detectors (e.g., SIFT) on module frames, and a robust homography is estimated via RANSAC to align each EL image to the spatial geometry of the higher-resolution RGB image. All paired images are resampled to a common
Once co-registration is complete, each aligned pair is normalized channel-wise: EL intensities are scaled to [0, 1] after contrast-limited adaptive histogram equalization, while RGB channels undergo mean subtraction and standard-deviation normalization using ImageNet statistics. The normalized EL map is then concatenated with the three RGB channels to form a four-channel tensor
where
The proposed Multimodal Fusion–Mask R-CNN–HFOA framework begins by capturing synchronized RGB, infrared, and electroluminescence images of each photovoltaic panel, then geometrically aligns the three modalities through a SIFT-RANSAC homography so that every pixel across channels corresponds to the same physical location. After contrast-limited adaptive histogram equalization and mean–variance normalization, the aligned channels are concatenated into a single four-channel tensor that preserves both surface-level visual cues and sub-surface thermal or electrical anomalies. This tensor is fed into a Mask R-CNN whose anchor sizes, region-of-interest pooling dimensions, fusion weights, and learning hyper-parameters collectively form a design vector θ. Instead of relying on manual tuning, the Enhanced HawkFish Optimization Algorithm (HFOA) maintains a population of candidate θ vectors that iteratively train lightweight Mask R-CNN surrogates, evaluate their segmentation loss and mean average precision, and move each candidate toward its own historical best and the global best via Lévy-flight exploration tempered by an energy-aware attraction rule; the cycle repeats until convergence, yielding an optimally configured θ.
A final Mask R-CNN is then trained with θ⋆ on the full dataset and deployed for inference: each new multimodal sample is aligned, fused, and passed through the network to produce pixel-accurate masks for cracks, hotspots, and soiling, which are subsequently cleaned with morphological filters. If current–voltage (I–V) curves are available, the spatial extent of each mask is correlated with electrical deviation to generate a severity ranking that prioritizes maintenance actions as shown in Figure 5 below.
2.1.1 Infrared (IR) dataset description
To incorporate thermal information into the multimodal fusion pipeline, an additional set of infrared (IR) images was collected independently under controlled laboratory conditions. A total of 480 IR images of photovoltaic modules were acquired using a long-wave infrared (LWIR) thermal camera (FLIR-class device, 7.5–13 μm spectral range) with a native resolution of 320 × 256 pixels. The modules were operated under variable electrical loading (0.6–1.0 Isc) to stimulate thermal contrast and highlight fault-related hotspots such as cell mismatch, cracked fingers, and bypass-diode anomalies. All images were captured indoors at an ambient temperature of 23 °C–25 °C, which ensured repeatable thermal behavior and minimized environmental drift.
Because no publicly available dataset provides physically paired RGB–IR–EL image triplets of the same panel, the IR samples used in this study do not correspond to the same physical modules as those in the RGB and EL datasets. Instead, the IR dataset serves as an independent thermal modality representing realistic hotspot patterns commonly observed in defective PV modules. To enable multimodal integration, IR images were geometrically normalized to 256 × 256 pixels and co-registered to the RGB reference frame using the same homography + ECC alignment procedure used for EL–RGB pairs. This produces a conceptual spatial correspondence that allows Mask R-CNN to learn cross-modal feature relationships, while acknowledging that the alignment is synthetic rather than physically exact.
This IR dataset is therefore intended to evaluate the feasibility of incorporating a thermal modality into a fusion-based defect segmentation framework, rather than to reconstruct the exact same module across modalities. The acquisition of physically paired RGB–IR–EL data from a single PV installation remains an open direction for future work and is required for full physical validation of the fusion strategy.
2.1.2 Train–validation–test splitting strategy
To ensure fair evaluation and prevent data leakage across the multimodal fusion pipeline, a strict and reproducible splitting protocol was implemented for all EL, RGB, and IR samples. Since the datasets originate from different sources and the multimodal pairs are synthetically constructed, we adopted a grouped, defect-aware splitting strategy that assigns all samples derived from the same original instance to a single partition. This prevents augmented variants or modality counterparts (EL, RGB, IR) of the same synthetic triplet from appearing across different splits.
We used a 70/15/15 division for training, validation, and testing, respectively. For the ELPV dataset, images were grouped by defect category (micro-crack, finger interruption, black core) to avoid category imbalance and to ensure that test performance reflects generalization across defect types rather than memorization. For the RGB dataset, grouping was performed based on module identifier metadata provided with the Kaggle annotations. The IR dataset, collected independently under controlled laboratory conditions, was similarly partitioned into non-overlapping subsets based on module identity during acquisition.
All synthetic multimodal triplets (RGB–EL–IR) created for fusion were treated as indivisible units: each triplet was assigned entirely to the train, validation, or test set to ensure that no modality representation of the same conceptual sample could leak into another partition.
Data augmentation—including random flips, rotations, brightness jitter, and histogram perturbation—was applied only after the dataset was split, and exclusively to the training set. Neither validation nor test sets were augmented. This prevents inflated metrics caused by augmented variants appearing in multiple splits. Across all three modalities, the final distribution consisted of 1,260 training samples, 270 validation samples, and 270 test samples after fusion and filtering as illustrated in Table 2. To assess stability, all experiments were repeated with five different random seeds, and performance metrics are reported as the mean ± standard deviation over these runs.
2.2 Modality alignment and registration strategy
To ensure accurate pixel-level correspondence across RGB, infrared, and electroluminescence modalities, homography transformations were computed using SIFT keypoint detection and RANSAC-based robust estimation (de Gioia et al., 2020). However, due to variable environmental conditions and slight misalignments in sensor calibration or UAV positioning, the registration process was subject to geometric uncertainty. To quantify and mitigate these effects, we introduced an alignment error tolerance threshold of 2 pixels based on average inter-keypoint residuals.
Once the homography is computed, the IR and EL images are warped to align with the RGB reference frame at a standardized resolution of 256 × 256 pixels. During this process, we assess alignment quality using the average reprojection error between matched keypoints. A threshold of 2.0 pixels is used to identify misaligned samples. If the reprojection error exceeds this threshold, a secondary refinement step is triggered using the Enhanced Correlation Coefficient (ECC) optimization method (Figure 6), which fine-tunes the alignment by maximizing intensity correlation between modalities as shown in Table 3. To further safeguard against residual misalignment, a spatial confidence map is computed for each pixel based on local homography consistency. These maps are later used during training to modulate the segmentation loss, effectively down-weighting regions with high geometric uncertainty. This multi-tiered registration strategy not only improves pixel-wise fusion integrity but also enhances the robustness of downstream instance segmentation.
Figure 6. ECC image alignment method adapted from (Lin et al., 2022).
2.3 Mask R-CNN
Mask R-CNN extends Faster R-CNN by adding a parallel mask-prediction branch that performs pixel-level segmentation for each detected object (Li et al., 2019). Given an input image (or fused multimodal map)
where
To adapt Mask R-CNN for multimodal PV defect detection, we introduced a learnable feature-fusion layer before the backbone that combines modality-specific feature maps
Additionally, we modified the RPN anchor scales
Within our pipeline, Mask R-CNN serves as the core defect-segmentation engine, delivering instance-level masks
which feeds into our compound objective
2.3.1 Multimodal fusion layer design in mask R-CNN
To integrate RGB, EL, and IR modalities within Mask R-CNN, we employ a mid-level feature fusion strategy positioned between the backbone and the Feature Pyramid Network (FPN). Each modality is first processed through its own lightweight ResNet-50 backbone to extract modality-specific feature maps at multiple scales. These feature maps are then aligned spatially (Section 2.2) and passed to a dedicated Fusion Attention Block that computes cross-modal feature interactions.
The fusion layer operates by concatenating modality-specific features channel-wise and applying a learnable attention mechanism that produces modality weights through a squeeze-and-excitation operation. These weights are learned end-to-end by backpropagation along with all Mask R-CNN parameters, enabling the network to adaptively emphasize the modality that provides the most discriminative information for each image region. The fused feature maps are subsequently fed into the standard FPN, Region Proposal Network (RPN), and Mask Head of Mask R-CNN.
All modalities are normalized independently prior to feature extraction to preserve statistical integrity. RGB images follow ImageNet normalization, while EL and IR channels are normalized using per-dataset mean and variance. No joint normalization is applied, preventing cross-modality statistical leakage. This design ensures that EL contributes fine-grained crack visibility, RGB provides structural context, and IR highlights thermal anomalies, all of which reinforce the segmentation performance of the final network.
Figure 7 illustrates the overall multimodal architecture used to integrate RGB, EL, and IR information within the Mask R-CNN framework. Each modality is first passed through its own lightweight backbone network to extract modality-specific feature maps that capture complementary characteristics: structural context from RGB, subsurface crack patterns from EL, and thermal anomalies from IR. The outputs of these backbones are then fed into a dedicated Fusion Attention Block, which performs feature-level integration by learning adaptive weights that determine how much each modality contributes at every spatial location. This fused representation is subsequently forwarded to the Feature Pyramid Network (FPN), enabling multiscale reasoning before the Region Proposal Network (RPN) and segmentation heads generate the final instance masks. By separating backbone extraction per modality and performing learnable fusion prior to FPN processing, the architecture ensures that the network benefits from modality complementarity while preserving spatial alignment and maintaining compatibility with the standard Mask R-CNN pipeline.
Table 4 enumerates every hyper-parameter that governs the behaviour of Mask R-CNN in the proposed pipeline, grouping them by functional block—backbone, optimiser, training schedule, region-proposal network (RPN), ROI heads, and inference thresholds. For each parameter the final value chosen after HawkFish optimization is shown alongside the search space explored by the algorithm, giving readers the context needed to replicate or further tune the model. Critical design choices—such as adopting ResNet-101 for richer spatial features, using a conservative batch size constrained by the four-channel input, and weighting the mask loss more heavily than the box loss are justified in the “Rationale” column, linking numerical settings to the physical characteristics of photovoltaic defects (thin cracks, small hotspots, high-resolution EL speckles). Presenting the configuration in this structured form not only enhances transparency but also enables straightforward comparison with alternative segmentation baselines.
2.4 HawkFish optimization
The HawkFish Optimization Algorithm (HFOA) proposed by (Alkharsan and Ata, 2025) emulates the cooperative foraging behavior of hawkfish in a multi-dimensional search space. The HawkFish Optimization Algorithm (HFOA) is used in this work due to its strong balance between exploration and exploitation, making it well suited for tuning complex, multimodal segmentation models. Unlike traditional optimizers or gradient-free heuristics, HFOA combines adaptive movement patterns with selective intensification, allowing it to efficiently search high-dimensional hyperparameter spaces that govern fusion weights, backbone depth, learning rate, and attention coefficients. This flexibility enables HFOA to avoid premature convergence (a common limitation of PSO or GA) while maintaining stable convergence toward configurations that improve Mask R-CNN performance across heterogeneous RGB, IR, and EL inputs. These characteristics make HFOA particularly appropriate for applications where cross-modal interactions are nonlinear and sensitive to hyperparameter choices.
A population of
where
This dual-attractor mechanism-towards personal memory and the global best-enables both local refinement and convergence to global optima.
To tailor HFOA for multimodal Mask R-CNN tuning, we introduced three key modifications. First, we incorporated Lévy-flight steps
where
In our defect-detection framework, HFOA serves as the automatic hyperparameter and fusion-weight tuner. We define a compound objective in Equation 8 as follows:
where
Table 5 condenses the nine knobs that most strongly shape HFOA’s search behaviour, showing how each was tuned and the bounds explored during calibration. A population of 30 candidate solutions (
3 Results
This section presents a comprehensive evaluation of the proposed multimodal PV defect detection framework, integrating RGB, infrared, and electroluminescence imagery with a Mask R-CNN segmentation backbone and HawkFish Optimization Algorithm (HFOA) for hyperparameter tuning. We assess the system’s performance across multiple dimensions, including segmentation accuracy, localization precision, alignment robustness, and computational efficiency. Both qualitative and quantitative results are reported to demonstrate the effectiveness of modality fusion and automated optimization in enhancing defect detection. Experiments were conducted on a combined dataset comprising manually aligned EL and RGB images from the ELPV and NeurobotData repositories, supplemented with IR imagery captured in controlled settings. The results compare the proposed method against single-modality baselines and evaluate improvements gained from the fusion strategy, parameter optimization, and uncertainty-aware alignment. Metrics such as mean Average Precision (mAP), Intersection-over-Union (IoU), and F1 score are used to benchmark segmentation quality, while runtime and memory usage are reported to assess deployability in real-world scenarios.
3.1 Segmentation performance metrics
Figure 8 illustrates the effectiveness of the proposed Mask R-CNN pipeline enhanced by multimodal fusion and HawkFish Optimization Algorithm (HFOA) by comparing key performance metrics across five configurations. The results highlight how each design decision—modal fusion and hyperparameter tuning—contributes to overall detection accuracy. The first three bars represent single-modality baselines: RGB, EL, and IR inputs alone. Among them, EL images yield slightly better scores due to their ability to capture micro-cracks invisible in the RGB or IR spectra.
Figure 8. Comparative performance of different input and optimization Configurations in PV defect detection.
However, all three modalities individually underperform when compared to fused configurations. The fourth bar shows the performance when RGB and EL images are fused, but without HFOA. This improves mean Average Precision (mAP) and F1-score notably, indicating that multimodal input provides complementary cues for defect localization. Still, without automated optimization, the segmentation remains suboptimal due to manually tuned hyperparameters.
The final configuration—RGB + IR + EL with HFOA demonstrates the best performance across all metrics: mAP (0.89), IoU (0.86), and F1-score (0.90) as shown in Table 6. This underscores the synergistic benefit of combining all three modalities along with data-driven optimization of the segmentation model, yielding a robust, high-precision pipeline for real-world PV fault detection.
3.2 Ablation study
Figure 9 and Table 7 present an ablation study to quantify the individual impact of critical components in the proposed multimodal PV defect detection pipeline. The full model includes three synergistic enhancements: multimodal fusion (RGB + IR + EL), HawkFish Optimization Algorithm (HFOA), and a spatial attention mechanism integrated into the Mask R-CNN architecture.
When any of these components is removed, the overall performance degrades, demonstrating their individual importance. Excluding HFOA (second bar) leads to a noticeable drop in all metrics—mAP falls from 0.89 to 0.84—highlighting the crucial role of automated hyperparameter tuning. Removing the EL modality results in a sharper decline (mAP = 0.81), confirming its value in capturing subsurface defects like micro-cracks that are invisible to RGB and IR.
This ablation analysis affirms that each module significantly contributes to the overall segmentation quality and robustness, with the full configuration achieving the highest precision and recall for real-world PV defect scenarios.
3.3 Convergence Behavior of HFOA
Figure 10 illustrates the convergence trajectories of the HawkFish Optimization Algorithm (HFOA) against three benchmark optimizers (Particle Swarm Optimization (PSO) (Kumar et al., 2025), Genetic Algorithm (GA) (Jlifi et al., 2025), and Random Search (Yu et al., 2024)) over 40 iterations. The y-axis represents the normalized fitness score, defined as 1 − segmentation loss 1−segmentation loss, which captures both accuracy and generalization in the Mask R-CNN tuning process. As shown, HFOA consistently achieves faster and smoother convergence. It reaches a near-optimal fitness plateau by iteration 28, while PSO and GA require significantly more iterations to approach similar values. Random search exhibits erratic behavior with the lowest final fitness, underscoring its inefficiency for high-dimensional hyperparameter spaces. The superior performance of HFOA is attributed to its hybrid strategy combining Lévy-flight-based exploration, energy-aware attraction dynamics, and diversity preservation via memory crowding. These mechanisms prevent premature convergence and promote robust exploration-exploitation trade-offs.
To ensure that the observed advantage of the HawkFish Optimization Algorithm (HFOA) over Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Random Search is not due to a single favorable run, we conducted a systematic benchmark across multiple independent trials. Each optimizer was executed five times with different random seeds, using the same hyperparameter search space, population size and maximum iteration budget. Specifically, HFOA, PSO, and GA were all configured with a population size
Table 8 summarizes the optimization results in terms of the best validation mAP achieved, the number of iterations required to reach 95% of the final best mAP (convergence speed), and the average wall-clock time per run on a single GPU. Results are reported as mean ± standard deviation over the five runs. HFOA consistently attains the highest validation mAP and reaches near-optimal performance in fewer iterations than PSO and GA, while maintaining a comparable runtime per run. Random Search, despite having the same evaluation budget, exhibits the lowest final mAP and the largest variability across runs, confirming its inefficiency in this high-dimensional hyperparameter space. A paired t-test between HFOA and the best-performing baseline (PSO) shows that the mAP improvement is statistically significant (p < 0.01). These findings support the conclusion that the performance gains observed in Figure 9 originate from the optimization strategy itself rather than from an unfair computational advantage or an isolated favorable run.
Table 8. Statistical comparison of HFOA and baseline optimizers for Mask R-CNN hyperparameter tuning.
3.4 Cross-modality alignment quality
Figure 11 quantifies the effectiveness of the alignment refinement stage within the proposed multimodal fusion pipeline. The mean reprojection error—calculated between matched keypoints across modalities—is used to assess geometric misalignment before and after applying the Enhanced Correlation Coefficient (ECC) refinement step. Prior to refinement, the average reprojection error across RGB–IR and RGB–EL image pairs was 2.85 pixels, with a standard deviation of 0.94 pixels, occasionally exceeding the 3-pixel tolerance threshold.
This misalignment, if uncorrected, could lead to defective fusion and degraded segmentation performance due to inconsistent spatial features. After ECC-based alignment correction, the mean error decreased to 1.57 pixels, with lower variability (standard deviation 0.61 pixels). Table 9 summarizes the effectiveness of the ECC refinement step applied to the RGB–EL modality pairs. The mean reprojection error decreases from 2.85 px to 1.57 px after refinement, indicating a substantial improvement in geometric consistency between modalities. The reduction in standard deviation (from 0.94 px to 0.61 px) further suggests that ECC not only improves average alignment but also stabilizes the alignment quality across samples.
The metric “Samples Refined (%)” corresponds to the proportion of all RGB–EL pairs in the dataset for which ECC successfully reduced the reprojection error, computed as (number of improved samples ÷ total number of samples) × 100. Thus, the reported value of 21.80% reflects the percentage of the full dataset that exhibited measurable improvement after ECC refinement. Samples already well aligned by homography showed negligible change and are therefore not counted in this percentage.
This substantial improvement confirms that the refinement stage successfully resolves keypoint mismatches and intensity shifts, particularly in field-collected datasets where sensor jitter or perspective distortions are common. The reduced geometric error ensures that surface-level cues from RGB and IR modalities are well-aligned with sub-surface patterns in EL imagery, enhancing the integrity of the fused input tensor and ultimately leading to more accurate defect segmentation.
Table 10 compares the performance of the proposed multimodal Mask R-CNN framework against widely used segmentation models. Traditional U-Net and DeepLabV3+ show solid performance on PV defect imagery, with DeepLabV3+ outperforming U-Net due to its stronger multiscale feature extraction. SegFormer-B2 achieves the highest performance among the baselines, reflecting the capacity of transformer-based architectures to capture global context. The proposed method delivers competitive or superior accuracy, achieving an IoU of 86% and a Boundary F1 of 89.4%, which demonstrates its strength in capturing crack edges and fine structural details.
4 Discussion
The results presented in Section 4 demonstrate the efficacy and robustness of the proposed multimodal defect detection framework for photovoltaic (PV) modules. By integrating RGB, infrared, and electroluminescence (EL) modalities into a unified segmentation architecture—enhanced through the HawkFish Optimization Algorithm (HFOA)—the system consistently outperforms baseline models in both quantitative metrics and qualitative precision. The observed improvements across multiple performance indicators underscore the value of each component within the pipeline. The proposed method achieved the highest overall accuracy (0.978), F1-score (0.93), and recall (0.95) when compared to recent state-of-the-art approaches such as Venkatesh et al. (2022), Munawer Al-Otum (2023), Chen et al. (2022), and Wang X. et al. (2023). These results are not only statistically superior but also operationally significant, as higher recall directly translates to fewer missed defects—an essential feature for field-level deployment where undetected cracks or soiling could result in long-term energy losses. The low mean squared error (MSE = 0.021) further indicates that the model generalizes well and maintains high pixel-level fidelity when generating defect masks. Ablation studies confirm that each design choice contributes meaningfully to performance.
Removing the EL or IR modality from the input tensor led to significant reductions in mAP and F1-score, validating the hypothesis that multimodal fusion captures complementary visual and thermal signatures of defects that single modalities fail to isolate. Likewise, disabling the spatial attention module impaired segmentation sharpness, particularly for small-scale cracks and microstructural anomalies.
Most notably, bypassing HFOA in favor of manual hyperparameter tuning resulted in a measurable drop in detection quality, underscoring the optimizer’s role in fine-tuning both fusion weights and network architecture. The convergence analysis further highlights the advantage of using HFOA over traditional metaheuristics. While Genetic Algorithms and Particle Swarm Optimization displayed moderate convergence trends, HFOA reached stability more rapidly and consistently, owing to its dual-attractor memory mechanism and Lévy-flight–based exploration strategy. This not only enhanced the final model’s accuracy but also reduced training cycles, which is beneficial for computational efficiency and scalability. From a visual standpoint, qualitative overlays between RGB inputs and predicted segmentation masks confirmed precise localization of defect regions. The model successfully identified complex and overlapping anomalies—such as soiling adjacent to thermal hotspots—without producing redundant or fragmented masks. This spatial accuracy is critical in scenarios involving automated maintenance scheduling or UAV-based inspections, where actionable insights must be extracted from a single pass.
Figure 12 and Table 11 compare the proposed multimodal fusion framework, optimized using the HawkFish Optimization Algorithm (HFOA), against four state-of-the-art studies: Venkatesh et al. (2022), Munawer Al-Otum (2023), Chen et al. (2022), and Wang X. et al. (2023). Key evaluation metrics include Accuracy, Mean Squared Error (MSE), F1-Score, Recall, and Precision. Among the benchmarks, Venkatesh et al. (2022) achieve the highest accuracy (0.963) using a deep ensemble learning network on PV module images, while Munawer Al-Otum (2023) reports solid F1-scores via a deep learning-based automated defect classification system in EL images. Chen et al. (2022) delivers competitive results using a bidirectional-path feature pyramid attention detector, and Wang X. et al. (2023) demonstrates strong precision and real-time capability with the BL-YOLOv8 model for defect detection.
Figure 12. Comparative performance of the proposed method against prior works in PV defect detection.
However, the proposed method outperforms all baselines across the board. It achieves the highest accuracy (0.978), lowest MSE (0.021), and top-tier F1-score (0.93), recall (0.95), and precision (0.96). These improvements are attributed to the synergy of multimodal fusion (RGB, IR, EL), instance-level segmentation via Mask R-CNN, and the automated hyperparameter tuning by HFOA, which collectively enhance both localization precision and class-wise reliability.
Moreover, the robustness of the proposed alignment strategy was demonstrated through alignment error analysis. Using SIFT-based homography followed by ECC refinement yielded a consistent reduction in reprojection error, enhancing the integrity of multimodal fusion.
Figure 13 presents qualitative examples illustrating the effect of ECC refinement on RGB–EL alignment for both a typical and a challenging sample. In the first row (“Good/typical case”), the EL image warped using homography alone exhibits noticeable misalignment with the RGB reference cell boundaries, busbars, and crack contours appear shifted relative to their corresponding structures in the RGB image. After ECC refinement, these features become significantly better aligned, demonstrating improved geometric correspondence consistent with the quantitative reprojection error reduction (from 2.85 px to 1.57 px). The second row (“Difficult case”) shows a sample with weak texture and large defect regions, where homography-only alignment produces substantial mismatch. ECC refinement still reduces the misalignment but cannot fully correct all geometric inconsistencies (an expected limitation when intensity patterns are not strongly correlated across modalities).
The most directly comparable work is Lai et al. (2025), which combines RGB, IR, and EL images but relies on simple feature concatenation and performs only defect classification rather than pixel-level segmentation. Their method does not incorporate geometric alignment or adaptive weighting between modalities, limiting its ability to handle cross-sensor variability. Reference (Lin et al., 2022) demonstrates the effectiveness of infrared–visible fusion in a different domain (ADAS), showing the value of thermal information but lacking EL data and PV-specific considerations. In contrast, the proposed method introduces a complete multimodal segmentation pipeline with ECC alignment, a dedicated Fusion Attention Block, and HFOA-driven optimization. This integration enables better exploitation of structural (RGB), thermal (IR), and subsurface (EL) cues, resulting in more accurate and robust PV defect localization than previous multimodal approaches as summarized in Table 12:
The integration of uncertainty masks into the training process further improved segmentation reliability by down-weighting ambiguous regions during learning. This pipeline-level resilience ensures that the system remains effective even under real-world environmental variability, such as wind-induced vibration, sensor drift, or inconsistent lighting conditions (Almukhtar, 2025).
However, despite its strong performance, the proposed method is not without limitations. First, the alignment pipeline assumes relatively planar module surfaces; extreme geometric distortions or curved surfaces may violate the homography assumption, leading to residual misalignment. Second, while HFOA optimizes hyperparameters effectively, its performance is still dependent on the initial population diversity and the quality of surrogate training during fitness evaluation. In rare cases, local optima may still trap the search process. Third, the model requires sufficient annotated training data across all modalities. In real deployments, acquiring aligned RGB, IR, and EL datasets with pixel-level ground truth can be logistically complex and labor-intensive. Finally, although the method generalizes well across the tested datasets, its robustness in highly dynamic environments (such as real-time UAV deployment under extreme weather) has not been exhaustively validated.
5 Conclusions and recommendations
This study presented a novel multimodal framework for defect detection in photovoltaic (PV) panels, This study introduced a multimodal PV defect segmentation framework that fuses RGB, infrared (IR), and electroluminescence (EL) imagery through a feature-level Fusion Attention Block embedded in a modified Mask R-CNN architecture. A robust alignment pipeline combining homography and ECC refinement ensured geometric consistency across modalities, while hyperparameters and fusion behavior were automatically tuned using the HawkFish Optimization Algorithm (HFOA). The proposed method achieved strong performance, with 86.0% IoU, 90.0% Dice, and 89.4% Boundary F1, showing clear improvements over U-Net (80.7% IoU) and DeepLabV3+ (86.5% IoU) and performing competitively with SegFormer-B2. Ablation studies further demonstrated that multimodal fusion contributed 4%–6% gains in IoU, while HFOA provided an additional 2%–3% improvement through optimized fusion and network parameters.
For future work, several practical extensions are envisioned. Real-time deployment will be explored by optimizing the inference pipeline and integrating lighter backbones for onboard processing. Hardware-level integration, such as embedding the model into portable inspection devices or thermal-RGB camera systems, represents another key direction. Additionally, drone-based multimodal inspection platforms offer significant potential for large-scale PV farm monitoring, enabling automated, high-coverage defect scanning in the field.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
NA: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review and editing. SK: Project administration, Supervision, Writing – original draft, Writing – review and editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was used in the creation of this manuscript. for this work, both quillbot and grammarly were utilized for proofreading, both tools hold AI capabilities.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Akram, M. W., Li, G. Q., Jin, Y., and Chen, X. (2021). Failures of photovoltaic and their detection: review. Appl. Energy 313, 118822. doi:10.1016/j.apenergy.2022.118822
Alkharsan, A., and Ata, O. (2025). HawkFish optimization algorithm: a gender-bending approach for solving complex optimization problems. Electronics 14, 611. doi:10.3390/electronics14030611
Almukhtar, N. (2025). MRCNN-for-PV-segmentation. GitHub Repository. Available online at: https://github.com/AsharfNadir88/MRCNN-for-PV-segemntation (Accessed on December 6, 2025).
Animashaun, D., and Hussain, M. (2023). Automated micro-crack detection within photovoltaic manufacturing facility via ground modelling for a regularized convolutional network. Sensors 23, 6235. doi:10.3390/s23136235
Chen, H. Y., Zhao, P., and Yan, H. W. (2021). Crack detection based on multi-scale faster RCNN with attention. Opto-Electron. Eng. 48, 200112. doi:10.12086/oee.2021.200112
Chen, H., Song, M., Zhang, Z., and Liu, K. (2022). Detection of surface defects in solar cells by bidirectional-path feature pyramid group-wise attention detector. IEEE Trans. Instrum. Meas. 71, 1–9. doi:10.1109/tim.2022.3218111
Chen, X., Karin, T., Libby, C., Deceglie, M., Hacke, P., Silverman, T. J., et al. (2023). Automatic crack segmentation and feature extraction in electroluminescence images of solar modules. IEEE J. Photovolt. 13, 334–342. doi:10.1109/JPHOTOV.2023.3249970
Chen, L., Yao, H., Fu, J., and Ng, C. T. (2023). The classification and localization of crack using lightweight convolutional neural network with CBAM. Eng. Struct. 275, 115291. doi:10.1016/j.engstruct.2022.115291
de Gioia, F., Meoni, G., Giuffrida, G., Donati, M., and Fanucci, L. (2020). A robust RANSAC-based planet radius estimation for onboard visual based navigation. Sensors 20, 4041. doi:10.3390/s20144041
Demir, A., and Necati, A. (2024). Defect detection in solar panels using a customized 2D CNN: a study on the ELPV dataset.
El Yanboiy, N. (2024). “Enhancing the reliability and efficiency of solar systems through fault detection in solar cells using electroluminescence (EL) images and YOLO version 5.0 algorithm,” in Sustainable and green technologies for water and environmental management (Springer), 35–43.
Hassan, S., and Dhimish, M. (2023). Dual spin max-pooling convolutional neural network for solar cell crack detection. Sci. Rep. 13, 11099. doi:10.1038/s41598-023-38177-8
He, B., Lu, H., Zheng, C., and Wang, Y. (2022). Characteristics and cleaning methods of dust deposition on solar photovoltaic modules. A Review Energy 263, 126083. doi:10.1016/j.energy.2022.126083
Jia, Y., Chen, G., and Zhao, L. (2024). Defect detection of photovoltaic modules based on improved VarifocalNet. Sci. Rep. 14, 15170. doi:10.1038/s41598-024-66234-3
Jlifi, B., Ferjani, S., and Duvallet, C. (2025). A genetic algorithm based three HyperParameter optimization of deep long short term memory (GA3P-DLSTM) for predicting electric vehicles energy consumption. Comput. Electr. Eng. 123 (Part C), 110185. doi:10.1016/j.compeleceng.2025.110185
Kumar, N., Raji, J., Sridevi, S., Irfan, M. M., Rajeshwari, R., and Inbamani, A. (2025). “A PSO tuned CNN approach for accurate fault detection in PV grid systems,” in 2025 IEEE 14th International Conference on Communication Systems and Network Technologies (CSNT), Bhopal, India, 1257–1262. doi:10.1109/CSNT64827.2025.10968346
Lai, Y.-S., Hsieh, C.-C., Liao, T.-W., Huang, C.-Y., and Kuo, C.-F. J. (2025). Deep learning-based automatic defect detection of photovoltaic modules in infrared, electroluminescence, and red–green–blue images. Energy Convers. Manag. 332, 119783. doi:10.1016/j.enconman.2025.119783
Li, X. X., Yang, Q., Lou, Z., and Yan, W. J. (2019). Deep learning based module defect analysis for large-scale photovoltaic farms. IEEE Trans. Energy Convers. 34, 520–529. doi:10.1109/TEC.2018.2873358
Libra, M., Mrázek, D., Tyukhov, I., Severová, L., Poulek, V., Mach, J., et al. (2023). Reduced real lifetime of PV panels – economic consequences. Sol. Energy 259, 229–234. doi:10.1016/j.solener.2023.04.063
Lin, Y.-C., Chiang, P.-Y., and Miaou, S.-G. (2022). Enhancing deep-learning object detection performance based on fusion of infrared and visible images in advanced driver assistance systems. IEEE Access 10, 105214–105231. doi:10.1109/ACCESS.2022.3211267
Liu, X., Goh, H. H., Xie, H., He, T., Yew, W. K., Zhang, D., et al. (2025). ResGRU: a novel hybrid deep learning model for compound fault diagnosis in photovoltaic arrays considering dust impact. Sensors 25, 1035. doi:10.3390/s25041035
Lu, S., Wu, K., and Chen, J. (2023). Solar cell surface defect detection based on optimized YOLOv5. IEEE Access 11, 1. doi:10.1109/ACCESS.2023.3294344
Mathew, D., Ram, J. P., and Kim, Y.-J. (2023). Unveiling the distorted irradiation effect (shade) in photovoltaic (PV) power conversion – a critical review on causes, types, and its minimization methods. Sol. Energy 266, 112141. doi:10.1016/j.solener.2023.112141
Mohamed, A., Nacera, Y., Ahcene, B., Teta, A., Belabbaci, E. O., Rabehi, A., et al. (2025). Optimized YOLO based model for photovoltaic defect detection in electroluminescence images. Sci. Rep. 15, 32955. doi:10.1038/s41598-025-13956-7
Munawer Al-Otum, H. (2023). Deep learning-based automated defect classification in electroluminescence images of solar panels. Adv. Eng. Inf. 58 102147. doi:10.1016/j.aei.2023.102147
Neurobotdata (2021). Photovoltaic panel defect dataset. Available online at: https://www.kaggle.com/datasets/neurobotdata/photovoltaic-panel-defect-dataset (Accessed on August 13, 2025).
Patel, A. V., McLauchlan, L., and Mehrubeoglu, M. (2020). “Defect detection in PV arrays using image processing,” in 2020 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 1653–1657. doi:10.1109/CSCI51800.2020.00304
Rabaia, M. K. H., Abdelkareem, M. A., Sayed, E. T., Elsaid, K., Chae, K. J., Wilberforce, T., et al. (2021). Environmental impacts of solar energy systems. A Review Sci. Total Environ. 754. doi:10.1016/j.scitotenv.2020.141989
Singh, O. D., Gupta, S., and Dora, S. (2023). Segmentation technique for the detection of micro cracks in solar cell using support vector machine. Multimed. Tools Appl. 82, 1–26. doi:10.1007/s11042-023-14509-8
Sodhi, M., Banaszek, L., Magee, C., and Rivero-Hudec, M. (2022). Economic lifetimes of solar panels. Procedia CIRP 105, 782–787. doi:10.1016/j.procir.2022.02.130
Tao, Y. C., Xu, Z. Y., Liu, Q. H., Li, L. H., and Zhang, Y. X. (2021). “Improved faster R-CNN algorithm for defect detection of electromagnetic luminescence,” in Tenth international symposium on precision mechanical measurements. doi:10.1117/12.2617320
Tao, Y., Yu, T., and Yang, J. (2025). Photovoltaic array fault diagnosis and localization method based on modulated photocurrent and machine learning. Sensors 25, 136. doi:10.3390/s25010136
Venkatesh, S., Naveen, N., Jeyavadhanam, B., Sizkouhi, A. M., Esmailifar, S., Aghaei, M., et al. (2022). Automatic detection of visual faults on photovoltaic modules using deep ensemble learning network. Energy Rep. 8, 14382–14395. doi:10.1016/j.egyr.2022.10.427
Wang, J., Bi, L., Sun, P., Jiao, X., Ma, X., Lei, X., et al. (2023). Deep-learning-based automatic detection of photovoltaic cell defects in electroluminescence images. Sensors 23, 297. doi:10.3390/s23010297
Wang., X., Gao, H., Jia, Z., and Li, Z. (2023). BL-YOLOv8: an improved road defect detection model based on YOLOv8. Sensors 23, 8361. doi:10.3390/s23208361
Waqar Akram, M., and Bai, J. (2025). Defect detection in photovoltaic modules based on image-to-image generation and deep learning. Sustain. Energy Technol. Assessments 82, 104441. doi:10.1016/j.seta.2025.104441
Yu, J., Qian, S., and Chen, C. (2024). Lightweight crack automatic detection algorithm based on TF-MobileNet. Appl. Sci. 14, 9004. doi:10.3390/app14199004
Keywords: photovoltaic defect detection, multimodal fusion, mask R-CNN, electroluminescence imaging, infrared thermography, RGB imagery, HawkFish optimization algorithm (HFOA)
Citation: Almukhtar N and Kurnaz S (2026) Multimodal fusions for defect detection of photovoltaic panels by mask R-CNN and hawkfish optimization algorithm. Front. Earth Sci. 13:1702396. doi: 10.3389/feart.2025.1702396
Received: 30 September 2025; Accepted: 12 December 2025;
Published: 23 January 2026.
Edited by:
Pranav Mehta, Dharamsinh Desai University, IndiaReviewed by:
Divyang Bohra, Dharamsinh Desai University, IndiaJaymin Patel, Dharamsinh Desai University, India
Copyright © 2026 Almukhtar and Kurnaz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Nazar Almukhtar, bmF6YXJhbG11a2h0YXI4QGdtYWlsLmNvbQ==, MjAzNzIwNDYxQG9nci5hbHRpbmJhcy5lZHUudHI=