Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurosci., 08 October 2025

Sec. Brain Imaging Methods

Volume 19 - 2025 | https://doi.org/10.3389/fnins.2025.1634606

This article is part of the Research TopicAdvances in brain diseases: leveraging multimodal data and artificial intelligence for diagnosis, prognosis, and treatmentView all 8 articles

LEM-UNet: an edge-guided network for 3D multimodal images segmentation in focal cortical dysplasia


Qiunan Li&#x;Qiunan Li1Hao Yu&#x;Hao Yu2Manli ZhangManli Zhang1Xiaotong YuanXiaotong Yuan1Lixin Cai
Lixin Cai2*Guixia Kang
Guixia Kang1*
  • 1School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China
  • 2Pediatric Epilepsy Center, Peking University First Hospital, Beijing, China

Introduction: Focal cortical dysplasia (FCD) is one of the common causes of refractory epilepsy. The subtle and indistinct edge of FCD lesions pose considerable challenges for accurate lesion localization. Therefore, we propose an edge guided segmentation network based on Laplacian pyramid to improve the localization performance of FCD lesions.

Methods: This is a retrospective study evaluated on two independent datasets. The proposed Laplacian Edge Mix UNet (LEM-UNet) builds upon the MedNeXt baseline and incorporates the Laplacian Edge Attention (LEA) block and the Multi-strategy Feature Fusion (MFF) block. LEA block captures lesion details and edge information during the encoding phase by integrating Laplacian pyramid feature maps with an attention mechanism, while MFF block fuses edge features with high level features during the decoding phase.

Results: The model's performance was assessed through 5-fold cross-validation across both Open and Private Datasets, demonstrating superior performance. The average Dice Coefficient achieved 0.452 and 0.597 on the Open and Private Datasets, respectively, representing improvements of 2.40% and 2.90% compared to the baseline model.

Discussion: The results demonstrate the importance of focusing on lesion edge in the FCD segmentation task. The integration of the Laplacian pyramid enhances the mode's ability to capture lesions with blurred edge and subtle features. LEM-UNet exhibits significant advantages over current FCD segmentation algorithms. The source code and pre trained model weights are available at https://github.com/simplify403/LEM-UNet.

1 Introduction

Epilepsy is a prevalent chronic neurological disorder characterized by the abrupt, abnormal discharge of brain neurons (Fisher et al., 2005). Most epilepsy patients can control seizures using medications, but one-third of patients develop drug-resistant epilepsy (DRE). Prolonged epileptic seizures can lead to severe consequences such as brain damage, decreased quality of life, and even premature death (Avakyan et al., 2017). Focal cortical dysplasia (FCD), a malformation of cortical development, is the most common cause of DRE (Kabat and Król, 2012). Clinically, an effective treatment for patients with DRE is surgical resection of FCD lesions. About 70% of patients experience seizure relief after surgery (Wagstyl et al., 2022). A favorable outcome is associated with the complete resection of FCD lesions (Willard et al., 2022).

In clinical diagnosis, physicians can identify FCD lesions by analyzing neuroimaging results. Magnetic Resonance Imaging (MRI) is a medical imaging technique that uses the principles of nuclear magnetic resonance to produce detailed images of the internal structures of the human body. On MRI scans, characteristic manifestations of FCD include increased cortical thickness, blurring of the gray/white matter junction, the presence of a transmantle sign, abnormal cortical folding pattern, and increased signal intensity on FLAIR/T2-weighted MRI (Urbach et al., 2022) as depicted in Figure 1. Positron Emission Tomography (PET) is a nuclear medicine imaging technique used to observe biological processes and functional activities within the human body. Between seizures, PET images using fluorodeoxyglucose show hypometabolism in areas of gray matter tissue that are associated with the epileptogenic region (Baete et al., 2004). However, the identification of FCD lesions in clinical practice remains a formidable challenge. Firstly, FCD lesions are often subtle, exhibit diverse morphological features, and possess poorly defined edge, making them difficult to detect through routine visual inspection. Secondly, the analysis of high-resolution 3D imaging datasets is time-consuming, requires specialized expertise, and can be subjective in certain cases. Therefore, there is an urgent need for precise and efficient computational techniques to assist in the localization of FCD lesions.

Figure 1
MRI scans of the brain in four frames labeled (a) to (d). Frame (a) shows a clear image of the brain structures. Frame (b) has a red rectangle highlighting an area near the temporal lobe. Frame (c) features a red rectangle over a region near the frontal cortex. Frame (d) displays a red rectangle around a region in the parietal area. Frame (b), (c), and (d) show the lesion characteristics of focal cortical dysplasia on images.

Figure 1. (a) A T1-weighted MRI slice of normal brain; (b) abnormal gray matter thickening in a T1-weighted MRI slice in red box; (c) blurred junction of gray and white matter in a T1-weighted MRI slice in red box; (d) increased signal intensity in FLAIR in red box.

In the early stages, the task of FCD lesion detection primarily focused on the extraction of cortical morphological features, with the main approaches including voxel-based morphometry (Mechelli et al., 2005) and surface-based morphometry (Thesen et al., 2011). The extracted voluminous cortical morphological features were then input into machine learning models to distinguish between normal and pathological tissues, such as (David et al. 2021); (Jin et al. 2018); and (Hong et al. 2014). However, these methods exhibit several limitations: the computation of features is complex and extremely time-consuming, and some basic features are insufficient to accurately differentiate FCD lesions.

Currently, deep learning-based models have achieved remarkable success in various medical imaging tasks, significantly advancing disease detection and diagnosis. Convolutional Neural Networks (CNNs) are the most widely used model for the detection of FCD lesions, with a lot of work is based on it, such as (Gill et al. 2021) and (Wang et al. 2020). Enhanced model designs have led to improved detection performance. (Dev et al. 2019) trained a U-Net using 2D slices derived from 3D FLAIR-weighted sequence images. (Niyas et al. 2021) proposed a 3D Res-UNet model for segmenting FCD lesions from MRI volumes, leveraging inter-slice information from 3D MRI data to achieve superior segmentation performance. (Thomas et al. 2020) introduced a Multi-Res-Attention UNet to address the significant semantic gap in feature mapping caused by long-range connections between the encoder and decoder layers, resulting in a higher FCD detection rate. The nnU-Net framework has been proposed as a self-adaptive solution for diverse medical image segmentation tasks (Isensee et al., 2018). (Spitzer et al. 2023) developed a graph-based nnU-Net for segmenting FCD lesions using surface-based cortical data, achieving a 22%–27% improvement in specificity compared to baseline methods. Zhang S. et al. (2024) employed a 3D full-resolution nnU-Net for automatic lesion segmentation, demonstrating strong performance in FCD lesion detection with a sensitivity of 0.73. Zhang X. et al. (2024) integrated multi-scale transformers into CNN-based encoding and decoding structures to overcome the limitations of local receptive fields in CNN models and successfully identified lesions in 82.4% of patients.

The aforementioned methods have significantly contributed to the detection of FCD lesions. However, traditional methods based on manual extraction of cortical morphological features combined with machine learning exhibit several limitations: (1) feature extraction is computationally complex and time-consuming, (2) basic features are insufficient to fully capture lesion characteristics. Transformer-based models, while powerful, demand substantial computational resources and are challenging to train on small datasets. CNN-based models have demonstrated significant potential in the field of lesion segmentation due to their excellent local perception capabilities. However, they still exhibit limitations when applied to FCD lesion segmentation tasks. FCD lesions are characterized by blurred edge, making it challenging for networks to accurately capture and identify edges and subtle lesions. Integrating edge detection methods into convolutional models opens new avenues to address these challenges. In other medical image segmentation tasks, models incorporating edge detection methods have shown exceptional performance, as seen in (Zhu et al. 2023) and (Lu et al. 2023). However, there is a paucity of relevant research in the task of FCD lesion segmentation.

To address these clinical needs and the shortcomings of existing algorithms, this study proposes an edge information-guided FCD lesion segmentation model. Based on the MedNext (Roy et al., 2023) network architecture, the main contributions of this paper are as follows:

1. We propose a Laplacian edge attention block to extract edge features from the T1 modality and integrate them with attention mechanisms, enabling spatial localization from global edges to FCD lesion edges.

2. We propose a multi-strategy feature fusion block to effectively combine high-frequency edge information with network decoding information.

2 Materials and methods

2.1 Materials

Our network was evaluated on two distinct datasets: the Open Dataset, provided by the Department of Epileptology at the University Hospital Bonn (Schuch et al., 2023) and our proprietary dataset, referred to as the Private Dataset. More detailed information on the two datasets is shown in Table 1 and the brief introduction of the two datasets is as follows.

Table 1
www.frontiersin.org

Table 1. Subjects characteristics of FCD.

Open dataset: Open dataset consists of 85 people with FCD type II and 85 control persons. Data for each patient include preoperative T1, preoperative FLAIR and lesion ground truth. Lesion ground truth was annotated on FLAIR images by two neurologists. Because 7 patients underwent a FLAIR sequence with thick slices, we excluded these data and ended up with 78 3D cases.

Private dataset: Private dataset consists of 125 patients. Among them, 41.6% of the patients were diagnosed with FCD IIB pathologically, 24.8% with FCD IIA, and 18.4% with other types of malformation of cortical development, while the remaining small proportion was attributed to other etiological causes. Each patient contains preoperative 3-T T1, FLAIR, FDG-PET images, and lesion ground truth. Lesion ground truth was annotated on T1 images by the physician. This study has obtained approval from the Ethics Committee of Peking University First Hospital. All participants provided written informed consent for us to use their data for research purposes.

2.2 Methods

2.2.1 Model architecture

Before detailing our method, we provide a concise overview of the MedNeXt network. MedNeXt enhances the nnU-Net framework by introducing several specialized modules: MedNeXt Block, MedNeXt 2 × down Block, MedNeXt 2 × up Block, Stem Block and Output/Deep Supervision Block. Our approach adopts the MedNeXt network, adhering to the architectural components outlined in the original paper.

The proposed network architecture, illustrated in Figure 2, comprises an encoder-decoder model, Laplacian edge attention blocks, and multi-strategy feature fusion blocks. The encoder-decoder model, derived from MedNeXt, supports composite scaling in depth, width, and receptive field to effectively extract contextual and high-level features from multimodal images. The Laplacian edge attention block integrates Laplacian pyramid features with encoded network features to enhance edge representation, directing the network's focus toward lesion edge. The multi-strategy feature fusion blocks combine edge features with decoded network features, employing diverse fusion strategies across decoding layers to retain edge information from shallow features. The subsequent sections elaborate on the structure and efficacy of the Laplacian edge attention block and the multi-strategy feature fusion block.

Figure 2
Flowchart of a neural network architecture for medical imaging analysis. It illustrates an encoder-decoder structure with skip connections. The input includes T1, FLAIR, and PET scans. Various blocks such as MedNeXt, LEA, and Laplacian pyramid processes are used. Outputs are shown at the decoder end. Color-coded arrows represent different stages and processes.

Figure 2. The overall framework of the proposed methodology. (a) Inputs to the model; (b) encoding of the T1 image; (c) the architecture of LEM-UNet. The LEA and MFF blocks are integrated into the first two encoding layers and the last two decoding layers, respectively. The three-colored dashed data stream denotes the three inputs to the LEA block. The dark blue blocks represent downsampling or stem results, the gray blocks represent the results after upsampling or MedNeXt block, and the light blue block represents the results after MedNeXt block.

2.2.1.1 Laplacian edge attention block

The segmentation of FCD lesions is particularly challenging due to their ambiguous edge. The LEA block addresses this by enhancing edge information, enabling the network to better focus on lesion edge and thereby improving segmentation performance. Since edge features are inherently shallow, the LEA block is applied to the first and second layers of the encoding process to effectively capture texture details of lesion edges.

The LEA block proposed in this paper is shown in Figure 3. Each LEA block receives three inputs: encoded feature fei from the main network branch, encoded feature fe-T1i from the T1 branch, and high-frequency features fli from the Laplacian pyramid. These features generate an output feature map fedgei after being processed by the LEA block, which then enters the subsequent network for feature fusion processing. The following section will detail the inputs of the LEA block and its specific processing methods.

Figure 3
Diagram of a neural network architecture featuring CBAM (Convolutional Block Attention Module) components for inputs, and. The upper box details channel and spatial attention mechanisms using max and average pooling. Symbols denote operations: multiplication, addition, sigmoid, and concatenation. Outputs undergo batch normalization, convolution, and GeLU activation, producing.

Figure 3. The architecture of Laplacian Edge Attention (LEA) block.

2.2.1.1.1 Input of LEA block

The input of LEA block consists of three features.

• Encoded feature fei: Our backbone network continues to use the MedNext encoding-decoding model to process multimodal data, representing the feature maps extracted after MedNext block as feiCi×Di×Hi×Wi. To obtain rich shallow-level features, the first and second layer feature maps after MedNext processing are used, namely fe1C×D×H×W and fe22C×D2×H2×W2, as shown by the black dotted data stream in Figure 2.

• Encoded feature fe-T1i from branch T1: The T1-weighted MRI contains rich tissue texture and structural information. We individually extracted the characteristics, still using the MedNext network blocks, and obtained characteristics fe-T11C×D×H×W and fe-T122C×D2×H2×W2, as shown by the blue dotted data stream in Figure 2. Parameters R and B refer to paper (Roy et al., 2023).

• High frequency features fli: The Laplacian pyramid is a multiresolution image representation method based on the Gaussian pyramid, capable of capturing edges and detail information of images at different scales (Burt and Adelson, 1987). In this paper, this classic edge detection technique is selected to highlight the high-frequency edge features and the detail features of the images. In the LEA block, T1-weighted MRI is used as input to construct the Laplacian pyramid, as T1-weighted MRI provides rich texture information and clear anatomical structures. The specific calculation can be seen in Equation 1.

Nk=N,ifk=0,Nk=d(gs(Nk-1)),ifk1,Lk=Nk-up(Nk+1),    (1)

where N represents the input image, gs represents the convolution operator with Gaussian filter, d is the 2 × downsampling operation, and up is the 2 × linear interpolation upsampling. We denote fli=Li(N). As the image scale decreases, due to multiple processes of Gaussian filtering, upsampling, and downsampling, the high-frequency details in the feature maps significantly decrease. Therefore, the first and second layers of the Laplacian pyramid feature fl11×D×H×Wand fl21×D2×H2×W2 are selected as inputs for the LEA block, as shown by the red dotted data stream in Figure 2.

2.2.1.1.2 Procedure of LEA block

At the ith layer, the LEA block receives the three inputs mentioned above. At first, fei and fe-T1i are each passed through a Convolutional Block Attention Module (CBAM) (Woo et al., 2018). The CBAM primarily consists of two components: channel attention and spatial attention, which serve to emphasize useful channel features and important spatial locations, respectively. Subsequently, the weights of the feature maps are adjusted to enhance the representation of salient features and fmapi is obtained. The process is shown in Equation 2.

β=1-λ,fmapi=BN(CBAM(fei)×λ+CBAM(fe-T1i)×β),    (2)

where BN is the batch normalization, λ is a learnable parameter, with the purpose of adjusting the weights of fei and fe-T1i, allowing the network to autonomously learn the complementary information between this two features.

The high frequency features fli correspond to the overall edge information and texture features of the original image. In order to obtain edge information and detailed features related to the FCD lesions, we fuse fli and fmapi, further screening the features to ensure that the edges of the FCD lesions are highlighted, and finally obtain the edge attention feature fedgei, characterized as follows Equation 3.

mapi=σ(con[fli,fmapi]),fedgei=g(BN(Cov(mapi))).    (3)

In this context, the symbol σ signifies the sigmoid function, Cov signifies the 1 × 1 × 1 convolution, con is Concatenation operation and the symbol g represents GELU function.

Through the LEA block, we obtain the edge attention features for the first and second layers, denoted fedge1C×D×H×W and fedge22C×D2×H2×W2, respectively. These features are then inputted into the feature fusion block, where they are combined with the backbone network to achieve better segmentation performance.

2.2.1.2 Multi-strategy feature fusion block

Multi-strategy feature fusion (MFF) block is shown in Figure 4. Our backbone network undergoes four downsampling and four upsampling operations. The upsampling features are now denoted as fdi, with the last upsampling result denoted as fd1 and the penultimate upsampling result as fd2. Each MFF block accepts two inputs: fdi and fedgei, employing different fusion strategies at different decoder layers to maximize the use of edge information of the injury obtained from the LEA block. Define the output features of MFF block as fmffi.

Figure 4
Diagram of two processes labeled 'a' and 'b'. Both involve functions and operations such as convolution, downsampling, and upsampling. Process 'a' starts with inputs and, using 'conv-guass', 'downsample', and 'upsample', leading to output. Process 'b' begins with inputs, and, featuring similar steps and operations, concluding with output. Diagram includes various operations denoted by symbols and lines connecting steps.

Figure 4. The architecture of multi-strategy feature fusion (MFF) block. (a) MFF(1); (b) MFF(2).

In MFF (1), the inputs are processed as shown in Equation 4.

Mi=fdi-up(d(gs(fdi))),fmff1=σ(M1)×fedge1,    (4)

where Mi represents the high-frequency information obtained from the decoded feature fdi and it is designed to further suppress noise in the output of the LEA block. In MFF (2), the inputs are processed as shown in Equation 5.

I=σ(Cov(fedge2),fmff2=σ(M2)×(fedge2+Cov(d(fedge1))×I).    (5)

We employ distinct fusion strategies in MFF block. The encoded features contain rich texture information in the shallow layers, and as the network deepens, it increasingly focuses on complex high-level features and abstract representations. Similarly, in the Laplacian pyramid, with the increasing number of sampling iterations, the detail information in the feature maps is progressively lost.

Therefore, during the feature fusion phase, the edge information rich feature fedge1 undergoes a 2 × downsampling and is fed into the subsequent layer for further processing. The purpose of this operation is to make full use of the edge information from the shallow layers of the network and to minimize the loss of information. Finally, the fused feature fmffi from the encoding layer ith is added to fdi for decoding.

2.2.2 Loss function

We employ a combination of Dice loss LDice and Cross-Entropy loss LCE at the training stage of our network. It can be defined as follows in Equation 6.

L=i=1Dwdicei*LDice(fDSi,fi)+wcei*LCE(fDSi,fi),    (6)

where D denotes the total number of layers, as we use deep supervision in our network. wdicei and wcei are the weights of LDice and LCE at the ith decoding layer. fDSi is predicted feature map at the ith decoding layer. fi is Ground Truth (GT) at the ith layer.

2.2.3 Evaluation metrics

To quantitatively compare our method with others, we employ five evaluation metrics.

Dice Coefficient (DC) is used to measure the overlap between the predicted segmentation and the ground truth. To assess individual performance, we set the DC thresholds at 0.0 and 0.22. A DC score greater than 0.0, the so-called “one voxel overlap,” has been used in other FCD detection studies (David et al., 2021; Spitzer et al., 2022; Gill et al., 2021), and a higher threshold of 0.22 has been shown to be effective in reproducing the precise positioning performance of expert raters (Walger et al., 2024). Precision (Pre) represents the proportion of samples that are actually positive among those predicted as positive by the model and reflects the accuracy of the positive predictions of the model. Recall (Rec) indicates the proportion of actual positive samples that are correctly predicted as positive by the model and reflects the model's coverage ability in the positive class. Intersection over Union (IoU) is used to measure the degree of overlap between the result of the segmentation and the ground truth. They are defined as Equation 7.

DC(G,P)=2*|GP||G||P|=2*G·PG2+P2,          Pre=TPTP+FP,          Rec=TPTP+FN,    (7)
IoU=GPGP,Detection=NdNa*100%,

where G and P denote the GT and the predicted results of the model, respectively. TP refers to the number of true positive voxels, FP refers to the number of false positive voxels and FN denotes the false negative voxels. Nd refers to the number of individuals with a dice value higher than the threshold of 0.0 or 0.22, while Na denotes the total number of individuals. The higher the above metrics, the better the model performance.

Hausdorff Distance (HD) is a measure of similarity or difference between two sets of points, especially the edge difference between two shapes. HD95 uses a 95% quantile and is more robust, while HD is susceptible to outliers. A lower value HD95 indicates a higher degree of similarity and less difference between the two sets of points. HD95 is defined as Equation 8.

HD95(G,P)=max{d95(G,P),d95(P,G)}    (8)

Our experiment uses a five-fold cross-validation, so the above metrics are averaged.

2.2.4 Comparison models

We included a variety of comparison models, namely: UNETR++ (Shaker et al., 2024), CoTr (Xie et al., 2021), nnFormer (Zhou et al., 2023), Res-Unet (Diakogiannis et al., 2020), and nnUNet. UNETR++, nnFormer, and CoTr represent successful integrations of CNNs and transformers. These hybrid frameworks leverage the transformer's strong long-sequence modeling capabilities to compensate for CNN's limited receptive field and weak long-range dependencies. They have excellent performance in medical data sets such as brain tumors and organ segmentation. Both Res-Unet and nnUNet are pure CNN-based architectures. Res-Unet introduces residual connections in the encoder-decoder, effectively alleviating the problem of vanishing gradients and making it suitable for small-sample medical image segmentation. nnUNet is an automated medical image segmentation framework. It has achieved excellent performance in multiple medical segmentation challenges and exhibits strong generalization performance. By including pure CNNs models and hybrid models, we can better demonstrate the advantages of our proposed method in the results.

3 Results

3.1 Implementation details

Our method was implemented on Python 3.9.12 and Pytorch 2.1.1. Experiments were deployed based on the MedNeXt framework, using five-fold cross-validation and the data preprocessing operations built into the MedNeXt framework to perform format conversion, cropping, resampling, and standardization on the data, resulting in a data size of 128 × 128 × 128 with a voxel size of 1mm × 1mm × 1mm. The model training batch size was 2, the initial learning rate was 10-3 and the maximum epoch was 100. Frame size configuration was M, kernel size was 3, λ was 0.5 and the optimizer was AdamW.

For all comparison models, the training batch size, patch size and maximum epoch were the same as those of MedNeXt. The initial learning rate was 10-2 and the optimizer was SGD.

3.2 Results on open and private datasets

First, we use five indicators to evaluate the predictions of various models, showing the overall and subtype results, respectively. Secondly, we compared the prediction results of different models in terms of individual detection rate and slice (axial, sagittal, and coronal) detection rate, enabling a comprehensive analysis of the results. Finally, we present and discuss the 16 negative cases in the Private Dataset separately.

Open dataset result: As listed in Table 2, compared to MedNeXet_M3, our method improved the DC metric by 2.40%. LEM-UNet also achieved 2.00% of the DC score over the MedNeXet_L3 which used more convolutional layers. Compared with other CNN-based models, LEM-UNet showed significant improvements in these five metrics. Among transformer-based models, our model exhibited a substantial advantage, with a 5.80% increase in the DC metric over the comparatively high-performing CoTr model. The results indicated that nnFormer and UNETR++ performed poorly in this dataset, struggling to capture the details and characteristic edge of the FCD lesions. In terms of the ability to identify positive class samples, our model excelled, with a notably improved ability to capture FCD lesions details after integrating LEA blocks.

Table 2
www.frontiersin.org

Table 2. Performance metrics for different methods on Open and Private datasets.

Private dataset result: As shown in Table 2, all networks achieved a higher DC score compared to the Open dataset. One reason is that each case in this dataset comprised three modalities of raw data (T1, FLAIR, and FDG-PET), providing the base network with a sufficient amount of rich information. However, our network still demonstrated commendable performance, with a 2.90% improvement in DC metrics over MedNeXt_M3. In terms of the ability to identify positive class samples, our network remained more prominent compared to other networks, with a 2.90% increase in Rec over MedNeXt_M3. Secondly, compared to transformer-based networks, LEM-UNet still demonstrated significant advantages, particularly in terms of predictive precision, indicating that our network is more capable of focusing on the subtle details of the lesions.

Classification results: In LEM-UNet, the segmentation performance according to histopathology and seizure-free state is shown in Table 3. In the Private Dataset, the DC score for 13 subjects with FCD type I reached 0.655, whereas for 23 subjects with the other MCDs, the DC score was 0.620.

Table 3
www.frontiersin.org

Table 3. Segmentation performance based on pathology and seizure freedom in Open and Private dataset.

Detection result: We analyzed the detection rate results of different models. Table 4 showed the results of the detection rate at the individual and slice level, respectively. LEM-UNet still demonstrated superior performance. One of the major challenges in FCD lesion detection is the occurrence of false-positive clusters that do not have any overlap with the ground truth. Table 5 showed the statistical results on the individual level, which counts the false-positive, true positive, false negative, and ground truth clusters, respectively. All predicted results were subjected to the same connected component post-processing procedure. If retaining only the largest connected component did not yield an improvement in the DC score, the original predictions were preserved, and no non-largest connected components were removed. Independent connected components were identified using the standard 3D 26-connectivity criterion. The results demonstrate that our method exhibits strong performance in identifying true positive clusters, achieving the fewest false-positive clusters on the Open Dataset. In contrast, nnFormer predicted a relatively larger number of false-positive clusters on both datasets.

Table 4
www.frontiersin.org

Table 4. Comparison of individual detection rates and average detection rates across the three directional slices between LEM-UNet and other models across two Datasets.

Table 5
www.frontiersin.org

Table 5. Comparisons of the number of ground GT lesions, TP lesions, FP lesions, and FN lesions among models different on two Datasets.

MRI-negative result: We included a discussion on MRI-negative patients. Table 6 displayed the results of different models on 16 MRI-negative patients from Private Dataset. These 16 patients appeared positive on PET. Most models experienced varying degrees of performance degradation on these difficult-to-identify MRI-negative patients. For example, the nnUNet's DC score dropped by 6.00%, and the Res-Unet's DC score dropped by 2.00%. However, the hybrid transformer architecture model appeared relatively stable. Our method showed only a slight performance degradation and fully identified all 16 MRI-negative patients.

Table 6
www.frontiersin.org

Table 6. The evaluation results of MRI-negative patients on different models.

3.3 Visualization comparison and discussion

The visualization results of two datasets are illustrated in Figure 5. The first two rows display the visualization results on the Private Dataset, while the last two rows show the visualization results on the Open Dataset. The ground truth is set to white, while the model's predicted results are set to red and superimposed upon the ground truth for comparison. We observe that transformer-based models exhibit a significantly lower ability to capture positive samples in this task compared to CNN-based models. They struggle to correctly identify FCD lesions with blurred edges and subtle features, even resulting in missed detections, as seen in the CoTr prediction results. By comparing all visualized outcomes, LEM-UNet demonstrates superior performance in capturing the lesion details. Furthermore, our network, by enhancing the focus on edge information, becomes more sensitive to lesion edges, increasing the number of true positive identifications while reducing the likelihood of false positives.

Figure 5
MRI brain scans are shown across four rows, each demonstrating results from different models. The columns display: (a) T1 image, (b) Ground Truth, (c) LEM-UNet, (d) MedNeXt_M, (e) MedNeXt_L3, (f) nnUNet, (g) UNETR++, (h) Res-UNet, (i) CoTr, and (j) nnFormer. Highlighted areas in red indicate areas of interest or abnormalities for comparison across models.

Figure 5. Visualization comparison on Open and Private Datasets (Private Dataset: first two rows, Open Dataset: last two rows). (a) The T1 image, (b) ground truth, (c–j) illustrated the visualization results of LEM-UNet(ours), MedNeXt_M3, MedNeXt_L3, nnUNet, UNETR++, Res-Unet, CoTr, and nnFormer separately.

The heat map shown in Figure 6 further illustrates the network focus when processing features. In the LEA block, it has a stronger focus on the edges of the lesion, with a color gradient from blue to red indicating low to high intensity.

Figure 6
Brain MRI images are displayed in three rows labeled a, b, and c, with columns for T1, FLAIR, fedge, f1mff, pred, and GT. The images show different characteristics of brain tissue, with areas highlighted for analysis. The fedge and f1mff columns use color mapping to show intensity variations. The pred column shows predicted regions in white, while GT marks ground truth areas in red.

Figure 6. Heatmap visualization of the output in LEA and MFF blocks. Each row represents a case (a–c) from Open Dataset. The first column displays the patient's T1 image, the second column shows the patient's FLAIR image, the third column shows the output of the first LEA block: fedge1, the fourth column presents the output of the last MFF block: fmff1, the fifth column displays the predict result, and the last column depicts the GT that are set to red.

3.4 Ablation study

In this section, we underscore the significance of each proposed block by conducting a series of ablation studies. Before proceeding with the ablation experiments, a brief discussion will be held on the initialization value of the learnable parameter λ. The experiments will be conducted on the Private Dataset, and the results are shown in Table 7.

Table 7
www.frontiersin.org

Table 7. λ initialization in LEM-UNet on Private Dataset.

Performance is evaluated based on three metrics: DC, Pre, and Rec. It is observed that when λ = 0.5, the model exhibited superior comprehensive performance, with a higher Pre and DC score compared to when λ = 0.3 or λ = 0.7. Next, we proceed with the deployment of ablation studies. Within the context of the LEM-UNet framework, the LEA and MFF blocks are of paramount importance. Given that one of the inputs to the MFF block is the output of the LEA block, this paper evaluates the segmentation outcomes when the MFF and LEA blocks are sequentially removed, in order to validate the performance of each block. We conduct these assessments on the MedNeXt_M3 baseline with a parameter setting of λ = 0.5, using the Private Dataset, and present the results, as detailed in Table 8.

Table 8
www.frontiersin.org

Table 8. Ablation study of proposed blocks for LEM-UNet on Private Dataset.

In LEM-UNet, the incorporation of the edge attention module improves model capabilities but also introduces additional resource overhead. We conducted a comprehensive comparison of the parameter counts and computational complexity between LEM-UNet and other models, as shown in Table 9. Compared to the baseline model MedNeXt_M3, LEM-UNet exhibits higher computational complexity but achieves superior segmentation performance. Our approach strikes a better balance between computational resource utilization and performance enhancement.

Table 9
www.frontiersin.org

Table 9. Comparison of computational complexity for different models on Private Dataset.

4 Discussion

The segmentation of FCD lesions is highly complex, as their characteristics vary across different imaging modalities and are often characterized by blurred edge, posing significant challenges to the segmentation task. The success of FCD lesion resection surgery heavily depends on the precision of the resection area. Therefore, we propose integrating edge enhancement methods into convolutional networks, aiming to improve segmentation performance by enhancing high-frequency edge information of lesions, thereby providing a reliable reference for the clinical diagnosis of FCD lesions.

From the experimental results, LEM-UNet demonstrates superior performance, outperforming comparison models across multiple key metrics, as shown in Table 2. Furthermore, our method exhibits significant advantages in individual and slice detection rates, achieving 99.20% and 81.22% (DC > 0.0) on the Private Dataset, respectively. Based on visualization of feature maps, the LEA block demonstrates the ability to capture high-frequency edge information of lesions, as illustrated in Figure 6, although these edge cues are not entirely closed. According to the ablation study, the integration of the Laplacian pyramid with CBAM effectively enhances the network's focus on high-frequency edge information of FCD lesions, which is one of the key factors contributing to the improved segmentation performance of LEM-UNet.

Deep learning-based methods have demonstrated promising performance in FCD lesion segmentation tasks (e.g., Thomas et al., 2020; Zhang X. et al., 2024; Niyas et al., 2021). However, compared to the excellent segmentation results in other tasks, such as brain tumor segmentation and polyp segmentation, there is still under-segmentation, which may be closely related to the complexity of FCD lesions. Previous approaches primarily focused on feature fusion and enhancement, contributing to addressing segmentation challenges such as semantic gaps, spatial information loss, and limited receptive fields. However, they often overlooked edge features that are particularly meaningful for segmentation tasks. To address this, we introduced the Laplacian pyramid to process lesion edges, leveraging the characteristics of FCD lesions to reduce the false positive rate in segmentation results. By incorporating edge analysis tailored to FCD lesion features, LEM-UNet maximizes the capture of morphological characteristics of lesion edges while focusing on deep abstract features, which is a key advantage of our method over existing algorithms.

Nevertheless, the proposed method still suffers from the inherent limitations of deep learning frameworks, such as its dependence on the accuracy of training data annotations. Additionally, the small dataset size restricts the model's generalizability and robustness. We performed five-fold cross-validation on both Open and Private Datasets and conducted a series of optimization experiments. However, due to the limited availability of FCD data, our method did not use scan data from another center as an independent test set to test the model performance, which is another limitation. Future work will integrate multi-center data to comprehensively enhance the model's generalization capability.

5 Conclusion

In this study, we introduce LEM-UNet, an advanced framework for 3D FCD lesions segmentation in multimodal medical images, fortified by the innovative LEA and MFF blocks. We designed the LEA block, which uses a 3D Laplacian pyramid to capture shallow edge information in images, combined with the CBAM to achieve precise localization from global information to local lesion information, further guiding the network to focus on lesion edges and detailed information. Additionally, we designed the MFF block, which combines high-frequency edge features with advanced prediction features, overcoming semantic differences between feature maps while further eliminating redundant information. We deploy these designed modules in the MedNeXt_M3 framework, and the results evaluated through five metrics demonstrate that our method provides competitive performance, with overall metrics superior to other SOTA methods. This emphasizes the effectiveness of our proposed framework in the field of medical image segmentation.

Data availability statement

The datasets presented in this article are not readily available because our Private Dataset is restricted by ethical requirements. Requests to access the datasets should be directed to Open Dataset: https://openneuro.org/datasets/ds004199/.

Ethics statement

The studies involving humans were approved by the Ethics Committee of Peking University First Hospital (No. 102627). Written informed consent was obtained from the parents of all participants, allowing the use of their children's data for research purposes. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants' legal guardians/next of kin.

Author contributions

QL: Investigation, Conceptualization, Writing – review & editing, Methodology, Validation, Formal analysis, Visualization, Data curation, Software, Writing – original draft. HY: Writing – original draft, Data curation, Validation, Methodology, Writing – review & editing, Formal analysis. MZ: Writing – review & editing, Methodology, Formal analysis, Data curation. XY: Writing – review & editing, Data curation. LC: Resources, Validation, Writing – review & editing, Conceptualization, Supervision, Methodology. GK: Methodology, Supervision, Validation, Conceptualization, Writing – review & editing, Funding acquisition.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported by the State Key Program of the National Natural Science Foundation of China (82030037).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Avakyan, G., Blinov, D., Lebedeva, A., Burd, S., and Avakyan, G. (2017). Ilae classification of the epilepsies: the 2017 revision and update. Epilepsy Paroxysmal Condit. 9, 6–25. doi: 10.17749/2077-8333.2017.9.1.006-025

Crossref Full Text | Google Scholar

Baete, K., Nuyts, J., Van Paesschen, W., Suetens, P., and Dupont, P. (2004). Anatomical-based fdg-pet reconstruction for the detection of hypo-metabolic regions in epilepsy. IEEE Trans. Med. Imaging 23, 510–519. doi: 10.1109/TMI.2004.825623

PubMed Abstract | Crossref Full Text | Google Scholar

Burt, P. J., and Adelson, E. H. (1987). “The laplacian pyramid as a compact image code,” in Readings in Computer Vision (Elsevier), 671–679. doi: 10.1016/B978-0-08-051581-6.50065-9

Crossref Full Text | Google Scholar

David, B., Kröll-Seger, J., Schuch, F., Wagner, J., Wellmer, J., Woermann, F., et al. (2021). External validation of automated focal cortical dysplasia detection using morphometric analysis. Epilepsia 62, 1005–1021. doi: 10.1111/epi.16853

PubMed Abstract | Crossref Full Text | Google Scholar

Dev, K. B., Jogi, P. S., Niyas, S., Vinayagamani, S., Kesavadas, C., and Rajan, J. (2019). Automatic detection and localization of focal cortical dysplasia lesions in MRI using fully convolutional neural network. Biomed. Signal Process. Control 52, 218–225. doi: 10.1016/j.bspc.2019.04.024

Crossref Full Text | Google Scholar

Diakogiannis, F. I., Waldner, F., Caccetta, P., and Wu, C. (2020). Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogram. Rem. Sens. 162, 94–114. doi: 10.1016/j.isprsjprs.2020.01.013

Crossref Full Text | Google Scholar

Fisher, R. S., Boas, W. V. E., Blume, W., Elger, C., Genton, P., Lee, P., et al. (2005). Epileptic seizures and epilepsy: definitions proposed by the international league against epilepsy (ilae) and the international bureau for epilepsy (IBE). Epilepsia 46, 470–472. doi: 10.1111/j.0013-9580.2005.66104.x

PubMed Abstract | Crossref Full Text | Google Scholar

Gill, R. S., Lee, H.-M., Caldairou, B., Hong, S.-J., Barba, C., Deleo, F., et al. (2021). Multicenter validation of a deep learning detection algorithm for focal cortical dysplasia. Neurology 97, e1571–e1582. doi: 10.1212/WNL.0000000000012698

PubMed Abstract | Crossref Full Text | Google Scholar

Hong, S.-J., Kim, H., Schrader, D., Bernasconi, N., Bernhardt, B. C., and Bernasconi, A. (2014). Automated detection of cortical dysplasia type II in MRI-negative epilepsy. Neurology 83, 48–55. doi: 10.1212/WNL.0000000000000543

PubMed Abstract | Crossref Full Text | Google Scholar

Isensee, F., Petersen, J., Klein, A., Zimmerer, D., Jaeger, P. F., Kohl, S., et al. (2018). NNU-net: self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:1809.10486.

PubMed Abstract | Google Scholar

Jin, B., Krishnan, B., Adler, S., Wagstyl, K., Hu, W., Jones, S., et al. (2018). Automated detection of focal cortical dysplasia type ii with surface-based magnetic resonance imaging postprocessing and machine learning. Epilepsia 59, 982–992. doi: 10.1111/epi.14064

PubMed Abstract | Crossref Full Text | Google Scholar

Kabat, J., and Król, P. (2012). Focal cortical dysplasia-review. Polish J. Radiol. 77:35. doi: 10.12659/PJR.882968

PubMed Abstract | Crossref Full Text | Google Scholar

Lu, L., Chen, S., Tang, H., Zhang, X., and Hu, X. (2023). A multi-scale perceptual polyp segmentation network based on boundary guidance. Image Vis. Comput. 138:104811. doi: 10.1016/j.imavis.2023.104811

Crossref Full Text | Google Scholar

Mechelli, A., Price, C. J., Friston, K. J., and Ashburner, J. (2005). Voxel-based morphometry of the human brain: methods and applications. Curr. Med. Imag. 1, 105–113. doi: 10.2174/1573405054038726

Crossref Full Text | Google Scholar

Niyas, S., Vaisali, S. C., Show, I., Chandrika, T., Vinayagamani, S., Kesavadas, C., et al. (2021). Segmentation of focal cortical dysplasia lesions from magnetic resonance images using 3D convolutional neural networks. Biomed. Signal Process. Control 70:102951. doi: 10.1016/j.bspc.2021.102951

Crossref Full Text | Google Scholar

Roy, S., Koehler, G., Ulrich, C., Baumgartner, M., Petersen, J., Isensee, F., et al. (2023). “Mednext: transformer-driven scaling of convnets for medical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer), 405–415. doi: 10.1007/978-3-031-43901-8_39

Crossref Full Text | Google Scholar

Schuch, F., Walger, L., Schmitz, M., David, B., Bauer, T., Harms, A., et al. (2023). An open presurgery MRI dataset of people with epilepsy and focal cortical dysplasia type II. Sci. Data 10:475. doi: 10.1038/s41597-023-02386-7

PubMed Abstract | Crossref Full Text | Google Scholar

Shaker, A. M., Maaz, M., Rasheed, H., Khan, S., Yang, M.-H., and Khan, F. S. (2024). Unetr++: delving into efficient and accurate 3D medical image segmentation. IEEE Trans. Med. Imag. 43, 3377–3390. doi: 10.1109/TMI.2024.3398728

PubMed Abstract | Crossref Full Text | Google Scholar

Spitzer, H., Ripart, M., Fawaz, A., Williams, L. Z., project, M., Robinson, E. C., et al. (2023). “Robust and generalisable segmentation of subtle epilepsy-causing lesions: a graph convolutional approach,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer), 420–428. doi: 10.1007/978-3-031-43993-3_41

Crossref Full Text | Google Scholar

Spitzer, H., Ripart, M., Whitaker, K., D'Arco, F., Mankad, K., Chen, A. A., et al. (2022). Interpretable surface-based detection of focal cortical dysplasias: a multi-centre epilepsy lesion detection study. Brain 145, 3859–3871. doi: 10.1093/brain/awac224

PubMed Abstract | Crossref Full Text | Google Scholar

Thesen, T., Quinn, B. T., Carlson, C., Devinsky, O., DuBois, J., McDonald, C. R., et al. (2011). Detection of epileptogenic cortical malformations with surface-based MRI morphometry. PLoS ONE 6:e16430. doi: 10.1371/journal.pone.0016430

PubMed Abstract | Crossref Full Text | Google Scholar

Thomas, E., Pawan, S., Kumar, S., Horo, A., Niyas, S., Vinayagamani, S., et al. (2020). Multi-res-attention uNet: a CNN model for the segmentation of focal cortical dysplasia lesions from magnetic resonance images. IEEE J. Biomed. Health Inf. 25, 1724–1734. doi: 10.1109/JBHI.2020.3024188

PubMed Abstract | Crossref Full Text | Google Scholar

Urbach, H., Kellner, E., Kremers, N., Blümcke, I., and Demerath, T. (2022). MRI of focal cortical dysplasia. Neuroradiology 64, 443–452. doi: 10.1007/s00234-021-02865-x

PubMed Abstract | Crossref Full Text | Google Scholar

Wagstyl, K., Whitaker, K., Raznahan, A., Seidlitz, J., Vértes, P. E., Foldes, S., et al. (2022). Atlas of lesion locations and postsurgical seizure freedom in focal cortical dysplasia: a meld study. Epilepsia 63, 61–74. doi: 10.1111/epi.17130

PubMed Abstract | Crossref Full Text | Google Scholar

Walger, L., Bauer, T., Kügler, D., Schmitz, M. H., Schuch, F., Arendt, C., et al. (2024). “Bridging the gap between human and artificial intelligence-an evaluation framework for computer-aided detection of brain lesions”, in SSRNScholarly Paper (Rochester, NY). doi: 10.2139/ssrn.4692599

Crossref Full Text | Google Scholar

Wang, H., Ahmed, S. N., and Mandal, M. (2020). Automated detection of focal cortical dysplasia using a deep convolutional neural network. Comput. Med. Imag. Graph. 79:101662. doi: 10.1016/j.compmedimag.2019.101662

PubMed Abstract | Crossref Full Text | Google Scholar

Willard, A., Antonic-Baker, A., Chen, Z., O'Brien, T. J., Kwan, P., and Perucca, P. (2022). Seizure outcome after surgery for MRI-diagnosed focal cortical dysplasia: a systematic review and meta-analysis. Neurology 98, e236–e248. doi: 10.1212/WNL.0000000000013066

PubMed Abstract | Crossref Full Text | Google Scholar

Woo, S., Park, J., Lee, J.-Y., and Kweon, I. S. (2018). “CBAM: convolutional block attention module,” in Proceedings of the European Conference on Computer Vision (ECCV), 3–19. doi: 10.1007/978-3-030-01234-2_1

Crossref Full Text | Google Scholar

Xie, Y., Zhang, J., Shen, C., and Xia, Y. (2021). “COTR: efficiently bridging CNN and transformer for 3D medical image segmentation,” in Medical Image Computing and Computer Assisted Intervention-MICCAI 2021: 24th International Conference, Strasbourg, France, September 27-October 1, 2021, Proceedings, Part III 24 (Springer), 171–180. doi: 10.1007/978-3-030-87199-4_16

Crossref Full Text | Google Scholar

Zhang, S., Zhuang, Y., Luo, Y., Zhu, F., Zhao, W., and Zeng, H. (2024). Deep learning-based automated lesion segmentation on pediatric focal cortical dysplasia ii preoperative MRI: a reliable approach. Insights Imaging 15:71. doi: 10.1186/s13244-024-01635-6

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, X., Zhang, Y., Wang, C., Li, L., Zhu, F., Sun, Y., et al. (2024). Focal cortical dysplasia lesion segmentation using multiscale transformer. Insights Imaging 15:222. doi: 10.1186/s13244-024-01803-8

PubMed Abstract | Crossref Full Text | Google Scholar

Zhou, H.-Y., Guo, J., Zhang, Y., Han, X., Yu, L., Wang, L., et al. (2023). NNFORMER: volumetric medical image segmentation via a 3d transformer. IEEE Trans. Image Proc. 32, 4036–4045. doi: 10.1109/TIP.2023.3293771

PubMed Abstract | Crossref Full Text | Google Scholar

Zhu, Z., He, X., Qi, G., Li, Y., Cong, B., and Liu, Y. (2023). Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Inf. Fusion 91, 376–387. doi: 10.1016/j.inffus.2022.10.022

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: focal cortical dysplasia, multimodal medical imaging, deep learning, edge information, medical image segmentation

Citation: Li Q, Yu H, Zhang M, Yuan X, Cai L and Kang G (2025) LEM-UNet: an edge-guided network for 3D multimodal images segmentation in focal cortical dysplasia. Front. Neurosci. 19:1634606. doi: 10.3389/fnins.2025.1634606

Received: 24 May 2025; Accepted: 29 August 2025;
Published: 08 October 2025.

Edited by:

Dong Zeng, Southern Medical University, China

Reviewed by:

Sophie Adler, University College London, United Kingdom
Dezhi Cao, Shenzhen Children's Hospital, China
Xiaoming Jiang, Chongqing University of Post and Telecommunications, China

Copyright © 2025 Li, Yu, Zhang, Yuan, Cai and Kang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lixin Cai, cHVmaHBlY2NseEAxNjMuY29t; Guixia Kang, Z3hrYW5nQGJ1cHQuZWR1LmNu

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.