Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Behav. Neurosci., 12 January 2026

Sec. Pathological Conditions

Volume 19 - 2025 | https://doi.org/10.3389/fnbeh.2025.1705385

This article is part of the Research TopicArtificial Intelligence for Behavioral Neuroscience: Unlocking mechanisms, modeling behavior, and advancing predictionView all articles

Customized SAM-Med3D with multi-view adapter and T2-FLAIR mismatch features for glioma IDH genotyping and grading


Xinyu Li&#x;Xinyu Li1Hui Li&#x;Hui Li2Yunyi Hu
Yunyi Hu3*Jingjing ZhangJingjing Zhang1Lanlan WangLanlan Wang1Xinran YangXinran Yang1
  • 1School of Computer Science and Engineering, Central South University, Changsha, China
  • 2School of Informatics, Xiamen University, Xiamen, China
  • 3School of Information Resource Management, Renmin University of China, Beijing, China

Objective: Gliomas, the most aggressive type of brain tumor, are infamous for their low survival rates. Tumor grading and isocitrate dehydrogenase (IDH) status are key prognostic biomarkers for gliomas. However, obtaining these markers typically requires invasive methods such as biopsy. As an effective, noninvasive alternative, multimodal MRI can reveal tumor spatial information and the microenvironment. Low-grade and IDH-mutant gliomas often exhibit T2-FLAIR mismatch signals. Medical image foundational models can explore complex representations in medical images, and fine-tuning them may further enhance glioma diagnosis.

Methods: We propose a multi-task network, MTSAM, for simultaneous glioma IDH genotyping and grading. MTSAM first uses dilated convolutions to simulate large-field convolutions and then reviews the T2 and FLAIR images. Then, we employ convolutions to perform a detailed exploration of the T2 and FLAIR images, and we subtract the weighted T2 and FLAIR images to obtain T2-FLAIR mismatch features. T2-FLAIR mismatch features are concatenated with multimodal MRIs and input into the customized SAM-Med3D. The customized SAM-Med3D is fine-tuned by leveraging complementary information across multi-view modalities, including MRIs, handcrafted radiomics (HCR), and clinical features. Then it extracts deep features for accurate IDH genotyping and grading.

Results: MTSAM achieves AUCs of 92.38 and 94.31% for glioma IDH typing and grading on the UCSF-PDGM dataset, respectively, and AUCs of 91.56 and 93.37% on the BraTS2020 dataset, outperforming other methods. Additionally, we use Grad-CAM to visualize the attention maps of MTSAM, demonstrating its potential for non-invasive glioma diagnosis.

Conclusion: The proposed method demonstrates that we can effectively fuse multi-view, non-invasive information and fully explore the knowledge learned by medical image foundational models from large-scale medical datasets to facilitate glioma diagnosis, thereby advancing glioma research.

1 Introduction

Glioma is the most common malignant primary brain tumor in adults and has gained notoriety for its extremely poor five-year survival rate (Weller et al., 2015; Jayaram and Phillips, 2024; Li et al., 2022a). In clinical practice, the formulation of preoperative treatment plans for gliomas, such as whether to perform total surgical resection or whether preoperative targeted therapy is needed, depends heavily on tumor grading and isocitrate dehydrogenase (IDH) status (Olar et al., 2015; van den Bent et al., 2024). According to the World Health Organization (WHO) Classification of Tumors of the Central Nervous System (Mahajan et al., 2022; Wen and Packer, 2021), adult diffuse gliomas are classified into Grades 2–4, with IDH wild-type glioblastoma having the highest annual incidence. Specifically, the prognosis of IDH-mutant gliomas is generally better than that of IDH-wildtype gliomas (Vettermann et al., 2019; Han et al., 2020). High-grade gliomas are highly malignant and invasive, leading to a poor prognosis (Navarria et al., 2022; Zhou et al., 2022), while low-grade gliomas are less malignant and less invasive, resulting in a relatively better prognosis (Khan et al., 2021). Therefore, more conservative treatment approaches should be adopted for patients with IDH-wildtype or high-grade gliomas. Traditional methods for evaluating IDH status rely on invasive tissue sampling (Yu et al., 2024; Lim-Fat et al., 2022), which carries risks such as bleeding, infection, and tumor metastasis. For gliomas located in deep-seated areas such as the brainstem or thalamus, or with small volumes, the surgical risk of invasive biopsy is extremely high, and approximately 20% of such patients are forced to delay diagnosis because the benefits of biopsy are outweighed by the risks (Esquenazi et al., 2018; Santos et al., 2024). Moreover, gliomas often exhibit high spatial heterogeneity (Nicholson and Fine, 2021). Multimodal magnetic resonance imaging (MRI) and hand-crafted radiomics (HCR) features have emerged as promising approaches for glioma diagnosis due to the rich spatial information they contain (Tan et al., 2019; van Santwijk et al., 2022; Dayarathna et al., 2024; Yang et al., 2025; Bijari et al., 2025; Wu et al., 2023; Sudre et al., 2020). Clinical features such as age and gender provide fundamental physiological information about gliomas (Li et al., 2022b; Akpinar and Oduncuoglu, 2025). These non-invasive multi-view data offer crucial information for glioma diagnosis.

In IDH-wildtype gliomas, cells undergo metabolic reprogramming via the “Warburg effect,” which enables the rapid generation of adenosine triphosphate (ATP) and produces large quantities of biosynthetic precursors such as pyruvate and glutamine (Braun et al., 2021; Murnan et al., 2023). These substances provide raw materials for the synthesis of DNA and proteins required for the rapid division of tumor cells, thereby accelerating proliferation. IDH-wildtype gliomas generally present at higher grades, suggesting a potential complementary relationship between IDH status and glioma grade. Specifically, the incidence of IDH mutations is approximately 12% in WHO Grade 4 gliomas, while this proportion is nearly 60% in Grade 3 gliomas (Jusue-Torres et al., 2023; Komori, 2022; Whitfield and Huse, 2022). Using a multi-task deep learning network to simultaneously perform IDH genotyping and glioma grading may enable exploration of the potential complementary relationship between IDH status and grade, thereby further enhancing glioma diagnosis (Sairam et al., 2023).

The T2-FLAIR mismatch signal is critical for glioma IDH genotyping and grading (Han et al., 2022; Park et al., 2021; Lee et al., 2024). Specifically, IDH-mutant gliomas and lower-grade gliomas are often closely associated with this characteristic T2-FLAIR mismatch signal. However, traditional methods (Jeon et al., 2025; Tang et al., 2024) rely on T2-FLAIR image subtraction to explore T2-FLAIR mismatch features, treating all image positions as having uniform weights, and struggling to effectively capture the subtle differences and mismatch signals between T2 and FLAIR images. The perceptual logic of the human visual system (Stewart et al., 2020) suggests that forming an overall overview first, then conducting detailed observation, can effectively capture differential information. Therefore, first performing an overview of the T2 and FLAIR images, then conducting a refined exploration of their complementary information for weighted subtraction, may further reveal T2-FLAIR mismatch features.

Medical image foundational models are capable of mining complex patterns in data (Moor et al., 2023; Willemink et al., 2022), yet their potential for IDH genotyping and grading of gliomas has not been fully explored (Zhang and Metaxas, 2024; He et al., 2024). The encoder of segmentation models can effectively capture tumor edge and location information, which contains rich prognostic information (Zhang J. et al., 2023; Cheng et al., 2022; Yu et al., 2024). Thus, using features extracted by segmentation encoders may further improve the performance of IDH genotyping and glioma grading. SAM-Med3D (Wang et al., 2025) is a foundational model for medical image segmentation, pre-trained on a large-scale dataset comprising 245 disease categories, 70 public datasets, and 8,000 privately authorized hospital cases, including 22,000 medical images and 143,000 corresponding segmentation masks. Additionally, SAM-Med3D is composed of transformer blocks that focus on exploring long-range dependencies in images and achieves an overall Dice score of 80.71% on 16 medical image segmentation datasets. However, when SAM-Med3D is applied to medical image diagnosis, its performance often remains suboptimal (Wang, 2025). Fine-tuning SAM-Med3D by fusing multi-view information, including multimodal MRIs, HCR, and clinical features, can leverage the prior knowledge that SAM-Med3D has learned from large-scale medical datasets, thereby improving IDH genotyping and grading performance for gliomas.

To improve the performance of glioma IDH genotyping and grading, we propose the multi-task network named MTSAM. MTSAM uses the customized SAM-Med3D to simultaneously conduct IDH genotyping and grading for gliomas. First, we employ dilated convolutions and convolutions with shared weights to respectively conduct an overview and detailed exploration of the complementary information between T2 and FLAIR images. We then perform a weighted subtraction of these two images to obtain T2-FLAIR mismatch features. Then, we concatenate the T2-FLAIR mismatch feature maps with multi-modal MRIs along the channels and feed them into the customized SAM-Med3D, which is fine-tuned by fusing multi-view information, to obtain deep features for IDH genotyping and grading. We employ an uncertainty-weighted method to balance the losses associated with IDH genotyping and grading. Overall, our main contributions include the following:

(1) We propose a multi-task network named MTSAM, which explores T2-FLAIR mismatch features and utilizes the customized SAM-Med3D to explore the SAM-Med3D's prior knowledge learned from large-scale medical data for accurate glioma IDH genotyping and grading.

(2) We propose a multi-view adapter called MVAdapter that explores complementary and multi-scale information in multi-view data, including HCR, clinical, and MRI features, to fine-tune SAM-Med3D and uncover deep features for glioma IDH genotyping and grading.

(3) We propose a T2-FLAIR mismatch feature extraction block based on the human visual system, named MFEB, which first provides an overview of MRIs through dilated convolutions and then conducts detailed exploration using convolutions, aiming to capture the complementary information between T2 and FLAIR images and perform weighted subtraction to obtain T2-FLAIR mismatch features.

2 Materials and methods

2.1 Datasets

We use the publicly available UCSF-PDGM (Calabrese et al., 2022) and BraTS2020 (Menze et al., 2014) datasets. Each sample includes T1-weighted, T1-contrast-enhanced (T1CE), T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images, along with their segmentation results, IDH status, and grading information. After excluding samples with missing segmentation, IDH subtype, or grading information in the UCSF-PDGM and BraTS2020 datasets, 492 and 128 patients' data remain, respectively. Regions of interest (ROIs) are manually adjusted by radiologists and verified by experts, covering key areas such as the enhancing tumor region, necrotic tumor region, and peritumoral abnormal areas. We determine the center of the annotated tumor ROI by calculating the midpoint along its depth, width, and height. Using this center as a reference, we extract a (128, 128, 128) patch from the MRI data to retain peritumoral information (Cheng et al., 2020), which serves as the 3D MRI input to the network.

We split the data into a training set and a test set in an 8:2 ratio. 15% of the training data is used for validation to fine-tune parameters. As shown in Table 1, the demographic distributions in the training set and the test set are consistent.

Table 1
www.frontiersin.org

Table 1. Summary of the datasets used in this study.

2.2 Overview of MTSAM

As shown in Figure 1, the main steps of MTSAM are as follows: (A) Firstly, MTSAM takes multi-modal MRI as 3D input, extracts 1D hand-crafted radiomics features from the multi-modal MRI, and combines them with clinical features. (B) MTSAM mimics the human eye system, which uses multi-scale dilated blocks first to take an overview and then explore in detail to explore the complementary information between T2 and FLAIR images. After weighting them respectively, it extracts the T2-FLAIR mismatch feature map. (C) MTSAM concatenates the T2-FLAIR mismatch features and the multi-modal MRIs along the channels. The combined data is then input into the customized SAM-Med3D, fine-tuned by adapters that fuse multi-view information and further extract deep features for glioma IDH genotyping and grading.

Figure 1
Flowchart depicting a machine learning model for multimodal MRI analysis. Sections include: A) Data Preparation with multimodal-MRI input and HCR feature extraction; B) Mismatch Feature Extraction Block with pooling and convolution operations; C) Customized SAM-Med3D model with modules like MVAdapter, Transformer Block, and Global Average Pooling; D) Predictor identifying IDH mutation and grade; E) Dilated Block with convolution and neural network processes. Labels highlight components like convolution types and operations used.

Figure 1. Overall pipeline of the proposed MTSAM. (A) Data preparation. (B) MFEB. (C) Customized SAM-Med3D. (D) Predictor. (E) DB.

2.2.1 HCR and clinical feature extraction

As shown in Figure 1, MTSAM first extracts HCR features (Liu et al., 2019) from multi-modal MRI and their corresponding Regions of Interest (ROIs). The detailed extraction method for HCR features is available at https://pyradiomics.readthedocs.io/en/latest/. A total of 2,153 HCR features are extracted from each sample, and these features are combined with clinical features, including patients' age and gender, to form 1D features. The value of each feature is standardized by subtracting the mean and dividing by the standard deviation. In the UCSF-PDGM training dataset, we use Lasso regression with five-fold cross-validation to perform HCR feature selection for IDH genotyping and grading, respectively. Features with a p-value greater than 0.05 are considered redundant and thus removed. The 1D features selected in each fold for IDH genotyping and grading are combined. Ultimately, we select 99 HCR features, including 18 first-order, 21 gray level size zone matrix (glszm), 16 gray level dependence matrix (gldm), 19 gray level co-occurrence matrix (glcm), 10 neighboring gray tone difference matrix (ngtdm), 10 gray level run length matrix (glrlm), and five shape-based HCR features, as well as two clinical features, including age and gender. We use the 101 1D features selected from the UCSF-PDGM dataset for the BraTS2020 dataset to ensure that the HCR and clinical features used in the two datasets remain consistent.

2.2.2 T2-FLAIR mismatch feature extraction

Mismatch information between T2 and FLAIR can assist in IDH genotyping and glioma grading (Han et al., 2022; Park et al., 2021; Lee et al., 2024). Subtracting T2 and FLAIR images directly will treat all regions equally and may ignore key mismatch information. Therefore, we propose the MFEB, which extracts complementary features from T2 and FLAIR images and performs weighted subtraction to explore T2-FLAIR mismatch features.

2.2.2.1 Complementary feature extraction

We first explore the complementary information between T2 and FLAIR. We use global average pooling and max pooling, respectively, to comprehensively aggregate the local information from T2 and FLAIR images, which can be expressed as follows:

xmp=Pavg(xm)Pmax(xm),m{t2, flair}),    (1)

where Pavg and Pmax represent global average pooling and max pooling, respectively, which reduce the MRI resolution to half. xt2 and xflair are the T2 and FLAIR images, respectively. ⊕ denotes channel concatenation. We then design multi-scale shared weight dilated blocks (DB) that first use dilated convolutions with a larger field of view to obtain an overview and then use convolution blocks to explore details to uncover the complementary information between T2 and FLAIR images, which can be expressed as:

Fdbn(xmp)=FFN(fcn(drop(β(fcn(BN(β(fdn(xmp)))))))),    (2)

where fdn represents the dilated convolutions with a kernel size of n and a dilation rate of 3, while fcn represents convolutions with a kernel size of n. drop is the Dropout function. β is the Leaky ReLU activation function, BN refers to Batch Normalization, and FFN is a feed-forward neural network composed of two pointwise convolutions and a Leaky ReLU activation function.

2.2.2.2 Weighted subtraction

We adopt multi-scale DBs with shared weights to explore the complementary information between T2 and FLAIR images and weight them respectively, which can be expressed as follows:

xt2w=ρ(ftc(Fdb3(Fdb5((Fdb7(xt2p))))))×xt2,    (3)
xflairw=ρ(ftc(Fdb3(Fdb5((Fdb7(xflairp))))))×xflair,    (4)

where ftc represents transposed convolutions and ρ represents the sigmoid activation function. Finally, we subtract the weighted T2 and FLAIR images to obtain the T2-FLAIR mismatch features, which can be expressed as:

xtf=xt2w-xflairw.    (5)

Next, the T2-FLAIR mismatch features xtf are concatenated along the channels with the multi-modal MRIs xmri. The combined data are then input into the customized SAM-Med3D to extract deep features for glioma IDH genotyping and grading.

2.2.3 Customized SAM-Med3D for feature extraction

To effectively leverage the prior knowledge that SAM-Med3D has learned from large-scale datasets, we fuse multi-view features and conduct comprehensive deep feature extraction to fine-tune SAM-Med3D for glioma diagnosis. The steps of the customized SAM-Med3D can be divided into multi-view feature fusion, deep feature extraction, and fine-tuning SAM-Med3D.

2.2.3.1 Multi-view feature fusion

We design MVAdapter to explore the complementary information among multi-view data to fine-tune the attention mechanism of SAM-Med3D for IDH genotyping and grading. As shown in Figure 1C, the fine-tuned SAM-Med3D encoder consists of a patch embedding layer and multiple transformer blocks. First, we expand the convolutional weights of the Patch Embedding layer fivefold to accommodate the concatenated xtf and xmri. After passing through the Patch Embedding layer, we obtain the feature map xfm with a shape of (8, 8, 8) and a channel number of 384. As shown in Figure 2, to effectively fine-tune SAM-Med3D, we first explore the complementary information among multi-view data, including 3D MRI images, 1D HCR, and clinical features. Specifically, we perform global average pooling along the channel dimension on the feature maps of MRI images to convert them into 1D features, then use a fully connected layer to reduce the number of channels to one-eighth, yielding 48 features. Similarly, we use another fully connected layer to reduce the 101 HCR and clinical features to 48 features. We use a bilinear layer to explore the complementary information among the multi-view, including MRIs, HCR, and clinical features, to weight the feature maps xfm, which can be expressed as follows:

xfmw=ρ(fbil(ffc(Pavgc(xfm)),ffc(xhc)))×ffc(xfm),    (6)
Figure 2
Flowchart depicting a neural network architecture combining features with bilinear and sigmoid layers, followed by Dilated Group Convolution Blocks (DGCB) of various sizes. Includes detailed DGCB components: DGConv, Leaky ReLU, GConv. Outputs a generated feature map merged with original Q, K, V values.

Figure 2. Details of multi-view adapter (MVAdapter).

where ffc represents the fully connected layers, fbil represents the bilinear layers, and Pavgc represents global average pooling along the channels.

2.2.3.2 Deep feature extraction

To fully explore the deep information in the multi-view fused feature xfmw, we design multi-scale dilated group convolution blocks (DGCB). Specifically, we utilize multi-scale dilated group convolutions and group convolutions to extract deep information from the weighted feature map xffmw. We then use pointwise convolution to explore inter-channel information and reduce the number of channels to 1,052 to obtain deep multi-view fused features, which is consistent with the dimensions of the query, key, and value generated by the transformer block. The process of deep feature extraction can be expressed as follows:

xfusion=fpc(β((BN(fgc7(β(fdgc7(xfmw)))+fgc5(β(fdgc5(xfmw)))                                                                          +fgc3(β(fdgc3(xfmw)))))),    (7)

where fdgcn represents the dilated grouped convolutions with a kernel size of n and a dilation rate of 3. fgcn denotes the grouped convolutions with a kernel size of n. fpc is a pointwise convolution.

2.2.3.3 Fine-tuning SAM-Med3D

We add the deep multi-view fused features to the generated query, key, and value in the transformer block to fine-tune SAM-Med3D for glioma diagnosis. The process of fine-tuning SAM-Med3D can be expressed as:

Fft(xfusion,xqkv)=xfusion+xqkv,    (8)

where xqkv denotes the query, key, and value vectors produced by the transformer blocks in SAM-Med3D. Finally, the feature maps from the fine-tuned SAM-Med3D encoder are passed through global average pooling to extract deep features for both glioma IDH genotyping and grading.

2.3 Joint loss for IDH genotyping and grading

2.3.1 Weighted cross-entropy loss

To optimize the trainable parameters of MTSAM, we use the prediction results from its IDH genotyping and grading as inputs to the loss function. To address data imbalance, we design a weighted cross-entropy loss function that assigns higher weights to samples with lower occurrence frequencies. Additionally, we increase the loss weight for samples that are difficult to classify, thereby reducing the influence of easily classified samples. The weighted cross-entropy loss function can be expressed as:

L(yt)=-ytlog(yt)-(1-yt)log(1-yt),    (9)
L(yt)=-α(1-yt)γlog(yt)+β·L(yt),    (10)

where L(yt) is the standard binary cross-entropy loss, and yt is the predicted probability of the true glioma class. α is a scaling factor used to adjust the importance of classes according to the class imbalance. γ is the focusing parameter that controls the strength of the modulating factor (1-yt)γ. By reducing the loss for easily classified samples, they enable the model to focus on hard-to-classify samples. Based on empirical engineering, for the weighted cross-entropy loss function in glioma IDH genotyping and grading, α is set to the ratio of the different classes.γ is set to 2. β is the regularization factor for the cross-entropy term, used to balance the weighted modulating term and the original cross-entropy loss.

2.3.2 Joint loss with uncertain loss weight

We adopt an uncertain-loss-weight approach Kendall et al. (2018) to jointly optimize the losses for IDH genotyping and grading using a weighted cross-entropy loss. The joint loss function can be expressed as:

Ljoint=i=1212γi2Li+logi=12γi,    (11)

where L1 and L2 correspond to the losses for glioma IDH genotyping and grading obtained via the weighted cross-entropy loss function, respectively. The γi is the weight parameter for each loss function, initially set to 1 based on empirical engineering and adjusted during training.

2.4 Implementation details

The training process is implemented using PyTorch, and experiments are conducted on NVIDIA 4090 GPUs. To enhance the generalization ability of model training, we apply random rotation, flipping, Gaussian noise, intensity transformation, and shifting to multi-modal MRIs. We use the Ranger (Wright and Demeure, 2021) optimizer with an initial learning rate of 2e-5, and the learning rate gradually decreases during training. The batch size is set to 4. The publicly available code and data for MTSAM can be found at https://github.com/mtsams/MTSAM.

2.5 Compared methods and evaluation metrics

To validate the superiority of MTSAM, we compare it with several methods for IDH genotyping and grading in gliomas. We ensure consistency across all compared methods and MTSAM in implementation details, including training settings, data augmentation strategies, and hardware environment. Specifically, regarding methods for glioma IDH genotyping, Tan et al. (2019) used a support vector machine to model HCR features. Yang et al. (2025) used Swin Transformer to extract deep features from MRI slices for IDH genotyping. Zhang H. et al. (2023) adopted a CNN+LSTM-based neural network to extract deep features from slices of multi-modal MRIs, respectively, and concatenated these deep features with HCR features for glioma IDH genotyping. Li et al. (2025) proposed DLRN, which uses MRI slices as input, extracts deep features using a fine-tuned pre-trained ResNet-101, and concatenates these features with HCR features for IDH genotyping using an SVM. Regarding methods for glioma grading, Qin et al. (2025) used an FFN to extract deep-level information from HCR features. Wu et al. (2023) proposed AGCN, which explores channel and spatial information in multi-modal MRI through a dual-domain attention mechanism and combines multi-scale features obtained using multi-branch convolution for glioma grading. Bijari et al. (2025) used three convolutional blocks to extract deep features from T1 images, which were then concatenated with HCR features for grading using logistic regression. Regarding methods for both glioma IDH genotyping and grading, Sudre et al. (2020) used a random forest with HCR and clinical features as inputs to perform these tasks separately. Sairam et al. (2023) used InceptionV3, with T1, T2, and FLAIR slices as inputs, to simultaneously perform glioma IDH genotyping and grading.

To evaluate the performance of the model, we use the area under the curve (AUC), accuracy (ACC), F1_score, and their 95% confidence intervals (CI) for the quantitative assessment of the model's IDH genotyping and grading, which can be represented as follows:

ACC=TP+TNTP+TN+FP+FN,    (12)
F1_Score=2×TP2TP+FP+FN,    (13)

where True Positives (TP) are the number of cases in which the model correctly predicts the positive class. True Negatives (TN) are the number of correctly predicted negative cases. False Positives (FP) are the number of incorrectly predicted positive cases, and False Negatives (FN) are the number of incorrectly predicted negative cases.

3 Results

3.1 Comparison with other methods

As shown in Table 2, MTSAM achieves AUCs of 92.38 and 94.31% for IDH genotyping and grading, respectively, on the UCSF-PDGM dataset, and AUCs of 91.56 and 93.37% on the BraTS2020 dataset, outperforming other methods. Specifically, compared with DLRN (Li et al., 2025), which uses multi-view information for IDH genotyping, MTSAM improves the AUC in IDH genotyping by 8.49 and 6.34%, respectively. Compared with the method of Bijari et al. (2025), which uses multi-view information for grading, MTSAM improves the AUC in grading by 9.62 and 5.07%, respectively. This may be attributed to MTSAM's effective exploration of the complementary information between IDH genotyping and grading. Compared with the multi-task method proposed by Sairam et al. (2023), MTSAM improves the AUC in IDH genotyping by 11.48 and 9.89%, respectively, and in grading by 9.63 and 8.2%, respectively, which could be because MTSAM effectively explores the prior knowledge that the medical foundational model has learned from large-scale medical data.

Table 2
www.frontiersin.org

Table 2. Performance comparison with other methods.

3.2 Ablation study

3.2.1 Effectiveness of T2-FLAIR mismatch features

To validate the effectiveness of T2-FLAIR mismatch features for glioma IDH genotyping and grading on the UCSF-PDGM dataset, we compared MTSAM with extracting T2-FLAIR mismatch features by direct subtraction without using MFEB (w/o MFEB) and the other that replaces the DB in MFEB with a convolutional layer (replace DB in MFEB). As shown in Table 3, replacing DB in MFEB with convolution reduces the ACC for both glioma IDH genotyping and grading by 1.02%. This may be because dilated convolutions can more effectively capture tumor information by simulating large-kernel convolutions, which is consistent with the fact that large-field-of-view convolutions facilitate glioma diagnosis (Wu et al., 2023; Zhu et al., 2025), thereby better capturing complementary information in T2 and FLAIR images. Additionally, without using MFEB to extract T2-FLAIR mismatch features, the AUCs for IDH genotyping and grading decrease by 2.09 and 1.73%, respectively, thereby validating the effectiveness of MFEB in extracting T2-FLAIR mismatch features.

Table 3
www.frontiersin.org

Table 3. Ablation study of MTSAM.

3.2.2 Effectiveness of customized SAM-Med3D

To validate the effectiveness of fine-tuning SAM-Med3D by fusing multi-view features to extract deep features for IDH genotyping and grading, we compare MTSAM with the methods on the UCSF-PDGM dataset without using MVAdapter to fine-tune SAM-Med3D (w/o MVAdapter), without using HCR and clinical features in MVAdapter to fine-tune SAM-Med3D (w/o H in MVAdapter), and without using MRI features in MVAdapter to fine-tune SAM-Med3D (w/o M in MVAdapter). As shown in Table 3, without using MVAdapter to fine-tune SAM-Med3D for extracting deep features, the AUC of IDH genotyping and grading decreases by 18.94 and 27.64%, respectively, indicating that customized SAM-Med3D effectively exploits the prior knowledge learned by foundational medical image models from large-scale medical data.

3.2.3 Effectiveness of multi-task learning

To validate the effectiveness of performing IDH genotyping and grading simultaneously, we compare MTSAM with methods that do not perform IDH genotyping (w/o IDH genotyping) or grading (w/o grading). As shown in Table 3, when IDH genotyping and grading are performed simultaneously, MTSAM achieves the best performance, validating that MTSAM can effectively exploit the complementary information across tasks.

3.3 Visualization analysis of MTSAM

To analyze the attention maps generated by MTSAM, we employ Grad-CAM (Selvaraju et al., 2020) to visualize the attention maps for non-fine-tuned and fine-tuned SAM-Med3D, without conducting IDH genotyping or grading. To ensure a fair comparison, the attention values across all visualization results were constrained to the same range, with red regions representing higher attention values and blue regions representing lower ones. As shown in Figures 3a, f, the blue, red, and green regions correspond to the necrotic tumor cores, enhancing tumors, and edema regions of gliomas, respectively. As shown in Figures 3b, c, g, h, when IDH genotyping or grading of gliomas is performed in isolation, the network tends to focus on non-tumor regions, underscoring the effectiveness of simultaneously performing both tasks. According to the WHO Classification of Tumors of the Central Nervous System (Wen and Packer, 2021; Horbinski et al., 2022), tumor necrotic regions are key indicators for glioma diagnosis. Specifically, low-grade gliomas have smaller tumor necrotic regions due to their low proliferation rate and sufficient blood supply. In contrast, patients with high-grade gliomas have larger tumor necrotic regions. Furthermore, highly invasive tumor cells are often present around the necrotic foci of gliomas, which is a key factor in determining the prognosis of gliomas (Ratliff et al., 2023; Markwell et al., 2022). As shown in Figures 3d, e, i, j, MTSAM with fine-tuned SAM-Med3D enables the network's attention regions to be more focused, with attention mainly concentrated in the necrotic tumor core regions, which contain rich prognostic information (Greenwald et al., 2024; Markwell et al., 2022; Liu et al., 2025), and highlighting the effectiveness of MTSAM in glioma diagnosis.

Figure 3
MRI images showing brain scans. The top row depicts low-grade glioma; the bottom row shows high-grade glioma. Columns represent processing stages: (a, f) MRI, and (b-j) various processing methods including w/o Grading, w/o IDH Genotyping, w/o Fine-tuning Sam-Med3d, and MTSAM. Images highlight areas using color gradients to indicate regions of interest.

Figure 3. The visualization of attention maps in MTSAM. In MRIs, the blue, red, and green regions correspond to the necrotic tumor cores, enhancing tumors, and edema regions of gliomas, respectively.

3.4 Visual of MTSAM with other methods

To further verify the effectiveness of MTSAM, we visualize MTSAM's attention map and compare it with those of other methods, including Yang et al. (2025), ACGN (Wu et al., 2023), and Sairam et al. (2023). As shown in Figures 4, we can observe that the attention of other methods is very scattered, while MTSAM's attention is concentrated on tumor regions, primarily the necrotic regions of the tumor. This may be because MTSAM effectively leverages the prior knowledge that the foundational model SAM-Med3D has learned from large-scale data.

Figure 4
MRI scan comparison in five images. The first shows a brain with colored regions indicating a tumor. The next four images, titled with different research names, display enhanced and processed visualizations highlighting the tumor's location and intensity. Each uses different color mappings to emphasize various aspects of the tumor.

Figure 4. The visualization of different methods.

3.5 Robustness validation of MTSAM

To validate the effectiveness of MTSAM, we conduct external validation. We collect 143 samples from TCGA-LGG (Pedano et al., 2016) and TCGA-GBM (Scarpace et al., 2016) that include glioma segmentation, IDH status, and grade information to perform external validation on the MTSAM as well as other multi-task methods, including Sairam et al. (2023) and Sudre et al. (2020) trained on the UCSF-PDGM dataset. As shown in Table 4, we find that MTSAM achieves an AUC of 83.28 and 80.33% for IDH typing and grading, respectively, on the TCGA dataset, outperforming other methods and validating the robustness of MTSAM.

Table 4
www.frontiersin.org

Table 4. Validation of MTSAM on the TCGA Dataset.

3.6 Effectiveness of MFEB

To validate the effectiveness of MFEB, we visualize the direct subtraction results of T2 and FLAIR images, the T2-FLAIR mismatch features obtained via MFEB, and the attention maps of MTSAM on the MFEB-derived T2-FLAIR mismatch features. As shown in Figure 5, we observe that the T2-FLAIR mismatch features obtained via MFEB exhibit higher feature values near the tumor than the direct subtraction results. Additionally, the regions MTSAM focuses on in the T2-FLAIR mismatch feature maps are also around the tumor. This result demonstrates the potential of MFEB in exploring deep representations and validates its effectiveness in extracting T2-FLAIR mismatch features.

Figure 5
Five-panel image comparison of brain scans. From left to right: T2 image with highlighted regions in blue and green; FLAIR image with similar highlights; Subtraction image in grayscale showing reduced details; MFEB image in grayscale with a marked area; MTSAM image with a heat map overlay showing intensity variations.

Figure 5. Visualization of T2-FLAIR mismatch feature.

3.7 Selection of foundational model

To validate the effectiveness of using the medical foundational model SAM-Med3D for feature extraction in IDH genotyping and grading, we compare it with other medical foundational models on the UCSF-PDGM dataset. To ensure a fair comparison, all the foundational models, including SAM-Med2D (Cheng et al., 2023) and FastSAM3d (Shen et al., 2024), are fine-tuned using MVAdapter. As shown in Table 5, when using the fine-tuned FastSAM3D, derived from SAM-Med3D via knowledge distillation, for glioma IDH genotyping and grading, the AUC decreases by 0.63 and 0.56%, respectively. This may be because SAM-Med3D has learned effective prior knowledge from the large-scale medical data.

Table 5
www.frontiersin.org

Table 5. Comparison with different foundational models.

3.8 Comparison with other fine-tuning methods

To validate the effectiveness of fine-tuning SAM-Med3D using MVAdapter, we compare it with other fine-tuning methods, including Bitfit (Ben Zaken et al., 2021), LoRA (Hu et al., 2022), and Side-tuning (Zhang et al., 2020), on the UCSF-PDGM dataset. As shown in Table 6, when SAM-Med3D is fine-tuned using side-tuning, the AUC in IDH genotyping and grading decreases by 4.81 and 9.82%, respectively. This may be because MVAdapter effectively exploits the complementary information among multi-view features to fine-tune SAM-Med3D and explore deep features.

Table 6
www.frontiersin.org

Table 6. Comparison with different fine-tuning methods.

3.9 Feature interpretability of MTSAM

To explain the decision-making process of MTSAM, we pool the 3D MRI features obtained from the transformer block output in the fine-tuned SAM-Med3D across channels, name them sequentially, concatenate them with 1D HCR and clinical features, and perform SHAP analysis via linear regression on the UCSF-PDGM dataset. As shown in Figure 6a, we find that for IDH typing, age is the most important feature, as the larger its value, the more likely the glioma is predicted to be IDH wild type, which is consistent with the study by Reuss et al. (2015). In addition, the 143rd 3D MRI feature is important for glioma IDH typing, as the larger its value, the more likely the glioma is predicted to be IDH-mutant. As shown in Figure 6b, we find that for glioma grading, lbp-2D_glszm_HighGrayLevelZoneEmphasis is the most important feature, as the larger its value, the more likely the glioma is predicted to be a lower-grade glioma, which is consistent with the study by Mapelli et al. (2022). In contrast, the 153rd 3D MRI feature is also important for glioma grading: the larger its value, the more likely the glioma is to be higher grade.

Figure 6
Two graphs show SHAP values for model features. Panel (a) details glioma IDH genotyping with features like age and various imaging characteristics impacting the model, where age has the highest positive influence. Panel (b) depicts glioma grading, highlighting imaging features such as lbp-2D_glszm_HighGrayLevelZoneEmphasis, with several features having both positive and negative impacts. High feature values are marked in pink, and low values in blue.

Figure 6. The top 10 most important 3D MRI, 1D HCR, and clinical features in MTSAM obtained by the SHAP method on the UCSF-PDGM dataset. The vertical axis represents the importance ranking of features, and the horizontal axis represents the SHAP value. Each point in the figure represents a sample, with colors reflecting the feature value from low (blue) to high (red). (A) Glioma IDH genotyping. (B) Glioma grading.

4 Discussion

This study proposes a network, MTSAM, for simultaneous glioma IDH genotyping and grading using non-invasive data. Previous studies (Yu et al., 2024; Lim-Fat et al., 2022) mainly rely on invasive methods such as biopsies and tissue resections. In contrast, MTSAM utilizes non-invasive features, including multi-modal MRIs, which can effectively reduce patient discomfort and surgical risks. Specifically, MTSAM uses multi-view data to fine-tune the foundational model SAM-Med3D, aiming to accurately perform glioma IDH genotyping and grading. We validate MTSAM on the UCSF-PDGM and BraTS2020 datasets, and Table 2 shows that MTSAM outperforms other existing methods in glioma IDH genotyping and grading, demonstrating its potential in glioma diagnosis. Furthermore, Figure 4 visualizes the attention maps of MTSAM and other methods, validating MTSAM's potential for mining diagnosis-related features and demonstrating that leveraging the knowledge learned by medical foundational models from large-scale data may improve glioma diagnosis.

We validate the effectiveness of each module in MTSAM. Table 3 shows that when the MFEB is not used for T2-FLAIR mismatch feature extraction, the performance of MTSAM in glioma IDH genotyping and grading decreases, which reveals the potential of MFEB in mining T2-FLAIR mismatch features. Furthermore, Figure 5 demonstrates that, compared with the direct subtraction of T2 and FLAIR images, the T2-FLAIR mismatch features extracted by MFEB can more clearly highlight the mismatch phenomenon around the tumor region. Additionally, Table 3 indicates that using MVAdapter, fine-tuning the medical foundational model SAM-Med3D with multi-view information, can effectively improve glioma diagnostic performance, which may be attributed to the fact that multi-view information provides the model with more comprehensive and rich feature perspectives, thereby mitigating the information limitations of a single view. Moreover, Table 6 compares MVAdapter with other fine-tuning methods, validating the potential of MVAdapter in fine-tuning medical foundational models.

Figure 6 demonstrates the interpretability of MTSAM and visualizes the contributions of MRI features, HCR, and clinical features to glioma diagnosis. Specifically, we find that age is the most important feature for glioma IDH genotyping, as higher age is associated with a higher likelihood of IDH-wildtype gliomas. This is consistent with the fact that IDH-mutant gliomas often occur in young and middle-aged adults with active cell proliferation (Yamasaki, 2022; Weller et al., 2024). In contrast, the development of IDH-wildtype gliomas relies on the synergistic activation of multistep oncogenic pathways, such as EGFR amplification and TERT promoter mutation. These mutations require the accumulation of long-term DNA damage and a decline in cellular repair capacity as the basis. They mostly correspond to gliomas with higher malignancy, such as glioblastoma, and the risk of developing such tumors increases with age. For glioma grading, lbp-2D_glszm_HighGrayLevelZoneEmphasis is the most important feature, as higher values of this feature are associated with a higher likelihood of low-grade gliomas. This may be attributed to the fact that lbp-2D_glszm_HighGrayLevelZoneEmphasis is a feature designed to quantify large, continuous high-gray-level regions in images. Low-grade gliomas exhibit low malignancy, characterized by low cell density, minimal nuclear atypia, no obvious necrosis or angiogenesis, and more homogeneous tissue composition, which is reflected in MRI images as continuous, large-scale, and uniformly distributed high-gray-level tumor parenchyma regions, aligning with the quantification direction of lbp-2D_glszm_HighGrayLevelZoneEmphasis.

MTSAM exhibits considerable clinical value, with an inference time of only 0.82 s per sample, indicating high efficiency and promising it for rapidly providing glioma diagnostic results, thus offering auxiliary support for treatment decisions. Although MTSAM has achieved some progress in glioma IDH genotyping and grading, it still has several limitations. First, in real-world clinical settings, there is significant variation in scanner variability, imaging protocol heterogeneity, and patient population diversity, and the model has not yet been validated on larger datasets. Second, its potential for application in other tumor types remains unproven. Therefore, we plan to collect more data from real clinical scenarios and conduct validation across multiple tumor types to further verify the robustness and effectiveness of MTSAM.

5 Conclusion

In this study, we propose a multi-task network, MTSAM, based on the customized SAM-Med3D. MTSAM fuses multi-view information to fine-tune SAM-Med3D in order to extract deep features from multi-modal MRI and T2-FLAIR mismatch features for glioma IDH genotyping and grading. MTSAM achieves the best performance on the publicly available UCSF-PDGM and BraTS2020 datasets. Additionally, we use Grad-CAM to verify that MVAdapter fine-tunes SAM-Med3D by fusing multi-view information, enabling the network to effectively focus on the regions of gliomas that contain rich prognostic information.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

XL: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. HL: Conceptualization, Data curation, Investigation, Methodology, Resources, Validation, Visualization, Writing – original draft, Writing – review & editing. YH: Conceptualization, Data curation, Investigation, Resources, Supervision, Writing – original draft, Writing – review & editing. JZ: Conceptualization, Data curation, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing. LW: Conceptualization, Investigation, Methodology, Visualization, Writing – review & editing. XY: Conceptualization, Data curation, Investigation, Methodology, Visualization, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Akpinar, E., and Oduncuoglu, M. (2025). Hybrid classical and quantum computing for enhanced glioma tumor classification using TCGA data. Sci. Rep. 15:25935. doi: 10.1038/s41598-025-97067-3

PubMed Abstract | Crossref Full Text | Google Scholar

Ben Zaken, E., Ravfogel, S., and Goldberg, Y. (2021). Bitfit: simple parameter-efficient fine-tuning for transformer-based masked language-models. arXiv [preprints]. arXiv:2106.10199. doi: 10.48550/arXiv.2106.10199

Crossref Full Text | Google Scholar

Bijari, S., Rezaeijo, S. M., Sayfollahi, S., Rahimnezhad, A., and Heydarheydari, S. (2025). Development and validation of a robust MRI-based nomogram incorporating radiomics and deep features for preoperative glioma grading: a multi-center study. Quant. Imaging Med. Surg. 15:1125. doi: 10.21037/qims-24-1543

PubMed Abstract | Crossref Full Text | Google Scholar

Braun, Y., Filipski, K., Bernatz, S., Baumgarten, P., Roller, B., Zinke, J., et al. (2021). Linking epigenetic signature and metabolic phenotype in IDH mutant and IDH wildtype diffuse glioma. Neuropathol. Appl. Neurobiol. 47, 379–393. doi: 10.1111/nan.12669

PubMed Abstract | Crossref Full Text | Google Scholar

Calabrese, E., Villanueva-Meyer, J. E., Rudie, J. D., Rauschecker, A. M., Baid, U., Bakas, S., et al. (2022). The university of California San Francisco preoperative diffuse glioma MRI dataset. Radiol. Artif. Intell. 4:e220058. doi: 10.1148/ryai.220058

PubMed Abstract | Crossref Full Text | Google Scholar

Cheng, J., Liu, J., Kuang, H., and Wang, J. (2022). A fully automated multimodal MRI-based multi-task learning for glioma segmentation and IDH genotyping. IEEE Trans. Med. Imaging 41, 1520–1532. doi: 10.1109/TMI.2022.3142321

PubMed Abstract | Crossref Full Text | Google Scholar

Cheng, J., Liu, J., Yue, H., Bai, H., Pan, Y., Wang, J., et al. (2020). Prediction of glioma grade using intratumoral and peritumoral radiomic features from multiparametric MRI images. IEEE/ACM Trans. Comput. Biol. Bioinform. 19, 1084–1095. doi: 10.1109/TCBB.2020.3033538

PubMed Abstract | Crossref Full Text | Google Scholar

Cheng, J., Ye, J., Deng, Z., Chen, J., Li, T., Wang, H., et al. (2023). Sam-med2d. arXiv [Preprint]. arXiv:2308.16184. doi: 10.48550/arXiv.2308.16184

Crossref Full Text | Google Scholar

Dayarathna, S., Islam, K. T., Uribe, S., Yang, G., Hayat, M., Chen, Z., et al. (2024). Deep learning based synthesis of MRI, CT and PET: review and analysis. Med. Image Anal. 92:103046. doi: 10.1016/j.media.2023.103046

PubMed Abstract | Crossref Full Text | Google Scholar

Esquenazi, Y., Moussazadeh, N., Link, T. W., Hovinga, K. E., Reiner, A. S., DiStefano, N. M., et al. (2018). Thalamic glioblastoma: clinical presentation, management strategies, and outcomes. Neurosurgery 83, 76–85. doi: 10.1093/neuros/nyx349

PubMed Abstract | Crossref Full Text | Google Scholar

Greenwald, A. C., Darnell, N. G., Hoefflin, R., Simkin, D., Mount, C. W., Castro, L. N. G., et al. (2024). Integrative spatial analysis reveals a multi-layered organization of glioblastoma. Cell 187, 2485–2501. doi: 10.1016/j.cell.2024.03.029

PubMed Abstract | Crossref Full Text | Google Scholar

Han, S., Liu, Y., Cai, S. J., Qian, M., Ding, J., Larion, M., et al. (2020). IDH mutation in glioma: molecular mechanisms and potential therapeutic targets. Br. J. Cancer 122, 1580–1589. doi: 10.1038/s41416-020-0814-x

PubMed Abstract | Crossref Full Text | Google Scholar

Han, Z., Chen, Q., Zhang, L., Mo, X., You, J., Chen, L., et al. (2022). Radiogenomic association between the t2-flair mismatch sign and IDH mutation status in adult patients with lower-grade gliomas: an updated systematic review and meta-analysis. Eur. Radiol. 32, 5339–5352. doi: 10.1007/s00330-022-08607-8

PubMed Abstract | Crossref Full Text | Google Scholar

He, Y., Huang, F., Jiang, X., Nie, Y., Wang, M., Wang, J., et al. (2024). Foundation model for advancing healthcare: challenges, opportunities and future directions. IEEE Rev. Biomed. Eng. 18, 172–191. doi: 10.1109/RBME.2024.3496744

PubMed Abstract | Crossref Full Text | Google Scholar

Horbinski, C., Berger, T., Packer, R. J., and Wen, P. Y. (2022). Clinical implications of the 2021 edition of the who classification of central nervous system tumours. Nat. Rev. Neurol. 18, 515–529. doi: 10.1038/s41582-022-00679-w

PubMed Abstract | Crossref Full Text | Google Scholar

Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., et al. (2022). Lora: low-rank adaptation of large language modelss[J]. arXiv preprint arXiv:2106.09685, 2021. doi: 10.48550/arXiv.2106.0968

Crossref Full Text | Google Scholar

Jayaram, M. A., and Phillips, J. J. (2024). Role of the microenvironment in glioma pathogenesis. Ann. Rev. Pathol. Mech. Dis. 19, 181–201. doi: 10.1146/annurev-pathmechdis-051122-110348

PubMed Abstract | Crossref Full Text | Google Scholar

Jeon, Y. H., Choi, K. S., Lee, K. H., Jeong, S. Y., Lee, J. Y., Ham, T., et al. (2025). Deep learning-based quantification of t2-flair mismatch sign: extending IDH mutation prediction in adult-type diffuse lower-grade glioma. Eur. Radiol. 35, 5193–5202. doi: 10.1007/s00330-025-11475-7

PubMed Abstract | Crossref Full Text | Google Scholar

Jusue-Torres, I., Lee, J., Germanwala, A. V., Burns, T. C., and Parney, I. F. (2023). Effect of extent of resection on survival of patients with glioblastoma, IDH-wild-type, who grade 4 (who 2021): systematic review and meta-analysis. World Neurosurg. 171, e524–e532. doi: 10.1016/j.wneu.2022.12.052

PubMed Abstract | Crossref Full Text | Google Scholar

Kendall, A., Gal, Y., and Cipolla, R. (2018). “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT: IEEE), 7482–7491. doi: 10.1109/CVPR.2018.00781

Crossref Full Text | Google Scholar

Khan, M. T., Prajapati, B., Lakhina, S., Sharma, M., Prajapati, S., Chosdol, K., et al. (2021). Identification of gender-specific molecular differences in glioblastoma (GBM) and low-grade glioma (LGG) by the analysis of large transcriptomic and epigenomic datasets. Front. Oncol. 11:699594. doi: 10.3389/fonc.2021.699594

PubMed Abstract | Crossref Full Text | Google Scholar

Komori, T. (2022). Grading of adult diffuse gliomas according to the 2021 who classification of tumors of the central nervous system. Lab. Invest. 102, 126–133. doi: 10.1038/s41374-021-00667-6

PubMed Abstract | Crossref Full Text | Google Scholar

Lee, M. D., Jain, R., Galbraith, K., Chen, A., Lieberman, E., Patel, S. H., et al. (2024). T2-flair mismatch sign predicts dna methylation subclass and cdkn2a/b status in IDH-mutant astrocytomas. Clin. Cancer Res. 30, 3512–3519. doi: 10.1158/1078-0432.CCR-24-0311

PubMed Abstract | Crossref Full Text | Google Scholar

Li, D., Hu, W., Ma, L., Yang, W., Liu, Y., Zou, J., et al. (2025). Deep learning radiomics nomograms predict isocitrate dehydrogenase (IDH) genotypes in brain glioma: a multicenter study. Magn. Reson. Imaging 117:110314. doi: 10.1016/j.mri.2024.110314

PubMed Abstract | Crossref Full Text | Google Scholar

Li, T., Li, J., Chen, Z., Zhang, S., Li, S., Wageh, S., et al. (2022a). Glioma diagnosis and therapy: current challenges and nanomaterial-based solutions. J. Controlled Release 352, 338–370. doi: 10.1016/j.jconrel.2022.09.065

PubMed Abstract | Crossref Full Text | Google Scholar

Li, Y., Liu, Y., Liang, Y., Wei, R., Zhang, W., Yao, W., et al. (2022b). Radiomics can differentiate high-grade glioma from brain metastasis: a systematic review and meta-analysis. Eur. Radiol. 32, 8039–8051. doi: 10.1007/s00330-022-08828-x

PubMed Abstract | Crossref Full Text | Google Scholar

Lim-Fat, M. J., Youssef, G. C., Touat, M., Iorgulescu, J. B., Whorral, S., Allen, M., et al. (2022). Clinical utility of targeted next-generation sequencing assay in IDH-wildtype glioblastoma for therapy decision-making. Neurooncology 24, 1140–1149. doi: 10.1093/neuonc/noab282

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, J., Jiang, S., Wu, Y., Zou, R., Bao, Y., Wang, N., et al. (2025). Deep learning-based radiomics and machine learning for prognostic assessment in IDH-wildtype glioblastoma after maximal safe surgical resection: a multicenter study. Int. J. Surg. 111, 4576–4585. doi: 10.1097/JS9.0000000000002488

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, Q., Jiang, P., Jiang, Y., Ge, H., Li, S., Jin, H., et al. (2019). Prediction of aneurysm stability using a machine learning model based on pyradiomics-derived morphological features. Stroke 50, 2314–2321. doi: 10.1161/STROKEAHA.119.025777

PubMed Abstract | Crossref Full Text | Google Scholar

Mahajan, S., Suri, V., Sahu, S., Sharma, M. C., and Sarkar, C. (2022). World health organization classification of tumors of the central nervous system 5th edition (WHO CNS5): what's new? Indian J. Pathol. Microbiol. 65, S5-S13. doi: 10.4103/ijpm.ijpm_48_22

PubMed Abstract | Crossref Full Text | Google Scholar

Mapelli, P., Bezzi, C., Palumbo, D., Canevari, C., Ghezzo, S., Samanes Gajate, A., et al. (2022). 68ga-dotatoc PET/mr imaging and radiomic parameters in predicting histopathological prognostic factors in patients with pancreatic neuroendocrine well-differentiated tumours. Eur. J. Nucl. Med. Mol. Imaging 49, 2352–2363. doi: 10.1007/s00259-022-05677-0

PubMed Abstract | Crossref Full Text | Google Scholar

Markwell, S. M., Ross, J. L., Olson, C. L., and Brat, D. J. (2022). Necrotic reshaping of the glioma microenvironment drives disease progression. Acta Neuropathol. 143, 291–310. doi: 10.1007/s00401-021-02401-4

PubMed Abstract | Crossref Full Text | Google Scholar

Menze, B. H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., et al. (2014). The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34, 1993–2024. doi: 10.1109/TMI.2014.2377694

PubMed Abstract | Crossref Full Text | Google Scholar

Moor, M., Banerjee, O., Abad, Z. S. H., Krumholz, H. M., Leskovec, J., Topol, E. J., et al. (2023). Foundation models for generalist medical artificial intelligence. Nature 616, 259–265. doi: 10.1038/s41586-023-05881-4

PubMed Abstract | Crossref Full Text | Google Scholar

Murnan, K. M., Horbinski, C., and Stegh, A. H. (2023). Redox homeostasis and beyond: the role of wild-type isocitrate dehydrogenases for the pathogenesis of glioblastoma. Antioxid. Redox Signal. 39, 923–941. doi: 10.1089/ars.2023.0262

PubMed Abstract | Crossref Full Text | Google Scholar

Navarria, P., Pessina, F., Clerici, E., Bellu, L., Franzese, C., Franzini, A., et al. (2022). Re-irradiation for recurrent high grade glioma (HGG) patients: results of a single arm prospective phase 2 study. Radiother. Oncol. 167, 89–96. doi: 10.1016/j.radonc.2021.12.019

PubMed Abstract | Crossref Full Text | Google Scholar

Nicholson, J. G., and Fine, H. A. (2021). Diffuse glioma heterogeneity and its therapeutic implications. Cancer Discov. 11, 575–590. doi: 10.1158/2159-8290.CD-20-1474

PubMed Abstract | Crossref Full Text | Google Scholar

Olar, A., Wani, K. M., Alfaro-Munoz, K. D., Heathcock, L. E., van Thuijl, H. F., Gilbert, M. R., et al. (2015). IDH mutation status and role of who grade and mitotic index in overall survival in grade ii-iii diffuse gliomas. Acta Neuropathol. 129, 585–596. doi: 10.1007/s00401-015-1398-z

PubMed Abstract | Crossref Full Text | Google Scholar

Park, S. I., Suh, C. H., Guenette, J. P., Huang, R. Y., and Kim, H. S. (2021). The t2-flair mismatch sign as a predictor of IDH-mutant, 1p/19q-noncodeleted lower-grade gliomas: a systematic review and diagnostic meta-analysis. Eur. Radiol. 31, 5289–5299. doi: 10.1007/s00330-020-07467-4

PubMed Abstract | Crossref Full Text | Google Scholar

Pedano, N., Flanders, A. E., Scarpace, L., Mikkelsen, T., Eschbacher, J. M., Hermes, B., et al. (2016). The Cancer Genome Atlas Low Grade Glioma Collection (TCGA-LGG) [Dataset]. The Cancer Imaging Archive. Available online at: https://www.cancerimagingarchive.net/collection/tcga-lgg

Google Scholar

Qin, Y., You, W., Wang, Y., Zhang, Y., Xu, Z., Li, Q., et al. (2025). Magnetic resonance imaging radiomics-driven artificial neural network model for advanced glioma grading assessment. Medicina 61:1034. doi: 10.3390/medicina61061034

PubMed Abstract | Crossref Full Text | Google Scholar

Ratliff, M., Karimian-Jazi, K., Hoffmann, D. C., Rauschenbach, L., Simon, M., Hai, L., et al. (2023). Individual glioblastoma cells harbor both proliferative and invasive capabilities during tumor progression. Neurooncology 25, 2150–2162. doi: 10.1093/neuonc/noad109

PubMed Abstract | Crossref Full Text | Google Scholar

Reuss, D. E., Mamatjan, Y., Schrimpf, D., Capper, D., Hovestadt, V., Kratz, A., et al. (2015). IDH mutant diffuse and anaplastic astrocytomas have similar age at presentation and little difference in survival: a grading problem for who. Acta Neuropathol. 129, 867–873. doi: 10.1007/s00401-015-1438-8

PubMed Abstract | Crossref Full Text | Google Scholar

Sairam, V., Bhaskar, N., and Tupe-Waghmare, P. (2023). “Automated glioma grading and IDH mutation status prediction using cnn-based deep learning models,” in International Conference on Robotics, Control, Automation and Artificial Intelligence (Cham: Springer), 391–400. doi: 10.1007/978-981-97-4650-7_29

Crossref Full Text | Google Scholar

Santos, J. L. M., Aljuboori, Z., Richardson, A. M., Hanalioglu, S., Peker, H. O., Aydin, I., et al. (2024). Microsurgical anatomy and approaches to thalamic gliomas. part 1: a cartography guide for navigating to the thalamus. integrating 3d model rendering with anatomical dissections. J. Neurosurg. 1, 1–15. doi: 10.3171/2024.3.JNS232049

PubMed Abstract | Crossref Full Text | Google Scholar

Scarpace, L., Mikkelsen, T., Cha, S., Rao, S., Tekchandani, S., Gutman, D., et al. (2016). The Cancer Genome Atlas Glioblastoma Multiforme Collection (TCGA-GBM). The Cancer Imaging Archive. Available online at: https://www.cancerimagingarchive.net/collection/tcga-gbm

Google Scholar

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D., et al. (2020). Grad-cam: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359. doi: 10.1007/s11263-019-01228-7

Crossref Full Text | Google Scholar

Shen, Y., Li, J., Shao, X., Inigo Romillo, B., Jindal, A., Dreizin, D., et al. (2024). “Fastsam3d: an efficient segment anything model for 3d volumetric medical images,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Cham: Springer), 542–552. doi: 10.1007/978-3-031-72390-2_51

PubMed Abstract | Crossref Full Text | Google Scholar

Stewart, E. E., Valsecchi, M., and Schütz, A. C. (2020). A review of interactions between peripheral and foveal vision. J. Vis. 20, 2–2. doi: 10.1167/jov.20.12.2

PubMed Abstract | Crossref Full Text | Google Scholar

Sudre, C. H., Panovska-Griffiths, J., Sanverdi, E., Brandner, S., Katsaros, V. K., Stranjalis, G., et al. (2020). Machine learning assisted DSC-MRI radiomics as a tool for glioma classification by grade and mutation status. BMC Med. Inform. Decis. Mak. 20:149. doi: 10.1186/s12911-020-01163-5

PubMed Abstract | Crossref Full Text | Google Scholar

Tan, Y., Zhang, S.-t., Wei, J.-w., Dong, D., Wang, X.-c., Yang, G.-q., et al. (2019). “A radiomics nomogram may improve the prediction of IDH genotype for astrocytoma before surgery. Eur. Radiol. 29, 3325–3337. doi: 10.1007/s00330-019-06056-4

PubMed Abstract | Crossref Full Text | Google Scholar

Tang, W.-T., Su, C.-Q., Lin, J., Xia, Z.-W., Lu, S.-S., Hong, X.-N., et al. (2024). T2-flair mismatch sign and machine learning-based multiparametric MRI radiomics in predicting IDH mutant 1p/19q non-co-deleted diffuse lower-grade gliomas. Clin. Radiol. 79, e750-e758. doi: 10.1016/j.crad.2024.01.021

PubMed Abstract | Crossref Full Text | Google Scholar

van den Bent, M. J., French, P. J., Brat, D., Tonn, J. C., Touat, M., Ellingson, B. M., et al. (2024). The biological significance of tumor grade, age, enhancement, and extent of resection in IDH-mutant gliomas: How should they inform treatment decisions in the era of IDH inhibitors? Neurooncology 26, 1805–1822. doi: 10.1093/neuonc/noae107

PubMed Abstract | Crossref Full Text | Google Scholar

van Santwijk, L., Kouwenberg, V., Meijer, F., Smits, M., and Henssen, D. (2022). A systematic review and meta-analysis on the differentiation of glioma grade and mutational status by use of perfusion-based magnetic resonance imaging. Insights Imaging 13:102. doi: 10.1186/s13244-022-01230-7

PubMed Abstract | Crossref Full Text | Google Scholar

Vettermann, F., Suchorska, B., Unterrainer, M., Nelwan, D., Forbrig, R., Ruf, V., et al. (2019). Non-invasive prediction of IDH-wildtype genotype in gliomas using dynamic 18f-fet PET. Eur. J. Nucl. Med. Mol. Imaging 46, 2581–2589. doi: 10.1007/s00259-019-04477-3

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, H., Guo, S., Ye, J., Deng, Z., Cheng, J., Li, T., et al. (2025). SAM-Med3D: a vision foundation model for general-purpose segmentation on volumetric medical images. IEEE Trans. Neural Netw. Learn. Syst. 36, 17599–17612. doi: 10.1109/TNNLS.2025.3586694

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, K. (2025). “A survey on sam-based methods for medical image segmentation,” in International Symposium on Artificial Intelligence Innovations (IS-AII 2025), Vol. 13681 (SPIE), 223–230. doi: 10.1117/12.3073583

Crossref Full Text | Google Scholar

Weller, M., Wen, P. Y., Chang, S. M., Dirven, L., Lim, M., Monje, M., et al. (2024). Glioma. Nat. Rev. Dis. Primers 10:33. doi: 10.1038/s41572-024-00516-y

Crossref Full Text | Google Scholar

Weller, M., Wick, W., Aldape, K., Brada, M., Berger, M., Pfister, S. M., et al. (2015). Glioma. Nat. Rev. Dis. Primers 1, 1–18. doi: 10.1038/nrdp.2015.17

Crossref Full Text | Google Scholar

Wen, P. Y., and Packer, R. J. (2021). The 2021 WHO classification of tumors of the central nervous system: clinical implications. Neuro. Oncol. 23, 1215–1217. doi: 10.1093/neuonc/noab120

PubMed Abstract | Crossref Full Text | Google Scholar

Whitfield, B. T., and Huse, J. T. (2022). Classification of adult-type diffuse gliomas: impact of the world health organization 2021 update. Brain Pathol. 32:e13062. doi: 10.1111/bpa.13062

PubMed Abstract | Crossref Full Text | Google Scholar

Willemink, M. J., Roth, H. R., and Sandfort, V. (2022). Toward foundational deep learning models for medical imaging in the new era of transformer networks. Radiol. Artif. Intell. 4:e210284. doi: 10.1148/ryai.210284

PubMed Abstract | Crossref Full Text | Google Scholar

Wright, L., and Demeure, N. (2021). Ranger21: a synergistic deep learning optimizer. arXiv [Preprint]. arXiv:2106.13731. doi: 10.48550/arXiv:2106.13731

Crossref Full Text | Google Scholar

Wu, P., Wang, Z., Zheng, B., Li, H., Alsaadi, F. E., Zeng, N., et al. (2023). AGGN: attention-based glioma grading network with multi-scale feature extraction and multi-modal information fusion. Comput. Biol. Med. 152:106457. doi: 10.1016/j.compbiomed.2022.106457

PubMed Abstract | Crossref Full Text | Google Scholar

Yamasaki, F. (2022). Adolescent and young adult brain tumors: current topics and review. Int. J. Clin. Oncol. 27, 457–464. doi: 10.1007/s10147-021-02084-7

PubMed Abstract | Crossref Full Text | Google Scholar

Yang, Z., Zhang, P., Ding, Y., Deng, L., Zhang, T., Liu, Y., et al. (2025). Magnetic resonance imaging-based deep learning for predicting subtypes of glioma. Front. Neurol. 16:1518815. doi: 10.3389/fneur.2025.1518815

PubMed Abstract | Crossref Full Text | Google Scholar

Yu, D., Zhong, Q., Xiao, Y., Feng, Z., Tang, F., Feng, S., et al. (2024). Combination of MRI-based prediction and crispr/cas12a-based detection for IDH genotyping in glioma. NPJ Precis. Oncol. 8:140. doi: 10.1038/s41698-024-00632-8

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, H., Fan, X., Zhang, J., Wei, Z., Feng, W., Hu, Y., et al. (2023). Deep-learning and conventional radiomics to predict IDH genotyping status based on magnetic resonance imaging data in adult diffuse glioma. Front. Oncol. 13:1143688. doi: 10.3389/fonc.2023.1143688

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, J., Wu, J., Zhou, X. S., Shi, F., and Shen, D. (2023). Recent advancements in artificial intelligence for breast cancer: image augmentation, segmentation, diagnosis, and prognosis approaches. Semin. Cancer Biol. 96, 11–25. doi: 10.1016/j.semcancer.2023.09.001

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, J. O., Sax, A., Zamir, A., Guibas, L., and Malik, J. (2020). “Side-tuning: a baseline for network adaptation via additive side networks,” in Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part III 16 (Cham: Springer), 698–714. doi: 10.1007/978-3-030-58580-8_41

Crossref Full Text | Google Scholar

Zhang, S., and Metaxas, D. (2024). On the challenges and perspectives of foundation models for medical image analysis. Med. Image Anal. 91:102996. doi: 10.1016/j.media.2023.102996

PubMed Abstract | Crossref Full Text | Google Scholar

Zhou, Q., Xue, C., Ke, X., and Zhou, J. (2022). Treatment response and prognosis evaluation in high-grade glioma: an imaging review based on MRI. J. Magn. Reson. Imaging 56, 325–340. doi: 10.1002/jmri.28103

PubMed Abstract | Crossref Full Text | Google Scholar

Zhu, Z., Lu, S.-Y., Huang, T., Liu, L., and Liu, Z. (2025). “LKA: Large Kernel Adapter for enhanced medical image classification,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Cham: Springer), 394–404. doi: 10.1007/978-3-032-04978-0_38

Crossref Full Text | Google Scholar

Keywords: deep learning, glioma, grading, IDH genotyping, medical foundational model, multi-modal MRIs

Citation: Li X, Li H, Hu Y, Zhang J, Wang L and Yang X (2026) Customized SAM-Med3D with multi-view adapter and T2-FLAIR mismatch features for glioma IDH genotyping and grading. Front. Behav. Neurosci. 19:1705385. doi: 10.3389/fnbeh.2025.1705385

Received: 15 September 2025; Revised: 10 December 2025;
Accepted: 15 December 2025; Published: 12 January 2026.

Edited by:

Nuno Sousa, Centro Universitário de Jaguariúna (UniFAJ), Brazil

Reviewed by:

Miao Chang, The First Affiliated Hospital of China Medical University, China
Ana Coelho, University of Minho, Portugal

Copyright © 2026 Li, Li, Hu, Zhang, Wang and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yunyi Hu, aHV5dW55aUBydWMuZWR1LmNu

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.