Radiomics analysis of pancreas based on dual-energy computed tomography for the detection of type 2 diabetes mellitus

Objective To utilize radiomics analysis on dual-energy CT images of the pancreas to establish a quantitative imaging biomarker for type 2 diabetes mellitus. Materials and methods In this retrospective study, 78 participants (45 with type 2 diabetes mellitus, 33 without) underwent a dual energy CT exam. Pancreas regions were segmented automatically using a deep learning algorithm. From these regions, radiomics features were extracted. Additionally, 24 clinical features were collected for each patient. Both radiomics and clinical features were then selected using the least absolute shrinkage and selection operator (LASSO) technique and then build classifies with random forest (RF), support vector machines (SVM) and Logistic. Three models were built: one using radiomics features, one using clinical features, and a combined model. Results Seven radiomic features were selected from the segmented pancreas regions, while eight clinical features were chosen from a pool of 24 using the LASSO method. These features were used to build a combined model, and its performance was evaluated using five-fold cross-validation. The best classifier type is Logistic and the reported area under the curve (AUC) values on the test dataset were 0.887 (0.73–1), 0.881 (0.715–1), and 0.922 (0.804–1) for the respective models. Conclusion Radiomics analysis of the pancreas on dual-energy CT images offers potential as a quantitative imaging biomarker in the detection of type 2 diabetes mellitus.


Introduction
The anatomy of the human pancreas is closely related to its endocrine and exocrine functions.In individuals diagnosed with type 1 or type 2 diabetes, alterations in the pancreas have been observed (1).Chronic inflammation of the pancreas can cause damage to the insulin-producing cells, potentially leading to the development of diabetes.Pancreatitis and type 2 diabetes share similar risk factors (1).Several imaging techniques, such as computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound (US), have been used in various studies to investigate pancreatic changes in individuals with diabetes.These studies aim to assess the size (diameter, area, or volume) as well as the fat content of the pancreas using these imaging modalities.The comparisons are made between individuals with type 1 diabetes (T1DM) and/or type 2 diabetes (T2DM) and healthy controls (2).
These imaging studies hold significant potential for providing reliable insights into diabetes mellitus (DM).Between 2008 and 2013, a total of 1,478 lean individuals without diabetes underwent CT scans (3).The presence of fatty pancreas was evaluated using a validated histological method, which measured the attenuation of the pancreas on CT scans conducted at the beginning of the study.Lower pancreas attenuation values indicate higher fat content in the pancreas (3).Furthermore, a study aimed to investigate abdominal CT biomarkers for type 2 diabetes mellitus using advanced automated deep learning techniques within a substantial clinical dataset (4).The analysis demonstrated a correlation between the diagnosis of type 2 diabetes mellitus and specific abdominal CT biomarkers, particularly measurements related to pancreatic CT attenuation and visceral fat (4).Dual-energy CT (DECT) has shown promise in providing more precise quantitative information compared to conventional methods by utilizing two different X-ray beams with distinct absorption characteristics (5).
Radiomics analysis has made significant advancements in the field of medical image analysis in recent years (6).This approach enables the extraction of numerous features from medical images, facilitating the quantification of phenotypic characteristics of tumors (7).These quantifications play a crucial role in various areas such as diagnosis, clinical prognosis, treatment selection, and decision support.By optimizing the selection of features and utilizing machine learning algorithms, it becomes possible to effectively differentiate between patients with similar outcome conditions and establish personalized prediction models based on scientific and data-driven analyses for treatment outcomes.The emerging field of radiomics also holds promise in identifying previously undetectable characteristics and assisting in the diagnosis and prediction of diabetes through pancreas imaging.For instance, a study focused on evaluating the diagnostic accuracy of a Dual-Energy Computed Tomography (DECT)-based technique that utilizes iodine quantification and fat fraction analysis for early detection of acute pancreatitis (8).The results indicated that DECT with iodine quantification exhibits higher sensitivity in diagnosing early acute pancreatitis compared to standard image evaluation methods (8).Furthermore, another study aimed to assess whether an AI model based on pancreas radiomics can identify the CT imaging pattern associated with type 2 diabetes (9).The findings showed that the model successfully detects the imaging pattern linked to type 2 diabetes.However, further enhancements and validation are necessary to evaluate its potential for identifying type 2 diabetes in the millions of CT scans conducted annually.
Our previous study investigated the clinical value of pancreatic fat fraction measured on DECT images for the detection of type 2 diabetes mellitus (10).Given the burgeoning evidence linking pancreatic imaging characteristics with DM and the advancements in imaging and analytical technologies, we hypothesize that radiomics analysis of dual-energy CT images of the pancreas can identify specific imaging biomarkers that quantitatively differentiate individuals with type 2 diabetes mellitus from non-diabetic controls.This approach aims to establish a novel, quantitative imaging biomarker for T2DM, leveraging the precision of DECT and the analytical power of radiomics to enhance early detection and intervention strategies for diabetes mellitus.

Study population
This retrospective study received approval from the Ethics Committee of Liuzhou Municipal Liutie Central Hospital (Approval No. 2021033), and the requirement for informed consent was waived.We conducted a search in our medical information database for cases that occurred between September 2021 and July 2022.Eligible patients meeting the following criteria were included in the study: (1) those who underwent dual-energy abdominal CT with a non-contrast phase; (2) individuals who had blood tests done within 3 days before and after DECT; and (3) patients whose electronic medical records clearly indicated either "type 2 diabetes" or "non-diabetes." Patients with malignant tumors (n = 12), pancreatitis (n = 5), or significant pancreatic atrophy without visible pancreatic parenchyma (n = 4) were excluded from the study.

DECT images
Abdominal DECT was conducted using a Dual-source CT scanner (SOMATOM Drive, Siemens Healthineers, Forchheim, Germany).The tube voltages used were 100 kV and Sn140 kV, with average effective tube currents of 250 mAs and 193 mAs.Automatic tube current modulation technology (CARE Dose 4D) was employed for dose control.The CT parameters were as follows: detector collimation of 32 × 0.6 mm, pitch of 0.6, gantry rotation time of 0.5 s, and matrix size of 512 × 512.Reconstruction of the CT image was performed using an iterative reconstruction technique called Adaptive Model-based Iterative Reconstruction (ADMIRE) with the Q30f algorithm, resulting in a reconstructed image thickness of 1.5 mm (see Figure 1).

Clinical variables
The following 24 clinical data were collected: (1) categorical variables: gender, lipid turbidity index, hemolysis index, jaundice index, and clinical manifestations (now the criteria for the diagnosis of diabetes include the patient exhibiting symptoms of polydipsia, polyuria, polyphagia, and weight loss (referred to as "three excesses and one deficiency"), along with elevated blood glucose levels and the presence of glucose in the urine (which should not be found in normal urine).So if the subject has these manifestations the category of the subject is "positive"); (2)

Pancreas segmentation on DECT images
In our previous work (11), we implemented a deep learning framework that utilized a cascade coarse-to-fine segmentation approach with an attention mechanism to segment organs.In this study, we applied this deep learning framework specifically to segment the pancreas from fusion DECT images.The segmentation process was carried out using the uAI Research Portal (United Imaging Intelligence, China) (12), which is a clinical research platform based on the Python programming language (version 3.7.3). Figure 2 displays the original pancreas DECT image and the corresponding segmentation results.In our previous work (11), we conducted a quantitative analysis comparing the DECT results between the diabetes and control groups.Figure 2 demonstrates that the head of the pancreas exhibited lower density and higher fat fraction compared to the body and tail of the pancreas in the diabetes group.Finally, all delineations were reviewed by a highly experienced chief radiologist with 8 years of expertise in abdominal imaging.

Radiomics extraction and selection
A total of 2,264 radiomic features were extracted from each pancreas region.These features consisted of 104 original features, which were further categorized into 18 first-order statistics, 14 shape features, and 21 texture features.The texture features included Gray-Level Co-occurrence Matrix (GLCM), Gray-Level Run-Length Matrix (GLRLM), Gray-Level Size-Zone Matrix (GLSZM), Gray-Level Dependence Matrix (GLDM), and Neighboring Gray-Tone Difference Matrix (NGTDM).Additionally, 14 image filters including Box Mean, Additive Gaussian Noise, Binomial Blur Image, Curvature Flow, Box Sigma Image, LoG with sigma values of 0.5, 1, 1.5, and 2 were applied to generate derived images.Derived images were further processed using Wavelet filters (LLL, LLH, LHL, LHH, HLL, HLH, HHL, HHH), Normalize, Laplacian Sharpening, Discrete Gaussian Mean Speckle Noise Recursive Gaussian and Shot Noise to extract first-order statistics and texture features within the pancreas regions.This resulted in a total of 2,160 derived features.
Figure 3 illustrates the flowchart of the radiomics analysis.First, radiomic features were calculated from the region of interest (ROI) on each DECT image.All the radiomic features were then normalized using Z-score.Next, the least absolute shrinkage and selection operator (LASSO) technique was applied to sift through these normalized radiomic features, with the aim of identifying those with the highest predictive reliability for type 2 diabetes mellitus (T2DM).This critical step focused on isolating features that are predictive of the binary outcome of diabetes presence (yes or no).Then a similar LASSO selection process was conducted for the clinical features gathered from the study participants.The radiomic and clinical features deemed significant through these selection processes were then amalgamated.A final round of LASSO selection was performed on this combined set to refine and identify a comprehensive feature set.LASSO selection was employed to identify the most reliable predictive radiomic features.Finally, another round of LASSO selection was performed to evaluate clinical features.The selected radiomic and clinical features were merged and subjected to another round of LASSO selection to obtain a comprehensive combined feature set.

Model construction
Three machine learning models were developed for binary classification (diabetes or not) using three classifiers: random forest (RF), support vector machine (SVM), and logistic regression (LR).The models were built based on the selected features and/or clinical features.The input data for the detection model came from one of the three feature sets: (1) radiomic features with 7 variables, (2) clinical features with 8 variables, or (3) a combination of all features with 8 variables.To enhance performance, a grid search was performed to fine-tune the parameters for different features and classification algorithms.

Statistical analysis
The receiver operating characteristic curve (ROC) was generated to evaluate the performance of the detection model, and several performance metrics such as sensitivity (SEN), specificity (SPE), accuracy (ACC), F1-Score, and area under the curve (AUC) were computed.The confidence intervals for cross-validated AUC were computed to estimate the performance of each model.The demographic data was processed using the uAI Research Portal to examine significant differences in variables between the training set and the validation set.Python (version 3.6) was utilized for programming the training, validation of the prediction model, and conducting statistical analysis.Frontiers in Medicine 05 frontiersin.org

Results
Based on Harrell's guideline, the number of selected features should be less than 10% of the sample size.Consequently, in our experiment involving the radiomic and clinical features, as well as the final combination of radiomic and clinical features, the number of selected features was less than 10% of the sample size (13, 14).

Assessment of radiomic and clinical features
A total of 2,264 radiomic features were computed using the uAI Research Portal for each pancreas region.These features were then normalized using the Z-score approach.The feature selection method, Lasso, was employed to reduce dimensionality.Ultimately, 7-dimensional features were selected for the subsequent classification modeling.The names and corresponding importance coefficients of these 7 features calculated by Lasso are illustrated in Figure 4A.The selection of coefficient values was computed using the least square method.Each coefficient signifies the average impact of the corresponding feature on the selection results.In simpler terms, a higher value indicates a greater importance of the feature for the detection model.
Additionally, for each patient, a total of 24 clinical features were digitized and normalized using the Z-score method.Employing a similar feature selection procedure with Lasso, 8 clinical features were chosen.The names and coefficients of each clinical feature are featured in Figure 4B.
Moreover, a combination of all extracted radiomic features (2,264) and clinical features (24) resulted in a total of 2,288 features.Through Lasso computation, an optimal subset of 8 features was obtained, consisting of 6 radiomic features and 2 clinical features.Detailed information regarding all selected features is presented in Figure 4C.

Evaluation of models
The feature selection step automatically chose the most significant features for this classification task.In this step, all To effectively assess the performance of the detection models, the five-fold cross-validation technique was implemented due to the limited amount of data.Cross-validation is a widely accepted statistical technique used to evaluate predictive models by partitioning the original dataset into a training set to train the model, and a testing set to evaluate it.In the context of our study, we divided the complete dataset into five equal (or nearly equal) parts randomly.During the validation process, four of these subsets are combined to form the training set, and the remaining one subset is used as the testing set.This process is repeated five times (folds), with each of the five subsets serving as the testing set exactly once.
LR, RF, and SVM classifiers were constructed for each fold using the selected features, which consisted of radiomic features, clinical features, and combined features.The overall performance was summarized by calculating the mean AUC, sensitivity, specificity, accuracy, and F1 score for both the training set and testing set.The results are presented in Table 2, we can found that LR get the best performance among these three models (RF, SVM, and Logistic) in AUC 0.922, Specificity 0.886, Accuracy 0.862, and F1 Score 0.871 indexes on test dataset with combined model, which demonstrate LR model having a robust classification ability.Then we do some other tests in the LR model, according to the Delong test, there was no statistically significant difference in the area under the ROC curve between the training set and testing set (as shown in Figure 5).

Regenerate response
The performance of the models built with radiomic features, clinical features, and combined features shown in Table 2.

Discussion
In our previous study (10), we calculated the fat fraction and CT value of the head, body, and tail of the pancreas from dual-energy CT images of 45 patients with type 2 diabetes mellitus (T2DM) and 33 control subjects.The experimental results demonstrated a significant association between the fat content of the pancreas and diabetes (10).While our focus was on fat fraction measurements, there are numerous other quantitative parameters that can be derived from dual-energy CT data.Hence, further studies could evaluate additional parameters such as radiomic features.
This retrospective study aims to establish a quantitative imaging biomarker for type 2 diabetes mellitus using dual-energy CT images of the pancreas.We have constructed three models using radiomic features, clinical features, and combined features, obtained through the LASSO regression approach, as shown in Table 3.The coefficients listed in Table 3 were derived from LASSO regression and represent the average impact of each corresponding feature on the classification results.A higher coefficient value indicates greater importance of the feature for the detection model.The description and explanation of each feature for the three models are as follows: these radiomics features are linked to the focus of the doctor's attention, such as image texture, gray level intensity, local homogeneity, and etc.
1. Informational Measure of Correlation (IMC) 1: IMC1 evaluates the correlation between the probability distributions of variables i j and , using mutual information I i j , ( ).This feature quantifies the complexity of the texture within the region of interest (ROI).2. Minimum: This is a first-order feature that represents the minimum gray level intensity within the ROI.The selected features and their corresponding importance coefficients calculated by Lasso.For the clinical model, diabetes is primarily characterized by hyperglycemia and metabolic disturbances.Its key clinical manifestations include increased appetite, excessive thirst, frequent urination, and unintended weight loss (i.e., more than three kilograms but less than one kilogram).In terms of serum biochemical markers, triglycerides, aspartate aminotransferase, endogenous creatinine clearance, low-density lipoprotein cholesterol, total bilirubin, and albumin are all intricately linked to human metabolism.
For the combined model, two additional features were listed as follows: 1. Informational Measure of Correlation (IMC) 2: IMC2 also evaluates the correlation between the probability distributions of variables i j and , similarly quantifying the complexity of the texture within the region of interest (ROI).2. Busyness: This feature measures the change from a voxel to its neighbor.A high value for busyness indicates a "busy" image, with rapid changes in intensity between voxels and their neighborhood.
These additional features contribute to capturing more detailed information about texture complexity and local intensity variations in relation to diabetes detection.
From the selected features above, it is evident that some of them quantify the complexity of the texture of the pancreas.This suggests that there are rapid changes in intensity between voxels and their neighborhood in the medical images of patients with type 2 diabetes mellitus.These texture complexities may be indicative of underlying structural or compositional changes in the pancreas associated with diabetes.The identification and quantification of such changes can potentially provide valuable insights into disease progression and aid in the development of imaging biomarkers for diabetes diagnosis and monitoring.
Radiomics has indeed gained widespread utilization in clinical diagnosis.It has emerged as a valuable tool for auxiliary diagnosis by converting clinical images into mining data with high fidelity, repeatability, and minimal redundancy.This is achieved through the extraction of mathematical structural features from quantitative images.With the focus on personalized precision diagnosis and treatment, radiomics has played a pivotal role in advancing medical imaging beyond being just a diagnostic tool.It has become a fundamental instrument that provides specific and effective guidance for clinical diagnosis and treatment.By leveraging radiomic features, medical professionals can gain deeper insights into diseases, enabling them to make more accurate and personalized decisions regarding patient care (15).The integration of radiomics into clinical practice holds great promise for improving patient outcomes, optimizing treatment strategies, and facilitating precision medicine approaches.
Radiomics has been widely utilized in various diseases.van Griethuysen et al. ( 16) extracted radiomic features from 429 different lung lesions, including 48 texture features, 310 logarithmic features, and 158 wavelet features, to differentiate between benign and malignant nodules in the lungs.Their analysis revealed a correlation between image-based subtypes and the benign or malignant status of lung lesions.In another study, Grimm et al. (17) analyzed 275 preoperative breast MRI images of breast cancer patients and extracted a total of 56 imaging features encompassing morphology, texture, and dynamic characteristics for each patient's image.Utilizing multivariate analysis, they determined the correlation between these imaging  Their findings demonstrated that hyper-homocysteinemia and hypertension exhibited independent associations with symptomatic carotid plaque.These studies highlight the potential of radiomics in various disease contexts, including lung lesions, breast cancer, pancreatic cancer, and carotid plaque.Radiomics analysis can provide valuable insights into disease characterization, subtype classification, and prediction of clinical outcomes.
In this study, a total of 2,264 radiomic features were extracted from pancreatic dual-energy CT images.A feature selection procedure similar to Lasso was employed to identify 7 optimal features.Simultaneously, 24 clinical features were digitized and normalized.Out of these, 8 representative clinical features were selected.The radiomic features were integrated with the clinical features, resulting in an optimal subset comprising of 8 calculated features determined by the Lasso algorithm.Subsequently, three models were constructed: one solely based on radiomic features, another solely based on clinical features, and the third model incorporating both types of features.The performance evaluation of these models was conducted using cross-validation.In the test set, the area under the curve (AUC) values were 0.887 (0.73-1), 0.881 (0.715-1), and 0.922 (0.804-1), respectively.When compared with previous studies (10), the sensitivity (0.844, 0.822, 0.844), specificity (0.876, 0.757, 0.886), accuracy (0.86, 0.795, 0.862), and F1 score (0.873, 0.819, 0.871) of this study showed significant improvement in performance.
Furthermore, we evaluated the performance of the combined model using three clinical features: average blood glucose, duration of diabetes, and diabetic complications.The feature "average blood glucose" was used for the diagnosis of diabetes patients, which is consistent with the ground truth.Regarding diabetes duration, all patients who were classified incorrectly by the combined model had a disease duration of less than 10 years.This interesting finding may be due to the selected feature not being able to capture the difference between healthy controls and diabetes patients with a duration of less than 10 years as shown in Figure 6.Additionally, when classifying patients with (25 patients) or without (20 patients) diabetic complications, 2 patients were classified into the complication group.
There are several limitations to our study.Firstly, the sample size was relatively small, and further expansion of the sample size is necessary to enhance the credibility of the results.Secondly, we only analyzed the imaging features of all patients using dual-energy pancreatic plain scans without considering differences in pancreatic imaging features during different enhanced phases.Additionally, we solely selected axial pancreatic images for analysis, potentially resulting in the omission of certain characteristic radiomics data.

Conclusion
After a comprehensive analysis, the three models constructed based on CT radiomic features demonstrate the potential of pancreatic dual-energy CT images as quantitative imaging biomarkers for detecting type 2 diabetes.The ROC curves of different models using LR method, including radiomic features, clinical features, and combined features on train and test datasets, respectively.Statistical graph of the disease duration for 45 diabetes patients.
TABLE 3 The significance of radiomic features, clinical features, and combined features to build three models.

Models Features Coefficient
Radiomics The specific calculation method of each feature can be found from Pyradiomics website (https://pyradiomics.readthedocs.io/en/latest/features.html).The bold meaning is the best performance.

FIGURE 1 Flowchart
FIGURE 1Flowchart of the study including diabetes patients enrolled with dual-energy CT images.

FIGURE 2
FIGURE 2 Illustration of pancreas DECT images, dual-energy fat map, and the segmentation results (each row from top to bottom) of the healthy subject and diabetes patient.(A) Healthy subject.(B) Diabetes patient.

FIGURE 3 Flowchart
FIGURE 3Flowchart of radiomics analysis on pancreas DECT images.

3 .
Inverse Difference Normalized (IDN): IDN is a measure of the local homogeneity within the ROI. 4. 90 Percentile: This is a first-order feature that represents the number of voxels exceeding 90% of the voxel values in the set of all voxel values within the ROI. 5. Median: This first-order feature represents the median gray level intensity within the ROI. 6. Maximum 2D Diameter Slice: This shape feature is defined as the largest pairwise Euclidean distance between organ surface mesh vertices in the row-column plane (typically axial plane).
features and molecular subtypes of breast cancer.The findings revealed a significant association between radiomic features derived from dynamic contrast-enhanced MRI and the molecular subtypes of luminal A and luminal B hormone receptor-positive breast cancers.Kaissis et al.(18) retrospectively analyzed preoperative CT images of 207 patients diagnosed with pancreatic ductal adenocarcinoma.They developed a random forest machine learning algorithm to accurately predict the molecular subtype of pancreatic cancer based on radiomic features.The classification algorithm demonstrated high sensitivity (0.84 ± 0.05) and specificity (0.92 ± 0.01).Furthermore, the area under the receiver operating characteristic curve (AUC-ROC) was determined to be 0.93 ± 0.01, indicating that preoperative CT image radiomics analysis utilizing machine learning holds promise in predicting molecular subtypes closely associated with the survival outcomes of pancreatic cancer patients.Xue et al.(19) retrospectively analyzed CTA images and clinical data of 170 cases involving the head and radiomic features in conjunction with elevated levels of homocysteine and hypertension.Both the Rad-score model and joint model were established to investigate associations with symptomatic carotid plaque.

TABLE 1
Clinical variables of the study population.

TABLE 2
The performance of the models built with radiomic features, clinical features, and combined features.