Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med., 22 July 2025

Sec. Nuclear Medicine

Volume 12 - 2025 | https://doi.org/10.3389/fmed.2025.1599739

This article is part of the Research TopicMethods and Strategies for Integrating Medical Images Acquired from Distinct ModalitiesView all 4 articles

Integrating CT radiomics and clinical data with machine learning to predict fibrosis progression in coalworker pneumoconiosis


Xiaobing Li,,,,&#x;Xiaobing Li1,2,3,4,5†Qian Li,,,&#x;Qian Li1,2,3,4†Xinyi Xie&#x;Xinyi Xie6†Wei WangWei Wang7Xuemei LiXuemei Li8Tingqiang Zhang,,,Tingqiang Zhang1,2,3,4Li Zhang,,,Li Zhang1,2,3,4Yongsheng Liu,,,*Yongsheng Liu1,2,3,4*Li Wang,,,*Li Wang1,2,3,4*Wutao Xie*Wutao Xie7*
  • 1Science and Technology Industry Development Center, Chongqing Medical and Pharmaceutical College, Chongqing, China
  • 2Laboratory of Toxicology, The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College, Chongqing, China
  • 3Chongqing Key Laboratory of Prevention and Treatment for Occupational Diseases and Poisoning, The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College, Chongqing, China
  • 4Department of Occupational Disease and Poisoning Medicine, The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College, Chongqing, China
  • 5College of Public Health and Health Management, Chongqing Medical University, Chongqing, China
  • 6Clinical Medicine Department, Medical College, Hebei University of Engineering, Handan, Hebei, China
  • 7Department of Radiology, The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College, Chongqing, China
  • 8Department of Neurology, NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China

Objective: This study aims to develop a machine learning (ML) model that integrates computed tomography (CT) radiomics with clinical features to predict the progression of pulmonary interstitial fibrosis in patients with coalworker pneumoconiosis (CWP).

Methods: Clinical and imaging data from 297 patients diagnosed with CWP at The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College between December 2021 and December 2023 were analyzed. Of these patients, 170 developed pulmonary interstitial fibrosis over a 3-year follow-up and were classified as the progression group, while 127 patients showed stable conditions and were classified as the stable group. The patients were divided into a training cohort (n = 207) and a test cohort (n = 90). Radiomic features were extracted from CT images of lung fibrosis lesions in the training cohort. These features were reduced in dimensionality to construct morphological biomarkers. ML methods were then used to develop three models: a clinical model, a radiomics model, and a multimodal joint model. The performance of these models was evaluated in the test cohort using receiver operating characteristic (ROC) curves and decision curve analysis (DCA).

Results: In the training cohort, the area under the curve (AUC) for the clinical, radiomics, and joint models were 0.835, 0.879, and 0.945, respectively. In the test cohort, the AUC values for these models were 0.732, 0.750, and 0.845, respectively. The joint model demonstrated the highest predictive performance and clinical benefit in both the training and test cohorts.

Conclusion: The multimodal model, combining CT radiomics and clinical features, offers an effective and accurate tool for predicting the progression of pulmonary fibrosis in CWP.

1 Introduction

Pneumoconiosis is a chronic, progressive fibrotic lung disease caused by the prolonged inhalation and deposition of occupational dust particles, resulting in diffuse pulmonary fibrosis. It includes a spectrum of conditions linked to exposure to airborne particulates such as asbestos fibers, coal mine dust, and respirable crystalline silica (1, 2). Although pneumoconiosis is recognized as a global occupational health issue, its incidence remains disproportionately high in industrialized nations, where environmental dust exposure is endemic (3). China continues to report the highest number of pneumoconiosis cases annually, with a steadily increasing disease burden despite the implementation of occupational health regulations (4).

Among the various subtypes, coalworker pneumoconiosis (CWP) is one of the most common, attributed to prolonged coal dust exposure in mining environments (5). While CWP shares certain clinicopathological features with other dust-induced lung diseases-such as asbestosis and silicosis-its fibrogenic mechanisms differ due to the unique properties of coal dust (6). For instance, while asbestosis and silicosis involve interstitial lung damage triggered by asbestos fibers and crystalline silica particles, respectively, CWP is marked by the accumulation of coal dust, often compounded by silica contamination, leading to a distinct fibrotic response (7, 8). Despite their etiological differences, all types of pneumoconiosis converge on a common pathological trajectory: progressive pulmonary fibrosis (9).

Pulmonary fibrosis represents the principal driver of morbidity and mortality in advanced pneumoconiosis. It is characterized by the relentless accumulation of extracellular matrix proteins in lung parenchyma, which distorts the normal alveolar architecture and results in irreversible impairment of pulmonary function (10, 11). As fibrosis advances, patients often experience a marked decline in lung function, leading to complications such as pulmonary hypertension, cor pulmonale, and ultimately, respiratory failure (12, 13). Given these severe outcomes, early identification and continuous monitoring of fibrotic progression are essential to improving clinical prognosis and guiding therapeutic intervention (14).

However, early-stage fibrosis in pneumoconiosis typically lacks overt clinical symptoms or radiological markers, making it difficult to diagnose using conventional methods (15). Pulmonary function tests and chest X-rays, though routinely employed in occupational health surveillance, have limited sensitivity for detecting subtle interstitial changes (16). High-resolution computed tomography (CT), on the other hand, provides greater diagnostic accuracy, but its interpretation relies heavily on radiologist expertise, introducing subjectivity and variability in clinical assessments (17, 18).

In recent years, radiomics-a technique involving the extraction of high-dimensional features from medical imaging-has emerged as a promising tool to enhance diagnostic precision and capture latent imaging biomarkers not discernible by the human eye (19). In the context of lung disease, CT-based radiomics has been shown to reflect underlying pathophysiological alterations, including fibrotic remodeling, thereby enabling risk stratification and disease prediction (20). When combined with machine learning (ML), radiomic analysis can be further optimized to create predictive models that identify disease progression with improved accuracy and objectivity (21, 22).

The present study focuses on patients with CWP and aims to develop an interpretable, multimodal ML model to predict pulmonary fibrosis progression. By integrating CT radiomics with clinical parameters, we seek to enhance predictive performance beyond what is achievable with either modality alone. Three distinct models were constructed: a clinical model based solely on laboratory and demographic features; a radiomic model derived from CT feature sets; and a multimodal joint model that fuses both clinical and imaging data. These models were systematically trained, internally validated, and evaluated to determine their relative performance in predicting fibrosis progression.

The clinical significance of this study lies in its potential to offer a non-invasive, reproducible, and objective tool for the early identification of pulmonary fibrosis in CWP. Such an approach could facilitate timely clinical interventions, reduce disease burden, and ultimately improve patient outcomes. Furthermore, this methodological framework may be extended to other occupational lung diseases and interstitial lung conditions, underscoring its broader applicability in respiratory medicine.

2 Materials and methods

2.1 Demographic data

The present study retrospectively analyzed data from the Pneumoconiosis Diagnosis Center at The First Affiliated Hospital of Chongqing Medical and Pharmaceutical College. A total of 297 male patients with confirmed CWP were enrolled. The diagnosis of CWP was established by a certified expert panel at the Chongqing Prevention and Treatment Center for Occupational Diseases, in accordance with national diagnostic criteria for occupational pneumoconiosis.

All included patients were initially diagnosed with CWP without radiological evidence of pulmonary interstitial fibrosis and were followed for a period of 3 years. During the follow-up, 170 patients developed imaging features consistent with pulmonary fibrosis and were categorized into the progression group, while the remaining 127 patients exhibited no significant radiological progression and were assigned to the stable group.

To facilitate model development and validation, the entire cohort was randomly divided into a training cohort (n = 207) and a test cohort (n = 90). The training cohort was used for feature selection and model construction, while the test cohort served as an independent validation set. The overall study design, including inclusion criteria and cohort allocation, is summarized in Figure 1.

FIGURE 1
Flowchart illustrating a process for model development with six steps: (A) Image segmentation, showing lung scans. (B) Feature extraction, listing first, second, and higher-order features. (C) Feature selection, with graphs and heatmaps. (D) Image model, displaying performance plots. (E) Multimodal combined model, with additional variables like gender, age, and medical metrics. (F) Model validation, showcasing performance charts.

Figure 1. Workflow of the CT radiomics and clinical feature-based machine learning (ML) model for predicting pulmonary fibrosis progression in coalworker pneumoconiosis (CWP). Overview of the workflow for constructing and validating a ML model based on CT radiomics and clinical features, including: (A) image segmentation: Regions of interest (ROIs) were segmented from high-resolution CT images, including lung parenchyma and fibrotic regions. (B) Feature extraction: radiomics features were extracted and categorized into first-order (intensity-based), second-order (texture-based), and higher-order features. (C) Feature selection: relevant features were selected using statistical analysis, correlation heatmaps, and importance ranking to optimize the model’s performance. (D) Image model construction: ML algorithms were applied to selected features, and performance was evaluated using metrics such as ROC and PR curves. (E) Multimodal combined model: clinical features (e.g., gender, age, FVC, OH, FEV1, and WBC) were integrated with radiomics features to construct a multimodal prediction model, achieving enhanced predictive performance. (F) Validation model: the model was validated with independent datasets, assessing metrics such as sensitivity, specificity, and calibration curves.

Inclusion criteria comprised: (1) male patients with a diagnosis of CWP based on the GBZ 70-2015 guidelines, a national Chinese standard titled “Diagnosis of Occupational Pneumoconiosis”, which defines diagnostic criteria for pneumoconiosis using chest radiographs; (2) availability of baseline high-resolution computed tomography (HRCT) imaging and complete clinical data; and (3) no evidence of pulmonary fibrosis at initial presentation. Diagnosis of pulmonary fibrosis during follow-up was based on HRCT criteria outlined in the 2022 American Thoracic Society guidelines for idiopathic and progressive pulmonary fibrosis in adults. Exclusion criteria included incomplete biomarker or imaging data, low-quality CT scans, or co-existing interstitial lung diseases (ILDs) such as tuberculosis-related fibrosis or connective tissue disease-associated pneumoconiosis.

Clinical variables collected for analysis included patient age, cumulative dust exposure time (in hours), pulmonary function test results [forced vital capacity (FVC), forced expiratory volume in the first second (FEV1), FEV1/FVC ratio], and key laboratory indices.

2.2 Feature extraction and data pre-processing

Lesion identification: a HRCT images were independently reviewed by a senior thoracic radiologist (over 20 years of experience in occupational lung disease). Radiological signs indicative of fibrotic progression-such as interlobular septal thickening (± subpleural lines), ground-glass opacities (GGOs), reticular patterns (± parenchymal bands), honeycombing with or without traction bronchiectasis, and pleural plaques-were documented. ROI segmentation: lung window images were standardized in grayscale intensity. Target lesions were manually segmented to create regions of interest (ROIs) by a second radiologist with more than 10 years of experience in chest imaging. The segmented volumes were reconstructed into three-dimensional ROIs using 3D Slicer software (version 5.2.2)1. A third senior radiologist independently validated the segmentations to ensure reproducibility and anatomical accuracy.

Radiomic feature extraction: a total of 851 radiomic features were extracted from each ROI using the Pyradiomics extension in 3D Slicer. Extracted features included first-order statistics, shape descriptors (2D and 3D), and texture-based metrics, including gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighboring gray-tone difference matrix (NGTDM).

Data pre-processing: to address sample imbalance in the training set, oversampling techniques (random replication) were applied. All radiomic features were normalized to a range of [0, 1] using the MinMaxScaler function in R (v3.4.1).

2.3 Feature dimensionality reduction and construction of radiomic biomarkers

To reduce overfitting risk and enhance model interpretability, dimensionality reduction was performed using the least absolute shrinkage and selection operator (LASSO) regression. This technique imposes an L1 penalty to shrink coefficients of irrelevant or collinear features, thereby improving model generalizability. The initial pool of 851 radiomic features underwent LASSO-based feature selection, yielding a subset of features with the highest predictive value. These features were subsequently used to construct radiomic signatures, referred to as “psychoradiomic signatures” (PS), which encapsulate multi-parametric image information reflective of microstructural changes in lung parenchyma associated with fibrotic progression.

To build the PS, two ML classifiers-logistic regression (LR) and support vector machine (SVM)-were employed. The classification performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, specificity, and calibration plots. A calibration curve was generated to assess agreement between predicted probabilities and observed outcomes, thus evaluating model reliability and potential overfitting.

2.4 Construction and validation of the joint model

An integrative model incorporating both radiomic and clinical variables was subsequently developed. The rationale was to exploit the complementary diagnostic information provided by HRCT-derived radiomics and conventional clinical indicators, e.g., pulmonary function metrics (FVC, FEV1, FEV1/FVC). The combined model was trained using an SVM algorithm, given its proficiency in high-dimensional and non-linear classification problems. Prior to training, five-fold cross-validation was employed within the training set to fine-tune hyperparameters, including the kernel function and the regularization coefficient (denoted as C). Model complexity was balanced against predictive performance to prevent overfitting.

Following optimization, the final model was tested on the independent validation cohort (n = 90). Model performance was assessed using AUC, sensitivity, specificity, and overall classification accuracy. Calibration curves were again used to verify consistency between predicted risk and actual outcome. Furthermore, decision curve analysis (DCA) was conducted to evaluate clinical utility by quantifying the net benefit of model-assisted decisions compared with treat-all or treat-none strategies.

2.5 Statistical analysis

Statistical analyses were conducted using SPSS version 26.0 (IBM Corp., Armonk, NY, USA), R software version 3.4.1, and DecisionLinnc (version 1.0, Nov 2023)2. Continuous variables conforming to normal distribution were analyzed using independent-sample t-tests, while non-normally distributed data were assessed via the Mann-Whitney U test. Categorical variables were compared using chi-square or Fisher’s exact tests as appropriate. Model diagnostic performance was quantified through ROC-derived metrics including AUC, sensitivity, and specificity. All statistical tests were two-tailed, with a P < 0.05 considered statistically significant.

3 Results

3.1 Comparison of clinical characteristics and CT imaging findings

In the training cohort, a comparative analysis of clinical parameters between the fibrosis progression group and the stable group revealed statistically significant differences in dust exposure duration (DCH), FVC), FEV1, and FEV1/FVC ratio (all P < 0.05). These findings suggest that both occupational exposure and lung function metrics serve as important clinical indicators of fibrosis progression in patients with CWP. In contrast, no significant intergroup differences were observed for age or WBC count (P > 0.05), indicating limited prognostic value of these variables in this context.

Consistent results were observed in the test cohort: FVC, FEV1, and FEV1/FVC ratio remained significantly different between the progression and stable groups (P < 0.05), whereas age and DCH did not reach statistical significance (P > 0.05). A comprehensive comparison of clinical variables for both cohorts is presented in Table 1.

TABLE 1
www.frontiersin.org

Table 1. Comparative analysis of clinical characteristics between the training and test cohorts in coalworker pneumoconiosis (CWP) patients.

HRCT findings demonstrated excellent inter-observer agreement between two experienced thoracic radiologists in identifying key radiological features associated with pulmonary fibrosis. Principal imaging signs-including interlobular septal thickening, ground-glass opacity (GGO), reticular patterns (“grid shadows”), honeycombing, and pleural plaques-were consistently observed across both patient groups (Figure 2).

FIGURE 2
CT images of the lungs in two pairs. Pair (a) and (b) show coronal and axial cross-sections with notable opacities, likely indicating abnormal lung conditions. Pair (c) and (d) show clearer coronal and axial views with less visible opacities, suggesting healthier lung tissue. Each CT scan includes measurement details and identification codes.

Figure 2. Comparison of HRCT images between progressive and non-progressive CWP patients. (a) Coronal HRCT image of a patient with progressive pulmonary fibrosis showing extensive fibrotic lesions and lung structural distortion. (b) Axial HRCT images of the same progressive case reveal pronounced interstitial fibrosis with diffuse irregular opacities and honeycombing. (c) Coronal HRCT image of a patient with non-progressive pulmonary fibrosis demonstrating relatively preserved lung architecture and limited fibrotic changes. (d) Axial HRCT images of the same non-progressive case illustrate minor fibrotic features with a predominance of nodular or reticular patterns and no evidence of honeycombing.

Cohen’s kappa coefficients for inter-observer agreement were as follows: 0.786 for interlobular septal thickening, 0.769 for GGO, 0.828 for reticular pattern, 0.814 for honeycombing, and 0.792 for pleural plaques, all demonstrating substantial to excellent agreement (P < 0.05). These results, detailed in Table 2, underscore the robustness and reproducibility of the radiological assessment process.

TABLE 2
www.frontiersin.org

Table 2. Comparative analysis of CT imaging features between stable and progression groups in coalworker pneumoconiosis.

Importantly, specific HRCT features such as honeycombing, reticular patterns, and interlobular septal thickening were more prevalent in the progression group, aligning with more advanced fibrotic changes. Although GGO and pleural plaques were present in both groups, their frequency was higher in the progression group, suggesting a potential association with early fibrotic evolution (Figures 3, 4). These observations highlight the complementary diagnostic value of HRCT in conjunction with clinical indicators for the early detection and monitoring of pulmonary fibrosis progression in CWP.

FIGURE 3
CT scans of the chest showing lung nodules highlighted with red circles. (a) and (c) display coronal views, while (b) and (d) present axial views. Each image highlights a different affected area within the lungs.

Figure 3. Progression of Pulmonary Lesions in a Stage I Pneumoconiosis Patient: A 1-Year Follow-Up CT Study. (a) and (b): Initial CT scans show multiple small high-density nodular opacities and linear fibrotic streaks distributed in the bilateral upper and middle lung fields, primarily affecting the upper lobe segments and lower lobe dorsal segment. (c) and (d): Follow-up CT images taken 1 year later reveal an increased number of nodular and fibrotic lesions in the same lung regions, indicating progression of fibrosis. The findings suggest disease advancement over time.

FIGURE 4
CT scan images showing various perspectives of a chest cavity. Images (a) and (c) display vertical cross-sections with highlighted areas in the upper lung region. Images (b) and (d) show horizontal cross-sections with circled areas in the lung fields, indicating areas of interest.

Figure 4. Progression of pulmonary fibrosis in a stage II pneumoconiosis patient: a 1-Year Follow-Up CT Study. (a,b) Initial CT scans show large patchy opacities in the apical-posterior segment of the bilateral upper lobes, with a long-axis diameter >2 cm and a short-axis diameter>1 cm. Additionally, multiple scattered high-density nodular opacities are observed in all lung lobes, indicating severe fibrosis. (c,d) Follow-up CT images taken 1 year later demonstrate significant progression of fibrotic lesions, with an increase in lesion size and density, suggesting further disease advancement.

3.2 Feature selection and machine learning model development

A total of 851 quantitative radiomic features were extracted from the manually segmented HRCT images of the 297 enrolled CWP patients. These features encompassed first-order statistics, three-dimensional shape descriptors, and multiple texture matrices (GLCM, GLRLM, GLSZM, and NGTDM), enabling a comprehensive representation of parenchymal tissue heterogeneity and fibrotic alterations.

To address the high dimensionality and potential multicollinearity of the dataset, feature selection was performed using the LASSO regression, a robust method for identifying informative predictors while preventing overfitting. Through LASSO penalization, 19 radiomic features with the highest predictive value for fibrosis progression were retained (Figure 5).

FIGURE 5
Left pane: Line plot showing coefficients versus log-transformed lambda values, illustrating paths of selected variables. Right pane: Scatter plot of cross-validated mean (CVM) against log-transformed lambda. Dashed lines indicate LogLambda_1se at minus six point three seven six and LogLambda_min at minus eight point nine eight one.

Figure 5. Selection of optimal lambda parameter for LASSO regression in predicting pulmonary fibrosis progression. (Left) Coefficient profile plot for the LASSO regression model, showing the trajectory of feature coefficients as the regularization parameter (log Lambda) changes. As the penalty increases, more coefficients shrink toward zero, emphasizing feature selection for model sparsity. (Right) Cross-validation curve for the LASSO model, with the mean squared error (CVM) plotted against log Lambda. The vertical dashed lines indicate the optimal Lambda values: the minimum error (left line) and the largest Lambda within one standard error of the minimum (right line). These Lambda values guide feature selection, balancing model complexity and predictive performance.

These selected features were subsequently used to construct predictive models using two ML algorithms: LR and SVM. LR was chosen for its interpretability and probabilistic output, while SVM was employed for its robustness in handling non-linear decision boundaries. Model performance was evaluated based on the area under the receiver operating characteristic curve (AUC), complemented by calibration curves to assess the agreement between predicted and observed outcomes (Figure 6). Both LR and SVM models demonstrated satisfactory predictive performance, with minimal evidence of overfitting.

FIGURE 6
Two plots are shown. The left plot is a Receiver Operating Characteristic (ROC) curve comparing a logistic model with an area under the curve (AUC) of 0.728 and an SVM prediction with an AUC of 0.879. The right plot is a calibration plot showing observed proportion versus predicted probability, with a red line representing the group “svc 1” closely following the diagonal reference line.

Figure 6. Comparative performance of logistic regression and SVM models. (Left) ROC curves illustrate the discrimination performance of the Logistic Regression and SVM models. The AUC for the Logistic Regression model was 0.728, indicating moderate predictive capability, whereas the SVM model achieved a significantly higher AUC of 0.879, demonstrating superior predictive accuracy and robustness. (Right) The calibration plot evaluates the agreement between predicted probabilities and observed outcomes. The SVM model calibration curve aligns closely with the ideal diagonal line, reflecting excellent calibration with no evidence of overfitting and reliable predictive performance.

These findings affirm the utility of radiomics as a non-invasive biomarker strategy for tracking fibrotic progression in CWP. The ability to extract and apply high-dimensional image-derived information-coupled with ML -based classification-provides a promising approach for risk stratification and early intervention planning.

3.3 Model construction and performance evaluation

Following the feature selection and radiomic biomarker construction, three different models were developed for predicting pulmonary fibrosis progression in CWP patients: the clinical model, the radiomic model, and the joint model. The clinical model relied solely on clinical features such as DCH, lung function measures (e.g., FVC, FEV1, FEV1/FVC ratio), and other clinical biomarkers. The radiomic model, in contrast, used only the selected radiomic features derived from the CT images. Finally, the joint model integrated both clinical and radiomic features in an attempt to combine the strengths of both data types for more accurate predictions. The performance of these models was assessed using ROC curves for both the training cohort and test cohort. As shown in Figure 7, the joint model consistently outperformed both the clinical and radiomic models in terms of predictive accuracy.

FIGURE 7
Two ROC curve graphs compare model performance. The left graph shows three models: clinical (red, AUC=0.835), image omics (blue, AUC=0.879), and joint model (green, AUC=0.945). The right graph shows clinical (black, AUC=0.732), image omics (blue, AUC=0.750), and joint model (green, AUC=0.845). Sensitivity and specificity are plotted on the y and x-axes, respectively.

Figure 7. Comparative predictive performance of clinical, radiomic, and joint models. (Left) ROC curves demonstrate the predictive performance of the clinical, radiomic, and joint models in the training cohort. The AUC values were 0.835, 0.879, and 0.945 for the clinical, radiomic, and joint models, respectively, highlighting the superior performance of the joint model. (Right) In the test cohort, the ROC curves show AUC values of 0.732 for the clinical model, 0.750 for the radiomic model, and 0.845 for the joint model, further validating the enhanced predictive capability of the joint model compared to the individual models.

In the training cohort, the AUC for the ROC curve was 0.835 for the clinical model, 0.879 for the radiomic model, and 0.945 for the joint model. These results indicate that while both clinical and radiomic models demonstrated satisfactory performance, the joint model achieved the highest predictive accuracy, suggesting that the integration of clinical and radiomic data provides superior results. Similarly, in the test cohort, the clinical model had an AUC of 0.732, the radiomic model had an AUC of 0.750, and the joint model reached an AUC of 0.845. These findings further validate the robustness of the joint model across different cohorts, demonstrating its potential for reliable prediction of pulmonary fibrosis progression in CWP patients. The AUC values for all three models are detailed in Table 3, providing a comprehensive overview of their performance.

TABLE 3
www.frontiersin.org

Table 3. Comparative predictive performance of clinical, radiomic, and joint models in training and test cohorts.

In addition to ROC curve analysis, DCA was performed to assess the clinical utility of the models by evaluating the net benefits of using each model at different threshold probabilities. In the training cohort, DCA showed that the clinical model performed better than the radiomic model, but the joint model outperformed both, demonstrating the highest net benefit. This trend was consistent in the test cohort, where the joint model continued to show superior performance over the clinical and radiomic models. The results of the DCA are illustrated in Figure 8, reinforcing the clinical relevance of the joint model for guiding treatment decisions in CWP patients.

FIGURE 8
Two line graphs compare net benefit versus high-risk threshold for different models: Clinical, Imaging_omics, Joint, All, and None. Lines represent each model’s performance, showing varying net benefits across risk thresholds from 0 to 1.

Figure 8. Decision curve analysis (DCA) curves of the training and test cohorts. (Left) DCA curves for the training cohort illustrate that the clinical model consistently outperformed the radiomic model, while the joint model provided the highest net benefit across a range of threshold probabilities, demonstrating its superior predictive utility. (Right) Similarly, in the test cohort, the clinical model surpassed the radiomic model, with the joint model achieving the highest net benefit, further reinforcing its robust performance and practical value in decision-making scenarios.

Overall, these results demonstrate that combining radiomic signatures with clinical indicators yields more accurate and clinically meaningful predictions of pulmonary fibrosis progression than either approach alone. The joint model offers a promising tool for early risk assessment, potentially enabling timely therapeutic intervention in high-risk CWP patients.

4 Discussion

4.1 Integration of radiomics and clinical features enhances prediction

High-resolution computed tomography remains central to the diagnosis and monitoring of ILDs, including pneumoconiosis-related pulmonary fibrosis. Owing to its excellent spatial resolution and sensitivity to fibrotic changes, HRCT allows accurate assessment of disease extent and progression (23, 24). Quantitative image analysis, enabled by radiomics, further augments this process by extracting high-dimensional features that capture tissue heterogeneity, structural alterations, and fibrotic remodeling not readily discernible by human readers (25, 26).

In our study, we extracted 851 radiomic features from baseline HRCT images of coalworkers, and employed a LASSO-based feature selection strategy to reduce redundancy and identify 19 predictive features. These were combined with pulmonary function indicators (FVC, FEV1, FEV1/FVC) to build a multi-modal ML model, which demonstrated strong predictive performance for fibrosis progression. The joint model achieved an AUC of 0.945 in the training cohort and 0.845 in the test cohort, outperforming radiomic-only and clinical-only models. DCA further confirmed its clinical net benefit across a wide range of threshold probabilities. These results suggest that integrating radiomics with clinical features can enhance early risk assessment in pneumoconiosis, providing a non-invasive and scalable tool to support individualized monitoring and early intervention strategies.

4.2 Clinical and biological interpretability of selected features

To enhance the transparency and clinical relevance of our predictive model, we further explored the interpretability of the radiomic features selected during model construction. Among the 19 features retained after LASSO selection, many were texture-related metrics derived from GLCM, GLSZM, and first-order statistics. These features quantify intrapulmonary heterogeneity, density distribution, and spatial arrangement-attributes known to correlate with pathological alterations in fibrotic lung diseases (27, 28). For instance, features such as GLCM entropy, cluster shade, and GLSZM small area emphasis reflect increased structural complexity and textural irregularity, which are characteristic of fibrotic remodeling in pneumoconiosis (9). First-order features like skewness and kurtosis may reflect asymmetric density distribution, potentially related to the uneven deposition of fibrotic tissue and alveolar collapse (29).

Although these quantitative features do not directly correspond to conventional radiological signs, their patterns are consistent with the fibrotic processes observed in histopathology and high-resolution imaging studies (30). Their selection suggests that the model captures biologically meaningful signals beyond what can be perceived visually, offering potential non-invasive biomarkers for disease progression. Moreover, the combination of radiomics with clinical indicators such as FVC, FEV1, and FEV1/FVC enhances the robustness and interpretability of the model (31). Pulmonary function tests reflect physiological impairment due to restrictive ventilation, while radiomic features reflect structural deterioration. Their joint use aligns with the multidimensional nature of fibrosis progression (32).

This interpretability not only reinforces confidence in the model’s predictions but also promotes clinical acceptance by providing insight into the underlying biological rationale (33). In a broader context, our study demonstrates the feasibility of using explainable radiomic signatures to supplement functional metrics, potentially improving early detection, treatment planning, and monitoring in occupational lung diseases (34).

4.3 Calibration and threshold optimization for clinical use

To ensure the clinical applicability of our ML models, we conducted a detailed evaluation of calibration performance and optimized the decision threshold for risk stratification (35). Calibration analysis, which assesses the agreement between predicted probabilities and observed outcomes, revealed that the SVM-based radiomic model demonstrated excellent alignment with the ideal calibration line in the test cohort, indicating reliable probability estimates and minimal overfitting (36). This is visually supported by the calibration curve (Figure 6, right), which confirms the model’s strong probabilistic performance and its potential to inform clinical decision-making. Furthermore, threshold optimization was performed to determine a clinically relevant decision cutoff for distinguishing high-risk patients with likely fibrosis progression from those with stable disease (37, 38). Using Youden’s Index derived from ROC analysis, the optimal threshold was identified for the joint model, which achieved the highest predictive accuracy among all models (39). At this threshold, the model balanced sensitivity and specificity, providing a practical decision point for early intervention planning (40).

Although Brier scores were not explicitly reported in the results, the close calibration of the SVM and joint models suggests a low average prediction error, further supporting their utility in real-world settings (41). These findings underscore the model’s readiness for integration into routine risk assessment workflows, offering clinicians a non-invasive, data-driven tool to support personalized surveillance strategies in patients with CWP (42).

4.4 Addressing potential confounders and sampling bias

Despite the promising performance of our integrated radiomics-clinical model in predicting radiological progression in patients with CWP, several potential confounders and biases must be acknowledged (43). Firstly, this study was conducted retrospectively at a single institution with a limited sample size, which may have introduced selection bias. Patients included had relatively complete follow-up data and high-quality HRCT imaging, potentially excluding more severe or comorbid cases and thus limiting the generalizability of the model to the broader CWP population (44).

Secondly, although our model incorporated key clinical predictors-such as age, LDH, and pulmonary function indices (FEV1/FVC and FVC% pred)-there may be unmeasured confounders that also influence disease progression. For example, occupational exposure intensity, smoking status, and comorbid pulmonary conditions (e.g., COPD, silicosis) were not fully accounted for due to data constraints (45). Additionally, manual segmentation of the lungs and lesions, while performed by experienced radiologists with good interobserver agreement, introduces some subjectivity that could affect the consistency of radiomic feature extraction, especially in borderline cases or early-stage disease.

Lastly, the class imbalance between progression and stable groups, though addressed by down-sampling and cross-validation, could still bias the model’s learning process (46). Furthermore, the lack of external validation in independent cohorts remains a key limitation (47, 48). Nevertheless, this study represents an important step toward integrating quantitative imaging biomarkers and clinical parameters to non-invasively assess early progression in CWP. Future multicenter, prospective studies with standardized imaging and a broader spectrum of clinical variables are needed to validate and refine this predictive approach (49, 50).

4.5 Conclusion and future directions

The present study constructed a combined predictive model based on HRCT radiomic features and pulmonary function indicators, which showed high accuracy, good calibration, and favorable clinical utility in identifying patients at risk of pulmonary fibrosis progression. These findings provide a novel and non-invasive approach for risk stratification in occupational pulmonary disease. However, several limitations should be acknowledged. Firstly, the study was conducted at a single center, which may restrict generalizability due to variations in CT acquisition protocols and population characteristics (51). Secondly, the model has not yet undergone external validation on an independent dataset. While internal validation yielded consistent results across training and test sets, future work should involve multi-center, prospective validation to confirm robustness (52, 53). Thirdly, the retrospective nature of the study may introduce bias in data collection and outcome classification. Although image acquisition was standardized, longitudinal follow-up data were limited (54). Future studies should adopt a prospective design with repeated imaging and clinical evaluation to enable dynamic prediction modeling (55). Lastly, while radiomic features provide valuable information, their direct biological correlates remain partially understood. Ongoing efforts in imaging-pathology correlation and multi-omics integration are essential to further refine feature selection and enhance clinical adoption (56, 57).

In conclusion, our study presents a predictive framework combining HRCT-derived radiomic features and pulmonary function data to identify patients at risk of pulmonary fibrosis progression among coalworkers. The integrated model demonstrated high predictive accuracy, reliable calibration, and favorable clinical utility (58). With further external validation and real-world testing, such a tool may support early intervention and individualized disease monitoring in occupational respiratory health (59).

Data availability statement

The original contributions presented in this study are included in this article, further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by Ethical approval was obtained from the Institutional Review Board (IRB) of Chongqing Medical and Pharmaceutical College, with approval number KYLLSC20240730015. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. The animal study was approved by Ethical approval was obtained from the Institutional Review Board (IRB) of Chongqing Medical and Pharmaceutical College, with approval number KYLLSC20240730015.

Author contributions

XL: Supervision, Writing – original draft, Conceptualization, Methodology, Software, Data curation, Investigation, Validation, Formal Analysis, Funding acquisition, Resources, Visualization, Project administration, Writing – review and editing. QL: Writing – review and editing, Software, Writing – original draft, Investigation, Conceptualization. WW: Writing – original draft, Writing – review and editing, Software, Conceptualization, Investigation. XmL: Writing – review and editing, Investigation, Software, Writing – original draft, Conceptualization. TZ: Investigation, Conceptualization, Writing – review and editing, Writing – original draft. LZ: Methodology, Investigation, Supervision, Writing – review and editing, Writing – original draft, Data curation, Software, Conceptualization. YL: Funding acquisition, Resources, Visualization, Formal Analysis, Writing – review and editing, Software, Validation, Methodology, Conceptualization, Writing – original draft, Supervision, Investigation, Data curation, Project administration. LW: Writing – review and editing, Conceptualization, Writing – original draft, Software. WX: Visualization, Funding acquisition, Project administration, Resources, Writing – review and editing, Formal Analysis, Validation, Conceptualization, Methodology, Data curation, Supervision, Writing – original draft, Investigation, Software. XX: Writing – original draft, Data curation, Formal Analysis, Investigation.

Funding

The authors declare that financial support was received for the research and publication of this article. This work was funded by the Chongqing Medical scientific Research Project (Joint project of Chongqing Health Commission and Science and Technology Bureau) (No. 2023GGXM006), Chongqing Medical Scientific Research Project (Joint project of Chongqing Health Commission and Science and Technology Bureau) (No. 2024ZDXM026), the Key Research Project from Chongqing Medical and Pharmaceutical Vocational Education Group (No. CQZJ202329), Chongqing key Municipal public health specialty construction project, 2024 Scientific research project of Chongqing Medical and Pharmaceutical College (No. ygzrc2024101), Chongqing Education Commission Natural Science Foundation (Nos. KJQN202402821; KJQN202302820), Chongqing Shapingba District Science and Technology Bureau Project (No. 2024071), and 2024 Chongqing Medical and Pharmaceutical College Innovation Research Group Project (No. ygz2024401), Chongqing Shapingba District Science and Health Joint Medical Research Project (No. 2024SQKWLHMS051), Foundation of Chongqing Key Laboratory of Prevention and Treatment for Occupational Diseases and Poisoning (No. 2021ZYBKF01), Medical Research Program of Chongqing municipal health, Health Committee (No. 2023WJWYX12) and Key Projects of Science and Technology Research Program of Chongqing Municipal Education Commission (No. KJZD-K202302802) respectively.

Acknowledgments

We would like to extend my heartfelt appreciation to Zhenjun Xi, and Chenyin He for their invaluable contributions to the data collection and experimental process. Their meticulous efforts ensured the accuracy and reliability of our research findings. Additionally, We are grateful to Mei Yu, Jiemei Jiang, Li Yan, Lvsu Ye for their assistance in data analysis and interpretation, which greatly enhanced the depth and quality of our research outcomes.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

References

1. Vanka K, Shukla S, Gomez H, James C, Palanisami T, Williams K, et al. Understanding the pathogenesis of occupational coal and silica dust-associated lung disease. Eur Respir Rev. (2022) 31:210250. doi: 10.1183/16000617.0250-2021

PubMed Abstract | Crossref Full Text | Google Scholar

2. Cool C, Murray J, Vorajee N, Rose C, Zell-Baran L, Sanyal S, et al. Pathologic findings in severe coal workers’ pneumoconiosis in contemporary US coal miners. Arch Pathol Lab Med. (2024) 148:805–17. doi: 10.5858/arpa.2022-0491-OA

PubMed Abstract | Crossref Full Text | Google Scholar

3. Duan Z, Zhou L, Wang T, Han L, Zhang J. Survival and disease burden analysis of occupational pneumoconiosis from 1956 to 2021 in Jiangsu Province. J Occup Environ Med. (2023) 65:407–12. doi: 10.1097/JOM.0000000000002795

PubMed Abstract | Crossref Full Text | Google Scholar

4. Liu X, Jiang Q, Wu P, Han L, Zhou P. Global incidence, prevalence and disease burden of silicosis: 30 years’ overview and forecasted trends. BMC Public Health. (2023) 23:1366. doi: 10.1186/s12889-023-16295-2

PubMed Abstract | Crossref Full Text | Google Scholar

5. Liu W, Liang R, Zhang R, Wang B, Cao S, Wang X, et al. Prevalence of coal worker’s pneumoconiosis: A systematic review and meta-analysis. Environ Sci Pollut Res Int. (2022) 29:88690–8. doi: 10.1007/s11356-022-21966-5

PubMed Abstract | Crossref Full Text | Google Scholar

6. Akira M, Suganuma N. Imaging diagnosis of pneumoconiosis with predominant nodular pattern: HRCT and pathologic findings. Clin Imaging. (2023) 97:28–33. doi: 10.1016/j.clinimag.2023.02.010

PubMed Abstract | Crossref Full Text | Google Scholar

7. Calabrese F, Montero-Fernandez M, Kern I, Pezzuto F, Lunardi F, Hofman P, et al. The role of pathologists in the diagnosis of occupational lung diseases: An expert opinion of the European society of pathology pulmonary pathology working group. Virchows Arch. (2024) 485:173–95. doi: 10.1007/s00428-024-03845-1

PubMed Abstract | Crossref Full Text | Google Scholar

8. Weissman D. Progressive massive fibrosis: An overview of the recent literature. Pharmacol Ther. (2022) 240:108232. doi: 10.1016/j.pharmthera.2022.108232

PubMed Abstract | Crossref Full Text | Google Scholar

9. Cohen R, Rose C, Go L, Zell-Baran L, Almberg K, Sarver E, et al. Pathology and mineralogy demonstrate respirable crystalline silica is a major cause of severe pneumoconiosis in U.S. coal miners. Ann Am Thorac Soc. (2022) 19:1469–78. doi: 10.1513/AnnalsATS.202109-1064OC

PubMed Abstract | Crossref Full Text | Google Scholar

10. Selman M, Pardo A. From pulmonary fibrosis to progressive pulmonary fibrosis: A lethal pathobiological jump. Am J Physiol Lung Cell Mol Physiol. (2021) 321:L600–7. doi: 10.1152/ajplung.00310.2021

PubMed Abstract | Crossref Full Text | Google Scholar

11. Lurje I, Gaisa N, Weiskirchen R, Tacke F. Mechanisms of organ fibrosis: Emerging concepts and implications for novel treatment strategies. Mol Aspects Med. (2023) 92:101191. doi: 10.1016/j.mam.2023.101191

PubMed Abstract | Crossref Full Text | Google Scholar

12. Boucherat O, Agrawal V, Lawrie A, Bonnet S. The latest in animal models of pulmonary hypertension and right ventricular failure. Circ Res. (2022) 130:1466–86. doi: 10.1161/CIRCRESAHA.121.319971

PubMed Abstract | Crossref Full Text | Google Scholar

13. Jone P, Ivy D, Hauck A, Karamlou T, Truong U, Coleman R, et al. Pulmonary hypertension in congenital heart disease: A scientific statement from the American heart association. Circ Heart Fail. (2023) 16:e00080. doi: 10.1161/HHF.0000000000000080

PubMed Abstract | Crossref Full Text | Google Scholar

14. Hang W, Bu C, Cui Y, Chen K, Zhang D, Li H, et al. Research progress on the pathogenesis and prediction of pneumoconiosis among coal miners. Environ Geochem Health. (2024) 46:319. doi: 10.1007/s10653-024-02114-z

PubMed Abstract | Crossref Full Text | Google Scholar

15. Libu C, Otelea M, Arghir I, Rascu A, Antoniu S, Arghir O. Challenges in diagnosing occupational chronic obstructive pulmonary disease. Medicina (Kaunas). (2021) 57:911. doi: 10.3390/medicina57090911

PubMed Abstract | Crossref Full Text | Google Scholar

16. Jung T, Vij N. Early diagnosis and real-time monitoring of regional lung function changes to prevent chronic obstructive pulmonary disease progression to severe emphysema. J Clin Med. (2021) 10:5811. doi: 10.3390/jcm10245811

PubMed Abstract | Crossref Full Text | Google Scholar

17. Choe J, Hwang H, Lee S, Yoon J, Kim N, Seo JB. CT Quantification of interstitial lung abnormality and interstitial lung disease: From technical challenges to future directions. Invest Radiol. (2025) 60:43–52. doi: 10.1097/RLI.0000000000001103

PubMed Abstract | Crossref Full Text | Google Scholar

18. Hussain S, Mubeen I, Ullah N, Shah S, Khan B, Zahoor M, et al. Modern diagnostic imaging technique applications and risk factors in the medical field: A review. Biomed Res Int. (2022) 2022:5164970. doi: 10.1155/2022/5164970

PubMed Abstract | Crossref Full Text | Google Scholar

19. Ibrahim A, Primakov S, Beuque M, Woodruff H, Halilaj I, Wu G, et al. Radiomics for precision medicine: Current challenges, future prospects, and the proposal of a new framework. Methods. (2021) 188:20–9. doi: 10.1016/j.ymeth.2020.05.022

PubMed Abstract | Crossref Full Text | Google Scholar

20. Selvam M, Chandrasekharan A, Sadanandan A, Anand V, Murali A, Krishnamurthi G. Radiomics as a non-invasive adjunct to Chest CT in distinguishing benign and malignant lung nodules. Sci Rep. (2023) 13:19062. doi: 10.1038/s41598-023-46391-7

PubMed Abstract | Crossref Full Text | Google Scholar

21. Stamate E, Piraianu A, Ciobotaru O, Crassas R, Duca O, Fulga A, et al. Revolutionizing cardiology through artificial intelligence-big data from proactive prevention to precise diagnostics and cutting-edge treatment-A comprehensive review of the past 5 years. Diagnostics (Basel). (2024) 14:1103. doi: 10.3390/diagnostics14111103

PubMed Abstract | Crossref Full Text | Google Scholar

22. Serrano D, Luciano F, Anaya B, Ongoren B, Kara A, Molina G, et al. Artificial Intelligence (AI) applications in drug discovery and drug delivery: Revolutionizing personalized medicine. Pharmaceutics. (2024) 16:1328. doi: 10.3390/pharmaceutics16101328

PubMed Abstract | Crossref Full Text | Google Scholar

23. Khanna D, Distler O, Cottin V, Brown K, Chung L, Goldin J, et al. Diagnosis and monitoring of systemic sclerosis-associated interstitial lung disease using high-resolution computed tomography. J Scleroderma Relat Disord. (2022) 7:168–78. doi: 10.1177/23971983211064463

PubMed Abstract | Crossref Full Text | Google Scholar

24. Ricci F, Pugliese L, Cavallo A, Forcina M, De Stasio V, Presicce M, et al. Highlights of high-resolution computed tomography imaging in evaluation of complications and co-morbidities in idiopathic pulmonary fibrosis. Acta Radiol. (2020) 61:204–18. doi: 10.1177/0284185119857435

PubMed Abstract | Crossref Full Text | Google Scholar

25. Suman G, Koo C. Recent advancements in computed tomography assessment of fibrotic interstitial lung diseases. J Thorac Imaging. (2023) 38:S7–18. doi: 10.1097/RTI.0000000000000705

PubMed Abstract | Crossref Full Text | Google Scholar

26. Chen Z, Lin Z, Lin Z, Zhang Q, Zhang H, Li H, et al. The applications of CT with artificial intelligence in the prognostic model of idiopathic pulmonary fibrosis. Ther Adv Respir Dis. (2024) 18:17534666241282538. doi: 10.1177/17534666241282538

PubMed Abstract | Crossref Full Text | Google Scholar

27. So A, Nicolaou S. Spectral computed tomography: Fundamental principles and recent developments. Korean J Radiol. (2021) 22:86–96. doi: 10.3348/kjr.2020.0144

PubMed Abstract | Crossref Full Text | Google Scholar

28. Du Z, Tian W, Tilley M, Wang D, Zhang G, Li Y. Quantitative assessment of wheat quality using near-infrared spectroscopy: A comprehensive review. Compr Rev Food Sci Food Saf. (2022) 21:2956–3009. doi: 10.1111/1541-4337.12958

PubMed Abstract | Crossref Full Text | Google Scholar

29. Mu M, Li B, Zou Y, Wang W, Cao H, Zhang Y, et al. Coal dust exposure triggers heterogeneity of transcriptional profiles in mouse pneumoconiosis and Vitamin D remedies. Part Fibre Toxicol. (2022) 19:7. doi: 10.1186/s12989-022-00449-y

PubMed Abstract | Crossref Full Text | Google Scholar

30. Zhang L, Rong R, Li Q, Yang D, Yao B, Luo D, et al. A deep learning-based model for screening and staging pneumoconiosis. Sci Rep. (2021) 11:2201. doi: 10.1038/s41598-020-77924-z

PubMed Abstract | Crossref Full Text | Google Scholar

31. Shah I, Mishra S. Artificial intelligence in advancing occupational health and safety: An encapsulation of developments. J Occup Health. (2024) 66:uiad017. doi: 10.1093/joccuh/uiad017

PubMed Abstract | Crossref Full Text | Google Scholar

32. Sanchez-Morillo D, León-Jiménez A, Guerrero-Chanivet M, Jiménez-Gómez G, Hidalgo-Molina A, Campos-Caro A. Integrating routine blood biomarkers and artificial intelligence for supporting diagnosis of silicosis in engineered stone workers. Bioeng Transl Med. (2024) 9:e10694. doi: 10.1002/btm2.10694

PubMed Abstract | Crossref Full Text | Google Scholar

33. Wei J, Zhao Q, Yang G, Huang R, Li C, Qi Y, et al. Mesenchymal stem cells ameliorate silica-induced pulmonary fibrosis by inhibition of inflammation and epithelial-mesenchymal transition. J Cell Mol Med. (2021) 25:6417–28. doi: 10.1111/jcmm.16621

PubMed Abstract | Crossref Full Text | Google Scholar

34. Dong H, Zhu B, Kong X, Zhang X. Efficient clinical data analysis for prediction of coal workers’ pneumoconiosis using machine learning algorithms. Clin Respir J. (2023) 17:684–93. doi: 10.1111/crj.13657

PubMed Abstract | Crossref Full Text | Google Scholar

35. Yang F, Tang Z, Chen J, Tang M, Wang S, Qi W, et al. Pneumoconiosis computer aided diagnosis system based on X-rays and deep learning. BMC Med Imaging. (2021) 21:189. doi: 10.1186/s12880-021-00723-z

PubMed Abstract | Crossref Full Text | Google Scholar

36. Walsh S, Mackintosh J, Calandriello L, Silva M, Sverzellati N, Larici A, et al. Deep learning-based outcome prediction in progressive fibrotic lung disease using high-resolution computed tomography. Am J Respir Crit Care Med. (2022) 206:883–91. doi: 10.1164/rccm.202112-2684OC

PubMed Abstract | Crossref Full Text | Google Scholar

37. Dack E, Christe A, Fontanellaz M, Brigato L, Heverhagen J, Peters A, et al. Artificial intelligence and interstitial lung disease: Diagnosis and prognosis. Invest Radiol. (2023) 58:602–9. doi: 10.1097/RLI.0000000000000974

PubMed Abstract | Crossref Full Text | Google Scholar

38. Koo C, Larson N, Parris-Skeete C, Karwoski R, Kalra S, Bartholmai B, et al. Prospective machine learning CT quantitative evaluation of idiopathic pulmonary fibrosis in patients undergoing anti-fibrotic treatment using low- and ultra-low-dose CT. Clin Radiol. (2022) 77:e208–14. doi: 10.1016/j.crad.2021.11.006

PubMed Abstract | Crossref Full Text | Google Scholar

39. Zhang G, Luo L, Zhang L, Liu Z. Research progress of respiratory disease and idiopathic pulmonary fibrosis based on artificial intelligence. Diagnostics (Basel). (2023) 13:357. doi: 10.3390/diagnostics13030357

PubMed Abstract | Crossref Full Text | Google Scholar

40. Preuss K, Thach N, Liang X, Baine M, Chen J, Zhang C, et al. Using quantitative imaging for personalized medicine in pancreatic cancer: A review of radiomics and deep learning applications. Cancers (Basel). (2022) 14:1654. doi: 10.3390/cancers14071654

PubMed Abstract | Crossref Full Text | Google Scholar

41. Yang H, Liu H, Lin J, Xiao H, Guo Y, Mei H, et al. An automatic texture feature analysis framework of renal tumor: Surgical, pathological, and molecular evaluation based on multi-phase abdominal CT. Eur Radiol. (2024) 34:355–66. doi: 10.1007/s00330-023-10016-4

PubMed Abstract | Crossref Full Text | Google Scholar

42. Ramli Z, Karim M, Effendy N, Abd Rahman M, Kechik M, Ibahim M, et al. Stability and reproducibility of radiomic features based on various segmentation techniques on cervical cancer DWI-MRI. Diagnostics (Basel). (2022) 12:3125. doi: 10.3390/diagnostics12123125

PubMed Abstract | Crossref Full Text | Google Scholar

43. Tharmaseelan H, Rotkopf L, Ayx I, Hertel A, Nörenberg D, Schoenberg S, et al. Evaluation of radiomics feature stability in abdominal monoenergetic photon counting CT reconstructions. Sci Rep. (2022) 12:19594. doi: 10.1038/s41598-022-22877-8

PubMed Abstract | Crossref Full Text | Google Scholar

44. Guiot J, Vaidyanathan A, Deprez L, Zerka F, Danthine D, Frix A, et al. A review in radiomics: Making personalized medicine a reality via routine imaging. Med Res Rev. (2022) 42:426–40. doi: 10.1002/med.21846

PubMed Abstract | Crossref Full Text | Google Scholar

45. Fathi Kazerooni A, Bagley S, Akbari H, Saxena S, Bagheri S, Guo J, et al. Applications of radiomics and radiogenomics in high-grade gliomas in the era of precision medicine. Cancers (Basel). (2021) 13:5921. doi: 10.3390/cancers13235921

PubMed Abstract | Crossref Full Text | Google Scholar

46. Wei J, Jiang H, Zhou Y, Tian J, Furtado F, Catalano O. Radiomics: A radiological evidence-based artificial intelligence technique to facilitate personalized precision medicine in hepatocellular carcinoma. Dig Liver Dis. (2023) n55:833–47. doi: 10.1016/j.dld.2022.12.015

PubMed Abstract | Crossref Full Text | Google Scholar

47. He J, Hu J, Liu H. A three-gene random forest model for diagnosing idiopathic pulmonary fibrosis based on circadian rhythm-related genes in lung tissue. Expert Rev Respir Med. (2023) 17:1307–20. doi: 10.1080/17476348.2024.2311262

PubMed Abstract | Crossref Full Text | Google Scholar

48. Li L, Li W, Xiao L, Lai W. Lactylation signature identifies liver fibrosis phenotypes and traces fibrotic progression to hepatocellular carcinoma. Front Immunol. (2024) 15:1433393. doi: 10.3389/fimmu.2024.1433393

PubMed Abstract | Crossref Full Text | Google Scholar

49. Constantinescu E, Udri stoiu AL, Udri stoiu  SC, Iacob AV, Gruionu LG, Gruionu G, et al. Transfer learning with pre-trained deep convolutional neural networks for the automatic assessment of liver steatosis in ultrasound images. Med Ultrason. (2021) 23:135–9. doi: 10.11152/mu-2746

PubMed Abstract | Crossref Full Text | Google Scholar

50. Song X, Xu H, Wang X, Liu W, Leng X, Hu Y, et al. Use of ultrasound imaging Omics in predicting molecular typing and assessing the risk of postoperative recurrence in breast cancer. BMC Womens Health. (2024) 24:380. doi: 10.1186/s12905-024-03231-8

PubMed Abstract | Crossref Full Text | Google Scholar

51. Boldt J, Schuster M, Krastl G, Schmitter M, Pfundt J, Stellzig-Eisenhauer A, et al. Developing the benchmark: Establishing a gold standard for the evaluation of AI caries diagnostics. J Clin Med. (2024) 13:3846. doi: 10.3390/jcm13133846

PubMed Abstract | Crossref Full Text | Google Scholar

52. Deniffel D, Abraham N, Namdar K, Dong X, Salinas E, Milot L, et al. Using decision curve analysis to benchmark performance of a magnetic resonance imaging-based deep learning model for prostate cancer risk assessment. Eur Radiol. (2020) 30:6867–76. doi: 10.1007/s00330-020-07030-1

PubMed Abstract | Crossref Full Text | Google Scholar

53. Chu F, Liu Y, Liu Q, Li W, Jia Z, Wang C, et al. Development and validation of MRI-based radiomics signatures models for prediction of disease-free survival and overall survival in patients with esophageal squamous cell carcinoma. Eur Radiol. (2022) 32:5930–42. doi: 10.1007/s00330-022-08776-6

PubMed Abstract | Crossref Full Text | Google Scholar

54. Tan Y, Liu R, Xue J, Feng Z. Construction and validation of artificial intelligence pathomics models for predicting pathological staging in colorectal cancer: Using multimodal data and clinical variables. Cancer Med. (2024) 13:e6947. doi: 10.1002/cam4.6947

PubMed Abstract | Crossref Full Text | Google Scholar

55. Liu Y, Wu J, Zhou J, Guo J, Liang C, Xing Y, et al. Identification of high-risk population of pneumoconiosis using deep learning segmentation of lung 3D images and radiomics texture analysis. Comput Methods Programs Biomed. (2024) 244:108006. doi: 10.1016/j.cmpb.2024.108006

PubMed Abstract | Crossref Full Text | Google Scholar

56. Li Z, Zhao M, Li Z, Huang Y, Chen Z, Pu Y, et al. Quantitative texture analysis using machine learning for predicting interpretable pulmonary perfusion from non-contrast computed tomography in pulmonary embolism patients. Respir Res. (2024) 25:389. doi: 10.1186/s12931-024-03004-9

PubMed Abstract | Crossref Full Text | Google Scholar

57. He W, Jin N, Deng H, Zhao Q, Yuan F, Chen F, et al. Workers’ occupational dust exposure and pulmonary function assessment: Cross-sectional study in China. Int J Environ Res Public Health. (2022) 19:11065. doi10.3390/ijerph191711065

Google Scholar

58. Nandavaram S, Mei X, Khan T, Faruqi M, Fyffe Z, Keshavamurthy S, et al. Outcomes of lung transplantation in coal workers pneumoconiosis: Analysis of UNOS database. Clin Transplant. (2024) 38:e70042. doi: 10.1111/ctr.70042

PubMed Abstract | Crossref Full Text | Google Scholar

59. de Jersey A, Lavers J, Zosky G, Rivers-Auty J. The understudied global experiment of pollution’s impacts on wildlife and human health: The ethical imperative for interdisciplinary research. Environ Pollut. (2023) 336:122459. doi: 10.1016/j.envpol.2023.122459s

Crossref Full Text | Google Scholar

Keywords: coalworker pneumoconiosis, pulmonary interstitial fibrosis, CT radiomics, clinical features, machine learning, predictive model, multimodal joint model, ROC curve

Citation: Li X, Li Q, Xie X, Wang W, Li X, Zhang T, Zhang L, Liu Y, Wang L and Xie W (2025) Integrating CT radiomics and clinical data with machine learning to predict fibrosis progression in coalworker pneumoconiosis. Front. Med. 12:1599739. doi: 10.3389/fmed.2025.1599739

Received: 25 March 2025; Accepted: 17 June 2025;
Published: 22 July 2025.

Edited by:

Chen Shanxiong, Southwest University, China

Reviewed by:

Chen Zhang, Anhui Normal University, China
Fubin Zhang, China West Normal University, China

Copyright © 2025 Li, Li, Xie, Wang, Li, Zhang, Zhang, Liu, Wang and Xie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yongsheng Liu, MjQxMDAyQGNxbXBjLmVkdS5jbg==; Li Wang, MjQzMDE1MkBjcW1wYy5lZHUuY24=; Wutao Xie, MjQzMDE1M0BjcW1wYy5lZHUuY24=

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.