- 1Department of Pharmacy, Affiliated Hospital of Zunyi Medical University, Zunyi, Guizhou, China
- 2Department of Pharmacy, Yuncheng Central Hospital Affiliated to Shanxi Medical University, Yuncheng, Shanxi, China
- 3Department of Pharmacy, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
Background: Severe adverse drug reactions (SADRs) pose significant challenges to pharmacotherapy. Machine learning (ML) models hold promise in providing reliable solutions for predicting SADRs. This study is designed to pinpoint the independent risk factors contributing to SADRs through the application of ML techniques, thus constructing a predictive model for SADRs applicable in real-world clinical settings.
Methods: This retrospective dual-center cohort study analyzed adverse drug reaction (ADR) cases reported in two Chinese tertiary medical centers from 2014 to 2022. Per the World Health Organization - Uppsala Monitoring Centre severity criteria, cases were classified as SADRs or common ADRs. Independent predictors were identified via univariate and multivariate logistic regression (LR). A random partitioning of the data set resulted in a 75% training set and a 25% test set. The performance of three ML algorithms, including LR, Random Forest and Gradient Boosting Machine, was compared. A nomogram was constructed, model performance was measured by the area under the receiver operating characteristic curve (AUC), concordance index (C index), Hosmer-Lemeshow test (H-L test), Decision Curve Analysis (DCA), and Clinical Impact Curve (CIC).
Results: A total of 508 SADRs were identified. The AUC values of LR model demonstrates the highest predictability among the three ML models. The AUC was 0.707 in the test set and the AUC in the training set was 0.689. A nomogram was established based on the LR model and evaluated. The C-index was 0.714 in the test set and the AUC in the training set was 0.713; The H-L test produced a chi-square value of 9.769 (p = 0.369), indicating good calibration. The DCA and CIC verify that the LR model possesses significant predictive value. According to the LR model, there were 20 predictors, including age ≥54 years, concurrent diseases ≥3, cardiac insufficiency, hemorrhagic disorders, active malignancies, cerebral infarction, bone fractures, anti-infectives, cytotoxic antineoplastics, proton pump inhibitors, antiepileptics, anticoagulants, diagnostic agents, arterial administration.
Conclusion: This study established a predictive nomogram for SADRs based on LR through comparative analysis of three ML approaches. The developed nomogram enables clinically meaningful risk stratification for SADRs, facilitating prophylactic surveillance of high-risk populations.
Introduction
Adverse drug reactions (ADRs) refer to harmful and unexpected pharmacological reactions that manifest at standard therapeutic doses (Montané and Santesmases, 2020). Approximately 10%–20% of inpatients and 25% of outpatients experience ADRs (Garon et al., 2017). Systematic review and meta - analysis reveal marked variations in the incidence of serious adverse drug reactions (SADRs) linked to hospitalization, spanning from 1.0% to 16.8% (Formica et al., 2018). It was showed that 6.7% of hospitalized patients incur SADRs, with a 0.32% prevalence of fatal ADRs (Formica et al., 2018). Alarmingly, research in pharmacovigilance has shown that ADRs are the fourth top cause of death around the world (Moyer et al., 2019).
Pharmacovigilance constitutes a pivotal strategy for the prevention and mitigation of adverse drug events (ADEs) and ADRs (Song et al., 2023). Methodologically, it is dichotomised into passive and active monitoring systems (Li and Yin, 2019). Passive surveillance relies on spontaneous reporting systems (SRS), through which healthcare professionals voluntarily submit observed ADRs. Owing to its minimal infrastructural requirements, SRS is the predominant mode of data collection in many jurisdictions (Mulchandani and Kakkar, 2019). Its advantages include high data volume, ease of access, and low operational cost. Nevertheless, this approach is intrinsically limited by under-reporting, duplication, low reporting rates, and variable data quality (Shamim et al., 2024; Crisafulli et al., 2025; Alatawi and Hansen, 2017). Active pharmacovigilance leverages comprehensive electronic health records that encapsulate detailed patient-level information, thereby enabling the simultaneous control of confounders such as polypharmacy, combination therapies, and sociodemographic characteristics (Zhuo et al., 2014). However, the substantial human and financial resources required for its implementation have constrained its widespread adoption. Recently, artificial intelligence (AI), particularly machine learning (ML), has emerged as a transformative paradigm in pharmacovigilance (Demirsoy and Karaibrahimoglu, 2023; Hu et al., 2024; Salas et al., 2022). By learning patterns from large-scale data, these algorithms facilitate the automated identification of ADEs/ADRs, extraction of drug–drug interactions, and stratification of patients at elevated ADR risk (Salas et al., 2022). The principal advantages of AI-driven approaches are threefold: (i) Automated report generation and multimodal data integration, amalgamating electronic medical records, genomic data, social media content, and other heterogeneous sources, thereby enhancing both the velocity and accuracy of data processing (Kompa et al., 2022; Golder et al., 2025; Dsouza et al., 2025; Dimitsaki et al., 2024). (ii) Objective risk assessment, wherein AI models generate causal probability scores by systematically analysing confounding factors, temporal relationships, and the hierarchy of literature evidence, thereby mitigating subjective bias (Desai, 2024). (iii) Individualised risk prediction, achieved through predictive models that integrate patients’ genomic, metabolic, and medication histories to deliver personalised ADR risk estimates (Ward et al., 2021). Although ML approaches have demonstrated significant advantages in ADR research, the prediction of ADRs still face challenges, particularly as studies on ML models for predicting SADRs remain scarce.
Given the significant adverse impact of SADRs on patient health, it is important to identify risk factors associated with SADRs and develop predictive models. This retrospective study aims to establish an accurate and reliable model for predicting SADRs by using ML algorithms.
Materials and methods
Study design
Pharmacovigilance data from two different tertiary care institutions in China were used in this dual-center retrospective cohort study: Affiliated Hospital of Zunyi Medical University (Zunyi, Guizhou); Yuncheng Central Hospital affiliated to Shanxi Medical University (Yuncheng, Shanxi). The dataset spans from 1 January 2014, to 31 December 2022, and includes all ADR reports registered in the national surveillance systems of the institutions involved.
This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Affiliated Hospital of Zunyi Medical University (Approval No. KLL-2021-257) and the collaborating center. The requirement for written informed consent was waived by the Ethics Committee due to the retrospective nature of the research.
Data assessment
Experts from the Drug Reevaluation Centre of China National Medical Products Administration evaluated all the adverse drug reactions (ADRs), which were categorized as “certain”, “probable”, “possible”, “irrelevant”, “to be evaluated”and “unable to evaluate” based on World Health Organization - Uppsala Monitoring Centre (WHO-UMC) causality assessment criteria (Chen et al., 2018). The authors reassessed the causal relationship of the ADR using the Naranjo algorithm (Shukla et al., 2021). Inclusion criteria were limited to ADRs categorized as “certain”, “probable”, or “possible” (The Uppsala Monitoring Centre, 2013). Cases were excluded if: (i) Causality assessments classified as “irrelevant”, “to be evaluated” or “unable to evaluate”; (ii) Key information were missing, as gender, age; (iii) Occurrence time was unclear; (iv) Diagnosis incomplete.
In accordance with ICH E2A guidelines (Therapeutic Goods Administration of Department of Health of Australian Government, 2000), SADRs are defined as any event meeting at least one of the following standards: (i) Requires inpatient hospitalization or prolongation of hospitalization; (ii) Results in persistent or significant disability/incapacity; (iii) A congenital anomaly/birth defect; (iv) Life-threatening; (v) Fatal outcome; (vi) Other serious medical events that may lead to the aforementioned outcomes (Chen et al., 2018).
Additionally, ADR outcomes were typically encompassed within several definitive categories: cured, improvement, recovered with sequelae, no healing, death and unknown (Sun et al., 2020).
The MedDRA v24.0 classification system was used to describe organ-specific injury manifestations of common ADRs and SADRs (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use, 2021). The WHO-ATC classification system was employed to evaluate the ADR profiles of different drug categories (Norwegian Institute of Public Health, 2025).
Data analysis
SPSS (version 26.0, Chicago, IL, United States) and R software (version 3.5.1, R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analyses. The Shapiro test was employed to evaluate the distribution of continuous variables. For normallydistributed variables, data were shown as mean ± standard deviation (SD), while skewed data were presented as median (interquartile range, IQR). Categorical data were represented by frequencies and percentages.
The comparison of the common ADRs cohort with the SADRs cohort was conducted using the unpaired Student’s t-test or the non-parametric Mann-Whitney test for continuous variables, and chi-square or Fisher’s exact tests for categorical variables. Through receiver operating characteristic (ROC) curve analysis, continuous variables that showed statistical significance were divided into two groups, with the Youden index used to find the optimal diagnostic thresholds (Obuchowski and Bullen, 2018). To identify risk factors for SADRs, chi-square (or Fisher’s exact) tests or univariate LR analysis were conducted. In the univariate analysis, variables with p-values <0.10 along with clinically relevant parameters were included in the multivariate LR model via forward stepwise selection (Faria et al., 2022; Eastment et al., 2019). Clinically relevant parameters were included in the multivariate LR model via forward stepwise selection (Faria et al., 2022). Two-tailed tests were used, and a P value below 0.05 was regarded as statistically significant.
In this study, a fixed ratio of 75%-25% was employed for data partitioning in accordance with the requirements of TRIPOD-AI and PROBAST by using the R language (via the createDataPartition function from the caret package).
In building the model, three ML algorithms were used: LR, Random Forest (RF), and Gradient Boosting Machines (GBM). The hyperparameter tuning part is in the Supplementary File 1. These classifiers were systematically implemented to construct prediction systems using two independent datasets containing optimally selected feature variables (Lin et al., 2024). The effectiveness of each model’s predictions was assessed using the area under the receiver operating characteristic curve (AUC) (Li et al., 2022). Furthermore, we also calculated specificity, accuracy, sensitivity, F1-score and precision for each model’s AUC at the “best” thresholds (Zhang S. T. et al., 2025; Peng et al., 2025). Bootstrap calculated using 1,000 stratified repetitions in the “pROC” program.
Nomogram implementation and test
Using the “rms” package in R statistical software, a nomogram for SADRs was developed from a multivariate LR model (Hoshino et al., 2018). Its performance was assessed with a concordance index (C-index) and calibration plots derived from bootstrap samples. Calibration plots provide a visual evaluation of predictive accuracy by comparing observed probabilities with those predicted by the nomogram, while a C-index measures discriminative ability numericallya. Decision Curve Analysis (DCA) was utilized to measure the clinical utility of the predictive models. By assessing the net benefit over various threshold probabilities, DCA allows for comparison of the nomogram with other models and highlights their respective differences. By showing the false-positive and true-positive rates as functions of the risk threshold, DCA effectively addresses the limitations of ROC curves (Zhang et al., 2022). Finally, the net clinical benefit of the model with the best diagnostic results was assessed by plotting clinical impact curves (CIC) (Hou et al., 2020).
Results
Baseline characteristics
Of the 4,860 cases, 4,333, accounting for 89.2%, fulfilled the inclusion/exclusion criteria and were retained for subsequent analysis (Figure 1). Causality assessment using the WHO-UMC criteria classified 2.5% (n = 109) of the cases as “certain,” with the remainder distributed between “probable” (62.4%, n = 2,702) and “possible” (35.1%, n = 1,522).
To better understand the risk factors associated with SADRs, the cases were divided into two cohorts: the common ADRs cohort (3,825 cases, 88.3%) and the SADRs cohort (508 cases, 11.7%). A comparison of demographic and clinical characteristics is shown in Table 1. The key findings are summarized as follows: (i) Male patients exhibited higher risks of SADRs, constituting 53.3% of SADRs. (ii) Patients aged 54 years and older had a significantly higher proportion of SADRs compared to those younger than 54 years (58.3% vs. 41.7%, p < 0.001). (iii) Among the routes examined, IA administration was associated with high occurrences of SADRs (2.6% vs. 0.3% for common ADRs). (iv) There were significant differences (p < 0.001) between SADRs and common ADRs in ADR type, clinical aoutcomes, impacts on the primary disease, and the occurrence time of ADRs.
Compared with common ADRs, SADRs showed significantly higher incidence in systemic organ injury, hepatobiliary dysfunction, hematologic abnormalities and urinary system injury (p < 0.05) (Supplementary File 2; Supplementary Tables S1, S2).
In terms of medications, diagnostic drugs had the highest proportion in SADRs at 28.9%, followed by central nervous system agents (18.0%), hematopoietic modulators (16.9%), anti-tumor medications (15.5%), gastrointestinal drugs (12.6%), immunomodulatory compounds (12.2%), anti-infective agents (12.0%), and traditional Chinese medicine preparations (11.0%) (Supplementary File 2; Supplementary Tables S3).
Univariate analysis of SADRs
Univariate analysis of the association between patients’ diseases and the occurrence of SADRs are shown in Supplementary File 2; Supplementary Table S4, which yielded the following results: (i) multiple comorbidities: patients with concurrent diseases ≥3 had elevated risks of SADRs compared to those with fewer than 3 concurrent diseases (p < 0.001). (ii) Disease-specific risks (p < 0.05): active malignancies, cardiac insufficiency, hemorrhagic disorder, cerebral infarction and bone fractures.
The univariate analysis results showed no statistically significant association between multidrug combination therapy and SADRs incidence (p > 0.05). The subsequent univariate LR analysis of SADRs included drugs that met the inclusion criteria of having 10 or more recorded SADR cases and an increased incidence. The results showed that SADRs occurrence were significantly associated with the following drugs (p < 0.05): ceftazidime, ceftriaxone, cefoperazone/sulbactam, carbapenems, vancomycin, antifungal agents, antiviral medications, antiepileptics, cytotoxic antineoplastics, antithrombotic agents, proton pump inhibitors and diagnostic agents (Supplementary File 2; Supplementary Table S5).
Multivariate analysis of SADRs
The “administration routes” were dichotomized into intra-arterial (IA) versus non-IA administration. All predictors were assessed for multicollinearity, with only those exhibiting a variance inflation factor <10 being retained in the model (Supplementary File 2; Supplementary Table S6). At last, the model identified 20 independent predictors of SADRs (Supplementary File 2; Supplementary Tables S7): (i) demographic predictors included age ≥54 years (OR 1.280, 95% CI 1.041–1.573; p = 0.019) and multi-morbidity (concurrent diseases ≥3: OR 1.581, 95% CI 1.223–2.043; p < 0.001). (ii) Pathological conditions: bone fractures showed the strongest pronounced association with SADRs (OR 2.900, 95% CI 1.703–4.939; p < 0.001), followed by cerebral infarction (OR 2.658, 95% CI 1.827–3.866; p < 0.001), hemorrhagic disorder (OR 1.984, 95% CI 1.272–3.094; p = 0.003), cardiac insufficiency (OR 1.694, 95% CI 1.247–2.303; p = 0.001) and active malignancies (OR 1.386, 95% CI 1.016–1.890; p = 0.040). (iii) Drug exposure: Among anti-infective drugs, cefoperazone/sulbactam showed the highest association (OR 9.499, 95% CI 5.187–17.397; p < 0.001), exceeding antiviral medications (OR 5.484, 95% CI 2.597–11.578; p < 0.001), vancomycin (OR 5.021, 95% CI 2.560–9.848; p < 0.001), ceftriaxone (OR 4.259, 95% CI 2.029–8.939; p < 0.001), ceftazidime (OR 3.457, 95% CI 1.964–6.083; p < 0.001), antifungal agents (OR 2.922, 95% CI 1.390–6.142; p = 0.005) and carbapenems (OR 2.880, 95% CI 1.331–6.232; p = 0.007). Among other types of drugs, antiepileptics (OR 5.732, 95% CI 3.139–10.467; p < 0.001) was higher than that of proton pump inhibitors (OR 5.283, 95% CI 2.746–10.164; p < 0.001), diagnostic agents (OR 3.884, 95% CI 2.538–5.944; p < 0.001), cytotoxic antineoplastics (OR 2.497, 95% CI 1.618–3.853; p < 0.001), antithrombotic agents (OR 2.271, 95% CI 1.351–3.818; p < 0.001). (iv) IA administration (OR 2.768, 95% CI 1.126–6.809; p = 0.027) also had a significant risk.
Comparison of SADRs prediction model by different ML
Utilizing the 20 variables selected (Figure 2), we employed three ML algorithms to construct an prediction model for SADRs. Following rigorous nested hyperparameter tuning, the possibility that the observed superiority arose from overfitting or stochastic parameter selection was effectively ruled out. As revealed in Figure 3A, AUC values of the LR, RF and GBM models were 0.707, 0.673, and 0.703 respectively based on the training set. Among the three models, LR showed the highest level of predictive accuracy (AUC = 0.707, 95% CI 0.676–0.738). The results of the test set were similar to the training sets (Figure 3B). The additional performance metrics of each model, including specificity, accuracy, sensitivity, F1-score and precision, are detailed in Supplementary File 2; Supplementary Tables S8, S9.
Figure 2. Forest plots of risk factors associated with SADRs. SADRs: serious adverse drug reactions; IA Administration: Intra-arterial Administration.
Figure 3. Comparative evaluation of ROC among the three ML models for the prediction of SADRs. ROC: the receiver operating characteristic curve; LR: logistic regression; RF: random forest; GBM: gradient boosting machine.
Establishment of a nomogram model for SADRs
Utilizing LR-derived predictors, a prognostic nomogram was developed to visualize individual risk factors (Figure 4). This graphical model incorporated all significant covariates identified through multivariate LR, with weighted point allocations reflecting effect magnitudes as detailed in Figure 2. Each risk factor was assigned a specific score, and the cumulative scores of all risk factors can correspond to the predicted probability of SADRs occurrence.
Figure 4. Nomogram for predicting SDARs. To estimate the probability of SADRs, mark patient values at each axis, draw a straight line perpendicular to the point axis, and sum the points for all variables. Next, mark the sum on the total point axis and draw a straight line perpendicular to the probability axis.
Evaluation of a nomogram model for predicting SADRs
A random partitioning of the data set resulted in a 75% training set and a 25% test set. Discriminative performance evaluation demonstrated moderate predictive accuracy in the training set (Figure 5A), with an AUC of 0.713 and a bootstrap-corrected C-index of 0.714. Calibration accuracy, assessed via 1000-resample bootstrap validation, revealed strong agreement between predicted probabilities and observed outcomes (Hosmer-Lemeshow goodness-of-fit test: χ2 = 9.769, p = 0.369), as visualized in the calibration curve (Figure 5B). DCA quantified clinical translatability across a threshold probability range of 7%–56%. Within this range of clinical relevance, the nomogram demonstrated a higher net benefit compared to the “treat-all” or “treat-none” strategies (Figure 5C), with the optimal threshold probability resulting in a maximum absolute risk reduction of 32.5%. The CIC illustrated that with a 20% threshold, the model’s prediction of at-risk individuals greatly surpassed the real number (Figure 5D). When the threshold probability was above 55%, the predicted number of high-risk subjects (predicted positive cases by the scoring system) was nearly identical to the true high-risk cases, indicating the nomogram model’s significant predictive value for SADRs.
Figure 5. Evaluation of a nomogram model for predicting SADRs. (A) Receiver Operating Characteristic curve (ROC) of the training set and the test set. The area under the curve (AUC) of the purposed nomogram for predicting SADR in the training set and in the test set was 0.713 and 0.709, respectively. (B) The calilbration of the nomogram. The Calibration curve analysis of the nomogram in the test set. (C) The DCA of the nomogram for predicting SADR in the test set. The DCA curve of the nomogram. it was revealed that SADR prediction using the nomogram isaccompanied with a higher net benefit than using any single factor alone, as wel as by treating either no or all patients. (D) The CIC curve of the purposed nomogram in the test set. The red curve (the number of individuals at high risk) indicates the number of persons who are classified as positive (high risk) by the prediction model at each threshold probability; the blue curve (the number of individuals at high risk with outcomes) is the number of true positives at each threshold probability.
Discussion
SADRs are critical concerns in pharmacovigilance, linked to prolonged hospitalization, higher medical costs, and adverse clinical outcomes. This retrospective study analyzed 4,333 ADRs, including 508 SADRs, aiming to identify SADR determinants. Three ML algorithms were developed for SADR prediction. The LR model outperformed RF and GBM. Subsequently, a nomogram was developed using the LR model.
It is important to highlight that in the present study, the continuous variable of age was dichotomized. The cut-off point of ≥54 years was determined based on the threshold corresponding to the optimal cut-off point in ROC analysis, rather than arbitrarily setting 60 years as the age limit for the elderly (Obuchowski and Bullen, 2018). This data-driven approach is consistent with the methodology employed in the pharmacovigilance study conducted by Han YZ et al., which identified 52 years as the optimal cut-off point for predicting ADRs of hepatotoxicity (Han et al., 2022). Consistent with the findings reported by Toni E et al., the present study corroborated the significant roles of age, comorbidities, and specific drug types in the occurrence of SADRs (Toni et al., 2024a; Toni et al., 2024b). For patients identified as high-risk, targeted preventive strategies can be implemented. By employing close monitoring or optimizing treatment plans, potential adverse reactions can be promptly detected and managed, thereby enhancing both the efficacy and safety of therapeutic interventions for these patients.
The application of ML techniques to predict SADRs is essential, particularly when conducted in accordance with relevant policies. This approach not only enhances the standardization and interpretability of research data but also promotes regulatory consistency and facilitates cross - departmental collaboration (Toni and Ayatollahi, 2025). ML, encompassing unsupervised learning, supervised learning, reinforcement learning, etc., has demonstrated notable advantages in ADR-related studies (Salas et al., 2022). However, differences in study population, data structure and confounding factors contribute to variations in performing of different ML algorithms for predicting ADRs (Deimazar and Sheikhtaheri, 2023). In the systematic report by Deimazar G et al., eight comparative studies on various ML algorithms for ADR prediction reported inconsistent results (Deimazar and Sheikhtaheri, 2023). In a study comparing three ML algorithms--LR, decision trees, and artificial neural networks-for predicting chemotherapy - induced ADRs, LR model had the highest AUC (0.67–0.83) for the six types of ADRs caused by chemotherapy drugs (On et al., 2022).
Following hyperparameter optimisation, the LR model exhibited the best discriminative performance among the three ML models we established, a finding that aligns with recent evidence (On et al., 2022). In a cross-sectional study to predict osteoporosis in older adults at high risk of cardiovascular disease, the LR model outperformed SVM, Random Forest, XGBoost, and Decision Tree models (Peng et al., 2025). Guo Y. et al. developed an ultrasound-based radiomics nomogram for identifying HER2 status in breast cancer patients, the LR model was found to perform the best on the validation set (Guo et al., 2022). Similarly, Xu R et al. reported that the LR model exhibited the highest discriminative power in identifying individuals with low bone mineral density using ML algorithms (Xu et al., 2024). Christodoulou E et al. compared the predictive capabilities of LR with other machine learning models, concluding that LR often performs comparably to or even outperforms ML algorithms in clinical prediction studies, particularly in datasets with limited sample sizes (Christodoulou et al., 2019). While ML algorithms hold theoretical advantages, especially in datasets with complex interactions or large sample sizes (Peng et al., 2025). In contrast, the LR model demonstrates strong reliability and applicability in clinical predictive analytics, with enhanced interpretability for binary outcomes (Dsouza et al., 2025). This highlights that LR and other ML models may be applicable in distinct scenarios: the LR model is preferable in studies with limited sample sizes, a small number of predictors, or low signal-to-noise ratio (SNR) data. Conversely, other ML models are more suitable for large datasets with numerous predictors, complex interactions/confounding factors, or high-SNR data (Christodoulou et al., 2019; Wei et al., 2024). We posit that even with hyperparameter tuning, the LR model may still exhibit advantages in datasets characterized by limited sample sizes and low signal-to-noise ratios. In future experiments, we recommend conducting a more extensive exploration of the performance of various models across different datasets, particularly under conditions of varying sample sizes and signal-to-noise ratios. Additionally, we suggest experimenting with a broader range of models and tuning methods to further validate the reliability of the aforementioned conclusion.
Nomograms have been identified as an efficient means of measuring ADRs risks in the past few years. Bai H et al. developed a nomogram for hospitalized adult patients to predict cefoperazone/sulbactam-induced hypoprothrombinemia, which demonstrated satisfactory predictive performance (Bai et al., 2023). Li P et al. constructed a nomogram to predict granisetron-associated arrhythmias, showing high discriminative and calibratioan capabilities (Li et al., 2024). Hong H et al. designed a predictive model with high efficacy for identifying vancomycin-induced acute kidney injury in overweight patients (Hong et al., 2024). Zhang Z et al. created a nomogram to forecast cutaneous adverse reactions induced by targeted cancer therapies and immunotherapy, offering critical insights for optimizing treatment efficacy and improving quality of life (Zhang Z. M. et al., 2025). However, to the best of our knowledge, no nomogram model specific to SADRs has been developed, particularly among large hospitalized populations using routinely collected clinical data.
Based on the LR model, we developed a nomogram to predict SADRs. This nomogram integrated multiple predictors as scaled axes aligned on a single plane, enabling clinicians to intuitively visualize individualized SADR probabilities. In clinical practice, physicians can utilize the nomogram model to assess the likelihood of SADR occurrence based on a patient’s specific risk factors. For instance, consider a 60-year-old male patient with more than three underlying diseases and heart insufficiency, who received treatment with cefoperazone/sulbactam for an infectious disease during hospitalization. According to the nomogram, his cumulative score for is 166, indicating a 60%–70% probability of SADR occurrence. Upon receiving this risk signal, physicians should implement stringent monitoring of the patient’s various clinical indicators following the administration of cefoperazone/sulbactam to preempt the occurrence of SADRs. Alternatively, physicians may consider substituting the anti-infective agent with piperacillin/tazobactam to mitigate the likelihood of SADR occurrence. The nomogram model we have developed is characterized by its simple structure and ease of understanding, thereby ensuring good interpretability. This feature not only facilitates the application and validation of the model across different settings but also provides convenience for future collaborative efforts among multiple institutions and departments. Therefore, this study offers theoretical value by enriching and expanding the methodological framework of SADR monitoring and pharmacovigilance, and serves as a reference for the development of personalized early-warning systems for SADRs.
The study presents multiple limitations. Firstly, it is a two-center retrospective cohort study. While it provided a preliminary pharmacovigilance approach, it was constrained by a small sample size and limited inclusion of variables, necessitating validation through larger, multicenter studies. Second, the study was restricted to the Han Chinese population, limiting its generalizability to other ethnicities. Third, the present study acknowledges several methodological limitations. Pharmacokinetic and pharmacogenomic parameters were not incorporated; future research could integrate these factors to develop a more comprehensive predictive model. Fourthly, despite our model demonstrating acceptable discrimination in internal validation (AUC = 0.713; C-index = 0.714), its performance remains at a moderate level. This suggests that while the nomogram holds value for signal detection within the existing data range, it is not yet robust enough to serve as an independent clinical decision-making tool. Further validation through multicenter, prospective cohort studies, or randomized clinical trials is necessary to comprehensively assess its calibration, clinical utility threshold, and cost-effectiveness, thereby delineating its true scope of application.
Future research on SADRs will be pursued along five complementary trajectories to strengthen surveillance and advance pharmacovigilance: (i) All variables in this study are derived from the mandatory reporting fields of the national “Drug Adverse Reaction Monitoring Specifications”. The two hospitals involved exhibit differences in demographics and drug catalogues, suggesting that the model has the potential for cross-regional migration. Subsequently, a prospective validation will be conducted, employing the same variable set and thresholds, and reporting the C-index, calibration slope, and decision curve to further evaluate the external validity. (ii) Employing ML models to perform sub-classification-based predictions of SADRs, thereby achieving a higher level of prognostic precision. (iii) Harnessing natural language processing for large-scale corpus analysis and integrating it with ML algorithms to shift SADR surveillance from retrospective tracing to prospective, real-time early warning (Le Glaz et al., 2021; Khanbhai et al., 2021). (iv) Embedding validated ML-SADR prediction modules into the Hospital Pharmacovigilance System to enable continuous, real-time monitoring and support proactive pharmacovigilance. (v) Collaborative research endeavors involving multiple stakeholders, including research institutions, pharmaceutical companies, and government agencies (Toni and Ayatollahi, 2025).
Conclusion
The occurrence of SADRs is associated with multiple factors. This study identified key predictors of SADRs and construct a nomogram, which facilitated prophylactic surveillance of high-risk populations.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Ethics statement
The studies involving humans were approved by Ethics Committee of the Affiliated Hospital of Zunyi Medical University. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin due to the retrospective nature of the study. Written informed consent was not obtained from the minor(s)’ legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article because the requirement for written informed consent was waived by the Ethics Committee due to the retrospective nature of the research.
Author contributions
WB: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft. XW: Data curation, Formal Analysis, Investigation, Resources, Validation, Writing – review and editing. CW: Data curation, Formal analysis, Validation, Writing – original draft. YC: Conceptualization, Methodology, Project administration, Supervision, Validation, Writing – review and editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Science and Technology Bureau Program of Zun Yi, Guizhou Province [HZ (2020) 251].
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar.2025.1669995/full#supplementary-material
References
Alatawi, Y. M., and Hansen, R. A. (2017). Empirical estimation of under-reporting in the U.S. food and drug administration adverse event reporting system (FAERS). Expert Opin. Drug Saf. 16 (7), 761–767. doi:10.1080/14740338.2017.1323867
Bai, H. H., Li, H., Nie, X. H., Yao, Y. Q., Han, X. N., Wang, J. P., et al. (2023). Development and validation of a nomogram for predicting cefoperazone/sulbactam-induced hypoprothrombinaemia in hospitalized adult patients. PLoS One 18 (9), e0291658. doi:10.1371/journal.pone.0291658
Chen, S. Q., Kwong, J. S. W., Zheng, R., Wang, Y. P., and Shang, H. C. (2018). Normative application of Xiyanping injection: a systematic review of adverse case reports. Evid. Based Complement. Altern. Med. 2018, 4013912. doi:10.1155/2018/4013912
Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., and Calster, B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12–22. doi:10.1016/j.jclinepi.2019.02.004
Crisafulli, S., Bate, A., Brown, J. S., Candore, G., Chandler, R. E., Hammad, T. A., et al. (2025). Interplay of spontaneous reporting and longitudinal healthcare databases for signal management: position statement from the real-world evidence and big data special interest group of the international society of pharmacovigilance. Drug Saf. 48 (9), 959–976. doi:10.1007/s40264-025-01548-3
Deimazar, G., and Sheikhtaheri, A. (2023). Machine learning models to detect and predict patient safety events using electronic health records: a systematic review. Int. J. Med. Inf. 180, 105246. doi:10.1016/j.ijmedinf.2023.105246
Demirsoy, I., and Karaibrahimoglu, A. (2023). Identifying drug interactions using machine learning. Adv. Clin. Exp. Med. 32 (8), 829–838. doi:10.17219/acem/169852
Desai, M. K. (2024). Artificial intelligence in pharmacovigilance-opportunities and challenges. Perspect. Clin. Res. 15 (3), 116–121. doi:10.4103/picr.picr_290_23
Dimitsaki, S., Natsiavas, P., and Jaulent, M. C. (2024). Applying AI to structured real-world data for pharmacovigilance purposes: scoping review. J. Med. Internet Res. 26, e57824. doi:10.2196/57824
Dsouza, V. S., Leyens, L., Kurian, J. R., Brand, A., and Brand, H. (2025). Artificial intelligence (AI) in pharmacovigilance: a systematic review on predicting adverse drug reactions (ADR) in hospitalized patients. Res. Soc. Adm. Pharm. 21 (6), 453–462. doi:10.1016/j.sapharm.2025.02.008
Eastment, M. C., Wanje, G., Richardson, B. A., Nassir, F., Mwaringa, E., Barnabas, R. V., et al. (2019). Performance of family planning clinics in conducting recommended HIV counseling and testing in Mombasa County, Kenya: a cross-sectional study. BMC Health Serv. Res. 19 (1), 665. doi:10.1186/s12913-019-4519-x
Faria, M. H. D., Silva, L. M. A. C., Mafra, R. P., Santos, M. M. D., Soares, S. C. M., and Moura, J. M. B. O. (2022). Actinic cheilitis in rural workers: prevalence and associated factors. Einstein-Sao Paulo 20, eAO6862. doi:10.31744/einstein_journal/2022AO6862
Formica, D., Sultana, J., Cutroneo, P. M., Lucchesi, S., Angelica, R., Crisafulli, S., et al. (2018). The economic burden of preventable adverse drug reactions: a systematic review of observational studies. Expert Opin. Drug Saf. 17 (7), 681–695. doi:10.1080/14740338.2018.1491547
Garon, S. L., Pavlos, R. K., White, K. D., Brown, N. J., Stone, C. A., and Phillips, E. J. (2017). Pharmacogenomics of off-target adverse drug reactions. Br. J. Clin. Pharmacol. 83 (9), 1896–1911. doi:10.1111/bcp.13294
Golder, S., Xu, D., O'Connor, K., Wang, Y., Batra, M., and Hernandez, G. G. (2025). Leveraging natural language processing and machine learning methods for adverse drug event detection in electronic health/medical records: a scoping review. Drug Saf. 48 (4), 321–337. doi:10.1007/s40264-024-01505-6
Guo, Y. H., Wu, J. F., Wang, Y. L., and Jin, Y. (2022). Development and validation of an ultrasound-based radiomics nomogram for identifying HER2 status in patients with breast carcinoma. Diagn. (Basel) 12 (12), 3130. doi:10.3390/diagnostics12123130
Han, Y. Z., Guo, Y. M., Xiong, P., Ge, F. L., Jing, J., Niu, M., et al. (2022). Age-associated risk of liver-related adverse drug reactions. Front. Med. (Lausanne) 17 (9), 832557. doi:10.3389/fmed.2022.832557
Hong, H. D., Chen, Y. C., Zhou, L., Bao, J. A., and Ma, J. J. (2024). Risk factors analysis and construction of predictive models for acute kidney injury in overweight patients receiving vancomycin treatment. Expert Opin. Drug Saf. 23, 1–10. doi:10.1080/14740338.2024.2393285
Hoshino, N., Hida, K., Sakai, Y., Osada, S., Idani, H., Sato, T., et al. (2018). Nomogram for predicting anastomotic leakage after low anterior resection for rectal cancer. Int. J. Colorectal Dis. 33 (4), 411–418. doi:10.1007/s00384-018-2970-5
Hou, N. Z., Li, M. Z., He, L., Xie, B., Wang, L., Zhang, R. M., et al. (2020). Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J. Transl. Med. 18 (1), 462. doi:10.1186/s12967-020-02620-5
Hu, X. M., Hou, Y. Y., Teng, X. R., Liu, Y., Li, Y., Li, W., et al. (2024). Prediction of cytochrome P450-mediated bioactivation using machine learning models and in vitro validation. Arch. Toxicol. 98 (5), 1457–1467. doi:10.1007/s00204-024-03701-w
International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (2021). Introductory guide for standardised MedDRA queries (SMQs) version 24.0. Available online at: https://admin.new.meddra.org/sites/default/files/guidance/file/SMQ_intguide_24_0_English.pdf (Accessed October 06, 2024).
Khanbhai, M., Anyadi, P., Symons, J., Flott, K., Darzi, A., and Mayer, E. (2021). Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review. BMJ Health Care Inf. 28 (1), e100262. doi:10.1136/bmjhci-2020-100262
Kompa, B., Hakim, J. B., Palepu, A., Kompa, K. G., Smith, M., Bain, P. A., et al. (2022). Artificial intelligence based on machine learning in pharmacovigilance: a scoping review. Drug Saf. 45 (5), 477–491. doi:10.1007/s40264-022-01176-1
Le Glaz, A., Haralambous, Y., Kim-Dufor, D. H., Lenca, P., Billot, R., Ryan, T. C., et al. (2021). Machine learning and natural language processing in mental health: systematic review. J. Med. Internet Res. 23 (5), e15708. doi:10.2196/15708
Li, L. S., and Yin, J. (2019). Drug safety evaluation in China. Curr. Allergy Asthma Rep. 19 (9), 39. doi:10.1007/s11882-019-0872-4
Li, J. L., Liu, S. R., Hu, Y. D., Zhu, L. F., Mao, Y. J., and Liu, J. L. (2022). Predicting mortality in intensive care unit patients with heart failure using an interpretable machine learning model: retrospective cohort study. J. Med. Internet Res. 24 (8), e38082. doi:10.2196/38082
Li, P., Zhu, M., Gao, A., Guo, H. L., Fu, A., Zhao, A. Q., et al. (2024). A case-control study on the clinical characteristics of granisetron-related arrhythmias and the development of a predictive nomogram. Int. J. Clin. Pharm. 46 (3), 684–693. doi:10.1007/s11096-024-01703-3
Lin, S. W., Lu, W. B., Wang, T., Wang, Y., Leng, X. Q., Chi, L. D., et al. (2024). Predictive model of acute kidney injury in critically ill patients with acute pancreatitis: a machine learning approach using the MIMIC-IV database. Ren. Fail. 46 (1), 2303395. doi:10.1080/0886022X.2024.2303395
Montané, E., and Santesmases, J. (2020). Adverse drug reactions. Med. Clin. Barc. 154 (5), 178–184. doi:10.1016/j.medcli.2019.08.007
Moyer, A. M., Matey, E. T., and Miller, V. M. (2019). Individualized medicine: sex, hormones, genetics, and adverse drug reactions. Pharmacol. Res. Perspect. 7 (6), e00541. doi:10.1002/prp2.541
Mulchandani, R., and Kakkar, A. K. (2019). Reporting of adverse drug reactions in India: a review of the current scenario, obstacles and possible solutions. Int. J. Risk Saf. Med. 30 (1), 33–44. doi:10.3233/JRS-180025
Norwegian Institute of Public Health (2025). WHO collaborating centre for drug statistics methodology ATC/DDD index 2025. Available online at: https://www.whocc.no/atc_ddd_index_and_guidelines/atc_ddd_index/ (Accessed April 30, 2025).
Obuchowski, N. A., and Bullen, J. A. (2018). Receiver operating characteristic (ROC) curves: review of methods with applications in diagnostic medicine. Phys. Med. Biol. 63 (7), 07TR01. doi:10.1088/1361-6560/aab4b1
On, J., Park, H. A., and Yoo, S. (2022). Development of a prediction models for chemotherapy-induced adverse drug reactions: a retrospective observational study using electronic health records. Eur. J. Oncol. Nurs. 56, 102066. doi:10.1016/j.ejon.2021.102066
Peng, Y., Zhang, C., and Zhou, B. (2025). A cross-sectional study comparing machine learning and logistic regression techniques for predicting osteoporosis in a group at high risk of cardiovascular disease among old adults. BMC Geriatr. 25 (1), 209. doi:10.1186/s12877-025-05840-w
Salas, M., Petracek, J., Yalamanchili, P., Aimer, O., Kasthuril, D., Dhingra, S., et al. (2022). The use of artificial intelligence in pharmacovigilance: a systematic review of the literature. Pharm. Med. 36 (5), 295–306. doi:10.1007/s40290-022-00441-z
Shamim, M. A., Shamim, M. A., Arora, P., and Dwivedi, P. (2024). Artificial intelligence and big data for pharmacovigilance and patient safety. J. Med. Surg. Public Health 3, 100139. doi:10.1016/j.glmedi.2024.100139
Shukla, A. K., Jhaj, R., Misra, S., Ahmed, S. N., Nanda, M., and Chaudhary, D. (2021). Agreement between WHO-UMC causality scale and the Naranjo algorithm for causality assessment of adverse drug reactions. J. Fam. Med. Prim. Care 10 (9), 3303–3308. doi:10.4103/jfmpc.jfmpc_831_21
Song, H. B., Pei, X. J., Liu, Z. X., Shen, C. Y., Sun, J., Liu, Y. Q., et al. (2023). Pharmacovigilance in China: evolution and future challenges. Br. J. Clin. Pharmacol. 89 (2), 510–522. doi:10.1111/bcp.15277
Sun, J., Deng, X. Y., Chen, X. P., Huang, J. J., Huang, S. Q., Li, Y. F., et al. (2020). Incidence of adverse drug reactions in COVID-19 patients in China: an active monitoring study by hospital pharmacovigilance system. Clin. Pharmacol. Ther. 108 (4), 791–797. doi:10.1002/cpt.1866
The Uppsala Monitoring Centre (2013). The use of the WHO-UMC system for standardised case causality assessment. Available online at: https://www.who.int/publications/m/item/WHO-causality-assessment (Accessed October 15, 2024).
Therapeutic Goods Administration of Department of Health of Australian Government (2000). Note for guidance on clinical safety data management definitions and standards for expedited reporting (CPMP/ICH/377/95) annotated with TGA comments. Available online at: https://www.tga.gov.au/sites/default/files/ich37795.pdf (Accessed December 22, 2024).
Toni, E., and Ayatollahi, H. (2025). Applying machine learning techniques to predict drug-related side effect: a policy brief. Inquiry 62, 469580251335805. doi:10.1177/00469580251335805
Toni, E., Ayatollahi, H., Abbaszadeh, R., and Fotuhi-Siahpirani, A. (2024a). Adverse drug reactions in children with congenital heart disease: a scoping review. Paediatr. Drugs 26 (5), 519–553. doi:10.1007/s40272-024-00644-8
Toni, E., Ayatollahi, H., Abbaszadeh, R., and Fotuhi-Siahpirani, A. (2024b). Risk factors associated with drug-related side effects in children: a scoping review. Glob. Pediatr. Health 11, 2333794X241273171. doi:10.1177/2333794X241273171
Ward, I. R., Wang, L., Lu, J., Bennamoun, M., Dwivedi, G., and Sanfilippo, F. M. (2021). Explainable artificial intelligence for pharmacovigilance: what features are important when predicting adverse outcomes? Comput. Methods Programs Biomed. 212, 106415. doi:10.1016/j.cmpb.2021.106415
Wei, Z. M., Li, M. Q., Zhang, C. H., Miao, J., Wang, W. M., and Fan, H. (2024). Machine learning-based predictive model for post-stroke dementia. BMC Med. Inf. Decis. Mak. 24 (1), 334. doi:10.1186/s12911-024-02752-4
Xu, R. X., Chen, Y. X., Yao, Z. H., Wu, W., Cui, J. X., Wang, R. Q., et al. (2024). Application of machine learning algorithms to identify people with low bone density. Front. Public Health 12, 1347219. doi:10.3389/fpubh.2024.1347219
Zhang, W., Ji, L. C., Wang, X. J., Zhu, S. B., Luo, J. C., Zhang, Y., et al. (2022). Nomogram predicts risk and prognostic factors for bone metastasis of pancreatic cancer: a population-based analysis. Front. Endocrinol. (Lausanne) 12, 752176. doi:10.3389/fendo.2021.752176
Zhang, S. T., Zhang, Y. L., Ouyang, X. Y., Li, H., and Dai, R. P. (2025a). Random forest algorithm for predicting postoperative hypotension in oral cancer resection and free flap reconstruction surgery. Sci. Rep. 15 (1), 5452. doi:10.1038/s41598-025-89621-w
Zhang, Z. M., Zhu, M. Y., and Jiang, W. W. (2025b). Risk factors analysis of cutaneous adverse drug reactions caused by targeted therapy and immunotherapy drugs for oncology and establishment of a prediction model. Clin. Transl. Sci. 18 (1), e70118. doi:10.1111/cts.70118
Keywords: severe adverse drug reactions, adverse drug reactions, machine learning, nomogram, predictive model
Citation: Bu W, Wu X, Wang C and Cai Y (2025) Development and validation of a predictive nomogram for severe adverse drug reactions: a dual-center pharmacovigilance study. Front. Pharmacol. 16:1669995. doi: 10.3389/fphar.2025.1669995
Received: 21 July 2025; Accepted: 03 October 2025;
Published: 07 November 2025.
Edited by:
Eleonore Fröhlich, Medical University of Graz, AustriaReviewed by:
Esmaeel Toni, Iran University of Medical Sciences, IranViola Dsouza, Maastricht University, Netherlands
Copyright © 2025 Bu, Wu, Wang and Cai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yan Cai, Y2FpeWFuMDI5QDE2My5jb20=