AUTHOR=Zhou Chengyu , Qian Yali , Xue Yao , Rong Liucheng , Wan Yu , Leng Kaiqiang , Miao Hongjun , Chen Feng , Fang Yongjun , Ge Xuhua TITLE=Risk factor identification for delayed excretion in pediatric high-dose methotrexate therapy: a machine learning analysis of real-world data JOURNAL=Frontiers in Pharmacology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2025.1662718 DOI=10.3389/fphar.2025.1662718 ISSN=1663-9812 ABSTRACT=PurposeThis study was to identify risk factors associated with delayed methotrexate (MTX) excretion in pediatric patients receiving high-dose MTX (HDMTX) therapy based on real-world data, and to develop and evaluate a predictive model.MethodsClinical data were retrospectively collected from 1,485 pediatric HDMTX chemotherapy cycles at the Children’s Hospital affiliated with Nanjing Medical University between 2021 and 2023. Key predictive variables were identified by Least Absolute Shrinkage and Selection Operator (LASSO) regression, Random Forest (RF), and Support Vector Machine Recursive Feature Elimination (SVM-RFE), and then incorporated into predictive models for MTX delayed excretion using Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). Bootstrap was employed to internally validate these models and identify the best-performing one, and then SHapley Additive exPlanations (SHAP) values were utilized to provide both global and local interpretations.ResultsAmong the 1,485 pediatric HDMTX chemotherapy cycles, 26.1% were associated with delayed MTX excretion. Serum creatinine (Scr), total drug dose (Dose), alkaline phosphatase (ALP), creatine kinase (CK), blood urea nitrogen (Urea), gamma-glutamyl transferase (GGT), hemoglobin (HB), and height were identified as key predictors of delayed excretion. Internal validation showed that the XGBoost model performed best, with an accuracy of 0.780, an F1 score of 0.669, an area under the Receiver Operating Characteristic curve (AUROC) of 0.842, and a Brier score of 0.136. Decision Curve Analysis (DCA) also demonstrated favorable clinical utility. SHAP analysis revealed that Scr was the most important risk factor for delayed MTX excretion in the XGBoost model. This XGBoost model has been translated into a convenient tool to facilitate its utility in clinical settings.ConclusionThe XGBoost model demonstrated good predictive performance and clinical utility for delayed MTX excretion in pediatric patients.