AUTHOR=Lu Qiyuan , Chen Mengmeng , Ling Houfu TITLE=Prediction of 1-year post-operative mortality in elderly patients with fragility hip fractures in China: evaluation of risk prediction models JOURNAL=Frontiers in Surgery VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/surgery/articles/10.3389/fsurg.2025.1415680 DOI=10.3389/fsurg.2025.1415680 ISSN=2296-875X ABSTRACT=This study systematically evaluates the predictive capacity of seven risk stratification models for 12-month postoperative mortality in geriatric patients with fragility hip fractures, while concurrently assessing their risk classification accuracy to inform perioperative protocol formulation, rehabilitation strategies, and prognostic management.IntroductionCurrent clinical practice lacks standardized criteria for mortality risk prediction in elderly fragility hip fracture patients. This investigation conducts a comparative evaluation of seven prognostic models—the Sernbo Score, Jiang et al. model, Nottingham Hip Fracture Score (NHFS), Holt et al. algorithm, HEMA, ASAgeCoGeCC Score, and SHiPS—HiPSe, and SHire, and SHim, HEMA, ham Hip Fracture Score (mortality risk prediction in elderly fragility hip fracture patientsMethodsIn this retrospective cohort analysis, all consecutive patients aged isk prediction in elderly fragility hip fracture between January 2018 and October 2022 were enrolled. Model-derived mortality predictions and risk categorizations were computed. Predictive performance was quantified through the predictive validity, the area under the receiver operating characteristic (ROC) curve (AUC) analysis, DeLong test, Hosmer-Lemeshow goodness-of-fit testing and calibration slope (95% CI), followed by precision assessment of risk stratification tiers.ResultsThe cohort demonstrated a 12-month mortality rate of 29.0%. Kaplan–Meier survival curves identified the first postoperative year as the highest mortality risk period. The ASAgeCoGeCC Score was the only model in this study that simultaneously demonstrated balanced sensitivity (0.73)/specificity (0.82), excellent discrimination (AUC = 0.84), and good calibration (H-L p = 0.36, calibration slope = 0.75). The DeLong test indicated its significantly superior performance compared to the other models (p < 0.01). The NHFS and Holt et al. performed next best. All models except the Sernbo Score achieved AUC values exceeding 0.70. Significant calibration deficiencies were observed in NHFS, HEMA, and SHiPS (Hosmer-Lemeshow p < 0.05). Risk stratification analysis revealed SHiPS as the most precise classification system.ConclusionASAgeCoGeCC score, NHFS and Holt et al.showed acceptable predictive performance, where the first two are applicable to clinical rapid decision-making, while NHFS has been extensively external validated. Holt et al.is more suitable for a well-resourced medical system. SHiPS displayed optimal risk categorization accuracy, suggesting potential for broader clinical implementation. These findings necessitate verification through prospective multi-center studies.