AUTHOR=Yasin Parhat , Mardan Muradil , Xu Tao , Cai Xiaoyu , Abulizi Yakefu , Wang Ting , Sheng Weibin , Mamat Mardan TITLE=Development and validation of a diagnostic model for differentiating tuberculous spondylitis from brucellar spondylitis using machine learning: A retrospective cohort study JOURNAL=Frontiers in Surgery VOLUME=Volume 9 - 2022 YEAR=2023 URL=https://www.frontiersin.org/journals/surgery/articles/10.3389/fsurg.2022.955761 DOI=10.3389/fsurg.2022.955761 ISSN=2296-875X ABSTRACT=Background: Tuberculous spondylitis (TS) and Brucella spondylitis (BS) are commonly observed in spinal infectious diseases, which are initially caused by bacteremia. BS is easy to be misdiagnosed as TS, especially in under developed regions of northwestern China with less sensitive medical equipment. Nevertheless, a rapid and reliable diagnostic tool is remaining to be developed and a clinical diagnostic model to differentiate TS and BS using machine learning algorithms is of great significance. Methods: A total of 410 patients were included in this study. Independent factors to predict TS were selected by using least absolute shrinkage and selection operator (LASSO) regression model, permutation feature importance and multivariate logistic regression analysis. The TS risk prediction model was developed with six different machine learning algorithms. We used several metrics to evaluate the accuracy, calibration and predictability of these models. The performance of the model with best predictability was further verified with the area under the curve (AUC) of the receiver operating characteristic (ROC) curve and the calibration curve. The clinical performance of the final model was evaluated by decision curve analysis. Results: Six variables were incorporated in the final model, named: Pain severity, CRP, X-ray Intervertebral disc height loss, X-ray endplate sclerosis, CT vertebral destruction and MRI paravertebral abscess. The analysis of appraising six models revealed that the logistic regression model developed in current study outperformed other methods by sensitivity (0.88 ± 0.07) and accuracy (0.79 ± 0.07). The AUC of the logistic regression model predicting TS was 0.86 (95% CI, 0.81–0.90 in the training set and 0.86 (95% CI, 0.78–0.92) in the validation set. The decision curve analysis displayed that the logistic regression model displayed a higher clinical efficiency in the differential diagnosis. Conclusions: The logistic regression model developed in this study outperformed other methods. The logistic regression model demonstrated by a calculator exerts good discrimination and calibration capability, and could be applicable in differentiating TS from BS in primary health care diagnosis.