Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurol.

Sec. Neuro-Otology

This article is part of the Research TopicVestibular Function and Mental Health During the LifespanView all 7 articles

Development and comparison of machine learning models for predicting moderate-to-severe tinnitus in patients with hearing loss

Provisionally accepted
Chenguang  ZhangChenguang Zhang1,2Tao  RanTao Ran1,2Yicong  WangYicong Wang3Di  XiaoDi Xiao4Yuwen  WangYuwen Wang5Ying  ZhangYing Zhang1Ying  ZhangYing Zhang2Bin  GuoBin Guo2*
  • 1Qinghai University, Xining, China
  • 2Qinghai University Affiliated Hospital, Xining, China
  • 3Sun Yat-Sen University, Guangzhou, China
  • 4Dalian Medical University, Dalian, China
  • 5Zhejiang University School of Medicine, Hangzhou, China

The final, formatted version of the article will be published soon.

Objective: Analyze the psychological and clinical factors of moderate-to-severe tinnitus clinically significant tinnitus (THI score ≥38) in patients with hearing loss, construct predictive models based on four machine learning (ML) algorithms, and compare the predictive performance of different models. Methods: Patients with hearing loss who visited the Department of Otolaryngology at Qinghai University between August 2024 and May 2025 were enrolled in this study. Clinical data were retrieved from the hospital’s electronic medical record system. The study outcome was the occurrence of clinically significant tinnitus. Predictive variables were screened using univariate analysis, the least absolute shrinkage and selection operator (LASSO) regression, and the Boruta algorithm. Four ML algorithms—logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), and support vector machine (SVM)—were applied to construct and validate predictive models. The area under the receiver operating characteristic curve (AUC) of each model in the validation set was compared using the DeLong test. Additionally, model performance metrics in the validation set were compared to identify the optimal model. Finally, the Shapley additive explanations (SHAP) algorithm was employed to interpret the best-performing model. Results: 9 key variables—age, hypertension, sleep disorder, anxiety, hearing loss severity, depression, noise exposure history, hearing side, and ototoxic drug use—were retained after LASSO and Boruta feature selection. Among the four ML models, the RF algorithm achieved the best predictive performance, with an AUC of 0.973 in the training set and 0.977 in the validation set, followed by XGBoost (AUC = 0.962 and 0.961, respectively). DeLong tests confirmed that RF significantly outperformed LR and SVM models (p < 0.001), while its difference from XGBoost was not significant. In the validation set, the RF model yielded the highest accuracy (0.923), sensitivity (0.929), specificity (0.914), precision (0.945), and F1-score (0.937). SHAP analysis indicated that hearing loss severity, age, and sleep disorder were the most influential predictors, suggesting that both auditory and non-auditory factors contribute substantially to the risk of clinically significant tinnitus. Conclusion: The RF model showed the best performance in predicting clinically significant tinnitus, with hearing loss severity, age, and sleep disorder identified as major predictors.

Keywords: Hearing Loss, machine learning, random forest, sleep disorder, Tinnitus

Received: 22 Nov 2025; Accepted: 16 Dec 2025.

Copyright: © 2025 Zhang, Ran, Wang, Xiao, Wang, Zhang, Zhang and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Bin Guo

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.