AUTHOR=Beverin Luka , Topalovic Marko , Halilovic Armin , Desbordes Paul , Janssens Wim , De Vos Maarten TITLE=Predicting total lung capacity from spirometry: a machine learning approach JOURNAL=Frontiers in Medicine VOLUME=Volume 10 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2023.1174631 DOI=10.3389/fmed.2023.1174631 ISSN=2296-858X ABSTRACT=Background and objective: Spirometry patterns can suggest that a patient has a restrictive ventilatory impairment; however, lung volume measurements such as total lung capacity (TLC) are required to confirm the diagnosis. The aim of the study was to train a supervised machine learning model that can accurately estimate TLC values from spirometry and subsequently identify which patients would most benefit from undergoing a complete pulmonary function test. Methods: We trained three tree-based machine learning models on 51,761 spirometry data points with corresponding TLC measurements. We then compared model performance using an independent test set consisting of 1402 patients. The best-performing model was used to retrospectively identify restrictive ventilatory impairment in the same test set. The algorithm was compared against different spirometry patterns commonly used to predict restriction. Results: The prevalence of restrictive ventilatory impairment in the test set is 16.7% (234/1402). CatBoost was the best-performing machine learning model. It predicted TLC with a mean squared error (MSE) of 560.1 ml. The sensitivity, specificity, and F1-score of the optimal algorithm for predicting restrictive ventilatory impairment was 83%, 92% and 75% respectively. Conclusion: A machine learning model trained on spirometry data can estimate TLC to a high degree of accuracy. This approach could be used to develop future smart home-based spirometry solutions, which could aid decision making and self-monitoring in patients with restrictive lung diseases.