AUTHOR=Loeloe Mohammad Sadegh , Sefidkar Reyhane , Tabatabaei Seyyed Mohammad , Mehrparvar Amir Houshang , Jambarsang Sara TITLE=Machine learning-based spirometry reference values for the Iranian population: a cross-sectional study from the Shahedieh PERSIAN cohort JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1480931 DOI=10.3389/fmed.2025.1480931 ISSN=2296-858X ABSTRACT=ObjectiveThis study aimed to determine spirometric norm values for the healthy Iranian adult population and compare them with established norm equations, specifically the GLI-Caucasian and Iranian equations.MethodsDuring the recruitment phase of the Shahedieh Prospective Epidemiological Research Studies in Iran (PERSIAN) in 2016, spirometric parameters of 998 participants were obtained. KNN regression was used to extract reference values for spirometric parameters FEV1, FVC, FEV1/FVC, and FEF25–75%, considering height and age as features. The performance of KNN regression was compared with conventional models used in previous studies, such as the multiple linear regression (MLR) model and the Lambda-Mu-Sigma (LMS) model. The predicted values were compared with those obtained from the GLI-Caucasian and Iranian equations. The validation criterion was the mean squared error (MSE) based on 5-fold cross-validation.ResultsThis study included 473 female participants and 525 male participants. KNN regression provided more accurate predictions for four spirometric parameters than MLR and LMS. The MSE for predicting FVC in female participants was 0.159, 0.169, and 0.165 in KNN regression, MLR, and LMS, respectively. The predictions of the present study were closer to the actual values of the reference population for four indicators compared to the prediction values using two sets of reference equations. The MSE of predicted FVC for female participants was 0.159 in the present study, which was less than the Iranian (MSE = 0.344) and GLI-Caucasian (MSE = 0.397) equations.ConclusionUsing a flexible machine learning approach, this study established spirometry reference values specifically for the Iranian population. Recognizing that spirometry reference values vary among different populations, the Excel calculator developed in this research can be a valuable tool in healthcare centers for assessing lung function in Iranian adults.