AUTHOR=Choo Bryan Peide , Mok Yingjuan , Oh Hong Choon , Patanaik Amiya , Kishan Kishan , Awasthi Animesh , Biju Siddharth , Bhattacharjee Soumya , Poh Yvonne , Wong Hang Siang TITLE=Benchmarking performance of an automatic polysomnography scoring system in a population with suspected sleep disorders JOURNAL=Frontiers in Neurology VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2023.1123935 DOI=10.3389/fneur.2023.1123935 ISSN=1664-2295 ABSTRACT=Aims: The current gold standard for measuring sleep disorders is polysomnography (PSG) which is manually scored by a sleep technologist. Scoring a PSG is time consuming and tedious, with substantial inter-rater variability. A deep learning-based sleep analysis software module can perform autoscoring of PSG. The primary objective of the study is to validate the accuracy and reliability of the autoscoring software. The secondary objective is to measure workflow improvements in terms of time and cost via a time motion study. Methodology: The performance of a cloud-based automatic PSG scoring software was benchmarked against the performance of two independent sleep technologists on PSG data collected from patients with suspected sleep disorders. The technologists at the hospital clinic and a third-party scoring company scored the PSG records independently. The scores were then compared between the technologists and the automatic scoring system. An observational study was also performed where the time taken for sleep technologists at the hospital clinic to manually score PSGs were tracked, along with time taken by the autoscored software to assess for potential time savings. Results: The Pearson’s correlation between manually scored apnea-hypopnea index (AHI) and automatically scored AHI was 0.962, demonstrating near-perfect agreement. The autoscoring system demonstrated similar results in sleep staging. The agreement between automatic staging and manual scoring was higher in terms of accuracy and Cohen’s Kappa than the agreement between experts. The autoscoring system took an average 42.7 seconds to score each record compared to 4243 seconds for manual scoring. Following a manual review of the auto scores, an average time savings of 38.6 minutes per PSG was observed, amounting to 0.25 Full Time Equivalent (FTE) savings per year. Conclusion: The findings indicate potential for a reduction in the burden of manual scoring of PSGs by sleep technologists and may be of operational significance for sleep laboratories in the healthcare setting.