AUTHOR=Duan Juntao , Li Hanmo , Ma Xiaoran , Zhang Hanjie , Lasky Rachel , Monaghan Caitlin K. , Chaudhuri Sheetal , Usvyat LenĀ A. , Gu Mengyang , Guo Wensheng , Kotanko Peter , Wang Yuedong TITLE=Predicting SARS-CoV-2 infection among hemodialysis patients using multimodal data JOURNAL=Frontiers in Nephrology VOLUME=Volume 3 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/nephrology/articles/10.3389/fneph.2023.1179342 DOI=10.3389/fneph.2023.1179342 ISSN=2813-0626 ABSTRACT=Background The COVID-19 pandemic has created more devastation to dialysis patients than to the general population. Patient-level prediction models for SARS-CoV-2 infection are crucial for the early identification of patients to prevent and mitigate outbreaks within dialysis clinics. As the COVID-19 pandemic evolves, it is unclear whether previously built prediction models are still sufficiently effective. Methods We developed a machine learning (XGBoost) model to predict during the incubation period a SARS-CoV-2 infection that is subsequently diagnosed after three or more days. We used data from multiple sources, including demographic, clinical, treatment, laboratory, and vaccination information from a national network of hemodialysis clinics, socioeconomic information from the Census Bureau, and county-level COVID- 19 infection and mortality information from state and local health agencies. We created prediction models and evaluated their performances on a rolling basis to investigate the evolution of prediction power and risk factors. Result From April 2020 to August 2020, our machine learning model achieved an AUC of 0.75, an improvement over 0.07 from a previously developed machine learning model published on Kidney360 in 2021. As the pandemic evolved, the prediction performance deteriorated and fluctuated more, with the lowest AUC of 0.6 in December 2021 and January 2022. Over the whole study period from April 2020 to February 2022, fixing the false positive rate at 20%, our model can detect 40% of the positive patients. We found that features derived from local infection information reported by CDC are the most important predictors and vaccination status is a useful predictor as well. Whether a patient lives in a nursing home was an effective predictor before vaccination but became less predictive after vaccination. Conclusion As found in our study, the dynamics of the prediction model are frequently changing as the pandemic evolves. County-level infection information and vaccination are crucial for the success of early COVID-19 prediction models. Our results show that the proposed model can effectively identify SARS-CoV-2 infections during the incubation period. Prospective studies are warranted to explore the application of such prediction models in daily clinical practice.