AUTHOR=BuHamra Sana S. , Almutairi Abdullah N. , Buhamrah Abdullah K. , Almadani Sabah H. , Alibrahim Yusuf A. TITLE=An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities JOURNAL=Frontiers in Public Health VOLUME=Volume 10 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.1070870 DOI=10.3389/fpubh.2022.1070870 ISSN=2296-2565 ABSTRACT=Automating the process of extracting information from unstructured text data, such as physician notes, is becoming increasingly popular through the use of Natural Language Processing (NLP). There are numerous ways in which healthcare providers can use NLP to enhance their services. We investigated COVID-19 mortality data from the intensive care unit (ICU) in Kuwait over the first 18 months of the pandemic using NLP. A key goal is to extract and classify the primary and intermediate causes of death from electronic health records (EHRs) in a timely way. Comorbid conditions, which refers to concurrent diseases or medical disorders, were also retrieved and studied in relation to various causes of mortality. A wide range of medical systems around the world have difficulties in dealing with this pandemic of COVID-19, due to its mutating feature. The primary cause of death for 54.8% of the 1691 ICU patients we investigated was septic shock or sepsis-related multiorgan failure. About three-quarters of patients die from acute respiratory distress syndrome (ARDS), a common intermediate cause of death. An arrhythmia (AF) disorder was determined to be the strongest predictor of intermediate cause of death, whether caused by ARDS or another cause (ARDS/Other) using machine learning decision trees.