AUTHOR=Chen Xiaoping , Zheng Lihui , Ye Shupei , Xu Mengxin , Li YanLing , Lv KeXin , Zhu Haipeng , Jie Yusheng , Chen Yao-Qing TITLE=Research on Influencing Factors and Classification of Patients With Mild and Severe COVID-19 Symptoms JOURNAL=Frontiers in Cellular and Infection Microbiology VOLUME=Volume 11 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/cellular-and-infection-microbiology/articles/10.3389/fcimb.2021.670823 DOI=10.3389/fcimb.2021.670823 ISSN=2235-2988 ABSTRACT=Objective: To analyze the epidemiological history, clinical symptoms, laboratory testing parameters of patients with mild and severe COVID-19 infection, and to provide a reference for timely judgment of changes in the patients’ conditions and the formulation of epidemic prevention and control strategies. Methods: A retrospective study was conducted in this research. A total of 90 patients with COVID-19 infection from January to March, 2020 were selected as study subjects. We analyzed the clinical characteristics of laboratory-confirmed patients with COVID-19, used the oversampling method (SMOTE) to solve the imbalance of categories, and established LASSO-logistic regression and random forest models. Results: Among the 90 confirmed COVID-19 cases, 79 were mild and 11 were severe. The average time from illness onset to hospital admission was 4.1 days, and the average actual hospital stay was 18.7 days. Both times were longer for severe patients than for mild patients. Forty-eight (53.3%) of the 90 patients had family cluster infections, which was similar among mild and severe patients. Comorbidities of underlying diseases were more common in severe patients, including hypertension, diabetes and other diseases. Severe patients had a low level of creatine kinase (median 40.9) and a high level of D-dimer. Logistic regression showed that age, phosphocreatine kinase, procalcitonin, time from onset to admission, the lymphocyte count of the patient on admission, expectoration, fatigue, poor appetite, and dry throat were independent predictors of COVID-19 severity. The classification of random forest was predicted and the importance of each variable was displayed. Conclusion: The clinical symptoms of COVID-19 patients are non-specific and complicated. Age and the time from onset to admission are important factors that determine the severity of the patient's condition. Patients with mild illness should be closely monitored to identify those who may develop severe illness. Variables such as age and creatine phosphate kinase selected by logistic regression can be used as important indicators to assess the disease severity of COVID-19 patients. The importance of variables in the random forest further complements the variable feature information.