AUTHOR=Cheng Xiuwei , Wan Hongli , Yuan Heng , Zhou Lijun , Xiao Chongkun , Mao Suling , Li Zhirui , Hu Fengmiao , Yang Chuan , Zhu Wenhui , Zhou Jiushun , Zhang Tao TITLE=Symptom Clustering Patterns and Population Characteristics of COVID-19 Based on Text Clustering Method JOURNAL=Frontiers in Public Health VOLUME=Volume 10 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.795734 DOI=10.3389/fpubh.2022.795734 ISSN=2296-2565 ABSTRACT=Background: Descriptions of single clinical symptoms of COVID-19 have been widely reported. However, evidence of symptoms associations was still limited. We sought to explore the potential symptom clustering patterns and high-frequency symptom combinations of COVID-19 to enhance people’s understanding of this disease. Methods: In this retrospective cohort study, a total of 1,067 COVID-19 cases were enrolled. Symptom clustering patterns were firstly explored by text clustering method. Then a multinomial logistic regression was applied to reveal the population characteristics of different symptom groups. In addition, time intervals between symptoms onset and the first visit were analyzed to take account of the effect of time interval extension on the progression of symptoms. Results: Based on text clustering, the symptoms were summarized into four groups. Group 1: No-obvious symptoms; Group 2: Mainly fever and/or dry cough; Group 3: Mainly upper respiratory tract infection symptoms; Group 4: Mainly cardiopulmonary, systemic and/or gastrointestinal symptoms. Apart from Group 1 with no obvious symptoms, the most frequent symptom combinations were fever only (64 cases, 47.8%), followed by dry cough only (42 cases, 31.3%) in Group 2; expectoration only (21 cases, 19.8%), followed by expectoration complicated with fever (10 cases, 9.4%) in Group 3; fatigue complicated with fever (12 cases, 4.2%), followed by headache complicated with fever was also high (11 cases, 3.8%) in Group 4. People aged 45-64 years were more likely to have symptoms of Group 4 than those aged 65 years or older (OR = 2.66, 95%CI:1.21-5.85) and at the same time had longer time intervals. Conclusions: Symptoms of COVID-19 could be divided into four clustering groups with different symptom combinations. The Group 4 symptoms (i.e., mainly cardiopulmonary, systemic and/or gastrointestinal symptoms) happened more frequently in COVID-19 than in influenza. This distinction could help deepen the understanding of this disease. The middle-aged people have a longer time interval for medical visit and was a group deserve more attention, from the perspective of medical delays.