AUTHOR=Huang Ru , Wang Xiuli TITLE=Impact of COVID-19 on mental health in China: analysis based on sentiment knowledge enhanced pre-training and XGBoost algorithm JOURNAL=Frontiers in Public Health VOLUME=Volume 11 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2023.1170838 DOI=10.3389/fpubh.2023.1170838 ISSN=2296-2565 ABSTRACT=COVID-19 has had a great impact on the society of countries around the world. Understanding these impacts is worth studying. But few papers consider it from the perspective of social text data. In this paper, we try to use the social text data about COVID-19 on Sina Weibo(the largest “tweet” platform in China, we will also call Weibo as tweet in the following content), to explore the impact of COVID-19 on Chinese mental health. First, we filter the tweet data by selecting examples which contain COVID-19 and COVID-19 correlated keywords. And we segment the filtered tweets, extract meaningful words, and construct a word vector sparse matrix as the measurement of every tweet. Then, for model’s labels, we use sentiment knowledge enhanced pre-training model (SKEP), a deep learning framework published by Baidu to measure the user's mental state. Through SKEP, we can obtain the probability of the user's positive mental state. Finally, we use XGBoost algorithm to study the relationship between word vector sparse matrix and users’ mental health state. Our research shows that social text data can indeed reflect the mental health of users to a large extent, and social data can be used to explore the impact of COVID-19 on mental health, which can help public health policy.