Mapping the sociodemographic distribution and self-reported justifications for non-compliance with COVID-19 guidelines in the United Kingdom

Which population factors have predisposed people to disregard government safety guidelines during the COVID-19 pandemic and what justifications do they give for this non-compliance? To address these questions, we analyse fixed-choice and free-text responses to survey questions about compliance and government handling of the pandemic, collected from tens of thousands of members of the UK public at three 6-monthly timepoints. We report that sceptical opinions about the government and mainstream-media narrative, especially as pertaining to justification for guidelines, significantly predict non-compliance. However, free text topic modelling shows that such opinions are diverse, spanning from scepticism about government competence and self-interest to full-blown conspiracy theories, and covary in prevalence with sociodemographic variables. These results indicate that attempts to counter non-compliance through argument should account for this diversity in peoples’ underlying opinions, and inform conversations aimed at bridging the gap between the general public and bodies of authority accordingly.


This file includes:
Supplementary Text -Full questionnaire items, Further topic modelling explanations Figs. S1 to S9 Table S1 , S6, S7 Other Supplementary Materials for this manuscript include the following separate files:

Part 2 -Mental health and wellbeing
These items were scored on a continuous scale from 0 to 6 and 0 to 7 respectively. The scores were grouped in three categories for analysis purposes: high wellbeing (below 20), medium wellbeing (between 20 and 40) and low wellbeing (above 40).
The GAD-7 Questionnaire items of choice Part 5 -Questions about opinion and compliance during the COVID-19 pandemic Compliance questions Compliance questions were scored on a continuous scale from 0 to 2. The total scores represented the compliance score. The vaccine questions was initially asked when vaccines were just starting to be rolled out.
1. Do you wear a mask in enclosed public spaces? a. Yes = 0 b. Yes, but only because I have to = 0 c. No, I am exempt = 0 (because this is still complying) d. No, I don't think it makes a difference = 2 2. Do you avoid leaving the house due to Covid? a. Yes, I am scared for my health = 0 b. Yes, I am shielding = 0 c. Yes, I don't trust others to be sensible = 0 d. Yes, for other reasons = 0 e. No, not anymore = 1 f. No = 2 3. Have you followed the government guidelines about social distancing? a. Yes = 0 b. Yes, most of the time = 1 c. I used to but I have given up = 1 d. We estimated the average coherence per number of topics, ranging from 1-30 for each question ( Figure S3). Coherence measures assess whether topics are more or less internally consistent; therefore, the number of topics with the highest coherence (range from 4-7) was chosen for each question as the optimal number of topics.
Then, LDA was applied using the optimal number of topics illustrated as the peaks in Figure S3 for each question. The probabilistic graphical model of LDA is represented in Figure S4 (adopted from Syed and Spruit 2017). The topic mixtures per opinion were obtained and the top 20 opinions and words per topic were ranked based on this. Finally, the top 20 opinions associated with each topic were used to interpret and label each topic.
Predicting individual answers to opinion and compliance questions from sociodemographic and lifestyle factors.
We performed binomial and multinomial logistic regression on the data using Python statmodels library (Seabold et al., 2010) in order to identify questionnaire variables predictive of people more likely to answer a question in a distrustful/noncompliant way. The questionnaire variables investigated include sociodemographic features (age, gender, occupation, education, ethnicity and country of residence), levels of substance use, neuropsychiatric status, wellbeing and number of hours spend online per day and primary source of news about the pandemic. All empty values were dropped, and dummy variables were computed for all categorical data including the predictors and the dependent variables. The predictor categories with the highest number of participants acted as reference categories and were removed from the models. We also checked for highly correlated variables and found that questions such as those asking about mask wearing and about socially distancing had very low numbers of responses in certain categories, so those were collapsed together and the questions entered binomial rather than multinomial regression models. For each model odds ratios were calculated to reflect odds compared to the removed highest scoring dummy. Effect size (Cohen's d) was calculated by dividing the beta coefficients calculated in the linear regression by the square root of the N times the standard errors. Figure S2 illustrates the results.

Fig. S1.
Longitudinal distribution of beliefs, compliance and media sources. The data presented only refers to the N=2797 respondents who completed all three timepoints. Panels A-E illustrate the proportions of various responses to belief-class questions. Panels F-I illustrate responses received to questions regarding compliance to measures that attempt to reduce COVID-19 transmission. Figure 1J illustrates the distribution of primary sources about the pandemic that people use. All xaxes show percentages. Each panel contains χ 2 and p values of tests for dissimilarity between the December 2020, June 2021 and January 2022 data. Answers are displayed in cascade order based on the most prevalent answers given in December 2020. Information regarding sources of information about the pandemic was not collected in January 2022. Significance is denoted as *, **, and *** for p<0.05, p<0.01, and p<0.001, respectively.     Probabilities of answering questions about beliefs, compliance and media sources based on sociodemographic factors at the different timepoints.