Health Stigma on Twitter: Investigating the Prevalence and Type of Stigma Communication in Tweets about Different Conditions and Disorders

Background: Health-related stigma can act as a barrier to seeking treatment and can negatively impact wellbeing. Comparing stigma communication across different conditions may generate insights previously lacking from condition-specific approaches and help to broaden our understanding of health stigma as a whole. Method: A sequential explanatory mixed-methods approach was used to investigate the prevalence and type of health-related stigma on Twitter by extracting 1.8 million tweets referring to five potentially stigmatised health conditions and disorders (PSHCDs): Human Immunodeficiency Virus (HIV) / Acquired Immunodeficiency Syndrome (AIDS), Diabetes, Eating Disorders, Alcoholism, and Substance Use Disorders (SUD). Firstly, 1,500 tweets were manually coded by stigma communication type, followed by a larger sentiment analysis ( n = 250,000). Finally, the most prevalent category of tweets, ‘ Anti-Stigma and Advice ’ ( n = 273), was thematically analysed to contextualise and explain its prevalence. Results: We found differences in stigma communication between PSHCDs. Tweets referring to substance use disorders were frequently accompanied by messages of societal peril. Whereas, HIV/AIDS related tweets were most associated with potential labels of stigma communication. We found consistencies between automatic tools for sentiment analysis and manual coding of stigma communication. Finally, the themes identified by our thematic analysis of anti-stigma and advice were Social Understanding, Need for Change, Encouragement and Support, and Information and Advice. Conclusions: Despite one third of health-related tweets being manually coded as potentially stigmatising, the notable presence of anti-stigma suggests that efforts are being made by users to counter online health stigma. The negative sentiment and societal peril associated with substance use disorders reflects recent suggestions that, though attitudes have improved towards physical diseases in recent years, stigma around addiction has seen little decline. Finally, consistencies between our manual coding and automatic tools for identifying language features of harmful content, suggest that machine learning approaches may be a reasonable next step for identifying general health-related stigma online.


Introduction
Experiences of stigma can have a detrimental effect on the lives of people living with a range of health conditions and disorders.A recent study in the UK found that over half of participants living with a long-term health condition reported having experienced stigma associated with their condition (Brown et al., 2022a).Modern conceptualisations of stigma suggest that it is a social, cultural and moral phenomenon; a process typically involving labelling, negative stereotyping, linguistic separation (whereby the target of stigma is referred to as distinct from the person communicating the stigma), and with an asymmetry of power between those communicating and receiving stigma (Major and O'brien, 2005;Andersen et al., 2022;Kleinman and Hall-Clifford, 2009).Experiences of stigma can act as a barrier to sharing health information making individuals less likely to seek treatment and advice, which may inhibit them from receiving an appropriate level of care (Earnshaw et al., 2011;Sheehan and Corrigan, 2020;Simpson et al., 2021;Brown et al., 2022b).Experiences of health stigma have also been found to negatively impact employment, income and can have adverse economic effects (Sharac et al., 2010).
Stigma can accompany conditions or disorders that are associated with lifestyle behaviours.
The concept of 'lifestyle diseases' has been strongly criticised because it can incorrectly allocate both blame and choice to those who experience ill health (Whyte, 2016).Nevertheless, assumptions are frequently made about an individual's lifestyle in the presence of ill health, which can assign individual responsibility for the development of the condition or disorder (Seeberg and Meinert, 2015).The perceived origin of a condition is key to health stigma, and centres on the construct of onset controllability (Pachankis et al., 2018).For example, health conditions relating to alcohol and other substance use disorders may be considered stigmatising due to the association of their origin with chosen lifestyle behaviours around drinking.Human Immunodeficiency Virus (HIV) is a longterm health condition that remains highly stigmatised in society, as does Acquired Immunodeficiency Syndrome (AIDS).Pachankis et al. (2018), when evaluating an extensive range of potentially stigmatising conditions, found that alcoholism, drug dependency and HIV status were all rated high in controllable origin.In previous decades, public policy and media discourse cultivated a stigmatising narrative around HIV and AIDS as a sexually transmitted, fatal disease associated with lifestyle choices (Khan, 2020).Typically held stereotypes about the lifestyle of people living with HIV and AIDS were that they are likely to be homosexual men, prostitutes or drug users (Earnshaw et al., 2012).Eating disorders have also been considered potentially stigmatised due to their association with lifestyle behaviours.For example, Anorexia Nervosa has been linked to lifestyle factors such as dietary habits, pursuit of the 'thin ideal' (Zipfel et al., 2015), and a need for control (Branley-Bell et al., 2023).Extreme 'Pro-ana' (pro-anorexia) groups may have exacerbated stigma by suggesting that Anorexia is a lifestyle choice in place of an illness (Richardson and Cherry, 2011).
Bulimia and Binge Eating Disorder are also often associated with lifestyle choices such as intentional overeating and desired weight loss (Mehler and Rylander, 2015;Hutson et al., 2018).Finally, type 2 Diabetes may be considered potentially stigmatising as it is frequently classified as a lifestyle disease, often being associated with factors such as laziness and poor dietary choices (Browne et al., 2013).
Experiences of health stigma can be very hurtful and can have a damaging impact on long-term health and wellbeing (Clair et al., 2016;Lawrence et al., 2022;Entwistle, 2008).
Existing research into health stigma has typically adopted a siloed approach in which specific conditions and disorders are investigated separately (Stangl et al., 2019).This approach has stifled comparisons between stigmatised conditions, potentially limiting the broader understanding of health stigma as a whole (Pachankis et al., 2018;Stangl et al., 2019).This siloed approach to health stigma research means that the insights drawn from investigating condition-specific stigma are not always brought to the attention of academics and public health communicators working to address stigma in other areas (Millum et al., 2019).Therefore, jointly investigating and comparing potentially stigmatised conditions may offer additional insights previously lacking from condition-specific approaches.For example, such comparisons may provide knowledge concerning potential differences between physical, mental, and behavioural conditions, with respect to instances of stigma.This may help to develop interventions aimed at tackling health stigma with a more informed application across multiple conditions.
In order to investigate the true impact of health stigma, it is vital to understand how it is communicated.Smith's Model of Stigma Communication describes four types of content relevant to stigma communication: Marks, Labels, Responsibility, and Peril (Smith, 2014;Smith, 2007;Smith, 2011).There are parallels between the modern conceptualisation of stigma provided above and Smith's Model of Stigma Communication.For example, using labels to refer to a particular social group is employed to denote both what stigma is and how it is communicated.Stigma is described as a process involving the separation of those assigning and receiving stigma (such that the receivers of stigma are 'othered' from the dominant social group).Stigma communication conveys this separation by referring to the negative features of the 'othered' group.This is communicated by highlighting the societal peril caused by the stigmatised group and the responsibility they carry for belonging to this group.With respect to the Model of Stigma Communication, Marks describe potential ways to identify members of a stigmatised group.To be most effective, marks should be visible and unsightly features of health, so that they can be identified rapidly.These may include clearly visible facial marks, a notable physical movement or tic, such as those associated with Tourette's syndrome (Smith, 2007).Labels are terms used to refer to a group.Labels often present the danger associated with a group by arousing social cognitions, such as considering the identified persons as a distinct group and encouraging stereotypes.For example, instead of stating that a person has epilepsy, the use of a label may refer to an individual as an 'epileptic', denoting that the person is the disease and a member of a separate group (Smith, 2007).Responsibility is content that describes a person's own agency, and assigns choice and blame for belonging to a certain stigmatised group.Responsibility may even suggest that a person voluntarily decided to deviate from social norms to engage in taboo activities.
For example, someone living with a sexually transmitted infection may experience stigma from others who attribute individual responsibility for the causal origin of the infection (Yoo and Jang, 2012).
Finally, Peril is content that describes the physical or social threat to a community's functioning.Peril often highlights painful, fatal, or socially taboo consequences of belonging to a stigmatised group.
For example, HIV/AIDS may be stigmatised by portraying experiences of pain and death, associated with sexual promiscuity or injecting illegal drugs.Smith's model has been recently extended for the purpose of identifying health stigma on Twitter to highlight additional features such as experiences of self-stigma (i.e.feeling negative attitudes towards oneself, or about one's condition), wishing harm upon others, and generally seeking to devalue the lives of those living with a particular condition (Bacsu et al., 2022;Robinson et al., 2019).
Today, much of the public communication and discussion about health conditions and the associated stigma occurs on social media platforms.This article focusses on health messages posted on the social media platform Twitter (X).Twitter was recently rebranded as 'X', however in this article we refer to 'Twitter' and 'tweets' because data were collected between March and May 2022, prior to the rebranding in July 2023 (BBC News, 2023).Twitter is a widely used social media platform on which users can communicate their thoughts and opinions on almost any topic, including those associated with potentially stigmatising health conditions.Twitter is therefore a relevant communicative context for investigating health stigma, providing researchers with an ideal source of both quantitative and qualitative data (Kim et al., 2021).Health researchers have used Twitter to collect large quantities of potentially stigmatising messages associated with a number of health concerns, including mental health disorders (Robinson et al., 2019), dementia (Bacsu et al., 2022), HIV pre-exposure prophylaxis treatment (Schwartz and Grimm, 2017), and eating disorders (Arseniev-Koehler et al., 2016;Talbot and Branley-Bell, 2022).Many health stigma studies have chosen to qualitatively analyse samples of tweets to identify themes, categories and content features associated with specific types of stigma (Najafizada et al., 2022;Reavley and Pilkington, 2014;Bacsu et al., 2022).Using sentiment analysis and other forms of natural language processing can also provide insights into the patterns of health-related stigma by considering the language features of a tweet and whether or not content characteristics vary by condition.Social media platforms such as Twitter have come under heavy criticism for potentially incentivising the spread of provocative or polarising content (Rathje et al., 2021;Branley and Covey, 2017).Further investigation of tweets containing health stigma may help to understand the degree to which potentially stigmatising narratives are present online.This may help health communicators to 'cut through the noise' by providing accurate and effective health messaging to counter harmful and potentially stigmatising content.
The aim of this study is to investigate Twitter discourse around potentially stigmatising health conditions and disorders (PSHCDs).We will examine the prevalence of different types of stigma, and explore differences and/or similarities between PSHCDs.Furthermore, by conducting a sentiment analysis of tweets, we will compare natural language processing approaches to identifying potential features of health stigma communication online with manual coding conducted by subject experts.Finally, a deeper qualitative analysis of tweets will look to contextualise and explain the presence of stigma communication online.

Method
We conducted a mixed methods study into health stigma on Twitter by extracting tweets referring to five PSHCDs (Human Immunodeficiency Virus (HIV) / Acquired Immunodeficiency Syndrome (AIDS), Diabetes, Eating Disorders, Alcoholism, and Substance Use Disorders).These health conditions and disorders were selected because of their perceived associations with lifestyle behaviours (see above).People living with conditions associated with lifestyle behaviours may experience stigma due to external perceptions of the controllability of the origin of their condition or disorder (Pachankis et al., 2018).We manually coded a subset of tweets by stigma communication type, using natural language processing to analyse sentiment and other language features of tweets, and thematically analysing an additional subset of tweets to further contextualise and explain health stigma communication online.Our study was approved by the Department of Psychology Ethics Committee at Northumbria University (ethical approval number 52832).For a flowchart of study phases, see Figure 1.

Phase 1
Extract english language tweets for each PSHCD via Twitter's API using: 'Search Terms for Extracting Tweets' (see supplement).
Phase 2 Create random subsets of tweets for each PSHCD.

Phase 3
Manually code potentially stigmatising tweets from the manual coding sample and categorise stigma communication type using: 'Stigma Communication Types Codebook' (see supplement).
Phase 4 Sentiment analysis of manually coded tweets.

Phase 5
Sentiment analysis of larger subset of tweets.
Exclude retweets, quote tweets, and non-English language tweets.Exclude irrelevant and uncategorisable tweets not referring to a target PSHCD.

Manual coding sample
See above.

Quantitative analysis sample
See above.

Twitter Data
The R package 'academictwitteR' was used to extract tweets via Twitter's API (Barrie and Ho, 2021).
This package requires accredited access to Twitter's 'Academic Research Product Track v2' which allows academic researchers to search the full history of public Tweets (Twitter, 2022).Tweets associated with each PSHCD were extracted using a pre-determined list of search terms (see supplement, Table S1).This list of terms was created by first consulting the Medical Subject Headings (MeSH) thesaurus (National Library of Medicine, 2021) to find common terms related to each of the five target PSHCDs (HIV/AIDS, Diabetes, Eating Disorders, Alcoholism, and Substance Use Disorders).For each PSHCD, we consulted previous research and subject experts within the research team to refine our list of search terms.For example, in the UK, the four most common categories of eating disorder are Bulimia Nervosa, Anorexia Nervosa, Binge Eating Disorder, and Other Specified Feeding or Eating Disorder (OSFED)(Priory Group, 2022).Therefore, the names for these categories of eating disorder were included as additional search terms, along with related words and expressions identified by previous research (Branley and Covey, 2017).Group discussion and consultation with subject experts within the research team aided further refinement of search terms.
We extracted English language tweets published during March-May of 2022 without territory restrictions for the origin of tweets.This time period was chosen to avoid major international awareness days and campaigns for each of the five PSHCDs.Such dates were avoided to limit the effect of intermittent spikes in twitter activity on the prevalence and content of health-related tweets.
We extracted 1,841,375 tweets in total: HIV/AIDS = 568,632 tweets, Diabetes = 496,614 tweets, Alcoholism = 339,391 tweets, Substance Use Disorders = 239,056 tweets, and Eating Disorders = 197,682 tweets.Due to limits to both researcher time and computational power, random subsets of the extracted tweets were created to enable further analysis.A subset of 1,500 tweets (300 per PSHCD was created for manual coding by stigma communication type.A subset of 250,000 (50,000 tweets per PSHCD) was created for further analysis using natural language processing.

Analysis
Mixed methods approach.We adopted a sequential explanatory mixed-methods approach.First, we manually coded tweets by stigma type.We then conducted a sentiment analysis of the coded tweets to enable a comparison between the manual coding of stigma communication types and automatic sentiment analysis.A further sentiment analysis of the larger subset of tweets was then conducted to compare differences in language features between conditions.Finally, thematic analysis was used to explain and contextualise earlier findings by qualitatively investigating the prevalent category of coded tweets (Anti-Stigma & Advice; see section 'Manual Coding' below).This follows previous theoretical guidelines for mixed-methods research in which sequential explanatory designs allow researchers to utilise a qualitative approach to expound former findings (Bishop, 2015;Creswell and Clark, 2017).For example, a related mixed-methods approach has previously been used to study mental health stigma on Twitter (Pavlova and Berkers, 2020).
Manual coding.To manually code tweets by stigma communication type, we created a Stigma Communication Type Codebook (see supplement, Table S2).This codebook was informed by prior theoretical research.First, we included categories from Smith's Model of Stigma Communication: Marks, Labels, Responsibility, and Peril (Smith, 2014;Smith, 2007;Smith, 2011).After discussions among the full research team, additional categories were included to capture common features of Twitter communication, i.e., insults, entertainment and advice, as well as the role that stigma can play in devaluing the lives of those with a particular health condition or disorder (Bacsu et al., 2022;Robinson et al., 2019).A draft codebook was assessed by the research team by applying it to an initial subset of the extracted Twitter data (n = 100 tweets).During this initial assessment of tweets, researchers identified a notable presence of messages aimed at combatting online health stigma.This included content that referred to features of stigma as an attempt to counter existing narratives or to inform and advise the public on the relevant PSHCD.Due to the prevalence of this category, and its relevance to health stigma communication, the research team decided to add it as a stigma communication type.This additional category was defined as Anti-Stigma & Advice.The final codebook consisted of seven categories for coding tweets by stigma communication type (Labels, , Responsibility, Peril, Insults, Entertainment, and Anti-Stigma & Advice).

Marks
Tweets were divided among five coders who applied the codebook to manually categorise tweets.Coders flagged tweets as irrelevant if they did not refer to a PSHCD and coded tweets as 'No Stigma' where no stigma was present.If stigma was present, coders indicated the type(s) of stigma communication within the tweet and selected the main type represented.Finally, coders were free to code tweets as 'other' and to define further categories.For each coder, 15% of their sample of tweets was also coded by a second researcher, pursuant to guidelines for ensuring inter-rater reliability among coders (Syed and Nelson, 2015).From this sample of twice-coded tweets, Krippendorff's alpha scores were calculated, α = .83indicating an acceptable level of interrater reliability (Lombard et al., 2010).
An a priori power analysis indicated that a sample of 1,120 tweets would be sufficient to detect a small to medium effect size of .15 for a Chi-square test with power = .8,significance level = .05,df = 32, to investigate whether or not there is a statistically significant relationship between PSHCD and stigma communication type.After irrelevant and uncategorisable tweets were removed, our manually coded sample consisted of 1,288 tweets.
Wordclouds and n-grams were generated to identify common terms and language features among tweets referring to different PSHCDs.Sentiment scores were calculated to determine whether sentiment differs between tweets manually coded as stigmatised, not stigmatised, or anti-stigma.
Further analysis was conducted to investigate differences in sentiment between PSHCDs using the larger sample of tweets (see Figure 1).Sentiment scores were calculated for tweets using an existing Additionally, we generated language feature scores using Perspective API to compare our manual coding of health stigma with an automatic tool commonly used for identifying harmful content online.An increasing number of machine learning solutions are being offered to help monitor and combat harmful online content.Google's Perspective API processes online text and provides scores for a range of attributes potentially relevant to stigma communication, most notably 'Toxicity', 'Identity Attack', 'Threat', 'Sexual Explicit', 'Insult' and 'Profanity'.By automatically generating scores for these dimensions, researchers have used Perspective API to study hate speech and other harmful content, social use of language, and online behaviours on Twitter (Jiang and Vosoughi, 2020;Aleksandric et al., 2022;Narayanan, 2020).We compare these dimensions among tweets manually coded as stigmatised, not stigmatised, or anti-stigma to explore the suitability and effectiveness of automatic tools for identifying potential features of health stigma communication.
Qualitative analysis.Reflexive Thematic Analysis (Braun and Clarke, 2019) was used to analyse tweets manually categorised as, 'Anti-Stigma & Advice' (n = 273).This category of stigma communication was included in the Stigma Communication Type Codebook due to its notable presence among health-related stigma content (see section 'Manual Coding' above).Our manual coding of tweets found Anti-Stigma & Advice to be the most prevalent category from our codebook.
Therefore, researchers decided to conduct a qualitative analysis of this category of tweets in order to contextualise and explain this finding, pursuant to a sequential explanatory mixed-methods approach (Bishop, 2015;Cresswell et al., 2019).Thematic analysis has previously been used to analyse tweets associated with a range of stigmatised health conditions such as diabetes (Blackwood et al., 2022) and mental health disorders (Jansli et al., 2022;Berry et al., 2017).In our study, one researcher examined and analysed tweets using NVivo 12 to create an initial thematic structure.This hierarchical structure of descriptive headings and subheadings was used to compare themes across all PSHCDs.
Initial themes and representative tweets were discussed among the full research team to create the final thematic framework collaboratively.
Gold (2020) notes the importance of recognising that Twitter data is unlike many datasets used for secondary analysis in that it is dynamic.Users may choose to delete previously posted tweets or remove their account entirely.Producing verbatim extracts of tweets in research makes it possible to connect tweets to individual users through internet search engines.Despite tweets being available in the public domain, we employed a process of anonymisation to avoid any potential unwanted identification of tweet authors.Twitter handles and potentially identifying information were removed from tweets.Furthermore, when citing representative tweets in our thematic analysis, message content was altered and reconstructed where necessary in order to 1) ensure that the user could not be identified, and 2) ensure that the message accurately represented the original tweet.This follows previous guidelines in qualitative research whereby direct quotations may be paraphrased to hide idiosyncratic speech patterns (Social Research Association, 2021).This protection of user identity is pursuant to the British Psychological Society's 'Ethics guidelines for internet-mediated research' which highlights the variety of expectations around data privacy online, and indicates the need to protect individuals posting or referred to in tweets (British Psychological Society, 2021).

Results
Our results are divided by the three phases of our analytic procedure (see Figure 1).First, we report the findings from our manual coding of stigma communication types.Second, we report the results from our sentiment analysis of both our manually coded sample of tweets, and larger subset of tweets.
Finally, we present the findings from our thematic analysis of the most prevalent category from our manual coding of tweets, 'Anti-Stigma & Advice'.

Manual coding of tweets by stigma communication type
A chi-square test of independence was performed to examine the relationship between PSHCD and the main category of stigma identified for each tweet.The relation between these variables was significant, x 2 (32, n = 1,288) = 192.15,p = < .001.Of the 1,288 included in our manual coding, 43% (n = 562) were coded as not containing potentially stigmatising content, 21% were considered to be 'Anti-Stigma or Advice', and the remaining 35% (n = 453) were divided among our defined categories of stigma communication (see Figure 2).

Comparison of sentiment scores across PSHCDs from the larger sample of tweets
A one-way ANOVA was performed to compare sentiment scores across the larger sample of tweets by PSHCD (n = 248,600).The ANOVA indicated a statistically significant difference in sentiment score between health conditions (F(4, 248595) = 5911.3,p < 0.001).Mean tweet sentiment scores were most positive for tweets referring to HIV/AIDS (M = 0.00, SD = 1.81), and least positive for tweets referring to substance use disorders (M = -1.69SD = 2.07; see Figure 6 below).For an account of the frequency of the most common terms contained within tweets for each PSHCDs, see supplement Table S3 and Figures S1-S5.

Qualitative Analysis
We report the findings from our reflexive thematic analysis of tweets manually coded as 'Anti-Stigma & Advice' (n = 273).An overview of each theme, accompanied by representative tweets from our sample, are presented in Table 1.The prevalence of each theme across the five PSHCDs is reported in Table 2.Each theme is presented, defined and explained in further detail below.Social Understanding.This theme represents content aimed at combatting perceived stigma associated with PSCHDs by attempting to alter perspectives of ill health and personal responsibility for health.In particular, tweets challenged individual stereotypes associated with certain conditions, and looked to diminish the attribution of responsibility to those living with stigmatised conditions.
Messages that attempted to improve the public's understanding of the experiences of those living with a PSHCD were most common among tweets about eating disorders.Though subsets of tweets for each PSHCD included content looking to counter messages of blame, this was most frequent when referring to substance use disorders.These tweets highlighted tensions between substance use disorders as a medical issue and the legal and societal frameworks around drugs.This tension is also found in tweets that stressed a need for change with respect to laws and attitudes towards addiction.
Need for Change.This theme represents calls for change with respect to the degree of support given to people suffering from PSHCDs and the need for improved use of language when communicating about various conditions.Tweets about substance use disorders highlighted the myriad of causes that can lead to addiction and the importance of better provision and support.
"We need to do more to assist people whose lives are in danger from substance abuse.Many have chaotic lives and not the best start in life.We cannot turn our backs on them just because we don't agree with their lifestyle." "I reckon this tweet does a hell of a lot more to tackle the stigma of addiction by highlighting the history of drug use, the total hypocrisy of societal attitude to use of any substances, & the discriminatory drug laws across the world.More debate instigated by humour required" General calls for greater action to address structural causes of addiction were similar to the various rallying cries among the HIV/AIDS tweets seeking to limit the spread of the virus and improve treatment.
"Important we all pledge and commit to ending #stigma faced by people living with #HIV."pointing out the hypocrisy of asking someone about one aspect of their medical history, apparently acceptable and necessary, when to ask that question about HIV would be inappropriate and insensitive." "Diabetes guru says in order to provide the best healthcare it's important to use the right language.Great resource for #Diabetes care #languagemstters" These tweets suggest the role that language can play in perpetuating stigma and highlight the importance of using terms mindfully for both individuals living with a specific condition, as well as for health communicators.
Encouragement & Support.This theme captures messages that may inspire or encourage others either through celebrating instances of recovery or offering words of support to those living with PSHCDs.A common category of tweets across PSHCDs was that of celebrating recovery or successful management of health.These tweets often highlighted the ongoing nature of recovery and self-management of health, and focussed on positive steps taken by the individual.

"I'm officially [number of days] self harm & emotional binge eating free!! it might not seem long but I've been struggling a lot & it takes a lot not to relapse! I'm v proud of myself !!"
"honestly I'm making so much progress here, still pretty much relapsing with anorexia but I'm working so much on myself" "I set out using  "Years of managing my own T1d helped me focus.Emotionally, was in bits.Now know best to let it all out and ask for help" "Here I show how helping #T2D patients cut sugar and starchy carbs helps reduce our use of drugs for diabetes in primary care" "Something to glean from fab info on how to reduce type 2 diabetes with a focus on BMI & weight loss, an area for community pharmacy to play a big part in, come on pharmacy need to listen to these guys more!" This may suggest that people with type 1 diabetes are more willing to specify the category of condition they experience, compared to those living with type 2 diabetes.Alternatively, this may reflect that greater efforts are being made by health communicators to raise awareness of type 2 diabetes given the increased behavioural component compared to type 1.

Prevalence of health stigma communication on Twitter
From our manually coded sample, almost half of all tweets were categorised by researchers as not containing any potentially stigmatising content nor any anti-stigma content.A third of tweets contained some form of potentially stigmatising content and the remainder (over one fifth of all tweets) comprised messages of anti-stigma and advice.It is notable that this category 'Anti-Stigma or Advice' was more prevalent than any individual stigma communication type.By studying numerous health conditions within the same study, we were able to provide an indication of the prevalence of health-related stigma communication on Twitter. Best and Arseniev-Koehler (2022) have suggested that assessments of the prevalence of health-related stigma may have been obscured due to research typically only addressing those individual conditions and disorders that remain highly stigmatised (most notably mental illnesses such as schizophrenia).By investigating the prevalence of stigma across several conditions, we find that potentially stigmatising content is notably present on Twitter, but less common than non-stigmatising content.We also highlight the significant presence of tweets directed at countering health-related stigma on Twitter.

Commonalities and differences between PSHCDs in stigma communication types
Each of our defined stigma communication types were found across all PSHCDs, though the prevalence of each category varied by condition.This suggests that there are commonalities between PSHCDs in the communication of stigma.Recent research has suggested that 'health-related stigma' should be studied as a viable concept in its own right, due to similarities in features of stigma across various conditions and disorders (Van Brakel et al., 2019).Features of stigma communication are likely to be common across conditions because all forms of stigma share a common framework as a social phenomenon arising from shared perceptions and relationships throughout society (Pescosolido and Martin, 2015).Despite the expected commonalities, our manual coding revealed notable differences between PSHCDs in the prevalence of certain stigma communication types.
From those tweets manually coded by researchers as containing potentially stigmatising content, labels of stigma were most prevalent among HIV/AIDS tweets.Our analysis of the larger subset of HIV/AIDS tweets found that 'gay' was one of the most associated terms with this PSHCD.
Previous research has found that gender, race, and sexual orientation often intersect with HIV-related stigma (Logie et al., 2011).Similarly, the percentage of tweets containing marks of stigma was highest among those referring to eating disorders.Further analysis of the larger subset of eating disorder tweets found 'weight', 'fat', and 'skinny' to be among the most frequently tweeted terms associated with this category of PSHCD.The close association between eating disorders and marks of stigma (physically identifying characteristics) found within our sample may reinforce commonly held stereotypes concerning disordered eating.This is problematic because perpetuating stereotypes associated with eating disorders can lead to disparities in treatment where the individual does not have the 'marks' most commonly associated with disordered eating (thin, white, female) (Head, 2019).Finally, over a third of potentially stigmatising tweets that referred to substance use disorders alluded to the 'peril' associated with this PSHCD.Further analysis of the larger subset of tweets referring to substance use disorder reported that terms such as 'crime', 'criminal', 'homelessness' and references to money and family issues were common.Previous research has suggested that substance use disorders are typically discussed as moral and criminal issues, rather than as a health concern (Mattoo et al., 2015).Research has also connected substance use disorder with reports of 'peril' due to perceptions of societal danger, and suggestions of 'poor moral character' (Stringer and Baker, 2018).In addition to substance use disorder tweets being coded as containing greater peril and perceived danger, this category of tweets reported a markedly lower average sentiment score compared to other PSHCDs.A key component of substance use-related stigma previously used to explain the negative connotations associated with this PSHCD is 'socially deviant' behaviour (Millum et al., 2019).Furthermore, in an analysis of changes in health-related stigma since the 1980s, Best and Arseniev-Koehler (2022) suggest that most physical diseases have experienced a marked decline in negative connotations, whereas mental illnesses, eating disorders, and addiction have seen little change in levels of stigma.Individual activism and informational campaigns are suggested to explain some, but not all, of the variation in condition-related stigma (Best and Arseniev-Koehler, 2022).

Anti-stigma and advice
To contextualise and explain the notable prevalence of 'Anti-Stigma and Advice' within our manually coded sample of tweets, we conducted a thematic analysis to identify common features within this category.We identified the themes Social Understanding, Need for Change, Encouragement & Support, and Information & Advice.We found that alcoholism and substance use disorders were the least represented disorders among tweets categorised as anti-stigma or advice.As described above, conditions and disorders often associated with addiction and 'socially deviant' behaviours have seen little decline in public stigma in recent years (Best and Arseniev-Koehler, 2022).Whereas, HIV/AIDS related tweets contained the highest portion of anti-stigma and most positive sentiment compared to other PSHCDs.
Tweets that represented attempts to improve public understanding of PSHCDs were most common when referring to eating disorders.Messages often highlighted that eating disorders are not restricted to those of a particular age bracket, sex or body shape and stressed the damage that can come from perpetuating such stereotypes.This message of anti-stigma appears to be in response to the reported 'marks' of stigma communication commonly associated with eating disorders.Though the prevalence of this category of anti-stigma among tweets referring to eating disorders is positive, our finding that stigma 'marks' are prevalent among tweets about this PSHCD suggests that continued efforts are required to combat stereotypes online.A previous evaluation of past and present approaches to stigma change divided strategies into attempts to 'protest', 'educate' and 'contact' (Corrigan, 2016;Corrigan and Penn, 1999).We consider our themes for defining anti-stigma and advice in light of this theoretical categorisation.Messages of encouragement and support featured across PSHCDs, often celebrating an individual's recovery or successful management of health.This echoes previous research into communication about eating disorders which highlighted the common presence of 'Pro-Recovery' content online that looks to share and inspire recovery (Branley and Covey, 2017).However, the most common theme from our sample of anti-stigma and advice was 'Information and Advice'.Van Brakel et al. (2019) previously suggested that information-based approaches are the most common strategy to counteracting public stigma associated with any condition.Public health campaigns and activism are suggested to be effective in contributing towards the decline of stigma associated with certain conditions (Best and Arseniev-Koehler, 2022).

Automatic tools for identifying potential features of stigma communication
To compare the effectiveness of automatic tools for identifying potential language features of stigmatising health content, we divided our manually coded sample into tweets categorised as stigmatising, not stigmatising, or anti-stigma and advice.We found that stigmatising tweets were more negative in sentiment and higher in scores for five out of six of Perspective APIs dimensions for identifying harmful speech.This provides an initial indication that automatic tools may offer a possible means for assisting in the identification of general health-related stigma at a larger scale.In a detailed comparison of manual and automatic approaches to analysing online text, Van Atteveldt et al. ( 2021) determined that the best performance for measuring the sentiment of text is still achieved by human coders, compared to lexicon approaches and machine learning (ML).However, numerous attempts have been made in recent years to identify health-related stigma associated with specific conditions online using ML approaches. of principle supervised ML model for identifying schizophrenia stigma on Twitter.This suggests that ML approaches may be effective in identifying health stigma at a large scale, which may prove useful for measuring the success of attempts to reduce online stigma.To our knowledge, no attempts have been made to apply ML approaches to identifying general health-related stigma online.The consistency between our automatic sentiment analysis and manual coding may suggest that ML models should be an appropriate next step for attempting to classify general health-related stigma online.

Limitations
The results of this study are not without limitation.Firstly, we extracted tweets published during March-May of 2022, however this extraction was conducted in August 2022.The delay between the publication and extraction of tweets may have affected our data.For example, during this period of delay, tweets may have been censored by twitter administrators, accounts removed, or content deleted by users in response to public comment.This may have prevented researchers from identifying certain aspects of health stigma communication present on Twitter in real time.Furthermore, we manually coded single tweets in isolation of replies and retweets.This may have limited researchers from being able to accurately interpret the context of tweets.However, users often passively scroll through their twitter feeds when consuming social media (Song et al., 2021), suggesting that it is typical of Twitter users to read a tweet without fully understanding the surrounding context.Therefore, it is possible that an assessment of individual tweets in isolation reflects the standard interaction between users and content on Twitter.Our data were manually coded by researchers and subject experts relevant to our chosen PSHCDs.However, instances of stigma communication may be interpreted differently by those living with a particular health condition.Recent research has suggested the need for pools of specialised raters (consisting of members from marginalised communities) when annotating content used to design automatic tools for identifying harmful content online (Goyal et al., 2022).It is important that future research captures the perspectives of those living with a range of potentially stigmatised conditions in order to refine our understanding of health stigma communication.Finally, colloquial terms referring to health conditions, especially those related to alcohol and drug use, are diverse and rapidly evolving.While researchers attempted to include a variety of search terms to capture a comprehensive sample of tweets, some might have been overlooked.Future studies could benefit from in-depth consultations with individuals who have lived experience of specific conditions, ensuring a broader spectrum of search terms is used to extract online messages.

Conclusion
This study investigated the prevalence and type of stigma communication among health-related tweets.We found that each of our defined categories of stigma communication were present across all PSHCDs, though there were notable differences between conditions.From our sample of potentially stigmatising tweets, those referring to substance use disorders were frequently accompanied by messages of societal peril.Whereas, HIV/AIDS related tweets were most associated with reference to potential labels of stigma communication (such as sexual orientation).Sentiment scores for substance use disorder tweets were more negative than any other PSHCD, reflecting recent suggestions that, though negative connotations associated with physical diseases have diminished in recent years, stigma around addiction has seen little decline.Despite one third of health-related tweets being manually coded as potentially stigmatising by researchers, we found a notable presence of content directed at counteracting online stigma.Our thematic analysis found that themes related to providing 'Information and Advice' and 'Social Understanding' were common across PSHCDs.
Finally, the consistency between automatic tools for identifying features of harmful text online and our manual coding of stigma communication, suggests that ML approaches may be a reasonable next step for identifying general health-related stigma online.

Declarations
Conflicting interests: The authors declare that there are no conflicts of interest.
Ethical approval: This study was approved by the Department of Psychology Ethics Committee at Northumbria University.
Guarantors: All authors shall act as guarantors, taking responsibility for the content of this article.

Figure 1 .
Figure 1.Flowchart describing each phase of the study.

Figure 2 .
Figure 2. Number of manually coded tweets for each stigma communication type by condition (n = 726 tweets).

Figure 3 .
Figure 3. Radar chart showing the percentage of tweets containing each stigma communication type for each condition's sample, excluding non-stigmatising tweets and anti-stigma (n = 453).

Figure 5 .
Figure 5. Bar chart showing differences in Perspective API dimension scores between groups of manually coded tweets (n = 1,288).

Figure 6 .
Figure 6.Mean sentiment scores for each PSHCD across the full quantitative sample of health tweets (n = 248,600).

Finally, of
the Diabetes Anti-Stigma & Advice tweets (n = 49) only 17 indicated which type of diabetes they referred to (Type 1 = 12 tweets, Type 2 = 5 tweets).Though it was common among the Anti-Stigma & Advice tweets to contain references to personal experiences of PSHCDs, when specifying diabetes type, 9/12 references to type 1 diabetes were made by the individual living with the condition, whereas this was 1/5 for type 2 diabetes, with most references not indicating personal experience by the author of the tweet."They are two very different illnesses.I'm Type 1 and when I see a rise in diabetes due to bad diet, it frustrates me."

Table 1 .
Definition of themes and representative tweets.
"We need to do more to assist people whose lives are in danger from substance abuse.Many have chaotic lives and not the best start in life.We cannot turn our backs on them just because we don't agree with their lifestyle" "in order to provide the best healthcare to patients it's important to use the right language.""I set out using Twitter as a personal journey diary.Was going to post my sobriety progress monthly to prove to myself.Figured if I could make it a year then I'm out of the woods as an alcoholic!I succeeded.And no longer need to keep track.thank you for your love"

Table 2 .
Representation of themes (number of tweets) for each PSHCD.
Finally, tweets about HIV/AIDS and Diabetes stressed the need for careful use of language when referring to a PSHCD and communicating with individuals about their health.
Together, we can get to zero stigma and zero new hiv infections" "No more stalling.No more shifting responsibility.No more totally preventable HIV cases.However, this is not our end goal.We must simultaneously work towards finding PrEP a proper home in healthcare.It is now time to put aside the challenges of the last few years and collaborate." These tweets may contribute towards combatting health stigma by presenting cases where individuals have progressed towards a better state of health and improved their general wellbeing.These tweets frame the individual in a positive light and may dampen negative attitudes around conditions by highlighting the possibility of positive change.This positive sentiment was also present among tweets offering more general messages of support and encouragement to seek treatment.This theme captures messages that aimed to increase awareness of PSHCD campaigns and events, signpost services, and offer advice.This theme is most represented by tweets that highlight specific events or services and provide information about accessing more information and support.This was most featured among HIV/AIDS and Diabetes tweets, but was also present among other PSHCDs.
Twitter as a personal journey diary.Was going to post my sobriety progress monthly to prove to myself.Figured if I could make it a year then I'm out of the woods as an alcoholic!I succeeded.And no longer need to keep track.thank you for your love" "20 years ago today my life changed, my family's life changed.Diabetes, you may be here for the long haul.But you ain't stopping me.Has it made me who I am? Yes.Would I change it?Yes, but no.#NeverGiveUp #Type1Diabetes" "It's been almost 21yrs, living with HIV.It's gone from, a terminal illness, to a manageable illness.I'm undetectable, which means, untransmittable.""It is sad to find out young people died from OD… If you're in the same situation (i.e drug addict) please do go and get help.It's never too late x" "Alcoholism is progressive but so is Recovery" Information & Advice."Our last awareness session for diabetes awareness week so if your struggling to manage your diabetes join us to learn lifestyle management tips by health care professionals" Our theme Social Understanding coincides withCorrigan's 'education' category -attempts to decrease stigmatizing myths of ill health and combatting stereotypes by presenting facts.The category 'protest', which highlights calls to suppress thoughts of moralising health issues and disrespecting those that suffer from ill health, also overlaps with our theme 'Need for Change'.Despite lower reports of anti-stigma among substance use-related tweets, there were notable calls for a need for change with respect to attitudes towards this PSHCD.These calls highlighted the need for greater efforts to address the structural causes of ill health and to ensure that addiction is viewed as a medical, not moral issue.Corrigan's final category 'contact' (attempts to eradicate stigma through interactions between people living with a specific condition and the broader public) differs somewhat from our remaining themes Encouragement & Support, and Information & Advice.
Oscar et al. (2017)used a supervised ML approach to classify Alzheimer's disease stigma on Twitter.Similarly, Budenz et al. (2020) used a ML model to classify stigma on Twitter associated with bipolar disorder.Most recently, Jilka et al. (2022) provided a proof