Thirty years of research on physical activity, mental health, and wellbeing: A scientometric analysis of hotspots and trends

The sheer volume of research publications on physical activity, mental health, and wellbeing is overwhelming. The aim of this study was to perform a broad-ranging scientometric analysis to evaluate key themes and trends over the past decades, informing future lines of research. We searched the Web of Science Core Collection from inception until December 7, 2021, using the appropriate search terms such as “physical activity” or “mental health,” with no limitation of language or time. Eligible studies were articles, reviews, editorial material, and proceeding papers. We retrieved 55,353 documents published between 1905 and 2021. The annual scientific production is exponential with a mean annual growth rate of 6.8% since 1989. The 1988–2021 co-cited reference network identified 50 distinct clusters that presented significant modularity and silhouette scores indicating highly credible clusters (Q = 0.848, S = 0.939). This network identified 6 major research trends on physical activity, namely cardiovascular diseases, somatic disorders, cognitive decline/dementia, mental illness, athletes' performance, related health issues, and eating disorders, and the COVID-19 pandemic. A focus on the latest research trends found that greenness/urbanicity (2014), concussion/chronic traumatic encephalopathy (2015), and COVID-19 (2019) were the most active clusters of research. The USA research network was the most central, and the Chinese research network, although important in size, was relatively isolated. Our results strengthen and expand the central role of physical activity in public health, calling for the systematic involvement of physical activity professionals as stakeholders in public health decision-making process.


Introduction
Physical activity can be considered as medicine and has been used in both the treatment and prevention of a variety of chronic conditions (1). Longitudinal cohort studies demonstrate that a low cardiorespiratory fitness constitutes the largest attributable fraction for all-cause mortality (2). There is also overwhelming evidence that low physical activity (i.e., not meeting physical activity recommendations) is considered as an important risk factor for chronic conditions including some cancers, cardiovascular disease, diabetes, dementia, and in particular for a patient with mental illness (schizophrenia, bipolar disorder, or major depressive disorder) (3)(4)(5). Patients with mental illness have poor physical health compared with the general population, with reduced life expectancy and a higher risk of premature death beyond suicide, from natural causes (6). At least partially, among other factors, their poor physical health is due to higher sedentary behavior and lower physical activity compared with the general population (7,8). Physical activity, and its structured form of exercise, seem to affect the brain and mind, beyond physical health, both as a factor associated with poor mental health and quality of life and as a treatment for mental disorders (9). Indeed, exercise has shown to be efficacious in a number of mental disorders, according to a previous umbrella review pooling 27 systematic reviews (10,11). Exercise is also now seen as a potential preventive or diseasemodifying treatment of dementia and brain aging (12) or as a possible treatment for negative symptoms in schizophrenia (13).
Importantly, systematic reviews, meta-analysis, and umbrella reviews have offered a deep synthesis of specific research questions addressed within the exponential volume of physical activity literature related to mental health and wellbeing. However, such systematic methods may not be appropriate to encompass hundreds or thousands of new publications per year. In fact, systematic reviews have to be narrow in their inclusion criteria and offer a comprehensive view on a specific and restricted research or clinical question. For instance, a meta-analysis can inform if an intervention is efficacious for a given population on an outcome of interest (14, 15) or an umbrella review can assess the credibility of an association between a risk factor and an incident condition (16)(17)(18)(19). Nevertheless, none of the two offers an insight on the temporal trend of research, the complex network of topics, authors, publications, networks, institutions, and their bibliometric performance. Gaining such overarching views of how an entire field of research on a particular topic is important and useful, in order to gauge how the academic literature is developing and inform the next steps for the science to pursue.
The integration of developments in data visualization, text mining, and network analysis has permitted the emergence of a new framework and a new generation of research synthesis of both evidence and influence, named research weaving (20). This framework combines visual analytics and scientometrics to visualize and delineate the development of a field, its underlying intellectual structure and the dynamics of scholarly communication over time (21). A comprehensive delineation of how scientometrics and bibliometrics overlap and distinct can be found in Hood and Wilson 2001 paper (22).
To the best of our knowledge, no broad-ranging scientometric study of research trends and influence networks of physical activity, mental health and wellbeing has yet been conducted. Thus, in this article, we present one to bridge the gap.

Search strategy and data collection
We searched the Web of Science Core Collection (WOSCC) on December 7, 2021, using a combination of keywords and Medical Subject Headings such as "physical activity, " "mental health, " and "mental illness * ." WOSCC provides full references and complete citations of articles published in major journals since 1900 and is one of the largest comprehensive sources for bibliometric studies (23). The full protocol with the search key is available on osf.io. This current study protocol is based on a first large-scale scientometric analysis (24). The database source was limited to the Web of Science Citation Index Expanded. The document types are limited to "article, " "review, " "editorial material, " and "proceeding papers, " without restrictions on language or time. The dataset was extracted from the WOSCC in tag-delimited plain text files.
In order to assess the quality of the reference filtering process and the homogeneity of the dataset, we independently inspected each of the most cited references (604 articles in total), and Sabe et al. .

FIGURE
Co-citation reference network with cluster visualization ( -). The unit of measure are articles and constitutes nodes. Nodes are organized according to year of publication. The size of a node (article) is proportional to the number of times the node has been co-cited. Colored shades indicate the passage of the time, from past (purplish) to the present time (yellowish).
a randomly selected sample of 10% of included articles to allow a margin of error (i.e., inclusion of non-relevant papers) of 5% with a 95% confidence interval (Supplementary Table 1; Figure 1).

Objectives
The primary outcome was to visualize research trends on physical activity related to mental health and wellbeing and to characterize the evolution of research trends using networks of co-cited references and networks of co-occurring keywords assigned to relevant publications.
The secondary outcome was to provide clinicians, researchers, and policymakers with a specific unit of measure of the research network (countries, institutions, authors, and journals) and to identify emerging trends and limitations.

Data analysis
Two different software tools for constructing bibliometric networks were used: Bibliometrix R package (3.1.4) (25) and CiteSpace (version 5.8.R4) (21). Bibliometric outcomes included citation counts, co-citations, and co-occurrences. A co-citation count is defined as the frequency with which two published articles are cited together by subsequently published articles (26). Co-occurrence networks are based on how frequently two entities, such as keywords, appear in the same articles.
The Bibliometrix R package was used for the analysis of publication outputs and the trend of growth. CiteSpace was used for the study of several types of networks, namely, networks of co-cited references, networks of co-cited authors, and co-occurrence networks of authors, keywords, institutions, and countries. For instance, the co-cited (authors') institutions network accounts for the cooperation between two or more institutions, which reflects the cooperation between authors and the influence networks.
CiteSpace produces a variety of metrics of significance, with temporal metrics such as citation burstness, structural metrics such as betweenness centrality, modularity, and silhouette score as well as a combination of both, namely, the sigma metric. The betweenness centrality of a node measures the fraction of shortest paths in an underlying network passing through the node (27). The burstness of the frequency of an entity over time indicates a specific duration of a surge of the frequency (28). The sigma indicator combines structural and temporal properties of a node, namely, its betweenness centrality and citation burst (29). Modularity (the Q score) measures the quality of dividing a network into clusters, and the silhouette score (the S score) of a cluster measures the quality of a clustering configuration (30). The Q score ranges from 0 to +1. The cluster structure is considered significant with a Q score >0.3, and higher values indicate a well-structured network. The S score ranges from −1 to +1. If the S score is >0.3, 0.5, or 0.7, the network is considered homogenous, reasonable, . /fpubh. . or highly credible, respectively. In addition, we conducted a structural variation analysis that focuses on novel boundaryspanning connections to detect transformative papers ranked on their divergence modularity (31). These transformative papers can potentially change to the existing structure of knowledge. We extracted cluster labels from keywords associated with articles that are responsible for the formation of a cluster selected by the likelihood ratio test (p < 0.001). Each cluster was closely inspected, and eventually cluster labels were improved based on the authors' judgment.
The second level of the data filtering process was applied during the generation of networks within each dataset (e.g., most cited reference) in order to detect duplicates, references without authors, or any non-relevant unit of measure that was excluded (e.g., DSM reference; CIM-10) or merged (e.g., author Motl RW and Motl W Robert).
The g-index was used for all calculations. This index permits to give credit to lowly cited or non-cited papers while giving credit for highly cited papers, thus partially alleviating bias from highly cited papers as seen with the h-index (32). CiteSpace general parameters are reported in Supplementary Information 1.

Results
Analysis of publication outputs, major journals, and growth trend prediction We report a flowchart with detail of the 56,442 retrieved documents from the WOS Science citation index expanded and the different steps of our scientometric study: identification and screening of studies, software analyses, and expert review's interpretation (Supplementary Figure 1).
Among the retrieved documents, 1,089 documents were excluded, and 55,353 documents encompassing 1,306,828 references were retained (47,105 articles; 6,671 reviews; 564 editorial material; 1,013 proceeding papers). The data filtering process consisted of the inspection of each 604 highly cited papers, editorial material, and proceeding papers and the inspection of 10% randomly selected titles of the retrieved documents. Only 4% (n = 224 articles) were not relevant (Supplementary Figure 1).
The retained 55,353 articles were published between 1905 and May 2022 in 24 different languages (95.1% of articles in English). The annual scientific production is still in 2022 exponential with a mean annual growth rate of 6.8% since 1989 (n = 17) and 2022 (n = 5,604) ( Supplementary Figures 2, 3).
The first article identified was a Franz SI and Hamilton GV article on "the effects of exercise upon the retardation in conditions of depression" published in the American Journal of Insanity (33).

Analysis of co-citation reference: Clusters of research and most cited papers Clusters of research
We constructed a synthesized network of co-cited references based on articles published during the 1988-2021 time period as suggested by CiteSpace after the removal of empty time intervals to optimize time slicing (Figure 1). In this network, each node represents a highly co-cited article. We further explored the latest research trends with the extraction of co-citation networks The 1988-2021 network identified 50 different clusters, with a single constellation of 26 clusters that reveals six distinct major trends of research on physical activity, namely cardiovascular disease, somatic disorders, cognitive decline/dementia, mental illness, athletes' performance, related health issues and eating disorders and COVID-19 pandemic.
The link walkthrough over time between clusters based on burstness dynamics for the 1988-2021 network is available as a video on osf.io.

Most cited papers
We report the top 10 most co-cited references for the 1988-2021 time period in Table 1

Analysis of co-occurrence of keywords
The use of author keywords can help identify the latest trends of research and choose search keywords for future reviews. The co-occurrence author keywords network for 1988-2021 is shown in Supplementary Figure 5, and the 2016-2021 time period is shown in Figure 2. In this network, each node is a highly co-occurring keyword. Both networks presented significant modularity and silhouette scores indicating credible clusters (Q = 0.3327, S = 0.6823 and Q = 0.3971, S = 0.6614 respectively).

Analysis of influence and co-operation network Co-cited countries and co-cited institutions network
We produced the co-cited countries and co-cited institutions network ( Figures 3A,B). Units of measures were authors' countries and authors' institutions. A significant modularity and silhouette score were found (Q = 0.5321; S = 0.785). Overall

Co-authorship, co-cited and co-cited journals network
Our dataset includes 1,306,827 citations with an average of 31.85 citations per document. About 175,508 different authors were found, with an average of 3.17 authors and 5.76 coauthors per document in 4,193 different sources (e.g., books and journals) (Supplementary Figure 1).
We produced the co-authorship networks, which are the social networks encompassing researchers that reflect collaboration among them, each node representing a different highly cited co-author (Supplementary   Tables 2G,H). We further produce the co-cited author network that permits to visualize "who cites who" for the last 5 years (2016-2021 network) was also conducted (Supplementary Figure 9). The burstness analysis revealed that the most co-cited first authors according to our datasets were Brooks SK, Wang CY, Ogden CL, Holmes EA, and Kandola SA.  Figure 10). We conducted the co-cited journal network that retained 2,879 journals and showed the highly cited journals with high betweenness centrality (Supplementary Figure 11).

(Supplementary
The top five highly cited journals were Archives of General Psychiatry (JAMA), The Lancet, PLOS ONE, Medicine and Science in Sports and Exercise, and the New England Journal of Medicine ( Table 2). The burstness analysis further reveals that five journals with the latest beginning of burst were Frontiers in Psychology, The Lancet Psychiatry, International Journal of Environmental Research and Public Health, Nutrients, and Frontiers in Psychiatry (Supplementary Tables 2E,F).

Summary of the main findings
To the best of our knowledge, this is the first broad scientometric that proposes a comprehensive overview of the development of research on physical activity, mental health, and wellbeing.
We retained 55,353 documents revealing an exponential growth of scientific production since the 90s. The USA holds for decades the leading position in research; however, China is very active since 2020 with an important burst of citations, mainly due to publication on COVID-19. The King's College London and Harvard University were the most influential institutions in terms of citation count. In supplement to actual reviews, this scientometric study reveals the influence and collaboration network, which could help researchers to identify major scholarly communities and establish potential research collaboration.

Identification of research trends
The six distinct major trends of research identified expose the history and the latest development of research on physical activity, mental health, and wellbeing. The first major trend of research concerns physical activity and cardiovascular disease, reminding the past and present intertwine. First research focused on cardiovascular disease (35). The large body of research on evidence synthesis of the last decades that mainly focused on the prevention to treatment role of physical activity for cardiovascular disease started with guidelines for exercise testing (37,78), and that continues to date with consideration of cardiometabolic risk factors (39).
The extension of prevention and treatment of physical activity to other somatic disorders constituted the second major trend, making levels of physical activity a public health priority (41), that continues to date (79). Another trend, which emerged after 2000, is the potential of physical activity for the prevention and treatment of dementia with increased importance of evidence-synthesis studies (51,80,81).
Physical activity has also been explored as a potential intervention for the prevention and treatment of dementia. As regards to prevention, it has been demonstrated that physical activity is a protective factor against Alzheimer's disease and other types of dementia (82, 83). As a treatment, recently an umbrella review has pooled evidence from as many as 27 systematic reviews, including 18 with meta-analyses, overall reporting on 28,205 participants with mild cognitive impairment or dementia (84). The authors showed that mindbody intervention and mixed physical activity interventions had a small effect on global cognition, whereas resistance training had a large effect on global cognition in those with mild cognitive impairment. In people affected by dementia, a small effect of physical activity/exercise emerged in improving global cognition in Alzheimer's disease and all types of dementia. Importantly, physical activity/exercise also improved other outcomes not strictly related to cognition, including the risk of falls, and neuropsychiatric symptoms.
Adjacently, a massive body of evidence has organized an important trend of research on the benefits of physical activity for both prevention and treatment of severe mental disorders, in particular depression (4,85,86) and schizophrenia (71,87). More recently, the evidence has focused on evidence-synthesis (10,74) and mental health/wellbeing (9).
Other lesser, although highly relevant trends were also uncovered, such as the importance of physical activity for athlete's performance (88,89). While most of the research efforts in that area have focused on how to optimize performance in the context of professional athletics (90), perfectionism, and excessive physical activity can also be a symptom of mental disorders, and eating disorders in particular (58). This research trends now focus on concussion and its consequence (chronic traumatic encephalopathy) (60).
Finally, a large body of research has focused on physical activity and COVID-19. Physical activity is a protective factor for COVID-19 complications (91). During COVID-19 research has also focused on restrictions and physical activity (63). Finally, physical activity's relevance has also been shown to extend  beyond the clinical sciences and start to dialogue with greenness and urban planning (66,92,93). Although various trends of research have developed these last decades, we can identify two important gaps, the one of the roles of physical activity in the prevention or treatment of substance-use disorders, and the one regarding the socioeconomic inequalities in access to physical exercise (94). Meta-review covering this subject (10) concluded that exercise can improve multiple mental health outcomes in those with alcohol-use disorders and substance-use disorders; however, further research is needed in these conditions, notably with the use of mind-body practices (95,96).

Strengths and limitations
This work has strengths and weaknesses. Strengths are its novel evidence-synthesis approach, complete systematic reviews, and meta-analysis, by providing information on the evolution of research trends over time, the visualization of networks of authors, countries, and institutions, and that go beyond common measures of academic bibliometric performance (i.e., impact factor, H-Index, number of papers or citations). This novel research framework permits repeatable, reproducible, and comparable analysis with less bias than conventional time-consuming reviews that are vulnerable to biased coverage/selection.
Limitations are that, despite the quality check procedures outlined in the methods, this is not a systematic review. Furthermore, gathered data were only obtained from WOSCC, which can limit retrieved publication (94, 97). Also, the centrality and number of citations are not necessarily indicative of the quality of a work, as faulty publications can be highly cited because they are frequently criticized as well (98). Finally, no reporting guidance is available for scientometric studies yet, given their recent introduction in the literature.

Conclusion
In conclusion, researchers have consistently focused on the role of physical activity on cardiovascular disease, other somatic disorders, dementia, mental disorders, athlete's performance, and eating disorders and more recently on COVID-19 pandemic, which clearly shows the role of physical activity as medicine across physical and mental disorders. More recently, the literature has focused on green space, urban planning, and behavior change, further expanding the multidisciplinary reach of physical activity. Taken together our results strengthen and expand the specific and central role of physical activity in public health, calling for the systematic involvement of physical activity professionals as stakeholders in the public health decision-making process.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions
MSa and MSo: conceptualization and writing-original draft preparation. MSa, CC, and MSo: methodology, formal analysis, and investigation. MSa, CC, OS, JD, DV, JF, LS, BS, SR, FS, and MSo: writing-review and editing. CC and MSo: supervision. All authors contributed to the article and approved the submitted version.

Funding
Open access funding was provided by the University of Geneva.

Conflict of interest
Author OS has received advisory board honoraria from Otsuka, Lilly, Lundbeck, Sandoz, and Janssen in an institutional account for research and teaching. Author JF has received consultancy fees from Parachute BH for a separate project. Author BS is on the Editorial Board of Ageing Research Reviews, Mental Health and Physical Activity, the Journal of Evidence Based Medicine and the Brazilian Journal of Psychiatry. Author BS has received honorarium from a co-edited a book on exercise and mental illness, advisory work from ASICS & ParachuteBH for unrelated work. Author MSo has received honoraria/has been a consultant for Angelini, Lundbeck and Otsuka.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.