Skip to main content


Front. Hum. Neurosci., 30 September 2021
Sec. Brain Health and Clinical Neuroscience
Volume 15 - 2021 |

Machine Learning for Subtyping Concussion Using a Clustering Approach

Cirelle K. Rosenblatt1,2* Alexandra Harriss1 Aliya-Nur Babul3 Samuel A. Rosenblatt1
  • 1Advance Concussion Clinic Inc., Vancouver, BC, Canada
  • 2Division of Sport & Exercise Medicine, Department of Family Practice, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
  • 3Department of Astronomy, Columbia University, New York, NY, United States

Background: Concussion subtypes are typically organized into commonly affected symptom areas or a combination of affected systems, an approach that may be flawed by bias in conceptualization or the inherent limitations of interdisciplinary expertise.

Objective: The purpose of this study was to determine whether a bottom-up, unsupervised, machine learning approach, could more accurately support concussion subtyping.

Methods: Initial patient intake data as well as objective outcome measures including, the Patient-Reported Outcomes Measurement Information System (PROMIS), Dizziness Handicap Inventory (DHI), Pain Catastrophizing Scale (PCS), and Immediate Post-Concussion Assessment and Cognitive Testing Tool (ImPACT) were retrospectively extracted from the Advance Concussion Clinic's database. A correlation matrix and principal component analysis (PCA) were used to reduce the dimensionality of the dataset. Sklearn's agglomerative clustering algorithm was then applied, and the optimal number of clusters within the patient database were generated. Between-group comparisons among the formed clusters were performed using a Mann-Whitney U test.

Results: Two hundred seventy-five patients within the clinics database were analyzed. Five distinct clusters emerged from the data when maximizing the Silhouette score (0.36) and minimizing the Davies-Bouldin score (0.83). Concussion subtypes derived demonstrated clinically distinct profiles, with statistically significant differences (p < 0.05) between all five clusters.

Conclusion: This machine learning approach enabled the identification and characterization of five distinct concussion subtypes, which were best understood according to levels of complexity, ranging from Extremely Complex to Minimally Complex. Understanding concussion in terms of Complexity with the utilization of artificial intelligence, could provide a more accurate concussion classification or subtype approach; one that better reflects the true heterogeneity and complex system disruptions associated with mild traumatic brain injury.


Mild traumatic brain injury (mTBI) is a growing public healthcare concern, and presents a substantial burden to patients, families and health care systems (Langer et al., 2020). The incidence rate of this silent epidemic has significantly increased over the past decade, and accounts for the majority of all reported traumatic brain injury cases (Rao et al., 2017). While the majority of patients recover within 3 months, up to 30% of patients experience persistent concussion symptoms, affecting their ability to return to school, work, and activities of daily living (Dennis et al., 2019; Permenter et al., 2021). Given the significant increase in concussion and economic burden to health care systems, there is a need for effective and efficient evaluation of these injuries by healthcare professionals to implement accurate and timely management strategies.

Concussion reflects in a variety of affected systems and areas of co-occurring disruption that requires interdisciplinary management, an approach that is universally recommended in consensus statements and best-practice clinical guidelines (Collins et al., 2016; McCrory et al., 2017; Schneider et al., 2019). It is the heterogeneous nature of concussion, from its etiology and pathophysiology, to its individual clinical presentation and variable recovery trajectories, that has made its management especially challenging for the clinician or primary care physician. Moreover, in regions or health care systems where interdisciplinary care falls outside insured or core medical services, translating concussion best-practices in the application of the interdisciplinary treatment model represents a further barrier toward effective concussion management and meaningful progress in addressing the concussion epidemic.

Growing evidence of the complexity of concussion has given rise to the development of clinical subtypes, with steady empirical support (Collins et al., 2014; Ellis et al., 2016, 2018; Kontos and Collins, 2018; Lumba-Brown et al., 2020), and is considered a valuable tool toward informing clinical decision making, treatment planning and the conceptualization of targeted rehabilitation pathways. Similar to Collins et al. (2014) early delineation of six distinct sport-related concussion subtypes based on patient-reported symptoms following one-week post-injury, subsequent subtypes, or post-concussion disorders, have been organized according to symptoms, impairments or a combination of affected systems.

However, these subtypes may be vulnerable to bias in their conceptualization (Langdon et al., 2020), or inherently limited by the application of discipline specific expertise. Certainly, there is a disproportionate body of research on sport related concussion with notable gaps in our understanding of mTBI associated with motor vehicle accidents and other causes. Moreover, the complexity of concussion and its multi-factorial nature does not align with the approaches that assimilate the dimensionality of concussion into its basic units of impairment. A reductionism approach–that isolates a single factor, or combination of factors, and assumes them to be the cause of injury or impairment (Hulme and Finch, 2015)–may be useful in understanding causal, linear relationships (Bittencourt et al., 2016), however the non-linear, multifaceted entity that is concussion may require more sophisticated methods to capture the complex determinants that influence outcomes. As Langer et al. (2020) suggested, such a model requires “testing in an overview of empirical evidence focusing on data driven clustering of symptoms” into concussion subtypes. Kontos and Collins (2018), identified the need for mapping symptom clusters across various domains that quantified symptom clusters of concussion into objective deficits in functional outcome domain measures.

Analyses that integrate Artificial Intelligence (AI) and embrace a systems approach to concussion, with inherent non-linearity and complex, dynamic interactions, may improve our ability to identify patterns of system disruption in such multidimensional injuries as concussion. Recent AI work has used machine learning to predict symptom resolution following sport-related concussion (Bergeron et al., 2019). While another study used a clustering approach on vestibular and balance diagnostic data, and demonstrated two clinically distinct groups, patients with prominent vestibular disorders and others with no clear vestibular or balance impairment (Visscher et al., 2019). Specifically in consideration of the heterogeneity of concussion and noting the ways in which this complicates research efforts, Kenzie et al. (2018) utilized causal loop diagramming to visualize relationships between concussion injury factors, including pathophysiology, deficits, symptom persistence and recovery trajectories.

Advances in machine learning can provide a distinct practical advantage to healthcare providers (Davenport and Kalakota, 2019). The advent of using machine learning in addressing complex healthcare questions is already underway, demonstrating promise in automating and assisting in clinical diagnoses and treatment response (Garcia-Vidal et al., 2019; Nakata, 2019; Stevens et al., 2019). Compared to traditional statistics, machine learning can identify non-linear relationships and high-order interactions between multiple variables, where traditional statistics fall short (Bergeron et al., 2019). Thus, machine learning could more appropriately address multifaceted and complicated human health conditions, such as concussion.

Hierarchical agglomerative clustering, is a unsupervised, bottom-up, machine learning approach that can identify subgroups from complex data and provide an opportunity to classify clinical patterns as well as create novel representations of clinical profiles (Hassan et al., 2020). There are substantial implications for research on concussion subtypes, but also the utilization of machine learning to help in interpreting overall assessment results and summarizing multiple parameters to identify which features or combination of features discriminate between clinical profiles. This approach aligns with the heterogeneity, complexity, and diversity of concussion.

The purpose of this study was to determine whether a bottom-up, unsupervised, machine learning approach, could provide insight into different concussion clinical profiles by using objective outcome measures, including the Patient-Reported Outcomes Measurement Information System (PROMIS), Dizziness Handicap Inventory (DHI), Pain Catastrophizing Scale (PCS), and Immediate Post-Concussion Assessment and Cognitive Testing Tool (ImPACT). Utilization of self-administered, objective outcome measures, without the costly barriers of interdisciplinary concussion assessment were prioritized for this study.

Materials and Methods


In this retrospective study design, a cluster analysis was used on 275 patients in the initial intake database of the Advance Concussion Clinic, in Vancouver, British Columbia, Canada. Patients who were 18 years of age and older as well as completed the PROMIS (Version 2.1), DHI, PCS and ImPACT between January 2018 to December 2019 were included in the analysis. Patients were excluded from the analysis if, >5% of the patient's data were missing from the database, the patient did not allow the use of their data for research purposes, the patient did not complete the objective outcome measures at their initial assessment.


The Advance Concussion Clinic initial intake database consists of patient reported outcome measures obtained from PROMIS, DHI, PCS, and ImPACT. PROMIS is set of person-centered measures that evaluates and monitors domains of pain interference, fatigue, depression, anxiety, sleep disturbance, cognitive concerns and abilities, physical function and social function on a one to five numeric rating scale, as well as an average pain intensity score, on a zero to 10 numeric rating scale (Cella et al., 2010). The DHI is a 25-item form that evaluates a patients self-perceived handicapping effects imposed by vestibular dysfunction (Jacobson and Newman, 1990). The PCS is a 13-item scale that assesses three aspects of catastrophizing: helplessness, rumination and magnification (Sullivan et al., 1995). ImPACT is a computerized neurocognitive testing measure, which consists of six cognitive test modules. These six modules are utilized to generate four composite scores: verbal memory, visual memory, visual-motor processing speed, and reaction time. A number of studies have reported on the test's validity and utility in identifying subtle cognitive changes associated with concussion (Schatz et al., 2006; Van Kampen et al., 2006; Broglio et al., 2007; Alsalaheen et al., 2016). The Acute Concussion Evaluation (Gioia et al., 2008) and Concussion Grading Scale (CGS) were also extracted from the initial assessment. The CGS is a 21-item self-report measure that records symptom severity using a 7-point Linkert scale. Studies demonstrate that the CGS scale is able to discriminate between concussed and non-concussed patients (Schatz et al., 2006; Broglio et al., 2007).

Data Analysis

Data extracted from the Acute Concussion Evaluation were generated using descriptive statistics in SPSS statistical software (IBM Corp, 2017) and are reported as mean and standard deviation. For the cluster analysis, participant data were imported into Python (Python, Wilmington, DE: Python Software Foundation) and analyses were carried out using the scikit-learn toolkit (Abraham et al., 2014). Patient data were stored into a single matrix, where each row represented one patient and each column one variable. As a first step to reduce multicollinearity, a correlation matrix was used and redundant features were discarded. Moderate correlation areas (r > 0.55) were reduced to the element with the highest inter-patient variability (Schober et al., 2018). A principal component analysis (PCA) was then applied to reduce the dimensionality of the dataset (Jolliffe and Cadima, 2016). A PCA extracts information needed to explain the highest amount of variance within the dataset and in turn produces a set of new orthogonal variables called principal components (Feeny et al., 2020). Following, sklearn's agglomerative clustering algorithm was used on the determined principal components.

Agglomerative clustering is a hierarchical bottom up clustering approach which groups objects into clusters based on n their similarity. It is particularly suited for datasets where clusters maybe unevenly shaped, of unequal size and unequally distributed across parameter space (Hirano et al., 2004). In this hierarchical cluster analysis, the model is initialized by assuming that each datapoint is an individual cluster (Feeny et al., 2020; Hassan et al., 2020). Similarity and linkage are the two parameters of greatest importance for agglomerative clustering. Similarity was calculated using the euclidean distance between two data-points in PCA space while linkage was measured as the variance of each cluster (ward linkage).

Ward linkage was chosen to minimize the variance of each cluster ensuring that assessment scores of patients in each cluster were maximally uniform (Hirano et al., 2004). For the remaining parameters we used the default sklearn. At each iteration of the algorithm, the clusters with the shortest distance merge. The distance between clusters containing multiple data values is calculated using the minimum distance between a point in cluster x and a point in cluster y. The algorithm will continue merging clusters until stopped or there is only one large cluster. Since agglomerative clustering begins by assigning each datapoint a cluster, very few assumptions are made about the data. This is one of the strengths of agglomerative clustering, compared to other clustering methods, such as k-means (Hassan et al., 2020).

To determine the optimal number of clusters, the Silhouette and Davies-Bouldin scores were used (Feeny et al., 2020; Hassan et al., 2020). The Silhouette score is used to determine the separation distance between the resulting clusters (i.e., measures how similar an object is to its own cluster, compared to other clusters). In contrast, the Davies-Bouldin score is a measure of similarity between each cluster. In this analysis, we aimed to minimize the Davies-Bouldin score to ensure each cluster is maximally different from the other clusters and maximize the Silhouette score to maintain maximal uniformity between values within a given cluster. Figure 1 shows the Silhouette score as a function of the number of clusters. To improve the stability of the cluster outcomes, the agglomerative clustering model started with 100 patients to capture the initial cluster properties. Following, an additional 100 patients were added to ensure that both the properties and profiles of each cluster did not change. Finally, all remaining participants were added to ensure cluster stability in that profiles and properties remained unchanged.


Figure 1. The Silhouette score as a function of the number of clusters.

A Mann Whitney U test was used to determine significant differences between each assessment in each cluster against the same assessment in another cluster (i.e., pain interference in cluster one against pain interference in cluster two). A Bonferroni Correction was used on multiple comparisons (Ranstam, 2016). For all analyses, a statistical significance was assessed using p < 0.05 and confidence interval of 95% (Table 1).


Table 1. Means and standards deviations of the five determined clusters.


Participant Characteristics

The mean age of participants was 37.84 (SD = 12.50 years of age). Females represented 51% (n = 139) of the study population, males 48% (n = 133), and 1% (n = 3) did not disclose this information. The majority of concussions occurred following a motor vehicle accident (n = 148, 54%), and remaining mechanisms of injury were sport (n = 70, 25%), falls (n = 22, 8%), and other (n = 35, 13%). Loss of consciousness (LOC) was reported in 73 participants (27%), with 202 (73%) denying or not reporting LOC. The mean total for the CGS was 52.8 ± 29.8. Time since injury and litigation weren't specifically calculated.

Outcome of Clustering Procedure

The correlation matrix extracted redundant features (r > 0.55), that were subsequently removed from the analysis (Figure 2). PROMIS parameters retained by the correlation matrix were Pain Interference, Fatigue, Depression, Anxiety, Sleep Disturbance, Physical Function and Mobility, Ability to Participate in Social Roles, and Cognitive Function as well as Pain Intensity on a numeric score. From DHI, the total score as well as the DHI Functional subgroup were retained. Lastly, ImPACT scores for reaction time, visual motor processing speed, and verbal memory were also used.


Figure 2. Results of correlation matrix between patient baseline characteristics and objective outcome measures including, PROMIS, CGS DHI, PCS, and ImPACT.

Principal component analysis further reduced the dimensionality of the dataset from 14 features to two features. Using sklearn's agglomerative clustering algorithm on the two principal components, the maximum silhouette score was 0.36 and minimum Davies Bouldin score was 0.83. Therefore, it was determined that the optimal number of clusters was five (Figure 3). Since all patients in the study presented with a concussion, the range of values for any given assessment was not very large. In addition, all patients suffered from pain and movement related symptoms so it was expected the Davies Bouldin score would lie closer to 1, indicating similarities between clusters. shows the clinically distinct five clusters. Mann Whitney U tests determined statistically significant in-between differences across all assessments in each of the clusters (p < 0.05).


Figure 3. Scatterplot of the formed five clusters.


Concussion is only seeming to grow in complexity as research and evidence advance our understanding of its heterogenous nature. From its etiology and pathophysiology, to its individual clinical presentation and variable recovery trajectories, clinical management has been especially challenging for the clinician or primary care physician. Research has varied widely in how it has dealt with this increasing complexity, with some seeming to lean into the complexity by highlighting systems approaches and recursive modeling (Schneider et al., 2019), while others argue that its heterogeneity is associated with mutually reinforcing biopsychosocial symptoms rather than a single entity (Iverson, 2019).

Clinically, management of the concussion patient has continued to grow in complexity as well, with increasing challenge for the primary care doctor or individual clinician, particularly in context of westernized medical frameworks and reimbursement systems. Multiple risk factors have been identified for prolonged recovery from concussion such as previous concussion history, developmental delay, headache history and psychiatric history, all of which require consideration together with a multitude of other variables that can influence outcomes. Clinical assessment of the concussion patient requires consideration of the above as well as potential interactions between them, all of which form the unique patterns of presentation in the individual concussion patient. AI is an ideal, and perhaps even necessary, partner, that can best support our ability to manage the complexity of concussion, toward gaining a better understanding of patterns that may improve our ability to diagnose and treat mTBI.

This study evaluated the use of an unsupervised, bottom-up, machine learning approach to analyze the clinical profiles of patients attending a private concussion clinic. Utilization of objective outcome measures provided a unique opportunity to engage machine learning with various concussion and non-concussion specific evidence-based metrics. Findings revealed five statistically significant and distinct clusters, each with unique patterns of system disruption.

Across each of these five clusters, 14 features were retained, which combined outcomes from PROMIS, DHI, and ImPACT. These 14 features were: Pain Interference, Pain Intensity, Physical Function and Mobility, Anxiety, Depression, Sleep Disturbance, Ability to Participate in Social Roles, Cognitive Function, Fatigue, DHI Total, DHI Functional, Visual Motor Speed, Reaction Time, and Verbal Memory. Reliable and validated self-administered outcome measures were specifically utilized to ensure the accessibility and affordability of this approach.

The authors considered each of five clusters, or concussion subtypes to be best understood in terms of Complexity, with the following classification system suggested for each concussion subtype: Minimally Complex (Cluster 2), Mildly Complex (Cluster 3), Moderately Complex (Cluster 1), Highly Complex (Cluster 0), Extremely Complex (Cluster 4) (Figure 4). Level of complexity was ranked according to number and relative severity of arenas affected. “Complexity” was considered to better align with the nature and particular features of concussive injury, informing specific and treatable features that support clinical decision making. The language of “complexity” was furthermore preferred to “severity” to promote perceived control, in both practitioner and patient, while avoiding the catastrophizing that might arise with a severity approach.


Figure 4. Concussion subtypes (clusters) according to complexity.

Historically, concussion classification approaches have grouped subtypes according to particular symptoms or a combination of affected systems. From the concussion profiles suggested by Collins et al. (2014) to the most recent subtypes as proposed by Lumba-Brown et al. (2020) and Kontos et al. (2020), the approach to subtyping has been grounded in heuristics derived from clinical experience and patient presentation or reporting.

The utilization of AI offered an opportunity to engage machine learning pattern recognition to explore what clinical concussion data can inform regarding concussion profiles. Indeed, this analysis revealed five unique and distinct subtypes with various patterns of system disruption, across multiple symptoms areas. This machine learning approach may be most appropriate for capturing the complexity and heterogeneity of the dynamic injury that is concussion, and in doing so, maximizing the potential of these subtypes to support and improve clinical decision making in concussions.

Reliable and validated self-administered outcome measures were specifically utilized to ensure the accessibility and affordability of this approach, offering a data-driven method toward enhancing clinical judgement and decision making without requiring commonly utilized clinicaly administered measures. The utilization of self-administered objective measures is of notable value in countries and regions that are more dispersed with less access to the recommended interdisciplinary concussion team otherwise needed to assess or screen the range of clinical areas affected in concussion.

Moreover, the Coronavirus pandemic emphasized the need for virtual tools to support and inform concussion care when in-person options are not available. The self-administered nature of the tools utilized in this analysis, as well the interdisciplinary value of the information provided, may well reduce costs or make affordable a reliable screening for those in a fee-for-service environment, as well as for the health systems that don't include interdisciplinary care among their insured services.

Limitations and Future Directions

This clustering approach was selected to maximize the value of clinically meaningful data points that were derived from a large concussion population data set comprised of evidence-based outcome measures. This novel representation of concussion subtypes may help to guide interdisciplinary management, though further study is needed to assess its value in estimating recovery timelines and response to specific treatments. Future work should focus on a evaluating a wider variety of clustering algorithms to determine if they reveal further insights into the data; however, the uniformity in the clusters generated by the agglomerative clustering algorithm provide a first insight into the interconnectedness of systems affected by concussion. Further study is needed to understand the clinical application of these profiles and to explore the utility and feasibility of these subtypes within a clinical setting.

As the subtypes derived were based only on patients specifically presenting with symptoms associated with concussion, certainty regarding its diagnostic value cannot be ascertained. Extension of this cluster analysis to include healthy controls would support the validation of the current clinical profiles and aid in a diagnostic context.

Notably, many concussion subtypes have been developed based upon sport related concussions (SRC), while this research offers evidence of concussion subtyping more broadly applicable to concussions associated with other causes that are non-sports related. While the current sample represents a majority associated with motor vehicle accidents, results may be more generalizable than research that has focused on SRC alone. Further research would be needed to confirm the generalizability of results between these, and perhaps other, distinct groups.


Classification of concussion according to subtypes may become a useful, if not essential, tool to support clinical diagnosis and treatment planning and is useful to concussion practitioners in directing and coordinating care, and in evaluating progress toward recovery. It stands to reason that as our knowledge and understanding of the complexity surrounding concussion grows, a more comprehensive approach is warranted.

This study demonstrated the novel opportunity of using AI to gain insight into the complex clinical profiles of concussion. By systematically analyzing evidence-based metrics, five clusters emerged that were not only clinically distinct, but could be used to develop a novel view of concussion complexity that better approximates the true heterogeneity of the injury. In turn, this could help inform as well as support clinical decision making, and interdisciplinary involvement more readily. Given the interdisciplinary nature of concussion assessment, and the importance of the interdisciplinary teams' findings in treatment planning and providing Clearance—both to learn and to sport—AI's work in automating, if not its diagnosis, it's management, would be useful to most primary care physicians and other clinicians involved in its management.

Furthermore, healthcare providers without specific training in concussion, and in those parts of the world where this expertise may be largely unavailable would also benefit. With the right collaboration and balance, the integration of AI with concussion subtyping can optimize otherwise elusive or incomplete concussion recoveries.

Data Availability Statement

The datasets presented in this article are not readily available because these data are clinical information obtained from a private healthcare database. Requests to access the datasets should be directed to

Ethics Statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

CKR conceived the investigation with contribution by SAR. CKR and ANB designed the study. CKR, ANB, and SAR prepared the data. ANB trained and tested the algorithm with contribution by CKR. CKR and ANB interpreted the results. AH contributed to interpretation. AH and CKR drafted the paper and then substantially revised the paper with contributions by ANB. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare the potential following conflicts of interest with respect to the research, authorship, and/or publication of this article: AH, ANB, and SAR are employed by Advance Concussion Clinic. CKR is the Founder and Clinical Director of Advance Concussion Clinic.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


The authors would like to thank Julia Dahlby, Paddy O'Flaherty, Benny Freedman, and Danit Macklin, along with the entire clinical and administrative staff at Advance Concussion Clinic.


Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., et al. (2014). Machine learning for neuroimaging with scikit-learn. Front. Neuroinform. 8:14. doi: 10.3389/fninf.2014.00014

PubMed Abstract | CrossRef Full Text | Google Scholar

Alsalaheen, B., Stockdale, K., Pechumer, D., and Broglio, S. P. (2016). Validity of the Immediate Post Concussion Assessment and Cognitive Testing (ImPACT). Sports Med. 46, 1487–1501. doi: 10.1007/s40279-016-0532-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Bergeron, M. F., Landset, S., Maugans, T. A., Williams, V. B., Collins, C. L., Wasserman, E. B., et al. (2019). Machine learning in modeling high school sport concussion symptom resolve. Med. Sci. Sports Exerc. 51, 1362–1371. doi: 10.1249/MSS.0000000000001903

PubMed Abstract | CrossRef Full Text | Google Scholar

Bittencourt, N. F. N., Meeuwisse, W. H., Mendonça, L. D., Nettel-Aguirre, A., Ocarino, J. M., and Fonseca, S. T. (2016). Complex systems approach for sports injuries: moving from risk factor identification to injury pattern recognition—narrative review and new concept. Br. J. Sports Med. 50, 1309–1314. doi: 10.1136/bjsports-2015-095850

PubMed Abstract | CrossRef Full Text | Google Scholar

Broglio, S. P., Macciocchi, S. N., and Ferrara, M. S. (2007). Sensitivity of the concussion assessment battery. Neurosurgery 60, 1050–1058. doi: 10.1227/01.NEU.0000255479.90999.C0

PubMed Abstract | CrossRef Full Text | Google Scholar

Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., et al. (2010). The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J. Clin. Epidemiol. 63, 1179–1194. doi: 10.1016/j.jclinepi.2010.04.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Collins, M. W., Kontos, A. P., Okonkwo, D. O., Almquist, J., Bailes, J., Barisa, M., et al. (2016). Statements of Agreement From the Targeted Evaluation and Active Management (TEAM) approaches to treating concussion meeting held in Pittsburgh, October 15-16, 2015. Neurosurgery 79, 912–929. doi: 10.1227/NEU.0000000000001447

PubMed Abstract | CrossRef Full Text | Google Scholar

Collins, M. W., Kontos, A. P., Reynolds, E., Murawski, C. D., and Fu, F. H. (2014). A comprehensive, targeted approach to the clinical care of athletes following sport-related concussion. Knee Surg. Sports Traumatol. Arthrosc. 22, 235–246. doi: 10.1007/s00167-013-2791-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Davenport, T., and Kalakota, R. (2019). The potential for artificial intelligence in healthcare. Future Healthc J. 6, 94–98. doi: 10.7861/futurehosp.6-2-94

PubMed Abstract | CrossRef Full Text | Google Scholar

Dennis, J., Yengo-Kahn, A. M., Kirby, P., Solomon, G. S., Cox, N. J., and Zuckerman, S. L. (2019). Diagnostic algorithms to study post-concussion syndrome using electronic health records: validating a method to capture an important patient population. J. Neurotrauma 36, 2167–2177. doi: 10.1089/neu.2018.5916

PubMed Abstract | CrossRef Full Text | Google Scholar

Ellis, M. J., Leddy, J., Cordingley, D., and Willer, B. (2018). A physiological approach to assessment and rehabilitation of acute concussion in collegiate and professional athletes. Front. Neurol. 9:1115. doi: 10.3389/fneur.2018.01115

PubMed Abstract | CrossRef Full Text | Google Scholar

Ellis, M. J., Leddy, J., and Willer, B. (2016). Multi-disciplinary management of athletes with post-concussion syndrome: an evolving pathophysiological approach. Front. Neurol. 7:136. doi: 10.3389/fneur.2016.00136

PubMed Abstract | CrossRef Full Text | Google Scholar

Feeny, A. K., Chung, M. K., Madabhushi, A., Attia, Z. I., Cikes, M., Firouznia, M., et al. (2020). Artificial intelligence and machine learning in arrhythmias and cardiac electrophysiology. Circ Arrhythm. Electrophysiol. 13:e007952. doi: 10.1161/CIRCEP.119.007952

PubMed Abstract | CrossRef Full Text | Google Scholar

Garcia-Vidal, C., Sanjuan, G., Puerta-Alcalde, P., Moreno-García, E., and Soriano, A. (2019). Artificial intelligence to support clinical decision-making processes. EBioMed. 46, 27–29. doi: 10.1016/j.ebiom.2019.07.019

PubMed Abstract | CrossRef Full Text | Google Scholar

Gioia, G. A., Collins, M., and Isquith, P. K. (2008). Improving identification and diagnosis of mild traumatic brain injury with evidence: psychometric support for the acute concussion evaluation. J. Head Trauma Rehabil. 23, 230–242. doi: 10.1097/

PubMed Abstract | CrossRef Full Text | Google Scholar

Hassan, S. I., Samad, A., Ahmad, O., and Alam, A. (2020). Partitioning and hierarchical based clustering: a comparative empirical assessment on internal and external indices, accuracy, and time. Int. J. Inf. Tecnol. 12, 1377–1384. doi: 10.1007/s41870-019-00406-7

CrossRef Full Text | Google Scholar

Hirano, S., Sun, X., and Tsumoto, S. (2004). Comparison of clustering methods for clinical databases. Inf. Sci. 159, 155–165. doi: 10.1016/j.ins.2003.03.011

CrossRef Full Text | Google Scholar

Hulme, A., and Finch, C. F. (2015). From monocausality to systems thinking: a complementary and alternative conceptual approach for better understanding the development and prevention of sports injury. Inj. Epidemiol. 2:31. doi: 10.1186/s40621-015-0064-1

PubMed Abstract | CrossRef Full Text | Google Scholar

IBM Corp (2017). IBM SPSS Statistics for Windows. Armonk, NY: IBM Corp.

Iverson, G. L. (2019). Network analysis and precision rehabilitation for the post-concussion syndrome. Front. Neurol. 10:489. doi: 10.3389/fneur.2019.00489

PubMed Abstract | CrossRef Full Text | Google Scholar

Jacobson, G. P., and Newman, C. W. (1990). The development of the dizziness handicap inventory. Arch. Otolaryngol. Head Neck Surg. 116, 424–427. doi: 10.1001/archotol.1990.01870040046011

PubMed Abstract | CrossRef Full Text | Google Scholar

Jolliffe, I. T., and Cadima, J. (2016). Principal component analysis: a review and recent developments. Phil. Trans. R. Soc. A. 374:20150202. doi: 10.1098/rsta.2015.0202

PubMed Abstract | CrossRef Full Text | Google Scholar

Kenzie, E. S., Parks, E. L., Bigler, E. D., Wright, D. W., Lim, M. M., Chestnutt, J. C., et al. (2018). The dynamics of concussion: mapping pathophysiology, persistence, and recovery with causal-loop diagramming. Front. Neurol. 9:203. doi: 10.3389/fneur.2018.00203

PubMed Abstract | CrossRef Full Text | Google Scholar

Kontos, A. P., and Collins, M. W. (2018). Concussion: A Clinical Profile Approach to Assessment and Treatment. Washington, DC: American Psychological Association, 23–58.

Google Scholar

Kontos, A. P., Elbin, R. J., Trbovich, A., Womble, M., Said, A., Sumrok, V. F., et al. (2020). Concussion Clinical Profiles Screening (CP Screen) tool: preliminary evidence to inform a multidisciplinary approach. Neurosurgery 87, 348–356. doi: 10.1093/neuros/nyz545

PubMed Abstract | CrossRef Full Text | Google Scholar

Langdon, S., Königs, M., Adang, E. a. M. C., Goedhart, E., and Oosterlaan, J. (2020). Subtypes of sport-related concussion: a systematic review and meta-cluster analysis. Sports Med. 50, 1829–1842. doi: 10.1007/s40279-020-01321-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Langer, L., Levy, C., and Bayley, M. (2020). Increasing incidence of concussion: true epidemic or better recognition? J. Head Trauma Rehabil. 35, E60–E66. doi: 10.1097/HTR.0000000000000503

PubMed Abstract | CrossRef Full Text | Google Scholar

Lumba-Brown, A., Teramoto, M., Bloom, O. J., Brody, D., Chesnutt, J., Clugston, J. R., et al. (2020). Concussion guidelines step 2: evidence for subtype classification. Neurosurgery 86, 2–13. doi: 10.1093/neuros/nyz332

PubMed Abstract | CrossRef Full Text | Google Scholar

McCrory, P., Meeuwisse, W., Dvorak, J., Aubry, M., Bailes, J., Broglio, S., et al. (2017). Consensus statement on concussion in sport—the 5th international conference on concussion in sport held in Berlin, October 2016. Br. J. Sports Med. 51, 838–847. doi: 10.1136/bjsports-2017-097699

PubMed Abstract | CrossRef Full Text | Google Scholar

Nakata, N. (2019). Recent technical development of artificial intelligence for diagnostic medical imaging. Jpn. J. Radiol. 37, 103–108. doi: 10.1007/s11604-018-0804-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Permenter, C. M., Fernández-de Thomas, R. J., and Sherman, A. (2021). “Postconcussive syndrome,” in StatPearls (Treasure Island, FL: StatPearls Publishing), 1–11. Available online at: (accessed May 25, 2021).

Google Scholar

Ranstam, J. (2016). Multiple P-values and Bonferroni correction. Osteoarthr. Cartil. 24, 763–764. doi: 10.1016/j.joca.2016.01.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Rao, D. P., McFaull, S., Thompson, W., and Jayaraman, G. C. (2017). Trends in self-reported traumatic brain injury among Canadians, 2005-2014: a repeated cross-sectional analysis. CMAJ Open 5, E301–E307. doi: 10.9778/cmajo.20160115

CrossRef Full Text | Google Scholar

Schatz, P., Pardini, J. E., Lovell, M. R., Collins, M. W., and Podell, K. (2006). Sensitivity and specificity of the ImPACT Test Battery for concussion in athletes. Arch. Clin. Neuropsychol. 21, 91–99. doi: 10.1016/j.acn.2005.08.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Schneider, K. J., Emery, C. A., Black, A., Yeates, K. O., Debert, C. T., Lun, V., et al. (2019). Adapting the dynamic, recursive model of sport injury to concussion: an individualized approach to concussion prevention, detection, assessment, and treatment. J. Orthop. Sports Phys. Ther. 49, 799–810. doi: 10.2519/jospt.2019.8926

PubMed Abstract | CrossRef Full Text | Google Scholar

Schober, P., Boer, C., and Schwarte, L. A. (2018). Correlation coefficients: appropriate use and interpretation. Anesth. Analg. 126, 1763–1768. doi: 10.1213/ANE.0000000000002864

PubMed Abstract | CrossRef Full Text | Google Scholar

Stevens, E., Dixon, D. R., Novack, M. N., Granpeesheh, D., Smith, T., and Linstead, E. (2019). Identification and analysis of behavioral phenotypes in autism spectrum disorder via unsupervised machine learning. Int. J. Med. Inform. 129, 29–36. doi: 10.1016/j.ijmedinf.2019.05.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Sullivan, M. J. L., Bishop, S. R., and Pivik, J. (1995). The pain catastrophizing scale: development and validation. Psychol. Assess. 7, 524–532. doi: 10.1037/1040-3590.7.4.524

PubMed Abstract | CrossRef Full Text | Google Scholar

Van Kampen, D. A., Lovell, M. R., Pardini, J. E., Collins, M. W., and Fu, F. H. (2006). The “Value Added” of neurocognitive testing after sports-related concussion. Am. J. Sports Med. 34, 1630–1635. doi: 10.1177/0363546506288677

PubMed Abstract | CrossRef Full Text | Google Scholar

Visscher, R. M. S., Feddermann-Demont, N., Romano, F., Straumann, D., and Bertolini, G. (2019). Artificial intelligence for understanding concussion: Retrospective cluster analysis on the balance and vestibular diagnostic data of concussion patients. PLoS ONE 14:e0214525. doi: 10.1371/journal.pone.0214525

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: concussion, artificial intelligence, cluster analysis, interdisciplinary, rehabilitation, mild traumatic brain injury, complexity

Citation: Rosenblatt CK, Harriss A, Babul A-N and Rosenblatt SA (2021) Machine Learning for Subtyping Concussion Using a Clustering Approach. Front. Hum. Neurosci. 15:716643. doi: 10.3389/fnhum.2021.716643

Received: 28 May 2021; Accepted: 31 August 2021;
Published: 30 September 2021.

Edited by:

Carol A. DeMatteo, McMaster University, Canada

Reviewed by:

Thomas Edward Doyle, McMaster University, Canada
Robert L. Kane, Self-Employed, Washington, DC, United States

Copyright © 2021 Rosenblatt, Harriss, Babul and Rosenblatt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Cirelle K. Rosenblatt,

These authors share first authorship

Senior author