Performance and healthcare analysis in elite sports teams using artificial intelligence: a scoping review

Introduction In competitive sports, teams are increasingly relying on advanced systems for improved performance and results. This study reviews the literature on the role of artificial intelligence (AI) in managing these complexities and encouraging a system thinking shift. It found various AI applications, including performance enhancement, healthcare, technical and tactical support, talent identification, game prediction, business growth, and AI testing innovations. The main goal of the study was to assess research supporting performance and healthcare. Methods Systematic searches were conducted on databases such as Pubmed, Web of Sciences, and Scopus to find articles using AI to understand or improve sports team performance. Thirty-two studies were selected for review. Results The analysis shows that, of the thirty-two articles reviewed, fifteen focused on performance and seventeen on healthcare. Football (Soccer) was the most researched sport, making up 67% of studies. The revised studies comprised 2,823 professional athletes, with a gender split of 65.36% male and 34.64% female. Identified AI and non-AI methods mainly included Tree-based techniques (36%), Ada/XGBoost (19%), Neural Networks (9%), K-Nearest Neighbours (9%), Classical Regression Techniques (9%), and Support Vector Machines (6%). Conclusions This study highlights the increasing use of AI in managing sports-related healthcare and performance complexities. These findings aim to assist researchers, practitioners, and policymakers in developing practical applications and exploring future complex systems dynamics.


Introduction
The inception of artificial intelligence (AI) can be traced back to the mid-20th century, marked by the aspiration to create machines capable of emulating human intelligence (1).Over the decades, advancements in computational power and algorithmic sophistication, as well as the availability of large data sets, have propelled AI from theoretical frameworks to practical applications in diverse fields (2).
In the context of sports, one of the first applications of AI was biomechanics modelling (3) and its integration began to gain traction in the early 21st century, evolving from basic statistical analyses to complex predictive modelling and real-time decision-making systems.This progression has enabled sports scientists and professionals to harness AI for performance analysis, biomechanics, sports technique, strategic planning, performance analysis, and improving competitive edge in team sports (4).
During these last years, there has been a great evolution and applications of AI and ML in sports, and some authors have defined the key challenges for AI usage in elite sports, including correct data collection, the process of connecting AI and elite sports communities, the need to keep control in the hands of practitioners, maintaining the explainability of AI results, developing robust predictive models, and closing the loop defined as the need to provide feedback to the AI system to develop quality and self-adaptation (5).
When considering the implementation of AI in elite sports teams, it is critical to successfully address the obstacles associated with human-centered activities such as privacy protection, ethical design, adherence to human principles, governance and oversight, and the preservation of human cognitive abilities (6).

Elite sports teams, performance, and healthcare
Defining elite sports teams is not a trivial task.Some authors claim that it is a concept that undermines the external validity of high-performance requirements (7).This study focusses on clarifying and proposing some indicators to develop a correct definition development, including: age, competition level, league status, gender, international ranking, nationality, province/state, sport and success/achievements.
Performance in elite sports teams is a wide concept because many different aspects should be considered, such as technicaltactical and those related to physical factors (8).From the perspective of this research, the strength and conditioning of athletes have been considered, as well as physical and physiological aspects.Some authors have previously developed the physical and physiological profiles in team sports such as male football (9) or female football players (10).
Regarding physical training, key performance goals could be considered such as the conceptualization of load (11), the effects of the accumulated training and match load (12) and the ways to monitor it (13).The integration of new devices such as global positioning systems (GPS) during practise and competition has been very important (14), Key identifying performance markers is challenging, but we can quantify repeated sprint ability (15) injury impact on the team (16) and availability (17).Also crucial is the effect of the competition's arrangement with concatenated matches (18) and fixtures congestion (19).As a performance tool, AI could help to develop recommendations to determine player wellness, trip planning, and sleep management (20).
Healthcare may be conceptualised as a system that combines health management and performance coaching to provide comprehensive care for elite athletes and, eventually, elite sports teams, as opposed to merely ensuring the absence of disease and injury (21).Thus, psychological concerns are an additional component of this approach (22).

Complex systems and artificial intelligence approach
Elite sports teams are complex systems with interconnected parts that must be understood and adapted to dynamic, unexpected, different environments and integrate interdisciplinary staff.A pragmatic approach to data set organization can generate innovative insights in sports sciences and sports teams, guiding practitioners in training, competition, and team member well-being (23).
Some authors describe the constant advancement of sports technology and how AI and Machine Learning (ML) may improve each of these characteristics, injury rates and injury risk (22), athlete health and injuries prevention (25), and maximise sporting performance and athlete well-being (26).Using system thinking, other authors explain how to understand complex systems and their patterns to better understand how they affect sports coaching (27).
Expanding upon a previous investigation conducted in professional and nonprofessional team sports (28), this study describes the present healthcare and performance approaches in elite team sports utilising artificial intelligence.In this update, new domains, applications, or forms of AI use have been identified, as well as new research openings.

Theoretical framework for AI and ML learning strategies
Artificial Intelligence (AI) and Machine Learning (ML) encompass a broad spectrum of computational techniques designed to emulate intelligent behaviour and facilitate databased autonomous decision-making.The core objective of ML, a subset of AI, is to enable machines to learn from data, thereby improving their performance on a given task without being explicitly programmed for every scenario.
As will be described below, the most common AI learning strategies in the revised studies include: Tree-based techniques, AdaBoost/XGBoost, Neural Networks, K-Nearest Neighbors (KNN), Classical Regression Techniques, and Support Vector Machines (SVM).In what follows we briefly describe their main characteristics; a more in-depth description can be found in Turing's work (2).
Tree-based techniques, including decision trees and random forests, are useful for their interpretability and ability to handle nonlinear relationships.They are particularly useful for classification and regression tasks.
AdaBoost/XGBoost are boosting algorithms that base their utility on their robustness and efficiency in improving the accuracy of weak learners.They are particularly useful in handling bias-variance trade-offs in predictive models.
The adaptability of neural networks to model complex nonlinear relationships makes them a cornerstone of AI research, particularly in deep learning applications for image and speech recognition, among others.
K-Nearest Neighbors (KNN) shows its simplicity and effectiveness in classification tasks, relying on distance metrics to determine the closest training examples.
Classical Regression techniques remain fundamental for predictive modeling, offering a straightforward approach to understanding relationships between variables.
Finally, Support Vector Machines (SVMs) are suitable in highdimensional spaces and show their effectiveness in classification problems, especially when data are not linearly separable.

Objectives
This scoping review aims to provide a global perspective on the current state of the use of artificial intelligence in professional sports teams by: (a) Identifying the current specific techniques of AI and ML used in sports teams.(b) Studying the distribution of the selected topics such as sports performance and healthcare in elite sports teams.(c) Unveiling the sports that are developing these techniques and in which parameters.

Procedures
Our work aligns with those of other authors in healthcare practise and research and uses a scoping approach (29).A scoping review will allow us to summarise research findings from the scientific literature, as well as to identify possible research gaps in a field that needs further research in many aspects.We will use a methodological approach inspired by systematic review methodologies, as suggested by some authors (30).However, there are differences in aims and methods between systematic and scoping reviews, so we will follow the recommendations stated in PRISMA-ScR (31).
A systematic scientific literature search was conducted for this review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) (31).The screening process, extraction process and critical evaluation were done independently by two reviewers: AAMM and MJDM.JLSR supervised all the processes and resolved disagreements to make the last decisions.The same authors carried out the quality evaluation.A web-based repository specifies the initial protocol 1 .

Querying technique
The Internet databases: Web of Science, Scopus and Pubmed were thoroughly searched from 2019 until the end of 2023 (last 5 years).The various search strings are presented in Table 1.The aim is to analyse the literature of the last five years to ascertain the current state of artificial intelligence applications that attempt to appreciate the intricacies of team sports performance.

Criteria for eligibility and the selection process
Two reviewers, AAMM and MJDM, conducted the screening, extraction, and critical assessment processes independently.
We created a thorough selection process to determine eligibility for our study, setting clear inclusion and exclusion criteria with a strong emphasis on the research topic.We started by classifying the titles and abstracts of publications according to their topical significance.When there was uncertainty about a study's significance, author JLSR was asked to make the ultimate decision.Here, we outline these conditions: The inclusion criteria were created to ensure that the chosen studies were relevant and of good quality.We initially focused on articles written in English to guarantee a widely accessible database.The works must be published as original, comprehensive research articles in peer-reviewed journals between 2019 and the end of 2023 to include up-to-date and pertinent research.We restricted our study to participants in team sports only, omitting those centred on individual sports, to ensure uniformity in our research setting.
The papers must primarily focus on study performance or healthcare in the sports realm to allow us to focus on specific areas of interest.Our interest is on research carried out with elite, professional, or high-level sports teams since these settings provide unique insights into the requirements and results of high performance.It was crucial that the AI methods or algorithms in the studies be well explained and detailed, as this is vital to understand the relevance and efficiency of the suggested solutions.
Exclusion criteria were set to eliminate studies that did not fit our specifications.We excluded research that did not utilise machine learning-based solutions, as our focus was on the implementation of AI in sports.To preserve an emphasis on primary and original research in elite sports teams, studies including mainly non-professional and/or underage players, as well as reviews and meta-analyses, were removed from the research.We excluded study protocols and articles with unavailable full texts to ensure that our review was based on comprehensive and accessible data.

Quality evaluation
The studies incorporated in the research utilize different observational methodologies: cohort, case-and-control, and cross-sectional.The STROBE checklist (32) was employed to assess the quality of the identified research publications and mitigate any biases (Table 2).

The main research AI technique or method: classification Topic classification and selection
Sports performance, sports healthcare, technical-tactical domain, talent identification, and business domain, among others, were among the domain-specific categories of the application that emerged during the search (Figure 1).The classification and comparison of documents was facilitated by this arrangement.The themes Performance and Healthcare were chosen among others to focus on the aim of the scoping review.
In the Performance section, topics such as the use of GPS for tracking, profiling and decision making, wellness control respecting the load, predicting performance from anthropometric and testing data and key performance indicators (KPI) were explored because of the search and selection results.In the Healthcare section, topics such as injury prevention and prediction with an emphasis on muscle injury processes from injury to return to play, the external load and the injuries, and the integration of psychological, blood biochemical and genetical factors were discussed.

AI or ML identified
The key AI learning strategies and methodologies were examined for this extensive analysis and then categorised as follows, depending on their appearance in the studies: Tree-based methods, Ada/XGBoost, Neural Network, K-Nearest Neighbours, Classical Regression Techniques, Support Vector Machine, Naïve-Bayes, Subgroup Discovery and Clustering Analysis were ordered by frequency of appearance.Then, the investigations were categorised into this fundamental learning strategy or method and the comprehensive method as part of the investigation into the complexities and procedures of artificial intelligence.
The metrics used to evaluate the model depended on the AI technique used.As a rule, to order the models, we designated the one that exhibited the most optimal performance when various techniques were used in the same study.

Metrics for evaluation
The evaluation of these AI learning strategies employs several metrics to assess their performance comprehensively: True Positive (TP), False Positive (FP), True Negative (TN), False Negative (FN).
Accuracy: Measures the proportion of correct predictions among the total number of cases examined.Proportion of TP and TN in all evaluated cases (see Equation 1).
Sensitivity (or Recall): Assesses the ratio of true positive predictions to the actual positives.Proportion of TP in all the cases that belong to this class (see Equation 2).
Specificity: proportion of TN in all cases that don't belong to this class (see Equation 3).
Precision: Evaluates the ratio of true positive predictions to the total positive predictions.Proportion of TP in all cases that have been classified as it (see Equation 4).
F1-score: Provides a harmonic mean of precision and recall, balancing the two metrics.A measure of a test's accuracy.It considers both the precision and the sensitivity (recall) of the test to compute the score.It is the harmonic mean of both parameters (see Equation 5).

Web of science Scopus Pubmed
(Elite OR Pro* OR "High Level") (Topic) and ("Team Sports" OR "Team Sport" OR Football OR Soccer OR Basketball OR Handball OR Volleyball OR Rugby OR "Australian Football" OR Baseball OR "American Football" OR "Ice Hockey") (Topic) and ("Artificial Intelligence" OR "Computational Intelligence" OR "Machine Intelligence" OR "Computer Reasoning" OR AI OR "Computer Vision System" OR "Machine Learning" OR "Transfer Learning" OR Complexity OR "Complex Model" OR "Big Data" OR "Data Mining" "Agent-Based Modelling" OR "Systems Thinking" OR "System Theory" OR "System Analyses" OR "Artificial Neural Network" OR Bayes OR "Bayesian Logistic" OR "Bayesian Networks" OR "Decision (TITLE-ABS-KEY ((elite OR pro* OR "High Level")) AND TITLE-ABS-KEY (("Team Sports" OR "Team Sport" OR football OR soccer OR basketball OR handball OR volleyball OR rugby OR "Australian Football" OR baseball OR "American Football" OR "Ice Hockey")) AND TITLE-ABS-KEY (("Artificial Intelligence" OR "Computational Intelligence" OR "Machine Intelligence" OR "Computer Reasoning" OR ai OR "Computer Vision System" OR "Machine Learning" OR "Transfer Learning" OR complexity OR "Complex Model" OR "Big Data" OR "Data Mining" "Agent-Based Modelling" OR "Systems Thinking" OR "System Theory" OR "System Analyses" OR "Artificial Neural Network" OR bayes OR "Bayesian Logistic" OR "Bayesian Networks" OR "Decision Tree Classifier" OR "Deep Learning" OR "learning algorithms" OR "Neural Network" OR "Random Forest" OR "Support Vector Machine")) AND Area Under the Receiver Operating Characteristic Curve (AUC-ROC): This metric gauges the model's ability to distinguish between classes.
The root mean square (RMS) measure in artificial intelligence studies refers to the square root of the average of the squares of a set of values.In the AI context, it is used to calculate the square root of the average of the squared errors between the values predicted by a model and the actual values of the data.
Root Mean Square Error (RMSE): Similarly to RMS, RMSE is the square root of the average of the squared differences between predicted values and actual values.It gives a measure of the magnitude of the error.
Logarithmic Loss (Log Loss): This metric measures the performance of a classification model where the prediction is a probability value between 0 and 1. Log loss increases as the predicted probability diverges from the actual label.

Results
The search found 1,315 articles as the initial selection of sources of evidence (Figure 2).After deleting duplicate articles, we obtained 1,076 research papers.After applying the inclusion and exclusion criteria we obtained 32 studies that were considered for full-text evaluation.
During the selection process, a rather curious discovery was made: we found that some studies (n = 21) corresponded to elite team competitions involving robots and not people.They were not included in our review.
Afterwards, some articles remained, and a deep reading revealed that the AI technique or the algorithms were not described, and applying the risk of bias threshold to the studies some of them were discarded.
The results summarizing the population and the AI or ML used in the selected studies can be found in Table 3.
Finally, Table 4 gives the results of individual sources of evidence, and shows the synthesis of results, summarizing and presenting the charting results as they relate to the review questions and objectives of the research.

Studies characteristics
Of the 32 studies identified, the majority are on football (soccer) (67%) (Figure 3) which might be justified by the fact that it is the most developed area of sport and business and where the most resources are invested to improve performance and achieve results.
A population of 2,823 professional athletes were examined in the 32 papers that were screened, of which 1,845 (or 65.36%) were men and 978 (34.64%) were women.Thus, the actual distribution of papers is as follows: 26 pertain to studies on men and 3 to studies on women, with two papers containing a mixed population.Lastly, one study was identified as subject-free and match-centered, and it was not accounted for.
The sports levels seen are those filtered by the PRISMA-ScR search, and they are adults from high-level senior teams who develop in professional leagues.
An analysis of the application domains within the research teams reveals that the majority of studies focus on enhancing healthcare (n = 17) through the prevention of injuries and, secondarily, performance (n = 15) improvement.In this regard, AI is used or sought to assist in this endeavour.

Main artificial intelligence (AI) technique or method in team sports
In the 32 selected articles, it can be observed that the most frequently used AI-based and non-AI-methods for data processing are (Figure 4): Tree-based techniques (36%), Ada/ XGBoost (19%), Neural Networks (9%), K-Nearest Neighbours (9%) Classical Regression Techniques (9%) and Support Vector Machines (6%).A higher TL was reported after the games and during match day (MD)-5 and MD-4, while a higher WI was recorded on the following days (MD-6, MD-4, and MD-3, respectively).A significant correlation was reported between daily TL (TLMDi) and WI measured the day after (WIMDi + 1) (r = .72,P < .001).Additionally, a similar weekly pattern seems to be repeating itself throughout the season in both TL and WI.Nevertheless, the higher accuracy of ordinal regression (39% [2%) compared with the results obtained by baseline B1 (21% [1%) demonstrated that the machine learning approach used in this study can predict the WI according to the TL performed the day before (MD < i).Machine learning models used individualised normalised variables to predict whether players will perform at 100%, 95%, or 90% of their average physical performance in a match.Models were built using Random Forest and Decision Tree.Simple Naïve Bayes was used as a baseline model to demonstrate the superiority of tree-based algorithms.Machine learning method Random Forest and power category variable energy expenditure were most accurate.The combination of Random Forest and power category energy expenditure predicted performance and underperformance after 15 min in a match with precision of 0.91, 0.88, and 0.92 for 100%, 95%, and 90% thresholds, respectively.To conclude, early match predictions can be made about player physical performance.These findings can help elite soccer coaches make better substitution decisions.Various classification machine-learning algorithms that performed best on external and internal loads features were compared using standard performance metrics such as accuracy, precision, recall and the area under the receiver operator characteristic curve.In particular, tree-based algorithms based on non-linear models with an important interpretation aspect were privileged as they can help to understand internal and external load features impact on injury risk The results showed that the radial-basis function neural network outperformed the other models and was capable of predicting the studied types of athletic performance with R 2 scores between 0.86 and 0.97.Finally, significant factors influencing predicted performance were determined by retraining the superior model.This is one of the first studies using machine learning in sport sciences for handball players, and the results are encouraging for future studies.
( It should be noted that AI techniques are normally combined with classical statistical methods to complement and help reveal information due to the complexity of the relationships sought in the studies. The synthesis of results (Table 4), has been evaluated and showed the diverse methodologies utilized to assess each type of artificial intelligence applied, including the Gradient Boosted Regression Trees (GBRT) model, the Area Under the Receiver Operating Characteristic (AUC-ROC) curve, Key Performance Indicators (KPI), Uniform Manifold Approximation and Projection (UMAP), Kernel Principal Component Analysis (k-PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Ordinary Least Squares (OLS) regression, Root Mean Square Error (RMSE), Alternating Decision Trees (ADTree), and Decision Trees (DT).
This comprehensive evaluation is critical in understanding the strengths and limitations of each AI methodology within the context of sports performance and healthcare.By scrutinizing how each AI technique contributes to the analysis and prediction of athletic performance and injury prevention, we can discern the most effective tools for specific applications.
For instance, the GBRT model, known for its predictive accuracy in regression and classification problems, is assessed for its efficacy in predicting injury risk and performance outcomes (60).The AUC-ROC curve, a measure of the ability of a classifier to distinguish between classes, is used to evaluate the performance of predictive models in injury severity and returnto-play predictions (45,52,64).
KPIs, on the other hand, are analyzed for their role in quantifying physical demands and wellness metrics, providing a tangible measure of athlete performance and health.Dimensionality reduction techniques such as UMAP, k-PCA, and t-SNE are evaluated for their effectiveness in visualizing highdimensional data, aiding in the identification of patterns and relationships that are not immediately apparent (36).
OLS regression and RMSE are scrutinized for their application in predictive modeling, assessing their accuracy in forecasting outcomes based on linear relationships (43).Lastly, the ADTree and DT methodologies are evaluated for their utility in classifying and predicting outcomes based on hierarchical decision-making processes (52,64).
By valuing the different methodologies for assessing each type of artificial intelligence applied, our synthesis of results not only highlights the versatility and potential of AI in sports science but also guides future research and application towards the most effective and efficient tools for enhancing athletic performance and healthcare.

Research topics
When reviewing the articles selected for our study, it became evident that the research topics could be broadly categorized into two principal domains: Performance and Healthcare.This bifurcation not only aids in a structured analysis but also aligns with the overarching aim of enhancing athletic performance while ensuring the well-being of the athletes.
In the domain of Performance, our investigation revealed intriguing insights into the utilization of GPS data for tracking, profiling, and decision-making purposes.The ability to monitor athletes' movements in real-time has revolutionized the way performance is analyzed, offering a granular view of their capabilities and areas for improvement.Furthermore, the aspect of Wellness Control emerged as a critical factor.By monitoring athletes' wellness, we can discern its direct impact on performance levels.Here, AIdriven insights have provided new lens through which wellness can be quantified and optimized.
Predicting future performance from anthropometric data and performance tests has shown promising results.This approach underscores the importance of individual physical characteristics and their correlation with performance outcomes.Additionally, the utilization of Key Performance Indicators (KPIs) through AI has provided a nuanced understanding of the physical demands placed on athletes during training and matches.This knowledge is pivotal in tailoring training programs that maximize performance while minimizing the risk of injury.
Turning our attention to Healthcare, the research delved into Injury Prevention and Prediction.Through a multifactorial analysis, we have begun to unravel the complex web of factors contributing to injury risk among elite football players.Predictive modeling has further enhanced our ability to forecast injury severity, classify muscle injuries, and predict return-to-play timelines post-hamstring injuries with greater accuracy.
The correlation between External Load Data and injury risk, along with fatigue assessment, has shed light on the delicate balance between training load and athlete well-being.By monitoring external load, we can better understand its relationship with injury risk and athlete fatigue, aiding in the development of more informed training and recovery protocols.
Moreover, the integration of Psychological and Physiological Factors for injury risk screening has highlighted the multifaceted Global Positioning System and 30 from accelerometers) and the internal load derived from ratings of perceived exertion were collected for each player and each session and match.

Football
The ML models predicted HCE severity with an average area under the receiver operating characteristic curve of 0.73 for male and 0.70 for mixed datasets.The injury mechanism and pre-injury incident were most predictive.The mechanisms "head-to-head" and "kneeto-head" were significantly linked with severity in the male dataset (P = 0.0244) but not in the mixed sample (P = 0.1113).
P injuries.Recognizing that both mental and physical states play a critical role in the occurrence of injuries has led to more holistic approaches to injury prevention.Lastly, the innovative integration of Blood Biochemical Markers and Genetics for personalized injury risk assessment marks a significant leap towards customized healthcare in sports.This approach enables a deeper understanding of each athlete's unique physiological makeup, paving the way for personalized injury prevention strategies.In summary, the thematic analysis of the selected articles has not only enriched our understanding of the intricacies involved in sports performance and healthcare but also highlighted the potential of technology and data analytics in pushing the boundaries of what is possible in sports science.

Discussion
The primary objective of this research is to provide a global perspective on the state of the art in terms of the use of artificial intelligence in various areas of advisory services.These aspects include topics such as sports performance and healthcare.As a brief methodology recap, it has to be emphasized how the thematic analysis led to the identification of the discussed topics.Some authors have previously described how team sports can benefit from these automated or artificial intelligence technologies (28).Historically, research has been carried out employing a scientific framework and methodology founded on statistical investigations.Nevertheless, these foundational statistical studies are being progressively helped and superseded by more intricate models and methodologies derived from artificial intelligence.This could improve the precision with which these expert models or their outputs may be implemented in the actual world.AI is normally integrated with statistical methods to enhance information, injury prediction, and player performance monitoring.
Given the current state of the sports framework, the complexity of the data processing methods involved, and the intrinsic properties of the data, it is critical to form horizontal and multidisciplinary teams to maximise the potential of artificial intelligence and machine learning technologies.
One area of investigation that could not be detected with our search strings is cardiological tests such as electrocardiograms utilised in team sports, probably due to the variations in the papers' formulations.Certainly, some authors have studied the gaps and exerted considerable effort in this domain (65).

Performance
There are numerous references in which AI has been applied to this facet of sports.An abundance of data has been collected since the implementation of tracking technologies like the Global Positioning System (GPS) or Global Navigation Satellite System (GNSS).Additional functionalities have been incorporated into

GPS data tracking, profiling and decision-making
GPS devices are extremely useful since they enable the investigation of a vast array of physical concepts in order to precisely define what, when, and where it occurs during matches and training.Certain authors employed this methodology to assess player performance during training and matches within the setting of Serie A. Their research utilised machine learning models such as XGBoost and Decision Trees to illustrate the progressive improvement in the accuracy of XGBoost as the season progressed.By employing realworld scenarios and cross-validation techniques, they gained valuable insights into the dynamics of performance (47).Similarly, additional researchers conducted a study of clustering and performance profiling utilising GPS data and 38 male football players, respectively, in order to identify unique performance profiles.The incorporation of machine learning techniques in this study enhances comprehension of the diverse performance attributes exhibited by players, hence facilitating the development of customised training and optimization approaches (48).In an effort to forecast individual acceleration-velocity profiles in real-world scenarios utilising GNSS readings, some researchers sought to further the individualization of acceleration-velocity profiling in the context of professional male footballers.The development of multivariate models, which incorporated long short-term memory neural networks and time series forecasting, highlighted the capacity of machine learning to provide individualised performance insights.A model description and multivariate approaches were built with the following techniques in mind: time series forecasting, ridge regularisation, and long short-term memory neural networks (46).Finally, match performance prediction and substitution decision support are critical in team sports.To this end, some authors applied machine learning models such as Random Forest and Decision Trees to in-match position tracking data from 302 competitive professional soccer matches to forecast player performance levels.The research, which utilised data acquired retrospectively, illustrated the capability of early match forecasts to assist coaches in making well-informed substitute judgments (53).

Wellness related to load control
Developing prediction models for the wellbeing of elite football players was a subject that was expounded upon by numerous authors.An association between well-being indicators and training load was determined by analysing 28 sub-elite football players over the course of the 2017-2018 season.By utilising a machine learning methodology, they successfully predicted wellbeing markers such as the training load completed the day before with a noteworthy ordinal regression accuracy of 39% (2%).This demonstrates how machine learning has the ability to predict the well-being of football players (51).In volleyball, some authors used XGBoost, random forest regression, and subgroup discovery to study male national volleyball teams' well-being, RPE, and readiness using questionnaires.The study examines defensive and attacking game phases and shows how machine learning can analyses volleyball performance patterns (44).
Professional football players' fatigue prediction utilising Random Forest, Lightgbm, and gradient Boosting Regressor machine learning models is intriguing.FatigueNet accurately predicted Rating of Perceived Exertion (RPE), demonstrating the importance of machine learning in player tiredness prediction (49).Other writers used FatigueNet and stressed the importance of perceived exertion rating in prediction.Machine learning algorithms like Random Forest and the "FatigueNet" model were used to study movement characteristics and well-being.They stressed the efficacy of machine learning in predicting well-being, providing valuable insights for player management and performance improvement (50).Gradient-boosted regression trees (GBRT) were used to create predictive models for individual wellness variables in 26 professional male football players for RPE prediction.The GBRT model outperformed baseline techniques in predicting future wellbeing using acute and cumulative load indicators (60).Following the 2015/2016 study of professional soccer players' RPE and training load, this study examined the effects of training workloads on perceived exertion and training load.The research used RPE, S-RPE, and GPS to show that the previous week workloads strongly affected subsequent perceived effort and training load.The study emphasises machine learning's usefulness in understanding the complex links between external tasks and subjective player experiences (61).
In summary, the use of AI and ML in football performance modelling has changed sports performance research.Multiple studies highlight the adaptability of these technologies, from forecasting sub-elite football players' well-being to studying soccer training practise effects.These improvements affect volleyball, handball, and football.

Predicting performance from anthropometric and testing
Other authors have used radial-basis function neural networks to predict athletic performance in women's handball using anthropometric metrics, power tests, and jump tests from 59 female players.A total of 23 anthropometric parameters, power tests (Wingate), and jump tests (CMJs) were combined.The study demonstrates the potential of machine learning to predict handball players' performance, with R 2 scores ranging from 0.86 to 0.97 (55).In a study of elite male futsal players, Bayesian networks were used to explore how neuromuscular performance affects dynamic postural control.The study showed that Bayesian networks using the Tabu search algorithm may capture the interaction of performance characteristics, providing futsal players with significant insights.Dynamic postural control (y-balance), isokinetic (concentric and eccentric) knee flexion and extension, isometric hip abduction and adduction, lower limb joint ROM, and core stability were measured as the features were chosen.The dominant (AUC = 0.899) and non-dominant (AUC = 0.879) legs of the BNs generated using correlation attribute evaluator and χ 2 had the highest assessment requirements (AUC) (64).

Key performance indicators (KPI): AI helps to understand physical demands on training and matches
In football, some authors used experimental randomised controlled trials to test the effects of various training drills on soccer performance during matches in different field sizes.They used decision tree induction to determine cut-off point values for different parameters.The research uses machine learning to understand drills' physical demands and develop customised training methods (57).In order to improve KPI in Australian football, some authors have developed spectral features and classification using Random Forest Models to evaluate segment similarity and classify spectral characteristics.The 37 Australian football players study showed that spectral entropy and skewness are important in distinguishing skilled output and that machine learning may be used to tailor training specificity to conditions.Offensive and defensive involvements were the lowest categorization features, suggesting match conditions affect skilful performance.The methodology may be used to compare training specificity to matches and build match rotation methods (62).Finally, additional authors used generalised estimating equations (GEE) and regression decision trees to evaluate performance indicators from the Australian Football League Women's (AFLW) seasons to reveal crucial aspects not found in conventional statistics.The study showed that team differentials and athlete percentiles can describe quarter outcomes, demonstrating the power of machine learning to analyse performance (63).

Healthcare
Sport is increasingly emphasising health, which is more than just the absence of disease or injury.Anyway, most AI-based solution research focuses on harm prevention.

Injury prevention and prediction
The majority of studies were focused on predicting non-contact lower-body injuries in male professional players.One of them employed three decision-making methods, with the XGBoost algorithm showing the most promising results, achieving high precision, recall, and F1-score (35).Blood samples were utilised to customise machine learning (ML) models for injury prediction in an intriguing study involving eighteen male professional soccer players.By including blood parameters and GPS-measured exterior workloads, the research study achieved an accuracy of 63%, outperforming models that exclusively relied on the features of the training workloads (33).An additional body of research investigated the prediction of injuries among professional football players, with a particular emphasis on the implementation of screening data within a gradient-boosted model.Incorporating 112 adult male football players, the research exhibited encouraging cross-validated performance and proposed applicability to novel situations through the implementation of gradient boosting with ROSE upsampling in a leave-one-out cross-validation scheme for data processing (34).
Other writers used dimension reduction to assess elite female soccer players' injury signs.They compared UMAP (Uniform Manifold Approximation and Projection) to non-linear kernel principal component analysis (k-PCA) and t-distributed stochastic neighbour embedding for damage pattern detection (t-SNE).UMAP identified damage indicators using grid search, suggesting it may be better in this situation (36).
In an interesting and broad analysis of 791 female elite handball and soccer players from 2007 to 2015, researchers investigated different machine learning algorithms for predicting anterior cruciate ligament (ACL) injuries.Using a robust technique that addressed chance and random changes, the study indicated that the ideal linear support vector machine classifier had a mean AUC-ROC of 0.63.However, AUC-ROC values varied from 0.51 to 0.69 among approaches and repetitions.Addressing class disparities did not improve prediction outcomes (45).
In tandem with the preceding authors and imbalance detection, other research examines muscular strength metrics to predict injuries in professional male football players.Preseason study uses subgroup identification technique to assess 77 athletes, 92 injury cases and 186 healthy cases.Countermovement leap, eccentric hamstring strength, and isometric hip adductor and abductor strength are examined.Subgroup Discovery data mining works well with tiny datasets and reveals patterns.The study found a higher injury risk for between-limb abduction imbalance three weeks before to the occurrence, exceeding a threshold of ≥0.97.In addition, a right leg adduction muscle strength threshold of ≤1.01 is a significant risk factor one week prior to injury.These findings demonstrate the importance of monitoring key strength metrics and their complex interaction in injury prediction, providing practical insights for professional football injury prevention strategies (38).
One study conducted a Subgroup Discovery machine learning analysis of 14 elite male volleyball players' injuries, illness, and perceived wellness during 24 weeks of the 2018 international season.The study monitors 1,112 professional player questionnaires from training and matches.Based on degree of complaints, injuries, or pain and their impact on performance and training volume, players are divided into three groups (Q1-Q3).The results shed light on how physical well-being, training load, and elite volleyball performance interact.This complete assessment helps elite male volleyball players optimise training and injury avoidance (41).

Multi-factorial analysis for predicting injury risk in elite football players
One study analyses 36 male elite football players and 22 independent variables, including player information, body composition, physical fitness, and season injuries, to predict injury risk.Net elastic analysis is pursued using traditional regression models (OLS), shrinkage regression, and stepwise regression.The data show that defensive and forward sectorial positions, body height, sit-and-reach performance, 1-min pushups, handgrip strength, and 35 m linear speed affect injury risk.The investigation showed that the most accurate predictive model predicts elite footballer injury risk with a root mean square error (RMSE) of 0.591.This holistic approach illuminates the multifaceted nature of injury vulnerability, aiding injury prevention and athlete well-being (43).

Predictive modelling of soccer player injury severity
In a remarkable study spanning the 2013/2014 season, some authors (59) conducted a comprehensive analysis of injury prediction methodologies in male professional football players.The costsensitive ADTree base classifier-based SmooteBoostM1 model was added in preseason tests.This model performed well with an area under the receiver operating characteristic curve score of 0.837, a true positive rate of 77.8%, and a true negative rate of 83.8 percent.The meticulous investigation of 18 injuries showed 55.6 percent in the dominant limb and 44.4 percent in the nondominant leg.This study sheds light on predictive modelling for injury risk assessment and the distribution and features of injuries among male professional football players during the season.

Muscle injury classification and expert predictions
Utilizing the British Athletic Muscle Injury Classification (BAMIC), researchers estimated muscle injuries among male football players.It was shown that XGBoost performed the best among the models that had their performance enhanced by including expert forecasts (40).

Predictive analysis of return-to-play in hamstring injuries
Hamstring injuries in male professional football players from February 2010 to February 2020 were examined in this retrospective observational cohort study.The study uses MLG-R, a comprehensive classification system, and linear regression, random forest, and eXtreme Gradient Boosting to assess returnto-play factors for 76 injuries involving 42 players.65.8% of injuries were grade 3rd, with the biceps femoris long head frequently injured.MLG-R shows strong predictive power with a mean absolute error of 9.8 days and an R 2 of 0.48.Location and grade of the injury at the free tendon of the biceps femoris long head determine return-to-play.The study confirms the MLG-R classification system's accuracy in predicting elite footballers' hamstring injury return-to-play times, providing valuable injury management and prevention insights (42).

External load data, injury risk and fatigue assessment
Some authors observed elite soccer players and compared machine learning models to predict high chronic exertional compartment (HCE) severity.The study highlighted the importance of injury mechanisms and pre-injury incidents, linking them with severity (37).Others assessed professional male football players using a subgroup discovery algorithm.The study revealed that injuries were more likely when there was an imbalance in abduction strength or right leg adduction muscular strength remained unchanged or deteriorated (38).A Research monitored elite soccer players, employing machine learning techniques like Extreme Gradient Boosting (XGBoost) and Random Forest Regression to analyze external load data.Random Forest Regression provided the best performance in assessing fatigue or neuromuscular readiness (39).
We can also develop models that enable us to comprehend soccer player injuries based on anthropometric or screening differences (43), as well as the control of the internal and external load in soccer (54) or in Australian Football Indeed in this last discipline some authors claim to have made contributions to injuries and their relationship with specific actions like deceleration and acceleration or impacts (58).
An intriguing study that integrated external and internal strain in professional soccer comprised 40 male players from a top French Ligue 2 club from June 2017 to May 2018.We compared nonlinear models, specifically interpretable tree-based classification machine-learning methods.Understanding injury risk from internal and exterior load features required these algorithms.Evaluations included algorithm accuracy, precision, recall, and receiver operator characteristic curve area.The study found that training burden, perceptive well-being, and machine learning avoid top soccer injuries (54).
Another study uses a predictive approach to examine the complex relationship between external load variables and professional soccer players' RPE ratings.The study examines 58 GPS external load factors and 30 accelerometer variables using linear regression, K-NN, decision trees, random forest, elastic net regression, and XGBoost.These factors and RPE-derived internal loads are studied throughout 151 training sessions and 44 matches in a season.The machine learning models predict high chronic external load (HCE) severity with an average area under the receiver operating characteristic curve of 0.73 for male and 0.70 for mixed datasets.Injury mechanisms like "head-to-head" and "knee-to-head," as well as match events like "corner kicks" and "throw-ins," are good predictors.This study highlights the importance of specific actions and their correlation with soccer player injury severity, advancing injury prediction (37).

Integrating psychological and physiological factors for injury risk screening
This study incorporates psychological and physiological aspects to create injury risk screening models for male and female indoor football players.Decision trees with C4.5 and ADTree, Support Vector Machines with SMO, and k-Nearest Neighbor were used in the analysis of 139 participants (KNN).The models used psychological risk variables, self-perceived chronic ankle instability, neuromuscular risk factors, and injury monitoring data to optimise each base classifier using MultiSearch.The screening models, based on range of motion (ROM) and dynamic postural control features, had moderate accuracy (AUC scores of 0.701-0.767).The "model of best fit," with two hip and two ankle ROM values and 10 classifiers, had an AUC of 0.767, 85 percent true positive, and 62% true negative.This complex approach emphasises the importance of psychological and physiological factors in indoor football injury risk assessment (52).

Integrating blood biochemical markers and genetic for personalized injury risk assessment
A pioneering study examined the complex link between blood factors and athlete injury risk assessment.The study carefully examined red cell data, ferritin, testosterone, and cortisol to build player profiles.A complete method using internal and exterior load data was recommended by the study.The combination of Decision Trees (DT), Dummy, and XGBoost (XGB) algorithms improved injury risk prediction.This novel approach shows how physiological markers can improve injury risk assessments, enabling more effective sports injury prevention strategies (33).
In a detailed study of male Australian football, twenty-six professional players were studied during one AFL season.The study used parametric and machine-learning analysis to reveal the complex relationship between physical load indices and muscle injury during intensive competition.The study found impacts exceeding 3 g and high-intensity running variables as robust predictors of muscle injury, as indicated by generalised estimating equations and random forest models.Athlete tracking data like deceleration, acceleration, impacts >3 g, and sprint distance were better predictors than traditional approaches.Impacts >3 g and game time were best predictors of post-match creatine kinase (CK) levels.Intriguingly, prematch (CK) only made up 11% of post-match (CK), showing data unpredictability.This study illuminates muscle injury patterns during AFL competition and emphasises the need of tracking measurements in forecasting and analysing post-match (CK) levels (58).
Using genetics, a 2008-2018 case-control study examined 363 elite soccer, futsal, basketball, handball, and roller hockey players from FC Barcelona, Spain.The study investigated the complex relationship between genetics and tendinopathy in high-performance athletes.A machine learning-based multivariate modelling technique using support vector machine and random forest algorithms was used to analyse the influence of SNPs on tendinopathy susceptibility.The study began with a hypothesis-free genome-wide association study of 495,837 SNPs, then targeted examination of 58 SNPs identified as risk factors in the literature.The predictive model included robust SNPs after synthetic variant imputation, with fibrillin 2 gene rs10477683 being influential.In the multidimensional modelling technique, fibrillin 2, a key component of connective tissue microfibrils involved in elastic fibre construction, showed significant connections.These findings illuminate the genetics of elite athlete tendinopathy, enabling tailored injury preventive measures (56).

Limitations
During our study, we ran into a few key limitations.Firstly, we focused mainly on strength and conditioning when talking about "performance," leaving out areas like technical skills, tactics, and talent spotting.We used the STROBE tool to help find and fix weaknesses in the studies we looked at, making our review stronger.
One big issue was how we set up our search, which missed some important health studies, especially those on heart health.This mistake shows we need a wider search strategy in the future to cover more health topics.We also didn't include enough about collective team sports like baseball, cricket, hockey or others, and we completely missed sports for paraathletes.This shows the need for future studies to cover a broader range of sports.
Another problem was that the studies we looked at didn't share their data publicly, making it hard to check their findings.This points to a need for more openness in sharing data for scientific research.
We also noticed not enough research on professional women's sports, which is a gap that needs filling.This lack of information is a missed opportunity to better understand and improve performance in a wider range of sports.

Conclusion
In conclusion, after conducting an extensive review and scoping analysis of the pertinent literature, we have determined that artificial intelligence and machine learning have the potential to bring about significant changes in numerous aspects of team sports participation and coaching, keeping track of the health and safety of individuals and teams, including injuries.Further clinical investigations utilising rigorous methodologies are required to extract more dependable conclusions from the vast amount of data that the teams are generating.By confronting the limitations of current research and embracing a more inclusive and comprehensive methodology, the field of sports science can continue to evolve, offering valuable insights and practical applications that enhance athletic performance and healthcare across the sporting spectrum.

Practical applications and future prospects
The issues we found point out where sports science can improve.By tackling these problems, future research can give us a fuller picture of athletic performance and health, helping athletes in many different sports.
We suggest making training programs that meet the specific needs of different sports, including those for para-athletes and women's teams.Also, sharing data more openly can speed up progress in sports science by making studies easier to verify.
Looking ahead, research should look at athletic performance in a more general sense, including technical, tactical, and talent aspects.Expanding studies to more sports and including paraathletic activities will give us a better understanding of performance.Finally, focusing more on women's sports is crucial to make sure research benefits all athletes equally.
age: 22.5 ± 5.2 years, stature: 1.75 ± 0.7 m, body mass: 72.9 ± 6.9 kg) males and 67 (age: 22.4 ± 5.5 years, stature: 1.64 ± 0.5 m, body mass: 59.4 ± 5.1 kg) females] elite futsal players from 12 different teams [56 players (24 males and 32 females) from six club engaged in the First (top) National Spanish Futsal division and 83 players (48 males and 35 females) from six clubs engaged in the Second National Futsal division] completed this study Four classifiers based on different paradigms, namely decision trees with C4.5 and ADTree, Support Vector Machines with SMO and the well-known k-Nearest Neighbor (KNN) as an Instance-Based Learning approach were selected.The configuration of each base classifier was optimized through the use of the metaclassifier MultiSearch.(53)Retrospectivelycollected data from the in-match position tracking data from 302 competitive professional soccer matches between 18 teams during the Dutch "Eredivisie" 2018-2019 season.

TABLE 1
Search strings used in each database and results returned.

TABLE 3
Item 17 PRISMA ScR protocol: results of individual sources of evidence.
(43)e different statistical and machine learning approaches (linear regression, random forest, and eXtreme Gradient Boosting) were used to assess the importance of each factor of the MLG-R classification system in determining the return to play, as well as to offer a prediction of the expected return to play(43)36 male elite football players who competed in the First Portuguese Soccer League in the 2020/2021 season This study considered multiple-input single-output regression-type models.The analysis showed that the most accurate model presented in this work generates an error of RMSE = 0.591.(44) 25 elite male volleyball players volunteered to participate in this study (mean ± SD age: 27.0 ± 3.0 years, weight: 91.2 ± 6.4 kg, height: 2.00 ± 0.10 m).All subjects competed on the international level Machine learning techniques XGBoost, random forest regression and subgroup discovery (45) 791 female elite handball and soccer players For the best classifier (linear support vector machine), the mean AUC-ROC was 0.63.AUC-ROC values varied substantially across repetitions and methods (0.51-0.69).Class imbalance handling did not improve the results.(46) 42 professional football players.FC Lucerne football club occurred during 1.5 seasons (2019-2021).Definition of models and multivariate approaches: Time-series forecasting, Ridge regularisation, Long short term memory neural network (47) 17 players competing in the Italian championship (Serie A) Cross-validation and real scenario approaches.The cross-validation approach randomly splits the dataset in train and test sets, while the real scenario approach continuously creates train and test as the season goes.Resulting in 157 predictor variables.Data were split into training and testing datasets.Random forests (RF) were built to reduce the dimensionality of the data, identify variables of importance and build classification models.To aid practical interpretation, conditional inference (CI) trees were built.Nine variables were identified as most important for backs, classifying between levels with 83% (RF) and 78% (CI tree) accuracy.(Continued) Munoz-Macho et al. 10.3389/fspor.2024.1383723Frontiers in Sports and Active Living 08 frontiersin.org

TABLE 4
Item 18PRISMA ScR protocol: synthesis of results.