- 1Human Motion Analytics, Salzburg Research Forschungsgesellschaft mbH, Salzburg, Austria
- 2Athlete Performance, Adidas AG, Herzogenaurach, Germany
This study investigates performance development and the relationship between subjective and objective training assessments in female youth soccer using wearable sensor technology. The aim of this study was to assess how subjective post-training ratings (intensity and happiness) relate to high-percentile performance outputs, and to identify longitudinal trends in female youth soccer players using IMU-based wearable data. Data were collected over a 14-month period from 46 players (U17 and U20 teams) equipped with foot-mounted inertial measurement units (IMUs) during regular training sessions. Objective performance metrics, including 95th percentile of ball speed, peak speed, and absolute distance, were derived using a multi-stage machine learning pipeline, while subjective metrics (intensity and happiness) were collected via post-session Likert-scale questionnaires using an app. Using the modified Mann-Kendall test, we found 30 significant longitudinal trends, with 14 positive and 16 negative trends across key performance metrics. Peak speed showed the highest number of trends (13), followed by absolute distance (10) and ball speed (7). Correlation analyses based on the Spearman coefficient (with False Discovery Rate correction) revealed meaningful associations between subjective self-assessments and high-percentile performance metrics, with notable differences across player positions and age groups. A robustness check confirmed these patterns also hold when analyzing the 99th percentile of performance outputs. Our findings underscore the value of combining wearable sensor data with subjective evaluations for individualized, role-specific performance monitoring and training optimization in youth soccer. However, as an exploratory study with a single cohort, findings require further validation in broader populations.
1 Introduction
1.1 General context
Wearable sensors, or wearables, can be unobtrusively worn on the body and are increasingly used in soccer training to collect data on players’ physiological and physical activity (1). For example, (2) presents a wearable GNSS platform that calculates a player’s total distance covered and the speed range in which they covered it (walking, jogging, fast running, sprinting), along with their top speed and other sprint statistics. Similar metrics can be calculated using the player’s inertial data (three-dimensional acceleration and angular velocity), measured by an inertial measurement unit (IMU). A wrist-worn inertial data-based system is presented in (3), where in addition to the aforementioned speed metrics, the authors estimate the player-ball interaction (pass, kick, dribble). Furthermore, they differentiate their analysis based on player position (defender, midfielder, striker). This added information can be used by athletes and their coaches for training load monitoring.
1.2 Specific problem and gaps in the literature
Training load (TL) has an internal and an external component (4). The internal component includes all physiological and psychological stressors that act upon the athlete. The external component is the work performed during training/competition. TL monitoring aims to quantify the relationship between TL and performance, as well as the relationship between TL and injury risk (4, 5). While it is believed necessary to overload athletes to some degree during training to improve their performance capacity (6), overdoing it increases injury risk (4) and, in extreme cases, leads to a permanent decrease of performance capacity due to overtraining (7).
While wearables are capable of directly or indirectly measuring physiological stressors (through e.g., a wearable heart rate monitor) and external TL measures like total distance and speed parameters, they are limited when it comes to measuring psychological stressors, i.e., subjective TL. The most common measure of subjective training load is a sessional rating of perceived exertion (sPRE) (5). It is important that an sPRE follows a non-linear scale with verbal anchors (5). Such a scale is the CR10 scale (8) where expressions such as “extremely weak,” “weak,” “moderate” and “very strong” map to a zero-to-ten number range in a non-equidistant manner. To provide a more fine-grained picture of training load, questionnaires may use a 0–100 range (CR100) and split the sRPE into separate scores for e.g., breathlessness, upper body exertion, and cognitive exertion (9). While most subjective training load measures focus on perceived exertion, emotional responses to performance, such as satisfaction or happiness, can also influence motivation, recovery behavior, and long-term engagement in sport. Positive affective states have been shown to modulate perceived readiness and performance, especially in young or developing athletes. A recent systematic review highlights that psychological well-being and pleasant emotions not only complement negative affect but can serve as performance-related motivational drivers in sport (10). Furthermore, empirical evidence from a large-scale study among collegiate athletes indicates that subjective happiness is significantly associated with concentration regulation and need satisfaction, both of which affect performance and are psychologically relevant (11). Therefore, we included a post-session “happiness” rating to capture players’ emotional responses to training, which may reflect self-perceived performance satisfaction, confidence, and overall psychological readiness.
Understanding the interplay between subjective perceptions of TL and objective performance metrics is crucial for the further development of athletes and the optimization of training strategies in elite sport (12). In their study of TL monitoring for youth soccer players, (13) conclude that both external and internal TL measures should be considered by trainers when designing training programs. Our study addresses this need by using data collected from a wearable insole with integrated IMU to analyze performance metrics in female youth soccer players and questionnaires.
1.3 Study aim
The aim of this study is twofold: (1) to identify longitudinal trends in performance development among female youth soccer players, and (2) to investigate the relationship between subjective self-assessments (intensity and happiness) and objective high-percentile performance metrics derived from wearable IMU data.
This approach provides valuable insights into the development of athletes and contributes to more effective and individualized performance monitoring in sports science.
2 Materials and methods
2.1 Data
2.1.1 Data collection
Data were collected between November 7, 2023 and January 10, 2025 from 46 female youth soccer players (U17 and U20) (age: avg 16.0, sd = 1.4). Note that two players trained in both teams, and thus they appear twice in inter-team comparisons. This study was conducted in Germany and used secondary, anonymized data collected independently of this study as part of an ongoing project by Adidas AG. Participation in the data collection was voluntary and conducted in compliance with German and EU legal requirements. Prior to data collection, participants (and, for minors, their legal guardians) provided informed consent, including agreement to a privacy notice which explicitly allows the use of anonymized data for scientific research. To evaluate whether a formal ethics vote was required, we consulted the ethical guidelines published by the German Council for Social and Economic Data (RatSWD; https://doi.org/10.17620/02671.1) and concluded that none was required. Data processing was conducted in accordance with the EU General Data Protection Regulation (GDPR, Regulation (EU) 2016/679).
During a regular training session, the 3D acceleration and 3D angular velocity of the dominant foot were recorded at 200 Hz with an adidas TEAM-FX insole (Figure 1). The insole housed a Jacquard Tag IMU (Figure 2) in a small cavity under the arch of the dominant foot. This sensor position is recommended by (14) and guarantees that the sensor does not poke into the foot. The size of the sensor unit was 38.8 19.7 5.8 mm with a weight of 5.5 g. The recorded IMU data were buffered on the Jacquard Tag for further processing.
2.1.2 Sensor data processing and calculation of performance metrics
The data was processed with the Jacquard Tag’s inbuilt processor to calculate football-relevant metrics using implemented machine learning algorithms partially based on the work of (15, 16). To derive the objective performance metrics absolute distance, ball speed, and peak speed, raw sensor data from IMUs was processed using a multi-stage classification and segmentation approach illustrated in Figure 3. The system integrates machine learning algorithms to detect and classify ball interactions and player movement. The machine learning models used to derive performance metrics (e.g., ball speed, peak speed) were provided as part of the TEAM FX insole platform and were not developed or modified by the authors. Consequently, we did not have access to internal model architectures or training procedures, and no additional model validation or tuning was performed in the context of this study.
The data set for absolute distance and peak speed was structured to capture key performance metrics at 5 s intervals. The absolute distance covered by a player during each interval can therefore be derived directly from the average speed. In addition, whenever a ball was passed or kicked during the training, the ball speed at that moment of passing or kicking was recorded and stored separately in the dataset. This allowed precise analysis of ball-related actions in conjunction with player movement data. This data structure provides a comprehensive framework for analysing both continuous movement patterns and discrete ball interactions during the training. This system follows a structured pipeline consisting of:
• Kick detection and classification: A Kick Detector (small model with high recall) first identifies potential kicking events based on acceleration and gyroscope data. A Kick Classifier (larger model tuned for precision) then confirms these events and assigns a confidence score along with an estimated ball speed.
• Ball speed estimation: If a kick is classified, the Ball Speed Estimator is triggered to refine the ball speed estimate (km/h) based on the sensor data.
• Player motion estimation: The Player Motion Estimator continuously processes IMU-derived metrics, estimating speed and distance covered.
2.1.3 Performance metric calculations
• Absolute distance: Computed by integrating the estimated speed over time from the Player Motion Estimator for a 5 s interval of the respective training session.
• Peak speed: Defined as the maximum speed reached within a 5 s interval of the respective training session, derived from the Player Motion Estimator.
• Ball speed: Directly obtained from the Ball Speed Estimator after kick classification.
For the analysis we use the 95th percentile of ball speed, peak speed and distance covered during a training session. We decided not to use the maximum value as it would be prone to outliers. In Section 4.2 we also analyse the results for the 99th percentile to check the dependence of the results on the percentile used.
2.1.4 Subjective metrics: intensity and happiness
To evaluate players’ perceived intensity and happiness after training sessions, a 5-point Likert scale was used (Table 1). This format is widely used in sports and exercise psychology due to its simplicity, ease of interpretation, and demonstrated psychometric robustness. For instance, the Perceived Performance in Team Sports Questionnaire (PPTSQ) also employs a 5-point Likert scale to evaluate athletes’ perceptions of team performance (17).

Table 1. Likert scale used for assessing perceived intensity and happiness levels after training sessions.
Using an app, players were asked to rate:
1. Intensity: How would you rate the intensity of today’s effort? (1 = Very light, 5 = Very hard)
2. Happiness: How do you feel about your performance? (1 = Disaster, 5 = Perfect)
The intensity scale ranges from Very light (minimal effort, relaxed) to Very hard (exhausting, requiring recovery). The happiness scale captures emotional responses, from Disaster (a frustrating, poor performance) to Perfect (an outstanding session). This subjective assessment provides insight into individual player experiences and facilitates correlation analyses with objective performance metrics. Players were instructed to complete the questionnaire independently and immediately after each session. Responses were submitted digitally and individually, without observation or feedback from coaches or teammates. They were informed that their data would remain confidential and be used solely for research purposes, to encourage honest and unbiased reporting. The variable “happiness” was included as a single-item, post-session measure to capture athletes’ affective evaluation of their own performance. Although not based on a standardized psychological scale, the term was explicitly defined in the questionnaire and anchored to a 5-point Likert scale to ensure clarity and interpretability.
2.2 Data analysis
2.2.1 Modified Mann-Kendall-test
To detect a monotonic trend in time series data, we decided to use the Mann-Kendall-Test (18, 19). This non-parametric statistical test is widely applied in hydrometeorological studies to analyze trends in variables such as streamflow, water quality, temperature, and precipitation (20). The null Hypothesis () assumes that the data is independent and randomly distributed, while the alternative Hypothesis () posits the existence of a monotonic trend (increasing or decreasing) in the data.
For a time series with observations, the test statistic is calculated as:
where
and and are values at time and , respectively.
According to (18, 19), the test statistic approximates a normal distribution when the sample size . The variance of under is defined as:
where represents the number of tied (equal) values in the -th group of ties, and is the total number of groups of tied values. The standardised test statistic is computed as:
To asses significance, the computed value is compared against the critical value from the standard normal distribution for a given significance level . If , is rejected, indicating a statistical significant trend. The direction of the trend is determined by the sign of : if , the trend is increasing, if the trend is decreasing.
Because some of the data in our study exhibit autocorrelation, we apply the modified Mann-Kendall test introduced by (21). This modification adjusts the variance of to account for serial correlation, which can cause the standard Mann-Kendall test to overestimate the significance of trends. While the test statistic remains unchanged, the variance is modified as follows:
where is the effective sample size:
Here, denotes the autocorrelation of the ranks at lag . The adjusted standardised test statistic is then computed as:
This adjustment accounts for serial correlation, providing a more accurate assessment of trends in autocorrelated time series and reducing the likelihood of false positives.
2.2.2 Spearmann correlation index
To assess the monotonic relationships between variables, we employed the Spearman correlation coefficient (22):
where is the number of paired observations and is the difference between the ranks of each pair of observations. The Spearman correlation is a non-parametric measure that evaluates the strength and direction of association between two ranked variables. Unlike Pearson’s correlation coefficient, Spearman’s method does not assume a linear relationship or normal distribution of the data, making it particularly suitable for non-linear or non-normally distributed datasets (23). The correlation coefficient ranges from (perfect negative monotonic relationship) to (perfect positive monotonic relationship). A value of indicates no monotonic relationship. We interpret all correlations within the context of their magnitude. Weak but statistically significant correlations are treated as exploratory and interpreted with caution. Given the large number of correlation tests across player positions, teams, subjective variables (intensity, happiness), and performance metrics, we applied the Benjamini-Hochberg procedure (24) to control the false discovery rate (FDR) at . This correction was implemented across all Spearman tests per subjective variable. Only correlations with FDR-adjusted p-values below 0.1 were interpreted and visualized.
Although we identified autocorrelation in the time series of some players, we decided against removing trends to calculate the Spearman correlation coefficient. This decision ensures comparability among all players, regardless of the presence of trends in their respective data. Furthermore, some players exhibit gaps in their time series due to illness or injuries, which could complicate or bias detrending methods such as differencing or fitting a trend line. Thus, we opted to retain the original structure of the data for a consistent and unbiased analysis.
2.3 Additional analytical considerations
In addition to the specific statistical procedures, several broader analytical decisions were made to ensure robustness and interpretability of the results.
We tested for normality using the Shapiro-Wilk test and found that none of the three performance metrics followed a normal distribution (p-value for all three metrics <0.05). As a result, we confirmed the appropriateness of our non-parametric approach, which included the modified Mann-Kendall trend test and Spearman rank correlation.
Regarding missing data, we decided to not impute missing data, as our focus was on identifying individual-level trends and associations using non-parametric methods. Imputation could introduce artificial stability or variability, especially in longitudinal data with individual-specific trajectories. Furthermore, missingness in our dataset primarily resulted from real-world events such as illness, injury, or absence, which are themselves meaningful in the context of athletic development and training load management. Retaining the original structure of the time series ensured that statistical tests (e.g., modified Mann-Kendall, Spearman correlation) reflected actual measurement patterns rather than imputation artifacts.
While age and experience may influence both subjective assessments and physical performance, these factors are implicitly accounted for by analyzing players separately within the U17 and U20 team groups. We chose not to include additional control variables or compute partial correlations, as our focus was on exploratory, subgroup-level insights and individual variability. This approach also allowed us to maintain comparability across players with varying levels of data completeness due to illness or injury.
All analyses were conducted using the software R (25), employing the mmkh function from the modifiedmk package (26) for the modified Mann–Kendall trend test. Spearman correlations and multiple testing corrections using the Benjamini–Hochberg procedure (24) were performed with functions from the base stats package, specifically cor.test and p.adjust. Visualizations were created using the ggplot2 package (27).
3 Results
3.1 Descriptive analysis
Table 2 presents the descriptive statistics for the key performance metrics—peak speed, absolute distance, and ball speed—across two teams (U17 and U20) over the analyzed period (November 7, 2023, to January 10, 2025). The statistics include the mean, standard deviation (sd), median, interquartile range (IQR), minimum (min), and maximum (max) values.

Table 2. Descriptive statistics of performance metrics; time range: 2023/11/07 to 2025/01/10, IQR = ; ball speed is measured as an integer; peak speed and absolute distance are motion metrics and represent the respective value in a 5 s interval; sd stands for standard deviation; IQR stands for interquartile range; min stands for the minimum and max for the maximum measured value. The maximum value of the peak speed comes from the same player, as she has trained in both teams (U17 and U20).
Across both teams, the mean peak speed values reached in 5 s intervals are comparable (U17: 7.96 km/h; U20: 7.95 km/h), with similar dispersion as indicated by the standard deviations (3.53 and 3.63, respectively). The absolute distance covered in a 5 s interval shows slightly higher mean values in U20 (8.51 m) compared to U17 (8.49 m), although the median values remain similar (7.39 m vs. 7.37 m). The IQR for absolute distance suggests that the distribution of this variable is relatively consistent across both teams, with moderate variability.
Ball speed shows higher overall variability compared to the other metrics, as indicated by its larger standard deviations and IQRs. The mean ball speed is slightly higher in U20 (52.34 km/h) than in U17 (51.51 km/h), with a broader range of values in both teams, spanning from 30 km/h (below 30 km/h, the data is not taken into account as the movement is not reliable enough at extremely low speeds) to over 110 km/h.
These descriptive statistics provide a fundamental understanding of the distribution and variability of performance metrics within each team, laying the foundation for further correlation and trend analyses.
Figures 4, 5 complement the descriptive statistics presented in Table 2 by visually illustrating the distribution and variability of the three key performance metrics (peak speed, absolute distance, and ball speed) across the U17 and U20 teams during the analyzed period (November 7, 2023, to January 10, 2025).

Figure 5. Boxplots of performance metrics per team, time range: 2023/11/07 to 2025/01/10; ball speed is measured as an integer; peak speed and absolute distance are motion metrics and represent the respective value in a 5 s interval.
The histograms in Figure 4 provide a detailed view of the frequency distributions for each metric. For ball speed, both teams exhibit a unimodal distribution with a concentration around 50–60 km/h, though U20 displays slightly higher frequencies at higher speeds compared to U17. Peak speed follows a right-skewed distribution for both teams, with most values clustered below 10 km/h. Similarly, absolute distance shows a pronounced right-skewed pattern, with most observations below 15 m for both teams. These histograms highlight the overall trends and differences in performance metrics between the two age groups.
The boxplots in Figure 5 further emphasize the variability and central tendencies of these metrics. For ball speed, U20 demonstrates slightly higher median values and a broader range compared to U17, consistent with the descriptive statistics. Peak speed boxplots reveal similar medians for both teams but show slightly more outliers in U20, indicating occasional higher peak speeds. Absolute distance boxplots confirm comparable medians across teams but also illustrate greater variability in U20 through a larger spread and more outliers.
These visualizations corroborate the statistical findings and provide deeper insights into the distributional characteristics of performance metrics across age groups, enabling a comprehensive understanding of team-specific trends in soccer performance.
3.2 Longitudinal trend analysis
Figure 6 shows statistically significant trends in the 95th percentile of the performance metrics for individual players over the period from November 7, 2023, to January 10, 2025. The modified Mann-Kendall test () was used to identify these trends. The data is grouped into two sets, labeled U17 and U20, with each subplot representing a specific performance metric, further divided into positive and negative trends. Each player’s data is color-coded by their ID for clarity.

Figure 6. Statistical significant trends per player per the 95th percentile of the performance metric; time range: 2023/11/07 to 2025/01/10; statistical test: modified Mann-Kendall test .
Table 3 provides a summary of the results in Figure 6 and shows the significant trends across the three performance metrics: ball speed, peak speed, and absolute distance. For 21 out of 46 players, a total of 30 significant trends were identified, with 16 negative and 14 positive trends. The distribution reveals that peak speed has the highest number of significant trends (13), with a majority being negative (8). Ball speed exhibits the fewest trends (7), dominated by positive changes (5), while absolute distance shows more negative trends (6) than positive (4). In addition to trend direction and significance, we report Kendall’s tau and Sen’s slope estimates for players with significant trends to quantify trend strength and magnitude in Table 4.

Table 3. Statistical significant trends per the 95th percentile of the performance metric summarised for performance metrics; time range: 2023/11/07 to 2025/01/10; statistical test: modified Mann-Kendall test p < 0.05.

Table 4. Summary of statistically significant trends in performance metrics per player based on the modified Mann-Kendall test. Reported values include the corrected p-value (after variance correction), Sen’s slope (indicating magnitude of trend per time unit), and Kendall’s tau (measuring monotonic trend strength). Direction indicates whether the trend was positive (increasing) or negative (decreasing).
3.3 Correlation analysis
Figures 7, 8 show correlation heatmaps based on FDR-adjusted -values, illustrating the relationships between subjective assessments (intensity and happiness) and the 95th percentile of performance metrics (ball speed, absolute distance, and peak speed) for players grouped by team and position. The heatmaps are divided into two panels representing the two teams (U17 and U20), with player positions on the x-axis and performance metrics on the y-axis. Positive correlations are shown in red hues, negative correlations in blue hues, and only those combinations with FDR-adjusted -values below 0.1 are displayed.

Figure 7. FDR-corrected correlation heatmap between subjective assessment of intensity and the 95th percentile of performance metrics, stratified by team and player position; time range: 2023/11/07 to 2025/01/10; Spearman correlation index with FDR correction ().

Figure 8. FDR-corrected correlation heatmap between subjective assessment of happiness and the 95th percentile of performance metrics, stratified by team and player position; time range: 2023/11/07 to 2025/01/10; Spearman correlation index with FDR correction ().
Figure 7 shows statistically significant () negative correlations between perceived intensity and ball speed for goalkeepers (U17, ) and defenders (U17, ; U20, ). Positive correlations were found between intensity and peak speed (U17, defenders, ) and absolute distance (U17, defenders, ; U20, defenders, ). No other combinations met the corrected significance threshold.
In Figure 8, only one statistically significant correlation remains after FDR correction: a negative association between happiness and absolute distance for defenders in the U20 team (). All other correlations fell above the FDR-adjusted threshold.
4 Discussion
4.1 Main results
Our findings provide valuable insights into the longitudinal trends and correlations of performance metrics in professional soccer players, revealing key patterns and potential implications for training and performance evaluation.
The descriptive statistics and visualizations presented in Table 2 and Figures 4, 5 provide valuable insights into the performance metrics of U17 and U20 soccer teams over the analyzed period. These findings highlight both similarities and differences in physical performance characteristics between the two age groups.
The comparable mean peak speed values across U17 and U20 teams (7.96 m/s vs. 7.95 m/s) suggest that this metric may not significantly differentiate between age groups at these levels. However, the slightly higher number of outliers observed in U20 (Figure 5) indicates that older players occasionally achieve higher peak speeds, which could reflect differences in individual physical development.
The slightly higher mean absolute distance covered by U20 players (8.51 m vs. 8.49 m) aligns with expectations that older players may engage in more extensive movement patterns due to increased training intensity. The greater variability observed in U20 (Figure 5) further supports this hypothesis, suggesting that individual differences in endurance or playing style may become more pronounced with age.
The higher mean ball speed observed in U20 (52.34 km/h vs. 51.51 km/h) and its broader range (Figure 4) indicate that older players may possess superior technical skills or physical strength, enabling them to execute faster passes or shots. This finding is consistent with the notion that ball speed is a critical performance metric influenced by both technical proficiency and physical maturity.
The longitudinal trend analysis can be interpreted as modeling individual response trajectories over time, capturing athlete-specific performance development based on longitudinal insole-derived data. This approach aligns with current recommendations in athlete monitoring that emphasize intra-individual variation and personalized interpretation. The analysis revealed distinct patterns in player performance metrics, with peak speed showing the highest number of significant trends, predominantly negative. This decline could indicate cumulative fatigue over the study period, possibly linked to intensified match schedules or insufficient recovery time. Conversely, the positive trends observed in ball speed suggest targeted improvements in skill-specific training. The distribution of trends across metrics and teams underscores the need for tailored interventions, as team-specific factors, such as coaching strategies or player composition, likely influence outcomes.
As shown in Figure 7, the FDR-corrected correlation analysis revealed selective but interpretable relationships between subjective intensity assessments and performance metrics. Only a limited number of associations reached statistical significance after correction, but the observed patterns offer exploratory insights into how perceived exertion relates to objective performance in specific roles and teams. For example, significant negative correlations between intensity and ball speed were evident for all player positions, suggesting that higher perceived effort may relate to reduced ball handling efficiency, possibly due to fatigue or training context. A positive association was found between intensity and absolute distance among U17 and U20 defenders, indicating that higher self-reported intensity can reflect increased physical output in certain defensive roles.
Figure 8 shows FDR-corrected associations between post-session happiness and performance metrics. In the U20 team, defenders show significant negative correlations between happiness and absolute distance. Additional a negative correlation were found for defenders and midfielders when using the 99th percentile of peak speed and absolute distance. These findings suggest that elevated physical output may not always correspond to greater subjective satisfaction, especially in roles that emphasize tactical stability over physical explosiveness. Conversely, U17 midfielders showed a positive correlation between happiness and ball speed, suggesting that technical performance may influence perceived satisfaction in more offensively oriented roles. These position- and age-specific results highlight the nuanced relationship between affective responses and physical output in high-performance settings.
For practitioners, these findings offer practical implications for individualized training design. The descriptive differences between U17 and U20 players support the implementation of age-specific interventions to improve peak speed, endurance, and technical execution. Tactical and role-based conditioning may be especially relevant for defenders and goalkeepers, while offensively positioned players could benefit from dynamic, performance-driven drills.
The FDR-corrected results also emphasize the psychological dimension of training. Defenders who experience lower satisfaction following physically intense sessions may benefit from targeted exercises that promote engagement, communication, and perceived relevance. In contrast, players whose satisfaction is positively associated with performance outputs, such as U17 defenders, may respond best to technically focused, feedback-rich environments.
In summary, although few correlations met statistical significance under FDR correction, the observed patterns offer robust, role-specific insight into the interplay between subjective perception and physical performance. These findings reinforce the value of integrating wearable-derived metrics with psychological self-assessments to optimize training effectiveness and athlete development. However, we emphasize that these findings are exploratory and should be interpreted as preliminary indications of potential associations, which warrant further investigation in future research.
4.2 Robustness check
To assess the robustness of the observed statistical relationships, we first repeated the modified Mann-Kendall trend analysis using the 99th percentile of each performance metric. The results, summarized in Table 5 and visualized in Figure 9, show a total of 26 statistically significant trends, slightly fewer than the 30 identified at the 95th percentile (Table 3). Peak speed continues to exhibit the highest number of significant trends (12), though the distribution is now evenly split between positive and negative trends (6 each), suggesting more variability in extreme speed outputs over time. Ball speed trends remain predominantly positive (5 vs. 3 negative), while absolute distance shows a slight skew toward negative trends (4 vs. 2 positive), indicating a possible decline in maximal distance efforts across the period.

Table 5. Statistical significant trends per the 99th percentile of the performance metric summarised for performance metrics; time range: 2023/11/07 to 2025/01/10; statistical test: modified Mann-Kendall test p < 0.05.

Figure 9. Statistical significant trends per player per the 99th percentile of the performance metric; time range: 2023/11/07 to 2025/01/10; statistical test: modified Mann-Kendall test .
The spatial and temporal distribution of these trends across both U17 and U20 teams remains consistent with the original analysis, reinforcing the validity of the observed individual development patterns. The replication of trend directionality, especially the dominance of peak speed trends, further supports the robustness of the original findings and confirms that the 95th percentile results are not overly sensitive to the specific percentile threshold used.
In addition, we replicated the Spearman correlation analysis (applying FDR correction) from Section 3.3 using the 99th percentile of each performance metric, to examine whether the associations between subjective assessments (intensity and happiness) and physical performance remain consistent under more extreme conditions. As illustrated in Figures 10, 11, the overall pattern of statistically significant relationships becomes more selective at the 99th percentile level.

Figure 10. FDR-corrected correlation heat map for the subjective assessment of intensity and the 99th percentile of performance metrics per player position and team; time range: 2023/11/07 to 2025/01/10; Spearman correlation index with FDR correction (); GK, goalkeepers; B, defenders; M, midfielders; ST, strikers.

Figure 11. FDR-corrected correlation heat map for the subjective assessment of happiness and the 99th percentile of performance metrics per player position and team; time range: 2023/11/07 to 2025/01/10; Spearman correlation index with FDR correction (); GK, goalkeepers; B, defenders; M, midfielders; ST, strikers.
For subjective intensity (Figure 10), negative correlations with ball speed are still evident for defenders (U17, ) midfielders (U20, ) and strikers (U20, ). A statistically significant positive correlation is observed between intensity and absolute distance for defenders in the U17 team (). No other player positions or performance metrics met the corrected significance threshold.
For subjective happiness (Figure 11), more distinct effects emerged. Notably, defenders in the U20 team show negative correlations with absolute distance (). Additionally, midfielders in the U20 team show negative correlations with peak speed (), while midfielders in the U17 show a positive correlation with ball speed ().
Overall, the consistency of both longitudinal trend results and selected correlation structures at the 99th percentile underscores the robustness of our analytical framework. While fewer associations reached statistical significance, key relationships, such as the negative link between intensity and ball speed or the negative association between happiness and performance in U20 players, persisted. These findings indicate that the relationships between subjective perceptions and objective performance metrics are not artifacts of percentile selection but rather reflect role- and context-dependent mechanisms that become particularly evident under extreme performance conditions.
The use of the 95th and 99th percentiles in performance analysis represents a methodological trade-off between capturing high-performance output and maintaining robustness against variability. While the 95th percentile provides a stable estimate of upper performance under typical conditions, the 99th percentile emphasizes near-maximal efforts. In our study, the 99th percentile was used as a percentile-based summary statistic and not as a raw maximum value, what reduces the influence of individual outliers. Combined with FDR correction, this approach mitigates the risk of spurious associations. Nonetheless, the reduced number of statistically significant correlations at the 99th percentile highlights the inherent tension: higher specificity and sensitivity to peak performance may come at the cost of reduced statistical power and generalizability. As such, percentile selection should be guided by the intended analytical objective, whether to monitor consistent high-level output or to isolate peak expressions of individual capacity.
As an additional robustness check, we repeated the Spearman correlation analysis using only players who were uniquely assigned to either the U17 or U20 team, thereby excluding two players who participated in both datasets. One of these players was categorized as a defender or midfielder, and the other as a striker. The resulting correlation heatmap for subjective intensity is provided as Supplementary Figure S1. The main correlation patterns remained largely consistent in both direction and magnitude. Notably, the significant negative associations between intensity and ball speed persisted, as did the positive associations between intensity and absolute distance and peak speed. Interestingly, the previously significant correlation between intensity and ball speed for defenders disappeared, while two new significant correlations emerged for U17 midfielders (with peak speed and absolute distance), and one new correlation appeared for U20 strikers (with absolute distance).
No statistically significant correlations were observed for subjective happiness in this reduced dataset. Overall, these results support the robustness of the main findings and suggest that the inclusion of overlapping players did not introduce bias into the observed correlation structures.
4.3 Limitations and outlook
This study is limited by its focus on a specific timeframe and the focus on two teams, which restricts the generalisability of the findings. Training culture, coaching style, and organizational context may vary substantially across clubs or national systems. Consequently, the observed patterns should be seen as exploratory and context-dependent. Future research should aim to replicate this approach across broader samples, including different clubs and regions, competitive levels and training environments.
In addition, the study is limited by its reliance on the 95th percentile of performance metrics, potentially missing trends in other data percentiles. So we added an analysis where we analyse the results for the 99th percentile of performance metrics. Subjective intensity and happiness assessments may be influenced by psychological factors, introducing bias. Statistical methods such as the modified Mann-Kendall test and Spearman correlation, while appropriate, could be complemented by machine learning to uncover nonlinear patterns. Future research should explore longer timeframes, integrate objective and psychological metrics, assess external factors like weather or match difficulty, include injury data, and expand the sample to teams from diverse leagues or regions for broader insights.
Although our study focused on female youth soccer players, the underlying principles and methodology have potential relevance in other performance contexts. The combination of wearable-based motion tracking and post-session subjective assessment could be similarly applied in male football or other team sports such as basketball or handball, where movement patterns, psychological engagement, and role-specific demands are also critical. Moreover, while our analysis focuses exclusively on training sessions, it is important to note that subjective perceptions such as intensity or happiness may still be influenced by role-based and contextual factors. For example, players who regularly start in matches may approach training with different goals, expectations, or mental loads compared to those who are primarily substitutes. Moreover, variations in training focus (e.g., recovery vs. conditioning), individual readiness, or anticipation of selection could shape how players perceive and report exertion or affect. Although we did not systematically collect role-related metadata, these factors may contribute to within-group variation in subjective assessments and should be considered in future research.
A further limitation lies in the accuracy of the machine learning models embedded in the TEAM-FX system. Although partially validated in a previous studies (15, 16), these models may still produce classification errors such as false positives or false negatives in kick detection, as well as minor estimation inaccuracies in derived metrics like ball speed. Such errors could introduce noise at the event level, especially in shorter time windows, and may attenuate the strength of correlations with subjective assessments. While we relied on validated output formats and summary statistics to mitigate this, these model-based uncertainties should be kept in mind when interpreting the results.
In addition, future research may benefit from applying longitudinal modeling techniques such as latent growth models. These models allow for the estimation of both fixed and random effects across time and could provide a more nuanced understanding of intra- and inter-individual variability in subjective and objective performance measures, particularly when data with regular time intervals and higher completeness become available.
Moreover, although we used non-parametric methods to capture individual-level variation without distributional assumptions, future studies may consider applying generalized linear mixed-effects models to more formally account for hierarchical data structures and repeated measures. This could offer complementary insights, especially when larger datasets become available.
In conclusion, this study underscores the importance of monitoring performance trends and their relationship with perceived intensity. By integrating these insights into training and match preparation, teams can better optimize player performance and reduce injury risks over the long term.
5 Conclusion
This study aimed to identify and understand trends in key performance metrics and their correlations with subjective intensity and happiness assessments among players in two football teams over a 14-month period. The detection of significant monotonic trends across multiple performance metrics highlights the potential of longitudinal monitoring to inform individual development. The observed correlations between subjective intensity and happiness ratings and objective metrics, as well as the positional differences in these relationships, suggest initial, context-sensitive connections between perceived effort and performance outcomes. The identified weak correlations offer exploratory insights into the interplay between perceived and measured performance. Building on these findings, future studies may apply this integrated approach to other sports disciplines, age groups, and competition levels to explore its broader applicability. The findings may support sports practice by encouraging the combined use of wearables and subjective self-assessments to help adjust training loads, monitor psychological responses, and guide the individualized development of athletes. This approach may also hold value for elite training programs seeking to individualize performance monitoring and optimize athlete readiness.
Data availability statement
The data analyzed in this study is subject to the following licenses/restrictions: Third party licensing restrictions apply. Requests to access these datasets should be directed to Burkhard Duemler,YnVya2hhcmQuZHVlbWxlckBhZGlkYXMuY29t.
Ethics statement
Ethical approval was not required for the study involving human samples in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants' legal guardians/ next of kin.
Author contributions
SK: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. CK: Writing – original draft, Writing – review & editing, Conceptualization, Formal analysis, Methodology. WK: Writing – original draft, Writing – review & editing, Conceptualization, Data curation, Visualization. BD: Conceptualization, Data curation, Writing – original draft, Writing – review & editing, Formal analysis.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was conducted within the COMET project DiMo-NEXT which is funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK), the Federal Ministry for Labour and Economy (BMAW), and the provinces of Salzburg, Upper Austria and Tyrol within the framework of COMET -Competence Centres for Excellent Technologies (Grant-No.: 48584933). COMET is processed by the Austrian Research Promotion Agency (FFG).
Conflict of interest
SK, CK, WK were employed by Salzburg Research Forschungsgesellschaft mbH. BD was employed by Adidas AG.
Generative AI statement
The author(s) declare that Generative AI was used in the creation of this manuscript. During the preparation of this work, the authors selectively used DeepL Translate and ChatGPT 4o for improving language and readability. The tools' output was reviewed before integration into the present article, and the authors take full responsibility for its contents.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fspor.2025.1627820/full#supplementary-material
References
1. Almulla J, Takiddin A, Househ M. The use of technology in tracking soccer players’ health performance: a scoping review. BMC Med Inform Decis Mak. (2020) 20:184. doi: 10.1186/s12911-020-01156-4
2. Liu Y. Research and development of GNSS wearable device for sports performance monitoring by example of soccer player analysis. In: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering. New York, NY, USA: Association for Computing Machinery (2023). EITCE ’22; p. 901–6.
3. Hossain HMS, Khan MAAH, Roy N. SoccerMate: a personal soccer attribute profiler using wearables. In: 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). Kona, HI: IEEE (2017). p. 164–9.
4. Bourdon PC, Cardinale M, Murray A, Gastin P, Kellmann M, Varley MC, et al. Monitoring athlete training loads: consensus statement. Int J Sports Physiol Perform. (2017) 12:S2-161–70. doi: 10.1123/IJSPP.2017-0208
5. Coyne JOC, Gregory Haff G, Coutts AJ, Newton RU, Nimphius S. The current state of subjective training load monitoring—a practical perspective and call to action. Sports Med Open. (2018) 4:58. doi: 10.1186/s40798-018-0172-x
6. Issurin VB. New horizons for the methodology and physiology of training periodization. Sports Med. (2010) 40:189–206. doi: 10.2165/11319770-000000000-00000
7. Meeusen R, Duclos M, Foster C, Fry A, Gleeson M, Nieman D, et al. Prevention, diagnosis and treatment of the overtraining syndrome: joint consensus statement of the European College of Sport Science (ECSS) and the American College of Sports Medicine (ACSM). Eur J Sport Sci. (2013) 13:1–24. doi: 10.1080/17461391.2012.730061
8. Borg G, Borg E. A new generation of scaling methods: level-anchored ratio scaling. Psychologica. (2001) 28:15–45.
9. McLaren SJ, Smith A, Spears IR, Weston M. A detailed quantification of differential ratings of perceived exertion during team-sport training. J Sci Med Sport. (2017) 20:290–5. doi: 10.1016/j.jsams.2016.06.011
10. Peris-Delcampo D, Núñez A, Ortiz-Marholz P, Olmedilla A, Cantón E, Ponseti J, et al. The bright side of sports: a systematic review on well-being, positive emotions and performance. BMC Psychol. (2024) 12:284. doi: 10.1186/s40359-024-01769-8
11. Zhang Z, Chen X, Xu L, Qin X, Veloo A. Sport anxiety and subjective happiness of college athletes: a self-determination theory perspective. Front Psychol. (2024) 15:1400094. doi: 10.3389/fpsyg.2024.1400094
12. Saw AE, Main LC, Gastin PB. Monitoring the athlete training response: subjective self-reported measures trump commonly used objective measures: a systematic review. Br J Sports Med. (2016) 50:281–91. doi: 10.1136/bjsports-2015-094758
13. Lechner S, Ammar A, Boukhris O, Trabelsi K, M Glenn J, Schwarz J, et al. Monitoring training load in youth soccer players: effects of a six-week preparatory training program and the associations between external and internal loads. Biol Sport. (2023) 40:63–75. doi: 10.5114/biolsport.2023.112094
14. Zrenner M, Küderle A, Roth N, Jensen U, Dümler B, Eskofier BM. Does the position of foot-mounted imu sensors influence the accuracy of spatio-temporal parameters in endurance running? Sensors. (2020) 20:5705. doi: 10.3390/s20195705
15. Schuldhaus D. Human activity recognition in daily life and sports using inertial sensors (Doctoral dissertation). Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany (2019).
16. Stoeve M, Schuldhaus D, Gamp A, Zwick C, Eskofier BM. From the laboratory to the field: IMU-based shot and pass detection in football training and game scenarios using deep learning. Sensors. (2021) 21:3071. doi: 10.3390/s21093071
17. Gershgoren L, Blatt A, Sela T, Tenenbaum G. Does the league table lie? The development and validation of the perceived performance in team sports questionnaire (PPTSQ). Front Psychol. (2021) 11:615018. doi: 10.3389/fpsyg.2020.615018
19. Mann HB. Nonparametric tests against trend. Econometrica. (1945) 13:245–59. doi: 10.2307/1907187
20. Wang F, Shao W, Yu H, Kan G, He X, Zhang D, et al. Re-evaluation of the power of the Mann-Kendall test for detecting monotonic trends in hydrometeorological time series. Front Earth Sci. (2020) 8:14. doi: 10.3389/feart.2020.00014
21. Hamed KH, Rao AR. A modified mann-kendall trend test for autocorrelated data. J Hydrol (Amst). (1998) 204:182–96. doi: 10.1016/S0022-1694(97)00125-X
22. Spearman C. The proof and measurement of association between two things. Am J Psychol. (1904) 15(1):72–101. doi: 10.2307/1412159
23. Zar JH. Significance testing of the spearman rank correlation coefficient. J Am Stat Assoc. (1972) 67:578–80. doi: 10.1080/01621459.1972.10481251
24. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol). (1995) 57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x
25. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing (2023).
26. Patakamuri SK, O’Brien N. modifiedmk: modified versions of Mann Kendall and Spearman’s Rho Trend tests. R package version 1.6 (2021).
Keywords: youth soccer performance, wearable sensors, subjective training assessment, machine learning in sports, longitudinal monitoring
Citation: Kranzinger S, Kranzinger C, Kremser W and Duemler B (2025) Performance tracking in female youth soccer through wearables and subjective assessments. Front. Sports Act. Living 7:1627820. doi: 10.3389/fspor.2025.1627820
Received: 13 May 2025; Accepted: 16 June 2025;
Published: 30 June 2025.
Edited by:
Bruno Gonçalves, University of Evora, PortugalReviewed by:
Júlio A. Costa, Portugal Football School, PortugalGemma Martinez-Torremocha, University of Castilla La Mancha, Spain
Copyright: © 2025 Kranzinger, Kranzinger, Kremser and Duemler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Stefan Kranzinger, c3RlZmFuLmtyYW56aW5nZXJAc2FsemJ1cmdyZXNlYXJjaC5hdA==