Biological Maturity Status in Elite Youth Soccer Players: A Comparison of Pragmatic Diagnostics With Magnetic Resonance Imaging

The influence of biological maturity status (BMS) on talent identification and development within elite youth soccer is critically debated. During adolescence, maturity-related performance differences within the same age group may cause greater chances of being selected for early maturing players. Therefore, coaches need to consider players' BMS. While standard methods for assessing BMS in adolescents are expensive and time-consuming imaging techniques (i.e., X-ray and MRI), there also exist more pragmatic procedures. This study aimed to evaluate commonly used methods to assess BMS within a highly selected sample of youth soccer players. A total of N = 63 elite male soccer players (U12 and U14) within the German Soccer Association's talent promotion program completed a test battery assessing BMS outcomes. Utilizing MRI diagnostics, players' skeletal age (SAMRI) was determined by radiologists and served as the reference method. Further commonly used methods included skeletal age measured by an ultrasound device (SAUS), the maturity offset (MOMIR), and the percentage of adult height (PAHKR). The relation of these alternative BMS outcomes to SAMRI was examined using different perspectives: performing bivariate correlation analyses (1), modeling BMS as a latent variable (BMSlat) based on the multiple alternative diagnostics (2), and investigating individual differences in agreement (3). (1) Correlations of SAMRI and the further BMS variables ranked from r = 0.80 to r = 0.84 for the total sample and were lower for U12 (0.56 ≤ r ≤ 0.66), and U14 (0.61 ≤ r ≤ 0.74) (2). The latent structural equation modeling (SEM) (R2 = 51%) revealed a significant influence on BMSlat for MOMIR (β = 0.51, p <0.05). The additional contribution of PAHKR (β = 0.27, p = 0.06) and SAUS (β = −0.03, p = 0.90) was rather small (3). The investigation of individual differences between the reference method and alternative diagnostics indicated a significant bias for MOMIR (p <0.01). The results support the use of economical and time-efficient methods for assessing BMS within elite youth soccer. Bivariate correlation analyses as well as the multivariate latent variable approach highlight the measures' usefulness. However, the observed individual level differences for some of the utilized procedures led to the recommendation for practitioners to use at least two alternative assessment methods in order to receive more reliable information about players' BMS within the talent promotion process.

The influence of biological maturity status (BMS) on talent identification and development within elite youth soccer is critically debated. During adolescence, maturity-related performance differences within the same age group may cause greater chances of being selected for early maturing players. Therefore, coaches need to consider players' BMS. While standard methods for assessing BMS in adolescents are expensive and time-consuming imaging techniques (i.e., X-ray and MRI), there also exist more pragmatic procedures. This study aimed to evaluate commonly used methods to assess BMS within a highly selected sample of youth soccer players. A total of N = 63 elite male soccer players (U12 and U14) within the German Soccer Association's talent promotion program completed a test battery assessing BMS outcomes. Utilizing MRI diagnostics, players' skeletal age (SA MRI ) was determined by radiologists and served as the reference method. Further commonly used methods included skeletal age measured by an ultrasound device (SA US ), the maturity offset (MO MIR ), and the percentage of adult height (PAH KR ). The relation of these alternative BMS outcomes to SA MRI was examined using different perspectives: performing bivariate correlation analyses (1), modeling BMS as a latent variable (BMS lat ) based on the multiple alternative diagnostics (2), and investigating individual differences in agreement (3). (1) Correlations of SA MRI and the further BMS variables ranked from r = 0.80 to r = 0.84 for the total sample and were lower for U12 (0.56 ≤ r ≤ 0.66), and U14 (0.61 ≤ r ≤ 0.74) (2). The latent structural equation modeling (SEM) (R 2 = 51%) revealed a significant influence on BMS lat for MO MIR (β = 0.51, p < 0.05). The additional contribution of PAH KR (β = 0.27, p = 0.06) and SA US (β = −0.03, p = 0.90) was rather small (3). The investigation of individual differences between the reference method and alternative diagnostics indicated a significant bias for MO MIR (p < 0.01). The results support the use of economical and time-efficient methods for assessing BMS within elite youth soccer. Bivariate correlation analyses as well as the multivariate latent variable approach highlight the measures'

INTRODUCTION
In the context of competitive sports, major sports organizations invest considerable financial resources and work in the promotion and development of youth players (Johnston et al., 2018), meaning a huge commitment for the organizations as well as for the players. For instance, upon being drafted (to a selection squad), promising players are removed from their familiar environment at an early stage . While this separation can be valuable from a performanceoriented perspective, it can also represent a serious interruption in the personality development of young players (Fraser-Thomas et al., 2008). Therefore, a fair (and optimal) selection process must take into consideration the viewpoints of both competitive sports and pedagogical development. However, several studies point out that such selection processes within talent development programs are challenging in youth soccer (Gouvêa et al., 2017;Cumming, 2018).
A major reason for this problem is separating youth players into chronological age (CA) groups (e.g., U10, U11, and U12) based on an annual cutoff date (Helsen et al., 2005;Deutscher Fußball-Bund, 2020). These classifications lead to CA differences of players within an age group of up to 1 year . Players born early in a given year (i.e., first birth quarter) generally have a physiological advantage in their development in contrast to their younger counterparts (i.e., those born in the fourth quarter). This leads to the well-known relative age effect (RAE), which occurs especially in soccer talent identification (Deprez et al., 2013). Votteler and Höner (2017) emphasize the importance of this effect by demonstrating that significantly more players are born in the first rather than last quarter in German youth national teams. Moreover, this problem is further reinforced when a player's biological maturity status (BMS), regardless of his CA, is neglected. Johnson et al. (2017) point out that BMS has a stronger impact than RAE when selecting players. Especially in the pubertal stages (i.e., 11-16 years; Deprez et al., 2015), in which rigorous and important selection processes take place, the difference in players' BMS can reach up to 5 years (Malina et al., 2004). In practice, this leads to the phenomenon where coaches and talent scouts often prefer early maturing players due to a currently better performance level based on more developed physical attributes (Unnithan et al., 2012). At the same time, late maturing players show lower performance levels, especially in physiological predictors, and therefore are often overlooked (Cumming et al., 2017). Hence, those players often do not receive access to a comprehensive talent promotion program with more qualified coaches and better resources. Furthermore, they also receive less playing time in competitive matches, less team responsibility, and less emotional support, which undermines their holistic soccer education (Malina et al., 2015). In the worst case, even highly talented players are deselected due to the time-delayed biological development in comparison with their on-time or early developing peers. Therefore, sports scientific research on talent has long shown that a player's individual BMS should be taken into account within talent promotion, particularly in selection processes (Romann and Javet, 2018).
As a result, there have been recent initial approaches that have tested the classification of players according to their biological maturity rather than CA. This classification strategy is currently referred to as bio-banding (Cumming et al., 2017). Bio-banding can be considered in different domains, such as conducting new competition formats and grouping players in strength and conditioning training to prevent injuries or in talent identification with respect to selecting players for promotion programs (see Cumming et al., 2018a,b). To date, only a few pilot studies have evaluated bio-banding, e.g., within tournaments initiated by United States Soccer Federation or as a part of the Elite Player Performance Plan by the English F.A. (Bradley et al., 2019).
These studies indicate that bio-banding offers potential benefits for talent promotion programs . However, in order to obtain the benefits of implementing biobanding in soccer practice and research, appropriate methods are needed that meet ethical and economically pragmatic criteria and undergo a sound psychometric evaluation.
To determine BMS in youth players-both in research and in practice-various methods are proposed that can objectively assess different dimensions of biological maturity (e.g., skeletal age or somatic age; Lloyd et al., 2014) or qualitatively through a morphological, subjective examination of the maturity status by experts (Romann et al., 2017). In the research literature, measuring skeletal age radiographically is currently regarded as the gold standard method (Lloyd et al., 2014). In various international talent studies, an image of the left hand is taken by using an X-ray (Gouvêa et al., 2017;Holienka et al., 2017). However, some researchers have figured out that even in different assessment methods based on X-rays (i.e., Fels method vs. Tanner-Whitehouse 3 method; Malina et al., 2007a), no satisfying agreement in the determination of skeletal age could be achieved. Furthermore, due to the radiation exposure, there are ethical concerns that make routine implementation very difficult, especially with youth and adolescent players (Focardi et al., 2014). In fact, in competitive sports in many European countries, there is currently no legal basis for using X-ray examinations to estimate the ages of healthy youth players (Timme et al., 2017). In this context, several studies mention that, in addition to the X-ray method as the gold standard, an image of the handwrist can also be reliably produced by radiation-free magnetic resonance imaging (Dvorak et al., 2007;Dvorak, 2009;George et al., 2012;Bolívar et al., 2015;Urschler et al., 2016;MRI). Going further, there is also evidence that MRI that does not use ionizing radiation is fundamentally more accurate than X-rays due to its high contrast resolution (Serinelli et al., 2015). Therefore, currently-from the authors' point of view-MRI is the most established method for determining skeletal age and can be used without ethical consequences.
In general, the MRI method is associated with economic disadvantages (e.g., high costs and long acquisition time) for practical use. Therefore, there appears to be a need for less costly and less technical methods to measure the biological maturity based on skeletal age. One promising alternative method involves a newly developed ultrasound diagnostic device (SonicBone, Rishon Lezion, Israel). Currently, however, this method has been only sufficiently validated for children, but not in a sport-specific context (i.e., Rachmiel et al., 2017;Utczas et al., 2017), and a validation of this method in the field of youth competitive sports is still pending. This validation is necessary given the fact that moderator variables have to be considered in talent research (e.g., performance level, age groups, gender, see Murr et al., 2018).
To date, practitioners from diverse talent promotion programs mainly use alternative assessments and determine biological maturity by somatic age (Cumming et al., 2018b). In this context, two commonly utilized methods are the calculation of the maturity offset (MO) based on the estimation of individual age at peak height velocity (APHV; Mirwald et al., 2002) and calculating the percentage of predicted final adult height (PAH; Khamis and Roche, 1994). More specifically, the Mirwald method estimates the players' MO based on various parameters (CA at the time of measurement, weight, height, sitting height, and leg length), while the Khamis-Roche approach predicts the adult height from weight and height of the individual as well as the height of the biological parents. However, considering the influence of moderator variables, researchers emphasize that these methods have inaccuracies depending on which age group was studied (Myburgh et al., 2019).
Within the constraints of talent promotion programs, empirical knowledge of appropriate diagnostics for determining BMS is needed. However, in practice, such methods have to be both acceptable with regard to costs and ethical issues for bio-banding strategies and scientifically sound in terms of psychometric properties.
Indeed, in prior studies, researchers aimed to validate different diagnostics to assess BMS in youth players in different sports (e.g., Malina et al., 2007 in American football ;Malina et al., 2012 andRomann et al., 2017 in soccer;Myburgh et al., 2019 in tennis). Malina et al. (2007b) found a correlation of r s = 0.52 between skeletal age measured by left hand-wrist radiographs and PAH in their study with 143 male American football players (9.27-14.24 years). By utilizing the same method as Malina et al. (2007b), similar results were detected by Myburgh et al. (2019) in an investigation with 40 male, British junior tennis players (12.5 ± 1.8 years). Apart from PAH, this study also used predicted APHV (Mirwald et al., 2002) for BMS assessment and found lower correlations between various assessment methods (PAH: r s = 0.35; predicted APHV: r s = 0.37). In comparable studies in soccer, Romann et al. (2017) and Malina et al. (2012) found similar correlations (0.26 ≤ r s ≤ 0.47) between skeletal age and PAH, as well as predicted APHV in 11-to 14-years-old male soccer players. However, these studies mainly used bivariate, correlative approaches to analyze the relationship between the gold standard and further BMS diagnostics.
Therefore, special focus should be given to an accurate and comprehensive investigation of diagnostics' reliability and validity by comparing alternative diagnostics with a well-established reference method (i.e., MRI) from different perspectives. Here, correlational analyses (perspective 1) give insight into the association between possible appropriate diagnostics and the reference method. Furthermore, a multivariate consideration of the alternative diagnostics' coherence with the theoretical construct of BMS (perspective 2) may facilitate a more comprehensive view of diagnostics' psychometric properties. Therefore, a structural equation modeling (SEM) approach may be beneficial to define BMS as a latent variable by utilizing the reference method as the measurement model. Consequently, this makes it possible to accurately analyze the degree of the theoretical construct's correspondence with further alternative and more pragmatic diagnostics (e.g., Bollen, 1989). More specifically, one can investigate whether the multiple alternative BMS diagnostics may be utilized in combination to represent BMS in a satisfying manner. Since the consideration of manifest variables as indicators implies neglecting potential measurement errors, SEM could take this problem into account and enable more exact calculations (e.g., Skrondal and Rabe-Hesketh, 2004). However, perspectives 1 and 2 fail to examine absolute differences between two measures (Bland and Altman, 2003). In order to go beyond such an examination of the relationship of the considered diagnostics for criterion validation, the absolute agreement (perspective 3) between the reference method and the alternative diagnostics should be taken into account by analyzing individual differences and systematic biases in agreement between the various methods (Bland and Altman, 1999;Giavarina, 2015).

THE PRESENT STUDY
The aim of the present study is to evaluate pragmatic diagnostics for assessing biological maturity in a representative sample of elite youth soccer players by comparing their applicability for the assessment of skeletal age, which is currently considered the gold standard method (Lloyd et al., 2014). In doing so, the reference method MRI was set as the criterion to determine skeletal age (e.g., Serinelli et al., 2015;Urschler et al., 2016). The MRI approach was chosen in order to avoid possible health risks for players due to unnecessary radiation exposure (i.e., X-ray method). In addition to fulfilling economic, ethical, and pragmatic criteria, the study focuses especially on the criterion validation of different diagnostics to assess BMS. Therefore, the study's main purpose was to investigate the agreement between the radiation-free MRI diagnostics and the alternative (e.g., in terms of setup and cost), more economical and practical methods of measuring BMS by (a) Skeletal age using a quantitative ultrasound-based device and (b) Somatic age utilizing estimates of MO (Mirwald et al., 2002) and PAH (Khamis and Roche, 1994).
These BMS outcomes were related to the reference method MRI using three perspectives of analyses: • Bivariate correlation analyses of MRI with the alternative BMS diagnostics, • Multivariate modeling of BMS as a latent variable (measured by MRI) based on alternative BMS diagnostics, and • Investigation of individual differences and systematic bias in agreement between MRI and alternative BMS diagnostics.

Participants
The study sample consisted of male youth soccer players (N = 63) who were part of the German talent promotion program. Players were born between 2006 and 2008 and belonged to either the U12 (n = 32, 11.3 ± 0.3 years old) or U14 age group (n = 31, 13.4 ± 0.3 years old). For the estimation of an appropriate sample size in each age group, statistical a priori power analyses were performed utilizing G * Power Version 3.1.9.4 (α = 0.05, 1β = 0.85, two-tailed). In order to detect at least large effect sizes within the correlational analyses (i.e., r ≥ 0.50; Cohen, 1988), a sample size of at least 30 players in each age group (i.e., U12 and U14) was indicated.
As the talent promotion program comprises two important levels of promotion in early to middle adolescence (i.e., competence centers and youth academies), the sample included a balanced amount of competence center (U12: n = 16; U14: n = 16) as well as youth academy players (U12: n = 16; U14: n = 15). All players' legal guardian/next of kin provided informed written consent for the collection and scientific use of the data. With respect to the MRI diagnostics, players and their parents were informed about the examination in advance and had to sign a study participation agreement. The research was approved by the ethics committee of the Faculty of Medicine at the University of Frankfurt and the scientific board of the DFB Academy.

Measures
The entire investigation was conducted within 2 weeks at Frankfurt University Hospital and was predetermined in a strict protocol. Testing for one player, including MRI, ultrasound, and anthropometric data, took about 25 min and took place between 12 and 4 p.m. at the day of assessment. Before every measurement, all players were informed about the detailed assessment procedure by the respective investigators.

Criterion
To assess the reference method for BMS, an MRI of each player's left hand was taken. A 3.0-Tesla MRI (MAGNETOM Prisma, Siemens, Erlangen, Germany) using a dedicated wrist coil was implemented for the native MRI examination. Players were examined in the prone position with the left arm extended (super-man position). In the coil, the middle finger was positioned in the same axis as the radius to avoid ulnar or dorsal deviation. The MRI data of the bones of the left hand were evaluated by three certified clinical radiologists with different experience levels (1 = specific pediatric radiologist, 2 = more than 20 years, and 3 = more than 3 years of experience in clinical radiology) independently from each other. The conventional Tanner-Whitehouse 2 method (TW2; Tanner et al., 2001;Satoh, 2015) was used to determine the skeletal age to the nearest 0.1 years. Inter-rater reliabilities were found to be excellent for the total sample (ICC = 0.

Predictors
To determine the skeletal age based on the ultrasound examination, the BAUSport TM instrument was used (Rachmiel et al., 2017). The ultrasound device is a small, portable, bone sonometer (SonicBone, Rishon Lezion, Israel). It analyzes three sites of the left hand [(1) the distal radius and ulna's secondary ossification centers of the epiphyses at the wrist; (2) the growth plate of the third metacarpal and the shaft of the proximal phalange; and (3) the distal metacarpal epiphysis at the metacarpals]. The device measures the speed of propagation through bone of inaudible high-frequency waves of a short ultrasound pulse (m/s) and the distance attenuation factor (decay rate). With the use of these parameters, skeletal age was calculated (to the nearest 0.01 years) by an algorithm integrated into the software of BAUSport TM using the scoring method designed by Tanner and Whitehouse (TW2 method;Tanner et al., 2001;Rachmiel et al., 2017). All ultrasound examinations were conducted by a trained person according to the BAUSport TM user manual's instructions. All subjects underwent two measurements. Correlation analyses showed excellent retest reliability for the two measurements (r tt = 0.98). The mean of both measurements comprised players' skeletal age according to ultrasound (SA US ).
For anthropometric data assessment, all players were barefoot and wore only shorts. Weight was measured with calibrated scales (seca 213 portable stadiometer) to the nearest 0.1 kg. Height and sitting height were determined to the nearest 0.1 cm with a fixed stadiometer (seca 813 electronic flat scale). Here, players had to stand with feet together and arms relaxed. For sitting height, the players sat on a table with an upright trunk and back against the stadiometer. Leg length was indirectly calculated as the difference between standing height and sitting height. In both measurements, the players' head was aligned with the Frankfurt horizontal plane (Malina and Koziel, 2014). Two measurements were taken for each anthropometric variable by the same trained research assistant. Retest reliabilities for all anthropometric measurements were excellent (r tt ≥ 0.99). If the results differed by more than 0.4 kg for weight, or 0.4 cm for height, or 0.4 cm for sitting height, a third measurement was conducted (Mirwald et al., 2002). The findings for each anthropometric measurement were averaged.
By additionally recording the body sizes of the biological parents (collected by a questionnaire), the somatic age was also estimated using the Khamis-Roche method (Khamis and Roche, 1994). The method enables prediction of players' adult height based on the regression formula: predicted adult height (in cm) where β 0 , β 1 , β 2 , and β 3 represent age and gender-specific regression coefficients defined by Khamis and Roche (1994) (for more details, see this original research). In order to control for a potential overestimation of the self-reported heights by parents (Maukonen et al., 2018) and in line with former research (Cumming et al., 2018b), parents' heights were adjusted according to the recommendations of Epstein et al. (1995) before calculating the mid-parent height parameter. By utilizing this adult height prediction, players' PAH (in %) (PAH KR ) was calculated by the ratio (height/predicted adult height).

Data Analysis
Data were analyzed utilizing IBM SPSS version 26 and Mplus Version 8.4. In order to compare the reference method (SA MRI ) and the alternative diagnostics for assessing BMS (SA US , MO MIR , and PAH KR ) according to the three perspectives of analyses, the following statistical procedures were applied.

Bivariate Correlation Analyses
Pearson's r served as the measure for the correlations between SA MRI and the alternative BMS diagnostics (SA US , MO MIR , and PAH KR ) for the total sample as well as for each age group separately.

Multivariate Latent Structural Equation Modeling
A SEM approach was used to model BMS as a latent construct. Within the measurement model, three different evaluations of SA MRI by the independent clinical experts were defined to load on the latent variable BMS lat . The alternative diagnostics SA US , MO MIR , and PAH KR served as predictors for BMS lat . In accordance with Muthén and Muthén (2010), R 2 was examined to quantify the amount of variance within BMS lat explained by the utilized predictors within the latent regression model. As the sample sizes within each age group were too low to specify a model for U12 and U14 separately, the SEM was only computed for the total sample. However, in order to also adjust for the classification to an age group for this perspective, all variables were z-standardized within each age group before the model was run.

Investigation of Individual Differences and Systematic Bias in Agreement
In addition to the correlative approaches in perspectives 1 and 2, Bland-Altman analyses (Bland and Altman, 1999) were utilized to investigate individual differences as well as systematic biases in agreement between SA MRI and each of the three alternative measures SA US , MO MIR , and PAH KR . Since a comparison between two methods is only reasonable when two measurements are of the same unit, some measurements had to be converted before the analysis. In particular, MO MIR was converted into skeletal age (i.e., SA MIR ) based on the mean individual APHV for boys (i.e., 13.8 years; Malina et al., 2004) via the equation SA MIR = MO MIR + 13.8. With respect to PAH KR , there was no possibility for a transformation into skeletal age. For this reason, SA MRI was transformed into values of achieved percentage of adult height (i.e., PAH MRI ) by a conversion tool BoneXpert, 2020 validated by Thodberg et al. (2009). Finally, players' PAH MRI was determined as the ratio of their current height and their predicted adult height. As the BoneXpert conversion is restricted to individuals where the absolute difference between their skeletal and CA is <3.5 years, three players had to be excluded from this part of the analyses.
In accordance with Bland and Altman (1999), the average of two measures to be compared (i.e., SA MRI and SA US , SA MRI and SA MIR , and PAH MRI and PAH KR , resp.) constituted the x-axis, whereas the differences between the measures (SA MRI -SA US , SA MRI -SA MIR , and PAH MRI -PAH KR , resp.) were depicted on the y-axis of the plots. Additionally, the mean difference and the corresponding 95% limits of agreement were computed and marked within the graphs according to Bland and Altman (2003). Finally, one-sample t-tests were utilized in order to examine whether there was a significant systematic bias between two measurements, which was indicated if the average of the differences between measurements deviated significantly from zero.

RESULTS
Descriptive statistics for all maturity-related outcomes for the total sample as well as separated by age group are displayed in Table 1.

Bivariate Correlation Analyses
The results of the correlation analyses with respect to various BMS outcomes are presented in Table 2. All correlations were found to be significant (p < 0.001). With respect to the total sample, correlation coefficients of SA MRI and the further BMS variables are ranked from r = 0.80 (SA MRI , SA US ) to r = 0.84 (SA MRI , MO MIR ). When looking at the age groups U12 and U14 separately, correlations for U14 were higher for BMS variables (0.61 ≤ r ≤ 0.74) than those for U12 (0.56 ≤ r ≤ 0.67).
The latent SEM revealed a significant influence on the latent factor BMS lat for the variable MO MIR (β = 0.51, p < 0.05). Due to high correlations between MO MIR and the other alternative diagnostics PAH KR (r = 0.78) and SA US (r = 0.87), the additional contribution of PAH KR (β = 0.27, p = 0.06), and in particular of SA US (β = −0.03, p = 0.90) within the regression model was rather small and not significant.

Investigation of Individual Differences and Systematic Biases in Agreement
The results from the Bland-Altman analyses are shown in Figure 2. When regarding the range of differences between SA MRI and the other diagnostics with increasing mean values between two measurements, differences did not seem to correspond with the mean value. Furthermore, the investigation at an individual level showed that nearly all differences between the reference method (SA MRI ) and each alternative BMS diagnostics were within the 95% limits of agreement for the mean value. In total, five individuals were identified as outliers (i.e., players whose differences fell outside of the 95% limits of agreement) by at least one of the three comparisons between the reference method and the pragmatic BMS diagnostics. All three outliers identified by SA US were also detected by at least one further comparison. However, both SA MIR and PAH KR found one outlier each that was not recognized by another comparison. Moreover, while the average of the differences for the comparison of SA MRI

DISCUSSION
In addressing the demand of both practitioners and researchers to integrate information about players' BMS within the processes of talent promotion, the present study evaluated various BMS diagnostics within a representative setting. Highly talented adolescent players (i.e., U12 and U14) from the two main institutions of the German talent promotion program (i.e., competence centers and youth academies) underwent a test battery consisting of the (costly and time intensive) reference method (SA MRI ) as well as additional, more pragmatic diagnostics (SA US , MO MIR , and PAH KR ) that could be applied, among other things, in an area-wide setting. Following the idea of a comprehensive evaluation, diagnostics were related to the reference method from three different perspectives (i.e., bivariate correlation analyses, multivariate latent SEM approach, and investigation of individual differences and systematic deviations). The comparison between the reference method and the alternative diagnostics (perspective 1) revealed strong correlations for the total sample (r > 0.80) and, as expected due to a lower variance in age, slightly lower correlations regarding U12 (r ≥ 0.56) and U14 (r ≥ 0.61) separately. The multivariate SEM approach (perspective 2) allowed for an accurate investigation (adjusted for age group) of the alternative diagnostics' conformity with BMS as a latent construct free of measurement errors (BMS lat ). BMS lat was measured by three evaluations of the reference method by independent experts. Overall, the three alternative BMS outcomes (in combination) predicted significant BMS lat . Perspective 3 added value to the diagnostics' evaluation by considering differences and systematic deviations at the individual level. The procedures' average difference exposed a systematic bias for SA MIR . Furthermore, the comparisons revealed high standard deviations for the differences between the reference method and the pragmatic diagnostics. With respect to detected outliers, a high degree of agreement was achieved among comparisons. FIGURE 1 | Latent structural equation modeling (SEM): biological maturity status (BMS) as a latent construct predicted by the alternative BMS diagnostics. BMS lat , biological maturity status as a latent construct; SA MRI(i) , evaluation of skeletal age based on magnetic resonance imaging by rater i; SA US , skeletal age determined by mobile ultrasound device; MO MIR , maturity offset according to Mirwald et al. (2002); PAH KR , percentage of adult height according to Khamis and Roche (1994).
Former research that aimed to validate different diagnostics to assess BMS in youth players mainly utilized correlative approaches, that is, perspective 1 (e.g., Malina et al., 2007b in American football; Malina et al., 2012 andRomann et al., 2017 in soccer;Myburgh et al., 2019 in tennis). While the players in those studies were of similar CA as were participants of the present study, all authors used maturity categories (early, on time, and late) instead of continuous outcomes to define players' BMS. This may limit the comparability to the present study, which utilized continuous BMS variables (e.g., skeletal age) in order to differentiate more precisely between diagnostics. Nevertheless, the results of categorical classifications were compared with the results of the current study to establish a reference to existing literature. In general, the correlations between diagnostics were found to be higher in the present study. In a study of 143 male American football players (9.27-14.24 years), Malina et al. (2007b) found lower correlations (r s = 0.52) between the skeletal age measured by left hand-wrist radiographs (evaluated by the Fels method) and PAH KR compared with the results of this investigation (r = 0.84). Similar (even lower) results were found by Myburgh et al. (2019) in a study with 40 male, British junior tennis players (12.5 ± 1.8 years). Utilizing the same method as Malina et al. (2007b), this study evaluated PAH KR as well as the predicted APHV (Mirwald et al., 2002) and found limited agreement between various assessment methods (PAH KR : r s = 0.35; predicted APHV: r s = 0.37). With respect to comparable studies in soccer, Romann et al. (2017) found only moderate rank correlations (r s = 0.42) between skeletal age (measured by Xray) and predicted APHV in male Swiss soccer players (13.9 ± 1.8 years), while SA MRI and MO MIR highly correlated within this investigation (r = 0.81). A similar pattern holds true for a study of 11-to 12-years-old (n = 87) and 13-to 14-years-old (n = 93) male soccer players evaluating the relationship among indicators (i.e., skeletal age based on the Fels method, predicted APHV, and PAH KR ) of BMS (Malina et al., 2012). While results showed small to moderate Spearman rank correlations for both age groups (11-12 years:.26 ≤ r s ≤ 0.43; 13-14 years: 29 ≤ r s ≤ 0.47), Pearson coefficients of the present study for the corresponding age groups (U12:.56 ≤ r ≤ 0.67; U14:.61 ≤ r ≤ 0.74) were large. These higher correlations could be explained by the more differentiated assessment of BMS outcomes (i.e., categorized vs. continuous), which might be indicated when evaluating potential alternative BMS diagnostics.
Moreover, the research analyzing multivariate correspondence between diagnostics is scarce. An exception is the study of Malina et al. (2012), which evaluates the relationship among BMS indicators in young soccer players. The authors found that the chosen indicators (i.e., skeletal age based on radiographs, APHV, PAH KR , and stage of pubic hair) showed one principal factor (i.e., one dimension) within a principal component analysis for 13-to 14-years-old players. This finding is in line with the results of the present study where the three alternative diagnostics significantly predicted BMS lat (perspective 2). This provides evidence that, in a realistic setting of highly selected, male youth soccer players, alternative diagnostics, such as SA US , MO MIR , and PAH KR may be used to assess BMS more pragmatically and efficiently in order to incorporate players' BMS as an important criterion within the talent promotion process (i.e., in terms of selections and bio-banding; Cumming, 2018). However, it was particularly MO MIR (β = 0.51) that predicted BMS lat within the regression model. The influences of SA US as well as PAH KR were rather small (β ≤ 0.27) because of the high correlations (r ≥ 0.78) within the three alternative measurements (i.e., collinearity). Perhaps the use of similar information to compute BMS (i.e., CA and anthropometric measurements) within those measurements, among other factors, may have led to these high correlations. On the one hand, this may lead to the conclusion that the use of only one measurement may be FIGURE 2 | Bland-Altman plots: individual differences of SA MRI and the alternative BMS diagnostics. SA MRI , skeletal age determined by magnetic resonance imaging; SA US , skeletal age determined by mobile ultrasound device; SA MIR , maturity offset (Mirwald et al., 2002) transformed to skeletal age according to Thodberg et al. (2009); PAH KR , percentage of adult height according to Khamis and Roche (1994); PAH MRI , skeletal age determined by magnetic resonance imaging converted to percentage of adult height according to Thodberg et al. (2009). sufficient. On the other hand, the use of the combination of the three alternative methods leads to a higher degree of explained variance without overly magnifying the efforts that come with the assessment.
A further benefit of such a combinatory approach could be obtained from the investigation of individual differences between the various diagnostics providing a more comprehensive view of players' BMS (e.g., to detect systematic bias between two diagnostics). As demonstrated in perspective 3, systematic bias was found between the reference method and the measurement SA MIR . Although these pragmatic diagnostics may be easily utilized within an area-wide setting (e.g., a huge number of players within a nationwide promotion program), their use at the individual level must be considered with caution. The systematic bias for the comparison with SA MIR as well as the slightly high variance (e.g., SD = 0.74 years for SA MIR ) indicates considerable deviations between the alternative diagnostics and the reference method. These distinct differences seem problematic when using one of the alternative diagnostics in practice in order to get reliable BMS information at an individual level. Instead, the use of at least two alternative diagnostics may be helpful in order to adjust for the deviations between the pragmatic and reference method diagnostics.
Consequently, the findings from the present study may help practitioners aiming to integrate information about players' BMS within talent promotion. Perspective 1 showed that all considered alternative diagnostics correlate highly with the reference method and, therefore, may be used as more economic assessment methods for BMS. Similarly, perspective 2 revealed that a combined, multivariate use of the alternative measurements significantly predicted BMS and led to slightly higher explanatory power. Even though MO MIR provided the highest impact on BMS, the strong correlations between the pragmatic diagnostics did not allow the conclusion of which diagnostics should be preferred. In contrast to perspectives 1 and 2, perspective 3 was able to detect individual differences and systematic deviations that might be controlled for by using more than one pragmatic BMS diagnostics in practice.

Limitations and Perspectives
While the main focus within the present study was the investigation of BMS, further aspects of the maturation process, namely, "maturity timing" and "maturity tempo, " may be considered to determine the biological maturity of youth players in sports . The maturity timing approach describes specific maturational events that occur at a certain point of time at a different CA for every player (Swain et al., 2018). Such events include the estimated APHV (te Wierike et al., 2015), menarche status (Lloyd et al., 2014), or the age of first ejaculation (Mattila et al., 2008). In addition to these objective diagnostics, further approaches in talent research exist that determine maturity timing morphologically by holistic, subjective expert judgments (Romann et al., 2017). In those assessments, responsible coaches evaluate players independently according to certain characteristics (e.g., morphology). From an economic perspective, such a method offers advantages; however, a certain level of experience is essential, and comprehensive evaluation of the reliability and validity of these expert judgments is still pending. For both objective and subjective approaches, individuals are categorized in early, on-time, and late maturing players (e.g., Romann et al., 2017;Myburgh et al., 2019). Maturity tempo examines how fast/slow a child develops biologically (Mendle et al., 2019) and refers to the rate at which maturation progresses between (at least) two measurement points (Howard et al., 2016;Malina, 2017;Radnor et al., 2018). As with BMS and maturity timing, various approaches to determine maturity tempo exist in the literature (e.g., rate between beginning and end of the adolescent growth spurt; Wormhoudt et al., 2017).
However, there is disagreement in the literature with regard to the inconsistency of definitions (Cheng et al., 2020) and which indicators are assigned to which approaches (BMS vs. maturity timing vs. maturity tempo). For example, several authors use APHV (Buchheit and Mendez-Villanueva, 2014;Deprez et al., 2015) as an indicator of BMS, despite the fact that the review by Swain et al. (2018) argues that APHV reflects an indicator of maturity timing. To the authors' best knowledge, both approaches are possible but investigate different aspects; a more precise consideration of this issue is needed. While MO MIR should be used as an indicator of BMS (as in the present study), the difference of a player's individual APHV to the general APHV for boys (i.e., 13.8 years; Malina et al., 2004) provides an indicator of maturity timing. For instance, Mirwald's equation (Mirwald et al., 2002) calculates both BMS and maturity timing. Consequently, future research should carefully choose the right approach for determining an indicator that corresponds to the specific aspect of the maturation process to be investigated. While the present study analyzed BMS outcomes, maturity timing and maturity tempo outcomes-ideally in a longitudinal research design-would be of future interest.
As a limitation of the present study and of the pragmatic assessment of indicators of somatic age in general, it has to be considered that both MO MIR and PAH KR appear to be very sensitive for parameters, such as leg length and standing height. Therefore, in order to ensure precise measurement of these parameters, practitioners should-beyond the use of calibrated measurement devices-control for potential physiological confounding variables. For instance, height and weight might vary at different times of the day (e.g., in the morning/evening or before/after practice; Orsama et al., 2014). For this reason, practitioners should try to maintain a standardized measurement procedure by determining consistent time slots for measuring their players. In addition, concerning the PAH KR method including a mid-parent height parameter, it has been remarked that people tend to overestimate their own height (Maukonen et al., 2018). This, in turn, may falsify the PAH KR values for the respective player and indicates the need for an objective assessment by an independent observer. However, research investigating the measurement errors of PAH KR values between self-reported parents' height and objectively assessed parents' height by researchers is scarce. To the authors' best knowledge, only one equation exists in which the self-reported height is adjusted, developed by Epstein et al. (1995). While this equation was used in the present study as well as in some current studies (e.g., Cumming et al., 2018b), more research is needed for finding an accustomed correction formula to reduce measurement errors based on overestimating selfreported height.
Moreover, players' ethnicity status was not taken into account in this study. Researchers controversially discuss a potential influence of ethnicity on skeletal age. While Timme et al. (2017) emphasize that no impact of ethnicity exists, current studies found significant differences in skeletal age between African and European population (e.g., Grgic et al., 2020). However, the focus of the present study lays in the comparison of different pragmatic methods with the MRI diagnostics, not least because of the effort in terms of time and costs associated with the MRI diagnostics, and the study's sample size was too small to examine the impact on different ethnic groups. Thus, comprehensive validation studies are needed to investigate potential differences when determining BMS for several ethnic groups. Therefore, future studies-ideally in a longitudinal design-should control for a possible impact of ethnicity when examining BMS, and the use of an ethnicity-specific formula might be helpful for this issue. However, to date, there is no formula that could account for ethnicity-specific assessment of BMS.

CONCLUSION
The results suggest that the use of SA US , MO MIR , and PAH KR for measuring BMS is more pragmatic in terms of cost and time as compared with MRI diagnostics. Based on a general agreement between these pragmatic diagnostics and the reference method MRI in all three perspectives, the alternative methods can be used to determine BMS among (male) elite youth soccer players. Since caution is required with respect to the precision of the measurements at the individual level, the simultaneous use of at least two alternative diagnostics is recommended in order to get a more reliable BMS outcome. Further research is needed that evaluates both the implementation of BMS' diagnostics in practice and their usefulness in terms of bio-banding in youth soccer (e.g., Romann et al., 2020).

DATA AVAILABILITY STATEMENT
A de-identified version of the raw data supporting the conclusions of this article the findings of this study will be made available by the authors upon reasonable request.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics committee of the Faculty of Medicine at the University of Frankfurt. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
OH, DLe, and DM: conceptualization and methodology. OH: data curation and supervision. DLe: formal analysis. OH and TH: funding acquisition. OH, DLe, DM, GS, MR, DLü, LB, and KE: investigation. OH, MR, and KE: project administration. DLe and DM: validation, visualization, and writing ± original draft. DLe, DM, OH, LB, KE, TH, DLü, MR, and GS: writing ± review and editing. All authors contributed to the article and approved the submitted version.

FUNDING
This study was part of the research project scientific support of the DFBs Talent Development Program, which was funded by the DFB (Deutscher Fußball-Bund, DFB, http://www.dfb.de). We acknowledge support by Open Access Publishing Fund of University of Tübingen.