Körperkoordinationstest Für Kinder (KTK) for Brazilian Children and Adolescents: Factor Analysis, Invariance and Factor Score

The decrease in children motor competence, with a consequent reduction in the levels of physical activities and fitness, impacting health negatively, has affected children across countries. In addition to consistent intervention strategies, it is necessary to use appropriate instruments. The Körperkoordinationstest Für Kinder (KTK) is a reliable and low-cost motor coordination (MC) test used in several countries but lacking psychometric evidence in the Brazilian population. The present study investigates the factor structure of KTK in a Brazilian sample; and, compared four possibilities of calculating the factorial score of the test, precisely the sum of the scores, sum of the standard scores, weighted method, and the refined method. The participants of the study consisted of 565 volunteers (49.9% boys), from 5 to 10 (7.93 ± 1.51) years of age, with a body mass index (BMI) means of 17.04 (±2.81). The results showed that the KTK factor structure was adequate to the model for the total sample, by sex, and by age groups. However, the results did not confirm the invariance between sexes and age groups. Besides, our result showed that the sum of the raw scores of the subtests could be used as the factor score method in KTK. In the end, we conclude that the KTK is a valid test to measure the MC of Brazilian children and adolescents, with features that qualify it as a useful instrument both for research and for the practice.


INTRODUCTION
The engagement of children and young people in physical activity has been decreasing in many countries (Dollman et al., 2005), and a part of this population has adopted a predominantly sedentary lifestyle (Photiou et al., 2008). Consequently, there has been a considerable increase in the number of young people who are overweight and have low physical fitness in addition to the rise in the incidence of diseases associated with physical inactivity, such as obesity (Booth et al., 2012) among children and adolescents.
The increase of physical activity levels through the development of children's motor competence (Stodden et al., 2008) is among the various strategies used. Studies have pointed to positive associations between these two variables, showing that children with high levels of motor competence tend to have a higher engagement in physical activities (Wrotniak et al., 2006;Haga et al., 2008;Lubans et al., 2010). We have to point out that the expression motor competence, according to Cattuzzo et al. (2016), used in a global perspective, contemplates all forms of tasks directed to objectives that involve coordination and control of the human body. Thus, it is essential to stimulate the development of motor competence since childhood (Hoeboer et al., 2016). However, for this to occur in addition to consistent and coherent work, the use of systematic assessment is important to measure the progress of children over the levels of motor competence (Fransen et al., 2014).
Among the reliable measurements to assess motor competence, the Körperkoordinationstest Für Kinder (KTK) is one of the most commonly used (e.g., Bardid et al., 2015;Cattuzzo et al., 2016;Rudd et al., 2016) in children and adolescents. KTK was developed in Germany to test children and adolescents, ranging from 5 to 14 years of age (Kiphard and Schilling, 1974). It is considered a relatively simple test, easy to perform, with objective measures, and low operational cost (Cools et al., 2009), which are characteristics that may favor the expansion of its use for both research purposes and the daily activities of Physical Education teachers and sports coaches. The test encompasses components of motor coordination (MC) and consists of four tasks: (1) walking backward along a balance beam of decreasing width: 6.0, 4.5, and 3.0 cm (WB); (2) two-legged jumping from side to side for 15s (JS); (3) moving sideways on wooden boards for 20 s (MS); and (4) hopping for height (HH), which consists of one-legged hopping over a foam obstacle with increasing height in consecutive steps of 5 cm. The scores obtained in each sub-test are compared to the original normative data and transformed into the motor quotients for each task. The sum of the four standardized item scores obtained results in the overall motor quotient (MQ) of the KTK (Kiphard and Schilling, 1974).
To the best of our knowledge, the factorial structure of the KTK was not tested in Brazilian children and adolescents. Thus, considering the importance of the use of KTK as a motor assessment tool, it seems relevant to test its factorial structure based on data obtained from evaluations performed with Brazilian children and adolescents. Moreover, another aspect that stands out is the fact that the original normative values of KTK were established more than 40 years ago in Germany, taking into consideration economic, social and cultural contexts (Robinson et al., 2015), which are very different from the Brazilian ones.
Another point that deserves investigation is related to the result provided by the KTK, the MQ, which can also be treated as a factor score (DiStefano et al., 2009). According to DiStefano et al. (2009), factorial scores are used in the effort to summarize the results obtained in the various items of an instrument in one or more factors. In the KTK, the MQ summarizes in a single value the results obtained in the four subtests. To summarize a MC in a single value, four different calculation methods can be used with specific strength and limitations. Specifically, related to the sum of the raw scores of the subtests (subtest's method), when this procedure is adopted, the skills with higher values will weigh more for the calculation of the MC. For example, the maximum score that can be achieved in item WB is 72 points, while in JS, the maximum value predicted for girls is 110 points, that is, different tasks, measured in different units. Regarding the sum of the standard scores, another important conceptual problem remains. The sum of the scores assumes that the subtests have the same importance to the calculation of the MC, which may not be real by the results of the psychometric properties (using Confirmatory Factor Analysis) of the KTK. Thus, other methods could be more indicated to use to calculate the factor score; more specifically using the "Weighted Method" (For more details, see Albuquerque et al., 2017) and the "Refined Method" (For more details, see DiStefano et al., 2009). These procedures may be more robust methods contrasted with the sum of scores and sum of the standard scores.
Therefore, the present study investigates the factor structure of the KTK for a Brazilian sample; and, compared four possibilities of calculating the factorial score of the test, precisely the sum of the scores, sum of the standard scores, weighted method, and the refined method.

Participants
The participants of the study consisted of 565 volunteers, from 5 to 10 years of age, with age mean of 7.93 (±1.51) years, and a body mass index (BMI) means of 17.04 (±2.81). Of these, 282 (49.9%) were boys, and 283 (50.1%) were girls, all of whom are regularly enrolled in Brazilian public and private schools in Minas Gerais (one state of Brazil) and attending classes from the 1st to 5th grade of elementary school.
The ethical committee of the Institutional Review Board of the Universidade Federal de Viçosa approved the study (26874614.1.0000.5153). Caregivers, those legally responsible for the children were informed about the goals and relevance of the research, as well as the procedures that would be adopted. Besides, they signed the consent authorization for the participation of children.

Motor Coordination Assessment
The MC of the participants was assessed through the application of KTK -Körperkoordinationstest Für Kinder, developed by Kiphard and Schilling (1974). The test involves components of MC, such as balance, rhythm, strength, laterality, speed, and agility (Scordella et al., 2015). The test consists of four tasks: (1) walking backward (WB) along a balance beam with a decreasing width, from 6.0 cm to 4.5 cm, to 3.0 cm; (2) two-legged jumping from side to side for 15 s. (JS); (3) moving sideways on wooden boards for 20 s (MS); and (4) hopping for height (HH), with onelegged, over a foam obstacle with increasing height in consecutive steps of 5 cm (Rudd et al., 2016).

Procedure
Participants' full name, sex, and date of birth were obtained using a questionaire. The KTK test was conducted at the participants' schools. The first task was WB, followed by sub-tests JS, MS, and HH, following all guidelines established by the authors (Kiphard and Schilling, 1974). Assessors, who were responsible for the assessments, underwent training sessions both in groups (two times) and individually (more than four times). The scores of a total of 50 ( ∼ =10) participants were used to conduct the raters' agreement for all subtests. Raters' reliability was conducted with Cohen's kappa test, which indicated an agreement above 80% in all cases.

Statistical Analysis
For the construct validity of the KTK, a confirmatory factorial analysis (CFA) was conducted. Maximum likelihood (ML) estimator was used since it was recommended as an alternative when data are continuous. For the fittest of the proposed model, we assessed the indices of χ 2 (Chi-square); CFI (comparative fit index); TLI (Tucker-Lewis index); RMSEA (root mean square error of approximation); and SRMR (standardized root mean square residual) following the recommended literature (Hu and Bentler, 1999;Brown, 2006). Recognized values were adopted as criteria for a satisfactory model fit for the data: CFI and TLI less than 0.9; RMSEA with a value close to or less than 0.06; SRMR value close to or less than 0.08 (Hu and Bentler, 1999;Brown, 2006). Also, Cronbach's Alpha was used to measure Reliability.
A multigroup confirmatory factor analysis (MGCFA), using the ML estimation procedure, was used to test the assumption of the KTK invariance across sex and age groups. Factorial invariance testing followed a series of hierarchical steps, each comprising consecutive constraints across sex. An initial confirmatory analysis tested the proposed model in each sex separately. In addition, it was tested whether the same parameters existed for both sexes (configural invariance). Moreover, factor loadings (metric invariance), item intercepts (scalar invariance), and residual variances (strict invariance) were investigated (Hirschfeld and von Brachel, 2014). As recommended by many authors (e.g., Brown, 2006;Kline, 2011), the model fit was evaluated using (a) χ 2 goodness-offit; (b) root mean square error of approximation (RMSEA; with values lower than.08 being indicative of acceptable fit to the data); and (c) comparative fit index (CFI; with values greater than 0.90). A change of lower than 0.01 in CFI between configural and metric invariance models, in addition to a change of lower than 0.02 in RMSEA, indicated non-invariance, while a change of lower than 0.01 and 0.02 for CFI and RMSEA, respectively, would confute scalar or strict invariance (Vandenberg and Lance, 2000).
The sum of the scores of all samples and separated by sexes were calculated by the sum of raw scores of the subtests (Equation 1). Where WB, SJ, MS, and HH represent Walking Backward, Jumping Sideways, Moving Sideways, Hopping raw data, respectively.
Regarding the Sum of the Standard Scores, to use only positive numbers, we chose to transform each raw value in a "minmax scaling" (Equation 2), which transforms the data such that the values are within a specific range [0 to 1]. In which, x' is the normalized value; x is the raw data, xmin is the minimum value found in the sample, and xmax is the maximum value of the sample. This procedure was conducted for each subtest separately. Subsequently, the transformed data values of each subtest were summed to calculate the MQ standard (Equation 3).
Weighted method (MQ weighted )-Firstly, as the raw data of the subtest separated are represented by different units of measure, they were initially transformed (using Equation 2) as done previously in the sum of the standard scores. After this, the sums of the factor loadings of each subtest extracted of the Confirmatory Factor Analysis were calculated. Thirdly, item's factor loadings were standardized by the sum of factor loadings. Then, the factor score was computed by the sum of each item score transformed (using Equation 2) by multiplying the standardized factor loading by the score of the item. For instance, in a KTK hypothetical case items factor (Factor loading -WB = 0.80; SJ = 0.90; MS = 0.40; HH = 0.60), the sum of the factor loading is 2.70 ( of the factor loading of all subtest). The standardized factors loading of the items are: WB-0.80/2.70 = 0.30; SJ -0.90/2.70 = 0.33; MS -0.40/2.70 = 0.15; HH -0.60/2.70 = 0.22. In the end, assuming this hypothetical example that subject one scored the highest score (in this case, it would have a transformed score = 1) in all subtests, the weighted factor score of the subject is 1 [(1 * 0.30) + (1 * 0.33) + (1 * 0.15) + (1 * 0.22)].
The refined method was computed by the factor extraction of the Confirmatory Factor Analysis. This procedure was computed using "predict() function" by the lavaan package (Rosseel, 2012). The main purpose of the "predict function" is to compute (or "predict") estimated values for the latent variables (MQ refined ) in the model (factor scores).
Since the units of measurement of the factor scores, extracted by the sum of the scores, sum of the standard scores, weighted method, and refined method values, are different, we used the Pearson's correlation to verify the association between the methods. Moreover, the participants were grouped according to the levels of MC, as defined by their performance rates. The groups were formed by 0-20% (performance rates <20%), 20-40% (performance rates >20% and ≤40%), 40-60% (performance rates >40% and ≤60%), 60-80% (performance rates >60% and ≤80%), and 80-100% (performance rates >80%) using all methods. After that, a confusion matrix was generated using caret package to compare the classification of the standard, refined, and weighted methods with the sum method.
Two-way ANOVAs were used to compare sex and age differences in each subtest of the KTK. In addition, like other studies (e.g., Suppiah et al., 2016), partial eta-squared (ηp 2 ) was used as a measure of effect size on the two-way ANOVA and classified using the following scale (small: 0.01; moderate: 0.09; large: 0.25). All analyses were conducted using α = 5%.
All the analyzes were performed in RStudio Version 1.1.463 for Windows that is an integrated development environment (IDE) for R.

Descriptive Analysis
An overview of the raw scores of the subtests separated by age are presented in Tables 1-4.

Correlation Between Subtests
The correlations between all subtests of the KTK (Figure 1) were positive, weak to moderate (from 0.47 to 0.54), and significant (p < 0.0001).
The Cronbach's alpha was conducted for all the subtests of the KTK in which the value was 0.80 for all sample, 0.79 for male, and 0.81 for female.
In the model testing metric invariance (Tables 5, 6), Configural invariance showed that the number of latent variables and the pattern of loadings of latent variables on indicators is similar across the sexes and age groups. Weak invariance (also known as metric invariance) indicated that the magnitude of the loadings is similar across age groups, but not across sexes. Moreover, strong invariance (also known as scalar invariance) showed that item intercepts are statistically different across the sexes and age groups. In the end, strict invariance showed that residual variances are not similar across sexes and age groups.

Analysis of the Factor Scores
The analysis of the correlation between factor score methods showed that all methods were statistically significant, positive, and large (Figure 4).
The confusion matrix generated to compare the classification generated by performance rates of the standard, refined, and weighted methods with the sum of raw scores method showed that the standard, refined, and weighted methods present classification practically similar to the sum method with an accuracy in the classification of 95.4% to 97.2% (Figure 5).

Interpretative Test Parameters
Based on the MQ calculated by sum of the raw scores of the subtest, the interpretative parameters are shown in Table 7.

DISCUSSION
The present study aimed to investigate the factor structure of KTK for a Brazilian sample; and, compared four possibilities of calculating the factorial score of the test, precisely the sum of the scores, sum of the standard scores, weighted method, and the refined method.
In summary, the results show that the psychometric properties of KTK from a sample composed of Brazilian children were well adjusted for the total sample and separated by sex and age groups. Using the CFA was possible, identifying in the instrument a single latent factor, which can be named motor quotient to assess children and adolescents MC. However, the results do not confirm the invariance between the sexes. In addition, an interesting result found by us was that the different methods of calculating factor scores present a high correlation with each other. Thus, our result indicating that the sum of the raw scores of the subtests, which is the simplest way of calculating the factor score of the KTK, can be used. The factorial structure of the KTK was shown to be well adjusted to the model for all sample and separated by sex, which suggests that the instrument has appropriate capacity to assess the MC of the group from this population aged between 5 and 10. The present study advances in the current knowledge by providing the first psychometric analysis of the KTK for Brazilian children. As pointed out by a previous study (Ribeiro et al., 2012), there was a emergent need for the validation since the test has been used for research and clinical purposes in Brazil. Internationally, the study of Rudd et al. (2016) with Australian children, also used factorial analysis to investigate the factorial structure of KTK, found a model that adequately fits the data, in which the four tasks had a strong effect on the latent MC variable. The results obtained by these authors give support to the findings of the present study for the Brazilian sample and confirm KTK as an adequate tool to measure MC in different population contexts. Although it is necessary to consider the specificities of each population, which undergo different environmental, social, and cultural influences (Robinson et al., 2015), the KTK seems to be capable of measuring levels of MC even in different contexts, despite the results pointing to variations in performance for samples from different countries (Graf et al., 2005). The validity of its construct, combined with its practicality, makes the KTK a viable instrument to be used in different contexts (research and, professional practice, among others), as it is a simple and objective test, with low operational cost and with low interference on physical fitness.
We conducted multiple groups confirmatory factor analysis to investigate the degree to which measures are invariant across sex and age groups, and our results showed that the assumption of invariance across sex and age groups were not confirmed. As well documented, the measurement of invariance has a very important implication for the interpretation of differences between groups (e.g., sex and age groups in our study). In this sense, as the invariance assumptions have not been confirmed, we cannot assume a stable relationship between the construct and the test score. Thus, the observed mean differences between sex and age in KTK may be either due to differences in underlying constructs or due to the different relations between latent constructs and scores (Hirschfeld and von Brachel, 2014). Therefore, our results have shown that, as invariance was not found for sex and age, further investigations using KTK need to be adjusted by sex and age. One way to make this adjustment would be the use of normative tables, as presented by us in the present study (e.g., Table 7).
Two important limitations are found to compute the factor score in the KTK: (1) The subtests have different measuring units; (2) The weight is given to each subtest in the factor score calculations. Thus, methods that control such limitations may be of great value to the quality of the measure. On the other hand, more robust methods that control these disadvantages, usually have limitations, such as not being simple enough for their wide practical use. Our result showed that the correlation between the different methods is statistically significant, positive, and strong. In addition, the confusion matrix generated showed that the three methods (standard, refined, and weighted methods) used to calculate the KTK factor score present classification practically similar to the sum of raw scores of the subtest's methods with an accuracy of 95.4% to 97.2%. Thus, it is possible to assume that the sum of raw scores of the subtests of the KTK, which is a simple and easy method, can be used as a measure of KTK factor score.
Regarding the possible limitations presented by the present study, it is important that the results obtained are not seen as ultimate due to some factors. The first one concerns the sample, which has regional characteristics. However, it is worth mentioning that, to the best of our knowledge, this is the first study that has sought to investigate the validity of the test in Brazilian children and adolescents. Another limiting aspect is the fact that KTK only tests MC in children, with few locomotion actions and almost no manipulation of objects. Considering the influence of general motor competence in the adoption of a more active lifestyle by children, it might be interesting to use a complementary test, such as TGMD-2 (Ulrich, 2000). Another limitation concerns the fact that no information on whether the children practiced sports or had any level of physical activity was considered. Children with more time dedicated to motor practices could have taken a certain advantage in the test, influencing the task scores (Vandorpe et al., 2012). In future studies, objective methods for assessing the level of physical activity should be employed. The influence of the children's BMI on scores also needs to be better investigated, since a previous study (D'Hondt et al., 2011) found that overweight in childhood influences KTK performance negatively. In the end, the fact that KTK has a good factorial structure does not indicate that conceptually implies that it assesses the motor competence thoroughly. For this reason, there is a suggestion that KTK can be combined with TGMD-2 (Rudd et al., 2016) to theoretically make the motor competence construct more robust and therefore, a better motor competence assessment.
In summary, the results of the present study extend the current knowledge regarding the use of KTK as a tool to measure MC in children and adolescents, especially concerning the Brazilian reality. Our result showed that the sum of raw scores method has a statistic, positive, and large correlation with other robust methods to calculate the factor score, and can therefore be interpreted as a simple and adequate method for the interpretations of KTK results.

CONCLUSION
The results obtained in the present study establish parameters to extend the use of the test in Brazil and may contribute in a way that new research could be conducted to establish validity across the entire country and yet normative values for the Brazilian population. Normative values are necessary to extend the use to KTK in Brazil; reference scores should be produced according to geographic, cultural and social realities. However, in addition to the applications in the science field, it is essential that such information reaches the knowledge of physical education teachers and sports coaches, in order to give them conditions to apply the tests in the field, using the obtained results to support the elaboration and the execution of programs aiming to develop motor skills in children and adolescents.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Universidade Federal de Viçosa (26874614.1.0000.5153). Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
JM, ML, and MJ collected the data. NV and GL participated with MA on the conception and design of the study. MA analyzed the data. All authors participated in the interpretation of the results, drafted and revised the manuscript, and approved the final revision of the manuscript.