Adaptation and Validation of a Test for the Evaluation of Tactical Knowledge in Soccer: Test de Conocimiento Táctico Ofensivo en Fútbol for the Brazilian Context (TCTOF-BRA)

Rechenchosky, Leandro; Menegassi, Vanessa Menezes; Jaime, Matheus de Oliveira; Borges, Paulo Henrique; Serra-Olivares, Jaime; Rinaldi, Wilson

doi:10.3389/fpsyg.2022.849255

ORIGINAL RESEARCH article

Front. Psychol., 14 July 2022

Sec. Quantitative Psychology and Measurement

Volume 13 - 2022 | https://doi.org/10.3389/fpsyg.2022.849255

Adaptation and Validation of a Test for the Evaluation of Tactical Knowledge in Soccer: Test de Conocimiento Táctico Ofensivo en Fútbol for the Brazilian Context (TCTOF-BRA)

Leandro Rechenchosky¹^*

Vanessa Menezes Menegassi¹

Matheus de Oliveira Jaime¹

Paulo Henrique Borges²

Jaime Serra-Olivares³

Wilson Rinaldi¹

¹Group of Studies and Researches Applied in Soccer (GEPAFUT), Department of Physical Education, State University of Maringá, Maringá, Brazil
²Department of Physical Education, Federal University of Santa Catarina, Florianópolis, Brazil
³Pedagogy in Physical Education, Faculty of Education, Catholic University of Temuco, Temuco, Chile

Background: Studies and tests to assess the tactical domain of young soccer players are recent, and few instruments meet the majority of quality criteria.

Objective: To adapt and validate the Test de Conocimiento Táctico Ofensivo en Fútbol (TCTOF) for the Brazilian context (TCTOF-BRA).

Methods: The article consists of two studies. Study 1 (n = 111) included the translation, theoretical/semantic analysis, back translation, cross-cultural equivalence, and content and face validity (pre-test). In study 2 (n = 768), a theoretical and empirical item analysis was carried out, followed by construct validity [exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and the known-groups method] and reliability (internal consistency and repeatability).

Results: In the cross-cultural evaluation, the Coefficient of content validity total (CCV_t) of the instrument was 0.96 and in the content validity, the CCV_t of the instrument was 0.87. The face validity was confirmed (>95%). After theoretical and empirical analysis, 15 questions were included in the Teste de Conhecimento Tático Ofensivo no Futebol (TCTOF-BRA). The EFA showed a model with adequate fit (KMO = 0.69; Bartlett p < 0.001), with a factor structure considered very good, composed of four factors (decision making, operational tactical principles, collective tactical-technical elements, and rules). The CFA by the Asymptotically Distribution-Free estimation method demonstrated good and very good goodness of fit indices (X²/df = 1.54, GFI = 0.99, CFI = 0.94, TLI = 0.92, PGFI = 0.71, PCFI = 0.76, RMSEA = 0.03, and ECVI = 0.26). The known-groups method showed significant differences (p < 0.01) and effect sizes varying from small-to-medium to large. With respect to reliability, coefficients of 0.89 (CR) and 0.74 (KR20) for internal consistency and 0.85 for repeatability were found.

Conclusion: The TCTOF-BRA presented satisfactory evidence, demonstrating it to be an instrument with valid and reliable measures for the evaluation of tactical knowledge (declarative and theoretical procedural), based on specific knowledge and decision making (cognitive domain), of Brazilian young soccer players from 12 to 17.9 years old.

Introduction

Soccer (football) has been considered the most popular sport worldwide (Dvorak et al., 2004; FIFA, 2007; Shvili, 2020) and the tactics, either from an expanded and cognitive perspective (Abernethy et al., 1993; McPherson, 1994) or a dichotomous and ecological dynamics perspective (Silva et al., 2013), is recognized as the central dimension of the teaching-learning and training process (Teoldo et al., 2015; Praça and Greco, 2020), since it “gives meaning and consistency to all other dimensions” (Teoldo et al., 2015, p. 27). Systematic reviews have shown the relationship and influence of tactics, through the manipulation of small-sided games, in the technical, physical/physiological, and psychological dimensions of young soccer players (Sarmento et al., 2018; Bujalance-Moreno et al., 2019; Clemente and Sarmento, 2020). In this sense, the need for instruments that offer valid and reliable measures for the assessment of the tactical dimension is evident.

Studies and tests to assess the tactical domain of young soccer players are recent and few instruments meet the majority of quality criteria, as can be seen in the scoping review by Rechenchosky et al. (2021) and in the systematic review by Sánchez-López et al. (2021). Rechenchosky et al. (2021) further reveal that studies which developed and/or validated tests to assess the tactical dimension of young soccer players were mostly composed of young Europeans (75.0%), especially Spanish and Portuguese. Considering that “the population for which a test is intended should be clearly delimited” (American Educational Research Association [AERA] et al., 2014, p. 23), since the evidence of “validity and reliability are affected by the characteristics and composition of the sample” (Brink and Louw, 2012, p. 4), only five tests with participants from other continents were observed (Rechenchosky et al., 2021), including two with Brazilian samples, the TCTP-OE (Greco et al., 2015) and the TacticUP (Machado and da Costa, 2020).

The Teste de Conhecimento Tático Processual para Orientação Esportiva (TCTP-OE) is based on the “Game Test Situation” (Memmert and Roth, 2003), and evaluates tactical-technical behavior through a small-sided game of 3 × 3, being theoretically based on the general tactical principles (Garganta and Pinto, 1994). The TacticUP, on the other hand, is a test that assesses tactical knowledge through videos and is theoretically based on the core tactical principles of soccer (Costa et al., 2009). Therefore, the development, adaptation, or validation of instruments for assessing tactical knowledge in a Brazilian sample, using a questionnaire, with operational tactical principles (Bayer, 1994) as a theoretical basis, have not yet been verified.

Regarding the validation of instruments, according to Morales (2011, p. 3), “a test or a scale has as nature, the expression of the same trait” and the quality of the measure depends on the validation process. For Urbina (2014, p. 147), there is no consensus in the literature on which and how much validity evidence is needed for a given instrument to present a measure considered valid and reliable. For American Educational Research Association [AERA] et al. (2014, p. 11), “the process of validation involves accumulating relevant evidence to provide a sound scientific basis for the proposed score interpretations.”

In this sense, based on a series of references recognized in the scientific literature, Rechenchosky et al. (2021) proposed 13 criteria to be considered in instrument validation studies in the area of physical education and sport. One of the tests that meets a greater number of criteria (Rechenchosky et al., 2021; Sánchez-López et al., 2021), demonstrating more evidence in the validation process, is the Test de Conocimiento Táctico Ofensivo en Fútbol (TCTOF), created and validated in Spain by Serra-Olivares and García-López (2016). The TCTOF is a questionnaire that assesses “declarative tactical knowledge” and “procedural tactical knowledge” in the cognitive/theoretical domain, considering the tactical dimension from an expanded and cognitive perspective (Abernethy et al., 1993; McPherson, 1994). Theoretical procedural knowledge represents knowledge in the representational plan (knowledge-based paradigm), and is related to what the player would do when faced with a hypothetical situation presented to them, for example, through videos and questionnaires (Rechenchosky et al., 2021). This is in line with Abernethy et al. (1993, p. 324) and McPherson (1994), when stating that the knowledge of “how to do” in sports of high strategy, as is the case of soccer, can refer to both the selection (cognitive) and the execution (motor) of the movement. For McPherson (1994, p. 230) “in sport a successful response selection (decision) may not necessarily correlate with successful response execution (action),” since a failure can be committed “due to an unsuccessful response execution, not response selection.” Thus, “an individual’s sport tactical knowledge may be confounded by the need to carry out a response selection in a sport situation.”

Therefore, considering the central importance that the tactical dimension assumes in the training and match process; the scarce availability of tests that assess the tactical dimension of young soccer players built or validated from Brazilian samples; the inexistence of instruments in this same population that have as a theoretical basis operational tactical principles and that are carried out using questionnaires, which tends to facilitate and expand their use by professors, coaches, and researchers; and also that the TCTOF involves decision making (cognitive domain) in game contexts, contributing to the ecological validity; it was chosen to “validate the TCTOF for the Portuguese language (Brazilian population)” based on the hypothesis that the TCTOF can offer valid and reliable measures for the assessment of tactical knowledge also in young male Brazilian soccer players. For this, two studies were organized with the following objectives: “Translate, adapt, and validate the content of the TCTOF-BRA” (Study 1) and “Determine and present the evidence of construct validity and reliability of the TCTOF-BRA” (Study 2).

Study 1: Cross-Cultural Adaptation and Content/Face Validation

Methods

Participants

A committee formed by the main researcher (LR), a doctoral student (VM), and a master’s student and soccer coach (MJ) participated in the translation, and received the support of the instrument’s main author (JS-O) and a postdoctoral professor (LB) in the Hispanic language. The “semantic analysis” (Pasquali, 2018, p. 107) of the preliminary version of the translated instrument was carried out by focus groups, using the “brainstorming” technique among researchers LR, VM, and MJ and 20 participants who represented the target sample (Under 13/U13, Under 15/U15, and Under 17/U17). The back translation was performed independently by two university professors, one in Brazil (PG) and another in Spain (DT), who did not participate in the translation, who have Spanish as their native language and proficiency in the Portuguese language (Beaton et al., 2000; Cassepp-Borges et al., 2010; International Test Commission, 2017). The cross-cultural equivalence stage (semantic, idiomatic, experiential, and conceptual) between the translated/pre-final version (Portuguese) and the original version (Spanish) involved the researchers who participated in the translation and back translation (VM, MJ, JS-O, PG, and DT), except LR and LB. For content validity, a panel of five university professors from the soccer area was formed, with at least 10 years of experience (Ericsson et al., 1993, p. 366) and who did not participate in any previous part of the research (AS, HS, JM, PB, and RA). Finally, complementing the content validity (American Educational Research Association [AERA] et al., 2014), for the face validity, a sample of 91 players aged 12.1 to 17.8 years (mean age ± SD = 15.1 ± 1.5 years) participated in a pilot study (pre-test), selected by convenience, who competed in the state championship and a regional championship. For both studies, it was decided to increase the age range of the sample in relation to that used in the development of the TCTOF, Spanish version (8–14 years of age). The minimum age of 12 years for the TCTOF-BRA was chosen after analysis by the committee of the Brazilian context and also following guidelines from the scientific literature (Brislin et al., 1973; Guillemin et al., 1993) regarding the age group for understanding translated questionnaires. It was also decided to apply the questionnaire to young people of 15, 16, and 17 years of age, since this is a basic category (U17), and to evaluate the behavior of the data/results in terms of difficulty and discrimination of each of the questions/items. Thus, the cross-cultural adaptation, content and face validity involved 12 researchers, including post-graduate students, coaches, and university professors, and 111 young male Brazilian soccer players.

Instrument

The TCTOF is a questionnaire with multiple-choice questions, involving statements and game contexts through pictures. It was created and validated in Spain by Serra-Olivares and García-López (2016) with the participation of 465 children and young people between 8 and 14 years old from different contexts. According to the authors, the test aims to assess tactical knowledge from a more ecological view and through two dimensions, the declarative and the procedural. The first contains 36 multiple-choice questions involving six indicators related to knowledge about: roles and positions, offside rule, individual technical-tactical elements, operational tactical principles (OTP), relationship between individual technical-tactical elements and OTP, and collective technical-tactical elements. The second contains 16 questions in the form of figures in which the participant must first choose “what” to do and then “how” to do it, related to the “why” do it (OTP) and involves four indicators related to decision making in situations of keeping/maintaining ball possession, advancing/progressing, and attacking/trying to score the goal, in addition to knowledge about the offside rule. Each correct answer has a value of 1 point and a higher score represents more tactical knowledge in soccer.

Procedures

First, the main author of the instrument was contacted in order to present interest and formally request authorization for the translation and adaptation of the test from Spanish to Portuguese, which was promptly answered. Subsequently, the project was submitted for ethical review, in accordance with the Declaration of Helsinki, and approved in March 2019 (CAAE 08918619.3.0000.0104; Opinion 3.208.874). For both studies, consent was obtained from the participants, their legal representatives, and the clubs and everyone’s privacy was preserved.

The method adopted for the cross-cultural adaptation of the instrument was back translation, associated with the committee method (Vallerand, 1989). Initially, the committee met (LR, VM, and MJ) to carry out the translation of the instrument from Spanish to Portuguese, as directed by the International Test Commission (2017), with regard to the committee being familiar with the test and taking due care regarding literal translations. Thus, the procedure involved the reading, discussion, and understanding of terms and denominations in Spanish based on Spanish references and the translation supported by Brazilian and Portuguese references (conceptual analysis). Terms that could raise doubts with respect to interpretation were registered for clarification with the other members of the committee (JS-O and LB). After a conversation between the main researcher (LR) and LB, a videoconference meeting (skype) was held between JS-O and LR, VM, and MJ, at which time all translated questions were presented. The main author of the instrument (JS-O) participated in the cross-cultural adaptation, clarifying doubts regarding the use of some terms, giving suggestions, and authorizing the changes proposed by the group.

Subsequently, theoretical/semantic analysis of the preliminary version, as an indicator of apparent validity, was performed by committee members and focus groups to verify if the items were clear and understandable. The literature suggests “3–4 participants per group” (Pasquali, 2018, p. 107). Thus, three groups (U13, n = 7; U15, n = 6; and U17, n = 7) with different levels of knowledge, according to their coaches, were constituted and independently asked if they understood the questions. Next, participants were asked about what each question sought to discover and what or how they would answer. In case of disagreement in the understanding between the participants and after suggestions, the question was rephrased and presented again to the youth players.

The next step was to send the version of the instrument translated to Portuguese to two professors who have Spanish as their native language to perform the back translation to Spanish. The two versions back translated to Spanish were sent to the main author of the instrument, who analyzed the questions and informed that they were preserved similar to the original instrument. According to Pacico (2015, p. 68), “many researchers ask the author of the original scale to evaluate the back translation”; if this is “similar to the original version (the meaning of the items is preserved) and the adapted items are adequate, data collection with the pilot sample can be started.” In sequence, after discussion and consensus, the committee (LR, VM, and MJ) consolidated the preliminary version of the instrument in Portuguese (pre-final version).

Although it was already possible to start the pre-test stage with the pilot sample, the authors chose to send the translated version (pre-final version) to the translators VM and MJ, to the retranslators PG and DT, and to the main author of the instrument (JS-O), to determine cross-cultural equivalence (semantic, idiomatic, experiential, and conceptual), according to Guillemin et al. (1993) and Beaton et al. (2000). The individuals were asked to analyze each question and select one of the options; 1 very poor equivalence; 2 poor equivalence; 3 average equivalence; 4 good equivalence; and 5 very good equivalence.

To investigate content validity, the panel consisting of five judges (AS, HS, JM, PB, and RA) received the translated version (pre-final) of the instrument and a spreadsheet on which they were required to score all questions/items in relation to the criteria using a Likert scale from 1 to 5 (very poor to very much): (a) Clarity of language: evaluates the terms and language used in the questions/items of the questionnaire, considering the characteristics of the target population; example: Do you believe the terms and language of the question are clear, understandable, and adequate for young soccer players (~12–17 years old)? How much?; (b) Practical relevance: assess the relevance of the question for the daily lives of the target population. This considers whether each question is designed to investigate the concept of interest and whether it happens in practice; example: Do you believe that the questions/situations are relevant to the practice of young soccer players? How much?; and (c) Theoretical relevance: assesses the degree of association between the question/item and the theoretical basis; analyzes whether the item is related to the construct that is intended to be measured; example: Do you believe that the content of this question/situation is relevant and representative of the knowledge you want to measure, or of one of its indicators, considering the construct in question (tactical knowledge)? How much?

Subsequently, a pre-test (pilot study) was conducted to determine face validity, which “refers to the subjective judgment that participants make about the test” (Pacico and Hutz, 2015, p. 76) and indicates whether procedures are adequate and if any item remains incomprehensible. If the participant did not understand a question they were asked to circle it. At the end of the test, the youth players answered the following questions: (1) Do you think the test questions and figures are clear and understandable?; (2) Do you think this test assesses (tactical) knowledge about soccer?; (3) Did you enjoy taking the test?; (4) Did you feel challenged when taking the test?; and (5) Would you take the test at another opportunity to find out about your (tactical) knowledge in soccer, if necessary? All questions were initially answered with a yes or no; in case of “yes,” there was a Likert scale from 1 (very little) to 5 (very much). The pre-test data also allowed the “empirical analysis of the items” based on traditional parameters suggested in the literature, such as “difficulty and discrimination” (Pasquali, 2018, pp. 108–109). The summary of the procedures adopted in the cross-cultural adaptation and content/face validation is presented in Figure 1.

FIGURE 1

Figure 1. Study 1 flowchart. Source: the authors.

Data Analysis

The agreement regarding cross-cultural equivalence and content validity was determined by the Coefficient of Content Validity (CCV) proposed by Hernández-Nieto (2002, p. 131-132), which is able to measure the degree of agreement between judges regarding the total for the instrument, total per equivalence/parameter, and total per question (CCV_t), as well as considering each item/question per type of equivalence/parameter (CCV_i). If any question is considered unsatisfactory, it must be adjusted before the questionnaire is applied to the target population. Hernández-Nieto recommends a cutoff value of 0.80. Balbinotti (2004), on the other hand, suggests that it is possible to consider a CCV between 0.70 and 0.79 as the threshold and less than 0.70 as unsatisfactory. Therefore, in this study, a value of 0.70 was adopted as the threshold for the assessment of cross-cultural equivalence and content validity.

Face validity and the number of questions not understood were obtained by relative frequency (%). The empirical analysis of the items involved the difficulty index (number of subjects who answered the item correctly/total number of subjects who answered the item), discrimination index D (Flanagan method), and item-total point-biserial correlation. Difficulty indices from 0.10 to 0.90 and discrimination indices D ≥ 20 (0.20) were sought (Thomas et al., 2015, p. 396) and item-total point-biserial correlation coefficients ≥0.30 (Field, 2009, p. 598; Pasquali, 2018, p. 136).

Results

All questions from the pre-final version applied in the pilot study are presented in Supplementary Table 1 (Data Sheet 1).

In the cross-cultural evaluation, the CCV_t of the instrument was 0.96, with a semantic equivalence of 0.96, idiomatic equivalence of 0.95, experiential equivalence of 0.96, and conceptual equivalence of 0.96. The CCV_t per question (average of equivalences) ranged from 0.80 to 1.00 (see Supplementary Table 2 in the Data Sheet 1). Considering CCV by question and type of equivalence (CCV_i), all values were ≥ 0.80, with the exception of the conceptual equivalence of the questions about “offside position” and “permute,” which had a CCV_i considered as threshold (0.76). Therefore, the cross-cultural evaluation showed semantic, idiomatic, experiential, and conceptual equivalence for all questions. The Test de Conocimiento Táctico Ofensivo en Fútbol (TCTOF) was named in Portuguese as the Teste de Conhecimento Tático Ofensivo no Futebol (TCTOF-BRA).

In content validity, the CCV_t of the test was 0.87, with 0.87 for clarity of language (CL), 0.86 for practical relevance (PR), and 0.87 for theoretical relevance (TR). The CCV_t per question (average of criteria) ranged from 0.63 to 1.00. The CCV_i was below the threshold (0.70) for CL in questions 1 (player’s role while attacking) and 33 (permute), for PR in questions 2–5 (role and positions), and for TR in question 23 (controlling × OTP). Question 27 about “Give-and-go” or “wall pass” had CCV_i < 0.70 in CL, PR, and TR. Considering the relevance of reaching the threshold value in all criteria (CL, PR, and TR), eight questions (1, 2, 3, 4, 5, 23, 27, and 33) had at least one value below 0.70 (see content validity column in Supplementary Table 2 of the Data Sheet 1). Thus, 43 (84.3%) questions demonstrated content validity.

Face validity showed that more than 95.5% of the sample found the questions clear and understandable, thought that the test assesses tactical knowledge in soccer, enjoyed taking the test, and would do it again, if necessary. Approximately, 2 out of every 10 participants evaluated (21.3%) declared that they did not feel challenged when taking the test. Finally, in only four questions (1, 16, 27, and 33) more than 10% of the participants declared that they did not understand the question or a part of it. One question (15) received 8.8% and all the others demonstrated values below 5%. The average application time was 32.9 ± 8.4 min (SD). Finally, an empirical analysis of the difficulty (DI) and item discrimination (D and R_pb) was performed after applying the pilot instrument (see Supplementary Table 1 in the Data Sheet 1). It is possible to verify (in bold) that 26 questions together met the criteria in the three parameters (DI ≥ 0.10 and ≤ 0.90; D ≥ 20 (0.20); and R_pb: ≥ 0.30).

Discussion

In order for an instrument to be used with subjects from a different country to the one for which it was created and validated, it must be submitted to a “systematic” process of translation and validation (Cassepp-Borges et al., 2010, p. 508).

Vallerand (1989) indicates that the first moment of translation, which in the case of the present study is transferring the Spanish version into Portuguese, can be performed by a single individual (traditional translation), while Cassepp-Borges et al. (2010) suggest one or more independent translations. Furthermore, Vallerand suggests the use of a committee to avoid possible prejudices of a single individual and, if possible, the involvement of the person who created the instrument. Beaton et al. (2000) also argue the importance of participation of the instrument’s author, since he/she can assess more complex situations and suggest terms that demonstrate content validity in both languages. Sometimes translators with proven competence in the languages are hired, but who do not have knowledge of the study’s object, which can compromise or hinder the next steps. In this sense, the International Test Commission (2017, p. 11) advises that the translators or team are familiar with the test. For these reasons, in some studies, the authors themselves carried out the translation and adaptation, as was the case in Maroco et al. (2008). Therefore, the translation in this study followed guidance from important references in the area, aiming to ensure quality in this and other stages.

After the translation stage, characterized in the current study, by translation (Spanish—Portuguese), theoretical/semantic analysis (focus groups), and back translation (Portuguese—Spanish), the cross-cultural equivalence between the translated version and the original version was determined. Based on the CCV results, it is possible to state that the questions in the Spanish and Portuguese versions have the same meaning (semantic equivalence), the expressions are equivalent, that is, there is no change in cultural meaning (idiomatic equivalence), the content is present in both realities (experiential equivalence), and the questions are conceptually equivalent, that is, they assess the same aspect in different cultures (conceptual equivalence). Several authors have addressed the issue of cross-cultural adaptation and equivalences are well described in Guillemin et al. (1993) and in Beaton et al. (2000). It is important to highlight that substitution with another term to preserve the desired equivalence is allowed and, therefore, was performed when necessary. Additionally, sometimes a given situation may simply not be performed (even if it is translatable) in another country or culture, in this case the questionnaire item needs to be replaced by a similar item or even excluded (Guillemin et al., 1993). This situation occurred in the item about offside position, which had to be almost completely changed, and in item 23 of the original questionnaire, which was excluded because in Portuguese it was the same technical skill (shooting), as previously presented in another item of the instrument.

Finally, according to Guillemin et al. (1993), in the stage of cross-cultural analysis, one of the committee’s functions is to modify instructions or formats, modify or reject inappropriate items, and generate new items; ultimately, it is likely that the committee modify or eliminate irrelevant, inappropriate, or ambiguous items and may generate substitutes that better fit the target culture while maintaining the overall concept of the excluded items. Part of this committee’s role should also be to review the introduction and instructions for completing the questionnaire. Thus, adjustments were made to the questionnaire after focus groups, contact with the main author of the instrument, and discussion by the committee. Based on the results found, the cross-cultural assessment showed semantic, idiomatic, experiential, and conceptual equivalence between the Spanish instrument and the one translated to Brazilian Portuguese, that is, there was correspondence between the questions of the original TCTOF and the TCTOF-BRA.

Content validity involves evidence of the extent to which the test or instrument represents the content or behavior that is intended to be measured (Goodwin and Leech, 2003; Pasquali, 2010). For American Educational Research Association [AERA] et al. (2014, p. 14), the “evidence based on test content” involves an “analysis of the relationship between the content of a test and the construct.” The main method used is agreement between an expert panel. Hernández-Nieto (2002, p. 119) recommends a “minimum of three and a maximum of five judges,” preferably per modality, and, according to Balbinotti et al. (2007, p. 32), who “did not participate in any stage of the study.” In this study, five university professor judges were involved, with experience ranging from initiation to high performance in soccer. Regarding the cutoff points, the value of 0.70 was chosen as the threshold, according to Balbinotti (2004) and following validation studies in the area (Silva, 2018) and outside it (Balbinotti et al., 2007). For Cassepp-Borges et al. (2010, p. 513) it is possible to “relativize the cutoff point” due to the different opinions among the judges. In addition, it is worth highlighting that seven experts had already validated the original version of the TCTOF in terms of content (Serra-Olivares and García-López, 2016).

As seen, the CCV_t results for the test showed satisfactory agreement and content validity (Hernández-Nieto, 2002). The clarity of language was confirmed, that is, the terms and language of the questions are clear, understandable, and adequate for young male soccer players between 12 and 17.9 years of age; the judges also considered that there is practical relevance, that is, the questions and game situations presented in the figures are related to the daily lives of young soccer players; and there is theoretical relevance, that is, the content of the questions is relevant and representative of the knowledge that is being measured or of one of its indicators (Cassepp-Borges et al., 2010).

After the return of the content validity evaluation form completed by the judges, the committee met to discuss the results and the observations and suggestions made, especially in the questions with the lowest scores. Thus, 14 questions were reformulated according to the experts’ assessment before applying the questionnaire to the pilot sample (pre-test). Some authors (Greco et al., 2015) have suggested excluding items that are assessed as having a practical relevance below the cutoff point adopted, as they would not be considered relevant to the reality of the target population. However, it was decided not to remove any question in the initial stages of the study and analyze how their behavior after the application in the population of interest. In short, 43 (84.3%) questions showed content validity and this was also the number of questions that together had satisfactory results both in the cross-cultural adaptation and in content validity, since in the first, all items reached the necessary equivalences.

Regarding face validity, this type of validity generally receives less attention from researchers, as seen in Rechenchosky et al. (2021). According to Bornstein (1996), this validity can impact other forms of validity, hence the need to include it as one of the stages in the instrument validation process, beginning with the pre-test or pilot study (Guillemin et al., 1993). Thus, in order to ensure that the translation was understandable, this stage intended to identify questions that were not clear to the participants. The subjects of the pilot sample were asked to circle the questions they did not understand as a whole or in part; this procedure was also adopted in the original TCTOF study. The literature suggests that language should be understood by children aged 10–12 years old (Brislin et al., 1973; Guillemin et al., 1993). For the authors of this study, the few questions (1, 16, 27, and 33) that received a higher percentage (10–20%) of not being understood were within the expected range; even so, they were adjusted for the application of the instrument to the target sample.

Although the literature indicates that the sample size in the pilot study (pre-test) does not need to be greater than 10% of the target sample (Canhota, 2008, p. 70), in this study, it was approximately 12%. For statistical analysis this quantity is insufficient (Morales, 2011, p. 47), as it would be ineffective for these cases, however, it is believed that the pilot study, in addition to enabling apparent validity, can provide valuable information regarding the difficulty of the questions, which in turn can affect discrimination indices and, later, other evidence of validity. Knowing the behavior of these variables enables immediate adjustments to the instrument for application in the target sample.

Regarding the cutoff points adopted to analyze the difficulty and discrimination of the questions, some are more conservative than those adopted in this study, as is the case of Nakano et al. (2015), who consider a good difficulty index between 0.30 and 0.70; others are less conservative and justify it by the sample size, that is, for the analysis of discrimination, for example, in the case of large samples, correlation coefficients lower than 0.30 would already be “acceptable” (Field, 2009, p. 598). Regarding the D index, fundamentally, what is sought is for it to be positive and distant from zero, since “a null or negative D demonstrates that the item is not discriminatory” (Pasquali, 2018, p. 133).

Based on the knowledge that “the pilot study is important as a last chance to detect and correct errors before carrying out the research” (Cassepp-Borges et al., 2010, p. 514), a new committee meeting was held to analyze and discuss the results, and review (Guillemin et al., 1993) and make final adjustments to the instrument, preserving the logic of the questions and answers.

Thus, considering the cross-cultural adaptation, the evidence of content and face validity, and the limitations for a more appropriate analysis of the behavior of the questions regarding difficulty and discrimination, due to the sample size, it was decided to maintain the 51 questions for the final data collection (target sample), although it was possible to conclude satisfactory evidence in 43 questions of the TCTOF for application in Brazil. According to Cassepp-Borges et al. (2010, p. 513), when a question “is not considered relevant to the reality” that is being sought, it is possible for it to remain in the questionnaire, since “the researcher may insist on establishing some comparability.” Finally, in response to these authors (Cassepp-Borges et al., 2010, p. 519), Study 2 sought a balance between improving the structure of the instrument (necessary) and keeping it similar to the original.