Perceptual-Motor and Perceptual-Cognitive Skill Acquisition in Soccer: A Systematic Review on the Influence of Practice Design and Coaching Behavior

Facilitating players' skill acquisition is a major challenge within sport coaches' work which should be supported by evidence-based recommendations outlining the most effective practice and coaching methods. This systematic review aimed at accumulating empirical knowledge on the influence of practice design and coaching behavior on perceptual-motor and perceptual-cognitive skill acquisition in soccer. A systematic search was carried out according to the PRISMA guidelines across the databases SPORTDiscus, PsycInfo, MEDLINE, and Web of Science to identify soccer-specific intervention studies conducted in applied experimental settings (search date: 22nd November 2020). The systematic search yielded 8,295 distinct hits which underwent an independent screening process. Finally, 34 eligible articles, comprising of 35 individual studies, were identified and reviewed regarding their theoretical frameworks, methodological approaches and quality, as well as the interventions' effectiveness. These studies were classified into the following two groups: Eighteen studies investigated the theory-driven instructional approaches Differential Learning, Teaching Games for Understanding, and Non-linear Pedagogy. Another seventeen studies, most of them not grounded within a theoretical framework, examined specific aspects of practice task design or coaches' instructions. The Downs and Black checklist and the Template for Intervention Description and Replication were applied to assess the quality in reporting, risk of bias, and the quality of interventions' description. Based on these assessments, the included research was of moderate quality, however, with large differences across individual studies. The quantitative synthesis of results revealed empirical support for the effectiveness of coaching methodologies aiming at encouraging players' self-exploration within representative scenarios to promote technical and tactical skills. Nevertheless, “traditional” repetition-based approaches also achieved improvements with respect to players' technical outcomes, yet, their impact on match-play performance remains widely unexplored. In the light of the large methodological heterogeneity of the included studies (e.g., outcomes or control groups' practice activities), the presented results need to be interpreted by taking the respective intervention characteristics into account. Overall, the current evidence needs to be extended by theory-driven, high-quality studies within controlled experimental designs to allow more consolidated and evidence-based recommendations for coaches' work.

Facilitating players' skill acquisition is a major challenge within sport coaches' work which should be supported by evidence-based recommendations outlining the most effective practice and coaching methods. This systematic review aimed at accumulating empirical knowledge on the influence of practice design and coaching behavior on perceptual-motor and perceptual-cognitive skill acquisition in soccer. A systematic search was carried out according to the PRISMA guidelines across the databases SPORTDiscus, PsycInfo, MEDLINE, and Web of Science to identify soccer-specific intervention studies conducted in applied experimental settings (search date: 22 nd November 2020). The systematic search yielded 8,295 distinct hits which underwent an independent screening process. Finally, 34 eligible articles, comprising of 35 individual studies, were identified and reviewed regarding their theoretical frameworks, methodological approaches and quality, as well as the interventions' effectiveness. These studies were classified into the following two groups: Eighteen studies investigated the theory-driven instructional approaches Differential Learning, Teaching Games for Understanding, and Non-linear Pedagogy. Another seventeen studies, most of them not grounded within a theoretical framework, examined specific aspects of practice task design or coaches' instructions. The Downs and Black checklist and the Template for Intervention Description and Replication were applied to assess the quality in reporting, risk of bias, and the quality of interventions' description. Based on these assessments, the included research was of moderate quality, however, with large differences across individual studies. The quantitative synthesis of results revealed empirical support for the effectiveness of coaching methodologies aiming at encouraging players' self-exploration within representative scenarios to promote technical and tactical skills. Nevertheless, "traditional" repetition-based approaches also achieved improvements with respect to players' technical outcomes, yet, their impact on match-play performance remains widely unexplored. In the light of the large methodological heterogeneity of the included studies (e.g., outcomes or control groups' practice activities), the presented results need to be interpreted by taking the respective intervention characteristics into account.

INTRODUCTION
Sport coaches face multiple challenges, one of which is to facilitate athletes' skill acquisition to improve performance (Gould and Mallett, 2021). Due to the dynamic and interactive character of team sports games, a plethora of skills is required to act successfully during gameplay. In soccer, these skills specifically encompass perceptual-motor (e.g., technical) and perceptual-cognitive (e.g., tactical) components (Williams et al., 2020). Knowledge about these and further relevant performance factors is not only important to identify talented players, it is also essential for developing these performance factors systematically through effective practice and coaching. While recent systematic reviews document the growing body of knowledge about soccerspecific performance characteristics and talent predictors, less is known on how to promote these factors effectively, reinforcing the call for intervention research (Williams et al., 2020;O'Connor et al., 2021).
Most intervention research on the development of performance factors in soccer is concerned with the players' physical fitness and physiological capabilities (for overviews see Bujalance-Moreno et al., 2018;Zouhal et al., 2020). Considerably fewer studies investigated the promotion of soccer-specific skills based on psychological and motor learning theories . This smaller scope of scientific work is primarily attributed to methodological challenges in assessing behavioral changes in players, especially in highly dynamic and unpredictable match-play situations. For this reason, laboratorybased research on the acquisition of closed soccer skills has made the primary contribution to our general understanding of skill acquisition (e.g., Anderson and Sidaway, 1994;Hodges et al., 2005). Often conducted with novice participants, those studies seem to not only lack transferability to a pitch-based coaching context but also to superior skill-level players (Wulf and Shea, 2002;Farrow and Robertson, 2017). As a possible consequence, there is an emerging trend toward intervention studies anchored in more representative settings and with more experienced players (e.g., Práxedes et al., 2016;Roberts et al., 2020).

Theoretical Perspectives on Skill Acquisition
Besides methodological difficulties in designing intervention research, also the theoretical underpinnings of skill acquisition pose challenges for developing effective practice. There are different perspectives on the organization and development of skills, possibly leading to contradictory implications for coaches' work. Historically, substantial advances have been made with regards to understanding the processes linked to skill acquisition, but it remains a dynamic and disputed topic among researchers (Whitall et al., 2020a,b).
In current research, two influential theoretical perspectives can be distinguished that emerged from cognitive psychology, on the one hand, and ecological psychology/dynamical systems theory, on the other hand (Anson et al., 2005). 1 According to an understanding grounded in cognitive "information-processing" theories, there is an ideal way to perform a skill that is to be learned through a stage-linear process (Fitts and Posner, 1979). Based on the schema theory of motor learning (Schmidt, 1975), a skill consists of invariants that are stored through mental representations in so-called generalized motor programs (GMPs). For utilizing GMPs in dynamic game-related situations, players must learn to parameterize the skill, that means, learn how to adjust the skill to the requirements of any respective situation. Within this theoretical perspective, performing a game action relates to a subsequent three-stage process from stimulus identification (i.e., perception), response selection (i.e., decision), and response programming (i.e., action; Schmidt et al., 2019).
A different, opposing view to this often called "traditional" standpoint emerges from an ecological/dynamical systems perspective where skill acquisition is considered a process of exploration and self-organization. Based on Newell's (1986) constraints-based model, individuals interact with the environment and the tasks of the given situation by exploring individual movement solutions in a non-linear fashion. In this approach, there is no one ideal, "correct" technique for performing a skill. Rather, there is substantial variability both across and within individuals (i.e., "repetition without repetition"; Bernstein, 1967). According to this viewpoint, performing game actions depends on the direct perception of affordances from the environment, implying that perception, and action are considered inherently coupled (Gibson, 1979).

Practice and Coaching Methods to Promote Skill Acquisition in Soccer
In searching for the most supportive methods to improve players' skills, it is worthwhile to look at the early stages of playing soccer during childhood: In these stages, the process of skill acquisition can occur in an implicit and unstructured way within informal and child-led settings, such as street soccer (Uehara et al., 2018). There is evidence that a high amount of such self-organized soccer gameplays, as well as multi-sport practice in childhood, is positively associated with achieving excellence in adult soccer (Forsman et al., 2016;Güllich et al., 2021). However, when looking at the pathways of elite players, it is indisputable that formal soccer-specific and coach-led practice is indispensable to improve and sustain players' skill level, too (Sieghartsleitner et al., 2018;Hendry and Hodges, 2019). In this goal-directed process, coaches are challenged by a variety of methodological decisions relating to the skill itself but also contextual factors, such as the athlete's performance level, age, or available time to achieve an outcome (Côté and Gilbert, 2009). Therefore, coaches likely need a "blended tool kit" (Price et al., 2019, p. 126), equipped with practice and coaching methods in order to find effective context-specific solutions to facilitate player learning and performance.
To describe these "tools", the terms training form (i.e., decontextualized repetitive drills) and playing form (i.e., gamerelated, representative situations) are commonly used to classify practice activities in soccer (Ford et al., 2010;O'Connor et al., 2018). Yet, these terms only provide a broad picture of the accompanying task demands so that more sophisticated differentiation is needed. One of that is the degree of variability within (i.e., trial-to-trial) and between (i.e., contextual interference) practiced skills that can be scaled in drills and game-based tasks (Stratton et al., 2004;Magill and Anderson, 2020). Further, the specificity describes whether practice conditions reflect competitive demands in terms of motor-and sensory-related parameters (Proteau, 1992;Farrow and Robertson, 2017). Besides practice activities, there is also a wide array of coaching behaviors, such as demonstrations, instructions, and feedback which are used to accentuate the practice demands (Hendry et al., 2015;Otte et al., 2020). For instance, instructive behaviors may differ depending on whether explicit or implicit learning processes should be promoted in the players.
Insights to which extent coaches make use of the outlined tools during pitch-based work are given by systematic observations in "real-world" coaching contexts (Cushion et al., 2012a;Partington and Cushion, 2013;Partington et al., 2014). Many of these studies have displayed the "challenging tradition" of practice, instruction, and skill acquisition in terms of the predominant use of unrepresentative drill-based activities and a high amount of prescriptive instructions by the coaches . More recent studies show a trend regarding more game-realistic activities which are more closely aligned to competitive demands (O'Connor et al., 2018;Roca and Ford, 2020). Nonetheless, there still seems to be potential toward greater consideration of skill acquisition research within soccer coaches' work (O'Connor et al., 2017;Farrow, 2021).
In contrast to the outlined traditional approach, characterized by the predominant use of decontextualized drill-based activities and directive instructions, instructional approaches providing alternative strategies have gained increasing popularity in soccer. Grounded in an educative and constructivist perspective, Teaching Games for Understanding (TGFU; Bunker and Thorpe, 1982) is a game-centered approach that aims to improve tactical intelligence through simplified, game-related situations. The accompanying guided discovery approach to coaching is intended to explicitly enhance players in solving tactical problems. As another approach, Non-linear Pedagogy (NLP; Chow et al., 2011) provides key principles to practice and coaching and is shaped by an ecological dynamics viewpoint. According to NLP, functional variability of skills can be achieved through perception-action couplings during practice, by applying representative learning designs, and a "hands-off " facilitative approach to coaching. NLP aims to support players in finding individual movement solutions through a non-linear process of learning. Lastly, based on dynamical systems theory, Differential Learning (DL; Schöllhorn, 1999) assumes that athletes need to adapt to constant perturbations in dynamic competitive environments. Thus, practicing skills with additional random fluctuations ("noise") offers the opportunity to explore and selforganize individual functional movement patterns.
Considering the wide array of possible practice and coaching approaches, Williams et al. (2020) andO'Connor et al. (2021) call for intervention research on soccer players' skill acquisition to get a deeper insight into the effectiveness of different methods. Recent systematic reviews and meta-analyses within these and related fields pooled intervention research from educational settings (Abad Robles et al., 2020), focused on the effectiveness of one instructional approach (e.g., DL; Tassignon et al., 2021), or merely examined one specific outcome (e.g., decision-making; Silva et al., 2020). While these reviews included studies from various sports and often different experimental settings, there is no systematic review accumulating evidence on the effectiveness of practice and coaching methods grounded in different theoretical perspectives to skill acquisition from a soccer-specific viewpoint.

The Present Study
This systematic review aims to pool empirical knowledge from intervention research on the effectiveness of different practice and coaching methods on skill acquisition in soccer. In contrast to previous systematic reviews and meta-analyses, the current review is explicitly set out to focus on the following attributes: First, it focuses on intervention research conducted in applied ("pitch-based") experimental settings which included experienced soccer players as participants rather than novices. Such studies provide the ecological validity necessary to offer the most pertinent support for soccer coaches' actual work. Next, it was deemed vital to investigate studies grounded in different theoretical skill acquisition approaches to explore how different assumptions impact an interventions' design as well as to discuss the effectiveness of different approaches regarding the acquisition and learning of soccer-specific skills. Finally, considering both perceptual-motor and perceptual-cognitive skills as outcomes was perceived mandatory due to their interrelationship during matchplay. As there is no overview of intervention research in soccer considering these aspects, knowledge on the methodological characteristics and rigor is required to estimate the current potential for drawing evidencebased recommendations for coaches' work.
To this end, soccer-specific intervention studies conducted in applied experimental settings with soccer players were reviewed to answer the following research questions considering three overarching perspectives: I. Theoretical perspective: What were the interventions' underlying frameworks to skill acquisition? How did these impact the practice design and coaches' behavior? II. Methodological perspective: What study designs, participant samples, instruments, and statistical methods were applied? What was the quality in reporting and risk of bias within studies? III. Outcome-related perspective: To what degree did the interventions contribute to effective perceptual-motor and perceptual-cognitive skill acquisition in the players?

METHODS
A systematic review was conducted according to the guidelines of preferred reporting items for systematic review and metaanalyses (PRISMA; Moher et al., 2009). On November 4 th 2020, a PRISMA-Protocol (PRISMA-P; Moher et al., 2015) was pre-registered in the Open Science Framework, outlining the objectives of this systematic review, the systematic search strategy, methods for assessing the methodological quality of individual studies, as well as predetermined methods for data extraction and synthesis in detail (Bergmann et al., 2020).

Systematic Search and Eligibility Criteria
On November 22 nd 2020, a systematic search was conducted across the databases SPORTDiscus, PsycInfo, MEDLINE (via EBSCOHost, respectively), and Web of Science (considering the categories Sports Science and Psychology). Each database was searched for peer-reviewed academic articles in English language without limitations for publication year. The systematic search strategy was developed by the research group in consultation with two librarians who helped to optimize search terms and to identify the most appropriate databases to best address the objectives of this systematic review (Harari et al., 2020). The following terms and operators were searched in each database considering titles and keywords (further details of the systematic database search and the respective settings of each database are documented within Supplementary Material I): (football * OR soccer) AND (intervention OR train * OR program * OR approach OR pract * OR effect * OR impact OR improv * OR learn * OR perform * OR coach * OR "skill acquisition" OR cognit * OR ecologic * OR constraints OR "information processing") NOT (novice OR referee OR injur * OR pupil * OR class OR goalkeep * OR NFL OR "american football" OR "australian football") The inclusion criteria for study selection are presented in Table 1 according to the PICOS components. In accordance with Spittle's (2021) classification of intervention research, only studies in at least applied settings were included. Additionally, studies were excluded if they examined novices, goalkeepers, referees, or athletes with mental or physical disabilities. Besides that, interventions focusing on injury prevention or rehabilitation (including warm-up programs), as well as interventions that only included physical or physiological training exercises without a ball (i.e., fitness training), were not considered. This was also true for such studies which only assessed physical or physiological outcomes. Due to the lower ecological validity, laboratory research designs, but also imagery or virtual-reality interventions, were excluded.

Article Screening
The search results were exported to and managed with EndNote (Version X9.3). EndNote was also used to remove duplicates automatically. Additionally, the first author and a trained research assistant screened all titles independently to remove previously missed duplicates manually. Throughout an independent screening process following the PRISMA guidelines, potentially eligible articles were checked against the inclusion and exclusion criteria by both reviewers. The inter-rater agreement (IRA) was calculated using the percentage of agreement as well as Cohen's Kappa (κ;Hallgren, 2012). The PRISMA flowchart in Figure 1 displays the systematic search results. The initial database search yielded 13,318 hits. Three articles were additionally added through other sources (Hossner et al., 2016;Bozkurt, 2018;Ozuak and Çaglayan, 2019). After removing 5,026 duplicates, 8,295 titles were screened independently by the first author and a trained research assistant. The previously initiated reviewer training comprised an independent title screening of ∼10% of identified records and a subsequent discussion of all differently categorized articles based on the inclusion and exclusion criteria. After screening all titles independently, the IRA yielded a sufficient agreement between the two reviewers (97.58%; κ = 0.56). If at least one reviewer argued for inclusion at this stage, the article was moved into the next stage of abstract screening. At this point, 338 potentially relevant abstracts were examined and an excellent IRA was reached (94.67%; κ = 0.92). In case of initial disagreement, study records were discussed by the two reviewers and a decision for or against full-text screening was made. Lastly, 70 full texts were screened independently, the IRA was again found to be excellent (91.43%; κ = 0.87). Uncertainties were discussed with the whole project group. Within this process, 33 articles were identified and one more article was found through manual reference list screening (Schöllhorn et al., 2012). Finally, 34 articles were included for systematic review.

Data Extraction
The data of included studies were extracted using a Microsoft Excel spreadsheet. All accessible information about the study, participant characteristics, intervention characteristics, instruments, outcome variables, as well as statistical techniques used to assess outcome effects were extracted. To determine whether a quantitative synthesis is possible based on reported data, descriptive and inferential statistics were also extracted. Necessary missing data was requested from the corresponding authors.

Assessment of Quality in Reporting and Risk of Bias
To judge the quality in reporting and the risk of bias in individual studies, all studies underwent a detailed assessment using an augmented version of the Downs and Black Checklist (Downs and Black, 1998) as well as a slightly modified version of the Template for Intervention Description and Replication (TIDieR; Hoffmann et al., 2014). 2 The Downs and Black Checklist (Downs and Black, 1998) was developed for the rating of both randomized and nonrandomized studies and has been recently used within systematic reviews and meta-analyses in sports sciences (Davies et al., 2017;Grgic et al., 2018;Ramos et al., 2020). Conventions for the studies' quality scores used in previous reviews were adopted but transferred to percentage as not all items were applicable to all studies. Good methodological quality is represented by a score ≥ 70%. A score ≤ 35% represents a low methodological quality. Scores between these thresholds were considered as moderate.
The TIDieR (Hoffmann et al., 2014) consists of 12 items to assess the quality in reporting of interventions. With the objectives of this systematic review in mind, a detailed intervention description is essential to interpret the results, allow potential replications, and provide coaches with evidence-based recommendations for their work.
To provide a reliable assessment, a similar procedure as for the systematic screening process was applied. An excellent IRA for both the Downs

Narrative and Quantitative Synthesis
The study characteristics and results were synthesized narratively for all included studies to provide a descriptive overview of the existent research. Based on this narrative synthesis as well as the judgement of quality in reporting and risk of bias, it was discussed whether an additional quantitative synthesis is possible according to criteria outlined in the PRISMA-P (Bergmann et al., 2020).
Performing a meta-analysis was deemed inappropriate due to the heterogeneity of dependent variables and study designs (e.g., randomized/non-randomized studies; parallel-group/crossover designs), as well as differences in the studies' quality. More specifically, pooling the effects as done within a meta-analysis was thought to lead to a comparison of dissimilar studies, the inclusion of poorly designed research, and thus, to invalidate results (Tod, 2019;Deeks et al., 2021). However, to estimate the intervention effect sizes in controlled studies, the effect size d was calculated in two different ways based on the reported data in the original publications. First, if the effectiveness was analyzed by group × time interactions, d was calculated from the interaction effect according to Cohen (1988). Second, if differences between groups at pre-and post-test were used to analyze the effectiveness (i.e., acquisition effects), d ppc2 was computed (see Equations 1-3; 2 The modified checklists, including further specifications of the items, are accessible in the OSF project for this publication (https://osf.io/85tjg/). Morris, 2008;p. 369): where the pooled standard deviation is defined as and The d ppc2 was also applied when multivariate statistics were used to classify effects for single outcome variables. If a retentiontest was conducted and all relevant data was available, d ppc2 was calculated using pre-and retention-test data to estimate learning effects.
To consider the intervention duration as a potential moderator, studies were classified as short-(i.e., ≤ 6 sessions or 180 min), mid-(i.e., ≤ 24 sessions or 720 min), and longterm (i.e., ≥ 25 sessions or 2,250 min) interventions. Similar approaches to synthesize intervention effects quantitatively were recently used within systematic reviews in sports science research (e.g., Demetriou and Höner, 2012;Raabe et al., 2019). Results are reported by the range of significant and non-significant effects, the percentage of significant effects, as well as the median when three or more effect sizes were found. Recommendations by Cohen (1988) were used to classify small (d ≥ 0.20), medium (d ≥ 0.50), and large effects (d ≥ 0.80). Effects displayed as positive values represent improvements in the respective outcome in favor of the intervention groups (IGs), while negative values represent greater improvements in control groups (CGs).

RESULTS
From the 34 identified articles, 85.7% were published between 2010 and 2020, revealing an increase in the number of relevant publications in recent years. The publication by Schöllhorn et al. (2006) includes two studies so that in total 35 individual studies were reviewed. All 35 studies were analyzed regarding their theoretical frameworks and methodological approaches (objectives 1 and 2). To investigate and compare the effectiveness (objective 3), only the 27 controlled studies were considered, while 8 studies without a CG were excluded.

Theoretical Frameworks and Intervention Content (Perspective 1)
The studies can be grouped into two overarching categories (see Figure 1). The first group (n = 18) represents studies in which interventions were designed based on theory-driven instructional approaches to practice and coaching. The second group (n = 17) includes studies investigating specific aspects of Game-based DL, focusing on intertrial variability, was compared to game-based practice supported by specific instructions and error correction of a coach. The practice programs are aimed at improving the players' creativity. YA NLP, based on representative learning designs and perception-action couplings, was compared to a linear information-processing practice program regarding the promotion of attackers' individual learning objectives.
If authors did not sufficiently report the applied study design, the research group decided on an appropriate descriptive terminology. AL, Amateur Leagues; Av, average; CG, control group; Div., division; DL, differential learning; DLB, differential learning blocked; DLR, differential learning random; IG, intervention Group; LL, local level; NLP, non-linear pedagogy; n. r., not reported; PL, performance level; RL, regional level; TGFU, Teaching Games for Understanding; TL, traditional learning; YA, youth academy; YL, Spanish youth football league. a The 20 participants were randomly divided into two groups that practiced the same content, but the coaches changed between the groups to reduce clustering effects.
Frontiers in Psychology | www.frontiersin.org either the practice design or coaching behavior and interventions were mostly not grounded in theoretical frameworks. In the first group of studies, interventions were based on mainly three different theoretical and methodological underpinnings, namely DL, TGFU, and NLP (see Table 2). With nine studies, most utilized the DL approach. Of these nine studies, seven compared DL to traditional learning (TL) methods. Additionally,  compared an enrichment program of DL to a non-active CG, and Gaspar et al. (2019) investigated acute effects after DL and TL sessions in a single group design.
TGFU was investigated in five studies. Two of these compared TGFU with TL, primarily applying technical drill practices supported by direct instructions (Práxedes et al., 2016;Sierra-Ríos et al., 2020). The further TGFU-studies without CGs only investigated within-group changes.
The remaining four studies in this first group investigated NLP. Roberts et al. (2020) compared NLP to promote youth academy attackers' individual learning objectives to a practice program grounded in information-processing theory. Práxedes et al. (2018a) compared NLP to a drill-based and technically focused TL program. Two studies without CG investigated the effects of NLP in a single-group design (Práxedes et al., 2019) or by comparing the effects in different performance groups (Práxedes et al., 2018b).
Within the second group of studies, specific aspects of practice and coaching were investigated (see Table 3). Only four studies reported skill acquisition frameworks or discussed the results in the light of theoretical considerations (Weigelt et al., 2000;Haaland and Hoff, 2003;Raastad et al., 2016;Schwab et al., 2019).
Sixteen studies focused on the design of practice tasks, whereby a multitude of different aspects and outcomes were pursued. For instance, seven studies examined the effects of drill-based practices on technical outcomes. Within these seven studies, two interventions only included deliberate technical drill practices (Weigelt et al., 2000;Montesano and Mazzeo, 2019). Others applied drill-based practices with subsequent game-based situations (e.g., Holt et al., 2012;Miranda et al., 2013) or combined it with coordination exercises (Boraczyński et al., 2019;Kösal et al., 2020).
Lastly, as the only study on coaches' behavior, Schwab et al. (2019) compared the effects of internal and external focus feedback for learning the knuckle ball freekick technique.

Study Designs
Various study designs were used to investigate the influence of practice design and coaching behavior on soccer players' skill acquisition (see Tables 2, 3). Single-group (n = 7; 20.0%), as well as multi-groups designs with two (n = 19; 54.3%), three (n = 6; 17.1%), four (n = 2; 5.7%), or six groups (n = 1; 2.9%) were found. Across the 27 controlled studies, different practice activities of CGs were found that need to be considered for interpreting intervention effects (see in detail Supplementary Material III). For example, most CGs practiced according to different approaches compared to the IG (n = 17) or participated in their regular ("usual care") practice (n = 4). Another three studies compared the interventions to non-active CGs. Only three cases were identified in which different practice or coaching methods as well as usual care or non-active CGs were investigated (e.g., Witkowski et al., 2011).

Measurements and Statistical Analyses
In 19 studies (54.3%) two measurement points were assessed (i.e., mostly pre-and post-test). Twelve studies (34.3%) conducted three, and three studies (11.4%) conducted four measurements (e.g., through intermediate measurements). Only six studies (17.1%) conducted a retention test to assess learning effects. Holt et al. (2012) observed the players' performances during the intervention in every session. Five studies that used systematic in-game observations averaged the performances from different matches as values for the respective measurement point (e.g., Práxedes Pizarro et al., 2017).
Most studies investigated male participants (n = 17, 48.6%), two studies (5.7%) examined both males and females (Raastad et al., 2016;Barquero-Ruiz et al., 2020) and 16 studies (45.7%) did not specify participants' sex. Whereas, four studies (11.4%) The "non-preferred leg group" practiced 45 min in three out of five weekly sessions including drills and SSGs by only using the non-preferred leg. The "preferred-leg group" used both legs equally.

(Continued)
Frontiers in Psychology | www.frontiersin.org  Weigelt et al. (2000): the sample was reduced from 26 to 20 participants as two players did not participate in every session and goalkeepers were excluded from analyses. b Zago et al. (2016): the sample was reduced from 26 to 20 participants that were able to participate in every session and test. c Raastad et al. (2016): the sample was reduced from 22 to 17 participants that completed the practice sessions and were not injured.
Frontiers in Psychology | www.frontiersin.org FIGURE 2 | Quality in reporting and risk of bias in individual studies. D&B Score, Downs and Black score (all items). Notes: Results are presented as means with error bars that represent the standard deviation. The dashed lines show thresholds for low (≤ 35%) and high (≥ 70%) scores. It must be considered that the subscales external validity and power were calculated from only three and two items, respectively.
investigated adult players and two studies (5.7%) examined both youths and adults, most studies utilized youth soccer players (82.9%, n = 29). Thereby, most studies were conducted with participants between the age of 11 to 15 (n = 20). Twentynine studies reported the performance level of the sample (82.9%; see Tables 2, 3). A wide range of terminologies was found, in some cases corresponding to national league systems. Regional level players (incl. descriptions such as "moderate level") were most frequently examined (n = 11, 31.4%), followed by studies with youth academy or national level players (n = 6, 17.1%) and local (or "low") level players (n = 4; 11.4%). Two studies compared the effects of interventions with players on different performance levels (Harvey et al., 2010;Práxedes et al., 2018b). Studies with adult participants investigated players from regional levels (i.e., 5 th to 8 th divisions). The youth and adult players in Schwab et al.'s (2019) study were recruited from local levels, Haaland and Hoff (2003) investigated competitive players.

Instruments
To assess outcome effects, skill tests were used in 22 studies

Quality in Reporting and Risk of Bias
The results for the assessment of the quality in reporting and risk of bias in individual studies are presented in Figure 2 (see in detail Supplementary Material II). On average, the score of the Downs and Black Checklist reveals a moderate methodological quality (M = 55.65%, SD = 12.86%, Mdn = 57.69%). There are, however, large differences between studies ranging from 30.77% to 78.57% of the total score (2x low score, 5x good score). Most papers lacked a sufficient report of principal confounders (item 5), only four studies (11.4%) met this criterion. To assess outcome effects, 10 studies (28.6%) used assessments that were not evaluated regarding their reliability (item 20). Moreover, very few studies conducted a priori sample size estimations (n = 5, 14.3%), leading to an overall low statistical power.
The TIDieR displays large differences across studies with an average score of 59.11% (SD = 22.52%, Mdn = 66.67%, [11.11, 100]). Even the two studies that reached 100% did not differentiate for every single session but provided an overview of the intervention content across sessions (Zago et al., 2016;Roberts et al., 2020). Another 20 studies (57.1%) presented the procedures in a manner that permits the replication of at least parts of interventions. Lastly, 15 studies (42.9%) controlled as to whether the intervention was delivered as intended, but only nine studies reported underlying criteria of observation. Table 4. In total, 150 outcome variables were investigated. However, 123 different variants for operationalization (e.g., different skill tests) and different contexts of assessments (e.g., game formats for systematic observations) lead to a highly heterogeneous conglomeration of outcomes. Moreover, six studies employed overall performance scores which included perceptual-motor and perceptual-cognitive characteristics. Yet, none of these scores were calculated based on variables assessed in comparable contexts or with identical equations.

A list of examined outcome variables is provided in
On average, 4.29 (SD = 3.39, Mdn = 4, [1, 20]) outcome variables were investigated per study. Thirty four studies (97.2%) assessed perceptual-motor outcomes, such as the players' technical performance in skill tests or in-game technical executions. Eleven studies (31.4%) assessed perceptualcognitive outcomes, such as the divergent (i.e., creativity) or convergent (i.e., decision-making) tactical performance. Two studies examined tactical behavior in defense (Harvey et al., 2010;Barquero-Ruiz et al., 2020), while all other studies addressed aspects in the attack.
Perceptual-motor outcomes, such as shooting accuracy or dribbling were most often collected via skill tests (n = 22 studies). Technical skill execution, assessed with the GPET or GPAI, was the most frequently examined in-game variable (n = 8 studies). Five studies distinguished between different techniques (i.e., in-game passes or dribbles), while others reported cumulative variables.
Decision-making was the most often examined perceptualcognitive outcome (n = 9 studies). Práxedes et al. (2016, 2018a,b, 2019), Práxedes Pizarro et al. (2017, and Roberts et al. (2020) reported whether the decisions occurred in different contexts (i.e., passing, dribbling, or shooting) while others only provided accumulated performance values. Lastly, in-game creativity was examined in two studies .

Effectiveness of Interventions (Perspective 3)
Inferences on the effectiveness of interventions can only be drawn from the 27 controlled studies. The main results were narratively synthesized in Table 5. A summary of results for studies without a CG is provided in Supplementary Material IV. For the quantitative synthesis of controlled studies, 51 effect sizes (26 significant; 50.98%) for perceptual-motor outcomes and 32 effect sizes (11 significant; 34.38%) for perceptual-cognitive outcomes could be recalculated (see Table 6). Given the large differences across studies (e.g., practice activities of CGs) a detailed analysis of individual studies is necessary.

Effectiveness of Instructional Approaches Differential Learning
Comparisons of DL and TL led to ambivalent results across studies (see Table 6). Most studies assessed the  Preferred foot performance (n = 2; 1) Non-preferred foot performance (n = 2; 1) Results in brackets represent the n of investigated variables (first number) and the n of different operationalizations (second number). Performance scores that were calculated from both perceptual-motor and perceptual-cognitive skill domains (n = 6) were not displayed in this   compared DL, using SSGs and technical drills, to a non-active group. Non-clinical magnitude-based inferences displayed greater acquisition effects for DL in U15 players regarding in-game dribbles and shots, as well as creative components fluency and versatility. In U17, greater effectiveness was only found in the technical shooting performance.  found positive effects in favor of game-based DL compared to a game-based CG in few creative components. More The DL group achieved significantly greater improvements in creative speed and ball dribbling tests. No significant differences compared to the usual care group were found in juggling and passing. In-game: -Creativity (fails, attempts, fluency, versatility, and originality) in passes, dribbles, and shots DL led to significantly greater effects in few creative components in both ages compared to TL. A decrease in fails in both ages was found. Significant differences were also found in attempts, originality, and most stressed in versatility. More significant and higher effect sizes were found in the U13 age group. TGFU was found to be significantly more effective than TL in promoting decision-making (passing and dribbling). The only significant difference in the execution variables in favor of TGFU was found for passes.

(Continued)
Frontiers in Psychology | www.frontiersin.org A significantly greater reduction in the number of unsuccessful on-the-ball executions, a decrease in off-the-ball errors, and more successful off-the-ball actions after TGFU were present. No differences in the successful on-the-ball performance were found. No significant group × time interaction was found. However, at post-test, the NLP group significantly outperformed the TL group in passing decisions and executions. No differences were found in dribbles. Roberts et al. (2020) -NLP (n = 11) -IP (n = 11)

(60) 4 U-Test and Wilcoxon
Skill-test: -Strong foot finishing -Weak foot finishing−1v1 -Decision-making Significantly greater improvements in the NLP group compared to the IP group were found in 1v1 and decision-making skills. No significant differences were found in the technical shooting proficiency or the execution time. The coordination group improved in all variables and fewer within-group effects were found compared to the usual care group. The unstructured practice group did not improve in any variable. However, no interaction effects were reported. (2019) -Add. practice (n = 9) -Usual care (n = 9) n. r. (60-80) n. r.

Descriptive analyses
Skill tests: -Passing -Shooting Both groups descriptively improved in successful passes and goal shots. Descriptively greater improvements were found after add. practice. Weigelt et al. (2000) -Intervention (n = 10) -Non-active (n = 10)  In-game: -Preferred foot performance utilization -Non-preferred foot performance utilization The experimental practice program led to a significantly greater utilization rate of the non-preferred leg during match-play, while the use of the preferred leg significantly decreased.
Haaland and Hoff (2003)  Only in speed dribbling, the lateral asymmetry was significantly reduced from pre-to post-test in the non-preferred-leg group. In other variables, no significant differences between groups were found due to improvements in both the preferred and non-preferred-leg groups. Both groups improved in their technical performance from pre-to post-test. Higher within-group effects were found in the SSG group. No interaction effects were reported.

(Continued)
Frontiers in Psychology | www.frontiersin.org  The ball juggling performance of both groups increased from pre-to post-test, but no interaction effect regarding transfer effects was found.
Internal and external focus feedback (n = 1) External focus feedback led to a significantly greater reduction in the rotational ball velocity from pre-to post and pre-to ret-test. No effects on the linear ball velocity were found.
Add., additional; Adol., adolescent; DI, Direct Instruction; DL, Differential Learning; DL and FB, Differential Learning and Feedback; HIIT, High-Intensity Interval Training; LSST, Loughborough Shooting Skill Test; TL, Traditional Learning; TGFU, Teaching Games for Understanding; NPL, non-preferred leg; PL, preferred leg; SSG, small-sided games. a Further variables, that do not correspond to the perceptual-motor or perceptual-cognitive skill domains (e.g., physiological outcomes), were investigated in the study. b Arslan et al. (2020), Kösal et al. (2020), and Witkowski et al. (2011) did not conduct or sufficiently report interaction effects. Thus, the narrative synthesis is limited to the comparison within group time effects between groups.
Frontiers in Psychology | www.frontiersin.org Drill-based P. All effect sizes comparing intervention and control groups from pre-to post, but also from pre-to ret-test were included in this overview. CD, controlled designs; UCD, uncontrolled designs. P., Practice, sig., significant. Effect sizes from studies on instructional approaches are provided in the upper half of the table. Effect sizes from studies on specific aspects of practice or coaching are presented below. a Schöllhorn et al. (2012) compared random DL, blocked DL, and TL groups. Only the effects for the comparisons of the random and blocked DL groups compared to the TL group, but not for the comparisons of random and blocked DL groups, were considered. b Práxedes et al. (2018b) used a multivariate analysis of variance including both the perceptual-motor (i.e., skill execution in passing and dribbling) and perceptual-cognitive outcomes (i.e., decision making in passing and dribbling). The non-significant effect (d = 1.05) is once included in the perceptual-motor and perceptual-cognitive domain as no differentiation was possible based on the reported results. c Guilherme et al. (2015a,b) reported the effects of the utilization rate for both the dominant and non-dominant leg. As a reduction of lateral asymmetries was intended, the higher utilization rate of the non-dominant leg, but also the lower utilization rate of the dominant leg is displayed as a positive effect. d No effects for game-based studies could be recalculated due to missing data. To sum up, two studies provide support on the general effectiveness of DL in at least a few technical or tactical outcomes Ozuak and Çaglayan, 2019). Results on the relative effectiveness of DL compared to TL regarding the promotion of soccer-specific techniques are, however, ambivalent. Yet, no study found significantly greater improvements after TL. In terms of creativity,  indicate the potential of DL to encourage versatility and originality compared to other game-based models.

Teaching Games for Understanding
The TGFU approach was compared to TL, primarily including decontextualized drills (Práxedes et al., 2016;Sierra-Ríos et al., 2020). Práxedes et al. (2016) found strong effects in favor of TGFU in decision-making for passes and dribbles (0.90 ≤ d ≤ 1.40; two of two significant). A significant effect in the execution variables was present for passes (d = 1.10), but not for dribbles (d = 0.45). In the study by Sierra-Ríos et al. (2020), TGFU outperformed the direct instruction group regarding significantly more successful off-the-ball decisions, and executions (2.62 ≤ d ≤ 2.80; 2 of 2 significant), as well as less unsuccessful actions (2.37 ≤ d ≤ 2.48; 2 of 2 significant). Regarding on-the-ball variables, only significantly fewer inefficient technical actions support the greater effectiveness of TGFU (d = 2.48), but not more efficient technical executions were found. Práxedes et al. (2018a) compared NLP to TL group that prioritized technical components. NLP was altogether not significantly more effective than TL (d = 1.05). 3 However, at post-test, the NLP group significantly outperformed the TL group in passing decisions and executions. No significant differences in dribble variables were found. Roberts et al. (2020) compared NLP to a practice program based on information-processing theory and found greater improvements after NLP in 1v1 (d = 0.74) and decision-making skills (d = 0.60). No significant differences occurred in technical variables [Mdn(d) = 0.44; 0.17 ≤ d ≤ 0.58].

Effectiveness of Interventions on Specific Aspects of Practice or Coaching
Generally, the low number of effect sizes within this second group, as well as an often limited content-related comparability of studies impede an accumulated report. For game-based interventions, even no effect size could be recalculated.
Sixteen controlled studies investigated multiple aspects of the practice design to promote technical skills. Regarding technical drill practices (with subsequent SSGs), Zago et al. (2016) found that the use of tape-matrix structures as spatio temporal constraints for both, technical drills, and phases of play situations, led to a significantly faster passing execution time compared to a group that practiced without such constraints (d = 1.05). No interactions in other precision-or time-based technical outcomes occurred [Mdn(d) = 0.14; 0.12 ≤ d ≤ 0.94]. Weigelt et al. (2000) found a significant interaction of deliberate juggling practice compared to a non-active CG, resulting from improvements in knee-juggling, but also positive transfer effects to ball control performance (d = 1.20). Such positive transfer to ball control was not confirmed by Raastad et al. (2016), additionally, no differences occurred in deliberate juggling practice with smaller or larger balls (d = 0.36).
The non-dominant leg practice interventions consistently led to greater effects compared to CGs. Guilherme et al. (2015a,b) found that additional drill-based non-dominant leg practice led to a higher utilization rate in game-based situations compared to non-additional practice CGs (2.31 ≤ d ≤ 2.91; two of two significant). The decrease in the preferred leg utilization rate reveals fewer lateral asymmetries during match-play (1.40 ≤ d ≤ 2.31; two of two significant). Haaland and Hoff (2003) further showed that the predominant use of the non-dominant leg within team practice led to improvements in both the non-dominant [Mdn(d) = 0.93; 0.88 ≤ d ≤ 1.32; three of three significant] and dominant legs [Mdn(d) = 0.95; 0.82 ≤ d ≤ 0.99; three of three significant] compared to a usual care CG.
In the only study on coaches' instructions (Schwab et al., 2019), external focus feedback led to greater improvements compared to internal focus feedback in the knuckle-ball technique due to a reduction in the rotational ball velocity (d = 0.59). No differences were found in the linear ball velocity (d = 0.11).

DISCUSSION
Enhancing soccer players' skill acquisition is an essential part of coaches' work that should be supported by evidence-based knowledge to identify the most effective practice and coaching methods. Thirty-five studies were reviewed to pool the growing body of research investigating perceptual-motor and perceptualcognitive skills as outcomes of interventions anchored in pitchbased settings. Two groups of studies were identified within the present research. In the first group, theory-driven instructional approaches were investigated and compared to non-active CGs or active controls practicing according to differing methodologies.
In the second group, specific aspects of the practice design or coaches' instructions were examined, but interventions were often not explicitly embedded within skill acquisition paradigms. In both groups, methodological differences in terms of the study designs, practice activities of CGs, outcome variables, as well as research quality challenge the comparability of the respective study results. Thus, interpreting the potential of the investigated methodologies requires a detailed discussion of underlying theoretical frameworks, derived principles to practice and coaching, as well as studies' methodological characteristics and limitations.

Theoretical Perspectives
Regarding studies on instructional approaches, underlying frameworks of self-organization approaches DL (i.e., dynamical systems theory) and NLP (i.e., ecological dynamics), as well as corresponding principles for practice and coaching, were clearly stated. Allocating TGFU solely based on skill acquisition theory seems inappropriate as it emerged from "an educative perspective rather than approaching it from purely the field of sports science/skill acquisition" (Harvey et al., 2018;p. 175). In contrast to the underpinnings of IGs' practice, theoretical frameworks of CGs were only stated in a few studies (e.g., Schöllhorn et al., 2006;Roberts et al., 2020). In many studies on TGFU and NLP, the respective CGs' practice was not explicitly linked to a cognitive, information-processing approach to skill acquisition. Instead, CGs were defined on a practical rather than on a theoretical level by "prioritizing the technical component" (Práxedes Pizarro et al., 2017, p. 187) while IGs practiced both technical and tactical aspects of play. Similarly, not withstanding the outlined exceptions, the vast number of studies on specific aspects did not design their interventions based on skill acquisition theory. Thus, many of the reviewed studies only contribute to a practical discussion and do not allow inferences on the explanatory value of skill acquisition theories, and thereto derived methodological conclusions on how to effectively design and deliver practice and coaching.

Practical Implementation
Typically, technical practices following an informationprocessing viewpoint targeted the development of "ideal movement archetypes" (Schöllhorn et al., 2012, p. 104) mostly through decontextualized drills. When aspiring to tactical objectives, practices were sometimes enriched by subsequent game-based situations to apply the to-be-learned skills in game-realistic contexts. Coaches acted as "expert[s] leading participants to a series of pre-determined outcomes" (Roberts et al., 2020(Roberts et al., , p. 1456) by using a high number of verbal instructions and demonstrations. However, it was often not clearly outlined if, when, and how the decomposed technical practice was linked to tactical aspects within the sessions' microstructure and the interventions' periodization.
Regarding DL, a variety of implementations regarding both technically focused drill-based and game-based interventions were utilized. For instance, "superfluous exercises" (Schöllhorn et al., 2006, p. 191) were added to target drills (e.g., additional body movements) or random modifications of SSGs (e.g., equipment or rules) were applied to create noise throughout the learning process. Nevertheless, a hitherto unresolved problem is the lack of metrics for quantifying variability ("noise") limiting opportunities for replications and practical recommendations for coaches' work.
Both NLP and TGFU interventions primarily include gamebased activities and in some cases applied skill practice tasks aiming at a rather natural variability within and between skills. This was mostly achieved through simplified "contexts of play" (Práxedes et al., 2019, p. 335) such as numerical superiority SSGs. These methodological similarities are in line with those outlined in the literature on TGFU and the CLA that is underpinned by principles of NLP (Renshaw et al., 2016;Harvey et al., 2018). Nevertheless, specific differences grounded in the respective theoretical underpinnings were often difficult to identify. Few studies reported in detail how coaches guided players in TGFU in an explicit process through reflective questions (Barquero-Ruiz et al., 2020). Otherwise, more implicit strategies in manipulating constraints to promote "strong functional couplings of information and movement" (Roberts et al., 2020(Roberts et al., , p. 1456 were described for NLP interventions. Yet, the implementation of specific strategies within a defined context, the triggers for their use, as well as the dose were scarcely specified. Such details in the light of the targeted outcomes are, however, particularly important to achieve conceptual clarity and reduce misinterpretations (Cope and Cushion, 2020).

Critical Appraisal of Methodological Study Characteristics
The present review faces often outlined methodological challenges in systematic reviews and meta-analyses of intervention research, such as large methodological diversity across studies, outcome variables, and practice activities of CGs (Abad Robles et al., 2020;Silva et al., 2021;Tassignon et al., 2021). Besides that, the quality in reporting and risk of bias, as well as the quality of intervention description largely differed across studies and some general limitations in the present body of research were uncovered. These limitations need to be critically discussed to ensure a careful and reflected interpretation of the interventions' effectiveness.

Study Designs and Intervention Characteristics
Studies without a CG need to be necessarily excluded from analyses on the effectiveness as the outcomes could be biased by various confounding factors. Even within controlled studies, the research designs varied substantially, limiting the comparability of results. Caution is also required when comparing the effects as different types of CGs were identified (see in detail Supplementary Material III). While comparisons to non-active CGs only allow conclusions on the general effectiveness of interventions, usual-care CGs provide evidence on whether interventions could improve current practice (Smelt et al., 2010). Attention is advised when CGs mainly practice technical aspects while IGs focus on both technical and tactical content, in particular when both technical and tactical outcomes were assessed through systematic in-game observations. Such comparisons neither allow a theory-led discussion of results, nor enable conclusions on the effectiveness of investigated approaches in relation to other methodologies.
The only studies which allow for conclusions on the relative effectiveness of practice and coaching methods are those that explicitly state theoretical and methodological principles of CGs and by approaching similar practice objectives. However, most studies did not include additional nonactive CGs so that improvements in both groups could not be interpreted (e.g., Hossner et al., 2016). Yet, ensuring both sufficient statistical power and including additional no-training CGs could be particularly difficult in applied experimental settings. Nevertheless, in line with Vater et al. (2021), including at least a non-active CG can be seen as the minimum requirement to ensure sufficient quality of evidence. Potentially, randomized crossover designs-as chosen by Roberts et al. (2020)-provide solutions that every player can profit from interventions by simultaneously controlling intervention effects.
Primarily, studies focused on performance changes (i.e., temporary fluctuations in behavior), but not on learning effects (i.e., permanent changes over time; Soderstrom and Bjork, 2015). Such short-time effects may be of great importance within high-performance settings when the goal is to quickly achieve a specific outcome. From a talent development viewpoint, however, developing players' skills systematically would benefit from knowledge on learning effects, thus, requiring the inclusion of retention tests within intervention designs. Additionally, knowledge about methods resulting in positive transfer to other skills but also from decontextualized to game-realistic settings would provide valuable information on how to effectively prepare players for the dynamic nature of the game.
A further identified concern relates to the scarce description of interventions which limits the potential to replicate the reviewed studies. Besides that, little consideration was dedicated to the intervention adherence and fidelity. Systematic assessments based on specific and outlined criteria of observation provide transparent insights on the intervention delivery (e.g., Práxedes et al., 2019). Especially within the highly dynamic and partly unpredictable pitch-based work, systematic assessments would add valuable information on the interventions' protocol adherence.

Outcome Variables and Statistical Analysis
The multitude of dependent variables illustrates the variety of relevant outcomes even within the perceptual-motor and perceptual-cognitive skill domains. Yet, multiple ways for measuring the same skills underline that there is no consensus regarding standard assessment tools (Williams et al., 2020). Additionally, instruments lacking a scientifically sound investigation of reliability and validity were found. Those results need to be interpreted with caution. Besides outcome-related skill tests, also an increasing number of in-game assessments were applied. These provide higher ecological validity and the opportunity to assess the functional application of practiced skills (for a similar discussion see Koopmann et al., 2020). However, many interfering factors (e.g., performance of other players) need to be controlled. Consequently, combining skill tests and in-game assessments is recommended.
A further prevalent issue was the low statistical power due to small sample sizes. Unsurprisingly given this result-and in line with similar investigations in sports science research (Abt et al., 2020)-only five studies reported a priori sample size estimations. Further, correlations among repeated measures in individual studies were unknown and may have resulted in less precise post-hoc power calculations. Besides inferential statistics, two studies used magnitude-based inferences, however, it is intensively discussed within sports science whether the use of such parameters is adequate as they are often used to justify small sample sizes (Lohse et al., 2020). Additionally, insufficient reports or interpretations of statistical results, missing interaction effects, or unknown group differences challenged the recalculation and interpretation of effect sizes.
Overall, these limitations reinforced the decision to refrain from a meta-analysis and need to be recognized when interpreting and comparing the effectiveness of interventions. Due to the diverse approaches, many studies must be considered in the light of individual characteristics.

Effectiveness of DL, TGFU, and NLP Interventions
A fundamental question arising from theoretical camps is as to whether practice should aim to improve ideal "textbook" techniques or allow exploration for individual solutions. Seven studies on DL dealt with this question, but large methodological differences and a low statistical power must be considered.
Only a few significant effects were found that confirm the superiority of DL compared to TL methods. Nevertheless, no study reported significantly greater effectiveness of TL methods, permitting the conclusion that DL seems to be at least a viable alternative for practicing technical soccer skills. Supporting this notion, comparisons to non-active CGs provide evidence that DL could generally improve performance (e.g., Ozuak and Çaglayan, 2019). Yet, besides two exceptions, the performance was assessed within skill tests. Thus, investigations that DL supports "more effective and more stable movement patterns" (Schöllhorn et al., 2012, p. 102) within match-play have been widely neglected so far. Further, the greater effectiveness of DL in the retention phase compared to the acquisition phase as found in a meta-analysis by Tassignon et al. (2021) could not be proofed due to the absence of retention tests in most studies.
The two studies on game-based DL support its potential to promote divergent tactical behavior in early-to mid-adolescent regional level players (e.g., . Although generalizable conclusions seem premature due to only two gamebased studies on divergent tactical outcomes, the results provide support for a developmental framework on promoting creativity in team sports, recommending DL as one appropriate method during adolescence (Santos et al., 2016).
In contrast to DL, studies on TGFU and NLP consistently aimed at promoting technical and convergent tactical skills (i.e., decision-making). Generally, positive effects regarding the promotion of perceptual-motor and perceptual-cognitive skills indicate the potential of both approaches although only two controlled designs were found. Besides, many studies rather provide evidence on the general effectiveness as targeted practice objectives in IGs (technical and tactical content) and CGs (primarily technical content) seem hardly comparable.
The predominant use of SSGs with numerical superiority led to greater effects in passing variables (e.g., Práxedes et al., 2018a). Consequently, if improvements in dribbling should be targeted, other strategies to simplify game demands are required. As a general result, more significant effects in perceptual-cognitive compared to perceptual-motor variables were found, probably due to the predominant focus on technical skills in CGs. Further, intervention periods longer than eight sessions as used by Roberts et al. (2020) or greater consideration of the technical elements (e.g., applied technical practice) may achieve greater effectiveness in the perceptual-motor domain.

Effectiveness of Interventions Focusing on Specific Aspects of Practice or Coaching
Providing a clear image of the effectiveness of studies on specific aspects to practice and coaching is challenged as diverse topics within skill acquisition research, practice methods, and technical outcomes were found. Further, most technical outcomes were only investigated through skills tests limiting the inferences which can be drawn for match-play performances. Nevertheless, most studies support the effectiveness of deliberate technical practice although specific study objectives and particularities of applied methods often require individual analyses. Specifically, positive results were reported after technically focused drillbased interventions to promote the technical non-dominant leg performance. Thus, interventions that primarily include TL methods-often applied as supplemental to regular team practice-were found to achieve improvements. Again, none of these studies assessed its impact on match-play performance. Only two studies provide support that drill-based practices over several months can positively impact match-play behavior in terms of a higher utilization rate of the non-dominant leg (Guilherme et al., 2015a,b). In summary, the present research on specific aspects provides first insights on the potential of different practice and coaching methods, but evidence on whether such practices can improve match-play performance is rare.

Limitations of This Systematic Review
The limitations of the present review mainly relate to three characteristics. First, due to large heterogeneity and methodological weaknesses across studies, pooling effect sizes within a meta-analysis to increase the statistical power and precision of effects was not possible. Besides the limited number of controlled designs on comparable approaches, not all effect sizes could be recalculated due to missing data, as well as unknown interaction effects or group differences. Consequently, inferences on the effectiveness of interventions were limited, they could not be statistically investigated regarding potential moderators (e.g., the intervention duration) and, thus, must be often interpreted in the light of individual study characteristics. Along with this, the extent of potential publication biases could not be statistically assessed. Second, strict inclusion criteria were applied to ensure the highest quality of individual studies (e.g., peer-reviewed research). Nevertheless, relevant studies published within different outlets (e.g., book chapters) may have been omitted as a consequence. Third, improvements in players' perceptual-motor and perceptual-cognitive skills were chosen from a wide range of potentially relevant outcomes (Nichol et al., 2019). Thus, skill acquisition needs to be considered at the interface of various interrelating elements, such as motivational or physiological attributes.

Future Perspectives
The growing number of studies over the past years elucidated the increasing interest and relevance of the topic. To encourage and improve further work, directions and recommendations for future studies are outlined relating to three main features.
First, higher standards of intervention research should be applied, especially in terms of sufficiently powered and controlled designs. Although guidelines for intervention research from health science or medicine may need to be translated and adapted for sports coaching research (e.g., O'Cathain et al., 2019), they can be valuable benchmarks for future work. Further, scientifically grounded and clearly outlined hypotheses are required to better understand what interventions aim to achieve. Pre-registrations could improve the current practice by outlining the study design, hypotheses, statistical analyses, and a priori sample size estimation.
Second, the specific practice objectives of different practice and coaching methods should be critically juxtaposed to conduct theory-driven intervention research on the relative effectiveness of approaches (for an example see Gray, 2020). Here, the additional use of non-active or usual care CGs would allow interpreting improvements in different practice groups. The application of instruments with high specificity is recommended to draw sound conclusions on the respective intervention objectives. These can be both outcome-related skill tests, but also representative in-game assessments if potential confounders could be controlled. Additionally to hypothesis testing based on group means, looking at individual development curves would allow specific implications on how different methods impact acquisition and learning in the light of personal characteristics (Anderson et al., 2021).
Third, systematic monitoring of practice and coaching would help to better understand and compare the implementation of different approaches. This requires empirically grounded metrics to quantify practice (e.g., variability) and coaching (e.g., instructions). Although proper metrics are scarce, systematic observation instruments can provide valuable data (e.g., Cushion et al., 2012b). Potentially relevant variables may also be adapted from periodization frameworks on skill acquisition (Farrow and Robertson, 2017;Otte et al., 2019). Collaborations with coaches who transfer methodological principles to their "realworld" coaching contexts could show how differently grounded approaches can be applied in different settings. This could specifically profit from mixed-method designs focusing on both the players' outcomes, but also coaches' intentions, expectations, and experiences when applying different methodologies.

Practical Implications
The present findings underline the need for coaches to make numerous considerations before deciding for or against a specific practice and coaching methodology. These may include, for example, coaches' reflections on which competencies they aim to improve in their players (i.e., the practice objectives) as well as the situative circumstances in which practice is conducted (e.g., age groups or sport facilities). Within coaches' daily work, but beyond the scope of the present review, these considerations cannot be limited to perceptual-motor and perceptual-cognitive skills, but need to go further (see, for example, Alves et al., 2017 for reasons on the use of SSGs in soccer practice).
Regarding the promotion of outcomes addressing technical skills, the available results support that both decomposed, repetition-based approaches as well as selforganziation/variability-based approaches implemented within drills or games can lead to improvements depending on how performance is operationalized. These findings challenge the traditional idea that players must learn the "fundamentals" first (e.g., ball handling) before they can be put into the game context (Newell, 2020). Given the other benefits of self-organization/game-based practice activities (e.g., more opportunities for decision-making), this suggests that, at very least, coaches should reduce the amount of decomposed, isolated drills in practice. Here, the key challenge is to find the most supportive integration and periodization of such practices, as well as the optimal degree of self-exploration and variability by considering individuals' requirements and needs.
Regarding convergent tactical skills (i.e., decision-making), the present results allow the cautious conclusion that those approaches which provide players a greater opportunity to selfexplore tactical solutions within game-realistic settings provide a better foundation to facilitate convergent tactical thinking. Knowledge of how and to what degree this process can be most effectively guided through implicit or explicit coaching strategies is still scarce as comparisons of specific strategies are lacking. In terms of players' divergent tactical behavior, it seems worthwhile to confront players with highly dynamic and unpredictable match-play situations that encourage flexible adaptions to game demands. Nevertheless, specific recommendations on the most conducive degree of variability and unpredictability cannot be made based on the present knowledge.

CONCLUSION
The small number of studies investigating similar approaches, heterogeneity across studies and dependent variables, as well as methodological weaknesses, limit the generalizability of results. Although it was possible to outline the potential of different practice and coaching methods regarding a variety of outcomes, most effects need to be critically interpreted in the light of individual study characteristics and weaknesses. Thus, based on the current body of knowledge, drawing scientifically sound conclusions on the effectiveness or even superiority of specific approaches would be premature. Rather, the present review aims to encourage further theory-driven and high-quality studies to extend the growing but still limited body of research conducted in applied experimental contexts. Furthermore, the findings must be systematically enriched by coaches' experiential knowledge. This will contribute to a conscious and evidence-based use of the coaches' methodological "toolbox" for effectively enhancing player learning and performance.

DATA AVAILABILITY STATEMENT
Further data, as well as the PRISMA checklist, are provided in Supplementary Materials I-V. Datasets of the systematic search and the quantitative synthesis can be accessed in the OSF project of this publication (https://osf.io/85tjg/).

AUTHOR CONTRIBUTIONS
FB, RG, SW, and OH: conceptualization, review, and editing original draft. FB: methodology, investigation and formal analysis, and visualization and writing original draft. OH: project administration and supervision, and funding acquisition. All authors contributed to the article and approved the submitted version.

FUNDING
This study was part of the research project "Scientific Support of the DFB's Talent Promotion Program", which was granted by the German Football Association (Deutscher Fußball-Bund, DFB). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication. We also acknowledge support by Open Access Publishing Fund of the Eberhard Karls University Tübingen. This funding supported the payment of the publication fee.