The Effectiveness of Phonological-Based Instruction in English As a Foreign Language Students at Primary School Level: A Research Synthesis

Phonological-based instruction, namely phonological awareness instruction (PA) and phonics instruction, has shown to be effective on early literacy skills among young children in western countries. Children who learn English as a foreign language (EFL) learn to read English differently from children in English dominant societies. Effectiveness of the instruction in the EFL context is much less investigated. The present study systematically reviewed 15 experimental and quasi-experimental studies published in between 2000 to 2016, on the topic of the effectiveness of phonological-based instruction in the EFL context. Study characteristics and instructional features were described, and effect sizes were calculated. Phonological-based instruction was consistently found to be effective among primary school EFL students on reading underlying skills including phonemic awareness and non-word reading. The median value of the effect size was moderate. In contrast, the effectiveness on word recognition (lexical access and pronunciation) and reading comprehension were inconsistent across studies. The median value of the effect size on word reading was small. This pattern suggests a limitation of the phonological-based instruction, which is the difficulty of transferring the phonological underlying outcomes to real reading. We found that most studies, although meeting the minimum standard of evidence for effectiveness, suffer from methodological flaws, thus they are potentially biased. Therefore, the positive effects reported in this study should be interpreted with caution. The implication for practice of this study is that including phonological-based instruction in the current English curriculum may be beneficial for young EFL students, thus they can better learn to phonologically decode English words. But not enough evidence has been found to support the instructional effectiveness on real word recognition and reading comprehension. Future research on this topic with rigorous design are needed so that strong causal inference can be made. The findings of this study provide novel insights into foreign language education of English for young learners.

Phonological-based instruction, namely phonological awareness instruction (PA) and phonics instruction, has shown to be effective on early literacy skills among young children in western countries. Children who learn English as a foreign language (EFL) learn to read English differently from children in English-dominant societies. Effectiveness of the instruction in the EFL context is much less investigated. The present study systematically reviewed 15 experimental and quasi-experimental studies published in between 2000 and 2016, on the topic of the effectiveness of phonological-based instruction in the EFL context. Study characteristics and instructional features were described, and effect sizes were calculated. Phonological-based instruction was consistently found to be effective among primary school EFL students on reading underlying skills, including phonemic awareness and non-word reading. The median value of the effect size was moderate. In contrast, the effectiveness on word recognition (lexical access and pronunciation) and reading comprehension were inconsistent across studies. The median value of the effect size on word reading was small. This pattern suggests a limitation of the phonological-based instruction, which is the difficulty of transferring the phonological underlying outcomes to real reading. We found that most studies, although meeting the minimum standards of evidence for effectiveness, suffer from methodological flaws; thus, they are potentially biased. Therefore, the positive effects reported in this study should be interpreted with caution. The implication for practice of this study is that including phonological-based instruction in the current English curriculum may be beneficial for young EFL students, thus they can better learn to phonologically decode English words. But not enough evidence has been found to support the instructional effectiveness on real word recognition and reading comprehension. Future research on this topic with rigorous design is needed so that strong causal inference can be made. The findings of this study provide novel insights into foreign language education of English for young learners.
inTrODUcTiOn English has an alphabetic writing system, which means the print represents speech largely at phonemic level. Therefore, phonological decoding is greatly involved in learning to read in English. Phonological-based instruction, which focuses on explicit teaching of phonological analysis of words and lettersound correspondences, is shown to be effective in improving literacy outcomes at early stage (Bus and Van Ijzendoorn, 1999;Ehri et al., 2001a,b). Whether this approach is effective with children who learn English as a foreign language (EFL) has not been substantially investigated yet.
Learning to read in English is challenging for EFL students. Exposure to oral and written English is limited in most EFL contexts (Gunderson, 2014). Thus, the development of English oral language and literacy skills of EFL students is constrained. English literacy instruction for this group of students is important but far from being evidence-based. This paper presents a systematic review of experimental and quasi-experimental studies on phonological-based instruction in EFL context.

conceptualization of reading and the Underlying Phonological Predictors
Reading is to make meaning out of print. According the simple view of reading, two cognitive components are involved in the reading processes, comprehension and word decoding (Hoover and Gough, 1990). Comprehension in reading is underpinned by listening comprehension and develops with oral language proficiency. Decoding is to access a word's meaning from its print form. To become a successful reader, one has to decode effortlessly so that most of the cognitive resources can be dedicated to comprehension. The set of words one can recognize effortlessly from memory without further breaking it down to smaller unit is called sight vocabulary (Ehri, 1987).
To build sight vocabulary, orthographic mapping of large quantity of words needs to be attained (Ehri, 2014). Orthographic mapping is letter-sound formation that bonds the pronunciation and spelling of a word. Orthographic mapping is acquired by phonological decoding, which refers to in-depth analysis of the relation between the pronunciation and spelling of the word. The unit of analysis could be syllables, phonemes, rimes, or morphemes (see Ehri, 2014 for a review).
The idea that phonological decoding is necessary for sight reading is also advocated in lexical quality hypothesis (Perfetti and Hart, 2002). According to the hypothesis, a word retrieved reliably and efficiently by sight is the one represented with good quality, meaning that the word is represented with redundancy and specificity in terms of semantics, phonology, and orthography. Phoneme-grapheme mapping of a word is redundant if the pronunciation and print form of the word are separately specified in the representation. The redundant cues of phoneme-grapheme correspondence can confirm the connection among a word's spelling, pronunciation, and meaning by avoiding the confusion with words similarly spelled or pronounced. For example, one who memorizes the word "president" by rote and does not decode it phonologically may have difficulties in distinguishing it from words that are visually similar such as "present, " "precedent, " and "precious. " The ability of phonological decoding has massive power in kicking start the self-teaching mechanism for learning new vocabulary through independent reading (Share, 1995). Phonological decoding is enabled by two underlying abilities. One is phonological awareness, which refers to the ability of detecting and manipulating linguistic sounds in speech such as segmenting, deleting, and blending (Hoien et al., 1995). The other is the knowledge of letter-sound correspondence. The two skills are the most robust predictors of subsequent reading performance after oral proficiency has been controlled (Ehri, 1998).
On the other hand, some researchers argue that early acquisition of sight vocabulary depends on rudimentary phonological awareness and large exposure to print, rather than refined phonemic awareness and proficient knowledge of letter-sound correspondence (Stuart et al., 2000;Fletcher-Flinn and Thompson, 2004;Thompson et al., 2015). Systematic instruction on alphabetic principles should be based on students' knowledge of sight words and certain level oral language proficiency (Thompson et al., 1996).

Word reading for eFl students
One important distinction between EFL and English-nativespeaking students in terms of word reading is that the strength of association between the formal (spelling and pronunciation specification) and semantic information of words (Jiang, 2000). According to the stage theory of lexical acquisition (Jiang, 2000), formal information is weakly linked to semantic information at the initial stage of lexical acquisition due to the constrained input students receive in EFL context. When sounding out a high-frequency word, access to meaning is usually assumed for an English-native-speaking student. However, the lexical access is less likely guaranteed for an EFL student. For EFL students, learning to crack the code of print-and-pronunciation correspondence does not guarantee the access to lexical-semantic information. The words students know the meaning of are likely to be less than what they can pronounce. Therefore, decoding (pronunciation-print association) and lexical access (print-meaning association) are treated as separate outcomes of word reading in the present study.
English as a foreign language learning environment is marked by constrained input of both written and oral English. The constrained environment leads to delayed development of word decoding for EFL students compared to English-native-speaking students. For example, the logographic stage of word reading is short-lived for native speakers, who develop beyond the initial stage soon after formal schooling starts (Frith, 1985;Ehri, 1987Ehri, , 2005. This stage may last longer for EFL students. Yin et al. (2007) conducted a study in Beijing and found that 50% of Grade 2 and 34% of Grade 4 EFL students in their sample were designated as recognizing words at pre-alphabetic stage, which means that they recognized print words using visual features instead of phonological decoding.
Although at the beginning EFL students rely less on phonological decoding to identify words than their English-speaking counterparts, there is evidence suggesting that phonological decoding facilitates vocabulary acquisition of EFL students. Hu (2008) found that Chinese-speaking EFL students better associated a novel word's auditory form with its semantic referent when the word was presented in print form. The print effect was larger for the EFL students with better phonological awareness. This suggests that phonological processing enabled phonological decoding to form bonds between pronunciation and print, thus enhancing sight vocabulary learning. Therefore, explicit instruction on English alphabetic principles and phonological awareness may be beneficial for EFL students; it may boost their development of word decoding skills and further facilitate sight vocabulary acquisition.

Definition of Phonological-Based instruction
Phonological-based instruction focuses on aurally analyzing words at the phonemic level and mapping linguistic units to print so that students can eventually learn to read. Phonological-based instruction includes two types of instructional programs, phonics instruction and phonemic awareness instruction (PA).
Phonics instruction focuses on explicit and direct teaching of alphabetic principles and grapheme-phoneme corresponding rules, and of applying the knowledge to word-and text-level reading. PA focuses on teaching phonological skills, such as rhyming, identifying, segmenting, and blending phoneme sounds. There is overlap between phonemic awareness instruction and phonics instruction (Ehri et al., 2001b). Both of them may include the component of grapheme-phoneme correspondence of 26 English letters. Phonics instruction goes beyond teaching letter-sound knowledge to more complex spelling rules such as digraphs and diphthongs. Phoneme awareness instruction focuses on training students to manipulate speech sounds without the presence of written letters, and word level reading and spelling are important outcomes of phonics instruction. Phoneme awareness instruction can serve as a precursor to systematic phonics instruction (Ehri et al., 2001a).
Substantial amount of evidence has suggested that phonemic awareness and knowledge of alphabetic principles are important to learning to read in alphabetical languages at the beginning learning stage (see Bus and Van Ijzendoorn, 1999 for a review). Instruction targeting at phonological skills and letter knowledge is effective in young learners on word level reading, regardless of whether English is their first language (National Reading Panel (U.S.), & National Institute of Child Health and Human Development (U.S.), 2000; Angiulli et al., 2004;Lipka and Siegel, 2010). Findings from quantitative meta-analysis studies showed that phonological-based instruction has moderate to large effects on English literacy skills (Ehri et al., 2001a,b).

Phonological-Based instruction in the eFl context
Phonological-based instruction in EFL classes is attracting more and more attention of researchers, school administrators, and teachers. EFL students do not have English-language environments to develop literacy skills spontaneously. Thus, the foundational skills, such as phonemic awareness and alphabetic principles, need to be explicitly taught so that students are prepared to learn to read in English (Shen, 2003). Government-endorsed English curriculums in many EFL countries include phonics or PA as an instructional component, such as Malaysia (Johnson and Tweedie, 2010) and Taiwan (Lai et al., 2009). Phonologicalbased instruction, phonics instruction in particular, has become a trend in English classes. Qualitative studies have found that English teachers in EFL countries have the belief that the spelling rules (phonics) are essential in teaching English to young children (Kuo, 2011). For example, English teachers in Hong Kong reported that they found phonics instruction effective on their students' spelling and reading performances in early grades (Lau and Rao, 2013).
Meanwhile, some concerns have been raised about adopting phonological-based instruction in EFL classrooms in early grades. The primary goal of English education at early stage is oral language development. Some people are concerned that phonics instruction focuses too much on identifying letters and words and introducing it too early could cause negligence of conversational skills. For example, some English teachers took phonics instruction as an easy way out, because they were not confident in conversing in English themselves (Zhou and Mcgride-Chang, 2009). research reviews with english-language learners (ell) in english-Dominant societies Several literature reviews have been conducted on the topic of effectiveness of phonological-based instruction with the ELL, who learn English as their second language in English-dominant societies (Thorius and Sullivan, 2013;Stephens, 2014;Richards-Tutor et al., 2015). National reading panel reported that ELL students generally respond to phonological-based instruction as well as English-native-speaking students (August et al., 2009). A synthetic review was conducted on studies with ELL students who were struggling readers (Richards-Tutor et al., 2015). The findings showed that the interventions, which included phonologicalbased instruction as one of the instructional components, had moderate to large effects on word reading. Stephens (2014) reviewed intervention studies with Spanish-speaking children, who were struggling with English reading, and found that comprehensive programs with phonics and phonemic awareness instruction included had large effects on reading comprehension. However, the sole effect of the phonological components was not investigated in these studies. Furthermore, Thorius and Sullivan (2013) synthesized studies with ELL students in the Response to Intervention setting and found that most of the studies were conducted at Tier 2 level, where the instruction is targeted at struggling readers and is delivered relatively intensively and in small group. In contrast, studies investigating the instructional effects in general educational setting are scarce.
The findings discussed above cannot be directly generalized to the EFL context, because EFL students are in a completely different situation compared to ELL learners in English-speaking countries. EFL students are from neither language minority groups nor struggling readers who fell behind English-native-speaking counterparts. EFL students learn English as a school subject in their native countries. They have extensive print exposure of their first language and many of them have started to receive formal literacy instruction in their first language before they learn to read in English.

Purposes
We reviewed studies published from 2000 to 2016 on the topic of effectiveness of phonological-based instruction in the EFL context. This systematic review was aimed to answer the following questions: How the instruction is conducted in the EFL context, particularly what instructional components were covered and what adapting strategies were adopted?
Whether the instruction is effective on the following outcomes in the ascending order of level of cognitive processing, phonological awareness and letter knowledge, phonological decoding (non-word reading), word reading (lexical access and pronunciation), and reading comprehension?
Specifically, we conducted the following analyses: the examination of the characteristics of the studies and selecting less biased studies for further analysis, summarizing features of the treatment and comparison instruction, the calculation of effect size of the instruction in each study on the outcomes.

MaTerials anD MeThODs search Procedures
Four commonly used databases, PsycINFO, ERIC, web of science, and ProQuest Dissertations and Theses Global were searched for studies of interest published in English from 2000 to 2016, including a large number of peer-reviewed articles and unpublished dissertations in this field. Various combinations of key words of three sets were used in the initial literature search. The first set is related to instructional components or outcomes, such as phonics, phonemic awareness, literacy skills, phonological awareness, and rhyming. The second set is related to setting, such as EFL, ELL, and foreign countries. The third set is related to study design, such as intervention, instruction, and training. A manual search was also conducted with the key journals (TESOL quartile, Language learning, Reading and writing, Journal of educational psychology, Second language research, Learning and individual differences, Reading in a foreign language) to find relevant studies. We also checked reference lists in the key articles to find additional studies. The initial search resulted in 116 studies. The first author of this article screened the titles and abstracts and selected 20 studies that met the following criteria.

criteria for inclusion
First, studies must be focused on evaluating the effects of phonological-based instruction. The instruction can be related to either phonological awareness or phonics which might include the component of phonological awareness. Studies that examined instruction of other types such as the International Phonetic Alphabet were excluded. In addition, instruction in the control condition should not include a phonological component. Studies that compared effectiveness of two phonological skillbased programs were discarded. Second, studies were included if they were conducted in the context where English is learned as a foreign language. Studies with ELL students in native English countries were removed. Third, the participants of the studies must be at the primary grade level ranging from kindergarten to Grade 6. Studies with secondary school students and adults were excluded.

study Quality
Twenty studies were included for in-depth review of quality after excluding studies that did not meet the above criteria. A study was included for further analysis only if it met the minimum quality standards adapted from the evidence standards published by What Works Clearinghouse (2014).
• Studies must adopt RCT or quasi-experimental design.
• Group equality on outcomes at pretest should be reported.
The inequality should be addressed in data analysis using the ANCOVA technique. Studies not reporting pretest scores on any of the outcomes were excluded. • A study was excluded if only one teacher was involved in each condition, unless evidence was provided that the confounding effect was minimum or controlled. For example, the instruction in both treatment and control conditions was monitored to ensure that teachers in both conditions followed the lesson plan and no obvious alteration was introduced. For this reason, studies of Chu et al. (2007), Jamaludin et al. (2015), Li and Chen (2016), Lin and Cheng (2008), Yang (2009), and Bing et al. (2013) were included, although only one teacher was involved in each condition. However, the findings only serve as weak evidence for effectiveness, because the confounding factor was not eliminated and the results were potentially biased. • Outcomes were clearly specified. Studies in which generic outcomes were reported, such as English-language proficiency or English literacy without detailed break-down of skills, were excluded.
The two authors of this article independently assessed the quality of each study and selected studies that met the above criteria. Then they compared notes and reached agreement. This yielded 15 studies that qualified for the current study. These studies were further coded on study characteristics, instructional context and features in both treatment and comparison conditions, and their reported statistics were extracted to calculate effect sizes.
coding Coding scheme for this study is shown in Table 1. The creation of the coding scheme was iterative. During the first reading, each study was described in general dimensions adopted in previous studies (Ehri et al., 2001a;Gersten et al., 2009;Li, 2010). Then, we created categories for the dimensions so that all the studies were covered. The two authors independently coded the studies and compared notes. Then, the dimensions and categories that were ambiguously defined and led to different interpretations were removed or corrected. The third round of the coding yielded the following scheme.

Study Characteristics
The characteristic dimensions include authors and year of publishing, region and native language, study design, study type, sample size, the number of classes involved, the number of schools involved, students' grade and ability level, schools' social economic status (SES), and English-learning context, including language medium of daily instruction and the number of English lessons a week.

Treatment Instruction
The treatment instruction was described by the following features, delivering personnel, group size, intensity, instructional content, and adapting strategies. The coding scheme of instructional content was adapted from a summary of instructional components commonly seen in phonological-based programs of different approaches (Fry, 2010 We also summarized each study on the adapting strategies adopted in the treatment condition to tailor the instruction to young EFL students. Purposes of the strategies include enabling active learning, creating comprehensive input, facilitating transfer from knowledge of first language, developing oral language, and facilitating memorization. Two researchers independently coded the above mentioned aspects for each study. We compared notes and solved all the differences. Hundred percent agreement has been reached.

Comparison Instruction
The comparison condition was described if the following instructional components were identified.

Whole Word Recognition
Students were taught to recognize a word as a whole without breaking it into smaller parts of letter groups. A word is acquired through repeated encounters. Specifically, the whole word approach was coded in the following aspects: (a) whether explicit instruction on letter knowledge was included, (b) whether writing exercise was involved, (c) how the repetition was delivered, through authentic reading, multiple demonstrations and examples provided by the teachers, or through out-of-context drills.

Text Reading
Students read text under teachers' guidance.

Oral Language Development
Instruction focused on oral language skills without explicit teaching of reading skills; activities included singing, chanting, conversing, and listening comprehension.

Miscellaneous
Students received a comprehensive English program covering various linguistic components.

Outcomes
We coded skills and knowledge measured in the assessment. Skills and knowledge that underlie word reading included rhyme awareness, phonemic awareness, letter knowledge, and phonological decoding (non-word reading); tasks of word level reading included measures of print-pronunciation association and print-meaning association. Reading comprehension, which requires higher level of processing, was also included as an outcome.

Effect Size
We used the Hedge's g as the measure of effect size (Hedges, 1981).
Hedge's provides a correction factor that modifies the effect of sampling bias. Small sample size can cause overestimation of effects using Cohen's d (Hedges, 1981). Hedge's g was calculated from means and SDs of the treatment and control groups at the immediate posttests. We adjusted effect sizes using pretest scores with the formula by Wortman and Bryant (1985). For studies in which means and SDs of outcomes were not reported, we contacted the authors and obtained their raw data. The first author extracted the data for calculation and the second author checked for accuracy. To evaluate the strength of effect size, Cohen's (Cohen, 1998) criteria was adopted. An effect size of 0.2 is interpreted as small, 0.5 is interpreted as moderate, and 0.8 and above is interpreted as large. The criteria were also adopted by Ehri et al. (2001a,b) to judge the effect of phonological-based instruction in English native context.

resUlTs
The comprehensive literature search yielded 17 comparisons out of the 15 studies. Two studies included two comparisons. Only the ones with larger effect size were included. Five studies are unpublished dissertations and the other 10 studies were published in peer-reviewed journals. We described characteristics of each study and the treatment instruction, followed by the report of the effectiveness on each outcome (see details related to each study in the Supplementary Material).

Research Context
Countries and regions where the reviewed studies were conducted include Nigeria (Shepherd, 2013;Eshiet, 2014), India (Dixon et al., 2011), Malaysia (Johnson and Tweedie, 2010;Jamaludin et al., 2015), Japan (Allen-Tamai, 2000), PR China (Ashmore et al., 2003;Bing et al., 2013), Hong Kong (Yeung, 2012;, and Taiwan (Chu et al., 2007;Lin and Cheng, 2008;Lai et al., 2009;Yang, 2009;Li and Chen, 2016). According to Kachru (1990), the use of English in the aforementioned areas can be summarized into two categories: the outer circle and the extending circle. The outer circle refers to settings where English is not the native language but has been widely used in chief institutions of the country and plays an important role in a multilingual setting (Crystal, 2003). These countries and regions include India, Malaysia, Nigeria, Hong Kong, and over 50 other territories. The expanding circle involves nations where the importance of English is recognized as an international language, though they do not have a history of colonization by English-native countries nor have they given English any special role in government. In countries belonging to the extending circle, English is taught as a foreign language and considered to be of high priority and great importance for academic success. Countries of this category include Mainland China, Taiwan, Japan, etc.
However, the English learning situation of young learners in the two circles did not differ much in regards to the studies involved in the analysis. English was taught as a school subject, and the daily instruction was mediated by the native language in most of the studies, except for the study by Dixon (2010), where English was the medium of instruction. The number of English lessons students took each week ranged from two to eight, with a median of three. Each lesson lasted from 30 to 40 min. This is very limited exposure compared to their counterparts in Englishdominant context.
Eleven studies reported the SES of schools or students. Five studies were conducted in schools in rural areas and of low SES (Johnson and Tweedie, 2010;Dixon et al., 2011;Shepherd, 2013;Eshiet, 2014;Jamaludin et al., 2015). Two studies were performed in schools of mixed levels of SES (Yeung, 2012;. Four studies were conducted in schools of middle or high SES (Allen-Tamai, 2000; Ashmore et al., 2003;Bing et al., 2013;Li and Chen, 2016).

Study Design
Among the 15 studies, three studies adopted the design of randomized controlled trial. The randomization was conducted at student level in two studies (Lai et al., 2009;. Class served as the unit of randomization and was taken into account in data analysis in one study (Dixon et al., 2011). Two studies did not provide information in regards to group assignment (Johnson and Tweedie, 2010;Jamaludin et al., 2015). The rest 10 studies adopted quasi-experimental design, as the group assignment was not random. Intact classes were assigned to different groups in eight studies. In Chu et al. (2007) study, students from different groups were matched on one of the outcome variables. In Yeung's (Yeung, 2012) study, the students were assigned by convenience.
None of the studies adopted the blinding procedures in delivering the instruction. The teachers in the treatment group knew that they were in the treatment group, and five of them reported positive attitudes toward the experimental method (Ashmore et al., 2003;Yang, 2009;Yeung, 2012;Shepherd, 2013;Eshiet, 2014). This potentially introduced bias, as the teachers' positive attitudes could contribute to the effectiveness (Torgerson and Torgerson, 2013). In nine studies (Allen-Tamai, 2000; Ashmore et al., 2003;Chu et al., 2007;Lin and Cheng, 2008;Yang, 2009;Yeung, 2012;Shepherd, 2013;Jamaludin et al., 2015;Li and Chen, 2016), teachers in the treatment and control groups were from the same school. It is possible that the comparison group was contaminated, because the teachers in the treatment group could communicate the method to the teachers in the control group. The contamination was minimized but cannot be eliminated, although the fidelity check ensured that the instructional activities in the treatment conditions did not occur in the comparison conditions.
In summary, although 15 studies met the minimum standards of quality, only 3 studies qualified for making causal inference (Lai et al., 2009;Dixon et al., 2011;. Causal inference in respect to instructional effectiveness means that the improvement in the outcomes can be solely attributed to the instruction. The three studies eliminated the unobserved confounding effects by the procedure of group randomization. In addition, the confounding factor of teacher was also eliminated and the control groups were not contaminated in the three studies. Ten studies adopted quasi-experimental design which potentially brought in unobserved confounding factors that may have contributed to the outcome. The primary confounding factor, teacher, was identified in six studies.

Sample Characteristics
The sample size of the studies ranges from 40 to 1,030. Eight studies were conducted with kindergarten or Grade 1 students whose experience of learning English language was less than 1 year (Ashmore et al., 2003;Johnson and Tweedie, 2010;Dixon et al., 2011;Yeung, 2012;Bing et al., 2013;Shepherd, 2013;Eshiet, 2014). Three studies involved Grade 3 students (Chu et al., 2007;Lai et al., 2009;Yang, 2009), and three studies involved Grade 5 students (Lin and Cheng, 2008;Jamaludin et al., 2015;Li and Chen, 2016). One study was conducted with students of mixed grades from Grade 1 to 6 (Allen-Tamai, 2000). Two studies were conducted with students identified as English struggling readers (Chu et al., 2007;Jamaludin et al., 2015). Students' experience of learning English out of school was taken into account in four studies. One study recruited participants who had no out-of-school experience (Bing et al., 2013), and the rest three studies showed equal distribution across two groups on this variable (Allen-Tamai, 2000; Yang, 2009;Li and Chen, 2016). None of the studies reported the proficiency level of oral and written abilities in students' native languages.

Comparison Conditions
Ten studies compared phonological-based approach with an alternative approach of explicit reading instruction. Comparison conditions in six studies (Lin and Cheng, 2008;Dixon et al., 2011;Yeung, 2012;Shepherd, 2013;Eshiet, 2014) adopted the whole word approach to teach students to read, which is an approach that a word is taught to students through repetitive encounters without phonological decoding. Letter names were taught along with or prior to the whole word recognition in eight conditions. In four conditions, a word was repeatedly shown to students in various demonstrations and forms provided by the teachers, e.g., in different sentences and pictures (Yang, 2009;Yeung, 2012;Jamaludin et al., 2015). Word repetition was conducted via out-of-context drills in the same format in two studies. The students were shown a word in flashcards repeatedly and asked to repeat its pronunciation and meaning in their first language after the teacher (Ashmore et al., 2003;Lin and Cheng, 2008). Two studies adopted mnemonics as a strategy to facilitate students' rote memorization of the association among a word's pronunciation, print form, and meaning (Lin and Cheng, 2008;Shepherd, 2013). Writing and copying exercises were featured in the whole word approach in one study . Three studies described the comparison conditions as the whole word approach, but did not give implementation details (Lai et al., 2009;Dixon et al., 2011;Eshiet, 2014).
Besides the whole word approach, reading instruction in one comparison condition focused on text reading. The teacher explained the meaning of the text sentence by sentence and asked questions to check for comprehension (Yang, 2009).
Phonological-based instruction was compared with the status quo of English education in three studies (Ashmore et al., 2003;Johnson and Tweedie, 2010;Li and Chen, 2016). The students received regular English curriculum, which included various instructional components of both reading and oral language development. All the curricula were without systematic and explicit teaching of phonemic awareness and phonics. Incidental teaching of English alphabetic knowledge and phonological awareness was mentioned in two control conditions (Lai et al., 2009;Johnson and Tweedie, 2010).
Comparison conditions in two studies focused on oral/aural language development without introducing print (Allen-Tamai, 2000; Chu et al., 2007). Activities included watching videos, listening to stories, conversing, singing, and chanting.

Delivering Personnel
Instruction in the treatment conditions was delivered by school teachers in 13 studies. It was delivered by the researcher in Bing et al. (2013) study and by a computer program in Lai et al. (2009) study.

Instructional Intensity and Setting
All studies except for two were conducted in classroom setting. The class size in the 12 studies ranges from 12 to 50 with a median of 25. Lai et al. (2009) developed a computer program to deliver the PA instruction at individual level. Chu et al. (2007) conducted a study with struggling readers and the instruction was delivered in groups of eight. The instruction reviewed was delivered regularly. Each instructional session lasts from 20 min to 1 h. The instruction was provided for different length of time, from 1 week to one academic year. The accumulation of the instruction ranges from 120 min to 128 h with a median of 560 min.

Instructional Components
Among the 15 studies, 6 studies provided phonological-based instruction that featured phonological awareness training targeting rhyme and phonemic awareness without introducing alphabetic knowledge (Allen-Tamai, 2000; Ashmore et al., 2003;Lai et al., 2009;Yeung, 2012;Bing et al., 2013;. The other nine studies adopted phonics programs which varied in instructional components covered. Six out of the eight phonics programs focused on synthetic phonics which is typically organized by grapheme-phonemecorrespondence (GPC) rules in the sequence of English alphabet followed by the units of diagraphs and consonant blends. Students practiced unitizing specific set of GPC rules to read and spell words. Johnson and Tweedie (2010) and Li and Chen (2016) adopted simplified version of synthetic phonics which only included letter-sound knowledge and phonemic awareness without extending to complicated spelling patterns. Synthetic phonics was taught in one study to enhance vocabulary acquisition (Lin and Cheng, 2008). Specifically, letter-sound correspondences related to the target words were taught to the students. Thus, they utilized the knowledge to sound out the novel words.
It is worth noting that the treatment conditions in four studies (Dixon et al., 2011;Shepherd, 2013;Eshiet, 2014;Jamaludin et al., 2015) adopted Jolly Phonics, a commercially available teaching program developed in UK (Lloyd, 2001). It is a synthetic phonics program consisting of 42 units including 26 letter-and-sound corresponding and 16 digraph rules. Multiple components were featured in the program including phonemic awareness activities, word decoding, dictation, and decodable text practice. This program used stories, songs, and body gestures to create mnemonics for students to remember letter-sound correspondences and engage students in learning.
Three treatment conditions adopted analytic phonics. Letter-sound correspondences were taught using the words in textbooks, and word decoding was taught during oral text reading (Chu et al., 2007). Teaching materials in Yang (2009) study were stories featuring rhyming pairs. Reading activities included rhyme detection, word family, and spelling patterns. Flash card activities were implemented for students to practice recognizing sight words in all the three conditions.

Adapting Strategies
It is found that various strategies were employed to adapt phonological-based instruction to the characteristics of EFL students. Adapting strategies include introducing the meaning of words before phonologically analyzing them (Chu et al., 2007;Lin and Cheng, 2008;Yang, 2009;, using stories, songs, and games to engage students (Yang, 2009;Dixon et al., 2011;Shepherd, 2013;Eshiet, 2014;Jamaludin et al., 2015), using computer programs to provide intensive and individualized training on phonemic awareness (Lai et al., 2009), using the total physical response method to create comprehensible input (Johnson and Tweedie, 2010), using body movements to demonstrate sound segmentation and blending , and using mnemonics for students to remember letter-sound correspondences and word spellings (Lin and Cheng, 2008;Dixon et al., 2011;Shepherd, 2013;Eshiet, 2014;Jamaludin et al., 2015).

effectiveness of Phonological-Based instruction
The range and the median of effect sizes on each outcome were reported. Meta-analysis synthesizing multiple studies for a presentative mean effect size was not conducted, because the studies reviewed varied in both the features of the instructions and the characteristics of the studies. Moreover, the analysis of the study design showed that most of the studies were potentially biased, thus the synthesized effect size could be misleading.
The presentation of the instructional outcomes follows the ascending order of level of cognitive processing of reading. Rhyme detection, phonemic awareness, letter naming, letter-sound knowledge, and non-word reading are skills and knowledge that underpin the reading process; real word recognition, including pronunciation and lexical access are real-time activities of word reading; text comprehension requires higher level processing beyond word recognition.

Rhyme Awareness
Four comparisons were made on rhyme detection. Tests of two comparisons (Yeung, 2012; showed significant effects, while the other two did not (Allen-Tamai, 2000; Bing et al., 2013). The effect size ranges from −0.09 to 0.81. The two significant effects are with size of 0.81 and 0.37, respectively.

Phonemic Awareness
Nine comparisons were made on phonemic awareness tapped by tasks, including phonemic segmentation, deletion, blending, detection, and identification. Statistical tests of seven comparisons (Ashmore et al., 2003;Chu et al., 2007;Lai et al., 2009;Yang, 2009;Bing et al., 2013;Shepherd, 2013; showed significant effects and two (Yeung, 2012;Eshiet, 2014) showed insignificant effects. The effect size ranges from −0.05 to 1.69 with a median of 0.62. The median size of the significant effects is 0.62 which is considered to be moderate.

Letter Knowledge
Five comparisons were made on letter recognition. Two of them showed significant effects (Johnson and Tweedie, 2010;Dixon et al., 2011). The effect size ranges from 0.01 to 0.44 with a median of 0.30. The two significant effects are with size of 0.29 and 0.27, respectively. One comparison was made on letter-sound correspondence (Dixon, 2010). The result showed extremely large effect, because the students in the control group showed knowledge of letter-sound correspondence in neither pre-nor posttests.

Non-word Reading
Six comparisons (Chu et al., 2007;Johnson and Tweedie, 2010;Dixon et al., 2011;Shepherd, 2013; were made on non-word reading and all showed significant results favoring phonological-based instruction. The effect size ranges from 0.32 to 1.20 with a median of 0.55 which is considered to be moderate.

Word Reading
Twelve comparisons were conducted on English real word reading. Word decoding, the association between print and pronunciation, was measured in 10 comparisons by the same task which was to read aloud a list of words untimed. Lexical access, the association between print and meaning, was measured in two studies. One measure was word-picture matching (Allen-Tamai, 2000) and the other was native-language translation (Lin and Cheng, 2008).
The words chosen for the word-reading task in five studies were supposed to be familiar to the students (Chu et al., 2007;Johnson and Tweedie, 2010;Yeung, 2012;Li and Chen, 2016). They were compiled from the English text books and were simple words without complex morphological or orthographical structures. Two studies chose words that were novel to the students, and the words were taught in the treatment instruction (Allen-Tamai, 2000; Lin and Cheng, 2008). Two studies chose standardized measures developed for assessing reading ability of English native speakers (Ashmore et al., 2003;Eshiet, 2014). The items were in the ascending order of difficulty in Eshiet (2014) study; the items were of simple structure and can be decoded using simple corresponding rules in Ashmore et al. (2003) study.
Results of the comparisons on word pronunciation were not consistent. Effects of six out of nine comparisons (Lai et al., 2009;Johnson and Tweedie, 2010;Dixon et al., 2011;Shepherd, 2013; were significant. The effect size ranges from −0.05 to 0.65 with a median of 0.32. The median effect size of the significant effects is 0.33, which is small according to Cohen's (Cohen, 1998) criteria. The comparisons on lexical access of words showed no significant effects.

DiscUssiOn
In this review, we examined studies investigating effectiveness of phonological-based instruction with EFL students at primary school level. After screening literature based on our criteria, 15 studies were included in this review. We described the study characteristics and the instructional features in the treatment and comparison conditions, and calculated effect size on a variety of English literacy skills for each study.
Although the studies reviewed met the minimum standards of evidence, most of them had flawed design and the findings rendered weak evidence for effectiveness (What Works Clearinghouse, 2014). The flawed design could lead to biased results. First, the positive effects on the outcomes might not be solely attributed to the instruction in 12 studies due to the lack of group randomization. Second, the effects might be inflated in 12 studies, because the clustering of participants was not taken into account. Third, other factors that affect comparisons identified in this study include variation of control conditions and outcome measures. The comparison in which the control condition contains a reading component may yield a smaller effect on literacy skills than the comparison where the control condition focuses on oral/aural language without exposure to print. Instructional outcomes measured by the items which were simple and aligned to the instruction may yield larger instructional effects than the ones measured by the standardized items which were difficult and were not directly addressed in the instruction. Last, some other sources of bias cannot be assessed because of the absence of required descriptive information, e.g., students' proficiency level in English or in their native language.
Despite the limitation and variation of study designs, we identified some patterns consistent across the studies, which are discussed below together with suggestions and implications for future research and educational practice.

english learning context
As expected, the EFL students in the review studies had limited exposure to English language. The primary input of English EFL students received was from English classes at school. The duration of English exposure was less than 5 h a week as indicated in the reviewed studies, which was much more restricted compared to students from English-speaking areas. Moreover, English reading materials students were exposed to were very limited. New words were usually learned through explanation from their teachers and drill practices, rather than through authentic reading. In countries where English is one of the official languages, the students in the reviewed studies did not have sufficient exposure to the oral and written language; their situation of learning English was similar to students in a completely foreign context.

Treatment instructions
The treatment instructions reviewed were mostly in the realm of synthetic phonics, which focuses on the explicit instruction of alphabetic principles and applying the knowledge to sounding out novel words. Phonemic awareness was also commonly included as a component in synthetic phonics programs, or as an intact program which could serve as the precursor to synthetic phonics. Activities such as sight word recognition and word family analysis, which were often featured in analytic phonics, were rarely implemented in the studies reviewed. Therefore, the effectiveness of phonological-based instruction demonstrated in this review is more reflective of the effectiveness of synthetic phonics instruction than that of other approaches.
Which approach is the most effective one? While this issue is debated extensively in English-speaking countries (Wyse and Styles, 2007), it is less addressed in the EFL context. Analytic phonics approach focuses on phonetically analyzing words which are already familiar to students. Thus, a basic level of sight vocabulary is required in this method. In contrast, prior knowledge of literacy is not required in synthetic phonics because students are taught to sound out novel words. For this reason, synthetic phonics approach may be more feasible in the EFL context, especially for students in lower grades who have little or no prior experience with English. However, results of some studies reviewed here do not support the hypothesis. For example, Yang (2009) analyzed the effect of analytic phonics featuring activities of sight words and analogy phonics, and the results showed large effects in phonemic awareness and sentence comprehension. Wu (2005) conducted a study with Grade 3 students in Taiwan comparing analytic and synthetic phonics and found that the two approaches were equally effective when systematically delivered. Therefore, there is not enough evidence to draw any conclusion at this point.
Adapting strategies were adopted in the majority of the reviewed studies. Some strategies were aimed at engaging students and promoting active learning, such as playing games, while others were used to facilitate memorization like mnemonics and telling stories. All these strategies are helpful for young children, regardless if their first language is English or not. Meanwhile, introducing skills and knowledge of students' native language was not seen as an adapting strategy in the reviewed studies. When starting to learn English, EFL students are usually well developed in speaking their native language and have been receiving the native literacy instruction formally and intensively. Studies found that knowledge in their first language can be transferred to English learning; the instruction facilitating the transfer is effective on English literacy outcomes (Cummins, 2005). For example, Nishanimut et al. (2013) tested the instructional method of comparing the writing system of Kanada (students' first language) with English alphabetic in synthetic phonics instruction and found large effects favoring this method.

instructional effectiveness
The reviewed studies consistently showed positive instructional effects on reading underlying skills including phonemic awareness and phonological decoding. Since some of the studies were methodologically flawed, the findings only weakly suggest the effectiveness of phonological-based instruction with primary school EFL students, in comparison to oral language teaching and whole word reading approach.
The phonological reading skills tap the process of aurally breaking a word into smaller units and applying the sound-printconversion rules to sound it out, which was directly addressed in the phonological-based instruction as shown by the analysis of the treatment instructions. The positive instructional effects imply that explicitly teaching of decoding skills in English may be independent from English oral language experience and proficiency. A certain level of oral language proficiency and sight vocabulary may not be the prerequisite for learning English alphabetic principles and phonological awareness. Further studies are needed to test this hypothesis.
The two studies that measured lexical access in word recognition both showed insignificant results. The words chosen for assessment were directly taught in the instruction. This suggests that the instruction focused on skills of phonological decoding was not effective on lexical retrieval of words via the print form. This contradicts the findings with native English-speaking children, which found that phonological decoding enhanced sight word vocabulary and students with good phonological skills learned sight words faster and more accurately (Ehri, 2014).
The outcomes of word recognition were measured by the task of pronouncing word items that were of high frequency and simple to decode in 10 studies. Seven out of the 10 studies showed significantly positive effects. The insignificant effects found in two of the three studies were attributed to lack of oral support and shortened length. The significant effects were dominantly small. In contrast, studies with English-native-speaking children found that phonological-based instruction had moderate to large effect on word recognition (Ehri et al., 2001a,b).
Although the quantitative difference of instructional effect was shown on word reading between English native children and EFL children, studies with the two populations revealed a similar pattern that the instruction is more effective on phonological decoding than on word recognition (Ehri et al., 2001a,b). The pattern was repeated in all the reviewed studies which reported results on both outcomes.
This pattern suggests the limitation of the instruction regardless of context, which is the difficulty of transferring the phonological skills and letter knowledge to real word reading. Word recognition requires more than phonological processing. Explicit instruction on phonemic awareness and letter-sound conversion rules is not enough. Explicit instruction on applying the skills and knowledge to decode a novel word is also needed (Fielding- Barnsley, 1997).
Moreover, the effectiveness of phonological-based instruction may be constrained by the limited exposure to oral and written English in the EFL context. Semantic and syntactic information of words are not the focus of phonological-based instruction and are often gained from large exposure to print and oral language. Semantic and syntactic information of a word is also essential for word recognition. According to the lexical quality hypothesis, successful retrieval of words during reading and spelling depends on high-quality representation of the word (Perfetti and Hart, 2002). A high-quality word is represented with specified spelling (orthographic information), meaning (semantic information), as well as pronunciation at phonemic level. Lack of information in any of the three aspects will result in low-quality representation, thus leading to unsuccessful retrieval. Students are more proficient in reading words familiar to them despite the complicated orthographic and phonological structure (Ehri, 1987).
Another factor that might contribute to the difficulty of transferring the phonological skills to word recognition is that phonological awareness and letter-sound knowledge might not be the most dominant skill underlying English word recognition for EFL students. Vocabulary and oral language proficiency are also significant predicators of word level reading for young EFL students, as important as phonological awareness . In addition, the dominant method of literacy instruction students receive also influences the underpinning skills and knowledge of word recognition. Phonological awareness and letter-sound knowledge are more important to word reading in the context where synthetic phonics is the major literacy teaching approach (McGeown and Medford, 2014). If the whole word approach is the dominant teaching method of reading in English and in students' native language, visual memory capacity might be more important than phonological decoding for word recognition (Bialystok et al., 2005;Wang et al., 2005).

implications for Practice
Since most of the studies reviewed could be potentially biased, the positive effects reported should be interpreted with great caution. Thus, the implications provided here are only suggestive. Phonologicalbased instruction may be effective in improving phonological decoding abilities among EFL learners at primary school level. Therefore, allocating resources to this type of instruction may be beneficial.
However, transferring the instructional outcomes to improvement of word recognition is a challenge. Theories suggest that instruction on semantic and grammatical information and large language/print input is essential for improving the ability of word recognition. Synthetic and analytic phonics both provide methods that incorporate meaning in the instruction of decoding skills, such as using decodable text and word family analysis.
It should be noted that none of the studies took phonics/ phonological awareness training as the only English program. The instruction in the studies reviewed is meant to be supplementary to daily English classes, and it is most effective when delivered regularly and discretely. The primary goal of early foreign language education should be language comprehension and communication (Canale and Swain, 1980;Krashen, 1985). Learning English phonological skills and alphabetic knowledge cannot replace whole language teaching.

Future research
The number of studies investigating the effectiveness of phonological-based instruction among EFL students is drastically small compared to studies with ELLs in English-dominant societies and those with English native speakers. Studies on this topic with rigorous research design are highly needed. Quasiexperimental design is found to be the dominant method. Studies that employed the design of randomized controlled trial are scarce. In educational research, class and student both can be the unit of randomization. The match between the unit of randomization and the unit of analysis is important, because the mismatch could inflate the effect of the instruction considerably (Gersten et al., 2009). Future studies on this topic which assign classes of students should account for variance at both class and student level.
Evaluating instructional effectiveness of phonologicalbased instruction in the EFL context is complex. Besides factors that are generally considered to have influence on instructional effectiveness regardless of cultural context such as school SES and nature of comparison, factors that specific to the EFL context should also be considered and thoroughly reported in future studies. For example, literacy experience and reading abilities in students' native language are important for learning a foreign language, but were reported in none of the reviewed studies. Future studies that directly investigate the moderating effects of the above mentioned factors are also needed.
Moreover, the assessment of English word reading of EFL students should be multi-facet rather than relying solely on word pronunciation. Not only can print-pronunciation association not guarantee lexical access but the task also produces unreliable results because scoring pronunciation is influenced by many factors such as scorers' background (Fletcher-Flinn et al., 2014). In the future, both lexical access and pronunciation of words should be measured, so that the findings can be more reliable and valid.
limitations Despite the importance of the findings in this review, we do think that this study has the following limitations. First, many of the selected studies are not of the best quality. Thus causal inference in regards to the instructional effectiveness cannot be drawn. Second, the number of studies reviewed here is relatively small. One reason is that we only included studies published in English. More studies on this topic may be published in other languages to make it more accessible to the native language speakers. Last, some factors important to instructional effectiveness are not included in this review. For example, the training and proficiency of teachers were not analyzed, which could influence the fidelity of instructions. In addition, the performance of students was assessed only at immediate posttest tests. The instructional effectiveness at delayed posttests was not examined.
cOnclUsiOn Results of this systematic review showed a consistent pattern that phonological-based instruction has positive effects on phonological decoding and phonemic awareness. However, the effectiveness found should be interpreted with great caution, because the casual inference in respect to the instructional effectiveness can be drawn in only three studies (Lai et al., 2009;Dixon et al., 2011;. The rest of the studies are potentially biased for the reasons including failure to exclude confound factors, mismatch between unit of assignment and unit of analysis, contamination and variation of the comparison conditions, inconsistency of outcome measures, and absence of required descriptive information. Furthermore, none of the three studies which qualified for making causal inference assessed reading with full validity; the semantic access of word recognition was not measured, although they showed positive effects on word pronunciation. Therefore, this study provides limited evidence for effectiveness of the phonological-based instruction on reading in English among young EFL children.
aUThOr cOnTriBUTiOns SH and SW shared the responsibilities of searching studies, developing the coding scheme and coding the studies. SH wrote the manuscript.

FUnDing
Research funded by China Scholarship Council (201306040147).