ORIGINAL RESEARCH article

Front. Psychol., 29 July 2020

Sec. Quantitative Psychology and Measurement

Volume 11 - 2020 | https://doi.org/10.3389/fpsyg.2020.01725

Rasch Analysis of the Japanese Version of Teacher Efficacy for Inclusive Practices Scale: Scale Unidimensionality

  • 1. Department of Special Education, College of Education, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia

  • 2. Department of Education, Faculty of Education and Psychology, University of Jyväskylä, Jyväskylä, Finland

Article metrics

View details

8

Citations

4k

Views

939

Downloads

Abstract

This study aimed to examine the construct validity of the Japanese version of the Teacher Efficacy for Inclusive Practices (TEIP) scale. The sample consisted of 250 teachers in Japan. Rasch analysis was used to examine the psychometric properties of the scale. Results did not support the 18-item Japanese version of the TEIP scale as a unidimensional scale for measuring TEIP. However, they do support the final 14-item Japanese version of the TEIP scale as a unidimensional scale for measuring TEIP. Four items were removed from the original 18-item scale (items 12, 8, 5, and 3) for violation of the local independency assumption. No item with differential item functioning (DIF) was detected. Only one item (item 18) was rescored to solve a threshold disorder. Further studies with different samples are warranted to confirm the study findings.

Introduction

The past 30 years have seen increasingly rapid advances toward inclusive education in educational policies and systems reinforced by international policy documents (e.g., UNESCO, 1994; United Nations, 2006; United Nations General Assembly, 2015). Inclusive education can broadly be defined as including all children in mainstream classrooms regardless of their gender, their being from ethnic or linguistic minorities, or having disabilities (UNESCO, 2009). Many countries have adopted new educational strategies toward inclusive education, and Japan is no exception. A drastic change occurred in 2007 when the traditional special education (tokushukyoiku) system was replaced by the current special needs education (tokubetsushienkyoiku) system, in which children with special educational needs can officially receive appropriate support even in regular classes (Central Council for Education, 2005). Because of rapid political change toward inclusive education, Japanese teachers’ readiness to implement inclusive education has been questioned by several researchers. For instance, surveys conducted on Japanese regular classroom teachers by Ueno and Nakamura (2011) have shown that the teachers expressed anxiety and difficulty about including children with disabilities under the current support system, even though they agreed on the concept of inclusion. Similarly, Fujii (2014) studied the teachers’ awareness of keywords related to inclusive education (e.g., “reasonable accommodation” and “Convention on the Rights of Persons with Disabilities”) and suggested that the teachers’ knowledge level regarding inclusive education is relatively low and thus they need more in-service training.

Several studies on inclusive education have been conducted which report the global trend on inclusion. Investigating inclusive education from the teachers’ perspective has been one of the issues of greatest interest because they play an important role in implementing inclusive education (Avramidis and Norwich, 2002; de Boer et al., 2011). It has conclusively been shown that teachers’ self-efficacy for inclusive practices is one of the crucial factors associated with teachers’ attitudes toward inclusive education and their intention to include children with disabilities in mainstream classrooms (Savolainen et al., 2012; Sharma et al., 2018; Yada et al., 2018). The concept of self-efficacy was introduced by Bandura (1977), who defined it as a person’s belief in his/her capability to plan and execute specific performances that produce expected outcomes (Bandura, 1997). Teachers’ self-efficacy is specific to the teaching profession, and it has been found to be related not only to student achievement (Tschannen-Moran et al., 1998) but also teachers’ outcomes such as willingness to apply new teaching strategies (Allinder, 1994), and job-related stress and burnout (Betoret, 2006). Previous research has demonstrated that teachers’ self-efficacy beliefs have context-specific and goal-oriented characteristics (Wyatt, 2014), and therefore they have been studied in several teaching domains (e.g., math, language, and physical education).

Teachers’ self-efficacy has been investigated in relation to many teaching contexts; however, there was no specific instrument that assesses teachers’ efficacy beliefs in implementing inclusive education (Sharma et al., 2012). Therefore, Sharma et al. (2012) developed a scale named the Teacher Efficacy for Inclusive Practices (TEIP). The scale consists of 18 items, which can be divided into three sub-scales: (1) efficacy to use inclusive instruction, (2) efficacy in managing behavior, and (3) efficacy in collaboration (Sharma et al., 2012). The scale has been translated and used in many countries such as China, Finland, Japan, Saudi Arabia, and South Africa, and the reliability and validity of translated versions have been demonstrated (Savolainen et al., 2012; Malinen et al., 2013; Yada et al., 2018; Alnahdi, 2019b). Although there are a number of international studies focusing on teachers’ self-efficacy in inclusive practices, data about Japanese teachers’ self-efficacy are limited (Yada and Savolainen, 2017). There is the pressing need to develop a valid and reliable instrument that can measure Japanese teachers’ self-efficacy in implementing inclusive education.

Although high reliability (Cronbach’s alpha = 0.93) and good construct validity (good fit to a theoretical model of confirmatory factor analysis) were addressed for the Japanese version of the TEIP scale (Yada and Savolainen, 2017; Yada et al., 2018), these analyses were conducted according to classical test theory (CTT). In CTT, the focus is on the observed score as a whole and not on each item, such as in an item response theory like Rasch analysis (De Ayala, 2013). Rasch analysis has the benefit of providing different statistical parameters for each item. This will allow one to understand the difficulty of each item “specific perceived ability.” In the context of inclusive education, understanding the item difficulty would help to target specific teaching skills for improvement (Lai et al., 2016). The item location in Rasch analysis represents how difficult (difficult to endorse) the respective item’s “specific perceived ability” was perceived by participants, which would afford us the opportunity to pay more attention to these aspects in future training programs for teachers rather than conducting an overall assessment of teachers’ perceived self-efficacy as a whole. In addition, Rasch analysis enables us to fit the observed data to a unidimensional model, to examine the scoring structure that is being used and find the best scoring structure based on the observed data, to provide misfit persons and misfit items to be removed, and to examine measurement invariance, and it enables transformation of ordinal data into interval data (Tennant and Conaghan, 2007a; Bond and Fox, 2015). In addition, it offers the benefit of a person-item map for easy visual comparison of item difficulty and participants’ ability (Lee et al., 2010; Cappelleri et al., 2014), help to scan the scale targeting at a glance, and provide the Wright map of item-person relationship (Cheng et al., 2009). In sum, “Rasch analysis is a powerful tool for evaluating construct validity” (Baghaei, 2008b, p. 1146) that allows us to apply a unified approach to examine several measurement properties (Tennant and Conaghan, 2007a). Other reasons and advantages of using Rasch models have been documented in numerous studies (see, e.g., Andrich, 1995; Fisher, 1995, 1996; Linacre, 1996; Smith, 2000; Andrich and Marais, 2019).

Therefore, this study aims to examine the construct validity of the Japanese version of the TEIP scale using Rasch analysis. Since the total score of the TEIP scale (e.g., Ahsan et al., 2012; Miesera et al., 2019) is often used, it is important to check the scale’s unidimensionality, as “the use of the total score as a summary of a person’s value on the variable implies that persons can be compared by their total scores, or the estimates of their true scores, and this implies a unidimensional construct” (Andrich and Marais, 2019, p. 50). According to Park et al. (2016) and Lai et al. (2016) the TEIP is a unidimensional scale, and the use of the Rasch model analysis would help to examine the unidimensionality (Wright and Stone, 1999) of the Japanese version of the TEIP.

Materials and Methods

Sample and Instrument

The sample consisted of 250 Japanese teachers who were working in primary or secondary level (grade 1–12) schools. The data were collected using convenience sampling technique, in which schools and teachers who agreed to engage in the study were included in the sample. The sample was selected from Tokyo metropolis and eight prefectures, including Chiba, Fukui, Kagoshima, Kochi, Miyazaki, Saitama, and Yamaguchi. A sample of 250 would be needed for calibration of definitive items with 99% confidence (Linacre, 1994). The sample included 45% males and 55% females. As regards age, 36% were between 22 and 35 years old, 35% were between 35 and 50 years old, and around 29% were older than 60 years. The rate of gender and age distribution were close to those of overall Japanese teacher population (MEXT (Ministry of Education, Culture, Sports, Science and Technology), 2017).

The Japanese version of the TEIP scale was utilized to assess teachers’ self-efficacy beliefs in implementing inclusive education. The TEIP scale consists of 18 items with 6 items allocated to each sub-scale. The participants were asked to evaluate to what extent they agree/disagree with the statements on a six-point Likert scale ranging from “1 = strongly disagree” to “6 = strongly agree.” The translation of the Japanese and confirmatory factor analysis of the three factor model was confirmed in previous studies (Yada and Savolainen, 2017; Yada et al., 2018). A good level of reliability was indicated (Cronbach’s alpha = 0.93), and a good fit to the hypothesized three-factor model via confirmatory factor analysis was determined for the Japanese version of the TEIP scale (Yada and Savolainen, 2017; Yada et al., 2018).

Rasch Analysis

In this study, the Rasch analysis steps were based on Tennant and Conaghan’s recommendations on conducting and reporting the results of a Rasch analysis study (Tennant and Conaghan, 2007a). The Rasch Unidimensional Measurement Model (RUMM2030) software (Andrich et al., 2010) was used for analysis in this study. The default model in RUMM2030, the partial credit model (Masters, 1982), is used in this study, as it is recommended with significant likelihood ratio test (Tennant et al., 2011; Vincent et al., 2015), and that means that the thresholds were estimated for each item in this type of model (Andrich and Marais, 2019). While in the rating scale model, threshold discriminations are equal across all items (Andrich, 1978).

In this analysis, we were looking for non-significant item-trait interaction chi-square as an indicator for overall fit. Another indicator would be to have normally distributed residuals with item residual mean close to zero, and a standard deviation close to 1 (Alnahdi, 2018). Threshold map and item characteristic curve (ICC) were checked for items with disordered thresholds, and any item that showed disordered thresholds was rescored to combine adjacent categories to solve this disorder (Tennant et al., 2004; Tennant and Conaghan, 2007a). We considered items with item-fit residual outside the range ±2.5 with significant p value to be misfit items to be removed (Tennant and Conaghan, 2007a). We considered persons outside the range of ±2.5 person-fit residual as misfit persons, to be removed from the sample (Tennant and Conaghan, 2007a). Local dependency was checked by looking for high correlation between the item residuals (Andrich and Marais, 2019). Due to the assumption that in a unidimensional scale, item residuals would not show high correlation after extracting the latent variable (self-efficacy), we considered all items with residual correlation of 0.20 above the average as an indicated violation of the local dependency assumption (Hissbach et al., 2011; Makransky and Bilenberg, 2014; Christensen et al., 2017; Lundgren-Nilsson et al., 2019). This is an important step, as it has been argued that the traditional fit statistics used in Rasch analysis could be insensitive to violations of unidimensionality under certain circumstances (Hattie, 1985; Smith, 2002; Hagquist et al., 2009).

A principal components analysis for the residuals was conducted to examine the scale unidimensionality (Smith, 2002; Tennant and Conaghan, 2007a; Andrich and Marais, 2019). Based on this analysis we got two sets of items: items that loaded positively on the first component and items that loaded negatively on the first component. Then, two ability estimates were calibrated based on these two sets of items. Next, t-tests were conducted to examine whether the two ability estimates were statistically significant. A 5% or less significant test would be considered acceptable or the lower limit of 95% for the binomial proportion confidence intervals at 5% level or less (Smith, 2002; Tennant and Conaghan, 2007a; Alnahdi, 2018).

We examined Differential item functioning (DIF) for items to ensure that items function similarly for both gender and regardless of participants’ age (Tennant and Pallant, 2007b). We examined internal consistency looking for the value of the person separation index (PSI) >0.7 as a good indicator (Tennant and Conaghan, 2007a). Finally, we transformed the raw scores to interval scores using the formula: “Y = M + (S × logit score). S = range of interval-level scale [(60; for a 0–60 scale)] divided by the actual range of logit scores, and M = (minimum score of interval-level scale) – (minimum logit score × S)” (Alnahdi, 2018, p. 355). This step made the interpretation of the scores much easier because any change in one unit would have the same weight across scale (Alnahdi, 2018).

Results

In the first analysis, we examined the 18-items scale fit to the Rasch model. Table 1 shows that chi-square for item-trait interaction was non-significant [χ2 78.05 (72) = p > 0.05], which was a good indicator. However, the results did not support the scale unidimensionality as 14% of t-tests were significant in comparison with the recommended limit of 5% (see Table 1).

TABLE 1

MItem residual fit
Person residual fit
Item-trait interaction
Unidimensionality t-tests
NMeanSDMeanSDχ 2 (df)pPSI% significant testsLower limit of 95% CIa
AInitial analysis2500.251.43−0.321.4478.05 (72)0.2920.93314%9.94%
After removing 18 misfit persons2320.391.75−0.221.2797.63 (72)0.0230.93713.8%9.63%
After rescoring one item (18)2320.391.75−0.221.2797.68 (72)0.0230.93613.8%9.63%
4 super items *2320.451.39−0.191.2088.08 (56)0.0030.92812.9%8.9%
After removing item 122320.371.82−0.221.2593.07 (68)0.0230.93313.8%9.63%
After removing item 82320.371.55−0.211.2192.15 (64)0.0120.92713.8%9.63%
After removing item 52320.301.31−0.221.1996.34 (60)0.0020.9219.91%6.38%
After removing item 32320.291.29−0.221.1867.51 (56)0.1390.9197.76%4.46%
BEMB subscale232−0.121.88−0.401.1247.78 (24)0.0020.87011.6%7.8%
EII subscale2320.090.62−0.351.0617.84 (24)0.8100.8559.5%6%
EC subscale2320.241.21−0.270.9822.78 (24)0.5320.8308.2%5%

Rasch statistics at each run.

Bold = indicator of unidimensionality; a = the lower limit of 95% for the binomial proportion confidence intervals, M = models, A = unidimensional, B = multi-dimensional, EMB = efficacy in managing behavior, EII = efficacy in inclusive instruction, EC = efficacy in collaboration, four items that showed local independency indictors composite together, * = after removing two items, the remaining two items (item 3 with item 9 and item 5 with item 6) that showed indicators of local independence are combined.

Next, in the second run, 18 misfit persons with residuals outside the ±2.5 range were removed. As shown in Table 1, the unidimensionality issue was still not solved by this modification, and the percentage of significant t-tests decreased only to 13.8%, which was still far above the recommended 5%. Next, we examined the threshold map looking for items with disordered threshold. Only item 18 was found to have threshold disorder (see Figure 1).

FIGURE 1

In the next run, item 18 was rescored to improve the threshold order, and instead of the previous score of 012345, it became 001234. This new score improved the threshold order (see Figure 2). However, there was no improvement with regards to the unidimensionality test. Next, we examined local dependency of items by reviewing residuals correlation of items, looking for a value of 0.20 or higher. Four items were found to have that level of correlation with other items.

FIGURE 2

Next, to overcome local dependency, the four items whose residuals showed high correlations with other item residuals were combined into four super items (Marais and Andrich, 2008) (item 8 with item 7, item 12 with item 13, item 5 with item 6, item 3 with item 9) to check whether this improved scale unidimensionality. This is recommended as a solution, for “when a set of items are locally dependent they can be bundled into polytomous super-items” (Baghaei, 2008a, p. 1105), and it has been applied in several studies (such as Marais and Andrich, 2008; Brogårdh et al., 2013; Peter et al., 2014; Medvedev et al., 2016; Milinis et al., 2017; Finger et al., 2019). However, this step did somewhat improve the percentage of significant t-tests in the unidimensionality test, 12.9%, though this is still far from the recommended 5%.

In the next four runs, we removed these items one by one and examined improvement in the unidimensionality test. In the last run, after removing the four items, the unidimensionality was supported by finding that the lower limit of the 95% CI for the binominal test was less than 5% (4.46%). The removed items were from all three subscales: item 8 was from efficacy in managing behavior; item 5 was from efficacy in inclusive instruction subscale; and items 3 and 12 were from efficacy in collaboration subscale. Since the scale showed unidimensionality and had no items with high residual correlations, the assumption of the local independence of items was fulfilled (Tennant and Conaghan, 2007a) in the final 14-item Japanese TEIP scale.

In addition, as an alternative to removing the four items, three extra runs were conducted separately for each subscale to see whether the scale would show better fit as a multidimensional scale with three unidimensional subscales. The results showed that only the efficacy in collaboration subscale fit the Rasch model and the unidimensionality test supported a unidimensional subscale. However, the results did not support the unidimensionality of the other two subscales. Therefore, we continued with the 14 items as a unidimensional scale in further analysis.

After we ensured the unidimensionality of the 14-item scale, a DIF analysis was conducted to ensure the 14 items function similarly regardless of the sample age or gender. The results indicated that no item showed indicators of DIF. Figure 3 shows an example of item 9 that functioned similarly regardless of participants’ age or gender. Table 2 shows item parameters, and items were sorted based on location, from most to least difficult to comply with the lowest location value.

FIGURE 3

TABLE 2

ItemLocationSEFit residualχ 2pa
71.6730.0890.6531.1060.893
161.0970.0823.2122.2160.696
170.8450.087−0.7693.6160.460
100.6590.093−1.54712.10.016
20.2000.092−0.5012.270.686
90.1910.0920.8377.1350.128
40.0330.093−0.7751.7520.781
15−0.0360.0880.0614.150.386
18−0.0360.104−0.8295.5190.238
1−0.6620.0862.5699.0610.059
11−0.7570.1040.0074.3850.356
13−0.8270.0890.3968.1460.086
6−1.1090.0970.4451.5770.812
14−1.2710.0950.3524.4840.344

Item fit statistics sorted based on location, from the most to the least difficult item to endorse.

SE, standard error. a Bonferroni adjusted p = 0.00357 (0.05/14).

Table 3 shows the scale with the new scoring applied. One item (18) was rescored as 001234, and the remaining 13 items were not changed and they were 012345. Table 4 shows the transformation of raw scores to interval scores, which will make it easier to interpret any differences in interval scores, because any change in one unit has the same weight across the scale (Alnahdi, 2018).

TABLE 3

ItemStrongly disagreeDisagreeDisagree somewhatAgree somewhatAgreeStrongly agree
1I can make my expectations clear about student behavior012345
2I am able to calm a student who is disruptive or noisy012345
4I can assist families in helping their children do well in school012345
6I can provide appropriate challenges for very capable students012345
7I am confident in my ability to prevent disruptive behavior in the classroom before it occurs012345
9I am confident in my ability to get parents involved in the school activities of their children with disabilities012345
10I am confident in designing learning tasks so that the individual needs of students with disabilities are accommodated012345
11I am able to get children to follow classroom rules012345
13I am able to work jointly with other professionals and staff (e.g., aides, other teachers) to teach students with disabilities in the classroom012345
14I am confident in my ability to get students to work together in pairs or in small groups012345
15I can use a variety of assessment strategies (e.g., portfolio assessment, modified tests, performance-based assessment)012345
16I am confident in informing others who know little about laws and policies related to the inclusion of students with disabilities012345
17I am confident when dealing with students who are physically aggressive012345
18I am able to provide an alternative explanation or example when students are confused001234
Removed Items (below)
3I can make parents feel comfortable coming to school
5I can accurately gauge student comprehension of what I have taught
8I can control disruptive behavior in the classroom
12I can collaborate with other professionals (e.g., itinerant teachers or speech pathologists) in designing educational plans for students with disabilities

The final Rasch-validated scale with 14 items.

Boldface = rescored items.

TABLE 4

Raw scoreInterval-level scoreRaw scoreInterval-level scoreRaw scoreInterval-level scoreRaw scoreInterval-level score
00.01817.13624.65433.7
13.41917.53725.05534.3
25.72017.93825.55634.8
37.42118.33926.05735.5
48.62218.74026.45836.1
59.72319.14126.95936.8
610.52419.54227.46037.5
711.32519.94327.96138.2
812.02620.34428.46239.1
912.62720.74528.96340.0
1013.22821.14629.46441.1
1113.82921.54729.96542.3
1214.33022.04830.56643.9
1314.83122.44931.06746.0
1415.33222.85031.56850.5
1515.73323.25132.06960.0
1616.23423.75232.6
1716.63524.15333.1

Transformation table for the conversion of the 14-item Japanese TEIP Scale total raw ordinal-level score to interval-level score.

In addition, we examined the internal consistency of the 14-item scale, and the person separation index (PSI) was 0.91 which indicated a high level of internal consistency as the adequate value is 0.70 (Tennant and Conaghan, 2007a). Figure 4 shows the person-item threshold plot with good spread of item thresholds covering the partisan threshold as a good indicator for targeting.

FIGURE 4

Discussion

This study aimed to examine the construct validity of the Japanese version of TEIP using Rasch analysis. The Rasch analysis of the 18-items scale did not support a unidimensional scale. This result is not consistent with the result of Park et al. (2016) via confirmatory factor analysis that the 18-item TEIP is a unidimensional scale. However, Lincacre argued that “there is no way of knowing from factor analysis alone whether each factor is a dimension or a slice of a shared dimension” (Linacre, 1998, p. 603). This result is consistent with the findings of another Rasch study of the Arabic version of the scale (Alnahdi, 2019a) that did not support the unidimensionality of the 18-item TEIP. This shows the importance of more combined studies with data from different countries to examine the cross-cultural psychometric properties of the scale.

Modifications were made to improve the fit for the Rasch model; removing misfit persons, rescoring one item (18) for threshold disorder, and removing four items for violating the local independency assumption. After these changes took place, the 14-item scale did fit the Rasch model and support a unidimensional scale.

The four items that were removed were items 3, 5, 8, and 12. If we look at these items, which have high correlated residuals, to understand the source of the local dependency issue, we see that the residual of item 12 “I can collaborate with other professionals (e.g., itinerant teachers or speech pathologists) in designing educational plans for students with disabilities” was correlated with residual from item 13 “I am able to work jointly with other professionals and staff (e.g., aides, other teachers) to teach students with disabilities in the classroom.” In these two items, 12 and 13, it appears that there is a level of repetition of the collaborative idea that might cause a local dependency issue of items, as more redundant items could increase the dependence in the data (Marais and Andrich, 2008). A similar observation was made with item 8 “I can control disruptive behavior in the classroom” as its residual was correlated highly with the residual from item 7 “I am confident in my ability to prevent disruptive behavior in the classroom before it occurs,” and for residuals for items 5 and 6, and items 3 and 9. Moreover, it is important that we deal with items with local dependency as “dependency among items can inflate reliability and give a false impression of the precision and quality of the test” (Baghaei, 2008a, p. 1105). Hattie (1985) believes that “the most critical and fundamental assumption of the latent trait models is that of local independence” (p. 151), and Wang et al. (2005) also discussed the misspecification that can be a result of violating the local item independence assumption.

In addition, before we removed the four items, we tested the super-item solution, which did not solve the problem of local dependency in the 18-item TEIP scale unidimensionality. Therefore, we continued by removing four items and proposed the 14-item scale with as a unidimensional measure, as the “violations of the unidimensionality requirement influence person measurement” (Smith, 2002, p. 205). In line with this, Clark and Watson discussed the issues related to scales construct validity, and they stated that “in selecting scale items, the goal is unidimensionality rather than internal consistency” (p. 306, Clark and Watson, 1995). Two of the four items removed in this study, items 8 and 12, were also removed from the Arabic version of the scale to reach a unidimensional scale (Alnahdi, 2019a) for the same reason, the violation of the local item independence assumption.

We found that item 7 “I am confident in my ability to prevent disruptive behavior in the classroom before it occurs” was the most difficult item to be endorsed by participants, followed by items 16, 17, and 10. These results are consistent with those of Yoshitoshi (2014) who administered the TEIP scale to 59 high school teachers in Japan. He indicated that Japanese teachers reported more negative responses to items 7 and 17 and concluded that teachers did not know of concrete strategies to work with children who show difficult behavior. Further, item 16 is related to knowledge about laws and policies of inclusive education and this was the second difficult question for the Japanese teachers to endorse. This finding is somewhat in line with that of a previous study indicating that Japanese teachers showed less awareness of the keywords related to inclusive education than those related to special needs education (Fujii, 2014). These findings have important implications for developing in- and pre-service teacher training in Japan, where teachers can acquire necessary skills to deal with students’ problematic behavior and knowledge regarding inclusive educational laws and policies. The easiest item to be endorsed was item 14 followed by items 6, 13, and 11. Items 11 and 14 were among the consistently weakest indicators for self-efficacy for this sample and for the Saudi sample studied earlier (Alnahdi, 2019a). Similarly, Chao et al. (2018) found item 11 to be convenient for endorsement in Hong Kong (“I am able to get children to follow classroom rules”).

The final 14-item scale that fit the Rasch model was different from the 13-item scale that fit the Rasch model with a sample from Saudi Arabia (Alnahdi, 2019a). This shows that items are perceived somewhat differently in different populations for different reasons. Therefore, it would be recommended for future studies to have combined data from different countries to examine the unidimensionality of the TEIP scale for different populations. Also, this will allow one to examine if all items function similarly in different countries. It will also allow examining the best scoring structure of the scale, as this scale has been used with different scoring structures in different studies with 5, 6, and 9-point Likert scales (Lai et al., 2016; Park et al., 2016; Yada et al., 2018; Alnahdi, 2019a).

Implications

As mentioned above, developing inclusive education is on the global educational agenda to realize an inclusive society (United Nations General Assembly, 2015). Thus, there are increasing needs to examine continuously whether the new inclusive educational policies and systems are operated well in practice. One way for that is to measure teachers’ self-efficacy for inclusive practices, which reveals the teachers’ perspectives on inclusive education. Our findings confirmed that the 14-item Japanese version of the TEIP scale is valid to assess Japanese teachers’ self-efficacy for inclusive practices and can be used not only by researchers but also by government and municipal administrators. In addition, this finding will help by providing researchers with a clear order of the items in the scale, for items with strength indicator for teachers’ efficacy, and this would be helpful in cases where researchers would need to use only a few items as part of their studies. So, items with highest value on location can be chosen. Understanding tasks (items) order would help to arrange tasks that included in TEIP according to the difficulty level from the teachers’ point of view, which is important in designing curricula and training programs for teachers to focus more on the difficult tasks. In addition, it will be helpful to measure improvements as a result of intervention studies. Transformation tables of scores help researchers to easily interpret differences in scores and determine how significant a change is in some score after receiving an intervention, for example. The findings of this study would support researches to calculate a total score on the 14-item Japanese TEIP to represent a teacher’s efficacy to work in inclusive education. Finally, further research is required to confirm the findings of this study.

Statements

Data availability statement

The datasets generated for this study are available on request to AY, .

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

Author contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Funding

This project was supported by the Deanship of Scientific Research at Prince Sattam Bin Abdulaziz University Under Research Group # 2020/02/11930.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    AhsanM. T.SharmaU.DeppelerJ. M. (2012). Exploring pre-service teachers’ perceived teaching-efficacy, attitudes and concerns about inclusive education in Bangladesh.Int. J. Whole Sch.8120.

  • 2

    AllinderR. M. (1994). The relationship between efficacy and the instructional practices of special education teachers and consultants.Teach. Educ. Spec. Educ.178695. 10.1177/088840649401700203

  • 3

    AlnahdiA. H. (2018). Rasch validation of the Arabic version of the lower extremity functional scale.Disabil. Rehabil.40353359. 10.1080/09638288.2016.1254285

  • 4

    AlnahdiG. H. (2019b). The Arabic version of the teacher efficacy for inclusive practices (teip-ar) scale: a construct validity study.Cogent Educ.6:1618516.

  • 5

    AlnahdiG. H. (2019a). Rasch validation of the Arabic version of the teacher efficacy for inclusive practices (TEIP) scale.Stud. Educ. Eval.62104110. 10.1016/j.stueduc.2019.05.004

  • 6

    AndrichD. (1978). A rating formulation for ordered response categories.Psychometrika.43561573. 10.1007/bf02293814

  • 7

    AndrichD. (1995). Distinctive and incompatible properties of two common classes of IRT models for graded responses.Appl. Psychol. Meas.19101119. 10.1177/014662169501900111

  • 8

    AndrichD.MaraisI. (2019). A Course in Rasch Measurement Theory: Measuring in the Educational, Social and Health Sciences.New York, NY: Springer.

  • 9

    AndrichD.SheridanB.LuoG. (2010). Rumm2030.Perth: RUMM Laboratory.

  • 10

    AvramidisE.NorwichB. (2002). Teachers’ attitudes towards integration/inclusion: a review of the literature.Eur. J. Spec. Needs Educ.17129147. 10.1080/08856250210129056

  • 11

    BaghaeiP. (2008a). Local dependency and Rasch measures.Rasch Meas. Trans.2111051106.

  • 12

    BaghaeiP. (2008b). The Rasch model as a construct validation tool.Rasch Meas. Trans.2211451146.

  • 13

    BanduraA. (1977). Self-efficacy: toward a unifying theory of behavioral change.Psychol. Rev.84191215. 10.1037/0033-295x.84.2.191

  • 14

    BanduraA. (1997). Self-Efficacy: The Exercise of Control.New York, NY: Freeman.

  • 15

    BetoretF. D. (2006). Stressors, self-efficacy, coping resources, and burnout among secondary school teachers in Spain.Educ. Psychol.26519539. 10.1080/01443410500342492

  • 16

    BondT.FoxC. M. (2015). Applying the Rasch Model: Fundamental Measurement in the Human Sciences.Abingdon: Routledge.

  • 17

    BrogårdhC.LexellJ.Lundgren-NilssonÅ (2013). Construct validity of a new rating scale for self-reported impairments in persons with late effects of polio.PM&R5176181. 10.1016/j.pmrj.2012.07.007

  • 18

    CappelleriJ. C.LundyJ. J.HaysR. D. (2014). Overview of classical test theory and item response theory for the quantitative assessment of items in developing patient-reported outcomes measures.Clin. Ther.36648662. 10.1016/j.clinthera.2014.04.006

  • 19

    Central Council for Education (2005). Tokubetsushienkyoiku wo Suishin suru tame no Seido no Arikata ni Tsuite. Available online at: https://www.mext.go.jp/b_menu/shingi/chukyo/chukyo0/toushin/05120801.htm(accessed February 4, 2020).

  • 20

    ChaoC. N. G.LaiF. T. T.JiM.LoS. K.SinK. F. (2018). Which inclusive teaching tasks represent the highest level of teacher efficacy in primary and secondary schools?Teach. Teach. Educ.75164173. 10.1016/j.tate.2018.06.014

  • 21

    ChengY. Y.WangW. C.HoY. H. (2009). Multidimensional Rasch analysis of a psychological test with multiple subtests: a statistical solution for the bandwidth—fidelity dilemma.Educ. Psychol. Meas.69369388. 10.1177/0013164408323241

  • 22

    ChristensenK. B.MakranskyG.HortonM. (2017). Critical values for Yen’s Q 3: identification of local dependence in the Rasch model using residual correlations.Appl. Psychol. Meas.41178194. 10.1177/0146621616677520

  • 23

    ClarkL. A.WatsonD. (1995). Constructing validity: basic issues in objective scale development.Psychol. Assess.7309319. 10.1037/1040-3590.7.3.309

  • 24

    De AyalaR. J. (2013). The Theory and Practice of Item Response Theory.New York, NY: Guilford Publications.

  • 25

    de BoerA.Jan PijlS.MinnaertA. (2011). Regular primary schoolteachers’ attitudes towards inclusive education: a review of the literature.Int J. Incl. Educ.15331353. 10.1080/13603110903030089

  • 26

    FingerM. E.EscorpizoR.TennantA. (2019). Measuring work-related functioning using the work rehabilitation questionnaire (WORQ).Int. J. Environ. Res. Public Health.16:2795. 10.3390/ijerph16152795

  • 27

    FisherW. P.Jr. (1995). Opportunism, a first step to inevitability?Rasch Meas. Trans.9:426.

  • 28

    FisherW. P.Jr. (1996). The Rasch alternative.Rasch Meas. Trans.9466467.

  • 29

    FujiiN. (2014). Inkuru-shibukyoikushisutemukochiku no hokosei ni kansuru kento: Kyoshokuin ni taisuru ki-wa-do no ninchidochosa wo toshite [Study into the directions in building the inclusive education system: through teaching staff survey on keyword awareness].Bull. Cent. Educ. Res. Pract. Akita Univ.368998.

  • 30

    HagquistC.BruceM.GustavssonJ. P. (2009). Using the Rasch model in nursing research: an introduction and illustrative example.Int. J. Nurs. Stud.46380393. 10.1016/j.ijnurstu.2008.10.007

  • 31

    HattieJ. (1985). Methodology review: assessing unidimensionality of tests and ltenls.Appl. Psychol. Meas.9139164. 10.1177/014662168500900204

  • 32

    HissbachJ. C.KlusmannD.HampeW. (2011). Dimensionality and predictive validity of the HAM-Nat, a test of natural sciences for medical school admission.BMC Med. Educ.11:83.

  • 33

    LaiF. T.LiE. P.JiM.WongW. W.LoS. K. (2016). What are the inclusive teaching tasks that require the highest self-efficacy?Teach. Teach. Educ.59338346. 10.1016/j.tate.2016.07.006

  • 34

    LeeM.PetersonJ. J.DixonA. (2010). Rasch calibration of physical activity self-efficacy and social support scale for persons with intellectual disabilities.Res. Dev. Disabil.31903913. 10.1016/j.ridd.2010.02.010

  • 35

    LinacreJ. (1994). Sample size and item calibration stability.Rasch Mes. Trans.7:328.

  • 36

    LinacreJ. M. (1996). The Rasch model cannot be “disproved”!Rasch Meas. Trans.10512514.

  • 37

    LinacreJ. M. (1998). Rasch first or factor first.Rasch Meas. Trans.11:603.

  • 38

    Lundgren-NilssonÅTennantA.JakobssonS.SimrénM.TaftC.DenckerA. (2019). Validation of fatigue impact scale with various item sets: a Rasch analysis.Disabil. Rehabil.41840846. 10.1080/09638288.2017.1411983

  • 39

    MakranskyG.BilenbergN. (2014). Psychometric properties of the parent and teacher ADHD rating scale (ADHD-RS) measurement invariance across gender, age, and informant.Assessment21694705. 10.1177/1073191114535242

  • 40

    MalinenO.-P.SavolainenH.EngelbrechtP.XuJ.NelM.NelN.et al (2013). Exploring teacher self-efficacy for inclusive practices in three diverse countries.Teach. Teach. Educ.333444. 10.1016/j.tate.2013.02.004

  • 41

    MaraisI.AndrichD. (2008). Effects of varying magnitude and patterns of local dependence in the unidimensional Rasch model.J. Appl. Meas.9120.

  • 42

    MastersG. N. (1982). A Rasch model for partial credit scoring.Psychometrika47149174. 10.1007/bf02296272

  • 43

    MedvedevO. N.SiegertR. J.FengX. J.BillingtonD. R.JangJ. Y.KrägelohC. U. (2016). Measuring trait mindfulness: how to improve the precision of the mindful attention awareness scale using a Rasch model.Mindfulness7384395. 10.1007/s12671-015-0454-z

  • 44

    MEXT (Ministry of Education, Culture, Sports, Science and Technology) (2017). Gakkokyointokeichosa: Heisei 28 Nendokekka no Gaiyo [Statistics in School Teachers: Outline of the Survey Results in 2016]. Available online at: https://warp.ndl.go.jp/info:ndljp/pid/11293659/www.mext.go.jp/b_menu/toukei/chousa01/kyouin/kekka/k_detail/1395309.htm(accessed March 12, 2018).

  • 45

    MieseraS.DeVriesJ. M.JungjohannJ.GebhardtM. (2019). Correlation between attitudes, concerns, self-efficacy and teaching intentions in inclusive education evidence from German pre-service teachers using international scales.J. Res. Spec. Educ. Needs19103114. 10.1111/1471-3802.12432

  • 46

    MilinisK.TennantA. A.YoungC.TONiC Study Group (2017). Rasch analysis of SF-Qualiveen in multiple sclerosis.Neurourol. Urodynam.3611611166. 10.1002/nau.23081

  • 47

    ParkM. H.DimitrovD. M.DasA.GichuruM. (2016). The teacher efficacy for inclusive practices (TEIP) scale: dimensionality and factor structure.J. Res. Spec. Educ. Needs16212. 10.1111/1471-3802.12047

  • 48

    PeterC.CiezaA.GeyhS. (2014). Rasch analysis of the general self-efficacy scale in spinal cord injury.J. Health Psychol.19544555. 10.1177/1359105313475897

  • 49

    SavolainenH.EngelbrechtP.NelM.MalinenO.-P. (2012). Understanding teachers’ attitudes and self-efficacy in inclusive education: implications for pre-service and in-service teacher education.Eur. J. Spec. Needs Educ.275168. 10.1080/08856257.2011.613603

  • 50

    SharmaU.AielloP.PaceE. M.RoundP.SubbanP. (2018). In-service teachers’ attitudes, concerns, efficacy and intentions to teach in inclusive classrooms: an international comparison of Australian and Italian teachers.Eur. J. Spec. Needs Educ.33437446. 10.1080/08856257.2017.1361139

  • 51

    SharmaU.LoremanT.ForlinC. (2012). Measuring teacher efficacy to implement inclusive practices.J. Res. Spec. Educ. Needs121221. 10.1111/j.1471-3802.2011.01200.x

  • 52

    SmithE. V.Jr. (2002). Understanding Rasch measurement: detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals.J. Appl. Meas.3205231.

  • 53

    SmithR. M. (2000). Fit analysis in latent trait measurement models.J. Appl. Meas.1199218.

  • 54

    TennantA.ConaghanP. G. (2007a). The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper?Arthritis Care Res.5713581362. 10.1002/art.23108

  • 55

    TennantA.HortonM. C.PallantJ. F. (2011). Introductory Rasch Analysis: a Workbook.Leeds: Department of Rehabilitation Medicine, University of Leeds.

  • 56

    TennantA.PallantJ. F. (2007b). DIF matters: a practical approach to test if differential item functioning makes a difference.Rasch Meas. Trans.2010821084.

  • 57

    TennantA.PentaM.TesioL.GrimbyG.ThonnardJ. L.SladeA.et al (2004). Assessing and adjusting for cross-cultural validity of impairment and activity limitation scales through differential item functioning within the framework of the Rasch model: The PRO-ESOR project.Med. Care42(1 Suppl)I37I48.

  • 58

    Tschannen-MoranM.Woolfolk HoyA.HoyW. K. (1998). Teacher efficacy: its meaning and measure.Rev. Educ. Res.68202248. 10.3102/00346543068002202

  • 59

    UenoK.NakamuraK. (2011). Inkuru-jonkyoiku ni taisuru tsujogakkyukyoin no ishiki ni tsuite [A study of awareness of inclusion education among regular-class teachers in elementary schools].J. Health Sports Sci. Juntendo Univ.3112117.

  • 60

    UNESCO (1994). The Salamanca Statement and Framework for Action on Special Needs Education. Available online at: http://www.unesco.org/education/pdf/SALAMA_E.PDF

  • 61

    UNESCO (2009). Policy Guidelines on Inclusion in Education.Paris: UNESCO.

  • 62

    United Nations (2006). Convention on the Rights of Persons with Disabilities. Available online at: http://www.un.org/disabilities/documents/convention/convoptprot-e.pdf(accessed October 29, 2017).

  • 63

    United Nations General Assembly (2015). Transforming Our World: The 2030 Agenda for Sustainable Development. Available online at: http://www.un.org/ga/search/view_doc.asp?symbol=A/70/L.1(accessed September 18, 2015).

  • 64

    VincentJ. I.MacDermidJ. C.KingG. J.GrewalR. (2015). Rasch analysis of the patient rated elbow evaluation questionnaire.Health Qual. Life Outcomes13:84.

  • 65

    WangW. C.ChengY. Y.WilsonM. (2005). Local item dependence for items across tests connected by common stimuli.Educ. Psychol. Meas.65527. 10.1177/0013164404268676

  • 66

    WrightB. D.StoneM. (1999). Measurement Essentials.Karachi: Wide Range.

  • 67

    WyattM. (2014). Towards a re-conceptualization of teachers’ self-efficacy beliefs: tackling enduring problems with the quantitative research and moving on.Int. J. Res. Method Educ.37166189. 10.1080/1743727x.2012.742050

  • 68

    YadaA.SavolainenH. (2017). Japanese in-service teachers’ attitudes towards inclusive education and self-efficacy for inclusive practices.Teach. Teach. Educ.64222229. 10.1016/j.tate.2017.02.005

  • 69

    YadaA.TolvanenA.SavolainenH. (2018). Teachers’ attitudes and self-efficacy on implementing inclusive education in Japan and Finland: a comparative study using multi-group structural equation modelling.Teach. Teach. Educ.75343355. 10.1016/j.tate.2018.07.011

  • 70

    YoshitoshiM. (2014). Inkuru-shibukyoiku ni taisuru kotogakkokyoin no jikokoryokukan: Tokubetsushienkyoikuko-deine-ta- wo taisho toshita shitsumonshichosa no bunseki [A survey of Japanese high school teachers’ self-efficacy about inclusive education].Bull. Cent. Teach. Educ. Dev. Okayama Univ.415.

Summary

Keywords

inclusive education, Rasch analysis, self-efficacy, Japan, TEIP

Citation

Alnahdi GH and Yada A (2020) Rasch Analysis of the Japanese Version of Teacher Efficacy for Inclusive Practices Scale: Scale Unidimensionality. Front. Psychol. 11:1725. doi: 10.3389/fpsyg.2020.01725

Received

21 April 2020

Accepted

23 June 2020

Published

29 July 2020

Volume

11 - 2020

Edited by

Andrea Bonanomi, Catholic University of the Sacred Heart, Italy

Reviewed by

Francisco T. T. Lai, The Chinese University of Hong Kong, China; Xin Luo, Uber, United States

Updates

Copyright

*Correspondence: Ghaleb H. Alnahdi, ;

This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics