Psychometric properties of the reading self-efficacy scale in Peruvian students aged 10 to 16 years

Background: Reading self-efficacy is a key factor for students’ academic performance and motivation and given the low reading performance of Peruvian students compared to the average, it is crucial to understand and improve their reading self-efficacy. Objective: To evaluate the psychometric properties of the Reading Self-Efficacy Scale in a Peruvian sample. Methodology: Using a sample of 560 students aged 10 to 16 (M = 13.5, SD = 1.93), confirmatory factor analysis (CFA) and gender-specific invariance analysis were conducted. Results: A two-dimensional, second-order model was taken into consideration. Significant differences were discovered in gender invariance, suggesting that the scale is comparable between the genders. Conclusion: The validation of the Reading Self-Efficacy Scale in the Peruvian context provides a useful tool to assess and develop Peruvian students’ reading self-efficacy, with implications for professional practice and educational


Introduction
In the field of educational research, reading self-efficacy emerges as a key construct, increasingly relevant due to its significant influence on students' academic performance and motivation.Self-efficacy is defined as an individual's belief in their own ability to execute tasks and achieve goals (Bandura et al., 1999).Specifically, in the context of reading, it focuses on the student's confidence in their ability to handle specific reading tasks (Shell et al., 1989).Recent research has demonstrated a positive relationship between reading self-efficacy and critical aspects such as reading performance and motivation to read (Prat-Sala and Redford, 2012;Chen et al., 2021;Hoesny et al., 2023;León-Gutiérrez et al., 2023).This relationship becomes even more crucial when examining educational contexts with particular challenges, such as in Peru.According to the 2018 PISA report by the OECD, Peruvian students showed significantly lower reading performance than the OECD country average, a trend that persisted in the 2022 PISA assessment, highlighting issues not only in reading but also in science.This situation underscores deficiencies in basic education and reveals that factors like socioeconomic status and gender significantly influence educational outcomes, marking notable differences in performance in areas such as mathematics and reading (OECD, 2019(OECD, , 2023)).This landscape is further complicated considering the findings of the Peruvian Ministry of Education.The performance gap between students with high and low reading self-efficacy is particularly wide in the country.The 2022 study, following the return to in-person education, showed a general decline in learning outcomes compared to 2019.Specifically, low performance was observed at primary levels, increasing disparities between public and private educational institutions, as well as between urban and rural areas, and a significant decrease in reading performance, especially in public schools (Ministerio de Educación [MINEDU], 2018[MINEDU], , 2023)).Therefore, reading self-efficacy not only directly affects students' ability to tackle reading tasks and persist in the face of difficulties but also plays a vital role in their capacity to regulate their learning (Klassen, 2010;Prat-Sala and Redford, 2010;Peura et al., 2019;Orellana et al., 2020;Cho et al., 2021).
The measurement of reading self-efficacy has been a crucial aspect of educational research, with various scales and tools developed and used in diverse cultural contexts.In Iran, a specific scale for selfefficacy in reading comprehension, consisting of 11 items, was developed, adapted to the linguistic and cultural needs of the Iranian context.This scale, focused on university students of English literature, highlights the relevance of self-efficacy in foreign language learning, demonstrating the adaptability of these tools to different languages and educational environments (Ghonsooly and Elahi, 2009).In Thailand, a questionnaire was developed to measure self-efficacy beliefs for reading among university students in English education, aged 18 to 20 years.This tool, which recorded an internal reliability of 0.90, underscores the importance of reading self-efficacy in higher education, specifically in reading English texts as an international language (EIL), in both Western and Asian styles (Kakaew and Damnet, 2017).In the United Kingdom, a reading self-efficacy questionnaire consisting of 20 items was developed for children aged 8 to 11 years.This questionnaire demonstrated excellent internal reliability with a Cronbach's alpha coefficient of 0.89, facilitating the assessment of reading self-efficacy from an early age, especially in primary education (Carroll and Fox, 2017).Also, in Finland, a questionnaire was designed to examine reading self-efficacy at three levels of specificity (general, intermediate, and specific) among primary school students from second to fifth grade.The scale included a total of 14 items (3 general, 3 intermediate, and 8 specific) and showed a solid factorial structure (Peura et al., 2019).In the case of Peru, a reading self-efficacy scale was developed as part of a broader study on Peruvian students' attitudes towards reading, writing, mathematics, and indigenous languages.A total of 45 items were developed for reading self-efficacy.The scale's reliability was acceptable, with Cronbach's Alpha coefficients of 0.61 for sixth grade and 0.72 for fourth grade of secondary school, suggesting greater reliability at the secondary level (Cueto et al., 2003).However, there has been no report on the specificity of these items for measuring reading self-efficacy nor the factorial structure of the test.
Despite the diversity of research and the variety of scales developed to measure reading self-efficacy, there is a notable lack of instruments that coherently align with contemporary theoretical models of reading and that follow the guidelines established by Bandura for the design of self-efficacy scales (Bandura et al., 1999).Reading, a process that involves both the decoding of text and its interpretation in the context of prior and situational knowledge (Kintsch and Rawson, 2005), is intrinsically linked to motivation, including self-efficacy beliefs (Guthrie, 2000;Wang and Guthrie, 2004).However, many existing tools focus on particular aspects of reading, without comprehensively capturing the entirety of reading skills and strategies.For instance, it has been noted that existing scales might not adequately reflect the specificity and diversity of reading skills required in different contexts and for various types of text (Peura et al., 2019).Moreover, studies conducted in specific cultural contexts such as Iran (Ghonsooly and Elahi, 2009), Thailand (Kakaew and Damnet, 2017), and the United Kingdom (Carroll and Fox, 2017), while valuable within their respective areas, may not be directly applicable to broader contexts due to cultural and linguistic differences.These limitations underscore the need for developing more universal and holistic measurement instruments that can encompass the complex and multifaceted dimensions of reading selfefficacy, thus reflecting reading skills and strategies more completely and accurately.
In Spain, a Reading Self-Efficacy Scale was developed for students aged 10 to 16, comprising 15 items distributed across three factors: self-efficacy in the construction of the textual model, self-efficacy in decoding skills and fluency, and self-efficacy in constructing the situational model (Fidalgo et al., 2013).The Reading Self-Efficacy Scale includes items that reflect low-level or microprocesses, related to decoding skills and verbal fluency, and high-level or macroprocesses, related to text comprehension following Bandura's guidelines (2006).However, adapting and validating this scale in an educational context may present challenges.Cultural differences can influence how students interpret and respond to scale items.Moreover, differences in the ways reading is taught and educational systems may impact the relevance and applicability of the scales in different contexts.
To better understand the reading self-efficacy of Peruvian students, it is essential to assess the reading self-efficacy scale.Therefore, it is crucial to verify whether a scale is appropriate and relevant for Peruvian students.Thus, validating the 15-item Reading Self-Efficacy Scale in the Peruvian context can allow for a useful tool for researchers and educators.Therefore, the goal of this study is to analyze the psychometric properties of the reading self-efficacy scale among students aged 10 to 16 in Peru.

Design and participants
This study is of an instrumental type (Ato et al., 2013), employing a convenience sampling method.The sample was selected using an electronic calculator (Soper, 2023), which considered several factors: the number of observed and latent variables in the model, the anticipated effect size (λ = 0.20), the desired statistical significance (α = 0.05), and the level of statistical power (1 − β = 0.80).According to these parameters, the minimum sample required for the study was 296 participants.However, 560 students were recruited, with an almost equal distribution between males (50.5%) and females (49.5%), aged between 10 and 16 years (M = 13.5, SD = 1.93).Most students were in their first year of secondary school, accounting for 18.8% of the sample, and came from Lima (68.21%), with the remainder (31.79%) from Ica (Table 1).

Instruments
Reading Self-efficacy.The Spanish version of the Reading Selfefficacy Scale (Fidalgo et al., 2013) was used, which consists of 15 items divided into three dimensions: Decoding Self-efficacy (items 6, 10, and 13), Textual Self-efficacy (items 1, 2, 4, 5, 7, 8, 11, 12, and 14), and Situation Model Self-efficacy (items 3, 9, and 15).The instrument is scored on a scale of 0 to 100, with 0 indicating very sure of not being able to do it, 50 moderately sure of being able to do it, and 100 very sure of being able to do it.However, a 5-point Likert scale was adopted with 1 (very sure of not being able to do it) to 5 (very sure of being able to do it) as the 5 points offer a balance between the ability to capture variation in responses and the ease of data interpretation (Preston and Colman, 2000).

Procedure
The process complied with the ethical standards of a Peruvian university (reference 2023-CEUPeU-023).Data collection began with the request for permissions from the educational institutions' administrations, followed by seeking informed consent from the parents through a Google form distributed by the teachers through the WhatsApp groups of each classroom.This process occurred a day before data collection.During the collection, the voluntary participation of the students was encouraged through a specific section in the form that contained the research instruments.The data were collected in December 2020, during the compulsory social isolation due to the COVID-19 pandemic.A pilot test was conducted with 20 students, who reflected the characteristics of the study population, during synchronous class hours to verify the apparent validity of the instrument.This instrument, a Google form applied individually, was completed in approximately 20 min, confirming the understanding of the items by the students.

Analysis
In this study, a descriptive analysis of the items was carried out through the calculation of the mean, standard deviation, skewness, kurtosis, and the corrected item-test correlation analysis.Skewness (g1) and kurtosis (g2) were considered adequate if the values were between ±1.5 (Pérez and Medrano, 2010).Also, the corrected item-test correlation was used to eliminate items when r(i-tc) was less than or equal to 0.2 or when there was multicollinearity (i-tc) less than or equal to 0.2 (Kline, 2016).
A Confirmatory Factor Analysis (CFA) of the unifactorial scale was carried out using the MLR estimator, considered robust against deviations from inferential normality (Muthen and Muthen, 2017).The criteria to evaluate the model fit included the chi-square test (χ 2 ), the Confirmatory Fit Index and Tucker-Lewis (CFI and TLI ≥ 0.95) indices (Schumacker and Lomax, 2016), and the Root Mean Square Error of Approximation and Standardized Root Mean Square Residuals (RMSEA and SRMSR ≤0.05) indices (Kline, 2016).Additionally, to demonstrate internal validity, through convergent validity, the average variance extracted (AVE) per factor was calculated (AVE > 0.50).Interfactor correlations (φ) were also calculated according to conceptual affinity, as evidence of discriminant validity is evaluated by empirical differentiation between the AVE and the square of the interfactor correlations (φ 2 ), where the former is expected to be greater (AVE > φ 2 ) (Fornell and Larcker, 1981).
A sequence of progressively more restrictive hierarchical variance models was used.Initially, configurational invariance as a reference model was analyzed, which assesses whether the factorial structure is similar across groups.This analysis was followed by the evaluation of metric invariance, examining if the factorial loadings are equivalent across genders.Subsequently, scalar invariance was considered, which adds the equality of intercepts to the factorial loadings.Finally, strict invariance was evaluated, including the equality of factorial loadings, intercepts, and residual errors.The comparison of these models was based on statistical tests, using the change in the Comparative Fit Index (ΔCFI), where values less than 0.010 indicate invariance of the model between groups, according to Chen (2007) and Finch and French (2018).Additionally, the RMSEA (ΔRMSEA) was applied, with differences less than 0.015, to confirm the invariance of the model between groups (Chen, 2007;Finch and French, 2018).
The reliability of the scale was assessed using Cronbach's alpha coefficient (α) and McDonald's omega coefficient (ω) (McDonald, 1999), both indicators of internal consistency, with values above 0.70 considered indicative of good reliability.All statistical analyses were carried out using R software 4.1.1(R Foundation for Statistical Computing, Vienna, Austria; http://www.Rproject.org).

Results
The descriptive statistics are presented in Table 2, providing descriptive statistics and reliability measures for the total sample and broken down by gender.The highest mean in the total sample and by genders, with values of 3.05 for the total sample, 3.08 for the female group, and 3.03 for the male group, indicates that it is the item with the highest perception of self-efficacy.In contrast, item 14 has the lowest mean in all three groups, with 2.45 in the total sample, suggesting it is the item with the lowest perceived self-efficacy.The measures of skewness (g 1 ) and kurtosis (g 2 ) for all items fall within the established normality range of ±1.5, implying an acceptably normal data distribution.Additionally, all item-total correlations (r.cor) exceed the threshold of 0.30, indicating that each item contributes adequately to the scale and none should be eliminated.The internal consistency of the scale is high, as all Cronbach's alpha values per item exceed the acceptability criterion of 0.70.

Internal structure and reliability
A Confirmatory Factor Analysis (CFA) was performed considering an initial model (M1) that showed better fit indices than Model 2 (χ 2 = 175.050,df = 88, p < 0.001, CFI = 0.99, TLI = 0.99, RMSEA = 0.04, SRMR = 0.03).However, a correlation of 1 between Factor 1 (decoding self-efficacy) and Factor 3 (situation model selfefficacy) suggests that the two factors are essentially measuring the same thing.Also, when considering discriminant validity, it was identified that the average variance extracted (AVE) for factors 1 and 3 did not exceed the square of their interfactor correlations (φ 2 ).This suggests problems in discriminating between these factors.Thus, these factors may be so interrelated that students who feel self-efficacious in one, probably also feel self-efficacious in the other.Given these considerations, a second model (M2) was proposed, in which Factors 1 and 3 were combined into a single factor, named "Self-Efficacy in Decoding and Situational Model." This model demonstrated adequate fit: χ 2 = 350.100,df = 89, p < 0.001, CFI = 0.98, TLI = 0.98, RMSEA = 0.07 (90% CI 0.06-0.08),SRMR = 0.03.However, the squared inter-factor correlations (φ 2 ) being greater than the Average Variance Extracted (AVE) suggests a lack of discriminant validity, despite all factorial loadings being above 0.70, indicating that each item is well represented by its respective factor.Cronbach's alpha values and McDonald's omega (ω) for all items were above 0.70, indicating good internal consistency of each factor.Therefore, a second-order model was chosen to be implemented (See Table 3).

Second-order model
Given that discriminant validity was not achieved because interfactor correlations surpassed the AVE in some cases, it was decided to proceed with a second-order model.This model was found to have an adequate fit index, as evidenced by the following parameters: χ 2 = 175.05,df = 88, p < 0.001, CFI = 0.99, TLI = 0.99, RMSEA = 0.04 [90% CI = 0.03-0.05]and SRMR = 0.03.Therefore, the second-order model provides a structure more adjusted to the collected data and contributes to the conceptual interpretation of the results (Figure 1).The internal consistency, as measured by Cronbach's alpha coefficients and McDonald's Omega (ω), for Self-Efficacy in Decoding and Situational Model (α, ω = 0.94) and Textual Self-Efficacy (α, ω = 0.81), was found to be adequate

Gender invariance
The scale analysis demonstrates invariance at various levels for both genders.Configurational invariance, which examines the factorial structure without constraints, showed a good fit with a CFI of 0.96 and an RMSEA of 0.053, indicating structural similarity between gender groups.Metric invariance, with equal factorial loadings in both groups, maintained consistency (CFI of 0.961 and RMSEA of 0.051), implying that the scale measures the same dimensions of reading self-efficacy in both sexes.Evaluating scalar invariance, which includes equality of intercepts, the results (CFI of 0.961 and RMSEA of 0.049) suggest a similar interpretation of the scale at the item level for both sexes.Lastly, strict invariance, which adds equality of residual variances, showed a slight increase in ΔCFI (0.004), still below the 0.010 threshold, reaffirming invariance.According to Chen's criteria (2007), these results indicate that the scale is a reliable instrument for measuring reading self-efficacy between sexes, allowing for precise and unbiased comparisons between gender groups in Peruvian students (See Table 4).

Discussion
Reading self-efficacy has been established as a critical construct in educational research, influencing students' academic performance and motivation.As defined by Bandura et al. (1999), it is the individual's confidence in their ability to execute specific tasks and achieve goals, and in the context of reading, it refers to confidence in personal reading ability (Shell et al., 1989).In Peru, the situation is critical, as the OECD's PISA reports from 2018 and 2022 show that students' reading performance is significantly below the OECD average, with the performance gap widened by socioeconomic and gender differences (OECD, 2019(OECD, , 2023)).Additionally, the general decline in learning outcomes after the pandemic has highlighted the importance of reading self-efficacy (Ministerio de Educación [MINEDU], 2018[MINEDU], , 2023)).This study sought to evaluate the psychometric properties of the Reading Self-Efficacy Scale in the Peruvian context.
The confirmatory factor analysis applied to the reading selfefficacy scale reveals interesting and challenging aspects in its structure and validity.The first proposed model (M1), following an approach similar to Fidalgo et al. ( 2013), considered three distinct factors: self-efficacy in decoding, textual self-efficacy, and self-efficacy towards the situational model.Although this model showed acceptable fit indices, a perfect correlation between Factor 1 (self-efficacy in decoding) and Factor 3 (self-efficacy towards the situational model) poses a significant challenge.This high correlation suggests a considerable overlap between these two factors, indicating they might be evaluating the same dimension of reading self-efficacy.This finding contrasts with the results of the study by Fidalgo et al. (2013), where a clear differentiation between the three factors was observed.To address this overlap, a second model (M2) was proposed, merging Factors 1 and 3 into a single factor named "Self-Efficacy in Decoding and Situational Model." This adjusted model showed an improvement in fit indices, though concerns about discriminant validity still persist.Specifically, the lack of discriminant validity is inferred from the fact that the square of the inter-factor correlations (φ 2 ) is greater than the average variance extracted (AVE) for the combined factors.Despite these challenges, the factorial loadings of individual items on their respective factors are robust, all above 0.70, indicating an accurate representation of each item by its corresponding factor.
Therefore, it was decided to proceed with a second-order model, which resulted in an adequate fit index.This model provides a structure more suited to the collected data and contributes to the conceptual interpretation of the results.The internal consistency through Cronbach's alpha coefficients and McDonald's omega (ω) for Self-Efficacy in Decoding and Situational Model (α = 0.94), textual self-efficacy (α = 0.81) were adequate, indicating that the scale is reliable for measuring reading self-efficacy in the Peruvian context.
We also looked at the scale's gender invariance.Our findings demonstrated strict invariance, which suggests that latent means are comparable across genders.Furthermore, despite the rigorous invariance that our scale exhibits, it is crucial to keep in mind that cultural and gender variations may affect how people understand and react to scale items.Women could have higher reading self-efficacy ratings because they may feel more confidence in their ability to read effectively in societies where reading is regarded more highly for women than for males.This is important because reading self-efficacy has been shown to have a strong impact on academic performance (Bandura, 1997;Schunk, 2003).Increasing reading self-efficacy, therefore, could be an effective means to improve academic outcomes.However, if significant differences in reading self-efficacy scores between genders are found, educators should consider the possible reasons behind these differences and adapt their interventions appropriately.

Implications
The Reading Self-Efficacy Scale in Peru has demonstrated its validity and reliability as a tool.The findings provide a solid basis for understanding and improving reading self-efficacy, a key factor for academic performance and student motivation.The implications of this study extend to professional practice and educational policy by providing educators with a valid and reliable tool to assess and develop students' reading self-efficacy.To acquire a more full picture of students' abilities and attitudes toward reading, decision-makers in education may take into account adding measures of reading selfefficacy in national examinations.Additionally, they may put in place educational policies and programs that support the growth of reading self-efficacy, include encouraging reading settings, providing access to high-quality reading materials, and educating teachers on efficient methods to raise reading self-efficacy.Our results also provide evidence that reading self-efficacy is a multifaceted concept that encompasses elements like decoding, verbal fluency, and scenario modeling, supporting Bandura's self-efficacy theory.This deepens our comprehension of how students perceive and assess their reading competence and offers a sound theoretical foundation for developing instructional interventions aimed at raising reading self-efficacy.Finally, it's critical to remember that a variety of contextual and cultural elements might have an impact on a reader's self-efficacy.Therefore, it's important to take cultural quirks into account and modify the Self-Efficacy towards Reading Scale as appropriate when applying it with other groups or circumstances.Additionally, it's critical to keep in mind that reading self-efficacy is a construct that may vary over time and in response to students' experiences.For this reason, frequent evaluations are advised to track its development and modify treatments as necessary.

Limitations
It is crucial to identify and address limitations that might have influenced the results.Firstly, a diverse sample of Peruvian students aged 10 to 16 years was used.It is important to note that the results

Spanish version
English Version  may not be generalizable to other age groups or educational contexts.Further research involving more representative samples of students from various regions and educational levels in Peru and other countries is recommended.Additionally, it is important to highlight that our research was based on self-reported data.In future studies, self-reports could be complemented with additional assessment techniques, such as classroom reading observations or objective measures of reading performance.Regarding expanding the validity studies, it would be beneficial to conduct research exploring other forms of validity, such as concurrent and predictive validity, with other scales.Predictive validity could be examined by correlating scale scores with future academic achievements or reading assessments, or standardized reading tests.These additional validity studies would provide a more comprehensive understanding of the effectiveness and applicability of the scale in various educational settings.

Conclusion
The Reading Self-Efficacy Scale in Peru has demonstrated its validity and reliability as a tool.The findings provide a solid basis for understanding and improving reading self-efficacy, a key factor for academic performance and student motivation.The implications of this study extend to professional practice and educational policy by providing educators with a valid and reliable tool to assess and develop students' reading self-efficacy.To acquire a more full picture of students' abilities and attitudes toward reading, decision-makers in education may take into account adding measures of reading selfefficacy in national examinations.Additionally, they may put in place educational policies and programs that support the growth of reading self-efficacy, include encouraging reading settings, providing access to high-quality reading materials, and educating teachers on efficient methods to raise reading self-efficacy.Factorial model.-García et al. 10.3389/feduc.2024.1234268Frontiers in Education 08 frontiersin.org

TABLE 2
Descriptive statistics and reliability.

TABLE 3
Confirmatory factor analysis.