Academic emotions of eighth grade students in algebra classrooms using an artificial intelligence learning environment

Omar, Amal; Daher, Wajeeh; Bayaa, Nimer

doi:10.3389/feduc.2025.1669360

ORIGINAL RESEARCH article

Front. Educ., 30 September 2025

Sec. Psychology in Education

Volume 10 - 2025 | https://doi.org/10.3389/feduc.2025.1669360

Academic emotions of eighth grade students in algebra classrooms using an artificial intelligence learning environment

Amal Omar¹^*

Wajeeh Daher¹

Nimer Bayaa²

¹Faculty of Education, An-Najah National University, Nablus, Palestine
²Al-Qasemi Academic College of Education, Baqa al-Gharbiyye, Israel

Introduction: With the increasing use of artificial intelligence (AI) in education, there is a need to investigate the emotional impact of smart learning environments. This study explores the impact of using the AI-powered CK-12 Flexi (v2.0) application on the academic emotions of eighth-grade students while learning algebra.

Methods: A mixed-methods approach was applied to a sample of 92 male and female students from two schools in Nablus, selected via convenience sampling and distributed into two groups: an experimental group (learned using the Flexi application) and a control group (learned using the traditional method). Quantitative data were collected through pre- and post-questionnaires addressing academic emotions; qualitative data were collected through semi-structured interviews with a selected sample of students from the experimental group.

Results: Pretest-adjusted Quade nonparametric ANCOVA with Holm–Bonferroni control across six outcomes showed significant group differences favoring the experimental group for enjoyment (higher) and for anxiety and boredom (lower), p_ad_j ≤ 0.036; effects were small-to-moderate (partial, η² = 0.074–0.116). In contrast, shame, pride, and anger did not differ significantly after correction. Qualitative results reflected a comprehensive picture of students’ experience, revealing diverse positive emotions such as enthusiasm, comfort, and enjoyment, as well as negative emotions such as anxiety resulting from technical and interactive challenges, in addition to students’ dissatisfaction given their heavy reliance on the app.

Discussion: The study was primarily grounded in Control-Value Theory (CVT)—viewing academic emotions as functions of perceived control and value—while Self-Determination Theory (SDT) served as a complementary interpretive lens to explain how the intervention might support autonomy, competence, and relatedness.

1 Introduction

Educational literature suggests that academic emotions are a crucial component in shaping learning experiences, influencing attention, cognitive engagement, information processing, and, consequently, academic performance (Pekrun, 2006; Boekaerts, 2010). In mathematics specifically, these emotions are particularly prominent due to the abstract nature of the subject and its cognitive challenges. Students’ experiences range from positive emotions, such as pride and enjoyment, to negative emotions, including anger and anxiety (Daher, 2015; Pekrun et al., 2017; Schukajlow et al., 2023).

Educational research reveals persistent achievement gaps in mathematics among Palestinian students, particularly in algebraic concepts that require advanced abstract skills (TIMSS and PIRLS International Study Center, 2024). This highlights the need to adopt educational strategies that take into account the emotional dimension of the learning process. In this context, artificial intelligence technologies have contributed to providing new opportunities for effective academic emotion management through smart learning environments that adapt to students’ immediate responses and provide personalized feedback that helps transform negative emotions into positive motivators, enhancing the quality and sustainability of learning (Roda-Segarra et al., 2024).

Artificial intelligence plays a pivotal role in enhancing learner engagement on both emotional and cognitive levels, through its intelligent management of academic emotions by customizing educational content to fit individual needs, adjusting the challenge level adaptively, and providing immediate feedback. Together, these elements help create a supportive learning environment that fosters access to a state of flow, (Csikszentmihalyi and Csikzentmihaly, 1990) where deep focus blends with total immersion in the learning process (Daher and Abu Thabet, 2025; Pekrun, 2021; Alvarez, 2024; CK-12, 2024). This work contributes mixed-methods evidence from an under-represented Arabic/Palestinian context and examines an AI-supported algebra intervention integrated into classroom practice.

In this context, the CK-12 Flexi application represents an advanced model for AI applications in mathematics education. It combines real-time interaction with personalized learning paths, which contributes to stimulating positive emotions and reducing negative emotions when dealing with abstract algebraic concepts (Schirmer, 2015; Damasio, 2004).

Recent work argues that understanding how AI mediates learners’ emotions is a necessary next step, particularly in school mathematics, where affect (e.g., interest, engagement, anxiety) is closely tied to achievement. A recent systematic review shows that AI is increasingly used to assess emotions in educational settings and highlights both the promise of affect-aware adaptation and open issues around validity and ethics (e.g., privacy, cross-cultural robustness) (Vistorte et al., 2024). Building on this, affective intelligent tutoring systems (ATS) explicitly detect and respond to students’ affective states; a 2024 scoping review synthesizes 27 ATSs and calls for stronger evidence on emotional outcomes and clearer reporting standards (Fernández-Herrero, 2024). By contrast, state-of-the-art syntheses of AI/ITS in K-12 mathematics still prioritize performance outcomes, with affective measures under-reported—indicating a gap our study addresses (Son, 2024; Létourneau et al., 2025; Wang et al., 2023). At the same time, adjacent K-12 evidence suggests AI can reduce math anxiety: an intervention with primary pupils using Gen-AI-assisted learning reported anxiety reductions alongside gains in interest and self-efficacy (Wang and Wei, 2025), and an AI-driven program for low-performing 7th-graders documented decreased math anxiety with concurrent achievement gains (Polydoros et al., 2025). Together, these literatures motivate examining whether classroom AI can be leveraged not only to improve performance but also to support students’ learning-related emotions in mathematics.

Building on this gap, the present study examines how the CK-12 Flexi (v2.0) application—an AI-enabled tool integrating adaptive feedback, multiple input modalities, and personalized learning paths—shapes students’ emotional trajectories in algebra lessons within an Arabic/Palestinian setting, a context underrepresented in prior research (TIMSS and PIRLS International Study Center, 2024). Using a mixed-methods design that combines quantitative analyses with qualitative interviews and a SAMR-informed lesson redesign, we specify when, how, and for whom AI-supported environments foster positive emotions (e.g., enjoyment, pride) and attenuate negative emotions (e.g., boredom, anxiety). Rather than assuming uniformly beneficial effects, we situate our approach within conceptual accounts that frame AI as a double-edged sword and emphasize maximizing benefits while mitigating risks (Chen and Lin, 2024). Unlike Alvarez (2024), who contrasted Flexi with MathGPT and primarily reported achievement gains, we center academic emotions and the mechanisms linking Flexi’s specific affordances to emotional change, while documenting tool-specific boundary conditions that qualify these effects.

Based on the above, the current study aims to investigate the impact of using the Flexi app in teaching algebra on the academic emotions of eighth-grade students by analyzing the emotional transformations students experience while learning in an interactive smart environment. It also seeks to identify differences in the app’s impact depending on mathematical ability level, which will help guide the design of learning environments that are more responsive to learners’ emotional and cognitive needs. The study was based on the following questions:

• Q1: After adjusting for pretest scores, do post-test levels of enjoyment, pride, anger, anxiety, and boredom differ between the experimental (Flexi-supported) and control groups in eighth-grade algebra?

• Q2: What student–Flexi interaction patterns are associated with changes in academic emotions during eighth-grade algebra lessons?

2 Theoretical framework

According to Pekrun and Perry (2014), emotions represent complex mental and physical states that arise from a cognitive evaluation, which may be conscious or unconscious, of internal or external environmental stimuli. These emotions result in behavioral and physiological responses that directly impact individual performance (Pekrun et al., 2002). These responses are manifested in explicit ways, such as facial expressions, body language, and changes in voice tone, or they may be implicit, as seen in physiological changes. Specifically, in the educational context, academic emotions emerge as emotional responses directly related to scholarly activities and their outcomes, profoundly impacting attention, memory, and problem-solving (Meece et al., 1990).

Academic emotions are based on the Control-Value Theory (CVT), proposed by Pekrun (2006), which posits that these emotions arise from a student’s assessment of their ability to control an academic task and their perception of its value. The theory divides these emotions into three main dimensions: object focus, emotional valence, and physiological activation. These dimensions influence the formation of different types of emotions. Positive, activating emotions, such as pleasure and pride, enhance academic performance and effective cognitive strategies (Ellis and Ashbrook, 1989), while negative emotions, such as anxiety and frustration, can reduce cognitive efficiency but, under some circumstances, can motivate additional effort if properly managed (Muis et al., 2018; Middleton et al., 2023).

With technological advancements, artificial intelligence plays a pivotal role in detecting and managing these academic emotions. Advanced algorithms have enabled the analysis of emotional data with high accuracy, allowing intelligent educational systems to react to students’ emotional states in real time and provide personalized feedback that enhances positive emotions and mitigates the negative effects of negative emotions. Recent studies (D’Mello and Graesser, 2012; Mehigan and Pitt, 2019; Vistorte et al., 2024) indicate that the use of these technologies leads to significant improvements in students’ academic emotions and enhances levels of motivation and academic engagement, especially in complex cognitive subjects such as mathematics.

In this study, Control–Value Theory (CVT) serves as the primary framework for understanding academic emotions as functions of perceived control and value; CVT guided the research questions and the quantitative analysis. Self-Determination Theory (SDT) is used as a complementary interpretive lens to elucidate how components of the intervention may support the needs for autonomy, competence, and relatedness, thereby shaping control–value appraisals and, in turn, emotions. For operational transparency, the Methods section details how CK-12 Flexi features align with CVT appraisals and SDT needs as expected mechanisms of effect. Analytically, we treat academic emotions as influenced by control–value appraisals.

3 Methodology

This study utilized a mixed-methods approach that incorporated both quantitative and qualitative methods to analyze and explore students’ subjective emotional experiences (Creswell and Clark, 2017).

In the quantitative aspect, a quasi-experimental design based on pre-test and post-test measurements was implemented. The sample consisted of 92 eighth-grade students from two private schools in Nablus, Palestine: Modern English School and British Scientific School. Both schools follow the Cambridge curriculum and teach Grade 8 in English, with similar learning environments including an average class size of approximately 23 students and comparable assessment systems (short quizzes, homework, projects, classwork, and term examinations). Each school has two Grade 8 sections taught by the same teacher within the school, reducing within-school teacher variance while acknowledging a potential between-school teacher effect. Socioeconomic indicators are also similar across the two schools. Records from the previous year indicated a mean prior mathematics achievement of 88.84 at Modern English School and 87.91 at British Scientific School, demonstrating a high degree of baseline academic comparability. The teachers at Modern English School had an average of 5 years of experience, while those at British Scientific School had an average of 6 years.

The student questionnaire collected minimal demographic information (school, section, gender) and assigned a study ID to link pre- and post-responses. Ability level was not directly collected on the form, but an administrative, de-identified binary ability flag (High vs. Other) was provided by the schools based on prior-year mathematics placement and linked via the study ID. This flag was used for purposive sampling for qualitative interviews and to ensure baseline equivalence checks between groups. Pseudonymization measures were implemented for the questionnaire data, with only alphanumeric study IDs appearing on all forms and no names or direct identifiers stored with the quantitative dataset. Participant identity was known during the qualitative interviews, but identifying information was not retained or linked to survey responses.

The resulting sample (N = 92; experimental = 45, control = 47) constituted a near-census of all eligible, consenting Grade-8 students in the two participating schools during the study period. This reflects a pragmatic school-level sampling frame appropriate for the quasi-experimental design. With a 5% significance level, this sample provides approximately 80% statistical power to detect effects of medium magnitude in two-group comparisons. Smaller effects or interaction terms may be underpowered. Therefore, the study is adequately sensitive to its primary objective, which is between-group differences in emotions. All inferential results are reported with effect sizes and 95% confidence intervals to convey precision.

Assignment and Bias-Mitigation Procedures: The Modern English School was the experimental group, while the British Scientific School served as the control group due to operational reasons such as computer lab availability, timetable alignment, and teacher readiness to implement the protocol. In the experimental school, the teacher taught both sections a redesigned algebra unit aligned with the SAMR model using CK-12 Flexi. In the control school, the teacher provided business-as-usual instruction to both sections without Flexi, with lesson plans harmonized across schools where feasible. The institutional and geographic separation and the absence of direct operational links between the schools minimized the risk of treatment contamination. This was further mitigated by restricting CK-12 Flexi account access to students and teachers in the experimental school and instructing participants not to share links or screenshots. To assess baseline comparability and mitigate bias, we conducted χ² tests for categorical variables (gender, ability level) and Mann–Whitney U tests for AEQ-M pretests. Effect sizes (φ/Cramér’s V for χ²; r = r = |z|/√n for Mann–Whitney) and 95% confidence intervals were reported where applicable. Results are presented in Table 1.

Table 1

Table 1. Baseline equivalence of experimental and control groups.

Chi-square tests showed no significant differences between the groups in terms of gender or ability level. Similarly, Mann–Whitney tests did not reveal any statistically significant differences in pretest academic-emotion scores. Effect sizes ranged from small to very small, and the confidence intervals included zero. These findings suggest that the two groups were equivalent at baseline both demographically and emotionally. All outcome analyses were adjusted for pretest using Quade nonparametric ANCOVA.

The AEQ-M measures academic emotions across three contexts—class attendance, self-study, and test-taking (Pekrun et al., 2011; Bieleke et al., 2023). Since the present intervention occurred during classroom algebra lessons using CK-12 Flexi, measurement was deliberately restricted to the class-attendance context to align with the intervention, reduce irrelevant variance (e.g., homework or test conditions), and enhance ecological validity. Extensive validation work has established the AEQ-M’s construct validity and factorial structure alongside convergent/discriminant validity and reliability across multiple samples, languages, and educational stages (Pekrun et al., 2011; Bieleke et al., 2023). Translation in this study followed a rigorous forward-review procedure (Mallinckrodt and Wang, 2004). Seven educational experts evaluated semantic and conceptual equivalence of the Arabic items to ensure clarity and age appropriateness for Grade-8 students.

The study employed the Mathematics Achievement Emotions Questionnaire (AEQ-M; Pekrun et al., 2011; Bieleke et al., 2023) in a 30-item version adapted to Arabic for Grade-8 learners (Supplement S1). The instrument covers six emotions—enjoyment, pride, anxiety, anger, shame, and boredom—with five items per subscale, rated on a five-point Likert scale from 1 (lowest) to 5 (highest). Illustrative items include: “I smile and feel happy during the lesson when I understand math concepts” (enjoyment); “I feel proud when I overcome challenges in mathematics problems” (pride); “I feel afraid of attending mathematics class and prefer not to participate” (anxiety); “I feel angry while solving mathematics problems in class” (anger); “I feel ashamed when I cannot answer my teacher’s questions in mathematics correctly” (shame); and “I cannot concentrate because I feel very bored during mathematics class” (boredom). Adaptation note: prior to data collection, the AEQ-M Hopelessness subscale was excluded for three reasons: (i) expert reviewers cautioned that, in Grade-8 lessons, hopelessness is easily conflated with boredom, undermining discriminant validity; (ii) several items were ambiguously worded and not anchored to the during-class context targeted here (often referring to tests or general states); and (iii) to minimize respondent burden while retaining the emotions most germane to classroom learning.

Responses were coded 1–5, with no reverse-keyed items. For each student and each emotion, the subscale score was computed as the sum of its five items (range 5–25), where higher scores indicate greater intensity or frequency of the emotion during the lesson. The questionnaire was administered pre- and post-intervention in class. Pretest subscale sums were used to examine baseline equivalence between groups (Mann–Whitney U tests) and as covariates in the primary between-group comparisons via Quade nonparametric ANCOVA on posttest outcomes. For descriptive tables intended to aid interpretation, subscale scores may be rescaled to 1–5 (sum ÷ 5) without affecting inferential results. No item-level missing data were observed.

In our sample, internal consistency was acceptable to good. Table 2 presents the results.

Table 2

Table 2. Cronbach’s alpha coefficients for the subscales of the Academic Emotions Questionnaire and the total scale.

Construct validity was examined through intercorrelations among the AEQ-M subscales. The full correlation matrix is presented in Table 3.

Table 3

Table 3. Pearson correlations among the subscales of the academic emotions questionnaire (AEQ-M).

The correlation matrix shows a pattern consistent with Control–Value Theory (CVT): the positive emotions enjoyment and pride were positively correlated (r = 0.441), and each was negatively correlated with the negative emotions (for enjoyment, correlations ranged from −0.33 to −0.45; for pride, from −0.66 to −0.83). The negative emotions (anger, anxiety, shame, boredom) were positively interrelated with medium-to-strong magnitudes (r = 0.393 to 0.651). Overall, coefficients ranged from ≈ 0.33 to 0.83, indicating general structural coherence among the emotion dimensions; however, the proximity of some pairs to |0.80|- particularly pride–anxiety and pride–boredom - constrains discriminant validity and supports testing a higher-order or bifactor model in future work.

Intervention and setting: Following baseline measurement, the experimental school conducted a four-week, in-class intervention in the computer labs. This intervention consisted of five sessions per week, each lasting 45 min. The teaching was done using the CK-12 Flexi v2.0 web platform on Windows desktops (Google Chrome, school LAN). The use of Flexi was limited to class time only, with no homework or out-of-class exposure assigned. The algebra unit was redesigned by the first author, externally reviewed, and included six lessons aligned with SAMR: Substituting into Expressions (Substitution), Constructing Expressions (Augmentation), Expressions and Indices (Modification), Expanding (ax+b) (Redefinition), and two consolidation lessons (Simplifying Algebraic Fractions; Deriving/Using Formulae), integrating S → A → M → R. The control school covered the same content using parallel plans without Flexi.

AI environment and guidance: Flexi was presented in English with in-app translation available when needed. Default platform settings, including attempt-gated solution/verification, were maintained. Classroom management was done through the Edubook LMS, which posted objectives and links to Flexi concept sets, shared warm-ups/mini-presentation rubrics, and collected non-Flexi artifacts. Edubook data were not utilized as outcomes. Flexi use focused on six key features: (1) immediate explanatory feedback at the step level (beyond right/wrong) that highlights the erroneous step and relevant rule; (2) adaptive pacing with graduated difficulty based on recent performance; (3) tiered hints/worked examples on demand; (4) multiple input modes (math keyboard, equation drawing, drag-and-drop, and photo capture); (5) attempt-gated verification as per platform default; and (6) a teacher dashboard logging time-on-task, completion status, and summary hint usage at student and class levels.

Pedagogy and task structures: Each 45-min lesson followed a stable pattern. It started with 5–15 min of teacher-led framing, including one–two worked examples to establish a common method. This was followed by 25–35 min of guided work in Flexi, with feedback and hints provided. The lesson concluded with a 5–10 min closure, including whole-class checks and/or brief student mini-presentations. Tasks alternated between individual, dyadic (peer checking or shared device during error clinics), and small-group work (3–4 students) as specified in Edubook. Flexi was introduced after the initial explanation to avoid over-reliance. Some warm-ups were completed without Flexi to maintain a balanced human–technology blend. Exemplar items included numerical substitution, constructing expressions from verbal descriptions, laws of indices (e.g., $x^{m} \cdot x^{n}$ , $\frac{x^{a}}{x^{b}}, {(x^{7})}^{3}$ , combining like terms), expansion of (ax + b)(cx + d), algebraic fractions, formula derivation/use, selected fill-in-the-blank prompts, and inquiry-oriented “Think like a Mathematician” items. Students were required to use at least two input modes (e.g., drawing and keyboard) across a set and to attach a one- to two-line rationale naming the applied rule and explaining how feedback changed their attempt.

Error-handling and fidelity. Error-handling followed a standard sequence. If misrecognition or parsing failure occurred (e.g., drawn ×read as + or first photo failed to parse), students re-captured with better lighting/cropping or switched input mode (drawing ↔ keyboard ↔ photo). Unresolved cases received a brief teacher check to restore progression and align with the class method. Known limitations with some fill-in items and selected “Think like a Mathematician” prompts were documented (screenshots in the Supplement), with teacher-facilitated closure where the tool did not provide a complete solution. Prior to implementation, the teacher completed two 30-min Zoom trainings (tool operations + SAMR-aligned orchestration), the lab technician prepared the machines, and students received a 45-min technical orientation on login and input options. Assignment was at the school level (experimental vs. control). Flexi credentials and deployment were restricted to the experimental school’s labs; the control school had no credentials and no Flexi access during the study window. Fidelity was monitored weekly via the dashboard (time-on-task, completion, hint usage) and twice-weekly spot observations using a brief checklist.

The Supplement provides: (1) a weekly table summarizing topics, SAMR phase, features used, exemplar items, and orchestration pattern; and (2) annotated screenshots illustrating hint progression, correct/incorrect feedback, and representative error messages. In addition, representative task samples were included to demonstrate the nature of the activities carried out in Flexi, thereby enhancing clarity of the design and supporting replicability.

The AEQ-M was administered in class under the supervision of the mathematics teacher using a standardized script (instructions, timing, confidentiality) to ensure uniform administration. After data collection, responses were entered and scored in IBM SPSS Statistics v26: For each emotion—enjoyment, pride, anger, anxiety, shame, boredom—a subscale score was computed as the sum of its five items (range 5–25). Descriptives may be rescaled to 1–5 (sum ÷ 5) for readability without affecting inferential analyses. No item-level missing data were recorded (N = 92); hence, no case deletion or imputation was required. Because several subscales violated normality and homogeneity-of-variance assumptions for parametric ANCOVA and the outcomes are Likert-type, between-group differences at post-test were analyzed with Quade’s nonparametric ANCOVA (Graphpad and Ghoodjani, 2023), adjusting for the corresponding pre-test as a covariate. The analysis was run in SPSS via Analyze → Nonparametric Tests → Quade Nonparametric ANCOVA, with group (control vs. experimental) as the fixed factor and the pre-test subscale as the covariate. Results are reported as F(DFH = 1, DFE = 90) with two-tailed α = 0.05.

We also analyzed six primary outcomes, thus controlling for the family-level error rate using the Holm-Bonferroni stepwise procedure (α = 0.05), with Holm-adjusted p-values reported. Effect sizes are presented as partial η², calculated from the F statistic and degrees of freedom (see Lakens, 2013).

Because SPSS does not directly output confidence intervals for the Quade test, 95% confidence intervals for the adjusted difference between groups were obtained nonparametrically by (i) regressing the posttest on the pretest, preserving the residuals, and then (ii) applying independent samples (nonparametric) estimators to the residuals to extract a Hodges-Lehman signed-rank estimator with a 95% confidence interval. These confidence intervals are reported alongside the Quade F tests to reflect the accuracy of the adjusted variances.

Data collection and sampling for qualitative data were conducted through semi-structured, one-to-one interviews with a purposive sample of 14 students from the experimental group. All interviews took place in October 2024 via Microsoft Teams, following a standardized protocol to ensure consistency of prompts and probing (See Appendix 4). Theoretical saturation was achieved after 12 interviews, with two additional interviews conducted to confirm complete information exhaustion. Interviews lasted approximately 20–25 min and were scheduled in the evening after school under school oversight to minimize disruption. Sampling was designed to ensure gender balance and representation of two ability bands (High / Less-than-High), used solely for sampling and descriptive purposes, translation workflow, and de-identification. The interviews were conducted in Arabic. Audio recordings were transcribed verbatim in Arabic (manual transcription). English quotations presented in the manuscript were translated forward by a bilingual researcher (First Author) and independently reviewed by a second bilingual colleague to maintain semantic equivalence. Minor surface edits (e.g., ellipses, bracketed clarifications) were made to enhance readability without altering meaning. All transcripts were de-identified before analysis: student names were replaced with pseudonyms, direct identifiers were removed or obfuscated, and potentially identifying contextual details were generalized.

Following periodic quality checks, transcripts were read repeatedly and coded in MAXQDA 2020. An inductive thematic analysis was conducted using Braun and Clarke’s (2006) six-phase framework. Initial line-by-line coding produced over 240 data-driven codes accompanied by analytic memos. Codes were grouped through constant comparison into higher-order categories (axes), and thematic maps were created to evaluate each theme’s internal coherence and external distinctiveness. A codebook (Appendix 2) was iteratively developed with operational definitions and anchor examples, refined through two structured supervisory consultations. Merge/split decisions were systematically documented in a MAXQDA audit trail. All themes were supported by representative verbatim quotations and triangulated with the quantitative findings to enhance analytical credibility.

The study received approval from the Graduate Studies Department at An-Najah National University on September 4, 2024. The Nablus Directorate of Education formally informed both schools of the study’s objectives, procedures, and instruments on September 10, 2024. Informed consent/assent was obtained from parents/guardians, students, and teachers after explaining the voluntary nature of participation and the right to withdraw or skip any question without academic or administrative consequences; no incentives were provided. Privacy and confidentiality were ensured by coding data, removing direct identifiers, and using pseudonyms in quotations; access was limited to Researcher 1 and the two supervisors. Files were stored on a password-protected device with an encrypted Google Drive backup; recordings were deleted from the platform immediately after secure local storage; data will be retained for 12 months after publication and then permanently deleted. Limited follow-up clarifications were requested via WhatsApp after prior consent, without altering response context when necessary. Risks were assessed as minimal (possible mild discomfort or technical glitches), and participants could pause or reschedule at any time with a teacher/counselor available. No conflicts of interest or funding with a direct stake in the results were declared.

4 Results

4.1 Q1: After adjusting for pretest scores, do post-test levels of enjoyment, pride, anger, anxiety, and boredom differ between the experimental (Flexi-supported) and control groups in eighth-grade algebra?

In order to answer this question, the experimental group that used the Flexi application was compared with the control group that learned in the traditional way to explore the impact of the educational method on students’ emotions in algebra classes. First, the adjusted means and standard deviations of the six emotional levels (enjoyment, pride, anger, anxiety, shame, and boredom) were calculated for both the experimental and control groups. The results are shown in Table 4. As shown in Table 4, the descriptive statistics revealed significant differences between the control and experimental groups across the six emotion variables. For enjoyment, both groups reported similar pre-test scores; however, at post-test, the experimental group showed a higher mean enjoyment score (M = 20.69, SD = 1.058) compared to the control group (M = 17.532, SD = 4.890). A similar trend was observed for pride, with the experimental group maintaining higher post-test scores (M = 20.852, SD = 1.496) compared to the control group (M = 20.426, SD = 4.745).

Table 4

Table 4. Descriptive statistics (N, Mean, Standard Deviation, and Range) for control and experimental groups at pre and post-test across emotion variable.

In contrast, negative emotions such as anger, anxiety, shame, and boredom decreased more significantly in the experimental group compared to the control group. For example, the experimental group’s post-test mean for anger dropped to (M = 10.353, SD = 3.206), while the control group’s post-test mean remained higher at (M = 11.113, SD = 5.493). Similar reductions were found for anxiety (M = 9.622, SD = 2.674), shame (M = 7.773, SD = 1.323), and boredom (M = 8.604, SD = 2.226) in the experimental group, all of which were lower than the respective post-test means in the control group.

These descriptive results suggest that, following the intervention, the experimental group reported higher levels of positive emotions (enjoyment, pride) and lower levels of negative emotions (anger, anxiety, shame, boredom) compared to the control group. To determine the statistical significance of this difference, a Quade Nonparametric ANCOVA test was conducted (Table 5).

Table 5

Table 5. Quade nonparametric ANCOVA results for differences in academic emotions between experimental and control groups (with 95% CIs) (DFE = 90, DFH = 1).

After controlling for pretest scores and adjusting p-values across the six outcomes (Holm–Bonferroni), the experimental group reported higher enjoyment than the control group, F(1, 90) = 11.844, p_adj = 0.006, η² = 0.116, with a non-overlapping 95% confidence interval for the adjusted group difference [0.57, 2.86]. The experimental group also showed lower anxiety and lower boredom, F(1, 90) = 7.667, p_adj = 0.035, η² = 0.079, CI [−2.427, −1.621], and F(1, 90) = 7.149, p_adj = 0.036, η² = 0.074, CI [−3.667, −0.222], respectively. In contrast, shyness did not remain significant after adjustment, F(1, 90) = 4.337, p_adj = 0.120, η² = 0.046, with confidence intervals spanning zero, and pride and anger were non-significant (p_adj ≥ 0.57) with small effect sizes and confidence intervals crossing zero.

4.2 Q2: What student–flexi interaction patterns are associated with changes in academic emotions during eighth-grade algebra lessons?

Data were collected from interviews with eighth-grade students and analyzed to understand the impact of using the AI-powered Flexi app on students’ emotions while learning algebra lessons. Thematic analysis was used to identify main and sub-themes through recurrence and recurring patterns in students’ responses. Pseudonyms were used instead of actual names. The following themes emerged.

4.2.1 Topic 1: enjoyment

4.2.1.1 The impact of interactivity in the application on students’ enjoyment and reducing boredom

The results indicated that the Flexi app helped eliminate the boredom that characterized traditional classes. Student Zaid explained, “Before Flexi, classes were very long and we had to spend a lot of time solving problems.” After using the app, he felt more relaxed and enjoyable, adding, “It made classes much more enjoyable.” Salma confirmed this by saying, “We got out of the boring classroom environment, and studying math the way we were used to changing the atmosphere. I started to love math. I started to feel that math is fun.” These statements confirm that the integration of the Flexi app has improved the learning environment, making learning more engaging and stimulating, and increasing students’ desire to learn mathematics.

4.2.1.2 The role of innovative methods in flexi in enhancing the enjoyment of learning

Students described the Flexi app as offering new and fun elements that transformed learning algebra into a more enjoyable and less tedious experience. For example, Karim recounted a playful exchange with the app (“I made a joke with him… and he replied, ‘I love you,’ which made me happy”), while Sandra and others emphasized that they “felt happy and entertained” using it. In explaining their reasons, the students highlighted six specific features they believed increased enjoyment and reduced boredom. First, the app provided immediate explanatory feedback, enabling them to correct mistakes immediately and continue working—Sandra noted that she could “fix them right away.” Second, the app’s adaptive pacing and progressive difficulty maintained an optimal challenge level without overwhelming them, as Sarab explained: “The levels got progressively harder” but remained manageable. Third, the availability of step-by-step hints and worked examples supported progress when students got stuck, as Karim explained. Fourth, the inputs of graphing equations were frequently described as “fun” and “different,” adding variety to problem-solving—Joanna commented that she “enjoyed solving by graphing; it felt like playing.” Fifth, students appreciated the easy access to checking their solution after a set number of attempts, which reduced frustration and promoted self-regulation; Wafa explained that checking after a few attempts “reduced my stress.” Finally, the presence of multiple input modes (keyboard input, interactive options, drag and drop, and drawing) allowed students to choose their preferred method. A clear majority of students expressed a preference for drawing, describing it as the most fun and expressive method—for example, Joanna commented that she “enjoyed solving by graphing; it felt like playing,” while Sarab, Sandra, Mohammed, and Hamza similarly emphasized its playful and engaging nature.

4.2.2 Topic 2: comfort

4.2.2.1 The application and its role in enhancing comfort during learning and preparing for exams

Sarah confirmed that the app helped her prepare for tests with greater ease, saying, “The teacher told us the importance of this question and that a similar question would appear on the test, which made me feel more comfortable and helped me solve this problem.” Hamza explained, “I felt comfortable when he showed me the steps to solve it.” These results suggest that some students reported that the Flexi app enhanced their comfort and test preparation. By providing clear, timely guidance and structured exercises, the app appeared to reduce anxiety surrounding difficult material and tests, making students feel more comfortable and better prepared to face academic challenges. However, a subset of students also reported anxiety about occasional app errors (see sections 4.2.7 and 4.2.8).

4.2.2.2 The effect of learning simplification and performance efficiency in enhancing the sense of comfort

Findings indicated that several students perceived Flexi as simpler and more time-efficient than tools they had previously used (e.g., ChatGPT), though this impression was not based on a systematic comparison. Students’ accounts described easier solution processes and reduced time/effort. Mohammed noted, “I feel comfortable when it explains difficult problems that I do not understand… I do not spend a lot of time solving them; I just enter them into Flexi and solve them quickly.” For these students, Flexi functioned as a supportive tool that streamlined problem solving and fit efficiently into their study routines.

4.2.2.3 The role of application flexibility and multiple options in supporting a sense of comfort

These two aspects show how Flexi compatibility with multiple devices and the diversity of input methods for questions contribute to enhancing comfort for students. At the hardware level, Kareem explained: “On a laptop, you can enter from the math keyboard, and on a mobile phone, you can draw. There are many nice options to help.” Sandra pointed out the variety of methods by saying, “There are many ways to input questions: draw, insert an image, and the math keyboard. When one method does not work, I use another, and I really liked that.” Students described that flexibility and multiple input options allowed them to choose preferred modes, which they felt increased comfort and engagement.

4.2.2.4 The impact of performance accuracy and its suitability to educational methods on students’ comfort

Students described greater comfort when Flexi’s worked steps and final answers aligned with the teacher’s taught method; perceived mismatches or inaccuracies were linked to brief hesitation. Hamza noted, “I wrote the problem using the method the teacher explained, and the app solved it the same way—it was good.” By contrast, Amal remarked, “I expected it to give all answers correctly, but when the app is wrong, I feel a bit hesitant.” For these students, comfort appeared contingent on perceived accuracy and method-alignment, with temporary dips when accuracy was in doubt.

4.2.3 Topic 3: satisfaction

4.2.3.1 The impact of application performance efficiency on enhancing student satisfaction

Student Karim pointed out the effectiveness of the application, saying: “It helps us solve problems quickly and easily.” Student Sandra also emphasized the organization and accuracy of the application, saying: “It works accurately, organized, and easily.” These accounts suggest that students experienced the app as clear, well-organized, and efficient, which appeared to facilitate learning and reduce perceived effort.

4.2.3.2 The role of Flexi’s technical capabilities in supporting learner satisfaction

Students expressed satisfaction with Flexi’s technical features, particularly the instant translation option, which improved their understanding and reduced stress when encountering language difficulties. Sarab explained: “If you don’t understand English, you can translate it, and the app will show it in Arabic.” This feature allowed students to overcome language barriers easily, enabling them to continue learning without anxiety. In turn, it enhanced their satisfaction by allowing more efficient use of time and smoother engagement with the lesson content.

4.2.3.3 Perceived satisfaction and sense of accomplishment when using flexi

Students reported feeling more satisfied and accomplished when they were able to complete assigned tasks and reach a clear understanding of the concepts; these moments were linked, for them, to greater self-confidence and motivation to keep improving in mathematics. As Joanna put it, “Okay, I understand the question, and I understand the answer.” For these students, perceived progress and successful completion appeared to underpin satisfaction.

4.2.4 Topic 4: shyness

4.2.4.1 Shyness associated with technical input challenges

Students described episodes of shyness when they struggled with input mechanics during class (e.g., locating symbols or typing quickly). As Amal put it: “I’m not fast at typing on the keyboard, or I do not know where a certain symbol is, so I ask someone next to me… That’s where I’m shy.” Joanna similarly noted: “When I get confused about the placement of symbols or type slowly on the keyboard, I have to ask my classmate to help me, and at that moment, honestly, I feel embarrassed. Also, Flexi did not recognize my classmate’s handwriting, nor did it recognize mine.” Taken together, these accounts suggest two pathways for shyness: (a) a felt dip in competence when technical actions cannot be completed quickly and independently, and (b) a social-evaluative concern about being seen to need help in front of peers.

4.2.4.2 Flexi’s role in reducing shyness and shyness during learning

This section highlights the role of the Flexi app in reducing students’ emotions of shyness, whether when they encounter difficulties in understanding or when interacting directly with the teacher.

a. Avoid admitting misunderstanding, Karim explicitly mentions his reluctance to tell the teacher about the difficulty of the lesson, explaining: “Once there was a homework assignment… I did not understand the lesson, so I was embarrassed to tell the teacher that I did not understand.” Similarly, Mohammed admitted: “I used to be shy about asking questions in front of my classmates, but with Flexi I was able to practice on my own without anyone knowing.” We conclude that students often preferred to seek alternative routes to learning rather than request immediate clarification from the teacher, in order to avoid shyness or public criticism.

b. Resorting to the application to alleviate shyness, Karim explains how the Flexi app helps reduce shyness in front of the teacher. He says: “When I went home, I opened Flexi and told it that I did not understand the lesson… He explained it to me in more than one way… as if he were a teacher, but I wasn’t ashamed of him.” Likewise, Mohammed emphasized how Flexi created a safe, judgment-free space for practice: “I used to be shy about asking questions in front of my classmates, but with Flexi I was able to practice on my own without anyone knowing.” Together, these accounts highlight the role of the Flexi app in providing a private learning environment that reduces shyness in class and encourages students to continue their learning independently and with greater confidence.

4.2.5 Topic 5: pride

4.2.5.1 Autonomy as an enhancer of pride

Several students reported feeling proud when they could grasp the lesson and solve problems independently with Flexi, without immediately relying on the teacher. Karim explained: “I felt proud and more confident because I could understand the lesson with Flexi without asking the teacher right away; if something was missing, I went back to Flexi and it explained it.” For these students, such autonomous progress functioned as a source of personal accomplishment and confidence.

4.2.5.2 Assisting others in enhancing the sense of pride

Zaid expressed pride in both his proficiency with the app and his ability to support classmates, explaining: “I understood the application quickly and helped my colleagues… I was proud because I knew how to use Flexi.” He added: “I was the one who helped the most, and whenever I explained something to someone, I felt proud.” These accounts illustrate how students derived pride not only from mastering Flexi themselves but also from guiding peers, which enhanced their sense of competence and reinforced their social standing in the classroom.

4.2.5.3 The role of classroom interaction in developing a sense of academic pride

Results showed that classroom participation and group interactions, both individual and group, clearly played a role in fostering students’ sense of pride when using the Flexi app in algebra classes. Students expressed a deep sense of pride when they were able to actively participate in class activities, drawing on the knowledge they had acquired through the app. Hamza describes this by saying: “He also made me participate in the class… When I answered, I felt proud that I was getting my information from Flexi… I started participating in the class… This made me feel like I was a very important person, and that’s how I was proud of myself. Now I’m confident in my answer.” This statement reflects the importance of social recognition and appreciation from teachers and peers in enhancing a student’s self-esteem and confidence in his or her abilities.

4.2.6 Topic 6: enthusiasm

4.2.6.1 Novel classroom experiences and students’ enthusiasm

Students reported heightened enthusiasm when the intervention introduced novel, student-led activities (e.g., brief presentations and creative algebra discussions). Such tasks made them feel more agentic and involved in the lesson. As Hamza noted, “This was the first time we used PowerPoint in math class. We presented and discussed algebra as if we were the teachers—it was great.” For these students, the new formats appeared to boost autonomy and relatedness (SDT) and to create moments where challenge and skill felt balanced (flow), which they associated with increased engagement and enthusiasm.

4.2.6.2 The impact of Flexi’s interactive environment on stimulating students’ enthusiasm

Students described Flexi’s interactive environment as highly motivating, noting that it increased their enthusiasm for mathematics by encouraging interaction, inquiry, and the confident exploration of new concepts. Mohammed illustrated this experience: “I feel more enthusiastic because I can always ask Flexi questions and get answers—it feels as if I am in class with a teacher.” Such accounts suggest that the integration of Flexi not only created a supportive digital space but also transformed mathematics lessons into stimulating and engaging experiences, fostering a sense of active participation and curiosity.

4.2.7 Topic 7: anxiety

4.2.7.1 Error-related anxiety and perceived implications for performance

Some students reported heightened anxiety when the app produced repeated or unexpected errors, voicing worries that such mistakes might carry over to tests. As Amal explained: “Once I entered equations for it to solve. I did not know how to solve them, and it did not solve them correctly. This made me nervous… I started thinking, ‘What if I have an exam and I get an important question wrong?” For these students, such breakdowns appeared to undermine perceived control and competence, co-occurring with stress and momentary doubt about their readiness. While these accounts do not establish causal effects on exam outcomes, they indicate a perceived risk that may merit teacher mediation (e.g., verification steps, clarifying correct methods) to contain anxiety.

4.2.7.2 The impact of anxiety caused by mistakes on self-confidence and app usage

The results showed that anxiety caused by repeated mistakes undermined students’ self-confidence and generated emotions of uncertainty and confusion during problem solving. Sandra expressed this feeling by saying: “When there is a question that I do not know how to solve, because sometimes the app makes mistakes, I get confused and do not know if I solved it correctly or not,” which reflected the impact of this anxiety on her confidence and performance while using the application. Similarly, Wafa reported a loss of trust in the app: “Sometimes I get nervous when it makes a mistake. Sometimes I don’t trust it because it makes mistakes.” Taken together, these accounts indicate that frequent errors in Flexi not only caused students to feel anxious and stressed but also negatively affected their confidence in themselves and in the app.

4.2.7.3 The effect of the student’s difficulty in identifying the source of errors on emotions of anxiety

The results also showed that errors made by the app left some students confused about the source of the error, whether it was from the app or themselves, which led to a feeling of temporary helplessness in solving the problem. Student Hamza expressed this struggle by saying: “Because of the errors, sometimes I felt that there was a mistake in the application’s programming, or that the mistake was mine, or that I was taking the wrong picture, or that I had entered the problem wrong. I did not know where the error came from.” Ambiguity about the source of errors (app vs. user) appeared to heighten students’ anxiety and briefly hinder problem solving.

4.2.7.4 Varied responses to flexi errors: from worry to acceptance

Students differed in how they reacted to app errors. Some voiced concern that incorrect answers could spill over into exam performance; others treated glitches as an expected part of using a digital tool. As Joanna noted: “Honestly, I don’t get too upset because it’s like that… the website helps me a lot. If it gives me the wrong answer, for example, I ask again and it gives me the correct answer.” Analytically, these reactions span transient anxiety when perceived control is reduced and a more accepting stance that normalizes breakdowns, consistent with short-lived adjustments rather than sustained effects.

4.2.8 Topic 8: discomfort

4.2.8.1 The impact of technical difficulties on increasing feelings of discomfort

Students explained that the application sometimes struggled to distinguish between mathematical operations such as multiplication and addition, which increased their feelings of confusion and discomfort during problem solving. Zaid noted: “What I do not like is that sometimes I take pictures. I do not understand the pictures well, and he tells me to show him another picture. And sometimes, for example, I write multiplication to him while I’m drawing the equation. I put in multiplication, and he calculates it as addition for me.” Similarly, Sandra commented: “Sometimes when Flexi did not read the equation correctly, I felt uncomfortable and stressed.” These accounts suggest that technical challenges—particularly inaccuracies in image recognition and operation differentiation—were associated with heightened discomfort and, for some students, brief disruptions to their learning.

4.2.8.2 Shortfalls in solution completeness and resulting discomfort

Students reported that Flexi did not reliably handle certain item types—especially fill-in-the-blank expressions/equations and items that required precise formatting—sometimes failing to produce or verify a complete solution despite repeated attempts. As Amal described: “I made sure it wrote the equation exactly as I did; the answers appeared, but there was still an error. I tried logging out and in again and re-entering the same equations, and it still showed an error.” These episodes were associated with confusion and a brief dip in confidence, which students tried to manage by re-attempting entries or switching input methods.

4.2.8.2.1 Technical error notes

Several students have reported recurring technical issues when using Flexi (v2.0). These include misrecognition in drawing-based input, where the app did not accurately capture symbols on the first attempt, leading to re-drawing or switching to the math keyboard. There were also problems with parsing photographed notebook/textbook items on the first capture, often requiring re-capture or better lighting/cropping. Additionally, there were clear limitations with fill-in-the-blank items, such as 2(12x + 16x²) = 24x + 32x², where the app would sometimes return a persistent error despite repeated entries and changes in input method. Furthermore, when it came to the textbook’s “Think like a mathematician” prompts, the app typically only offered idea-level guidance (hints/outlines) without a complete worked solution or final verification. This was especially noticeable for division-of-powers items like $x^{5} \div x^{5}$ , $y^{4} \div y^{4}$ , $a^{b} \div a^{b}$ , $c^{d} \div c^{d}$ . Reactions to these issues varied among students, with some seeing them as temporary glitches, while others experienced brief spikes in anxiety/discomfort and momentary dips in trust in the tool (see 4.2.7–4.2.8).

4.2.9 Topic 9: feelings of discontent with Technology in the Lack of personal accomplishment

The results showed that students expressed dissatisfaction with relying solely on the Flexi app to solve problems. Hamza believed that education required personal effort to achieve understanding and achievement. In this context, the student said: “When I use it a lot to solve problems, I feel a negative feeling. I feel uncomfortable that it is not my effort, but rather the application that solves them while I write in the notebook.” The quote reflects the student’s feeling of discomfort resulting from a lack of personal contribution to the problem-solving process, making the experience less satisfying.

5 Discussion

This discussion examines the results of a study that looked at how the CK-12 Flexi AI-powered application influenced the emotions of eighth-grade students while learning algebra. The study utilized both quantitative and qualitative data. Due to the study’s nonrandomized and short-duration design, we only report pretest-adjusted emotional differences, not causal effects. Qualitative insights are provided as associative, context-dependent explanations. The study did not measure achievement, so no claims about learning gains are made.

After controlling for pretest differences and adjusting for multiplicity across six outcomes (Holm–Bonferroni), enjoyment was significantly higher in the experimental group, F(1, 90) = 11.844, pₐdⱼ = 0.006, partial η² = 0.116. The 95% CI for the adjusted group difference excluded zero, indicating a medium effect. Qualitatively, students consistently attributed this gain to six affordances of Flexi: (1) immediate explanatory feedback enabling on-the-spot correction; (2) adaptive pacing with graduated difficulty; (3) step-by-step hints and worked examples; (4) equation-drawing input that felt “fun” and varied; (5) answer verification after several attempts, supporting self-regulation; and (6) multiple input modes (keyboard, interactive choices, drag-and-drop, drawing). Mechanistically, these features increase perceived control and task value (CVT; Pekrun, 2006) while supporting autonomy and competence (SDT; Deci and Ryan, 1985). Previous studies also show that such technical affordances can enhance engagement and enjoyment while reducing boredom (e.g., Létourneau et al., 2025; Polydoros et al., 2025; Zong and Yang, 2025). In our data, Flexi also facilitated brief collaboration and peer discussion, aligning with social learning perspectives (Vygotsky, 1978) and recent reports on AI-enabled classrooms (Gao, 2024; Vistorte et al., 2024), suggesting that beyond its interface, the tool created opportunities for relatedness through teacher- and peer-mediated interactions.

Although between-group differences in pride were not significant after controlling for pretest and adjusting for multiplicity (Holm–Bonferroni)—F (1, 90) = 1.155, pₐdⱼ = 0.570, partial η² = 0.013; 95% CI − 1.115, 1.000, crossing zero—the interviews revealed recurrent pride episodes tied to independent problem-solving (autonomy/competence) and to classroom recognition through participation and peer support (relatedness). In CVT terms, these moments reflect heightened perceived control and ownership of success (Pekrun, 2006). Because Flexi primarily supports individualized work, our SAMR-informed orchestration (student-led mini-presentations, sharing Flexi-generated solutions, guided whole-class discussions) helped transform private accomplishments into socially recognized achievements. Where such orchestration was absent, pride tended to remain individual and less socially anchored. Thus, pride was most evident when autonomy-supportive tool use was combined with teacher-mediated opportunities for social validation, aligning with SDT’s competence, autonomy, and relatedness needs (Deci and Ryan, 1985, 2000).

The qualitative data also revealed recurrent technical errors in Flexi v2.0—such as misrecognition in drawing-based input, failure to parse captured images, persistent errors in fill-in-the-blank items, and the inability to provide complete solutions or verification for the “Think like a Mathematician” prompts. Importantly, the app failed to solve this type of question across all algebra lessons, despite these being among the unit’s most challenging tasks. This limitation amplified students’ discomfort and, for some, led to brief declines in trust in the tool. From a CVT perspective, such breakdowns diminished perceived control and left task value unresolved. From an SDT lens, they frustrated core needs: autonomy was undermined by malfunctioning input channels; competence was thwarted when correct answers were unrecognized, incorrect answers were returned, or closure was missing; and relatedness was weakened when the app failed to provide sufficient support without teacher or peer mediation.

Quantitatively, anxiety decreased after controlling for the pretest and adjusting for multiplicity using the Holm–Bonferroni method (F(1, 90) = 7.667, p(adj) = 0.035, partial η² = 0.079). The Hodges–Lehmann 95% confidence interval for the adjusted location shift (−2.427, 1.621) spanned zero, indicating imprecision in the point estimate. Therefore, we interpret this small-to-moderate effect cautiously, considering qualitative evidence that students adapted by switching input modes, improving photo capture, and normalizing glitches as manageable classroom obstacles. These processes are likely to raise perceived control and competence, thereby reducing anxiety (Pekrun, 2006, 2021).

This pattern aligns partly with productive failure accounts (Kapur, 2008), where momentary setbacks trigger inquiry and strategy refinement,while also illustrating unproductive failure when closure remains unattained. Pedagogically, teacher scaffolding (e.g., verification routines, method-alignment checks, and alternative input pathways) is essential to convert technical setbacks into learning opportunities and to protect students’ control–value appraisals and basic-need satisfaction (Deci and Ryan, 2000; Ryan and Deci, 2017). These quantitative trends are consistent with recent K–12 interventions showing that AI-assisted learning can reduce math anxiety (Wang and Wei, 2025; Polydoros et al., 2025), although those studies involve different grade levels and implementation contexts.

Regarding student interaction with the AI tool, qualitatively, students’ interaction with Flexi followed consistent patterns that help explain the observed emotional shifts. Many described self-regulatory cycles—retrying items, switching input modes (drawing ↔ keyboard ↔ photo), and using step-by-step hints/verification—that increased perceived control and competence, aligning with CVT/SDT and co-occurring with greater enjoyment and comfort (Pekrun, 2006; Deci and Ryan, 2000). When misrecognition or ambiguity about the error source occurred, students reported brief spikes in anxiety/discomfort and momentary dips in trust; these were often contained by SAMR-aligned orchestration (teacher clarification, quick whole-class checks, peer explanation), which restored control/value appraisals and basic-need satisfaction (Son, 2024). Several students also used Flexi as a privacy-preserving space to rehearse understanding before speaking in class, which reduced shyness and sometimes translated into peer support/tutoring (Ryan and Deci, 2017). Taken together, interaction patterns suggest that feature-supported autonomy and competence drive positive emotions, whereas technical glitches delineate boundary conditions where supportive classroom routines are needed to sustain engagement.

Beyond enjoyment and pride, the findings revealed that two interrelated factors fostered students’ enthusiasm. First, novel, student-led activities (e.g., short presentations and creative algebra discussions) enhanced their sense of autonomy and agency, consistent with Self-Determination Theory (Deci and Ryan, 2000). Second, Flexi’s interactive environment encouraged continuous questioning and exploration, which strengthened perceived control and task value (Control–Value Theory; Pekrun, 2006, 2021) and created moments where challenge and skill were well balanced, aligning with flow theory (Csikszentmihalyi and Csikzentmihaly, 1990). These results are consistent with recent findings showing that innovative digital tools can heighten students’ enthusiasm by fostering interactive, emotionally supportive spaces for learning (e.g., Hanin and Gay, 2023; Zong and Yang, 2025). Together, these pedagogical and technological elements contributed to more stimulating experiences (as reported by students), fostering active participation, curiosity, and sustained enthusiasm. Overall, enthusiasm emerged as the product of pedagogical–technological integration. Practically, it can be sustained through diversified classroom activities, calibrated challenge levels, and staged scaffolds that link individual progress to classroom recognition.

The findings point to a dual pathway whereby technology can both elicit and relieve shyness. Input frictions (e.g., slow keyboarding, symbol placement, or occasional misrecognition) temporarily lowered students’ perceived competence and heightened concern about peer evaluation—consistent with Control–Value Theory, which links reduced control appraisals to social anxiety (Pekrun, 2006). Conversely, Flexi offered a private, judgment-free space to admit confusion and practice discreetly, supporting competence and autonomy in line with Self-Determination Theory (Deci and Ryan, 2000). Quantitatively, the omnibus Quade ANCOVA indicated a small reduction in shyness in the experimental group (F(1,90) = 4.337, p = 0.040, η² = 0.046). However, after Holm–Bonferroni adjustment across the six emotion outcomes, this effect did not remain significant (p_adj = 0.120), and the 95% confidence interval for the adjusted location shift crossed zero—suggesting that the reduction should be interpreted as suggestive rather than definitive. Taken together with the qualitative reports (which showed both shyness triggered by input difficulties and relief via private practice), a cautious conclusion is warranted: AI tools may increase shyness when they impede technical performance, yet can reduce it when they create safe, low-visibility spaces for “safe failure.” Pedagogically, design should minimize public input hurdles (symbol finders, tolerant recognition, optional math keyboard) and maximize private, scaffolded practice to protect students’ social face while building competence.

Boredom was lower in the experimental group after controlling for pretest and multiplicity, F(1,90) = 7.15, p = 0.009, pHolm = 0.036, partial η² = 0.074. The pretest-adjusted Hodges–Lehmann 95% CI for the group difference (Control − Experimental) was [−3.67, 0.22], which is consistent with the Quade result, but more conservative. Therefore, we interpret the reduction as modest but reliable. Qualitatively, students rarely labeled their experience as “less bored”; instead, they emphasized engaging features such as immediate explanatory feedback, adaptive pacing, step-by-step hints, and varied input modes (including drawing). In terms of Control-Value and flow, these affordances plausibly increase perceived control and skill-challenge fit, which helps to mitigate boredom. Taken together, the quantitative analysis shows a decrease in boredom, while in interviews, students tend to frame the same shift as an increase in enjoyment and engagement—consistent with CVT’s view that these are distinct constructs that are negatively related.

Despite the safeguards we implemented to discourage over-reliance on Flexi (e.g., attempt-before-hint routines and teacher-led consolidation), a small subset of students nevertheless reported dissatisfaction when the tool was perceived as “doing the work.” In SDT terms, such episodes reflect autonomy and competence frustration; in CVT terms, they lower perceived control and task value, thereby dampening enjoyment/pride and elevating dissatisfaction. This underscores the need to maintain a balanced human–technology integration consistent with constructivist orchestration (Jonassen, 1994), positioning Flexi explicitly as a scaffold rather than a solver. Practically, guardrails such as staged (graduated) feedback, brief self-explanations before verification, and method-alignment checks help restore control–value appraisals and support autonomy/competence (Deci and Ryan, 2000; Pekrun, 2021).

Anger did not differ significantly between groups after Holm–Bonferroni adjustment across the six primary outcomes, F(1, 90) = 0.150, p_HB = 0.699, η² = 0.002 (negligible). This pattern suggests that anger was infrequent or any between-group difference was too small to detect, potentially reflecting features of the app such as private practice and non-evaluative feedback that limit escalation into overt anger.

While the Flexi app created a generally supportive and engaging learning environment, the results also highlight considerations related to equity. Students who experienced difficulties with typing speed, emoticons, or the app’s recognition of handwritten input expressed feelings of shame and shyness, suggesting that the level of emotional support was not equal for all learners. These technological barriers may have disproportionately impacted students with weaker digital skills, limiting their opportunities for equal participation. However, features such as the real-time translation option mitigate language-related disadvantages, reducing language-related barriers to participation and fostering comfort without shyness, thereby enabling more equitable participation. Together, these findings suggest that while AI-based tools can enhance positive emotions and engagement, care must be taken to ensure that technological design does not unintentionally disadvantage some learners. Therefore, future developments should prioritize inclusive features that reduce technological barriers and support equal emotional engagement across diverse students.

Taken together, the findings reinforce CVT’s core proposition: positive emotions arise when students perceive high control and value, whereas negative emotions reflect diminished perceptions of these appraisals. This study extends prior work in three ways. First, it provides feature-level evidence that specific AI affordances—such as immediate explanatory feedback, adaptive pacing, and multiple input modes—are directly associated with increased enjoyment and reduced boredom and anxiety, linking technological design to psychological outcomes. Second, by triangulating quantitative ANCOVA analyses with inductive qualitative data interpreted through CVT, SDT, flow, broaden-and-build, and social learning lenses in a middle-school algebra setting in Palestine, it strengthens the empirical base in a context where little prior work exists. Third, it specifies boundary conditions: technical glitches and ambiguity about whether errors stem from the app or the user were associated with transient anxiety spikes and brief dips in trust. This highlights that the emotional benefits of AI-supported learning are context-dependent and contingent on both technological reliability and pedagogical orchestration. Collectively, these results extend accounts emphasizing competence-supportive, autonomy-enhancing feedback, and they nuance optimistic claims about AI in education by clarifying when, how, and for whom emotional benefits are most likely to materialize.

6 Conclusions and recommendations

The AI-powered Flexi application demonstrated a clear positive emotional impact on eighth-grade students’ learning of algebra. The experimental group experienced higher levels of enjoyment and lower levels of anxiety, shyness, and boredom compared to the control group, while no significant differences were observed in emotions of pride or anger. Qualitatively, students expressed a wide range of positive emotions—happiness, relief, and enthusiasm—associated with improved understanding, increased autonomy, and enhanced group interaction. Limited negative emotions, such as anxiety and annoyance, were also evident and were associated with technical difficulties or poor self-engagement. Together, these findings confirm the effectiveness of smart learning environments in fostering a positive emotional climate, while emphasizing the need to consider individual differences and incorporate elements of emotional intelligence into app design to ensure balanced learning environments that accommodate diverse learners.

Based on the study’s findings, it recommends designing interactive learning environments that foster positive emotions by developing smart applications capable of enhancing enjoyment and emotional interaction, thereby increasing student engagement in mathematics learning. A culture of productive failure should be promoted, encouraging students to learn from mistakes and transform them into opportunities to strengthen emotions of pride and accomplishment. Moreover, the study recommends advancing emotional AI techniques that adapt content to students’ emotional states, thus reducing stress and supporting psychological well-being. Further, it calls for analyzing the impact of varying emotions on academic performance, especially in mathematics, and examining the relationship between positive and negative emotions and their cumulative effects. Finally, the study highlights the importance of exploring different learning patterns and their influence on motivation and positive emotions when using smart applications, as well as investigating the long-term sustainability of emotions associated with continuous use of AI-based learning environments.

Within this broader agenda, the study emphasizes the importance of further developing the Flexi application itself through substantial enhancements. These enhancements include improving symbol-recognition algorithms in drawing-based input to minimize confusion between mathematical operations, strengthening visual-parsing capabilities with first-attempt reliability and image-quality alerts, resolving persistent errors in fill-in-the-blank items by offering alternative input pathways and clearer explanations, and tightening the answer-validation mechanism to ensure accurate recognition of students’ solutions while avoiding misleading outputs.

Additionally, the application should be upgraded to provide “Think like a Mathematician” items with fully worked solutions and final verification. These tasks represent high cognitive challenges, and the tool’s inability to solve them across all algebra lessons intensified students’ frustration. Implementing these improvements would not only reduce students’ momentary frustration but also better satisfy their psychological needs for autonomy, competence, and relatedness. This alignment with the principles of Self-Determination Theory would support more engaging and resilient learning experiences.

Conceptually, increasing accuracy and system transparency is expected to raise perceived control and task value, thereby lowering anxiety in line with Control–Value Theory. Simultaneously, this would strengthen competence and autonomy per Self-Determination Theory, clarifying how technical refinements translate into emotional gains.

Future research should replicate these findings in public schools and other Palestinian/Arabic and non-Arabic contexts, across additional grades and subjects, and over longer durations. Methodologically, randomized or stepped-wedge designs with multilevel modeling, pre-registered analyses, and adequate power are recommended. Feature-level experiments (A/B) are suggested to isolate which Flexi affordances drive emotional change. Triangulation of AEQ-M with learning-log analytics/brief EMA, tests of Arabic measurement invariance, moderator analyses (gender, ability, language proficiency, tech self-efficacy, teacher/class effects), and equity/feasibility work (access, PD, cost, privacy) are also recommended. Head-to-head comparisons with alternative AI tools and integration frameworks beyond SAMR would further clarify when, how, and for whom emotional benefits are most likely to hold.

6.1 Limitations

Despite steps to enhance rigor, several factors qualify interpretation. The sample comprises two Grade-8 cohorts in private schools in Nablus (Cambridge curriculum). Device availability, English-medium instruction, and assessment practices may differ from public schools, other grades/subjects, or other regions. Because recruitment relies on convenience sampling, the sample may overrepresent students with greater device access or comfort with educational technology. Generalization is therefore limited to comparable school profiles. School-level quasi-experimental assignment leaves open unobserved school/teacher differences—even after pretest baseline equivalence—so residual confounding cannot be ruled out. To minimize bias, we (i) assign at the school level (avoiding student self-selection), (ii) use the same teacher for both sections within each school, (iii) standardize lesson objectives/pacing and assessment windows, (iv) standardize AEQ-M administration (timing/instructions/language), and (v) restrict access to Flexi in the experimental school to reduce contamination.

Psychometrics and construct/statistical-conclusion validity also pose constraints. Internal consistencies are acceptable (α = 0.70–0.81), but we do not conduct a CFA or formal cross-cultural measurement-invariance testing. The very high negative correlation between pride and anxiety (|r| = 0.826) suggests construct overlap in this context, which limits discriminant validity and motivates future work using CFA (six-factor vs. higher-order/bifactor), HTMT/Fornell–Larcker checks, and potential item/translation refinements with multi-method triangulation Because multiple emotion outcomes are tested, we apply a Holm–Bonferroni adjustment across the six AEQ-M subscales and report both unadjusted and adjusted p values (p_adj) alongside effect sizes (partial η²) and 95% confidence intervals to convey precision. Temporal and model dependence further qualify interpretation. The four-week exposure captures short-term (potentially novelty-sensitive) shifts, and longer studies may reveal different trajectories (e.g., waning novelty, delayed pride/competence gains). Integration follows a SAMR-aligned redesign using CK-12 Flexi 2.0; portability to other tools/versions or to alternative integration frameworks remains uncertain and warrants multi-site, longer-duration replications and direct tool comparisons.

After data collection, we conduct a limited post hoc probe of item families that prove problematic in the trial (fill-in expressions/equations, “Think like a Mathematician,” and symbol disambiguation in drawing input). In the current Flexi release, symbol recognition (× vs. +) improves and several “Think like a Mathematician” prompts yield verifiable solutions; however, persistent failures remain for fill-in items (e.g., □(12x + □) = 24x + 32x²). These observations are descriptive and outside the original protocol; they do not alter the experimental findings but suggest some issues may be version-dependent. We recommend systematic re-evaluation of the updated release in future studies (see Appendix S2).

Qualitative coding is conducted by a single primary coder, so formal intercoder reliability (κ/α) cannot be assessed. Credibility is supported through supervisory reviews, an audit trail, and peer debriefing, though these are not substitutes for dual coding. Member checking is not feasible due to timetable and privacy constraints; triangulation with quantitative results helps mitigate this limitation. The interviews and thematic analysis are designed to elicit shared mechanisms of emotional change with Flexi and are not pre-planned for systematic comparisons by gender or achievement level. Accordingly, we do not perform differential coding matrices (by gender/ability), and the analysis focuses solely on patterns common across participants.

De-identified materials are available in an anonymized OSF repository for peer review: [https://osf.io/ywrtb/files/osfstorage/?view_only=3d0806fa8f6842729eade0309e008b1e]. The repository includes the AEQ-M items and scoring key, the interview protocol, and an SPSS analysis workflow (Quade ANCOVA, Holm–Bonferroni, Hodges–Lehmann 95% CIs). Raw individual-level data and audio files are not shared for privacy reasons.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by the Institutional Review Board (IRB) of An-Najah National University. Parental/guardian consent and student assent were secured. Participation was voluntary with the right to withdraw. Data were de-identified and reported in aggregate, in accordance with institutional guidelines and the Declaration of Helsinki. No identifiable human images are included.

Author contributions

AO: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing. WD: Supervision, Writing – review & editing. NB: Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that Generative AI was used in the creation of this manuscript. Generative AI (ChatGPT, OpenAI) was used solely for language editing and limited paraphrasing of the authors’ own text. All content was reviewed and verified by the authors. No AI-generated analyses, data, images, or references were used, and no confidential or personally identifiable information was entered into the tool.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2025.1669360/full#supplementary-material

References

Alvarez, J. (2024). Evaluating the impact of AI–powered tutors Mathgpt and flexi 2.0 in enhancing Calculus learning. Jur. Ilmh. Ilm. Ter. Un. Ja. 8, 495–508. doi: 10.22437/jiituj.v8i2.34809

Crossref Full Text | Google Scholar

Bieleke, M., Goetz, T., Yanagida, T., Botes, E., Frenzel, A. C., and Pekrun, R. (2023). Measuring emotions in mathematics: the achievement emotions questionnaire—mathematics (AEQ-M). ZDM Math. Educ. 55, 269–284. doi: 10.1007/s11858-022-01425-8

Crossref Full Text | Google Scholar

Boekaerts, M. (2010). “The crucial role of motivation and emotion in classroom learning” in The nature of learning: using research to inspire practice. eds. H. Dumont, D. Istance, and F. Benavides (Paris: OECD Publishing), 91–111.

Google Scholar

Braun, V., and Clarke, V. (2006). Using thematic analysis in psychology. Qual. Res. Psychol. 3, 77–101. doi: 10.1191/1478088706qp063oa

Crossref Full Text | Google Scholar

Chen, J. J., and Lin, J. C. (2024). Artificial intelligence as a double-edged sword: wielding the POWER principles to maximize its positive effects and minimize its negative effects. Contemp. Issues Early Child. 25, 146–153. doi: 10.1177/14639491231196238

Crossref Full Text | Google Scholar

CK-12. (2024). About us. Available online at: https://www.ck12info.org/about-us/ (accessed September 4, 2025).

Google Scholar

Creswell, J. W., and Clark, V. L. P. (2017). Designing and conducting mixed methods research. New York: Sage Publications.

Google Scholar

Csikszentmihalyi, M., and Csikzentmihaly, M. (1990). Flow: the psychology of optimal experience. New York: Harper & Row.

Google Scholar

D’Mello, S., and Graesser, A. (2012). Dynamics of affective states during complex learning. Learn. Instr. 22, 145–157. doi: 10.1016/j.learninstruc.2011.10.001

Crossref Full Text | Google Scholar

Daher, W. (2015). Discursive positionings and emotions in modelling activities. Int. J. Math. Educ. Sci. Technol. 46, 1149–1164. doi: 10.1080/0020739X.2015.1036833

Crossref Full Text | Google Scholar

Daher, W., and Abu Thabet, E. (2025). Students’ motivation in the artificial intelligence environment: a systematic review. Int. J. Interact. Mob. Technol. 19, 66–79. doi: 10.3991/ijim.v19i11.46281

Crossref Full Text | Google Scholar

Damasio, A. R. (2004). “Emotions and feelings” in Feelings and emotions: the Amsterdam symposium. eds. A. S. R. Manstead, N. H. Frijda, and A. H. Fischer (Cambridge: Cambridge University Press), 49–57.

Google Scholar

Deci, E. L., and Ryan, R. M. (1985). The general causality orientations scale: self-determination in personality. J. Res. Pers. 19, 109–134. doi: 10.1016/0092-6566(85)90023-6

Crossref Full Text | Google Scholar

Deci, E. L., and Ryan, R. M. (2000). The ‘what’ and ‘why’ of goal pursuits: human needs and the determination of behavior. Psychol. Inq. 11, 227–268. doi: 10.1207/S15327965PLI1104_01

Crossref Full Text | Google Scholar

Ellis, H. C., and Ashbrook, P. W. (1989). The "state" of mood and memory research: a selective review. J. Soc. Behav. Pers. 4, 1–21.

Google Scholar

Fernández-Herrero, J. (2024). Evaluating recent advances in affective intelligent tutoring systems: a scoping review of educational impacts and future prospects. Educ. Sci. 14:839. doi: 10.3390/educsci14080839

Crossref Full Text | Google Scholar

Gao, S. (2024). Can artificial intelligence give a hand to open and distributed learning? A probe into the state of undergraduate students’ academic emotions and test anxiety in learning via ChatGPT. Int. Rev. Res. Open Dis. Learn. 25, 199–218. doi: 10.19173/irrodl.v25i3.7654

Crossref Full Text | Google Scholar

Graphpad, I., and Ghoodjani, A. (2023). Quade Nonparametric ANCOVA. Available online at: https://www.researchgate.net/publication/376595667_Quade_Nonparametric_ANCOVA (accessed September 4, 2025).

Google Scholar

Hanin, V., and Gay, P. (2023). Comparative analysis of students’ emotional and motivational profiles in mathematics in grades 1–6. Front. Educ. 8:676. doi: 10.3389/feduc.2023.1117676

Crossref Full Text | Google Scholar

Jonassen, D. H. (1994). Thinking technology: toward a constructivist design model. Educ. Technol. 34, 34–37.

Google Scholar

Kapur, M. (2008). Productive failure. Cogn. Instr. 26, 379–424. doi: 10.1080/07370000802212669

Crossref Full Text | Google Scholar

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4:863. doi: 10.3389/fpsyg.2013.00863

PubMed Abstract | Crossref Full Text | Google Scholar

Létourneau, A., Deslandes Martineau, M., Charland, P., Karran, J. A., Boasen, J., and Léger, P. M. (2025). A systematic review of AI-driven intelligent tutoring systems (ITS) in K-12 education. NPJ Sci. Learn. 10:29. doi: 10.1038/s41539-025-00320-7

Crossref Full Text | Google Scholar

Mallinckrodt, B., and Wang, C. C. (2004). Quantitative methods for verifying semantic equivalence of translated research instruments: a Chinese version of the experiences in close relationships scale. J. Couns. Psychol. 51, 368–383. doi: 10.1037/0022-0167.51.3.368

Crossref Full Text | Google Scholar

Meece, J. L., Wigfield, A., and Eccles, J. S. (1990). Predictors of math anxiety and its influence on young adolescents' course enrollment intentions and performance in mathematics. J. Educ. Psychol. 82, 60–70. doi: 10.1037/0022-0663.82.1.60

Crossref Full Text | Google Scholar

Mehigan, T., and Pitt, I. (2019). “Engaging learners through emotion in artificially intelligent environments” in EDULEARN19 proceedings. ed. T. Mehigan (Spain: IATED), 5661–5668.

Google Scholar

Middleton, J. A., Wiezel, A., Jansen, A., and Smith, E. P. (2023). Tracing mathematics engagement in the first year of high school: relationships between prior experience, observed support, and task-level emotion and motivation. ZDM–mathematics. Education 55, 427–445. doi: 10.1007/s11858-023-01496-3

Crossref Full Text | Google Scholar

Muis, K. R., Chevrier, M., and Singh, C. A. (2018). The role of epistemic emotions in personal epistemology and self-regulated learning. Educ. Psychol. 53 3, 165–184. doi: 10.1080/00461520.2017.1421465

Crossref Full Text | Google Scholar

Pekrun, R. (2006). The control-value theory of achievement emotions: assumptions, corollaries, and implications for educational research and practice. Educ. Psychol. Rev. 18, 315–341. doi: 10.1007/s10648-006-9029-9

Crossref Full Text | Google Scholar

Pekrun, R. (2021). Teachers need more than knowledge: why motivation, emotion, and self-regulation are indispensable. Educ. Psychol. 56, 312–322. doi: 10.1080/00461520.2021.1991356

Crossref Full Text | Google Scholar

Pekrun, R., Goetz, T., Frenzel, A. C., Barchfeld, P., and Perry, R. P. (2011). Measuring emotions in students’ learning and performance: the achievement emotions questionnaire (AEQ). Contemp. Educ. Psychol. 36, 36–48. doi: 10.1016/j.cedpsych.2010.10.002

Crossref Full Text | Google Scholar

Pekrun, R., Goetz, T., Titz, W., and Perry, R. P. (2002). Academic emotions in students' self-regulated learning and achievement: a program of qualitative and quantitative research. Educ. Psychol. 37, 91–105. doi: 10.1207/S15326985EP3702_4

Crossref Full Text | Google Scholar

Pekrun, R., Lichtenfeld, S., Marsh, H., Murayama, K., and Goetz, T. (2017). Achievement emotions and academic performance: longitudinal models of reciprocal effects. Child Dev. 88, 1653–1670. doi: 10.1111/cdev.12704

PubMed Abstract | Crossref Full Text | Google Scholar

Pekrun, R., and Perry, R. P. (2014). “Control-value theory of achievement emotions” in International handbook of emotions in education. eds. R. Pekrun and L. Linnenbrink-Garcia (New York: Routledge), 120–141.

Google Scholar

Polydoros, G., Galitskaya, V., Pergantis, P., Drigas, A., Antoniou, A.-S., and Beazidou, E. (2025). Innovative AI-driven approaches to mitigate math anxiety and enhance resilience among students with persistently low performance in mathematics. Psychol. Int. 7:46. doi: 10.3390/psycholint7020046

Crossref Full Text | Google Scholar

Roda-Segarra, J., Mengual-Andrés, S., and Payà, R. A. (2024). Analysis of social metrics on scientific production in the field of emotion-aware education through artificial intelligence front. Artif. Intell. 7:1401162. doi: 10.3389/frai.2024.1401162

Crossref Full Text | Google Scholar

Ryan, R. M., and Deci, E. L. (2017). Self-determination theory: basic psychological needs in motivation, development, and wellness. New York, NY: Guilford Press.

Google Scholar

Schirmer, A. (2015). Emotion. London: Sage Publications.

Google Scholar

Schukajlow, S., Krawitz, J., Kanefke, J., Blum, W., and Rakoczy, K. (2023). Open modelling problems: cognitive barriers and instructional prompts. Educ. Stud. Math. 114, 417–438. doi: 10.1007/s10649-023-10196-y

Crossref Full Text | Google Scholar

Son, T. (2024). Intelligent tutoring systems in mathematics education: a systematic literature review using the substitution, augmentation, modification. Redef. Model Comp. 13:270. doi: 10.3390/computers13100270

Crossref Full Text | Google Scholar

Tavakol, M., and Dennick, R. (2011). Post-examination analysis of objective tests. Med. Teach. 33, 447–458. doi: 10.3109/0142159X.2011.564682

Crossref Full Text | Google Scholar

TIMSS and PIRLS International Study Center. (2024). TIMSS 2023 encyclopedia: education policy and curriculum in mathematics and science – Palestine chapter. Available online at: https://timss2023.org/encyclopedia/palestine/ (accessed September 4, 2025).

Google Scholar

Vistorte, A. O. R., Deroncele-Acosta, A., Martín Ayala, J. L., Barrasa, A., López-Granero, C., and Martí-González, M. (2024). Integrating artificial intelligence to assess emotions in learning environments: a systematic literature review. Front. Psychol. 15:1387089. doi: 10.3389/fpsyg.2024.1387089

Crossref Full Text | Google Scholar

Vygotsky, L. S. (1978). Mind in society: the development of higher psychological processes. Cambridge: Harvard University Press.

Google Scholar

Wang, H., Tlili, A., Huang, R., Cai, Z., Li, M., Cheng, Z., et al. (2023). Examining the applications of intelligent tutoring systems in real educational contexts: a systematic literature review from the social experiment perspective. Educ. Inf. Technol. 28, 9113–9148. doi: 10.1007/s10639-022-11555-x

Crossref Full Text | Google Scholar

Wang, X., and Wei, Y. (2025). The influence of gen-AI-assisted learning on primary school students’ math anxiety: an intervention study. Appl. Cogn. Psychol. 39:e70088. doi: 10.1002/acp.70088

Crossref Full Text | Google Scholar

Zong, Y., and Yang, L. (2025). How AI-enhanced social–emotional learning framework transforms EFL students' engagement and emotional well-being. Eur. J. Educ. 60:12925. doi: 10.1111/ejed.12925

Crossref Full Text | Google Scholar

Keywords: artificial intelligence in education, flexi application, academic emotions, teaching algebra, eighth grade (grade 8)

Citation: Omar A, Daher W and Bayaa N (2025) Academic emotions of eighth grade students in algebra classrooms using an artificial intelligence learning environment. Front. Educ. 10:1669360. doi: 10.3389/feduc.2025.1669360

Received: 19 July 2025; Accepted: 16 September 2025;
Published: 30 September 2025.

Edited by:

Michelle Finestone, University of Pretoria, South Africa

Reviewed by:

Mariacarla Martí-González, Complutense University of Madrid, Spain
Haifa Jammeli, NEOMA Business School, France

Copyright © 2025 Omar, Daher and Bayaa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Amal Omar, YW1sb21hcjYwNUBnbWFpbC5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.