Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Psychol., 19 December 2025

Sec. Educational Psychology

Volume 16 - 2025 | https://doi.org/10.3389/fpsyg.2025.1710203

Enhancing peer teaching and psychological outcomes in medical education through structured formative assessment: a quasi-experimental study

Dawei ZhangDawei Zhang1Kuibo ZhangKuibo Zhang1Junquan ChenJunquan Chen1Binfang ShangBinfang Shang2Zhongzhen SuZhongzhen Su3Qiang WuQiang Wu4Lianjun YangLianjun Yang1Lili XieLili Xie2Hai Lv
Hai Lv1*
  • 1Department of Spine Surgery, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
  • 2Department of Continuing Education, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
  • 3Department of Ultrasound, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
  • 4The First People's Hospital of Shaoguan City, Shaoguan, China

Background: Peer-assisted learning (PAL) is an established approach in medical education, yet variability in teaching quality persists when peer tutors lack structured pedagogical support. This study examined whether integrating a structured formative assessment framework could enhance peer tutors’ teaching performance, teaching self-efficacy, and reduce teaching anxiety, as well as improve first-year students’ knowledge, academic motivation, and self-efficacy.

Methods: A quasi-experimental, parallel- group study was conducted in three medical universities in Guangdong, China (2024–2025). Final-year medical students (n = 122) served as peer tutors and were allocated to an intervention (n = 61) or control groups (n = 61), each supervising 6–8 first-year students (total first-year students initially = 850; final analytic sample = 820) (intervention n = 411; control n = 409). The intervention integrated validated formative assessment tools—Mini-Clinical Evaluation Exercise (Mini-CEX), Direct Observation of Procedural Skills (DOPS), and Reflective Teaching Journals—alongside faculty feedback and self-reflection. Data were analyzed using linear mixed-effects models (LMM) accounting for student–tutor nesting. False discovery rate (FDR) correction was applied to control for multiple testing.

Results: Significant group × time interactions favored the intervention group across all outcomes. Students taught by intervention-group tutors showed higher post-test knowledge (7.93 ± 0.31 vs. 5.28 ± 1.10; F = 54.9, p < 0.001, η2 = 0.18) and greater academic motivation and self-efficacy (both p < 0.001). Peer tutors demonstrated higher teaching self-efficacy (72.48 ± 5.73 vs. 69.13 ± 5.91; F = 22.3, p < 0.001, η2 = 0.15) and lower teaching anxiety (2.34 ± 0.33 vs. 2.65 ± 0.39; F = 17.1, p < 0.001, η2 = 0.14). Post-test performance measures (PES-TBL, Mini-CEX, DOPS) were consistently higher in the intervention group (all p < 0.001). Qualitative reflections revealed challenges in communication and confidence but documented progressive improvements in interaction and teaching clarity.

Conclusion: This study provides preliminary evidence that integrating structured formative assessment into peer-assisted learning enhances tutors’ instructional competence, strengthens self-efficacy, and reduces teaching anxiety, while simultaneously improving students’ motivation and learning outcomes. Embedding formative assessment within PAL may represent a feasible and scalable strategy to improve teaching quality in medical education.

1 Introduction

Peer- assisted learning (PAL) has become an integral component of contemporary medical education, offering senior students opportunities to develop instructional competencies, leadership, and communication skills while supporting the academic development of junior learners (Brierley et al., 2022). Evidence across medical institutions demonstrates that PAL can enhance academic achievement, conceptual understanding, and learner engagement (Friel et al., 2018; Slabbert, 2024). PAL may also strengthen tutors’ professional confidence and reflective teaching practice, contributing to both pedagogical and psychological development (Pierce et al., 2024). In recent years, peer teaching has expanded within Asian medical schools, including China, where student-centered and interactive learning approaches have been increasingly incorporated into curricula (Yao et al., 2025).

Despite these documented benefits, concerns remain regarding the consistency and overall quality of peer-delivered instruction. Teaching effectiveness in PAL often depends on tutors’ prior experience and their ability to explain scientific content clearly and interactively (Herrmann-Werner et al., 2017). Because most peer tutors lack formal pedagogical preparation and structured support systems, the quality of peer-led sessions can vary considerably (Larios-Jones et al., 2024). These challenges have stimulated interest in strategies that can enhance peer tutors’ instructional competence and ensure that peer teaching achieves its intended educational value.

One promising strategy is the integration of structured formative assessment within peer tutoring processes. Formative assessment—which includes ongoing feedback, guided reflection, and constructive review—has been shown to improve teaching quality and learner outcomes in faculty- led educational settings (Morris et al., 2021; Irons and Elkington, 2021). However, its potential for strengthening peer tutors’ instructional skills has been insufficiently explored.

Although peer teaching has gained traction in Chinese medical universities and contributes to students’ autonomy and collaborative learning (Zhu et al., 2024; Wang et al., 2025), limited empirical evidence exists on whether structured formative assessment can enhance peer tutors’ teaching performance and improve students’ learning experiences in this context. Addressing this gap essential for developing evidence- informed frameworks that support high-quality peer instruction and sustained tutor development.

Guided by Bandura’s Social Cognitive Theory (SCT) and Self-Determination Theory (SDT) (Martin and Guerrero, 2020), the present study examines whether integrating structured formative assessment into PAL improves peer tutors’ teaching self-efficacy and reduces teaching anxiety, while also enhancing first-year students’ academic motivation and academic self-efficacy. SCT highlights the role of mastery experiences, feedback, and social modeling in shaping self-efficacy (Woreta et al., 2025), whereas SDT emphasizes the importance of competence and autonomy in fostering intrinsic motivation (Ryan and Deci, 2020). Accordingly, the intervention incorporated structured feedback and reflective components designed to promote tutors’ mastery and self-regulated teaching.

Although the advantages of PAL are well documented, little is known about how structured formative assessment influences teaching behaviors, psychological outcomes, and student learning within peer-teaching environments. This study addresses this gap by evaluating a structured formative assessment framework implemented across three medical universities in China. The findings aim to inform the development of scalable, evidence-based approaches that strengthen peer tutors’ instructional performance and enhance the quality of peer-assisted learning.

2 Methods and materials

2.1 Participants and study design

This study employed a quasi-experimental, parallel-group, mixed-methods design across three medical universities in Guangdong Province, China, during the second semester of the 2024–2025 academic year. Participants included final-year undergraduate medical students serving as peer tutors and first-year medical students who received peer teaching. All sessions took place in medical education units and clinical-skills classrooms at the participating universities.

Participants were recruited through convenience sampling in collaboration with teaching and learning offices at each institution. Although recruitment was voluntary and therefore non-random, peer tutors were allocated to intervention or control group using a computer-generated randomization list created by an independent researcher. Allocation concealment was ensured using sealed, opaque envelopes opened only after enrollment.

First-year students were also recruited by convenience sampling from the same courses or departments where peer-assisted teaching sessions were scheduled. Each peer tutor was paired with six to eight first-year students based on class schedule to minimize disruption to routine teaching. All first-year students followed the same group allocation as their respective tutor. To minimize evaluation bias, faculty raters conducting Mini-CEX and DOPS assessments were blinded to group allocation and completed a two-hour calibration workshop prior to data collection. PES-TBL ratings were student-reported and therefore could not be blinded.

Inclusion criteria for peer tutors were: final-year medical student, at least one year of prior peer-teaching experience, willingness to participate in structured formative assessment training, and availability to deliver at least six peer teaching sessions. Peer tutors were excluded if they withdrew voluntarily, missed required training, or failure to participate in the Mini-CEX and DOPS assessments.

First year medical students were eligible if they were enrolled in relevant course where peer teaching was occurred and provided written informed consent. Students were excluded if they missed more than one required session or withdrew voluntarily.

Sample size was calculated using G*Power 3.1 for a two-group repeated-measures design analyzed with linear mixed-effects models (LMM). Assuming a medium effect size (Cohen’s d = 0.5), 80% power, and α = 0.05, at least 51 tutors per group were required. Allowing 15% attrition, 60 tutors were recruited for each group. One additional volunteer was included, resulting in 61 tutors per group (N = 122). With each tutor supervising six to eight students, the estimated total number of first-year participants ranged from 720 to 960.

All participants were informed about the study purpose, procedures, potential benefits, and possible risks. Written informed consent was obtained prior to participation. The study followed the principles of the Declaration of Helsinki and was approved by the Ethics Committee of Sun Yat-sen University (Approval No. SYSU-20240304-28).

2.2 Assessments

All instruments demonstrated acceptable to excellent reliability in the current sample (Cronbach’s α range = 0.79–0.87; ICC range = 0.72–0.74; κ = 0.79). Validated Chinese versions were used for all self-report measures, and cross-cultural adaptation was ensured through pilot testing and expert review.

2.2.1 Demographic questionnaire

A short demographic questionnaire was developed specifically for this study to collect baseline information, including age, gender, academic year, GPA, and prior peer teaching experience.

2.2.2 Knowledge assessment

Knowledge of first-year students was assessed using a researcher-developed 10-item multiple-choice test administered before and after the peer-teaching program. The test was directly aligned with the course learning objectives and practical skills addressed during the peer teaching sessions. Each item had four response options with one correct answer (maximum score = 10). Content validity was ensured through independent review by two senior faculty experts in clinical medical education. Minor revisions were made following pilot testing with 10 non-participating students. Internal consistency in the current sample was acceptable (Cronbach’s α = 0.79).

2.2.3 Peer Evaluation Scale for Team-Based Learning (PES-TBL)

The PES-TBL, a validated 16-item instrument, was used to assess the perceived teaching quality of peer tutors from the student perspective. It measures four domains: clarity of explanation, content organization, learner interaction and engagement, and professional attitude. Each item is rated on a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree). The total score ranges from 16 to 80, with higher scores indicating stronger teaching performance. The Chinese version used in this study had previously demonstrated good validity and reliability in medical education settings (He et al., 2025). Internal consistency in the present sample was high (Cronbach’s α = 0.87).

2.2.4 Mini Clinical Evaluation Exercise (Mini-CEX)

The Mini-CEX was used to assess peer tutors’ real-time teaching performance (Mortaz Hejri et al., 2017; Loerwald et al., 2018). It includes five key domains: teaching readiness, clarity of content presentation, use of practical examples, interaction with learners, and responsiveness to questions. Each item was rated on a 9-point scale (1 = unsatisfactory to 9 = excellent). Each tutor was observed at least twice by a senior faculty member. Inter-rater reliability, calculated using a two-way random-effects intraclass correlation coefficient (ICC), indicated good agreement (ICC = 0.72).

2.2.5 Direct Observation of Procedural Skills (DOPS)

The DOPS tool was used to assess peer tutors during sessions that involved procedural or hands-on skills (Tang et al., 2025; Hu et al., 2025). The evaluation covered four dimensions: demonstration of technique, clarity of explanation, level of supervision and guidance provided, and the quality of feedback given to students. Each item was scored on a 5-point scale (1 = poor to 5 = excellent). Each tutor received at least two DOPS assessments. Faculty raters participated in a two-hour calibration session prior to data collection to ensure scoring consistency. Inter-rater reliability was acceptable (ICC = 0.74).

2.2.6 Reflective Teaching Journal

To promote self-awareness and ongoing improvement, peer tutors completed a structured Reflective Teaching Journal after each teaching session (Xu et al., 2020). Tutors were instructed to briefly reflect on their strengths, identify aspects needing improvement, and outline specific goals for the next session. Additionally, tutors rated their own performance on a 3-point scale (1 = needs major improvement, 2 = acceptable, 3 = excellent). These journals were reviewed by faculty members who provided feedback to support growth. The format of the journal was adapted from existing educational literature (Xu et al., 2020; Ma et al., 2023). Data saturation was reached after iterative coding, and final themes were established through consensus meetings between the two coders. Qualitative entries were analyzed using conventional content analysis. Two independent coders reviewed and reconciled differences through discussion, achieving substantial inter-rater agreement (Cohen’s κ = 0.79). An inductive approach was adopted to allow categories to emerge directly from the data.

2.2.7 Teaching self-efficacy

To measure peer tutors’ teaching self-efficacy, we used the short-form 20-item Teaching Self-Efficacy Scale developed and validated in a Chinese education context (Ma et al., 2023). This scale includes two subscales—Ethos (confidence in creating a positive, collaborative learning environment and engaging in professional development) and Teaching (tutors’ perceived ability to deliver clear explanations, facilitate student learning, and provide effective feedback during teaching sessions). Each item is rated on a 5-point Likert scale ranging from 1 = strongly disagree to 5 = strongly agree. The total score ranges from 20 to 100, with higher scores indicating greater perceived teaching self-efficacy. Construct validity of the scale was confirmed through exploratory and confirmatory factor analyses in the original study (Ma et al., 2023). In the current sample, internal consistency was high (Cronbach’s α = 0.87 for the total scale, and 0.79 and 0.81 for the Ethos and Teaching subscales, respectively).

2.2.8 Teaching anxiety

Teaching anxiety was assessed using a Teaching Anxiety Scale (TAS), originally developed by Parsons (1973) and adapted by Chinese study (Liu and Yan, 2020). The scale included 33 items, covering common sources of teaching-related stress in higher education contexts. Responses were recorded on a 5-point Likert scale (1 = never to 5 = always). Total scores above 3 indicate high teaching anxiety, between 2 and 3 moderate anxiety, and below 2 low anxiety. The scale demonstrated good internal consistency in the present sample (Cronbach’s α = 0.83).

2.2.9 Academic motivation

First-year students’ academic motivation was measured using the Academic Motivation Scale (AMS), grounded in Self-Determination Theory (Ten Cate et al., 2011). The validated Chinese version by Zhang et al. (2016) was used (Ten Cate et al., 2011). It comprises 28 items assessing intrinsic motivation, extrinsic motivation, and amotivation, rated on a 7-point Likert scale (1 = not at all true to 7 = completely true). Total and subscale scores were calculated by averaging the relevant items, with higher scores indicating greater levels of the respective motivation type (Hu and Luo, 2021). Internal consistency in this study was high, with Cronbach’s α = 0.80.

2.2.10 Academic self-efficacy

Academic self-efficacy was assessed using the Chinese version of the Academic Self-Efficacy Scale (ASES-C), originally developed by McIlroy (2000) and later validated for Chinese university students (Zhao et al., 2024). This unidimensional 8-item scale is rated on a 7-point Likert scale (1 = not at all confident to 7 = extremely confident). Higher scores indicate stronger academic self-efficacy. The scale demonstrated good internal consistency in this study, with Cronbach’s α = 0.79.

2.3 Intervention

The peer tutors in the intervention group participated in a 2-h training workshop that taught the principles of effective peer teaching and how to use formative assessment tools such as Mini-CEX, DOPS, and the Reflective Teaching Journal. The workshop was facilitated by faculty members with experience in peer medical education.

During the 12-week peer teaching program, each peer tutor in both the intervention and control groups conducted six peer teaching sessions (each session lasting 60–90 min). The number and duration of sessions were standardized to ensure equal teaching exposure and comparability across groups. Session scheduling and oversight were coordinated by faculty coordinators at each university.

During this period, peer tutors of intervention group received direct observation and structured feedback from faculty using the Mini-CEX and DOPS tools at least twice for each section (theoretical and practical sessions). After each session, tutors were required to complete a Teaching Reflection journal, recording strengths, challenges, and plans for improvement for the next session.

In contrast, control group peer tutors conducted the same number and duration of peer teaching sessions with students, but received no formal training in formative assessment or structured feedback, and conducted their teaching activities according to standard peer teaching practice common at the participating universities.

To ensure methodological consistency across the three universities, all peer tutors followed an identical peer-teaching protocol. Teaching topics, learning objectives, session plans, lesson duration, teaching materials, and assessment rubrics (Mini-CEX, DOPS, and PES-TBL) were standardized and jointly developed by a committee of faculty representatives from all three institutions. Faculty observers also participated in a calibration workshop to ensure uniform scoring procedures. Therefore, the instructional content and assessment process were fully aligned across sites.

2.4 Statistical analyses

All data were analyzed using SPSS version 26.0 (IBM Corp., Armonk, NY). Descriptive statistics (mean, standard deviation, and frequency) were computed to summarize baseline characteristics. The normality of continuous variables was verified using the Shapiro–Wilk test and inspection of Q–Q plots prior to conducting parametric analyses. Moreover, pre-test differences were analyzed using independent-samples t-tests.

Because first-year students were nested within peer tutors (approximately 6–8 students per tutor), a multilevel analytic framework was applied to account for the hierarchical data structure and non-independence of observations. LMM with random intercepts for tutor ID were used to analyze primary outcomes, including knowledge scores, academic motivation, and academic self-efficacy for students, as well as teaching self-efficacy and teaching anxiety for peer tutors. The proportion of variance attributable to tutor clustering (ICC) was 0.13 for knowledge scores, 0.09 for academic motivation, and 0.11 for academic self-efficacy, supporting the use of LMMs with random intercepts for tutor ID. Fixed effects included group (intervention vs. control), time (pre vs. post), and their interaction (group × time), while baseline scores were entered as covariates where applicable. Model assumptions were checked by examining residual plots for normality, homoscedasticity, and influential outliers. Restricted maximum likelihood (REML) estimation was used, and model fit was evaluated using Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Effect sizes were calculated using partial eta-squared (η2) and interpreted according to Cohen’s benchmarks.

For outcomes without pre–post measures (e.g., PES-TBL, Mini-CEX, DOPS), independent t-tests were conducted at the tutor level, and findings were cross-validated using LMMs including tutor as a random effect to confirm robustness. Statistical significance was set at p < 0.05 (two-tailed).

Qualitative data derived from the reflective teaching journals were analyzed using conventional inductive content analysis to allow themes to emerge directly from the data rather than from a pre-existing framework. Two researchers independently conducted open coding, grouped similar codes into subcategories, and iteratively refined them into broader themes. The coding framework was discussed and finalized through consensus meetings. Data saturation was reached when no new categories emerged after reviewing approximately 90% of the journals. Inter- coder reliability was assessed using Cohen’s kappa (κ = 0.79), indicating substantial agreement. Final themes were confirmed through peer debriefing with an experienced qualitative researcher to enhance trustworthiness. Four overarching themes were identified: (1) difficulty managing group discussions, (2) communication barriers with first-year students, (3) low teaching confidence during early sessions, and (4) uncertainty about content accuracy.

To control for potential inflation of Type I error due to multiple testing, false discovery rate (Benjamini–Hochberg) correction was applied across the five primary outcome models; all main findings remained significant after FDR adjustment. Sensitivity analyses using Bonferroni correction produced the same pattern of results.

3 Results

A total of 122 peer tutors (61 in the intervention group and 61 in the control group) participated in the study, and all completed the intervention as scheduled without dropout. Each peer tutor supervised between six and eight first-year students during the 12 week program, resulting in 850 first-year students (425 per group). During the intervention, 14 first-year students in the intervention group and 16 in the control group withdrew due to absenteeism or voluntary withdrawal. Consequently, data from 411 intervention and 409 control students were retained for the final analysis.

The mean age of peer tutors was 23.9 ± 0.4 years and 55.73% were female (n = 68). Their mean GPA was 3.44 ± 0.15 on a 4.0 scale, and all had least 1 year of previous peer teaching experience. The mean age of first -year students was 18.91 ± 0.22 years, with 53.9% female (n = 442). There were no significant baseline differences between groups in demographic or outcome variables, indicating good initial comparability (Table 1).

Table 1
www.frontiersin.org

Table 1. The baseline demographic characteristics of peer tutors and first-year students in the intervention and control groups.

Because students were nested within peer tutors (approximately 6–8 students per tutor), all inferential analyses were conducted using LMMs to account for the hierarchical data structure and non-independence of observations (Table 2). Each model included group (intervention vs. control), time (pre vs. post), and their interaction (group × time) as fixed effects, while tutor ID was included as a random intercept to account for within-tutor clustering.

Table 2
www.frontiersin.org

Table 2. Linear mixed-effects model results comparing pre–post outcomes between intervention and control groups.

Significant group × time interaction effects were observed for all primary outcomes, indicating that the intervention group improved more over time than the control group. Specifically, knowledge scores showed a large improvement among students taught by intervention-group tutors (F = 54.9, p < 0.001, partial η2 = 0.18). Teaching self-efficacy among peer tutors also increased significantly compared with the control group (F = 22.3, p < 0.001, η2 = 0.15), while teaching anxiety decreased markedly (F = 17.1, p < 0.001, η2 = 0.14). Among first-year students, both academic motivation (F = 27.5, p < 0.001, η2 = 0.06) and academic self-efficacy (F = 37.1, p < 0.001, η2 = 0.07) increased significantly more in the intervention group than in the control group. Model diagnostics confirmed that residuals were normally distributed, variances were homogeneous, and no influential outliers were detected. Descriptive means (means ± SD) are reported in Table 2 for interpretability; inferential statistics (F, p, partial η2) are derived from linear mixed-effects models (LMM) adjusted for baseline scores and GPA.

For performance-based measures collected only at post-test, independent-samples t-tests were performed at the tutor level, and results were cross-validated using LMMs with tutor as a random effect (Table 3). Both analyses yielded consistent results, indicating significantly higher teaching performance among intervention-group tutors. The intervention group achieved higher mean scores on the Peer Evaluation Scale for Team-Based Learning (PES-TBL) (67.35 ± 5.77 vs. 59.32 ± 6.21; t (120) = 7.61, p < 0.001, Cohen’s d = 0.47), Mini-Clinical Evaluation Exercise (Mini-CEX) (7.76 ± 0.72 vs. 6.38 ± 0.88; t (120) = 8.15, p < 0.001, d = 0.42), and Direct Observation of Procedural Skills (DOPS) (4.45 ± 0.32 vs. 3.73 ± 0.41; t (120) = 8.13, p < 0.001, d = 0.82), corresponding to moderate-to-large effect sizes.

Table 3
www.frontiersin.org

Table 3. Post-test comparisons for tutor performance indicators (independent-samples t-tests, confirmed with LMMs).

Of the 372 reflective teaching journals collected from peer tutors, six were excluded due to incomplete or missing entries, leaving 366 journals for analysis. Based on tutors’ self-ratings, 56.5% of reflections were classified as acceptable, 21.2% as excellent, and 22.3% as requiring major improvement. Content analysis revealed four recurring themes: communication barriers with first-year students, challenges in managing group discussions, uncertainty about content accuracy, and low teaching confidence during early sessions. Tutors frequently identified goals related to improving interaction, time management, and clarity of explanations. Representative excerpts included statements such as: “At first, I was nervous about whether my explanations were clear enough, but after feedback, I realized the importance of checking students’ understanding during class,” and “Some students were quiet at first, and I found it difficult to engage them. Gradually, I learned to use questions to make them participate more.” These reflections suggest that structured formative assessment, combined with guided self-reflection, facilitated the development of tutors’ confidence, communication, and pedagogical adaptability throughout the intervention.

All primary results remained statistically significant after Benjamini–Hochberg FDR correction (adjusted p < 0.05); the same pattern was observed with Bonferroni correction, confirming robustness.

4 Discussion

This quasi-experimental study evaluated whether integrating structured formative assessment into peer-assisted learning enhances teaching quality among peer tutors and learning outcomes among first-year students. The observed improvements in first-year students’ knowledge scores and peer tutor’ teaching indicators (PES-TBL, Mini-CEX, and DOPS) provide preliminary evidence that formative assessment is a practical approach to improving the educational impact of peer tutoring, consistent with previous findings in health professions education (Herrmann-Werner et al., 2017; Sabale et al., 2022; Feng et al., 2024).

Structured formative feedback appeared to benefit both tutors and learners. Students taught by tutors receiving targeted feedback experienced clearer explanations, better-organized sessions, and more interactive instruction, which are known to enhance knowledge retention (Morris et al., 2021). Although modest gains occurred in the control group, the larger improvements in the intervention group underscore that structured observation and feedback enable tutors to refine delivery, adapt explanations, and address misunderstandings more effectively, aligning with constructivist and experiential learning frameworks (O’Connor and McCurtin, 2021).

Mechanisms driving teaching performance improvement include the use of Mini-CEX and DOPS, which facilitated structured observation and immediate feedback, promoting deliberate practice and competency development among novice educators (Lörwald et al., 2019; Lee and Mori, 2021; Embo et al., 2010). The PES-TBL scale provided clear performance criteria, reducing ambiguity and guiding tutors to follow recognized best practices (Robertson et al., 2025). Reflective journal analysis reinforced these findings: over half of reflections rated teaching as “acceptable,” with about one-fifth rated as “excellent,” accompanied by recurrent notes on communication and content clarity. This pattern demonstrates how guided reflection fosters metacognitive engagement and adaptability, core elements of reflective professional practice (Ratminingsih et al., 2017; Silver et al., 2023). Together, structured feedback and reflection appear to form a cycle that accelerates tutors’ professional growth and self-regulation (Zlabkova et al., 2024).

Formative assessment also yielded psychological and motivational benefits. Peer tutors in the intervention group reported higher teaching self-efficacy and lower teaching anxiety. According to Bandura’s social cognitive theory, formative feedback serves as a structured source of mastery experiences—the most powerful determinant of self-efficacy (Morris et al., 2021; Granziera and Perera, 2019; Lent, 2016). Constructive feedback following teaching attempts reinforced confidence and framed errors as growth opportunities, likely explaining reduced anxiety and creating a positive feedback loop between competence and performance (Jones et al., 2021; Patra et al., 2022).

First-year students in the intervention group demonstrated higher academic motivation and self-efficacy, which can be interpreted through Self-Determination Theory (SDT). The structured environment likely enhanced competence via clear instruction, autonomy through active engagement, and relatedness through tutors’ responsiveness and confidence (Ten Cate et al., 2011; Patra et al., 2022; Luarn et al., 2023). Consequently, improved teaching quality and psychological safety likely enhanced intrinsic motivation and belief in academic capability.

In summary, formative assessment in peer-assisted learning offers dual pedagogical and psychological benefits, reinforcing tutors’ instructional competence and confidence while simultaneously promoting learners’ motivation and self-efficacy. By bridging social-cognitive and self-determination perspectives, formative feedback fosters a self-reinforcing cycle of mastery, reduced anxiety, and sustained motivation for both tutors and learners.

Several limitations should be considered when interpreting these findings. First, the quasi-experimental design and convenience sampling restrict causal inference and may introduce selection bias, as students who volunteered to participate could have been more motivated or teaching-oriented than the general student population. Second, the relatively short intervention period did not allow for long-term follow-up, making it unclear whether improvements in teaching performance, motivation, or self-efficacy are sustained over time. Third, although reflective journals provided valuable qualitative insight, the absence of interviews or focus groups limited the depth of qualitative interpretation.

Additionally, a potential Hawthorne effect cannot be ruled out; tutors who knew they were being observed or receiving structured feedback may have temporarily altered their behavior, contributing to short-term performance gains. Finally, because the study was conducted in three universities within a single province in China, cultural and contextual factors may limit the generalizability of findings to other regions or educational systems. Future research would benefit from multi-center randomized designs, longer follow-up periods, and mixed-methods approaches incorporating interviews to further explore the mechanisms linking formative assessment, self-efficacy, and learning outcomes.

5 Conclusion

This quasi-experimental study indicates that integrating structured formative assessment into peer-assisted learning can enhance both teaching quality and student learning in undergraduate medical education. Peer tutors who received structured feedback through Mini-CEX, DOPS, and reflective journals demonstrated higher teaching performance, greater teaching self-efficacy, and lower teaching anxiety. Correspondingly, first-year students taught by these tutors achieved higher knowledge gains, stronger academic motivation, and improved academic self-efficacy. These findings suggest that formative assessment exerts a dual influence: it strengthens tutors’ instructional competence and confidence while fostering a more engaging and motivating learning environment for students. The consistent improvements across cognitive, behavioral, and psychological domains underscore the value of formative assessment as both an educational strategy and a professional development tool. Given its low cost and feasibility, this approach provides medical schools with a practical means to enhance the quality of peer teaching. Future research should explore the long-term sustainability of these effects and investigate the applicability of formative assessment models across different disciplines and cultural contexts.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The study was conducted in accordance with the principles of the Declaration of Helsinki and was approved by the Ethics Committee of Sun Yat-sen University (Approval No. SYSU-20240304-28). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

DZ: Conceptualization, Data curation, Investigation, Project administration, Writing – original draft. KZ: Conceptualization, Data curation, Investigation, Project administration, Writing – original draft. JC: Conceptualization, Data curation, Investigation, Project administration, Writing – original draft. BS: Methodology, Project administration, Supervision, Visualization, Writing – original draft. ZS: Methodology, Project administration, Supervision, Visualization, Writing – original draft. QW: Methodology, Project administration, Supervision, Visualization, Writing – original draft. LY: Methodology, Project administration, Supervision, Visualization, Writing – original draft. LX: Methodology, Project administration, Supervision, Visualization, Writing – original draft. HL: Conceptualization, Data curation, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Guangdong Province Learning Society Construction (Continuing Education) Quality Enhancement Project [grant fund number: JXJYGC2024D160].

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Brierley, C., Ellis, L., and Reid, E. R. (2022). Peer-assisted learning in medical education: a systematic review and meta-analysis. Med. Educ. 56, 365–373. doi: 10.1111/medu.14672,

PubMed Abstract | Crossref Full Text | Google Scholar

Embo, M. P., Driessen, E. W., Valcke, M., and Van der Vleuten, C. P. (2010). Assessment and feedback to facilitate self-directed learning in clinical practice of midwifery students. Med. Teach. 32, e263–e269. doi: 10.3109/0142159X.2010.490281,

PubMed Abstract | Crossref Full Text | Google Scholar

Feng, H., Luo, Z., Wu, Z., and Li, X. (2024). Effectiveness of peer-assisted learning in health professional education: a scoping review of systematic reviews. BMC Med. Educ. 24:1467. doi: 10.1186/s12909-024-06434-7,

PubMed Abstract | Crossref Full Text | Google Scholar

Friel, O., Kell, D., and Higgins, M. (2018). The evidence base for peer assisted learning in undergraduate medical education: a scoping study. MedEdPublish 7:44. doi: 10.15694/mep.2018.0000044.1

Crossref Full Text | Google Scholar

Granziera, H., and Perera, H. N. (2019). Relations among teachers’ self-efficacy beliefs, engagement, and work satisfaction: a social cognitive view. Contemp. Educ. Psychol. 58, 75–84. doi: 10.1016/j.cedpsych.2019.02.003

Crossref Full Text | Google Scholar

He, S., Guan, J., Xiong, C., Qiu, Y., Duan, Y., Zhang, Y., et al. (2025). Translation and psychometric validation of the peer evaluation scale for team-based learning (PES-TBL) for Chinese medical students. Nurse Educ. Pract. 83:104257. doi: 10.1016/j.nepr.2025.104257,

PubMed Abstract | Crossref Full Text | Google Scholar

Herrmann-Werner, A., Gramer, R., Erschens, R., Nikendei, C., Wosnik, A., Griewatz, J., et al. (2017). Peer-assisted learning (PAL) in undergraduate medical education: an overview. Z. Evid. Fortbild. Qual. Gesundhswes. 121, 74–81. doi: 10.1016/j.zefq.2017.01.001,

PubMed Abstract | Crossref Full Text | Google Scholar

Hu, H., and Luo, H. (2021). Academic motivation among senior students majoring in rehabilitation related professions in China. BMC Med. Educ. 21:582. doi: 10.1186/s12909-021-03016-9,

PubMed Abstract | Crossref Full Text | Google Scholar

Hu, Z., Zhang, W., Huang, M., and Liu, X. (2025). Application of directly observed procedural skills in hospital infection training: a randomized controlled trial. Front. Med. 12:1509238. doi: 10.3389/fmed.2025.1509238,

PubMed Abstract | Crossref Full Text | Google Scholar

Irons, A., and Elkington, S. (2021). Enhancing learning through formative assessment and feedback. London: Routledge.

Google Scholar

Jones, D. L., Nelson, J. D., and Opitz, B. (2021). Increased anxiety is associated with better learning from negative feedback. Psychol. Learn. Teach. 20, 76–90. doi: 10.1177/1475725720965761

Crossref Full Text | Google Scholar

Larios-Jones, L., Richards, E., and Sollazzo, A. (2024). A peer-led approach to tutor training: Implementation and outcomes, proceedings of the 2024 conference on United Kingdom & Ireland computing education research, Manchester, United Kingdom: Association for Computing Machinery (ACM).1–7.

Google Scholar

Lee, H., and Mori, C. (2021). Reflective practices and self-directed learning competencies in second language university classes. Asia Pac. J. Educ. 41, 130–151. doi: 10.1080/02188791.2020.1772196

Crossref Full Text | Google Scholar

Lent, R. W. (2016). Self-efficacy in a relational world: social cognitive mechanisms of adaptation and development. Counsel. Psychol. 44, 573–594. doi: 10.1177/0011000016638742

Crossref Full Text | Google Scholar

Liu, M., and Yan, Y. (2020). Anxiety and stress in in-service Chinese university teachers of arts. Int. J. High. Educ. 9, 237–248. doi: 10.5430/ijhe.v9n1p237

Crossref Full Text | Google Scholar

Loerwald, A. C., Lahner, F.-M., Nouns, Z. M., Berendonk, C., Norcini, J., Greif, R., et al. (2018). The educational impact of mini-clinical evaluation exercise (Mini-CEX) and direct observation of procedural skills (DOPS) and its association with implementation: a systematic review and meta-analysis. PLoS One 13:e0198009. doi: 10.1371/journal.pone.0198009

Crossref Full Text | Google Scholar

Lörwald, A. C., Lahner, F.-M., Mooser, B., Perrig, M., Widmer, M. K., Greif, R., et al. (2019). Influences on the implementation of Mini-CEX and DOPS for postgraduate medical trainees’ learning: a grounded theory study. Med. Teach. 41, 448–456. doi: 10.1080/0142159X.2018.1497784,

PubMed Abstract | Crossref Full Text | Google Scholar

Luarn, P., Chen, C.-C., and Chiu, Y.-P. (2023). Enhancing intrinsic learning motivation through gamification: a self-determination theory perspective. Int. J. Inf. Learn. Technol. 40, 413–424. doi: 10.1108/IJILT-07-2022-0145

Crossref Full Text | Google Scholar

Ma, T., Li, Y., Yuan, H., Li, F., Yang, S., Zhan, Y., et al. (2023). Reflection on the teaching of student-centred formative assessment in medical curricula: an investigation from the perspective of medical students. BMC Med. Educ. 23:141. doi: 10.1186/s12909-023-04110-w,

PubMed Abstract | Crossref Full Text | Google Scholar

Ma, K., Luo, J., Cavanagh, M., Dong, J., and Sun, M. (2023). Measuring teacher self-efficacy: validating a new comprehensive scale among Chinese pre-service teachers. Front. Psychol. 13:1063830. doi: 10.3389/fpsyg.2022.1063830,

PubMed Abstract | Crossref Full Text | Google Scholar

Martin, J. J., and Guerrero, M. D. (2020). “Social cognitive theory” in Routledge handbook of adapted physical education (London: Routledge), 280–295.

Google Scholar

McIlroy, D. (2000). An evaluation of the factor structure and predictive utility of a test anxiety scale with reference to students’ past performance and personality indices. Br. J. Educ. Psychol. 70, 17–32.

Google Scholar

Morris, R., Perry, T., and Wardle, L. (2021). Formative assessment and feedback for learning in higher education: a systematic review. Rev. Educ. 9:e3292. doi: 10.1002/rev3.3292

Crossref Full Text | Google Scholar

Mortaz Hejri, S., Jalili, M., Shirazi, M., Masoomi, R., Nedjat, S., and Norcini, J. (2017). The utility of mini-clinical evaluation exercise (mini-CEX) in undergraduate and postgraduate medical education: protocol for a systematic review. Syst. Rev. 6:146. doi: 10.1186/s13643-017-0539-y,

PubMed Abstract | Crossref Full Text | Google Scholar

O’Connor, A., and McCurtin, A. (2021). A feedback journey: employing a constructivist approach to the development of feedback literacy among health professional learners. BMC Med. Educ. 21:486. doi: 10.1186/s12909-021-02914-2,

PubMed Abstract | Crossref Full Text | Google Scholar

Parsons, J.S. Assessment of anxiety about teaching using the teaching anxiety scale: Manual and research report 1973. Austin, TX: University of Texas at Austin, Research and Development Center for Teacher Education. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA, February 25–March 1, 1973.

Google Scholar

Patra, I., Alazemi, A., Al-Jamal, D., and Gheisari, A. (2022). The effectiveness of teachers’ written and verbal corrective feedback (CF) during formative assessment (FA) on male language learners’ academic anxiety (AA), academic performance (AP), and attitude toward learning (ATL). Lang. Test. Asia 12:19. doi: 10.1186/s40468-022-00169-2

Crossref Full Text | Google Scholar

Pierce, B., van de Mortel, T., Allen, J., and Mitchell, C. (2024). The influence of near-peer teaching on undergraduate health professional students' self-efficacy beliefs: a systematic integrative review. Nurse Educ. Today 143:106377. doi: 10.1016/j.nedt.2024.106377,

PubMed Abstract | Crossref Full Text | Google Scholar

Ratminingsih, N. M., Artini, L. P., and Padmadewi, N. N. (2017). Incorporating self and peer assessment in reflective teaching practices. Int. J. Instr. 10, 165–184. doi: 10.12973/iji.2017.10410a

Crossref Full Text | Google Scholar

Robertson, K. A., Gunderman, D. J., and Byram, J. N. (2025). Formative peer evaluation instrument for a team-based learning course: content and construct validity. Med. Teach. 47, 828–834. doi: 10.1080/0142159X.2024.2374511,

PubMed Abstract | Crossref Full Text | Google Scholar

Ryan, R. M., and Deci, E. L. (2020). Intrinsic and extrinsic motivation from a self-determination theory perspective: definitions, theory, practices, and future directions. Contemp. Educ. Psychol. 61:101860. doi: 10.1016/j.cedpsych.2020.101860

Crossref Full Text | Google Scholar

Sabale, R., Manapuranth, R. M., Subrahmanya, S. U., and Pathak, B. (2022). “Written formative assessments with peer-assisted learning” an innovative teaching program for postgraduate students in community medicine. Indian J. Community Med. 47, 34–38. doi: 10.4103/ijcm.IJCM_682_21,

PubMed Abstract | Crossref Full Text | Google Scholar

Silver, N., Kaplan, M., LaVaque-Manty, D., and Meizlish, D. (2023). Using reflection and metacognition to improve student learning: across the disciplines, across the academy. New York: Taylor & Francis.

Google Scholar

Slabbert, R. (2024). Effects of same-year/level peer-assisted learning on academic performance of students in health sciences’ extended curriculum programmes at a University of Technology in South Africa. Perspect. Educ. 42, 46–59. doi: 10.38140/pie.v42i2.7311

Crossref Full Text | Google Scholar

Tang, K., Zhou, X., Ju, Z., Wang, F., Zhang, T., Hu, D., et al. (2025). Direct observation of procedural skills as an assessment tool in acupuncture skills training for international students. Medical Acupuncture 37, 73–79. doi: 10.1089/acu.2024.0119,

PubMed Abstract | Crossref Full Text | Google Scholar

Ten Cate, O. T. J., Kusurkar, R. A., and Williams, G. C. (2011). How self-determination theory can assist our understanding of the teaching and learning processes in medical education. AMEE guide no. 59. Med. Teach. 33, 961–973. doi: 10.3109/0142159X.2011.595435,

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, L., Chen, P., Wang, X., Wei, S., Lin, J., and Jing, X. (2025). Integrating team-based and peer-teaching strategies for standardized dental residency: a path to active learning and professional growth. BMC Med. Educ. 25:618. doi: 10.1186/s12909-025-07023-y,

PubMed Abstract | Crossref Full Text | Google Scholar

Woreta, G. T., Zewude, G. T., and Józsa, K. (2025). The mediating role of self-efficacy and outcome expectations in the relationship between peer context and academic engagement: a social cognitive theory perspective. Behav. Sci. 15:681. doi: 10.3390/bs15050681,

PubMed Abstract | Crossref Full Text | Google Scholar

Xu, D., Atkinson, M., Yap, T., Yap, M., Hossain, R., Chong, F., et al. (2020). Reflecting on exchange students’ learning: structure, objectives and supervision. Med. Teach. 42, 278–284. doi: 10.1080/0142159X.2019.1676886,

PubMed Abstract | Crossref Full Text | Google Scholar

Yao, Q., Zhu, P., Yu, X., Cheng, Y., Cui, W., and Liu, Q. (2025). The effectiveness of the student-centered flipped classroom approach in medical anatomy teaching: a quasi-experimental study. Clin. Anat. 38, 496–504. doi: 10.1002/ca.24267,

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, B., Li, Y. M., Li, J., Li, Y., and Zhang, H. (2016). The revision and validation of the Academic Motivation Scale in China. Journal of Psychoeducational Assessment 34, 15–27.,

PubMed Abstract | Google Scholar

Zhao, M., Kuan, G., Chau, V. H., and Kueh, Y. C. (2024). Validation and measurement invariance of the Chinese version of the academic self-efficacy scale for university students. PeerJ 12:e17798. doi: 10.7717/peerj.17798,

PubMed Abstract | Crossref Full Text | Google Scholar

Zhu, C., Tian, H., Yan, F., Xue, J., and Li, W. (2024). Enhancing knowledge mastery in resident students through peer-teaching: a study in respiratory medicine. BMC Med. Educ. 24:350. doi: 10.1186/s12909-024-05130-w,

PubMed Abstract | Crossref Full Text | Google Scholar

Zlabkova, I., Petr, J., Stuchlikova, I., Rokos, L., and Hospesova, A. (2024). “Development of teachers' perspective on formative peer assessment” in Developing formative assessment in STEM classrooms (London: Routledge), 105–125.

Google Scholar

Keywords: peer-assisted learning, formative assessment, teaching self-efficacy, teaching anxiety, academic motivation, medical education

Citation: Zhang D, Zhang K, Chen J, Shang B, Su Z, Wu Q, Yang L, Xie L and Lv H (2025) Enhancing peer teaching and psychological outcomes in medical education through structured formative assessment: a quasi-experimental study. Front. Psychol. 16:1710203. doi: 10.3389/fpsyg.2025.1710203

Received: 23 September 2025; Revised: 21 November 2025; Accepted: 25 November 2025;
Published: 19 December 2025.

Edited by:

Daniel H. Robinson, The University of Texas at Arlington College of Education, United States

Reviewed by:

Uzair Abbas, Dow University of Health Sciences, Pakistan
Jingyuan Ren, Radboud University, Netherlands
Ashfaque Ahmed Kanhar, Chulalongkorn University, Thailand

Copyright © 2025 Zhang, Zhang, Chen, Shang, Su, Wu, Yang, Xie and Lv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Hai Lv, bHZoYWlAbWFpbC5zeXN1LmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.