Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Educ., 05 January 2026

Sec. Psychology in Education

Volume 10 - 2025 | https://doi.org/10.3389/feduc.2025.1689514

“Yes we can?” – To what extent do can-judgments enhance self-regulated learning from text?

Martin Fifka
&#x;Martin Fifka1*Nicolas Hübner&#x;Nicolas Hübner2Anique de Bruin&#x;Anique de Bruin3Anja Prinz-Weiß&#x;Anja Prinz-Weiß1
  • 1Department of Psychology, University of Education Karlsruhe, Karslruhe, Germany
  • 2University of Bonn, Bonn, Germany
  • 3Department of Educational Development and Research, Maastricht University, Maastricht, Netherlands

Students worldwide struggle to learn in STEM subjects, highlighting the need for effective interventions to enhance learning in these areas. Evidence on whether judgments of learning (JOLs) combined with retrieval practice can support learning from text is mixed. Notably, previous studies have been carried out with adult learners. Thus, it remains unclear to what extent younger students in intermediate school tracks, who typically perform at low to medium levels, benefit from making JOLs in addition to engaging in retrieval practice. Moreover, a potential positive influence of JOLs and retrieval practice on the effectiveness of self-regulation has not yet been examined. Therefore, we investigated the impact of JOLs and retrieval practice on text comprehension, judgment accuracy, and improvements in text comprehension after restudying (as an indicator of self-regulation). The study involved N = 315 9th school students (Mage = 15.10) who made can-judgments as a special form of JOLs and had a 2 (JOLs vs. no JOLs) × 2 (retrieval vs. no retrieval practice) factorial design. The results showed no significant effects of retrieval practice, can-judgments, or their combination on text comprehension, JOL accuracy, and improvements in text comprehension. This means that can-judgements and retrieval practice, without feedback or discussion afterwards, do not support self-regulated learning from text.

Introduction

Supporting students in self-regulated learning from text is relevant in nearly every school subject. This support is particularly important for students who struggle with self-regulating their learning, as low self-regulation skills impair their ability to plan, monitor, and control their own learning. As a result, they are more likely to achieve lower academic outcomes (e.g., Blume et al., 2022; de Bruin and Van Gog, 2012). One approach increasingly used to foster self-regulated learning in German high schools is the use of “can-lists.” Can-lists contain several target-specific judgments of learning (JOLs; e.g., “I can draw points into a Cartesian coordinate system,” Hoffmann, 2016), in which students self-assess their understanding of specific topics. The aim of can-lists is to help students identify knowledge gaps and, in turn, allocate their study time more effectively. However, despite their wide spread use in German schools, the effectiveness of can-lists has not yet been systematically investigated.

A key factor in self-regulated learning is high JOL accuracy. More accurate JOLs lead to more effective self-regulation (e.g., restudy decisions), which in turn improves performance (e.g., Butler and Winne, 1995; Dunlosky and Rawson, 2012; Thiede et al., 2003). Consequently, it seems likely that students who provide more accurate can-judgments will self-regulate their learning more effectively. This raises the question of how to best support students in making accurate can-judgments.

One promising method in this regard is retrieval practice, which involves recalling information from memory rather than simply restudying it (e.g., Butler and Winne, 1995; Karpicke, 2017). While the primary reason teachers use retrieval practice is to enhance learning, it can also improve the accuracy of JOLs (e.g., Tauber et al., 2015). This is likely because retrieving information strengthens memory traces and provides students with a clearer sense of what they know. Therefore, combining can-lists with retrieval practice might be beneficial, as retrieval practice could enhance JOL accuracy, making can-lists a more effective tool for guiding students’ self-regulated learning.

Recent research suggests that making JOLs can support learning, possibly because learners engage in covert retrieval when trying to assess their knowledge (e.g., Li et al., 2022). However, the effects of JOLs on learning—known as the reactive effect of JOLs—appear to depend on the learning material. Specifically, a meta-analysis by Double et al. (2018) showed that the effect varies depending on whether students learn single words, word pairs, or longer texts. The evidence for a reactive effect when learning from text remains inconclusive. Ariel et al. (2021) found a reactive effect of JOLs on text comprehension only when combined with retrieval practice. However, Zhao et al. (2023) did not replicate this finding findings. To date, these two studies are the only ones that investigated the reactive effect when learning from text, and both focused on adult learners, leaving open the question of whether similar effects occur among younger students.

In this study, we focused on students in intermediate track schools, who are still developing their self-regulation skills (e.g., Yang et al., 2023; de Bruin and Van Gog, 2012) and often face challenges in both STEM subjects and reading. For example, PISA 2022 results showed that among 15-year-old students from all types of schools in Germany, 22.9% did not meet the minimum proficiency level in STEM subjects and 25.5% did not achieve the minimum proficiency level in reading.

By investigating self-regulated learning from text in physics, our study addresses these challenges and by focusing on young learners and can-judgments as a tool already used in German schools, our study is of high practical relevance.

The following sections of the literature review expand on the reactive effect of JOLs, overt and covert retrieval, and can-judgements as a special form of JOLs, in order to provide a rationale for the research questions and to substantiate the hypotheses.

Before conducting the study, we preregistered our methods and hypotheses at https://osf.io/j58g3/?view_only=85b0b29645654562a8e4c720ec4a43fa.

The reactive effect of judgments of learning

Typically, JOLs are used to measure judgment accuracy (e.g., Rhodes, 2015). In this case, learners judge their learning, and this judgment is then compared to their actual performance to determine its accuracy. However, besides being used to measure judgment accuracy, their direct effect on learning, the so-called reactive effect of JOLs, has been addressed by numerous previous studies. A meta-analysis by Double et al. (2018) revealed no general reactive effect of JOLs. However considering the type of learning material, they found a moderate positive effect when learning related word pairs or structured word lists. Consistent with these findings, subsequent research by Janes et al. (2018) indicated a positive reactive effect when learning related word pairs. Two recent studies by Zhao et al. (2022) and Li et al. (2022) also found a reactive effect of JOLs when learning single words instead of word pairs. While Li et al. (2022) focused on adults, Zhao et al. (2022) assessed primary school students in grades 1, 3, and 5, finding that the reactive effect increased with students’ age. To date, there are – to the best of our knowledge – no studies looking into the reactive effect of JOLs in high school students.

The mechanisms underlying the reactive effect of JOLs remain unclear, though several hypotheses have been proposed to explain this phenomenon. One possibility is that making a JOL involves a form of covert retrieval, where learners try to recall the target information to evaluate how well it can be remembered (e.g., Baker and Dunlosky, 2006; Jönsson et al., 2012). Another explanation is that the requirement to make JOLs alters learners’ study strategies. Specifically, learners might shift their focus from preparing for the final test to identifying specific cues in the material that help them provide accurate JOLs. This shift in attention could lead to more deliberate and effective engagement with the content, thereby enhancing learning outcomes (e.g., Li et al., 2022). A third hypothesis is that JOLs affect self-confidence, which in turn influences test performance. Specifically, making JOLs may heighten learners’ awareness of their perceived competence. Low-performing learners might become more aware of their difficulties, leading to increased anxiety and poorer performance, whereas high-performing learners could gain confidence, which may enhance their results (e.g., Double and Birney, 2017).

In the context of text-based learning, research on the reactive effect of JOLs remains limited. To date, only two studies have looked into this effect, both using adult participants with an average age of approximately 34 years, the majority of whom had college experience. Ariel et al. (2021) conducted a series of five experiments investigating the effects of JOLs and retrieval practice on text-based learning. In Experiments 1, 2a, 2b, and 3, participants were assigned to either a JOL or a no-JOL condition. The reading material covered the formation of minerals. In these four experiments, participants in the JOL group made various types of JOLs after reading each section, such as aggregate JOLs (overall confidence ratings) or term-specific JOLs (e.g., “How confident are you that you understand that minerals are made by geological processes?”). Across these four experiments, making JOLs did not improve performance on the final test compared to reading alone. In Experiment 4, Ariel et al. (2021) examined whether retrieval practice could produce an effect. The experiment employed a 2 × 2 design, manipulating the inclusion of JOLs (yes or no) and retrieval practice (yes or no). Participants in the retrieval practice condition answered two to three short-answer questions after each section, which were identical to those on the final test. They received feedback immediately after responding. Those in the combined JOL + retrieval practice condition additionally made JOLs after completing retrieval practice. Participants in the JOL condition made a JOL after reading each section. Participants in the control group neither made JOLs nor engaged in retrieval practice. The final test consisted of 12 factual short-answer questions about the text (e.g., “How are minerals made?”). The results showed no effect of JOLs alone on test performance. However, retrieval practice substantially improved performance compared to reading only, and combining JOLs with retrieval practice led to even greater improvements. Moreover, additional analyses showed that retrieval practice also enhanced the accuracy of participants’ JOLs.

Zhao et al. (2023) attempted to replicate the findings of Experiment 4 by Ariel et al. (2021) by testing the combined effect of JOLs and retrieval practice. Across three experiments conducted with different samples (students from Beijing Normal University, mTurk workers, and Prolific Academic participants), they found no significant difference in test performance between those who engaged in retrieval practice and JOLs and those who engaged in retrieval practice alone. This lack of replication suggests that benefits of combining JOLs with retrieval practice might not be robust.

Whereas prior research has demonstrated a reactive effect of JOLs in learning single words and word pairs, the evidence for this effect in text-based learning remains inconclusive. The findings by Ariel et al. (2021) suggested that JOLs alone do not enhance text comprehension but may be beneficial when combined with retrieval practice. However, Zhao et al. (2023) did not replicate this effect. In addition, to date, no research has examined whether the reactive effect of JOLs extends to younger learners in high school settings.

Overt and covert retrieval practice

Retrieval practice involves actively recalling information from memory through exercises designed to facilitate knowledge retrieval (Karpicke, 2017). In a typical retrieval practice experiment, learners first study a set of materials before engaging in retrieval-based activities, such as answering quiz questions or recalling key concepts (e.g., Blunt and Karpicke, 2014). Control conditions vary but often involve additional study time, elaborative learning tasks, or restudying instead of retrieval practice. Finally, all participants complete a criterial test, which assesses how well the previously learned material is retained or understood (e.g., Carpenter et al., 2009; Larsen and Dornan, 2013; Rowland and DeLosh, 2014). Research has shown that retrieval practice with closely related or identical items as used on a subsequent test significantly enhances test performance compared to simply restudying the learning materials (e.g., Butler and Winne, 1995; Karpicke, 2017; Kemp et al., 2023).

Making a JOL may involve covert retrieval, subtly triggering recall processes without performing overt retrieval practices. This subtle activation of memory could explain the reactive effect of JOLs on learning single words or related word pairs (e.g., Li et al., 2022). Ariel et al. (2021) suggested that the reason the reactive effect of making JOLs has not been found when learning from text (i.e., when only providing JOLs without overt retrieval practice) might be that learners prematurely terminate their retrieval attempts compared to the overt retrieval induced by explicit instructions. Short attempts to retrieve the target information might be sufficient in the case of learning single words or word pairs but insufficient when learning more complex materials like key definitions (Tauber et al., 2018) or textual information (Ariel et al., 2021). Contrary to this, Yang et al. (2023) questioned whether JOLs consistently rely on covert retrieval. Their meta-analysis showed that learners’ JOL accuracy when studying text is highly inaccurate (see also Prinz et al., 2020). Whereas this does not directly imply that covert retrieval does not occur, it raises doubts about whether JOLs alone consistently trigger meaningful recall processes that contribute to improved learning outcomes. Further supporting this idea, Yang et al. (2023) found that JOL accuracy significantly improved when learners first engaged in overt retrieval before making their JOLs. This suggests that retrieval attempts prior to making JOLs enhance metacognitive accuracy, as learners gain a clearer sense of what they truly know. Similarly, Ariel et al. (2021) found that JOL accuracy was higher when participants engaged in retrieval practice in addition to making JOLs compared to when they only made JOLs.

It is important to note that, in the studies by Ariel et al. (2021) and Zhao et al. (2023), the students received feedback after engaging in retrieval practice. A meta-analysis by Rowland (2014) showed that retrieval practice has a positive effect on learning even when learners are not given any feedback on the correctness of their answers. Feedback is clearly beneficial in educational settings as, depending on the type of feedback, it can provide learners with the correct answer, relevant information, or at least indicate whether their response was correct (Hattie and Timperley, 2007). However, giving feedback prohibits the possibility of observing direct mnemonic effects of the initial retrieval effort (cf. Karpicke, 2017).

Overall, retrieval practice plays a crucial role in learning. For one, studies have shown that retrieval practice can improve the accuracy of JOLs (e.g., Ariel et al., 2021; Dunlosky and Nelson, 1992; Tauber et al., 2015). Moreover, while JOLs may involve some level of covert retrieval, research suggests that JOLs alone may not consistently induce the retrieval effort required for learning, especially for complex materials. However, encouraging learners to engage in overt retrieval practice before making JOLs appears to improve both the accuracy of JOLs and learning outcomes (Yang et al., 2023). Nonetheless, these studies have primarily focused on adult learners and have not examined whether improvements in JOL accuracy actually lead to more effective regulation of study behavior In addition, previous studies have not included measures of self-regulation, such as rereading behavior. Yet, research indicates that more accurate JOLs are linked to more effective regulation decisions, such as about what content to restudy (e.g., de Bruin et al., 2011; Dunlosky and Thiede, 2013; Thiede et al., 2003). This connection is especially relevant for younger learners in school settings, where developing the ability to regulate one’s own learning is an important educational goal. Our study aims to address this gap and investigates to what extent can-judgments as a special form of JOLs support self-regulated learning.

“Can-judgments” as a special form of JOLs

Research has shown that JOLs during text learning are often inaccurate, particularly among primary and secondary school students (e.g., Prinz et al., 2020). This highlights the need to actively foster JOL accuracy in educational settings, especially in schools, where its impact could be substantial (e.g., de Bruin and Van Gog, 2012).

One practical approach aimed at fostering monitoring and accurate JOLs in German schools is the use of “I-can-lists”; which contain so-called “can-judgments” (Schneider et al., 2012). These judgments typically begin with the phrase “I can...” followed by specific learning objectives, such as “... name the different kinds of polygons” or “... draw points into a Cartesian coordinate system” (Hoffmann, 2016). While can-judgments share similarities with JOLs, they represent a specific form that explicitly emphasizes perceived competence and may relate to self-efficacy. Unlike more traditional JOLs, which often focus on the likelihood of recalling specific information (e.g., term-specific JOLs in Ariel et al., 2021), can-judgments emphasize perceived competence in applying knowledge and are typically framed as broader learning goals.

Despite their widespread use, there seems to be no systematic research supporting the effectiveness of can-judgments in fostering learning. Moreover, the structure of can-lists varies between schools, for example, in whether they include accompanying exercises designed as retrieval practice (Schilderoth, 2016). This raises the questions of whether can-judgments are effective and, if so, whether their effectiveness stems from potential reactive effects, similar to those observed in JOL research (e.g., Zhao et al., 2022; Li et al., 2022), or whether their benefits primarily arise when combined with retrieval practice (e.g., Ariel et al., 2021).

Present study

This study explored the reactive effect of can-judgments, a specific form of JOLs, in combination with retrieval practice among German intermediate track students (i.e., students attending “Realschule”) when engaging in self-regulated learning from text. Examining this population is particularly relevant because prior research on this issue has focused on adult learners (e.g., Ariel et al., 2021; Zhao et al., 2023). However, middle school students, especially those with lower to intermediate achievement levels, may particularly benefit from structured metacognitive support, as they often struggle with accurately judging their learning (e.g., Prinz et al., 2020; de Bruin and Van Gog, 2012). Additionally, despite their widespread use in German middle schools, can-judgments seem to lack scientific validation regarding their effectiveness in improving self-regulated learning. Finally, it remains unclear whether more accurate judgments after retrieval practice could translate into greater improvements in text comprehension following a regulation phase, indicating more effective self-regulation.

The first goal of this study was to examine whether can-judgments and retrieval practice enhance learners’ text comprehension. Prior research has yielded mixed results: Ariel et al. (2021) found that making JOLs alone did not improve text comprehension, whereas retrieval practice did. Moreover, combining retrieval practice with JOLs led to a further improvement. Zhao et al. (2023) did not replicate this effect when examining the combination of retrieval practice and JOLs, though without assessing the individual effects of each component. This highlights the need for further investigation. Numerous studies, including Ariel et al. (2021), have demonstrated the effectiveness of retrieval practice in enhancing text comprehension (e.g., Hughes and Thomas, 2022; Rowland, 2014). Building on this evidence, we predicted that retrieval practice would enhance text comprehension compared to a control condition. Furthermore, based on Ariel et al. (2021), we expected that combining retrieval practice with can-judgments would lead to additional improvements in text comprehension (H1). The latter expectation diverges from Zhao et al. (2023) because we assumed that can-judgments, due to their structured “I can” format that emphasizes self-efficacy, may be particularly effective in combination with retrieval practice.

The second goal was to investigate the impact of retrieval practice on the accuracy of can-judgments. Research has shown that retrieval practice can enhance JOL accuracy (e.g., Ariel et al., 2021; Dunlosky et al., 2005; Dunlosky and Nelson, 1992). We assumed that this effect would also occur with can-judgments. Specifically, we predicted that retrieval practice prior to making can-judgments would result in more accurate can-judgments compared to only making can-judgments without prior retrieval practice (H2).

The third goal was to shed light on improvements in text comprehension after a regulation phase as an indicator of self-regulation. Prior research has shown that more accurate JOLs when learning from text contribute to more effective rereading decisions and in turn to improved comprehension (e.g., Thiede et al., 2003). Accordingly, because we assumed that retrieval practice in combination with can-judgments would led to more accurate can-judgments (see H2), we predicted that engaging in both retrieval practice and making can-judgments would lead to a greater improvement in text comprehension after a regulation phase than making can-judgments only (H3).

Method

Sample and design

We conducted an a priori power analysis using G*Power. Based on previous research on the reactive effect of JOLs and retrieval practice when learning from text (Ariel et al., 2021; Zhao et al., 2022), we expected small to medium effect sizes. The power analysis yielded a required sample size of N = 259 participants (α = 0.05, β = 0.20, f = 0.175; Faul et al., 2007). To account for potential exclusions, we recruited 356 ninth-grade students from German “Realschulen.” In Germany, Realschulen are intermediate schools that offer education from fifth up to 10th grade and are characterized by a practical-oriented education pathway.

In these schools, challenges such as lower socioeconomic status contribute to low-to-medium academic performance, particularly in STEM subjects, where 22.9% of 15-year-old German students failed to reach the baseline proficiency level in PISA 2022. Moreover, German students’ science scores dropped from 503 points in 2019 to 492 points, reflecting a significant decline in performance (OECD, 2023). This underscores the importance of focusing research efforts on these students and STEM-subjects.

We applied specific exclusion criteria: Participants were excluded if they spent an unreasonably short amount of time answering the questions (defined as more than two standard deviations below the mean), failed to complete all items within the time limits, lacked sufficient German language proficiency, or were diagnosed with dyslexia. As a result, 41 participants were excluded (28 due to dyslexia, 12 due to insufficient German language proficiency, and one due to excessively short task completion time). The final sample consisted of N = 315 participants, with an average age of 15.10 years (SD = 0.69), and 43.13% were female.

The study employed a 2 (JOLs vs. no JOLs) × 2 (retrieval practice vs. no retrieval practice) factorial design. Participants were randomly assigned to one of four groups. In the can-judgments + retrieval practice group (n = 79), participants engaged in retrieval practice followed by making can-judgments. In the can-judgments only group (n = 80), participants made can-judgments without engaging in prior retrieval practice. In the retrieval practice only group (n = 80), participants engaged in retrieval practice without making subsequent can-judgments. Finally, in the control group (n = 76), participants neither engaged in retrieval practice nor made can-judgments.

Materials

All materials and measures used are openly accessible at Open Science Framework1. All materials were specifically designed for this study and piloted with n = 37 9th grade students.

Text

The text was an expository physics text titled Ways of Heat Transfer and covered 525 words. The text was particularly designed for 9th-grade students, with a Flesch-Reading-Ease score of 66. It described three mechanisms of heat transfer, namely, conduction, convection, and radiation, each in a dedicated section.

Control task

Pilot testing revealed time-on-task differences between groups. Hence, to ensure that the time participants spent on the group-specific treatments would be comparable across groups, math problems were developed as filler tasks. Specifically, these problems reflected typical 9th-grade level tasks, requiring basic arithmetic, unrelated to the physics topic of the study (e.g.: “solve the equation 3y * 7 = 84 for y”).

Participants in the control group completed 10 math problems during the intervention phase, whereas Likewise, to account for time-on-task differences between groups observed when piloting, participants in the JOLs-only group additionally completed five of these math problems, while participants in the retrieval practice-only group completed three of them after their respective tasks. Participants in the JOLs and retrieval practice group received no filler tasks.

Measures

Prior knowledge

Participants’ prior knowledge about the topic “ways of heat transfer” was assessed using an open-ended question asking them to write down everything they know about heat transfer (i.e., heat conduction, heat radiation, and convection). Participants were given 90 s to respond, without the option to progress earlier. Participants received between 0 and 2 points based on the correctness and completeness of their responses. Zero points were assigned if no answer was provided or if the answer was completely incorrect. One point was assigned if either one mechanism of heat transfer (conduction, convection, or radiation) was correctly explained, or if correct examples or technical applications of heat transfer were provided. Two points were assigned more than one mechanisms of heat transfer was correctly explained. Two raters independently scored the participants’ answers to the prior knowledge question with substantial interrater agreement, Cohen’s κ = 0.62, 95% CI [0.53, 0.72].

Retrieval practice

Retrieval practice consisted of 10 multiple-choice questions with four answer options, of which only one was correct. These questions focused on comprehension, requiring participants to draw inferences and apply the learned content rather than simply recalling factual information. For example, one retrieval practice question was: “The thermal conductivity of a frying pan describes …” with the answer options: (a) how much heat it can absorb, (b) how quickly thermal energy spreads within it, (c) how much thermal energy is released when it cools down, and (d) how much energy is needed to heat it to the desired temperature. In the retrieval practice only condition, all 10 items were presented together on a single scrollable page. In the can-judgements and retrieval practice condition, each retrieval practice item was presented together with a corresponding can-judgement referring to the specific competency targeted by the question.

Participants received one point for each correctly answered question and zero points for incorrect responses. No feedback was provided after completing retrieval practice.

Comprehension test

The comprehension test also consisted of 10 multiple-choice questions with four answer options, of which only one was correct. Each retrieval practice question had a corresponding parallel item in the comprehension test. For instance, a parallel item to the retrieval practice question presented above was: “A metal cube is heated in an oven. Its thermal conductivity determines… (a) how much thermal energy it can store, (b) how much energy is required to heat it, (c) how quickly thermal energy spreads within it, and (d) how much thermal energy it releases when cooling down.

Each correct answer was awarded one point, while incorrect answers received zero points. As in the retrieval practice phase, no feedback was provided. Participants took this comprehension test twice: immediately after the intervention and again after learning from the text for a second time.

JOLs

JOLs were operationalized as target-specific can-judgments focusing on specific learning objectives. Each judgment began with the phrase “I can,” followed by a specific statement (e.g., “I can give examples of good and bad thermal conductors”). Participants provided 10 can-judgments, each corresponding to one specific section of the text and hence to one specific question of the retrieval and comprehension test. In the can-judgments only group, participants provided these judgments after reading, without answering any retrieval practice questions beforehand. In the can-judgments + retrieval practice group, the can-judgements were presented directly below each retrieval practice question, so that participants first had to answer a retrieval practice question and then immediately rate their perceived competence. The judgments were given on a 5-point Likert scale (0 = definitely cannot to 4 = definitely can).

JOL accuracy

JOL accuracy was operationalized as relative accuracy by calculating intraindividual Gamma correlations between can-judgments and corresponding test performance scores in the first comprehension test (Nelson, 1984). These Gamma correlations reflect the degree to which a participant’s judgments discriminate, at the item level, between comprehended and non-comprehended information. Gamma correlations could only be computed for the JOL groups (i.e., can-judgments only group and can-judgments + retrieval practice group), because the other two groups (i.e., retrieval practice only group and control group) made no judgments.

Piloting and item validation

The study was piloted with N = 37 9th-grade students (Mage = 15.43, SDage = 0.60, 24% female) to test item difficulty and avoid potential ceiling or floor effects. The results showed that only 10 participants scored 100%, none scored 0%, and the mean score was 46.76%. Internal consistency of the retrieval practice and test questions was sufficient, with Cronbach’s alpha = 0.72.

Procedure

Data collection took place during a 45-min school lesson in the regular classrooms of the participating schools. One class was tested at a time, and testing lasted on average 24.34 min (SD = 5.21). Data were collected on laptops, using the software Inquisit Lab 6. Each laptop was randomly assigned to one of the four experimental conditions, with no visual differences, before students entered the classroom and were arbitrarily seated.

First, participants’ prior knowledge about the learning topic of the study (i.e., heat transfer) was assessed. Then, they read a physics text on heat transfer. The text was presented on a single page with a set reading time of 6 min, during which participants could not proceed to the next task.

After reading, participants engaged in group-specific tasks for a maximum of 10 min, with the option to proceed earlier if all tasks had been answered. In the can-judgments + retrieval practice group, participants first answered 10 multiple-choice retrieval practice questions about the text. After each retrieval practice question, they made a corresponding can-judgment, assessing how well they believed they could apply the relevant competency to similar tasks. Can-judgements were given on a Likert-scale from 0 (definitely can not) to 4 (definitely can) points. In the can-judgments only group, participants made 10 can-judgments about their comprehension without engaging in retrieval practice, followed by five math problems. In the retrieval practice only group, participants completed 10 retrieval practice questions, followed by three filler tasks, but did not make can-judgments. Finally, in the control group, participants did not engage in either retrieval practice or can-judgments but completed 10 filler tasks. The filler task was to solve mathematical equations, with typical difficulty for 9th grade students. We used ten filler tasks in the control-condition, five in the can-judgements only group and three in the retrieval practice only group.

Next, participants completed a comprehension test consisting of 10 multiple-choice questions. These questions closely aligned with those used in the retrieval practice phase, with each test question having a parallel counterpart in the retrieval practice question pool. The assignment of the questions to the retrieval practice or test phase was randomized. The time limit for completing the comprehension test was 10 min, with the option to proceed earlier if all questions had been answered.

Then, participants were given a second opportunity to study the text, again with a 6-min reading time limit, but this time with the option to proceed early. Finally, the comprehension test was repeated using the same questions as in the comprehension test to measure improvement. At the end of the study, participants provided demographic information (Figures 1, 2).

Figure 1
Flowchart titled

Figure 1. Flowchart of study.

Figure 2
Test and retrieval practice questions on thermal conductivity. Test Item 1 asks about the thermal conductivity of a frying pan, and Retrieval Practice 1 discusses the thermal conductivity of a metal cube in an oven. The two questions address the same underlying concept, using different examples, and the colored answer options indicate parallels between the items used for retrieval and testing. The bottom section includes a self-assessment scale for explaining thermal conductivity.

Figure 2. Example for can-judgment paired with parallel items used for retrieval practice and test.

Results

The data analysis was performed using R 2024.09.0 (R Core Team, 2025). Descriptive statistics for all study variables are shown in Table 1.

Table 1
www.frontiersin.org

Table 1. Means and standard deviations of the variables in the four groups.

Preliminary analyses

To test for comparability and successful randomization of the four experimental groups, preliminary analyses were conducted on prior knowledge, time on task, and age.

A 2 × 2 ANOVA on prior knowledge with the factors JOLs (yes vs. no) and retrieval practice (yes vs. no) showed neither significant main effects of JOLs, F(1, 311) < 0.01, p = 0.950, η2 < 0.01, 95% CI [0.00, 0.00] (no effect), and retrieval practice, F(1, 311) = 0.15, p = 0.696, η2 < 0.01, 95% CI [0.00, 0.02] (no effect), nor a significant interaction between JOLs and retrieval practice, F(1, 311) = 0.05, p = 0.827, η2 < 0.01, 95% CI [0.00, 0.01] (no effect). There were no significant correlations between prior knowledge and the dependent variables: test performance (r = 0.10, p = 0.070, 95% CI [−0.01, 0.21]) JOL accuracy (r = 0.05, p = 0.581, 95% CI [−0.11, 0.20]), and test improvement (r = 0.01, p = 0.821, 95% CI [−0.10, 0.12]).

A 2 × 2 ANOVA on average time spent on group-specific treatments with the factors JOLs (yes vs. no) and retrieval practice (yes vs. no) showed significant main effects of JOLs, F(1, 311) = 5.40, p = 0.021, η2 = 0.02, 95% CI [0.00, 0.06] (small effect), and retrieval practice, F(1, 311) = 45.43, p < 0.001, η2 = 0.13, 95% CI [0.07, 0.20] (medium effect), but no significant interaction effect between JOLs and retrieval practice, F(1, 311) = 0.57, p = 0.451, η2 < 0.01, 95% CI [0.00, 0.02] (no effect). Students in the retrieval practice conditions spent significantly more time on the treatment (M = 5.57, SD = 2.50) than students in the conditions without retrieval practice (M = 4.90, SD = 1.83). Students in the can-judgements conditions spent less time on average (M = 5.46, SD = 2.11) than students in the conditions without can-judgements (M = 6.03, SD = 2.54). There were no significant correlations between the time spent on treatment and the dependent variables: test performance (r = 0.04, p = 0.523, 95% CI [−0.07, 0.15]), JOL accuracy (r = −0.03, p = 0.716, 95% CI [−0.19, 0.13]), and test improvement (r = −0.04, p = 0.529, 95% CI [−0.15, 0.08]).

A 2 × 2 ANOVA on age with the factors JOLs (yes vs. no) and retrieval practice (yes vs. no) showed neither significant main effects on age of JOLs, F(1, 310) = 1.22, p = 0.270, η2 < 0.01, 95% CI [0.00, 0.03] (no effect), and retrieval practice, F(1, 310) = 0.64, p = 0.423, η2 < 0.01, 95% CI [0.00, 0.02] (no effect), nor a significant interaction between JOLs and retrieval practice, F(1, 310) = 0.22, p = 0.642, η2 < 0.01, 95% CI [0.00, 0.02] (no effect). There was a significant, albeit small negative correlation between participants´ age and test performance: r = −0.11, p = 0.048, 95% CI [−0.22, 0.00]. The correlations between age and JOL accuracy (r = −0.06, p = 0.437, 95% CI [−0.22, 0.10]) and test improvement (r = 0.01, p = 0.877, 95% CI [−0.10, 0.12]) were not significant.

Overall, these analyses showed that randomization of participants was successful and that group differences before the intervention were negligible or not significant. However, there were significant group differences in treatment time, showing that more filler tasks in the can-judgements only and retrieval practice only groups would have been beneficial, which will be discussed in more detail in the Limitations and Practical Implications section. Moreover, as there was a significant correlation between age and test performance, the analyses on comprehension (Hypothesis 1) will also be calculated with age as covariate.

Hypothesis 1—the effects of JOLs and retrieval practice on text comprehension

A 2 × 2 ANOVA on text comprehension with the factors JOLs (yes vs. no) and retrieval practice (yes vs. no) was conducted to examine if those factors individually or in combination had an effect on students´ text comprehension. The result showed that there were neither significant main effects of JOLs, F(1, 311) = 0.06, p = 0.809, η2 < 0.01, 95% CI [0.00, 0.01] (no effect), and retrieval practice, F(1, 311) < 0.01, p = 0.981, η2 < 0.01, 95% CI [0.00, 0.00], (no effect), nor a significant interaction effect, F(1, 311) = 1.03, p = 0.311, η2 < 0.01, 95% CI [0.00, 0.03] (no effect). Because there was a significant correlation between age and test performance, we also calculated the analyses on Hypothesis 1 with age as covariate, but the results remained similar. Thus, our first hypothesis that retrieval practice would improve text comprehension, and that combining it with JOLs would provide an additional benefit was not supported.

Hypothesis 2—the effects of JOLs and retrieval practice on JOL accuracy

In this analysis, we compared JOL accuracy of the can-judgments + retrieval practice group and the can-judgments only group. A t test was conducted to determine whether JOL accuracy differed between students who made can-judgments only and students who engaged in both making can-judgments and retrieval practice. The analysis showed no significant difference between the two groups, t(150) = 0.16, p = 0.870, d = 0.03, 95% CI [−0.29, 0.34] (no effect). Thus, our second hypothesis that retrieval practice would improve JOL accuracy was not supported.

Hypothesis 3—the effects of JOLs and retrieval practice on improvements in text comprehension

To examine the extent to which text comprehension improved from the first to the second test in the can-judgments + retrieval practice group versus the can-judgments only group, a mixed ANOVA was conducted. The analysis included one between-subjects factor (group: can-judgments + retrieval practice group vs. can-judgments only group) and one within-subjects factor (comprehension test time: first test completion vs. second test completion). This analysis allowed us to examine to what extent students´ comprehension changed through rereading from the first to the second test, depending on group assignment. The results revealed no significant main effects of group, F(3, 311) = 0.16, p = 0.923, η2 < 0.01, 95% CI [0.00, 0.01] (no effect), and time, F(3, 311) = 2.93, p = 0.088, η 2  < 0.01, 95% CI [0.00, 0.04] (no effect). The interaction effect between group and time was also not significant, F(3, 311) = 0.55, p = 0.651, η2 < 0.01, 95% CI [0.00, 0.02], (no effect). Since no significant effect was found, our third hypothesis that retrieval practice in addition to can-judgments would foster improvements in text comprehension must be rejected.

Discussion

Our study investigated potential reactive effects of JOLs and retrieval practice on self-regulated learning from text. Unlike previous studies (e.g., Ariel et al., 2021; Zhao et al, 2023), we focused on German middle school students in intermediate tracks. This setting allowed us to examine whether JOLs and retrieval practice foster self-regulated learning under conditions that more closely resemble everyday classroom environments. Moreover, by using can-judgments as an authentic and widely applied form of JOLs, our study gains greater practical relevance.

Contrary to Ariel et al. (2021) and Zhao et al. (2023) we did not provide feedback after retrieval practice. This was done to ensure that any observed reactive effects would be attributable to JOLs and answering retrieval practice questions rather than to feedback. Prior research has shown that although retrieval practice has positive effects on learning even without feedback it is typically stronger with feedback (e.g., Pashler et al., 2005; Rowland, 2014). Hence, giving no feedback may have reduced both the effectiveness of retrieval practice and also the availability of cues for students to base their JOLs on.

Consistent with Zhao et al. (2023) and in contrast to Ariel et al. (2021), we found no significant effects of combining JOLs and retrieval practice on comprehension (H1). Both Ariel et al. (2021) and Zhao et al. (2023) used adult participants (Mage ≈ 34.00 years), often with higher education levels. Instead, our study focused on German middle school students (Mage = 15.10 years), with rather low to intermediate academic achievement. Differences in population characteristics could have contributed to these discrepancies, particularly in comparison to Ariel et al. (2021). Intermediate school students may engage in less effortful covert retrieval when making JOLs due to lower academic skills and motivation. For example, when providing their judgments, the students may have relied more on surface-level cues, such as ease of reading, rather than engaging in deep processing, leading to weaker effects of JOLs on comprehension.

The immediate timing of JOLs, that is making JOLs directly after reading in our study aligns with that applied in Ariel et al. (2021). Delayed JOLs, that is, metacognitive judgments made after a temporal delay following initial study, have been found to be more effective in triggering long-term memory retrieval processes (e.g., Dunlosky and Nelson, 1992; Tauber et al., 2015). Thus, future research should explore whether delaying JOLs might enhance their efficiency, particularly for populations and contexts where immediate JOLs show limited reactive effects. Moreover, the relatively short duration of the study and especially the learning phase may have restricted opportunities for deeper engagement with the material. As a consequence, both the potential benefits of retrieval practice and the reactive effects of JOLs might not have had sufficient time to develop, which could help explain the absence of significant effects.

Our study also revealed that retrieval practice without feedback did not improve comprehension, deviating from findings that consistently highlight its benefits when feedback is provided (e.g., Ariel et al., 2021; Dunlosky et al., 2005; Dunlosky and Nelson, 1992) and typically also when no feedback is given (e.g., Rowland, 2014). In contrast to Rowland (2014), our result suggests that younger learners might depend on the external validation of their responses through feedback. Feedback is also known to enhance metacognitive monitoring (Hattie and Timperley, 2007) and facilitate self-regulation (e.g., Butler and Winne, 1995; Pashler et al., 2005). Without feedback, retrieval practice may fail to highlight knowledge gaps, limiting its effectiveness for comprehension. Future research should investigate whether retrieval practice alone is less beneficial for younger learners than for adult learners, and to what extent feedback contributes to its effectiveness.

Retrieval practice did not enhance students’ judgment accuracy in our study (H2). This suggests that students did not, or only ineffectively, use cues arising from retrieval practice to inform their can-judgments. Prior research has shown that retrieval practice improves JOL accuracy when feedback is available (e.g., Ariel et al., 2021; Dunlosky and Thiede, 1998). Contrary to Ariel et al. (2021), we did not include feedback in order to isolate the effects of retrieval practice itself. Without feedback, learners might lack valid cues for assessing their knowledge, which can reduce benefits of retrieval practice. Instead, they may rely on heuristic cues such as ease of recognition (Metcalfe and Finn, 2008) or familiarity with the material (Begg et al., 1992), which are less reliable indicators of actual comprehension. Measurement sensitivity of the five-point Likert scale used for the can-judgements might also have been low, especially compared to Ariel et al. (2021), where participants gave JOLs on a scale from 0 to 100. This may have contributed to the absence of significant effects of can-judgements by limiting the statistical power to detect changes in JOL Accuracy.

We found no benefit of engaging in retrieval practice in addition to making can-judgments on improvements in text comprehension after restudying (as an indicator of self-regulation; H3). Although the “I can” formulation of the judgments should prompt students to reflect more deeply on specific competencies, it is possible that students processed the can-judgments too superficially to elicit a motivational-cognitive effect. This could explain the absence of the expected synergistic effect with retrieval practice on regulation processes and in turn comprehension improvement. Future studies could address this limitation by prompting students to provide justified can-judgments, such as “I can explain this section because I have understood the main argument,” to encourage deeper self-reflection. Additionally, providing concrete feedback on the accuracy of their can-judgments could help students better align their self-assessments, potentially increasing effects of can-judgments and retrieval practice.

Contrary to our assumption (H3), combining JOLs with retrieval practice did not enhance self-regulation compared to JOLs alone. This result aligns with our finding concerning JOL accuracy, which was not superior when can-judgments were combined with retrieval practice than with JOLs alone. Since accurate judgments are critical for effective restudy decisions (e.g., Metcalfe and Finn, 2008; Thiede et al., 2003), this outcome is not surprising. As indicated above, the lack of feedback might have prevented advantages for students’ ability to make accurate judgments and hence to effectively regulate their learning when rereading, diminishing self-regulation benefits typically associated with retrieval practice.

Interestingly, an exploratory analysis showed a significant correlation between JOL accuracy and test performance improvement (r = 0.21, p = 0.010). This finding indicates that students who provided more accurate JOLs more strongly improved their comprehension from the first to the second test completion. This result corroborates prior research, which showed that more accurate judgments when learning from texts produce more effective regulation (e.g., Thiede et al., 2003). Hence, this result reinforces the importance of accurate JOLs for effective self-regulation across different learning formats (e.g., Dunlosky and Thiede, 1998; Thiede et al., 2003).

To conclude, our study extends the discussion on the reactive effect of JOLs. While prior research has demonstrated clear benefits of making JOLs for learning simple materials such as word pairs (e.g., Double et al., 2018), our findings, together with the findings of Ariel et al. (2021) and Zhao et al. (2023), suggest that these benefits may not always generalize to more complex materials like text. Moreover, the combination of can-judgments and retrieval practice does not automatically lead to enhanced learning outcomes when applied to complex materials like text - especially in the absence of feedback, among younger and less advanced learners. This highlights the need for further investigation into how the learning material influences the effect of making can-judgments and retrieval practice on learning outcomes. For example, future research may examine the question of whether can-judgments and retrieval practice interact differently depending on material complexity. Long-term effects of can-judgements should also be investigated, given that can-lists are often promoted as a tool that supports learning and makes progress visible to students (e.g., Hoffmann, 2016; Schilderoth, 2016).

Limitations and practical implications

In this study, no feedback was provided in order to examine effects of retrieval practice per se. However, the absence of feedback might have constrained the effectiveness of retrieval practice. Feedback is known to enhance self-regulated learning (e.g., Pashler et al., 2005), suggesting that its inclusion could yield different results. In practical contexts, teachers might therefore integrate feedback following retrieval practice tasks to maximize their educational value. Nonetheless, it should be noted that feedback was not experimentally manipulated in the present study, limiting claims about its role.

Additionally, the immediate timing of the JOLs in the present study may have reduced the likelihood of covert retrieval processes. Exploring delayed JOLs could provide additional insights, as children as young as 6 years, as well as adults were found to make more accurate JOLs if they are delayed (de Bruin and Van Gog, 2012).

Although treatment duration differed significantly between groups, it is unlikely that this difference substantially affected the results. All treatments were designed to be cognitively manageable, and even the longer duration times were relatively short (approximately 6–7 min in the experimental conditions), minimizing the risk of fatigue or overload that could have influenced learning outcomes. Moreover, no significant correlations were found between treatment time and any of the dependent variables. Therefore, adding more filler tasks for the JOL only and the control groups to equalize treatment duration would probably not have altered the findings.

Furthermore, it is possible that the use of multiple-choice questions for retrieval practice and testing limited the effectiveness of can-judgements and retrieval practice. This question format is rather untypical in STEM subjects and students may not be accustomed to answering multiple-choice questions in a physics class. Consequently, both self-assessments and performance might differ if open-ended questions, which are more typical in physics, had been used. Nonetheless it should be noted that test difficulty was adequate with an average test score of 4.67 (out of 10), with only six students scoring the maximum of 10 points and no student scoring zero points. Thus, there is no strong indication that students struggled with the multiple-choice format in this specific context.

In summary, our study suggests that a reactive effect of JOLs and retrieval practice may depend on the learning context (as the reactive effect has only ever been found when learning single words or word pairs (Double et al., 2018)) and population. Further research should explore the role of learning materials such as the kind of JOLs and feedback to clarify the mechanisms underlying the effects and optimize strategies for self-regulated learning.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Ethikkommission der Pädagogischen Hochschule Karlsruhe. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin.

Author contributions

MF: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. NH: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. AB: Conceptualization, Methodology, Writing – review & editing. AP-W: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. MF was supported within the Doctoral and Postdoctoral Program “AQUA-d” (Forschungs- und Nachwuchskolleg “Aufgabenqualität im digital gestützten Unterricht”) of the Ministry of Science, Research and Arts Baden-Württemberg.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

References

Ariel, R., Karpicke, J. D., Witherby, A. E., and Tauber, S. K. (2021). Do judgments of learning directly enhance learning of educational materials? Educ. Psychol. Rev. 33, 693–712. doi: 10.1007/s10648-020-09556-8

Crossref Full Text | Google Scholar

Baker, J. M. C., and Dunlosky, J. (2006). Does momentary accessibility influence metacomprehension judgments? The influence of study—judgment lags on accessibility effects. Psychon. Bull. Rev. 13, 60–65. doi: 10.3758/BF03193813,

PubMed Abstract | Crossref Full Text | Google Scholar

Begg, I., Anas, A., and Farinacci, S. (1992). Dissociation of processes in belief: source recollection, statement familiarity, and the illusion of truth. J. Exp. Psychol. Gen. 121, 446–458. doi: 10.1037/0096-3445.121.4.446

Crossref Full Text | Google Scholar

Blume, F., Irmer, A., Dirk, J., and Schmiedek, F. (2022). Day-to-day variation in students' academic success: the role of self-regulation, working memory, and achievement goals. Dev. Sci. e13301. doi: 10.1111/desc.13301

Crossref Full Text | Google Scholar

Blunt, J. R., and Karpicke, J. D. (2014). Learning with retrieval-based concept mapping. J. Educ. Psychol. 106, 849–858. doi: 10.1037/a0035934

Crossref Full Text | Google Scholar

Butler, D. L., and Winne, P. H. (1995). Feedback and self-regulated learning: a theoretical synthesis. Rev. Educ. Res. 65:245. doi: 10.2307/1170684

Crossref Full Text | Google Scholar

Carpenter, S. K., Pashler, H., and Cepeda, N. J. (2009). Using tests to enhance 8th grade students’ retention of U.S. history facts. Appl. Cogn. Psychol. 23, 760–771. doi: 10.1002/acp.1507

Crossref Full Text | Google Scholar

de Bruin, A. B. H., and Van Gog, T. (2012). Improving self-monitoring and self-regulation: from cognitive psychology to the classroom. Learn. Instr. 22, 245–252. doi: 10.1016/j.learninstruc.2012.01.003

Crossref Full Text | Google Scholar

de Bruin, A. B., Thiede, K. W., Camp, G., and Redford, J. (2011). Generating keywords improves metacomprehension and self-regulation in elementary and middle school children. Journal of experimental Child Psychology, 109, 294–310. doi: 10.1016/j.jecp.2011.02.005,

PubMed Abstract | Crossref Full Text | Google Scholar

Double, K. S., and Birney, D. P. (2017). Are you sure about that? Eliciting confidence ratings may influence performance on raven’s progressive matrices. Think. Reason. 23, 190–206. doi: 10.1080/13546783.2017.1289121

Crossref Full Text | Google Scholar

Double, K. S., Birney, D. P., and Walker, S. A. (2018). A meta-analysis and systematic review of reactivity to judgements of learning. Memory 26, 741–750. doi: 10.1080/09658211.2017.1404111,

PubMed Abstract | Crossref Full Text | Google Scholar

Dunlosky, J., Hertzog, C., Kennedy, M., and Thiede, K. (2005). The self-monitoring approach for effective learning. Cogn. Technol. 10, 4–11.

Google Scholar

Dunlosky, J., and Nelson, T. O. (1992). Importance of the kind of cue for judgments of learning (JOL) and the delayed-JOL effect. Mem. Cogn. 20, 374–380. doi: 10.3758/BF03210921,

PubMed Abstract | Crossref Full Text | Google Scholar

Dunlosky, J., and Rawson, K. A. (2012). Overconfidence produces underachievement: inaccurate self evaluations undermine students’ learning and retention. Learn. Instr. 22, 271–280. doi: 10.1016/j.learninstruc.2011.08.003

Crossref Full Text | Google Scholar

Dunlosky, J., and Thiede, K. W. (1998). What makes people study more? An evaluation of factors that affect self-paced study. Acta Psychol. 98, 37–56. doi: 10.1016/s0001-6918(97)00051-6,

PubMed Abstract | Crossref Full Text | Google Scholar

Dunlosky, J., and Thiede, K. W. (2013). Four cornerstones of calibration research: Why understanding students’ judgments can improve their achievement. Learning and Instruction, 24, 58–61. doi: 10.3390/jintelligence10040101,

PubMed Abstract | Crossref Full Text | Google Scholar

Faul, F., Erdfelder, E., Lang, A.-G., and Buchner, A. (2007). G*power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191. doi: 10.3758/BF03193146

Crossref Full Text | Google Scholar

Hattie, J., and Timperley, H. (2007). The power of feedback. Rev. Educ. Res. 77, 81–112. doi: 10.3102/003465430298487 (Original work published 2007)

Crossref Full Text | Google Scholar

Hoffmann, K. (2016). “Eigene Lernwege” in GGG Verband für Schulen des gemeinsamen Lernens & Debus Pädagogik Verlag (Hrsg.), Leistungen ermitteln – Lernen fördern, Frankfurt am Main, 36–39.

Google Scholar

Hughes, G. I., and Thomas, A. K. (2022). When memory and metamemory align: How processes at encoding influence delayed judgment-of-learning accuracy. Journal of Intelligence, 10:101. doi: 10.3390/jintelligence10040101,

PubMed Abstract | Crossref Full Text | Google Scholar

Janes, J., Rivers, M., and Dunlosky, J. (2018). The influence of making judgments of learning on memory performance: positive, negative, or both? Psychon. Bull. Rev. 25, 2356–2364. doi: 10.3758/s13423-018-1463-4,

PubMed Abstract | Crossref Full Text | Google Scholar

Jönsson, F. U., Hedner, M., and Olsson, M. J. (2012). The testing effect as a function of explicit testing instructions and judgments of learning. Exp. Psychol. 59, 251–257. doi: 10.1027/1618-3169/a000150,

PubMed Abstract | Crossref Full Text | Google Scholar

Karpicke, J. D. (2017). “Retrieval-based learning: a decade of progress” in Learning and memory: A comprehensive reference (Munich: Elsevier), 487–514.

Google Scholar

Kemp, P. L., Loaiza, V. M., and Wahlheim, C. N. (2023). Testing can enhance episodic memory updating in younger and older adults. Psychol. Aging 38, 656–669. doi: 10.1037/pag0000776,

PubMed Abstract | Crossref Full Text | Google Scholar

Larsen, D., and Dornan, T. (2013). Quizzes and conversations: exploring the role of retrieval in medical education. Med. Educ. 47, 1236–1241. doi: 10.1111/medu.12274,

PubMed Abstract | Crossref Full Text | Google Scholar

Li, B., Zhao, W., Zheng, J., Hu, X., Su, N., Fan, T., et al. (2022). Soliciting judgments of forgetting reactively enhances memory as well as making judgments of learning: empirical and meta-analytic tests. Mem. Cogn. 50, 1061–1077. doi: 10.3758/s13421-021-01258-y,

PubMed Abstract | Crossref Full Text | Google Scholar

Metcalfe, J., and Finn, B. (2008). Evidence that judgments of learning are causally related to study choice. Psychon. Bull. Rev. 15, 174–179. doi: 10.3758/PBR.15.1.174

Crossref Full Text | Google Scholar

Nelson, T. (1984). A comparison of current measures of feeling-of-knowing accuracy. Psychol. Bull. 95, 109–133. doi: 10.1037/0033-2909.95.1.109

Crossref Full Text | Google Scholar

OECD (2023). PISA 2022 results (volume I): The state of learning and equity in education, PISA, OECD Publishing, Paris. doi: 10.1787/53f23881-en

Crossref Full Text | Google Scholar

Pashler, H., Cepeda, N. J., Wixted, J. T., and Rohrer, D. (2005). When does feedback facilitate learning of words? J. Exp. Psychol. Learn. Mem. Cogn. 31, 3–8. doi: 10.1037/0278-7393.31.1.3,

PubMed Abstract | Crossref Full Text | Google Scholar

Prinz, A., Golke, S., and Wittwer, J. (2020). How accurately can learners discriminate their comprehension of texts? A comprehensive meta-analysis on relative metacomprehension accuracy and influencing factors. Educ. Res. Rev. 31:100358. doi: 10.1016/j.edurev.2020.100358

Crossref Full Text | Google Scholar

R Core Team. (2025). _R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. Available online at: https://www.R-project.org

Google Scholar

Rhodes, M. G. (2015). Judgments of learning (Dunlosky, J. & S. (Uma K. Tauber), Hrsg.; Bd. 1). Oxford: Oxford University Press

Google Scholar

Rowland, C. A. (2014). The effect of testing versus restudy on retention: a meta-analytic review of the testing effect. Psychol. Bull. 140, 1432–1463. doi: 10.1037/a0037559,

PubMed Abstract | Crossref Full Text | Google Scholar

Rowland, C., and DeLosh, E. (2014). Benefits of testing for nontested information: retrieval-induced facilitation of episodically bound material. Psychon. Bull. Rev. 21, 1516–1523. doi: 10.3758/s13423-014-0625-2,

PubMed Abstract | Crossref Full Text | Google Scholar

Schilderoth,. 2016. Selbstorganisiertes Lernen mit Unterstützung neuer Medien. Max-Eyth Schule Alsfeld. Available online at: https://www.mes-alsfeld.de/fileadmin/user_upload/mes-alsfeld.de/FlyerInfoDownload/Hauptmenue/NeLe/Selbstorganisiertes_Lernen_mit_der_Unterstuetzung_neuer_Medien_im_Berufsschulunterricht.pdf (Accessed August 15, 2025).

Google Scholar

Schneider, S., Graf, G., and Würth, T. 2012 Individuelle Förderung: Unterrichtssequenzen. Available online at: https://lehrerfortbildung-bw.de/u_matnatech/bio/bs/2bfs/2bfs1/vorwort/ (Accessed August 15, 2025).

Google Scholar

Tauber, S. K., Dunlosky, J., and Rawson, K. A. (2015). The influence of retrieval practice versus delayed judgments of learning on memory: resolving a memory-metamemory paradox. Exp. Psychol. 62, 254–263. doi: 10.1027/1618-3169/a000296

Crossref Full Text | Google Scholar

Tauber, S. K., Witherby, A. E., Dunlosky, J., Rawson, K. A., Putnam, A. L., and Roediger, H. L. (2018). Does covert retrieval benefit learning of key-term definitions? J. Appl. Res. Mem. Cogn. 7, 106–115. doi: 10.1016/j.jarmac.2016.10.004

Crossref Full Text | Google Scholar

Thiede, K. W., Anderson, M. C. M., and Therriault, D. (2003). Accuracy of metacognitive monitoring affects learning of texts. J. Educ. Psychol. 95, 66–73. doi: 10.1037/0022-0663.95.1.66

Crossref Full Text | Google Scholar

Yang, C., Zhao, W., Yuan, B., Luo, L., and Shanks, D. R. (2023). Mind the gap between comprehension and metacomprehension: meta-analysis of metacomprehension accuracy and intervention effectiveness. Rev. Educ. Res. 93, 143–194. doi: 10.3102/00346543221094083

Crossref Full Text | Google Scholar

Zhao, W., Li, B., Shanks, D. R., Zhao, W., Zheng, J., Hu, X., et al. (2022). When judging what you know changes what you really know: soliciting metamemory judgments reactively enhances children’s learning. Child Dev. 93, 405–417. doi: 10.1111/cdev.13689,

PubMed Abstract | Crossref Full Text | Google Scholar

Zhao, W., Xu, M., Xu, C., Li, B., Hu, X., Yang, C., et al. (2023). Judgments of learning following retrieval practice produce minimal reactivity effect on learning of education-related materials. Journal of Intelligence. 11:190. doi: 10.3390/jintelligence11100190

Crossref Full Text | Google Scholar

Keywords: can-judgments, judgment accuracy, reactive effect, retrieval practice, self-regulated learning, text comprehension

Citation: Fifka M, Hübner N, de Bruin A and Prinz-Weiß A (2026) “Yes we can?” – To what extent do can-judgments enhance self-regulated learning from text? Front. Educ. 10:1689514. doi: 10.3389/feduc.2025.1689514

Received: 20 August 2025; Revised: 03 December 2025; Accepted: 04 December 2025;
Published: 05 January 2026.

Edited by:

Evely Boruchovitch, State University of Campinas, Brazil

Reviewed by:

Bambang Subali, Universitas Negeri Semarang, Indonesia
Aeng Muhidin, Pamulang University, Indonesia

Copyright © 2026 Fifka, Hübner, de Bruin and Prinz-Weiß. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Martin Fifka, bWFydGluLmZpZmthQHBoLWthcmxzcnVoZS5kZQ==

ORCID: Martin Fifka, orcid.org/0009-0009-1011-0581
Nicolas Hübner, orcid.org/0000-0003-3528-8086
Anique de Bruin, orcid.org/0000-0001-5178-0287
Anja Prinz-Weiß, orcid.org/0000-0002-1097-3442

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.