Introduction
This article discusses the use of text-generating AI applications for providing feedback on students' texts and to help them revise their writing. While feedback through applications based on generative AI (for example, ChatGPT or specific tools such as Writing Coach, Writeable, and others) is often evaluated in terms of quality, even in comparison to feedback from human teachers (Mah et al., 2025; Seßler et al., 2025; Steiss et al., 2024), it is often overlooked that the most important thing is for learners to use and process the feedback to revise their texts (e.g., Lipnevich and Smith, 2022). However, when considering learning situations in elementary and secondary schools, it appears that if text revision takes place at all, it is rarely in a form where previously provided feedback guides the revision (Jansen et al., 2025; Rong et al., 2025). AI appears to have little or no impact on this initial situation, which is what this article aims to discuss. To this end, it is divided into three parts: the first step is to present the potential of genAI for providing feedback on learner texts and to highlight the problem that many learners in elementary and secondary schools do not meaningfully engage with AI-generated feedback for revision. In a second step, we will discuss how this may be due not only to the way AI works, but also to the unfavorable integration of feedback into teaching and learning processes. To provide an alternative, the third step will outline a teaching model that aims to achieve effective integration.
AI feedback for text revision: potentials and limitations
Feedback is widely recognized as one of the most powerful tools for supporting learner development in writing (Hattie and Timperley, 2007; MacArthur, 2016; Wisniewski et al., 2020). This holds true for learning to write in the context of L1 acquisition as well as in the context of L2-writing (Hyland, 2016; Hyland and Hyland, 2019). However, for feedback to be effective, it should consist of an internal structure of feed up (Where am I going?), feed back (How am I going?), and feed forward (Where to next?), and be relevant to the learners and given in a timely manner (Brandmo and Gamlem, 2025; Gibbs and Simpson, 2005). Yet, providing feedback that is both timely and targeted remains a significant challenge, especially when teachers are faced with lengthy student texts and larger heterogeneous learning groups (Applebee and Langer, 2011). The public availability of AI could transform this area and take some of the burden of providing feedback off the shoulders of teachers (e.g., Kolade et al., 2024; Nikolopoulou, 2025). Chatbots such as ChatGPT, LeChat, and Gemini, as well as specialized tools like Flint, Writeable, or the Khan Academy's Writing Coach, now deliver instant feedback that numerous studies have shown to resemble human feedback in terms of how it is rated and assessed by researchers (Almegren et al., 2025; Steiss et al., 2024; Usher, 2025). GenAI thus seems to offer teachers a means of transforming feedback from a time-consuming burden into a more manageable and scalable practice.
However, AI-based feedback on texts has conceptual limitations from a writing education perspective. Firstly, the systems are hardly capable of providing feedback on the writing process: although some AI applications (e.g., Khan Academy's Writing Coach) are able to provide feedback on ideas and drafts, this feedback can only be provided after these texts or text fragments have been entered into the input mask (and submitted) and not during the writing process itself. In other words, the systems are unable to provide feedback on a paragraph, sentence, or word that has just been started while writing. This is particularly problematic given that formative feedback during the writing process has been shown to be crucial for learner development (Graham et al., 2011, 2015). Mekheimer (2025) also notes that providing extensive feedback “all at once” at the end of the writing process is suboptimal based on cognitive load theory. Secondly, despite offering high levels of personalisation, AI-based feedback applications may lack depth, logical coherence, and relevance with regard to the learner's text (Elmotri et al., 2025; Venter et al., 2025). It is not uncommon for such applications to focus on the structure of the text, particularly the introduction (e.g., Writing Coach), even when the actual underlying issue in the learner's text is a lack of reader orientation throughout. Individualization therefore often means measuring all learners in a personalized way while always using “the same yardstick.” Thirdly, it is important to remember that AI-based feedback systems (and AI systems in general) only consider the text entered and offer direct solutions on how to improve this specific text. They do not provide any feedback on how writing in general could be improved. This contradicts the goals of learning to write, which also include, for example, identifying problem areas in one's own text and finding alternatives (by oneself) before implementing them (Bereiter and Scardamalia, 1987; Hayes and Flower, 1986). However, if AI applications take over the task of identifying problem areas and searching for alternatives, revising becomes a mindless process of working through AI suggestions, especially for inexperienced writers, which can have a long-term impact on their revision skills and motivation (Jantzen, 2003).
While these constraints are important, we would like to turn our attention to what is probably an even more pressing issue: the uptake of AI-generated feedback by learners when revising their writing. Modern feedback research has long pointed out that, in addition to the source of the feedback and the feedback message, the cognitive, behavioral, and affective processing of feedback are also relevant factors—as illustrated, for example, in the Student-Feedback Interaction Model (Lipnevich and Smith, 2022). According to the authors of this model “feedback that is most conducive to improvement is feedback that is somehow processed” (Lipnevich and Smith, 2022: p. 2).
Regarding AI-based feedback applications, there is evidence that this processing by students does not take place in a sufficient manner. Both quantitative (Jansen et al., 2025; Yu and Xie, 2025) and qualitative (Rong et al., 2025) analyses of AI feedback usage in text revision revealed that the majority of primary and secondary school students do not make use of it in their revisions. In the context of L1-writing, Jansen et al. (2025) conducted a quantitative study to examine the extent to which elementary and middle school students make changes to their texts after receiving feedback from the German FelloFish platform before submitting them as “revised versions.” The results showed that 6,889 of the total 14,236 learners surveyed (48%) made no changes at all in response to the AI feedback, while another proportion of learners made only minor changes (Jansen et al., 2025: p. 833). In a detailed examination of the revision process using AI (the Chinese platform Unipus AIGC) among three students in an EFL class, Rong et al. (2025) observed that the highest-performing student in particular was able to efficiently incorporate the AI feedback into their own text, while the weakest learner hardly took the genAI feedback into account. Yu and Xie (2025) compare the uptake of feedback during revision between AI feedback and teacher feedback in high school and show that AI feedback in particular leads to high uptake (86%) in the area of surface-level feedback (e.g., grammar), but only to low uptake (32%) at the meaning level. This suggests that the mere existence of AI feedback does not automatically result in its usage. Notably, similar patterns were already observed in earlier AWE systems such as Criterion (Schroeder et al., 2008), indicating that this problem is not unique to AI but rather endemic to automated feedback in general.
Mekheimer (2025) looks at how AI-based feedback applications that target surface features of texts, such as grammar and sentence structure, are used by EFL writers at university. He examines the influence of AI feedback on quality, as well as its frequency of usage. The results show that advanced writers at the university frequently use feedback when revising their work. However, this requires “critical engagement” (Mekheimer, 2025) on the part of the writers, who must weigh up which aspects of the feedback are actually useful for revising their work. This suggests that these complex mental processes are often beyond primary and secondary school writers, preventing them from using the feedback in the first place.
Implication: rethinking how feedback is integrated into instruction
Clearly, the mere availability of rapid, personalized feedback through genAI is not a panacea. The assumed benefit of reducing the burden on teachers of providing feedback (e.g., Steiss et al., 2024; Wrede et al., 2023) is immediately negated if learners do not engage with the feedback. When attempting to identify why learners do not take up AI-generated feedback during revision, it would be reasonable to consider potential differences between human and AI feedback. Mah et al. (2025), for example, observe that AI and human teachers may focus on different aspects of a text, providing feedback at disparate levels of text (e.g., primarily at the sentence level in the case of AI). We, however, suspect that neither the (differences in the) quality nor the (technical) functionality of AI-generated feedback alone is responsible for its lack of use by learners, but rather an unfavorable embedding of AI-generated feedback in teaching and the learning processes. The reason for this assumption is that the lack of use of human feedback for text revision, if not properly embedded in teaching, has been recognized for a long time, even before the advent of AI (MacDonald, 1991; Sinclair and Cleland, 2007).
We suspect that teachers often present (genAI) feedback as isolated input rather than as a starting point for communication between learners or between learners and teachers. Feedback can be defined as a “dynamic interaction between teacher and student aimed at facilitating learning” (Brandmo and Gamlem, 2025: p. 2). Therefore, the prerequisite for the effectiveness of any type of feedback is that it is embedded in a context that actually allows for dynamic interaction. It would be a mistake to assume that the interactivity of AI feedback by itself (since AI already responds interactively) is sufficient for learning. The task of teachers is (probably now more than ever) to create spaces for dynamic interactions about feedback. Although feedback as such might already be generated by generative AI, teachers should now focus on creating opportunities for discussion about it, i.e., integrating it into social interaction.
A multi-stage framework for the implementation of feedback
We suggest approaching AI feedback not as an endpoint but as a catalyst for communication in the classroom. Multi-stage settings can encourage learners to interact with feedback at different levels (Figure 1).
After producing a first draft (phase 1), which generative AI can assist with if necessary, for example by generating ideas, learners receive feedback from the AI in phase 2. In the first crucial revision phase (phase 3), students should revise their texts individually, using AI feedback as a low-stakes starting point. Gero et al. (2023) demonstrate that one of the potentials of AI-based feedback systems is that learners can seek feedback on things that are (still) too personal, so that AI can act as an anonymous support actor (Gero et al., 2023). This seemingly private interaction and work phase helps them address obvious issues in their texts while lowering the threshold for sharing their work later on. Next (phase 4), peer collaboration—whether in pairs or small groups—can create opportunities to discuss both the feedback and its application. This allows the aspect to be fulfilled that feedback is, by definition, a dialogical process and a co-construction in which those receiving feedback never should be reduced to the role of recipients (Brandmo and Gamlem, 2025; Busse et al., 2022). Finally, bringing these discussions back to the whole class (phase 5) enables collective reflection on the quality, appropriateness, and limitations of AI-generated comments. In such a model, the teacher takes on a new role: not as the ultimate judge of text quality, but as a facilitator who helps learners interpret feedback, identify mismatches, and refine their revisions accordingly. This leads to a shift in the teaching dynamic: the teacher is no longer responsible for providing feedback themselves, but they still maintain authority in that they can disagree with or question the AI feedback. After these stages, students should again revise their texts to integrate insights gained from the process.
It should be noted that not every writing assignment lends itself to such a multi-stage approach; especially when the texts to be written have clearly defined (and traditional) structures (e.g., argumentative essays), the pattern-oriented approach of AI can provide valuable support in the feedback process. Secondly, the age of the learners must be taken into account. Understanding AI-based feedback alone (phase 2) places considerable demands on reading skills. In the context of primary school, teacher mediation would certainly be necessary.
Conclusion and implications
The position we have outlined in this article has clear implications for AI-oriented research. We propose that, when it comes to research in writing, equal consideration must be given both to the quality of AI feedback and to how AI feedback is embedded in writing instruction. Rather than merely assessing differences in the quality and quantity of human and AI feedback, or the quality of AI feedback across models, tasks or prompts, empirical studies should also be designed to identify characteristics of learning settings in which feedback on a written text leads to productive revision. Studies should also be designed to identify characteristics of learning settings in which feedback on a written text, whether AI-based or not, leads to productive revision (uptake). Examining and identifying the prerequisites for feedback uptake and the goals that teachers and learners associate with it, as well as the extent to which these influence uptake, are key to providing teachers with specific guidance on designing effective learning scenarios. So too will investigating the extent to which a tailor-made educational setting can increase uptake. These questions remain relevant even without AI-based feedback. However, they become even more pertinent when AI facilitates feedback, potentially reintroducing it into everyday teaching. Against this backdrop, we should also take advantage of the current focus on AI feedback to address important research questions independent of it (Jensen et al., 2024), particularly those concerning the conditions for effective writing feedback.
Author contributions
GH: Writing – review & editing, Writing – original draft. FH: Writing – original draft, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Almegren, A., Mahdi, H. S., Hazaea, A. N., Ali, J. K., and Almegren, R. M. (2025). Evaluating the quality of AI feedback: a comparative study of AI and human essay grading. Innov. Educ. Teach. Int. 62, 1858–1873. doi: 10.1080/14703297.2024.2437122
Applebee, A. N., and Langer, J. A. (2011). EJ Extra: a snapshot of writing instruction in middle schools and high schools [FREE ACCESS]. English J. 100, 14–27. doi: 10.58680/ej201116413
Bereiter, C., and Scardamalia, M. (1987). The Psychology of Written Composition (Transferred to digital printing). London: Routledge.
Brandmo, C., and Gamlem, S. M. (2025). Students' perceptions and outcome of teacher feedback: a systematic review. Front. Educ. 10:1572950. doi: 10.3389/feduc.2025.1572950
Busse, V., Müller, N., and Siekmann, L. (2022). “Wirksame Schreibförderung durch diversitätssensibles formatives Feedback,” Schreiben fachübergreifend fördern, eds in V. Busse, N. Müller, and L. Siekmann (Hanover: Kallmeyer mit Klett),114–133.
Elmotri, B., Harizi, R., Boujlida, A., M. Elyasa, Y., Garrouri, S., Amri, F., et al. (2025). The Impact of AI-generated feedback explicitness (Generic vs. Specific) on EFL students' use of automated written corrective feedback. Arab World Engl. J. 16, 384–402. doi: 10.24093/awej/vol16no1.24
Gero, K. I., Long, T., and Chilton, L. B. (2023). “Social dynamics of AI support in creative writing,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg: ACM), 1–15. doi: 10.1145/3544548.3580782
Gibbs, G., and Simpson, C. (2005). Conditions under which assessment supports students' learning. Learn. Teach. High. Educ. 1, 3–31.
Graham, S., Harris, K., and Hebert, M. (2011). Informing Writing. The Benefits of Formative Assessment. A Carnegie Corporation Time to Act report. Washington, DC: Alliance for Excellent Education. Available online at: https://files.eric.ed.gov/fulltext/ED537566.pdf
Graham, S., Hebert, M., and Harris, K. R. (2015). Formative assessment and writing: a meta-analysis. Elem. Sch. J. 115, 523–547. doi: 10.1086/681947
Hattie, J., and Timperley, H. (2007). The power of feedback. Rev. Educ. Res. 77, 81–112. doi: 10.3102/003465430298487
Hayes, J. R., and Flower, L. S. (1986). Writing research and the writer. Am. Psychol. 41, 1106–1113. doi: 10.1037/0003-066X.41.10.1106
Hyland, K. (2016). Teaching and Researching Writing, 3rd edn. New York, NY: Routledge. doi: 10.4324/9781315717203
Hyland, K., and Hyland, F. (2019). “Contexts and issues in feedback on L2 writing,” in Feedback in Second Language Writing, eds K. Hyland, and F. Hyland, 2 edn (Cambridge: Cambridge University Press), 1–22. doi: 10.1017/9781108635547.003
Jansen, T., Horbach, A., and Meyer, J. (2025). “Feedback from generative AI: correlates of student engagement in text revision from 655 classes from primary and secondary school,” Proceedings of the 15th International Learning Analytics and Knowledge Conference (Dublin), 831–836. doi: 10.1145/3706468.3706494
Jantzen, C. (2003). “Eigene Texte in der Schule überarbeiten: Beobachten—Verstehen—Lernen,” in Kinder schreiben und lesen. Beobachten—Verstehen—Lehren, eds E. Brinkmann, N. Kruse, and C. Osburg (Breisgau: Fillibach Verlag), 111–126.
Jensen, L. X., Buhl, A., Sharma, A., and Bearman, M. (2024). Generative AI and higher education: a review of claims from the first months of ChatGPT. High. Educ. 89, 1145–1161. doi: 10.1007/s10734-024-01265-3
Kolade, O., Owoseni, A., and Egbetokun, A. (2024). Is AI changing learning and assessment as we know it? Evidence from a ChatGPT experiment and a conceptual framework. Heliyon 10:e25953. doi: 10.1016/j.heliyon.2024.e25953
Lipnevich, A. A., and Smith, J. K. (2022). Student – feedback interaction model: revised. Stud. Educ. Eval. 75:101208. doi: 10.1016/j.stueduc.2022.101208
MacArthur, C. (2016). “Instruction in evaluation and revision,” in Handbook of Writing Research, 2nd Edn, eds. C. MacArthur, S. Graham, and J. Fitzgerald (The Guilford Press), 272–287.
MacDonald, R. B. (1991). Developmental students' processing of teacher feedback in composition instruction. Rev. Res. Dev. Educ. 8.
Mah, C., Tan, M., Phalen, L., Sparks, A., and Dorottya, D. (2025). From sentence-corrections to deeper dialogue: qualitative insights from LLM and teacher feedback on student writing. SSRN. 25, 1193. doi: 10.2139/ssrn.5213040
Mekheimer, M. (2025). Generative AI-assisted feedback and EFL writing: a study on proficiency, revision frequency and writing quality. Discover Educ. 4:170. doi: 10.1007/s44217-025-00602-7
Nikolopoulou, K. (2025). Assessment redefined: educational assessment meets AI - ChatGPT challenges. Curr. Perspect. Educ. Res. 8, 17–30. doi: 10.46303/cuper.2025.2
Rong, M., Yao, Y., Li, Q., and Chen, X. (Winnie) (2025). Exploring student engagement with artificial intelligence-guided chatbot feedback in EFL writing: interactions and revisions. Comput. Assist. Lang. Learn. 1–30. doi: 10.1080/09588221.2025.2539979
Schroeder, J., Grohe, B., and Pogue, R. (2008). The impact of criterion writing evaluation technology on criminal justice student writing skills. J. Crimin. Justice Educ. 19, 432–445. doi: 10.1080/10511250802476269
Seßler, K., Bewersdorff, A., Nerdel, C., and Kasneci, E. (2025). Towards adaptive feedback with AI: comparing the feedback quality of LLMs and teachers on experimentation protocols (Version 1). arXiv. [preprint]. arXiv:2502.12842. 10.48550/arXiv.2502.12842
Sinclair, H. K., and Cleland, J. A. (2007). Undergraduate medical students: who seeks formative feedback? Med. Educ. 41, 580–582. doi: 10.1111/j.1365-2923.2007.02768.x
Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., et al. (2024). Comparing the quality of human and ChatGPT feedback of students' writing. Learn. Instruct. 91:101894. doi: 10.1016/j.learninstruc.2024.101894
Usher, M. (2025). Generative AI vs. instructor vs. peer assessments: a comparison of grading and feedback in higher education. Assess. Eval. High. Educ. 50, 912–927. doi: 10.1080/02602938.2025.2487495
Venter, J., Coetzee, S. A., and Schmulian, A. (2025). Exploring the use of artificial intelligence (AI) in the delivery of effective feedback. Assess. Eval. High. Educ. 50, 516–536. doi: 10.1080/02602938.2024.2415649
Wisniewski, B., Zierer, K., and Hattie, J. (2020). The power of feedback revisited: a meta-analysis of educational feedback research. Front. Psychol. 10:3087. doi: 10.3389/fpsyg.2019.03087
Wrede, S. E., Gloerfeld, C., and de Witt, C. (2023). “KI und Didaktik – Zur Qualität von Feedback durch Recommendersysteme,” in Künstliche Intelligenz in der Bildung, eds C. de Witt, C. Gloerfeld, and S. E. Wrede (Wiesbaden: Springer Fachmedien Wiesbaden), pp. 133–154. doi: 10.1007/978-3-658-40079-8_7
Keywords: AI-generated feedback, automated writing evaluation (AWE), feedback pedagogy, learner uptake, writing instruction
Citation: Helm G and Hesse F (2026) Stop perfecting the feedback, start supporting the uptake: rethinking AI in writing instruction. Front. Educ. 11:1737037. doi: 10.3389/feduc.2026.1737037
Received: 31 October 2025; Revised: 30 December 2025;
Accepted: 06 January 2026; Published: 30 January 2026.
Edited by:
Steve Graham, Arizona State University, United StatesReviewed by:
Kleopatra Nikolopoulou, National and Kapodistrian University of Athens, GreeceCopyright © 2026 Helm and Hesse. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Gerrit Helm, Z2Vycml0LmhlbG1AdW5pLWplbmEuZGU=