Your new experience awaits. Try the new design now and help us make it even better

MINI REVIEW article

Front. Educ., 09 February 2026

Sec. Assessment, Testing and Applied Measurement

Volume 11 - 2026 | https://doi.org/10.3389/feduc.2026.1706559

Assessing artificial intelligence literacy in foreign language teachers: a TPACK-based perspective

Tailin RenTailin RenQingfu Li
Qingfu Li*
  • College of Literature and Journalism, South-Central Minzu University, Wuhan, Hubei, China

The rapid integration of artificial intelligence (AI) into language education has heightened the need for foreign language (FL) teachers to develop strong artificial intelligence literacy (AIL) to navigate increasingly complex and diverse digital classrooms. However, current AIL assessment standards tend to be generic, often overlooking the cognitive, linguistic, and pedagogical challenges specific to FL contexts. This gap highlights the need for reliable, context-sensitive frameworks and practical assessment tools. Drawing on an extensive synthesis of peer-reviewed research from major academic databases, this Mini Review synthesizes the literature, using a transparent literature selection process informed by established reporting principles and adapted to the scope of the study. Key study parameters such as research design, participants, and item formats were extracted and organized for comparative analysis. The resulting multidimensional framework, grounded in an expanded TPACK-AI model, comprises six interrelated dimensions: conceptual understanding, technical proficiency, pedagogical integration, adaptive expertise, ethical reasoning, and self-efficacy. In addition, this Mini Review illustrates a set of assessment item types including multiple-choice, matching, ordering, cloze, short answer, and extended response, each mapped to specific AIL competencies. By integrating insights from existing scholarship with a context-specific framework design, this study synthesizes a structured, adaptable approach for defining and organizing the core competencies of FL teachers in AI-integrated language education. The framework can serve as a reference for teacher educators, policymakers, and institutions in developing targeted training initiatives, formulating policy guidelines, and supporting equitable technology integration in language education.

1 Introduction

The rapid integration of AI technologies is fundamentally reshaping the global educational landscape. This shift accelerates the transition from traditional instruction to personalized, data-driven learning environments (Chen et al., 2020b; Chan and Tsi, 2023; Su and Lee, 2024). Nowhere is this transformation more evident than in language education. AI-powered tools, such as intelligent tutoring systems and automated assessment platforms, offer significant potential for individualized learning, real-time feedback, and differentiated instruction. These benefits are especially relevant for linguistically diverse and FL learners (Briceño and Klein, 2019; Williyan et al., 2024). However, AI integration also introduces persistent challenges, including algorithmic bias, data privacy concerns, and digital inequity (Haderer and Ciolacu, 2022; Chan and Tsi, 2023). Importantly, the effectiveness and fairness of AI-supported instruction are largely determined by the AIL of teachers. This critical factor is often overlooked, yet it directly influences the quality and equity of learning outcomes (Salami and Alharthi, 2022; Erdem Coşgun, 2025).

Despite growing interest in AIL, there is still a lack of context-sensitive standards tailored to the unique realities of FL education. Existing research has developed foundational frameworks, such as the TPACK model. This model places teachers’ technical, pedagogical, and content expertise at the center of effective technology integration (Sperling et al., 2024). However, these models and related assessment instruments often overlook the cognitive, linguistic, and cultural complexities encountered by FL teachers as they work with multiple languages and digital tools (Briceño and Klein, 2019; Ojeda-Ramirez et al., 2024). Many existing AIL standards are overly generic or technical. As a result, they provide insufficient support for differentiated pedagogy or practical classroom assessment (Glusac and Milic, 2022; Hockly, 2023; Xue, 2024).

Addressing this gap, the present study builds on and extends the TPACK framework by explicitly integrating context-specific dimensions of AIL into its structure. While TPACK typically covers general technological, pedagogical, and content knowledge, this research adapts and expands the framework to better capture the cognitive, linguistic, and ethical demands of AI integration in foreign language teaching. Specifically, six dimensions (conceptual understanding, technical proficiency, pedagogical integration, adaptive expertise, ethical reasoning, and self-efficacy) are systematically aligned with TPACK’s hierarchical layers to outline a context-sensitive set of candidate dimensions and criteria (Chen et al., 2020a; Lintner, 2024; Tenberga and Daniela, 2024; Pan and Wang, 2025). This integration clarifies how each domain of knowledge translates into practical competencies for AI-enhanced language instruction. The resulting synthesis offers illustrative criteria and assessment item examples, advancing both theoretical coherence and practical application in AI-supported language education.

This study is guided by two core research questions: (1) What are the main standards and dimensions for assessing language teachers’ AI literacy, and (2) What assessment items can be used to measure these dimensions in practice. To address these questions, this Review synthesizes and articulates a candidate framework that reflects the cognitive, sociolinguistic, and pedagogical complexity of AI integration in FL teaching. Rather than treating AI literacy as a checklist of tool skills, we conceptualize FL teachers’ AIL as context-sensitive decision-making across planning, instruction, and assessment when AI is involved. Building on this stance, we provide clear benchmarks and illustrative item formats to support assessment in practice. Together, these contributions offer actionable guidance for teacher educators, policymakers, and institutional leaders to strengthen teacher preparation, professional development, and evidence-based policy (Zhao et al., 2022; Krüger, 2023; Pinski and Benlian, 2024). Advancing AIL standards is essential for ensuring educational quality, equity, and readiness amid ongoing technological change.

2 Method

We conducted a critical narrative review following Grant and Booth’s methodology for critical reviews (Grant and Booth, 2009). Our objective was to synthesize assessment- and measurement-relevant literature on FL teachers’ AIL, focusing on construct operationalization, item/scoring design, and validity/fairness evidence. Messick’s unified validity framework (content, internal structure, relations to other variables, and consequences) guided data extraction and appraisal (Messick, 1995).

Searches were conducted in Web of Science (WoS) and Scopus and restricted to 2021–2025 to balance currency and coverage in a rapidly evolving AI-in-education literature, minimizing historical drift while ensuring relevance to current assessment practice. We combined controlled and free-text terms for “artificial intelligence literacy,” FL/s-language contexts, teachers, and assessment/measurement, employing fuzzy matching and near-synonyms. In total, 129 search terms were used. An example Boolean string was:

(“artificial intelligence literacy” OR “AI literacy” OR “Artificial intelligence in education”) AND (“language learning”OR“language education”) AND (“EFL teachers”OR“Teacher Education”) AND (“educational assessment”OR“Teaching quality”OR“assessment structure”)

Backward/forward citation chasing from recent reviews supplemented database results (Sperling et al., 2024; Chan and Tang, 2025; Ning et al., 2025). Searches returned 911 records; after de-duplication in Zotero, 170 duplicates were removed, leaving 741 unique records for title/abstract screening (final search: 17 August 2025). Eligibility criteria required peer-reviewed journal articles in English with explicit relevance to assessment/measurement of AIL in FL teacher education (e.g., operational definitions, instruments, item/scoring approaches, or validity/fairness evidence). We excluded books, book chapters, dissertations, and preprints. Title/abstract screening for measurement relevance resulted in 40 records for full-text review.

Two reviewers independently extracted key study characteristics and evidence related to teacher AI use, guided by Messick’s (1995) unified validity considerations (content, context, consequences). We first applied inductive coding to capture recurring constructs, indicators, and decision-making demands that teachers face when integrating AI in FL settings. Next, we mapped these codes to broader categories informed by the Technological Pedagogical Content Knowledge (TPACK) framework and established validity facets, ensuring that technological, pedagogical, content, and ethical aspects were all represented. During this deductive organization, overlapping codes were iteratively merged while preserving distinctions between different types of decisions, for example, distinguishing the act of selecting an AI tool from the act of justifying its use under specific constraints. Any discrepancies in coding were resolved through discussion until consensus was reached. The outcome of this synthesis was a framework of six distinct dimensions of AI literacy, each defined by characteristic teacher decisions or behaviors, with illustrative performance descriptors. We also developed a preliminary blueprint linking each dimension to potential item formats, which guided the creation of prototype assessment items (see Table 1).

Table 1
www.frontiersin.org

Table 1. Publications reviewed in full text with reasons for inclusion.

3 Results

3.1 What are the main standards and dimensions for assessing language teachers’ AI literacy

“…goes beyond description to include analysis and conceptual innovation… the product of a critical review should be a hypothesis or a model” (Grant and Booth, 2009, p.93). Guided by this expectation, we not only synthesize prior work but also outline an integrated account and provide dimension-linked illustrative item prototypes to operationalize the model. Language teachers’ AIL is conceptualized as a multidimensional construct—cognitive, technological, pedagogical, and ethical—refined in this study through an integrated TPACK-AI model (Mishra and Koehler, 2006). The TPACK framework effectively integrates technology, pedagogy, and content knowledge, making it suitable for foreign language AI education. By embedding AIL dimensions into the TPACK structure, we capture both operational knowledge and adaptive, ethical, reflective engagement with AI tools. This comprehensive model enables a nuanced evaluation of teachers’ competencies, including their understanding of AI, ability to apply it in practice, capacity for critical reflection, and responsiveness to sociocultural contexts in language teaching.

Expanding on previous research that highlights “AI foundations and applications, AI ethics, a human-centered mindset, AI pedagogy, and AI for professional development” (Liu, 2025), we derived six core dimensions of AIL for language teachers. These six dimensions emerged because they repeatedly appeared across our literature corpus as (a) important in existing teacher-knowledge frameworks and AI-in-education studies, (b) observable in foreign-language teaching practice, and (c) measurable with established assessment formats. The dimensions are:

(1) Conceptual understanding: foundational knowledge of AI principles, such as how AI and large language models (LLMs) work, basic algorithmic processes, and their limitations and implications. A strong conceptual base (e.g., knowing an LLM’s data constraints) provides definitional clarity and prevents important aspects of AI from being overlooked in literacy definitions (Chee et al., 2024; Pandey and Bhusal, 2024).

(2) Technical proficiency: practical skills in operating AI tools, including designing effective prompts and using AI to automate or assist with tasks like feedback generation. This skill set is necessary for integration (teachers must know how to use the tools), but on its own it is not sufficient for effective AI-enhanced teaching (Han and Li, 2024).

(3) Pedagogical integration: the ability to embed AI appropriately in classroom instruction and formative assessment to support language learning goals. This involves designing lessons and assessment activities where AI tools enhance, rather than distract from, language education (e.g., using an AI tool to personalize vocabulary practice or to generate quiz questions aligned with learning outcomes). Effective pedagogical integration translates technical know-how into educational practice (Erdem Coşgun, 2025).

(4) Adaptive expertise: the capacity to flexibly apply AI technologies in a context-sensitive manner to meet foreign-language teaching objectives. This meta-competence goes beyond following set procedures, it requires teachers to make judgments about when and how to use AI given specific learners, tasks, or sociolinguistic contexts (Erdem Coşgun, 2025). We include adaptive expertise as a distinct dimension because FL teaching conditions vary widely; teachers must transfer and tailor their AI use to different scenarios (Pan and Wang, 2025; Vo and Huynh, 2025).

(5) Ethical reasoning: awareness of and responsiveness to the ethical issues that arise with AI use in education. This includes recognizing biases in AI outputs, protecting data privacy, ensuring academic integrity and respecting cultural appropriateness. Ethical reasoning addresses systematic risks and helps teachers use AI in ways that are fair, transparent, and aligned with educational values (Derakhshan and Ghiasvand, 2024; Tenberga and Daniela, 2024);

(6) Self-efficacy: teachers’ confidence in their ability to adopt and integrate AI tools in their teaching. This motivational dimension gauges whether educators feel capable of implementing what they know. High self-efficacy strongly predicts actual classroom use of AI (Hazzan-Bishara et al., 2025). Including self-efficacy acknowledges that even a knowledgeable teacher may not utilize AI unless they have confidence and intent to do so (Zhao et al., 2022; Pan and Wang, 2025).

To clarify how evidence from our corpus informed these dimensions, we note that each arose from recurring themes in the literature. For example, many studies emphasize that teachers need a basic understanding of AI concepts and how AI tools function, underscoring the importance of conceptual knowledge and technical skills as fundamental AIL components. Likewise, the corpus consistently highlights integrating AI into pedagogy and the need for contextual adaptation of AI use as critical competencies for educators. Frequent discussions of AI ethics in language education research confirmed the relevance of an ethics dimension. Finally, numerous studies point out that a teacher’s confidence and willingness to use AI greatly influence successful implementation. These converging insights directly shaped our six dimensions, ensuring the model is grounded in the prevailing concerns and priorities documented across the literature.

The TPACK-AI model maps these six dimensions onto the familiar TPACK domains in a layered structure. At the basic Knowledge Layer, each core knowledge type aligns with a dimension: Content Knowledge (CK) corresponds to conceptual understanding, Technological Knowledge (TK) corresponds to technical proficiency, and Pedagogical Knowledge (PK) corresponds to pedagogical integration. At the next Application Layer, the intersections of these knowledge areas (TPACK’s TCK, TPK, and PCK) underscore the role of adaptive expertise, the skill to connect technology with content and pedagogy in real teaching situations (Koehler et al., 2013). Finally, a Fusion Layer represents the holistic synthesis of content, pedagogy, and technology in AI-enhanced instruction, which is the ultimate goal of effective AIL integration.

Building on this mapping, our model extends the traditional TPACK framework in two important ways. First, we introduce adaptive expertise and ethical reasoning as critical-reflective dimensions not fully captured by the original TPACK intersections. These dimensions specify how teachers choose and justify their use of AI under pedagogical and sociocultural constraints, adding a layer of reflective decision-making beyond knowledge alone. Second, we foreground self-efficacy as an agentic dimension that links teachers’ knowledge to their action. In other words, beyond knowing what and how to teach with AI, teachers must also feel confident and motivated to do so. Together, these additions recast TPACK from a static knowledge framework into a dynamic, decision-in-context model for AI-enhanced foreign language teaching.

In summary, our integrated TPACK-AI model provides a structured schema for evaluating language teachers’ AI literacy across multiple domains. Framing AIL as a comprehensive professional capability in this way allows for a nuanced assessment of teachers’ readiness and effectiveness in leveraging AI, while maintaining transparency and alignment with current research-based competencies.

3.2 What assessment items can be used to measure these dimensions in practice

3.2.1 Synthesis of candidate evaluation standards

We synthesize and organize candidate evaluation standards for assessing language teachers’ AIL, structured across four domains: AI concepts, instructional application, student assessment, and ethical considerations. Each standard follows quality criteria from previous research, such as the use of observable behavioral verbs and clear progression, and is described at four performance levels (Bloom et al., 1984; Robertson et al., 2022; Yang and Whatman, 2025). These principles include: (1) using observable behavioral verbs to describe a progression of quality; (2) avoiding ambiguous language such as “good,” “appropriate,” and “clear”; (3) limiting the number of standards in a dimension to no more than four; (4) ensuring each standard focuses on a single idea; (5) using positive language; and (6) employing language that communicates clearly to the intended audience (Yang and Whatman, 2025).

In addition, to minimize the impact of irrelevant structural variance and ensure reliable evaluation results, we adopted approaches regarding literature retrieval, item options, format selection, clarity, and discrimination, as recommended by previous studies (Wei et al., 2022).

This synthesis emphasizes a specific focus on language-teacher AIL and provides practical orientation for use. The standards emphasize actionable feedback and clear guidance. By using observable behavioral verbs and explicitly describing the progression of quality, these criteria help researchers and educators understand expectations at each proficiency level, supporting both peer and self-evaluation (Hattie and Timperley, 2007; Andrade, 2023). The indicative behavioral standards used for evaluation are summarized in Table 2.

Table 2
www.frontiersin.org

Table 2. Indicative behavioral standards for assessing FL teachers’ AIL.

3.2.2 Development of assessment items

The development of assessment items for evaluating FL teachers’ AIL was guided by a construct representation perspective (Whitely, 1983), ensuring that each item reflected relevant cognitive operations associated with the targeted construct. Drawing on Whitely’s multicomponent latent trait model (MLTM), we illustrate representative item types corresponding to distinct components of AIL, including technical proficiency, conceptual understanding, pedagogical integration, sociocultural adaptability, ethical reasoning, and self-efficacy. This approach is consistent with prior research highlighting the layered and context-sensitive nature of AI-TPACK integration in FL education (Zhao et al., 2022; Pan and Wang, 2025).

In line with established principles for high-quality assessment (Fenderson et al., 1997), the items were classified into two main formats: selected response and constructed response. Selected response items include multiple-choice, matching, alternative choice, and ordering formats, while constructed response items encompass cloze, short answer, and extended response tasks (Center for the Study of Higher Education, 2023).

Selected response items offer objectivity, scoring reliability, and strong alignment with declarative knowledge domains (Downing and Haladyna, 2006). For example, multiple-choice items targeting AI conceptual understanding focus on knowledge of algorithmic limitations and data constraints in large language models (Chee et al., 2024; Pandey and Bhusal, 2024). Matching and ordering formats are used to assess AI tool familiarity and lesson design fluency, consistent with research on cognitive depth in instructional planning tasks (Han and Li, 2024; Erdem Coşgun, 2025).

Constructed response items address interpretive and generative dimensions of AIL. Cloze tasks (without prompts) allow for the measurement of reading comprehension and sociolinguistic evaluation in foreign language contexts (Taylor, 1956). Short answer items assess pragmatic appropriateness in AI feedback, which is especially important in culturally diverse classrooms (Xue, 2024; Vo and Huynh, 2025). Extended response items provide space to explore ethical dilemmas and critical reflections on fairness, authorship, and data use, aligning with emerging issues in AI-assisted language instruction (Derakhshan and Ghiasvand, 2024; Tenberga and Daniela, 2024). To strengthen construct validity, all item formats were mapped to specific AIL dimensions and Bloom’s taxonomy levels. The blended item architecture, comprising both selected and constructed response formats, ensures comprehensive coverage of both factual knowledge and higher-order reflective skills. This item development framework is consistent with current empirical models advocating for the integration of AI-TPACK in language teacher education (Al-Abdullatif, 2024; Xue, 2024; Vo and Huynh, 2025).

3.2.2.1 Ordering items

Dimension: pedagogical integration (2.2).

Prompt: put the following steps in the correct order for using ChatGPT to co-develop a foreign language vocabulary task:

(1) Feed example sentence prompts.

(2) Define target vocabulary.

(3) Generate classroom task.

(4) Review appropriateness of outputs.

Correct order: 2 → 1 → 3 → 4.

Rationale: assesses procedural fluency in lesson planning with AI.

Mapped level: 2.2 – selects activities in foreign language classrooms appropriate for AI support.

3.2.2.2 Extended response items

Dimension: ethical reasoning (4.3, 4.4).

Prompt: an AI tool recommends removing a student’s low-scoring writing from a class dataset because it “lowers the model’s accuracy.” What ethical concerns does this raise, and how should a teacher respond?

Sample response criteria: identifies fairness and data bias; Recognizes student rights in data usage; Suggests inclusive assessment policies.

Rationale: assesses the teacher’s ability to recognize and respond to ethical challenges, including fairness, data bias, and students’ rights, in real classroom applications of AI.

Mapped level: demonstrates practical experience in addressing ethical, privacy, or bias issues (including implemented solutions or preventive measures); 4.4 – Synthesizes generalizable principles and articulates transferable strategies for ongoing or future AI integration.

4 Discussion

This study addresses a critical gap in language teacher education by synthesizing evidence toward a multidimensional, theoretically grounded set of candidate dimensions for assessing AIL in foreign language settings. By integrating the TPACK model with structured performance-based descriptors, the study clarifies what constitutes AIL and offers guidance for measuring it with attention to validity and reliability assessment (Mishra and Koehler, 2006). The resulting framework spans six dimensions: conceptual understanding, technical proficiency, pedagogical integration, adaptive expertise, ethical reasoning, and self-efficacy, capturing the complexity of AI-supported language teaching. To operationalize these dimensions, we align them with diverse item formats such as multiple choice, matching, ordering, cloze, short answer, and extended response. This yields a more holistic perspective than models that treat technical and pedagogical domains separately (Walters, 2010; Lo and Leung, 2022; Pascoal and Mattos, 2022; Xue, 2024).

Despite these advances, the literature on practice-based assessment of teachers’ AIL remains limited, although it is growing (Sperling et al., 2024). Implementation may face resistance due to limited resources, insufficient training, and skepticism toward AI tools (Salami and Alharthi, 2022; Hockly, 2023). Ethical concerns, including algorithmic bias, data privacy, and cultural appropriateness, also risk undermining equitable adoption (Derakhshan and Ghiasvand, 2024). Together, these constraints define the immediate challenges that any measurement approach and implementation agenda must address to be usable and trustworthy in real classrooms.

Theoretically, the study advances the TPACK paradigm by mapping assessable AIL domains to observable teacher behaviors, and the resulting item–construct alignment offers methodological guidance for building reliable instruments grounded in Bloom’s taxonomy and classroom practice (Walters, 2010; Pascoal and Mattos, 2022; Xue, 2024). However, key issues remain under-specified: the ethical implications of assessment (bias, privacy, cultural sensitivity), and the intersections of AIL with teachers’ pedagogical beliefs, institutional constraints, and broader sociocultural influences (Salami and Alharthi, 2022; Derakhshan and Ghiasvand, 2024; Pan and Wang, 2025). These issues set the agenda for targeted empirical work and theory refinement.

In teacher education, the six dimensions of TPACK-AI model can structure curriculum modules and micro-credentials; programs can use the framework for diagnostic profiling at entry, dimension-specific formative feedback during coursework, and capstone evidence in practicum. For instrument development, the dimension–item map enables a staged pipeline: blueprinting with content validity, piloting selected- and constructed-response tasks, examining internal structure, testing relations to allied constructs and consequences for practice, and auditing fairness with sensitivity-to-change checks. For policy and program evaluation, dimension scores can be embedded in quality-assurance cycles and procurement decisions for classroom AI tools. These uses position TPACK-AI as a practical scaffold for improving teacher readiness and for generating comparable evidence across foreign-language contexts.

A priority for future research is empirical validation. The synthesized candidate dimensions are grounded in a literature review and require classroom trials and rigorous psychometric testing prior to wider use. Such testing should examine item difficulty, item discrimination, reliability, validity, and differential item functioning to check fairness. We recommend a staged pipeline that begins with small sample pilots, then calibrates items using item response theory and classical test theory, followed by checks of measurement invariance across groups and the collection of process data when feasible, and finally iterative refinement of the item bank. Publishing technical appendices and sample items in an open format would support transparency, replication, and broader use across programs (Lo and Leung, 2022; Xue, 2024).

5 Conclusion

Grounded in the TPACK framework, this Mini Review synthesizes and organizes candidate dimensions and illustrative item formats for assessing AIL among foreign-language teachers. This Mini Review synthesizes candidate dimensions and illustrative item prototypes intended to inform research, training, and program dialogue; it does not propose a finalized instrument or accreditation standard. These materials are intended to facilitate reflection and piloting toward effective and ethical integration of AI technologies into language education.

As AI continues to reshape educational ecosystems, systematic assessment of teachers’ AIL becomes increasingly critical to ensure educational quality, equity, and teacher preparedness for future technological advances. Recognizing the complexity and variability inherent in educational contexts, future research should prioritize conducting cross-cultural and cross-regional comparative studies to explore how these standards operate in diverse teaching environments. The TPACK-AI framework offers a practice-ready scaffold for teacher preparation and a validity-aligned blueprint for measurement; future work should build and cross-validate item banks across contexts. Additionally, future studies need detailed empirical validation through psychometric evaluations of the developed assessment tools, including item difficulty, discrimination analyzes, and reliability testing. It is equally important to involve diverse stakeholders, such as teachers, students, administrators, and policymakers, in participatory research designs to refine standards further, enhance practical feasibility, and ensure alignment with authentic educational practices and policies. Lastly, longitudinal studies are recommended to track the sustained impacts of AIL standards on teacher professional growth, instructional quality, and student learning outcomes.

Author contributions

TR: Writing – original draft, Writing – review & editing. QL: Conceptualization, Methodology, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Acknowledgments

The authors would like to thank all teachers, colleagues, and classmates who provided valuable guidance, encouragement, and support throughout the writing of this paper and the submission process. It is hoped that this research will inspire further exploration of AI literacy in language education and contribute to ongoing discussions on equitable and ethical technology use in teaching.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Al-Abdullatif, A. M. (2024). Modeling teachers’ acceptance of generative artificial intelligence use in higher education: the role of AI literacy, intelligent TPACK, and perceived trust. Educ. Sci. 14:1209. doi: 10.3390/educsci14111209

Crossref Full Text | Google Scholar

Andrade, H. L. (2023). “What is next for rubrics?: a reflection on where we are and where to go from here” in Improving learning through assessment rubrics: Student awareness of what and how they learn, Hershey, PA, USA – IGI Global, 314–326.

Google Scholar

Bloom, B. S., Krathwohl, D. R., and Masia, B. B. (1984). Taxonomy of educational objectives: The classification of educational goals. New York: Longman.

Google Scholar

Briceño, A., and Klein, A. (2019). A second Lens on formative Reading assessment with multilingual students. Read. Teach. 72, 611–621. doi: 10.1002/trtr.1774

Crossref Full Text | Google Scholar

Center for the Study of Higher Education (2023) Generic item writing manual Available online at: https://figshare.unimelb.edu.au/ndownloader/files/36683046 (Accessed 20 July, 2025).

Google Scholar

Chan, K. K.-W., and Tang, W. K.-W. (2025). Evaluating English teachers’ artificial intelligence readiness and training needs with a TPACK-based model. World J. Engl. Lang. 15, 129–129. doi: 10.5430/wjel.v15n1p129

Crossref Full Text | Google Scholar

Chan, C. K. Y., and Tsi, L. H. Y. (2023). The AI revolution in education: Will AI replace or assist teachers in higher education? Ithaca, NY, USA – arXiv (Cornell University Library).

Google Scholar

Chee, H., Ahn, S., and Lee, J. (2024). A competency framework for AI literacy: variations by different learner groups and an implied learning pathway. Br. J. Educ. Technol. 56, 2146–2182. doi: 10.1111/bjet.13556

Crossref Full Text | Google Scholar

Chen, L., Chen, P., and Lin, Z. (2020a). Artificial intelligence in education: a review. IEEE Access 8, 75264–75278. doi: 10.1109/ACCESS.2020.2988510

Crossref Full Text | Google Scholar

Chen, X., Xie, H., Zou, D., and Hwang, G.-J. (2020b). Application and theory gaps during the rise of artificial intelligence in education. Comput. Educ. Artif. Intell. 1:100002. doi: 10.1016/j.caeai.2020.100002

Crossref Full Text | Google Scholar

Derakhshan, A., and Ghiasvand, F. (2024). Is ChatGPT an evil or an angel for second language education and research? A phenomenographic study of research-active EFL teachers’ perceptions. Int. J. Appl. Linguist. 34, 1246–1264. doi: 10.1111/ijal.12561

Crossref Full Text | Google Scholar

Downing, S. M., and Haladyna, T. M. (2006). Handbook of test development. Mahwah, NJ: Erlbaum.

Google Scholar

Erdem Coşgun, G. (2025). Artificial intelligence literacy in assessment: empowering pre-service teachers to design effective exam questions for language learning. Br. Educ. Res. J. 51, 2340–2357. doi: 10.1002/berj.4177

Crossref Full Text | Google Scholar

Fenderson, B. A., Damjanov, I., Robeson, M. R., Veloski, J. J., and Rubin, E. (1997). The virtues of extended matching and uncued tests as alternatives to multiple choice questions. Hum. Pathol. 28, 526–532. doi: 10.1016/S0046-8177(97)90073-3,

PubMed Abstract | Crossref Full Text | Google Scholar

Glusac, T., and Milic, M. (2022). Quality of written instructions in teacher-made tests of English as a foreign language. Engl. Teach. Learn. 46, 39–57. doi: 10.1007/s42321-021-00079-1

Crossref Full Text | Google Scholar

Grant, M. J., and Booth, A. (2009). A typology of reviews: an analysis of 14 review types and associated methodologies. Health Inf. Libr. J. 26, 91–108. doi: 10.1111/j.1471-1842.2009.00848.x,

PubMed Abstract | Crossref Full Text | Google Scholar

Haderer, B., and Ciolacu, M. (2022). Education 4.0: artificial intelligence assisted task-and time planning system. Procedia Comput. Sci 200, 1328–1337. doi: 10.1016/j.procs.2022.01.334

Crossref Full Text | Google Scholar

Han, J., and Li, M. (2024). Exploring ChatGPT-supported teacher feedback in the EFL context. System 126, 1–11. doi: 10.1016/j.system.2024.103502

Crossref Full Text | Google Scholar

Hattie, J., and Timperley, H. (2007). The power of feedback. Rev. Educ. Res. 77, 81–112. doi: 10.3102/003465430298487

Crossref Full Text | Google Scholar

Hazzan-Bishara, A., Kol, O., and Levy, S. (2025). The factors affecting teachers’ adoption of AI technologies: a unified model of external and internal determinants. Educ. Inf. Technol. 30, 15043–15069. doi: 10.1007/s10639-025-13393-z

Crossref Full Text | Google Scholar

Hockly, N. (2023). Artificial intelligence in English language teaching: the good, the bad and the ugly. RELC J. 54, 445–451. doi: 10.1177/00336882231168504

Crossref Full Text | Google Scholar

Koehler, M., Mishra, P., and Cain, W. (2013). What is technological pedagogical content knowledge (TPACK)? J. Educ. 193, 13–19. doi: 10.1177/002205741319300303

Crossref Full Text | Google Scholar

Krüger, R. (2023). Artificial intelligence literacy for the language industry – with particular emphasis on recent large language models such as GPT-4. Lebende Sprachen 68, 283–330. doi: 10.1515/les-2023-0024

Crossref Full Text | Google Scholar

Lintner, T. (2024). A systematic review of AI literacy scales. NPJ Sci. Learn. 9:50. doi: 10.1038/s41539-024-00264-4,

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, W. (2025). Language teacher AI literacy: insights from collaborations with ChatGPT. J. China Computer-Assisted Language Learn. 5, 287–316. doi: 10.1515/jccall-2024-0030

Crossref Full Text | Google Scholar

Lo, Y., and Leung, C. (2022). Conceptualising assessment literacy of teachers in content and language integrated learning programs. Int. J. Biling. Educ. Biling. 25, 3816–3834. doi: 10.1080/13670050.2022.2085028

Crossref Full Text | Google Scholar

Messick, S. (1995). Standards of validity and the validity of standards in performance asessment. Educ. Meas. Issues Pract. 14, 5–8. doi: 10.1111/j.1745-3992.1995.tb00881.x

Crossref Full Text | Google Scholar

Mishra, P., and Koehler, M. J. (2006). Technological pedagogical content knowledge: a framework for teacher knowledge. Teachers College Record Voice Scholar. Education 108, 1017–1054. doi: 10.1111/j.1467-9620.2006.00684.x

Crossref Full Text | Google Scholar

Ning, Y., Zhang, W., Yao, D., Fang, B., Xu, B., and Wijaya, T. T. (2025). Development and validation of the artificial intelligence literacy scale for teachers (AILST). Educ. Inf. Technol. 30, 17751–17767. doi: 10.1007/s10639-025-13347-5

Crossref Full Text | Google Scholar

Ojeda-Ramirez, S., Ritchie, D., and Warschauer, M. (2024). AI literacy for multilingual learners: storytelling, role-playing, and programming. CATESOL J. 35, 1–12. doi: 10.5070/b5.35861

Crossref Full Text | Google Scholar

Pan, Z., and Wang, Y. (2025). From technology-challenged teachers to empowered digitalized citizens: exploring the profiles and antecedents of teacher AI literacy in the chinese EFL context. Eur. J. Educ. 60. doi: 10.1111/ejed.70020

Crossref Full Text | Google Scholar

Pandey, H. L., and Bhusal, P. C. (2024). ChatGPT literacy for fostering language proficiency and writing skills in ESL/EFL classrooms. Nepal J. Multidiscip. Res. 7, 1–24. doi: 10.3126/njmr.v7i3.70859

Crossref Full Text | Google Scholar

Pascoal, L., and Mattos, M. (2022). Critical literacy in ELT classroom testing. ALICANTE J. Engl. Stud.-Rev. Alicant. Estud. INGLESES 36, 29–53. doi: 10.14198/raei.2022.36.02

Crossref Full Text | Google Scholar

Pinski, M., and Benlian, A. (2024). AI literacy for users – a comprehensive review and future research directions of learning methods, components, and effects. Comp. Human Behav. Artificial Humans 2:100062. doi: 10.1016/j.chbah.2024.100062

Crossref Full Text | Google Scholar

Robertson, P., Beswick, B., English, N., Kheang, T., Collins, M., Nguyen, C., et al. (2022). Writing objective and judgement-based assessment items. Available online at: https://findanexpert.unimelb.edu.au/scholarlywork/1775718-writing-objective-and-judgement-based-assessment-items (Accessed July 20, 2025).

Google Scholar

Salami, F., and Alharthi, R. (2022). Improving language assessment literacy for in-service Saudi EFL teachers. ARAB WORLD Engl. J. 13, 536–554. doi: 10.24093/awej/vol13no3.35

Crossref Full Text | Google Scholar

Sperling, K., Stenberg, C.-J., McGrath, C., Åkerfeldt, A., Heintz, F., and Stenliden, L. (2024). In search of artificial intelligence (AI) literacy in teacher education: a scoping review. Comput. Educ. Open 6:100169. doi: 10.1016/j.caeo.2024.100169

Crossref Full Text | Google Scholar

Su, X., and Lee, I. (2024). Emotion regulation of EFL teachers in blended classroom assessment. Asia Pac. Educ. Res. 33, 649–658. doi: 10.1007/s40299-023-00761-x

Crossref Full Text | Google Scholar

Taylor, W. L. (1956). Recent developments in the use of “cloze procedure.”. Journal. Q. 33, 42–99. doi: 10.1177/107769905603300106

Crossref Full Text | Google Scholar

Tenberga, I., and Daniela, L. (2024). Artificial intelligence literacy competencies for teachers through self-assessment tools. Sustainability 16, 1–25. doi: 10.3390/su162310386

Crossref Full Text | Google Scholar

Vo, L.-H., and Huynh, N.-T. (2025). Vietnamese EFL teachers’ perspectives on ChatGPT: a conceptual metaphor analysis. Arab World Engl. J. 16, 162–178. doi: 10.24093/awej/vol16no1.10

Crossref Full Text | Google Scholar

Walters, F. (2010). Cultivating assessment literacy: standards evaluation through language-test specification reverse engineering. Lang. Assess. Q. 7, 317–342. doi: 10.1080/15434303.2010.516042

Crossref Full Text | Google Scholar

Wei, M. T., Yang, Z., Bai, Y. J., Yu, N., Wang, C. X., Wang, N., et al. (2022). Shaping future directions for breakdance teaching. Front. Psychol. 13:952124. doi: 10.3389/fpsyg.2022.952124,

PubMed Abstract | Crossref Full Text | Google Scholar

Whitely, S. E. (1983). Construct validity: construct representation versus nomothetic span. Psychol. Bull. 93, 179–197. doi: 10.1037/0033-2909.93.1.179

Crossref Full Text | Google Scholar

Williyan, A., Fitriati, S. W., Pratama, H., and Sakhiyya, Z. 2024 Examining pedagogical strategies and teacher agency: A quantitative inquiry into ai integration in indonesian efl language teaching., in proceeding seminar Nasional IPA, 134–150. Semarang, Indonesia – Universitas Negeri Semarang (conference proceedings publisher).

Google Scholar

Xue, L. (2024). Urgent, but how? Developing English foreign language teachers’ digital literacy in a professional learning community focusing on large language models. Eur. J. Educ. 60:e12899. doi: 10.1111/ejed.12899

Crossref Full Text | Google Scholar

Yang, Z., and Whatman, S. (2025). Development and validation of standards for evaluating the quality of qualitative research on Olympics breakdance. Humanit. Soc. Sci. Commun. 12, 1–14. doi: 10.1057/s41599-025-04792-1

Crossref Full Text | Google Scholar

Zhao, L., Wu, X., and Luo, H. (2022). Developing AI literacy for primary and middle school teachers in China: based on a structural equation modeling analysis. Sustainability 14, 1–16. doi: 10.3390/su142114549

Crossref Full Text | Google Scholar

Keywords: artificial intelligence literacy, assessment standards, foreign language teachers, teacher professional development, TPACK framework

Citation: Ren T and Li Q (2026) Assessing artificial intelligence literacy in foreign language teachers: a TPACK-based perspective. Front. Educ. 11:1706559. doi: 10.3389/feduc.2026.1706559

Received: 17 September 2025; Revised: 05 January 2026; Accepted: 12 January 2026;
Published: 09 February 2026.

Edited by:

Antonio Sarasa-Cabezuelo, Complutense University of Madrid, Spain

Reviewed by:

Mehmet Tunaz, Nevsehir University, Türkiye
Sive Makeleni, University of Fort Hare, South Africa

Copyright © 2026 Ren and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Qingfu Li, MjAxNTMwN0BtYWlsLnNjdWVjLmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.