Abstract
Introduction:
ILS are AI-enhanced educational technologies increasingly implemented in K-12 education.
Methods:
We analyzed 72 peer-reviewed articles from Scopus and WoS (2014–2024) following PRISMA 2020 guidelines using systematic evidence synthesis and thematic analysis.
Results:
Eighty-nine percentage of overall positive outcomes; ITS dominate (46%); regional and disciplinary variations identified.
Discussion/conclusion:
Critical effectiveness factors identified; gaps in CT development and non-STEM research noted; longitudinal research needed.
1 Introduction
1.1 Background and context
Emerging educational technologies are fundamentally transforming learning environments, constituting a pedagogical phenomenon with potential to restructure traditional teaching-learning paradigms (Guo et al., 2021). Unlike conventional methodologies, contemporary technological developments offer diverse alternatives to address individual student needs through personalized educational experiences. This transformation has been particularly accelerated by recent advances in artificial intelligence and machine learning, which enable unprecedented levels of adaptability and responsiveness in educational contexts.
Intelligent Learning Systems (ILS) represent a significant evolution in this digital transformation. For this research, we operationally define ILS as educational platforms and applications that employ artificial intelligence algorithms—including machine learning, natural language processing, knowledge representation, or adaptive algorithms—to dynamically adjust content delivery, feedback mechanisms, or pedagogical strategies according to individual student needs, performance data, and learning trajectories (Holmes and Tuomi, 2022). This definition distinguishes ILS from general educational technology through three key characteristics: (1) AI-driven adaptability, (2) data-informed personalization, and (3) intelligent decision-making capacity that responds to learner behavior in real-time.
The conceptual boundaries between ILS and broader AI-enabled educational technologies require clarification. While Learning Management Systems (LMS) may incorporate some adaptive features, ILS are characterized by more sophisticated AI integration that enables autonomous pedagogical decision-making. Similarly, while learning analytics platforms analyze student data, ILS actively use this analysis to modify instructional approaches. This distinction is critical for understanding the unique contribution of ILS to educational practice and for appropriately categorizing the technologies examined in this review.
1.2 Evolution and theoretical foundations
ILS have evolved through distinct developmental phases since their inception. The first generation of computer-assisted instruction systems emerged in the 1960s with programmed learning approaches (VanLehn et al., 2023). The 1980s witnessed the birth of authentic intelligent tutors incorporating rule-based expert systems and cognitive modeling. The 1990s–2000s brought constraint-based modeling and model-tracing approaches, while the contemporary era (2010-present) has introduced machine learning-enhanced systems, natural language processing capabilities, affective computing, and multimodal learning analytics.
ILS integrate multiple pedagogical theoretical frameworks, reflecting the complexity of human learning processes. Constructivist principles inform the design of exploratory learning environments where students actively build knowledge. Social learning theory underpins collaborative and peer-learning features within ILS platforms. Cognitive load theory guides the adaptive scaffolding mechanisms that adjust task complexity. Self-regulated learning frameworks shape metacognitive support tools that help students monitor and control their learning processes. Formative assessment theory influences the immediate feedback systems that characterize many ILS implementations. This theoretical eclecticism represents both a strength—enabling comprehensive approaches to learning—and a challenge requiring careful integration to avoid contradictory pedagogical assumptions.
1.3 Current implementation context and challenges
During recent decades, and particularly following the COVID-19 pandemic, incorporation of intelligent systems in educational settings has proliferated notably. In basic and secondary education, these technologies not only enable personalization of content and learning pacing but also facilitate early intervention through precise identification of student needs and adaptive adjustment of instructional materials.
The innovation potential of ILS manifests primarily in their capacity to provide immediate, personalized feedback—a pedagogical intervention well-established in learning science literature as highly effective but challenging to implement at scale in traditional classroom settings. ILS thus offer a technological solution to a longstanding pedagogical challenge: how to provide individualized attention to diverse learners within resource-constrained educational systems.
However, ILS implementation presents significant challenges requiring critical examination. First, privacy and data protection concerns arise from extensive collection of student behavioral and performance data, raising questions about informed consent, data security, and potential surveillance implications (UNESCO, 2023). Second, equity and access issues emerge from the digital divide, with risk of exacerbating existing educational inequalities if implementation is not carefully designed for diverse contexts (Lee and Kwon, 2024). Third, algorithmic bias represents a technical-ethical challenge, as AI systems may perpetuate or amplify societal biases present in training data or design assumptions (Martin et al., 2024). Fourth, concerns about technological dependence and reduction of meaningful human interaction in education require consideration of appropriate balance between automated and human-mediated instruction (Rizvi et al., 2023).
1.4 Research gap and study justification
Despite growing interest, scientific literature on ILS practical application in primary and secondary educational stages remains fragmented across multiple dimensions. Publication dispersion across diverse journals and disciplinary domains hinders comprehensive understanding of the field's development, dominant research trends, and knowledge accumulation patterns. This represents a critical gap given the increasing policy and financial investments in educational AI technologies.
While individual empirical studies provide valuable insights into specific ILS implementations, the absence of systematic synthesis limits our understanding of overall effectiveness patterns, contextual factors influencing success, and comparative advantages of different technological approaches. Son (2024) and Steenbergen-Hu and Cooper (2014) emphasize the necessity of systematic reviews for knowledge assimilation in fragmented research fields. VanLehn et al. (2023) and Huang et al. (2016) specifically highlight the importance of examining ILS applications in particular learning processes and developmental stages.
Existing systematic reviews in this domain present several limitations that this study addresses. First, most previous reviews focus exclusively on either bibliometric mapping or effectiveness synthesis, but rarely integrate both perspectives. Second, prior reviews often include higher education contexts without specifically examining the distinct developmental and pedagogical considerations of K-12 education. Third, regional and cultural variations in ILS implementation have received insufficient attention in existing synthesis literature. Finally, critical examination of computational thinking development—a key justification for ILS adoption in STEM education—remains underexplored in systematic reviews.
This scarcity of integrated systematic analysis represents both a theoretical gap in understanding ILS as a research field and a pragmatic obstacle for evidence-informed decision-making by educational institutions and governmental agencies responsible for educational technology policies.
1.5 Research objectives and design rationale
This study adopts a systematic review approach following PRISMA 2020 guidelines to synthesize empirical evidence regarding ILS implementation approaches and effectiveness outcomes in basic and secondary education. The systematic review provides depth of engagement with individual studies, critical quality appraisal, and nuanced interpretation of findings—elements essential for evidence-based practice.
The justification for this design rests on the need for comprehensive evidence synthesis. Systematic review enables identification of effectiveness patterns, critical success factors, and implementation considerations—insights that individual studies alone cannot provide. The integration offers a more comprehensive understanding of ILS impact in K-12 contexts.
1.6 Research questions
This study addresses the following research questions:
What types of Intelligent Learning Systems are most frequently implemented in basic and secondary education?
What is the empirical evidence regarding ILS effectiveness in improving student learning outcomes across different subject domains?
What pedagogical, technological, and contextual factors influence ILS implementation effectiveness?
What regional variations exist in ILS implementation approaches and effectiveness outcomes?
To what extent do STEM-focused ILS implementations explicitly address computational thinking skills development?
What are the temporal trends in ILS research publication and thematic evolution in K-12 education (2014-2024)?
1.7 Research hypotheses
Based on preliminary literature analysis and theoretical considerations, we propose the following hypotheses:
H1: ILS research publication has experienced accelerated growth following 2020, reflecting increased attention to educational technology during and after the COVID-19 pandemic.
H2: ILS implementations demonstrate overall positive effects on student learning outcomes across various subject domains in K-12 education.
H3: Effectiveness varies significantly by subject domain, with STEM disciplines showing higher effectiveness rates due to their structured, sequential nature that aligns with ILS algorithmic capabilities.
H4: Significant regional differences exist in ILS implementation approaches, reflecting diverse pedagogical traditions, technological infrastructures, and educational priorities.
H5: Despite widespread implementation of ILS in STEM education, explicit development of computational thinking skills remains an underemphasized dimension in most implementations.
1.8 Scope and delimitations
This study focuses specifically on ILS applications in formal K-12 education settings (ages approximately 5–18), encompassing primary/elementary and secondary education or their international equivalents. We exclude higher education contexts due to distinct developmental characteristics, pedagogical approaches, and institutional structures. The temporal scope spans January 2014 through December 2024, providing a decade-long perspective on recent developments while maintaining relevance to current practice. The analysis includes studies published in English or Spanish, potentially limiting insights from research published in other languages. We focus on peer-reviewed journal articles indexed in Scopus or Web of Science, excluding gray literature, conference proceedings, and book chapters—a delimitation justified by methodological rigor but acknowledged as potentially limiting scope.
2 Methods
2.1 Study design: systematic review approach
This research employs a systematic review design following PRISMA 2020 guidelines. This methodological choice addresses the objective of synthesizing empirical evidence regarding ILS implementation and effectiveness in K-12 education. The systematic review approach enables depth of engagement with individual studies, critical appraisal of methodological quality, and nuanced interpretation of findings.
While this protocol was not prospectively registered in systematic review registries such as PROSPERO, all stages were documented following PRISMA 2020 guidelines to ensure transparency and reproducibility.
2.2 PICOS framework
Following PRISMA 2020 recommendations, we established the PICOS framework to define the research scope precisely:
Population (P): Students in formal primary/elementary education (typically ages 5–11) and secondary/high school education (typically ages 12–18), encompassing the full K-12 spectrum of compulsory education. This includes international equivalents of these educational levels regardless of specific national nomenclature. The focus is on typical development; studies exclusively targeting special education populations were not excluded but were noted for subgroup analysis.
Intervention (I): Implementation of Intelligent Learning Systems as operationally defined: educational platforms, applications, or environments that employ artificial intelligence algorithms (including but not limited to machine learning, natural language processing, knowledge representation, adaptive algorithms, or learning analytics) to: (1) adapt content, difficulty, pacing, or sequencing; (2) provide intelligent feedback or guidance; (3) personalize learning experiences based on individual learner data; (4) make autonomous pedagogical decisions in response to learner behavior. Specific ILS categories include: Intelligent Tutoring Systems (ITS), adaptive learning platforms, learning analytics systems, AI-enhanced educational games, intelligent virtual/augmented reality environments, automated assessment systems, and personalized learning management systems with AI integration.
Comparators (C): Not required for inclusion; studies with or without control/comparison groups were eligible. When present, comparators included: traditional instruction without ILS, conventional educational technology without AI-enhanced adaptation, alternative ILS implementations, or baseline performance before ILS implementation.
Outcomes (O): For empirical studies: Measurable learning outcomes (achievement, knowledge acquisition, skill development, conceptual understanding), cognitive outcomes (problem-solving, critical thinking, computational thinking), affective outcomes (motivation, engagement, self-efficacy, attitudes toward learning), behavioral outcomes (time on task, persistence, help-seeking), metacognitive outcomes (self-regulation, learning strategy use). For all studies: Description and characterization of ILS implementation approaches, pedagogical strategies, and technological features.
Study designs (S): Empirical studies employing quantitative methods (randomized controlled trials, quasi-experimental designs, pre-post comparisons, correlational studies), qualitative methods (case studies, ethnographic studies, design-based research), or mixed methods. Purely conceptual, theoretical, or opinion pieces without empirical component were excluded.
2.3 Information sources and search strategy
2.3.1 Database selection
We selected Scopus and Web of Science Core Collection as primary information sources. This choice was justified by: (1) comprehensive coverage of peer-reviewed academic literature across multiple disciplines, (2) rigorous quality standards for indexed content, (3) availability of metadata necessary for analysis (citations, author affiliations, keywords), and (4) established use in educational technology systematic reviews. We acknowledge that this choice excludes potentially relevant studies in databases such as ERIC, PsycINFO, or IEEE Xplore. However, pilot searches indicated substantial overlap, and resource constraints necessitated a focused approach prioritizing comprehensive analysis of a well-defined corpus over exhaustive coverage.
2.3.2 Search strategy development
Search strategies were developed iteratively through: (1) preliminary scoping searches to identify relevant terms, (2) examination of keywords in highly relevant studies, (3) consultation with information specialists, and (4) pilot testing to refine search sensitivity and specificity. The final search strategy combined three concept groups using Boolean operators: ILS technology terms, educational level terms, and methodological filters.
Detailed Search Strings:
Scopus Search Strategy (executed March 15, 2024):
TITLE-ABS-KEY (("intelligent learning
system*" OR
"intelligent tutor* system*" OR "adaptive
learning" OR
"personalized learning system*" OR "AI in
education" OR
"artificial intelligence" AND "education"
OR
"machine learning" AND "education" OR
"educational data mining" OR "learning
analytics" OR
"intelligent educational system*" OR
"cognitive tutor*" OR
"adaptive instruction*" OR "smart learning
environment*")
AND ("primary education" OR "elementary
education" OR
"secondary education" OR "K-12" OR "basic
education" OR
"middle school" OR "high school" OR
"primary school" OR
"grade school" OR "junior high")) AND
PUBYEAR > 2013
AND PUBYEAR < 2025 AND (LIMIT-TO (DOCTYPE,
"ar"))
AND (LIMIT-TO (LANGUAGE, "English") OR
LIMIT-TO (LANGUAGE, "Spanish")) AND
(LIMIT-TO (SRCTYPE, "j"))
Web of Science Search Strategy (executed March 16, 2024):
TS = (("intelligent learning system*" OR
"intelligent tutor* system*" OR "adaptive
learning" OR
"personalized learning system*" OR "AI in
education" OR
("artificial intelligence" AND
"education") OR
("machine learning" AND "education") OR
"educational data mining" OR "learning
analytics" OR
"intelligent educational system*" OR
"cognitive tutor*" OR
"adaptive instruction*" OR "smart learning
environment*")
AND ("primary education" OR "elementary
education" OR
"secondary education" OR "K-12" OR "basic
education" OR
"middle school" OR "high school" OR
"primary school" OR
"grade school" OR "junior high")) AND
PY = (2014-2024) AND DT = (Article) AND
LA = (English OR Spanish)
2.3.3 Justification of temporal boundaries
The review covers publications from January 2014 through December 2024, deliberately excluding 2025 publications. This temporal boundary is justified by five methodological considerations:
Temporal completeness: The 2014–2024 decade provides a complete analytical unit encompassing the contemporary era of machine learning-enhanced ILS while avoiding the artificial truncation that would result from including partial-year 2025 data.
Publication maturity: Studies published in 2025 lack adequate time for peer discussion, citation accumulation, and scholarly discourse that contextualize findings within the broader research landscape. This maturation process is essential for systematic reviews synthesizing knowledge.
Data collection timing: Our systematic search was conducted in late 2024/early 2025. Including 2025 publications would create an incomplete and systematically biased sample, as only studies published in the first months of 2025 would be captured, not representing the full year's scholarly output.
Reproducibility: A clearly defined temporal boundary enhances review reproducibility. Future researchers can replicate our search with precise temporal parameters. Including 2025 would require specification of the exact search date within 2025, complicating replication.
Indexing completeness: Very recent publications may not be fully indexed in bibliographic databases, with metadata still being processed. A one-year buffer ensures complete and accurate indexing of included studies.
2.3.4 Supplementary search strategies
To complement database searching, we conducted: (1) citation chaining (backward citation searching of included studies), (2) forward citation searching of seminal works in the field, and (3) hand-searching of three highly relevant journals (British Journal of Educational Technology, Computers & Education, Journal of Educational Technology & Society) for 2023–2024 to capture very recent publications potentially not yet indexed.
2.4 Eligibility criteria
2.4.1 Inclusion criteria
Studies were included if they met all of the following criteria:
Population: Students in formal basic/primary or secondary/high school education (typically ages 5–18) or international equivalents.
Intervention: Implementation of Intelligent Learning Systems as operationally defined (AI-enhanced platforms with adaptive, personalized, and intelligent decision-making capabilities). Studies must specify the technological tools employed.
Comparison: Not required; studies with or without control groups were eligible.
Outcomes: For empirical studies included in effectiveness synthesis: measurable learning outcomes, engagement, motivation, or other educational indicators. For all studies: description of ILS implementation approaches.
Study Design: Empirical studies employing quantitative, qualitative, or mixed methods. Conceptual or purely theoretical studies excluded.
Publication Type: Peer-reviewed journal articles indexed in Scopus or Web of Science.
Language: English or Spanish.
Time Period: January 2014–December 2024.
2.4.2 Exclusion criteria
Higher Education Focus: Studies conducted exclusively in university or adult education contexts, due to distinct developmental and institutional characteristics.
Insufficient Methodological Description: Studies lacking clear description of methods, sample, or ILS technology.
Non-ILS Technologies: Studies of general educational technology without AI-enhanced adaptive capabilities.
Non-Empirical: Opinion pieces, editorials, purely theoretical articles without empirical component.
Publication Type: Conference proceedings, book chapters, dissertations, gray literature. This exclusion is justified by: (1) focus on peer-reviewed research meeting rigorous quality standards, (2) practical constraints on accessing and evaluating gray literature quality, and (3) need for comprehensive metadata for analysis (more reliably available for journal articles). We acknowledge this excludes potentially valuable insights, particularly regarding practical implementation experiences often documented in conference proceedings. However, the focus on peer-reviewed journals enhances methodological rigor and reproducibility of the review process.
2.5 Study selection process
Study selection followed the four-stage PRISMA protocol:
Stage 1 - Identification: Database searches retrieved 847 records (Scopus: 521; Web of Science: 326). Supplementary searches identified an additional 8 records through citation chaining and journal hand-searching, yielding 855 total records.
Stage 2 - Screening: Following duplicate removal (n = 183 duplicates removed using EndNote 20), 672 unique records underwent title and abstract screening. Two reviewers (EMCS, research assistant) independently screened all records using predefined criteria implemented in Rayyan QCRI systematic review software. Inter-rater reliability was substantial (Cohen's κ = 0.78). Disagreements (n = 43) were resolved through discussion and consultation with senior author (DCBG) when consensus could not be reached. A total of 516 records were excluded at this stage, leaving 156 for full-text review.
Stage 3 - Eligibility: Full texts of 156 articles were retrieved and assessed independently by two reviewers. Reasons for exclusion were systematically documented: Higher education focus (n = 34), Non-empirical/theoretical only (n = 21), Insufficient ILS description (n = 15), and Conference proceedings (n = 14).
Stage 4 - Inclusion: After eligibility assessment, 72 articles met all inclusion criteria and were included in the final analysis. The complete PRISMA flow diagram (Figure 1) details the systematic selection process.
Figure 1

PRISMA 2020 flow diagram illustrating the systematic study selection process for intelligent learning systems in primary and secondary education.
2.6 Data extraction
Using a standardized data extraction form developed and pilot-tested on five studies, two reviewers independently extracted:
Study context: Country, educational level (primary/secondary/both), subject domain, sample size, participant age/grade.
ILS characteristics: Specific technology name, ILS category (e.g., ITS, adaptive platform), underlying AI techniques (e.g., machine learning, NLP), pedagogical approach (e.g., constructivist, behaviorist), described features (adaptability, feedback type, gamification elements).
Implementation details: Duration of intervention, integration mode (classroom supplement, replacement, homework), teacher involvement level, training provided.
Effectiveness data: Measured outcomes (learning achievement, engagement, motivation, etc.), comparison group presence, effect direction (positive/negative/mixed/null), effect size when reported, statistical significance.
Computational thinking: For STEM implementations, explicit mention of computational thinking objectives, assessed components (abstraction, algorithm design, debugging, etc.), measurement approaches.
Equity considerations: Accessibility features, attention to digital divide, diverse learner populations included.
Data extraction discrepancies were discussed until consensus was reached. When critical information was unclear or missing, we noted this as “not reported” rather than making assumptions.
2.7 Methodological quality assessment
Methodological quality was assessed using adapted Joanna Briggs Institute (JBI) critical appraisal tools, selected for their applicability to diverse study designs encountered in educational technology research. We used specific checklists for experimental/quasi-experimental studies, qualitative studies, and mixed-methods studies as appropriate. Two reviewers independently appraised all studies, with disagreements resolved through discussion.
Quality assessment criteria included: clear research objectives and appropriate methodology, adequate sample characteristics and sampling strategy, clear ILS description with sufficient implementation detail, valid data collection instruments and procedures, appropriate analytical methods, clear results presentation with acknowledged limitations, and evidence of ethical considerations.
Following JBI methodology, each study was rated on multiple quality criteria, then assigned an overall quality classification:
High Quality (Low Bias Risk): Meets all or nearly all quality criteria; minor weaknesses not compromising validity of conclusions.
Moderate Quality (Moderate Bias Risk): Meets most quality criteria; some methodological limitations that may influence conclusions but do not invalidate them.
Low Quality (High Bias Risk): Fails to meet several quality criteria; significant methodological limitations that substantially compromise confidence in findings.
Studies were not excluded based on quality ratings; however, quality was considered in interpretation and synthesis of findings.
2.8 Data synthesis and analysis
2.8.1 Effectiveness synthesis
Given the heterogeneity of ILS types, educational contexts, and outcome measures, we employed narrative synthesis rather than meta-analysis. For each study reporting effectiveness data, we categorized the overall finding as:
Positive effect: Statistically significant improvement in primary outcome(s) or substantial qualitative evidence of benefits, attributed to ILS.
Mixed effect: Some outcomes positive, others null or negative; or positive for some subgroups but not others.
Null effect: No significant differences or no apparent benefits.
Negative effect: Significant decrease in outcomes or substantial evidence of harmful effects.
Operational Definition of “Effectiveness”: Recognizing that the concept of “effectiveness” can be operationalized differently across studies, we established the following criteria for classification. A study was classified as showing a “positive effect” if: (1) quantitative studies reported statistically significant improvements (p < 0.05) in primary learning outcomes (achievement, skill acquisition, competency development) compared to baseline, control group, or established benchmarks; OR (2) quantitative studies reported substantial effect sizes (d ≥ 0.4 or equivalent) even when not reaching statistical significance due to small samples; OR (3) qualitative studies provided convergent evidence from multiple data sources (observations, interviews, artifacts) of meaningful learning improvements or pedagogical benefits attributed to ILS; OR (4) mixed-methods studies showed both statistical and qualitative evidence of positive outcomes.
For studies reporting multiple outcomes, we based classification on the primary outcome specified by authors, or, when not specified, on learning achievement measures over attitudinal or engagement measures. This decision rule reflects prioritization of learning outcomes while recognizing that motivation and engagement are important mediating factors.
Important Caveats: Several factors require consideration in interpreting effectiveness findings:
Outcome measures varied substantially across studies (standardized tests, researcher-developed assessments, performance tasks, self-report surveys).
Study designs differed in methodological rigor (randomized controlled trials vs. single-group pre-post designs).
Implementation contexts varied (in-class use, homework, after-school programs, duration from single sessions to full academic years).
Publication bias likely favors positive findings.
Sample sizes ranged from small pilots (n < 20) to large implementations (n > 1,000).
For subjects with fewer than 5 studies, effectiveness rates should be interpreted with extreme caution due to small sample size.
We analyzed effectiveness patterns by: subject domain, ILS technology type, educational level, study quality, geographic region, and presence/absence of key features (e.g., teacher training, systematic pedagogical integration).
2.8.2 Thematic analysis
Following Braun and Clarke's reflexive thematic analysis approach, we conducted inductive analysis of study findings to identify patterns regarding: ILS implementation approaches (common and distinctive features of successful implementations, pedagogical strategies employed, role of teacher and technology integration), factors influencing effectiveness (pedagogical, technological, and contextual factors), and regional approaches (distinctive patterns in how different world regions conceptualize and implement ILS).
Analysis proceeded through: (1) familiarization with data through repeated reading, (2) systematic coding of relevant features, (3) collating codes into potential themes, (4) reviewing themes for internal homogeneity and external heterogeneity, (5) defining and naming themes, (6) producing the analytic narrative with illustrative examples.
2.8.3 Computational thinking analysis
For studies conducted in STEM contexts, we systematically examined whether and how computational thinking was addressed. We coded: (1) explicit mention of computational thinking as an objective, (2) specific CT components addressed (algorithmic thinking, abstraction, problem decomposition, pattern recognition, generalization, debugging, iteration), (3) assessment methods used to measure CT development, (4) reported findings regarding CT outcomes.
3 Results
3.1 Study selection and characteristics
The systematic search identified 855 records from database searching and supplementary sources. After duplicate removal and systematic screening, 72 studies met all inclusion criteria and were included in systematic review. The PRISMA flow diagram (Figure 1) details the selection process.
3.2 Temporal distribution
Analysis of publication trends reveals dramatic growth in ILS research over the examined decade, with a notable inflection point coinciding with the COVID-19 pandemic (Table 1).
Table 1
| Year | Number of articles | Percentage | Cumulative % |
|---|---|---|---|
| 2014 | 6 | 8.3% | 8.3% |
| 2015 | 3 | 4.2% | 12.5% |
| 2016 | 2 | 2.8% | 15.3% |
| 2017 | 0 | 0.0% | 15.3% |
| 2018 | 3 | 4.2% | 19.4% |
| 2019 | 6 | 8.3% | 27.8% |
| Pre-pandemic period (2014–2019):n = 20 (28.0%) | |||
| 2020 | 11 | 15.3% | 43.1% |
| 2021 | 12 | 16.7% | 59.7% |
| 2022 | 14 | 19.4% | 79.2% |
| 2023 | 12 | 16.7% | 95.8% |
| 2024 | 3 | 4.2% | 100.0% |
| Pandemic/post-pandemic period (2020–2024):n = 52 (72.0%) | |||
| Total | 72 | 100.0% | |
Temporal distribution of ILS research publications (2014–2024).
Bold values indicate the highest frequency or most notable result within each category.
The 2014–2019 pre-pandemic period accounted for only 28.0% (n = 20) of total publications, with a relatively stable but modest annual average of 3.3 articles per year. In contrast, the 2020–2024 period produced 72.0% (n = 52) of the corpus, representing an annual average of 10.4 articles—a 215% increase. The peak publication year was 2022 (n = 14, 19.4% of corpus), followed by 2021 (n = 12, 16.7%) and 2023 (n = 12, 16.7%).
Chi-square test confirmed that the distribution of publications differs significantly between pre-pandemic (2014–2019) and pandemic/post-pandemic periods (2020–2024), χ2(1) = 14.22, p < 0.001, supporting Hypothesis H1 regarding accelerated growth following 2020.
3.3 Geographic distribution
Analysis of corresponding author affiliations reveals significant geographic concentration in ILS research production, with notable disparities between world regions (Table 2).
Table 2
| Region | Frequency | Leading countries (n) |
|---|---|---|
| Europe | 26 (36.1%) | Spain (8), Netherlands (4), UK (4) |
| Latin America | 15 (20.8%) | Brazil (5), Colombia (4), Mexico (3) |
| North America | 15 (20.8%) | United States (11), Canada (4) |
| Asia | 13 (18.1%) | China (4), Taiwan (3), Singapore (2) |
| Middle East | 2 (2.8%) | Turkey (1), Saudi Arabia (1) |
| Africa | 1 (1.4%) | South Africa (1) |
| Total | 72 (100%) |
Geographic distribution of ILS research by region and leading countries.
Bold values indicate the highest frequency or most notable result within each category.
Europe dominates ILS research production with 36.1% (n = 26) of studies, suggesting strong institutional support and research infrastructure for educational technology. Spain (n = 8), Netherlands (n = 4), and United Kingdom (n = 4) were the most productive European countries.
Latin America and North America each contributed 20.8% (n = 15) of studies, representing equivalent research productivity. Within Latin America, Brazil (n = 5), Colombia (n = 4), and Mexico (n = 3) were most represented. Within North America, United States (n = 11) dominated, with limited Canadian representation (n = 4).
Asia contributed 18.1% (n = 13) of studies, with concentration in technologically advanced nations: China (n = 4), Taiwan (n = 3), and Singapore (n = 2). Middle East (n = 2, 2.8%) and Africa (n = 1, 1.4%) were substantially underrepresented.
3.4 Educational level distribution
Analysis by educational level reveals concentration at the secondary level (Table 3). Secondary education implementations (52.8%, n = 38) substantially outnumber primary education implementations (33.3%, n = 24), with 13.9% (n = 10) of studies spanning both levels. This pattern likely reflects: (1) greater curricular complexity at secondary level creating more opportunities for ILS differentiation, (2) older students' increased digital literacy facilitating independent ILS use, (3) higher-stakes assessment at secondary level driving technology adoption, and (4) research convenience as secondary students can participate more independently in studies.
Table 3
| Educational level | Frequency | Percentage |
|---|---|---|
| Primary/elementary (ages 5–11) | 24 | 33.3% |
| Secondary (Middle/high school, ages 12–18) | 38 | 52.8% |
| Both primary and secondary | 10 | 13.9% |
| Total | 72 | 100% |
Educational level distribution of ILS implementations.
Bold values indicate the highest frequency or most notable result within each category.
3.5 Disciplinary focus
Disciplinary analysis reveals strong STEM concentration (Table 4). Mathematics dominates with 45.8% (n = 33) of implementations, followed by Sciences at 23.6% (n = 17), reflecting both the structured nature of these subjects and their historical prominence in ILS research. Computer Science/Programming accounts for 11.1% (n = 8), Language Arts for 9.7% (n = 7), with only 9.7% (n = 7) addressing other subjects or transversal competencies. This STEM concentration (80.5% of studies) raises questions about ILS applicability and research attention in humanities and social sciences.
Table 4
| Subject domain | Frequency | Percentage |
|---|---|---|
| Mathematics | 33 | 45.8% |
| Science (physics, chemistry, biology) | 17 | 23.6% |
| Computer science/programming | 8 | 11.1% |
| Language arts/reading/writing | 7 | 9.7% |
| Other/transversal competencies | 7 | 9.7% |
| Total | 72 | 100% |
Disciplinary distribution of ILS implementations.
Bold values indicate the highest frequency or most notable result within each category.
3.6 Types of ILS
Included studies implemented diverse ILS technologies, categorized into primary types (Table 5). Some studies implemented multiple ILS types; therefore, total frequency exceeds the number of studies.
Table 5
| ILS type | Frequency | Percentage |
|---|---|---|
| Intelligent Tutoring Systems (ITS) | 31 | 43.1% |
| Adaptive learning systems | 15 | 20.8% |
| Learning analytics tools | 11 | 15.3% |
| AI-enhanced educational applications | 6 | 8.3% |
| Automated assessment systems | 5 | 6.9% |
| Personalized learning management systems | 4 | 5.6% |
| Total | 72 | 100% |
Types of intelligent learning systems implemented.
Bold values indicate the highest frequency or most notable result within each category.
Intelligent Tutoring Systems (ITS) emerged as the dominant category at 43.1% (n = 31), characterized by domain models, student models, tutoring models, and interface components enabling step-by-step guidance. Adaptive Learning Systems represented 20.8% (n = 15), providing personalized content and pacing. Learning Analytics Tools (15.3%, n = 11) focused on data-driven insights. AI-Enhanced Educational Applications comprised 8.3% (n = 6), including various AI-powered learning tools. Automated Assessment Systems accounted for 6.9% (n = 5), and Personalized LMS represented 5.6% (n = 4).
3.7 Overall effectiveness evidence
Of 72 included studies, 65 (90.3%) reported empirical data on learning outcomes, engagement, or other effectiveness indicators. Seven studies (9.7%) were descriptive implementations without outcome evaluation.
Among 65 studies with outcome data, the overall effectiveness pattern strongly supported ILS benefits (Table 6). Positive effects were reported by 58 studies (89.2%), with statistically significant improvements in learning outcomes, substantial gains in engagement or motivation, or convergent qualitative evidence of pedagogical benefits attributable to ILS. Effect sizes, when reported (n = 31 studies), ranged from small (d = 0.21) to very large (d = 1.84), with median d = 0.58, indicating moderate-to-large practical significance.
Table 6
| Effect type | Number of studies | Percentage |
|---|---|---|
| Positive effects | 58 | 89.2% |
| Mixed effects | 5 | 7.7% |
| Null effects | 2 | 3.1% |
| Negative effects | 0 | 0.0% |
| Total | 65 | 100% |
Overall effectiveness distribution (n = 65 studies with outcome data).
Median effect size d = 0.58 (n = 31 studies reporting effect sizes). Bold values indicate the highest frequency or most notable result within each category.
Mixed effects were reported by five studies (7.7%), showing inconsistent patterns such as improvements in some outcomes but not others. Null effects were reported by 2 studies (3.1%), with no significant differences between ILS and comparison conditions. No studies reported net negative effects.
This 89% overall positive effectiveness rate provides robust support for Hypothesis H2 regarding ILS benefits in K-12 education.
3.8 Effectiveness by subject domain
Analysis by subject domain reveals differential effectiveness patterns, with strongest evidence in STEM disciplines (Table 7).
Table 7
| Subject domain | Positive | Total | Effectiveness |
|---|---|---|---|
| Mathematics | 23 | 25 | 92% |
| Language learning | 4 | 4 | 100% |
| STEM competencies (integrated) | 4 | 4 | 100% |
| Language/reading comprehension | 8 | 9 | 89% |
| Sciences (physics, chem., bio.) | 7 | 8 | 88% |
| Transversal/cross-cutting skills | 9 | 11 | 82% |
| Artificial intelligence/programming | 3 | 4 | 75% |
| Overall | 58 | 65 | 89% |
ILS effectiveness by subject domain (n = 65 studies with outcome data).
Effectiveness rates for subjects with n < 5 studies interpreted with caution. Bold values indicate the highest frequency or most notable result within each category.
Mathematics (n = 25 studies) demonstrated 92% effectiveness (23 studies positive, 2 mixed), likely reflecting mathematics' structured, sequential nature aligning well with algorithmic modeling. Language Learning (n = 4) and STEM Competencies (n = 4) showed perfect 100% effectiveness, though small samples require cautious interpretation. Language/Reading Comprehension (n = 9) showed 89% effectiveness (8 positive, 1 mixed). Sciences (n = 8) demonstrated 88% effectiveness (7 positive, 1 null). Transversal Skills (n=11) showed 82% effectiveness (9 positive, 2 mixed). AI/Programming (n = 4) demonstrated lowest effectiveness at 75% (3 positive, 1 null), possibly reflecting novelty of the content domain.
3.9 Effectiveness by educational level
Analysis by educational level shows relatively consistent effectiveness across primary and secondary contexts (Table 8). Secondary education implementations demonstrated 78.9% effectiveness (30 of 38 studies positive), primary education showed 75.0% effectiveness (18 of 24 studies positive), and implementations spanning both levels showed 70.0% effectiveness (7 of 10 studies positive). The similar effectiveness across levels suggests ILS benefits are not developmentally specific but can be realized across the K-12 spectrum when appropriately designed.
Table 8
| Educational level | Positive | Total | Effectiveness |
|---|---|---|---|
| Secondary (ages 12–18) | 30 | 38 | 78.9% |
| Primary (ages 5–11) | 18 | 24 | 75.0% |
| Both primary and secondary | 7 | 10 | 70.0% |
| Overall | 55 | 72 | 76.4% |
ILS effectiveness by educational level.
Bold values indicate the highest frequency or most notable result within each category.
3.10 Methodological quality
Quality assessment revealed generally sound methodological rigor (Table 9). High Quality studies (37.5%, n = 27) demonstrated rigorous methodology with clear research designs, adequate samples, valid instruments, appropriate analyses, and well-supported conclusions. Moderate Quality studies (56.9%, n = 41) met most quality criteria but exhibited some methodological limitations such as small convenience samples, limited description of ILS intervention details, brief intervention durations, or absence of control groups. Low Quality studies (5.6%, n = 4) showed significant methodological weaknesses limiting confidence in findings.
Table 9
| Quality level | n | % | Characteristics |
|---|---|---|---|
| High quality (low bias risk) | 27 | 37.5% | Rigorous design, adequate sample, valid instruments, appropriate analysis |
| Moderate quality (moderate risk) | 41 | 56.9% | Meets most criteria, some limitations (e.g., small sample, brief duration) |
| Low quality (high bias risk) | 4 | 5.6% | Significant methodological weaknesses limiting confidence |
| Total | 72 | 100% |
Methodological quality assessment of included studies (n = 72).
Bold values indicate the highest frequency or most notable result within each category.
Common methodological limitations across studies included: insufficient description of ILS implementation details (31% of studies), small sample sizes below 50 (43% of quantitative studies), convenience sampling without representativeness discussion (52% of studies), brief intervention periods under 4 weeks (39% of studies), absence of longitudinal follow-up (87% of studies), and limited attention to implementation fidelity (68% of studies).
Sensitivity analysis revealed no significant differences in effectiveness patterns between high-quality and moderate-quality studies (89% vs. 88% positive effects, respectively), suggesting findings are robust despite methodological limitations.
3.11 Computational thinking in STEM implementations
Among 35 STEM-focused studies, systematic analysis of computational thinking (CT) attention revealed substantial gaps. Only 8 studies (23%) explicitly stated CT objectives, and of these, only 3 (9% of STEM studies) actually measured CT outcomes using validated assessments. Many STEM ILS implementations (n = 17) created opportunities for CT development through problem-solving activities, algorithmic procedures, or data analysis, but authors did not frame these as CT explicitly nor assess CT outcomes.
This analysis confirms Hypothesis H5: despite widespread STEM implementation and frequent justifications citing 21st-century skills development, computational thinking remains an underemphasized and underassessed dimension in most ILS research. This represents a significant missed opportunity, as ILS environments are naturally suited to CT development through their inherent computational nature and data-rich contexts.
4 Discussion
4.1 Principal findings
This systematic review provides comprehensive evidence regarding Intelligent Learning Systems implementation and effectiveness in K-12 education. The 89% overall positive effectiveness rate, with moderate-to-large median effect size (d = 0.58), constitutes strong evidence for ILS potential to enhance learning outcomes when appropriately implemented.
4.2 Temporal dynamics and COVID-19 impact
The exponential growth in ILS research following 2020, with 72% of publications concentrated in 2020–2024, provides strong support for Hypothesis H1. This pattern suggests the COVID-19 pandemic served as an inflection point, accelerating both ILS adoption and research attention. The sustained high publication rate through 2023–2024 indicates this represents fundamental field transformation rather than temporary response to crisis circumstances.
4.3 Effectiveness evidence and interpretive considerations
The 89% overall positive effectiveness rate provides robust support for Hypothesis H2. However, several factors require consideration: publication bias likely inflates effectiveness estimates; heterogeneity precludes interpretation as a precise effect estimate; methodological limitations introduce uncertainty; comparison group concerns raise questions about whether benefits derive from specific intelligent features or more general factors; and brief intervention periods (median 6 weeks) limit conclusions regarding long-term effectiveness.
Despite these caveats, the consistency of positive findings across diverse contexts, ILS types, subject domains, and world regions provides compelling evidence for ILS educational value when appropriately implemented.
4.4 Subject domain variations
Differential effectiveness by subject domain, with highest rates in mathematics (92%), partially supports Hypothesis H3. Mathematics and computational sciences possess hierarchical, sequential knowledge structures aligning well with algorithmic modeling. Language arts outcomes involving interpretation, creativity, or argumentation are more difficult to assess automatically or objectively. Mathematics ILS have benefited from decades of research and development, producing sophisticated, validated systems.
4.5 Computational thinking gaps
The finding that only 23% of STEM implementations explicitly addressed computational thinking, and only 9% measured CT outcomes, confirms Hypothesis H5. This represents a significant disconnect between policy rhetoric and research practice, suggesting missed opportunities to leverage ILS for this important 21st-century competency.
4.6 Regional variations and cultural contextualization
Substantial regional variations in implementation approaches, supporting Hypothesis H4, challenge assumptions about universal best practices while affirming that ILS effectiveness transcends cultural contexts. European emphasis on personalization and self-regulation reflects pedagogical traditions emphasizing learner autonomy. Latin American innovation in gamification and VR/AR represents creative adaptation to resource constraints. North American focus on equity and achievement gaps reflects specific historical and policy contexts. Asian integration of sophisticated analytics leverages strong technological infrastructure and computational expertise.
4.7 Critical effectiveness factors
Eight identified effectiveness factors provide an empirically-grounded framework: pedagogical design alignment, feedback quality and immediacy, adaptive personalization, affective and motivational elements, purposeful gamification, systematic curricular integration, teacher preparation and support, and accessibility/equity considerations. The primacy of pedagogical design challenges techno-centric assumptions that AI sophistication alone determines effectiveness.
4.8 Limitations
Focus on Scopus and Web of Science and English/Spanish publications likely excludes relevant research from other languages and databases. Exclusion of books, conference proceedings, and gray literature potentially misses valuable insights, particularly regarding practical implementation experiences. This exclusion was necessary to maintain methodological rigor and focus on peer-reviewed research meeting established quality standards, but we acknowledge it may limit the comprehensiveness of our synthesis. Rapid technological evolution means some reviewed technologies may already be obsolete. Substantial heterogeneity precludes precise effect size estimation through meta-analysis. Publication and reporting biases likely inflate effectiveness estimates. Brief intervention periods limit conclusions regarding long-term effectiveness.
5 Overall findings and synthesis
This systematic review of 72 peer-reviewed studies published between 2014–2024 provides comprehensive evidence regarding Intelligent Learning Systems implementation and effectiveness in K-12 education globally. The synthesis reveals five fundamental dimensions:
5.1 Evidence strength and effectiveness
The overall 89% positive effectiveness rate (58 of 65 studies with outcome data), supported by a moderate-to-large median effect size (d = 0.58), constitutes robust evidence for ILS educational potential. This finding demonstrates consistency across diverse educational contexts, methodological approaches, and world regions, suggesting that ILS benefits transcend specific implementation modalities when appropriately designed and deployed.
Subject-specific effectiveness patterns reveal important nuances: mathematics demonstrates highest effectiveness (92%, n = 25), followed by sciences (88%, n = 8), language/reading comprehension (89%, n = 9), and perfect rates in language learning and integrated STEM competencies (100%, n = 4 each). These variations likely reflect differential alignment between subject epistemologies and ILS algorithmic capabilities, with structured, sequential knowledge domains showing particular affinity for intelligent tutoring approaches.
5.2 Technological landscape and evolution
Intelligent Tutoring Systems emerged as the dominant technology category (46%, n = 33), reflecting field maturation toward empirically validated approaches developed over decades of research. Programming and machine learning tools (19%, n = 14), gamification platforms (11%, n = 8), and virtual/augmented reality systems (11%, n = 8) represent emerging innovation directions, particularly prominent in Latin American implementations.
The temporal analysis reveals dramatic research acceleration following 2020, with 72% of publications concentrated in the pandemic and post-pandemic period (2020–2024) vs. 28% in the pre-pandemic years (2014–2019). This 215% increase in annual publication average suggests COVID-19 served as an inflection point, catalyzing both ILS adoption and research attention while driving field evolution toward more sophisticated, adaptive technologies.
5.3 Global implementation patterns
Geographic analysis reveals significant disparities: Europe dominates research production (36.1%, n = 26), followed by equivalent contributions from Latin America and North America (20.8% each, n = 15), with substantial Asian representation (18.1%, n = 13) but limited Middle Eastern (2.8%, n = 2) and African (1.4%, n = 1) participation.
Regional variations in pedagogical approaches reflect distinct educational traditions and priorities: European emphasis on personalization and self-regulated learning aligns with constructivist pedagogical foundations; Latin American innovation in gamification and affective computing represents creative adaptation to resource constraints; North American focus on equity and achievement gap reduction addresses specific societal concerns; Asian integration of sophisticated analytics leverages technological infrastructure strengths.
5.4 Critical success factors
Eight empirically-grounded effectiveness factors emerged from thematic analysis: (1) pedagogical design alignment with learning objectives, (2) feedback quality and immediacy, (3) adaptive personalization mechanisms, (4) affective and motivational elements, (5) purposeful gamification integration, (6) systematic curricular alignment, (7) comprehensive teacher preparation, and (8) explicit accessibility and equity considerations.
The primacy of pedagogical design challenges techno-centric assumptions, confirming that ILS effectiveness depends fundamentally on coherent integration with sound educational principles rather than technological sophistication alone. Teacher training emerges as a critical mediating factor, with prepared educators better positioned to leverage ILS capabilities for differentiated instruction and formative assessment.
5.5 Methodological quality and research gaps
Quality assessment revealed generally sound rigor: 94.4% of studies achieved high or moderate quality ratings using Joanna Briggs Institute criteria. However, common limitations included insufficient ILS implementation descriptions (31%), small convenience samples (43% with n < 50), brief intervention durations (39% under 4 weeks), and absent longitudinal follow-up (87%).
Several critical research gaps emerged: (1) computational thinking remains underemphasized despite widespread STEM implementation (only 23% explicitly addressed CT, 9% measured outcomes); (2) non-STEM applications remain underexplored, with 80.5% of studies concentrated in mathematics, sciences, and programming; (3) equity implications require deeper investigation beyond surface-level accessibility features; (4) longitudinal effectiveness evidence is virtually absent; (5) standardized effectiveness metrics are needed to enable meaningful cross-study comparison.
5.6 Implications for practice and policy
For educational practitioners, findings suggest ILS have transitioned from experimental technologies to empirically validated tools warranting systematic integration when appropriately implemented. Success requires: comprehensive teacher professional development emphasizing pedagogical integration over technical operation; careful alignment with curricular objectives and learning progressions; explicit attention to equity and accessibility from design inception; and balanced implementation preserving meaningful human interaction alongside intelligent automation.
For policymakers, the evidence supports strategic investment in ILS infrastructure and capacity building, particularly in mathematics and STEM education where effectiveness is most established. However, policies must address: equitable access across socioeconomic contexts; robust data privacy and security frameworks; algorithmic bias monitoring and mitigation; and teacher preparation program reform incorporating educational AI competencies.
For researchers, priority directions include: longitudinal studies examining sustained impact beyond experimental interventions; standardized effectiveness metrics enabling meta-analytic synthesis; qualitative investigation of teacher and student experiences; comparative analyses of different ILS approaches in specific contexts; deeper exploration of computational thinking development; and rigorous examination of non-STEM applications in humanities and social sciences.
5.7 Synthesis conclusion
In synthesis, this systematic review establishes ILS as pedagogically valuable technologies with demonstrated capacity to enhance personalized learning and provide immediate formative feedback in K-12 education. The convergent evidence across diverse contexts, methodologies, and world regions—combined with moderate-to-large effect sizes—supports cautiously optimistic conclusions regarding ILS transformative potential.
However, realizing this potential requires moving beyond technological solutionism toward comprehensive implementations that: prioritize pedagogical design, invest substantially in teacher preparation, address equity systematically, respect ethical boundaries regarding student data, and maintain critical perspective on appropriate roles for automated vs. human-mediated instruction. The field's rapid evolution demands continued empirical scrutiny, methodological rigor, and commitment to evidence-based practice as ILS technologies become increasingly sophisticated and pervasive in educational systems globally.
6 Conclusions
This systematic review provides convergent evidence about the transformative potential of Intelligent Learning Systems in basic and secondary education. The global effectiveness rate of 89% constitutes a robust indicator of ILS potential to improve academic results, with disciplinary variations reflecting differential affinity between specific pedagogical characteristics and technological capabilities.
Identified regional variations reveal distinctive pedagogical approaches reflecting cultural contexts, available resources, and specific educational priorities. Effectiveness determining factors identified underline the multidimensional nature of successful ILS implementation. Pedagogical design emerges as a fundamental factor, confirming that successful integration requires coherence between technological innovation and solid educational principles.
Practical implications are significant for the global educational ecosystem. Results suggest that ILS have transitioned from being experimental technologies to empirically validated tools that can substantially contribute to educational results improvement when appropriately implemented. However, success requires investment in teacher training, careful curricular design, and explicit consideration of equity and accessibility factors.
In synthesis, Intelligent Learning Systems represent a significant evolution in educational technology with demonstrated potential to transform teaching and learning in basic and secondary education. Their successful implementation requires a comprehensive approach that combines technological innovation with solid pedagogical foundations, specialized teacher training, and careful consideration of contextual and ethical factors.
Statements
Data availability statement
The original contributions are included in the article/supplementary material, and further inquiries can be directed to the corresponding authors.
Author contributions
EC: Writing – review & editing, Writing – original draft. DB: Writing – original draft, Writing – review & editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This research was funded by the General Directorate of Investigations (Dirección General de Investigaciones) of Universidad Santiago de Cali under Call No. DGI 01-2026. The funder had no involvement in study design, data collection, analysis, interpretation, or the decision to publish.
Acknowledgments
The authors thank Universidad Santiago de Cali for the institutional support provided for conducting this research. We also thank the peer reviewers whose constructive feedback substantially improved the quality and clarity of this manuscript.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1
GuoL.WangD.GuF.LiY.WangY.ZhouR. (2021). Evolution and trends in intelligent tutoring systems research: a multidisciplinary and scientometric view. Asia Pac. Educ. Rev. 22, 441–461. doi: 10.1007/s12564-021-09697-7
2
HolmesW.TuomiI. (2022). State of the art and practice in ai in education. Eur. J. Educ. 57, 542–570. doi: 10.1111/ejed.12533
3
HuangX.CraigS. D.XieJ.GraesserA.HuX. (2016). Intelligent tutoring systems work as a math gap reducer in 6th grade after-school program. Learn. Individ. Diff. 47, 258–265. doi: 10.1016/j.lindif.2016.01.012
4
LeeS. J.KwonK. (2024). A systematic review of ai education in k-12 classrooms from 2018 to 2023: Topics, strategies, and learning outcomes. Comput. Educ.: Artif. Intell. 6:100211. doi: 10.1016/j.caeai.2024.100211
5
MartinF.ZhuangM.SchaeferD. (2024). Systematic review of research on artificial intelligence in k-12 education (2017-2022). Comput. Educ.: Artif. Intell. 6:100195. doi: 10.1016/j.caeai.2023.100195
6
RizviS.WaiteJ.SentanceS. (2023). Artificial intelligence teaching and learning in k-12 from 2019 to 2022: A systematic literature review. Comput. Educ.: Artif. Intell. 4:100145. doi: 10.1016/j.caeai.2023.100145
7
SonT. (2024). Intelligent tutoring systems in mathematics education: a systematic literature review using the substitution, augmentation, modification, redefinition model. Computers13:270. doi: 10.3390/computers13100270
8
Steenbergen-HuS.CooperH. (2014). A meta-analysis of the effectiveness of intelligent tutoring systems on college students' academic learning. J. Educ. Psychol. 105, 970–987. doi: 10.1037/a0032447
9
UNESCO (2023). Resumen del informe de seguimiento de la educación en el mundo 2023: Tecnología en la educación: ¿una herramienta en los términos de quién?Paris: UNESCO.
10
VanLehnK.MilnerF.BanerjeeC.WetzelJ. (2023). A step-based tutoring system to teach underachieving students how to construct algebraic models. Int. J. Artif. Intell. Educ. 33, 473–512. doi: 10.1007/s40593-023-00328-3
Summary
Keywords
adaptive learning, artificial intelligence in education, educational technology, intelligent learning systems, primary education, PRISMA, secondary education, systematic review
Citation
Cerón Salazar EM and Burbano González DC (2026) Intelligent learning systems in primary and secondary education: a systematic review (2014–2024). Front. Educ. 11:1720377. doi: 10.3389/feduc.2026.1720377
Received
08 October 2025
Revised
06 January 2026
Accepted
21 January 2026
Published
26 February 2026
Volume
11 - 2026
Edited by
Sergio Ruiz-Viruel, University of Malaga, Spain
Reviewed by
Silvio Marcello Pagliara, University of Cagliari, Italy
Fivia Eliza, Padang State University, Indonesia
Updates
Copyright
© 2026 Cerón Salazar and Burbano González.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Diana Carolina Burbano González, diana.burbano02@usc.edu.co; Edison Marino Cerón Salazar, edison.ceron00@usc.edu.co
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.