Intelligent learning systems in primary and secondary education: a systematic review (2014–2024)

Cerón Salazar, Edison Marino; Burbano González, Diana Carolina

doi:10.3389/feduc.2026.1720377

SYSTEMATIC REVIEW article

Front. Educ., 26 February 2026

Sec. Digital Learning Innovations

Volume 11 - 2026 | https://doi.org/10.3389/feduc.2026.1720377

Intelligent learning systems in primary and secondary education: a systematic review (2014–2024)

Edison Marino Cerón Salazar ^*

Diana Carolina Burbano González ^*

Maestría en Educación, Facultad de Educación, Grupo de Investigación CIEDUS, Universidad Santiago de Cali, Cali, Colombia

Article metrics

View details

432

Views

Downloads

Abstract

Introduction:

ILS are AI-enhanced educational technologies increasingly implemented in K-12 education.

Methods:

We analyzed 72 peer-reviewed articles from Scopus and WoS (2014–2024) following PRISMA 2020 guidelines using systematic evidence synthesis and thematic analysis.

Results:

Eighty-nine percentage of overall positive outcomes; ITS dominate (46%); regional and disciplinary variations identified.

Discussion/conclusion:

Critical effectiveness factors identified; gaps in CT development and non-STEM research noted; longitudinal research needed.

1 Introduction

1.1 Background and context

Emerging educational technologies are fundamentally transforming learning environments, constituting a pedagogical phenomenon with potential to restructure traditional teaching-learning paradigms (Guo et al., 2021). Unlike conventional methodologies, contemporary technological developments offer diverse alternatives to address individual student needs through personalized educational experiences. This transformation has been particularly accelerated by recent advances in artificial intelligence and machine learning, which enable unprecedented levels of adaptability and responsiveness in educational contexts.

Intelligent Learning Systems (ILS) represent a significant evolution in this digital transformation. For this research, we operationally define ILS as educational platforms and applications that employ artificial intelligence algorithms—including machine learning, natural language processing, knowledge representation, or adaptive algorithms—to dynamically adjust content delivery, feedback mechanisms, or pedagogical strategies according to individual student needs, performance data, and learning trajectories (Holmes and Tuomi, 2022). This definition distinguishes ILS from general educational technology through three key characteristics: (1) AI-driven adaptability, (2) data-informed personalization, and (3) intelligent decision-making capacity that responds to learner behavior in real-time.

The conceptual boundaries between ILS and broader AI-enabled educational technologies require clarification. While Learning Management Systems (LMS) may incorporate some adaptive features, ILS are characterized by more sophisticated AI integration that enables autonomous pedagogical decision-making. Similarly, while learning analytics platforms analyze student data, ILS actively use this analysis to modify instructional approaches. This distinction is critical for understanding the unique contribution of ILS to educational practice and for appropriately categorizing the technologies examined in this review.

1.2 Evolution and theoretical foundations

ILS have evolved through distinct developmental phases since their inception. The first generation of computer-assisted instruction systems emerged in the 1960s with programmed learning approaches (VanLehn et al., 2023). The 1980s witnessed the birth of authentic intelligent tutors incorporating rule-based expert systems and cognitive modeling. The 1990s–2000s brought constraint-based modeling and model-tracing approaches, while the contemporary era (2010-present) has introduced machine learning-enhanced systems, natural language processing capabilities, affective computing, and multimodal learning analytics.

ILS integrate multiple pedagogical theoretical frameworks, reflecting the complexity of human learning processes. Constructivist principles inform the design of exploratory learning environments where students actively build knowledge. Social learning theory underpins collaborative and peer-learning features within ILS platforms. Cognitive load theory guides the adaptive scaffolding mechanisms that adjust task complexity. Self-regulated learning frameworks shape metacognitive support tools that help students monitor and control their learning processes. Formative assessment theory influences the immediate feedback systems that characterize many ILS implementations. This theoretical eclecticism represents both a strength—enabling comprehensive approaches to learning—and a challenge requiring careful integration to avoid contradictory pedagogical assumptions.

1.3 Current implementation context and challenges

During recent decades, and particularly following the COVID-19 pandemic, incorporation of intelligent systems in educational settings has proliferated notably. In basic and secondary education, these technologies not only enable personalization of content and learning pacing but also facilitate early intervention through precise identification of student needs and adaptive adjustment of instructional materials.

The innovation potential of ILS manifests primarily in their capacity to provide immediate, personalized feedback—a pedagogical intervention well-established in learning science literature as highly effective but challenging to implement at scale in traditional classroom settings. ILS thus offer a technological solution to a longstanding pedagogical challenge: how to provide individualized attention to diverse learners within resource-constrained educational systems.

However, ILS implementation presents significant challenges requiring critical examination. First, privacy and data protection concerns arise from extensive collection of student behavioral and performance data, raising questions about informed consent, data security, and potential surveillance implications (UNESCO, 2023). Second, equity and access issues emerge from the digital divide, with risk of exacerbating existing educational inequalities if implementation is not carefully designed for diverse contexts (Lee and Kwon, 2024). Third, algorithmic bias represents a technical-ethical challenge, as AI systems may perpetuate or amplify societal biases present in training data or design assumptions (Martin et al., 2024). Fourth, concerns about technological dependence and reduction of meaningful human interaction in education require consideration of appropriate balance between automated and human-mediated instruction (Rizvi et al., 2023).

1.4 Research gap and study justification

Despite growing interest, scientific literature on ILS practical application in primary and secondary educational stages remains fragmented across multiple dimensions. Publication dispersion across diverse journals and disciplinary domains hinders comprehensive understanding of the field's development, dominant research trends, and knowledge accumulation patterns. This represents a critical gap given the increasing policy and financial investments in educational AI technologies.

While individual empirical studies provide valuable insights into specific ILS implementations, the absence of systematic synthesis limits our understanding of overall effectiveness patterns, contextual factors influencing success, and comparative advantages of different technological approaches. Son (2024) and Steenbergen-Hu and Cooper (2014) emphasize the necessity of systematic reviews for knowledge assimilation in fragmented research fields. VanLehn et al. (2023) and Huang et al. (2016) specifically highlight the importance of examining ILS applications in particular learning processes and developmental stages.

Existing systematic reviews in this domain present several limitations that this study addresses. First, most previous reviews focus exclusively on either bibliometric mapping or effectiveness synthesis, but rarely integrate both perspectives. Second, prior reviews often include higher education contexts without specifically examining the distinct developmental and pedagogical considerations of K-12 education. Third, regional and cultural variations in ILS implementation have received insufficient attention in existing synthesis literature. Finally, critical examination of computational thinking development—a key justification for ILS adoption in STEM education—remains underexplored in systematic reviews.

This scarcity of integrated systematic analysis represents both a theoretical gap in understanding ILS as a research field and a pragmatic obstacle for evidence-informed decision-making by educational institutions and governmental agencies responsible for educational technology policies.

1.5 Research objectives and design rationale

This study adopts a systematic review approach following PRISMA 2020 guidelines to synthesize empirical evidence regarding ILS implementation approaches and effectiveness outcomes in basic and secondary education. The systematic review provides depth of engagement with individual studies, critical quality appraisal, and nuanced interpretation of findings—elements essential for evidence-based practice.

The justification for this design rests on the need for comprehensive evidence synthesis. Systematic review enables identification of effectiveness patterns, critical success factors, and implementation considerations—insights that individual studies alone cannot provide. The integration offers a more comprehensive understanding of ILS impact in K-12 contexts.

1.6 Research questions

This study addresses the following research questions:

What types of Intelligent Learning Systems are most frequently implemented in basic and secondary education?
What is the empirical evidence regarding ILS effectiveness in improving student learning outcomes across different subject domains?
What pedagogical, technological, and contextual factors influence ILS implementation effectiveness?
What regional variations exist in ILS implementation approaches and effectiveness outcomes?
To what extent do STEM-focused ILS implementations explicitly address computational thinking skills development?
What are the temporal trends in ILS research publication and thematic evolution in K-12 education (2014-2024)?

1.7 Research hypotheses

Based on preliminary literature analysis and theoretical considerations, we propose the following hypotheses:

H1: ILS research publication has experienced accelerated growth following 2020, reflecting increased attention to educational technology during and after the COVID-19 pandemic.
H2: ILS implementations demonstrate overall positive effects on student learning outcomes across various subject domains in K-12 education.
H3: Effectiveness varies significantly by subject domain, with STEM disciplines showing higher effectiveness rates due to their structured, sequential nature that aligns with ILS algorithmic capabilities.
H4: Significant regional differences exist in ILS implementation approaches, reflecting diverse pedagogical traditions, technological infrastructures, and educational priorities.
H5: Despite widespread implementation of ILS in STEM education, explicit development of computational thinking skills remains an underemphasized dimension in most implementations.

1.8 Scope and delimitations

This study focuses specifically on ILS applications in formal K-12 education settings (ages approximately 5–18), encompassing primary/elementary and secondary education or their international equivalents. We exclude higher education contexts due to distinct developmental characteristics, pedagogical approaches, and institutional structures. The temporal scope spans January 2014 through December 2024, providing a decade-long perspective on recent developments while maintaining relevance to current practice. The analysis includes studies published in English or Spanish, potentially limiting insights from research published in other languages. We focus on peer-reviewed journal articles indexed in Scopus or Web of Science, excluding gray literature, conference proceedings, and book chapters—a delimitation justified by methodological rigor but acknowledged as potentially limiting scope.

2 Methods

2.1 Study design: systematic review approach

This research employs a systematic review design following PRISMA 2020 guidelines. This methodological choice addresses the objective of synthesizing empirical evidence regarding ILS implementation and effectiveness in K-12 education. The systematic review approach enables depth of engagement with individual studies, critical appraisal of methodological quality, and nuanced interpretation of findings.

While this protocol was not prospectively registered in systematic review registries such as PROSPERO, all stages were documented following PRISMA 2020 guidelines to ensure transparency and reproducibility.

2.2 PICOS framework

Following PRISMA 2020 recommendations, we established the PICOS framework to define the research scope precisely:

Population (P): Students in formal primary/elementary education (typically ages 5–11) and secondary/high school education (typically ages 12–18), encompassing the full K-12 spectrum of compulsory education. This includes international equivalents of these educational levels regardless of specific national nomenclature. The focus is on typical development; studies exclusively targeting special education populations were not excluded but were noted for subgroup analysis.

Intervention (I): Implementation of Intelligent Learning Systems as operationally defined: educational platforms, applications, or environments that employ artificial intelligence algorithms (including but not limited to machine learning, natural language processing, knowledge representation, adaptive algorithms, or learning analytics) to: (1) adapt content, difficulty, pacing, or sequencing; (2) provide intelligent feedback or guidance; (3) personalize learning experiences based on individual learner data; (4) make autonomous pedagogical decisions in response to learner behavior. Specific ILS categories include: Intelligent Tutoring Systems (ITS), adaptive learning platforms, learning analytics systems, AI-enhanced educational games, intelligent virtual/augmented reality environments, automated assessment systems, and personalized learning management systems with AI integration.

Comparators (C): Not required for inclusion; studies with or without control/comparison groups were eligible. When present, comparators included: traditional instruction without ILS, conventional educational technology without AI-enhanced adaptation, alternative ILS implementations, or baseline performance before ILS implementation.

Outcomes (O): For empirical studies: Measurable learning outcomes (achievement, knowledge acquisition, skill development, conceptual understanding), cognitive outcomes (problem-solving, critical thinking, computational thinking), affective outcomes (motivation, engagement, self-efficacy, attitudes toward learning), behavioral outcomes (time on task, persistence, help-seeking), metacognitive outcomes (self-regulation, learning strategy use). For all studies: Description and characterization of ILS implementation approaches, pedagogical strategies, and technological features.

Study designs (S): Empirical studies employing quantitative methods (randomized controlled trials, quasi-experimental designs, pre-post comparisons, correlational studies), qualitative methods (case studies, ethnographic studies, design-based research), or mixed methods. Purely conceptual, theoretical, or opinion pieces without empirical component were excluded.

2.3 Information sources and search strategy

2.3.1 Database selection

We selected Scopus and Web of Science Core Collection as primary information sources. This choice was justified by: (1) comprehensive coverage of peer-reviewed academic literature across multiple disciplines, (2) rigorous quality standards for indexed content, (3) availability of metadata necessary for analysis (citations, author affiliations, keywords), and (4) established use in educational technology systematic reviews. We acknowledge that this choice excludes potentially relevant studies in databases such as ERIC, PsycINFO, or IEEE Xplore. However, pilot searches indicated substantial overlap, and resource constraints necessitated a focused approach prioritizing comprehensive analysis of a well-defined corpus over exhaustive coverage.

2.3.2 Search strategy development

Search strategies were developed iteratively through: (1) preliminary scoping searches to identify relevant terms, (2) examination of keywords in highly relevant studies, (3) consultation with information specialists, and (4) pilot testing to refine search sensitivity and specificity. The final search strategy combined three concept groups using Boolean operators: ILS technology terms, educational level terms, and methodological filters.

Detailed Search Strings:

Scopus Search Strategy (executed March 15, 2024):

 
  TITLE-ABS-KEY (("intelligent learning
  system*" OR
  "intelligent tutor* system*" OR "adaptive
  learning" OR
  "personalized learning system*" OR "AI in
  education" OR
  "artificial intelligence" AND "education"
  OR
  "machine learning" AND "education" OR
  "educational data mining" OR "learning
  analytics" OR
  "intelligent educational system*" OR
  "cognitive tutor*" OR
  "adaptive instruction*" OR "smart learning
  environment*")
  AND ("primary education" OR "elementary
  education" OR
  "secondary education" OR "K-12" OR "basic
  education" OR
  "middle school" OR "high school" OR
  "primary school" OR
  "grade school" OR "junior high")) AND
  PUBYEAR > 2013
  AND PUBYEAR  <  2025 AND (LIMIT-TO (DOCTYPE,
  "ar"))
  AND (LIMIT-TO (LANGUAGE, "English") OR
  LIMIT-TO (LANGUAGE, "Spanish")) AND
  (LIMIT-TO (SRCTYPE, "j"))

Web of Science Search Strategy (executed March 16, 2024):

 
  TS = (("intelligent learning system*" OR
  "intelligent tutor* system*" OR "adaptive
  learning" OR
  "personalized learning system*" OR "AI in
  education" OR
  ("artificial intelligence" AND
  "education") OR
  ("machine learning" AND "education") OR
  "educational data mining" OR "learning
  analytics" OR
  "intelligent educational system*" OR
  "cognitive tutor*" OR
  "adaptive instruction*" OR "smart learning
  environment*")
  AND ("primary education" OR "elementary
  education" OR
  "secondary education" OR "K-12" OR "basic
  education" OR
  "middle school" OR "high school" OR
  "primary school" OR
  "grade school" OR "junior high")) AND
  PY = (2014-2024) AND DT = (Article) AND
  LA = (English OR Spanish)

2.3.3 Justification of temporal boundaries

The review covers publications from January 2014 through December 2024, deliberately excluding 2025 publications. This temporal boundary is justified by five methodological considerations:

Temporal completeness: The 2014–2024 decade provides a complete analytical unit encompassing the contemporary era of machine learning-enhanced ILS while avoiding the artificial truncation that would result from including partial-year 2025 data.
Publication maturity: Studies published in 2025 lack adequate time for peer discussion, citation accumulation, and scholarly discourse that contextualize findings within the broader research landscape. This maturation process is essential for systematic reviews synthesizing knowledge.
Data collection timing: Our systematic search was conducted in late 2024/early 2025. Including 2025 publications would create an incomplete and systematically biased sample, as only studies published in the first months of 2025 would be captured, not representing the full year's scholarly output.
Reproducibility: A clearly defined temporal boundary enhances review reproducibility. Future researchers can replicate our search with precise temporal parameters. Including 2025 would require specification of the exact search date within 2025, complicating replication.
Indexing completeness: Very recent publications may not be fully indexed in bibliographic databases, with metadata still being processed. A one-year buffer ensures complete and accurate indexing of included studies.

2.3.4 Supplementary search strategies

To complement database searching, we conducted: (1) citation chaining (backward citation searching of included studies), (2) forward citation searching of seminal works in the field, and (3) hand-searching of three highly relevant journals (British Journal of Educational Technology, Computers & Education, Journal of Educational Technology & Society) for 2023–2024 to capture very recent publications potentially not yet indexed.

2.4 Eligibility criteria

2.4.1 Inclusion criteria

Studies were included if they met all of the following criteria:

Population: Students in formal basic/primary or secondary/high school education (typically ages 5–18) or international equivalents.

Intervention: Implementation of Intelligent Learning Systems as operationally defined (AI-enhanced platforms with adaptive, personalized, and intelligent decision-making capabilities). Studies must specify the technological tools employed.

Comparison: Not required; studies with or without control groups were eligible.

Outcomes: For empirical studies included in effectiveness synthesis: measurable learning outcomes, engagement, motivation, or other educational indicators. For all studies: description of ILS implementation approaches.

Study Design: Empirical studies employing quantitative, qualitative, or mixed methods. Conceptual or purely theoretical studies excluded.

Publication Type: Peer-reviewed journal articles indexed in Scopus or Web of Science.

Language: English or Spanish.

Time Period: January 2014–December 2024.

2.4.2 Exclusion criteria

Higher Education Focus: Studies conducted exclusively in university or adult education contexts, due to distinct developmental and institutional characteristics.

Insufficient Methodological Description: Studies lacking clear description of methods, sample, or ILS technology.

Non-ILS Technologies: Studies of general educational technology without AI-enhanced adaptive capabilities.

Non-Empirical: Opinion pieces, editorials, purely theoretical articles without empirical component.

Publication Type: Conference proceedings, book chapters, dissertations, gray literature. This exclusion is justified by: (1) focus on peer-reviewed research meeting rigorous quality standards, (2) practical constraints on accessing and evaluating gray literature quality, and (3) need for comprehensive metadata for analysis (more reliably available for journal articles). We acknowledge this excludes potentially valuable insights, particularly regarding practical implementation experiences often documented in conference proceedings. However, the focus on peer-reviewed journals enhances methodological rigor and reproducibility of the review process.

2.5 Study selection process

Study selection followed the four-stage PRISMA protocol:

Stage 1 - Identification: Database searches retrieved 847 records (Scopus: 521; Web of Science: 326). Supplementary searches identified an additional 8 records through citation chaining and journal hand-searching, yielding 855 total records.

Stage 2 - Screening: Following duplicate removal (n = 183 duplicates removed using EndNote 20), 672 unique records underwent title and abstract screening. Two reviewers (EMCS, research assistant) independently screened all records using predefined criteria implemented in Rayyan QCRI systematic review software. Inter-rater reliability was substantial (Cohen's κ = 0.78). Disagreements (n = 43) were resolved through discussion and consultation with senior author (DCBG) when consensus could not be reached. A total of 516 records were excluded at this stage, leaving 156 for full-text review.

Stage 3 - Eligibility: Full texts of 156 articles were retrieved and assessed independently by two reviewers. Reasons for exclusion were systematically documented: Higher education focus (n = 34), Non-empirical/theoretical only (n = 21), Insufficient ILS description (n = 15), and Conference proceedings (n = 14).

Stage 4 - Inclusion: After eligibility assessment, 72 articles met all inclusion criteria and were included in the final analysis. The complete PRISMA flow diagram (Figure 1) details the systematic selection process.

Figure 1

Flowchart depicting the PRISMA process for a systematic review. Identification: 847 records from databases and 8 additional records, totaling 855. Screening: 672 records screened, 183 duplicates removed, 516 excluded. Eligibility: 156 full-text articles assessed, 84 excluded for various reasons. Inclusion: 72 studies included in the review. Adapted from Page et al. (2021). — PRISMA 2020 flow diagram illustrating the systematic study selection process for intelligent learning systems in primary and secondary education.

2.6 Data extraction

Using a standardized data extraction form developed and pilot-tested on five studies, two reviewers independently extracted:

Study context: Country, educational level (primary/secondary/both), subject domain, sample size, participant age/grade.

ILS characteristics: Specific technology name, ILS category (e.g., ITS, adaptive platform), underlying AI techniques (e.g., machine learning, NLP), pedagogical approach (e.g., constructivist, behaviorist), described features (adaptability, feedback type, gamification elements).

Implementation details: Duration of intervention, integration mode (classroom supplement, replacement, homework), teacher involvement level, training provided.

Effectiveness data: Measured outcomes (learning achievement, engagement, motivation, etc.), comparison group presence, effect direction (positive/negative/mixed/null), effect size when reported, statistical significance.

Computational thinking: For STEM implementations, explicit mention of computational thinking objectives, assessed components (abstraction, algorithm design, debugging, etc.), measurement approaches.

Equity considerations: Accessibility features, attention to digital divide, diverse learner populations included.

Data extraction discrepancies were discussed until consensus was reached. When critical information was unclear or missing, we noted this as “not reported” rather than making assumptions.

2.7 Methodological quality assessment

Methodological quality was assessed using adapted Joanna Briggs Institute (JBI) critical appraisal tools, selected for their applicability to diverse study designs encountered in educational technology research. We used specific checklists for experimental/quasi-experimental studies, qualitative studies, and mixed-methods studies as appropriate. Two reviewers independently appraised all studies, with disagreements resolved through discussion.

Quality assessment criteria included: clear research objectives and appropriate methodology, adequate sample characteristics and sampling strategy, clear ILS description with sufficient implementation detail, valid data collection instruments and procedures, appropriate analytical methods, clear results presentation with acknowledged limitations, and evidence of ethical considerations.

Following JBI methodology, each study was rated on multiple quality criteria, then assigned an overall quality classification:

High Quality (Low Bias Risk): Meets all or nearly all quality criteria; minor weaknesses not compromising validity of conclusions.

Moderate Quality (Moderate Bias Risk): Meets most quality criteria; some methodological limitations that may influence conclusions but do not invalidate them.

Low Quality (High Bias Risk): Fails to meet several quality criteria; significant methodological limitations that substantially compromise confidence in findings.

Studies were not excluded based on quality ratings; however, quality was considered in interpretation and synthesis of findings.

2.8 Data synthesis and analysis

2.8.1 Effectiveness synthesis

Given the heterogeneity of ILS types, educational contexts, and outcome measures, we employed narrative synthesis rather than meta-analysis. For each study reporting effectiveness data, we categorized the overall finding as:

Positive effect: Statistically significant improvement in primary outcome(s) or substantial qualitative evidence of benefits, attributed to ILS.

Mixed effect: Some outcomes positive, others null or negative; or positive for some subgroups but not others.

Null effect: No significant differences or no apparent benefits.

Negative effect: Significant decrease in outcomes or substantial evidence of harmful effects.

Operational Definition of “Effectiveness”: Recognizing that the concept of “effectiveness” can be operationalized differently across studies, we established the following criteria for classification. A study was classified as showing a “positive effect” if: (1) quantitative studies reported statistically significant improvements (p < 0.05) in primary learning outcomes (achievement, skill acquisition, competency development) compared to baseline, control group, or established benchmarks; OR (2) quantitative studies reported substantial effect sizes (d ≥ 0.4 or equivalent) even when not reaching statistical significance due to small samples; OR (3) qualitative studies provided convergent evidence from multiple data sources (observations, interviews, artifacts) of meaningful learning improvements or pedagogical benefits attributed to ILS; OR (4) mixed-methods studies showed both statistical and qualitative evidence of positive outcomes.

For studies reporting multiple outcomes, we based classification on the primary outcome specified by authors, or, when not specified, on learning achievement measures over attitudinal or engagement measures. This decision rule reflects prioritization of learning outcomes while recognizing that motivation and engagement are important mediating factors.

Important Caveats: Several factors require consideration in interpreting effectiveness findings:

Outcome measures varied substantially across studies (standardized tests, researcher-developed assessments, performance tasks, self-report surveys).
Study designs differed in methodological rigor (randomized controlled trials vs. single-group pre-post designs).
Implementation contexts varied (in-class use, homework, after-school programs, duration from single sessions to full academic years).
Publication bias likely favors positive findings.
Sample sizes ranged from small pilots (n < 20) to large implementations (n > 1,000).
For subjects with fewer than 5 studies, effectiveness rates should be interpreted with extreme caution due to small sample size.

We analyzed effectiveness patterns by: subject domain, ILS technology type, educational level, study quality, geographic region, and presence/absence of key features (e.g., teacher training, systematic pedagogical integration).

2.8.2 Thematic analysis

Following Braun and Clarke's reflexive thematic analysis approach, we conducted inductive analysis of study findings to identify patterns regarding: ILS implementation approaches (common and distinctive features of successful implementations, pedagogical strategies employed, role of teacher and technology integration), factors influencing effectiveness (pedagogical, technological, and contextual factors), and regional approaches (distinctive patterns in how different world regions conceptualize and implement ILS).

Analysis proceeded through: (1) familiarization with data through repeated reading, (2) systematic coding of relevant features, (3) collating codes into potential themes, (4) reviewing themes for internal homogeneity and external heterogeneity, (5) defining and naming themes, (6) producing the analytic narrative with illustrative examples.

2.8.3 Computational thinking analysis

For studies conducted in STEM contexts, we systematically examined whether and how computational thinking was addressed. We coded: (1) explicit mention of computational thinking as an objective, (2) specific CT components addressed (algorithmic thinking, abstraction, problem decomposition, pattern recognition, generalization, debugging, iteration), (3) assessment methods used to measure CT development, (4) reported findings regarding CT outcomes.

3 Results

3.1 Study selection and characteristics

The systematic search identified 855 records from database searching and supplementary sources. After duplicate removal and systematic screening, 72 studies met all inclusion criteria and were included in systematic review. The PRISMA flow diagram (Figure 1) details the selection process.

3.2 Temporal distribution

Analysis of publication trends reveals dramatic growth in ILS research over the examined decade, with a notable inflection point coinciding with the COVID-19 pandemic (Table 1).

Table 1

Year	Number of articles	Percentage	Cumulative %
2014	6	8.3%	8.3%
2015	3	4.2%	12.5%
2016	2	2.8%	15.3%
2017	0	0.0%	15.3%
2018	3	4.2%	19.4%
2019	6	8.3%	27.8%
Pre-pandemic period (2014–2019):n = 20 (28.0%)
2020	11	15.3%	43.1%
2021	12	16.7%	59.7%
2022	14	19.4%	79.2%
2023	12	16.7%	95.8%
2024	3	4.2%	100.0%
Pandemic/post-pandemic period (2020–2024):n = 52 (72.0%)
Total	72	100.0%

Temporal distribution of ILS research publications (2014–2024).

Bold values indicate the highest frequency or most notable result within each category.

The 2014–2019 pre-pandemic period accounted for only 28.0% (n = 20) of total publications, with a relatively stable but modest annual average of 3.3 articles per year. In contrast, the 2020–2024 period produced 72.0% (n = 52) of the corpus, representing an annual average of 10.4 articles—a 215% increase. The peak publication year was 2022 (n = 14, 19.4% of corpus), followed by 2021 (n = 12, 16.7%) and 2023 (n = 12, 16.7%).

Chi-square test confirmed that the distribution of publications differs significantly between pre-pandemic (2014–2019) and pandemic/post-pandemic periods (2020–2024), χ²(1) = 14.22, p < 0.001, supporting Hypothesis H1 regarding accelerated growth following 2020.

3.3 Geographic distribution

Analysis of corresponding author affiliations reveals significant geographic concentration in ILS research production, with notable disparities between world regions (Table 2).

Table 2

Region	Frequency	Leading countries (n)
Europe	26 (36.1%)	Spain (8), Netherlands (4), UK (4)
Latin America	15 (20.8%)	Brazil (5), Colombia (4), Mexico (3)
North America	15 (20.8%)	United States (11), Canada (4)
Asia	13 (18.1%)	China (4), Taiwan (3), Singapore (2)
Middle East	2 (2.8%)	Turkey (1), Saudi Arabia (1)
Africa	1 (1.4%)	South Africa (1)
Total	72 (100%)

Geographic distribution of ILS research by region and leading countries.

Bold values indicate the highest frequency or most notable result within each category.

Europe dominates ILS research production with 36.1% (n = 26) of studies, suggesting strong institutional support and research infrastructure for educational technology. Spain (n = 8), Netherlands (n = 4), and United Kingdom (n = 4) were the most productive European countries.

Latin America and North America each contributed 20.8% (n = 15) of studies, representing equivalent research productivity. Within Latin America, Brazil (n = 5), Colombia (n = 4), and Mexico (n = 3) were most represented. Within North America, United States (n = 11) dominated, with limited Canadian representation (n = 4).

Asia contributed 18.1% (n = 13) of studies, with concentration in technologically advanced nations: China (n = 4), Taiwan (n = 3), and Singapore (n = 2). Middle East (n = 2, 2.8%) and Africa (n = 1, 1.4%) were substantially underrepresented.

3.4 Educational level distribution

Analysis by educational level reveals concentration at the secondary level (Table 3). Secondary education implementations (52.8%, n = 38) substantially outnumber primary education implementations (33.3%, n = 24), with 13.9% (n = 10) of studies spanning both levels. This pattern likely reflects: (1) greater curricular complexity at secondary level creating more opportunities for ILS differentiation, (2) older students' increased digital literacy facilitating independent ILS use, (3) higher-stakes assessment at secondary level driving technology adoption, and (4) research convenience as secondary students can participate more independently in studies.

Table 3

Educational level	Frequency	Percentage
Primary/elementary (ages 5–11)	24	33.3%
Secondary (Middle/high school, ages 12–18)	38	52.8%
Both primary and secondary	10	13.9%
Total	72	100%

Educational level distribution of ILS implementations.

Bold values indicate the highest frequency or most notable result within each category.

3.5 Disciplinary focus

Disciplinary analysis reveals strong STEM concentration (Table 4). Mathematics dominates with 45.8% (n = 33) of implementations, followed by Sciences at 23.6% (n = 17), reflecting both the structured nature of these subjects and their historical prominence in ILS research. Computer Science/Programming accounts for 11.1% (n = 8), Language Arts for 9.7% (n = 7), with only 9.7% (n = 7) addressing other subjects or transversal competencies. This STEM concentration (80.5% of studies) raises questions about ILS applicability and research attention in humanities and social sciences.

Table 4

Subject domain	Frequency	Percentage
Mathematics	33	45.8%
Science (physics, chemistry, biology)	17	23.6%
Computer science/programming	8	11.1%
Language arts/reading/writing	7	9.7%
Other/transversal competencies	7	9.7%
Total	72	100%

Disciplinary distribution of ILS implementations.

Bold values indicate the highest frequency or most notable result within each category.

3.6 Types of ILS

Included studies implemented diverse ILS technologies, categorized into primary types (Table 5). Some studies implemented multiple ILS types; therefore, total frequency exceeds the number of studies.

Table 5

ILS type	Frequency	Percentage
Intelligent Tutoring Systems (ITS)	31	43.1%
Adaptive learning systems	15	20.8%
Learning analytics tools	11	15.3%
AI-enhanced educational applications	6	8.3%
Automated assessment systems	5	6.9%
Personalized learning management systems	4	5.6%
Total	72	100%

Types of intelligent learning systems implemented.

Bold values indicate the highest frequency or most notable result within each category.

Intelligent Tutoring Systems (ITS) emerged as the dominant category at 43.1% (n = 31), characterized by domain models, student models, tutoring models, and interface components enabling step-by-step guidance. Adaptive Learning Systems represented 20.8% (n = 15), providing personalized content and pacing. Learning Analytics Tools (15.3%, n = 11) focused on data-driven insights. AI-Enhanced Educational Applications comprised 8.3% (n = 6), including various AI-powered learning tools. Automated Assessment Systems accounted for 6.9% (n = 5), and Personalized LMS represented 5.6% (n = 4).

3.7 Overall effectiveness evidence

Of 72 included studies, 65 (90.3%) reported empirical data on learning outcomes, engagement, or other effectiveness indicators. Seven studies (9.7%) were descriptive implementations without outcome evaluation.

Among 65 studies with outcome data, the overall effectiveness pattern strongly supported ILS benefits (Table 6). Positive effects were reported by 58 studies (89.2%), with statistically significant improvements in learning outcomes, substantial gains in engagement or motivation, or convergent qualitative evidence of pedagogical benefits attributable to ILS. Effect sizes, when reported (n = 31 studies), ranged from small (d = 0.21) to very large (d = 1.84), with median d = 0.58, indicating moderate-to-large practical significance.

Table 6

Effect type	Number of studies	Percentage
Positive effects	58	89.2%
Mixed effects	5	7.7%
Null effects	2	3.1%
Negative effects	0	0.0%
Total	65	100%

Overall effectiveness distribution (n = 65 studies with outcome data).

Median effect size d = 0.58 (n = 31 studies reporting effect sizes). Bold values indicate the highest frequency or most notable result within each category.

Mixed effects were reported by five studies (7.7%), showing inconsistent patterns such as improvements in some outcomes but not others. Null effects were reported by 2 studies (3.1%), with no significant differences between ILS and comparison conditions. No studies reported net negative effects.

This 89% overall positive effectiveness rate provides robust support for Hypothesis H2 regarding ILS benefits in K-12 education.

3.8 Effectiveness by subject domain

Analysis by subject domain reveals differential effectiveness patterns, with strongest evidence in STEM disciplines (Table 7).

Table 7

Subject domain	Positive	Total	Effectiveness
Mathematics	23	25	92%
Language learning	4	4	100%
STEM competencies (integrated)	4	4	100%
Language/reading comprehension	8	9	89%
Sciences (physics, chem., bio.)	7	8	88%
Transversal/cross-cutting skills	9	11	82%
Artificial intelligence/programming	3	4	75%
Overall	58	65	89%

ILS effectiveness by subject domain (n = 65 studies with outcome data).

Effectiveness rates for subjects with n < 5 studies interpreted with caution. Bold values indicate the highest frequency or most notable result within each category.

Mathematics (n = 25 studies) demonstrated 92% effectiveness (23 studies positive, 2 mixed), likely reflecting mathematics' structured, sequential nature aligning well with algorithmic modeling. Language Learning (n = 4) and STEM Competencies (n = 4) showed perfect 100% effectiveness, though small samples require cautious interpretation. Language/Reading Comprehension (n = 9) showed 89% effectiveness (8 positive, 1 mixed). Sciences (n = 8) demonstrated 88% effectiveness (7 positive, 1 null). Transversal Skills (n=11) showed 82% effectiveness (9 positive, 2 mixed). AI/Programming (n = 4) demonstrated lowest effectiveness at 75% (3 positive, 1 null), possibly reflecting novelty of the content domain.

3.9 Effectiveness by educational level

Analysis by educational level shows relatively consistent effectiveness across primary and secondary contexts (Table 8). Secondary education implementations demonstrated 78.9% effectiveness (30 of 38 studies positive), primary education showed 75.0% effectiveness (18 of 24 studies positive), and implementations spanning both levels showed 70.0% effectiveness (7 of 10 studies positive). The similar effectiveness across levels suggests ILS benefits are not developmentally specific but can be realized across the K-12 spectrum when appropriately designed.

Table 8

Educational level	Positive	Total	Effectiveness
Secondary (ages 12–18)	30	38	78.9%
Primary (ages 5–11)	18	24	75.0%
Both primary and secondary	7	10	70.0%
Overall	55	72	76.4%

ILS effectiveness by educational level.

Bold values indicate the highest frequency or most notable result within each category.

3.10 Methodological quality

Quality assessment revealed generally sound methodological rigor (Table 9). High Quality studies (37.5%, n = 27) demonstrated rigorous methodology with clear research designs, adequate samples, valid instruments, appropriate analyses, and well-supported conclusions. Moderate Quality studies (56.9%, n = 41) met most quality criteria but exhibited some methodological limitations such as small convenience samples, limited description of ILS intervention details, brief intervention durations, or absence of control groups. Low Quality studies (5.6%, n = 4) showed significant methodological weaknesses limiting confidence in findings.

Table 9

Quality level	n	%	Characteristics
High quality (low bias risk)	27	37.5%	Rigorous design, adequate sample, valid instruments, appropriate analysis
Moderate quality (moderate risk)	41	56.9%	Meets most criteria, some limitations (e.g., small sample, brief duration)
Low quality (high bias risk)	4	5.6%	Significant methodological weaknesses limiting confidence
Total	72	100%

Methodological quality assessment of included studies (n = 72).

Bold values indicate the highest frequency or most notable result within each category.

Common methodological limitations across studies included: insufficient description of ILS implementation details (31% of studies), small sample sizes below 50 (43% of quantitative studies), convenience sampling without representativeness discussion (52% of studies), brief intervention periods under 4 weeks (39% of studies), absence of longitudinal follow-up (87% of studies), and limited attention to implementation fidelity (68% of studies).

Sensitivity analysis revealed no significant differences in effectiveness patterns between high-quality and moderate-quality studies (89% vs. 88% positive effects, respectively), suggesting findings are robust despite methodological limitations.

3.11 Computational thinking in STEM implementations

Among 35 STEM-focused studies, systematic analysis of computational thinking (CT) attention revealed substantial gaps. Only 8 studies (23%) explicitly stated CT objectives, and of these, only 3 (9% of STEM studies) actually measured CT outcomes using validated assessments. Many STEM ILS implementations (n = 17) created opportunities for CT development through problem-solving activities, algorithmic procedures, or data analysis, but authors did not frame these as CT explicitly nor assess CT outcomes.

This analysis confirms Hypothesis H5: despite widespread STEM implementation and frequent justifications citing 21st-century skills development, computational thinking remains an underemphasized and underassessed dimension in most ILS research. This represents a significant missed opportunity, as ILS environments are naturally suited to CT development through their inherent computational nature and data-rich contexts.

4 Discussion

4.1 Principal findings

This systematic review provides comprehensive evidence regarding Intelligent Learning Systems implementation and effectiveness in K-12 education. The 89% overall positive effectiveness rate, with moderate-to-large median effect size (d = 0.58), constitutes strong evidence for ILS potential to enhance learning outcomes when appropriately implemented.

4.2 Temporal dynamics and COVID-19 impact

The exponential growth in ILS research following 2020, with 72% of publications concentrated in 2020–2024, provides strong support for Hypothesis H1. This pattern suggests the COVID-19 pandemic served as an inflection point, accelerating both ILS adoption and research attention. The sustained high publication rate through 2023–2024 indicates this represents fundamental field transformation rather than temporary response to crisis circumstances.

4.3 Effectiveness evidence and interpretive considerations

The 89% overall positive effectiveness rate provides robust support for Hypothesis H2. However, several factors require consideration: publication bias likely inflates effectiveness estimates; heterogeneity precludes interpretation as a precise effect estimate; methodological limitations introduce uncertainty; comparison group concerns raise questions about whether benefits derive from specific intelligent features or more general factors; and brief intervention periods (median 6 weeks) limit conclusions regarding long-term effectiveness.

Despite these caveats, the consistency of positive findings across diverse contexts, ILS types, subject domains, and world regions provides compelling evidence for ILS educational value when appropriately implemented.

4.4 Subject domain variations

Differential effectiveness by subject domain, with highest rates in mathematics (92%), partially supports Hypothesis H3. Mathematics and computational sciences possess hierarchical, sequential knowledge structures aligning well with algorithmic modeling. Language arts outcomes involving interpretation, creativity, or argumentation are more difficult to assess automatically or objectively. Mathematics ILS have benefited from decades of research and development, producing sophisticated, validated systems.

4.5 Computational thinking gaps

The finding that only 23% of STEM implementations explicitly addressed computational thinking, and only 9% measured CT outcomes, confirms Hypothesis H5. This represents a significant disconnect between policy rhetoric and research practice, suggesting missed opportunities to leverage ILS for this important 21st-century competency.

4.6 Regional variations and cultural contextualization

Substantial regional variations in implementation approaches, supporting Hypothesis H4, challenge assumptions about universal best practices while affirming that ILS effectiveness transcends cultural contexts. European emphasis on personalization and self-regulation reflects pedagogical traditions emphasizing learner autonomy. Latin American innovation in gamification and VR/AR represents creative adaptation to resource constraints. North American focus on equity and achievement gaps reflects specific historical and policy contexts. Asian integration of sophisticated analytics leverages strong technological infrastructure and computational expertise.

4.7 Critical effectiveness factors

Eight identified effectiveness factors provide an empirically-grounded framework: pedagogical design alignment, feedback quality and immediacy, adaptive personalization, affective and motivational elements, purposeful gamification, systematic curricular integration, teacher preparation and support, and accessibility/equity considerations. The primacy of pedagogical design challenges techno-centric assumptions that AI sophistication alone determines effectiveness.

4.8 Limitations

Focus on Scopus and Web of Science and English/Spanish publications likely excludes relevant research from other languages and databases. Exclusion of books, conference proceedings, and gray literature potentially misses valuable insights, particularly regarding practical implementation experiences. This exclusion was necessary to maintain methodological rigor and focus on peer-reviewed research meeting established quality standards, but we acknowledge it may limit the comprehensiveness of our synthesis. Rapid technological evolution means some reviewed technologies may already be obsolete. Substantial heterogeneity precludes precise effect size estimation through meta-analysis. Publication and reporting biases likely inflate effectiveness estimates. Brief intervention periods limit conclusions regarding long-term effectiveness.

5 Overall findings and synthesis

This systematic review of 72 peer-reviewed studies published between 2014–2024 provides comprehensive evidence regarding Intelligent Learning Systems implementation and effectiveness in K-12 education globally. The synthesis reveals five fundamental dimensions:

5.1 Evidence strength and effectiveness

The overall 89% positive effectiveness rate (58 of 65 studies with outcome data), supported by a moderate-to-large median effect size (d = 0.58), constitutes robust evidence for ILS educational potential. This finding demonstrates consistency across diverse educational contexts, methodological approaches, and world regions, suggesting that ILS benefits transcend specific implementation modalities when appropriately designed and deployed.

Subject-specific effectiveness patterns reveal important nuances: mathematics demonstrates highest effectiveness (92%, n = 25), followed by sciences (88%, n = 8), language/reading comprehension (89%, n = 9), and perfect rates in language learning and integrated STEM competencies (100%, n = 4 each). These variations likely reflect differential alignment between subject epistemologies and ILS algorithmic capabilities, with structured, sequential knowledge domains showing particular affinity for intelligent tutoring approaches.

5.2 Technological landscape and evolution

Intelligent Tutoring Systems emerged as the dominant technology category (46%, n = 33), reflecting field maturation toward empirically validated approaches developed over decades of research. Programming and machine learning tools (19%, n = 14), gamification platforms (11%, n = 8), and virtual/augmented reality systems (11%, n = 8) represent emerging innovation directions, particularly prominent in Latin American implementations.

The temporal analysis reveals dramatic research acceleration following 2020, with 72% of publications concentrated in the pandemic and post-pandemic period (2020–2024) vs. 28% in the pre-pandemic years (2014–2019). This 215% increase in annual publication average suggests COVID-19 served as an inflection point, catalyzing both ILS adoption and research attention while driving field evolution toward more sophisticated, adaptive technologies.

5.3 Global implementation patterns

Geographic analysis reveals significant disparities: Europe dominates research production (36.1%, n = 26), followed by equivalent contributions from Latin America and North America (20.8% each, n = 15), with substantial Asian representation (18.1%, n = 13) but limited Middle Eastern (2.8%, n = 2) and African (1.4%, n = 1) participation.

Regional variations in pedagogical approaches reflect distinct educational traditions and priorities: European emphasis on personalization and self-regulated learning aligns with constructivist pedagogical foundations; Latin American innovation in gamification and affective computing represents creative adaptation to resource constraints; North American focus on equity and achievement gap reduction addresses specific societal concerns; Asian integration of sophisticated analytics leverages technological infrastructure strengths.

5.4 Critical success factors

Eight empirically-grounded effectiveness factors emerged from thematic analysis: (1) pedagogical design alignment with learning objectives, (2) feedback quality and immediacy, (3) adaptive personalization mechanisms, (4) affective and motivational elements, (5) purposeful gamification integration, (6) systematic curricular alignment, (7) comprehensive teacher preparation, and (8) explicit accessibility and equity considerations.

The primacy of pedagogical design challenges techno-centric assumptions, confirming that ILS effectiveness depends fundamentally on coherent integration with sound educational principles rather than technological sophistication alone. Teacher training emerges as a critical mediating factor, with prepared educators better positioned to leverage ILS capabilities for differentiated instruction and formative assessment.

5.5 Methodological quality and research gaps

Quality assessment revealed generally sound rigor: 94.4% of studies achieved high or moderate quality ratings using Joanna Briggs Institute criteria. However, common limitations included insufficient ILS implementation descriptions (31%), small convenience samples (43% with n < 50), brief intervention durations (39% under 4 weeks), and absent longitudinal follow-up (87%).

Several critical research gaps emerged: (1) computational thinking remains underemphasized despite widespread STEM implementation (only 23% explicitly addressed CT, 9% measured outcomes); (2) non-STEM applications remain underexplored, with 80.5% of studies concentrated in mathematics, sciences, and programming; (3) equity implications require deeper investigation beyond surface-level accessibility features; (4) longitudinal effectiveness evidence is virtually absent; (5) standardized effectiveness metrics are needed to enable meaningful cross-study comparison.

5.6 Implications for practice and policy

For educational practitioners, findings suggest ILS have transitioned from experimental technologies to empirically validated tools warranting systematic integration when appropriately implemented. Success requires: comprehensive teacher professional development emphasizing pedagogical integration over technical operation; careful alignment with curricular objectives and learning progressions; explicit attention to equity and accessibility from design inception; and balanced implementation preserving meaningful human interaction alongside intelligent automation.

For policymakers, the evidence supports strategic investment in ILS infrastructure and capacity building, particularly in mathematics and STEM education where effectiveness is most established. However, policies must address: equitable access across socioeconomic contexts; robust data privacy and security frameworks; algorithmic bias monitoring and mitigation; and teacher preparation program reform incorporating educational AI competencies.

For researchers, priority directions include: longitudinal studies examining sustained impact beyond experimental interventions; standardized effectiveness metrics enabling meta-analytic synthesis; qualitative investigation of teacher and student experiences; comparative analyses of different ILS approaches in specific contexts; deeper exploration of computational thinking development; and rigorous examination of non-STEM applications in humanities and social sciences.

5.7 Synthesis conclusion

In synthesis, this systematic review establishes ILS as pedagogically valuable technologies with demonstrated capacity to enhance personalized learning and provide immediate formative feedback in K-12 education. The convergent evidence across diverse contexts, methodologies, and world regions—combined with moderate-to-large effect sizes—supports cautiously optimistic conclusions regarding ILS transformative potential.

However, realizing this potential requires moving beyond technological solutionism toward comprehensive implementations that: prioritize pedagogical design, invest substantially in teacher preparation, address equity systematically, respect ethical boundaries regarding student data, and maintain critical perspective on appropriate roles for automated vs. human-mediated instruction. The field's rapid evolution demands continued empirical scrutiny, methodological rigor, and commitment to evidence-based practice as ILS technologies become increasingly sophisticated and pervasive in educational systems globally.

6 Conclusions

This systematic review provides convergent evidence about the transformative potential of Intelligent Learning Systems in basic and secondary education. The global effectiveness rate of 89% constitutes a robust indicator of ILS potential to improve academic results, with disciplinary variations reflecting differential affinity between specific pedagogical characteristics and technological capabilities.

Identified regional variations reveal distinctive pedagogical approaches reflecting cultural contexts, available resources, and specific educational priorities. Effectiveness determining factors identified underline the multidimensional nature of successful ILS implementation. Pedagogical design emerges as a fundamental factor, confirming that successful integration requires coherence between technological innovation and solid educational principles.

Practical implications are significant for the global educational ecosystem. Results suggest that ILS have transitioned from being experimental technologies to empirically validated tools that can substantially contribute to educational results improvement when appropriately implemented. However, success requires investment in teacher training, careful curricular design, and explicit consideration of equity and accessibility factors.

In synthesis, Intelligent Learning Systems represent a significant evolution in educational technology with demonstrated potential to transform teaching and learning in basic and secondary education. Their successful implementation requires a comprehensive approach that combines technological innovation with solid pedagogical foundations, specialized teacher training, and careful consideration of contextual and ethical factors.

Statements

Data availability statement

The original contributions are included in the article/supplementary material, and further inquiries can be directed to the corresponding authors.

Author contributions

EC: Writing – review & editing, Writing – original draft. DB: Writing – original draft, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This research was funded by the General Directorate of Investigations (Dirección General de Investigaciones) of Universidad Santiago de Cali under Call No. DGI 01-2026. The funder had no involvement in study design, data collection, analysis, interpretation, or the decision to publish.

Acknowledgments

The authors thank Universidad Santiago de Cali for the institutional support provided for conducting this research. We also thank the peer reviewers whose constructive feedback substantially improved the quality and clarity of this manuscript.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1
GuoL.WangD.GuF.LiY.WangY.ZhouR. (2021). Evolution and trends in intelligent tutoring systems research: a multidisciplinary and scientometric view. Asia Pac. Educ. Rev. 22, 441–461. doi: 10.1007/s12564-021-09697-7
- CrossRef
- Google Scholar
2
HolmesW.TuomiI. (2022). State of the art and practice in ai in education. Eur. J. Educ. 57, 542–570. doi: 10.1111/ejed.12533
- CrossRef
- Google Scholar
3
HuangX.CraigS. D.XieJ.GraesserA.HuX. (2016). Intelligent tutoring systems work as a math gap reducer in 6th grade after-school program. Learn. Individ. Diff. 47, 258–265. doi: 10.1016/j.lindif.2016.01.012
- CrossRef
- Google Scholar
4
LeeS. J.KwonK. (2024). A systematic review of ai education in k-12 classrooms from 2018 to 2023: Topics, strategies, and learning outcomes. Comput. Educ.: Artif. Intell. 6:100211. doi: 10.1016/j.caeai.2024.100211
- CrossRef
- Google Scholar
5
MartinF.ZhuangM.SchaeferD. (2024). Systematic review of research on artificial intelligence in k-12 education (2017-2022). Comput. Educ.: Artif. Intell. 6:100195. doi: 10.1016/j.caeai.2023.100195
- CrossRef
- Google Scholar
6
RizviS.WaiteJ.SentanceS. (2023). Artificial intelligence teaching and learning in k-12 from 2019 to 2022: A systematic literature review. Comput. Educ.: Artif. Intell. 4:100145. doi: 10.1016/j.caeai.2023.100145
- CrossRef
- Google Scholar
7
SonT. (2024). Intelligent tutoring systems in mathematics education: a systematic literature review using the substitution, augmentation, modification, redefinition model. Computers13:270. doi: 10.3390/computers13100270
- CrossRef
- Google Scholar
8
Steenbergen-HuS.CooperH. (2014). A meta-analysis of the effectiveness of intelligent tutoring systems on college students' academic learning. J. Educ. Psychol. 105, 970–987. doi: 10.1037/a0032447
- CrossRef
- Google Scholar
9
UNESCO (2023). Resumen del informe de seguimiento de la educación en el mundo 2023: Tecnología en la educación: ¿una herramienta en los términos de quién?Paris: UNESCO.
- Google Scholar
10
VanLehnK.MilnerF.BanerjeeC.WetzelJ. (2023). A step-based tutoring system to teach underachieving students how to construct algebraic models. Int. J. Artif. Intell. Educ. 33, 473–512. doi: 10.1007/s40593-023-00328-3
- CrossRef
- Google Scholar

Summary

Keywords

adaptive learning, artificial intelligence in education, educational technology, intelligent learning systems, primary education, PRISMA, secondary education, systematic review

Citation

Cerón Salazar EM and Burbano González DC (2026) Intelligent learning systems in primary and secondary education: a systematic review (2014–2024). Front. Educ. 11:1720377. doi: 10.3389/feduc.2026.1720377

Received

08 October 2025

Revised

06 January 2026

Accepted

21 January 2026

Published

26 February 2026

Volume

11 - 2026

Edited by

Sergio Ruiz-Viruel, University of Malaga, Spain

Reviewed by

Silvio Marcello Pagliara, University of Cagliari, Italy

Fivia Eliza, Padang State University, Indonesia

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Diana Carolina Burbano González, diana.burbano02@usc.edu.co; Edison Marino Cerón Salazar, edison.ceron00@usc.edu.co

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

SYSTEMATIC REVIEW article

Intelligent learning systems in primary and secondary education: a systematic review (2014–2024)

Abstract

1 Introduction

1.1 Background and context

1.2 Evolution and theoretical foundations

1.3 Current implementation context and challenges

1.4 Research gap and study justification

1.5 Research objectives and design rationale

1.6 Research questions

1.7 Research hypotheses

1.8 Scope and delimitations

2 Methods

2.1 Study design: systematic review approach

2.2 PICOS framework

2.3 Information sources and search strategy

2.3.1 Database selection

2.3.2 Search strategy development

2.3.3 Justification of temporal boundaries

2.3.4 Supplementary search strategies

2.4 Eligibility criteria

2.4.1 Inclusion criteria

2.4.2 Exclusion criteria

2.5 Study selection process

2.6 Data extraction

2.7 Methodological quality assessment

2.8 Data synthesis and analysis

2.8.1 Effectiveness synthesis

2.8.2 Thematic analysis

2.8.3 Computational thinking analysis

3 Results

3.1 Study selection and characteristics

3.2 Temporal distribution

3.3 Geographic distribution

3.4 Educational level distribution

3.5 Disciplinary focus

3.6 Types of ILS

3.7 Overall effectiveness evidence

3.8 Effectiveness by subject domain

3.9 Effectiveness by educational level

3.10 Methodological quality

3.11 Computational thinking in STEM implementations

4 Discussion

4.1 Principal findings

4.2 Temporal dynamics and COVID-19 impact

4.3 Effectiveness evidence and interpretive considerations

4.4 Subject domain variations

4.5 Computational thinking gaps

4.6 Regional variations and cultural contextualization

4.7 Critical effectiveness factors

4.8 Limitations

5 Overall findings and synthesis

5.1 Evidence strength and effectiveness

5.2 Technological landscape and evolution

5.3 Global implementation patterns

5.4 Critical success factors

5.5 Methodological quality and research gaps

5.6 Implications for practice and policy

5.7 Synthesis conclusion

6 Conclusions

Statements

Data availability statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Generative AI statement

Publisher’s note

References

Summary

Outline

Figures

Cite article

Share article

Article metrics