- 1Facultad de Educación, Ciencia y Tecnología, Universidad Técnica del Norte, Ibarra, Ecuador
- 2Universidad Técnica del Norte, Ibarra, Ecuador
Digital internationalization has moved beyond “mobility-at-a-distance” toward course-embedded designs that can scale intercultural competence (IC) development. Using PRISMA, we mapped digital technologies and pedagogical strategies used to teach IC in higher education, outcome measures, direction of effects, and equity-related implementation conditions, and we added mechanism-oriented coding linking modality affordances to learning processes and IC outcomes. We searched Scopus, Web of Science, and ERIC and screened 133 records. Twenty-four studies (total n = 1,787) met eligibility criteria. We combined descriptive bibliometrics with an adapted JBI quality appraisal (median quality 78.6%, IQR 57.1–85.7), a SWiM synthesis of effect direction by strategy, and intervention-level mechanism coding. Evidence clusters around LMS/forums, videoconferencing and collaboration suites, immersive VR/360°, telepresence/telesimulation, MOOCs, and guided inquiry. Across modalities, the most consistent IC gains occur when interventions instantiate structured intercultural contact, facilitation and feedback, authentic co-production, and guided reflection/debriefing. Scenario-based and immersive formats are most effective when situated performance is paired with structured debriefing and feedback. By contrast, content-centric LMS/MOOC formats show mixed results unless instructors engineer dialogic tasks and active facilitation to convert exposure into interaction. Most studies rely on standardized self-report measures; performance or observational evidence is less common and concentrates in simulation and role-play contexts. Reported constraints include connectivity, language and time-zone barriers, and facilitation workload, underscoring the role of institutional support. Future research should strengthen comparative and longitudinal designs, report implementation fidelity and dosage, validate instruments across languages, and triangulate self-report with performance evidence and downstream academic/employability outcomes.
1 Introduction
Universities are redesigning internationalization through digital means (O’Dowd, 2021a; Borger, 2022; OECD, 2023a; UNESCO, 2024). Beyond the traditional reliance on physical mobility, technology mediated formats—Collaborative Online International Learning/Virtual Exchange (COIL/VE), telepresence, and immersive media such as VR/360°—now promise scalable, potentially more equitable pathways for developing intercultural competence (IC). Yet adoption has outpaced synthesis: effectiveness varies by design, and outcomes are conditioned by institutional capacity, policy scaffolds, and access. These dynamics unfold against global agendas that foreground inclusive, high quality education and “internationalization at home” under Sustainable Development Goal 4: “Quality Education” (SDG-4) (Gottschalk and Weise, 2023; OECD, 2023a; UNESCO, 2024).
Conceptually, IC/ICC is understood here as the ability to achieve one’s communicative goals appropriately and effectively across cultural identities—a multidimensional construct spanning attitudes/affect, knowledge/cognition, and skills/behaviors. Classic models remain foundational: Deardorff’s process/pyramid model of IC, and Byram’s intercultural communicative competence, which anchors “appropriate and effective” communication in mutually reinforcing attitudes, knowledge, and interpretive relational skills. This multidimensional framing underlies contemporary operationalizations in higher education and language in education research (Deardorff, 2006; Gutiérrez-Santiuste and Ritacco-Real, 2023; Ricardo Barreto et al., 2022).
Empirically, the research trajectory has shifted from generic “infrastructure” terms to pedagogy specific, collaboration centered designs. A 2025 systematic review of online intercultural programs (OIPs) in higher education documents the post 2020 consolidation of COIL/VE, virtual mobility, and mixed modalities, alongside a diversification of instruments and tasks (Zuo, 2025). Parallel syntheses in education technology map comparable growth in digitally enabled IC initiatives since the pandemic, with uneven implementation quality and measurement. Causal evidence is emerging: a quasi-experimental study shows COIL can significantly improve IC relative to controls, and definitional work has clarified virtual exchange as a family of structured, cross institutional pedagogies (Zuo, 2025).
Immersive and simulation-based approaches add a complementary mechanism. The Cognitive Affective Model of Immersive Learning (CAMIL) posits presence and agency as the key psychological affordances through which VR elicits cognitive affective processes (interest, self- efficacy, embodiment) that can translate into learning; experimental and review evidence suggests VR can enhance empathy and communicative readiness, though effects depend on task design and debriefing. In higher education, single exposure VR experiences tend to yield modest or domain specific gains, whereas sequenced VR with structured reflection performs better—consistent with a broader pattern whereby designed interaction and guided reflection, rather than exposure alone, drive IC change (Lin et al., 2024; Makransky and Petersen, 2021; Wang J. et al., 2023).
At the same time, digital expansion has surfaced equity constraints that shape external validity: connectivity, device access, teacher facilitation capacity, language and time zone coordination, and recognition/credit mechanisms. OECD and UNESCO analyses document persistent digital equity gaps and the need for institutional/sectoral investment in platforms, data governance, and professional development—guardrails that help explain why IC outcomes cluster where digital ecosystems and policy supports are strongest. These contextual factors inform both implementation and interpretation of effects in the present review (OECD, 2023b).
Within this landscape, recent studies in online collaborative contexts report gains in intercultural awareness, linguistic development, and interactional pragmatics through authentic tasks and netiquette practice; year-long mobility research traces movement toward ethnorelativism and links shifts in attitudes/knowledge/skills to more appropriate interactional performance, with host language competence a critical mediator (Çiftçi et al., 2022). Program level initiatives show that large scale virtual mobility and MOOCs score highest on collaborative learning, autonomy, open mindedness, and intercultural skills when facilitation is explicit (Poce, 2020). Diagnostic work with novice cohorts similarly indicates high openness but weaker communicative skills, motivating designs that convert attitudes and knowledge into goal achieving behavior (Guo-Brennan, 2022; Oberste-Berghaus, 2024; Yang et al., 2025).
Despite this rapid growth, a clear synthesis gap remains. Existing reviews typically examine a single modality (e.g., COIL/virtual exchange or immersive VR) or provide descriptive catalogs of tools and activities, but they rarely connect technology families, pedagogical strategy families, and the IC/ICC domains and assessment choices those designs target. As a result, practitioners and institutions still lack actionable evidence to decide which digital designs work best for which IC outcomes and under which implementation and equity conditions those effects are most likely to materialize.
The present PRISMA guided systematic review asks what digital technologies are used to teach IC in higher education, which pedagogical approaches/modalities accompany them, how IC is measured, what evidence of effectiveness exists across culturally diverse contexts, and under what equity and implementation conditions effects are realized. By integrating a structured effectiveness synthesis with a descriptive bibliometric lens, the review situates “what works” within the intellectual and policy geographies that make it work—linking designs (interaction + facilitation + debriefing), measures, and contexts to observed outcomes (sections 3–4) (Garrison et al., 1999).
1.1 Theoretical framework
Intercultural competence (IC) and intercultural communicative competence (ICC) are closely related but analytically distinct. IC typically denotes the broader capability to navigate cultural difference across settings (Figure 1), whereas ICC specifies that capability as it is enacted in communication. A widely adopted policy definition frames ICC as the ability to communicate effectively and appropriately in intercultural situations on the basis of knowledge, skills, and attitudes—a triad that anchors curricular and assessment choices in higher education (Barrett, 2012; Huber, 2014). A complementary communication theoretic view further clarifies the two evaluation criteria—appropriateness (contextual fit, other orientation) and effectiveness (goal attainment)—that underwrite judgments of competent intercultural action; this dual criterion has long guided empirical studies of ICC (Kabir and Sponseller, 2020). Contemporary frameworks also caution that competence is historically situated and power laden: operationalizations should be ethically reflexive and context sensitive, especially in designs that bridge the Global North and South or mobilize lingua-franca practices (Zhao et al., 2024).
Figure 1. Digital pathways to intercultural competence, integrating technological formats, theoretical foundations, and implementation conditions.
Against this backdrop, interculturality names the ongoing, co constructed negotiation of meanings, identities, and norms across difference; it is not confined to cross national pairings but includes any interaction in which cultural systems are drawn upon to accomplish social aims (Koester and Olebe, 1988). Within higher education, critical traditions differentiate functional interculturality (managing diversity), relational interculturality (dialogic engagement), and critical/decolonial interculturality (attention to power and epistemic justice)—distinctions that matter for the design and evaluation of digitally mediated programs (Davaei and Gunkel, 2024). These perspectives underpin policy architectures such as the Council of Europe’s Reference Framework of Competences for Democratic Culture (RFCDC) and UNESCO’s Intercultural Competences framework, both of which consolidate attitudes, knowledge, and skills into actionable curricular descriptors and tools (Barrett, 2020).
Converging evidence treats ICC as a multidimensional construct expressed through attitudes (openness, curiosity, tolerance of ambiguity), knowledge (self/other; culture general and culture specific; communicative awareness), and skills/behaviors (listening and observing; interpreting/relating; discovery/interaction; mediation). This attitudes–knowledge–skills (A–K–S) architecture helps align learning outcomes with evaluation strategies across models—compositional, co orientational, developmental (e.g., DMIS), adaptational, and causal—used in higher education programs. For applied contexts such as global virtual teamwork and leadership development, operational taxonomies that distinguish perception management, relationship management, and self-management offer practical scaffolds for rubrics and task design (e.g., perspective taking, interaction management, self-regulation) (Davaei and Gunkel, 2024). At the interface with interlanguage pragmatics, ICC also requires attention to pragmalinguistic and sociopragmatic resources so that learners can enact appropriate action within specific communicative ecologies, including digitally mediated ones (Sykes, 2017).
1.1.1 Competing IC/ICC models and implications for measurement in digitally mediated learning
Although many definitions of intercultural competence converge around an attitudes–knowledge–skills (A–K–S) architecture, the field remains theoretically plural, and the differences among leading models matter for how systematic reviews interpret effects and measurement quality. In language-education traditions, Byram’s Intercultural Communicative Competence (ICC) frames competence as the capacity to communicate and mediate meaning across cultures through a set of “savoirs”: attitudes (savoir être), knowledge (savoirs), skills of interpreting/relating (savoir comprendre), skills of discovery/interaction (savoir apprendre/faire), and critical cultural awareness (savoir s’engager) (Byram, 1997). This ICC framing places language-pragmatic mediation and criticality at the center, implying that interventions should be evaluated not only by attitudinal shifts but also by observable communicative and interpretive performance.
By contrast, Deardorff conceptualizes IC as a developmental process in which foundational attitudes (respect, openness, curiosity) enable the acquisition of knowledge and skills that lead to internal outcomes (e.g., adaptability, an ethnorelative perspective) and, ultimately, to external outcomes (effective and appropriate behavior and communication in intercultural situations) (Deardorff, 2006). This process orientation implies that short interventions may produce proximal attitudinal or awareness gains without necessarily demonstrating durable behavioral competence, and that assessments should ideally sample both internal and external outcomes.
Fantini’s model similarly emphasizes a multidimensional, developmental construct but makes “awareness” explicit as a central integrative element alongside attitudes, knowledge, and skills, and it commonly treats competence as evidenced through performance in intercultural interaction, often in relation to linguistic/discourse resources in authentic encounters (Fantini and Tirmizi, 2006). In communication research, Arasaratnam and colleagues propose an integrated view in which perceived ICC is shaped by components such as empathy, intercultural experience, motivation, listening/interaction involvement, and attitudes toward other cultures, derived from multicultural perspectives (Arasaratnam and Doerfel, 2005), and operationalized through a psychometric instrument intended for intercultural communication contexts (Arasaratnam, 2009). Compared with Byram and Deardorff, this tradition often foregrounds interpersonal communication competence variables and highlights the risk of treating IC as a stable trait rather than a context-sensitive capability.
These model differences map onto well-known tensions in the broader intercultural competence literature: (i) whether IC/ICC is best conceptualized as traits, attitudes/worldviews, capabilities, or a combined taxonomy; (ii) whether competence should be inferred from self-perceptions or demonstrated in performance; and (iii) whether instruments capture growth over time or stable individual differences (Spitzberg and Changnon, 2009; Leung et al., 2014). Consequently, IC measurement remains contested. Self-report scales dominate because they are efficient and scalable, but they are vulnerable to social desirability, reference-group effects, and construct underrepresentation when instruments prioritize attitudes over behavior or criticality (Deardorff, 2006; Spitzberg and Changnon, 2009). Performance-based assessments (e.g., observation, scenario-based tasks, rubric-scored products) can better align with competence-as-performance, yet they require resources, clear scoring criteria, and attention to reliability and cross-context comparability.
Digitally mediated internationalization adds an additional interpretive layer. Virtual exchange/telecollaboration engages learners in sustained online intercultural interaction and collaboration under educator guidance (O’Dowd, 2018), but digital mediation can both enable and constrain competence development: online environments may reduce nonverbal cues and increase pragmatic ambiguity, while also supporting reflection, revision, and documentation of interaction processes. Moreover, digital IC development is shaped by boundary conditions such as linguistic asymmetries, power differentials, time-zone/calendar misalignment, unequal access to infrastructure, and facilitation load, which can influence psychological safety and participation equity (Lewis and O’Dowd, 2016; O’Dowd, 2018). For these reasons, we adopt an integrative interpretation in this review: we use the above models to (a) clarify which IC/ICC dimensions are targeted, (b) evaluate the alignment between pedagogical design and the type of evidence used to claim outcomes, and (c) interpret heterogeneity in reported effects as partly driven by digitally mediated boundary conditions.
1.1.2 Information and communication technology mediated pedagogies and strategies
Information and communication technology (ICT)-mediated approaches show consistent benefits for IC/ICC development when designs make interaction and reflection consequential for learners. Syntheses of virtual exchange/COIL chart the field’s migration from infrastructure to collaboration centered pedagogy and document positive effects on intercultural learning when tasks are authentic and facilitation is explicit (Guillén-Yparrea and Ramírez-Montoya, 2023; Hackett et al., 2023; O’Dowd, 2021b). Short format, course embedded exchanges (3–8 weeks) typically combine multimodal artifacts with guided discussions/interviews, naming ICC among core outcomes (van der Zee et al., 2013). Beyond CMC, immersive media contribute through presence mediated mechanisms: the Cognitive–Affective Model of Immersive Learning (CAMIL) predicts gains where immersion triggers perspective taking and emotion regulation, a mechanism supported by systematic reviews and meta-analyses of VR in education showing advantages over desktop or text based conditions on several outcomes (Davaei and Gunkel, 2024; Koester and Olebe, 1988; Makransky and Petersen, 2021). In parallel, Community of Inquiry (CoI)–framed designs leverage the interplay of social, cognitive, and teaching presence to sustain purposeful discourse online; metanalytic evidence links CoI presences to learning related outcomes, and CoI increasingly structures VE facilitation and analytics (Brancu et al., 2022; Martin et al., 2022).
1.1.3 Assessment portfolios and instruments
Because instrument choice can moderate observed effects, defensible claims require multimethod evidence that matches the construct’s multidimensionality—triangulating standardized self-report with performance-based indicators (e.g., pragmatic appropriateness tasks, coded interaction transcripts, scenario-based judgments). Widely used tools include IDI (developmental orientation), ISS/ISI (affective/sensitivity), CQS/CQ (metacognitive, cognitive, motivational, behavioral facets), MPQ (trait dispositions), and BASIC style performance rubrics. Recent measurement work adds validated short forms and self-efficacy scales suitable for instructional settings (e.g., an eight item ICC scale tested in Portugal; a teacher Self-Efficacy in Intercultural Communication short form with evidence of metric invariance) (Kabir and Sponseller, 2020). At the same time, cross cultural invariance is not guaranteed—analyses of popular scales (e.g., CQS) show subgroup differences—so researchers are advised to test invariance and report reliability in multilingual deployments (Griffith et al., 2016).
1.1.4 Pedagogies by design (targets → mechanisms → measures)
Programs that successfully cultivate ICC make their targets explicit (e.g., openness and empathy; culture general and culture specific knowledge; interaction management, mediation, perspective taking), align mechanisms that realize those targets (e.g., short format VE/COIL sequences that move from information exchange to comparative analysis to collaborative production; global virtual team tasks with netiquette/pragmatics protocols and clear lingua-franca policies; VR-360 experiences followed by structured debrief; LMS case work that scaffolds argumentation and reflective writing), and select measures that evidence progress without overreliance on any single method (e.g., ISS/ISI for affect; CQS for capabilities; BASIC for observable behavior; IDI where worldview movement is germane; plus performance artifacts, scenario judgments, and coded transcripts). This target–mechanism–measure alignment, now common in VE/COIL handbooks and CoI informed implementations, supports coherent evaluation across modalities and durations while accommodating contingencies such as bandwidth, linguistic resources, or North–South asymmetries (Brancu et al., 2022; van der Zee et al., 2013).
These strands position interculturality as the dynamic context of meaning making across difference and IC/ICC as the capacity to act within it in ways that are appropriate and effective. The A–K–S architecture and associated models clarify what must be developed; technology mediated pedagogies translate those targets into feasible learning ecologies in higher education; and assessment portfolios—combining validated scales with performance evidence and attending to invariance—provide defensible claims about learner development. This integrated view motivates the empirical focus of the present review on which technologies are used, how they are implemented pedagogically, how outcomes are measured, and what effectiveness is observed across contexts—a line of inquiry that is consistent with, and extends, recent syntheses of online intercultural programs and virtual exchange (Davaei and Gunkel, 2024).
1.2 Purpose of the review study
Two converging developments make the evidence base difficult to use for design and policy decisions. First, digital internationalization has expanded beyond “mobility-at-a-distance” into course-embedded and program-level formats that aim to scale intercultural competence (IC) development. Second, the post-2020 growth of online intercultural programs has increased the number of empirical studies, but it has also increased heterogeneity in designs, measures, and reporting quality, which complicates defensible claims about effectiveness and equity (Zuo, 2025; Guo-Brennan, 2022).
This creates three specific gaps in the literature. (1) Linkage gap: studies frequently list technologies (e.g., videoconferencing, LMS, collaboration suites, immersive media) without specifying how particular technology families are coupled with particular pedagogical logics (e.g., guided dialogue, project-based collaboration, scenario-based simulation) to produce change in distinct IC/ICC domains (cognitive, affective, behavioral, communicative). (2) Evidence-use gap: the field continues to rely heavily on self-report instruments, often with incomplete reporting on reliability, translation procedures, or measurement invariance, while comparative and longitudinal designs remain relatively scarce—making it difficult to judge which approaches produce robust, transferable gains (Griffith et al., 2016; Kabir and Sponseller, 2020; Mitchell and Suransky, 2024; Zuo, 2025). (3) External-validity gap: equity-sensitive implementation conditions—connectivity and device access, language and time-zone coordination, facilitation workload, and institutional QA/IT support—are repeatedly mentioned as barriers, but they are seldom synthesized alongside effectiveness in a way that clarifies when and where effects are feasible (OECD, 2023b; Ricardo Barreto et al., 2022).
To address these gaps, we conducted a PRISMA-guided systematic review that makes the contribution explicit and usable. We (a) map technology families onto pedagogical strategy families and the IC/ICC domains they target; (b) synthesize effect direction transparently using SWiM categories (↑/ns/mixed/↓) by strategy to support design-relevant interpretation under heterogeneity; (c) integrate a quality- and equity-sensitive reading by appraising studies using an adapted JBI tool and extracting enabling conditions and constraints; and (d) add a descriptive bibliometric layer to situate where the evidence is produced and how research concentration aligns with infrastructure and policy context.
1.2.1 Objective and research questions
We aimed to synthesize which digital technologies and pedagogical strategies are used to develop intercultural competence (IC) in higher education, and to summarize the direction of reported effects using a SWiM (Synthesis Without Meta-analysis) approach. We also examined equity-sensitive constraints and enablers across contexts.
Research questions:
RQ1: Which digital technologies are used to teach IC in higher education?
RQ2: Which pedagogical/implementation strategies are employed, and how are they configured?
RQ3: Which IC outcomes and instruments are used, and with what reliability?
RQ4: What evidence of effectiveness is there (direction of effect by strategy)?
RQ5: What equity, access, and contextual factors constrain or enable implementation?
Scope. The review targets higher education learners engaged in digitally mediated IC interventions, with or without formal comparators. Outcomes span cognitive, affective, behavioral, and communicative IC domains, measured by validated instruments where available; blended implementations are included when digital mediation is central.
2 Methods
2.1 Study design and reporting
We conducted and reported this systematic review in accordance with PRISMA 2020. We prepared the protocol a priori and prespecified the review questions (RQ1–RQ5), eligibility criteria, information sources, screening and extraction procedures, risk-of-bias appraisal (JBI), bibliometric analyses, and the SWiM plan for effectiveness.
2.2 Eligibility criteria
• Population: Higher education Learners (undergraduate, graduate, or professional programs).
• Intervention: Digitally mediated, course-embedded pedagogical interventions in which a digital modality is essential to delivery and/or intercultural learning activities (i.e., removing the digital component would fundamentally change the intervention). Eligible designs included (a) structured virtual exchange/COIL/telecollaboration, (b) scenario-based or simulation-oriented implementations delivered via digital media (e.g., telepresence/telesimulation, immersive VR/360°), and (c) LMS/MOOC or other online course formats only when they incorporated structured intercultural dialogue, collaborative co-production, and/or guided reflection as part of the learning design. We excluded studies where technology was peripheral (e.g., an LMS used only to post readings) or where “digital internationalization” was addressed only at the policy level without an evaluated teaching–learning intervention.
• Outcomes: Intercultural competence (IC) and adjacent constructs (e.g., intercultural communicative competence, intercultural sensitivity, cultural intelligence, global competence), provided that the construct was operationalized with (i) a published standardized instrument, or (ii) a performance/observational rubric aligned to IC/ICC dimensions, or (iii) a study-specific instrument with explicit validity evidence (e.g., citation to prior validation and/or reported psychometrics such as internal consistency) and a transparent scoring procedure. Qualitative-only studies were eligible when they applied a clearly described analytic protocol (e.g., coding of journals, interviews, or interaction transcripts) and explicitly mapped codes to IC/ICC dimensions.
• Study Designs: Empirical studies such as Quantitative (experimental, quasi-experimental, pre–post), qualitative, or mixed methods.
• Language/period. Studies within 2015–2025; English/Spanish/Portuguese full texts were included when extractable.
• Report status. Only published journal papers.
The exclusion criteria consist of the wrong population (no HE context); wrong intervention (not digitally mediated); wrong outcome (not met or not reported IC or constructs of IC development); wrong study design (no primary research); wrong period (not within 2015–2025); wrong language (English, Spanish or Portuguese) and wrong type of study (not journal paper).
2.2.1 Operational definitions and decision rules
We treated an approach as “digital” when digital technology constituted a primary mediation layer for the learning activity (interaction, task execution, or scenario enactment), rather than merely supporting administration or passive content distribution. We treated IC/ICC outcomes as sufficiently validated for inclusion when authors used (a) published instruments with prior validation evidence, or (b) study-specific measures that reported at least minimal validity/reliability evidence and provided a clear mapping between items/codes and IC/ICC dimensions. During extraction, we recorded instrument name, any reported validation evidence, reliability indicators (when available), measurement timing, and the IC domain(s) targeted to support transparent cross-study interpretation.
To avoid construct drift and to enable mapping across instruments, we adopt the following working distinctions.
• IC (intercultural competence): a multi domain capability enabling appropriate/effective interaction across difference; we track four analytic domains frequently operationalized in higher education studies: cognitive (knowledge/understanding), affective (dispositions/attitudes, e.g., openness, empathy), behavioral (observable strategies/adaptation), and communicative/ICC (pragmatic–linguistic performance in interaction).
• ICC (intercultural communicative competence): the communication-focused subset linking language use to intercultural effectiveness (pragmatics, mediation, interaction management).
• Cultural sensitivity/cultural awareness: constructs commonly used as affective–cognitive precursors (e.g., attitudes toward diversity, awareness of one’s own and others’ cultural frames) and often measured via self-report scales in HE contexts (Ricardo Barreto et al., 2022).
2.3 Bibliometric mapping (co-occurrence, thematic structure, and co-citation)
To contextualize the inclusion set beyond basic productivity counts, we conducted science-mapping analyses focused on (i) thematic structure and (ii) the intellectual base. Specifically, we generated: (a) an author-keyword co-occurrence network and an overlay visualization by average publication year (Figures 2, 3), (b) a thematic performance map (centrality × impact) and a correspondence analysis of descriptor/outcome terms (Figures 4, 5), (c) an alluvial visualization linking author countries, descriptors, and abstract lexemes (Figure 6), and (d) a co-citation network of the intellectual base from cited references (Figure 7). Co-occurrence and co-citation networks were produced in VOSviewer, while the thematic map, correspondence analysis, and alluvial visualization were produced using bibliometrix (R).
Figure 2. Keyword co-occurrence network (VOSviewer). Nodes represent author keywords (size = frequency); edges indicate co-occurrence (thickness = link strength); colors mark modularity-based clusters. Intercultural competence is located at the boundary between the digitalization/infrastructure cluster and the telecollaboration/pedagogy cluster, highlighting its bridging role across themes.
Figure 3. Overlay visualization by average publication year. The same co-occurrence network is colored by average year of appearance (blue = earlier, yellow = more recent), documenting a temporal shift from generic ICT/online instruction toward collaboration-centered pedagogies (COIL/virtual exchange, online internships, global virtual teams). The position of intercultural competence at the cluster interface indicates its bridging role across the evolving topical structure.
Figure 4. Thematic performance map (centrality × impact). Bubbles plot clusters by Callon cen-trality (x-axis; structural importance) and normalized local citations (y-axis; influence within the set); bubble size reflects cluster volume.
Figure 5. Correspondence analysis of descriptors and outcome terms. A two-dimension solution (percentage of inertia shown on axes) positions process-oriented digital contexts (e.g., online/course/environment) against outcome-focused constructs (e.g., competence/achievement) and distinguishes communicative/adaptation terms from delivery/environmental factors.
Figure 6. Alluvial diagram linking author countries, descriptors, and abstract terms. Flows connect contributing countries (left) to descriptor terms (center) and to salient abstract lexemes (right).
Figure 7. Co-citation network of the intellectual base. Nodes represent cited references (size = citation frequency within the set); edges represent co-citation strength; colors denote clusters.
We applied minimum-occurrence thresholds and standard association-strength normalization to balance interpretability with coverage; complete parameter settings (thresholds, counting options, and node lists) are provided in the Supplement to support reproducibility. Because the corpus is intentionally restricted to eligible, HE interventions (n = 24), these bibliometric outputs are interpreted as contextual mapping (i.e., structure and backbone within the inclusion set), not as population-level trend inference or forecasting.
2.4 Information sources and search strategy
We searched Scopus, Web of Science (Core Collection), and ERIC (final run: August 2025). We restricted the review to peer-reviewed journal articles and did not search gray literature (e.g., conference proceedings, theses/dissertations, reports, working papers, or preprints). Following the eligibility criteria (section 2.2), we applied filters for publication years, language, and document type (2015–2025; English, Spanish, and Portuguese; journal articles only). We exported records and deduplicated them in Rayyan.
Search strings were built as three Boolean concept blocks— (1) IC/ICC constructs, (2) digital/technology-mediated learning, and (3) higher education—and adapted to each database syntax while preserving the same conceptual logic. For transparency, the database-level logic is reported below:
Scopus (TITLE-ABS-KEY):
(“intercultural competenc*” OR “cultural competence*” OR “intercultural communication” OR “global competenc*” OR “cross-cultural” OR “intercultural skills” OR “intercultural learning”) AND (“digital technolog*” OR “educational technolog*” OR ICT OR “online learning” OR e-learning OR “virtual platform*” OR “learning management system*” OR MOOC* OR “virtual reality” OR “augmented reality” OR “mobile learning” OR gamification OR simulation* OR “AI in education”) AND (“higher education” OR universit* OR university OR college OR “tertiary education” OR postsecondary)
Web of Science Core Collection (TS):
TS = ((“intercultural competenc*” OR “cultural competence*” OR “intercultural communication” OR “global competenc*” OR “cross-cultural” OR “intercultural skills” OR “intercultural learning”) AND (“digital technolog*” OR “educational technolog*” OR ICT OR “online learning” OR e-learning OR “virtual platform*” OR “learning management system*” OR MOOC* OR “virtual reality” OR “augmented reality” OR “mobile learning” OR gamification OR simulation* OR “AI in education”) AND (“higher education” OR universit* OR university OR college OR “tertiary education” OR postsecondary))
ERIC (TX):
TX = ((“intercultural competenc*” OR “cultural competence*” OR “intercultural communication” OR “global competenc*” OR “cross-cultural” OR “intercultural skills” OR “intercultural learning”) AND (“digital technolog*” OR “educational technolog*” OR ICT OR “online learning” OR e-learning OR “virtual platform*” OR “learning management system*” OR MOOC* OR “virtual reality” OR “augmented reality” OR “mobile learning” OR gamification OR simulation* OR “AI in education”) AND (“higher education” OR universit* OR university OR college OR “tertiary education” OR postsecondary))
Complete database-specific strings (as executed, including any interface-specific field tags) are provided in the Supplementary materials.
2.5 Selection process
Two reviewers independently screened each record (title and abstract) in Rayyan at every stage. Figure 8 reports the PRISMA counts: identified = 133, duplicates removed = 23, screened = 110, excluded at title/abstract = 56, full texts assessed = 64, full-text excluded with reasons = 40, and included in synthesis = 24. Supplementary material tabulate full-text exclusion reasons.
Figure 8. PRISMA flow chart. Identification (n = 133), duplicates removed (n = 23), screened (n = 110), excluded at title/abstract (n = 56), full texts assessed (n = 64), full texts excluded with reasons (n = 40), and studies included (n = 24).
Two reviewers carried out data collection and built a structured extraction matrix. The matrix captured bibliographic data, country, discipline/program, sample size and level, study objectives, methodological design, technology (type, subtype, platform/hardware), and pedagogy/implementation (task design, synchronicity, duration, scaffolding/reflection, assessment). It also captured outcomes and instruments (instrument name, validation, reliability when reported, and measurement timing), the IC dimensions assessed, and main results (IC domains grouped into knowledge, skills, attitudes, and critical awareness).
2.6 Risk-of-bias/methodological quality appraisal
We appraised methodological quality with an adapted JBI checklist. The checklist includes eight criteria scored on a 0–2 scale: 0 = not met or poorly justified, 1 = partially met or unclear, and 2 = fully met and well justified. Two reviewers assessed each study. To avoid penalizing designs where a classical control group is infeasible, we redefined criterion #3 as “comparative/counterfactual design” and considered it fulfilled by control or comparison groups, matching/propensity adjustments, difference-in-differences, interrupted time series, or pre-defined benchmarks. When a criterion was not applicable to a study (e.g., purely qualitative designs), we removed it from the denominator and rescaled the quality score to the percentage of the applicable maximum.
Criteria (0–2 each): Clear participant inclusion criteria; Context and intervention adequately described; Comparative/counterfactual design (control/comparison if applicable; or equivalent quasi-experimental logic); Valid and reliable instrument for IC; Pre-post (or repeated) measurement performed; Adequate statistical analysis; Losses to follow-up reported and managed; and Limitations and biases discussed.
For each study, we summed item scores and divided by the maximum applicable points (2 × number of non-N/A items). This yielded a percentage of the applicable maximum. We then classified studies as High ( ≥ 75%), Moderate (50–74%), or Low (<50%).
We used these labels descriptively and in sensitivity analyses. Overall, studies were Moderate–High. Excluding Low-quality studies in sensitivity analyses did not change the rank order of effect directions.
Appraisal labels are reported in Table 1 and used in sensitivity commentary (section 3.9).
Table 1. Summarizes, for each study, author (year), doi, topic, brief description, methodology, and quality (H = high; M = moderate; B = low), i.e., the fields most informative for cross-study comparison in the main text.
2.7 Data synthesis
For synthesis, we grouped data into technology families, implementation and pedagogical strategy families, and IC domains and measurement instruments. Because designs, measures, and contexts were heterogeneous, we used a SWiM (Synthesis Without Meta-analysis) approach. We coded each study for direction of effect on IC using four mutually exclusive categories: positive/significant improvement (↑), non-significant (ns), mixed within-study effects, and adverse (↓). When studies reported multiple IC outcomes, we applied a prespecified hierarchy that prioritizes validated IC instruments and primary outcomes. Two reviewers coded independently and resolved discrepancies by consensus.
2.7.1 SWiM coding and synthesis procedures
We followed the SWiM reporting guideline (Campbell et al., 2020) and operationalized the synthesis as study-level vote-counting by direction of effect. The unit of analysis was the study intervention (one SWiM code per study).
For SWiM aggregation, each study was assigned to a dominant pedagogical strategy/implementation family based on the extracted intervention description (i.e., the primary learning activity and interaction structure driving intercultural engagement). The strategy families correspond to the rows reported in Table 2 (e.g., COIL/Global Virtual Teams; telecollaboration/virtual exchange; online study abroad/telepresence; dialogic online course; project-based/design thinking; accredited e-learning modules; MOOC service-learning; blended mobility; VR/360° immersive; VR/360° + virtual exchange; video-making + blended discussion; WebQuest/guided inquiry; and role-play telecollaboration). When a study combined multiple components, we applied a pre-specified rule: we assigned the family based on the component that accounted for the majority of learner time and that best matched the stated primary instructional objective (i.e., the “core” intercultural learning mechanism).
Table 2. Crosswalk of implementation logics, dominant mechanisms, boundary conditions, and assessment alignment (included interventions, n = 24).
For each study, we identified the eligible IC outcome(s) used for SWiM coding. We prioritized (in order): (1) the study’s pre-specified primary IC outcome (when stated); (2) validated IC instruments or validated subscales (as reported by authors); and (3) performance/observational measures supported by analytic rubrics or checklists. If multiple eligible outcomes remained, we selected the outcome most directly aligned with IC dimensions (attitudes, knowledge/cognition, skills/behaviors, critical awareness) at the main post-intervention timepoint (end of module/term).
Direction of effect was coded using four mutually exclusive categories: (1) Positive (↑): statistically significant improvement in the prioritized IC outcome versus baseline (pre–post) and/or versus a comparison group, in the intended direction (p < 0.05, or as reported by the authors). When both within-group and between-group results were available, we prioritized the between-group estimate.
(2) non-significant (ns): change/difference in the intended direction that was not statistically significant on the prioritized outcome. (3) Negative (↓): statistically significant deterioration (or significant between-group difference favoring the comparator) on the prioritized outcome.
(4) Mixed: within-study divergence, defined as at least one eligible IC domain/subscale showing ↑ while other shows ns or ↓, or when different eligible instruments yielded conflicting directions.
Qualitative-only studies or descriptive studies lacking a comparable pre–post or contrast were retained in the review but were not assigned a SWiM direction code; these were synthesized narratively alongside the coded evidence. We summarized effectiveness by counting the number of studies coded ↑/ns/mixed/↓ within each strategy family (Table 2) and visualized these counts in the SWiM plot (Figure 9). The synthesis is descriptive and does not pool effect sizes. As a sensitivity check, we repeated the tabulation excluding studies appraised as low quality; this did not change the directional ranking across strategy families.
Figure 9. Effect-direction plot (SWiM). Stacked horizontal bars show counts of ↑/ns/mixed/↓ per strategy family derived from the direction-coding in Table 2.
2.8 Mechanism-oriented coding and synthesis
In addition to SWiM, we conducted a mechanism-oriented synthesis to make the analytic step from “modality/type” to “how and why outcomes plausibly occur” explicit and auditable. We adopted a realist-informed, mechanism-focused logic in which each intervention was modeled as a configuration linking digital modality affordances → learning processes (mechanisms) → IC outcomes, conditional on contextual and implementation moderators (Pawson and Tilley, 1997; Wong et al., 2013). The unit of analysis was the study/intervention. Using the same extraction matrix already employed for pedagogical scaffolding, task design, facilitation, assessment, and limitations, we added a dedicated layer of mechanism coding to capture cross-study patterns that are observable in the included reports.
We developed an a priori codebook of eight mechanisms grounded in established learning and contact-based explanations of intercultural development and refined it inductively during pilot coding (Allport, 1954; Kolb, 1984; Pettigrew and Tropp, 2006). The final codebook includes: (M1) structured intercultural contact, (M2) guided reflection/debriefing, (M3) facilitation and feedback, (M4) authentic co-production, (M5) immersion/situational salience, (M6) perspective-taking/empathy activation, (M7) language/pragmatic mediation, and (M8) identity negotiation and psychological safety (Table 3). We also coded boundary conditions where reported (e.g., duration, synchrony, scheduling constraints, access/infrastructure, facilitation load, linguistic demands, and equity-related asymmetries) to clarify “what works, for whom, and under what conditions” (Pawson and Tilley, 1997; Wong et al., 2013). Two reviewers independently coded mechanisms for each included intervention and resolved disagreements through discussion to consensus; the resulting patterns were summarized by implementation logic and linked to assessment alignment (Table 3).
3 Results
3.1 Study selection
Our searches in Scopus, Web of Science, and ERIC returned 133 records. After automated and manual deduplication, we removed 23 duplicates and screened 110 titles and abstracts. We excluded 56 records that did not meet the prespecified criteria. We assessed 64 full texts and excluded 40 with documented reasons, most commonly because they lacked an eligible IC outcome, used a non-digital or off-scope intervention, targeted a non-higher-education population, were non-empirical reports, did not provide retrievable or adequate full text, or used unvalidated ad hoc outcomes without triangulation. The final corpus comprised 24 studies for qualitative synthesis and SWiM effect-direction coding. Supplement S-PRISMA-Reasons lists all exclusion categories and counts, and Figure 8 summarizes the flow.
3.2 Study characteristics
Across the 24 included studies (Table 1 and Supplementary Table 1), authors implemented interventions in higher-education programs in education, health professions, business/management, and language/communication, with contributions from North America, Europe, East Asia, Latin America, and the Middle East. Most studies used pre–post or mixed-method designs. A smaller subset used quasi-experimental or one-group quantitative designs, and several incorporated qualitative components (e.g., reflective journals and interviews) to explain or triangulate measured change. Samples were modest and classroom-typical (median n = 50, IQR 34–91, range 16–272). Across studies, total n = 1,787, reflecting delivery within for-credit courses and the logistical constraints of cross-institutional collaboration. Intervention duration ranged from short course modules to full-semester projects. Most interventions paired synchronous interaction for authentic dialogue with asynchronous work to manage time-zone differences, aligning with common COIL/virtual exchange practice and with the instructional logic that intercultural learning benefits from guided reflection after live interaction.
Methodological quality, appraised with design-appropriate checklists and rescaled to the percentage of the applicable maximum, was generally adequate: the median was 78.6% (IQR 57.1–85.7%), with category cut-points at ≥ 75% (High), 50–74% (Moderate), and < 50% (Low). As shown in Figure 10, scores clustered in the moderate-to-high range; only a minority fell below 50%, typically because of non-comparative designs, heavy reliance on self-report without triangulation, or incomplete reporting of instrument properties. The comparative/counterfactual item (C3) applied to two studies (max = 16 points); for all others it was not applicable (max = 14), which explains the two different denominators used in the rescaling. This profile supports the decision to synthesize direction of effect rather than pool effect sizes, while lending credibility to the consistent positive patterns observed for interaction-centered designs in sections 3.4–3.6.
Figure 10. Risk-of-bias appraisal (% of applicable maximum). Horizontal bars show the percentage of checklist items met (rescaled by applicable maximum; dashed lines at 50 and 75%). Studies with a comparative/counterfactual design were scored on a 16-point maximum; all others on 14 points, which explains the two denominators.
3.2.1 Methodological quality profile and strength of inference
Methodological quality profile and strength of inference. The adapted JBI appraisal indicates that the included evidence is, overall, moderate-to-strong in reporting and internal coherence (median quality 78.6%, IQR 57.1–85.7), with 13 studies classified as High and 11 as Moderate (no Low). Importantly, the item-level profile clarifies what the current evidence can and cannot support. Across studies, eligibility and participant inclusion were consistently reported (12/24 scored “2”; 12/24 scored “1”), and most papers explicitly discussed limitations and potential biases (18/24 scored “2”; 6/24 scored “1”). Pre–post or repeated measurement was common (18/24 scored “2”), but a non-trivial subset lacked a robust pre–post structure (5/24 scored “0”), which reduces inferential strength for change claims. A key constraint is the limited presence of counterfactual designs: only two studies implemented an explicit comparative/control logic, while this criterion was not applicable for the remaining descriptive or single-group intervention designs. Measurement validity is also heterogeneous: seven studies did not provide sufficient evidence that the IC instrument used was valid and reliable for the target population or context, which is consequential given longstanding debates on IC/ICC measurement and the predominance of self-report evidence (Deardorff, 2006; Spitzberg and Changnon, 2009). Finally, statistical analysis quality tended to be “adequate but not optimal” (17/24 scored “1”), most often due to limited reporting of effect magnitudes, incomplete handling of confounds, or insufficient detail on analytic decisions.
This quality profile supports a cautious interpretation of “direction of change” patterns (as synthesized via SWiM), while underscoring that the evidence base is less suited for precise causal attribution or pooled effect size estimation. This also motivates the sensitivity and boundary-condition analysis reported below: when study designs lack counterfactual structure and measurement is predominantly self-report, interpretation benefits from explicitly considering alignment among pedagogy, learning processes, and assessment, as well as contextual constraints that condition implementation (Deardorff, 2006; Spitzberg and Changnon, 2009).
3.3 Bibliometric landscape
This bibliometric component goes beyond descriptive counts by using science-mapping techniques to characterize (i) the thematic structure of the inclusion set and (ii) its intellectual base. We report keyword co-occurrence and temporal overlay (Figures 2, 3), geographic–semantic linkages via an alluvial map (Figure 6), outlet dispersion and corpus-internal visibility (Figures 11–13), thematic structure via centrality–impact mapping and correspondence analysis (Figures 4, 5), and the intellectual backbone through co-citation mapping (Figure 7). Together, these outputs clarify how the literature clusters, what bridges the clusters, and which foundational works anchor the field within the eligible intervention corpus.
Figure 11. Cumulative occurrences by source journal. Step lines show the number of included articles per journal over time, illustrating a decentralized publication landscape spanning education technology, distance education, applied linguistics, health/clinical training, and business/management outlets.
Figure 12. Geographic distribution of included studies (choropleth). Shading indicates the number of included articles by country of author affiliation. Higher densities appear in Anglophone/OECD systems and East Asia, with emerging contributions from Latin America and parts of Europe.
Figure 13. Country-level citation impact within the inclusion set. Bars display cumulative citations attributable to included studies by country, offering a corpus-internal indication of visibility.
In the author-keyword co-occurrence network, “intercultural competence” operates as a bridging node across three dominant thematic concentrations: (1) interaction-intensive virtual exchange/telecollaboration aligned with internationalization agendas, (2) online learning and course design/teaching presence terminology, and (3) immersive/simulation-oriented pedagogies (including role-play and scenario-based implementations). This structure supports the review’s interpretation that outcomes are not explained by tools per se, but by pedagogical processes that different modalities afford and instructors organize. In the keyword co-occurrence network, intercultural competence anchors three tightly coupled constellations—curricular digitalization in higher education (e-learning/ICT/online instruction/MOOCs/virtual mobility), internationally networked pedagogies (COIL/virtual exchange, collaborative learning, online internships, global virtual teams), and outcome-oriented constructs (cultural awareness, cross-cultural, intercultural competencies) (Figure 2). In the visualization, the intercultural competence node is positioned at the interface of the digitalization/infrastructure cluster and the telecollaboration/pedagogy cluster, reflecting its role as a bridging construct rather than a single-cluster term. This configuration mirrors syntheses showing IC increasingly cultivated through technology-mediated designs in universities (Zuo, 2025), with quasi-experimental evidence for COIL effectiveness (Hackett et al., 2023) and corroboration across sectors (Galan-Lominchar et al., 2024; Heymans et al., 2024). The employability logic is consistent with the rise of global virtual teams (Liu, 2025), while digitalization extends IC instruction and assessment online (Zhang et al., 2025). Scalable routes via MOOCs/virtual mobility appear promising when inclusively designed and well facilitated (Rai et al., 2023), aligning with “internationalization at a distance” agendas (O’Dowd, 2021a).
The overlay by average publication year documents a shift from early, blue-tinted infrastructure terms (ICT, online instruction, blended learning) toward newer, yellow pedagogy-specific nodes centered on collaboration (COIL/virtual exchange, collaborative learning, online internships, global virtual teams) (Figure 3). Reviews covering 2016–2021 capture the initial phase (Guillén-Yparrea and Ramírez-Montoya, 2023), whereas later studies establish positive IC effects across disciplines (Hackett et al., 2023; O’Dowd, 2021b). COVID-19 accelerated internationalization-at-a-distance but also magnified inequality; policy scaffolds such as Erasmus+ Virtual Exchange helped institutionalize these models (Heymans et al., 2024; O’Dowd, 2021a).
The alluvial diagram linking author countries → descriptors → abstract lexemes show broad convergence—led by the United Kingdom, Australia, the United States, India, Germany, and Poland—on higher education, online/e-learning, and virtual mobility/MOOCs, with outcome streams dominated by students, learning, intercultural, competence, and communication (Figure 6). Pandemic markers (COVID-19, distance learning) capture the global pivot and equity concerns (Hackett et al., 2023; Heymans et al., 2024). The central flow toward online/virtual terms coheres with the empirical turn to COIL/virtual exchange and the employability case reinforced by global virtual teams (Liu, 2025).
Cumulative occurrences by source journal reveal a decentralized landscape: most venues contribute a single article, and only a few reach two by 2024–2025, with pronounced step-ups after 2019 (Figure 11). This dispersion across distance-education, ed-tech, applied linguistics, health/clinical, and business outlets matches prior syntheses (Guillén-Yparrea and Ramírez-Montoya, 2023) and the field’s move toward scalable online interventions, including MOOCs and hybrids (Rai et al., 2023).
Geographically, a choropleth shows dense contributions from Anglophone/OECD systems and East Asia—notably the United States, the United Kingdom, Australia, China, and Canada—alongside emerging nodes in Latin America (Figure 12). This distribution reflects internationalization policy and digital-capacity patterns (Ghani et al., 2022) and EU instruments that fund online, people-to-people learning, while lighter coverage across parts of Africa and Latin America mirrors the documented digital divide; overall, the map signals a widening research front as connectivity and funding expand.
Within the inclusion set, country-level citation impact concentrates in the USA, India, China, and Russia, followed by the UK/Ukraine and Australia/Spain (Figure 13). The pattern is consistent with early institutionalization of COIL in the US and accumulating causal evidence (Hackett et al., 2023), while China and India are buoyed by scaled national MOOC platforms. Systems that pair strong internationalization agendas with national digital infrastructures tend to generate more frequently cited outputs.
The thematic performance map positions motor themes at high centrality/impact—most prominently the cluster joining intercultural competence, higher education, and internationalization—while online-learning/communication skills occupy the well-connected core and low-centrality topics remain peripheral (Figure 4). Centrality follows Callon and impact is proxied by normalized local citations (Alkhammash, 2023); the configuration aligns with “internationalization at home” guidance and with evidence that telecollaboration/virtual exchange fosters intercultural communicative competence (Gutiérrez-Santiuste and Ritacco-Real, 2023).
The correspondence analysis (Dim1 = 27.41%, Dim2 = 16.29%) separates process-oriented digital contexts from outcome-focused constructs and distinguishes socio-cultural adjustment/communication from delivery factors (Figure 5). The right-hand alignment of achievement with competence/education reflects a growing linkage between IC and academic outcomes (Novikova et al., 2022). Gains in intercultural communication via online interaction are repeatedly documented (Hackett et al., 2023), whereas mixed experiences in Chinese/African contexts underscore environmental constraints (Gutiérrez-Santiuste and Ritacco-Real, 2023). Taken together, the pattern corroborates a trajectory from infrastructure-focused analyses toward designs that test downstream achievement.
The co-citation network traces a backbone anchored by Deardorff (2006), whose definition/assessment of IC remains the primary reference. Psychometric/theoretical pillars such as Chen and Starosta (2000) and Earley and Ang (2003) provide the scaffolding that later work uses to operationalize IC in higher education and digital pedagogy; subsequent clusters connect to international-education practice and virtual mobility (e.g., Liu and Shirley, 2021), and recent nodes converge on causal evaluations of virtual exchange/COIL. Read together, Figure 7 charts a progression from foundational constructs to implementation and, ultimately, to rigorous tests of online collaboration’s effect on IC—an arc echoed in sectoral syntheses (Deardorff and Jones, 2012).
3.4 Technology layer and IC domains
Addressing RQ1, the technology stack in the included studies spans institutional e-learning infrastructure (LMS, forums, and web tools), synchronous communication (web/videoconferencing), collaborative workspaces and social platforms, immersive media (VR/360°), open online formats (MOOCs), clinical telepresence/simulation, guided inquiry (WebQuests), and analytics supported activities. These tools are not used in isolation but as task vehicles inside credit bearing courses and cross site projects, typically paired with reflective prompts or interactional tasks that make cultural difference visible and discussable. As summarized in Table 4, videoconferencing and collaborative suites organize real-time dialogue and production; LMS/forums scaffold content exposure and reflection; immersive VR/360° adds situational salience for perspective taking; MOOCs and blended mobility extend reach; and telepresence/simulation brings authenticity to professional training. Table 5 then maps technology families to intercultural competence (IC) domains targeted in the corpus—cognitive (knowledge), affective (attitudes), behavioral (skills/efficacy), and communicative ICC—clarifying that “Yes” denotes documented use ( ≥ 1 study) rather than prevalence.
Patterns across Tables 4, 5 indicate that interaction-intensive affordances (videoconferencing, virtual exchange/role-play, telesimulation) are most consistently associated with behavioral and communicative development, particularly when paired with facilitator-guided debriefing. By contrast, spaces that primarily structure content and reflection (LMS modules, WebQuests, portfolios) more readily elicit cognitive and affective gains unless opportunities for authentic exchange are built in. Immersive VR/360° tends to catalyze perspective taking across all domains, but its effects depend on access, integration with course outcomes, and post-experience reflection. These contrasts anticipate the implementation logics detailed in §3.4 and the measurement choices in §3.6.
3.5 Pedagogical approaches and implementation modes
Linking RQ1 to RQ2, three implementation logics recur across the corpus and help explain why certain technologies connect to particular IC outcomes. Collaborative, internationally networked designs—especially COIL/virtual exchange and adjacent telecollaboration—organize students into cross-site teams that complete authentic deliverables and engage in structured reflection; synchronicity is mixed (asynchronous production layered with scheduled meetings) to manage time-zones, and durations range from a module to a full semester. Immersive or scenario-based designs—VR/360° activities and telesimulation/telepresence in clinical/professional courses—foreground situated performance with explicit debriefing and targeted feedback, typically delivered in short, repeated synchronous sessions nested within a course. Finally, content first online formats—MOOCs or LMS-based e-learning—deliver curated materials and forum prompts; their intercultural effects depend strongly on facilitation quality, reflective scaffolds, and opportunities for authentic exchange beyond content exposure. Table 6 synthesizes the approaches, specifying task design, synchronicity, duration, scaffolding/debriefing, assessment modes, and recurrent limitations.
The crosswalk clarifies the mechanism designs that stage contact and conversation (COIL/VE, telesimulation) are those that most reliably foster behavioral and communicative growth, provided active facilitation is present; content centric approaches remain valuable for cognitive and affective foundations but show variable outcomes without guided interaction. Immersive media add situational salience yet require equitable access and intentional debriefing to avoid novelty-driven or superficial gains. These regularities foreshadow the effectiveness patterns summarized in §3.7 (SWiM) and the equity constraints discussed in § 3.8.
3.5.1 Mechanisms of impact: how digital modalities plausibly produce IC gains
To move beyond a taxonomic listing of tools and formats, we coded each intervention for the learning-process mechanisms that plausibly connect modality affordances to intercultural competence (IC) outcomes (Table 7). Across the 24 included interventions, the most frequently coded mechanisms were facilitation and feedback (15/24) and language/pragmatic mediation (15/24), followed by structured intercultural contact (19/24) and authentic co-production (10/24). Guided reflection/debriefing (7/24) and explicit psychological safety/norm-setting (6/24) were reported less consistently, while immersion/situational salience (4/24) and explicit perspective-taking/empathy activation (2/24) were comparatively infrequent. This distribution is consequential because it indicates that many interventions rely on interaction and instructional presence, yet fewer make reflective consolidation and psychological safety fully explicit—both of which are central in contact- and experiential-learning explanations of intercultural development (Allport, 1954; Kolb, 1984; Pettigrew and Tropp, 2006).
Table 7. Mechanism codebook for the mechanism-oriented synthesis of digital interventions targeting intercultural competence (IC).
In networked exchange designs, the dominant configuration was structured intercultural contact plus facilitation/feedback, frequently strengthened by co-production and, when explicit, reflection prompts (e.g., West et al., 2024; Hackett et al., 2023; Sevilla-Pavón, 2019). In these studies, improvements were most plausibly explained by repeated opportunities for negotiated meaning, reciprocal collaboration, and instructor-supported feedback cycles rather than by the platform itself.
In scenario-based implementations, the mechanism is not “using VR/telepresence” per se; rather, outcomes are plausibly produced by situated performance under salient conditions (M5) combined with structured facilitation and feedback (M3) (e.g., DeWitt et al., 2022; Liu and Shirley, 2021; Min-Yu Lau et al., 2016). These formats often align naturally with performance-oriented evidence (e.g., checklists, rated interactions, or observed behaviors), which strengthens construct alignment when IC is framed as a behavioral/communicative capability rather than only a self-perceived disposition.
In content-first formats, outcomes depend on whether exposure to content is translated into interactive and guided learning processes. When content delivery was coupled with structured tasks, guidance, or service components, effects were more interpretable; when interaction and facilitation were minimal, effects were more variable and tended to rely primarily on self-report change (e.g., Li et al., 2024; Levey, 2020; Daugherty and Kearney, 2017). This pattern supports an explanatory claim: content-first interventions require deliberate design to “convert” information exposure into dialogic engagement and feedback if consistent IC gains are expected.
Table 2 summarizes these cross-cutting patterns by implementation logic, including the dominant mechanism configuration, typical boundary conditions reported in the corpus, and the assessment alignment most commonly used. Together, this mechanism-oriented layer provides an explanatory bridge from modality categories to plausible causal processes.
3.5.2 Design tensions and trade-offs (boundary conditions)
The mechanism patterns above also reveal practical tensions that shape what is feasible and what is likely to work in different settings. First, there is a persistent scalability–depth trade-off: content-first formats (e.g., LMS/MOOCs) can expand reach, yet they typically reduce the density of reciprocal contact unless interaction and facilitation are deliberately engineered (Li et al., 2024; Daugherty and Kearney, 2017). Second, a synchronous–asynchronous tension emerges in networked exchange. Synchronous contact can strengthen negotiated meaning and immediacy, but it increases coordination burdens, whereas asynchronous design can reduce scheduling friction while changing interaction quality and feedback timing (West et al., 2024; Hackett et al., 2023).
Third, there is an equity-of-access vs. technological sophistication tension. Scenario-based and immersive approaches can increase situational salience, yet they can introduce new access barriers and technical dependencies that are unevenly distributed across institutions and regions (DeWitt et al., 2022; Liu and Shirley, 2021). Finally, the evidence highlights a measurement trade-off: standardized self-reports are scalable and prevalent, but performance-aligned evidence is often needed to substantiate behavioral/communicative IC change, especially in scenario-based designs. This supports the methodological implication that future studies should more routinely triangulate self-report and performance/artifact-based evidence to strengthen interpretability and comparability across contexts (Pettigrew and Tropp, 2006; Kolb, 1984).
3.6 Measurement of intercultural competence
Across the included studies, outcomes were captured predominantly using standardized self-report instruments, most often the Intercultural Sensitivity Scale (ISS), the Intercultural Development Inventory (IDI), and the Cultural Intelligence Scale/Index (CQ/CQS). Some studies complemented these with the Intercultural Effectiveness Scale (IES), the Inventory for Assessing the Process of Cultural Competence Among Healthcare Professionals (IAPCC) and its student version (IAPCC-SV), and other intercultural competence or global citizenship measures, such as the Intercultural Communicative Competence Scale (ICCS), Intercultural Knowledge and Competence (IKC) measures, and the Global Citizenship Scale (GCS). Performance-oriented assessments appeared most frequently in scenario-based settings—clinical telesimulation, role-plays, and VR/telepresence—via Objective Structured Clinical Examination (OSCE)-style checklists, analytic rubrics, and evaluated artifacts/e-portfolios, typically embedded in guided debriefs. Most instructional studies used pre–post designs; a minority reported post-only snapshots. Reliability was unevenly reported but, when available, fell in acceptable–excellent ranges for standardized scales. Consistent with the technology–pedagogy pattern documented in Sections 3.4–3.5, COIL/virtual exchange commonly paired ISS and ICCS with IDI alongside reflective journals; health/clinical courses favored analytic rubrics and OSCE-style checklists, sometimes supplemented with IAPCC; LMS/MOOC implementations relied mainly on self-reports unless instructors added structured reflection or interaction; and VR/360°/telepresence combined observer ratings with brief self-efficacy or attitude measures. Formal forward–back translation was rarely described, which raises validity and equity considerations for multilingual cohorts (see RQ5). Table 8 summarizes instruments and measurement characteristics used in the corpus. Explicit theoretical anchoring was uneven but traceable. A subset of interventions operationalized IC/ICC through established models, including Byram’s ICC in language/telecollaboration-oriented designs (e.g., Hsu and Beasley, 2019; Sevilla-Pavón, 2019; Trinh and Dinh, 2024), Deardorff’s process framing in higher-education internationalization contexts (e.g., de Castro et al., 2019; Eliyahu-Levi, 2020), and an interpersonal ICC tradition aligned with Arasaratnam’s instrument-based approach (e.g., Swartz and Shrivastava, 2022). However, most studies inferred IC change primarily from standardized self-report measures that emphasize attitudes and perceived capability. This heterogeneity reinforces the need to interpret “IC gains” in relation to the underlying model assumptions and to the evidentiary standard implied by each instrument (self-perception vs. demonstrated performance).
Table 8. Outcome instruments and measurement characteristics (“Evidence level/Quality” reflects the study level appraisal recorded in the dataset; counts refer to studies in the inclusion set.)
Measurement choices mirror instructional logic: interaction and performance-centered designs invite observational/rubric-based outcomes, whereas content-centered formats lean on self-report unless instructors embed structured interaction and debriefing.
3.7 Effectiveness synthesis (SWiM effect-direction)
Given outcome and design heterogeneity, effectiveness was synthesized using SWiM as study-level vote-counting by direction of effect. For each study intervention, we identified the primary/validated IC outcome and coded direction as ↑ (positive/significant improvement), ns (non-significant), mixed (within-study divergence across IC domains/instruments), or ↓ (adverse), based on pre–post and/or between-group contrasts following the prespecified rules reported in Methods. We then aggregated these study-level codes by the dominant pedagogical strategy/implementation family (Table 3) and visualized the counts in Figure 9. The synthesis is descriptive and does not pool standardized effects.
The pattern is consistent across the corpus: interaction-intensive designs show the most reliable gains. COIL/Global Virtual Teams and telecollaboration/virtual exchange report predominantly ↑ outcomes when coupled with guided reflection/debriefing, a result aligned with quasi-experimental evidence that COIL improves IC relative to controls (Hackett et al., 2023). Scenario-based formats (online study-abroad/telepresence; VR/360°; telesimulation) also trend ↑, particularly where debriefing and instructor facilitation are explicit. By contrast, content-centric formats (e.g., large-scale MOOC or LMS modules) are contingent on facilitation; the one MOOC+service-learning implementation in this set achieved ↑, whereas MOOC-only approaches in the broader literature frequently yield ns/mixed results when relying solely on self-report. A sensitivity check excluding low-quality/high-RoB studies did not change the directional ranking, and only one small-n telecollaboration role-play study registered ns. The full counts are shown in Table 3, and the corresponding SWiM plot appears in Figure 9.
The SWiM evidence indicates that designs combining authentic intercultural interaction with structured facilitation and debriefing—exemplified by COIL/VE and telesimulation—are the most dependable routes to measurable IC gains in digitally mediated higher education. VR/360° appears promising when embedded in reflective sequences rather than as stand-alone exposure, and MOOC/LMS formats require facilitation and authentic tasks to avoid the null or mixed patterns frequently associated with self-report-only evaluation.
3.8 Equity, access, and context
Equity sensitive constraints and enablers were coded from the study-level “results/limitations” fields and synthesized across strategies. The resulting profile is coherent with the geographical footprint and citation gradient reported earlier: systems with strong policy scaffolds and digital capacity (Figure 12) tend to report fewer structural barriers and yield more visible outputs within the corpus (Figure 13), whereas cross-jurisdictional implementations most often flag language/time-zone coordination, facilitation load, and uneven connectivity as sources of friction. Table 9 consolidates the contextual dimensions most frequently cited by authors—digital divide/connectivity, language and time-zones, facilitation workload, institutional supports (IT, quality assurance, professional development), micro-credentials/recognition, and public policy/funding—and links them to representative notes from the included reports. Read alongside the effectiveness synthesis (Section 3.7), these constraints explain why interaction-intensive designs require guided debriefing and institutional backing to translate opportunity into measurable intercultural-competence (IC) gains: without reliable access, time-zone solutions, and facilitator capacity, otherwise sound pedagogies underperform.
Table 9. Contextual/Equity constraints and enabling conditions by approach (“Yes” indicates the theme was explicitly noted in the study’s results/limitations; blank = not explicitly reported.
Language and time-zone coordination recur precisely in the cross institutional formats that otherwise deliver the strongest IC gains (COIL/VE, telepresence, blended mobility). Facilitator workload and role clarity surface where interaction and debriefing are central (COIL/VE, telesimulation, VR+VE). Connectivity and IT frictions remain a background constraint in modular e-learning and multisite collaborations. At the same time, enabling signals are visible: institutional training, cross-office coordination (international, IT, QA), and recognition mechanisms (micro-credentials, course badges) sustain student effort and improve assessment quality. Public policy platforms—national MOOC ecosystems or Erasmus-style virtual exchange support—help explain the concentration of activity and influence observed in Figures 12, 13.
Variation across contexts, learner groups, and digital formats. Beyond overall effect-direction patterns, the included studies show meaningful variation in where, for whom, and under what conditions digital IC interventions are implemented. Most interventions were explicitly cross-national or cross-cultural in design, typically involving partner institutions and coordination across academic calendars and time zones; consequently, implementation barriers were not peripheral but structurally linked to modality choice. In disciplinary terms, the corpus spans teacher education, language/EFL contexts, and health professions training, which matters because “IC gains” are operationalized differently across fields—ranging from communicative/pragmatic competence and perspective-taking to professional interaction and scenario-based performance—thereby shaping both task design and assessment alignment (Deardorff, 2006; Spitzberg and Changnon, 2009).
Format-level variation also maps onto different constraint profiles. Interaction-intensive designs (e.g., COIL/telecollaboration) often face coordination and facilitation workload as primary bottlenecks, whereas content-centric LMS/MOOC formats tend to reduce coordination costs at the expense of lower contact density unless dialogic tasks and instructor presence are deliberately engineered. Immersive and scenario-based formats (telepresence, tele simulation, VR/360°) more frequently raise equity-of-access concerns (hardware, bandwidth, institutional support) and rely on debriefing and feedback to translate situated experience into reflective learning; without that pedagogical layer, “technology novelty” can dominate the learner experience and blur interpretation of IC change. These context and format differences motivate treating equity, infrastructure, and facilitation capacity as boundary conditions rather than as generic limitations.
3.9 Sensitivity to study quality and design features
When we limited the SWiM synthesis to studies rated moderate or high quality ( ≥ 50% of the applicable maximum in Figure 10), the overall pattern did not change. COIL/virtual exchange and scenario-based formats (telepresence/VR/telesimulation) still showed mostly positive effects. This suggests that our directional conclusions are not driven by lower-quality reports. We also found that studies combining standardized IC scales with performance- or rubric-based evidence reported positive effects more often than studies relying only on post-course self-report measures. This reinforces the measurement–pedagogy alignment discussed in section 3.6.
3.10 Cross-walk insight (technology × pedagogy × outcome domain)
When we read Tables 4, 5 alongside the SWiM profile, a clear alignment emerges. Designs that embed authentic interaction (COIL/virtual exchange, telepresence, and audiovisual project-based learning) mostly target communicative and behavioral outcomes, and they most often report positive effects. Content-first infrastructures (MOOCs/LMS) tend to target cognitive and affective outcomes, and results are mixed unless instructors add facilitation and structured dialogue. Immersive approaches (VR/360°) can support situated learning, but they typically require a structured debrief to translate “presence” into changes in communication or behavior. This helps explain why some designs outperform others: they focus on outcome domains that can realistically shift within course timeframes and they use measures that match those domains.
4 Discussion
4.1 Synthesis and interpretation of findings
This review synthesized evidence on how digital modalities and course-embedded pedagogies support intercultural competence (IC) development in higher education. The central interpretive finding is that reported IC gains are explained more convincingly by mechanism configurations than by the presence of any specific technology. In the SWiM synthesis, interaction-intensive designs—especially COIL/virtual exchange and closely related telecollaboration formats—show the most consistent positive direction-of-effect signal (Figure 9; Table 3). Scenario-based formats (telepresence/telesimulation and VR/360°) also trend positive when they include structured debriefing and feedback. In contrast, content-first implementations (LMS modules and MOOCs) display more contingent outcomes unless instructors engineer sustained dialogue and facilitation that transforms exposure into coached intercultural practice (Rai et al., 2023). This pattern is consistent with prior syntheses emphasizing that scaffolding and task design, rather than tools alone, condition outcomes in digitally mediated intercultural learning (Guillén-Yparrea and Ramírez-Montoya, 2023; O’Dowd, 2021b), and it coheres with quasi-experimental evidence suggesting COIL can improve IC relative to controls (Hackett et al., 2023). Importantly, the technology–pedagogy–outcome alignment observed in the corpus helps interpret what changes are most plausible within course timeframes: communicative/behavioral domains are more likely to shift in interaction-intensive formats, whereas cognitive/affective facets are more typical targets of content-first environments (Tables 4, 5).
Interpreting these patterns through IC theory strengthens the explanatory logic and addresses measurement controversies that otherwise make “effectiveness” difficult to compare across studies. Across prominent IC models, developmental claims depend on the joint activation of interactional experience, guided meaning-making, and opportunities to rehearse communicative behavior: Byram’s ICC emphasizes skills of interpreting/relating and interaction alongside attitudes and critical cultural awareness; Deardorff’s process model positions attitudes as an entry point but treats internal shifts and external outcomes as emergent from iterative interaction and reflection; Fantini’s A–K–S architecture highlights that skills/behaviors require practice and feedback rather than content exposure alone; and Arasaratnam’s ICC foregrounds interaction, empathy, and communicative adaptation as central mechanisms through which competence becomes visible (Byram, 1997; Deardorff, 2006, 2009; Fantini, 2000; Fantini and Tirmizi, 2006; Arasaratnam, 2009). These theoretical differences matter because they imply different “targets” (e.g., disposition vs. performance) and therefore different threats to validity when studies rely primarily on self-report measures. In our corpus, standardized self-report instruments predominate, while performance/observational evidence appears more often in simulation and role-play contexts (Section 3.6). This measurement pattern can inflate apparent convergence when constructs are treated as self-perceived attitudes rather than observable communicative capability, and it helps explain why some content-first studies report mixed results: without structured interaction and coaching, learners may shift awareness without demonstrating communicative change. Accordingly, “effectiveness” should be interpreted as conditional on construct operationalization and assessment alignment, not as a simple property of the platform.
The added mechanism-oriented coding layer makes this interpretation auditable by specifying how modalities plausibly produce IC gains (Tables 2, 7). Across implementation logics, the most dependable configuration combines: (a) structured intercultural contact (reciprocal interaction and negotiation of meaning), (b) guided reflection/debriefing (prompted processing of experience), and (c) facilitation and feedback cycles (teacher presence, coaching, rubrics, and responsive support). Where these mechanisms co-occur, the direction-of-effect signal remains predominantly positive across disciplines and delivery modes; where they are absent—particularly in content-first designs—outcomes become contingent on whether instructional design successfully converts information exposure into dialogue, practice, and feedback. For scenario-based implementations, the mechanism is not “VR/telepresence” per se; rather, outcomes are more plausibly produced by situated performance under salient conditions combined with structured debriefing and feedback, which also supports stronger assessment alignment via rubrics/checklists rather than post-only self-report (Table 2). For networked exchange designs, repeated negotiated interaction and co-production under facilitation appears to provide the most consistent pathway to change. This “mechanisms-first” interpretation is consistent with contact-based and experiential learning explanations in which interaction alone is insufficient unless conditions support cooperation, reflection, and guided transformation (Allport, 1954; Kolb, 1984; Pettigrew and Tropp, 2006).
The mechanism lens clarifies why results vary across contexts even when modality labels look similar. Boundary conditions reported in the included studies—time-zone and scheduling coordination, language demands, connectivity and infrastructure, facilitation workload, and institutional recognition/support—operate as constraints on whether core mechanisms can be sustained long enough to produce measurable change (Table 2). These constraints are not ancillary; they affect the feasibility of reciprocity, psychological safety, and feedback cadence, and therefore help explain heterogeneity across settings and cohorts. Taken together, the evidence supports an explanatory synthesis: digital modalities contribute to IC development when they are used to organize a small set of theory-consistent learning processes—structured intercultural contact, guided reflection, and facilitation/feedback—under enabling institutional and equity conditions; absent those processes, technology adoption alone does not reliably translate into IC gains.
4.2 Pedagogical integration and affordances
Rather than treating digital internationalization as a set of interchangeable tools, our findings support an affordance-based interpretation: different modalities primarily contribute by enabling (or constraining) specific learning processes that IC theory identifies as necessary for competence development. This matters because it reframes “what to implement” from platform selection to process engineering. In practical terms, the corpus suggests three integration logics—networked exchange, scenario-based immersion, and content-first provision—but their effectiveness depends on whether they reliably instantiate structured intercultural contact, guided reflection, and facilitation/feedback (Allport, 1954; Kolb, 1984; Pettigrew and Tropp, 2006).
First, COIL/virtual exchange and related telecollaboration designs appear to be the most robust pathway when institutions seek course-embedded, scalable internationalization that remains faithful to competence-as-performance. Their comparative advantage is not videoconferencing itself, but the capacity to organize repeated reciprocal interaction, negotiated meaning, and co-produced deliverables under educator guidance. This aligns with ICC traditions that foreground interaction and mediation (e.g., Byram’s emphasis on interpreting/relating and interaction skills) and with process-oriented models in which external outcomes emerge from iterative cycles of experience and reflection (Deardorff, 2006; Byram, 1997). A critical implication is that successful COIL/VE is less about adding “international guests” and more about designing tasks that require interdependence, balancing synchronous encounters with asynchronous collaboration, and making reflection visible through structured prompts and assessment. This logic also maps to workplace demand for global virtual teamwork, which helps explain why COIL/VE can be positioned as both internationalization and employability development when implemented with clear performance expectations (Liu, 2025).
Second, telepresence/tele simulation and VR/360° formats are best interpreted as salience amplifiers: they can increase situational realism, emotional engagement, and the immediacy of culturally sensitive encounters. However, the evidence indicates a common contradiction: “high immersion” does not reliably translate into IC gains unless educators add structured debriefing, feedback, and opportunities to reframe experience. This pattern is theoretically coherent. In experiential learning terms, concrete experience alone is insufficient; learning depends on reflective observation and abstraction that can be iteratively tested (Kolb, 1984). Likewise, contact-based explanations predict that exposure without cooperative structure and guided processing may yield superficial attitude change or even reinforce stereotypes. Therefore, the strongest implementations treat immersive scenarios as inputs to coached reflection and performance-aligned assessment, rather than as stand-alone experiences.
Third, LMS modules and MOOCs are most defensible as reach-extending infrastructures, but they expose a recurring tension between scalability and depth. Content-centric delivery can improve awareness and provide broad access, yet it often reduces the density of reciprocal contact and the immediacy of feedback—precisely the processes that theory suggests are needed for competence development. This helps explain the mixed outcomes observed in the corpus: where instructors engineered dialogic tasks (e.g., structured discussion protocols, peer collaboration, reflective journals, role-play prompts) and maintained visible facilitation, outcomes were stronger; where participation was largely self-paced and facilitation minimal, effects were variable and more vulnerable to self-report inflation or disengagement. In other words, content-first provision is not inherently ineffective, but it requires deliberate design to move learners from exposure to interaction and coached practice (Heymans et al., 2024).
Across these integration logics, the practical implication is a “mechanisms-first” design stance: instructors can select lower- or higher-tech modalities based on feasibility, but should preserve the same core process architecture—structured reciprocal contact, explicit and assessed reflection, and facilitation/feedback that maintains expectations and psychological safety (Allport, 1954; Pettigrew and Tropp, 2006; Kolb, 1984). The mechanism tables in this review (Tables 2, 7) also indicate that interpretive confidence increases when assessment aligns with the IC domain being targeted. For example, claims about communicative/behavioral change are more credible when supported by performance evidence (rubrics, scenario checklists, observable products) rather than post-only self-report; conversely, self-report measures are more defensible for proximal attitudinal and awareness outcomes. This is not merely a measurement preference: it is part of the causal chain through which pedagogical design becomes visible as “effectiveness,” and it is a key reason why nominally similar formats can yield different results across contexts and cohorts.
4.3 Challenges, design considerations, and research gaps
The mechanism coding clarifies recurrent tensions that explain heterogeneity and, crucially, why nominally similar “digital” interventions yield different outcomes. First, scalability versus depth: content-first ecosystems (MOOCs/LMS) expand reach and reduce marginal costs, but they often thin the density of reciprocal contact unless designers deliberately engineer guided interaction and feedback loops. Second, synchronous versus asynchronous exchange: asynchronous work mitigates time-zone constraints yet can reduce real-time negotiation of meaning unless supported by structured prompts and responsive facilitation. Third, equity of access versus technological sophistication: immersive approaches can increase situational salience, but they intensify infrastructure and access requirements, creating selection effects and widening participation gaps. Fourth, self-report versus performance evidence: when assessment is misaligned with the targeted mechanism (e.g., expecting behavioral change while measuring post-course attitudes), studies may under-detect change or overstate it through response bias. These are not peripheral “limitations”; they function as boundary conditions that should be built into design and evaluation.
Three research gaps follow directly from these tensions. First, most evaluations rely on standardized self-report instruments (e.g., ISS, IDI, CQ/CQS); many studies do not report reliability for the study sample, few document translation/adaptation procedures, and cross-language measurement equivalence is rarely tested. Because IC models differ in what counts as competence (dispositions, internal shifts, or external performance), this measurement pattern limits interpretability across contexts. Performance/observational evidence appears mainly in clinical or role-play settings (Section 3.5), suggesting a tractable improvement: future studies should routinely triangulate validated self-report with rubric-based or OSCE-style assessments and report reliability indicators (e.g., alpha/omega) and rater agreement. Second, coordination demands (time zones, language pairing, facilitation workload) repeatedly determine whether learners experience sustained reciprocal interaction (Table 9), making “implementation capacity” a substantive explanatory variable rather than a logistical footnote. Third, limited reporting of fidelity and dosage (contact hours, number of sessions, adherence to scaffolds) constrains causal interpretation of null or mixed effects; comparative and longitudinal designs with explicit implementation reporting would substantially strengthen inference.
4.4 Theoretical and conceptual insights
The intellectual base visible in the co-citation structure supports a theory-grounded interpretation of these findings. Deardorff’s process model remains a central anchor for conceptualizing IC as developmental and assessable (Deardorff, 2006), while psychometric traditions (e.g., intercultural sensitivity and cultural intelligence) continue to shape operational measurement. The overlay view (Figure 3) aligns with a broader field shift from infrastructure-focused “online learning” framings toward collaboration-centered pedagogy—an arc also observed in syntheses of virtual exchange (Guillén-Yparrea and Ramírez-Montoya, 2023). This supports a key inference: digital modalities contribute most when they instantiate learning processes consistent with IC theory, particularly iterative interaction plus guided meaning-making and feedback, rather than when they merely increase exposure to cultural content.
Accordingly, the main conceptual contribution of this review is a transferable explanatory model: digital internationalization works when it activates a small set of mechanisms that plausibly drive IC development. Technologies should therefore be treated as affordance bundles whose value depends on whether they produce (i) structured intercultural contact, (ii) guided reflection, (iii) facilitation/feedback, and—when relevant—(iv) situated performance and perspective-taking (Table 3). This framing clarifies why “VR” or “MOOC” are not mechanisms in themselves: immersive tools mainly amplify situational salience, while content-first tools mainly amplify reach and exposure; both require pedagogical engineering to realize the same underlying processes that are more directly instantiated in COIL/virtual exchange designs. Making mechanisms and boundary conditions explicit offers an analytic bridge between theory and implementable design choices that can be compared across contexts without collapsing them into tool labels.
4.5 Contextual nuances and emerging patterns
Geographic and citation patterns (Figures 12, 13) suggest that systems with stronger policy scaffolds and digital capacity produce more outputs and accrue more citations (Ghani et al., 2022; Li et al., 2024). Conversely, studies from lower-resourced settings report connectivity and access constraints that temper results even when designs are pedagogically sound (Table 9), reinforcing the argument that institutional support is part of the causal pathway. The correspondence map (Figure 5) also points to an emerging linkage between IC and academic achievement (Novikova et al., 2022), motivating the next empirical step: tracing whether proximal intercultural gains translate into downstream academic and employability outcomes, a direction already visible in policy-linked discourse under SDG-4 (O’Dowd, 2021a).
4.6 Critical reflections on the review process
This review’s contribution is methodological as well as substantive, but its added value depends on how the strands were integrated rather than merely co-presented. The PRISMA-conformant synthesis establishes what the inclusion set supports regarding effectiveness—namely, that interaction-intensive designs with explicit debriefing and facilitation most consistently yield positive IC change (Figure 9; Table 3). The bibliometric layer then situates where and how this knowledge is produced by mapping venue dispersion, geography, temporal shifts, and the intellectual backbone (Figures 4–7, 8, 11–13). The connection is analytic rather than rhetorical: juxtaposing the SWiM profile with geographic production/visibility (Figures 12, 13) indicates that designs most often associated with positive change (COIL/VE, telepresence/telesimulation, and VR/360° with debriefing) are disproportionately studied in settings with stronger policy scaffolds and digital infrastructure, while content-centric approaches are more visible in certain outlet ecologies yet show variable outcomes unless facilitation is resourced. Likewise, the co-citation structure (Figure 7) helps interpret the measurement profile documented in Table 7: traditions anchored in Deardorff’s process framing and in widely used psychometric instruments (e.g., ISS, IDI, CQ/CQS) tend to operationalize IC via self-report unless pedagogy compels observable performance evidence. The mechanism coding and cross-walk (Tables 2, 7; Section 3.4.X) makes this alignment auditable by linking modality affordances → learning processes → IC domains → assessment choices.
The choice of SWiM was not a convenience but a constraint-driven decision aligned with the evidence base. After attempting quantitative extraction from full texts, only a small subset reported commensurable statistics (means/SDs/Ns on comparable outcomes and timepoints). With too few eligible effects to support a defensible meta-analysis, a random-effects model would have been underpowered and potentially misleading. Direction-of-effect synthesis therefore provided an evidence-respectful aggregation while preserving design heterogeneity. Importantly, we strengthened reproducibility by specifying grouping criteria, effect-direction rules, and handling of mixed/qualitative-only studies (Methods, section 2.7). We also calibrated methodological quality as a percentage of the applicable maximum (Figure 10) and explicitly handled not-applicable items; the sensitivity check excluding lower-quality studies did not change the directional ranking of strategy families, increasing confidence that the pattern (structured contact + facilitation/feedback + debriefing) is not an appraisal artifact.
Several limitations warrant precise specification because they shape interpretation and directly reflect methodological choices. First, heterogeneity is structural: included studies vary in design (often single-group), populations and disciplines, and outcomes ranging from developmental orientation (IDI) and sensitivity (ISS) to cultural intelligence (CQ/CQS) and performance rubrics. This matters because IC models differ in what counts as “competence,” and the predominance of self-report increases susceptibility to social desirability and reference-group effects; moreover, reporting of reliability (α/ω) and rater agreement was inconsistent, and almost no study documented forward–back translation or measurement invariance, constraining inference in multilingual cohorts and partly explaining the equity/implementation issues logged in Table 9. Second, the evidence base is vulnerable to selection and capacity confounding: participation in COIL/VE or immersive formats may correlate with institutional resources, digital readiness, and instructor expertise. The bibliometric geography suggests such capacity clusters in Anglophone/OECD systems, which could contribute both to higher realized effects and to higher citation visibility. Third, pandemic-era disruptions plausibly confounded participation, contact hours, and access, especially in multi-site projects; these disruptions are difficult to disentangle from modality effects in observational designs. Fourth, on the search side, restricting databases to Scopus, WoS, and ERIC maximized indexing quality but may have missed regional outlets and non-English scholarship; gray literature was not systematically included, so publication bias and small-study/file-drawer effects cannot be ruled out. Finally, while bibliometric maps were pre-processed (synonym unification and thresholds reported in the Supplement), they remain descriptive; ecological inference from co-occurrence or co-citation structure to causal mechanisms would be unwarranted without triangulation. We mitigated this by cross-walking bibliometric patterns with intervention-level extraction, quality appraisal, and mechanism coding, but future work could strengthen this link through registered hypotheses relating network structure to measured outcomes.
These limits sharpen rather than dilute the take-home message by identifying where methodological investment will yield the greatest returns. Comparative and longitudinal designs—especially multi-site quasi-experiments in COIL/VE and tele simulation—would better adjudicate between pedagogy and context. Fidelity/dosage reporting (contact hours, number/length of interactions, adherence to debrief protocols) would convert promising designs into transportable models. Multi-method measurement that pairs validated self-reports with performance/rubric evidence, and that documents reliability and linguistic validation, would reduce construct ambiguity and improve assessment alignment. In short, the review connects outcomes to production contexts by design: it shows not only what tends to work in this corpus, but also where and under what enabling conditions it has been made to work—information essential for theory building and for equitable scaling.
4.7 Positioning the review within the broader discourse
This review advances literature that has moved from describing digital infra-structures to theorizing and testing collaboration-centered pedagogies for intercultural competence (IC) in higher education. Prior syntheses document that shift and increasingly report positive, causal evidence for virtual exchange/COIL (e.g., quasi-experimental designs with controls) as opposed to exposure-only online formats (Hackett et al., 2023; Zhou et al., 2024).
What this review adds is a tightly coupled account of where and how the evidence is produced (bibliometric landscape) and why certain designs work (mechanism-aware synthesis), thereby connecting outcomes to the field’s production system.
It explains why interaction-intensive designs outperform content-centric ones. The effect-direction pattern in our corpus mirrors well-established mechanisms: structured, repeated contact with clear goals and institutional support—conditions typical of COIL/virtual exchange—produces perspective taking and reduces intergroup bias (Allport’s contact hypothesis and its meta-analytic replications). In contrast, one-way content exposure in MOOCs/LMS seldom satisfies these conditions and therefore yields weaker IC change unless supplemented with facilitation and dialogue (Pettigrew et al., 2011).
In parallel, the Community of Inquiry framework predicts that social presence and teaching presence—prominent in facilitated team projects—scaffold the cognitive presence needed for development, whereas minimally interactive courses under-provide these presences.
Experiential-learning theory clarifies why experience → guided reflection → conceptualization → experimentation cycles (e.g., scenario work plus debrief) outperform “read-and-quiz” designs; they align with the behavioral and communicative facets of IC that our map links to telecollaboration, telesimulation, and VR.
These mechanisms are consistent with recent discipline-specific reviews showing that virtual exchange measurably enhances intercultural (communicative) competence across contexts when interaction and reflection are engineered into the tasks (Borger, 2022; Emir and Yangın-Ekşi, 2024; Zhou et al., 2024).
This review also situates effectiveness within a granular technology–pedagogy map. Our results specify how particular affordances route to outcomes: videoconferencing + collaborative workspaces + structured reflection → communicative/behavioral IC gains; VR/telepresence + debrief → situated performance and perspective taking; content-first LMS/MOOCs → predominantly cognitive/affective change unless coupled with facilitation. This aligns with multi-site and review evidence synthesizing COIL/VE implementations across languages, health, business and engineering, and showing advantages over “content-heavy, interaction-light” designs (Fernández-Cézar et al., 2024).
The review adds a structured equity lens and ties it to production patterns. The choropleth and citation gradient show concentration in systems with robust policy scaffolds (e.g., Internationalization at Home and Erasmus+ Virtual Exchange), national platforms, and QA/IT support; these same conditions reduce frictions (time zones, language pairing, connectivity) that authors repeatedly flag as threats to fidelity else-where. Thus, who publishes—and who is cited—tracks enabling infrastructures, clarifying external validity and pointing to where capacity investments will most affect outcomes (Chaisiri, 2025).
The study grounds contemporary practice in IC’s intellectual backbone. The co-citation structure locates today’s digital pedagogies in a lineage from Deardorff’s process model of IC to psychometric and theoretical pillars such as Chen and Starosta’s intercultural sensitivity and Earley and Ang’s cultural intelligence. This genealogy explains the instrument choices seen in our corpus and the emphasis on reflective, developmental trajectories over mere knowledge acquisition (Hackett et al., 2023).
The systematization positions itself among the most recent systematic accounts. In language/teacher-education and HE more broadly, reviews converge on three design levers: sustained, reciprocal interaction; explicit reflection/debriefing; and assessment beyond self-report—precisely the levers illuminated by our SWiM synthesis and measurement audit. Together with causal tests of COIL and design frameworks for virtual exchange in professional programs, these sources corroborate the design stance we articulate interaction + facilitation + debriefing, irrespective of platform or discipline (Kennedy et al., 2025; Liu, 2023).
4.8 Setting a research agenda
A focused agenda emerges from these findings. Future studies should (i) use comparative or longitudinal designs where feasible, including multi-site quasi-experiments for COIL/VE and VR/telepresence; (ii) triangulate self-report with performance evidence and report reliability (α/ω), inter-rater agreement, and measurement invariance across languages; (iii) document fidelity and dosage (contact hours, session counts, scaffold adherence); (iv) test downstream outcomes—academic achievement, program retention, and employability—in addition to proximal IC change (Figure 5; Novikova et al., 2022); (v) study equity by design, including time-zone solutions, language pairing, and access plans for high-bandwidth components; (vi) examine policy/institutional levers—micro-credentials, QA frameworks, and virtual-exchange offices—that convert pilots into sustained offerings; and (vii) explore emerging mediators such as analytics-informed facilitation and AI-supported translation/feedback while attending to ethics and bias. Pursued together, these directions would progress from promising case-based evidence to robust, generalizable knowledge about how digital pedagogies can equitably and reliably develop intercultural competence in higher education.
4.9 Strength of this review for practitioners and policymakers
For instructors and program designers, the actionable message is to prioritize authentic interaction with explicit facilitation and debriefing, ensure institutional IT/support structures are in place before scaling, and align assessment with targeted IC domains. For policymakers, the findings justify investment in virtual-exchange infrastructure and recognition schemes that reduce coordination costs and sustain participation. For researchers, the review delineates the methodological upgrades most likely to move the field toward theory-grounded, context-sensitive, and scalable practice.
5 Conclusion
Digital approaches develop intercultural competence (IC) most consistently when instructors design for sustained, reciprocal interaction and then make reflection explicit and assessable. In this corpus, Collaborative Online International Learning (COIL)/virtual exchange and closely related telecollaboration formats showed the most robust and repeatable improvements. Scenario-based implementations (telepresence, tele simulation, VR/360°) also supported IC development when courses included structured debriefing that helped students translate experience into communicative practice. By contrast, learning management system (LMS) and MOOC implementations expanded access but produced mixed results unless instructors embedded dialogic tasks and active facilitation rather than relying on content exposure alone.
The review also clarifies a recurring alignment between pedagogy and measurement. Studies in content-first settings relied mainly on standardized self-report scales, whereas interaction- and simulation-based designs more often incorporated performance, observational, or rubric-based evidence. Future evaluations should routinely combine these approaches, report reliability for the study sample, and address multilingual validity (including translation procedures and measurement invariance) to support credible cross-context interpretation.
The bibliometric layer indicates that much of the published evidence originates in settings with strong digital infrastructure and policy support. This concentration matters: equity, connectivity, facilitation capacity, and institutional recognition mechanisms shape what designs are feasible and which results appear in the literature. These findings support a practical design stance that generalizes across platforms and disciplines: prioritize structured intercultural contact, invest in facilitation and debriefing, and align assessment with the intended IC outcomes. The next research steps should strengthen comparative and longitudinal designs, document implementation fidelity and dosage, and examine whether gains extend beyond proximal IC outcomes to downstream academic and employability effects.
Data availability statement
The original contributions presented in this study are included in this article/Supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
MN-T: Validation, Supervision, Data curation, Conceptualization, Formal analysis, Resources, Writing – review & editing. AB-A: Investigation, Conceptualization, Validation, Visualization, Writing – original draft. MM: Visualization, Validation, Writing – original draft, Investigation. JG: Software, Validation, Project administration, Writing – review & editing, Methodology.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Acknowledgments
We acknowledge using Chat GPT 5.0 only for grammatical corrections.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was used in the creation of this manuscript. The authors acknowledge using Chat GPT 5.0 only for grammatical corrections.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2026.1725080/full#supplementary-material
References
Alkhammash, R. (2023). Bibliometric, network, and thematic mapping analyses of metaphor and discourse in COVID-19 publications from 2020 to 2022. Front. Psychol. 13:1062943. doi: 10.3389/fpsyg.2022.1062943
Arasaratnam, L. A. (2009). The Development of a New Instrument of Intercultural Communication Competence. J. Intercult. Commun. 9, 1–8. doi: 10.36923/jicc.v9i2.478
Arasaratnam, L. A., and Doerfel, M. L. (2005). Intercultural communication competence: identifying key components from multicultural perspectives. Int. J. Intercult. Relat. 29, 137–163. doi: 10.1016/j.ijintrel.2004.04.001
Arias-Cifuentes, S., and Solano-Cahuana, I. (2025). Assessing the impact of language virtual exchange approaches on intercultural competence development among university students. Íkala Rev. Lenguaje Cult. 30:e355699. doi: 10.17533/udea.ikala.355699
Barrett, M. (2012). Intercultural Competence. EWC Statement Series, Vol. 2. Oslo, Norway: European Wergeland Centre.
Barrett, M. (2020). The council of Europe’s reference framework of competences for democratic culture: Policy context, content and impact. Lond. Rev. Educ. 18, 1–18. doi: 10.18546/LRE.18.1.01
Borger, J. G. (2022). Getting to the CoRe of collaborative online international learning (COIL). Front. Educ. 7:987289. doi: 10.3389/feduc.2022.987289
Borrelli, S., Konstantinidis, S., Fumagalli, S., Kärema, A., Mets-Oja, S., Nespoli, A., et al. (2024). TransfOrming transnational intercultural sensitivity for Midwifery students through an inclusive mobility model: A mixed-method evaluation of the TOTEMM project. Nurse Educ. Today 138:106186. doi: 10.1016/j.nedt.2024.106186
Brancu, L., Şahin, F., Guğmundsdóttir, S., and Çetin, F. (2022). Measurement invariance of the Cultural Intelligence Scale across three countries. Int. J. Intercult. Relat. 86, 145–157. doi: 10.1016/j.ijintrel.2021.12.002
Byram, M. (1997). Teaching and assessing intercultural communicative competence. Clevedon: Multilingual Matters.
Campbell, M., McKenzie, J. E., Sowden, A., Katikireddi, S. V., Brennan, S. E., Ellis, S., et al. (2020). Synthesis without meta-analysis (SWiM) in systematic reviews: reporting guideline. BMJ 368:l6890. doi: 10.1136/bmj.l6890
Chaisiri, S. (2025). Virtual exchanges in higher education: Advancing intercultural competence and language confidence. Res. Stud. Engl. Lang. Teach. Learn. 3, 448–468. doi: 10.62583/rseltl.v3i3.88
Chen, G. M., and Starosta, W. J. (2000). The development and validation of the intercultural sensitivity scale. Hum. Commun. 3, 1–15. doi: 10.1037/t61546-000
Çiftçi, E. Y., Karaman, A. C., and Daloğlu, A. (2022). ‘No one is superior to another’: Tracing intercultural development in a year-long study abroad programme. Lang. Learn. J. 50, 537–549. doi: 10.1080/09571736.2020.1802770
Daugherty, H. N., and Kearney, R. C. (2017). Measuring the impact of cultural competence training for dental hygiene students. J. Dent. Hyg. 91, 48–54.
Davaei, M., and Gunkel, M. (2024). The role of intelligences in teams: A systematic literature review. Rev. Manag. Sci. 18, 259–297. doi: 10.1007/s11846-023-00672-7
de Castro, A. B., Dyba, N., Cortez, E. D., and Pe Benito, G. G. (2019). Collaborative online international learning to prepare students for multicultural work environments. Nurse Educ. 44, E1–E5. doi: 10.1097/NNE.0000000000000609
Deardorff, D. K. (2006). Identification and assessment of intercultural competence as a student outcome of internationalization. J. Stud. Int. Educ. 10, 241–266. doi: 10.1177/1028315306287002
Deardorff, D. K., and Jones, E. (2012). “Intercultural competence: An emerging focus in international higher education,” in The SAGE Handbook of International Higher Education, eds D. Deardorff, H. de Wit, J. Heyl, and T. Adams (Thousand Oaks, CA: SAGE Publications, Inc), 283–304. doi: 10.4135/9781452218397.n16
DeWitt, D., Chan, S. F., and Loban, R. (2022). Virtual reality for developing intercultural communication competence in Mandarin as a Foreign language. Educ. Technol. Res. Dev. 70, 615–638. doi: 10.1007/s11423-021-10074-9
Earley, P. C., and Ang, S. (2003). Cultural Intelligence: Individual Interactions Across Cultures. Redwood, CA: Stanford University Press. doi: 10.1515/9780804766005
Eliyahu-Levi, D. (2020). Cross cultural online encounters with peers from different countries. Distance Educ. 41, 402–423. doi: 10.1080/01587919.2020.1766948
Emir, G., and Yangın-Ekşi, G. (2024). The role of telecollaboration in English language teacher education: A systematic review. Smart Learn. Environ. 11:3. doi: 10.1186/s40561-024-00290-0
Fantini, A. E. (2000). A central concern: developing intercultural competence. Brattleboro, VT: School for International Training.
Fantini, A., and Tirmizi, A. (2006). Exploring and assessing intercultural competence. World Learning Publications. Available online at: http://digitalcollections.sit.edu/worldlearning_publications/1
Fernández-Cézar, R., Prada-Núñez, R., and Pinto, N. S. (2024). Collaborative online international learning: Experiences in higher education. Educ. Process Int. J. 13, 7–24. doi: 10.22521/edupij.2024.134.1
Galan-Lominchar, M., Roque, I. M.-S., Cazallas, C., del, C., Mcalpin, R., Fernández-Ayuso, D., et al. (2024). Nursing students’ internationalization: Virtual exchange and clinical simulation impact cultural intelligence. Nurs. Outlook 72:102137. doi: 10.1016/j.outlook.2024.102137
Garrison, D. R., Anderson, T., and Archer, W. (1999). Critical inquiry in a text-based environment: Computer conferencing in higher education. Internet High. Educ. 2, 87–105. doi: 10.1016/S1096-7516(00)00016-6
Ghani, N. A., Teo, P.-C., Ho, T. C. F., Choo, L. S., Kelana, B. W. Y., Adam, S., et al. (2022). Bibliometric analysis of global research trends on higher education internationalization using Scopus database: Towards sustainability of higher education institutions. Sustainability 14:8810. doi: 10.3390/su14148810
Gottschalk, F., and Weise, C. (2023). Digital Equity and Inclusion in Education. Paris: OECD. doi: 10.1787/7cb15030-en
Griffith, R. L., Wolfeld, L., Armon, B. K., Rios, J., and Liu, O. L. (2016). Assessing intercultural competence in higher education: Existing research and future directions. ETS Res. Rep. Ser. 2016, 1–44. doi: 10.1002/ets2.12112
Guillén-Yparrea, N., and Ramírez-Montoya, M. S. (2023). Intercultural competencies in higher education: A systematic review from 2016 to 2021. Cogent Educ. 10:2167360. doi: 10.1080/2331186X.2023.2167360
Guo-Brennan, L. (2022). Making virtual global learning transformative and inclusive: A critical reflective study on high-impact practices in higher education. J. Teach. Learn. 16, 28–49. doi: 10.22329/jtl.v16i2.6947
Gutiérrez-Santiuste, E., and Ritacco-Real, M. (2023). Intercultural communicative competence in higher education through telecollaboration: Typology and development. Educ. Inf. Technol. 28, 13885–13912. doi: 10.1007/s10639-023-11751-3
Hackett, S., Janssen, J., Beach, P., Perreault, M., Beelen, J., and van Tartwijk, J. (2023). The effectiveness of Collaborative Online International Learning (COIL) on intercultural competence development in higher education. Int. J. Educ. Technol. High. Educ. 20:5. doi: 10.1186/s41239-022-00373-3
Heymans, Y., Strosnider, C., Pool, J., and Jansen van Vuuren, M. (2024). Fostering intercultural competence through virtual exchange: Perspectives of undergraduate health students. Open Prax. 16, 119–129. doi: 10.55982/openpraxis.16.2.607
Howard, W., Perrotte, G., Lee, M., and Frisone, J. (2017). A formative case evaluation for the design of an online delivery model providing access to study abroad activities. Online Learn. 21, 115–134. doi: 10.24059/olj.v21i3.1234
Hsu, S.-Y. S., and Beasley, R. E. (2019). The effects of international email and Skype interactions on computer-mediated communication perceptions and attitudes and intercultural competence in Taiwanese students. Australas. J. Educ. Technol. 35, 149–162. doi: 10.14742/ajet.4209
Huber, J. (2014). Developing Intercultural Competence through Education. Strasbourg: Council of Europe.
Hutchins, D., and Goldstein Hode, M. (2021). Exploring faculty and staff development of cultural competence through communicative learning in an online diversity course. J. Divers. High. Educ. 14, 468–479. doi: 10.1037/dhe0000162
Kabir, R. S., and Sponseller, A. C. (2020). Interacting with competence: A validation study of the self-efficacy in intercultural communication scale-short form. Front. Psychol. 11:2086. doi: 10.3389/fpsyg.2020.02086
Kennedy, J., Dubreuil, J., Thibodeau, D., and Zachmeier, A. (2025). How to develop and deliver a virtual exchange based on Collaborative Online International Learning principles (COIL). Clin. Teach. 22:e70121. doi: 10.1111/tct.70121
Koester, J., and Olebe, M. (1988). The behavioral assessment scale for intercultural communication effectiveness. Int. J. Intercult. Relat. 12, 233–246. doi: 10.1016/0147-1767(88)90017-X
Kolb, D. A. (1984). Experiential learning: experience as the source of learning and development Vol. 1. Englewood Cliffs, NJ: Prentice-Hall.
Leung, K., Ang, S., and Tan, M. L. (2014). Intercultural competence. Ann. Rev. Org. Psychol. Org. Behav. 1, 489–519. doi: 10.1146/annurev-orgpsych-031413-091229
Levey, J. A. (2020). Teaching online graduate nursing students cultural diversity from an ethnic and nonethnic perspective. J. Transcult. Nurs. 31, 202–208. doi: 10.1177/1043659619868760
Lewis, T., and O’Dowd, R. (2016). “Online intercultural exchange and foreign language learning: A systematic review,” in Online intercultural exchange: Policy, pedagogy, practice, eds R. O’Dowd and T. Lewis (New York, NY: Routledge), 21–68.
Li, Y., Armstrong, A., and Krasny, M. (2024). MOOC teaching assistants’ global-engaged learning in the US and China. Online Learn. 28, 147–171. doi: 10.24059/olj.v28i4.4567
Lin, H.-L., Wang, Y.-C., Huang, M.-L., Yu, N.-W., Tang, I., Hsu, Y.-C., et al. (2024). Can virtual reality technology be used for empathy education in medical students: A randomized case-control study. BMC Med. Educ. 24:1254. doi: 10.1186/s12909-024-06009-6
Liu, X. (2025). Bridging cultures in virtual workplaces: A communication-focused review of global virtual teams. Bus. Prof. Commun. Q. 1–26. doi: 10.1177/23294906251327747
Liu, Y. (2023). Overview of the impact of collaborative online international learning on learners. SHS Web Conf. 157:04011. doi: 10.1051/shsconf/202315704011
Liu, Y., and Shirley, T. (2021). Without crossing a border: Exploring the impact of shifting study abroad online on students’ learning and intercultural competence development during the COVID-19 pandemic. Online Learn. 25, 182–194. doi: 10.24059/olj.v25i1.2471
Makransky, G., and Petersen, G. B. (2021). The Cognitive Affective Model of Immersive Learning (CAMIL): A theoretical research-based model of learning in immersive virtual reality. Educ. Psychol. Rev. 33, 937–958. doi: 10.1007/s10648-020-09586-2
Martin, F., Wu, T., Wan, L., and Xie, K. (2022). A meta-analysis on the community of inquiry presences and learning outcomes in online and blended learning environments. Online Learn. 26, 325–359. doi: 10.24059/olj.v26i1.2604
Min-Yu Lau, P., Woodward-Kron, R., Livesay, K., Elliott, K., and Nicholson, P. (2016). Cultural respect encompassing simulation training: Being heard about health through broadband. J. Public Health Res. 5:657. doi: 10.4081/jphr.2016.657
Mitchell, L.-M., and Suransky, C. (2024). Beyond assumptions: Rethinking intercultural competence development in North-South virtual exchanges. Perspect. Educ. 42, 115–133. doi: 10.38140/pie.v42i4.8100
Muñoz, K. E., Roma, M., and Almiñe-Catacutan, M. (2025). Enhancing tourism students’ intercultural skills with design thinking. Anatolia 36, 573–587. doi: 10.1080/13032917.2025.2451902
Novikova, I. A., Gridunova, M. V., Novikov, A. L., and Shlyakhta, D. A. (2022). Cognitive abilities and academic achievement as intercultural competence predictors in Russian school students. J. Intell. 10:25. doi: 10.3390/jintelligence10020025
Nowak, A., Gray, M., Omodara, D. and Gibson, L. (2023). Creating a sense of global community and belonging through collaborative online international learning. J. Teach. Learn. Technol. 12, 120–128. doi: 10.14434/jotlt.v12i1.36248
Oberste-Berghaus, N. (2024). The role of teaching foreign languages in developing intercultural competence. Rev. Rom. Pentru Educ. Multidimensionala 16, 1–15. doi: 10.18662/rrem/16.1/808
O’Dowd, R. (2018). From telecollaboration to virtual exchange: state-of-the-art and the role of UNICollaboration in moving forward. J. Virtual Exch. 1, 1–23. doi: 10.14705/rpnet.2018.jve.1
O’Dowd, R. (2021a). Virtual exchange: Moving forward into the next decade. Comput. Assist. Lang. Learn. 34, 209–224. doi: 10.1080/09588221.2021.1902201
O’Dowd, R. (2021b). What do students learn in virtual exchange? A qualitative content analysis of learning outcomes across multiple exchanges. Int. J. Educ. Res. 109:101804. doi: 10.1016/j.ijer.2021.101804
Pawson, R., and Tilley, N. (1997). “An introduction to scientific realist evaluation,” in Evaluation for the 21st century: A handbook, eds E. Chelimsky and W. R. Shadish (Thousand Oaks, CA: Sage Publications, Inc., 405–418. doi: 10.4135/9781483348896.n29
Pettigrew, T. F., and Tropp, L. R. (2006). A meta-analytic test of intergroup contact theory. J. Pers. Soc. Psychol. 90, 751–783. doi: 10.1037/0022-3514.90.5.751
Pettigrew, T. F., Tropp, L. R., Wagner, U., and Christ, O. (2011). Recent advances in intergroup contact theory. Int. J. Intercult. Relat. 35, 271–280. doi: 10.1016/j.ijintrel.2011.03.001
Poce, A. (2020). A massive open online course designed to support the development of virtual mobility transversal skills: Preliminary evaluation results from European participants. Educ. Cult. Psychol. Stud. 21, 255–273. doi: 10.7358/ecps-2020-021-poce
Rai, L., Deng, C., Lin, S., and Fan, L. (2023). Massive open online courses and intercultural competence: Analysis of courses fostering soft skills through language learning. Front. Psychol. 14:1219478. doi: 10.3389/fpsyg.2023.1219478
Ricardo Barreto, C., Llinas Solano, H., Medina Rivilla, A., Cacheiro, M. L., Villegas Mendoza, A., Lafaurie, A., et al. (2022). Teachers’ perceptions of culturally appropriate pedagogical strategies in virtual learning environments: A study in Colombia. Turk. Online J. Distance Educ. 23, 113–130. doi: 10.17718/tojde.1050372
Sarzhanova, D., Shokparov, A., Akeshova, M., and Rizakhojayeva, G. (2025). Do web-quest technologies enhance socio-cultural competence in a language learning environment? Int. J. Innov. Res. Sci. Stud. 8, 2129–2138. doi: 10.53894/ijirss.v8i2.5642
Sevilla-Pavón, A. (2019). L1 versus L2 online intercultural exchanges for the development of 21st century competences: The students’ perspective. Br. J. Educ. Technol. 50, 779–805. doi: 10.1111/bjet.12602
Shadiev, R., Xuan, C., Wayan, S., Fahriye, A., Yan, L., Nurassyl, K., et al. (2025). Facilitating cross cultural competence of students in an interactive VR learning environment. Educ. Technol. Soc. 28, 78–108. doi: 10.30191/ets.202501_28(1).rp05
Spitzberg, B. H., and Changnon, G. (2009). “Conceptualizing intercultural competence,” in The SAGE Handbook of Intercultural Competence, Ed. D. K. Deardorff (Thousand Oaks, CA: Sage), 2–52.
Swartz, S., and Shrivastava, A. (2022). Stepping up the game–meeting the needs of global business through virtual team projects. High. Educ. Skills Work Based Learn. 12, 346–368. doi: 10.1108/HESWBL-02-2021-0037
Sykes, J. M. (2017). “Technologies for teaching and learning intercultural competence and interlanguage pragmatics,” in The Handbook of Technology and Second Language Teaching and Learning, eds C. A. Chapelle and S. Sauro (Hoboken, NJ: Wiley), 118–133. doi: 10.1002/9781118914069.ch9
Trinh, A.-H., and Dinh, H. (2024). Language and home-culture integrated online learning curriculum for developing intercultural communicative competence. J. Multicult. Educ. 18, 38–52. doi: 10.1108/JME-09-2023-0097
UNESCO (2024). Global Education Monitoring Report, 2024/5, Leadership in Education: Lead for Learning. Paris: GEM Report UNESCO. doi: 10.54676/EFLH5184
van der Zee, K., van Oudenhoven, J. P., Ponterotto, J. G., and Fietzer, A. W. (2013). Multicultural personality questionnaire: Development of a short form. J. Pers. Assess. 95, 118–124. doi: 10.1080/00223891.2012.718302
Versnik, N. (2023). Creating a sense of global community and belonging through collaborative online international learning. J. Teach. Learn. Technol. 12, 120–128. doi: 10.14434/jotlt.v12i1.36248
Wang, J., Li, Q., Cui, J., Tu, S., Deng, Z., Yang, R., et al. (2023). Effectiveness of virtual reality on the caregiving competence and empathy of caregivers for elderly with chronic diseases: A systematic review and meta-analysis. J. Nurs. Manag. 2023:5449955. doi: 10.1155/2023/5449955
Wang, M. S., Yang, L.-Z., and Chen, T. C. (2023). The effectiveness of ICT-enhanced learning on raising intercultural competencies and class interaction in a hospitality course. Interact. Learn. Environ. 31, 994–1006. doi: 10.1080/10494820.2020.1815223
West, H., Goto, K., Borja, S. A. N., Trechter, S., and Klobodu, S. (2024). Evaluation of a Collaborative Online International Learning (COIL): A food product analysis and development project. Food Cult. Soc. 27, 152–173. doi: 10.1080/15528014.2022.2069441
Wong, C. A., Cummings, G. G., and Ducharme, L. (2013). The relationship between nursing leadership and patient outcomes: a systematic review update. J. Nurse. Manag. 21, 709–724. doi: 10.1111/jonm.12116
Yang, C., Popov, V., Zhao, Y., and Biemans, H. J. A. (2025). Intercultural communication competence of Chinese students with limited intercultural experience. J. Cult. Cogn. Sci. 9, 373–391. doi: 10.1007/s41809-025-00182-w
Zhan, H., Cheng, K. M., Wijaya, L., and Zhang, S. (2024). Investigating the mediating role of self-efficacy between digital leadership capability, intercultural competence, and employability among working undergraduates. High. Educ. Skills Work Based Learn. 14, 796–820. doi: 10.1108/HESWBL-02-2024-0032
Zhang, Q., Mohammad Ismail, M. I. R., and Bin Zakaria, A. R. (2025). Enhancing intercultural competence in technical higher education through AI-driven frameworks. Sci. Rep. 15:22019. doi: 10.1038/s41598-025-03303-1
Zhao, M., Kuan, G., Chau, V. H., and Kueh, Y. C. (2024). Validation and measurement invariance of the Chinese version of the academic self-efficacy scale for university students. PeerJ 12:e17798. doi: 10.7717/peerj.17798
Zhou, R., Samad, A., and Perinpasingam, T. (2024). A systematic review of cross cultural communicative competence in EFL teaching: Insights from China. Humanit. Soc. Sci. Commun. 11:1750. doi: 10.1057/s41599-024-04071-5
Keywords: digital technologies, equity and access, higher education, intercultural competence, systematic review, virtual exchange
Citation: Naranjo-Toro M, Basantes-Andrade A, Meneses M and Guerra J (2026) Digital approaches to intercultural competence in higher education—a PRISMA systematic review with bibliometric and SWiM evidence. Front. Educ. 11:1725080. doi: 10.3389/feduc.2026.1725080
Received: 14 October 2025; Revised: 31 December 2025; Accepted: 05 January 2026;
Published: 30 January 2026.
Edited by:
Maria Laura Angelini, Catholic University of Valencia San Vicente Mártir, SpainReviewed by:
Arash Javadinejad, Catholic University of Valencia San Vicente Mártir, SpainJody Siker, Northeastern Illinois University, United States
Copyright © 2026 Naranjo-Toro, Basantes-Andrade, Meneses and Guerra. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Julio Guerra, amVndWVycmFAdXRuLmVkdS5lYw==