- 1 Institute of Pharmaceutical Technology, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil
- 2 Institute for Science, Innovation and Technology in Industry 4.0/INCITE INDUSTRIA 4.0, SENAI CIMATEC University, Salvador, Brazil
- 3 Construction Department, Federal Institute of Bahia, Salvador, Brazil
- 4 PGGDDesign, Federal University of Paraná, Curitiba, Brazil
- 5 College of Engineering, University of Florida, Gainesville, FL, United States
- 6 Department of Mechanical and Industrial Engineering (DEMI), NOVA School of Science and Technology (FCT), NOVA University of Lisbon, Caparica, Portugal
- 7 Advanced Knowledge Center for Immersive Technologies/AKCIT, Salvador, Brazil
Evaluating industrial safety training in high-risk environments remains a methodological challenge, especially in sectors such as construction and mining, where the reliable measurement of knowledge transfer and behavioral change is limited. This study aims to identify the key attributes required for a model that leverages immersive technologies to evaluate safety training in high-risk scenarios, thereby advancing occupational safety research and informing the design of industrial training programs. A PRISMA-guided systematic review of 37 peer-reviewed studies (2021–2025) was conducted across Scopus, ScienceDirect and Web of Science, ensuring transparency and reproducibility. The analysis identified methodological patterns, technological features and research gaps. Most studies addressed immediate outcomes (reaction and learning, corresponding to Levels 1 and 2 of Kirkpatrick’s model), while evidence on behavioral change (Level 3) and organizational impact (Level 4) is scarce. Evaluation strategies are also fragmented, with a predominance of self-report questionnaires and limited use of biometric or performance-based metrics. Emerging approaches combine multimodal Human Action Recognition (HAR), biometric sensing (eye-tracking, EEG, heart rate), and behavioral analytics to enable real-time, performance-based assessment. Adaptive, AI-driven and gamified environments are also gaining relevance, by combining biometric feedback with behavioral data to detect and interpret user actions in real time. By consolidating these attributes, this review delineates the essential components of an immersive evaluation framework that advances methodological rigor and supports safer, human-centered industrial training aligned with Industry 4.0 and Society 5.0.
Systematic Review Registration: https://doi.org/10.17605/OSF.IO/EKWZ9.
1 Introduction
Global pressures for innovation and competitiveness have intensified the demand for a highly skilled workforce, particularly in sectors characterized by high investment, risk, regulation, and rapid technological transformation. Industrial training plays a crucial role in promoting health and safety and also in preventing the high costs that arise when workers fail to recall procedural tasks (Radhakrishnan et al., 2021). Consequently, effective training and rigorous evaluation are essential to ensure that industries achieve their strategic objectives, especially as work processes and environments evolve at an unprecedented pace. The survival and sustainability of industries increasingly depend on the continuous development of workforce competencies (Doolani et al., 2020a).
Within this context, the United Nations Sustainable Development Goals (SDGs), notably SDG 8, emphasize the need to balance economic development with social responsibility, promoting safe and healthy working environments for all, including vulnerable and precarious workers (International Labour Organization - ILO, 2023). Despite these global commitments, the statistics remain alarming: In 2019, work accidents resulted in 330,000 deaths worldwide (ILO, 2023). In Brazil, for example, a worker dies every 3 h and 47 min due to occupational accidents (TST, 2022), with more than 6.7 million workplace accidents and 25,492 deaths reported between 2012 and 2022 (SMARTLAB, 2023). The financial impact is equally severe, with billions lost annually in compensation and productivity, and over 46,000 workdays lost due to accident-related absences.
Multiple factors contribute to this scenario, including the inherent nature of industrial work and insufficient or inadequate training, that is an essential component to improve Occupational Health and Safety performance (Babalola et al., 2023). High injury rates in the construction industry, in particular, are strongly associated with inadequate hazard recognition, with workers failing to identify safety hazards in typical work environments. Accidents in construction activities have been attributed to multiple causes; in particular, they often result from shortcomings in proactive and preventive strategies, such as workforce training, systematic hazard identification and control, and initiatives to strengthen safety awareness (Li et al., 2018; Leite et al., 2022). This deficiency substantially increases the likelihood of unintended exposure, injuries, and severe incidents, particularly those involving falls, caught-in/between accidents, struck-by events, and electrocution (Albert et al., 2020). Addressing these critical gaps in hazard recognition and safety performance requires innovative approaches to training and evaluation that go beyond traditional methods.
In response to these evolving demands, emerging technologies have gained traction as powerful enablers of workforce development (Akdere et al., 2022). Immersive approaches, especially virtual reality (VR) and augmented reality (AR), are transforming industrial training by introducing interactive, scenario-based learning (Almeida et al., 2023; Stefan et al., 2023a). Extended reality (XR), encompassing VR, AR, and mixed reality, is reshaping occupational safety and health (OSH) by enabling close-to-reality training, enhancing hazard identification, and strengthening human-system interaction, thus representing a strategic investment in accident prevention (Soyka et al., 2025; Personeni and Savescu, 2023). Artificial intelligence (AI) further complements these tools by enhancing adaptability and assessment capabilities, enabling real-time feedback and personalized learning experiences (Pagano et al., 2022).
Additionally, beyond learning applications, immersive system architectures increasingly combine extended reality (XR) modalities with interconnected digital ecosystems (Yang et al., 2025; Fernández-Caramés and Fraga-Lamas, 2024). Current developments include multi-sensor VR and AR environments that integrate real-time data from wearables, motion-capture systems, and IoT devices into digital twin models of industrial assets (Tu et al., 2023; Fernández-Caramés and Fraga-Lamas, 2024). These architectures enable bidirectional data exchange between virtual and physical spaces (Yang et al., 2025), providing contextualized feedback for both trainees and instructors. Such integration forms the technological substrate that supports immersive evaluation, where digital replicas of workplace conditions facilitate continuous performance tracking and adaptive learning analytics (Tu et al., 2023).
Virtual reality, in particular, allows the creation of realistic and interactive environments where workers can face and respond to simulated hazards as if they were real (Jerald, 2015). Such experiences strengthen engagement and learning (Abich et al., 2021; Freina and Ott, 2015) and underscore the pressing need for standardized, evidence-based evaluation methods, especially in high-risk occupations where mistakes may entail severe human, environmental, and financial consequences.
This makes it particularly concerning that, despite their increasing adoption, immersive technologies are still frequently implemented without clear evaluation protocols or standardized training materials and methods, restricting the comparability of findings across studies and industrial sectors (Scorgie et al., 2024; Gürer et al., 2023). In other words, despite the increasing use of immersive technologies in safety training across high-risk industries, robust evidence on long-term behavioral change and organizational-level impact remains scarce.
For example, literature shows that training programs for work at heights are largely concentrated within the construction sector and frequently lack a multidimensional structure that integrates knowledge, practical skills, and attitudinal components. The evaluation of their effectiveness is likewise often fragmented, with outcome criteria assessed in isolation rather than holistically. Only a limited number of studies adopt a more comprehensive approach, simultaneously addressing cognitive, instructional, and experiential dimensions within the context of virtual safety training (Rey-Becerra et al., 2021).
Ultimately, evaluation remains a critical challenge (Cordeiro et al., 2025). Inadequate measurement of training programs can lead workers to overestimate their abilities, undermining both safety outcomes and organizational objectives. This concern is echoed in international frameworks such as the ILO Guidelines on Occupational Safety and Health Management Systems (ILO-OSH, 2001), which highlight the importance of competence-based training and the systematic evaluation of comprehension, retention, and performance. Moreover, ILO Convention 187 underscores the need for a preventative safety culture grounded in education, consultation, and continual improvement.
In alignment with Sustainable Development Goal 8.8 (UN, 2015), the present research addresses the imperative to “promote safe and secure working environments for all workers,” particularly those in hazardous or precarious conditions. Industrial sectors with high investment and risk profiles are especially dependent on effective training systems to prevent accidents, protect worker rights, and ensure operational sustainability.
Building upon this context, this study is part of an ongoing research initiative that adopts Kirkpatrick’s four-level model (reaction, learning, behavior, and results) as its primary reference for training evaluation, with particular emphasis on the third level: behavioral change in the workplace (Kirkpatrick and Kirkpatrick, 2016). At this stage of the research, we aim to identify essential components for immersive assessment of safety training in high-risk environments, leveraging storytelling (Doolani et al. (2020b), gamification (Jacobsen et al., 2022; Ulmer et al., 2022), and technologies such as eye-tracking (Çomu et al., 2023), biofeedback (Zhang Z. et al., 2025), and scenario fidelity (Luo et al., 2023). These elements have the potential to enhance immersion and enable the capture and recognition of worker actions and reactions, providing multimodal evidence of how safety behaviors are enacted within immersive scenarios, aligning with broader human action recognition (HAR) frameworks that integrate sensor, physiological, and behavioral data (Morshed et al., 2023).
Although some studies have explored immersive training in high-risk contexts such as fire safety (Kwegyir-Afful, 2022), work at heights (Rokooei et al., 2023), and confined space entry (Evangelista et al., 2025), they remain narrow in focus, with workplace behavioral change representing a dimension that is only beginning to be addressed.
Immersive technologies enhance learning experiences while creating new opportunities for evaluation through the capture of multimodal behavioral and physiological data during training (Evangelista et al., 2025; Lu et al., 2022; Jacobsen et al., 2022). Within virtual environments, user interactions such as gaze patterns, task completion accuracy, physiological responses, and reaction times can serve as objective indicators of engagement, comprehension, and skill application (Çomu et al., 2023; Jelonek et al., 2022; Stefan et al., 2023b). These measurable data streams enable the assessment of outcomes aligned with the four levels of Kirkpatrick’s model, connecting user experience (reaction), knowledge acquisition (learning), behavioral transfer (behavior), and potential organizational benefits (results). In parallel, researchers and instructors can observe and interact with participants during immersive sessions, documenting behavioral patterns, decision-making processes, and adaptive responses (Joshi et al., 2021; Pireddu et al., 2025; Gürer et al., 2023). Such observation complements sensor-based metrics, offering qualitative insights into performance and enabling a more comprehensive understanding of learning and behavioral change (Rey-Becerra et al., 2023). This perspective positions immersive systems as integrated evaluation platforms that generate quantifiable and contextualized evidence of safety performance and learning effectiveness, establishing the theoretical bridge between immersive learning and evaluative performance.
This study aims to identify the key attributes required for a model that leverages immersive technologies to evaluate safety training in high-risk scenarios, directly supporting SDG 8.8 and ILO OSH frameworks by presenting evidence to support research and development of guidelines for safety-critical industries. Moreover, this study will also inform both policy and practice, thereby supporting regulatory frameworks and organizational strategies aligned with the principles of Society 5.0, which emphasize placing human beings at the center of industrial processes. The findings are intended to be broadly applicable across advanced industries in both the public and private sectors.
The remainder of this paper is structured as follows: Section 2 outlines the materials and methods employed in the study; Section 3 presents the results alongside a comprehensive analysis; and Section 4 offers concluding remarks and proposes directions for future research.
2 Materials and methods
This systematic literature review employed a qualitative approach to synthesize the existing knowledge and highlight the core issues in the field (Creswell and Creswell, 2017). As an exploratory study, it addresses a topic with limited prior investigation, the lack of systematic evaluation methods for industrial safety training that incorporates immersive technologies. Since this concept requires deeper understanding and conceptual clarification, a qualitative approach is appropriate, especially when the relevant variables are not yet well defined (Creswell and Creswell, 2017).
This systematic literature review was conducted in accordance with the PRISMA 2020 guidelines (Page et al., 2021), ensuring methodological transparency and reproducibility. It followed a process comprising the following seven steps: planning, defining the scope, searching the published research, assessing the evidence base, synthesizing, analyzing, and writing (Booth et al., 2021).
As preceded by the authors of Chamusca et al. (2023), the initial search strategy was defined by a senior professional in virtual reality-based industrial training, ensuring alignment with the practical needs and evaluation challenges commonly found in high-risk industrial environments. This preliminary strategy was then independently reviewed by two senior researchers on immersive technologies for industrial innovation, then peer-debriefed by another senior researcher. These four experts independently evaluated the scope and inclusion/exclusion criteria, and participated in the stages of consensus building. Qualitative research is interpretative by nature (Creswell and Creswell, 2017); therefore, the relevance and rigor of this study are reinforced by the authors’ strong background in the field, which is described in Table 1.
To ensure methodological rigor, the study employed a Delphi-inspired validation process. The process comprised the following three stages:
1. Independent Assessment: each specialist evaluated the draft protocol, including the research questions, scope definition, search strings, inclusion/exclusion criteria, and screening process;
2. Synthesis and Reliability Analysis: the specialists were then provided with anonymized summaries of all feedback, highlighting convergences and divergences in interpretation, which allowed reflecting on differing perspectives and preparing for consensus refinement.
3. Consensus Refinement: the specialists reviewed the revised protocol to confirm shared agreement, and only items that were clearly aligned within all specialists were retained.
The search strategy and resulting protocol is registered on the Open Science Framework (https://doi.org/10.17605/OSF.IO/EKWZ9, accessed 16 November 2025), and is detailed in the following sections.
2.1 Planning
The literature search was performed in three comprehensive scientific databases: Web of Science, ScienceDirect and Scopus. These platforms were chosen for their broad coverage and international peer-reviewed publications in engineering, technology, industry, and occupational health and safety, fields directly relevant to immersive technologies and industrial safety training.
2.1.1 Scope definition
The definition of the research scope is intrinsically connected to the formulation of clear and well-structured research questions, which guide the direction and boundaries of the investigation (Creswell and Creswell, 2017). For the current research two questions were formulated:
• Q1: What are the opportunities and challenges for applying immersive technologies to evaluate the effectiveness of safety training in high-risk environments?
• Q2: What attributes are required for a model that leverages immersive technologies to evaluate safety training in high-risk scenarios?
2.2 Literature research
The search strategy was carefully constructed to capture studies at the intersection of virtual reality, safety training, evaluation, and high-risk industrial environments. For Web of Science and Scopus, the following string was used: ((“virtual reality” OR VR) AND (“safety training” OR “occupational safety”) AND (evaluation OR effectiveness OR assessment OR impact OR Kirkpatrick) AND (“high-risk industry” OR “hazardous industry” OR “dangerous industry” OR “industrial accidents” OR accidents)). For ScienceDirect, the string was adapted to (“virtual reality”) AND (“safety training” OR “occupational safety”) AND (evaluation OR effectiveness OR Kirkpatrick) AND (“high-risk industry” OR “hazardous industry” OR “industrial accidents”), respecting the database’s Boolean operator constraints. Only articles published in English between 2021 and 2025 were considered. This approach ensured comprehensive coverage of recent empirical and theoretical contributions relevant to the intersection of VR, safety training, and evaluation.
2.2.1 Assessing the evidence base
To identify the most relevant publications Inclusion Criteria (IC) and Exclusion Criteria (EC) were defined to guide the screening process:
• IC1: Immersive technologies.
• IC2: Industrial, risky, hazardous scenarios, processes or operations.
• IC3: Safety training.
• EC1: Exclusion of papers not written in English language.
• EC2: Exclusion of papers published before 2021.
• EC3: Exclusion of papers not related to the industrial domains, such as Medicine or Education.
These inclusion criteria ensured the selection of studies directly relevant to immersive safety training in high-risk industrial contexts and the exclusions helped focus the review on recent, domain-specific studies aligned with industrial safety training applications, ensuring methodological transparency and alignment with PRISMA 2020 guidelines (Page et al., 2021).
The literature search was conducted in July 2025, yielding an initial set of 218 records, which were exported to the Rayyan platform (Ouzzani et al., 2016) for management and screening. After automated identification and removal of 29 duplicates, 189 unique records remained. A filter was then applied to retain only journal articles, excluding four books and resulting in 185 records. Subsequently, a publication date filter was used, limiting the dataset to articles published from 2021 onward, which resulted in 157 articles for further screening.
Screening was conducted in several stages. Initially, two independent reviewers screened the titles of all records, excluding 49 articles that did not meet the inclusion criteria. The remaining 108 articles underwent abstract screening, again performed independently by both reviewers, leading to the exclusion of 51 articles. The 57 articles retained were then subjected to full-text review, resulting in the exclusion of twenty studies that did not fulfill the eligibility requirements. Ultimately, 37 articles were included in the final synthesis. Throughout the process, any disagreements between reviewers were resolved through discussion and consensus. Apart from the deduplication step in Rayyan, no automation or artificial intelligence tools were used in the screening or eligibility assessment.
2.2.2 Synthesizing and analyzing
Data extraction was performed using a standardized form, collecting information on authorship, year, publication source, study design, sample characteristics, evaluation methods, and behavioral metrics, especially those linked to Kirkpatrick’s Model (Kirkpatrick and Kirkpatrick, 2016). The methodological quality of each study was assessed using established criteria, considering reporting rigor, external and internal validity, and the relevance (Downs and Black, 1998).
The findings will directly inform the subsequent research phases, including the design, demonstration, and evaluation of a model for immersive assessment of worker safety performance in high-risk industrial environments. The entire selection process is illustrated in the PRISMA flow diagram (Figure 1).
3 Results and discussion
Following the PRISMA 2020 protocol, this review identified 37 peer-reviewed studies that met the inclusion and exclusion criteria. This section presents the corpus with authorship, publication year, and source. Table 2 lists the studies chronologically and, within each year, alphabetically by first author. Additionally, more comprehensive information, containing the abstracts and a brief summary of each paper can be found on the Supplementary Material.
The analysis compares methodological patterns, sectoral emphases, and emerging evaluation trends. It also reviews how training outcomes are measured (self-report, performance, or physiological indicators) and how these measures align with frameworks such as Kirkpatrick’s model.
Combining comprehensive documentation with a thematic synthesis supports transparency, reproducibility, and critical interpretation of the evidence. The discussion maps the current knowledge base, highlights advances, and identifies research gaps and methodological opportunities for immersive evaluation of industrial safety training.
A review of the 37-article corpus reveals a range of methodological strategies employed in research on immersive safety training. Experimental and laboratory-based studies are prevalent, leveraging controlled environments to evaluate the effects of immersive interventions on hazard recognition, knowledge retention, and behavioral responses. For example, Zhang Z. et al. (2025) utilizes VR simulations to assess cognitive processing and performance in hazard recognition. These experimental designs offer strong internal validity and are well suited for isolating the effects of specific variables, though their findings may not always fully capture the complexities of real-world industrial practice.
Fewer studies examine field or longitudinal contexts, yet these provide valuable insight into long-term impact of immersive technologies in authentic settings. Studies such as Pireddu et al. (2025) highlight the benefits of real-world validation, including the monitoring of skill retention. Although only a smaller proportion of literature adopts this approach, these contributions are important for understanding how immersive training translates into workplace safety outcomes. In the context of hazardous activities, effective skill transfer becomes critically important, as errors may stem from knowledge gaps or unsafe behavior.
Systematic and comprehensive reviews, including those by Babalola et al. (2023) and Dodoo et al. (2025), play a key role in synthesizing evidence and identifying research gaps. By employing structured protocols such as PRISMA, these reviews help contextualize individual findings within the broader field and point to areas where further research is warranted (Page et al., 2021).
Some studies focus on design science and the development of frameworks or guidelines. For instance, Longo et al. (2023) and Garcia Fracaro et al. (2021) propose conceptual models and evidence-based design principles for immersive training, contributing to the theoretical and practical advancement of the field. Comparative and user-centered studies, such as those by Çomu et al. (2023) and Rokooei et al. (2023), emphasize the importance of user experience and the relative effectiveness of different training modalities and immersive technologies as supplementary training tools, reflecting a growing interest in human factors.
A substantial proportion of the corpus is dedicated to the construction sector, which is among the areas with the most serious accidents worldwide, reinforcing this industry’s priority status in immersive safety research (e.g., Babalola et al., 2023; Man et al., 2024; Lopez et al., 2025; Shayesteh et al., 2023). Additional studies cover electrical safety (Stefan et al., 2024) and the precast concrete industry (Joshi et al., 2021), but these topics remain less commonly addressed.
Immersive virtual reality platforms constitute the primary technological approach throughout almost all studies in this corpus, reflecting a consistent focus on their application to workplace safety training (Babalola et al., 2023; Chan et al., 2023; Garcia Fracaro et al., 2022). While many interventions rely on VR as a standalone tool, a small subset of articles report the integration of additional elements such as haptic feedback and physiological sensing technologies, which aim to further enhance engagement and training realism (Lopez et al., 2025; Shayesteh et al., 2023). These advanced interfaces, however, remain relatively uncommon across the reviewed literature.
Additionally, a subset of articles maps technologies or assessment tools, providing descriptive overviews that are useful for practitioners and researchers alike (Damilos et al., 2024; Jelonek et al., 2022). While these studies may not always introduce new theoretical frameworks, they contribute to the practical understanding of available resources and their integration into safety training programs. This mapping effort is particularly valuable given the rapid pace of technological development and the absence of consolidated taxonomies or repositories for immersive safety training tools (Garcia Fracaro et al., 2022).
Regarding methodological approaches, the studies span a broad range of worker profiles, including those operating in confined spaces, at heights, in electrical hazard environments, and in specialized construction contexts (Lu et al., 2022; Evangelista et al., 2025; Rey-Becerra et al., 2023; Pětvaldský et al., 2025; Joshi et al., 2021; Stefan et al., 2024). Collectively, these works demonstrate the field’s commitment to both technological innovation and diversity of application, even as immersive VR training scenarios remain the dominant paradigm.
The reviewed studies also adopted a wide range of evaluation strategies, tailored for their objectives. A strong concentration was observed in primarily subjective evaluation methods such as Likert-type self-assessment questionnaires. Objective measures such as task performance, error rates, physiological responses (e.g., EEG), and knowledge retention were used less frequently. Table 3 illustrates the frequency of each evaluation type:
The frequencies reported in Table 3 exceed the total number of studies in the corpus because many articles employed more than one evaluation method. Thus, a single study may appear in multiple categories. Additionally, the categories and assignments in Tables 2–4 were established by the review authors for analytical purposes and may not correspond to the original intentions or frameworks of the primary studies. Nevertheless, this classification enables a structured analysis of methodological trends and supports a more granular discussion on methodological rigor and practical relevance in the field of study. Figure 2 represents the distribution visually.
Among the 37 studies reviewed, self-report questionnaires were the most frequently employed evaluation method, appearing in 24 articles (64.9%). These instruments included surveys, Likert scales, and structured interviews to capture participants’ perceptions of experience, satisfaction, and learning. This reliance on self-report measures underscores a prevailing trend in literature, where user and trainee perspectives are prioritized as primary indicators of intervention impact. Moreover, qualitative approaches such as interviews and open-ended feedback remain valuable, as they can provide richer insights into participants’ experiences, contextual factors, and behavioral changes that may not be fully captured by standardized metrics. Together, these complementary methods highlight the importance of integrating both quantitative and qualitative evidence to strengthen the evaluation of immersive training interventions.
The assessment of task performance was reported in 15 studies (40.5%). In these cases, researchers evaluated participants’ skills or behavioral accuracy in either virtual or real-world safety tasks, providing more quantifiable evidence of learning transfer and applied competence. These metrics serve to complement subjective reports by directly examining trainees’ ability to enact prescribed safety behaviors in realistic or semi-realistic scenarios.
Knowledge retention was specifically addressed in eight studies (21.6%), each of which incorporated pre- and post-intervention testing or delayed follow-up evaluations to determine the durability of learning gains over time. These studies offer critical insight into the lasting effectiveness of immersive safety training, extending beyond immediate post-training impressions.
Finally, physiological metrics, including EEG, biometric sensors, and eye-tracking, were utilized in four studies (10.8%). Although less common, these methodologies represent a growing area of methodological innovation, providing objective physiological indices of participant engagement, attention, and stress responses. The current rarity of such approaches within the corpus highlights the promise and the nascent state of this emerging evaluation paradigm in the domain of immersive safety training research. One of the noticeable impacts is how few studies align with Kirkpatrick’s Level 3 by assessing behavioral change, and comprehensive multi-level frameworks, and only one study (2.7%) assessing long-term organizational impact (Level 4).
This distribution highlights an opportunity for methodological enhancement. While many studies rely on self-reported data, there is growing potential to enrich evaluation strategies through integrated approaches that combine subjective experience with behavioral performance and physiological indicators. These findings reveal how evaluation strategies are distributed across the literature and provide an opportunity to examine why certain approaches prevail and how they shape the scope and reliability of current evidence.
The predominance of self-report questionnaires observed in this corpus can be understood in light of their accessibility, low cost, and procedural simplicity, which make them particularly appealing for exploratory or short-term studies. These instruments, however, rely on participants’ perceptions rather than objective indicators of performance, introducing potential bias and limiting the interpretability of training effectiveness. This trend mirrors a broader challenge in training evaluation: as emphasized by Kirkpatrick and Kirkpatrick (2016), most organizations remain concentrated at Levels 1 and 2 (reaction and learning), while evaluations at Levels 3 and 4 (behavior and results) are far less frequent because they require longitudinal observation, managerial access, and complex integration of behavioral and organizational data.
Consequently, the field continues to face a methodological gap between what is easiest to measure and what is most meaningful to assess: observable behavioral change and its translation into safer workplace practices. The present research explicitly seeks to bridge this gap by bringing the worker into the evaluation process while performing actual tasks within immersive environments. In virtual settings, trainees can act, make mistakes, and experience simulated consequences safely, allowing evaluators to observe authentic behavioral responses that would be too hazardous to study in physical workplaces. This immersive approach transforms evaluation from a post-training reflection into a real-time, performance-based process, aligning with the behavioral and results levels of the Kirkpatrick model. The sectoral predominance of construction within the reviewed corpus further reflects where such methods are most urgently needed: industries where safety outcomes are tightly coupled with the risk of severe accidents, human loss, and substantial financial impacts worldwide. While this concentration underscores the relevance of immersive approaches for high-risk contexts, it also highlights the need for cross-sectoral validation to ensure the generalizability of findings to other critical domains such as mining, manufacturing, and chemical processing.
Overall, the methodological landscape in immersive safety training research is marked by a healthy diversity of approaches, with each design offering unique strengths. While experimental studies remain dominant, the field is increasingly enriched by longitudinal, mixed-methods, and user-centered research. Continued efforts to standardize evaluation criteria and expand sectoral diversity will further strengthen the evidence base and support the effective implementation of immersive technologies in a range of high-risk industries. Nonetheless, only a small subset of studies adopted longitudinal or multi-level evaluation strategies capable of tracking behavior change or organizational safety outcomes over time. This gap limits the availability of evidence aligned with Kirkpatrick’s Levels three and four, reinforcing the need for future research focused on sustained impact and return on training investment (Evangelista et al., 2025).
3.1 Thematic trends and methodological approaches
To address and better understand these gaps, this review introduces a preliminary classification developed by the authors to map thematic trends and methodological approaches. The resulting framework is presented in Table 4.
Table 4 presents the proposed classification of the 37 studies included in this review, structured into methodological and design-oriented categories developed to support analytical synthesis. Each article was examined individually and assigned to the category that best reflects its primary empirical or conceptual approach, including experimental and laboratory-based research, longitudinal prototypes conducted in the field, systematic reviews, design science outputs (e.g., models or guidelines), comparative or user-centered studies, and research focused on mapping assessment tools or immersive technologies. This framework was developed to capture the methodological diversity within the corpus, facilitate thematic mapping, and enable comparison across studies with different designs.
Figure 3 represents the distribution of studies that reveals a predominance of experimental and laboratory-based research, underscoring the field’s reliance on controlled environments to evaluate immersive safety training interventions. Comparative and user-centered studies also form a substantial portion, reflecting growing interest in benchmarking virtual reality against traditional methods and understanding user experiences. The corpus further includes systematic reviews that synthesize existing evidence, design science outputs proposing guidelines or conceptual models, as well as studies focused on mapping assessment tools and technologies.
3.2 Analytic overview
To assess how immersive interventions are evaluated, we analyzed which studies applied structured models such as the Kirkpatrick framework (reaction, learning, behavior, and results).
The selected body of literature reveals a diverse and evolving landscape regarding the application of immersive technologies (primarily virtual reality and extended reality) for occupational safety training and management across various industrial sectors. While the construction sector remains predominant, the corpus also includes studies in mining, electrical safety, precast concrete, and chemical process industries, reflecting expanding interest across a broader range of industrial contexts and risk environments.
A significant portion of the research focuses on evaluating the effectiveness of immersive training interventions. Recent contributions have also begun to explore the integration of haptic feedback, sensor-based assessment, and wearable technologies, though these approaches remain relatively uncommon. For instance, Babalola et al. (2023) provide a systematic review of immersive technologies in occupational safety and health management, highlighting improvements in risk perception and knowledge retention. Similarly, Luo et al. (2023) and Al-Hamad and Gilányi (2025) investigate how facility management principles and the integration of structured methodologies within VR environments can enhance hazard recognition and compliance among trainees. Yoo et al. (2023) further explore the role of telepresence and risk perception in mediating the effectiveness of VR-based safety training. These studies, while innovative, often remain focused on isolated technical outcomes, and few of them apply structured evaluation models.
Another prominent theme is the development and empirical validation of serious games, simulators, and bespoke training systems. Gürer et al. (2023) present a comprehensive VR-based serious game for underground mining safety, while Longo et al. (2023) propose a digital twin-driven cognitive training approach for smart factories. Bakai et al. (2023) contribute with the design and evaluation of hazard identification systems and VR tools aimed at expanding the use of immersive technologies for fall protection in construction. Jacobsen et al. (2022) exemplify the integration of real-time data collection to personalize safety training in both physical and virtual environments.
In contrast to more technological-focused studies, cognitive, motivational, and engagement aspects are also widely investigated. Zhang Z. et al. (2025) combined VR with eye-tracking and EEG to analyze bottom-up and top-down attention in hazard recognition, while Chan et al. (2023) examine motivational factors and engagement in chemical laboratory safety training using VR serious games. Garcia Fracaro et al. (2021) offer design guidelines for VR training in the chemical industry, emphasizing the importance of interactive elements and user experience.
Çomu et al. (2023) compare safety awareness between virtual and real construction sites, providing insights into the transferability of virtual training outcomes, while Habibnezhad et al. (2021) assess the effectiveness of immersive biofeedback simulators for fall risk assessment relative to conventional systems.
Several studies undertake systematic reviews and state-of-the-art mappings to synthesize the current knowledge base. In addition to the work by Babalola et al. (2023), Sudiarno et al. (2024) and Garcia Fracaro et al. (2022) identify future research directions and summarize the application of immersive technologies for operator training in the industry. Dodoo et al. (2025) and Damilos et al. (2024) offer comprehensive reviews of XR applications, and the challenges associated with safety evaluation in Industry 4.0 contexts.
Across the corpus, studies consistently indicate that immersive virtual reality training delivers substantial benefits over traditional instructional approaches. Participants engaged in VR-based interventions frequently demonstrate greater motivation and engagement, as well as enhanced retention of safety protocols and procedures (Chan et al., 2023; Lu et al., 2022; Stefan et al., 2023b; Pětvaldský et al., 2025; Lopez et al., 2025). These improvements are reported across a variety of industrial contexts and participant profiles.
In particular, several studies have highlighted immediate positive outcomes in knowledge acquisition and user satisfaction following immersive training experiences (Rey-Becerra et al., 2023; Stefan et al., 2024; Joshi et al., 2021). Formal and comprehensive assessments are infrequent when evaluating changes at higher levels of training transfer, such as actual behavioral change or organizational impact. Only a minority of studies systematically apply established evaluation models like the Kirkpatrick framework, and those that do predominantly focus on the first two levels (reaction and learning), rather than on behavior or results (Evangelista et al., 2025; Rey-Becerra et al., 2023). This scarcity of systematic application of multi-level evaluation frameworks, combined with the ongoing emphasis on short-term or self-reported outcomes, highlights the need for longitudinal assessments of real-world impact.
Finally, a subset of articles addresses the practical challenges, limitations, and implementation guidelines for immersive safety training. Khan et al. (2023) discuss risk factors and emerging technologies for fall prevention at construction sites, and Zhang M. et al. (2025) evaluate proactive warning systems within immersive environments. Stefan et al. (2023b) and Rokooei et al. (2023) highlight both the preliminary effectiveness and the organizational barriers to adopting VR-based safety training in industrial settings. Addressing these practical and organizational challenges is essential for translating promising laboratory results into effective, scalable workplace interventions.
Collectively, this literature demonstrates a strong interdisciplinary approach, with contributions from engineering, psychology, computer science, and occupational health. The research underscores the prospect of immersive technologies in enhancing safety training and the need for standardized evaluation metrics, further investigation into real-world transferability, and strategies to overcome technological and organizational barriers to widespread adoption.
3.3 Key attributes, trends and findings
The systematic review of the selected articles reveals several converging trends in the use of immersive technologies for safety training in high-risk industries. For example, across diverse sectors, most notably construction, but also mining, chemical processing, and smart manufacturing, virtual reality (VR) and related tools have enabled the simulation of hazardous scenarios in a risk-free environment, supporting both procedural learning and hazard recognition. Evidence from studies such as Gürer et al. (2023) and Garcia Fracaro et al. (2022) demonstrates that these simulations foster error-based learning and improve immediate performance, with Habibnezhad et al. (2021) highlighting measurable gains in hazard identification compared to traditional methods. However, the durability of these learning outcomes remains uncertain, as only a minority of studies, such as Stefan et al. (2023b), include longitudinal follow-up, and results indicate that skills acquired in VR may decline over time. In addition, some recent studies have begun integrating haptic feedback or physiological measurement technologies (e.g., biosensors and eye-tracking) to enhance immersion and provide objective engagement metrics, though these approaches are still rare (Shayesteh et al., 2023; Zhang Z. et al., 2025).
The key trends are presented below, categorized by theme within the present study scope.
3.3.1 Training effectiveness and knowledge transfer
In high-risk industrial environments, safeguarding workers depends not only on the availability of safety training but critically on the effectiveness with which safety knowledge is transferred from the training context to actual workplace behavior. Without robust assurance of this transfer, employees may remain vulnerable to hazards despite formal instruction, with significant implications for both personal wellbeing and organizational safety performance.
The systematic review of 37 peer-reviewed articles confirms a growing adoption of immersive technologies, particularly virtual reality, within safety training programs across diverse sectors. A consistent finding throughout the literature is the presence of immediate learning gains following immersive training interventions. For example, Babalola et al. (2023) synthesize evidence that VR-based safety training generally facilitates more accurate hazard identification compared to traditional approaches, commonly relying on presentations and video-based training, which are cost-effective but often fail to foster employee engagement or long-term knowledge retention.
However, the magnitude of this benefit varies by context and assessment strategy. Similarly, Seo et al. (2024) highlight that interactive learning elements integrated into VR environments can enhance knowledge retention, though the literature does not converge on a precise estimate of this advantage, with gains differing by scenario, industry, and participant background.
Despite these promising short-term results, questions remain about the long-term durability and practical transferability of immersive training effects. Longitudinal validation is distinctly rare across the reviewed studies. Stefan et al. (2024), among the few to incorporate follow-up assessments, found that safety-related task performance declined significantly when measured four to 6 weeks after initial VR training, suggesting that benefits may diminish over time in the absence of reinforcement, raising concerns about the real-world effectiveness of standalone immersive interventions. Interestingly, the traditional training group exhibited a loss of declarative knowledge that was nearly four times greater than the loss of procedural knowledge. A comparable pattern was observed in the VR training group, as the decline in declarative knowledge also exceeded that of procedural knowledge, albeit to a lesser extent. Most research to date, however, is limited to immediate post-intervention outcomes, leaving the persistence of behavioral change largely unexplored.
Furthermore, the literature reveals unresolved challenges concerning the transfer of skills and knowledge from virtual to real-world environments. Çomu et al. (2023) show that despite strong performance within VR-based simulations, trainees often face persistent difficulties in hazard detection under actual field conditions. This finding underscores that simulation fidelity alone may not guarantee behavioral equivalence outside the laboratory. Similar concerns about the gap between virtual and real-world performance are echoed in studies such as Chan et al. (2023) and Joshi et al. (2021), indicating systemic challenges to reliable knowledge transfer, as represented visually in Figure 4.
Figure 4. Safety knowledge transfer elements. This image was created by the authors using Napkin. AI to visually represent findings.
Collectively, these findings underscore both the promise and the current limitations of immersive safety training: while VR-based programs effectively accelerate knowledge acquisition and short-term performance improvements, the achievement of sustainable behavioral change and consistent workplace safety outcomes remains an open challenge. Addressing these issues through more longitudinal research and rigorous behavioral evaluation will be critical to realizing the full potential of immersive training technologies for worker protection and organizational risk management.
3.3.2 Cognitive and behavioral metrics
The adoption of advanced assessment techniques (e.g., eye-tracking, EEG, and other physiological sensors) has begun to provide richer insights into the cognitive and perceptual mechanisms underpinning learning in immersive safety training. Notably, studies by Zhang Z. et al. (2025) and Huang et al. (2022) demonstrate that measures of visual attention and neural engagement are instrumental in understanding how trainees interact with virtual hazard scenarios. These works report that top-down attentional processes are especially important for effective hazard recognition in complex environments, and that physiological signals, including EEG and eye-movement data, can serve as accurate predictors of imminent errors during training exercises. In this sense, physiological signals and gaze patterns represent key modalities within a broader multimodal human action recognition (HAR) approach, bridging immersive safety training research with established HAR methods in computer vision and behavioral analytics. These techniques mark a methodological shift from perception-based to data-driven evaluation, integrating behavioral analytics within HAR frameworks.
Despite technical advances, discrepancies persist between virtual and real-world hazard detection. Çomu et al. (2023) found that, even when sophisticated metrics indicated high levels of engagement and task success in simulated contexts, trainee’ attention levels were significantly higher in the virtual environment than in the real construction site. This suggests that, while advanced cognitive and behavioral metrics (eye-tracking in particular) offer significant potential for evaluating and refining VR-based training evaluation, high-fidelity simulation alone may be insufficient to ensure full transferability of safety skills to operational settings. Nevertheless, these modalities provide critical inputs for HAR systems, which can be designed to detect gaps between simulated and real-world behaviors and guide adaptive interventions.
Recent studies (e.g., Shayesteh et al., 2023) explore multimodal physiological monitoring to predict training effectiveness and cognitive load, though such work remains limited. Overall, the integration of cognitive and behavioral metrics enhances methodological rigor and provides nuanced understanding of learning processes, but further studies are needed to determine how best to leverage these insights for improving real-world safety outcomes.
From a technological standpoint, immersive safety systems increasingly rely on XR frameworks that merge hardware and software layers for synchronized data collection (Roy et al., 2024). Advances in sensor fusion, haptic interfaces, and spatial mapping allow the aggregation of multimodal inputs (gaze and hand tracking, biometric and environmental data, etc.) within digital twin architectures (Roy et al., 2024; Rakkolainen et al., 2021). This convergence supports scalable evaluation pipelines, where human-system interaction data can be processed through AI algorithms for better dynamic interaction and predictive modeling of training outcomes. Figure 5 presents these findings visually:
Figure 5. Cognitive and behaviour elements. This image was created by the authors using Napkin. AI to visually represent findings.
3.3.3 Personalization and adaptive systems
The literature also points to the growing importance of personalization and adaptive systems in immersive safety training. Jacobsen et al. (2022) describe the integration of real-time data collection and personalized feedback, which are reported by both trainees and trainers as valuable tools for identifying and addressing individual skill gaps and learning trajectories. Studies highlight an emerging trend toward more flexible and context-aware training environments. Yan et al. (2025) describe an adaptive VR tool with automated unsafe behavior detection features, adjusted according to user performance, while Longo et al. (2023) introduce digital twin-based cognitive training to prepare workers for novel and unpredictable industrial conditions.
Adaptive training programs tailored to each individual trainee needs and the training sessions also dynamically self-adjusting to trainee’s performance in VR opens possibilities to evaluate how the individual actually behaves in the virtual workplace. Such systems can rely on the recognition of user actions (whether safe or unsafe) as a prerequisite for adaptation, highlighting the direct role of HAR in enabling personalized safety training. Actions can be monitored, either by the system, or by supervisors and instructors and the feedback can potentially be instantaneous.
Despite the conceptual promise of these innovations, most are presented as frameworks, design prototypes, or early-stage pilot studies, which are valuable and show promising new pathways. As such, robust empirical evidence regarding their effectiveness, scalability, and impact in diverse real-world industrial contexts remains limited within the current corpus.
3.3.4 Implementation challenges
While cognitive metrics and adaptive systems offer promising avenues for improving safety training, their real-world implementation remains constrained by a range of practical barriers. Implementation challenges are a central and recurring theme across the reviewed literature, reflecting the complex interplay of technical, human, and organizational factors that shape the adoption of immersive safety training technologies. Common technical obstacles, including motion sickness, hardware limitations, financial restrictions, consumer/business reluctance to embrace VR, and issues with sensor integration, are frequently reported as initial deterrents to user engagement and large-scale deployment (Damilos et al., 2024; Rokooei et al., 2023). Addressing these issues remains a critical focus for ongoing research and system development.
Several studies describe varying levels of acceptance among workers, particularly those with limited technological literacy or prior exposure to digital environments, which can translate into reluctance or slower adaptation to new training modalities, and the most advanced technology may prove ineffective if it is not accepted by its intended users (Sudiarno et al., 2024).
At the organizational level, cultural factors and established routines can influence the pace and extent of technology integration. The literature notes that factors such as leadership commitment, resource prioritization, and clear evidence of added value are crucial for successful scale-up. Immersive training initiatives benefit from pilot projects, stakeholder engagement, and demonstration of early successes as ways to foster acceptance and institutional learning. However, few studies provide systematic evaluations of these institutional strategies or their long-term impact on training culture and outcomes, and among these, the assessment is not the main focus.
Economic considerations are discussed, especially concerning the initial investment required for hardware, software development, and maintenance. While these costs are readily acknowledged, there remains a need for more systematic analyses of cost-effectiveness, which would equip decision-makers with the data necessary for informed planning (Damilos et al., 2024).
In addition, the effective evaluation of immersive training programs presents its own set of challenges. Only a minority of studies, such as Evangelista et al. (2025) and Rey-Becerra et al. (2023), employ structured assessment frameworks like Kirkpatrick; even then, the field is still developing familiarity and capacity to implement such models comprehensively. This suggests that methodological maturity is advancing, and additional guidance and training in evaluation techniques may further support the generation of robust, comparable evidence.
Taken together, these implementation challenges are not unique to the field of immersive safety training but rather reflect the natural evolution of emerging technologies. They also offer important opportunities: ongoing efforts to address technical, human, and organizational barriers through targeted interventions, supportive policies, and collaborative partnerships are likely to accelerate both adoption and the realization of safety benefits in diverse industrial settings (Sudiarno et al., 2024; Evangelista et al., 2025; Rey-Becerra et al., 2023).
3.3.5 Evaluation frameworks
A critical issue identified in this review is the absence of consensus regarding the evaluation of immersive safety training programs. While most studies employ some combination of pre/post-test assessments, self-report measures, or objective performance metrics, there is substantial diversity in the chosen indicators and procedures for assessing learning, behavioral change, or organizational impact. The Kirkpatrick model, widely recognized in broader training evaluation literature, is only explicitly and systematically applied in a small minority of studies within this corpus, most notably Evangelista et al. (2025) and Rey-Becerra et al. (2023), and then predominantly at the reaction and learning levels. For the majority, evaluation strategies remain study-specific or context-driven, which complicates direct comparability and inhibits the synthesis of outcomes across studies and sectors.
To enable a structured analysis, the present review authors classified all 37 studies according to the four levels of the Kirkpatrick model (reaction, learning, behavior, and organizational results), as summarized in Table 5. It should be emphasized that this framework reflects a deliberate analytical choice of the current review, enabling a standardized lens for synthesis and comparison, even though it was not directly adopted by the original study authors and the same paper may be classified in more than one level.
This distribution illustrates that most studies emphasize participant reactions and learning gains in the immediate aftermath of training, while less frequent attention is paid to behavioral transfer and, as of yet, no robust empirical documentation was presented on organizational-level outcomes, such as the training contribution to business results.
As shown in Figure 6, the assessment of behavioral change, corresponding to Level 3 in the Kirkpatrick framework, represents a crucial, yet comparatively underexplored, dimension in the evaluation of immersive safety training programs. Within the reviewed corpus, only a subset of six studies (16.2%) extends its analysis beyond reaction and learning to examine indicators of behavior. These studies predominantly utilize controlled or simulated environments to infer modifications in safety-related practices, with methodologies ranging from objective observation to validated proxies for behavioral intent.
For example, Jacobsen et al. (2022) implemented automated assessment of VR trainee data during immersive training sessions, enabling the evaluation of workers’ safety cognition and their specific training needs. This approach illustrates how multimodal human action recognition (HAR) techniques can be applied to detect improvements in task performance within simulated contexts, suggesting that adaptive real-time feedback can reinforce safer behavioral execution even within virtual environments.
Additional studies further highlight the breadth of behavioral evaluation approaches. Pireddu et al. (2025) documented reductions in procedural errors and improved task efficiency following repeated use of VR training modules, suggesting a positive effect on skill application that, while not yet tracked longitudinally in the workplace, points toward meaningful behavioral transfer. Evangelista et al. (2025) explicitly grounded their evaluation in the Kirkpatrick model and reported enhanced adherence to safety protocols and fewer critical mistakes among participants in follow-up simulations. Pětvaldský et al. (2025) examined how participants from different age and professional groups perceive VR-based Occupational Health and Safety training. Their findings indicate that interactive VR enhances engagement and improves hazard identification, with the most favorable responses observed among participants aged 30–45.
Rey-Becerra et al. (2023) evaluated the effectiveness of two types of safety training among construction workers in Colombia, addressing levels 3 and 4 of the Kirkpatrick model. By attempting to examine behavioral outcomes, the research advances the discussion on how safety training effectiveness can be assessed beyond immediate reactions and knowledge acquisition. Although the initial plan to employ video surveillance was not feasible due to company policies, the authors adopted self-reports, using the Workplace Health and Safety instrument as an alternative measure. Importantly, the study acknowledges this methodological limitation and provides valuable directions for future research, suggesting the adoption of objective measures such as cue utilization to assess safety awareness, tracking systems or sensors to capture safety behavior, and accident reports to reinforce the robustness of training evaluation.
This methodological pattern highlights a clear opportunity for future research to advance beyond immediate reactions and declarative learning, systematically measuring changes in behavior and, critically, organizational performance. Continued progress in this direction will be essential for capturing the full value and impact of immersive safety training interventions in real industrial contexts.
However, these studies remain a clear minority within the current literature. The prevailing focus across the corpus is still directed at immediate reactions and learning outcomes, with robust assessment of behavioral change seldom pursued. This pattern underscores a persistent research and development gap in advancing immersive safety training toward systematic evaluation and demonstration of Levels three and four outcomes, namely, the reliable training results to safe workplace behaviors and safety leading indicators. Advancing empirical strategies for solid, longitudinal measurement of workplace behavior remains a vital priority for the field, essential to fully realizing and documenting the long-term benefits of immersive safety interventions.
Table 6 presents a summary of the discussion on trends, attributes and findings.
3.4 Research gaps and opportunities
Despite growing interest in immersive safety training, the current evidence base reveals persistent gaps in how these interventions are evaluated, especially beyond short-term outcomes. Building on current gains in learner engagement and immediate knowledge acquisition, several avenues stand out for future research and methodological innovation.
A leading opportunity involves the systematic development of longitudinal studies. While short-term gains are well documented, only a handful of studies assess the retention and transfer of knowledge or behaviors to real workplace settings over extended periods (Stefan et al., 2023b). There is a clear need for empirical designs that track participants across time, enabling the field to identify which elements of immersive interventions produce lasting benefits in real-world safety culture and practice.
The standardization of evaluation frameworks remains another area primed for advancement. At present, most studies utilize study-specific or context-driven criteria, with limited adoption of structured, multi-level models such as Kirkpatrick, and those that do mostly restrict analysis to reaction and learning (Evangelista et al., 2025; Rey-Becerra et al., 2023; Babalola et al., 2023), which aligns with the well documented challenge of training evaluation.
In addition, there is significant potential in the exploration and validation of objective metrics made possible by immersive technologies and human action recognition (Morshed et al., 2023). The preliminary use of physiological sensors (e.g., EEG or eye-tracking) and behavioral analytics within VR environments (Zhang Z. et al., 2025; Huang et al., 2022; Shayesteh et al., 2023) point to new frontiers for measuring engagement, attention, and the precursors of safe behavior. Broadening and standardizing these methods can yield more actionable and comparable outcomes across studies and sectors.
Economic viability and cost-effectiveness are also critical dimensions where methodological innovation would add value, even though human-life should not be measured in currency. Although various studies acknowledge the costs of immersive solutions, rigorous analyses quantifying the return on investment or comparing costs and outcomes to conventional approaches are not the main focus. Integrating economic evaluation protocols into future assessment frameworks could better inform organizational decision-making and support scalable adoption of these technologies.
One prominent area relates to the systematic identification of actual training needs and contextual priorities among end-users. Across the reviewed articles, few studies undertake structured needs assessments, such as surveys, interviews, task analyses, or participatory design sessions with workers and trainees, prior to the development or evaluation of immersive interventions. Instead, most programs are designed based on general, yet relevant, risk scenarios typical of their industrial sector, not explicitly explaining how trainee experiences or workplace-specific safety challenges inform scenario selection, instructional content, or accessibility requirements (Jacobsen et al., 2022; Babalola et al., 2023). Greater integration of user-centered and participatory approaches can help ensure that immersive training and assessment tools more accurately reflect the realities, barriers, and heterogeneity of the target workforce.
Furthermore, the literature identifies a marked absence of consolidated guidelines or best-practice frameworks for the design, implementation, and evaluation of immersive safety training. Existing studies tend to employ context-specific or individually developed procedures, which limits comparability and knowledge accumulation across the field. Although some articles call for unified evaluation models and clearer development protocols (Babalola et al., 2023), as yet there is no widely adopted standard guiding the full cycle: from organizational objectives, needs assessment, through instructional design, to multi-level evaluation and organizational integration.
Finally, inclusivity and accessibility should be integrated systematically into both research design and evaluation methods. Expanding participant samples to include workers with disabilities, those with limited technological literacy, or employees from underrepresented sectors could potentially enhance the generalizability and equity of immersive solutions, strengthening the contribution to SGDs. Developing and validating adapted evaluation instruments for these groups represents an important opportunity to broaden the societal impact of immersive training, as summarized visually in Figure 7.
Figure 7. Research gaps that should be addressed to develop immersive safety training evaluation. This diagram was created by the authors using Napkin. AI.
In summary, advancing the evaluation of immersive safety training calls for multidisciplinary collaboration and methodologically robust studies that capture human action for immediate reactions or knowledge gains, and also long-term behavioral and organizational outcomes, objective physiological indicators, economic viability, and accessibility. Pursuing these directions will maximize the scientific and practical benefits of immersive technologies for occupational health and safety.
3.5 Addressing the research questions
Research Question 1: What are the opportunities and challenges for applying immersive technologies to evaluate the effectiveness of safety training in high-risk environments?
Immersive technologies offer transformative potential for evaluating safety training effectiveness in high-risk industrial settings, yet significant challenges impede their widespread adoption. In response to Q1, the review reveals both the functional opportunities and systemic limitations in how immersive technologies are currently leveraged for safety training evaluation. Realistic simulation capabilities enable trainees to experience hazardous scenarios, such as chemical leaks or structural collapses, without physical risk, leading to measurable improvements in hazard recognition accuracy (Zhang Z. et al., 2025). These technologies also facilitate personalized learning through real-time biometric feedback (e.g., eye-tracking, EEG), allowing trainers to adapt scenarios dynamically based on trainee performance (Jacobsen et al., 2022). Industries like mining and chemical processing particularly benefit from risk-free experiential learning, where traditional training methods pose inherent dangers (Gürer et al., 2023; Garcia Fracaro et al., 2021).
However, a critical and underexplored challenge relates to the limited use of structured methods for identifying the real-world needs, barriers, and preferences of workers or trainees before system development. Most training programs are based on common risk scenarios, with few studies documenting participatory or user-centered needs assessments that could ensure the relevance and inclusivity of immersive solutions (Jacobsen et al., 2022). This gap limits the contextualization of training scenarios and may reduce the practical effectiveness of the interventions for diverse worker profiles.
Additionally, longitudinal validation of skill transfer to real-world environments remains scarce, with only a minority of studies tracking outcomes beyond 90 days (Stefan et al., 2023b). Even among these studies contributions, the absence of standardized evaluation metrics, such as consistent hazard-recognition indices or behavioral compliance measures, hampers cross-sector comparability. Technical limitations, including hardware costs, motion sickness, and inflexible scenario design, further restrict practical implementation (Rokooei et al., 2023; Bakai et al., 2023). Human and organizational factors, such as limited acceptance and readiness for technological change, also present persistent barriers (Sudiarno et al., 2024).
In summary, immersive technologies offer highly promising avenues for enhancing and assessing safety training, but to realize their full impact, future advances must include participatory needs assessment, standardization, long-term validation, and targeted strategies to overcome technical, human, and organizational barriers. Therefore, overcoming these barriers is a methodological necessity and a prerequisite for the reliable and scalable adoption of immersive evaluation protocols in safety-critical industries.
Research Question 2: What attributes should a model for the application of immersive technologies possess for immersive evaluation of safety training?
To answer Q2, the review identifies key features that are essential for designing evaluation models that are scientifically robust and implementable and impactful in operational contexts. An effective immersive evaluation model in high-risk workplace settings must integrate multi-level and multi-modal assessment, technological adaptability, interoperability, and inclusive design.
A key requirement is the adoption of multi-level evaluation frameworks (e.g., a structured application of the Kirkpatrick model), which allow the assessment of outcomes beyond immediate reactions and learning, extending to observable behavioral changes and, ideally, organizational impacts (Evangelista et al., 2025; Rey-Becerra et al., 2023; Kirkpatrick and Kirkpatrick, 2016). This requires leveraging biometric and behavioral data (e.g., engagement via gaze tracking, performance metrics for hazard identification, and IoT-enabled compliance monitoring) to provide actionable evidence of training effectiveness.
Furthermore, it is essential that model design begins with a rigorous, user-centered process, taking into account the organizational objectives (e.g., accident prevention), structured needs assessments, participatory co-design, or direct consultation with worker representatives can ensure that evaluation criteria, training scenarios, and accessibility features meaningfully reflect real workplace demands and constraints. This approach increases both the practical relevance and inclusiveness of the model across diverse worker populations.
Artificial Intelligence powered dynamic scenario generation could potentially enable real-time adjustment of task difficulty and realism, especially for hazard category recognition and classification. Standardization and interoperability, through unified outcome metrics (such as a Hazard Recognition Index) and shared data resources, enable meaningful cross-sector comparison and foster cumulative progress.
Gamification is another promising attribute, involving the use of game-based mechanics such as points, badges, levels, challenges, and leaderboards, has demonstrated significant potential to enhance engagement, motivation, and learning outcomes in training environments. By embedding these elements within immersive platforms, evaluations can become more interactive and motivating, prompting sustained participation and deeper involvement with safety protocols. Gamified systems can also provide real-time feedback and adaptive challenges, fostering a sense of achievement and encouraging trainees to progress through increasingly complex scenarios. Furthermore, the data generated through gamified interactions (e.g., achievement rates, challenge completion times, and repeated practice) can serve as valuable analytic indicators within performance assessment frameworks, contributing to both individualized feedback and organizational-level evaluation. Therefore, the deliberate incorporation of gamification strategies should be viewed as a key design principle for future immersive evaluation models, reinforcing both learning effectiveness and the scalability of safety training interventions across industry sectors.
Feedback is a central pillar of effective training evaluation and should be approached as an iterative process that systematically informs and optimizes both trainee performance and program design. Within frameworks such as Kirkpatrick’s model, feedback mechanisms operate at multiple levels: they can capture immediate reactions and learning outcomes and are indispensable for detecting behavioral changes and organizational impact over time. By integrating both quantitative and qualitative data, a robust feedback system enables timely adjustments to instructional content, scenario fidelity, or learner support, responding dynamically to the evolving needs of high-risk industrial environments. Critically, successful feedback loops require more than the assessment of outcomes, but also the translation of insights into actionable revisions in training methods, scenario complexity, and technological features. This dynamic process ensures that immersive safety training remains aligned with organizational priorities and worker realities, ultimately supporting sustained competency development and risk mitigation.
The principle of human-centered design is a core attribute and fundamentally aligned with the central tenets of Society 5.0, a framework that advocates for the harmonious integration of advanced digital technologies to address pressing societal challenges while prioritizing human wellbeing, proposing that technological innovation should not be viewed merely as an end in itself; rather, it must serve as a means to enhance quality of life, inclusiveness, and occupational safety for all stakeholders. Consequently, by prioritizing user participation, accessibility, and diversity throughout the conception and deployment of immersive safety evaluation systems, organizations move beyond a limited paradigm focused solely on efficiency and automation. Instead, such an approach fosters environments in which technological advancements genuinely serve human needs and aspirations. Furthermore, this orientation increases the effectiveness, acceptance, and sustainability of training solutions in high-risk industrial sectors, which reinforces the paradigm shift advocated by Society 5.0, situating people at the heart of industrial transformation, so that technology becomes a tool for empowerment, equity, and organizational resilience. Therefore, achieving this alignment is essential for translating the promise of immersive technologies into concrete, inclusive, and ethically responsible outcomes across the evolving landscape of work.
In summary, developing a robust model for immersive evaluation of safety training requires a foundation in thorough participatory needs assessments, the adoption of standardized and validated evaluation frameworks, and the establishment of clear, cross-sector guidelines. Prioritizing these elements ensures that immersive solutions are tailored to the realities of workers and organizational contexts, strengthening the scientific rigor and comparability of training outcomes. By integrating these principles, evaluation models can maximize both the immediate effectiveness and the long-term impact, scalability, and acceptance of immersive safety training in high-risk industrial settings.
In light of these findings, future research should prioritize three strategic directions for advancing immersive evaluation in safety training. First, standardizing multimodal and biometric metrics is critical to ensure comparability and reliability across studies, enabling consistent interpretation of physiological and behavioral data. Second, developing longitudinal and multi-level evaluation models is essential to link short-term engagement with sustained behavioral change and organizational safety outcomes. Third, integrating real-time biometric sensing and adaptive feedback mechanisms within immersive environments can transform training systems into continuous evaluation platforms. These priorities represent the most promising avenues for strengthening methodological rigor and practical impact, ensuring that immersive evaluation evolves from experimental applications into validated, scalable tools for safety-critical industries.
3.6 Preliminary conceptual framework
Building on the findings from this systematic review, we propose a preliminary conceptual framework that interconnects the main elements required for immersive evaluation of industrial safety training in high-risk environments. In the taxonomy proposed by Design Science Research, a framework represents a conceptual structure that articulates key constructs and their relationships (Hevner et al., 2004). This contribution therefore provides a foundation for subsequent phases of the research, in which empirical testing and iterative refinement will progressively transform this conceptual structure into a validated model for immersive evaluation (Peffers et al., 2007).
The framework (Figure 8) reflects the convergence of technological, physiological, and behavioral dimensions identified across the reviewed studies, integrating them into a coherent structure that supports both formative and summative assessment of training effectiveness. It is organized into three interrelated layers, which collectively represent the continuum from training design to behavioral outcomes and feedback for continuous improvement.
Inspired by the Kirkpatrick Model, the Design Layer encompasses the organizational and instructional components that define the scope and objectives of safety training programs. It includes the identification of desired results, leading indicators, and critical behaviors, as well as the definition of evaluation strategies.
At the core of the framework, the Human-Centered Immersive Execution Layer places the trainee at the literal and conceptual center of the evaluation process. Within this layer, the worker’s actions, decisions, and physiological responses are continuously captured during immersive task performance. To maximize experiential fidelity and foster a genuine sense of being there and actually doing the work, storytelling, AI-driven adaptive interactions, gamification, and simulated drivers (virtual translation of Kirkpatrick’s behavioral drivers into the immersive environment) are integrated to enhance engagement and learning effectiveness.
Two complementary data streams are synthesized: human action recognition, encompassing motion tracking, eye tracking, heart rate, EEG, audio/video, and activity logs; and human interaction, including instructor evaluations, observational data, and task checklists. This human-centered perspective underscores that technological measurement and analytics remain grounded in the lived experience of the trainee, reinforcing an ethical and methodological commitment to evaluating learning and behavior as inherently human processes within immersive systems.
The Evaluation Layer translates these multimodal data into meaningful indicators aligned with the Kirkpatrick model, particularly Levels 2 (learning) and 3 (behavior). Metrics such as task accuracy, hazard recognition, physiological engagement, and adherence to safety procedures provide quantifiable evidence of training outcomes. These results are then integrated into a feedback loop that informs the redesign of training programs, thus closing the cycle of continuous improvement.
This preliminary framework advances the field by linking immersive training design, multimodal sensing, and behavioral evaluation into a unified and human-centered structure that supports evidence-based improvement of safety training practices. It positions human action recognition as a methodological bridge between presence, performance, and behavioral transfer; transforming immersive systems into active evaluation platforms rather than mere instructional tools.
Technologically, the framework is grounded in immersive architectures that couple extended reality environments with digital twin representations of industrial processes. Within this configuration, the digital twin serves as a dynamic mirror of the virtual workspace, continuously updated through multimodal sensing and AI-driven analytics. This linkage ensures interoperability between the immersive environment and real-world safety systems, creating a feedback-rich ecosystem where behavioral data from VR or AR sessions inform organizational-level safety performance metrics.
As a conceptual framework, it is not intended as a definitive or causal model, but as an evolving structure that will be empirically tested and refined in the forthcoming phases of this research, consistent with the iterative nature of Design Science Research. Future studies will validate the framework through experimental and field applications, assessing its capacity to generate reliable, transferable, and scalable evidence of training effectiveness across diverse industrial contexts.
4 Conclusion
This systematic review provides a comprehensive synthesis of recent literature on the use of immersive technologies for evaluating safety training in high-risk industrial environments. While VR-based interventions show strong potential for enhancing short-term learning outcomes, persistent limitations, such as the lack of longitudinal studies, absence of standardized evaluation frameworks, and low attention to cost and inclusivity, undermine their broader applicability.
To address these challenges, this study identified a set of essential attributes for an effective evaluation model. These include integration with behavioral assessment frameworks like Kirkpatrick’s model, incorporation of real-time biometric feedback, AI-driven scenario personalization, and human-centered design principles to improve accessibility, usability and inclusion. These attributes are critical to ensure that immersive training contributes meaningfully to reducing occupational risks and supporting workforce preparedness in technologically evolving sectors.
Some limitations of this study must be acknowledged. As with any systematic literature review, the search strategy was constrained by the selected databases, keywords, and inclusion/exclusion criteria. Although efforts were made to design comprehensive search strings, relevant studies using different terminology, published in sources not indexed in the chosen databases, or appearing in grey literature may have been omitted. The scope was also limited to papers written in English and published within the last 5 years, so expanding the scope to include non-English publications, particularly considering the growing relevance of Latin American and Asian contributions in this field, would present even more robust findings.
In addition, publication bias may have influenced the evidence base, since studies reporting significant or positive results are more likely to be published and accessible. Beyond this, the reviewed literature itself presents sectoral and methodological potential biases: most studies were conducted in the construction industry, with limited representation of other high-risk domains or non-Western contexts. Many relied on small, convenience samples and subjective self-report measures, which constrain generalizability and may not fully capture behavioral or physiological dimensions of learning. Furthermore, the included studies exhibited heterogeneity in terms of sample sizes, research designs, training contexts, and outcome measures, which limited direct comparison and precluded meta-analytical synthesis.
Finally, the review reflects the state of the literature only up to the date of the last search, and newer studies may already have emerged, particularly given the rapid development of immersive technologies.
Such a model potentially advances training outcomes and aligns with policy goals that emphasize risk prevention and decent work (ILO, 2023; United Nations, 2015), highlighting the contribution of immersive technologies to safety and equity in industrial training practices.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Author contributions
AC: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Project administration, Writing – original draft, Writing – review and editing. YF: Formal Analysis, Investigation, Validation, Writing – review and editing. RL: Investigation, Resources, Validation, Writing – review and editing, Conceptualization. LA: Investigation, Validation, Writing – review and editing. MC: Investigation, Validation, Writing – review and editing. AS: Investigation, Supervision, Validation, Writing – review and editing. TS: Investigation, Supervision, Validation, Writing – review and editing. IW: Formal Analysis, Funding acquisition, Investigation, Methodology, Resources, Supervision, Validation, Writing – review and editing.
Funding
The authors declare that financial support was received for the research and/or publication of this article. This research was funded by the Advanced Knowledge Center in Immersive Technologies (AKCIT), the Bahia State Research Support Foundation (FAPESB), and the National Council for Scientific and Technological Development; grant number 308783/2020-4 and the article processing charge (APC) was funded by the Advanced Knowledge Center in Immersive Technologies (AKCIT).
Acknowledgements
The authors thank the SENAI CIMATEC University, Oswaldo Cruz Foundation. The authors acknowledge the use of Napkin. AI for supporting the development of Figures 2–4.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that Generative AI was used in the creation of this manuscript. Generative AI was used to support the creation of the figures, then edited, colored and ajusted to conform editorial style.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frvir.2025.1726058/full#supplementary-material
References
Abich, J., Parker, J., Murphy, J. S., and Morgan, E. (2021). A review of the evidence for training effectiveness with virtual reality technology. VR 25, 919–933. doi:10.1007/s10055-020-00498-8
Akdere, M., Jiang, Y., and Lobo, F. (2022). Evaluation and assessment of virtual reality-based simulated training: exploring the human-technology frontier. Eur. J. Train. Dev. 46 (5-6), 434–449. doi:10.1108/EJTD-12-2020-0178
Al-Hamad, A., and Gilányi, A. (2025). Immersive VR safety training: enhancing hazard recognition and compliance through 5M and 5S integration. Results Eng. 26, 105135. doi:10.1016/j.rineng.2025.105135
Albert, A., Pandit, B., and Patil, Y. (2020). Focus on the fatal-four: implications for construction hazard recognition. Saf. Sci. 128, 104774. doi:10.1016/j.ssci.2020.104774
Almeida, L. G. G., Vasconcelos, N. V., Winkler, I., and Catapan, M. F. (2023). Innovating industrial training with immersive metaverses: a method for developing cross-platform virtual reality environments. Appl. Sci. 13 (15), 8915. doi:10.3390/app13158915
Babalola, A., Manu, P., Cheung, C., Yunusa-Kaltungo, A., and Bartolo, P. (2023). A systematic review of the application of immersive technologies for safety and health management in the construction sector. J. Saf. Res. 85, 66–85. doi:10.1016/j.jsr.2023.01.007
Bakai, N., Zagorácz, M. B., Rák, O., and Hillebrand, P. (2023). Development of tools for expanding the use of virtual reality (VR) technology in the field of construction site fall protection. Műsz.Tudományos Közl 19, 1–5. doi:10.33894/mtk-2023.19.01
Booth, A., Martyn-St James, M., Clowes, M., and Sutton, A. (2021). Systematic approaches to a successful literature review. London, England: Sage Publications.
Chamusca, I. L., Ferreira, C. V., Murari, T. B., Apolinario, A. L., and Winkler, I. (2023). Towards sustainable virtual reality: gathering design guidelines for intuitive authoring tools. Sustainability 15, 2924. doi:10.3390/su15042924
Chan, P., Van Gerven, T., Dubois, J.-L., and Bernaerts, K. (2023). Study of motivation and engagement for chemical laboratory safety training with VR serious game. Saf. Sci. 167, 106278. doi:10.1016/j.ssci.2023.106278
Çomu, S., Yücel, B., and Ateş Kıral, I. (2023). Comparing the safety awareness of workers on the virtual and real construction site using eye-tracking technology. J. Constr. Eng. Manag. Innov. 6 (4), 266–284. doi:10.31462/jcemi.2023.04266284
Cordeiro, A., Leite, R., Almeida, L., Neves, C., Silva, T., Siqueira, A., et al. (2025). Preliminary design guidelines for evaluating immersive industrial safety training. Informatics 12 (3), 88. doi:10.3390/informatics12030088
Creswell, J. W., and Creswell, J. D. (2017). Research design: qualitative, quantitative, and mixed methods approaches. Thousand Oaks, CA: Sage publications.
Damilos, S., Gkika, D., Damilos, I., and Papadopoulos, A. (2024). An overview of tools and challenges for safety evaluation and exposure assessment in industry 4.0. Appl. Sci. 14 (10), 4207. doi:10.3390/app14104207
Dodoo, J. E., Al-Samarraie, H., Alzahrani, A. I., and Tang, T. (2025). XR and workers’ safety in high-risk industries: a comprehensive review. Saf. Sci. 185, 106804. doi:10.1016/j.ssci.2025.106804
Doolani, S., Wessels, C., Kanal, V., Sevastopoulos, C., Jaiswal, A., Nambiappan, H., et al. (2020a). A review of extended reality (XR) technologies for manufacturing training. Technologies 8 (4), 77. doi:10.3390/technologies8040077
Doolani, S., Owens, L., Wessels, C., and Makedon, F. (2020b). vIS: an immersive virtual storytelling system for vocational training. Appl. Sci. 10 (22), 8143. doi:10.3390/app10228143
Downs, S. H., and Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. JECH 52 (6), 377–384. doi:10.1136/jech.52.6.377
Evangelista, A., Manghisi, V. M., De Giglio, V., Mariconte, R., Giliberti, C., and Uva, A. E. (2025). From knowledge to action: assessing the effectiveness of immersive virtual reality training on safety behaviors in confined spaces using the kirkpatrick model. Saf. Sci. 181, 106693. doi:10.1016/j.ssci.2024.106693
Fernández-Caramés, T. M., and Fraga-Lamas, P. (2024). Forging the industrial metaverse - where industry 5.0, augmented and mixed reality, IIoT, opportunistic edge computing and digital twins meet. IEEE Access. doi:10.1109/ACCESS.2024.3422109
Freina, L., and Ott, M. (2015). A literature review on immersive virtual reality in education: state of the art and perspectives. Int. Sci. Conf. eLearning Softw. Educ. 1, 133–141. doi:10.12753/2066-026X-15-020
Garcia Fracaro, S., Chan, P., Gallagher, T., Tehreem, Y., Toyoda, R., Bernaerts, K., et al. (2021). Towards design guidelines for virtual reality training for the chemical industry. Educ. Chem. Eng. 36, 12–23. doi:10.1016/j.ece.2021.01.014
Garcia Fracaro, S., Glassey, J., Bernaerts, K., and Wilk, M. (2022). Immersive technologies for the training of operators in the process industry: a systematic literature review. Comp. Chem. Engin. 160, 107691. doi:10.1016/j.compchemeng.2022.107691
Gürer, S., Surer, E., and Erkayaoğlu, M. (2023). MINING-VIRTUAL: a comprehensive virtual reality-based serious game for occupational health and safety training in underground mines. Saf. Sci. 166, 106226. doi:10.1016/j.ssci.2023.106226
Habibnezhad, M., Shayesteh, S., Jebelli, H., Puckett, J., and Stentz, T. (2021). Comparison of ironworker's fall risk assessment systems using an immersive biofeedback simulator. Autom. Constr. 122, 103471. doi:10.1016/j.autcon.2020.103471
Hevner, A. R., March, S. T., Park, J., and Ram, S. (2004). Design science in information systems research. MIS Q. 28 (1), 75–106. doi:10.2307/25148625
Huang, D., Wang, X., Liu, J., Li, J., and Tang, W. (2022). Virtual reality safety training using deep EEG-Net and physiology data. Vis. Comput. 38, 1195–1207. doi:10.1007/s00371-021-02140-3
International Labour Organization (ILO) (2001). Guidelines on occupational safety and health management systems (ILO-OSH 2001). Geneva: ILO. Available online at: https://www.ilo.org/public/libdoc/ilo/2001/101B09_15_engl.pdf.
International Labour Organization (ILO) (2023). World employment and social outlook: trends 2023. Available online at: https://www.ilo.org/publications/flagship-reports/world-employment-and-social-outlook-trends-2023 (Accessed August 10, 2025).
Jacobsen, E. L., Solberg, A., Golovina, O., and Teizer, J. (2022). Active personalized construction safety training using run-time data collection in physical and virtual reality work environments. Constr. Innov. 22 (3), 531–553. doi:10.1108/CI-06-2021-0113
Jelonek, M., Fiala, E., Herrmann, T., Teizer, J., Embers, S., König, M., et al. (2022). Evaluating virtual reality simulations for construction safety training: a user study exploring learning effects, usability and user experience. I. Com. Berl. i-com 21 (2), 269–281. doi:10.1515/icom-2022-0006
Jerald, J. (2015). The VR book: human-centered design for virtual reality. San Rafael, CA: ACM Books.
Joshi, S., Hamilton, M., Warren, R., Faucett, D., Tian, W., Wang, Y., et al. (2021). Implementing virtual reality technology for safety training in the precast/prestressed concrete industry. Appl. Ergon. 90, 103286. doi:10.1016/j.apergo.2020.103286
Khan, M., Nnaji, C., Khan, M. S., Ibrahim, A., Lee, D., and Park, C. (2023). Risk factors and emerging technologies for preventing falls from heights at construction sites. Autom. Constr. 153, 104955. doi:10.1016/j.autcon.2023.104955
Kirkpatrick, J. D., and Kirkpatrick, W. K. (2016). Kirkpatrick’s four levels of training evaluation. Alexandria, VA: ATD Press.
Kwegyir-Afful, E. (2022). Effects of an engaging maintenance task on fire evacuation delays and presence in virtual reality. Int. J. Disaster Risk Reduct. 67, 102681. doi:10.1016/j.ijdrr.2021.102681
Leite, R. M. C., Winkler, I., and Alves, L. R. G. (2022). Visual management and gamification: an innovation for disseminating information about production to construction professionals. Appl. Sci. 12 (11), 5682. doi:10.3390/app12115682
Li, X., Yi, W., Chi, H.-L., Wang, X., and Chan, A. P. C. (2018). A critical review of virtual and augmented reality (VR/AR) applications in construction safety. Autom. Constr. 86, 150–162. doi:10.1016/j.autcon.2017.11.003
Longo, F., Padovano, A., De Felice, F., Petrillo, A., and Elbasheer, M. (2023). From “prepare for the unknown” to “train for what's coming”: Aadigital twin-driven and cognitive training approach for the workforce of the future in smart factories. J. Ind. Inf. Integr. 32, 100437. doi:10.1016/j.jii.2023.100437
Lopez, J., Bhandari, S., Perry, L., Ayer, S. K., Hallowell, M. R., and Jones, M. (2025). Analyzing the impact of virtual reality and haptic feedback on the safety skills of construction workers. J. Constr. Eng. Manag. 151 (8), 04025110. doi:10.1061/JCEMD4.COENG-15981
Lu, S., Wang, F., Li, X., and Shen, Q. (2022). Development and validation of a confined space rescue training prototype based on an immersive virtual reality serious game. Adv. Eng. Inf. 51, 101520. doi:10.1016/j.aei.2021.101520
Luo, Y., Ahn, S., Abbas, A., Seo, J. O., Cha, S. H., and Kim, J. I. (2023). Investigating the impact of scenario and interaction fidelity on training experience when designing immersive virtual reality-based construction safety training. Dev. Built Environ. 16, 100223. doi:10.1016/j.dibe.2023.100223
Man, S., Wen, H., and So, B. C. L. (2024). Are virtual reality applications effective for construction safety training and education? A systematic review and meta-analysis. J. Saf. Res. 88, 230–243. doi:10.1016/j.jsr.2023.11.011
Morshed, M. G., Sultana, T., Alam, A., and Lee, Y.-K. (2023). Human action recognition: a taxonomy-based survey, updates, and opportunities. Sensors 23 (4), 2182. doi:10.3390/s23042182
Ouzzani, M., Hammady, H., Fedorowicz, Z., and Elmagarmid, A. (2016). Rayyan - a web and mobile app for systematic reviews. Syst. Rev. 5 (1), 210. doi:10.1186/s13643-016-0384-4
Pagano, T. P., Santos, V. R., Bonfim, Y. d. S., Paranhos, J. V. D., Ortega, L. L., Sá, P. H. M., et al. (2022). Machine learning models and videos of facial regions for estimating heart rate: a review on patents, datasets, and literature. Electronics 11 (9), 1473. doi:10.3390/electronics11091473
Page, M. J., Moher, D., Bossuyt, P. M., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., et al. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372, n71. doi:10.1136/bmj.n71
Peffers, K., Tuunanen, T., Rothenberger, M. A., and Chatterjee, S. (2007). A design science research methodology for information systems research. J. Manag. Inf. Syst. 24 (3), 45–77. doi:10.2753/MIS0742-1222240302
Personeni, G., and Savescu, A. (2023). Ecological validity of virtual reality simulations in workstation health and safety assessment. Front. Virtual Real 4, 1058790. doi:10.3389/frvir.2023.1058790
Pětvaldský, T., Kočkár, S., Lepík, P., Hollá, K., and Kuricová, A. (2025). A comparative analysis of OSH training: evaluating traditional methods versus interactive and virtual reality approaches in the context of sustainability. Sustainability 17, 5570. doi:10.3390/su17125570
Pireddu, A., Innocenti, A., Lusuardi, L. M., Santalucia, V., and Simeoni, C. (2025). The impact and effectiveness of virtual reality applied to the safety training of workers in open-cast mining. Int. J. Environ. Res. Public Health 22 (2), 151. doi:10.3390/ijerph22020151
Radhakrishnan, U., Konstantinos, K., and Chinello, F. (2021). A systematic review of immersive virtual reality for industrial skills training. Behav. Inf. Technol. 40, 1310–1339. doi:10.1080/0144929X.2021.1954693
Rakkolainen, I., Sand, A., Hopf, M., Regenbrecht, H., Löschner, J., Ojala, T., et al. (2021). Technologies for multimodal interaction in extended reality—A scoping review. Multimodal Technol. Interact. 5, 81. doi:10.3390/mti5120081
Rey-Becerra, E., Barrero, L. H., Ellegast, R., and Kluge, A. (2021). The effectiveness of virtual safety training in work at heights: a literature review. Appl. Ergon. 94, 103419. doi:10.1016/j.apergo.2021.103419
Rey-Becerra, E., Barrero, L. H., Ellegast, R., and Kluge, A. (2023). Improvement of short-term outcomes with VR-based safety training for work at heights. App. Ergon. 112, 104077. doi:10.1016/j.apergo.2023.104077
Rokooei, S., Shojaei, A., Alvanchi, A., Azad, R., and Didehvar, N. (2023). Virtual reality application for construction safety training. Saf. Sci. 157, 105925. doi:10.1016/j.ssci.2022.105925
Roy, S., Singh, S., and Rizwan-uddin., (2024). XR and digital twins, and their role in human factor studies. Front. Energy Res. 12, 1359688. doi:10.3389/fenrg.2024.1359688
Scorgie, D., Manu, P., Cheung, C., Yunusa-Kaltungo, A., and Bartolo, P. (2024). Applications of immersive technologies for occupational safety and health training and education: a systematic review. Saf. Sci. 166, 106214. doi:10.1016/j.ssci.2023.106214
Seo, S., Park, H., and Koo, C. (2024). Impact of interactive learning elements on personal learning performance in immersive virtual reality for construction safety training. Expert Syst. Appl. 251, 124099. doi:10.1016/j.eswa.2024.124099
Shayesteh, S., Ojha, A., Liu, Y., and Jebelli, H. (2023). Human-robot teaming in construction: evaluative safety training through the integration of immersive technologies and wearable physiological sensing. Saf. Sci. 159, 106019. doi:10.1016/j.ssci.2022.106019
SMARTLAB (2023). Painel de Informações e Estatísticas da Inspeção do Trabalho no Brasil. Available online at: https://smartlabbr.org/ (Accessed July 03, 2025).
Soyka, F., Nickel, P., Rebelo, F., Lux, A., and Grabowski, A. (2025). Editorial: use of AR/MR/VR in the context of occupational safety and health. Front. Virtual Real 6, 1528804. doi:10.3389/frvir.2025.1528804
Stefan, H., Mortimer, M., and Horan, B. (2023a). Evaluating the effectiveness of virtual reality for safety-relevant training: a systematic review. VR 27, 2839–2869. doi:10.1007/s10055-023-00843-7
Stefan, H., Mortimer, M., Horan, B., and Kenny, G. (2023b). Evaluating the preliminary effectiveness of industrial virtual reality safety training for ozone generator isolation procedure. Saf. Sci. 163, 106125. doi:10.1016/j.ssci.2023.106125
Stefan, H., Mortimer, M., Horan, B., and McMillan, S. (2024). How effective is virtual reality for electrical safety training? Evaluating trainees’ reactions, learning, and training duration. J. Saf. Res. 90, 48–61. doi:10.1016/j.jsr.2024.06.002
Sudiarno, A., Dewi, R. S., Widyaningrum, R., Ma'arij, A. M. D., and Supriatna, A. Y. (2024). Investigating the future study area on VR technology implementation in safety training: a systematic literature review. J. Saf. Sci. Resil. 5 (2), 235–248. doi:10.1016/j.jnlssr.2024.03.005
TST - Tribunal Superior do Trabalho (2022). Acidentes de trabalho: estatísticas oficiais. Available online at: https://www.tst.jus.br/ (Accessed May 15, 2025).
Tu, X., Autiosalo, J., Ala-Laurinaho, R., Yang, C., Salminen, P., and Tammi, K. (2023). TwinXR: method for using digital twin descriptions in industrial eXtended reality applications. Front. Virtual Real. 4, 1019080. doi:10.3389/frvir.2023.1019080
Ulmer, J., Braun, S., Cheng, C.-T., Dowey, S., and Wollert, J. (2022). Gamification of virtual reality assembly training: effects of a combined point and level system on motivation and training results. Int. J. Hum-Comp Stud. 165, 102854. doi:10.1016/j.ijhcs.2022.102854
United Nations (2015). Transforming our world: the 2030 Agenda for sustainable development. United Nations. Available online at: https://sdgs.un.org/goals/goal8 (Accessed July 04, 2025).
Xu, Z., and Zheng, N. (2021). Incorporating virtual reality technology in safety training solution for construction site of urban cities. Sustainability 13, 243. doi:10.3390/su13010243
Yan, M., Deng, C., Gao, J., and Wang, H. (2025). Development and empirical examination of the acceptance of a hazard identification and safety training system based on VR technology. Saf. Sci. 187, 106853. doi:10.1016/j.ssci.2025.106853
Yang, H., Aqlan, F., and Zhao, R. (2025). Towards smart manufacturing metaverse via digital twinning in extended reality. ASME. J. Comput. Inf. Sci. Eng. doi:10.1115/1.4070437
Yoo, J. W., Park, J. S., and Park, H. J. (2023). Understanding VR-based construction safety training effectiveness: the role of telepresence, risk perception, and training satisfaction. Appl. Sci. 13 (2), 1135. doi:10.3390/app13021135
Zhang, M., Ma, S., Xu, R., Chen, T., Ding, Y., and Luo, X. (2025). Evaluating the impact of proactive warning systems on worker safety performance: an immersive virtual reality study. Saf. Sci. 186, 106774. doi:10.1016/j.ssci.2024.106774
Keywords: immersive technologies, safety training, high-risk industries, evaluation frameworks, virtual reality, systematic literature review
Citation: Cordeiro A, Ferreira Y, Leite R, Almeida L, Catapan M, Siqueira A, Silva T and Winkler I (2025) Immersive technologies for evaluating industrial safety training in high-risk environments: a review on opportunities and challenges. Front. Virtual Real. 6:1726058. doi: 10.3389/frvir.2025.1726058
Received: 16 October 2025; Accepted: 21 November 2025;
Published: 17 December 2025.
Edited by:
Michalis Vrigkas, University of Western Macedonia, GreeceReviewed by:
LydaCamila Gómez Gómez, Minuto de Dios University Corporation, ColombiaRohit Kumar, Manipal School of Architecture and Planning, India
Lusi Susanti, Andalas University, Indonesia
Copyright © 2025 Cordeiro, Ferreira, Leite, Almeida, Catapan, Siqueira, Silva and Winkler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: André Cordeiro, YW5kcmUuY29yZGVpcm9AZmlvY3J1ei5icg==; Ingrid Winkler, aW5ncmlkLndpbmtsZXJAZG9jLnNlbmFpY2ltYXRlYy5lZHUuYnI=
Lucas Almeida4