<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Educ.</journal-id>
<journal-title>Frontiers in Education</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Educ.</abbrev-journal-title>
<issn pub-type="epub">2504-284X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">705551</article-id>
<article-id pub-id-type="doi">10.3389/feduc.2021.705551</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Education</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Comparing Two Subjective Rating Scales Assessing Cognitive Load During Technology-Enhanced STEM Laboratory Courses</article-title>
<alt-title alt-title-type="left-running-head">Thees et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Assessing Load During Laboratory Courses</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Thees</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1047740/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kapp</surname>
<given-names>Sebastian</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1332697/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Altmeyer</surname>
<given-names>Kristin</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1042765/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Malone</surname>
<given-names>Sarah</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/432557/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Br&#xfc;nken</surname>
<given-names>Roland</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/262750/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kuhn</surname>
<given-names>Jochen</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1036280/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Department of Physics, Physics Education Research Group, Technische Universit&#xe4;t Kaiserslautern, <addr-line>Kaiserslautern</addr-line>, <country>Germany</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Department of Education, Saarland University, <addr-line>Saarbr&#xfc;cken</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/954559/overview">Moritz Krell</ext-link>, Freie Universit&#xe4;t Berlin, Germany</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/504824/overview">Kim Ouwehand</ext-link>, Erasmus University Rotterdam, Netherlands</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/650106/overview">Jeroen Van Merrienboer</ext-link>, Maastricht University, Netherlands</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Michael Thees, <email>theesm@physik.uni-kl.de</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Assessment, Testing and Applied Measurement, a section of the journal Frontiers in Education</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>14</day>
<month>07</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>6</volume>
<elocation-id>705551</elocation-id>
<history>
<date date-type="received">
<day>05</day>
<month>05</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>06</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Thees, Kapp, Altmeyer, Malone, Br&#xfc;nken and Kuhn.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Thees, Kapp, Altmeyer, Malone, Br&#xfc;nken and Kuhn</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Cognitive load theory is considered universally applicable to all kinds of learning scenarios. However, instead of a universal method for measuring cognitive load that suits different learning contexts or target groups, there is a great variety of assessment approaches. Particularly common are subjective rating scales, which even allow for measuring the three assumed types of cognitive load in a differentiated way. Although these scales have been proven to be effective for various learning tasks, they might not be an optimal fit for the learning demands of specific complex environments such as technology-enhanced STEM laboratory courses. The aim of this research was therefore to examine and compare the existing rating scales in terms of validity for this learning context and to identify options for adaptation, if necessary. For the present study, the two most common subjective rating scales that are known to differentiate between load types (the cognitive load scale by Leppink et&#x20;al. and the na&#xef;ve rating scale by Klepsch et&#x20;al.) were slightly adapted to the context of learning through structured hands-on experimentation where elements such as measurement data, experimental setups, and experimental tasks affect knowledge acquisition. <italic>N</italic>&#x20;&#x3d; 95 engineering students performed six experiments examining basic electric circuits where they had to explore fundamental relationships between physical quantities based on the observed data. Immediately after the experimentation, the students answered both adapted scales. Various indicators of validity, which considered the scales&#x2019; internal structure and their relation to variables such as group allocation as participants were randomly assigned to two conditions with a contrasting spatial arrangement of the measurement data, were analyzed. For the given dataset, the intended three-factorial structure could not be confirmed, and most of the a priori-defined subscales showed insufficient internal consistency. A multitrait&#x2013;multimethod analysis suggests convergent and discriminant evidence between the scales which could not be confirmed sufficiently. The two contrasted experimental conditions were expected to result in different ratings for the extraneous load, which was solely detected by one adapted scale. As a further step, two new scales were assembled based on the overall item pool and the given dataset. They revealed a three-factorial structure in accordance with the three types of load and seemed to be promising new tools, although their subscales for extraneous load still suffer from low reliability scores.</p>
</abstract>
<kwd-group>
<kwd>cognitive load</kwd>
<kwd>differential measurement</kwd>
<kwd>rating scale</kwd>
<kwd>validity</kwd>
<kwd>split-attention effect</kwd>
<kwd>STEM laboratories</kwd>
<kwd>multitrait&#x2013;multimethod analysis</kwd>
</kwd-group>
<contract-num rid="cn001">01JD1811B 16DHL1022</contract-num>
<contract-sponsor id="cn001">Bundesministerium f&#xfc;r Bildung und Forschung<named-content content-type="fundref-id">10.13039/501100002347</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Experimentation in laboratory-like environments is an integral aspect of higher science education (<xref ref-type="bibr" rid="B49">Trumper, 2003</xref>; <xref ref-type="bibr" rid="B18">Hofstein and Lunetta, 2004</xref>; <xref ref-type="bibr" rid="B36">Lunetta et&#x20;al., 2005</xref>). Guided by a predefined task, learners manipulate experimental setups and observe scientific phenomena in order to explore or verify functional relationships between specific quantities in interaction with their theoretical background (<xref ref-type="bibr" rid="B3">American Association of Physics Teachers, 2014</xref>; <xref ref-type="bibr" rid="B33">Lazonder and Harmsen, 2016</xref>). Although this inquiry-based format allows for unique hands-on learning experiences, various empirical studies revealed contrary results concerning the learning gain of laboratory courses (<xref ref-type="bibr" rid="B51">Volkwyn et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B55">Zacharia and Olympiou, 2011</xref>; <xref ref-type="bibr" rid="B14">de Jong et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B53">Wilcox and Lewandowski, 2017</xref>; <xref ref-type="bibr" rid="B19">Husnaini and Chen, 2019</xref>; <xref ref-type="bibr" rid="B23">Kapici et&#x20;al., 2019</xref>). In response, technology-based approaches are applied to support students during experimentation and thereby ensure essential learning and raise the effectiveness of experimentation as a learning scenario (<xref ref-type="bibr" rid="B14">de Jong et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B54">Zacharia and de Jong, 2014</xref>; <xref ref-type="bibr" rid="B15">de Jong, 2019</xref>; <xref ref-type="bibr" rid="B8">Becker et&#x20;al., 2020</xref>).</p>
<p>The most common way to evaluate the effectiveness of new approaches is to apply conceptual knowledge tests to measure learning gains based on content-related knowledge (<xref ref-type="bibr" rid="B17">Etkina et&#x20;al., 2006</xref>; <xref ref-type="bibr" rid="B52">Vosniadou, 2008</xref>; <xref ref-type="bibr" rid="B15">de Jong, 2019</xref>). However, this procedure does not account for learning as a complex cognitive process. Since the focus of conceptual knowledge tests is merely on learning outcomes, it remains unclear whether and how the learning effects could be further increased and learning processes made more efficient. This gap is closed by considering cognitive load theory (CLT; <xref ref-type="bibr" rid="B47">Sweller et&#x20;al., 1998</xref>, <xref ref-type="bibr" rid="B46">2019</xref>; <xref ref-type="bibr" rid="B44">Sweller, 2020</xref>), which provides a useful framework to describe learning in terms of information processing and which respects human cognitive architecture as well as learners&#x2019; prior knowledge and the demands of the instruction. Hence, to evaluate the effects of a learning scenario, investigations should not only solely consider the effectiveness in terms of higher scores in knowledge tests but also the efficiency in terms of an optimal level of cognitive demands. This integration of cognitive processes as a key element of learning scenarios requires sensitive and valid measurement instruments to determine the cognitive&#x20;load.</p>
<p>CLT outlines the working memory and the long-term memory as those entities that are central for processing information and building up knowledge structures (<xref ref-type="bibr" rid="B47">Sweller et&#x20;al., 1998</xref>, <xref ref-type="bibr" rid="B46">Sweller et&#x20;al., 2019</xref>) called schemata (<xref ref-type="bibr" rid="B47">Sweller et&#x20;al., 1998</xref>). Already stored knowledge can be retrieved from long-term memory to support information processing in working memory. While the long-term memory is considered permanent and unlimited in terms of capacity, working memory is limited by the number of information elements that can be processed simultaneously (<xref ref-type="bibr" rid="B7">Baddeley, 1992</xref>; <xref ref-type="bibr" rid="B47">Sweller et&#x20;al., 1998</xref>, <xref ref-type="bibr" rid="B46">Sweller et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B13">Cowan, 2001</xref>). Consequently, learners cannot process information with any desired complexity, which means that to ensure successful learning, this limited capacity should be respected. Any processing of information requires mental processes that consume working memory capacity, which is called cognitive load. CLT distinguishes three types of cognitive load (<xref ref-type="bibr" rid="B45">Sweller, 2010</xref>; <xref ref-type="bibr" rid="B47">Sweller et&#x20;al., 1998</xref>, <xref ref-type="bibr" rid="B46">Sweller et&#x20;al., 2019</xref>): intrinsic cognitive load (ICL), extraneous cognitive load (ECL), and germane cognitive load (GCL). ICL is related to the complexity of the learning content and depends on the learner&#x2019;s prior knowledge as already built-up schemata reduce the number of elements that must be processed simultaneously in working memory. ECL refers to processes that are not essential and therefore hamper learning such as searching for relevant information within the environment or maintaining pieces of information in mind over a longer time (<xref ref-type="bibr" rid="B38">Mayer and Moreno, 2003</xref>). GCL represents the amount of cognitive resources devoted to processing information into knowledge structures. The amount of ECL imposed by a task affects the remaining resources that can be devoted to germane processing. Current theoretical considerations suggest that GCL cannot be essentially distinguished from ICL as both are closely related to processes of schema acquisition (<xref ref-type="bibr" rid="B21">Kalyuga, 2011</xref>; <xref ref-type="bibr" rid="B20">Jiang and Kalyuga, 2020</xref>). As a consequence, a reinterpretation of CLT as a two-factor model (ICL/ECL) is discussed. GCL is integrated into this model as a function of working memory resources needed to deal with the ICL of a task instead of representing an independent source of working memory load (<xref ref-type="bibr" rid="B45">Sweller, 2010</xref>; <xref ref-type="bibr" rid="B46">Sweller et&#x20;al., 2019</xref>).</p>
<p>One of the main goals of CLT is to derive design guidelines for learning materials and environments that ensure that learning processes can proceed efficiently and undisturbed by irrelevant processing steps (<xref ref-type="bibr" rid="B46">Sweller et&#x20;al., 2019</xref>). This can be achieved by removing unnecessary and distracting information as well as by a reasonable presentation format to avoid split-attention that consumes cognitive capacities and impairs essential learning (<xref ref-type="bibr" rid="B37">Mayer and Moreno, 1998</xref>; <xref ref-type="bibr" rid="B6">Ayres and Sweller, 2014</xref>). Therefore, elements of information that need to be associated with each other in learning should be presented without delay and in spatial proximity as described by the multimedia design principles of temporal and spatial contiguity (<xref ref-type="bibr" rid="B39">Mayer and Fiorella, 2014</xref>). These principles are empirically proven to reduce ECL and support learning in multimedia learning scenarios (<xref ref-type="bibr" rid="B42">Schroeder and Cenkci, 2018</xref>).</p>
<p>Scientific experimentation in STEM laboratory courses is assumed to be a highly complex learning scenario since learners are confronted with numerous sources of information such as experimental setups and measurement data which are presented in various representational forms. Although most of the given elements are typical features of the laboratory situation, not all of them are essential for the learning process. As CLT is considered universal and applicable to various learning scenarios, its framework can also be applied to laboratory courses (<xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>).</p>
<p>Since cognitive load has rarely been seen as a main variable to investigate the impact of hands-on laboratory courses, there existed no valid measurement instruments that 1) addressed the aforementioned characteristics of scientific hands-on experimentation including context-specific load-inducing sources and 2) provided results that allowed for a differentiated interpretation of the three load types. Former investigations by Kester et&#x20;al. (<xref ref-type="bibr" rid="B26">Kester et&#x20;al., 2005</xref>; <xref ref-type="bibr" rid="B27">Kester et&#x20;al., 2010</xref>) used the one-item scale by <xref ref-type="bibr" rid="B41">Paas (1992)</xref> in the context of virtual science experiments, i.e.,&#x20;screen-based electricity simulations, to rate mental effort as a measure of cognitive load. There, the authors revealed higher transfer performance for learning with integrated rather than split-source formats. However, no differences concerning mental effort were found, which could be due to the limitations of the one-item cognitive load measurement (<xref ref-type="bibr" rid="B27">Kester et&#x20;al., 2010</xref>). We intended to address this gap for real hands-on experiments by considering existing instruments that are known to differentiate load types and adapting them to fit the context of lab courses.</p>
<p>Even though current theoretical approaches integrate GCL in a dual intrinsic-extraneous load typology of cognitive load, <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref> argued that creating supportive learning scenarios requires a comprehensive understanding of task-related aspects of cognitive load (ICL/ECL) as well as of a learner&#x2019;s deliberately devoted germane resources (GCL) and their interactions. On these grounds, a differentiated measurement of cognitive load capturing its three-partite nature is still considered expedient.</p>
<p>The search for adequate instruments to measure the three types of cognitive load has a long history in cognitive load research. The most common approaches use subjective rating scales where participants rate their perceived cognitive load by evaluating their agreement with predefined statements (<xref ref-type="bibr" rid="B9">Br&#xfc;nken et&#x20;al., 2003</xref>; <xref ref-type="bibr" rid="B32">Krell, 2017</xref>; <xref ref-type="bibr" rid="B20">Jiang and Kalyuga, 2020</xref>). There exist essentially two different rating scales that are proven to differentially measure the three types of load. These are the cognitive load scale (CLS; 10-item questionnaire) developed by <xref ref-type="bibr" rid="B34">Leppink et&#x20;al. (2013)</xref> and the (second version of the) na&#xef;ve rating scale (NRS; 8-item questionnaire) by <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref>. Both scales were applied in various learning contexts (<xref ref-type="bibr" rid="B35">Leppink et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B4">Andersen and Makransky, 2021a</xref>; <xref ref-type="bibr" rid="B5">Andersen and Makransky, 2021b</xref>; <xref ref-type="bibr" rid="B8">Becker et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B25">Kapp et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B30">Klepsch and Seufert, 2020</xref>, <xref ref-type="bibr" rid="B29">Klepsch and Seufert, 2021</xref>; <xref ref-type="bibr" rid="B43">Skulmowski and Rey, 2020</xref>; <xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>), while the reliability of the subscales and the valid measurement of the three load types were confirmed multiple times (<xref ref-type="bibr" rid="B28">Klepsch et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B8">Becker et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B30">Klepsch and Seufert, 2020</xref>; <xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B4">Andersen and Makransky, 2021a</xref>; <xref ref-type="bibr" rid="B5">Andersen and Makransky, 2021b</xref>). However, their application in different contexts usually requires moderate adaptations.</p>
<p>With the objective of identifying an appropriate scale to measure the three types of cognitive load in the complex context of STEM laboratory courses, we adapted two existing cognitive load scales. We based our work on the original scales as presented in <xref ref-type="bibr" rid="B34">Leppink et&#x20;al. (2013)</xref> and <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref> as well as former adaptations of the CLS in the target context by <xref ref-type="bibr" rid="B48">Thees et&#x20;al. (2020)</xref>. In this process, both scales were adapted regarding terminology and partly extended to take various characteristics of the laboratory environment into account. Although these adaptations are highly plausible, they require empirical, evidence-based validation of the resulting scales in the intended learning context. Accordingly, the main research question of the present study was whether the adapted scales can be considered as valid measurement instruments of cognitive load for the context of STEM laboratory courses.</p>
<p>Validity is defined as the appropriateness of interpreting test scores in an intended manner (<xref ref-type="bibr" rid="B31">Kline, 2000</xref>; <xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B22">Kane, 2013</xref>). The presented analyses followed the concepts given by the <italic>Standards for Educational and Psychological Testing</italic> (<xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>) where the overall evidence for validity is based on considering multiple sources of evidence such as content, internal structure, relation to other variables, and response processes. As mentioned before, the main emphases of the application and interpretation of the scales are the suitability for the special context and the differentiated measurement of the three types of cognitive load. Based on this, the following sources of evidence were considered and evaluated during the presented analyses.</p>
<p>A prerequisite for interpreting test scores in the target context of STEM laboratory courses is that the items adequately represent the addressed constructs (ICL, ECL, and GCL) in terms of their formulation. In this sense, adequate items must match the sources of cognitive load that are part of STEM experiments as a learning environment. This <italic>evidence based on content</italic> (<xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>) was considered during the item development, i.e.,&#x20;the adaptation of the original items toward the target context. In order to successfully distinguish between the three types of cognitive load, each adapted scale is expected to show a three-partite internal structure that matches the structure inherited by the original scales. This <italic>evidence based on internal structure</italic> (<xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>) was considered during the analysis of the presented dataset. The simultaneous application of two adapted scales that are intended to measure the same constructs allowed for evaluating convergent and discriminant evidence to determine whether the same constructs were addressed by the respective subscale and whether different types of load could be clearly distinguished. The evaluation of properly addressing the intended constructs was further addressed by inducing group-specific differences by an external factor. By varying the presentation format of crucial information that was relevant to the learning process, ECL was varied, and the analyses evaluated whether the adapted scales could detect these induced differences. In addition, the scales should not indicate any differences in ICL since the complexity of the content and the experimental tasks as well as the representational forms were equal for both groups. Furthermore, a negative correlation between prior knowledge and ICL was expected, which is intended to verify the reduction of perceived content-related complexity due to the already built-up knowledge structures. These aspects related to an outer criterion and were considered <italic>evidence based on relations to other variables</italic> (<xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>). As both scales are applied as rating scales&#x20;and the individual process of rating each item is not considered part of the analyses, <italic>evidence based on response processes</italic> (<xref ref-type="bibr" rid="B1">AERA et&#x20;al., 2011</xref>) was not considered in the present analyses.</p>
<p>In the present study, both adapted scales were applied after learners had participated in a technology-enhanced laboratory course unit examining hands-on experiments in the context of electricity. The experimental tasks and the overall procedure followed the study design of <xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al. (2020)</xref>. Participants had to explore basic physical quantities by setting up several electric circuits and observing automatically provided measurement data while manipulating fundamental parameters. To induce differences in ECL by an external factor, two experimental learning conditions were included to contrast the spatial arrangement of the learning-relevant measurement data as a between-subject factor. One group received a split-source format where the data were anchored as virtual displays to their corresponding component using augmented reality and therefore spread across the learning environment. The other group received an integrated format where the data were grouped together on a single display. Former studies in the context of hands-on electricity laboratory courses have emphasized that measurement values, which have to be compared and related to each other in order to learn successfully, should be presented in spatial proximity (<xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B25">Kapp et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>) to avoid the well-known split-attention effect (<xref ref-type="bibr" rid="B42">Schroeder and Cenkci, 2018</xref>). Hence, the split-source format was expected to trigger unnecessary search processes, and the corresponding group was expected to rate higher ECL than the group with the integrated format. Both groups received the same experimental tasks and equal representational forms of the data to avoid differences in the complexity of the learning material. In terms of the evaluation of validity sources, this leads to the following hypotheses.</p>
<p>Hypothesis based on the internal structure is as follows:<list list-type="simple">
<list-item>
<p>(H1) <italic>Since both adapted scales are intended to differentiate the three types of cognitive load, confirmatory factor analyses are expected to prove their three-partite internal structure.</italic>
</p>
</list-item>
</list>
</p>
<p>Hypotheses based on relation to other variables are as follows:<list list-type="simple">
<list-item>
<p>(H2) <italic>Since both adapted scales include subscales that are intended to measure the same latent variable, high correlations between corresponding subscales (convergent evidence) and low correlations between different subscales (discriminant evidence) are expected.</italic>
</p>
</list-item>
<list-item>
<p>(H3) <italic>The integrated presentation of measurement data reduces perceived ECL compared to the split-source format.</italic>
</p>
</list-item>
<list-item>
<p>(H4) <italic>Since the complexity of the learning material was not varied and participants were randomly assigned to the conditions, equal ratings for ICL are expected.</italic>
</p>
</list-item>
<list-item>
<p>(H5) <italic>Since ICL depends on learners&#x2019; prior knowledge, negative correlations between prior knowledge scores and ICL ratings are expected.</italic>
</p>
</list-item>
</list>
</p>
<p>Furthermore, insufficient evidence for the internal structure might cast doubt on the appropriateness of the respective adaptations and challenge validity evidence based on content or other variables. In reaction, the construction of a new scale based on the overall item pool is considered a useful procedure to contribute to scale development for the target context, leading to the following research question:<list list-type="simple">
<list-item>
<p>(RQ) <italic>Is it possible to merge both scales into a new scale that fulfills the intended three-partite structure as well as detects the induced differences in&#x20;ECL?</italic>
</p>
</list-item>
</list>
</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Item Development</title>
<p>While the NRS was already available in German (<xref ref-type="bibr" rid="B28">Klepsch et&#x20;al., 2017</xref>), the CLS had to be translated to implement it in German university courses. We translated the scale with an emphasis on maintaining the meaning of the original items while applying comprehensible and grammatically correct formulations. We have already implemented the translated scale in previous studies (<xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>), where it has proven useful in principle, and we have further refined it for the present study. As both scales were not originally intended to be used in the context of STEM laboratory courses, all the items had to be adapted. The most important aspect was to emphasize the experiment itself consisting of the experimental tasks and procedures as well as all the components of the experimental setup and the learning environment, such as data displays and instruments. The adaptation intended to point out that the scales are referring to the cognitive load induced by the experimental tasks and not any accompanying activities such as pre- or posttests or preparation phases which are mandatory for graded laboratory courses. Hence, any formulations referring to general terms such as &#x201c;lecture,&#x201d; &#x201c;lesson,&#x201d; or &#x201c;activity&#x201d; were replaced by &#x201c;experiment&#x201d; or &#x201c;experimental task.&#x201d; The results can be found in <xref ref-type="table" rid="T1">Tables 1</xref>,&#x20;<xref ref-type="table" rid="T2">2</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Original and adapted NRS, based on the work of <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref>.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Type of load</th>
<th colspan="2" align="center">Original scale</th>
<th colspan="2" align="center">Adapted scale</th>
<th rowspan="2" align="center">&#x23;</th>
</tr>
<tr>
<th align="center">Item&#x2014;German</th>
<th align="center">Item&#x2014;English</th>
<th align="center">Item&#x2014;German</th>
<th align="center">Item&#x2014;English</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="2" align="left">ICL</td>
<td>Bei der Aufgabe musste man viele Dinge gleichzeitig im Kopf bearbeiten</td>
<td>For this task, many things needed to be kept in mind simultaneously</td>
<td>Beim Experimentieren musste man viele Dinge gleichzeitig im Kopf bearbeiten</td>
<td>During experimentation, many things needed to be kept in mind simultaneously</td>
<td>NRS-1</td>
</tr>
<tr>
<td/>
<td>Diese Aufgabe war sehr komplex</td>
<td>This task was very complex</td>
<td>Das Experimentieren war sehr komplex</td>
<td>Experimentation was very complex</td>
<td>NRS-2</td>
</tr>
<tr>
<td rowspan="4" align="left">ECL</td>
<td>Bei dieser Aufgabe ist es m&#xfc;hsam, die wichtigsten Informationen zu erkennen</td>
<td>During this task, it was exhausting to find the important information</td>
<td>Beim Experimentieren war es m&#xfc;hsam, die wichtigsten Informationen zu erkennen</td>
<td>During experimentation, it was exhausting to find the important information</td>
<td>NRS-3</td>
</tr>
<tr>
<td>Die Darstellung bei dieser Aufgabe ist ung&#xfc;nstig, um wirklich etwas zu lernen</td>
<td>The design of this task was very inconvenient for learning</td>
<td>Die Darstellung der Messwerte beim Experimentieren war ung&#xfc;nstig um wirklich etwas zu lernen</td>
<td>The presentation of measurement data was very inconvenient for learning</td>
<td>NRS-4</td>
</tr>
<tr>
<td align="left"/>
<td>Bei dieser Aufgabe ist es schwer, die zentralen Inhalte miteinander in Verbindung zu bringen</td>
<td>During this task, it was difficult to recognize and link the crucial information</td>
<td>Beim Experimentieren war es schwierig, die richtigen Messwerte und Bauteile miteinander in Verbindung zu bringen</td>
<td>During experimentation, it was difficult to link appropriate data and components</td>
<td>NRS-5</td>
</tr>
<tr>
<td rowspan="4" align="left">GCL</td>
<td>Ich habe mich angestrengt, mir nicht nur einzelne Dinge zu merken, sondern auch den Gesamtzusammenhang zu verstehen</td>
<td>I made an effort, not only to understand several details but also to understand the overall context</td>
<td>Beim Experimentieren habe ich mich angestrengt, mir nicht nur einzelne Dinge zu merken, sondern auch den Gesamtzusammenhang zu verstehen</td>
<td>During experimentation, I made an effort, not only to understand several details but also to understand the overall context</td>
<td>NRS-6</td>
</tr>
<tr>
<td>Es ging mir beim Bearbeiten der Lerneinheit darum, alles richtig zu verstehen</td>
<td>My point while dealing with the task was to understand everything correct</td>
<td>Es ging mir beim Experimentieren darum, alles richtig zu verstehen</td>
<td>My point while experimenting was to understand everything correct</td>
<td>NRS-7</td>
</tr>
<tr>
<td align="left"/>
<td>Die Lerneinheit enthielt Elemente, die mich unterst&#xfc;tzten, den Lernstoff besser zu verstehen</td>
<td>The learning task consisted of elements supporting my comprehension of the task</td>
<td>Die Aufgaben, die ich w&#xe4;hrend dem Experimentieren bearbeiten musste, haben mich dabei unterst&#xfc;tzt, den Lernstoff besser zu verstehen</td>
<td>The experimental task supported my comprehension of the content</td>
<td>NRS-8</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Original and adapted CLS, based on the work of <xref ref-type="bibr" rid="B34">Leppink et&#x20;al. (2013)</xref>.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Type of load</th>
<th align="center">Original scale</th>
<th align="center">Translated</th>
<th colspan="2" align="center">Adapted scale</th>
<th rowspan="2" align="center">&#x23;</th>
</tr>
<tr>
<th align="center">Item&#x2014;English</th>
<th align="center">Item&#x2014;German</th>
<th align="center">Item&#x2014;English</th>
<th align="center">Item&#x2014;German</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="8" align="left">ICL</td>
<td>The topic/topics covered in the activity was/were very complex</td>
<td>Die w&#xe4;hrend der Aktivit&#xe4;t behandelten Themen waren sehr komplex</td>
<td>The experiment covered topics that I perceived as complex</td>
<td>Die beim Experimentieren thematisierten Inhalte empfinde ich als komplex</td>
<td>CLS-1</td>
</tr>
<tr>
<td>The activity covered formulas that I perceived as very complex</td>
<td>Die Aktivit&#xe4;t behandelte Formeln, welche ich als sehr komplex empfand</td>
<td>I perceived the measurement procedure as complex</td>
<td>Das Aufnehmen der Messwerte habe ich als komplex empfunden</td>
<td>CLS-2</td>
</tr>
<tr>
<td align="left"/>
<td/>
<td>The experiment covered representations that I perceived as complex</td>
<td>Die beim Experimentieren verwendeten Darstellungen habe ich als komplex empfunden</td>
<td>CLS-3</td>
</tr>
<tr>
<td align="left"/>
<td/>
<td>I perceived the experimental setup as complex</td>
<td>Die experimentellen Aufbauten habe ich inhaltlich als komplex empfunden</td>
<td>CLS-4</td>
</tr>
<tr>
<td align="left"/>
<td>The activity covered concepts and definitions that I perceived as very complex</td>
<td>Die Aktivit&#xe4;t behandelte Konzepte und Definitionen, welche ich als sehr komplex empfand</td>
<td>The experiment covered physical laws that I perceived as complex</td>
<td>Die beim Experimentieren betrachteten physikalischen Zusammenh&#xe4;nge habe ich als komplex empfunden</td>
<td>CLS-5</td>
</tr>
<tr>
<td rowspan="6" align="left">ECL</td>
<td>The instructions and/or explanations during the activity were very unclear</td>
<td>Die Arbeitsauftr&#xe4;ge und/oder Erkl&#xe4;rungen zur Aktivit&#xe4;t waren sehr unklar</td>
<td>The instructions during the experiment were unclear</td>
<td>Die Arbeitsauftr&#xe4;ge zum Experimentieren waren unklar</td>
<td>CLS-6</td>
</tr>
<tr>
<td align="left"/>
<td/>
<td>The operation of the experimental setup was unclear</td>
<td>Das Bedienen des Experiments war unklar</td>
<td>CLS-7</td>
</tr>
<tr>
<td>The instructions and/or explanations were, in terms of learning, very ineffective</td>
<td>Die Arbeitsauftr&#xe4;ge und/oder Erkl&#xe4;rungen waren sehr ungeeignet f&#xfc;r den Lernfortschritt</td>
<td>The instruction during the experiment was, in terms of learning, ineffective</td>
<td>Die Arbeitsauftr&#xe4;ge zum Experimentieren waren f&#xfc;r meinen pers&#xf6;nlichen Lernfortschritt ungeeignet.</td>
<td>CLS-8</td>
</tr>
<tr>
<td align="left"/>
<td>The instructions and/or explanations were full of unclear language</td>
<td>Die Arbeitsauftr&#xe4;ge und/oder Erkl&#xe4;rungen enthielten viele sprachliche Unklarheiten</td>
<td>The work booklet was full of unclear language</td>
<td>Die Experimentieranleitung enthielt viele sprachliche Unklarheiten</td>
<td>CLS-9</td>
</tr>
<tr>
<td rowspan="8" align="left">GCL</td>
<td>The activity really enhanced my understanding of the topic(s) covered</td>
<td>Die Aktivit&#xe4;t hat mein Verst&#xe4;ndnis zu den betrachteten Themen wirklich gef&#xf6;rdert</td>
<td>The experiment enhanced my understanding of the topic covered</td>
<td>Das Experimentieren heute hat mein Verst&#xe4;ndnis zu dem betrachteten Themengebiet gef&#xf6;rdert</td>
<td>CLS-10</td>
</tr>
<tr>
<td>The activity really enhanced my knowledge and understanding of statistics</td>
<td>Die Aktivit&#xe4;t hat mein Wissen und Verst&#xe4;ndnis zu Statistik wirklich gef&#xf6;rdert</td>
<td>The experiment enhanced my understanding of the measurement procedures</td>
<td>Das Experimentieren heute hat mein Verst&#xe4;ndnis zur Aufnahme von Messwerten gef&#xf6;rdert</td>
<td>CLS-11</td>
</tr>
<tr>
<td>The activity really enhanced my understanding of the formulas covered</td>
<td>Die Aktivit&#xe4;t hat mein Verst&#xe4;ndnis zu den betrachteten Formeln wirklich gef&#xf6;rdert</td>
<td>The experiment enhanced my understanding of the physical laws covered</td>
<td>Das Experimentieren heute hat mein Wissen zu den betrachteten physikalischen Zusammenh&#xe4;ngen gef&#xf6;rdert</td>
<td>CLS-12</td>
</tr>
<tr>
<td align="left"/>
<td/>
<td>The experiment enhanced my understanding of the representations covered</td>
<td>Das Experimentieren heute hat mein Verst&#xe4;ndnis zu den verwendeten Darstellungen gef&#xf6;rdert</td>
<td>CLS-13</td>
</tr>
<tr>
<td align="left"/>
<td>The activity really enhanced my understanding of concepts and definitions</td>
<td>Die Aktivit&#xe4;t hat mein Verst&#xe4;ndnis zu Konzepten und Definitionen wirklich gef&#xf6;rdert</td>
<td>The experiment enhanced my general understanding of physical concepts and definitions</td>
<td>Das Experimentieren heute hat mein allgemeines Verst&#xe4;ndnis zu physikalischen Konzepten und Definitionen gef&#xf6;rdert</td>
<td>CLS-14</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Concerning the NRS (<xref ref-type="table" rid="T1">Table&#x20;1</xref>), the items of the ICL and GCL subscales were adapted by replacing the term &#x201c;activity&#x201d; as mentioned before. For the ECL subscale, the term &#x201c;information&#x201d; was specified as &#x201c;measurement data.&#x201d; These data are seen as the crucial information of the scientific context and the basis for any learning process as the information about the mutual dependencies between the physical quantities of the behavior of experimental components is solely represented by the data. The 7-point Likert scale level was adopted from the original work by <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref>, including the labeling of the scale range as &#x201c;absolutely wrong&#x201d; (left endpoint; German: &#x201c;Stimme &#xfc;berhaupt nicht zu&#x201d;) and &#x201c;absolutely right&#x201d; (right endpoint; German: &#x201c;Stimme voll zu&#x201d;).</p>
<p>Concerning the CLS (<xref ref-type="table" rid="T2">Table&#x20;2</xref>), the references within the items were also adjusted to the &#x201c;experiment.&#x201d; Furthermore, for the ICL and GCL subscales, the contents of the learning scenario (formerly statistics and corresponding formulas) were replaced by &#x201c;measurement procedure,&#x201d; &#x201c;representations,&#x201d; and &#x201c;physical laws.&#x201d; This resulted in one additional item for each subscale (CLS-3 and CLS-12). Another item was added to the ICL subscale referring to the complexity of the experimental setup (CLS-4). For ECL, the term &#x201c;instructions&#x201d; was directed to the &#x201c;experimental task&#x201d; and the &#x201c;work booklet.&#x201d; There, another item was added concerning the operation of the experimental setup (CLS-7). Hence, the original 10-item scale was expanded to a 14-item scale in order to capture various facets of the context. Furthermore, the scale range was adjusted to a six-point Likert scale. Within this step, the term &#x201c;very&#x201d; was excluded from each item. The labeling of the scale range was adopted from the original work by <xref ref-type="bibr" rid="B34">Leppink et&#x20;al. (2013)</xref>, ranging from &#x201c;not at all&#x201d; (left endpoint; German: &#x201c;Trifft gar nicht zu&#x201d;) to &#x201c;completely the case&#x201d; (right endpoint; German: &#x201c;Trifft voll und ganz zu&#x201d;).</p>
</sec>
<sec id="s2-2">
<title>Participants</title>
<p>The sample originally consisted of <italic>N</italic>&#x20;&#x3d; 117 engineering students from a medium-sized German university (approximately 14,000 students in total) who attended the same introductory physics lecture. Six of them had to be excluded due to language problems, and another 16 students had to be excluded due to missing values in the overall dataset. The remaining <italic>N</italic>&#x20;&#x3d; 95 students constitute the sample for all further analyses. Participants were randomly assigned to group 1, receiving an integrated presentation format (<italic>N</italic>&#x20;&#x3d; 48; 15% female, 81% male; age: <italic>M</italic>&#x20;&#x3d; 19.8, <italic>SD</italic> &#x3d; 1.3; semester: <italic>M</italic>&#x20;&#x3d; 1.9, <italic>SD</italic> &#x3d; 1.3), and group 2 (<italic>N</italic>&#x20;&#x3d; 47; 15% female, 74% male; age: <italic>M</italic>&#x20;&#x3d; 20.1, <italic>SD</italic> &#x3d; 1.5; semester: <italic>M</italic>&#x20;&#x3d; 2.3, <italic>SD</italic> &#x3d; 1.7), receiving a split-source presentation format. The investigation was conducted during the winter semester 2019. Participation was reimbursed with a bonus percentage of 5% for the final examination&#x20;score.</p>
</sec>
<sec id="s2-3">
<title>Materials</title>
<p>During the intervention, participants performed structured physics experiments for which they had to construct several electrical circuits and analyze measurement data to derive fundamental laws for voltage and current (well known as Kirchhoff&#x2019;s laws), which are based on a former study by <xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al. (2020)</xref>. This inquiry process was guided by structured task descriptions in which six different circuits were examined. Learners had to build up these circuits with typical educational equipment (i.e.,&#x20;cables, a voltage source, and resistors) based on a given circuit diagram and answered a set of single-choice items concerning the relation of voltage or amperage at all components based on the observed data. To observe a variety of data in order to derive physical laws, learners were encouraged to manipulate fundamental parameters of the experiment, i.e.,&#x20;the source voltage (<xref ref-type="fig" rid="F1">Figure&#x20;1</xref>). The data were provided automatically via a technology-enhanced measuring system and were visualized in real time. Hence, every interaction with the experiment that led to a change in its physical properties could be immediately observed as a change in the displayed data. The experimental tasks were, in terms of the complexity of the examined circuits and the required prior knowledge, comparable to such experiments that are part of the corresponding introductory physics laboratory courses which are mandatory for university STEM programs. Hence, the learning content and the complexity of the laboratory work instructions matched the curriculum of university engineering students.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Example of the experimental task description (translated for this publication, corresponds to the circuits given in <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>).</p>
</caption>
<graphic xlink:href="feduc-06-705551-g001.tif"/>
</fig>
<p>The learning environment consisted of the following: a work booklet that detailed the experimental tasks and circuit diagrams and the experimental components such as wires, a range of resistors, a voltage source, and a device that virtually displayed the automatically gathered measurement data (<xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>). For group 1, the measurement data were presented in a clearly arranged matrix on a tablet display (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). For group 2, smartglasses (Microsoft HoloLens, first-generation developer edition) were used as a see-through head-mounted augmented reality device, and the measurement values were presented as virtual 3D components next to the corresponding real parts of the electric circuits within the visual field of the smartglasses using visual marker recognition (<xref ref-type="fig" rid="F3">Figure&#x20;3</xref>). Both groups received equal representational forms, i.e.,&#x20;numerical values and a virtual needle deflection. Accordingly, the only difference between the two groups was the spatial arrangement of the virtual real-time measurement displays. Further information on the technical implementation of the learning environment was described by <xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al. (2020)</xref> and <xref ref-type="bibr" rid="B25">Kapp et&#x20;al. (2020)</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The learning environment as experienced by group 1 (presentation of the measurement data via separate display on a tablet).</p>
</caption>
<graphic xlink:href="feduc-06-705551-g002.tif"/>
</fig>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>
<bold>(a)</bold> Representation of the AR view as seen through the smartglasses by participants. <bold>(b)</bold> Researcher wearing smartglasses.</p>
</caption>
<graphic xlink:href="feduc-06-705551-g003.tif"/>
</fig>
<p>Both adapted subjective rating scales were applied as shown in <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref> in order to measure cognitive load in a differentiated&#x20;way.</p>
<p>Prior knowledge was determined via conceptual knowledge consisting of 10&#x20;single-choice items, which were also used in a similar form by <xref ref-type="bibr" rid="B2">Altmeyer et&#x20;al. (2020)</xref>. These items were selected from a conceptual knowledge test originally developed by <xref ref-type="bibr" rid="B50">Urban-Woldron and Hopf (2012)</xref> and <xref ref-type="bibr" rid="B10">Burde (2018)</xref> based on their compatibility with the physical concepts (i.e.,&#x20;voltage and current in simple circuits, Kirchhoff&#x2019;s laws) and the complexity of the circuits (i.e.,&#x20;parallel and serial circuits with few components) addressed during the experimentation phase. Five of the items were directly related to circuits that were part of the experimental tasks and were therefore considered &#x201c;instruction-related&#x201d; in subsuming analyses. The items were already available in German, but to match the formal representation of the instructions from the experiment, we adapted the symbols of the circuit diagrams (symbols for resistors, voltage source, etc.). An example item can be found in <xref ref-type="fig" rid="F4">Figure&#x20;4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Example of the conceptual knowledge items as presented to the participants (<xref ref-type="bibr" rid="B50">Urban-Woldron and Hopf, 2012</xref>, translated for this publication).</p>
</caption>
<graphic xlink:href="feduc-06-705551-g004.tif"/>
</fig>
<p>Furthermore, knowledge tests concerning concrete measurement data and a usability questionnaire were applied, but these were excluded from the presented analyses (Thees et&#x20;al., in preparation). Eventually, students were asked for demographic data on a voluntary&#x20;basis.</p>
</sec>
<sec id="s2-4">
<title>Procedure</title>
<p>After receiving general information about the study and data protection as well as providing written consent for participation, the students completed the prior knowledge test (pretest). All the items were presented consecutively on a computer screen, and completion took approximately 10&#xa0;min.</p>
<p>Afterward, participants were introduced to the actual learning environment, i.e.,&#x20;the work booklet, the experimental components, and the operation of the displaying device (tablet or smartglasses). They were randomly assigned to one of the two intervention groups. Students using the smartglasses were able to wear their own glasses or contact lenses at the same time without any limitation.</p>
<p>The introduction was followed by the experimentation phase, in which students conducted the six experimental tasks as presented in the work booklet. After setting up each circuit, a supervisor checked and corrected the wiring in order to ensure safe experimentation. Students did not prepare for this experiment, and no further guidance or support was provided. The experimentation phase lasted approximately 30&#xa0;min.</p>
<p>Subsequently, participants consecutively completed the subjective cognitive load rating scales as paper&#x2013;pencil tests, starting with the adapted CLS. Each student received the same order of items, but the items were presented in a randomized order so that they were not grouped by their intended three-partite structure. Answering both questionnaires took less than 10&#xa0;min.</p>
<p>Eventually, students answered questions concerning demographic data on a voluntary basis in a paper&#x2013;pencil format.</p>
</sec>
<sec id="s2-5">
<title>Data Analysis</title>
<p>For each subscale, the mean values were calculated as scores, which were scaled to [0; 1] afterward.</p>
<p>To provide evidence based on the internal structure, the reliability of each subscale for both scales was calculated as internal consistency (Cronbach&#x2019;s alpha; &#x3b1;<sub>c</sub>) with the conventional threshold of &#x3b1;<sub>c</sub> &#x3d; 0.70 for acceptable reliability (<xref ref-type="bibr" rid="B31">Kline, 2000</xref>). In addition, confirmatory factor analyses (CFA) were conducted for both scales, evaluating their intended three-factorial structure representing the three types of cognitive load (addressing H1). There, correlations between the factors (i.e.,&#x20;the subscales) were allowed.</p>
<p>To provide evidence based on relations to other variables, both scales were compared following the procedure of a traditional multitrait&#x2013;multimethod analysis (<xref ref-type="bibr" rid="B11">Campbell and Fiske, 1959</xref>) in order to search for convergent and discriminant evidence as each method (scale) addresses each trait (type of load). There, the correlations between the subscale scores for the two applied methods as well as the reliability scores in terms of internal consistency were considered and compared via a correlation table called MTMM matrix (addressing H2). Although there are no clear guidelines concerning thresholds, strong evidence is indicated if the correlations between the same traits measured by different methods are higher than the correlations between different traits measured by different methods. The traditional evaluation of the correlation table was complemented with a subsuming confirmatory MTMM, which was calculated as a correlated trait&#x2013;correlated method model via a CFA, which allowed for correlations between all components (<xref ref-type="bibr" rid="B16">Eid, 2000</xref>). Furthermore, it was checked whether the scales could detect differences in the subscales between the two intervention types (grouping variable) during the study. Therefore, group-specific ECL scores were compared using a two-sided independent sample <italic>t</italic>-test (addressing H3). An equivalent <italic>t</italic>-test was conducted to compare group-specific ICL scores (addressing H4). In addition, the correlations between the ICL subscales and the score in the pretest were included. There, a negative correlation was expected as higher prior knowledge is assumed to reduce the complexity of the content due to already existing knowledge schemata (addressing&#x20;H5).</p>
<p>Going one step further, we intended to combine both scales in order to merge them into a new scale with better model fit concerning the tripartite structure (addressing RQ). This was based on an exploratory factor analysis (EFA), which was conducted using all items of both scales together. In this instance, the Kaiser&#x2013;Meyer&#x2013;Olkin measure revealed a good sampling adequacy with an overall <italic>KMO</italic> &#x3d; 0.79. The individual <italic>KMO</italic>
<sub>
<italic>j</italic>
</sub> values were in the range of [0.65; 0.89]. Furthermore, Bartlett&#x2019;s test of sphericity, &#x3c7;<sup>2</sup>(231) &#x3d; 1,006.4, <italic>p</italic>&#x20;&#x3c; 0.001, revealed adequate item correlations. The scree plot and a parallel analysis were taken into account to determine the optimal number of factors, which was found to be three. Since the factors to be extracted were allowed to correlate with each other, an oblique factor rotation (&#x201c;oblimin&#x201d;) was applied. As the intention was to find a short and concise scale, we limited the number of items included for each subscale to three. Two new models were developed based on the factor loadings and the relation to the group variable in the presented study. Both scales were evaluated by conducting a confirmatory factor analysis with their intended three-factorial structure.</p>
<p>In general, the significance level for type I errors was considered as &#x3b1; &#x3d; 0.05. For each confirmatory analysis, the following indices were applied with their corresponding cutoff values indicating acceptable model fit: the comparative fit index (CFI) and the Tucker&#x2013;Lewis index (TLI), each &#x2265;0.95, as well as the root mean square error of approximation (RMSEA) and the standardized root mean square residual (SRMR), each &#x2264;&#x20;0.08.</p>
<p>All the confirmatory analyses were conducted using the lavaan package (version 0.6-6) in the R programming language (version 3.6.0). For the EFA, the psych package (version 1.8.12) was&#x20;used.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec id="s3-1">
<title>Validity Evidence Based on Internal Structure</title>
<p>The reliability analyses revealed insufficient values for the NRS, &#x3b1;<sub>c</sub>(ICL) &#x3d; 0.52, &#x3b1;<sub>c</sub>(ECL) &#x3d; 0.53, and &#x3b1;<sub>c</sub>(GCL) &#x3d; 0.62, and mixed results for the CLS, &#x3b1;<sub>c</sub>(ICL) &#x3d; 0.86, &#x3b1;<sub>c</sub>(ECL) &#x3d; 0.43, and &#x3b1;<sub>c</sub>(GCL) &#x3d; 0.90. Concerning the NRS, all the subscales did not reach the common threshold of &#x3b1;<sub>c</sub> &#x3d; 0.70. In contrast, the subscales of the CLS for ICL and GCL showed satisfying results, but not for&#x20;ECL.</p>
<p>The subsuming CFA also revealed no clear results. Concerning the NRS, the model fit indices did not reach the conventional thresholds, CFI &#x3d; 0.83, TLI &#x3d; 0.72, RMSEA &#x3d; 0.11, and SRMR &#x3d; 0.09. Concerning the CLS, RMSEA &#x3d; 0.07 indicated an acceptable model fit, while the other indices narrowly missed the range for acceptable values, CFI &#x3d; 0.94, TLI &#x3d; 0.93, and SRMR &#x3d; 0.09. In sum, there was no consistent indication of an acceptable model fit for both scales concerning the assumed structure with three inherent factors, which contradicts Hypothesis&#x20;1.</p>
</sec>
<sec id="s3-2">
<title>Validity Evidence Based on Relations to Other Variables</title>
<p>In order to compare the behavior of both adapted scales in terms of an MTMM approach, a correlation table based on Pearson&#x2019;s correlation was calculated (MTMM matrix; <xref ref-type="table" rid="T3">Table&#x20;3</xref>). Here, the correlations between the two methods concerning each trait (monotrait&#x2013;heteromethod coefficients) became significant (<italic>p</italic>&#x20;&#x3c; 0.05) with a range of <italic>r</italic>&#x20;&#x3d; 0.48 to <italic>r</italic>&#x20;&#x3d; 0.55 (<xref ref-type="bibr" rid="B12">Cohen, 1988</xref>), indicating convergent evidence between the two scales. These correlations were higher than those significant correlations between different traits measured by different methods (heterotrait&#x2013;heteromethod coefficients), emphasizing discriminant evidence. The same results were found concerning the correlations between different traits measured by the same method (heterotrait&#x2013;monomethod coefficients), which were also lower than the monotrait&#x2013;heteromethod coefficients. Furthermore, the patterns (ranks and sign of correlations) of the monomethod&#x2013;heterotrait blocks were comparable for both methods. In contrast, the reliability values (Cronbach&#x2019;s &#x3b1;<sub>c</sub>) showed high variance. In sum, based on the correlation table (<xref ref-type="table" rid="T3">Table&#x20;3</xref>), these findings emphasized convergent and discriminant evidence.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Correlation table for MTMM analysis (MTMM matrix; only correlations with <italic>p</italic>&#x20;&#x3c; 0.05 are displayed).</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="left"/>
<th colspan="3" align="center">Method A: NRS</th>
<th colspan="3" align="center">Method B: CLS</th>
</tr>
<tr>
<th align="left"/>
<th align="center">Trait</th>
<th align="center">ICL</th>
<th align="center">ECL</th>
<th align="center">GCL</th>
<th align="center">ICL</th>
<th align="center">ECL</th>
<th align="center">GCL</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="3" align="left">NRS</td>
<td align="center">ICL</td>
<td align="center">(0.55)</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">ECL</td>
<td align="center">0.30<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="center">(0.53)</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">GCL</td>
<td align="center">0.26<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="center">&#x2212;0.24<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="center">(0.62)</td>
<td align="left"/>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td rowspan="3" align="left">CLS</td>
<td align="center">ICL</td>
<td align="center">0.53<xref ref-type="table-fn" rid="Tfn1">
<sup>1</sup>
</xref>
</td>
<td align="center">0.37<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">
<italic>n.s</italic>
<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">(0.85)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">ECL</td>
<td align="center">0.20<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">0.55<xref ref-type="table-fn" rid="Tfn1">
<sup>1</sup>
</xref>
</td>
<td align="center">&#x2212;0.35<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">0.36<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="center">(0.43)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">GCL</td>
<td align="center">
<italic>n.s</italic>
<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">
<italic>n.s</italic>
<xref ref-type="table-fn" rid="Tfn3">
<sup>3</sup>
</xref>
</td>
<td align="center">0.48<xref ref-type="table-fn" rid="Tfn1">
<sup>1</sup>
</xref>
</td>
<td align="center">0.22<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="center">&#x2212;0.22<xref ref-type="table-fn" rid="Tfn2">
<sup>2</sup>
</xref>
</td>
<td align="char" char="(0">(0.89)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>n.s. &#x3d; not significant (p&#x20;&#x3e; 0.05).</p>
</fn>
<fn>
<p>( ) : reliability (Cronbach&#x2019;s&#x20;alpha).</p>
</fn>
<fn id="Tfn1">
<label>1</label>
<p>Monotrait&#x2013;heteromethod coefficients.</p>
</fn>
<fn id="Tfn2">
<label>2</label>
<p>Heterotrait&#x2013;monomethod coefficients.</p>
</fn>
<fn id="Tfn3">
<label>3</label>
<p>Heterotrait&#x2013;heteromethod coefficients.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The subsuming confirmatory MTMM analysis revealed acceptable values for RMSEA &#x3d; 0.06 and SRMR &#x3d; 0.08. In contrast, CFI &#x3d; 0.93 and TLI &#x3d; 0.91 were slightly below the range for acceptable model&#x20;fit.</p>
<p>
<xref ref-type="table" rid="T4">Table&#x20;4</xref> shows the group-dependent scores for each subscale. The results from the independent-sample <italic>t</italic>-test revealed for the adapted NRS a significant difference in favor of group 1 (lower ECL) in accordance with Hypothesis 3, while the CLS showed no group-specific differences. However, both NRS and CLS indicate no differences between groups concerning ICL in accordance with Hypothesis 4. Details of the test statistics can be found in <xref ref-type="table" rid="T4">Table&#x20;4</xref>.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Group-specific results for both adapted scales.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Scale</th>
<th rowspan="2" align="center">Subscale</th>
<th rowspan="2" align="center">Group 1 (<italic>M(SD)</italic>)</th>
<th rowspan="2" align="center">Group 2 (<italic>M(SD)</italic>)</th>
<th colspan="3" align="center">
<italic>t</italic>-test</th>
</tr>
<tr>
<th align="center">
<italic>t</italic>
</th>
<th align="center">
<italic>df</italic>
</th>
<th align="center">
<italic>p</italic>
</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">NRS</td>
<td align="left">ICL</td>
<td align="char" char="(0">0.25 (0.15)</td>
<td align="char" char="(0">0.25 (0.15)</td>
<td align="char" char=".">0.00</td>
<td align="char" char=".">93.0</td>
<td align="char" char=".">1</td>
</tr>
<tr>
<td align="left"/>
<td align="left">ECL</td>
<td align="char" char="(0">0.14 (0.14)</td>
<td align="char" char="(0">0.21 (0.17)</td>
<td align="char" char=".">2.17</td>
<td align="char" char=".">89.3</td>
<td align="char" char=".">0.03</td>
</tr>
<tr>
<td align="left"/>
<td align="left">GCL</td>
<td align="char" char="(0">0.72 (0.19)</td>
<td align="char" char="(0">0.71 (0.16)</td>
<td align="char" char=".">-0.43</td>
<td align="char" char=".">91.7</td>
<td align="char" char=".">0.67</td>
</tr>
<tr>
<td align="left">CLS</td>
<td align="left">ICL</td>
<td align="char" char="(0">0.20 (0.14)</td>
<td align="char" char="(0">0.21 (0.14)</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">92.9</td>
<td align="char" char=".">0.91</td>
</tr>
<tr>
<td align="left"/>
<td align="left">ECL</td>
<td align="char" char="(0">0.13 (0.09)</td>
<td align="char" char="(0">0.14 (0.11)</td>
<td align="char" char=".">0.44</td>
<td align="char" char=".">87.6</td>
<td align="char" char=".">0.66</td>
</tr>
<tr>
<td align="left"/>
<td align="left">GCL</td>
<td align="char" char="(0">0.68 (0.19)</td>
<td align="char" char="(0">0.67 (0.21)</td>
<td align="char" char=".">-0.11</td>
<td align="char" char=".">91.0</td>
<td align="char" char=".">0.92</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Furthermore, there were no significant correlations between the pretest results and the ICL-related subscales, both for the full pretest scores, <italic>r</italic>&#x20;&#x3d; &#x2212;0.12 and <italic>p</italic>&#x20;&#x3d; 0.25 for the NRS, <italic>r</italic>&#x20;&#x3d; &#x2212;0.02 and <italic>p</italic>&#x20;&#x3d; 0.82 for the CLS, and the intervention-related items, <italic>r</italic>&#x20;&#x3d; &#x2212;0.07 and <italic>p</italic>&#x20;&#x3d; 0.51 for the NRS, <italic>r</italic>&#x20;&#x3d; 0.01 and <italic>p</italic>&#x20;&#x3d; 0.89 for the CLS. These results contradict Hypothesis&#x20;5.</p>
</sec>
<sec id="s3-3">
<title>Evaluation of Combined Scales</title>
<p>All the items of both scales were taken into account to merge them into a new scale. First, an exploratory factor analysis was conducted to evaluate which items group together. Both the scree plot and a parallel analysis indicate a three-factorial structure. The items with the highest loading indicate conformity with the types of load known from theory, although some items with lower loadings are not grouped in accordance with their intended position. <xref ref-type="table" rid="T5">Table&#x20;5</xref> displays the extracted factor loadings.</p>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Results of the EFA for all items of both NRS and CLS.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Item</th>
<th align="center">Factor&#xa0;1&#xa0;interpreted&#xa0;as&#xa0;ICL</th>
<th align="center">Factor&#xa0;2&#xa0;interpreted&#xa0;as&#xa0;GCL</th>
<th align="center">Factor&#xa0;3&#xa0;interpreted&#xa0;as&#xa0;ECL</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">CLS-3</td>
<td align="char" char=".">
<bold>0.75</bold>
</td>
<td align="char" char=".">-0.04</td>
<td align="char" char=".">0.03</td>
</tr>
<tr>
<td align="left">CLS-2</td>
<td align="char" char=".">
<bold>0.74</bold>
</td>
<td align="char" char=".">-0.11</td>
<td align="char" char=".">0.05</td>
</tr>
<tr>
<td align="left">CLS-4</td>
<td align="char" char=".">
<bold>0.73</bold>
</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">-0.01</td>
</tr>
<tr>
<td align="left">NRS-2</td>
<td align="char" char=".">
<bold>0.72</bold>
</td>
<td align="char" char=".">-0.02</td>
<td align="char" char=".">-0.16</td>
</tr>
<tr>
<td align="left">CLS-5</td>
<td align="char" char=".">
<bold>0.66</bold>
</td>
<td align="char" char=".">0.13</td>
<td align="char" char=".">0.14</td>
</tr>
<tr>
<td align="left">NRS-1</td>
<td align="char" char=".">
<bold>0.53</bold>
</td>
<td align="char" char=".">-0.03</td>
<td align="char" char=".">-0.16</td>
</tr>
<tr>
<td align="left">CLS-1</td>
<td align="char" char=".">
<bold>0.48</bold>
</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">0.25</td>
</tr>
<tr>
<td align="left">NRS-3</td>
<td align="char" char=".">
<bold>0.40</bold>
</td>
<td align="char" char=".">-0.04</td>
<td align="char" char=".">0.22</td>
</tr>
<tr>
<td align="left">CLS-7</td>
<td align="char" char=".">
<bold>0.31</bold>
</td>
<td align="char" char=".">-0.06</td>
<td align="char" char=".">0.27</td>
</tr>
<tr>
<td align="left">CLS-10</td>
<td align="char" char=".">-0.05</td>
<td align="char" char=".">
<bold>0.95</bold>
</td>
<td align="char" char=".">0.02</td>
</tr>
<tr>
<td align="left">CLS-12</td>
<td align="char" char=".">0.02</td>
<td align="char" char=".">
<bold>0.82</bold>
</td>
<td align="char" char=".">-0.11</td>
</tr>
<tr>
<td align="left">CLS-13</td>
<td align="char" char=".">-0.09</td>
<td align="char" char=".">
<bold>0.79</bold>
</td>
<td align="char" char=".">0.13</td>
</tr>
<tr>
<td align="left">CLS-14</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">
<bold>0.72</bold>
</td>
<td align="char" char=".">-0.01</td>
</tr>
<tr>
<td align="left">CLS-11</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">
<bold>0.62</bold>
</td>
<td align="char" char=".">-0.16</td>
</tr>
<tr>
<td align="left">NRS-8</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">
<bold>0.48</bold>
</td>
<td align="char" char=".">-0.21</td>
</tr>
<tr>
<td align="left">CLS-9</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">
<bold>0.60</bold>
</td>
</tr>
<tr>
<td align="left">NRS-6</td>
<td align="char" char=".">0.27</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">
<bold>-0.49</bold>
<xref ref-type="table-fn" rid="Tfn4">&#x2a;</xref>
</td>
</tr>
<tr>
<td align="left">NRS-7</td>
<td align="char" char=".">-0.01</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">
<bold>-0.48</bold>
<xref ref-type="table-fn" rid="Tfn4">&#x2a;</xref>
</td>
</tr>
<tr>
<td align="left">NRS-4</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">-0.12</td>
<td align="char" char=".">
<bold>0.47</bold>
</td>
</tr>
<tr>
<td align="left">CLS-6</td>
<td align="char" char=".">0.31</td>
<td align="char" char=".">-0.05</td>
<td align="char" char=".">
<bold>0.45</bold>
</td>
</tr>
<tr>
<td align="left">NRS-5</td>
<td align="char" char=".">0.23</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">
<bold>0.35</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Highest item loadings are given in&#x20;bold.</p>
</fn>
<fn id="Tfn4">
<label>&#x2a;</label>
<p>Negative loadings were not considered for combined scales.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>For the first new model (referred to as model 1), the three items with the highest (positive) loadings were included because they represent their respective factor in a reliable manner. Hence, the ICL consisted of the items CLS-2, CLS-3, and CLS-4, the ECL subscale consisted of CLS-9, NRS-4, and CLS-6, and the GCL subscale consisted of CLS-10, CLS-12, and CLS-13. The subsuming CFA revealed adequate to good model fit, CFI &#x3d; 0.98, TLI &#x3d; 0.98, RMSEA &#x3d; 0.05, and SRMR &#x3d;&#x20;0.06.</p>
<p>In this way, model 1 corresponds directly to the structure revealed by the EFA for the given dataset. In terms of validity, it therefore meets the evidence source of the internal structure. The second model (referred to as model 2) aimed to integrate another source of evidence (evidence based on relation to other variables) by including those items in the ECL subscale that had proven to be sensitive toward the induced differences between the groups. Hence, for model 2, the same items as in model 1 were used to merge the ICL and GCL subscales because of their high loadings. For the ECL subscale, we used the full subscale of the NRS (NRS-3, NRS-4, and NRS-5) in order to incorporate the ability to detect a significant difference in terms of ECL. A subsuming confirmatory factor analysis also revealed adequate to good model fit, CFI &#x3d; 1.0, TLI &#x3d; 1.0, RMSEA &#x3d; 0.00, and SRMR &#x3d;&#x20;0.06.</p>
<p>Since both new models shared the same items for ICL and GCL, they reached the same (sufficient) level of reliability for these subscales, &#x3b1;<sub>c</sub>(ICL) &#x3d; 0.79 and &#x3b1;<sub>c</sub>(GCL) &#x3d; 0.90. They slightly differed concerning the reliability of their ECL subscales, &#x3b1;<sub>c</sub>(ECL, model 1) &#x3d; 0.54 and &#x3b1;<sub>c</sub>(ECL, model 2) &#x3d; 0.57 which are still below the desired cutoff value &#x3b1;<sub>c</sub> &#x3d; 0.70. Furthermore, the sensitivity toward group-specific differences in ECL seemed to be inherited as model 1 showed no significant difference, <italic>t</italic>(90.3) &#x3d; &#x2212;0.64 and <italic>p</italic>&#x20;&#x3d; 0.52, while model 2 adopted the significant differences from the full adapted NRS, <italic>t</italic>(89.3) &#x3d; 2.17 and <italic>p</italic>&#x20;&#x3d;&#x20;0.033.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<sec id="s4-1">
<title>Validity Based on Content</title>
<p>Both scales had to be adapted, and the CLS had to be expanded to fit the desired context. Since experimenting in STEM laboratory courses has been commonly based on generating and interpreting the measurement data, the measurement procedure and the corresponding quantities as well as their functional relationships and scientific laws are the main source of the information that has to be processed in order to generate new knowledge structures. Especially concerning the adapted and expanded CLS, the item development included all those relevant sources of content-related complexity in the subscales dedicated to measure ICL as well as GCL, whereas the items of the NRS merely consisted of general expressions. Hence, the adapted CLS appears to be slightly advantageous as a higher number of typical aspects from the learning scenario were directly addressed within the&#x20;items.</p>
<p>Following the concept of ECL as presented by CLT, processes that do not contribute to essential learning originate from irrelevant and distracting elements. These include language issues and presentation formats that demand unnecessary search processes and representational holding. While the CLS originally included text comprehension as a source of ECL, the adapted version was not expanded toward the presentation formats (e.g., by addressing distracting search processes in the items), though this was a specific part of the presented study. In this case, the adapted CLS could be limited in its ability to cover all relevant load-inducing aspects that learners face throughout the experimental procedure. In contrast, the NRS already addressed presentational aspects, which were retained for the adapted version.</p>
<p>In sum, all subscales covered relevant aspects of the learning environment, but each with a specific main emphasis toward instructional design aspects. Based on the item formulation, the adapted CLS seems to address more precisely ICL and GCL, while the adapted NRS seems to address ECL in a more sensitive way for the context of laboratory learning scenarios. Furthermore, this emphasizes a general need for developing and validating specific instruments that directly address the characteristics of learning scenarios and include all crucial load-inducing elements. A more general item formulation might be too abstract, which could result in participants not being able to relate the items to the given situation without being further introduced to the intention and the meaning of the respective scale (e.g., <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al., 2017</xref>).</p>
</sec>
<sec id="s4-2">
<title>Validity Evidence Based on Internal Structure</title>
<p>Concerning their internal consistency for the given dataset, the subscales of the adapted NRS and adapted CLS cannot be seen as sufficiently reliable. Moreover, these low indices are far below those of the original work by <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al. (2017)</xref> and therefore challenge the benefits and appropriateness derived from the content analysis (<italic>Validity Based on Content</italic>). It is probable that the a priori specification of load-inducing content will not fit the subjective impressions of the learners during the experimentation phase. In contrast, the subscales for ICL and GCL of the adapted CLS show a good internal consistency. Except for ECL, the values are in the range of the original work by <xref ref-type="bibr" rid="B34">Leppink et&#x20;al. (2013)</xref> or former adaptions of the scale (<xref ref-type="bibr" rid="B48">Thees et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B4">Andersen and Makransky, 2021a</xref>, <xref ref-type="bibr" rid="B5">Andersen and Makransky, 2021b</xref>). Here again, the insufficient reliability for ECL casts doubt on whether the items of this specific subscale are appropriate to measure the intended type of cognitive load. Especially in comparison with the findings of <xref ref-type="bibr" rid="B48">Thees et&#x20;al. (2020)</xref>, who used a very similar formulation of the ECL items in another scientific context (thermodynamics instead of electricity), these results challenge a broad applicability of a simple adaption of the original CLS and raise the question of how to integrate context-specific sources of load while the overall pedagogical approach remains comparable (e.g., inquiry-based learning).</p>
<p>The results of the CFA also undermine the intended internal structure of each scale as the model fit indices do not provide sufficient formal evidence for the three assumed factors. Hence, the confirmatory analysis strengthens criticisms of the appropriateness of the three-factorial structure as intended during the item development. This might be a consequence of a rather small sample size because the conventional rule of thumb that the number of participants should be more than 10&#x20;times the number of items is only reached for the adapted NRS, but not for the CLS. Another limiting factor might be the reduction of the scale range from a 10-point to a six-point scale for the adapted&#x20;CLS.</p>
<p>In sum, these findings reveal that the intended internal structure of the instruments is not fully represented in the data, which constrains the interpretation of the single subscales. We must therefore reject the first hypothesis and question the appropriateness of the adapted scales to differentiate between three different types of load in the context of technology-enhanced laboratory courses. Although the CFAs mostly narrowly missed the acceptable range for the fit indices, which can be interpreted as a case of a too small sample size, the low indices for internal consistency as the reliability measure for four out of six subscales remain the main issue for the internal structure.</p>
</sec>
<sec id="s4-3">
<title>Validity Based on Relations to Other Variables</title>
<p>Assuming the three-factorial structure of the scales as validated in various former studies, a traditional MTMM matrix based on a correlation table was analyzed. Although the reliability of all the subscales adapted from the NRS and for EL adapted from the CLS was not sufficient, significant correlations and repeating patterns indicate convergent and discriminant validity between the two scales. This means that the corresponding subscales in both approaches have meaningful coincidence and that each subscale can be distinguished from the others according to their interpretation as different types of cognitive load. These findings preliminarily emphasize the scales&#x2019; appropriateness as load-measuring instruments. However, the strength of evidence is limited due to missing cutoff values for the traditional interpretation of correlation patterns. Furthermore, the results could not be sufficiently reproduced by a confirmatory MTMM approach as not all indices indicate an acceptable model fit. Hence, although there are promising findings based on the traditional comparison of correlation patterns, we cannot provide sufficient formal evidence for convergent and discriminant validity, which means that the second hypothesis is not clearly supported by the data of the present study. Thus, the MTMM analysis does not support the internal structure of both scales as being directed to the same three different latent variables.</p>
<p>Concerning the contrasted presentation formats, a sensitive scale was expected to reflect group-specific differences in ECL in favor of group 1. For the given dataset, only the adapted NRS revealed a significant difference between the two intervention groups. As expected, group 1 reported lower scores for ECL. Hence, the findings support the third hypothesis for the adapted NRS and emphasize it as the more sensitive scale toward the contrasted presentation formats and the accompanying load sources, i.e.,&#x20;the spatial split of related information elements. The missing sensitivity of the adapted CLS toward differences in ECL might be the consequence of a biased focus on language issues and an insufficient adaptation toward other load-inducing sources for this specific subscale. However, these findings are in accordance with a study conducted by <xref ref-type="bibr" rid="B43">Skulmowski and Rey (2020)</xref>, who also revealed that the NRS is more likely to detect differences in ECL than the CLS. In their research, the authors also argued that the original items of the CLS might focus too much on the verbal aspects of the learning scenario, while the NRS addresses information processing in a more generalized&#x20;way.</p>
<p>Both adapted scales did not show significant differences concerning ICL scores, which is in accordance with the intention to provide both groups with equal content, experimental setups, and representational forms of the measurement data. Hence, the fourth hypothesis is supported for both scales. However, there is no significant correlation between the scores of both ICL subscales and the specific or full prior knowledge scores, which contradicts the theory-based expectation that learners with lower prior knowledge will perceive a higher ICL. Eventually, a missing correlation might indicate that learners&#x2019; prior knowledge was sufficient as a conceptual prerequisite to successfully conduct the experimental tasks. However, this leads to a rejection of the fifth hypothesis because this result does not support the compliance of the ICL subscale with the theoretical concept of ICL in terms of the&#x20;CLT.</p>
<p>In sum, the direct comparison between the two adapted scales via the MTMM matrix emphasizes but does not prove convergent and discriminant evidence due to insufficient support by the confirmatory model fit. The relation to the grouping variable for the given study emphasizes the adapted NRS as more sensitive toward differences in ECL, which is in accordance with previous findings. As expected, both scales reveal equal ICL ratings for both groups. However, the relation between ICL and prior knowledge could not be verified. Eventually, the relation to other variables revealed mixed to rather unfavorable results as most of the underlying hypotheses had to be rejected.</p>
</sec>
<sec id="s4-4">
<title>Combined Scales</title>
<p>As the internal structure of each adapted scale remains challenged after considering evidence based on the reliability scores as well as after the CFAs and the MTMM approaches, we decided to construct a combined instrument based on the given item pool of both scales. As the first step, the EFA revealed a three-factorial structure for the combined dataset. In addition, the items with the highest factor loadings indicate accordance with the expected underlying latent variable so that the three factors can be interpreted as related to the three types of cognitive load (<xref ref-type="table" rid="T5">Table&#x20;5</xref>). Given the self-imposed restriction of using only three items per factor to obtain a concise scale, two models were derived that considered those items with the highest positive factor loadings and findings from validity evidence based on the relation to the grouping variable.</p>
<p>Both models showed acceptable to good model fit in subsuming CFAs concerning their three-factorial internal structure, emphasizing their capability to differentiate between the three types of load. While the subscales for ICL and GCL are equal in both models and consist of items from the adapted CLS, the models differ concerning the ECL subscale. While model 1 follows the ranking of factor loadings from the EFA, resulting in a mix of items from both adapted NRS and CLS, model 2 inherits the full ECL subscale from the adapted NRS. This step is not based on the findings from the EFA, but respects the fact that this particular subscale was able to detect group-specific differences in ECL which are likely to exist in studies that contrast presentation formats to address well-known multimedia effects such as split-attention (<xref ref-type="bibr" rid="B42">Schroeder and Cenkci, 2018</xref>). Hence, model 2 constitutes a further development as it integrates validity evidence based on the relation to other variables.</p>
<p>However, both models still suffer from low internal consistency concerning ECL, which reduces the reliability of the acceptable model fits. This issue might result from the fact that the items dedicated to measuring ECL cover different load-inducing elements such as data presentation or verbal components. Hence, they cannot be expected to equally contribute to the score, and so, reaching a high internal consistency remains difficult. <xref ref-type="bibr" rid="B5">Andersen and Makransky (2021b)</xref> even considered ECL as a multidimensional variable, which presents a plausible reason for our low internal consistency findings. Eventually, we follow the results of the presuming EFA by considering ECL as an unidimensional factor which addresses multiple learning-irrelevant elements.</p>
<p>In sum, model 2 is considered the best scale based on the given item pool and the given dataset. Concerning the content of the new subscales for ICL and ECL, the items refer to concrete aspects of the experimental tasks, i.e.,&#x20;those components that are a priori determined the basis of the learning process. Hence, the combination of both NRS and CLS showed that the most valuable items for the given dataset were taken from the CLS, but the NRS provided a meaningful supplement. Furthermore, the restriction to three items per subscale emphasizes the need to focus on those elements of the learning environment that are mandatory to deal with during the learning process.</p>
</sec>
<sec id="s4-5">
<title>Future Work</title>
<p>To address a wider range of technology-based learning scenarios, our adapted versions could be enhanced by integrating items from other adaptations. For example, <xref ref-type="bibr" rid="B4">Andersen and Makransky (2021a)</xref> included the term &#x201c;information display format&#x201d; as a source of load in their ECL subscale, which was based on the original CLS. This term would directly address the contrasted presentation formats in our study without any bias toward a certain technology. On the contrary, such general formulations require a clarification as to what they are referring to, such as by a short introduction prior to the subjective rating, where the term is specified for each intervention&#x20;group.</p>
<p>As most of the samples used in comparable studies consist of university students, studies validating the application of the considered scales in school contexts are missing. At the school level, learners are expected to have a different amount of prior knowledge and metacognitive skills. Hence, the measurement of cognitive load based on subjective experiences could be much more challenging (<xref ref-type="bibr" rid="B9">Br&#xfc;nken et&#x20;al., 2003</xref>; <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al., 2017</xref>). Therefore, the scales have to be adapted concerning the item formulation as well as the scale levels and the endpoint labeling. In addition, items on passive load (mental load) and active load (mental effort), developed by <xref ref-type="bibr" rid="B29">Klepsch and Seufert (2021)</xref>, could be added. The authors could show that the item on passive load related to the ICL factor of their scale and the item on active load related to the GCL factor. Klepsch and Seufert recommended the use of these additional items with children and tasks that require learners&#x2019; self-regulation (e.g., laboratory work). Such adaptation might demand further investigations toward validity evidence. Future work might consider expert ratings for the item content to strengthen the explanatory power of the content-related evidence (<xref ref-type="bibr" rid="B9">Br&#xfc;nken et&#x20;al., 2003</xref>; <xref ref-type="bibr" rid="B28">Klepsch et&#x20;al., 2017</xref>). Furthermore, it will be essential to validate the new and further developed scales on a large sample as well as to consider that further (back-) translations of the presented (German) scales might affect validity aspects.</p>
<p>In the present study, only some of all possible sources providing evidence for the validation of the cognitive load scales were examined. Future studies should not only experimentally manipulate the ECL but also systematically manipulate all three types of cognitive load and verify whether the developed scales can also reflect variations in the ICL and GCL. Manipulation of ICL could be achieved by contrasting laboratory tasks with different levels of complexity or by contrasting groups with different levels of prior knowledge (evidence based on the relation to other variables). However, previous research suggesting that a subject&#x2019;s ability to reliably differentiate between ICL and ECL depends on a sufficient level of prior knowledge (<xref ref-type="bibr" rid="B56">Zu et&#x20;al., 2021</xref>) should also be considered. GCL could be manipulated by providing or not providing self-regulation prompts during student experimentation.</p>
<p>Another option for analyzing validity evidence based on the relation to other variables could be a direct comparison between subjective ratings and objectives measures such as eye-tracking data. Recent developments of mobile eye-tracking devices allow for collecting data in dynamic situations such as laboratory courses and might even be applied to augmented reality-based learning scenarios (<xref ref-type="bibr" rid="B24">Kapp et&#x20;al., 2021</xref>) so that various approaches of technology-enhanced learning scenarios can be accompanied by both the subjective rating scales and the objective gaze-based measures. Nevertheless, the interpretation should consider prior research indicating that there might be no linear relationship between objective and subjective measures but that they rather cover different facets of cognitive load (<xref ref-type="bibr" rid="B40">Minkley et&#x20;al., 2021</xref>).</p>
</sec>
<sec id="s4-6">
<title>Conclusion</title>
<p>In this article, we present supporting and critical points regarding the validity of two popular subjective cognitive load-rating scales in the context of technology-enhanced science experiments. Although the content of the adapted items seemed to be promising in terms of addressing various facets of the learning environment, the low internal consistency and the insufficient evidence for the intended three-factorial structure negate the appropriateness of the adapted scales. However, based on the correlations between the subscales, there are various indications that the addressed latent variables (i.e.,&#x20;ICL, ECL, and GCL) are comparable in both scales and can be distinguished from each other. Again, these assumptions cannot be formally confirmed based on the given dataset. In sum, three of five deduced hypotheses toward different sources of evidence in terms of validity had to be rejected due to insufficient formal evidence. Hence, there are no sufficient results that favor either the adapted NRS or the adapted CLS, although they seem to be convincing regarding their content.</p>
<p>The interpretation of this conflict is twofold. First, for the learning context under investigation, we question the current state of the adapted scales as they are not appropriate to measure different types of cognitive load. This would explain the insufficient reliability and the insufficient model fits concerning the assumed internal structure. In contrast, one could assume that the items of both scales are capable of representing the real load-inducing elements, but each scale addresses some but not all facets of the learning environment. Hence, solely by combining both item pools, it was possible to reach an adequate scale (model 2). At this point, the advantages of both adapted scales were combined to form a promising new scale for the context of complex science learning scenarios (although this scale is not without its flaws). The internal consistency of the ECL subscales is not acceptable but can be made plausible via the inherent multiple aspects covered by the&#x20;items.</p>
<p>The presented study is an example of applying known and empirically validated scales to an essential and realistic learning scenario from STEM education. Since inquiry-based learning scenarios contain multiple information sources, researchers must develop new instruments to be able to correctly measure cognitive load. Moreover, the issues raised in the analyses show that it is necessary to seek for validity based on different sources such as content, internal structure, and relation to other variables. In this sense, we want to encourage the community to contribute to the question of how to create valid and suitable questionnaires to determine cognitive load in specific complex learning scenarios.</p>
</sec>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s6">
<title>Ethics Statement</title>
<p>Ethical review and approval were not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>MT: conceptualization, methodology, formal analysis, investigation, writing, and supervision; SK: software, methodology, formal analysis, and investigation; KA: methodology and investigation; SM: conceptualization, methodology, and writing; RB: conceptualization, resources, and funding acquisition; JK: conceptualization, resources, writing, project administration, and funding acquisition.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>The dataset this paper draws upon was collected as part of the research projects GeAR (grant no. 01JD1811B) and gLabAssist (grant no. 16DHL1022), both funded by the German Federal Ministry of Education and Research (BMBF). The funding source had no involvement in preparing and conducting the study or in preparing the manuscript.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<collab>AERA</collab>
<collab>APA</collab>
<collab>NCME</collab> (<year>2011</year>). <source>Report and Recommendations for the Reauthorization of the institute of Education Sciences</source>. <publisher-loc>Washington D.C</publisher-loc>: <publisher-name>American Educational Research Association</publisher-name>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altmeyer</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Kapp</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Thees</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Malone</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kuhn</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Br&#xfc;nken</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>The Use of Augmented Reality to foster Conceptual Knowledge Acquisition in STEM Laboratory Courses-Theoretical Background and Empirical Results</article-title>. <source>Br. J.&#x20;Educ. Technol.</source> <volume>51</volume>, <fpage>611</fpage>&#x2013;<lpage>628</lpage>. <pub-id pub-id-type="doi">10.1111/bjet.12900</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="book">
<collab>American Association of Physics Teachers</collab> (<year>2014</year>). <source>AAPT Recommendations for the&#x20;Undergraduate Physics Laboratory Curriculum,</source> <ext-link ext-link-type="uri" xlink:href="https://www.aapt.org/Resources/upload/LabGuidlinesDocument_EBendorsed_nov10.pdf">https://www.aapt.org/Resources/upload/LabGuidlinesDocument_EBendorsed_nov10.pdf</ext-link>.</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andersen</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Makransky</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>The Validation and Further Development of a Multidimensional Cognitive Load Scale for Virtual Environments</article-title>. <source>J.&#x20;Comput. Assist. Learn.</source> <volume>37</volume>, <fpage>183</fpage>&#x2013;<lpage>196</lpage>. <pub-id pub-id-type="doi">10.1111/jcal.12478</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andersen</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Makransky</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>The Validation and Further Development of the Multidimensional Cognitive Load Scale for Physical and Online Lectures (MCLS-POL)</article-title>. <source>Front. Psychol.</source> <volume>12</volume>, <fpage>642084</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2021.642084</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ayres</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sweller</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>The Split-Attention Principle in Multimedia Learning</article-title>,&#x201d; in <source>The Cambridge Handbook of Multimedia Learning</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Mayer</surname>
<given-names>R. E.</given-names>
</name>
</person-group>. <edition>Second edition</edition> (<publisher-loc>New York</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>206</fpage>&#x2013;<lpage>226</lpage>. </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baddeley</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>1992</year>). <article-title>Working Memory</article-title>. <source>Science</source> <volume>255</volume>, <fpage>556</fpage>&#x2013;<lpage>559</lpage>. <pub-id pub-id-type="doi">10.1126/science.1736359</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Becker</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Klein</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>G&#xf6;&#xdf;ling</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kuhn</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Using mobile Devices to Enhance Inquiry-Based Learning Processes</article-title>. <source>Learn. Instruction</source> <volume>69</volume>, <fpage>101350</fpage>. <pub-id pub-id-type="doi">10.1016/j.learninstruc.2020.101350</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Br&#xfc;nken</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Plass</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<name>
<surname>Leutner</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Direct Measurement of Cognitive Load in Multimedia Learning</article-title>. <source>Educ. Psychol.</source> <volume>38</volume>, <fpage>53</fpage>&#x2013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.1207/S15326985EP3801_7</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Burde</surname>
<given-names>J.-P.</given-names>
</name>
</person-group> (<year>2018</year>). <source>Konzeption und Evaluation eines Unterrichtskonzepts zu einfachen Stromkreisen auf Basis des Elektronengasmodells</source>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>Logos</publisher-name>. <pub-id pub-id-type="doi">10.30819/4726</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Campbell</surname>
<given-names>D. T.</given-names>
</name>
<name>
<surname>Fiske</surname>
<given-names>D. W.</given-names>
</name>
</person-group> (<year>1959</year>). <article-title>Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix</article-title>. <source>Psychol. Bull.</source> <volume>56</volume>, <fpage>81</fpage>&#x2013;<lpage>105</lpage>. <pub-id pub-id-type="doi">10.1037/h0046016</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Cohen</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1988</year>). <source>Statistical Power Analysis for the Behavioral Sciences</source> <edition>2. ed.</edition> <publisher-loc>Hillsdale, NJ</publisher-loc>: <publisher-name>Erlbaum</publisher-name>.</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cowan</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>The Magical Number 4 in Short-Term Memory: A Reconsideration of Mental Storage Capacity</article-title>. <source>Behav. Brain Sci.</source> <volume>24</volume>, <fpage>87</fpage>&#x2013;<lpage>114</lpage>. <pub-id pub-id-type="doi">10.1017/s0140525x01003922</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Jong</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Linn</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Zacharia</surname>
<given-names>Z. C.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Physical and Virtual Laboratories in Science and Engineering Education</article-title>. <source>Science</source> <volume>340</volume>, <fpage>305</fpage>&#x2013;<lpage>308</lpage>. <pub-id pub-id-type="doi">10.1126/science.1230579</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Jong</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Moving towards Engaged Learning in STEM Domains; There Is No Simple Answer, but Clearly a Road Ahead</article-title>. <source>J.&#x20;Comput. Assist. Learn.</source> <volume>35</volume>, <fpage>153</fpage>&#x2013;<lpage>167</lpage>. <pub-id pub-id-type="doi">10.1111/jcal.12337</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eid</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>A Multitrait-Multimethod Model with Minimal Assumptions</article-title>. <source>Psychometrika</source> <volume>65</volume>, <fpage>241</fpage>&#x2013;<lpage>261</lpage>. <pub-id pub-id-type="doi">10.1007/bf02294377</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Etkina</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>van Heuvelen</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>White-Brahmia</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Brookes</surname>
<given-names>D. T.</given-names>
</name>
<name>
<surname>Gentile</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Murthy</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Scientific Abilities and Their Assessment</article-title>. <source>Phys. Rev. ST Phys. Educ. Res.</source> <volume>2</volume>, <fpage>113</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevSTPER.2.020103</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hofstein</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lunetta</surname>
<given-names>V. N.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>The Laboratory in Science Education: Foundations for the Twenty-First century</article-title>. <source>Sci. Ed.</source> <volume>88</volume>, <fpage>28</fpage>&#x2013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1002/sce.10106</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Husnaini</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Effects of Guided Inquiry Virtual and Physical Laboratories on Conceptual Understanding, Inquiry Performance, Scientific Inquiry Self-Efficacy, and Enjoyment</article-title>. <source>Phys. Rev. Phys. Educ. Res.</source> <volume>15</volume>, <fpage>31</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevPhysEducRes.15.010119</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kalyuga</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Confirmatory Factor Analysis of Cognitive Load Ratings Supports a Two-Factor Model</article-title>. <source>TQMP</source> <volume>16</volume>, <fpage>216</fpage>&#x2013;<lpage>225</lpage>. <pub-id pub-id-type="doi">10.20982/tqmp.16.3.p216</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kalyuga</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Cognitive Load Theory: How Many Types of Load Does it Really Need?</article-title> <source>Educ. Psychol. Rev.</source> <volume>23</volume>, <fpage>1</fpage>&#x2013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1007/s10648-010-9150-7</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kane</surname>
<given-names>M. T.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Validating the Interpretations and Uses of Test Scores</article-title>. <source>J.&#x20;Educ. Meas.</source> <volume>50</volume>, <fpage>1</fpage>&#x2013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1111/jedm.12000</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kapici</surname>
<given-names>H. O.</given-names>
</name>
<name>
<surname>Akcay</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Using Hands-On and Virtual Laboratories Alone or Together&#x2015;Which Works Better for Acquiring Knowledge and Skills?</article-title> <source>J.&#x20;Sci. Educ. Technol.</source> <volume>28</volume>, <fpage>231</fpage>&#x2013;<lpage>250</lpage>. <pub-id pub-id-type="doi">10.1007/s10956-018-9762-0</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kapp</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Barz</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Mukhametov</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sonntag</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kuhn</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>ARETT: Augmented Reality Eye Tracking Toolkit for Head Mounted Displays</article-title>. <source>Sensors</source> <volume>21</volume>, <fpage>2234</fpage>. <pub-id pub-id-type="doi">10.3390/s21062234</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Kapp</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Thees</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Beil</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Weatherby</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Burde</surname>
<given-names>J.-P.</given-names>
</name>
<name>
<surname>Wilhelm</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). &#x201c;<article-title>The Effects of Augmented Reality: A Comparative Study in an Undergraduate Physics Laboratory Course</article-title>,&#x201d; in <conf-name>Proceedings of the 12th International Conference 1on&#x20;Computer Supported Education</conf-name>, <conf-date>May 2&#x2013;4, 2020</conf-date> (<publisher-name>SciTePress&#x20;- Science and Technology Publications</publisher-name>), <volume>Vol. 2</volume>, <fpage>197</fpage>&#x2013;<lpage>206</lpage>. <pub-id pub-id-type="doi">10.5220/0009793001970206</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kester</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Kirschner</surname>
<given-names>P. A.</given-names>
</name>
<name>
<surname>van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>The Management of Cognitive Load during Complex Cognitive Skill Acquisition by Means of Computer-Simulated Problem Solving</article-title>. <source>Br. J.&#x20;Educ. Psychol.</source> <volume>75</volume>, <fpage>71</fpage>&#x2013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1348/000709904X19254</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kester</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Paas</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
</person-group> (<year>2010</year>). &#x201c;<article-title>Instructional Control of Cognitive Load in the Design of Complex Learning Environments</article-title>,&#x201d; in <source>Cognitive Load Theory</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Plass</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<name>
<surname>Moreno</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Brunken</surname>
<given-names>R.</given-names>
</name>
</person-group> (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>109</fpage>&#x2013;<lpage>130</lpage>. </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klepsch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Schmitz</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Seufert</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Development and Validation of Two Instruments Measuring Intrinsic, Extraneous, and Germane Cognitive Load</article-title>. <source>Front. Psychol.</source> <volume>8</volume>, <fpage>1997</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2017.01997</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klepsch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Seufert</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Making an Effort versus Experiencing Load</article-title>. <source>Front. Educ.</source> <volume>6</volume> (<issue>56</issue>). <pub-id pub-id-type="doi">10.3389/feduc.2021.645284</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klepsch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Seufert</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Understanding Instructional Design Effects by Differentiated Measurement of Intrinsic, Extraneous, and Germane Cognitive Load</article-title>. <source>Instr. Sci.</source> <volume>48</volume>, <fpage>45</fpage>&#x2013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1007/s11251-020-09502-9</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kline</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2000</year>). <source>The Handbook of Psychological Testing</source>. <edition>2. ed</edition>. <publisher-loc>London</publisher-loc>: <publisher-name>Routledge</publisher-name>.</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krell</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Evaluating an Instrument to Measure Mental Load and Mental Effort Considering Different Sources of Validity Evidence</article-title>. <source>Cogent Edu.</source> <volume>4</volume>, <fpage>1280256</fpage>. <pub-id pub-id-type="doi">10.1080/2331186X.2017.1280256</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lazonder</surname>
<given-names>A. W.</given-names>
</name>
<name>
<surname>Harmsen</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Meta-Analysis of Inquiry-Based Learning</article-title>. <source>Rev. Educ. Res.</source> <volume>86</volume>, <fpage>681</fpage>&#x2013;<lpage>718</lpage>. <pub-id pub-id-type="doi">10.3102/0034654315627366</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leppink</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Paas</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>van der Vleuten</surname>
<given-names>C. P. M.</given-names>
</name>
<name>
<surname>van Gog</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Development of an Instrument for Measuring Different Types of Cognitive Load</article-title>. <source>Behav. Res.</source> <volume>45</volume>, <fpage>1058</fpage>&#x2013;<lpage>1072</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-013-0334-1</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leppink</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Paas</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>van Gog</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>van der Vleuten</surname>
<given-names>C. P. M.</given-names>
</name>
<name>
<surname>van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Effects of Pairs of Problems and Examples on Task Performance and Different Types of Cognitive Load</article-title>. <source>Learn. Instruction</source> <volume>30</volume>, <fpage>32</fpage>&#x2013;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.1016/j.learninstruc.2013.12.001</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lunetta</surname>
<given-names>V. N.</given-names>
</name>
<name>
<surname>Hofstein</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Clough</surname>
<given-names>M. P.</given-names>
</name>
</person-group> (<year>2005</year>). &#x201c;<article-title>Learning and Teaching in the School Science Laboratory: An Analysis of Research, Theory, and Practice</article-title>,&#x201d; in <source>Handbook of Research on Science Education</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Abell</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Lederman</surname>
<given-names>N. G.</given-names>
</name>
</person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Lawrence Erlbaum; Routledge</publisher-name>), <fpage>393</fpage>&#x2013;<lpage>441</lpage>. </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mayer</surname>
<given-names>R. E.</given-names>
</name>
<name>
<surname>Moreno</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>A Split-Attention Effect in Multimedia Learning: Evidence for Dual Processing Systems in Working Memory</article-title>. <source>J.&#x20;Educ. Psychol.</source> <volume>90</volume>, <fpage>312</fpage>&#x2013;<lpage>320</lpage>. <pub-id pub-id-type="doi">10.1037/0022-0663.90.2.312</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mayer</surname>
<given-names>R. E.</given-names>
</name>
<name>
<surname>Moreno</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Nine Ways to Reduce Cognitive Load in Multimedia Learning</article-title>. <source>Educ. Psychol.</source> <volume>38</volume>, <fpage>43</fpage>&#x2013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1207/s15326985ep3801_6</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mayer</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Fiorella</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>Principles for Reducing Extraneous Processing in Multimedia Learning: Coherence, Signaling, Redundancy, Spatial Contiguity, and Temporal Contiguity Principles</article-title>,&#x201d; in <source>The Cambridge Handbook of Multimedia Learning</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Mayer</surname>
<given-names>R. E.</given-names>
</name>
</person-group>. <edition>Second edition</edition> (<publisher-loc>New York</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>279</fpage>&#x2013;<lpage>315</lpage>. </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Minkley</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Krell</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Analyzing Relationships between Causal and Assessment Factors of Cognitive Load: Associations between Objective and Subjective Measures of Cognitive Load, Stress, Interest, and Self-Concept</article-title>. <source>Front. Educ.</source> <volume>6</volume> (<issue>56</issue>). <pub-id pub-id-type="doi">10.3389/feduc.2021.632907</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paas</surname>
<given-names>F. G. W. C.</given-names>
</name>
</person-group> (<year>1992</year>). <article-title>Training Strategies for Attaining Transfer of Problem-Solving Skill in Statistics: A Cognitive-Load Approach</article-title>. <source>J.&#x20;Educ. Psychol.</source> <volume>84</volume>, <fpage>429</fpage>&#x2013;<lpage>434</lpage>. <pub-id pub-id-type="doi">10.1037/0022-0663.84.4.429</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schroeder</surname>
<given-names>N. L.</given-names>
</name>
<name>
<surname>Cenkci</surname>
<given-names>A. T.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Spatial Contiguity and Spatial Split-Attention Effects in Multimedia Learning Environments: a Meta-Analysis</article-title>. <source>Educ. Psychol. Rev.</source> <volume>30</volume>, <fpage>679</fpage>&#x2013;<lpage>701</lpage>. <pub-id pub-id-type="doi">10.1007/s10648-018-9435-9</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Skulmowski</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Rey</surname>
<given-names>G. D.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Subjective Cognitive Load Surveys lead to Divergent Results for Interactive Learning media</article-title>. <source>Hum. Behav. Emerg. Tech</source>. <volume>2</volume>, <fpage>149</fpage>&#x2013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1002/hbe2.184</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sweller</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Cognitive Load Theory and Educational Technology</article-title>. <source>Education Tech. Res. Dev</source>. <volume>68</volume>, <fpage>1</fpage>&#x2013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/s11423-019-09701-3</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sweller</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Element Interactivity and Intrinsic, Extraneous, and Germane Cognitive Load</article-title>. <source>Educ. Psychol. Rev.</source> <volume>22</volume>, <fpage>123</fpage>&#x2013;<lpage>138</lpage>. <pub-id pub-id-type="doi">10.1007/s10648-010-9128-5</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sweller</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
<name>
<surname>Paas</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Cognitive Architecture and Instructional Design: 20&#x20;Years Later</article-title>. <source>Educ. Psychol. Rev.</source> <volume>31</volume>, <fpage>261</fpage>&#x2013;<lpage>292</lpage>. <pub-id pub-id-type="doi">10.1007/s10648-019-09465-5</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sweller</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>van Merri&#xeb;nboer</surname>
<given-names>J.&#x20;J.&#x20;G.</given-names>
</name>
<name>
<surname>Paas</surname>
<given-names>F. G. W. C.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Cognitive Architecture and Instructional Design</article-title>. <source>Educ. Psychol. Rev.</source> <volume>10</volume>, <fpage>251</fpage>&#x2013;<lpage>296</lpage>. <pub-id pub-id-type="doi">10.1023/a:1022193728205</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thees</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Kapp</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Strzys</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Beil</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Lukowicz</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kuhn</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Effects of Augmented Reality on Learning and Cognitive Load in university Physics Laboratory Courses</article-title>. <source>Comput. Hum. Behav.</source> <volume>108</volume>, <fpage>106316</fpage>. <pub-id pub-id-type="doi">10.1016/j.chb.2020.106316</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trumper</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>The Physics Laboratory - A Historical Overview and Future Perspectives</article-title>. <source>Sci. Edu.</source> <volume>12</volume>, <fpage>645</fpage>&#x2013;<lpage>670</lpage>. <pub-id pub-id-type="doi">10.1023/a:1025692409001</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Urban-Woldron</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Hopf</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Entwicklung eines Testinstruments zum Verst&#xe4;ndnis in der Elektrizit&#xe4;tslehre [Development of a diagnostic instrument for testing studen understanding of basic electricity concepts]</article-title>. <source>Z. f&#xfc;r Didaktik der Naturwissenschaften</source> <volume>18</volume>, <fpage>201</fpage>&#x2013;<lpage>227</lpage>. </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Volkwyn</surname>
<given-names>T. S.</given-names>
</name>
<name>
<surname>Allie</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Buffler</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lubben</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Impact of a Conventional Introductory Laboratory Course on the Understanding of Measurement</article-title>. <source>Phys. Rev. ST Phys. Educ. Res.</source> <volume>4</volume>, <fpage>4</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevSTPER.4.010108</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Vosniadou</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <source>International Handbook of Research on Conceptual Change</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Routledge</publisher-name>.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilcox</surname>
<given-names>B. R.</given-names>
</name>
<name>
<surname>Lewandowski</surname>
<given-names>H. J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Developing Skills versus Reinforcing Concepts in Physics Labs: Insight from a Survey of Students&#x27; Beliefs about Experimental Physics</article-title>. <source>Phys. Rev. Phys. Educ. Res.</source> <volume>13</volume>, <fpage>65</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevPhysEducRes.13.010108</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zacharia</surname>
<given-names>Z. C.</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>The Effects on Students&#x27; Conceptual Understanding of Electric Circuits of Introducing Virtual Manipulatives within a Physical Manipulatives-Oriented Curriculum</article-title>. <source>Cogn. Instruction</source> <volume>32</volume>, <fpage>101</fpage>&#x2013;<lpage>158</lpage>. <pub-id pub-id-type="doi">10.1080/07370008.2014.887083</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zacharia</surname>
<given-names>Z. C.</given-names>
</name>
<name>
<surname>Olympiou</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Physical versus Virtual Manipulative Experimentation in Physics Learning</article-title>. <source>Learn. Instruction</source> <volume>21</volume>, <fpage>317</fpage>&#x2013;<lpage>331</lpage>. <pub-id pub-id-type="doi">10.1016/j.learninstruc.2010.03.001</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zu</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Munsell</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Rebello</surname>
<given-names>N. S.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Subjective Measure of Cognitive Load Depends on Participants&#x27; Content Knowledge Level</article-title>. <source>Front. Educ.</source> <volume>6</volume> (<issue>56</issue>). <pub-id pub-id-type="doi">10.3389/feduc.2021.647097</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>