Edited by: Lisa Oakes, University of California, United States
Reviewed by: Jenny Saffran, University of Wisconsin-Madison, United States; Keith Apfelbaum, The University of Iowa, United States
†These authors have contributed equally to this work and share first authorship
This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
With increased public access to the Internet and digital tools, web-based research has gained prevalence over the past decades. However, digital adaptations for developmental research involving children have received relatively little attention. In 2020, as the COVID-19 pandemic led to reduced social contact, causing many developmental university research laboratories to close, the scientific community began to investigate online research methods that would allow continued work. Limited resources and documentation of factors that are essential for developmental research (e.g., caregiver involvement, informed assent, controlling environmental distractions at home for children) make the transition from in-person to online research especially difficult for developmental scientists. Recognizing this, we aim to contribute to the field by describing three separate moderated virtual behavioral assessments in children ranging from 4 to 13years of age that were highly successful. The three studies encompass speech production, speech perception, and reading fluency. However varied the domains we chose, the different age groups targeted by each study and different methodological approaches, the success of our virtual adaptations shared certain commonalities with regard to how to achieve informed consent, how to plan parental involvement, how to design studies that attract and hold children’s attention and valid data collection procedures. Our combined work suggests principles for future facilitation of online developmental work. Considerations derived from these studies can serve as documented points of departure that inform and encourage additional virtual adaptations in this field.
Over the past decades, technological advancements have expanded the scale and scope of academic research. A body of literature between 1995 and 2005 proposed a series of benefits and disadvantages associated with the initial wave of Internet-based research (
In 2020, as the COVID-19 pandemic led to reduced social contact, causing many research laboratories to close, the scientific community began to investigate online research methods that would allow continued work. Remote, digital modalities have been recognized as viable substitutions for in-person research settings (
However, shifting from in-person to remote modalities is not without challenges. For example,
Although solutions have been proposed to address some of the challenges (
The three studies included in this paper sought to adapt their original in-person task designs for remote facilitation with researcher moderation. While the moderated format was appropriate for these studies, both moderated and unmoderated designs have their pros and cons, and we encourage developmental scientists to make decisions with regard to the degree of moderation while facilitating online child studies. Compared to moderated studies, unmoderated or fully automated studies are less work-intensive during the research appointments, but it may require more preparation work in task automation and involve additional steps of data processing. Elimination (or lessening) of researcher involvement is advantageous in bias removal, as it is often replaced by consistent machine-delivered instructions. This facilitates the comparison across replications of unmoderated studies (
Ethics of non-therapeutic research involving children are a delicate issue, as children are vulnerable and would likely not benefit directly from participation (
For many virtual studies, using online applications, such as REDCap, are an appealing way to collect e-consent and to build and manage online databases. A lot of web tools come with built-in privacy measures, allowing digital consent to be completed efficiently and stored securely. On platforms such as Pavlovia and Gorilla, documentation of major identifying information can stay detached from research data, and it is often possible to record the consent process and data collection separately (
Protecting participants’ privacy and data confidentiality is among the top priorities in human subject research. Remote consent processes in recent years have shown varying formats. Some researchers opt for digital acquisition of text-based consent
In addition, experimental stimuli and research data that are delivered and collected digitally are subjected to additional ethical scrutiny, specifically regarding data security. Some study designs may require transportation of research equipment or digital transfer of data files. In these cases, encrypting the devices and data files (e.g., using passwords or proprietary software) can significantly lower security risks, and related considerations are growing in prominence as new technologies increasingly deliver utility in research methods. As our capabilities are being enhanced rapidly, the scientific community needs to continually assess the implications of technologically enabled advancements in human subject research.
Additionally, experimental control concerns are presented in traditional research settings and highlighted even more in virtual environments. For example, whereas it is fairly straightforward to manipulate the acoustic environment in a laboratory’s sound booth, it is impossible to obtain the same level of control in participants’ homes. A realistic attempt would be to instruct caregivers to prepare a “quiet room” for the research appointment. In addition to audible noises, families may have different levels of visual and tactile distractions at home (e.g., siblings or pets). Furthermore, unless experimental equipment is specified or provided for the participants, technical device differences (e.g., headphones, Internet connection stability, screen sizes) also need to be considered.
Probably one of the main reasons for the slow move to online research in developmental work is that experimental designs involving children are typically more complex than those involving adults. A major challenge for child development researchers is how to best engage participants, remove distractions, and motivate participation given age-specific attention spans.
Interactions between the participant and researcher may be helpful in maintaining the child’s interest level. Developmental research studies, especially ones targeting auditory or visual perception, can benefit from researcher observation even if the task itself is fully automated. In a moderated session, the researcher-observer would be able to note any circumstances or issues that might come up and adjust as needed, whether it be troubleshooting technical difficulties, regulating caregiver involvement, clarifying task instructions, or introducing necessary breaks.
Adapting developmental research for online environments inevitably introduces tangible changes to a study’s experimental design and setup, but perhaps equally important is its impact on a socio-psychological aspect of human subject research, the researcher-participant relationship. Traditionally in a laboratory environment, face-to-face interactions can often motivate participation. While social interactions through a screen are often perceived as “flattened” and cannot fully replace their in-person counterparts, it is still possible to enhance researcher-participant relationships and to foster participant engagement and motivation through researcher moderation of remote studies. Notably, in studies involving children who struggle with unfamiliar surroundings (e.g., children with autism), the introduction of a stranger (i.e., the researcher) and a new environment (i.e., the laboratory) can be intimidating at times and interfere with the validity in data collection. In these cases, virtual assessment is an especially advantageous alternative, as it allows for in-home research participation, and can reduce or remove the perception of stranger interaction (
Given the variety of developmental behavioral work and the limited resources for online adaptations available, questions arise regarding the validity of these adaptations. Several attempts have been made to compare in-person and remote work (
A spectrum of options is available for administering this assessment
As such, interpretation of these norms when moved online should be deliberated prior to implementation.
In this paper, three different virtual studies will be discussed. Each study was initially conceived and developed for in-person environments and subsequently moved online. The original laboratory-based research plans will be summarized, along with adaptations made to enable remote facilitation. The studies targeted different questions and distinct age groups, which led to different approaches. Although the results of these studies are very promising and will each contribute to their field independently, the focal point of this paper is the adaptations we made to the three studies (Section “Procedural Modifications for Online Studies”), our data regarding their success and validity (Section “Methods; Developing Remote-Friendly Measures for Moderated, Developmental Studies”), and our resulting perspective on future implementations of virtual studies (Section “Discussion”). Through this paper, our ultimate aim is to motivate a continuance of remote developmental research, post-pandemic.
To represent the vast array of developmental research in this paper, we selected three distinct studies that varied in research goals and participants’ demographics. An Imitation study (see Section “Assessment of Vocal Imitation of Native and Nonnative Vowels (Cai and Kuhl, in Prep.)”) focusing on speech acquisition (age 4), an Audiovisual (AV) study (see Section “Audiovisual Speech Processing in Relationship to Phonological and Vocabulary Skills Gijbels et al., in Press.)”) focusing on speech perception (age 6–7), and a Reading study (see Section“A Symbolic Annotation of Vowel Sounds for Emerging Readers (
Vast differences have been observed in second language (L2) learners’ ability to imitate novel sounds – while the majority of learners exhibit and maintain a foreign accent throughout their lifetime, some are able to produce accurate L2 pronunciations to a near-native level. These individual differences have been previously characterized as largely innate and fixed (
In this study, we investigated four-year-old typically developing (TD) children’s (
A laboratory-based format of this study was carried out during the initial pilot phase. Upon arrival at the laboratory, parents were first asked to complete a questionnaire, which surveyed environmental factors such as socio-economic status and language background. Then, the speech imitation task involving child participants was administered
In response to the public health crisis posed by COVID-19, the study was adapted digitally to accommodate remote testing. We modified the parental survey format, the protocol for parental involvement, and the means of video and audio recording of experimental sessions. Parental questionnaires were conducted digitally using a secure online portal, and the speech imitation task took place over Zoom. In the modified, online version of the imitation task, instead of plush puppets, participants interacted with animal cartoon characters on the researcher’s computer screen (
Visualization of the Imitation Study. Digital cartoon animation adapted from in-lab puppet theater setup, used to deliver auditory stimuli remotely via Zoom during the imitation task. Speech data collected via LENA vests and recorders worn by child participants (Cai and Kuhl, in prep).
Online adaptations of the study were successfully implemented. Forty-six out of 57 participating subjects were included in the analysis, with a resulting total of over 7,000 utterances examined, and audio files retrieved from the LENA recorders provided adequate acoustic information for the purpose of vowel formant analysis (see “Validity of online adaptations”).
The benefits of audiovisual (AV) speech perception, more specifically, having access to the (
The specific aims of this study were to assess (1) whether TD children in first grade (
Visualization of the AV Study Experimental set up of the AV Study in both laboratory and online. Audio-only, audiovisual, or visual-only stimulus presentation of one-syllable words in speech-weighted noise, followed by a 4AFC answer screen (
In a laboratory setting, we would measure individuals’ behavioral and psychophysical performance in a quiet, controlled environment (i.e., sound booth) to ensure the reliability of stimuli and response. Conducting the tasks in a quiet room in the laboratory provides the opportunity to assess baseline control of hearing thresholds and visual acuity, eliminates potential interference (e.g., background noise), avoids unintended asynchrony of auditory and visual stimuli, and maintains exact output levels and quality of all stimuli using a calibrated computer. It also allows interpretation of normed behavioral tests, as they can be assessed according to the manual. Interference from parents would be limited as they would wait in the waiting room and instructions and assessment would be provided by a trained research assistant.
For both the in-person and the online version of the experiment, the stimulus presentation followed by 4AFC answer options would look identical. Also, the number of breaks (stimulus blocks) and catch trials were kept consistent. However, to move the tasks to a virtual environment, the tools for stimulus presentation (i.e., assessment format), data interpretation methods, and parental involvement had to be re-envisioned. Participants would complete the tasks at home, in front of their personal computer in a varied environment (i.e., background noise). Parents were instructed before and during the moderated session to provide a “controlled” and consistent environment. They would act as technical support and report presented technical hiccups, but also take over tasks that the research assistant would normally provide in the laboratory (e.g., providing mouse control when the child had insufficient computer handiness). Parents would provide information about hearing and vision of the participant
Although there is an extensive market for educational technologies for literacy (
This study investigated the efficacy of an educational technology to support literacy in 8-to 13-year-old struggling readers (
Visualization of the Reading Study
The digital literacy app studied was aimed at supporting phonological decoding for both isolated word reading and connected text fluency. In the laboratory research setting, instruction for both child participants and caregivers occurred in-person, with shared attention to teaching materials and a blend of digital/hardcopy materials to maximize learning. Moreover, assessment involved the use of a standard device (tablet) that reduced variability and controlled for potential issues of screen size, resolution, font size, and Internet connectivity. During the first session, all participants (3 groups) completed baseline tests in an uncued condition (without the
By moving to a virtual setting, the methodology was amended with impacts to the training program, the approach to assessment, and investments in device distribution. Where we could provide the same tablet for all participants in the laboratory, we now offered children the use of their own tablet if preferred. Additionally, all tests were presented digitally, where they were on paper for the in-person version. This added some extra measures to ensure digital consistency, visual presentation of reading passages, and test materials. These adjustments extended the online visits, with a prolonged start to ensure adequate assessment. Another time-intensive aspect was moving the training instructions online. Where initially, the child (and parent) would share a view of the tablet with the researcher who guided them through the app, visually and verbally, they now had to be guided verbally
In order to align with remote research modalities, critical adjustments were made to each study (See
To ensure participants’ understanding, in-person consent procedures are commonly guided by researchers, providing time and space to emphasize or clarify information on the informed consent, such as affirming the participant’s right to withdraw from the study at any time, as well as to address questions and concerns from participants. Comparable procedures can be carried out in virtual studies. Video conferencing platforms (e.g., Zoom, Microsoft Teams, Google Meet) have brought well-appreciated convenience in enabling researchers to moderate consent procedures and online tasks. However, certain privacy and security issues have also been exposed amid the soaring popularity of these platforms. While such issues are heavily dependent upon the individual software’s safety protocol, much responsibility in protecting research subjects lies within institutions and researchers. In our three studies, the research appointments were conducted over Zoom, and for online security purposes, we generated and assigned passcodes and an online waiting room, and to start off the appointments, we reviewed our video/audio protocol with the participant’s caregiver to ensure comprehension of informed consent.
In the Imitation and Reading studies, the majority of the caregivers signed the consent forms prior to the behavioral assessments. For those who were unable to, time was allocated at the beginning of the sessions to address questions and complete consent procedures. In the AV study, parental consent and child assent were both collected
For child studies in laboratory settings, caregiver involvement is often minimized. During the in-person pilot phase of the Imitation study, parents of the 4-year-old participants were invited to view the experimental process from an observation room. In the original designs of the AV and Reading studies involving older children, parents would be asked to stay in a neighboring waiting room or sit at a distance in the experimental room while the study is in session. These strategies removed possible confounds related to caregiver involvement during the task and allowed parents of younger children to monitor the task process and to attend to the children’s needs. When moving these studies online, the caregiver was advised to stay with or near the child during the appointments. Additionally, caregiver roles varied by participants’ age. Among younger children, parental physical assistance is often necessitated for task completion. For instance, to enhance participant compliance, it is typically recommended for a toddler to sit on the parent’s lap or beside the parent in front of the computer, whereas older children tend to have sufficient self-control to perform tasks with less caregiver involvement.
Specifying the role of caregivers in our studies was not only critical to ensuring proper consent, privacy, and children’s comfort, it also helps control parental involvement across families. As such, it was crucial for caregivers to be briefed on research procedures prior to the appointment. In order to uncover the role of caregiver-supervised practice, the Reading study implemented two training/practice conditions: unsupervised, independent reading and supervised, dyadic reading with a caregiver. In the online implementation, this involved providing consistent instructions for caregivers both during laboratory visits and at-home practice sessions. It was also important for caregivers to know what
Typically, parental feedback and parent-guided responses are discouraged in child studies. However, the challenge caused by the unpredictability of caregiver involvement in remote environments can be blunted by deciding prior to the experimental data collection whether parents would assist the child. Because our studies were moderated, researchers could make observations of participants and parents, and as required, instructing parents regarding their participation. In addition to parents receiving instructions at the beginning of each appointment, built-in training phases (as in the Imitation and AV studies) allowed instructions to be repeated to ensure adherence. In addition to the detailed protocols that were verbally communicated to parents prior to the appointments, the research team of the Imitation study also mailed a hardcopy flowchart to help visualize the task procedures. Lastly, because parents are often tempted to help their child “succeed” when they struggle with a task, as the more complex items occur in certain trials, reminder instructions regarding parental intervention were presented throughout the tasks as well.
An additional concern raised with parents is the timing of online appointments. Because they take place in participants’ homes, scheduling has to factor in families’ daily routines and the degree to which it is possible to participate without interruptions. When scheduling virtual appointments, our research teams recommended parents to consider potential distractions throughout a given day and highlighted the importance of creating a quiet environment. We also encouraged parents to schedule appointments when a second caregiver is available to attend to other family members (such as pets and other children), leaving the participant and one parent fully attentive during the appointment. Since home environments are inevitably more distracting (
Specifically, the Imitation and Reading studies involved providing electronic equipment for participants. To achieve excellent control of audio recordings across participants in the Imitation study, we mailed participants audio recorders, which enabled field recordings of speech production during virtual appointments. Similarly, inherent to the Reading study’s format as a longitudinal experiment with an in-home training component, ensuring access to similar equipment (i.e., touchscreen tablets) was particularly important to the study’s validity.
Both studies benefited from the high level of equipment control. However, equipment handling was a cumbersome process. It required meticulous planning such as schedule forecasting and inventory monitoring. Designated personnel prepared shipments (e.g., instructions/flow charts, equipment, small gifts, return label) sent packages at postal service locations according to the appointment schedules and even personally delivered to families when necessary. Despite the increased workload and logistical complexity caused by transporting research equipment to the families, we accepted this trade-off in order to enhance quality control of data collected in natural environments. Although we acknowledge that this is not feasible for every laboratory, sending equipment gave us the opportunity to reach a population that otherwise would not have access to these studies/ interventions.
A major logistical benefit we encountered across all three studies was increased scheduling and rescheduling flexibility for both researchers and participants. The researchers’ schedule was not subjected to shared laboratory venue availability. Likewise, in addition to work-from-home conditions for many of the parents and school cancelations for children, most families reported increased daytime flexibility. Often, it was easier to squeeze a one-hour virtual appointment into their schedule compared to an in-person visit with commuting and parking difficulties. Similarly, rescheduling appointments and follow-ups with the families were easier compared to previous in-person experiences. Importantly, we could reach families who would have been unable to visit the laboratory (due to distance or availability), which increased the diversity of participants in our studies.
One disadvantage associated with online experiments, as noted by
Moving our studies online required substantial adjustments in stimulus presentation and experimental setup. For example, the online Imitation experiment involved cartoon animations that replaced the plush puppets. The end result was visual stimuli that portrayed four cartoon characters whose mouth movements corresponded to pre-recorded audio files. During the online experiment, participants were highly engaged as cartoon characters delivered auditory stimuli. The digital animation showed to be less distracting than the puppet theater setup in the original study design. The online presentation mode eliminated distractions from tangible objects while maintaining a convincing representation of a “talking animal” for children to repeat after and interact with.
In the Imitation and AV studies, cartoon characters narrated task instructions, provided pre-programmed verbal feedback/encouragement, and indicated experimental progress to the participants. For example, the Imitation study provided “food” rewards (e.g., bananas for the monkey character) when children completed a trial, and in the AV study, a star was displayed for every block of trials. These “rewards” served as a progress bar and motivation for the children, and digital presentation offered reliable delivery and consistent timing of the instructions, stimuli, and rewards, which helped reduce unwanted influence from the researcher during facilitation of the tasks.
When presenting auditory stimuli, output levels are important. In laboratory environments, one often uses consistent and calibrated equipment and builds experiments in a virtual environment that provides certain levels of control (e.g., Python). Since there is currently no user-friendly way to run an experiment remotely in virtual environments, the AV study reimagined the experiment by using an online experiment builder. The changes following these adaptations were substantial, but not necessarily noticeable to participants. For example, the AV study required simultaneous presentation of audio and video. We wanted to ensure that potential delays caused by the participant’s computer or browser would not affect the results. Four measures were taken to assure this. First, we pre-compiled the auditory stimuli, the noise files, and the visual part of the stimulus (photo or video). This was done using ffmpeg software (Python 3.7) on the researcher’s computer. Second, these files were then reduced in file size while keeping the quality of the sound and video.
Another consideration for remote presentation of auditory stimuli is that exact loudness level on the participants’ end cannot be established. When working at a supra-threshold level, as in these studies, and/or when measuring differences in performance
A third aspect of presenting auditory stimuli is the use of headphones. Although over-ear headphones have been accepted as the gold standard for in-laboratory auditory experiments, for all three online studies, we instructed families to use speakers for all three studies, both to control for audio output variability (compared to using headphones) across devices and to allow easy incorporation of caregiver assistance.
Control of visual presentation is often encouraged. An aspect of this, when designing the experimental setup, is the positioning of the participant, which ideally should be consistent across participants to control for artifacts related to angle, distance, etc. Thus, preset age-and task-specific guidelines could be helpful in remote assessments. In our studies, the participants were asked to sit in a comfortable chair, or on a parent’s lap, with the computer/tablet positioned on a table in front of them. The Imitation and AV studies asked, when possible, to choose a computer over a tablet and to control the size of the display to a certain degree. With these instructions, we expected the camera angle to remain steady throughout the appointments. The Reading study also had a prescient need to ensure that the presentation of text was appropriate and consistent for each study visit. Participants were tested using a tablet (either owned or provided) for study sessions in addition to practice. In doing so, we could control for font size and scroll speed that would be adversely impacted with use of a small screen (i.e., smartphone). In the case of technical glitches that prevented use of the tablets, stimuli were projected onto the participants’ computer screen with considerations made to ensure clear and legible text and visual cues.
Where a researcher would be sitting adjacent or opposed to the child in the in-laboratory version of all three experiments, a similar situation was created by administering these tasks
In our studies, Zoom video conference allowed researcher-participant communication, with the stimuli and the participant visible on screen. Similar to the in-person procedures, the Imitation experiment was video-and audio-recorded. The original setup of the study had separate cameras capture the child’s face as well as the puppet show from the child’s perspective. With the online setup, the video conferencing tool offered the convenience of being able to record both angles in the same screen share view field. In the AV experiment, disabling the researcher’s camera allowed the researcher to “hide” as an observer in the background and “appear” during necessary intervention.
Considering the type of measurement (i.e., formant frequencies) in the Imitation study, obtaining quality audio recording is critical to signal analysis. However, Zoom audio recordings are subjected to input setting variability and participants’ choice of microphone. These software and hardware differences can result in incomparable speech signals or missing data. Therefore, in the absence of a highly controlled recording environment and a balanced-input microphone with exacting recording settings (as available in a laboratory booth), we sent each family a small, child-safe
Moreover, we acknowledge certain benefits of auditory recordings
For some studies, the format (in-person or remote) does not significantly change the implementation of audio/video recording, but recordings can be more efficient when using remote conferencing tools. In the Reading study, in-person sessions required the placement of a recording device (i.e., a handheld audio recorder) near the participant during reading activities. Not only did this introduce variability of recording quality, but perception of an explicit device tends to introduce more “performing” anxiety for child participants. On the contrary, however, we found that parents often reported that recording over the video conferencing platform helped relieve children’s self-consciousness because of the use of a more integrated recording device. For the at-home training sessions, there was no recording, but it was important to log participant adherence to the practice protocol. To achieve this, we implemented online quizzes
As described by
All three studies focused on designing experiments attractive to children. The Imitation and AV studies were narrated by engaging cartoon characters that served throughout the tasks and/or used in catch trials to stimulate attention. As confirmed by
In addition to having engaging study designs, motivation can be increased by paying/rewarding subjects (
Attention maintenance is also crucial in child studies. In all three studies, tasks were broken into sections, which allowed children to take breaks. Longer breaks were provided in between tasks. Most children were sufficiently motivated to continue without many breaks, but the opportunities were explicitly offered and even encouraged to those showing waning motivation. Particularly, the AV task had two attention mechanisms built in. First, the cartoon character would appear randomly as catch trials to measure cross-modal attention. Second, general attention was measured by including random answer options. In this 4AFC task, children picked from four answer options. All stimuli were consonant-vowel-consonant words. During the 4AFC presentation, children could pick from the goal stimulus (presented earlier in the audio-only, visual-only or audiovisual modality; e.g., sun), a minimal pair alternative (having one different consonant; e.g., run), an alternative with only the same vowel (e.g., gum), and one with no relationship to the stimulus in meaning or form (e.g., pink). We would not expect children to pick this random answer, unless they did not pay attention to the trial or fail to comprehend task instructions. Because all children were trained to criterion, we believe that random errors could be attributed to a lack of attention. We have facilitated this AV task moderated (
Comparison of a moderated (N=37) and unmoderated (N=47) version of the AV task in 6-to 7-year-olds. The expected error pattern minimal pair > vowel > random errors is shown in both tasks, but more distinct for the moderated task. Random errors, and there for lack of attention is significantly higher in the unmoderated task. Thick horizontal lines represent medians, boxes represent interquartile ranges, and whiskers represent range, excluding outliers. Outliers are defined as values falling more than 1.5 x below or above the 25th and 75th percentiles, respectively, and are shown as circles. Significance: *
As
Given the nature of virtual assessments, certain factors concerning unequal audio/visual display and environmental differences were beyond our control while facilitating tasks online. However, in order to validate our remote data collection procedures, we were able to establish in-person and online comparisons within several measures critical to each study.
As mentioned, the collection of speech data in the Imitation study benefited from LENA recorders’ compactness, usability, and security features. However, due to the design rationale behind LENA’s hardware and software systems – intending to capture day-long talk at a time, its recording quality is one 16-bit channel at a 16kHz sample rate (
Quality comparison between in-laboratory and in-home audio recording systems in the Imitation study (Cai & Kuhl, in prep). Panel
Norm-referenced behavioral tasks like vocabulary tasks (e.g., EVT) are extremely valuable in developmental research, especially when researchers are specifically interested in these skills for the target group of participants. This allows the researcher to assure they have a representative group to test their specific hypothesis, and it also allows comparisons with a bigger group of children of the same age or skill level. Since there is currently little information about implementing norm-referenced tests online, a comparison from the AV study of in-person versus moderated online assessment of the EVT is shown below.
Some adaptations needed to be made to move the Expressive Vocabulary task online. Verbal instructions were given (over Zoom) following the assessment manuals,
Since there are no data published to date confirming the use of norm-referenced scores for online assessments, we decided to interpret raw scores. This allowed comparing results between children and tasks without overcomplicating data interpretation. Nonetheless, we made a start to validate our results by doing a meta-analysis of in-person and online versions of the same measure. Participants from the online AV study had completed the same vocabulary task (EVT) as part of an in-laboratory study in the summer of 2019. The task was assessed two times (different versions) in-person, with a 3-to 4-week separation. These children did the first version of the test again online in June 2020. The online assessment was facilitated by a trained research assistant and was conducted as similarly as possible to in-person testing. The child, caregiver, and researcher sat in front of their computers with cameras and microphones enabled, and digital scans of the materials were presented in the same way as instructed in the manual,
Pearson correlation coefficients between in-person and online testing of a normed expressive vocabulary test (EVT-3), as described in the AV study (N=47;
As previously discussed, a primary concern when the Reading study moved to a remote implementation was the ability of a virtual training program for
Contrary to this concern, the Reading study demonstrated comparable-to-enhanced response in comparison with the previous, in-person iteration. As depicted in
Comparing rate of change for in-person versus remote study. Line plots depict mean change at the group-level for the intervention groups (blue) and control groups (gray) on a composite of real-word and pseudo-word decoding performance. Lines are shown for both a previous, in-person implementation (dotted) and the remote (solid) version delivered in response to the pandemic. Error bars represent +/− 1 SEM.
As much as we anticipated and prepared for obstacles associated with remote testing (e.g., instructing families to charge or connect their devices to power, conducting A/V testing at the beginning of appointments), occasional issues surfaced in the studies. For example, instead of the recommended device types, one family from the Imitation study used a Kindle tablet and needed to troubleshoot sound settings throughout the appointment due to unstable audio projection. Seldom, but present in all three studies, researchers encountered incidents where participants were disconnected mid-session either due to connection instability or low battery levels.
Overall, adopting the recorder-in-vest setup (see Section “Audio Setup”) resulted in reliable formant analysis in the Imitation study. However, because a few of the participants were not in compliance with wearing the vest, parents had to hold the recorder near the child. In these rare cases, we noticed a few instances of clipping, which is a distortion to an auditory signal when it exceeds the sensor’s constraints on the measurable range of data. In other words, the recorder could have been too close to the child’s mouth, resulting in speech input being too loud for the device.
Additionally, auditory filters and signal-to-noise adjustments on Zoom introduced additional confounds to speech tasks. For example, in the Imitation study, LENA recorders helped the researcher discover rare incidents where caregivers violated our guidelines for caregiver involvement and assisted the child during the imitation task by whispering the sounds. Such knowledge is crucial for data analysis. However, this is often undetectable over Zoom due to its background noise suppression feature. Additionally, auditory misperceptions were observed. For example, a very few participants produced /hi/ when /i/ stimuli were presented to them. Such misperception was not present in our in-person pilot work, and we suspect this to be caused by variability among audio devices and sound settings across participants. We note that the rare instances of misperception occurred only in trials containing the stimulus /i/, and vowel productions in a /h/−onset context have been shown to be virtually identical to those observed in isolation (
Another data collection-related surprise occurred in the AV study. Visual stimuli included both videos and images. Because these types of stimuli were among our measures of interest, we did not draw attention to them during instruction. Occasional feedback was received about online presentations “not working” because the video seemed to have frozen. We believe this was caused by the caregivers’ realization that technical issues such as choppy videos can occur with studies online, and we suspect participants would question these occurrences less in the laboratory.
In general, we observed that children were more comfortable working from home. Although we initially thought this would lead to more distractions, participants were often less distracted by their familiar home environment than by the “new” laboratory surroundings as experienced in previous studies or pilot phases. Furthermore, it was nice to share this “from home” experience with children we had been working with before in the laboratory – for example, children loved to show their new toys or pets, which created a positive and comfortable environment for the experiments.
In this section, we will first suggest some guiding principles derived from our implementations of the three online studies in order to aid developmental scientists seeking to carry out future online studies. Next, we will look deeper into the current limitations of online behavioral testing involving children as well as some resources and future improvements needed to move the field forward online.
The studies discussed in this paper differed in research questions explored and age groups involved. However, commonalities and differences among the studies lend themselves to suggesting the following guiding principles for future online developmental studies.
In general, remote consent procedures can take place over secure online portals. But the downside of solely obtaining (electronic) signatures online is the lack of explicit opportunity for participants/caregivers to raise questions and/or concerns. We recognize that it is important to consider consent acquisition as a process rather than a product (
The degree of caregiver involvement is typically determined by the age group and the complexity of equipment manipulation. Involving caregivers of younger participants in our studies required intentional efforts to ensure that they followed the research protocol closely to avoid introducing unwanted interference. Clear communication of research protocols prior to the appointment is crucial in establishing desired caregiver involvement. Additionally, we experienced that it was helpful to provide families visualizations of experimental procedures or scripts of approved caregiver encouragements. Therefore, in addition to a carefully designed protocol, we believe that these steps could help minimize the confounding risk of caregiver interference. Although the level of caregiver involvement differed by age, technical support was critical for all three studies. When active manipulation of technical devices (e.g., mouse clicks) is required by the children, it can be helpful to objectively assess technical proficiency of the child during a training session, and based on the outcome, decisions can be made regarding caregivers’ assistance in technical manipulations.
During data acquisition, it is crucial to generate and deliver consistent stimuli across subjects. However, in remote studies containing visual and auditory stimuli, it is more complicated to ensure this. Each of the three studies attempted to control for the quality of stimuli delivery in their own way, from screen sharing pre-recorded sets of cartoon animations, to providing participants with designated software. Generating and delivering testing materials using experiment builders would be a favorable option as the automation of stimulus delivery has been reported to reduce the workload of the researcher during the task, lowering the chance of human error (
Related to this, we encourage future studies to carefully evaluate the benefits and costs of providing research equipment to the participants following targeted research questions and data types. In our studies, we made logistical decisions based on task designs and resources available. In the Imitation study, mailing LENA recorders and vests to all participating families was a sensible and effective choice because consistency of speech recordings across participants was critical to the experiment. And it was to our unique logistical advantage that we could use existing resources (i.e., the LENA recorders) which happened to be participant-friendly, since the families had participated in our previous research using the same device. Since the Reading study required the use of a tablet-based app, there was a need to mail a tablet to participants who had no access to one. The goal of the study was to provide an intervention/aid for a population that needs help with literacy development. When only including families that own a tablet, a large portion of this population would have been excluded. For the AV study, it was not necessary to send equipment due to the type of data measured. This study took a different approach in experimental control where, through the use of an experimenter builder, general cross-subject consistency in participants’ visual and auditory perception was achieved.
Another helpful measure to ensure experimental control for online developmental studies is researcher moderation. Although most online behavioral procedures can be automated, it is beneficial to control for unexpected changes in the environment, allowing for impromptu adjustments and extra technical support. We suggest from the findings in the AV studies that moderation could help improve participants’ attention. The researcher can be aware of any decline in participants’ attention and suggest a break or introduce adequate motivators. Additionally, researcher moderation allowed participants and their caregivers to ask questions during the consent procedure and ensured that no data would be lost due to invalid consent/assent procedures. Finally, we believe that the personal connection we established with the participants through moderation was beneficial to lowering the attrition rate and helped sustain participants’ attention.
Last but not least, due to the variability and complexity of study designs in developmental research, validation of online methods in this field often stays specific to each study. We believe a potential solution may be to carry out a study design both in-person and remotely during the initial pilot phase and assess the validity of the online study design by comparing pilot results. Moreover, when designing an online study or converting an in-person study to virtual environments, it is consequential to identify areas of adaptation and define the purpose of each adaptation. Meticulous deliberation and systematic documentation of such decisions would maximize the comparability between data collected in-person and remotely and could benefit future replications of the study within or between laboratories.
Researchers desire highly controlled study designs and environments for accurate experimental measures, sometimes at the cost of results generalizability. Virtual settings promote a natural environmental variability, which could increase ecological validity and generalizability (
With regard to reproducibility of research findings, noise in measurement and contextual factors may compromise reproducibility (
Although
It has been reported that most in-laboratory developmental studies recruit children from areas surrounding universities (
Converting studies online can seem intimidating for many because of the adjustments that need to be made. However, the changes can be quite positive. At times, crises can force adaptation and encourage advancements. Even beyond the pandemic, we believe that online developmental research can be as valuable or even more valuable than in-person research when thoughtful adjustments and considerations are made.
Although we initially felt there was little support for online adaptations from the developmental science literature, we discovered platforms such as Lab.js, Gorilla, and Pavlovia, as well as task forces such as “The Acoustical Society of America’s Task Force on Remote Testing,” which are investing immensely in support systems for researchers interested in virtual studies. Furthermore, other researchers running into similar difficulties while developing online behavioral experiments are starting to report their experiences (e.g.;
Similar to diverse laboratory-based experimental designs, online methodologies are specific to individual research questions. The three studies mentioned in this paper employed different methods and encountered problems unique to their study design. We hope our experiences will be informative for future remote studies beyond the impact of the COVID-19 pandemic.
We believe by adjusting our developmental research methods from traditional in-person settings to an online format and by acknowledging all the changes needed to be made, our developmental work is as valuable as it would have been in-person. All children could participate from a familiar environment at a time that worked for both them and the researcher, without having to make concessions and, for example, arrive at the laboratory after a long day of school, activities, and driving. Testing from home can positively impact general attention and comfort for children. In our observations, many of our participants wanted to share their world (e.g., toys, pets) with the researcher and were highly motivated to participate. Data collection procedures felt more natural and comfortable for them because they completed the tasks in their home environment. Additionally, we recognize that part of the reason for the ease of our recruitment and the high compliance from our participants could be that we had established strong rapport with most of the participants and their caregivers from previous studies.
All experimental control that would be routine in a laboratory environment had to be reevaluated and adjusted for online testing, which led to carefully considered and documented protocols. This, in combination with the automation of the research tasks, may make it easier for others to replicate our analyses and findings. As our observations (
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
The studies involving human participants were reviewed and approved by University of Washington Human Subjects Division. Written informed consent to participate in this study was provided by the participants’ legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)’ legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.
LG, RC, PMD, and PKK contributed to conception and design of the studies. LG, RC, and PMD executed and analyzed the studies. LG and RC wrote the first draft of the manuscript. LG, RC, and PMD wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.
This work was funded by NSF BCS 1551330, NIH NICHD R01HD09586101, NICHD R21HD092771, Microsoft Research Grants and a Jacobs Foundation Research Fellowship to Jason D. Yeatman and by the Overdeck Family Foundation, the University of Washington Institute for Learning & Brain Sciences Ready Mind Project. The funders were not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The Supplementary Material for this article can be found online at:
1The catch trials were created by showing a presentation of the cartoon character on top of the stimulus video or image. The children had to yell the cartoon’s name, and this was noted by researchers and parents
2
3Measuring difference of percentage correct performance between AV and audio-only stimuli presentations, rather than absolute thresholds
4LENA recorders are child safe, meeting the United States and international safety standards for electronics and toys (see