- 1USC Tina and Rick Caruso Department of Otolaryngology-Head & Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
- 2Department of Cognitive Science, University of California - San Diego, San Diego, CA, United States
- 3SPG Therapy and Education, Walnut Creek, CA, United States
- 4Independent Researcher, Los Angeles, CA, United States
- 5Department of Speech-Language Pathology &Audiology, Towson University, Towson, MD, United States
- 6Institute for Learning Sciences, University of Buffalo, Buffalo, NY, United States
- 7Department of Communication Sciences and Disorders, University of Redlands, Redlands, CA, United States
- 8Department of Computer Science and Engineering, University of Buffalo, Buffalo, NY, United States
- 9Department of Education, Juniata College, Huntingdon, PA, United States
- 10Department of Language Science and Technology, The Hong Kong Polytechnic University, Kowloon/Hong Kong, Hong Kong SAR, China
Introduction: The growing population of bilingual children and lack of bilingual clinicians have created an increased need for reliable and accessible bilingual language assessment to accurately detect language delays and disorders globally. To address this growing need, this study evaluated the Mandarin-English Receptive Language Screener (MERLS), a web-based receptive language assessment designed for bilingual Mandarin-English (ME) speaking children.
Methods: Using a citizen science approach, bilingual ME speaking parents based in the United States served as the test administrators. This two-phase study compared bilingual ME speaking children’s performance and parent-child interactions across in-person (n = 16) and telehealth (n = 43) settings. Participants in both phases were typically developing children aged 3–10 years who used Mandarin and English for at least 20% of their daily communication.
Results: In Phase I (in-person), despite variability in parent behaviors during administration, parent-administered assessments demonstrated comparable test-retest reliability (Pearson correlation: r = 0.95, p < 0.01) and item-by-item agreement (82%) to researcher-administered assessments. These reliability metrics are comparable to those of established standardized child language assessments (e.g., PPVT-5 and the QUILS). In Phase II (telehealth), platform improvements (e.g., educational quizzes and videos on proper test administration) significantly reduced interfering parent behaviors (Mandarin items: W = 485, p = 0.004; English items: W = 482, p = 0.003) without affecting children’s test performance.
Discussion: These results support the feasibility of using a citizen science approach and a digital assessment platform MERLS for parent-administered language assessments. Such innovative assessment approach has great potentials to increase access to accurate and reliable language assessment services for bilingual ME speaking children in the United States. The findings offer clinical and technical insights for developing bilingual child language assessments across both in-person and telehealth settings.
1 Introduction
Bilingual children are often misdiagnosed with a language disorder when they are not appropriately and accurately assessed in both languages (Freeman and Schroeder, 2022; Boerma and Blom, 2017; Grimm and Schulz, 2014; Oetting, 2018; Samson and Lesaux, 2009). Approximately 7%–11% of bilingual children learners are at risk of developing a language impairment (Park et al., 2017; Norbury et al., 2016; Tomblin et al., 1997). The scarcity of appropriate bilingual language assessments makes it difficult to accurately assess bilingual children’s language skills (Westerveld, 2014; Du et al., 2020). In addition, bilingual children have different language development patterns compared to their monolingual peers (Pearson, 2013; Song et al., 2021, 2022). While bilingual children often know fewer words in each of their languages than monolingual learners, the differences disappear when combining bilingual children’s “conceptual vocabulary” across both languages (Marchman et al., 2010; Hyter, 2021). Therefore, clinicians need to accurately assess a bilingual child’s language abilities in each of their languages (Gillam et al., 2013; Castilla-Earls et al., 2020; Kritikos, 2003). Given the complex bilingual language profiles, standardized tests that are based on bilingual-specific norms are necessary to collect accurate language assessment profiles (Jasso et al., 2020). Despite advancement in bilingual assessment development (Patterson and Pearson, 2004; Peña et al., 2014; Golinkoff et al., 2017; Jasso et al., 2020; Peña and Sutherland, 2022; Caesar and Kohler, 2007), there is a lack of reliable and valid multilingual assessment tools and a limited linguistic diversity for bilingual children (Kimble, 2013; Peña and Sutherland, 2022; Dollaghan and Horner, 2011; Kan et al., 2020).
English and Chinese are among the world’s most widely used languages, with an estimated population of 1.5 billion and 1.1 billion speakers, respectively (Dyvik, 2024). The rapid rise in immigration and globalization has led to a growing population of bilingual English and Chinese speakers in major English-speaking nations, including the United States, Canada, the United Kingdom, and Australia (Grenoble and Osipov, 2023; Gov.UK, 2020). In the U.S., Chinese is the most spoken Asian-Pacific Island language among individuals five years and older (Ryan, 2013; McLeod and Crowe, 2018), with Mandarin being the most prevalent dialect of Chinese, spoken by approximately 3.4 million people (Lesso, 2023; U.S. Census Bureau, 2024). Despite such a significant population need, over the past decade, a standardized and comprehensive child language assessment has not been developed for detecting bilingual Mandarin-English (ME) speaking children who are at risk for language delays and disorders. A lack of bilingual ME-speaking SLPs further exacerbated this gap: as of 2024, there are only 491 bilingual ME speaking SLPs in the U.S., and most of them are located in coastal states such as California and New York which further limited access for care (American Speech-Language-Hearing Association [ASHA], 2025b). ASHA continues to address this service gap in 2025 by defining competencies and providing resources for multilingual service delivery (American Speech-Language-Hearing Association [ASHA], 2025a). However, ASHA does not accredit or approve specialized training programs for multilingual service providers, which means the depth and breadth of training can vary significantly across institutions. This uneven distribution severely limits access to appropriate bilingual assessment services across much of the country. As a result, the majority of ME speaking children are assessed by monolingual English-speaking SLPs, who often rely on interpreters to manually translate or interpret assessment items from standardized English tests into Mandarin (Langdon and Quintanar-Sarellana, 2003). Such an assessment practice may fail to capture linguistic constructs that are unique to Mandarin, and diagnosis can be less reliable as translation errors are introduced in the evaluation process (Sheng et al., 2021; Du et al., 2020). Consequently, bilingual ME-speaking children are prone to receive over- and under-diagnosis of language disorders, ultimately impacting their development and well-being and increasing healthcare cost for the society (Flores and Tomany-Korman, 2008; Dollaghan and Horner, 2011; Yu et al., 2021). This assessment gap has severe implications for bilingual ME children with language disorders, who require accurate identification to receive appropriate special education services and interventions.
Despite the urgent need for creating bilingual language assessments to improve the current standard of care, researchers face another practice challenge during assessment development: collecting a large-scale nationally representative sample of bilingual ME-speaking children to establish a bilingual language norm for standardized bilingual language assessment development (Sheng et al., 2021). Asian populations in the U.S. tend to concentrate in certain metropolitan areas (National Academies of Sciences, Engineering, and Medicine, 2018; Cooc, 2018), and researchers outside these geographical areas do not have ready access to bilingual participants. Typical laboratory or school-based testing requires extensive travel, time commitment, and trained multilingual personnel. These barriers hinder researchers from collecting large-scale data across different developmental age groups across different regions in the U.S. To address the data collection challenges, it is critical to explore alternative methods, such as citizen science, which refers to a research approach that involves members of the public contributing to data collection and scientific discovery (Bonney et al., 2016). These approaches can enable better access to large bilingual children and easier data collection methodologies for researchers.
In addition to assessment tool limitations, tester effects can also influence diagnostic outcomes. Prior research has shown that young children responded differently when interacting with parents versus unfamiliar testers, especially in tasks involving social-communicative cues. For example, Tang et al. (2023) found significant differences in infants’ attention-following responses to joint attention cues depending on whether the cue was provided by a caregiver in a home setting or a tester in a lab setting (Brown and Woods, 2015). When adopting a more accessible telehealth approach for assessment development, it is important to consider a variety of contextualized factors when transitioning from in-person to virtual and computer-administered assessments for young children (Paradis, 2011; Brandone et al., 2008; Werfe et al., 2021; Khoshsima and Toroujeni, 2017; Magasi et al., 2018; Solano-Flores et al., 2019).
Our study aims to resolve these issues by establishing the initial feasibility of a digital bilingual assessment tool for ME speaking children in multiple service delivery modes (in-person and telehealth) to increase access for care (Ciccia et al., 2011). This paper addresses this need by testing the feasibility of parent-administered, telehealth-based assessments within a citizen science framework, with the goal of generating reliable bilingual language data and addressing barriers related to geographical limitations and the shortage of bilingual research personnel.
2 Related work
2.1 A citizen science approach for parent-administered assessment via telehealth
Citizen science, broadly defined as the involvement of the public in scientific research, has gained traction in various domains, particularly in environmental and ecological science (Fraisl et al., 2022; Schmitz et al., 2018; Bhattacharjee, 2005). Prior research showed that citizen science samples are far more diverse than samples from lab-based studies (Gosling et al., 2004; Reinecke and Gajos, 2015). The involvement of citizen scientists could vary from merely helping with labor-intensive data processing to direct involvement in the language assessment process as test administrators. The involvement of parents from diverse racial and ethnic backgrounds is vital to collect large scale diverse language data that are essential to support the norm development of a bilingual child language assessment.
Citizen science is also a promising approach to obtain large scales of diverse samples across time and location. With the proliferation of web-based assessment and increased adoption of telehealth as a service delivery method (Waite et al., 2010; Grillo, 2021; Perrin et al., 2020; Farmer et al., 2020; Lehner et al., 2021; McCrae et al., 2021; Schmitt et al., 2022; Shankar et al., 2022; Farmani et al., 2024), citizen science approach further enable parents to serve as telehealth assistants to support researchers by serving as test administrators virtually through videoconferencing platforms (Klatte et al., 2020; Sutherland et al., 2021; Dekhtyar et al., 2020; Marhefka et al., 2020) for collecting child language data in the home setting. Such an approach is in alignment with the core component of family-centered care services which actively involves caregivers as a part of the assessment process (Crais et al., 2006; Corona et al., 2021; Frigerio et al., 2021; Dodge-Chin et al., 2022).
Additionally, another benefit for utilizing the citizen science approach is to reduce practical limitations (e.g., limited research budget), because citizen science projects do not offer cash or course credit for compensation. In citizen science projects, although some people are motivated to participate in a study that has monetary reward, nearly everyone is motivated to participate in a project that is intrinsically rewarding. For example, birders help with bird surveys, and astronomy enthusiasts categorize images of galaxies (Raddick et al., 2009). This incentive structure is particularly relevant to language assessment data collection, as parents are intrinsically motivated to learn more about their children’s bilingual language abilities, which also makes parents more likely to participate in citizen science research (Bonney et al., 2016). However, it is unknown whether relying on parents as citizen scientists to collect data for their own children at a large scale would lead to meaningful and high quality data (Li et al., 2024). The present study directly addresses this gap by examining whether parent-administered, telehealth-based assessments can generate reliable bilingual language data suitable for research and future test development.
2.2 Challenges of parent-administered online assessment
Several barriers on data collection must be taken into consideration when adopting parents to assess their children’s bilingual language abilities as citizen scientists. Though previous research has involved parents as assistants to help facilitate language service sessions, enlisting parents as independent test administrators for language assessment is uncommon due to their lack of professional training (Tomlinson et al., 2018; Talbott et al., 2020; Corona et al., 2021). After all, parents would not be compensated as professional testers would be, nor expected for their livelihood to adhere to professional standards. A key concern is that parents have varying language competencies and limited knowledge on language assessment principles. Since parents typically do have expectations or concerns regarding their children’s language skills, they may not be as impartial or unbiased, which can compromise the validity and reliability of parents-administered assessments (Sullivan, 2011). When administering bilingual assessments with their children, some parents demonstrated limited ability to comprehend and follow proper test instructions; additionally parental interference behaviors in parents’ proficient language but children’s less proficient languages have been reported (Du et al., 2020, 2021). In addition, the absence of a trained professional (e.g., researcher, clinician) during home-based assessments raises concerns about the overall quality of the data collected. Together, these challenges underscore the need to better understand parental behaviors and to design parent-administered assessment protocols that provide clear guidance, support, and safeguards for data quality when implementing a citizen science approach.
Researchers have provided models of training programs to improve test administration skills in citizen scientists and to identify potential solutions to address the challenges of engaging parents. Tomlinson et al. (2018) identified 20 studies on applied behavior analysis that trained individuals (e.g., parents) for assessment, teaching, and intervention purposes, and suggested that citizen scientists with no prior experience in behavior analytic techniques can be trained to adhere to protocols and implement a range of behavioral analysis techniques. All training in the reviewed studies were delivered via videoconferencing with a trainer, who was usually an experimenter/professional with prior experience in behavior analytic approaches. Training sessions usually lasted between 15 min to 3 h, which involved strategies such as direct instruction, modeling, or role playing (Alnemary et al., 2015; Barkaia et al., 2017; Hay-Hansson and Eldevik, 2013). Online modules, written explanations of the techniques, and a supplemental trainee manual were also used in studies to enhance trainees’ adherence (Scott et al., 2017; Radville et al., 2022). Therefore, in order for successful online data collection with parents via telehealth, it is critical to evaluate not only children’s performance but also parents’ behaviors during testing (Molini-Avejonas et al., 2015). The present study contributes to this effort by systematically examining parents’ behaviors under both in-person evaluation and remote testing conditions.
2.3 Study aims
This study aims to evaluate the feasibility of using parents as citizen scientists to test their own ME speaking children via a web-based telehealth-friendly Mandarin-English Receptive Language Screener (MERLS). We propose that such an approach can bring two contributions to addressing the current standard of care challenges in bilingual child language assessment by (1) training parents as test administrators for assessment data collection and development in partnership with researchers using an automated web-based language assessment, (2) evaluating parents’ assessment process and parent-child outcomes during the telehealth setting. Specifically, we investigate whether parents can be trained to act as competent test administrators by adhering to the test protocols in-person (Study 1) and virtually via telehealth context (Study 2). Specifically, the present paper examines the following research questions:
Study 1:
(1) During in-person assessment, is children’s performance comparable between parent- and researcher-administered sessions with adequate test-retest reliability?
(2) What are the characteristics of different parent behaviors (e.g., behavioral types and frequency) during parent-administered in-person sessions?
Based on the parent behaviors observed in study 1, we made adjustments to the assessment and training protocol. After improvement, we ask:
Study 2:
(1) With technical improvement of the assessment, what changes were found in the types and frequency of parent behaviors during the telehealth assessment?
(2) What are the verbal and nonverbal interaction patterns of parent-child dyads during the telehealth assessment?
(3) How did contextual factors (e.g., children’s age, frequency of digital device use, and test performance) influence the frequency of parent interference behaviors?
We hypothesize that children’s language assessment performance will be consistent across parent- and researcher-administered conditions in Study 1, indicating no significant test differences between the two conditions and feasibility for utilizing parents as test administrators. With improved system design, parents’ interference should significantly decrease in the telehealth context in Study 2, offering future directions for utilizing a citizen science approach towards developing MERLS assessment via both in-person and telehealth delivery modalities.
3 Study 1: in-person evaluation of MERLS
3.1 Study 1 materials and methods
3.1.1 Participants
A total of 29 ME speaking parent-child dyads (see Supplementary Table 1 for demographic information) were recruited through advertisements distributed via parent email lists affiliated with local Chinese language schools and bilingual SLP Facebook groups in North America. Participating children ranged in age from 3 to 10 years old. This age range is consistent with standardized language assessments such as the Test of Early Language Development-4 (TOELDS-4, 3; 0–7; 11 years) and is narrower than widely-used tools including the Preschool Language Scales-5 (PLS-5, 0; 0–7; 11 years), the Clinical Evaluation of Language Fundamentals-5 (CELF-5, 5; 0–21;11 years), and the Peabody Picture Vocabulary Test-5 (PPVT-5, 2; 6–90+ years), which assess language constructs across broad developmental periods. For detailed analysis of parental behaviors during testing, a subset of the sample (n = 16) was selected based on parents who provided consent for video recording for further analysis. For the purposes of this study, bilingualism was defined broadly to include both simultaneous bilinguals (exposed to both languages from infancy) and sequential bilinguals (learned one language after the other). The video (Supplementary Video) included demonstrations of both prohibited parent interference behaviors and acceptable supportive behaviors (Supplementary Figure 3). The primary inclusion criterion was that children used both Mandarin and English in at least 20% of their daily lives, as reported by their parents. This inclusive definition was chosen to capture a wide range of bilingual experiences representative of the community.
3.1.2 Materials
Mandarin-English Receptive Language Screener 1.0 is an online receptive language comprehension assessment screening designed for bilingual ME-speaking children. The web interface provides pre-recorded audios for children to select the corresponding picture stimuli; Supplementary Figure 1 provides a visual representation of the test interface and an example test item in English. This test evaluates critical language components in Mandarin with 44 items and in English with 36 items, assessing linguistic constructs including prepositions, classifiers (Mandarin) or plurals (English), quantifiers, passive sentences, and relative clauses. These language components selected for evaluation have been demonstrated to be the particular linguistic weaknesses in children with language disorders and have been utilized in previous related studies (Golinkoff et al., 2017; Peña et al., 2014; Hu et al., 2016; Jia and Aaronson, 2003; Wang et al., 2022; Wong et al., 2004; Zhou and Crain, 2011; Sheng et al., 2011, 2016; Sheng, 2018). Prior work has provided preliminary evidence that the Mandarin-English Receptive Language Screener (MERLS) is an effective bilingual screener. Du et al. (2021) reported high test–retest reliability and strong concurrent validity with established English and Mandarin comprehension measures, supporting its use for receptive language assessment in bilingual children.
The receptive language task employs a sentence-picture matching format. Participants are required to select the appropriate picture from a set of four pictures after listening to a pre-recorded sentence audio in either Mandarin or English. The assessment instructions are provided in both Mandarin and English through audio recordings, ensuring accessibility for parents with varying language proficiency levels, effectively eliminating potential language barriers. The assessment begins with a welcoming message in both English and Mandarin. Two practice items are then presented to acquaint children with the testing format. All children followed the instruction and made selections on the computer, whether with or without parental assistance. The audio clips were played twice during the assessment, with a 15-s interval between items. In cases where the child did not respond within 15 s after the second play, the web page automatically advanced to the next item (Supplementary Figure 1). The audio was played at an approximate volume of 65 dB through the computer’s built-in audio system. Once a selection was made, the child could not revisit previous items. This type of closed-set tasks was proven to be reliably tested by monolingual clinicians who cannot speak the language (Cowan et al., 2022).
3.1.3 Procedure
During the administration of MERLS 1.0 by parents, all interactions between parents and children were recorded on video. To ensure comprehensive recording, a video camera was positioned behind the dyads, capturing both the activities on the computer screen and dyadic interactions. This setup served to protect the privacy of the participants while minimizing potential distractions. MERLS 1.0 was first administered either at the children’s home or in a laboratory setting, with the assessment conducted by either a caregiver (n = 17) or a trained examiner (n = 12). Parents also completed a pre-assessment questionnaire which included questions on participants’ demographic information (age, gender, and level of education), and questions about the parent’s English proficiency and child digit media and device use. Subsequently, children underwent a second MERLS 1.0 assessment within 2–4 weeks following the first testing.
To examine the reliability of parent-administered test sessions, in addition to these initial 16 parent-child dyads who had parent-first researcher-next sessions; we also tested another 12 dyads with researcher-first and parent-next sessions and analyzed children’s test results for test-retest comparison. The second assessment was administered within 2–4 weeks from the initial testing, with a different administrator. For example, children who completed the first MERLS 1.0 assessment with their parents underwent the second testing with a lab examiner and vice versa. During both testing sessions, the items were displayed on a 15-inch laptop monitor positioned approximately two feet away from the child. The laptop utilized in the assessment was equipped with a touch screen, allowing children to select answers by simply pointing and touching the screen. A brief instruction page was provided before the test started. Children were allowed to take unstructured breaks as needed throughout the Mandarin and English modules, and they were expected to complete all the items. Additionally, parent questionnaires were administered by two trained bilingual (Mandarin-English) research assistants (1 undergraduate student and 1 graduate student in Communication Science and Disorders). All administrators completed a standardized training protocol covering questionnaire content and structure, questioning techniques to avoid leading responses, and data recording procedures. Administrators followed a structured script to ensure consistency across all participants.
3.1.4 Data analysis
All children’s performance on the sentence comprehension task were automatically scored and recorded within the MERLS 1.0 system. Two trained bilingual ME-speaking research assistants watched video recordings of 16 parent-child dyads, and then independently transcribed children’s utterances and coded parental behaviors during the tests using a clinically informed codebook (Du et al., 2020, 2021). This codebook delineated four categories of interference behaviors including “repeating questions, answering questions, analyzing items, and judging of correctness” and four categories of parent support behaviors including “encouragement, verbal or physical technical support, broadcasting, and miscellaneous” (Supplementary Table 2). Video coding included parent and child verbal and non-verbal behaviors, as well as environmental distractors (Du et al., 2020). An interobserver agreement (IOA) of 97% was reached between two trained video analysts.
3.2 Study 1 results
3.2.1 Children’s performance across different administrators in Study 1
The reliability between children’s performance on the MERLS 1.0 administered by the parent and the researcher was examined using item-by-item analysis and correlational analysis. The item-by-item analysis was conducted by comparing children’s accuracy (0 or 1) on the same item between the first and second testing session. Reliability was calculated by using the number of consistent items divided by the number of total items. Pearson correlations (Bishara and Hittner, 2012) were also conducted between children’s overall performance on parent- and researcher-administered sessions to examine whether the two sessions yielded similar performance on the same tasks.
All 29 children completed the English MERLS 1.0 in both sessions. Five children did not complete the Mandarin MERLS 1.0 in either session. Item-by-item analysis (Cohen et al., 2003) showed that the overall item consistency was 82% (children scored the same on 82% of the total items in the first and second testing session), with similar consistency in the parent-first (n = 17, consistency = 82%) and the researcher-first groups (n = 12, consistency = 82%). Item-by-item consistency was slightly higher for the English (84%) subtest compared to the Mandarin subtest (80%). Pearson correlation results showed that children’s performance in the two administration sessions was significantly correlated for the overall group (r = 0.95, p < 0.01), and for the parent-first (English: r = 0.94, p < 0.01; Mandarin: r = 0.85, p < 0.01) and researcher-first groups (English: r = 0.97, p < 0.01; Mandarin: r = 0.91, p < 0.01) separately. A correlation of 0.90 and above is considered excellent; in the 0.80 s is good, and in the 0.70 s is adequate (Cohen et al., 2003); therefore, parents were able to supervise their children in completing our task by eliciting similar performance as compared to performance supervised by trained researchers.
3.2.2 Types and frequency of parent behaviors
The 16 randomly selected parent-child dyads in Study 1 demonstrated a total of 677 behaviors, including 296 interference behaviors and 381 support behaviors while administering MERLS 1.0 to their children (Supplementary Table 5). Eleven out of 16 dyads demonstrated adherence, defined as less than 10 parent interference behaviors (Du et al., 2020; Kelders et al., 2011) to the assessment protocol after viewing the introduction video. Five out of 16 parents failed to adhere to the testing protocol and demonstrated more than 10 interfering behaviors per person. Specifically, these five parents demonstrated a total of 280 out of 296 (95%) interference behaviors across the 16 parent-child dyads. Furthermore, a cross-language variation was found in parent behaviors, characterized by more support and inference behaviors in the Mandarin than English modules. On average, 16 parents interfered in approximately 10 items in Mandarin and five items in English and offered support to 11 items in Mandarin and six items in English. The top two frequent interference behaviors are “Repeating Questions” and “Analyzing Items,” whereas the top two frequent support behaviors are “Technical Supports” and “Encouragement.”
3.3 Study 1 discussion
In-person evaluation of MERLS 1.0 showed that parents were able to administer language assessments to their children independently using MERLS 1.0, offering additional insights for parent-administered automated web assessment to collect bilingual child language data. The test-retest reliability of children’s performance between parent- vs. researcher-administered sessions are consistent, suggesting that child language data collected by parents using MERLS 1.0 were consistent with the data collected by researchers. Furthermore, the test-retest reliability based on a Pearson correlation coefficient of.95 (range = 0.85–0.91) for MERLS 1.0 is comparable to other standardized child language assessments, indicating high quality assessment outcomes for MERLS 1.0. For example, Peabody Picture Vocabulary Test (PPVT-5) has a Pearson correlation coefficient of.93 (range = 0.92–0.96) from 340 subject samples during a 4-week test-retest interval; the Quick Interactive Language Screener (QUILS) which sampled 75 subjects during a 3–5 weeks test-retest duration showed an overall test-retest correlation of 0.83. Furthermore, the item-by-item agreement for MERLS 1.0 was 82% (range = 80%–84%), consistent between parent-administered and researcher-administered sessions for the 11 parents who did not interfere much of their children’s sessions and the five parents who showed most interference behaviors. This item-by-item agreement indicated that parent behaviors did not impact children’s overall performance on individual assessment test items, and that these sessions can be as reliable as researcher-administered assessment sessions. These findings offered initial feasibility for the citizen science approach using parents to collect bilingual language data to gather receptive language assessment from their own children.
Although high test-retest reliability and item-by-item agreement were observed between sessions administered by parents and researchers, parent interference behaviors were still observed as a potential concern when engaging parents as citizen scientists. Closer examination revealed that interference often arose from a combination of language- and culture-related factors. For example, parents were more likely to intervene in Mandarin modules than English ones, reflecting greater comfort with the home language and a desire to clarify tasks for their children (Du et al., 2020). Natural code-switching practices in bilingual households also contributed to parents repeating or translating questions across languages, inadvertently increasing children’s cognitive load. In addition, cultural expectations surrounding parental roles in education may have shaped parents’ tendency to confirm or encourage children’s answers, as many interpreted their role as co-administrators rather than passive observers. Finally, parental anxiety about their child’s performance and desire for success motivated them to repeat or analyze test items, even when explicitly instructed not to. Together, these findings suggest that parent interference behaviors were not random but stemmed from linguistic, psychological, and sociocultural motivations. This prompted us to investigate the parent instruction page for MERLS 1.0, which provided essential education on parental interference behaviors that are prohibited during the assessment. Prior work by Du et al. (2020) suggested that parent behaviors during the administration of MERLS 1.0 might impact children’s performance based on a subset of the participating dyads’ performance and behaviors in Study 1, indicating ongoing needs to evaluate parents’ adherence to the MERLS platform. To resolve this issue of parent interference behaviors, we adjusted MERLS 1.0 by adding new interface features (e.g., break pages and animated pictures to better engage children) and parent education and assessment materials (e.g., instructional video and quiz questions) and developed a new version of MERLS 2.0 (Supplementary Table 3).
4 Study 2: telehealth evaluation of MERLS
Building on the findings from Study 1 which demonstrated the feasibility of parent-administered assessments and highlighted the impact of interference behaviors, we made corresponding adjustments to the testing platform to improve its functionality. In Study 2 we explore how design improvements affect parent behaviors and parent-child interactions in a telehealth setting. By shifting from in-person to virtual testing, Study 2 evaluates whether these interventions can reduce interference, increase the support behaviors, and maintain data quality when parents independently administer the MERLS assessment at home.
4.1 Study 2 materials and methods
4.1.1 Participants
A total of 43 ME-speaking parent-child dyads (see Supplementary Table 4 for demographic information) in North America were recruited in Study 2 through advertisements on social media platforms such as WeChat. Participating children were aged from 3 to 10 years old, were typically developing, had normal or corrected-to-normal vision with no known genetic, neurological, or psychiatric disorders. All children used Mandarin and English in at least 20% of their daily life. This 20% threshold aligns with established bilingual assessment protocols (Hoff et al., 2012; Peña et al., 2014) and ASHA clinical practice guidelines for identifying bilingual status in pediatric populations (De Lamo White and Jin, 2011). Dyads completed the task remotely from their homes via Zoom, without an in-person experimenter presence (Pearson Education Inc., 2020). For a more in-depth analysis of parent-child interaction patterns, a subset of 36 bilingual ME-speaking parent-child dyads from this larger group was selected for detailed analysis on modes of parental-child interactions, including verbal utterances and non-verbal behaviors. This selection was made because only these 36 videos observed parental behavior or verbal utterances during the assessment; the remaining 7 out of 43 dyads showed no observable parental behavior or verbal utterances.
4.1.2 Materials
The MERLS 2.0 was developed as an updated version of MERLS 1.0, incorporating the redesign recommendations outlined in Du et al. (2020) (Supplementary Table 3). A major enhancement in MERLS 2.0 was the addition of a three-minute parent training video that provided a comprehensive orientation to the testing procedure (Supplementary Figure 2). The video included demonstrations of both prohibited parent interference behaviors and acceptable supportive behaviors (Supplementary Figure 3). Additional updates included a brief parent assessment quiz to reinforce understanding of the protocol, revised testing item order, and updated graphic designs to maintain child engagement.
4.1.3 Procedure
Mandarin-English Receptive Language Screener 2.0 was administered once at children’s homes by their caregiver. In cases when the internet connectivity was insufficient to support the video conferencing platform (e.g., for P2, P8, P9), an experimenter provided support by screen sharing and granting the child remote control access to complete the task. Families received an online testing preparation sheet one day before the scheduled appointment, outlining the required equipment and environment setup. During the session, the task was presented on the screen of the computer or iPad positioned approximately two feet away from the child. The child was instructed to respond to the questions by selecting the answers via the iPad touchscreen, a mouse, or a touchpad. Parents were allowed to help with technical difficulties, such as helping the child click responses. Audio instructions were played through headphones and/or speakers and also shared with researchers via Zoom. Parents were instructed to adjust the audio volume to a comfortable level during the newly added instructional video, which they viewed before the task began. To ensure comprehension of the test protocols, parents completed a quiz at the end of the instructional video before beginning the assessment. After watching the instructional video, the parent would access the MERLS 2.0 website via a link shared in the Zoom chat. The testing process was recorded on Zoom, capturing both the shared screen (to document children’s testing progress), and the webcam video (to observe parent-child interactions during the assessment). Animated break pages were built-in to give children a break during the assessment. The experimenter remained muted throughout the testing process unless there’s technical issues that required interventions. Additionally, pre-assessment questionnaires were completed by parents independently via online survey platform (RedCap) with built-in validation checks. Two trained research assistants (1 graduate student, 1 undergraduate research assistant) reviewed all completed questionnaires for completeness and clarity. Follow-up clarification was conducted via email or brief Zoom calls when responses were unclear or incomplete.
4.1.4 Data analysis
All parent-child interactions were video-recorded using the recording function via Zoom. Two ME speaking research assistants transcribed the videos verbatim based on children and parents’ verbal communication, and also nonverbal actions visible via the video recording camera through Zoom following the coding categories presented in Supplementary Tables 2, 4. Two research assistants independently coded all 43 videos with an IOA of 86.1%. Transcription was further verified for accuracy using nine randomly selected videos out of the 43 videos. All children’s performance during the test was automatically recorded and collected online.
Additionally, within the 43 video data, 36 videos were observed with parent verbal or non-verbal behavior and were transcribed for further analysis. Transcription focused on participating children’s verbal utterance (CU) and children’s non-verbal behaviors (CB), as well as their parents’ verbal utterances (PU) and parents’ non-verbal behaviors (PB) in both Mandarin and English sessions. First, PB, PU, CB, and CU were coded and documented to gather the occurrences of these interactions in a spreadsheet with the de-identified participant ID and timestamps. Parent-child interactions were further classified into four types of codes: PB2CB, PB2CU, PU2CB, and PU2CU (Supplementary Table 4). After all behaviors were coded, an inter-rater reliability (IRR) check was conducted by two research assistants, who re-watched and independently coded 20% of randomly selected videos selected. Then, another senior researcher compared the consistency of the codes between two research assistants. Across the four videos reviewed, 172 instances of children’s behavior were identified, with 141 coded consistently. Thus, the IRR for the coding process is 82.0%.
Then we first run descriptive statistics to generate an overall pattern of the parents and children behavior. Each occurrence of PB, PU, CB, and CU were counted as 1. Each PB or PU followed by one occurrence of CB or CU within two timestamps was counted as one parent-child interaction (i.e., PB2CB, PB2CU, PU2CB, PU2CU). Descriptive analysis was conducted for PB, PU, CB, and CU, as well as four types of parent-child interaction in both Mandarin and English sessions. Paired-test was used to examine the differences of occurrences between two language sessions. Additionally, a qualitative interaction analysis (Jordan and Henderson, 1995) was conducted by two authors who analyzed the transcript with most parents’ behavior and parents’ behaviors that lead to children’s utterances and interactions. This qualitative interaction analysis primarily focused on: (1) how parents supported young children, and (2) how parents encouraged young children to engage in the online assessment task.
4.2 Study 2 results
4.2.1 Types and frequency of parent behaviors
The 43 parents in Study 2 demonstrated a total of 795 behaviors, including 50 interference behaviors and 745 support behaviors. A total of 42 out of the 43 parent-child dyads adhered to the assessment protocol and demonstrated less than 10 parental interference behaviors (Kelders et al., 2011). Only one parent demonstrated more than 10 interreference behaviors during the test (Mandarin: n = 13; English: n = 0). Four dyads experienced technical issues during the assessment, which led to an increase of verbal technical support behaviors. Different types and frequencies of parent behaviors in Study 2 are presented in Supplementary Table 5.
4.2.2 Overall parent behaviors across Study 1 and Study 2
To compare parent interference and support behaviors between Study 1 and Study 2, Shapiro Wilk’s tests (Ghasemi and Zahediasl, 2012) were first conducted to check the normality of parent behaviors during Mandarin and English modules. All variables were not normally distributed. Wilcoxon rank sum test was performed to check if there were significant differences in parent behaviors between the two groups. Specifically, parents in Study 2 demonstrated an increase in adherence to the assessment protocol and displayed fewer parent interference behaviors (W = 485, p = 0.007). On average, each parent demonstrated 18.5 interference behaviors in Study 1 and 1.2 interference behaviors in Study 2.
4.2.3 Parent behaviors across language modules and studies
To examine parent behaviors between English and Mandarin modules, the Wilcoxon rank sum test was conducted on parent interference and support behaviors across Study 1 and Study 2. Initial analyses indicated that parent interference behaviors have decreased significantly in Study 2 compared to Study 1 in Mandarin modules (W = 485, p = 0.004) and English modules (W = 482, p = 0.003), especially in “Repeating Question” behavior (Mandarin: W = 509, p < 0.001; English: W = 445, p = 0.01) and “Analyzing Items” (Mandarin: W = 464, p = 0.002; English: W = 450, p < 0.001). Parents also displayed significantly less “Judging of Correctness” behaviors in Study 2 in English modules (W = 507, p < 0.001). The decreases were not significant, however, in “Answering Questions” for both language modules (Mandarin: W = 385, p = 0.17; English: W = 344, p = 0.81) and “Judging of Correctness” (W = 432, p = 0.07) in Mandarin modules.
To examine the consistency of parent behavior across languages within a Study, a Wilcoxon signed-rank test was conducted on parent behaviors between Mandarin and English modules. Analyses indicated that there were no significant differences in parent behaviors in Study 1 across language modules for MERLS 1.0 (interference behaviors: W = 154, p = 0.30; support behaviors: W = 159, p = 0.25). Similarly, there were no significant differences in parents’ behavior patterns across two language modules in Study 2 (interference behaviors: W = 968, p = 1; support behaviors: W = 1,041, p = 0.68).
The descriptive statistics for different modes of parent-child dyads in the English and Mandarin sessions are presented in Supplementary Table 6. Paired t-tests were conducted to examine differences between language sessions. The results showed that children exhibited significantly more utterances (CU, t = −3.299, p < 0.01) in Mandarin sessions compared to English sessions. Additionally, more child behaviors following parent verbal utterances (PU2CB, t = −2.190, p < 0.05) were found in Mandarin than in English sessions. For other types of parent-child interactions, no significant differences were found between the two language sessions.
Building on the quantitative findings, particularly the increased child utterances and parent utterances leading to child behaviors observed in Mandarin sessions, an in-depth qualitative interaction analysis was conducted to examine the specific ways parents supported and encouraged their children. These sessions revealed three primary types of parental support: (1) technical guidance on using the MERLS platform, (2) encouragement and re-engagement strategies, and (3) clarification of meaning when children struggled with Mandarin vocabulary. For example, in one Mandarin session (P5, Supplementary Figure 4), the transcript illustrated parent-child interactions across four test items. When the child encountered the sentence “The calf is carrying a crocodile who is painting,” she turned to her mother for assistance (Line 1). The mother leaned in to read the sentence aloud and explained the clicking process (Line 3), providing the child with technical support that enabled her to make the correct selection. However, when the child heard a later Mandarin sentence she did not understand, she became anxious and repeatedly verbalized the item (Lines 6–9). The mother first offered emotional reassurance (“It’s okay,” Line 8) and encouraged the child to complete the task independently. When confusion persisted, the mother began translating specific Mandarin terms to English to help the child understand the item.
These patterns illustrate that parental support tended to increase from minimal assistance (e.g., guiding technical interaction) to more involved clarification when children showed signs of distress or disengagement. Notably, parents rarely gave direct answers unless the child became visibly frustrated. Instead, they employed prompts like “listen again,” “try it yourself,” or “calm down” to help children re-engage. In addition, parents offered affirmative feedback to maintain motivation. Common phrases included “good job,” “yes, that’s it,” and “you’re doing great,” which often prompted enthusiastic responses from children (e.g., “Oh! Yes!” or “I’m correct!”), suggesting that emotional support played a role in sustaining engagement. These findings highlight how parents in the Mandarin module not only adhered to the MERLS protocol but also actively scaffolded their children’s participation using verbal strategies that supported comprehension, emotional regulation, and task persistence, especially when the child encountered linguistic or attentional challenges.
4.2.4 Contextual variables of parent behaviors during MERLS 2.0
The above results indicate that caregivers can be effectively trained by a short instructional video to demonstrate adherence to the test protocols for MERLS 2.0. The following analyses describe three contextual variables: children’s age, children’s device use, and children’s assessment performance, and consider how these contextual factors influence parent interference behaviors while administering MERLS 2.0. Children’s age was reported by parents and recorded in months. Children’s performance was measured as the percentage accuracy that children obtained in the language tasks, which indicates language proficiency. Children’s device use was collected from the digital media questionnaire reported using a Likert scale from 1 to 3, where 1 indicates the child almost never uses computers or electronic devices, 2 indicates at least once a week, and three indicates almost every day (see Supplementary Table 7).
A linear regression model showed that parent interference behaviors decreased with children’s age (ßStd. = −0.388, 95% CI [−0.838, 0.062], p = 0.09), though the correlation was only marginally significant (Supplementary Figure 5). In addition, parent interference behaviors significantly decreased with children’s device use (ßStd. = −1.54, 95% CI [−2.72, −0.36], p = 0.012) (Supplementary Figure 6). A linear regression model also showed that parent interference behaviors decreased with children’s performance on MERLS, but the correlation did not reach significance (ßStd. = −2.90, 95% CI [−6.96, 1.17], p = 0.158) (Supplementary Figure 7). These findings suggest that as children grow older or demonstrate greater capability in completing the language tasks, parents tend to interfere less during the assessment. This pattern may reflect parents adapting their behaviors to match their children’s increasing language abilities.
4.3 Study 2 discussion
In this study, by exploring parent-led administered bilingual assessment using the MERLS, we aimed to investigate the feasibility of using parents as citizen scientists to support test administration with their bilingual children in lieu of a researcher/clinician. Correlation analysis of test-retest reliability and item-by-item agreement of parent-administered vs. researcher-administered sessions revealed good to excellent reliability comparable to gold standard clinical assessment and also aligns with prior research suggesting that parents can be trained to assist in developmental and language screenings under supervision (e.g., Crais et al., 2006; Roberts and Kaiser, 2011). The primary difference between Study 1 (in-person assessment) and Study 2 (telehealth assessment) was the addition of an instructional video in Study 2 for parental adherence improvement. While Study 1 involved an in-person setting with researchers present, Study 2 adopted a telehealth format where parents administered the MERLS assessment virtually. The language tasks and test objectives of both studies were identical, which allows a direct evaluation of the video intervention’s impact on parents’ behaviors and parent-child interaction patterns especially during a telehealth context. However, we acknowledge that the observed reduction in interference behaviors cannot be attributed solely to the instructional video, as the change in modality (in-person vs. telehealth) and the passage of time between studies may have also influenced results. The study findings not only demonstrate the potential of citizen science approach to gather large-scale speech assessment data for establishing robust bilingual language norms but also provide insights on the utility of telehealth service delivery to overcome geographical and access barriers.
While most previous studies explored the possibility of involving parents in in-person test administration, our work highlights the feasibility of telehealth formats, where assessment can occur remotely to circumvent the geographical barriers. This format could help parents monitor their children’s progress independently and seek professional support when necessary. However, it is important to clarify that in the current telehealth format, test administration remains supported by researchers or healthcare clinicians, ensuring adherence to protocols and addressing technical challenges. Future adaptations to MERLS could explore fully autonomous parent-administered assessments.
4.3.1 Improvement of parent adherence from Study 1 to Study 2
This study also demonstrated that brief instructional interventions can significantly improve parent adherence to test protocols, which lays the foundation for recruiting caregivers as test administrators in telehealth language assessments. In study 1, parents’ ability to administer the assessment was evaluated against professional examiners in a supervised, in-person context. In Study 2, parents administered the test virtually after watching an instructional video, an intervention that improved adherence while reducing interference behaviors. Moreover, our detailed qualitative analysis revealed specific parent behavioral patterns of verbal and nonverbal support during bilingual telehealth assessments.
Our results showed that all four types of parent interference behaviors (Repeating Questions, Answering Questions, Analyzing Items, Judging of Correctness) decreased significantly in Study 2 compared to Study 1, demonstrating the effectiveness of the instructional video intervention method. Notably, parent supporting behaviors, such as encouragement, remained stable, indicating that parents continued to provide motivational scaffolding without intruding on the child’s task performance. This balance, which preserves children’s task independence while allowing parents to offer appropriate affective support, is important for the validity of language assessment conducted in non-clinical settings. These findings provided robust statistical evidence for the feasibility of the citizen science approach, demonstrating that caregivers can be systematically trained to administer assessments and thereby contribute to large-scale collection of bilingual language data.
4.3.2 Scaffolding and language preferences in bilingual parent-child interaction
In-depth interaction analysis revealed how parent support behaviors interacted with the child during bilingual assessment. Children’s increased verbal interactions in Mandarin often stemmed from requests for technical assistance or clarification of meaning, leading parents to provide tailored verbal cues, including technical guidance, encouragement to re-engage, and clarifications. These dynamics highlight the adaptive role parents play as facilitators, adjusting their strategies based on the child’s needs. Beyond the overall adherence findings, the analysis of parent-child interaction modes revealed a strong communication preference for Mandarin among ME-speaking children during the telehealth assessment, with children exhibiting significantly more verbal utterances (characterized by statistical significance in CU) especially in Mandarin sessions compared to English sessions. This preference likely stems from the cognitive, cultural, and emotional connections children have with their heritage language (Cummins, 2001; Levey and Polirstok, 2010). Parents play a key role in reinforcing this dynamic by offering greater scaffolding and support in their shared primary language (Yeh, 2019), as evidenced by the qualitative findings of adaptive parental involvement during Mandarin sessions. This preference underscores a crucial consideration for designing technology-mediated parent-child interaction systems. Children’s linguistic comfort zones can significantly impact their engagement and performance during structured interactions (Puckett et al., 2009). For bilingual families, platforms should not only support multiple languages but also adapt dynamically to children’s language preferences and communication styles (Verhagen et al., 2022; Hoff et al., 2012). Incorporating features like “awareness display” (Gao et al., 2015), which include culturally contextualized prompts, adaptive language scaffolding, and sensitivity to family communication styles, could help ensure the accuracy of telehealth assessments. These features would also enhance the effectiveness of child-focused technologies, supporting children’s development, especially for those from diverse linguistic backgrounds.
4.3.3 Contextual variables influencing parent interference behaviors
Children’s behaviors and performance often act as mediators of parent interference behaviors. Previous research has found that parent and children’s behaviors coregulate and reflect moment-to-moment coordination of goal-oriented behaviors (Calkins, 2010). To investigate the effects of children’s behaviors on parent interference behaviors during the test, we examined three variables: age, device use, and test performance. This study includes a wide age range (3–10 years), which encompasses several distinct developmental stages. Cognitive and behavioral characteristics vary significantly across this span (Diamond, 2002). For instance, younger children (e.g., 3–5 years old) typically have shorter attention spans and less experience with digital interfaces (McClelland et al., 2006; Mahone and Schneider, 2012), while older, school-aged children (e.g., 6–10 years old) generally possess greater task autonomy and digital literacy (Liu et al., 2024). As expected, parent interference behaviors decreased as children grew older, likely due to increased linguistic competence and independence. Interestingly, parent interference behaviors also significantly decreased when children spent more time on digital devices, further supported by our qualitative observations that familiarity with the assessment interface reduced the need for direct parental technical assistance, thereby playing a key role in reducing parent involvement. This finding aligns with studies on digital literacy, which emphasize the role of child familiarity in reducing reliance on parental assistance (Neumann and Neumann, 2017). Lastly, while children’s test performance was negatively associated with parent interference behaviors, the relationship does not reach statistical significance. Nonetheless, the trend suggests that parents might adapt their behaviors to children’s language capabilities, which is an important consideration for designing scalable, parent-led assessments.
5 Study limitations
This research has several limitations which could be addressed in future research. First, the data for the two groups were collected in different modalities: Study 1 was conducted in-person, and Study 2 was conducted virtually via Zoom. This change in testing modality introduces potential confounding factors. While our detailed video analysis in Study 2 aimed to capture both verbal and non-verbal interactions, the virtual setting and limited camera angle inherently constrained the complete observation of all non-verbal parent behaviors occurring outside the video frame. Therefore, it is important to acknowledge that some non-verbal interference behaviors in Study 2 might have gone uncaptured, potentially influencing the observed reduction in overall interference. Future studies should mitigate this limitation by using self-recording devices or multiple camera angles to capture a holistic view of the testing environment. Second, Study 1 had design and data collection shortcomings. Specifically, parents chose the order of completing the English and Mandarin modules based on their preferences, rather than through random assignment. A counterbalanced design, where module order is systematically alternated, would improve the study’s internal validity. Third, our sample primarily consisted of middle-class families with highly educated parents, which limits the generalizability of our findings to families from diverse socioeconomic and cultural backgrounds. Socioeconomic factors may influence parent digital literacy, access to reliable internet and devices, availability of quiet testing environments, and cultural beliefs about parent roles in formal assessment. Future studies should prioritize recruitment from diverse socioeconomic groups to evaluate whether parent adherence patterns vary across demographics. Community partnerships with Title I schools, community health centers, and immigrant service organizations may help achieve more representative sampling.
6 Conclusion
Traditional approaches to language assessment face several challenges, including a shortage of bilingual SLPs, limited availability of bilingual language assessment tools, and insufficient development of bilingual language norms. This study demonstrates how citizen science, an underutilized data collection method, can expand the current assessment paradigm by positioning parents as active contributors to research and service delivery. Beyond its immediate implications for ME assessments, this approach highlights how collaborative, family-centered methods can reshape the way child language data are collected, diversify research samples, and accelerate the development of more equitable assessment tools across languages. The online MERLS test is equipped with an automated scoring system based on prior research (Gale et al., 2021), simplifying the process for parent-child dyads to access via standard telehealth equipment (e.g., such as laptops, videoconferencing software). Our findings from both studies suggested technical design insights to improve parental adherence, and identified qualitative insights regarding contextual factors (e.g., more supportive behaviors in Mandarin session, and more child’s utterances following parents’ behaviors) particularly in the dominant home language observed from parent-child interaction, adding more recommendations for future language assessment development.
Additionally, this study evaluates the feasibility of telehealth assessment to increase access to bilingual SLPs for ME-speaking children, while also can benefit a larger group of monolingual clinicians to administer tests to bilingual children, addressing the unique service needs of bilingual assessment with children from diverse social cultural backgrounds in the speech-language field (Hyter and Salas-Provance, 2019; De Lamo White and Jin, 2011). By establishing feasibility of remote, parent-administered bilingual assessment, this work provides a foundation for developing scalable approaches to identify language disorders in bilingual children, a population with significant special educational needs arising from systemic assessment barriers and limited access to bilingual clinical services. Future work should continue to examine the utility of the citizen science approach to accommodate a wider range of parent-child profiles across socioeconomic status, geographic location, and cultural backgrounds for more complex language assessment tasks, to ensure comprehensive child language assessment practices and inclusivity and equity during the assessment practice. Large-scale validation studies are needed to compare MERLS outcomes from parent-administered sessions with gold-standard diagnoses made by qualified bilingual speech-language pathologists in real-world clinical settings. Sensitivity, specificity, positive and negative predictive values must be established across different age groups and language proficiency levels. Lastly, more implementation research should examine the scalability of this approach, including cost-effectiveness analyses, integration into clinical workflows, parent satisfaction and retention over time, and quality assurance mechanisms for maintaining data integrity at scale. Only through such comprehensive validation can we move from a promising feasibility study to a clinically viable assessment tool that improves access to equitable language services for bilingual children.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by University of Delaware Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.
Author contributions
YD: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing. YT: Formal analysis, Visualization, Writing – original draft, Writing – review & editing. KF: Formal analysis, Writing – original draft, Writing – review & editing. YL: Formal analysis, Writing – original draft, Writing – review & editing. DW: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Writing – original draft, Writing – review & editing. XT: Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing. YL: Formal analysis, Writing – original draft, Writing – review & editing. QZ: Formal analysis, Writing – original draft, Writing – review & editing. SQ: Formal analysis, Writing – original draft, Writing – review & editing. JX: Supervision, Writing – review & editing, Validation. LS: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Writing – original draft, Writing – review & editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This study was supported by the Spencer Foundation Research Grant (Small) and ASHA Multicultural Activities Grant (#015703-00001) for funding the development of the platform, research assistantship, and human subject compensation for the data collected.
Acknowledgments
We acknowledge the contributions of the following individuals for their efforts on data collection and analysis of this project: Elena Hu, Sharon Hollenbach, Caila Walsh, Ganya Luo. We especially appreciate the following individuals for their generous feedback for this manuscript: Joseph Hin Yan Lam, Yannan Li, and Gedeon Deák.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2025.1696031/full#supplementary-material
References
Alnemary, F. M., Wallace, M., Symon, J. B., and Barry, L. M. (2015). Using international videoconferencing to provide staff training on functional behavioral assessment. Behav. Intervent. 30, 73–86.
American Speech-Language-Hearing Association [ASHA] (2025a). Multilingual service providers. Rockville, MD: ASHA.
American Speech-Language-Hearing Association [ASHA] (2025b). Profile of ASHA multilingual service providers, Year-End 2024. Rockville, MD: ASHA.
Barkaia, A., Stokes, T. F., and Mikiashvili, T. (2017). Intercontinental telehealth coaching of therapists to improve verbalizations by children with autism. J. Appl. Behav. Anal. 50, 582–589.
Bhattacharjee, Y. (2005). Citizen scientists supplement work of Cornell researchers. Science 308, 1402–1403. doi: 10.1126/science.308.5727.1402
Bishara, A. J., and Hittner, J. B. (2012). Testing the significance of a correlation with nonnormal data: Comparison of Pearson, Spearman, transformation, and resampling approaches. Psychol. Methods 17, 399–417. doi: 10.1037/a0028087
Boerma, T., and Blom, E. (2017). Assessment of bilingual children: What if testing both languages is not possible? J. Commun. Disord. 66, 65–76. doi: 10.1016/j.jcomdis.2017.04.001
Bonney, R., Phillips, T. B., Ballard, H. L., and Enck, J. W. (2016). Can citizen science enhance public understanding of science? Public Understand. Sci. 25, 2–16. doi: 10.1177/0963662515607406
Brandone, A. C., Golinkoff, R. M., and Hirsh-Pasek, K. (2008). Feasibility of computer-administered language assessment. Perspect. School-Based Issues 9, 57–65. doi: 10.1044/sbi9.2.57
Brown, J. A., and Woods, J. J. (2015). Effects of a triadic parent-implemented home-based communication intervention for toddlers. J. Early Intervent. 37, 44–68. doi: 10.1177/1053815115589350
Caesar, L. G., and Kohler, P. D. (2007). The state of school-based bilingual assessment: Actual practice versus recommended guidelines. Lang. Speech Hear. Services Schools 38, 190–200. doi: 10.1044/0161-1461(2007/020)
Calkins, S. D. (2010). “Caregiving as coregulation: Psychobiological processes and child functioning,” in Biosocial foundations of family processes, eds A. Booth, S. M. McHale, and N. S. Landale (New York, NY: Springer New York), 49–59.
Castilla-Earls, A., Bedore, L., Rojas, R., Fabiano-Smith, L., Pruitt-Lord, S., Restrepo, M. A., et al. (2020). Beyond scores: Using converging evidence to determine speech and language services eligibility for dual language learners. Am. J. Speech-Lang. Pathol. 29, 1116–1132. doi: 10.1044/2020_AJSLP-19-00179
Ciccia, A. H., Whitford, B., Krumm, M., and McNeal, K. (2011). Improving the access of young urban children to speech, language and hearing screening via telehealth. J. Telemed. Telecare 17, 240–244. doi: 10.1258/jtt.2011.100810
Cohen, J., Cohen, P., West, S. G., and Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences, 3rd Edn. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.
Cooc, N. (2018). Examining the underrepresentation of Asian Americans in special education: New trends from California school districts. Exceptionality 26, 1–19. doi: 10.1080/09362835.2016.1216847
Corona, L. L., Weitlauf, A. S., Hine, J., Berman, A., Miceli, A., Nicholson, A., et al. (2021). Parent perceptions of caregiver-mediated telemedicine tools for assessing autism risk in toddlers. J. Autism Dev. Disord. 51, 476–486. doi: 10.1007/s10803-020-04554-9
Cowan, T., Paroby, C., Leibold, L. J., Buss, E., Rodriguez, B., and Calandruccio, L. (2022). Masked-speech recognition for linguistically diverse populations: A focused review and suggestions for the future. J. Speech Lang. Hear. Res. 65, 3195–3216. doi: 10.1044/2022_JSLHR-22-00011
Crais, E. R., Roy, V. P., and Free, K. (2006). Parents’ and professionals’ perceptions of the implementation of family-centered practices in child assessments. Am. J. Speech Lang. Pathol. 15, 365–377. doi: 10.1044/1058-0360(2006/034)
Cummins, J. (2001). Bilingual children’s mother tongue: Why is it important for education? Sprogforum 7, 15–20.
De Lamo White, C., and Jin, L. (2011). Evaluation of speech and language assessment approaches with bilingual children. Intern. J. Lang. Commun. Disord. 46, 613–627. doi: 10.1111/j.1460-6984.2011.00049.x
Dekhtyar, M., Braun, E. J., Billot, A., Foo, L., and Kiran, S. (2020). Videoconference administration of the western aphasia battery-revised: Feasibility and validity. Am. J. Speech-Lang. Pathol. 29, 673–687. doi: 10.1044/2019_AJSLP-19-00023
Diamond, A. (2002). “Normal development of prefrontal cortex from birth to young adulthood: Cognitive functions, anatomy, and biochemistry,” in Principles of frontal lobe function, eds D. T. Stuss and R. T. Knight (Oxford: Oxford University Press), 466–503.
Dodge-Chin, C., Shigetomi-Toyama, S., and Quinn, E. D. (2022). Teaching parents read, ask, answer, prompt strategies via telepractice: Effects on parent strategy use and child communication. Lang. Speech Hear. Services Schools 53, 237–255. doi: 10.1044/2021_LSHSS-21-00075
Dollaghan, C. A., and Horner, E. A. (2011). Bilingual language assessment: A meta-analysis of diagnostic accuracy. J. Speech Lang. Hear. Res. 54, 1077–1088. doi: 10.1044/1092-4388(2010/10-0093)
Du, Y., Liu, Y., Tang, Y., Fong, K. K., and Sheng, L. (2021). “Parental adherence to test protocols: Exploratory study on a remote, web-based Mandarin-English receptive language assessment,” in Proceedings of the 30-minute technical research oral presentation accepted at the 2021 American Speech-Language and Hearing Association Annual Convention, (Washington, DC).
Du, Y., Sheng, L., and Tekinbas, K. S. (2020). “Try your best”: Parent behaviors during administration of an online language assessment tool for bilingual mandarin-english children,” in Proceedings of the Interaction Design and Children Conference. Association for Computing Machinery, (New York, NY), 409–420. doi: 10.1145/3392063.3394441
Farmani, E., Fekar Gharamaleki, F., and Nazari, M. A. (2024). Challenges and opportunities of tele-speech therapy: Before and during the COVID-19 pandemic. J. Public Health Res. 13:22799036231222115. doi: 10.1177/22799036231222115
Farmer, R. L., McGill, R. J., Dombrowski, S. C., McClain, M. B., Harris, B., Lockwood, A. B., et al. (2020). Teleassessment with children and adolescents during the coronavirus (COVID-19) pandemic and beyond: Practice and policy implications. Prof. Psychol. Res. Pract. 51, 477–487. doi: 10.1037/pro0000349
Flores, G., and Tomany-Korman, S. C. (2008). The language spoken at home and disparities in medical and dental health, access to care, and use of services in US children. Pediatrics 121:e1703-14.
Fraisl, D., Hager, G., Bedessem, B., Gold, M., Hsing, P. Y., Danielsen, F., et al. (2022). Citizen science in environmental and ecological sciences. Nat. Rev. Methods Prim. 2:64. doi: 10.1038/s43586-022-00144-4
Freeman, M. R., and Schroeder, S. R. (2022). Assessing language skills in elementary-aged bilingual children: Current trends in research and practice. J. Child Sci. 12, e33–e46. doi: 10.1055/s-0042-1743575
Frigerio, P., Monte, L. D., Sotgiu, A., De Giacomo, C., and Vignoli, A. (2021). Parents’ satisfaction of tele-rehabilitation for children with neurodevelopmental disabilities during the covid-19 pandemic. BMC Fam. Pract. 23, 146–146. doi: 10.1186/s12875-022-01747-2
Gale, R., Bird, J., Wang, Y., van Santen, J., Prud’hommeaux, E., Dolata, J., et al. (2021). Automated scoring of tablet-administered expressive language tests. Front. Psychol. 12:668401. doi: 10.3389/fpsyg.2021.668401
Gao, G., Yamashita, N., Hautasaari, A. M., and Fussell, S. R. (2015). “Improving multilingual collaboration by displaying how non-native speakers use automated transcripts and bilingual dictionaries,” in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, (New York, NY: ACM), 3463–3472.
Ghasemi, A., and Zahediasl, S. (2012). Normality tests for statistical analysis: A guide for non-statisticians. Intern. J. Endocrinol. Metab. 10, 486–489. doi: 10.5812/ijem.3505
Gillam, R. B., Peña, E. D., Bedore, L. M., Bohman, T. M., and Mendez-Perez, A. (2013). Identification of specific language impairment in bilingual children: Assessment in English. J. Speech Lang. Hear. Res. 56, 1813–1823. doi: 10.1044/1092-4388(2013/12-0056)
Golinkoff, R. M., De Villiers, J., Hirsh-Pasek, K., Iglesias, A., Wilson, M. S., Morini, G., et al. (2017). User’s manual for the quick interactive Language Screener (QUILS): A measure of vocabulary, syntax, and language acquisition skills in young children. Baltimore, MD: Paul H. Brookes Publishing Co.
Gosling, S. D., Vazire, S., Srivastava, S., and John, O. P. (2004). Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. Am. Psychol. 59, 93–104. doi: 10.1037/0003-066X.59.2.93
Gov.UK (2020). English proficiency: Pupils with English as an additional language. Washington, D.C: Department of Education.
Grenoble, L. A., and Osipov, B. (2023). The dynamics of bilingualism in language shift ecologies. Ling. Approaches Biling. 13, 1–39. doi: 10.1075/lab.22035.gre
Grillo, E. U. (2021). Functional voice assessment and therapy methods supported by telepractice, Voiceevalu8, and Etill voice training. Sem. Speech Lang. 42, 41–53. doi: 10.1055/s-0040-1722753
Grimm, A., and Schulz, P. (2014). Specific language impairment and early second language acquisition: The risk of over- and underdiagnosis. Child Indicators Res. 7, 821–841. doi: 10.1007/s12187-013-9230-6
Hay-Hansson, A. W., and Eldevik, S. (2013). Training discrete trials teaching skills using videoconference. Res. Autism Spectrum Disord. 7, 1300–1309.
Hoff, E., Core, C., Place, S., Rumiche, R., Señor, M., and Parra, M. (2012). Dual language exposure and early bilingual development. J. Child Lang. 39, 1–27. doi: 10.1017/S0305000910000759
Hu, S., Gavarró, A., Vernice, M., and Guasti, M. T. (2016). The acquisition of Chinese relative clauses: Contrasting two theoretical approaches. J. Child Lang. 43, 1–21. doi: 10.1017/S0305000914000865
Hyter, Y. D. (2021). “The power of words: A preliminary critical analysis of concepts used in speech, language, and hearing sciences,” in Critical perspectives on social justice in speech-language pathology, ed. R. Horton (Hershey, PA: IGI).
Hyter, Y. D., and Salas-Provance, M. B. (2019). Culturally responsive practices in speech, language, and hearing sciences. San Diego, CA: Plural Publishing Inc.
Jasso, J., McMillen, S., Anaya, J. B., Bedore, L. M., and Peña, E. D. (2020). The utility of an English semantics measure for identifying developmental language disorder in Spanish–English bilinguals. Am. J. Speech-Lang. Pathol. 29, 776–788. doi: 10.1044/2020_AJSLP-19-00202
Jia, G., and Aaronson, D. (2003). A longitudinal study of Chinese children and adolescents learning English in the United States. Appl. Psychol. 24, 131–161. doi: 10.1017/S0142716403000079
Jordan, B., and Henderson, A. (1995). Interaction analysis: Foundations and practice. J. Learn. Sci. 4, 39–103. doi: 10.1207/s15327809jls0401_2
Kan, P. F., Huang, S., Winicour, E., and Yang, J. (2020). Vocabulary growth: Dual language learners at risk for language impairment. Am. J. Speech-Lang. Pathol. 29, 1178–1195. doi: 10.1044/2020_AJSLP-19-00160
Kelders, S., Kok, R., and Van Gemert-Pijnen, J. (2011). “Technology and adherence in web-based interventions for weight control: A systematic review,” in Proceedings of the ACM international conference proceeding series, (New York, NY: ACM), doi: 10.1145/2467803.2467806
Khoshsima, H., and Toroujeni, S. M. H. (2017). Transitioning to an alternative assessment: Computer-based testing and key factors related to testing mode. Eur. J. English Lang. Teach. 2, 54–73. doi: 10.5281/zenodo.268576
Kimble, C. (2013). Speech-language pathologists’ comfort levels in English language learner service delivery. Commun. Disord. Quar. 35, 21–27. doi: 10.1177/1525740113487404
Klatte, I. S., Lyons, R., Davies, K., Harding, S., Marshall, J., McKean, C., et al. (2020). Collaboration between parents and SLTs produces optimal outcomes for children attending speech and language therapy: Gathering the evidence. Intern. J. Lang. Commun. Disord. 55, 618–628. doi: 10.1111/1460-6984.12538
Kritikos, E. P. (2003). Speech-language pathologists’ beliefs about language assessment of bilingual/bicultural individuals. Am. J. Speech-Lang. Pathol. 12, 73–91. doi: 10.1044/1058-0360(2003/054)
Langdon, H. W., and Quintanar-Sarellana, R. (2003). Roles and responsibilities of the interpreter in interactions with speech-language pathologists, parents, and students. Sem. Speech Lang. 24, 235–244. doi: 10.1055/s-2003-42826
Lehner, K., Pfab, J., and Ziegler, W. (2021). Web-based assessment of communication-related parameters in dysarthria: Development and implementation of the Kommpas web app. Clin. Ling. Phonet. 36, 1093–1111. doi: 10.1080/02699206.2021.1989490
Levey, S., and Polirstok, S. (2010). Language development: Understanding language diversity in the classroom. Thousand Oaks, CA: Sage Publications.
Li, W., Germine, L. T., Mehr, S. A., Srinivasan, M., and Hartshorne, J. (2024). Developmental psychologists should adopt citizen science to improve generalization and reproducibility. Infant Child Dev. 33:e2348. doi: 10.1002/icd.2348
Liu, S., Reynolds, B. L., Thomas, N., and Soyoof, A. (2024). The use of digital technologies to develop young children’s language and literacy skills: A systematic review. SAGE Open 14:21582440241230850. doi: 10.1177/21582440241230850
Magasi, S., Harniss, M., and Heinemann, A. W. (2018). Interdisciplinary approach to the development of accessible computer-administered measurement instruments. Arch. Phys. Med. Rehabil. 99, 204–210. doi: 10.1016/j.apmr.2017.06.036
Mahone, E. M., and Schneider, H. E. (2012). Assessment of attention in preschoolers. Neuropsychol. Rev. 22, 361–383. doi: 10.1007/s11065-012-9217-y
Marchman, V. A., Fernald, A., and Hurtado, N. (2010). How vocabulary size in two languages relates to efficiency in spoken word recognition by young Spanish–English bilinguals. J. Child Lang. 37, 817–840. doi: 10.1017/S0305000909990055
Marhefka, S., Lockhart, E., and Turner, D. (2020). Achieve research continuity during social distancing by rapidly implementing individual and group videoconferencing with participants: Key considerations, best practices, and protocols. AIDS Behav. 24, 1983–1989. doi: 10.1007/s10461-020-02837-x
McClelland, M. M., Acock, A. C., and Morrison, F. J. (2006). The impact of kindergarten learning-related skills on academic trajectories at the end of elementary school. Early Childhood Res. Quar. 21, 471–490. doi: 10.1016/j.ecresq.2006.09.003
McCrae, C. S., Chan, W. S., Curtis, A. F., Nair, N., Deroche, C. B., Munoz, M., et al. (2021). Telehealth cognitive behavioral therapy for insomnia in children with autism spectrum disorder: A pilot examining feasibility, satisfaction, and preliminary findings. Autism: Intern. J. Res. Pract. 25, 667–680. doi: 10.1177/1362361320949078
McLeod, S., and Crowe, K. (2018). Children’s consonant acquisition in 27 languages: A cross-linguistic review. Am. J. Speech-Lang. Pathol. 27, 1546–1571. doi: 10.1044/2018_AJSLP-17-0100
Molini-Avejonas, D. R., Rondon-Melo, S., Amato, C. A., and Samelli, A. G. (2015). A systematic review of the use of telehealth in speech, language, and hearing sciences. J. Telemed. Telecare 21, 367–376. doi: 10.1177/1357633x15583215
National Academies of Sciences, Engineering, and Medicine (2018). Improving Health Research on Small Populations. Washington, D.C: National Academies Press, doi: 10.17226/25112
Neumann, M. M., and Neumann, D. L. (2017). The use of touch-screen tablets at home and pre-school to foster emergent literacy. J. Early Childhood Liter. 17, 203–220. doi: 10.1177/1468798415619773
Norbury, C. F., Gooch, D., Wray, C., Baird, G., Charman, T., Simonoff, E., et al. (2016). The impact of nonverbal ability on prevalence and clinical presentation of language disorder: Evidence from a population study. J. Child Psychol. Psychiatry 57, 1247–1257. doi: 10.1111/jcpp.12573
Oetting, J. B. (2018). Prologue: Toward accurate identification of developmental language disorder within linguistically diverse schools. Lang. Speech Hear. Services Schools 49, 213–217. doi: 10.1044/2018_LSHSS-CLSLD-17-0156
Paradis, J. (2011). Individual differences in child English second language acquisition comparing child-internal and child-external factors. Ling. Approaches Biling. 1, 213–237. doi: 10.1075/lab.1.3.01par
Park, M., O’Toole, A., and Katsiaficas, C. (2017). Dual language learners: A national demographic and policy profile. Washington, DC: Migration Policy Institute.
Patterson, J. L., and Pearson, B. Z. (2004). “Bilingual lexical development: Influences, contexts, and processes,” in Bilingual language development and disorders in Spanish-English speakers, ed. B. A. Goldstein (Baltimore MD: Paul Brookes), 77–104.
Pearson, B. Z. (2013). “Distinguishing the bilingual as a late talker from the later talker who is bilingual,” in Late talkers: Language development, interventions, and outcomes, eds L. Rescorla and P. Dale (Baltimore MD: Paul Brookes), 67–87.
Peña, E. D., and Sutherland, R. (2022). Can you see my screen? Virtual assessment in speech and language. Lang. Speech Hear. Services Schools 53, 329–334. doi: 10.1044/2022_LSHSS-22-00007
Peña, E. D., Gillam, R. B., and Bedore, L. M. (2014). Dynamic assessment of narrative ability in English accurately identifies language impairment in English language learners. J. Speech Lang. Hear. Res. 57, 2208–2220. doi: 10.1044/2014_JSLHR-L-13-0151
Perrin, P. B., Rybarczyk, B. D., Pierce, B. S., Jones, H. A., Shaffer, C., and Islam, L. (2020). Rapid telepsychology deployment during the COVID-19 pandemic: A special issue commentary and lessons from primary care psychology training. J. Clin. Psychol. 76, 1173–1185. doi: 10.1002/jclp.22969
Puckett, M. B., Black, J. K., Wittmer, D. S., and Peterson, S. H. (2009). The young child. Upper Saddle River, NJ: Pearson.
Raddick, M. J., Bracey, G., Gay, P. L., Lintott, C. J., Murray, P., Schawinski, K., et al. (2009). Galaxy zoo: Exploring the motivations of citizen science volunteers. arXiv preprint arXiv:0909.2925
Radville, K. M., Larrivee, E. C., Baron, L. S., Kelley-Nazzaro, P., and Christodoulou, J. A. (2022). Online training modules for teaching assessment skills to graduate student clinicians. Lang. Speech Hear. Services Schools 53, 417–430. doi: 10.1044/2021_LSHSS-21-00068
Reinecke, K., and Gajos, K. Z. (2015). “Labinthewild: Conducting large-scale online experiments with uncompensated samples,” in Proceedings of the 18th ACM conference on computer supported cooperative work & social computing, (New York, NY: ACM), 1364–1378.
Roberts, M. Y., and Kaiser, A. P. (2011). The effectiveness of parent-implemented language interventions: A meta-analysis. Am. J. Speech Lang. Pathol. 20, 180–199. doi: 10.1044/1058-0360(2011/10-0055)
Ryan, C. L. (2013). Language use in the United States: 2011. Washington, DC: U.S. Department of Commerce, Economics and Statistics Administration, U.S. Census Bureau.
Samson, J. F., and Lesaux, N. K. (2009). Language-minority learners in special education: Rates and predictors of identification for services. J. Learn. Disabil. 42, 148–162. doi: 10.1177/0022219408326221
Schmitt, M. B., Tambyraja, S., Thibodeaux, M., and Filipkowski, J. (2022). Feasibility of assessing expressive and receptive vocabulary via telepractice for early elementary-age children with language impairment. Lang. Speech Hear. Services Schools 53, 445–453. doi: 10.1044/2021_LSHSS-21-00057
Schmitz, H., Howe, C. L., Armstrong, D. G., and Subbian, V. (2018). Leveraging mobile health applications for biomedical research and citizen science: A scoping review. J. Am. Med. Inform. Assoc. 25, 1685–1695. doi: 10.1093/jamia/ocy130.
Scott, K., Chu, J., and Schulz, L. (2017). Lookit (Part 2): Assessing the viability of online developmental research, results from three case studies. Open Mind 1, 15–29. doi: 10.1162/OPMI_a_00001
Shankar, V., Ramkumar, V., and Kumar, S. (2022). Understanding the implementation of telepractice in speech and language services using a mixed-methods approach. Wellcome Open Res. 7:46. doi: 10.12688/wellcomeopenres.17622.2
Sheng, L. (2018). “Typical and atypical lexical development,” in Handbook of communication disorders: Theoretical, empirical, and applied linguistic perspectives, 101–116.
Sheng, L., Lam, B. P. W., Cruz, D., and Fulton, A. (2016). A robust demonstration of the cognate facilitation effect in first-language and second-language naming. J. Exp. Child Psychol. 141, 229–238.
Sheng, L. I., Lu, Y., and Kan, P. F. (2011). Lexical development in Mandarin-English bilingual children. Bilingualism Lang. Cogn. 14, 579–587.
Sheng, L., Wang, D., Walsh, C., Heisler, L., Li, X., and Su, P. L. (2021). The bilingual home language boost through the lens of the COVID-19 pandemic. Front. Psychol. 12:667836. doi: 10.3389/fpsyg.2021.667836
Solano-Flores, G., Chia, M., and Kachchaf, R. (2019). Design and use of pop-up illustration glossaries as accessibility resources for second language learners in computer-administered tests in a large-scale assessment system. Intern. Multiling. Res. J. 13, 277–293. doi: 10.1080/19313152.2019.1611338
Song, L., Luo, R., and Liang, E. (2021). Dual language development of Chinese 3- and 4-year-olds: Associations with the family context and teachers’ language use. Early Educ. Dev. 33, 219–242. doi: 10.1080/10409289.2020.1865746
Song, L., Sheng, L., and Luo, R. (2022). Comprehension skills of Chinese-English dual language learners: Relations across languages and associations with language richness at home. Intern. J. Biling. Educ. Biling. 27, 19–37. doi: 10.1080/13670050.2022.2137386
Sullivan, G. M. (2011). A primer on the validity of assessment instruments. J. Graduate Med. Educ. 3, 119–120. doi: 10.4300/JGME-D-11-00075.1
Sutherland, R., Hodge, A., Chan, E., and Silove, N. (2021). Barriers and facilitators: Clinicians’ opinions and experiences of telehealth before and after their use of a telehealth platform for child language assessment. Intern. J. Lang. Commun. Disord. 56, 1263–1277. doi: 10.1111/1460-6984.12666
Talbott, M. R., Dufek, S., Zwaigenbaum, L., Bryson, S., Brian, J., Smith, I. M., et al. (2020). Brief report: Preliminary feasibility of the TEDI: A novel parent-administered telehealth assessment for autism spectrum disorder symptoms in the first year of life. J. Autism Dev. Disord. 50, 3432–3439. doi: 10.1007/s10803-019-04314-4
Tang, Y., and Triesch, J.Deák, G. O. (2023). Variability in infant social responsiveness: Age and situational differences in attention-following. Dev. Cogn. Neurosci. 63:101283.
Tomblin, J. B., Records, N. L., Buckwalter, P., Zhang, X., Smith, E., and O’Brien, M. (1997). Prevalence of specific language impairment in kindergarten children. J. Speech Lang. Hear. Res. 40, 1245–1260. doi: 10.1044/jslhr.4006.1245
Tomlinson, S. R. L., Gore, N., and McGill, P. (2018). Training individuals to implement applied behavior analytic procedures via telehealth: A systematic review of the literature. J. Behav. Educ. 27, 172–222. doi: 10.1007/s10864-018-9292-0
U.S. Census Bureau (2024). Language spoken at home in S1601. Washington, DC: U.S. Department of Commerce.
Verhagen, J., Kuiken, F., and Andringa, S. (2022). Family language patterns in bilingual families and relationships with children’s language outcomes. Appl. Psycholing. 43, 1109–1139. doi: 10.1017/S0142716422000297
Waite, M. C., Theodoros, D. G., Russell, T. G., and Cahill, L. M. (2010). Internet-based telehealth assessment of language using the CELF-4. Lang. Speech Hear. Services Schools 41, 445–458. doi: 10.1044/0161-1461(2009/08-0131)
Wang, D., Zheng, L., Lin, Y., Zhang, Y., and Sheng, L. (2022). Sentence repetition as a clinical marker for mandarin-speaking preschoolers with developmental language disorder. J. Speech Lang. Hear. Res. JSLHR 65, 1543–1560. doi: 10.1044/2021_JSLHR-21-00401
Werfe, K. L., Grey, B., Johnson, M., Brooks, M., Cooper, E., Reynolds, G., et al. (2021). Transitioning speech-language assessment to a virtual environment: Lessons learned from the ELLA Study. Lang. Speech Hear. Services Schools 52, 769–775. doi: 10.1044/2021_LSHSS-20-00149
Westerveld, M. F. (2014). Emergent literacy performance across two languages: Assessing four-year-old bilingual children. Intern. J. Biling. Educ. Biling. 17, 526–543. doi: 10.1080/13670050.2013.835302
Wong, A. M. Y., Leonard, L. B., Fletcher, P., and Stokes, S. F. (2004). Questions without movement. J. Speech Lang. Hear. Res. 47, 1440–1453. doi: 10.1044/1092-4388(2004/107)
Yeh, E. (2019). Parent Matters: The impact of parental involvement on non-native English speakers’ postsecondary education enrollment. School Commun. J. 29, 39–62.
Yu, B., Epstein, L., and Tisi, V. (2021). “A DisCrit-informed critique of the difference vs. disorder approach in speech-language pathology,” in Critical Perspectives on Social Justice in Speech-Language Pathology, ed. R. Horton (Hershey, PA: IGI).
Keywords: citizen science, bilingual children, Mandarin-English, language assessment, telehealth
Citation: Du Y, Tang Y, Fong KK, Liu Y, Wang D, Tu-Shea X, Liu Y, Zheng Q, Quan S, Xiong J and Sheng L (2026) A citizen science approach toward parents-administered remote language assessment for bilingual Mandarin-English children: an evaluation of in-person and telehealth settings. Front. Educ. 10:1696031. doi: 10.3389/feduc.2025.1696031
Received: 10 December 2025; Revised: 14 November 2025; Accepted: 22 December 2025;
Published: 20 January 2026.
Edited by:
Mingshuang Li, California State University, Northridge, United StatesReviewed by:
Abel Toledano-González, University of Castilla-La Mancha, SpainRuth Crutchfield, University of Texas–Pan American, United States
Copyright © 2026 Du, Tang, Fong, Liu, Wang, Tu-Shea, Liu, Zheng, Quan, Xiong and Sheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yao Du, eWFvZHVAdXNjLmVkdQ==
Ka Kei Fong3