Validation of factor structure of the neurodevelopmental parent report for outcome monitoring in down syndrome: confirmatory factor analysis

Introduction The Neurodevelopmental Parent Report for Outcome Monitoring (ND-PROM), initially developed to monitor developmental and behavioral functions in children with autism spectrum disorder (ASD), assesses symptoms across a wide range of domains relevant in Down syndrome (DS). Methods Psychometric properties of ND-PROM were assessed in 385 individuals with DS and 52 with a combined diagnosis of DS and ASD (DS+ASD), whose caregivers completed the ND-PROM questionnaire for a clinical visit in a specialized Down syndrome program at a tertiary pediatric hospital. Confirmatory factor analysis was conducted to evaluate the internal structure validity of the ND-PROM. Measurement invariance was assessed, with a comparison group of 246 individuals with ASD, and latent mean differences between the DS and ASD-only groups, as well as the combined DS+ASD groups, were assessed. Results Findings support the existence of the 12 clinically-derived factors in the DS population: Expressive Language, Receptive Language, Adaptive skills/Toileting, Social Emotional Understanding, Social Interaction, Independent Play, Sensory Processes, Challenging Behaviors, Impulse/ADHD, and Mental Health. Differences in response patterns of development and behaviors were observed between those with DS and those with ASD, including those with DS having higher abilities in nonverbal communication, social emotional understanding, and social interaction, and fewer restricted and repetitive behaviors and interests, impulsivity or ADHD symptoms, and mental health concerns compared to those with ASD. Individuals in the DS+ASD group had more difficulties with expressive and receptive language, nonverbal and social communication, social interaction, independent play, and adaptive skills than either the DS-only group or the ASD-only groups. Discussion The ND-PROM has a desirable factor structure and is a valid and clinically useful tool that captures a range of distinct and independent areas of developmental and behavioral functioning in DS, for individuals with and without an ASD diagnosis.


Introduction
Down syndrome (DS), caused by the presence of all or part of an extra chromosome 21, occurs in about 1/700 births (1).DS is the most common genetic cause of intellectual disability (ID), though cognitive, language, and adaptive abilities vary greatly (2)(3)(4)(5)(6)(7).DS is associated with a high prevalence of co-occurring neurodevelopmental, behavioral, and mental health conditions that can greatly impact overall functioning (8).Co-occurring ASD is particularly prevalent, occurring in up to 39% (9)(10)(11), and is typically associated with lower cognitive and language abilities, and higher rates of behavior problems (12).There is also an increasingly recognized phenomenon of unexplained regression in DS, now known as Down Syndrome Regression Disorder, which is associated with loss of skills and onset of autistic-like behaviors or catatonia (13-15).Thus, clinical care for individuals with DS requires clinicians to monitor developmental and behavioral progress across multiple domains, and to identify and manage any unexpected changes in behavior or deviations in development.
In order to enable efficient, patient-centered care, a standardized approach is needed that tracks a wide range of potential symptoms and identifies those at a heightened risk for co-occurring neurodevelopmental or mental health conditions.Parent-reported measures have been used in DS, including the Aberrant Behavior Checklist (12), Child Behavior Checklist (16), Social Responsiveness Scale (17), the Social Communication Questionnaire (18), and the Screen for Child Anxiety Related Disorders (19).However, many tools have not been validated in people with DS or are used only in specific age groups or to target particular symptom clusters.Evaluation of the breadth of symptoms and concerns that children and adolescents with DS may experience therefore necessitates the use of multiple scales.Thus, there remains a great need for a single tool that can be used to obtain information on developmental and behavioral domains applicable to people with DS, across a wide range of ages, developmental stages, and functional levels, that can ultimately be used to monitor clinically-relevant symptoms and skills.
The Neurodevelopmental Parent Report for Outcome Monitoring (ND-PROM; previously published as the Autism Spectrum Disorder Parent Report for Outcome Monitoring, ASD-PROM) is a freely-available tool initially developed to clinically monitor caregiver report of developmental and behavioral functions in children with ASD (20).The ND-PROM contains 128 Likert-scale items that cover a wide range of developmental skills and behaviors relevant to children with neurodevelopmental disabilities.Prior work using the ND-PROM in individuals ages 2-20 years old with ASD demonstrated clinical utility, test-re-test reliability, and good convergent validity with the Vineland-II (20).Further development of the ND-PROM involved delineation of 93 of the individual items into 12 clinical domains, which were subsequently supported using confirmatory factory analysis in the ASD population: Expressive Language, Receptive Language, Nonverbal Communication, Social Communication, Social Interaction, Independent Play, Adaptive Skills, Restricted and Repetitive Behaviors, Sensory Processes, Challenging/Aggressive Behaviors, Impulse/ADHD, and Mental Health (21).Items related to sleep, possible emergence of epilepsy, and drug or alcohol use from the original survey were not included in the factors.This suggested that the ND-PROM has good potential for independently assessing these key functional domains and identifying domains of relative strengths and weaknesses (e.g.restricted and repetitive behaviors and interests, or impulsivity or ADHD symptoms), which can identify targeted areas for intervention (21).
Given the applicability of these skill and behavioral areas in the clinical management and treatment of Down syndrome, the Boston Children's Hospital Down Syndrome Program began implementation of the ND-PROM as part of standard clinical care to gather developmental and behavioral information from parents and caregivers about their children with DS prior to clinic visits.While previous confirmatory factor analysis in a population of children with ASD confirmed 12 separate clinical domains for which the questions were best represented (21), it is not clear that the same skills, symptoms and behaviors track similarly in a population of children with Down syndrome, which is a different neurodevelopmental condition with different neurocognitive and behavioral profiles.Therefore, the current study aims to assess psychometric properties of the ND-PROM in Down syndrome.
This paper describes the internal structure validity of the ND-PROM in a large clinical population of children and adolescents with DS, some of whom also have ASD (DS+ASD).Using Confirmatory Factor Analysis, we will examine factor structure and internal consistency of the measure in a new clinical population, determine the factor structure, confirm the measurement invariance of the ND-PROM, and assess latent mean differences among groups.

Participants and procedures
Participants included 385 individuals with DS and no diagnosis of ASD, along with 52 individuals with a dual diagnosis of DS and ASD (DS+ASD) who were seen in a specialized Down Syndrome Program in a tertiary pediatric hospital from 2017-2021.Patients were assessed and followed by specialty providers who regularly evaluate for ASD as part of their clinical practice using Diagnostic and Statistical Manual, 5 th Edition (DSM-5) criteria (22).A comparison group consisting of 246 patients with a diagnosis of ASD and not DS were used to test measurement invariance, and to assess latent mean differences among DS, ASD, and DS+ASD groups.Details about this group, including diagnostic determination and data collection methodology, is available in Levin et al. (20).Caregivers completed the ND-PROM as part of standardized clinical procedures prior to clinical visits, to streamline clinical history taking and developmental and behavioral monitoring as a quality improvement initiative in the programs.Participants completed the ND-PROM using either a web-based system in which parents received automated prescheduled emails with secure links to complete the questionnaire online, or by completing a PDF or paper copy of the ND-PROM.The study was approved by the Boston Children's Hospital Institutional Review Board.

Data analyses
All analyses were conducted using Mplus 8.9 (23).The level of significance was set to 5% for a two-tailed test.

Internal consistency
In order to measure internal consistency, which allows us to examine how reliably the ND-PROM is able to address the constructs it is meant to measure, two coefficients were engaged: the popular alpha of Cronbach and McDonald's omega (24)(25)(26), which is appropriate for non-tau equivalent instruments, where each factor is not assumed to have equal item-latent variable relations.

Confirmatory factor analysis to examine internal factor structure
Internal factor structure was examined using the Confirmatory Factor Analysis (CFA) framework using the weighted least squares estimate using mean and variance adjustments for the expected non-normality of ordinal indicators.A 12-factor CFA model for ordered categorical indicators using the Weighted Least Squares Mean and Variance Adjusted Estimator (WLSMV) was estimated assuming the presence of 12 distinct, correlated dimensions.This covariance modeling approach estimates discrepancies between the population variance-covariance matrix and the sample-based matrix.Model fit is assessed using both absolute and relative criteria of exact fit, close fit, or not so close fit (27).Exact fit is based on the assumption that there is no discrepancy between hypothesized and estimated variance-covariance matrices, S(q).It represents an extremely strict set of evaluative criteria in the measurement of real-life phenomena, as minimal discrepancies would result in large chi-square values, particularly in the presence of excessive power.Thus, evaluation of the magnitude of the chi-square values should be the last resort in evaluating model fit.The same logic applies also to evaluating residuals with RMSEA values equal to zero.Using values of the root mean square error of approximation (RMSEA) between 0.05 and 0.08 are suggestive of acceptable but still "not exact" fit.
In addition to relying on the RMSEA, a series of descriptive fit indices (28), relative or incremental (e.g., comparative fit index (CFI); Tucker-Lewis index (TLI)) were employed based on the discrepancy function, adjusting for model complexity (i.e., number of estimated parameters and degrees of freedom).A large number of simulation studies examining their strengths and weaknesses (29) have favored the CFI and TLI as being relatively unaffected by sample sizes and model complexity (30) and were thus, used in the present study.Evaluative criteria of proper model fit usually involve values greater than 0.900 or more recently greater than 0.950 on the descriptive fit indices (31), RMSEA values between 0.05 and 0.08 (i.e., between close and not-so-close fit), and non-significant chisquare values (a strict omnibus criterion).

Measurement invariance
Among the available restrictive models to test for measurement invariance, typically three levels of restriction are utilized (32).These are termed configural, metric, and scalar, and contain the necessary restrictions to conduct test of significance at the latent means level, presuming they are all satisfied.Consequently, across the two populations of DS and ASD, the three levels were examined with the configural model testing the equivalence of the factor model's simple structure, the metric model imposing the equivalence of factor loadings linking the items to the latent construct, and last, the scalar or "strong invariance" model testing, in addition to the factor loadings, the equivalence of the intercept terms (or thresholds in categorical indicators).Tests of significance by use of difference chi-square tests for nested models are constructed to test whether the additional constraints are justified.
In cases where the classic protocol of strong measurement invariance was not met, we deferred to the alignment procedure developed by Muthen & Asparouhov (33).This methodology utilizes the configural model of no invariance and identifies the largest number of invariant parameters by allowing factor means and variances to vary freely across groups.The model utilizes a simplicity function to identify as many approximately invariant parameters as possible with few non-invariant parameters, thus attaining the goal of validly comparing latent means and variances between groups.To conclude the presence of measurement invariance, the number of significant between group parameters needs to be minimal or less than 25%.If measurement invariance is achieved, a latent means comparison can be conducted between the DS, ASD, and combined DS+ASD groups.

Effect size indicator
When latent means were contrasted, latent Cohen's d effect size statistic was utilized which presents results in the standard deviation metric.Conventions for effect size are small (SD=0.2),medium (SD=0.5)and large (SD=0.8)(34).Meaningful differences are considered as those in excess of a 0.5 standard deviation.

Construct validity and internal consistency reliability of ND-PROM in DS
Table 2 contains information about Cronbach's alpha and McDonald's omega across domains and groups.Estimates in alpha ranged between 0.578 and 0.925 in the DS group, between 0.632 and 0.932 in the ASD group, and between 0.538 and 0.924 in the DS+ASD group.The CFA model posited 12 latent factors as with the original ND-PROM in the ASD population (Levin 2022 vs SDBP Abstract).Global fit as judged by the chi-square test was significant, likely reflecting excessive levels of power.Use of descriptive fit indices and residuals pointed to acceptable model fit [CFI=0.90;TLI=0.90,RMSEA=0.04,Chi.square/DF=1.68,SRMR=0.10],indicating appropriateness of using this same 12factor model in the DS cohort.These results followed Bartlett's correction for sample size (see Supplementary Materials for Rfunction developed for that purpose).

Measurement invariance across DS, ASD, and DS+ASD groups
Table 3 displays results from the "exact" measurement invariance protocol when testing the 12 factor simple structure across the three groups.As expected, the chi-square statistical tests were significant for all three models, configural/metric/scalar with increased numbers of constraints.However, of interest were the comparisons between (a) configural and metric models, and (b) metric and scalar models.As shown in Table 3, neither the equivalence of factor loadings nor the equivalence of thresholds were supported when contrasting the DS and ASD samples.Consequently, the alignment procedure outlined above, was implemented to target partial measurement invariance so that tests of latent means would be possible.Following alignment (33), results indicated that all but two of the factor loadings and all but eight of the intercepts were equivalent between the DS, ASD, and DS+ASD groups.Thus, the model was able to converge on a simplicity function where estimates of factor loadings and intercepts were largely invariant between groups.Table 4 displays factor loadings and intercepts between groups and the decision of equivalence based on alignment.The amount of significant and non-invariant parameters was equal to 5.4%, much less than the 25% guideline and close to the nominal level of significance on the number of significant tests due to chance.

Latent mean differences across DS, ASD, and DS+ASD groups
Table 5 and Figure 1 display latent means and between group comparisons using both inferential statistical criteria and effect size indicators.As shown in Table 5, there were significant differences between the DS and ASD groups on nonverbal communication and social interaction, with the DS group having significantly higher means (nonverbal communication difference=1.021SD; social interaction difference=0.933SD).Higher means in these areas are indicative of higher skill levels.Similarly significantly lower scores in the ASD were observed in RRBs (-1.078SD), sensory processes (-0.881SD), challenging behaviors (-0.614SD), impulse/ADHD (-1.376SD), and mental health (-1.050SD).Last, there were significant differences between the DS and ASD groups in adaptive skills/toileting (-0.458SD); with the DS group having significantly lower mean levels compared to the ASD group.
When contrasting the DS to the DS+ASD groups, results indicated significantly higher functioning for the DS group in expressive language (1.558 SD), receptive language (1.262 SD), nonverbal communication (2.216 SD), social communication (1.489 SD), social interaction (2.079 SD), independent play (1.390 SD), and adaptive skills (1.103 SD).A significantly lower functioning of the combined group was observed in restricted and repetitive behaviors and interests (-1.064SD), sensory processes (-1.672SD), challenging behaviors (-0.699SD), and impulse/ ADHD (-0.841SD).There were no differences between the DS and combined DS+ASD groups for the mental health domain (-0.128SD).
Comparisons between the ASD and the DS+ASD groups indicated that the ASD group was significantly higher in expressive language (1.579 SD), receptive language (1.261 SD), nonverbal communication (0.904 SD), social communication (1.032 SD), social interaction (0.909 SD), independent play (1.275 SD), and adaptive skills (1.593 SD), all areas where higher means indicate higher skill or functioning levels.However, the ASD group also had a higher mean for impulse/ADHD (0.535 SD).Significantly lower functioning was observed for the combined group on sensory processes (-0.620SD).

Discussion
The present study evaluated the ND-PROM as a clinical monitoring tool to assess skills and behaviors in the DS population.The study evaluates internal structure validity of the ND-PROM with measurement invariance across DS, ASD, and DS +ASD samples.Our findings support the existence of the 12 clinically derived factors in the DS population: Expressive Language, Receptive Language, Nonverbal Communication, Social Emotional Understanding, Social Interaction, Independent Play, Adaptive skills/Toileting, Restricted and Repetitive Behaviors and Interests, Sensory Processes, Challenging Behaviors, Impulse/ ADHD, and Mental Health.This 12-factor model was previously confirmed through confirmatory factor analysis in the ASD population, and here we have shown that the ND-PROM tool works equally well and can capture the range of distinct and independent areas of developmental and behavioral functioning present in those with DS.Assessment of measurement invariance across DS, ASD, and DS+ASD groups showed that the ND-PROM was able to psychometrically distinguish between these three groups, indicating the specificity of the tool in elucidating different patterns of development and behaviors between children with DS, ASD, and DS+ASD.
Assessment of latent mean differences between DS and ASD revealed differences in patterns of development, and skills, and behaviors in those with DS compared to those with ASD.Expressive and receptive language, independent play skills, sensory processes, and challenging behaviors were similar in the sample populations of ASD and DS in the study.However, those with DS were found to have higher abilities in nonverbal communication, social emotional understanding, social interaction, and fewer reported restricted and repetitive behaviors and interests, impulsivity or ADHD symptoms, and mental health concerns compared to children with ASD in this study.These findings are largely in line with prior reports  describing profiles for children with DS.Children with DS have been described to have relative strengths in nonverbal communication and social skills (2,7,35,36), whereas these are defining core areas of impairment for children with ASD (22).Adaptive skills have previously been reported to be higher for children with DS compared to those with ASD (37,38), however in this study focused on toileting skills, this pattern was not found.Additionally, while challenging behaviors and mental health concerns, as well as restricted and repetitive behaviors, and impulse control/ADHD symptoms are commonly reported in DS, they may occur less than in other populations of children with ID (4,39).Compared to the DS-only group, the DS+ASD group shows areas of concern in language and communication domains, as well as in social interaction, independent play, and adaptive skills.The combined group also had more issues with restricted and repetitive behaviors and interests, sensory processing, and challenging behaviors.This is consistent with previous research showing vulnerabilities in children and adolecents with co-occurring DS and ASD (12,40,41).When comparing to the ASD-only group, the DS+ASD group had decreased functioning in communication and language, including social communication and social interaction, as well as adaptive skills.This differs from previous research which showed fewer issues with social interaction in a combined DS+ASD group compared to an ASD-only group (42).Interestingly, the ASD-only group had higher levels of impulse/ADHD symptomatology reported.
We recognize several limitations of the current work.The study population included primarily White, non-Hispanic respondents, thus may not be representative.However, the ND-PROM is now available in additional languages, thus future analyses can include a more diverse population.Though previously parents reported that use of the ND-PROM had a positive impact on their child's care (20), the length of the ND-PROM may be a limitation for its widespread application, especially in primary care clinics or other, non-tertiary care environments.In this study, the DS and ASD cohorts were not matched by cognitive level, thus more pronounced group differences may have been observed if the DS group were directly compared to a group with ASD and Intellectual Disability.Additionally, this study did not assess the stability of responses across repeated tests or the sensitivity of the measure to change over time, and therefore it is not clear how clinically meaningful changes in function might be represented on the ND-PROM.Additionally, though convergent validity was previously assessed in the ASD propulation (20), it was not repeated in the DS population as a part of this study.Finally, the ND-PROM responses were collected as part of clinical care visits to a specialty clinic within a tertiary pediatric care center.Therefore, patients included in both the DS and ASD samples might be more severely impacted than their peers in the general population, hence their desire to seek out specailized care.However, this scale was designed to facilitate clinical visits in the medical setting and this study has shown that it is useable within that setting.A last cautionary note relates to the relatively low internal consistency reliability estimates for some of the ND-PROM scales.In these instances we suggest caution about using the scores from individual domains for diagnostic and classification purposes.Instead future studies may consider revising the content of some of these scales so that levels of internal consistency reliability will increase.
Future directions will explore the use of the ND-PROM longitudinally to assess the ability to capture change over time, which will have implications when assessing response to intervention.Additionally, as the ND-PROM is now available in additional languages, future studies will include a more diverse population.Subsequent studies may also include an examination of a Computerized Adaptive Testing (CAT) framework using the computerized version of the ND-PROM, such that larger item pools can be created and an adaptive algorithm can be implemented to assess competency in each domain through defining a minimum tolerated error of measurement.CAT methodologies engage approximately 10-15% of the total number of items and thus, the gains in time, efficiency, with no sacrifice to the validity of the measure would enable increased use and scalability of the ND-PROM.
The ND-PROM is a clinically useful tool for assessing children and adolescents with DS, ASD, and DS+ASD, which captures a range of distinct and independent areas of developmental and behavioral functioning between and among these three groups.

FIGURE 1
FIGURE 1 Latent mean differences between DS, ASD and DS+ASD groups on ND-PROM factors.Mean differences for each of the 12 factors of the Neurodevelopmental Parent Report for Outcome Montioring (ND-PROM) survey are shown for three groups: Down syndrome (DS; n=385), autism spectrum disorder (ASD; n=246), and a dual diagnosis of DS and ASD (DS+ASD; n=53).

TABLE 1
Demographic characteristics of study participants.

TABLE 1 Continued
Sociodemographic and responder education level for participants.Median and interquartile range (IQR) is shown for age, primary communication type, and maximum length of communicative units; all other factors are presented with n and percent reporting.Race, ethnicity, and responder education were taken from the medical record and were not available for all participants.*"Other" includes people who self-identified as Multiracial.

TABLE 2
Internal consistency reliability of ND-PROM constructs across DS and ASD groups.
Internal Consistency Reliability measures for Down syndrome (DS), and autism spectrum disorder (ASD) groups for each of the 12 domains of the ND-PROM.N.E.,Not estimable; ADHD, attention deficit/hyperactivity disorder.

TABLE 4
Standardized factor loadings from a confirmatory factor analysis model for the 12 latent variables of the ND-PROM across DS, ASD and DS +ASD groups.

TABLE 3
Measurement invariance across DS, ASD, and DS+ASD groups using an exact-fit protocol.Exact fit protocol for Down syndrome (DS), autism spectrum disorder (ASD), and dual diagnosis DS and ASD (DS+ASD) groups.*** denotes significance at p<0.001.Npar, Number of Freely Estimated Parameters; D.F., degrees of freedom.a Indicates diffference in degrees of freedom across competing models.

TABLE 5
Latent mean differences between DS, ASD, and DS+ASD groups using standardized point estimates.