Skip to main content

ORIGINAL RESEARCH article

Front. Psychol., 03 June 2021
Sec. Quantitative Psychology and Measurement

The Decomposition of Between and Within Effects in Contextual Models

\r\nSiwen Guo*Siwen Guo1*Richard T. HouangRichard T. Houang2William H. SchmidtWilliam H. Schmidt2
  • 1Department of Psychology, Renmin University of China, Beijing, China
  • 2Center for the Study of Curriculum Policy, Department of Counseling, Educational Psychology & Special Education, Michigan State University, East Lansing, MI, United States

In contextual studies, group compositions are often extracted from individual data in the sample, in order to estimate the group compositional effects [e.g., school socioeconomic status (SES) effect] controlling for interindividual differences in multilevel models. As the same variable is used at both group level and individual level, an appropriate decomposition of between and within effects is a key to providing a clearer picture of these organizational and individual processes. The current study developed a new approach with within-group finite population correction (fpc). Its performances were compared with the manifest and latent aggregation approaches in the decomposition of between and within effects. Under a moderate within-group sampling ratio, the between effect estimates from the new approach had a lesser degree of bias and higher observed coverage rates compared with those from the manifest and latent aggregation approaches. A real data application was also used to illustrate the three analysis approaches.

Introduction

In contextual models, the group compositional effects on individual development or outcomes and their underlying organizational processes have attracted a large amount of attention (Mayer et al., 2014). The individual-level constructs and their aggregated group compositions often show different effects on individual outcomes, which reflect different theoretical meanings (Lau and Nie, 2008; Marsh et al., 2012). The big-fish-little-pond effect is an example, which found that student academic self-concept was positively associated with individual achievement but negatively associated with school average achievement. The school-level effect of achievement on student academic self-concept reflected the way schools were structured and their effects on individuals (Marsh et al., 2009). Group compositional effects, or the effects of aggregated individual characteristics, like socioeconomic status (SES), gender, and ethnicity, etc., have drawn attention in contextual studies. The study on student and school SES effects is one good example, which examines the between-group effect of group compositions and the within-group effect of individual characteristics (Raudenbush and Bryk, 2002; Lüdtke et al., 2008). To explore the between-group and within-group effects, a two-level random intercept model (referred to as MLM model for simplicity) is often used.

The models that include the same variable at both individual- and group levels are called contextual models or compositional models (Lüdtke et al., 2008). In these models, the central question is whether the aggregated group compositions have any effect on individual outcomes controlling for interindividual differences (Marsh et al., 2009). If individuals are randomly selected from the entire population with error-free measurements of their characteristics as well as their group compositions, a single level model would work to separate and describe the effects of group compositions and individual characteristics.

Challenges arise, however, in a two-stage cluster sampling design, which is often used in data collection in these contextual studies, as individuals are naturally nested in groups. Meanwhile, the group compositions are usually unknown and need to be extracted from individual data in the sample, which generally brings sampling errors into the aggregated group compositions. As the same variable is used at both group level and individual level, an appropriate decomposition of the between-group and within-group effects1 is a key to providing a clearer picture of these organizational and individual processes (Zhang et al., 2009). The current study aims at assessing the performances of different analysis approaches in the decomposition of between and within effects in contextual models.

The previous contextual studies have investigated not only the between effects of group compositions and the within effects of individual characteristics on individual outcomes but also their roles as mediators as well as their indirect effects through other variables. If it was the nature of the mediating effects that occurred not only on the individual level but also on the group level, the within and between indirect effects should be identified in multilevel models. For a 2-1-1 mediation model (i.e., the treatment is measured at group level, and the mediator and outcome are measured at individual level, and referred to as 2-1-1 mediation model in the current study). In a cluster randomized design, Keenan and Laura (2012) showed that the power to detect the mediation effects was reduced when the mediation was unnecessarily restricted to the group level, and recommended testing the cross-level indirect effects in empirical studies. Tofighi and Thoemmes (2014) discussed how to specify, estimate, and interpret the results of single-level and multilevel mediation analyses for different research questions in the 2-1-1 settings. Talloen et al. (2016) further discussed the assumptions under which the within and between indirect effects could be identified and proposed a sensitivity analysis to assess the potential impact of unmeasured confounders on the within and between indirect effects.

A study on the school-based tobacco prevention programs, which aimed at lowering youth initiation of smoking via norms, is an empirical example, exploring the mediating effect of a group-level aggregated construct on the relationship between group-level treatment and individual outcomes (Pituch et al., 2006). The group norm is aggregated from individual norms, and its mediating effect is the research focus. This mediating effect is usually modeled in a 2-1-1 mediation model. Schmidt et al. (2015) examined the between and within indirect effects of SES on student mathematics achievement through the opportunity to learn (OTL). This example explores the indirect effects of aggregated group compositions and individual characteristics. The between and within indirect effects are usually modeled in a 1-1-1 mediation model (i.e., the predictor, mediator, and outcome are measured at an individual level, referred to as the 1-1-1 mediation model in the current study).

Following the research trends adopted in other contextual studies, this study discusses the decomposition of between and within effects for the MLM, 2-1-1 mediation and 1-1-1 mediation models.

Manifest Aggregation Approach

Two modeling approaches, the manifest and latent aggregation approaches, were proposed in previous studies to decompose the between and within effects. Traditionally, the between and within effects are assessed as the effects of the manifest group means and individual deviations from the group means (i.e., group mean centering) in multilevel models (Raudenbush and Bryk, 2002). This approach was referred to as the manifest aggregation approach by Marsh et al. (2009) and Lüdtke et al. (2011), which uses manifest aggregation to construct group means. Following the manifest aggregation approach, the MLM (Raudenbush and Bryk, 2002), 2-1-1 mediation, and 1-1-1 mediation (Zhang et al., 2009) models are shown in Table 1.

TABLE 1
www.frontiersin.org

Table 1. MLM, 2-1-1 mediation, and 1-1-1 mediation1 models in the manifest and latent aggregation approaches.

By decomposing Xij into the uncorrelated group mean X¯.j and individual deviation from group mean (Xij-X¯.j), the variance of Xij seems to be separated into the between-group variance in X¯.j and within-group variance in (Xij-X¯.j), and the between and within effects seem to be set apart. However, the manifest group mean X¯.j is not a generally perfect or error-free measurement of the group composition2. To be specific, with a random sample from group j, the sample mean X¯.j is not the “true” population mean in the jth group, and it involves the sampling error (Lüdtke et al., 2008, 2011). Lüdtke et al. (2008) showed that the expectation of the between-effect estimator of X on Y in the MLM model was

E(βxb)=βxbICCX+βxw1n(1-ICCX)ICCX+1n(1-ICCX),(1)

where βxb is the estimator of the between effect of X on Y in the manifest aggregation approach, ICCX is the intraclass correlation coefficient (ICC) of X, and n is the common group size. Preacher et al. (2010) showed that the expectation of the group-level indirect effect estimator of X on Y via M in the 2-1-1 mediation model was

E(αbβmb)=αbβmb(τM2-τXM2τX2)+βmw1nσM2(τM2-τXM2τX2)+1nσM2,(2)

where αb and βmb are the estimators of the between effects of X on M, and of M on Y in the manifest aggregation approach; τM2 and σM2 are the group-level and individual-level variances of M; τX2 is the group-level variance of X; and τXM is the group-level covariance between X and M. Following the logic and assumptions made by Lüdtke et al. (2008) and Preacher et al. (2010), in the 1-1-1 mediation model, the expectation of the group-level indirect effect estimator is

E(αbβmb)=[αbτX2+αw1nσX2τX2+1nσX2][τMY+1nσMY-(τXY+1nσXY)(τXM+1nσXM)τX2+1nσX2τM2+1nσM2-(τXM+1nσXM)2τX2+1nσX2],(3)

where σX2 is the individual-level variance of X, τMY and σMY are the group-level and individual-level covariances between M and Y, τXY and σXY are the group-level and individual-level covariances between X and Y, and σXM is the individual-level covariance between X and M.

In the three models, unless the within effects are the same as the between effects for all paths (i.e., βxw = βxb, βmw = βmb, and αw = αb), or the individual-level variances and covariances of X and M are equal to zero, or the group size n is infinite, the between-effect estimators are biased. If the within effect is enough to answer the research question, the manifest aggregation approach will provide unbiased estimators. When the between effect is of theoretical interest as it is in the current study, the manifest aggregation approach may not be a good choice. The bias due to sampling error is not only involved in the estimation of the between effect of the decomposed predictor but also affects the estimation of other group-level effects (Lüdtke et al., 2011; Mayer et al., 2014). The sampling error in the aggregated group means and the resulting biased between effect estimators in the manifest aggregation approach are criticized by the latent aggregation approach.

Latent Aggregation Approach

To correct for the bias in the between-effect estimator due to sampling error, there is a new trend to decompose the between and within effects in a latent aggregation approach (Lüdtke et al., 2008, 2011; Marsh et al., 2009, 2012; Preacher et al., 2010, 2016; Preacher, 2011; Mayer et al., 2014; Ryu, 2015b). This approach models the group-level and individual-level variance–covariance matrices explicitly with multilevel structural equation modeling (MSEM; Muthén, 1990). The group-level and individual-level latent (or random) components are directly modeled to examine the between and within effects. In previous studies, the latent aggregation approach was discussed for the MLM (Lüdtke et al., 2008, 2011), 2-1-1 mediation (Preacher et al., 2010, 2011), and 1-1-1 mediation models (Preacher et al., 2010).

When (1) the model is correctly specified, (2) error terms follow a multivariate normal distribution3 with means of zero and constant variances, (3) error terms are uncorrelated with each other as well as the group-level and individual-level latent components of the predictors, (4) group-level latent components are uncorrelated with individual-level latent components, and (5) each group has an infinite population, the latent aggregation approach provides approximately unbiased within and between effects for the MLM (Lüdtke et al., 2008, 2011), 2-1-1 mediation (Preacher et al., 2010, 2011), and 1-1-1 mediation models (Preacher et al., 2010).

The latent aggregation approach outperforms the manifest aggregation approach in the estimation of between effects with sampling error correction under the appropriate assumptions. It has been shown that the between effect estimators in the latent aggregation approach had smaller biases and root mean square error (RMSE) than those in the manifest aggregation approach, when the assumptions in the latent aggregation approach were satisfied (Lüdtke et al., 2008, 2011; Preacher et al., 2011).

However, the latent aggregation approach did not always perform better than the manifest aggregation approach. Intuitively, when the sampling ratio within each group is 100%, the manifest group mean is the population group mean and free from sampling error. The manifest aggregation approach should provide approximately unbiased between effects under this condition. Lüdtke et al. (2008) found that when the within-group sampling ratio approached 100%, the manifest aggregation approach outperformed the latent aggregation approach in the estimation of between effects in the MLM model. Lüdtke et al. (2011) indicated that when only limited information on the group-level construct was available (e.g., a low ICC of predictors, a small number of groups, and a small group size), the manifest aggregation approach could outperform the latent aggregation approach in the RMSE of between-effect estimators in the MLM model. In the 2-1-1 mediation model, McNeish (2017) found that the manifest aggregation approach outperformed the latent aggregation approach with a small number of groups. When the group-level variance components were close to zero, to deal with the unstable between-effect estimators in the latent aggregation approach, Zitzmann et al. (2015) introduced a Bayesian estimation method with a small amount of information in the prior distribution. It was found that the Bayesian estimation provided more accurate between effect estimates in the MLM model than the maximum likelihood (ML) estimation for the latent aggregation approach under the problematic conditions with small group sizes and small ICCs of the predictors. For the doubly latent approach with multiple level-1 indicators, the Bayesian estimation had fewer problems and provided more accurate between-effect estimates than the ML estimation under the conditions with small group sizes and low ICCs (Zitzmann et al., 2016). With challengingly small groups and low ICCs of the predictors, consistent with previous studies, the doubly manifest approach provided more accurate between-effect estimates than the doubly latent approach no matter whether the ML estimation or the Bayesian estimation was used. Under these conditions, the between effect estimates from the doubly latent approach using the Bayesian estimation were between those from the doubly manifest approach and the doubly latent approach using the ML estimation (Zitzmann et al., 2016).

In these previous studies, sometimes the latent aggregation approach performed better in the estimation of between effects, and sometimes the manifest aggregation approach did. The contradiction comes from the different assumptions made by the manifest and latent aggregation approaches: in the manifest aggregation approach, the entire group is assumed to be sampled, or the within-group sampling ratio is assumed to be 100%, while in the latent aggregation approach, the population in each group is assumed to be infinite4 or the within-group sampling ratio is assumed to be close to 0. When designing a study, sampling of the entire groups is not generally applied. Sampling with replacement is hardly conducted either, and the number of individuals per group is hardly infinite. When the groups are naturally of small or moderate sizes, like classrooms and schools, not to mention “infinite,” the number of individuals per group is even further away from “large enough.”

This problem was mentioned in some previous studies. Lüdtke et al. (2008) and Preacher et al. (2010) limited their discussion of the latent aggregation approach to the situations where the within-group sampling ratio was low (e.g., lower than 5%). In the cases where the within-group sampling ratio approached 100%, Lüdtke et al. (2008, 2011), Marsh et al. (2009, 2012), and Preacher et al. (2010) suggested that the manifest aggregation approach might be a natural choice. When the within-group sampling ratio was moderate, Lüdtke et al. (2008) and Marsh et al. (2009) suggested that the “best” between-effect estimate was between the estimates in the manifest and latent aggregation approaches.

In addition to the manifest and latent aggregation approach, Shin and Raudenbush (2010) proposed an alternative approach to estimate the contextual effects with the latent “true” cluster means of covariates for an MLM model and a two-level random slope model. As this alternative approach used a similar idea as the latent aggregation approach and assumed an infinite population within each group, it was not discussed further in the current study.

Within-Group Finite Population Selection and the New Approach

With a probability sample, it is possible to quantify the sampling error with a consideration of the within-group sampling ratio and correct it in the decomposition of between and within effects. When the sampling ratio exceeds 5% of the population, the selection cannot be treated as if it comes from an infinite population (Cochran, 1977), and a correction is needed. The correction made for the finite population selection is called finite population correction (fpc). It is calculated as (1−n/N), where n is the sample size, and N is the population size. Intuitively, with a larger sampling ratio, there is more information and less uncertainty about the population mean. The variance of the mean estimator should be smaller than it is with a smaller sampling ratio (Lohr, 2009). For example, with a simple random sampling (SRS) of n individuals from a population of N, if the population variance of Y is S2, the variance of Y¯ in the sample is (1-nN)S2n, which is corrected with fpc.

For the clustered data collected following a complex sampling scheme, the traditional hierarchical linear modeling (HLM) or multilevel linear modeling techniques often assumed an infinite population of both level-1 and level-2 units. This assumption might not hold under most conditions. Some research explored the cooperation of fpc with multilevel techniques when level-2 units were of a finite population. For example, Lai et al. (2018) proposed a method to obtain the finite-population-adjusted standard errors of level-1 and level-2 predictors in two-level models. Their simulation results showed that the bias in the unadjusted standard errors was substantial when the level-2 sample size exceeded 10% of the population size, and increased with a larger ICC, a larger number of groups, and a larger average group size. The proposed fpc-adjusted method provided acceptable standard errors when the number of groups was at least 30, and the average group size was at least 10. Svoboda (2020) evaluated an fpc method in two-level models with both continuous and binary predictors. The fpc method generally provided acceptable levels of relative bias in standard error estimates for the continuous predictors. While, for an unbalanced level-2 binary predictor, the fpc-adjusted standard errors were only acceptable when at least 60 groups were sampled.

Different from these studies on the fpc for a finite level-2 population selection in general multilevel modeling, in the decomposition of between and within effects, the within-group population is more likely to be finite, like the population in a school or an organization. As the group-level and individual-level constructs are extracted from individual data, to decompose the between and within effects, a within-group fpc is needed to correct for the sampling errors in the estimation of variances and covariances of the aggregated group constructs. However, neither the manifest nor the latent aggregation approach takes the within-group fpc into sampling error correction. The previous fpc approaches in the general multilevel modeling did not deal with within-group finite population selection issue either.

The MLM, 2-1-1 mediation, and 1-1-1 mediation models can be formulated as MSEM, which are usually estimated by the ML estimation method in the manifest and latent aggregation approaches (Longford and Muthén, 1992; Muthén, 1994, 1997). To incorporate the within-group fpc in the estimation of MSEM for the decomposition of between and within effects in contextual studies, Muthén’s ML-based estimator (MUML) might provide some ideas. As the ML estimation is computationally intensive for MSEM, Muthén (1989;1990) suggested an ad hoc estimator, which treated the within and between data in a multiple-group fashion with a fitting function of

F=J{log| ΣW+n0ΣB |+tr([ΣW+n0ΣB]1[SB+n0(Z¯μ)(Z¯μ)])}+(jJnjJ){log| ΣW |+tr(ΣW1SPW)}.(4)

The common cluster size n0 is

n0=(jJnj)2-jJnj2(jJnj)(J-1),(5)

and the pooled within variance–covariance matrix and variance–covariance matrix of group means are

SPW=jJinj(Zij-Z¯j)(Zij-Z¯j)jJ(nj-1),(6)
SB=jJinj(Z¯j-Z¯)(Z¯j-Z¯)J-1,(7)

where ΣW and ΣB are the within and between variance–covariance matrices in the model, nj is the number of individuals in the jth group, J is the total number of groups in the sample, Zij is a k × 1 vector of k variables used in the model observed from individual i in group j, Z¯j is a k × 1 vector of the jth group means of k variables used in the model, Z¯ is a k × 1 vector of the grand means of k variables used in the model, and μ is a k × 1 parameter vector of k variables’ means.

This simpler estimator is called MUML (Muthén, 2004), limited information ML estimator, or pseudo-balanced ML estimator (Hox and Maas, 2001). This approach ends up with the same fitting function as ML estimation under a balanced design (Muthén, 1989, 1990). Under an unbalanced design, the common group size is used to “pseudo-balance” the data. The statistical inference of MUML estimators has been derived, and its performance under different sample sizes and ICCs have been examined (Hox and Maas, 2001; Yuan and Hayashi, 2005; Hox et al., 2010; Ryu, 2015a).

With a similar idea as the MUML, some previous research introduced the expected a posteriori (EAP)-based estimation, which used ANOVA to get the EAP estimates, and then estimated the between and within effects with the EAP estimates in a stepwise manner (Croon and van Veldhoven, 2007; Zitzmann, 2018; Zitzmann and Helm, 2021). Croon and van Veldhoven (2007) found this stepwise procedure resulted in unbiased estimates of between effects on the group-level outcomes, although it did not maximize the complete likelihood function. In a contextual model, when there was limited information about the level-2 constructs (e.g., a small number of groups and low ICCs), the between effects from the ML estimation tended to be inaccurate. To deal with this problem, Zitzmann (2018) applied a stabilization procedure in the EAP-based estimation, and found that the EAP-based estimation with stabilization was more accurate than the ML estimation for the between effects (Zitzmann, 2018). In addition, Zitzmann and Helm (2021) showed that the EAP-based estimation could also be used for the estimation of complex MSEM with mediation, moderation, and nonlinear effects, and found that the EAP-based estimation had a smaller relative RMSE than the ML estimation, especially when the sample was of small to medium sizes (Zitzmann and Helm, 2021).

Since the MUML estimation showed minor problems in the estimation and could be easily implemented using standard SEM software packages (Ryu, 2015a), it was used for MSEM in empirical studies (Cheung and Au, 2005; Stapleton, 2006; Ryu, 2014, 2015a; Wu et al., 2018). In the MUML, the between and within effects are estimated by treating SPW and SB in a multiple-group fashion. It was adopted by Laura M. Stapleton (2002) to incorporate sampling weights into the estimation of MSEM via MUML, in which the weighted SPW and SB were used. Following this logic, it is possible to use the within-group fpc in the estimation of SPW and SB, and estimate the between and within effects via MUML based on the adjusted SPW and SB with within-group fpc. To be specific, based on the method of moment (MOM), with a moderate within-group sampling ratio, the adjusted SPW and SB from the MOM are SPW_fpc=n-1n-fpcSPW and SB_fpc=SB+(n-1n-fpc)(1-fpc)SPW, which can be used in the MUML estimation for the new approach with within-group fpc in the current study.

The Current Study

The literature review showed that, in the decomposition of between and within effects in contextual models, the manifest and latent aggregation approaches made different assumptions about the within-group sampling in the sampling error correction for the aggregated group constructs. When the entire population was selected within each sampled group, the aggregated group mean was free from sampling error and the manifest aggregation approach was suitable. When the within-group sampling ratio was extremely small (e.g., smaller than 5%), the within-group finite population selection was not a major problem in sampling error correction and the latent aggregation approach was appropriate.

However, when the within-group sampling ratio was moderate, which is commonly seen in contextual studies, the within-group finite population selection was of concern in sampling error correction for the decomposition of between and within effects. The between effect estimators from the manifest aggregation approach may be biased as the sampling error in aggregation is not corrected at all. The between-effect estimators from the latent aggregation approach may also be biased as the sampling error is overcorrected by assuming an infinite group size.

As there was no available approach dealing with the within-group finite population selection in sampling error correction in aggregation, the current study first discussed the within-group fpc in the decomposition of between and within effects. The new approach with within-group fpc using MUML estimation based on the adjusted SPW and SB was compared with the manifest and latent aggregation approaches with ML estimation in a Monte Carlo simulation study. An empirical example using the dataset from the Programme for International Student Assessment (PISA) 2012 was also used to illustrate and compare the three analysis approaches.

Simulation Study

Methods

A Monte Carlo simulation study was first conducted to compare the performances of the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc in the decomposition of between and within effects for the MLM, 2-1-1 mediation, and 1-1-1 mediation models. To resemble the data structure typically found in contextual studies, an extremely large number of groups with small to moderate group sizes was assumed in the population, and a two-stage cluster sampling design was assumed to be used for data collection in the current simulation study. The conditions manipulated were balanced or unbalanced design (BAL), average group size in the population (N, 20 and 100), ICC of the predictor X or mediator M (ICCX/ICCM, 0.05 and 0.25), the ratio of between to within effects of the predictor X and/or mediator M (RX/RM, 0.10 and 10), number of groups in the sample (g, 50 and 200), and within-group sampling ratio (r, 0.1, 0.3, 0.5, 0.7, and 0.9).

Population

The average group size (N) in the population varied at 20 and 100 in the current study. In the balanced case, all groups were of N individuals; in the unbalanced case, half of the groups were of 32N individuals, and the other half of the groups were of 12N individuals.

Population model

The MLM, 2-1-1 mediation, and 1-1-1 mediation models were considered in this simulation, with their ICCs of the predictor X or mediator M, and ratios of between to within effects manipulated (see Table 1 for the models).

Intraclass correlation coefficient

In the MSEM, the ICCs of the decomposed predictors or mediators were of importance (Muthén and Satorra, 1995; Kim et al., 2012; Lachowicz et al., 2015; Hsu et al., 2016), and ranged from 0.05 to 0.50 in previous simulations (Muthén, 1994; Hox and Maas, 2001; Lüdtke et al., 2008, 2011; Hox et al., 2010; Kim et al., 2012; Hsu et al., 2016; Pham, 2018). Considering the ICCs found in the previous simulation and empirical studies, the ICC was set as 0.05 and 0.25 for X in the MLM model and the 1-1-1 mediation model, and the ICC was set as 0.05 and 0.25 for M in the 2-1-1 mediation model in the current study. The ICC of Y was equal to 0.25 across all conditions in the three models, and the ICC of M in the 1-1-1 mediation model ranged from 0.20 to 0.25.

Ratio of between to within effects

As discussed in the literature review, when the between and within effects were the same, the between effect estimators in the manifest and latent aggregation approaches were approximately unbiased. The research interest in contextual models focused on the between effects which were different from the within effects. In the current simulation study, the within effects were fixed at certain values, with the ratio of between to within effects of X (RX) and M (RM) being varied at 0.10 and 10.

Population model

In the MLM model, μY = μX5 = 0, β00 = 0, βxw = 0.2, βxb = RXβxw, and Var(Xij) = 1. In the 2-1-1 mediation model, μY = μX = μM = 0, β00 = 0, βmw = 0.1, βmb = RMβmw, βxb = 0.2, αb = 0.2, and Var(Xj) = 1. In the 1-1-1 mediation model, μY = μX = μM = 0, β00 = 0, βmw = 0.05, βmb = RMβmw, βxw = 0.1, βxb = RXβxw, αw = 0.2, αb = RXαw, and Var(Xij) = 1. The individual-level and group-level error terms were assumed to follow multivariate normal distributions in the three models. The variances of individual-level and group-level error terms were set at different values for different RX/RM and ICCX/ICCM. Please see Table 2 for the distributions of the error terms in the three models.

TABLE 2
www.frontiersin.org

Table 2. The population distributions of error terms in the multilevel modeling (MLM), 2-1-1 mediation, and 1-1-1 mediation models.

Data Generation

In the MLM model, group components (XBjυ0j) were generated from MVN((00),(ICCX0Var(υ0j))), and Nj individual components (XWijεij) were generated from MVN((00),(1ICCX0Var(εij))) for each j. The mean of Nj XWij was reset to 0 by using a scale parameter to each XWij in group j, which guaranteed the mean of XWij in group j equal to 0.

In the 2-1-1 mediation model, group components (XBjω0jυ0j) were generated from MVN((000),(10Var(ω0j)00Var(υ0j))), and Nj individual components (MWijεij) were generated from MVN((00),(1-ICCMICCM×1.040Var(εij))) for each j. The mean of Nj MWij was reset to 0 by using a scale parameter to each MWij in group j, which guaranteed the mean of MWij in group j equal to 0.

In the 1-1-1 mediation model, group components (XBjω0jυ0j) were generated from MVN((000),(ICCX0Var(ω0j)00Var(υ0j))), and Nj individual components (XWijδijεij) were generated from MVN((000),(1ICCX0Var(δij)00Var(εij))) for each j. The mean of Nj XWij was reset to 0 by using a scale parameter to each XWij in group j, which guaranteed the mean of XWij in group j equal to 0.

Sample

Number of groups

In the multilevel analysis, a sufficient number of groups was needed for the admissible solutions and asymptotic properties of the between estimators (Kim et al., 2012). In a multilevel factor analysis with MUML, 50 groups were considered as a “small number of groups,” and at least 100 groups were suggested as sufficient for the model test and confidence interval (CI) estimates (Hox and Maas, 2001; Hox et al., 2010). The number of sampled groups (J) was set as 50 and 200, and the groups were randomly drawn with equal probability of selection from an infinite population of groups in this study.

Within-group sampling ratio

The latent aggregation approach showed an unacceptable bias when the group size was small (e.g., 5), and its efficiency increased with the increase of group size. For a small bias, a group size of 20 was recommended (Preacher et al., 2011). To compare the new approach with within-group fpc to the manifest and latent aggregation approaches, the within-group sampling ratio (r) was manipulated from 0.1, 0.3, 0.5, 0.7, to 0.9. For the jth group, nj individuals were randomly drawn from the group of Nj, with nj equal to the product of Nj and within-group sampling ratio r.

Estimation Method

The simulation was conducted under 2 × 2 × 2 × 2 × 2 × 5 = 160 conditions for each model. Under each condition, the manifest and latent aggregation approaches, as well as the new approach with within-group fpc were applied. The ML was used for the manifest and latent aggregation approaches, and the MUML was used for the new approach. Under each condition, 1,000 replications were conducted.

Evaluation Criteria

The parameters of research interests in the current study were the between and within effects of the decomposed predictors and/or mediators. The performances of the three analysis approaches were evaluated in terms of model convergence, accuracy in parameter estimate, variability of the estimator, and accuracy of standard error. The model convergence rate across 1,000 replications was used to evaluate the model convergence for each analysis approach under each simulation condition. The accuracy of the estimator was evaluated by relative bias, which is the average difference between the estimate and population parameter relative to the population parameter over 1,000 replications under each condition. RMSE was used to evaluate the variability of the estimator, which is the square root of the mean square difference between the estimate and parameter over 1,000 replications under each condition. The observed coverage rate reflects the accuracy of standard error in each analysis approach. It is the proportion of times in which the true parameter is in the estimated 95% CI under each condition.

Results

To evaluate the performances of the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc in the decomposition of between and within effects, the model convergence rate, relative bias, RMSE, and observed coverage rate for the within and between effects were first obtained across the 1,000 replications under each simulation condition for each analysis approach.

As there were 160 simulation conditions for each model using each analysis approach, instead of proving the raw evaluation estimates under each simulation condition, the means and standard deviations of the convergence rate, relative bias, RMSE, and coverage rate across the 160 simulation conditions for each analysis approach were first provided for each parameter. Then, an ANOVA was conducted to examine the contributions of the seven design factors (i.e., analysis approach, RX/RM, ICCX/ICCM, g, BAL, N, and r) in explaining the variances of model convergence rate, relative bias, RMSE, and observed coverage rate under different simulation conditions for each parameter. All main and interaction effects were estimated in the ANOVA, and their effect sizes (η2) were calculated.

Convergence Rate

The three analysis approaches generally showed good model convergence rates for the MLM, 2-1-1 mediation, and 1-1-1 mediation models under most simulation conditions. For the manifest aggregation approach, the convergence rate was close to 100% (M = 100.00%, SD = 0.01%) for the MLM model, ranged from 96.10 to 100% (M = 99.97%, SD = 0.31%) for the 2-1-1 mediation model, and from 88.10 to 100% (M = 99.69%, SD = 1.50%) for the 1-1-1 mediation model across the 160 simulation condition. For the latent aggregation approach, the convergence rate ranged from 91.80 to 100% (M = 99.52%, SD = 1.32%) for the MLM model, from 85.20 to 100% (M = 98.26%, SD = 3.44%) for the 2-1-1 mediation model, and from 87.90 to 100% (M = 98.59%, SD = 2.68%) for the 1-1-1 mediation model. For the new approach with within-group fpc, the convergence rate ranged from 95.20 to 100% (M = 99.75%, SD = 0.84%) for the MLM model, from 87.50 to 100% (M = 99.44%, SD = 1.88%) for the 2-1-1 mediation model, and from 84.60 to 100% (M = 99.12%, SD = 2.77%) for the 1-1-1 mediation model.

The main and interaction effects of the analysis approach did not show any significant or consistent pattern, which largely explained the variance in the convergence rate for the MLM, 2-1-1 mediation, and 1-1-1 mediation models. The non-convergence problems with the latent aggregation approach and the new approach were caused by the non-positive definite estimated between variance-covariance matrices, in which the sampling errors in the aggregation were moved out either without or with within-group fpc. The between variance–covariance matrices in the manifest aggregation approach were estimated using the raw group means, which provided positive definite variance–covariance estimates across all conditions.

Bias

As different values were used for different between and within effect parameters in the three models, relative bias was used to evaluate the accuracy in the between and within effect estimates from the three analysis approaches.

Within effects

The manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc only showed small or negligible differences in the relative biases in within effect estimates in the three models (in Table 3). The main and interaction effects of the analysis approach accounted for less than 5% of the variances in relative biases in these within effect estimates. In other words, there was no significant or consistent pattern of the analysis approach, which largely explained the variance in the relative bias in any within effect estimate.

TABLE 3
www.frontiersin.org

Table 3. Relative bias in within and between effect estimates in the MLM, 2-1-1 mediation, and 1-1-1 mediation models.

Between effects

As expected, large differences in the between effect estimates were found among the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc (in Table 3). For most between effect estimates in the three models, the new approach showed the smallest degrees of relative biases. On average, the manifest aggregation approach overestimated the between effects, and the latent aggregation approach underestimated the between effects. Different from previous studies on the latent aggregation approach, which assumed the within-group population was infinite (Lüdtke et al., 2008, 2011; Preacher et al., 2011), the current study simulated moderate to large within-group sampling ratios with small to moderate group sizes in the population, which was more in favor of the manifest aggregation approach. For example, when the group size was 20 and within-group sampling ratio was 0.90, the manifest aggregation approach was expected to perform better than the latent aggregation approach in previous studies (Lüdtke et al., 2008, 2011; Marsh et al., 2009, 2012; Preacher et al., 2010). This was reflected in the current results, in which the degrees of relative biases in between effect estimates from the manifest aggregation approach were generally smaller than those from the latent aggregation approach.

The main effects and interactions of the analysis approach and between-to-within-effect ratio were of medium effect sizes (η2s > 0.059) for the relative biases in all between effect estimates in the three models. In addition, the three-way interactions between analysis approach, between-to-within-effect ratio, and ICCX/ICCM were of medium effect sizes (η2s > 0.059) in explaining the variation in relative biases of βxb in the MLM model, and βxb and αb in the 1-1-1 model. The cell means of relative biases in between effect estimates by analysis approach, between-to-within-effect ratio, and ICCX/ICCM are shown in Table 4 and Figure 1. The new approach with within-group fpc generally produced the smallest degrees of relative biases among the three analysis approaches under different between-to-within-effect ratios and ICCX/ICCM for most between effect estimates. In general, the differences in relative biases among the three analysis approaches dropped down with a larger between-to-within-effect ratio and a larger ICCX/ICCM. The degrees of relative biases dropped down when the between-to-within-effect ratio went up from 0.10 to 10, no matter which analysis approach was used. When the between-to-within-effect ratio was 10, the three analysis approaches provided similar relative biases in these between effect estimates. As discussed, the biases in between effect estimates from the manifest and latent aggregation approaches came from the additional parts containing within effects. When the between-to-within-effect ratio was large, i.e., RX/RM = 10, in the current study, the additional parts containing within effects were relatively small compared to the between effects. Under this condition, the between effect estimates were slightly affected. The degrees of relative biases dropped down when the ICCX/ICCM went up from 0.05 to 0.25, no matter which analysis approach was used. When the ICCX/ICCM was 0.25, the relative biases in these between effect estimates from the three analysis approaches were more similar.

TABLE 4
www.frontiersin.org

Table 4. Relative bias in between effect estimates by analysis approach, between-to-within-effect ratio, and ICCX/ICCM.

FIGURE 1
www.frontiersin.org

Figure 1. Relative bias in between-effect estimates by analysis approach, between-to-within-effect ratio, and ICCX/ICCM. Manifest, manifest aggregation approach; Latent, latent aggregation approach; FPC Latent, the new approach with within-group fpc; RX/RM, between-to-within-effect ratio; ICC, intraclass correlation coefficient of the decomposed predictor or mediator.

Root Mean Square Error

Within effects

For the within effect estimates in the three models, the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc provided similar RMSE (in Table 5). From ANOVA, the main effect of the analysis approach and its interaction effects with other design factors explained trivial proportions of variances in RMSE for those within effect estimates. The within-group sampling ratio explained large proportions of variances in RMSE for the within effect estimates in the three models, which ranged from 31 to 44%. The number of groups in the sample, group size, and the interaction between group size and within-group sampling ratio also contributed medium to large proportions to the variances in the RMSE of within effect estimates (η2s > 0.059).

TABLE 5
www.frontiersin.org

Table 5. Root mean square error of within and between effect estimates in the MLM, 2-1-1 mediation, and 1-1-1 mediation models.

Between effects

For the RMSE in between effect estimates, the three analysis approaches showed large differences. The means and standard deviations of RMSE of between effect estimates across the 160 simulation conditions from the three analysis approaches are presented in Table 5. From the seven-way ANOVA, the analysis approach accounted for medium to large proportions of variances (η2s > 0.059) of RMSE in the between effect estimates. For all between effect estimates in the three models, the manifest aggregation approach had the smallest RMSE among the three analysis approaches, while the latent aggregation approach gave the largest ones. The RMSE of between effect estimates from the new approach with within-group fpc were between the statistics from the manifest and latent aggregation approaches.

From the ANOVA results, the two-way interactions between analysis approach and ICCX/ICCM, and between analysis approach and group size, and the three-way interactions between analysis approach, ICCX/ICCM, and group size, explained medium to large proportions of variances (η2s > 0.059) in RMSE of between effect estimates. As shown in Table 6 and Figure 2, the differences in RMSE among the three analysis approaches dropped down with a larger group size and a larger ICCX/ICCM. The RMSE from the latent aggregation approach were more sensitive to the influences of group size and ICCX/ICCM, compared with the other two analysis approaches. No matter which analysis approach was used, the RMSE of between effect estimates were inversely related to the group size and ICCX/ICCM.

TABLE 6
www.frontiersin.org

Table 6. Root mean square error of between effect estimates by analysis approach, ICCX/ICCM, and group size.

FIGURE 2
www.frontiersin.org

Figure 2. Root mean square error (RMSE) of between effect estimates by analysis approach, ICCX/ICCM, and group size. Manifest, manifest aggregation approach; Latent, latent aggregation approach; FPC Latent, the new approach with within-group fpc; ICC, intraclass correlation coefficient of the decomposed predictor or mediator; N, average group size.

Coverage

As shown in Table 7, the average coverage rates for within effects from the manifest and latent aggregation approaches were close to the nominal level, i.e., 0.95, in the three models. In contrast, the new approach with within-group fpc provided higher coverage rates for the within effects than the nominal level. Its average coverage rates for the within effects were 100%. The differences in coverage rates for within effects among the three analysis approaches were also reflected in the ANOVA results: the largest proportions of variances in coverage rates for within effects were explained by the analysis approach, which were 94, 87, 94, 94, and 67% for the five within effect estimates.

TABLE 7
www.frontiersin.org

Table 7. Observed coverage rate for within and between effects in the MLM, 2-1-1 mediation, and 1-1-1 mediation models.

Different from the results on within effects, the observed coverage rates for the between effects in the new approach with within-group fpc were closer to the nominal level, i.e., 0.95, compared with those from the manifest and latent aggregation approaches (in Table 7). The manifest aggregation approach performed the worst in terms of observed coverage rates for the between effects among the three analysis approaches, with an average observed coverage rate lower than 0.90.

The main effects of analysis approach and between-to-within-effect ratio, the two-way interactions between analysis approach and between-to-within-effect ratio, between analysis approach and within-group sampling ratio, and the three-way interactions between analysis approach, between-to-within-effect ratio, and within-group sampling ratio explained medium to large proportions of variances (η2s > 0.059) in coverage rates for these between effects. The cell means of coverage rates for between effects are presented in Table 8 and plotted in Figure 3 by analysis approach, between-to-within-effect ratio, and within-group sampling ratio.

TABLE 8
www.frontiersin.org

Table 8. Observed coverage rates for between effects by analysis approach, RX/RM, and within-group sampling ratio.

FIGURE 3
www.frontiersin.org

Figure 3. Coverage rate for between effects by analysis approach, RX/RM, and within-group sampling ratio. Manifest, manifest aggregation approach; Latent, latent aggregation approach; FPC Latent, the new approach with within-group fpc; RX/RM, between-to-within-effect ratio; r, within-group sampling ratio.

The coverage rates for the between effects from the new approach with within-group fpc were above 0.95 in most conditions, and they were much closer to the nominal coverage rate, i.e., 0.95, than those from the manifest and latent aggregation approaches. The coverage rates for the between effects from the manifest and latent aggregation approaches were affected by the between-to-within-effect ratio and within-group sampling ratio. The coverage rates for the between effects from the manifest aggregation approach got better with an increasing within-group sampling ratio, which was consistent with the established findings (Lüdtke et al., 2008, 2011; Marsh et al., 2009, 2012; Preacher et al., 2010). As expected, the coverage rates for the between effects from the latent aggregation approach dropped down with an increasing within-group sampling ratio. The differences in coverage rates for the between effects among the three analysis approaches decreased with a decreasing between-to-within-effect ratio. When the between-to-within-effect ratio was 0.10, the differences in coverage rates among the three analysis approaches (or by different levels of within-group sampling ratio) were trivial. When the between-to-within-effect ratio was 10, the differences in coverage rates among the three analysis approaches (or by different levels of within-group sampling ratio) were clearer. For instance, when the between-to-within-effect ratio was 10 and the within-group sampling ratio was 0.10 or 0.30, the coverage rates for the between effects from the manifest aggregation approach were lower than 80%, which were unacceptable. When the between-to-within-effect ratio was 10 and the within-group sampling ratio was 0.70 and 0.90, the coverage rates for the between effects from the latent aggregation approach were unfavorable. Under the two same conditions, the coverage rates from the new approach were around 95%. The coverage rates for the between effects from the new approach were not affected largely by the between-to-within-effect ratio or within-group sampling ratio, and were better than those from the manifest and latent aggregation approaches.

Empirical Example

Background

Since the publication of the Coleman Report (Coleman et al., 1966), continuous efforts have been given to explore the role of schooling in alleviating student SES gap in mathematics performance. Some researchers look at this problem through OTL, which describes students’ content exposures in mathematics and is a key factor to understanding schooling. Previous studies showed OTL had a significant impact on student mathematics achievement, regardless of students’ parental education and income (Cogan et al., 2001; Lleras, 2008; Schmidt et al., 2015). However, the between-school and within-school SES gaps in OTL were found, which exacerbated rather than alleviated SES gaps in student mathematics performance (Schmidt et al., 2015). In other words, high SES schools showed more capabilities to provide advanced mathematics courses to their students, which brought benefits to their student performance on average (Schmidt et al., 2015); within schools, high SES students had more opportunities to attend demanding courses (Roscigno, 1998; Milner, 2012; Reeves, 2012; Kalogrides and Loeb, 2013; Burger, 2016), which were further translated into their advantages in mathematics (Schmidt et al., 2015).

The direct and indirect effects of SES were highly likely to occur at both school-level and student-level, which reflected the institutional-level and individual-level mechanisms. It was necessary to decompose the between and within direct and indirect effects of SES on student mathematics performance via OTL. The main purpose of the current empirical illustration was to show the between and within direct and indirect effects of SES on student mathematics performance through OTL in a 1-1-1 mediation model for different countries using the data from PISA 2012 with the three analysis approaches.

Methods

Programme for International Student Assessment is an international standardized assessment that measures how well-prepared 15-year-old students are for their future lives (OECD, 2014a, b). In each country (i.e., 34 OECD countries and 31 partner countries in PISA 2012), a stratified two-stage sampling design was used, where schools were sampled using probability proportional to size sampling (PPS), and students were sampled with equal probabilities within sampled schools. There were about 150 schools drawn from each country, with around 30 sampled students within each sampled school. Student weights and school weights were created for the sampled students and schools, which reflected how many other students (or schools) they could represent in the population (OECD, 2014a, b). Based on the student weights and school weights, the school size (of 15-year-olds) and within-school sampling ratio can be calculated as

sizej=iNjW_FSTUWTijW_FSCHWTj,(8)

and

ratioj=njsizej,(9)

where sizej is the school size (of 15-year-olds) of jth school; Nj is total number of sampled students in the jth school; W_FSTUWTij is the student weight (i.e., product of the inverse of the school’s probability of selection and the inverse of the student’s probability of selection within that school) for the ith student in the jth school; W_FSCHWTj is the school weight (i.e., inverse of the school’s probability of selection) for the jth school; ratioj is the within-school sampling ratio for school j; and nj is the actual number of students in the jth school used for analyses6.

In PISA, the mean mathematics performance across OECD countries is 494 and the standard deviation is 92. OTL is constructed based on the students’ exposures to 13 mathematics topics. The response categories vary from “never heard of it” to “knew it well.” Student socioeconomic background is represented by the economic, social, and cultural status (ESCS) in PISA. ESCS is computed as a weighted score of students’ home possessions, parents’ occupations, and parents’ education levels. This variable has an average score of 0 and a standard deviation of 1 across OECD countries (OECD, 2014a, b).

In the current study, only students with no missing data on ESCS, OTL, and mathematics performance were included, and only the schools with at least two students were used. After excluding two countries (i.e., Albania and Norway) which did not collect ESCS or OTL information, about two thirds of students in each of the 63 countries were used, as the PISA student survey used a random rotated block design and one third of students had missing data on OTL by design. The number of schools ranged from 12 to 1,421 in these 63 countries, the average within-school sample size ranged from 12 to 81, and the within-school sampling ratio ranged from 0.09 to 0.65.

Results

In this empirical study, the between and within direct and indirect effects of ESCS on student mathematics performance through OTL were estimated in the 1-1-1 mediation model using the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc for each of the 63 countries in PISA 2012. The results from only five countries are shown in Tables 9, 10 (see Supplementary Materials for the entire results).

TABLE 9
www.frontiersin.org

Table 9. Within effects from the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc in the 1-1-1 mediation model.

TABLE 10
www.frontiersin.org

Table 10. Between effects from the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc in the 1-1-1 mediation model.

In most countries, ESCS showed significant direct and indirect within effects via OTL on student mathematics achievement, which was consistent with previous studies (Schmidt et al., 2015). Consistent with the results in the simulation study, the within effect estimates and their standard errors from the three analysis approaches did not differ much from each other in the 63 countries, regardless of the degrees of ICCs of ESCS, OTL, and mathematics performance.

On the school level, significant direct and indirect between effects of ESCS on mathematics performance were also found in most countries, where the magnitudes of the between effects were larger than the corresponding within effects in these countries. Obvious differences in between effect estimates and their standard errors were found among the three analysis approaches. As expected, the between effect estimates and their standard errors from the new approach were generally between the statistics from the manifest and latent aggregation approaches. Different from the results in the simulation study, the between effect estimates from the new approach with within-group fpc were generally closer to the ones from latent aggregation approach than those from the manifest aggregation approach. The difference was related to the within-group sampling ratio in the empirical dataset, which ranged from 0.09 to 0.61, with a mean of 0.31 across these countries, but the within-group sampling ratio was set as 0.10, 0.30, 0.50, 0.70, and 0.90 in the simulation. The fpc was calculated as one minus within-group sampling ratio. With smaller within-group sampling ratios, the between variance-covariance matrices after correction from the new approach were closer to the ones used in the latent aggregation approach.

Discussion

Contextual effects or compositional effects are of interests in psychological research. The effects of organizational SES, percent of students eligible for free or reduced-price lunch (FRPL), percent of female, and percent of minorities on individual outcomes or development have previously been studied. These group-level compositions are often aggregated from individual data in the sample. To examine their effects on individual outcomes controlling for interindividual differences, traditionally, the manifest aggregation approach is used to separate the between and within effects. Recently, there is a new trend to adopt the latent aggregation approach in the decomposition of between and within effects, in which the sampling error in aggregation is corrected. There are statistical assumptions about the constructs to be decomposed, the population of research interests, and the sampling procedures used for data collection made in both manifest and latent aggregation approaches. However, little attention was given to these assumptions when choosing the analysis approach in the applications.

The current study focused on the decomposition of group compositional effects and individual effects based on the individual data in the sample. To resemble the data structure typically found in empirical research, an extremely large number of groups with small to moderate group sizes was assumed in the population, and a two-stage cluster sampling design with equal selection probability at each sampling stage was assumed. A new approach was proposed to deal with the within-group finite population selection problem in the sampling error correction in aggregation. The performances of the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc were compared in terms of the decomposition of between and within effects in the MLM, 2-1-1 mediation, and 1-1-1 mediation models. An empirical illustration was also used to compare the three analysis approaches in a 1-1-1 mediation model with PISA 2012 dataset.

The results from the simulation study and empirical study were generally in line with the expectations. The three analysis approaches showed acceptable model convergence rates for the three models and little differences in relative bias and RMSE of the within effects. The new approach provided higher coverage rates for the within effects compared to those from the manifest and latent aggregation approaches. For the between effects, large differences were found among the three analysis approaches. From the simulation study, the new approach with within-group fpc outperformed the manifest and latent aggregation approaches in terms of the relative biases and observed coverage rates for the between effects. The manifest aggregation approach had a better performance than the other two approaches in the RMSE for the between effects.

For the point estimates of between effects, the new approach with within-group fpc generally performed the best. The manifest aggregation approach underestimated the between effects, while the latent aggregation approach overestimated the between effects. The empirical study also found similar results, i.e., the between effect estimates from the latent aggregation approach were the largest among the three analysis approaches for most countries, those from the manifest aggregation approach were the smallest, while the between effect estimates from the new approach were in the middle.

On average, the between effect estimators from the manifest aggregation approach were the least variable among the three analysis approaches in the current study. Similarly, Lüdtke et al. (2008) found even with the population and sampling design that favored the latent aggregation approach, the estimates from the manifest aggregation approach were less variable, especially under the conditions with small ICCs, a small number of groups, and a small average group size. However, the accuracies of between effect estimates and their standard error estimates from the manifest aggregation approach were not ideal, which were reflected in its unacceptably low observed coverage rates under a low within-group sample size condition. For example, when the between-to-within-effect-ratio was 10 and the group size was 20, the coverage rates for the between effects from the manifest aggregation approach were around 50% with a within-group sampling ratio of 0.10, and all below 80% with a within-group sampling ratio of 0.30.

Based on these results, the new approach using within-group fpc was preferred, as it provided more accurate parameter and standard error estimates for the between effects than the other two approaches, although the between effect estimators were more variable than those of the manifest aggregation approach.

Previous studies (Lüdtke et al., 2008, 2011; Preacher et al., 2011) paid attention to the reflective group-level constructs, and formative group-level constructs under the conditions with large group sizes and extremely small within-group sampling ratios, and found the latent aggregation approach outperformed the manifest aggregation approach in terms of the bias of between effect estimates. As discussed before, the current study applied moderate to large within-group sampling ratios with small to moderate group sizes in the simulation (or for the formative group-level constructs), which better represented the designs in contextual studies. This population assumption and sampling design did not fit the assumptions made by the latent aggregation approach. It is no surprise that the latent aggregation approach did not show any advantage over the new approach with within-group fpc, and even performed worse than the manifest aggregation approach in the decomposition of between and within effects in the current study, although this result seemed to be completely opposed to the conclusions made before (Lüdtke et al., 2008, 2011; Preacher et al., 2011). Lüdtke et al. (2011) also mentioned that when the within-group sampling ratio was 100%, the manifest aggregation approach would perform better than the latent aggregation approach, and the finite sampling correction was needed when a moderate sampling ratio was used in their study. An additional simulation was conducted for the MLM model in the current study. A moderate group size (i.e., N = 100) and an extremely small within-group sampling ratio (i.e., r = 0.02) were used for data generation, which generally fit the assumptions made by the latent aggregation approach. Similar to the results from previous studies (Lüdtke et al., 2008, 2011; Preacher et al., 2011), the latent aggregation approach outperformed the manifest aggregation approach in term of biases in between effect estimates under this condition, and the new approach provided similar between effect estimates as those from the latent aggregation approach.

The unstable between effect estimates from the latent aggregation approach was not just related to a moderate to high within-group sampling ratio. Even with an infinite within-group population, it was found that the between effect estimates from the latent aggregation approach using ML estimation might be unstable when the group-level variance components were close to zero. To improve the estimation accuracy in the between effects for the latent aggregation approach under the conditions with a small number of groups and low ICCs, the Bayesian estimation and EAP-based estimation were proposed in the previous studies. It was shown that the between effect estimates in the latent aggregation approach using Bayesian estimation were between the results from the manifest aggregation approach using ML estimation and the results from the latent aggregation approach using ML estimation (Zitzmann et al., 2016). Under the challenging conditions with a small number of groups and low ICCs of predictors, the EAP-based estimation worked better for the between effects than the ML estimation in the latent aggregation approach, and it worked similarly as the ML estimation in other conditions (Croon and van Veldhoven, 2007; Zitzmann, 2018; Zitzmann and Helm, 2021).

The EAP-based estimation used a similar idea as the MUML estimation, both of which separated the between and within effects in a stepwise manner. In the current study, based on the MUML’s idea, a stepwise procedure was conducted in the new approach with within-group fpc. In the first step, the between and within variance-covariance estimates were separated using the ANOVA method. In the second step, the variance-covariance estimates were used to estimate the between and within effects. The major difference between the new approach with within-group fpc in the current study, and the EAP-based estimation in previous studies (Croon and van Veldhoven, 2007; Zitzmann, 2018; Zitzmann and Helm, 2021) was whether they dealt with the within-group finite population selection issue. In the current study, the within-group fpc was incorporated in the ANOVA procedure to get the adjusted between and within variance-covariance estimates in the new approach. Considering the fine performance of EAP-based estimation for the latent aggregation approach (Croon and van Veldhoven, 2007; Zitzmann, 2018; Zitzmann and Helm, 2021) and the within-group fpc for the sampling errors, the new approach with within-group fpc was expected to work better for the between effects than the latent aggregation approach using the ML estimation, especially when there was limited information on the group-level constructs. In the current results, the between effect estimates from the new approach with within-group fpc showed smaller relative biases and smaller RMSE than those from the latent aggregation approach using ML estimation. The differences in between effect estimates between the new approach and the latent aggregation approach were larger when the ICCs of the predictor and/or mediator were smaller and the group size was smaller. The current results did not mean the latent aggregation approach was unfavorable in general but suggested that under the same or a similar population and sampling scenario, the latent aggregation approach might not be a good choice to separate the individual effects and group compositional effects. Furthermore, it would be interesting to compare the between effect estimates from the Bayesian estimation and the EAP-based estimation in the latent aggregation approach proposed before, and those from the new approach with within-group fpc in the current study under different populations and sampling conditions in future studies.

In summary, the results from both simulation and empirical illustration reflect the necessity to consider assumptions about the population and sampling design, as well as the nature of group-level constructs in the decomposition of between and within effects in contextual models. When the within-group sampling ratio is extremely small (e.g., smaller than 5%) or the within-group population can be assumed to be infinite, the latent aggregation approach is a good choice. When the within-group sampling ratio is extremely large (e.g., close to 100%), the manifest aggregation approach can be used to separate the between and within effects. For the contextual studies in which the within-group sampling ratio is usually moderate, the finite population selection needs to be considered in the sampling error correction. Under this condition, the new approach with within-group fpc provides an additional choice to estimate the group compositional effects, with fewer degrees of bias and higher observed coverage rates of between effect estimates compared with those from the manifest and latent aggregation approaches.

Limitations and Future Study

In the current study, the between and within effects of the decomposed predictors and/or mediators were examined for the three particular two-level models, i.e., MLM, 2-1-1 mediation, and 1-1-1 mediation models, under certain assumptions about the constructs of research interests, populations of subjects, sampling procedures, and the variables used in these models. The results from the simulation could only be generated under the same or similar conditions.

The current study did not cover the conditions which theoretically favored either the manifest or latent aggregation approach. As discussed before, for the formative group-level constructs, when the entire groups are drawn, the manifest aggregation approach is assumed to perform better than the other two approaches. For the reflective group-level constructs under the conditions in which the within-group population can be assumed to be infinite, or for the formative group-level constructs under the conditions with an extremely small within-group sampling ratio (e.g., smaller than 5%), the latent aggregation approach is assumed to be a good choice based on previous studies (Lüdtke et al., 2008, 2011; Preacher et al., 2011). It deserves further simulations to examine the performance of the new approach under these conditions, and compare it with the manifest and latent aggregation approaches.

Second, all models were correctly specified in the current simulation study. It is unclear whether the three-analysis approaches are sensitive to model misspecifications under different settings, or how they will perform under different types of model misspecifications. For example, it was assumed that no omitted confounder influenced either the mediators or the outcomes in the current study. However, in empirical studies, it would be unrealistic or impossible to include and model all the relevant variables. For example, in our empirical illustration, student characteristics (e.g., gender and motivation, etc.), teacher characteristics (e.g., teacher’s degree and major), and school characteristics (e.g., school type and location) were highly likely related to student-level and school-level OTL and outcome, which would bring confounding effects on both direct and indirect effects on the student level and school level. It would be interesting to develop and conduct a sensitivity analysis for the contextual models in order to understand the potential influences of confounders on the between and within effects in the future study. Furthermore, there is no random slope or cross-level interaction discussed in the current study. For the manifest aggregation approach, the random slopes and cross-level interactions can be included in the models following the traditional multilevel modeling strategy (Raudenbush and Bryk, 2002). Marsh et al. (2009) and Preacher et al. (2010, 2011) showed the possibilities to include random slopes and cross-level interactions for the latent aggregation approach. However, when the within effects of the decomposed variables are of random slopes, or there are cross-level interactions involved, the new approach with within-group fpc in the current study cannot be applied. Further work is needed to incorporate the random slopes and cross-level interactions into the new approach. In addition, when the between and within effects of the decomposed variables, as well as the random slopes and cross-level interactions of the within components are all included in the models, it is necessary to reconsider the meaning of these estimates in applications, and whether these analyses and estimation approaches are reasonable for the research questions.

Third, all variables used in the simulation were generated from multivariate normal distributions. With different distributions of variables used in the model, different considerations may be given and different results may come out. In the future study, it is also necessary to consider how to decompose the between and within effects for the variables following distributions other than the multivariate normal distribution, with a correction for the sampling errors in the aggregation.

Moreover, the within-group sampling ratio is necessary for the application of the new approach with within-group fpc. As shown in the empirical study, many large-scale datasets provide the possibilities of calculating the within-group sampling ratios with the available sampling design information. However, different from the current study which assumed a two-stage cluster sampling design with equal selection probability at each stage, the selection probability is not the same either across groups or across individuals in these large-scale datasets. Sampling weights need to be incorporated into estimation for all three-analysis approaches. Previous studies indicated that multilevel pseudo-maximum likelihood estimation (MPML) provides one way to incorporate the group-level and individual-level sampling weights into multilevel models (Asparouhov, 2004, 2006; Rabe-Hesketh and Skrondal, 2006). This estimation method can be adapted and applied to the manifest and latent aggregation approaches. For the new approach with within-group fpc, further studies can work on the incorporation of sampling weights into estimation, with the adjustment of finite population selection in the sampling error correction in aggregation. The comparisons of the three analysis approaches with different weighting procedures in the decomposition of between and within effects are also of research interest.

Furthermore, for the PISA design, balanced repeated replication (BRR) weights should be used to calculate sampling variances in an empirical study. The raw school weights and student weights were only used to estimate the within-school sampling ratio, and all results in the current empirical example were unweighted. As more works are needed, examining the weighting methods for the manifest aggregation approach, the latent aggregation approach, and the new approach with within-group fpc, it also deserves further explorations on the resampling methods to estimate the sampling variances under the complex sampling designs in the decomposition of between and within effects. As the purpose of the current empirical study was to compare the performances of the manifest aggregation approach, the latent aggregation approach, and the new approach under a finite within-group population condition using real data, unweighted results were reported. Further considerations should be given on both weighting methods and replication weights for the three-analysis approaches.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://www.oecd.org/pisa/data/.

Author Contributions

SG designed the study, conducted the simulation and empirical studies, and took a leading role in writing the manuscript. RH and WS contributed to the simulation study and manuscript writing. All authors contributed to the article and approved the submitted version.

Funding

This research was supported by the Fundamental Research Funds for the Central Universities and the Research Funds of the Renmin University of China granted to SG.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Parts of this paper are based on the dissertation of SG and were submitted to, and accepted by, the American Educational Research Association (AERA) 2020 Annual Meeting.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.541803/full#supplementary-material

Footnotes

  1. ^ For simplicity, the between-group effect and within-group effect are referred as “between effect” and “within effect” in the current study.
  2. ^ The focus in the current study is on the sampling error issue in the aggregation of group compositions, and measurement error is not discussed. Xij is assumed free from measurement error in this study. This assumption is reasonable for some individual characteristics used in this study, like gender and ethnicity. However, for student SES, this assumption is not generally satisfied. When the measurement error is considered in the model, multiple indicators of Xij are needed. Please see Lüdtke et al. (2011) for further information.
  3. ^ The traditional assumption of multivariate normality is not crucial for the asymptotic results. In fact, MSEM framework is general enough to accommodate variables following non-normal distributions, and some software, like Mplus, had built in estimation methods dealing with non-normal data. We put “multivariate normal distribution assumption” here, as it was assumed in the paper by Preacher et al. (2010, 2011) and Lüdtke et al. (2011) we cited. Preacher et al. (2010, 2011) and Lüdtke et al. (2011) also discussed that the assumption of multivariate normality was not crucial.
  4. ^ Lüdtke et al. (2008) discussed the assumptions of finite and infinite population in each group and distinguished the formative and reflective group-level constructs. When the group is the referent in aggregation, group-level construct is a reflective measure, and the assumption of infinite population in each group is reasonable. One example is school climate, in which students within a school rate the climate for the target school. Following the domain sampling theory, students are exchangeable and can be assumed infinite in that school. However, it does not fit the contextual studies in which the group construct is a composition of individual characteristics. When the referent is the individual, and individual characteristics are used for aggregation, the group construct is a formative measure, and the population in each group cannot be assumed infinite. One example is school gender ratio, in which students are not exchangeable in terms of their own gender. When the sampling ratio in a school is 100%, manifest gender ratio in each school is free from sampling error. In the current study, group constructs are group compositions, which are aggregated from the individual characteristics. Individual is the referent and within-group population cannot be assumed infinite.
  5. ^ μY, μX, and μM are the population means of Y, X, and M, respectively.
  6. ^ Based on the sampling design in PISA, the BRR should be used for calculating sampling variances. As the purpose of the current example is to compare the three analysis approaches under a finite within-group population condition, the unweighted results were reported in this empirical example and BRR weights were not used. Further considerations should be given on the replication weights in the future study.

References

Asparouhov, T. (2004). Weighting for unequal probability of selection in multilevel modeling. Los Angeles, CA: Muthen & Muthen.

Google Scholar

Asparouhov, T. (2006). General multi-level modeling with sampling weights. Commun. Statist. Theory Methods 35, 439–460. doi: 10.1080/03610920500476598

CrossRef Full Text | Google Scholar

Burger, K. (2016). Intergenerational transmission of education in Europe: Do more comprehensive education systems reduce social gradients in student achievement? Res. Soc. Stratificat. Mobil. 44, 54–67. doi: 10.1016/j.rssm.2016.02.002

CrossRef Full Text | Google Scholar

Cheung, M. W. L., and Au, K. (2005). Applications of multilevel structural equation modeling to cross-cultural research. Struct. Equat. Modeling Multidiscipl. J. 12, 598–619. doi: 10.1207/s15328007sem1204_5

PubMed Abstract | CrossRef Full Text | Google Scholar

Cochran, W. G. (1977). Sampling techniques. New York: Wiley.

Google Scholar

Cogan, L. S., Schmidt, W. H., and Wiley, D. E. (2001). Who takes what math and in which track? Using TIMSS to characterize US students’ eighth-grade mathematics learning opportunities. Educat. Evaluat. Policy Anal. 23, 323–341. doi: 10.3102/01623737023004323

CrossRef Full Text | Google Scholar

Coleman, J. S., Campbell, E. Q., Hobson, C. J., McPartland, J., Mood, A. M., Weinfeld, F. D., et al. (1966). Equality of educational opportunity. Washington, D.C: U.S. Government Printing Office.

Google Scholar

Croon, M. A., and van Veldhoven, M. J. P. M. (2007). Predicting group-level outcome variables from variables measured at the individual level: A latent variable multilevel model. Psychol. Methods 12, 45–57. doi: 10.1037/1082-989x.12.1.45

PubMed Abstract | CrossRef Full Text | Google Scholar

Hox, J. J., and Maas, C. J. M. (2001). The accuracy of multilevel structural equation modeling with pseudobalanced groups and small samples. Struct. Equat. Model. Multidiscipl. J. 8, 157–174. doi: 10.1207/s15328007sem0802_1

CrossRef Full Text | Google Scholar

Hox, J. J., Maas, C. J. M., and Brinkhuis, M. J. S. (2010). The effect of estimation method and sample size in multilevel structural equation modeling. Statist. Neerlandica 64, 157–170. doi: 10.1111/j.1467-9574.2009.00445.x

CrossRef Full Text | Google Scholar

Hsu, H.-Y., Lin, J.-H., Kwok, O.-M., Acosta, S., and Willson, V. (2016). The impact of intraclass correlation on the effectiveness of level-specific fit indices in multilevel structural equation modeling: A Monte Carlo study. Educat. Psychol. Measurem. 77, 5–31. doi: 10.1177/0013164416642823

PubMed Abstract | CrossRef Full Text | Google Scholar

Kalogrides, D., and Loeb, S. (2013). Different teachers, different peers: The magnitude of student sorting within schools. Educat. Res. 42, 304–316. doi: 10.3102/0013189x13495087

CrossRef Full Text | Google Scholar

Keenan, A. P., and Laura, M. S. (2012). Distinguishing between cross- and cluster-level mediation processes in the cluster randomized trial. Sociol. Methods Res. 41, 630–670. doi: 10.1177/0049124112460380

CrossRef Full Text | Google Scholar

Kim, E. S., Kwok, O.-M., and Yoon, M. (2012). Testing factorial invariance in multilevel data: A Monte Carlo study. Struct. Equat. Modeling Multidiscipl. J. 19, 250–267. doi: 10.1080/10705511.2012.659623

CrossRef Full Text | Google Scholar

Lachowicz, M. J., Sterba, S. K., and Preacher, K. J. (2015). Investigating multilevel mediation with fully or partially nested data. Group Proces. Intergr. Relat. 18, 274–289. doi: 10.1177/1368430214550343

CrossRef Full Text | Google Scholar

Lai, M. H. C., Kwok, O.-M., Hsiao, Y.-Y., and Cao, Q. (2018). Finite population correction for two-level hierarchical linear models. Psychol. Methods 23, 94–112. doi: 10.1037/met0000137

PubMed Abstract | CrossRef Full Text | Google Scholar

Lau, S., and Nie, Y. (2008). Interplay between personal goals and classroom goal structures in predicting student outcomes: A multilevel analysis of person-context interactions. J. Educat. Psychol. 100, 15–29. doi: 10.1037/0022-0663.100.1.15

CrossRef Full Text | Google Scholar

Lleras, C. (2008). Race, racial concentration, and the dynamics of educational inequality across urban and suburban schools. Am. Educat. Res. J. 45, 886–912. doi: 10.3102/0002831208316323

CrossRef Full Text | Google Scholar

Lohr, S. (2009). Sampling: Design and analysis, 2nd Edn. Boston, MA: Cengage Learning.

Google Scholar

Longford, N. T., and Muthén, B. O. (1992). Factor analysis for clustered observations. Psychometrika 57, 581–597. doi: 10.1007/bf02294421

CrossRef Full Text | Google Scholar

Lüdtke, O., Marsh, H. W., Robitzsch, A., and Trautwein, U. (2011). A 2 x 2 taxonomy of multilevel latent contextual models: Accuracy-bias trade-offs in full and partial error correction models. Psychol. Methods 16, 444–467. doi: 10.1037/a0024376

PubMed Abstract | CrossRef Full Text | Google Scholar

Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., and Muthén, B. O. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychol. Methods 13, 203–229. doi: 10.1037/a0012869

PubMed Abstract | CrossRef Full Text | Google Scholar

Marsh, H. W., Lüdtke, O., Nagengast, B., Trautwein, U., Morin, A. J. S., Abduljabbar, A. S., et al. (2012). Classroom climate and contextual effects: Conceptual and methodological issues in the evaluation of group-level effects. Educat. Psychol. 47, 106–124. doi: 10.1080/00461520.2012.670488

CrossRef Full Text | Google Scholar

Marsh, H. W., Lüdtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T., Muthén, B. O., et al. (2009). Doubly-latent models of school contextual effects: Integrating multilevel and structural equation approaches to control measurement and sampling error. Multivar. Behav. Res. 44, 764–802. doi: 10.1080/00273170903333665

PubMed Abstract | CrossRef Full Text | Google Scholar

Mayer, A., Nagengast, B., Fletcher, J., and Steyer, R. (2014). Analyzing average and conditional effects with multigroup multilevel structural equation models. Front. Psychol. 5:304. doi: 10.3389/fpsyg.2014.00304

PubMed Abstract | CrossRef Full Text | Google Scholar

McNeish, D. (2017). Multilevel mediation with small samples: A cautionary note on the multilevel structural equation modeling framework. Struct. Equat. Modeling Multidiscipl. J. 34, 609–625. doi: 10.1080/10705511.2017.1280797

CrossRef Full Text | Google Scholar

Milner, H. R. (2012). Beyond a test score: Explaining opportunity gaps in educational practice. J. Black Stud. 43, 693–718. doi: 10.1177/0021934712442539

CrossRef Full Text | Google Scholar

Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika 54, 557–585. doi: 10.1007/bf02296397

CrossRef Full Text | Google Scholar

Muthén, B. O. (1990). “Mean and covariance structure analysis of hierarchical data,” in Paper presented at the Psychometric Society meeting, (Princeton, NJ: Psychometric Society).

Google Scholar

Muthén, B. O. (1994). Multilevel covariance structure analysis. Sociol. Methods Res. 22, 376–398. doi: 10.1177/0049124194022003006

CrossRef Full Text | Google Scholar

Muthén, B. O. (1997). Latent variable modeling of longitudinal and multilevel data. Sociol. Methodol. 27, 453–480. doi: 10.1111/1467-9531.271034

CrossRef Full Text | Google Scholar

Muthén, B. O. (2004). Mplus technical appendices. Los Angeles, CA: Muthén & Muthén.

Google Scholar

Muthén, B. O., and Satorra, A. (1995). Complex sample data in structural equation modeling. Sociol. Methodol. 25, 267–316. doi: 10.2307/271070

CrossRef Full Text | Google Scholar

OECD (2014a). PISA 2012 results: What students know and can do, Vol. I. Paris: OECD Publishing.

Google Scholar

OECD (2014b). PISA 2012 technical report. Paris: OECD Publishing.

Google Scholar

Pham, T. V. (2018). The performance of multilevel structural equation modeling (MSEM) in comparison to multilevel modeling (MLM) in multilevel mediation analysis with non-normal data. Ph.D. thesis. Tampa, FL: University of South Florida.

Google Scholar

Pituch, K. A., Stapleton, L. M., and Kang, J. Y. (2006). A comparison of single sample and bootstrap methods to assess mediation in cluster randomized trials. Multivar. Behav. Res. 41, 367–400. doi: 10.1207/s15327906mbr4103_5

PubMed Abstract | CrossRef Full Text | Google Scholar

Preacher, K. J. (2011). Multilevel SEM strategies for evaluating mediation in three-level data. Multivar. Behav. Res. 46, 691–731. doi: 10.1080/00273171.2011.589280

PubMed Abstract | CrossRef Full Text | Google Scholar

Preacher, K. J., Zhang, Z., and Zyphur, M. J. (2011). Alternative methods for assessing mediation in multilevel data: The advantages of multilevel SEM. Struct. Equat. Model. Multidiscipl. J. 18, 161–182. doi: 10.1080/10705511.2011.557329

CrossRef Full Text | Google Scholar

Preacher, K. J., Zhang, Z., and Zyphur, M. J. (2016). Multilevel structural equation models for assessing moderation within and across levels of analysis. Psychol. Methods 21, 189–205. doi: 10.1037/met0000052

PubMed Abstract | CrossRef Full Text | Google Scholar

Preacher, K. J., Zyphur, M. J., and Zhang, Z. (2010). A general multilevel SEM framework for assessing multilevel mediation. Psychol. Methods 15, 209–233. doi: 10.1037/a0020141

PubMed Abstract | CrossRef Full Text | Google Scholar

Rabe-Hesketh, S., and Skrondal, A. (2006). Multilevel modelling of complex survey data. J. R. Statist. Soc. Ser. Statist. Soc. 169, 805–827. doi: 10.1111/j.1467-985X.2006.00426.x

CrossRef Full Text | Google Scholar

Raudenbush, S. W., and Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. London: SAGE Publications.

Google Scholar

Reeves, E. B. (2012). The effects of opportunity to learn, family socioeconomic status, and friends on the rural math achievement gap in high school. Am. Behav. Sci. 56, 887–907. doi: 10.1177/0002764212442357

CrossRef Full Text | Google Scholar

Roscigno, V. J. (1998). Race and the reproduction of educational disadvantage. Soc. Forces 76, 1033–1061. doi: 10.2307/3005702

CrossRef Full Text | Google Scholar

Ryu, E. (2014). Factorial invariance in multilevel confirmatory factor analysis. Br. J. Mathemat. Statist. Psychol. 67, 172–194. doi: 10.1111/bmsp.12014

PubMed Abstract | CrossRef Full Text | Google Scholar

Ryu, E. (2015a). Multiple group analysis in multilevel structural equation model across level 1 groups. Multivar. Behav. Res. 50, 300–315. doi: 10.1080/00273171.2014.1003769

PubMed Abstract | CrossRef Full Text | Google Scholar

Ryu, E. (2015b). The role of centering for interaction of level 1 variables in multilevel structural equation models. Struct. Equat. Model. Multidiscipl. J. 22, 617–630. doi: 10.1080/10705511.2014.936491

CrossRef Full Text | Google Scholar

Schmidt, W. H., Burroughs, N. A., Zoido, P., and Houang, R. T. (2015). The role of schooling in perpetuating educational inequality: An international perspective. Educat. Res. 44, 371–386. doi: 10.3102/0013189x15603982

CrossRef Full Text | Google Scholar

Shin, Y., and Raudenbush, S. W. (2010). A latent cluster-mean approach to the contextual effects model with missing data. J. Educat. Behav. Statist. 35, 26–53. doi: 10.3102/1076998609345252

CrossRef Full Text | Google Scholar

Stapleton, L. M. (2002). The incorporation of sample weights into multilevel structural equation models. Structur. Equat. Model. 9, 475–502. doi: 10.1207/s15328007sem0904_2

CrossRef Full Text | Google Scholar

Stapleton, L. M. (2006). “Using multilevel structural equation modeling techniques with complex sample data,” in Structural Equation Modeling : A Second Course, eds R. O. Mueller and G. R. Hancock (Conn: Information Age Publishing).

Google Scholar

Svoboda, S. J. (2020). Finite population corrections for two-level hierarchical linear models with binary predictors. Ph.D. thesis. Ann Arbor: The University of Nebraska - Lincoln.

Google Scholar

Talloen, W., Moerkerke, B., Loeys, T., De Naeghel, J., Van Keer, H., and Vansteelandt, S. (2016). Estimation of indirect effects in the presence of unmeasured confounding for the mediator-outcome relationship in a multilevel 2-1-1 mediation model. J. Educat. Behav. Statist. 41, 359–391. doi: 10.3102/1076998616636855

CrossRef Full Text | Google Scholar

Tofighi, D., and Thoemmes, F. (2014). Single-level and multilevel mediation analysis. J. Early Adolesc. 34, 93–119. doi: 10.1177/0272431613511331

CrossRef Full Text | Google Scholar

Wu, J.-Y., Lee, Y.-H., and Lin, J. J. H. (2018). Using iMCFA to perform the CFA, multilevel CFA, and maximum model for analyzing complex survey data. Front. Psychol. 9:251. doi: 10.3389/fpsyg.2018.00251

PubMed Abstract | CrossRef Full Text | Google Scholar

Yuan, K. H., and Hayashi, K. (2005). On Muthen’s maximum likelihood for two-level covariance structure models. Psychometrika 70, 147–167. doi: 10.1007/s11336-003-1070-8

CrossRef Full Text | Google Scholar

Zhang, Z., Zyphur, M. J., and Preacher, K. J. (2009). Testing multilevel mediation using hierarchical linear models problems and solutions. Organizat. Res. Methods 12, 695–719. doi: 10.1177/1094428108327450

CrossRef Full Text | Google Scholar

Zitzmann, S. (2018). A computationally more efficient and more accurate stepwise approach for correcting for sampling error and measurement error. Multivar. Behav. Res. 53, 612–632. doi: 10.1080/00273171.2018.1469086

PubMed Abstract | CrossRef Full Text | Google Scholar

Zitzmann, S., and Helm, C. (2021). Multilevel analysis of mediation, moderation, and nonlinear effects in small samples, using expected a posteriori estimates of factor scores. Struct. Equat. Model. Multidiscipl. J. 2021:1855076. doi: 10.1080/10705511.2020.1855076

CrossRef Full Text | Google Scholar

Zitzmann, S., Lüdtke, O., and Robitzsch, A. (2015). A Bayesian approach to more stable estimates of group-level effects in contextual studies. Multivar. Behav. Res. 50, 688–705. doi: 10.1080/00273171.2015.1090899

PubMed Abstract | CrossRef Full Text | Google Scholar

Zitzmann, S., Lüdtke, O., Robitzsch, A., and Marsh, H. W. (2016). A Bayesian approach for estimating multilevel latent contextual models. Struct. Equat. Model. Multidiscipl. J. 23, 661–679. doi: 10.1080/10705511.2016.1207179

CrossRef Full Text | Google Scholar

Keywords: contextual model, multilevel modeling, structural equation modeling, multilevel SEM, mediation, finite population correction

Citation: Guo S, Houang RT and Schmidt WH (2021) The Decomposition of Between and Within Effects in Contextual Models. Front. Psychol. 12:541803. doi: 10.3389/fpsyg.2021.541803

Received: 10 March 2020; Accepted: 06 April 2021;
Published: 03 June 2021.

Edited by:

Alessandro Giuliani, National Institute of Health (ISS), Italy

Reviewed by:

Alexander Robitzsch, IPN – Leibniz Institute for Science and Mathematics Education, Germany
Steffen Zitzmann, University of Tübingen, Germany

Copyright © 2021 Guo, Houang and Schmidt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Siwen Guo, guosiwen@ruc.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.