<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2020.564403</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Score-Guided Structural Equation Model Trees</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Arnold</surname> <given-names>Manuel</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/984016/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Voelkle</surname> <given-names>Manuel C.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/105202/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Brandmaier</surname> <given-names>Andreas M.</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/205061/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Psychological Research Methods, Department of Psychology, Humboldt-Universit&#x00E4;t zu Berlin</institution>, <addr-line>Berlin</addr-line>, <country>Germany</country></aff>
<aff id="aff2"><sup>2</sup><institution>Max Planck UCL Centre for Computational Psychiatry and Ageing Research</institution>, <addr-line>Berlin</addr-line>, <country>Germany</country></aff>
<aff id="aff3"><sup>3</sup><institution>Center for Lifespan Psychology, Max Planck Institute for Human Development</institution>, <addr-line>Berlin</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Jin Eun Yoo, Korea National University of Education, South Korea</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Carolin Strobl, University of Zurich, Switzerland; Achim Zeileis, University of Innsbruck, Austria</p></fn>
<corresp id="c001">&#x002A;Correspondence: Manuel Arnold, <email>arnoldmz@hu-berlin.de</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>28</day>
<month>01</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>11</volume>
<elocation-id>564403</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>05</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>12</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Arnold, Voelkle and Brandmaier.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Arnold, Voelkle and Brandmaier</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Structural equation model (SEM) trees are data-driven tools for finding variables that predict group differences in SEM parameters. SEM trees build upon the decision tree paradigm by growing tree structures that divide a data set recursively into homogeneous subsets. In past research, SEM trees have been estimated predominantly with the R package <monospace>semtree</monospace>. The original algorithm in the <monospace>semtree</monospace> package selects split variables among covariates by calculating a likelihood ratio for each possible split of each covariate. Obtaining these likelihood ratios is computationally demanding. As a remedy, we propose to guide the construction of SEM trees by a family of score-based tests that have recently been popularized in psychometrics (<xref ref-type="bibr" rid="B31">Merkle and Zeileis, 2013</xref>; <xref ref-type="bibr" rid="B30">Merkle et al., 2014</xref>). These score-based tests monitor fluctuations in case-wise derivatives of the likelihood function to detect parameter differences between groups. Compared to the likelihood-ratio approach, score-based tests are computationally efficient because they do not require refitting the model for every possible split. In this paper, we introduce score-guided SEM trees, implement them in <monospace>semtree</monospace>, and evaluate their performance by means of a Monte Carlo simulation.</p>
</abstract>
<kwd-group>
<kwd>exploratory data analysis</kwd>
<kwd>heterogeneity</kwd>
<kwd>model-based recursive partitioning</kwd>
<kwd>parameter stability</kwd>
<kwd>structural change tests</kwd>
<kwd>structural equation modeling</kwd>
</kwd-group>
<contract-sponsor id="cn001">Humboldt-Universit&#x00E4;t zu Berlin<named-content content-type="fundref-id">10.13039/501100006211</named-content></contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="8"/>
<equation-count count="12"/>
<ref-count count="53"/>
<page-count count="18"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>Structural equation models (SEMs; <xref ref-type="bibr" rid="B4">Bollen, 1989</xref>; <xref ref-type="bibr" rid="B23">Kline, 2016</xref>) are a widely applied technique in social and psychological research to model the relationships between multiple variables. SEMs are especially useful when some of the variables under investigation are latent (not directly observable) or contain measurement errors. Various statistical procedures such as factor analysis, ANOVA, linear regression, mediation models, growth curve models, and dynamic panel models can be specified within the SEM framework.</p>
<p>A major challenge that complicates the specification and interpretation of SEMs are potential differences between subgroups of the sample. Group differences can pertain to various aspects of a SEM. For instance, in a latent growth curve model, we may find differences in how people change over time, or in a factor analysis model, the factor structure may vary across groups. By neglecting such instances of sample heterogeneity, SEM parameter estimates may not represent any individual in the sample, and researchers risk drawing incorrect conclusions from their data (e.g., <xref ref-type="bibr" rid="B22">Kievit et al., 2013</xref>). This makes identifying group differences in SEM parameters an important task.</p>
<p>One popular strategy is to detect heterogeneity in SEMs with the help of covariates. Multi-group structural equation models (MGSEMs; <xref ref-type="bibr" rid="B40">S&#x00F6;rbom, 1974</xref>) allow estimating different parameter values for the levels of a grouping variable, such as males and females or treated versus non-treated. By comparing the fit of a single-group SEM to the fit of a MGSEM, equality constraints on parameters across groups can be tested with the likelihood-ratio test. Multi-group structural equation modeling excels as a confirmatory tool to test a limited number of hypotheses about group differences. As part of exploratory data analysis, however, the method can often become tedious in large data sets. With many potentially important grouping variables, many MGSEMs need to be specified and estimated. Moreover, since MGSEMs require discrete grouping variables, numeric and ordinal covariates such as age or socioeconomic status need to be discretized, which often leads to a loss of information (but see <xref ref-type="bibr" rid="B14">Hildebrandt et al., 2016</xref>).</p>
<p>SEM trees, as first presented by <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref>, can be seen as an extension of MGSEMs for exploring parameter heterogeneity in SEMs. SEM trees are a data-driven approach that automatically searches through all available covariates to identify partitions of the full sample that differ with respect to SEM parameter estimates. SEM trees build upon the model-based recursive partitioning paradigm (for an overview, see <xref ref-type="bibr" rid="B51">Zeileis et al., 2008</xref>; <xref ref-type="bibr" rid="B42">Strobl et al., 2009</xref>). One key feature of SEM trees is their interpretability: SEM trees provide a graphical representation of how covariates and covariate interactions predict non-linear differences in SEM parameters. The building blocks of SEM trees are called nodes, each containing a SEM fitted to a distinct subsample. The SEM tree algorithm forms a binary tree structure by hierarchically splitting these nodes. Each node of the SEM tree has either two successors (daughter nodes) and is called an <italic>inner node</italic> or no successors and is called a <italic>leaf</italic> (or terminal node). The first node of the tree is called the <italic>root</italic> and has no parent nodes. The inner nodes of the tree represent split decisions. Each split decision involves a covariate (e.g., age of the observed individuals) and a cut point in the covariate (e.g., divide the sample into individuals younger and older than 45 years). A leaf of a tree contains a partition of the sample that is best described with a set of SEM parameters. All leaves taken together exhaustively partition the original sample and can be thought of as a MGSEM with potentially many groups. An important difference to conventional multi-group structural equation modeling is that the group membership in a SEM tree is not pre-specified but learned from the data.</p>
<p>There are currently two software packages for the statistical programming language R that allow fitting SEM trees. One is the <monospace>semtree</monospace> package (<xref ref-type="bibr" rid="B7">Brandmaier et al., 2013b</xref>) that has been widely applied in the literature (<xref ref-type="bibr" rid="B6">Brandmaier et al., 2013a</xref>, <xref ref-type="bibr" rid="B8">2016</xref>, <xref ref-type="bibr" rid="B9">2017</xref>, <xref ref-type="bibr" rid="B5">2018</xref>; <xref ref-type="bibr" rid="B18">Jacobucci et al., 2017</xref>; <xref ref-type="bibr" rid="B44">Usami et al., 2017</xref>, <xref ref-type="bibr" rid="B45">2019</xref>; <xref ref-type="bibr" rid="B11">de Mooij et al., 2018</xref>; <xref ref-type="bibr" rid="B1">Ammerman et al., 2019</xref>; <xref ref-type="bibr" rid="B37">Serang et al., 2020</xref>; <xref ref-type="bibr" rid="B39">Simpson-Kent et al., 2020</xref>). The other software implementation is the <monospace>partykit</monospace> package (<xref ref-type="bibr" rid="B16">Hothorn and Zeileis, 2015</xref>). Unlike <monospace>semtree</monospace>, <monospace>partykit</monospace> is not limited to a specific model class such as SEMs but provides the infrastructure for general recursive partitioning across various model classes. Among other features, <monospace>partykit</monospace> provides the generic MOB algorithm for model-based recursive partitioning that has been used to study heterogeneity in M-estimators (<xref ref-type="bibr" rid="B51">Zeileis et al., 2008</xref>), Bradley-Terry models (<xref ref-type="bibr" rid="B43">Strobl et al., 2011</xref>), Rasch models (<xref ref-type="bibr" rid="B41">Strobl et al., 2015</xref>; <xref ref-type="bibr" rid="B24">Komboz et al., 2018</xref>), multinomial processing trees (<xref ref-type="bibr" rid="B48">Wickelmaier and Zeileis, 2018</xref>), generalized linear mixed-effects models (<xref ref-type="bibr" rid="B12">Fokkema et al., 2018</xref>), network models (<xref ref-type="bibr" rid="B21">Jones et al., 2020</xref>), and circular regression models (<xref ref-type="bibr" rid="B25">Lang et al., 2020</xref>). Moreover, MOB is also used in more specialized recursive partitioning packages such as <monospace>psychotree</monospace> (<xref ref-type="bibr" rid="B53">Zeileis et al., 2020</xref>). Recently, <xref ref-type="bibr" rid="B49">Zeileis (2020)</xref> demonstrated on his blog how MOB can be coupled with the SEM software <monospace>lavaan</monospace> (<xref ref-type="bibr" rid="B36">Rosseel, 2012</xref>) to estimate SEM trees. Outside of the R ecosystem, SEM trees have also been fitted in M<italic>plus</italic> (<xref ref-type="bibr" rid="B37">Serang et al., 2020</xref>).</p>
<p>SEM trees are estimated by recursively selecting the covariate that best partitions the sample into different subgroups. Thus, the evaluation of potential splits is the central aspect of the algorithm. The <monospace>semtree</monospace> package uses a procedure that transforms all non-dichotomous covariates (that is, covariates with more than two values) into a set of dichotomous split candidates. Then, the tree growing algorithm computes the likelihood ratio between a single SEM (fitted on the complete sample of the current node) and MGSEMs (representing the model after the split) for every split candidate and selects the candidate associated with the largest likelihood ratio. The number of MGSEMs needed to calculate these likelihood ratios is directly related to the number of possible splits of the covariate. For instance, evaluating a numeric covariate such as age with many different values will require more MGSEMs to be estimated than evaluating a discrete covariate such as handedness. The reliance of the <monospace>semtree</monospace> package on likelihood ratios has the apparent drawback that the computational burden becomes large to excessive if there are many covariates and the covariates have many unique values. Another problem of the current <monospace>semtree</monospace> package is that the standard approach to split evaluation (called <italic>na&#x00EF;ve</italic> selection approach in <monospace>semtree</monospace>) is biased by favoring the selection of covariates with many unique values over covariates with few unique values (<xref ref-type="bibr" rid="B7">Brandmaier et al., 2013b</xref>). The <monospace>semtree</monospace> package offers a correction procedure (<italic>fair</italic> selection approach) for this selection bias (also known as attribute selection error; <xref ref-type="bibr" rid="B20">Jensen and Cohen, 2000</xref>). However, this correction procedure is heuristic and comes at the price of decreased statistical power to detect group differences. To solve this problem, we suggest to use a well-known method for likelihood-ratio-guided covariate selection that does not suffer from a selection bias while retaining full statistical power. We implemented this method into the <monospace>semtree</monospace> package.</p>
<p>In contrast to the <monospace>semtree</monospace> package, model-based recursive partitioning in the <monospace>partykit</monospace> package uses so-called score-based or structural change tests (e.g., <xref ref-type="bibr" rid="B50">Zeileis and Hornik, 2007</xref>) for assessing whether the values of one or more parameters depend on a covariate. Score-based tests are obtained by cumulating the case-wise gradients of the log-likelihood function evaluated at the parameter estimates. Unlike the likelihood-ratio test, score-based tests do not require the estimation of group-specific models for the evaluation of each split. This property leads to two advantages that make score-based tests highly attractive for model-based recursive partitioning. First, they are computationally efficient, as only the pre-split model needs to be estimated once. Second, when subgroups become small, fitting multi-group models to obtain likelihood ratios may become unstable. We propose using the advantages of score-based tests and added SEM trees guided by score-based tests to the <monospace>semtree</monospace> package. Our implementation of score-guided trees differs in some points from the generic MOB algorithm from the <monospace>partykit</monospace> package. MOB uses score-based tests to select a covariate, and it locates the optimal cut point in this covariate by comparing likelihood ratios. In contrast, our <monospace>semtree</monospace> implementation uses a score-based cut point localization, which is computationally more efficient. Moreover, MOB is currently limited to a single score-based test statistic, whereas <monospace>semtree</monospace> offers a broader selection of different test statistics that recently became popular in the exploration of measurement invariance in SEMs (<xref ref-type="bibr" rid="B31">Merkle and Zeileis, 2013</xref>; <xref ref-type="bibr" rid="B30">Merkle et al., 2014</xref>; <xref ref-type="bibr" rid="B46">Wang et al., 2014</xref>, <xref ref-type="bibr" rid="B47">2018</xref>).</p>
<p>The present study assesses a wide range of variable selection techniques in a Monte Carlo simulation study using the <monospace>semtree</monospace> package. We implemented an optimal likelihood-ratio-based method to improve the statistical properties of the likelihood-ratio-based split selection in <monospace>semtree</monospace> and added a family of score-based tests as a computationally efficient alternative. We evaluated the performance of these new methods next to the classical <italic>na&#x00EF;ve</italic> and <italic>fair</italic> methods. Moreover, we explored two techniques offered by <monospace>semtree</monospace> that allow testing specific hypotheses and incorporating <italic>a priori</italic> knowledge about group differences. The remainder of this manuscript is organized as follows: first, we reiterate the basic principles of SEM trees. Second, the existing likelihood-ratio-based implementation is outlined in detail and complemented with an unbiased method for selecting covariates. Third, we recapitulate a family of score-based tests and show how they can be used to guide the split decision of SEM trees. Fourth, the simulation setup and results are shown. The study concludes with a discussion of the simulation results and recommendations for future research.</p>
</sec>
<sec id="S2">
<title>Introductory Example</title>
<p>In the following, we illustrate the rationale behind SEM trees with an instructive example. Readers familiar with SEM trees may skip this section.</p>
<p>Let us assume a researcher estimated a confirmatory factor analysis (CFA; <xref ref-type="bibr" rid="B10">Brown, 2015</xref>) model that explains the scores of three ability tests of 600 male and female test takers of different ages with a single common latent factor and test-specific error terms. The data were collected at two different testing facilities. The researcher wonders if the parameter values of her CFA model differ with respect to the sites, the test takers&#x2019; age, and gender. She investigates this question with the help of a SEM tree.</p>
<p>The data for this fictional example were simulated such that the factor loading of the first ability test for individuals older than 45 years was smaller (0.6) than for younger individuals (0.8). This represents a violation of measurement invariance; that is, differences among individuals&#x2019; responses to an item are not only due to differences in the latent factor but also due to the item functioning differently across groups and being measured with different precision. Further, we lowered all factor loadings of older individuals tested at the second site by 0.1, imposing another form of violation of metric invariance. The covariate gender had no impact on the parameters of the CFA model and served as a noise variable.</p>
<p><xref ref-type="fig" rid="F1">Figure 1</xref> shows the resulting SEM tree for the simulated data set. The SEM tree consists of 5 nodes depicted as ovals, each of them containing a CFA model. Node 1 is the root node of the SEM tree and contains the CFA model fitted on the full data set with <italic>N</italic> = 600 individuals. In this illustrative example, the SEM tree algorithm concluded that the fit of the model in the root node could be improved most by splitting the data into a group of 300 individuals younger than 45 years (Node 2) and a group of 300 individuals older than 45 years (Node 3). Node 2 and 3 are said to be the daughters of Node 1. After splitting the sample associated with Node 1, the algorithm proceeds recursively with Node 2 and 3. Whereas the fit of the model for younger individuals (Node 2) could not be improved any further, the SEM tree algorithm split the group of older individuals (Node 3) into two subgroups with 150 older individuals tested at site 1 (Node 4) and 150 older individuals tested at site 2 (Node 5). After this split, the SEM tree algorithm terminated as no further split would significantly improve any of the submodels&#x2019; fit. Nodes 2, 4, and 5 are the leaves of the SEM tree, and individuals within these nodes were found to be homogeneous with respect to the covariates. As expected, the SEM tree algorithm did not select the covariate gender for splitting because this covariate was not associated with any group differences in the simulated data set.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Illustrative example of a SEM tree. The SEM tree recursively partitioned a CFA model with respect to individuals&#x2019; age and study site.</p></caption>
<graphic xlink:href="fpsyg-11-564403-g001.tif"/>
</fig>
<p>It is important to note that the structure of the SEM tree shown in <xref ref-type="fig" rid="F1">Figure 1</xref> is not specified <italic>a priori</italic> but learned top-down in an exploratory way. The algorithm only requires a pre-specified template SEM (in the example, the CFA model) and a data set including covariates that serve as split candidates to identify homogeneous groups. The selection of covariates and the identification of optimal cut points are then learned from the data. Throughout the tree, the structure of the template SEM remains the same, and only the values of the parameter estimates change as the model is fitted recursively on different subsamples.</p>
</sec>
<sec id="S3">
<title>Structural Equation Model Trees</title>
<p>The generical SEM tree algorithm can be described in four steps:</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>Specify a template SEM.</p>
</list-item>
<list-item>
<label>2.</label>
<p>Fit the template SEM to all observations in the current node.</p>
</list-item>
<list-item>
<label>3.</label>
<p>Assess whether the SEM parameter estimates are constant or vary with respect to the covariate.</p>
</list-item>
<list-item>
<label>4.</label>
<p>Choose the covariate that is associated with the largest group differences. If the group difference exceeds a threshold, split the node into two daughter nodes, and repeat the procedure with Step 2 for both daughter nodes. Otherwise terminate.</p>
</list-item>
</list>
<p>Likelihood-ratio-guided and score-guided SEM trees differ in how Step 3 of the general SEM tree algorithm is implemented. In other words, the procedures use different approaches to evaluate heterogeneity and to search for optimal split points in covariates. The following section outlines Steps 1&#x2013;4 for likelihood-ratio-guided SEM trees before introducing score-guided SEM trees afterward.</p>
<sec id="S3.SS1">
<title>Step 1: Specification of the Template Model</title>
<p>The starting point for growing a SEM tree is the specification of a template SEM. A template model reflects hypotheses about the data by specifying relations among observed variables and latent constructs and is determined by the research question. The template model is fitted on all subsamples associated with the nodes of the SEM tree. It is important to note that the structure of the template model stays the same in the entire SEM tree (but see <xref ref-type="bibr" rid="B6">Brandmaier et al., 2013a</xref> for trees with multiple models). Hence, parameters fixed to a constant (e.g., zero or one) in the template model are fixed to the same constant in all submodels of the tree. Only parameters freely estimated in the template model are allowed to differ across groups and contribute to the assessments of splits.</p>
<p>Fixing many parameters of the template model to constants can hinder the SEM tree algorithm from identifying group differences. Usually, some parameters are fixed to ensure the identification of the SEM. In some model classes, additional constraints are specified to model specific relationships or trajectories. For instance, in latent growth curve models (see <xref ref-type="bibr" rid="B28">McArdle, 2012</xref> for an overview), the factor loadings of a latent random slope variable are often fixed to model a specific growth pattern such as linear or quadratic growth. By fixing these loadings, a SEM tree will not be able to estimate different growth patterns between groups and, as a result, may overlook heterogeneity. In this case, estimating the factor loadings as free parameters may improve the SEM tree&#x2019;s flexibility to adapt to subgroup-specific trajectories.</p>
<p>By default, SEM trees estimate all non-fixed parameters freely in each submodel, and every parameter contributes to the evaluation of split candidates. This behavior is suboptimal if there is a clear set of target parameters that are of interest to investigate a given theory. As a solution, the <monospace>semtree</monospace> package offers the option to specify a set of so-called focus parameters. By declaring focus parameters, the SEM tree will only consider heterogeneity in these parameters when assessing split candidates. Thus, focus parameters are useful for testing parameter-specific hypotheses about group differences. For instance, if one wants to test measurement invariance, one could specify the measurement model&#x2019;s parameters as focus parameters and disregard heterogeneity in the structural model. Besides focus parameters, the <monospace>semtree</monospace> package allows constraining specific parameters to be equal across all submodels of the tree. This is done by estimating these parameters in the full sample once and using the resulting values throughout the tree. Such equality constraints allow incorporating prior knowledge about homogeneous parameters and can increase the power to detect heterogeneity in the remaining parameters. We will later demonstrate the use of focus parameters and equality constraints in two short simulation studies.</p>
</sec>
<sec id="S3.SS2">
<title>Step 2: Model Estimation</title>
<p>Various estimation techniques for SEMs have been discussed. In principle, SEM trees can operate with any estimation method that provides a fit statistic and are not necessarily limited to a multivariate normal distribution. At present, however, only maximum likelihood estimation for multivariate normal data is implemented in the <monospace>semtree</monospace> package. Therefore, <monospace>semtree</monospace> is currently less suited for investigating models fitted on non-normal data such as SEMs with categorical outcomes. In the following, we will focus on maximum likelihood estimation for multivariate normal data.</p>
<p>SEMs are usually specified by expressing the structure of a mean vector and a covariance matrix as a function of a <italic>q</italic>-variate vector &#x03B8; with model parameters. These parameters are estimated by minimizing a fitting function <italic>F</italic> that measures the discrepancy between the observed means <inline-formula><mml:math id="INEQ2"><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> and the model-implied means <italic>&#x03BC;</italic>(<italic>&#x03B8;</italic>) as well as the discrepancy between the observed covariance matrix <italic><bold>S</bold></italic> and the model-implied covariance matrix <italic><bold>&#x03A3;</bold>(&#x03B8;)</italic>. Several fitting functions have been proposed. The following maximum likelihood fitting function is widely used as it yields efficient parameter estimates under the assumption of multivariate normally distributed data:</p>
<disp-formula id="S3.E1">
<label>(1)</label>
<mml:math id="eq1">
<mml:mrow>
<mml:msub>
<mml:mi>F</mml:mi>
<mml:mrow>
<mml:mi>M</mml:mi><mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mover accent='true'>
<mml:mi>y</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>,</mml:mo><mml:mi>S</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03A3;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mover accent='true'>
<mml:mi>y</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
</mml:mrow>
<mml:mtext>T</mml:mtext>
</mml:msup>
<mml:mi>&#x03A3;</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mover accent='true'>
<mml:mi>y</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mtext>tr</mml:mtext><mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mi>S</mml:mi><mml:mi>&#x03A3;</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mtext>&#x2009;ln</mml:mtext><mml:mrow><mml:mo>{</mml:mo> <mml:mrow>
<mml:mtext>det</mml:mtext><mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mi>S</mml:mi><mml:mi>&#x03A3;</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>}</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>In the equation above, <italic>p</italic> denotes the number of observed variables in the SEM. A fitting function also provides a test of overall model fit. Evaluated at the parameter estimates <inline-formula><mml:math id="INEQ4"><mml:mover accent='true'><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x005E;</mml:mo></mml:mover></mml:math></inline-formula>, (<italic>N</italic>&#x2212;1)<italic>F</italic> asymptotically follows a <italic>&#x03C7;</italic><sup>2</sup> distribution with <italic>q</italic> degrees of freedom under the null hypothesis of a correctly specified model, where <italic>N</italic> refers to the sample size. A detailed account of SEM estimation can be found in the textbooks by <xref ref-type="bibr" rid="B4">Bollen (1989)</xref> and <xref ref-type="bibr" rid="B23">Kline (2016)</xref>.</p>
</sec>
<sec id="S3.SS3">
<title>Step 3: Split Evaluation</title>
<p>The original SEM tree algorithm suggested by <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref> compares the fit of a single-group model to the fit of a MGSEM, which consists of all submodels in the current leaves, to decide whether to split a node according to a covariate. For the sake of simplicity, we assume that all covariates are dichotomous and discuss non-dichotomous covariates afterward.</p>
<p>Let <italic>M</italic><sub><italic>F</italic></sub> represent the model associated with the root node (that contains the full data set) and let <inline-formula><mml:math id="INEQ6"><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x005E;</mml:mo></mml:mover><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denote the corresponding parameter estimates. Further, we mark the observed mean vector of the full data set as <inline-formula><mml:math id="INEQ7"><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and the observed covariance matrix as <italic><bold>S</bold><sub><italic>F</italic></sub></italic>. To evaluate a candidate covariate for a specific node, we split the node into two daughter nodes according to the covariate. Then, group-specific SEM parameters <italic>&#x03B8;<sub><italic>j</italic></sub></italic>, <italic>j</italic> = 1, &#x2026;, <italic>J</italic>, are estimated for all subsamples associated with the <italic>J</italic> current leaf nodes. Since the subsamples associated with the current leaves are non-overlapping, the submodels can be joined into a MGSEM, which we from now on refer to as <italic>M</italic><sub><italic>SUB</italic></sub>. As <italic>M</italic><sub><italic>F</italic></sub> is nested within <italic>M</italic><sub><italic>SUB</italic></sub>, we can test the following null hypothesis of parameter homogeneity with respect to the covariate under evaluation:</p>
<disp-formula id="S3.E2">
<label>(2)</label>
<mml:math id="eq2">
<mml:mrow>
<mml:msub>
<mml:mtext>H</mml:mtext>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mo mathvariant="italic" separator="true">&#x2003;</mml:mo>
<mml:msub>
<mml:mi>&#x03B8;</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>&#x03B8;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo rspace="12.5pt">,</mml:mo>
<mml:mo>&#x2200;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>J</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Rejecting Equation 2 implies that the model parameters vary with respect to the covariate. <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref> suggested using the following log-likelihood ratio between <italic>M</italic><sub><italic>F</italic></sub> and <italic>M</italic><sub><italic>SUB</italic></sub> as a test statistic for Equation 2:</p>
<disp-formula id="S3.E3">
<label>(3)</label>
<mml:math id="eq3">
<mml:mrow>
<mml:mi>L</mml:mi><mml:mi>R</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>{</mml:mo> <mml:mrow>
<mml:msub>
<mml:mi>F</mml:mi>
<mml:mrow>
<mml:mi>M</mml:mi><mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>y</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mi>F</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo><mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>F</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mi>F</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03A3;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mi>F</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:munderover>
<mml:mstyle mathsize='140%' displaystyle='true'><mml:mo>&#x2211;</mml:mo></mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>J</mml:mi>
</mml:munderover >
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:msub>
<mml:mi>F</mml:mi>
<mml:mrow>
<mml:mi>M</mml:mi><mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>y</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo><mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>&#x03A3;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>}</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Under the null hypothesis that there is no influence of the covariate under scrutiny, <italic>LR</italic> asymptotically follows a <italic>&#x03C7;</italic><sup>2</sup> distribution with (<italic>J</italic>&#x2212;1)<italic>q</italic> degrees of freedom.</p>
<p>This testing procedure provides a powerful and efficient solution for dichotomous covariates. However, evaluating a categorical, an ordinal, or a continuous covariate that has more than two unique values requires an additional step of locating the optimal cut point. <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref> suggested to compute the likelihood ratio in Equation 3 for every meaningful partition of the covariate and then to select the cut point associated with the maximum likelihood ratio. For categorical covariates, the best partition is found by splitting them into a set of dichotomous variables applying a one-against-the-rest scheme for all possible combinations of categories. For ordinal and continuous covariates, the ordering inherent to these covariates allows applying a procedure known as exhaustive split search (<xref ref-type="bibr" rid="B35">Quinlan, 1993</xref>) to find the optimal cut point. Given a covariate with <italic>m</italic> unique values, this procedure tests <italic>m</italic>&#x2212;1 potential partitions to locate the maximum of the likelihood ratios. For continuous covariates, it is also necessary to omit a certain fraction of the data associated with the smallest and largest values of the covariate in order to obtain a sufficiently large sample to estimate the SEMs in both partitions. From the above, it is clear that the computational demand of SEM trees grows with the number of covariates with many unique values as every potential cut point requires the estimation of SEMs.</p>
<p>Locating the optimal cut point in categorical, ordinal, and continuous covariates with the maximum of the likelihood ratios has important implications for the test statistic shown in Equation 3. By choosing the maximum of a set of statistics (one for each possible partition), the resulting distribution is no longer the same as the distribution of the individual statistics. Thus, a maximally selected likelihood-ratio test statistic does not follow a <italic>&#x03C7;</italic><sup>2</sup> distribution under the null hypothesis of parameter homogeneity. The deviation from the <italic>&#x03C7;</italic><sup>2</sup> distribution is directly related to the number of potential cut points. With a growing number of possible cut points, the maximum of the likelihood-ratio values will be increased purely by random fluctuations. Consequently, using the <italic>&#x03C7;</italic><sup>2</sup> test for the evaluation of covariates will artificially inflate the probability of type I errors and favors the selection of covariates with many potential cut points over the selection of covariates with few.</p>
<p><xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref> discuss different correction procedures for this selection bias that are used in the <monospace>semtree</monospace> package. The default method, labeled <italic>na&#x00EF;ve</italic> in <monospace>semtree</monospace>, uses the <italic>&#x03C7;</italic><sup>2</sup> distribution for evaluating covariates and simply ignores the resulting selection bias. To reduce this bias, <monospace>semtree</monospace> offers the option to use the <italic>na&#x00EF;ve</italic> method in combination with a Bonferroni correction for multiple testing within the same covariate by dividing the <italic>p</italic>-value obtained from the likelihood-ratio test in Equation 3 by the number of potential cut points. However, this Bonferroni adjustment can lead to overcorrection and decreases the probability of selecting covariates with many possible cut points, as demonstrated by <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref>. We will refer to this Bonferroni adjusted <italic>na&#x00EF;ve</italic> method simply as the <italic>na&#x00EF;ve</italic> method from now on. Besides the Bonferroni correction, different cross-validation methods are implemented in the <monospace>semtree</monospace> package. Cross-validation separates the estimation of SEMs from the testing of a potential cut point (e.g., <xref ref-type="bibr" rid="B20">Jensen and Cohen, 2000</xref>). SEM trees can be grown with a two-stage approach (<xref ref-type="bibr" rid="B26">Loh and Shih, 1997</xref>; <xref ref-type="bibr" rid="B38">Shih, 2004</xref>; <xref ref-type="bibr" rid="B7">Brandmaier et al., 2013b</xref>) that splits the sample associated with a node in half. One half of the sample is used to find the optimal cut point for every covariate. The other half is used to evaluate only the best cut points via the likelihood-ratio test. This method is called <italic>fair</italic> in the <monospace>semtree</monospace> package. Since the <italic>fair</italic> method uses only half of the sample for split selection, its power for detecting heterogeneity can be expected to be considerably lower than the power of methods that employ the whole sample. A much simpler and more elegant way of avoiding the selection bias and correction procedures altogether is to use the correct distribution of the maximally selected likelihood-ratio test statistic (max<italic>LR</italic>). <xref ref-type="bibr" rid="B2">Andrews (1993)</xref> showed that the asymptotic distribution of max<italic>LR</italic> is the supremum of a certain tied-down Bessel process from which <italic>p</italic>-values can be obtained (see <xref ref-type="bibr" rid="B51">Zeileis et al., 2008</xref>; <xref ref-type="bibr" rid="B31">Merkle and Zeileis, 2013</xref>). We now implemented the max<italic>LR</italic> statistic into the <monospace>semtree</monospace> package to provide a more efficient and robust likelihood-ratio-based covariate selection.</p>
</sec>
<sec id="S3.SS4">
<title>Step 4: Covariate Selection</title>
<p>To select a single covariate from a set of candidate covariates, the likelihood ratio for the optimal cut point is computed for every covariate, and the covariate associated with the smallest <italic>p</italic>-value is chosen. If the <italic>p</italic>-value is smaller than a pre-specified threshold, determined by the desired probability of a type I error, splitting is continued recursively. One should keep in mind that testing several covariates will artificially inflate the type I error probability. One of several solutions to this problem is the use of Bonferroni adjusted <italic>p</italic>-values. Given a large number of covariates, however, the Bonferroni correction will reduce the power of the SEM tree drastically and will produce sparse trees. In such cases, one may resort to unadjusted <italic>p</italic>-values for the selection of covariates and, if needed, can limit the size of the SEM tree with additional stopping criteria like a minimum number of individuals per node.</p>
</sec>
</sec>
<sec id="S4">
<title>Score-Guided SEM Trees</title>
<p>Using likelihood-ratio tests to grow SEM trees can become computationally burdensome if not infeasible as the evaluation of a covariate requires the estimation of MGSEMs for every potential cut point. Furthermore, when subgroups become small, fitting MGSEMs may become unstable. Alternatively, SEM trees can be guided by score-based tests that do not require the estimation of MGSEMs to evaluate a split at all. This makes score-based tests computationally efficient and often more stable as compared to likelihood-ratio tests. In the following, we will first introduce the general notion behind score-based tests and then introduce a family of score-based test statistics for covariates with different levels of measurement.</p>
<sec id="S4.SS1">
<title>Score-Based Tests</title>
<p>Score-based tests originated in econometrics, where they are primarily employed to detect parameter instability in time series models (e.g., <xref ref-type="bibr" rid="B13">Hansen, 1992</xref>; <xref ref-type="bibr" rid="B2">Andrews, 1993</xref>). Score-based tests can be summarized in three steps: first, the case-wise derivatives of the log-likelihood function with respect to the model parameters are computed. These case-wise derivatives, also called scores, indicate how well the model parameters represent an individual. The larger the score, the larger the misfit of a given model parameter for a given individual. Second, the scores are sorted with respect to a covariate for which we want to test parameter homogeneity. Third, the scores are aggregated into a test statistic that allows testing of the null hypothesis of homogeneous parameters (see Equation 2).</p>
<p>Score-based tests have been derived for general M-estimators that encompass popular estimation techniques such as least-squares methods and maximum likelihood as special cases (<xref ref-type="bibr" rid="B50">Zeileis and Hornik, 2007</xref>). For the sake of simplicity, we limit ourselves to maximum likelihood estimation for multivariate normally distributed data. The associated log-likelihood function for a single individual <italic>i</italic> is given by</p>
<disp-formula id="S4.E4">
<label>(4)</label>
<mml:math id="eq4">
<mml:mrow>
<mml:mi>ln</mml:mi><mml:mi>L</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>;</mml:mo><mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mrow><mml:mo>{</mml:mo> <mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
</mml:mrow>
<mml:mtext>T</mml:mtext>
</mml:msup>
<mml:mi>&#x03A3;</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo><mml:mi>&#x03BC;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>ln</mml:mi><mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mi>det</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03A3;</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mi>p</mml:mi><mml:mi>ln</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>2</mml:mn><mml:mtext>&#x03C0;</mml:mtext><mml:mo stretchy='false'>)</mml:mo>
</mml:mrow> <mml:mo>}</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Equation 4 is the normal theory log-likelihood function for a single individual <italic>i</italic> and yields identical parameter estimates to <italic>F</italic><sub><italic>ML</italic></sub> shown in Equation 1 if summed over individuals and maximized.</p>
<p>The individual scores are calculated by taking the partial derivative of the log-likelihood function with respect to the parameters and evaluating the expression at the estimates:</p>
<disp-formula id="S4.E5">
<label>(5)</label>
<mml:math id="eq5">
<mml:mrow>
<mml:mi>S</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>;</mml:mo><mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>[</mml:mo> <mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo><mml:mi>ln</mml:mi><mml:mi>L</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>;</mml:mo><mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo><mml:msub>
<mml:mi>&#x03B8;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msub>
<mml:mo>&#x2502;</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>=</mml:mo><mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mo>&#x22EF;</mml:mo>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo><mml:mi>ln</mml:mi><mml:mi>L</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>;</mml:mo><mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo><mml:msub>
<mml:mi>&#x03B8;</mml:mi>
<mml:mi>q</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msub>
<mml:mo>&#x2502;</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi><mml:mo>=</mml:mo><mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow> <mml:mo>]</mml:mo></mml:mrow>
</mml:mrow>
<mml:mtext>T</mml:mtext>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
<p>The scores assess the extent to which an individual&#x2019;s log-likelihood is maximized by one of the <italic>q</italic> parameters. Values close to zero indicate a good fit between model and individual, whereas large scores point toward a strong misfit. Note that by definition, the scores evaluated at the maximum likelihood estimates <inline-formula><mml:math id="INEQ9"><mml:mover accent="true"><mml:mi>&#x03B8;</mml:mi><mml:mo stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> sum up to zero; that is, <inline-formula><mml:math id="INEQ10"><mml:mrow><mml:munderover><mml:mstyle mathsize='140%' displaystyle='true'><mml:mo>&#x2211;</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover ><mml:mi>S</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mover accent='true'><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x005E;</mml:mo></mml:mover><mml:mo>;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>.</p>
<p>For the construction of a test statistic, the scores are cumulated according to the order induced by a covariate under scrutiny. For instance, if parameter homogeneity is assessed with respect to age, the first row consists of scores from the youngest individual. For the second row, scores of the youngest and second youngest individuals are summed up, and so forth. More formally, the cumulative score process is defined as</p>
<disp-formula id="S4.E6">
<label>(6)</label>
<mml:math id="eq6">
<mml:mrow>
<mml:mi>C</mml:mi><mml:mi>S</mml:mi><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>;</mml:mo><mml:mi>s</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msqrt>
<mml:mi>n</mml:mi>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x00A0;</mml:mo><mml:mi>I</mml:mi><mml:msup>
<mml:mrow>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo><mml:mfrac bevelled='true'>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
</mml:msup>
<mml:munderover>
<mml:mstyle mathsize='140%' displaystyle='true'><mml:mo>&#x2211;</mml:mo></mml:mstyle>
<mml:mrow>
<mml:mi>h</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>s</mml:mi>
</mml:munderover >
<mml:mi mathvariant='script'>S</mml:mi><mml:mrow><mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent='true'>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x005E;</mml:mo>
</mml:mover>
<mml:mo>;</mml:mo><mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>h</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo></mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where the index <italic>s</italic> denotes the number of sorted individuals entering the equation, and the index <italic>h</italic> selects the sorted individuals until <italic>h</italic> = <italic>s</italic>. Furthermore, <inline-formula><mml:math id="INEQ12"><mml:mrow><mml:mi>I</mml:mi><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mover accent='true'><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x005E;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mfrac bevelled='true'><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the estimated half-squared inverse of the Fisher information matrix. Pre-multiplying with <inline-formula><mml:math id="INEQ13"><mml:mrow><mml:mi>I</mml:mi><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mover accent='true'><mml:mi>&#x03B8;</mml:mi><mml:mo>&#x005E;</mml:mo></mml:mover><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mfrac bevelled='true'><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> decorrelates the scores so that the <italic>q</italic> cumulative score processes are unrelated to each other. In the following, we place the values of the cumulative score process row-wise into an <italic>N</italic> &#x00D7; <italic>q</italic> matrix that we denote with <italic><bold>CSP</bold></italic> and refer to the cumulative sum from the first <italic>s-</italic>th individuals of the <italic>k</italic>-th parameter as <italic>CSP</italic><sub><italic>s,k</italic></sub>. The plots in Panel (A&#x2013;C) in <xref ref-type="fig" rid="F2">Figure 2</xref> illustrate how sorting and cumulating scores make parameter heterogeneity visible.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Artificial example to visualize the effect of sorting and cumulating. 100 observations were sampled from two Poisson distributions with different rate parameters. 50 observations were generated with a rate parameter of 2 and 50 observations with a rate parameter of 5. Panel <bold>(A)</bold> shows the scores of the 100 observations in random order. Panel <bold>(B)</bold> displays the 50 scores of observations generated with a rate parameter of 2 first, followed by the 50 scores sampled with a rate parameter of 5. After sorting the scores according to the two groups, a clear pattern emerges as the first 50 scores are mostly negative, and the remaining scores are mostly positive. Panel <bold>(C)</bold> shows the cumulative score process. The negative and positive scores are cumulated, and the change point is noticeable from the negative peak in the cumulative score process. Panel <bold>(D)</bold> depicts five randomly generated Brownian bridges. Under the null hypothesis of a constant rate parameter, the cumulative score process would have behaved similarly to the 5 Brownian bridges in Panel <bold>(D)</bold>.</p></caption>
<graphic xlink:href="fpsyg-11-564403-g002.tif"/>
</fig>
<p><xref ref-type="bibr" rid="B15">Hjort and Koning (2002)</xref> show that under mild conditions and constant parameters, each column of the cumulative score process matrix <italic><bold>CSP</bold></italic> converges in distribution to a univariate Brownian bridge. A Brownian bridge is a stochastic process that is pinned to zero at the start and end and exhibits the most variability in the middle. Thus, the null hypothesis of parameter homogeneity in Equation 2 can be tested by comparing the observed cumulative score process to the analogous statistic of a Brownian bridge. Panel (C) and (D) in <xref ref-type="fig" rid="F2">Figure 2</xref> illustrate the difference between the cumulative score process of a heterogeneous parameter and the Brownian bridge.</p>
<p>Test statistics can be obtained by aggregating the cumulative score process matrix into a single scalar. Critical values and <italic>p</italic>-values for these test statistics can be found by applying the same aggregation to the asymptotic Brownian bridge (<xref ref-type="bibr" rid="B50">Zeileis and Hornik, 2007</xref>). Different ways of aggregating the cumulative scores will produce test statistics that will be sensitive to different patterns of parameter heterogeneity. The choice of a test statistic also depends on the level of measurement of the covariate.</p>
<p><xref ref-type="bibr" rid="B31">Merkle and Zeileis (2013)</xref> proposed three different test statistics for continuous covariates:</p>
<disp-formula id="S4.E7">
<label>(7)</label>
<mml:math id="eq7">
<mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mtext>max</mml:mtext>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S4.E8">
<label>(8)</label>
<mml:math id="eq8">
<mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S4.E9">
<label>(9)</label>
<mml:math id="eq9">
<mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder accentunder="true">
<mml:mi>s</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:munder>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">&#x00AF;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mi>s</mml:mi>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mi>s</mml:mi>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Equations 7&#x2013;9 show the double maximum (<italic>DM</italic>), Cram&#x00E9;r-von-Mises (<italic>CvM</italic>), and maximum Lagrange multiplier (max<italic>LM</italic>) test statistics. <italic>DM</italic> is the simplest test statistic and rejects the null hypothesis if, at any point, the maximum of any of the <italic>q</italic> processes strays too far away from zero. However, <xref ref-type="bibr" rid="B31">Merkle and Zeileis (2013)</xref> note that considering only the maximum of the <italic>q</italic> processes wastes power because the <italic>DM</italic> statistic ignores heterogeneity in other parameters. Furthermore, even for the same parameter, smaller peaks before and after the maximum are not considered, which may lead to a loss of power if the parameter changes its values across more than two groups. Using sums instead of maxima solves these problems. The <italic>CvM</italic> statistic sums the squared values over all parameters and individuals and is therefore well suited for detecting multiple group differences in several parameters. If one suspects that a single change point will manifest in several parameters, the max<italic>LM</italic> statistic that considers the maximum values of all parameters at a single point is more appropriate. Unlike the other test statistics for continuous covariates, the max<italic>LM</italic> statistic contains a scaling term <inline-formula><mml:math id="INEQ14"><mml:mrow><mml:mfrac><mml:mi>s</mml:mi><mml:mi>N</mml:mi></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mfrac><mml:mi>s</mml:mi><mml:mi>N</mml:mi></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, which increases sensitivity for peaks before and after the middle of the processes. A disadvantage of this scaling is that individuals with very small and very large values of the covariate need to be omitted to stabilize the test statistic. Therefore, one has to specify an interval <inline-formula><mml:math id="INEQ15"><mml:mrow><mml:mo>[</mml:mo><mml:munder accentunder="true"><mml:mi>s</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x2026;</mml:mi><mml:mo>,</mml:mo><mml:mover accent="true"><mml:mi>s</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> with a lower and upper threshold of the covariate. Parameter shifts outside of these boundaries are not considered. The max<italic>LM</italic> statistic is asymptotically equivalent to the max<italic>LR</italic> statistic from the previous section (<xref ref-type="bibr" rid="B2">Andrews, 1993</xref>).</p>
<p>For ordinal and categorical covariates, <xref ref-type="bibr" rid="B30">Merkle et al. (2014)</xref> suggested test statistics that focus on bins of individuals at each level of the covariates:</p>
<disp-formula id="S4.E10">
<label>(10)</label>
<mml:math id="eq10">
<mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo rspace="5.3pt">,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S4.E11">
<label>(11)</label>
<mml:math id="eq11">
<mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>M</mml:mi>
<mml:mi>O</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo rspace="5.3pt">,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Equations 10 and 11 present the weighted double maximum (<italic>WDM</italic>) and the maximum Lagrange multiplier statistics for ordinal covariates (max<italic>LM</italic><sub><italic>O</italic></sub>). For both test statistics, we first group the individuals into <italic>m</italic>&#x2212;1 bins associated with the first <italic>m</italic>&#x2212;1 levels of the covariate. Then, we sum the scores in each bin and cumulate the sums, yielding a (<italic>m</italic>&#x2212;1) &#x00D7; <italic>q</italic> matrix <italic><bold>CBSP</bold></italic> of cumulative bins of scores. In the equations above, we denote the cumulative bin of scores associated with the <italic>l</italic>-th level of the covariate and the <italic>k</italic>-th parameter with <italic>CBSP</italic><sub><italic>l,k</italic></sub>. Both statistics are scaled by <inline-formula><mml:math id="INEQ16"><mml:mrow><mml:mfrac><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mi>N</mml:mi></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mfrac><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mi>N</mml:mi></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, where <italic>n</italic><sub><italic>l</italic></sub> represents the cumulative number of individuals per bin. The main difference is that the max<italic>LM</italic><sub><italic>O</italic></sub> statistic considers heterogeneity in all parameters, whereas the <italic>WDM</italic> only considers the most heterogeneous parameter.</p>
<p>Categorical covariates do not possess a natural ordering that can be used to construct a test statistic. Alternatively, a test statistic can be obtained by summing the squared differences in the sum of scores across bins of individuals associated with a different level of the covariate (<xref ref-type="bibr" rid="B15">Hjort and Koning, 2002</xref>). In the following Lagrange multiplier (<italic>LM</italic>) statistic</p>
<disp-formula id="S4.E12">
<label>(12)</label>
<mml:math id="eq12">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p><italic>BSP</italic><sub><italic>l,k</italic></sub> denotes the sum of the scores of the <italic>k</italic>-th parameter from individuals associated with the <italic>l</italic>-th level of the covariate. <italic>B</italic><sub>0,</sub><italic><sub><italic>k</italic></sub></italic>, <italic>k</italic> = 1, &#x2026;, <italic>q</italic>, is not associated with any of the 1, &#x2026;, <italic>m</italic> levels of the covariate and is set to zero.</p>
<p>To apply the test statistics outlined above in practice, critical values and <italic>p</italic>-values are needed in order to compare split points across covariates. Analytic solutions are available for the <italic>DM</italic>, max<italic>LR</italic>, <italic>WDM</italic>, and <italic>LM</italic> statistic. For the remaining test statistics, critical values and <italic>p</italic>-values can be obtained through repeated simulation of Brownian bridges. Different strategies for obtaining critical values and <italic>p</italic>-values for the <italic>DM</italic>, max<italic>LR</italic>, and <italic>CvM</italic> statistics are discussed by <xref ref-type="bibr" rid="B31">Merkle and Zeileis (2013)</xref> and for the <italic>WDM</italic>, max<italic>LM<sub><italic>O</italic></sub></italic>, and <italic>LM</italic> statistics by <xref ref-type="bibr" rid="B30">Merkle et al. (2014)</xref>.</p>
</sec>
<sec id="S4.SS2">
<title>SEM Trees Guided by Score-Based Tests</title>
<p>Score-guided SEM trees can be obtained by replacing the evaluation of covariates in Step 3 of the general SEM tree algorithm with score-based tests instead of the likelihood-ratio test. Because score-based tests operate like an omnibus test for all possible cut points in a covariate, a single best cut point needs to be located after the selection of a covariate. Cut points can be obtained by identifying which of the unique values of the covariate maximizes the respective score-based test statistic. Omitting the outer sums or maxima in the Equations 7&#x2013;11 pairs every unique value of the covariate with a specific value of the partially summed test statistic. Then, a cut point can be determined by splitting the sample after the observation associated with the maximum of these partially summed test statistics. Due to its scaling term, the respective max<italic>LM</italic> statistic for ordinal and continuous covariates appears to be particularly well suited for identifying the optimal cut points. We implemented this fully score-based cut point localization procedure in the <monospace>semtree</monospace> package. Alternatively, the optimal cut point can be determined by maximizing the partitioned log-likelihood (that is, the sum of the log-likelihood for all observations to the left and the sum for all observations to the right of the cut point) over all conceivable values of the covariate. Since this approach requires the estimation of a sequence of SEMs, it will be slower than a purely score-based cut point identification. However, it will still be faster than a SEM tree purely guided by likelihood ratios because only the localization of a cut point but not the selection of the covariate requires the estimation of additional SEMs. This hybrid strategy is currently applied by the generic MOB algorithm from the <monospace>partykit</monospace> package, which uses the max<italic>LM</italic> statistic for selecting covariates and the max<italic>LR</italic> statistic for locating cut points.</p>
</sec>
</sec>
<sec id="S5">
<title>Simulation Study</title>
<p>We conducted four Monte Carlo simulations to evaluate SEM trees in different settings. The first two simulations compare the original SEM tree split selection methods with the newly proposed SEM trees guided by the max<italic>LR</italic> statistic and score-based tests. The first simulation aims at illustrating the performance of the different SEM trees under the null hypothesis of parameter homogeneity. The second simulation investigates power, the precision of cut point estimation, and group recovery for a heterogeneous population consisting of two groups. The third and fourth simulations demonstrate the use and common pitfalls of SEM trees with focus parameters and equality constraints.</p>
<p>All simulations were carried out with the statistical programming language R. SEM trees were fitted with the <monospace>semtree</monospace> package. <monospace>semtree</monospace> interfaces the <monospace>OpenMx</monospace> package (<xref ref-type="bibr" rid="B34">Neale et al., 2016</xref>) for the estimation of SEMs. To grow score-guided SEM trees, we linked <monospace>semtree</monospace> to the <monospace>strucchange</monospace> package (<xref ref-type="bibr" rid="B52">Zeileis et al., 2002</xref>). <monospace>strucchange</monospace> offers a unified framework for implementing score-based tests for a wide range of models. All features used in this simulation are available in the <monospace>semtree</monospace> package. Our simulations were performed using R 4.0.2, <monospace>OpenMx</monospace> 2.18.1, <monospace>strucchange</monospace> 1.5&#x2013;2, and a developmental snapshot of the <monospace>semtree</monospace> package<sup><xref ref-type="fn" rid="footnote1">1</xref></sup>. The simulation scripts and results are provided as Online Supplemental Material<sup><xref ref-type="fn" rid="footnote2">2</xref></sup>.</p>
<p>In all simulations, we aimed at ensuring an optimal type I error rate; that is, we tried limiting the proportions of false-positive splits to the significance level of 5%. To achieve this, we adjusted the <italic>p</italic>-values of the likelihood-ratio and score-based tests with the Bonferroni procedure to correct for the multiple testing of several covariates. Besides the Bonferroni correction, we used the default settings of the <monospace>semtree</monospace> package throughout our simulation studies. The score-based tests were performed by applying the default settings of the <monospace>strucchange</monospace> package. The data used to fit the SEMs were drawn from a multivariate normal distribution. All experimental conditions were replicated 10,000 times.</p>
<sec id="S5.SS1">
<title>Simulation I: Type I Error Rate and Runtime</title>
<p>Simulation I assessed the type I error rate under the null hypothesis of constant parameters and the runtime for a different number of noise variables and sample sizes. The simulated data was homogeneous without any group differences.</p>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> shows the linear latent growth curve model used in Simulation I and II. Model specification and parameter values were taken from <xref ref-type="bibr" rid="B29">McArdle and Epstein (1987)</xref>, who modeled the scores of 204 young children from the Wechsler Intelligence Scale for Children over four repeated occasions of measurement at 6, 7, 9, and 11 years of age (see <xref ref-type="bibr" rid="B7">Brandmaier et al., 2013b</xref> for a SEM tree analysis of these data). In both simulation studies, we generated multivariate normal data, using the mean vector and covariance matrix implied by the model presented in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Path diagram of the linear latent growth curve model used for data generation in Simulation I and II. The parameter values were obtained by fitting the model on scores from the longitudinal Wechsler Intelligence Scale for Children data set.</p></caption>
<graphic xlink:href="fpsyg-11-564403-g003.tif"/>
</fig>
<p>After generating the data, the linear latent growth curve model presented in <xref ref-type="fig" rid="F3">Figure 3</xref> was estimated, serving as a template model for the SEM trees. The model was defined by six free parameters: the mean and the variance of the random intercept <italic>f</italic><sub><italic>I</italic></sub>, the mean and the variance of the random slope <italic>f</italic><sub><italic>S</italic></sub>, the covariance between the random intercept and the random slope, and the residual error variance that was constrained to be equal for all four measurements of the observed variable <italic>y</italic>.</p>
<p>The following experimental factors were varied:</p>
<list list-type="simple">
<list-item>
<label>&#x2022;</label>
<p><italic>Level of measurement of the noise variables</italic>: We provided the SEM trees with randomly generated noise variables. The noise variables were either continuous (standard normal), ordinal with 6 levels (with an equal number of observations per level), or dichotomous (with an equal number of observations in both classes). For a given condition, all noise variables had the same level of measurement.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Number of noise variables</italic>: Either 1, 3, or 5 noise variables were generated.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Sample size (N)</italic>: The simulated samples contained either 504 or 1,008 observations. The odd numbers resulted from the necessity to be divisible by 6 to allow for an equal number of observations per level of the ordinal noise variables.</p>
</list-item>
</list>
<p>First, we will inspect the type I error rates of the different SEM tree approaches and compare their computation time afterward.</p>
<sec id="S5.SS1.SSS1">
<title>Percentage of Type I Errors</title>
<p>Every tree consisting of more than one node was counted as a type I error. Ideally, the proportion of type I errors should approach 5%. <xref ref-type="table" rid="T1">Table 1</xref> shows the empirical type I error rates of the different SEM tree approaches. The results are sorted with respect to the level of measurement of the noise variables. To get a better understanding of the simulated error rates, we printed results for methods that fell inside a 95% confidence interval around the optimal rate of 5% for 10,000 replications (CI: [4.573; 5.427]) in bold. For ordinal and dichotomous noise variables, all SEM tree implementations yielded error rates mostly close to the desired 5%. For continuous covariates, however, only <italic>fair</italic>, max<italic>LR</italic>, <italic>CvM</italic>, and max<italic>LM</italic> trees had satisfactory type I error rates. <italic>DM</italic> trees exhibited slightly too few type I errors. As predicted by <xref ref-type="bibr" rid="B7">Brandmaier et al. (2013b)</xref>, <italic>na&#x00EF;ve</italic> trees that were provided with continuous noise variables over-adjusted and produced error rates that were too small by a factor of 10. Increasing the sample sizes amplified this overcorrection. For the remaining methods, varying the number of noise variables and the sample size did not systematically influence the error rates.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Empirical type I error rates.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left" colspan="2"></td>
<td valign="top" align="center" colspan="6">Continuous<hr/></td>
<td valign="top" align="center" colspan="5">Ordinal<hr/></td>
<td valign="top" align="center" colspan="3">Dichotomous<hr/></td>
</tr>
<tr>
<td valign="top" align="left">Nr. noise</td>
<td valign="top" align="center"><italic>N</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>DM</italic></td>
<td valign="top" align="center"><italic>CvM</italic></td>
<td valign="top" align="center">max<italic>LM</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>WDM</italic></td>
<td valign="top" align="center">max<italic>LM</italic><sub><italic>O</italic></sub></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center"><italic>LM</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">0.56</td>
<td valign="top" align="center"><bold>5.15</bold></td>
<td valign="top" align="center"><bold>5.27</bold></td>
<td valign="top" align="center">3.85</td>
<td valign="top" align="center"><bold>5.01</bold></td>
<td valign="top" align="center"><bold>5.05</bold></td>
<td valign="top" align="center">4.50</td>
<td valign="top" align="center">5.71</td>
<td valign="top" align="center"><bold>5.31</bold></td>
<td valign="top" align="center"><bold>5.08</bold></td>
<td valign="top" align="center"><bold>5.16</bold></td>
<td valign="top" align="center"><bold>5.15</bold></td>
<td valign="top" align="center"><bold>5.21</bold></td>
<td valign="top" align="center"><bold>4.89</bold></td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">0.60</td>
<td valign="top" align="center"><bold>5.01</bold></td>
<td valign="top" align="center">5.43</td>
<td valign="top" align="center">4.17</td>
<td valign="top" align="center"><bold>5.05</bold></td>
<td valign="top" align="center">5.57</td>
<td valign="top" align="center"><bold>4.99</bold></td>
<td valign="top" align="center">5.78</td>
<td valign="top" align="center"><bold>5.41</bold></td>
<td valign="top" align="center"><bold>5.38</bold></td>
<td valign="top" align="center">5.59</td>
<td valign="top" align="center"><bold>5.18</bold></td>
<td valign="top" align="center">5.45</td>
<td valign="top" align="center"><bold>4.76</bold></td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">0.51</td>
<td valign="top" align="center"><bold>5.26</bold></td>
<td valign="top" align="center"><bold>5.25</bold></td>
<td valign="top" align="center">4.18</td>
<td valign="top" align="center"><bold>5.11</bold></td>
<td valign="top" align="center">5.62</td>
<td valign="top" align="center"><bold>5.10</bold></td>
<td valign="top" align="center">5.73</td>
<td valign="top" align="center">5.66</td>
<td valign="top" align="center">5.59</td>
<td valign="top" align="center">5.69</td>
<td valign="top" align="center"><bold>5.07</bold></td>
<td valign="top" align="center"><bold>5.39</bold></td>
<td valign="top" align="center">4.55</td>
</tr>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">1,008</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center"><bold>4.71</bold></td>
<td valign="top" align="center"><bold>5.39</bold></td>
<td valign="top" align="center">3.93</td>
<td valign="top" align="center"><bold>4.86</bold></td>
<td valign="top" align="center"><bold>5.17</bold></td>
<td valign="top" align="center">3.93</td>
<td valign="top" align="center"><bold>5.27</bold></td>
<td valign="top" align="center"><bold>4.71</bold></td>
<td valign="top" align="center"><bold>4.95</bold></td>
<td valign="top" align="center"><bold>4.76</bold></td>
<td valign="top" align="center"><bold>5.17</bold></td>
<td valign="top" align="center"><bold>5.31</bold></td>
<td valign="top" align="center"><bold>4.99</bold></td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1,008</td>
<td valign="top" align="center">0.35</td>
<td valign="top" align="center"><bold>4.95</bold></td>
<td valign="top" align="center"><bold>5.04</bold></td>
<td valign="top" align="center">3.91</td>
<td valign="top" align="center">4.57</td>
<td valign="top" align="center"><bold>5.01</bold></td>
<td valign="top" align="center"><bold>4.80</bold></td>
<td valign="top" align="center">5.67</td>
<td valign="top" align="center"><bold>5.23</bold></td>
<td valign="top" align="center">5.60</td>
<td valign="top" align="center"><bold>5.24</bold></td>
<td valign="top" align="center"><bold>5.13</bold></td>
<td valign="top" align="center"><bold>4.97</bold></td>
<td valign="top" align="center"><bold>4.92</bold></td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="center">1.008</td>
<td valign="top" align="center">0.35</td>
<td valign="top" align="center"><bold>5.23</bold></td>
<td valign="top" align="center"><bold>5.37</bold></td>
<td valign="top" align="center"><bold>4.61</bold></td>
<td valign="top" align="center"><bold>5.13</bold></td>
<td valign="top" align="center">5.48</td>
<td valign="top" align="center"><bold>4.95</bold></td>
<td valign="top" align="center">5.60</td>
<td valign="top" align="center"><bold>5.27</bold></td>
<td valign="top" align="center"><bold>5.25</bold></td>
<td valign="top" align="center">5.68</td>
<td valign="top" align="center"><bold>4.81</bold></td>
<td valign="top" align="center"><bold>5.04</bold></td>
<td valign="top" align="center"><bold>4.61</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Nr. noise = number of noise variables, <italic>N</italic> = sample size. Error rates within the 95% confidence interval around the optimal rate of 5% are printed in bold.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S5.SS1.SSS2">
<title>Runtime</title>
<p>We recorded the computation time for the different SEM trees in seconds. As a matter of course, the runtime varies widely depending on the computing platform. However, comparing the runtime of the different methods allows for relative comparisons and provides estimates for current standard computing platforms. Necessarily, the absolute estimates will become outdated soon. The simulation was conducted with an Intel<sup>&#x00AE;</sup> Xeon<sup>&#x00AE;</sup> CPU E5-2670 processor using a single core.</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> presents the median of the computation time in seconds. The median runtime for ordinal and dichotomous noise variables was small. The score-guided trees (<italic>WDM</italic>, max<italic>LM</italic><sub><italic>O</italic></sub>, and <italic>LM</italic>) showed a minor speed advantage over the likelihood-ratio-guided trees (<italic>na&#x00EF;ve</italic>, <italic>fair</italic>, and max<italic>LR</italic>). For continuous noise variables, however, for which many possible cut points needed to be evaluated, the runtime of likelihood-ratio-guided SEM trees was excessively larger than the computation time of score-guided SEM trees. For instance, given the larger sample size and five noise variables, likelihood-ratio-guided SEM trees needed several minutes to compute, whereas score-guided SEM trees (<italic>DM</italic>, <italic>CvM</italic>, and max<italic>LM</italic>) were performed in fractions of a second. The runtime of the <italic>fair</italic> trees was roughly half as long as the runtime of <italic>na&#x00EF;ve</italic> and max<italic>LR</italic> trees, most likely because the <italic>fair</italic> method tests only half of the possible cut points for continuous variables. As expected, a larger sample size and more noise variables led to an increase in computation time of the likelihood-ratio-guided SEM trees. In contrast, the runtime of score-guided SEM trees remained virtually the same. This implies that even for larger samples consisting of larger numbers of individuals and many covariates, score-guided SEM trees can be computed in short time.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Median runtime in seconds.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left" colspan="2"></td>
<td valign="top" align="center" colspan="6">Continuous<hr/></td>
<td valign="top" align="center" colspan="5">Ordinal<hr/></td>
<td valign="top" align="center" colspan="3">Dichotomous<hr/></td>
</tr>
<tr>
<td valign="top" align="left">Nr. noise</td>
<td valign="top" align="center"><italic>N</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>DM</italic></td>
<td valign="top" align="center"><italic>CvM</italic></td>
<td valign="top" align="center">max<italic>LM</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>WDM</italic></td>
<td valign="top" align="center">max<italic>LM</italic><sub><italic>O</italic></sub></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center"><italic>LM</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">35.0</td>
<td valign="top" align="center">15.8</td>
<td valign="top" align="center">34.7</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.6</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">1.4</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.2</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">105.6</td>
<td valign="top" align="center">48.1</td>
<td valign="top" align="center">105.7</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">1.3</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">1.9</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.2</td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">179.3</td>
<td valign="top" align="center">81.0</td>
<td valign="top" align="center">179.3</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">2.1</td>
<td valign="top" align="center">2.5</td>
<td valign="top" align="center">2.6</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">1.0</td>
<td valign="top" align="center">0.2</td>
</tr>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">1,008</td>
<td valign="top" align="center">72.7</td>
<td valign="top" align="center">34.5</td>
<td valign="top" align="center">72.7</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">0.6</td>
<td valign="top" align="center">1.3</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.3</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.2</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">1,008</td>
<td valign="top" align="center">222.7</td>
<td valign="top" align="center">105.0</td>
<td valign="top" align="center">222.7</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">1.3</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">1.9</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.2</td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="center">1.008</td>
<td valign="top" align="center">374.7</td>
<td valign="top" align="center">175.4</td>
<td valign="top" align="center">366.7</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">2.1</td>
<td valign="top" align="center">2.5</td>
<td valign="top" align="center">2.7</td>
<td valign="top" align="center">0.3</td>
<td valign="top" align="center">0.3</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">1.1</td>
<td valign="top" align="center">0.2</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Nr. noise = number of noise variables, <italic>N</italic> = sample size.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec id="S5.SS2">
<title>Simulation II: Power, Cut Point Estimation, and Group Recovery</title>
<p>Simulation II evaluated the performance of likelihood-ratio and score-guided SEM trees in heterogeneous samples consisting of two subgroups.</p>
<p>We varied the following experimental factors:</p>
<list list-type="simple">
<list-item>
<label>&#x2022;</label>
<p><italic>Level of measurement of the covariate:</italic> The SEM tree was provided with a single covariate that was either a continuous variable (standard normal), an ordinal variable with 6 levels, or a dichotomous variable.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Group differences:</italic> We tested two types of group differences. Either the fixed slope of the linear latent growth curve model shown in <xref ref-type="fig" rid="F3">Figure 3</xref> or all random effects varied between groups. <xref ref-type="table" rid="T3">Table 3</xref> presents the values used for the heterogeneous parameters. Note that in the fixed slope condition, only a single parameter varied between groups, whereas in the random effects condition, three parameters varied. The values of the remaining homogeneous parameters are shown in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
</list-item>
</list>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Parameter differences used in Simulation II.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"></td>
<td valign="top" align="center" colspan="2">Fixed effects<hr/></td>
<td/>
<td valign="top" align="center" colspan="2">Random effects<hr/></td>
</tr>
<tr>
<td valign="top" align="left">Parameter</td>
<td valign="top" align="center">Group 1</td>
<td valign="top" align="center">Group 2</td>
<td valign="top" align="center">Parameter</td>
<td valign="top" align="center">Group 1</td>
<td valign="top" align="center">Group 2</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">E(<italic>f</italic><sub><italic>I</italic></sub>)</td>
<td valign="top" align="center">5.389</td>
<td valign="top" align="center">5.695</td>
<td valign="top" align="center">Var(<italic>f</italic><sub><italic>I</italic></sub>)</td>
<td valign="top" align="center">25.137</td>
<td valign="top" align="center">38.023</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td valign="top" align="center">Var(<italic>f</italic><sub><italic>S</italic></sub>)</td>
<td valign="top" align="center">2.808</td>
<td valign="top" align="center">4.247</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td valign="top" align="center">Cov(<italic>f</italic><sub><italic>I</italic></sub>,<italic>f</italic><sub><italic>S</italic></sub>)</td>
<td valign="top" align="center">0.745</td>
<td valign="top" align="center">1.127</td>
</tr>
</tbody>
</table></table-wrap>
<list list-type="simple">
<list-item>
<label>&#x2022;</label>
<p><italic>Noise variable</italic>: In the noise condition, the SEM tree algorithm was provided with a noise variable in addition to the informative covariate. In the no-noise condition, only the informative covariate was given to the tree. The noise variable was independent of the group differences and randomly selected to be a continuous variable (standard normal), an ordinal variable with 6 levels, or a dichotomous variable.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Cut point location:</italic> We tested three different positions of the optimal cut point in the informative covariate. The cut points were either central, partitioning the sample into two groups of equal size, moderately non-central, resulting in a larger subgroup consisting of 66.67% of the observations and a smaller subgroup with 33.33% of the observations, or strongly non-central with 83.33% of the observations in the larger subgroup and 16.67% of the observations in the smaller subgroup. We counterbalanced the non-central cut points so that moderately non-central cut points occurred either after the <inline-formula><mml:math id="INEQ17"><mml:mrow><mml:mfrac bevelled='true'><mml:mn>1</mml:mn><mml:mn>3</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula>- or after the <inline-formula><mml:math id="INEQ18"><mml:mrow><mml:mfrac bevelled='true'><mml:mn>2</mml:mn><mml:mn>3</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula>-quantile of the covariate and strongly non-central cut points either after the <inline-formula><mml:math id="INEQ19"><mml:mrow><mml:mfrac bevelled='true'><mml:mn>1</mml:mn><mml:mn>6</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula>- or the <inline-formula><mml:math id="INEQ20"><mml:mrow><mml:mfrac bevelled='true'><mml:mn>5</mml:mn><mml:mn>3</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula>-quantile.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Sample size (N)</italic>: The sample consisted either of 504 or 1,008 observations.</p>
</list-item>
</list>
<p>We evaluated each method in terms of statistical power to detect heterogeneity, the precision of the estimated cut points, group recovery, and runtime. For each condition, the results of the best-performing method are printed in bold in the following tables. Due to space constraints, we report only the most important simulation results. The complete simulation results are provided as Online Supplemental Material<sup>2</sup>.</p>
<sec id="S5.SS2.SSS1">
<title>Power</title>
<p>We define statistical power as the percentage of SEM trees that correctly selected the covariate as a split at any cut point and any level of the tree.</p>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows the estimated power of the different SEM trees. We will first compare the overall performance of the original <italic>na&#x00EF;ve</italic> and <italic>fair</italic> trees with the newly implemented max<italic>LR</italic> and score-guided trees. With respect to power, we found that <italic>na&#x00EF;ve</italic> trees performed roughly as well as the newly implemented methods for ordinal and dichotomous covariates but poorly for continuous covariates. The other classical method, <italic>fair</italic> trees, showed overall the lowest power of all methods under investigation. As expected, the likelihood-ratio-guided max<italic>LR</italic> trees yielded similar results as the score-guided max<italic>LM</italic> trees but were consistently slightly more powerful. Among the experimental conditions, the type of group differences and the cut point location impacted the rank order of the methods the most. <italic>DM</italic> and <italic>WDM</italic> trees were the most powerful methods for detecting heterogeneity in the fixed slope parameter. In contrast, max<italic>LR</italic>, <italic>CvM</italic>, max<italic>LM</italic>, and max<italic>LM</italic><sub><italic>O</italic></sub> trees proved to be the more powerful methods for detecting heterogeneity in the random effects. We expected this behavior because the <italic>DM</italic> and <italic>WDM</italic> test statistics focus on heterogeneity in a single parameter, whereas all other methods monitor group differences in multiple parameters. Overall, the likelihood-ratio-based test statistic max<italic>LR</italic> and the score-based test statistics with a scaling term (that is, max<italic>LM</italic>, <italic>WDM</italic>, and max<italic>LM</italic><sub><italic>O</italic></sub>) were more sensitive for non-central cut points but less sensitive for central cut points than the <italic>DM</italic> and <italic>CvM</italic> statistics for continuous covariates that do not use any scaling. As an optimal baseline, we compared the power of the SEM trees with MGSEMs, denoted as MG in <xref ref-type="table" rid="T4">Table 4</xref>. Like the SEM trees, the MGSEMs were specified by letting all parameters vary between groups. In contrast to SEM trees, MGSEMs were unaffected by noise variables and were informed about the true cut point. Therefore, the MGSEMs present the upper limit achievable in terms of statistical power. Not surprisingly, MGSEMs were more powerful than all SEM tree methods, given continuous and ordinal covariates, but equally powerful in conditions with dichotomous covariates and without noise variables, where cut points did not need to be learned from the data.</p>
<table-wrap position="float" id="T4">
<label>TABLE 4</label>
<caption><p>Power to detect group differences.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left" colspan="2"></td>
<td valign="top" align="center" colspan="7">Continuous<hr/></td>
<td valign="top" align="center" colspan="6">Ordinal<hr/></td>
<td valign="top" align="center" colspan="4">Dichotomous<hr/></td>
</tr>
<tr>
<td valign="top" align="left"><italic>N</italic></td>
<td valign="top" align="center">CL</td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>DM</italic></td>
<td valign="top" align="center"><italic>CvM</italic></td>
<td valign="top" align="center">max<italic>LM</italic></td>
<td valign="top" align="center"><italic>MG</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>WDM</italic></td>
<td valign="top" align="center">max<italic>LM</italic><sub><italic>O</italic></sub></td>
<td valign="top" align="center"><italic>MG</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center"><italic>LM</italic></td>
<td valign="top" align="center"><italic>MG</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="19"><bold><italic>Group difference in the fixed slope</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">10.0</td>
<td valign="top" align="center">14.5</td>
<td valign="top" align="center">36.8</td>
<td valign="top" align="center"><bold>46.7</bold></td>
<td valign="top" align="center">41.7</td>
<td valign="top" align="center">35.4</td>
<td valign="top" align="center">52.8</td>
<td valign="top" align="center">35.8</td>
<td valign="top" align="center">17.4</td>
<td valign="top" align="center">38.4</td>
<td valign="top" align="center"><bold>44.2</bold></td>
<td valign="top" align="center">37.0</td>
<td valign="top" align="center">52.4</td>
<td valign="top" align="center"><bold>52.8</bold></td>
<td valign="top" align="center">27.0</td>
<td valign="top" align="center">51.8</td>
<td valign="top" align="center">52.8</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">7.7</td>
<td valign="top" align="center">13.1</td>
<td valign="top" align="center">31.6</td>
<td valign="top" align="center"><bold>35.4</bold></td>
<td valign="top" align="center">32.4</td>
<td valign="top" align="center">30.5</td>
<td valign="top" align="center">47.1</td>
<td valign="top" align="center">30.4</td>
<td valign="top" align="center">15.7</td>
<td valign="top" align="center">32.6</td>
<td valign="top" align="center"><bold>37.7</bold></td>
<td valign="top" align="center">31.6</td>
<td valign="top" align="center">46.1</td>
<td valign="top" align="center"><bold>45.9</bold></td>
<td valign="top" align="center">24.0</td>
<td valign="top" align="center">45.2</td>
<td valign="top" align="center">45.9</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">3.2</td>
<td valign="top" align="center">8.1</td>
<td valign="top" align="center"><bold>16.8</bold></td>
<td valign="top" align="center">9.8</td>
<td valign="top" align="center">13.0</td>
<td valign="top" align="center">16.4</td>
<td valign="top" align="center">29.7</td>
<td valign="top" align="center">17.8</td>
<td valign="top" align="center">10.4</td>
<td valign="top" align="center">19.4</td>
<td valign="top" align="center"><bold>21.2</bold></td>
<td valign="top" align="center">19.0</td>
<td valign="top" align="center">29.8</td>
<td valign="top" align="center"><bold>30.1</bold></td>
<td valign="top" align="center">17.0</td>
<td valign="top" align="center">29.3</td>
<td valign="top" align="center">30.1</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">30.6</td>
<td valign="top" align="center">29.8</td>
<td valign="top" align="center">72.9</td>
<td valign="top" align="center"><bold>84.9</bold></td>
<td valign="top" align="center">76.3</td>
<td valign="top" align="center">72.1</td>
<td valign="top" align="center">87.2</td>
<td valign="top" align="center">73.8</td>
<td valign="top" align="center">37.6</td>
<td valign="top" align="center">75.7</td>
<td valign="top" align="center"><bold>83.3</bold></td>
<td valign="top" align="center">75.0</td>
<td valign="top" align="center">86.4</td>
<td valign="top" align="center"><bold>85.5</bold></td>
<td valign="top" align="center">51.4</td>
<td valign="top" align="center">85.2</td>
<td valign="top" align="center">85.5</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">24.3</td>
<td valign="top" align="center">26.1</td>
<td valign="top" align="center">66.1</td>
<td valign="top" align="center"><bold>74.9</bold></td>
<td valign="top" align="center">64.8</td>
<td valign="top" align="center">65.6</td>
<td valign="top" align="center">82.0</td>
<td valign="top" align="center">65.7</td>
<td valign="top" align="center">32.3</td>
<td valign="top" align="center">68.0</td>
<td valign="top" align="center"><bold>77.2</bold></td>
<td valign="top" align="center">67.4</td>
<td valign="top" align="center">81.3</td>
<td valign="top" align="center"><bold>81.5</bold></td>
<td valign="top" align="center">47.6</td>
<td valign="top" align="center">81.2</td>
<td valign="top" align="center">81.5</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">8.7</td>
<td valign="top" align="center">15.6</td>
<td valign="top" align="center"><bold>38.5</bold></td>
<td valign="top" align="center">25.9</td>
<td valign="top" align="center">26.2</td>
<td valign="top" align="center"><bold>38.5</bold></td>
<td valign="top" align="center">58.2</td>
<td valign="top" align="center">39.5</td>
<td valign="top" align="center">19.9</td>
<td valign="top" align="center">42.0</td>
<td valign="top" align="center"><bold>48.7</bold></td>
<td valign="top" align="center">41.4</td>
<td valign="top" align="center">57.9</td>
<td valign="top" align="center">58.6</td>
<td valign="top" align="center">30.4</td>
<td valign="top" align="center"><bold>58.7</bold></td>
<td valign="top" align="center">58.6</td>
</tr>
<tr>
<td valign="top" align="left" colspan="19"><bold><italic>Group differences in the random effects</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">18.2</td>
<td valign="top" align="center">19.4</td>
<td valign="top" align="center">51.8</td>
<td valign="top" align="center">48.3</td>
<td valign="top" align="center"><bold>56.5</bold></td>
<td valign="top" align="center">49.9</td>
<td valign="top" align="center">69.4</td>
<td valign="top" align="center">52.1</td>
<td valign="top" align="center">24.5</td>
<td valign="top" align="center"><bold>54.2</bold></td>
<td valign="top" align="center">46.4</td>
<td valign="top" align="center">53.2</td>
<td valign="top" align="center">68.5</td>
<td valign="top" align="center"><bold>68.8</bold></td>
<td valign="top" align="center">36.0</td>
<td valign="top" align="center">67.4</td>
<td valign="top" align="center">68.8</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">13.7</td>
<td valign="top" align="center">17.1</td>
<td valign="top" align="center">44.1</td>
<td valign="top" align="center">35.6</td>
<td valign="top" align="center"><bold>44.6</bold></td>
<td valign="top" align="center">42.8</td>
<td valign="top" align="center">62.8</td>
<td valign="top" align="center">45.0</td>
<td valign="top" align="center">21.0</td>
<td valign="top" align="center"><bold>47.5</bold></td>
<td valign="top" align="center">39.9</td>
<td valign="top" align="center">45.9</td>
<td valign="top" align="center">63.1</td>
<td valign="top" align="center"><bold>62.9</bold></td>
<td valign="top" align="center">32.8</td>
<td valign="top" align="center">61.0</td>
<td valign="top" align="center">62.9</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">5.3</td>
<td valign="top" align="center">10.4</td>
<td valign="top" align="center">22.9</td>
<td valign="top" align="center">12.0</td>
<td valign="top" align="center">18.1</td>
<td valign="top" align="center"><bold>23.8</bold></td>
<td valign="top" align="center">40.6</td>
<td valign="top" align="center">24.5</td>
<td valign="top" align="center">13.4</td>
<td valign="top" align="center">26.4</td>
<td valign="top" align="center">22.8</td>
<td valign="top" align="center"><bold>26.8</bold></td>
<td valign="top" align="center">40.9</td>
<td valign="top" align="center"><bold>41.1</bold></td>
<td valign="top" align="center">21.0</td>
<td valign="top" align="center">38.0</td>
<td valign="top" align="center">41.1</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">53.3</td>
<td valign="top" align="center">45.1</td>
<td valign="top" align="center">88.0</td>
<td valign="top" align="center">85.7</td>
<td valign="top" align="center"><bold>89.6</bold></td>
<td valign="top" align="center">87.6</td>
<td valign="top" align="center">95.9</td>
<td valign="top" align="center">89.4</td>
<td valign="top" align="center">55.0</td>
<td valign="top" align="center"><bold>90.4</bold></td>
<td valign="top" align="center">84.2</td>
<td valign="top" align="center">90.1</td>
<td valign="top" align="center">95.7</td>
<td valign="top" align="center"><bold>95.7</bold></td>
<td valign="top" align="center">69.1</td>
<td valign="top" align="center">95.5</td>
<td valign="top" align="center">95.7</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">43.9</td>
<td valign="top" align="center">37.5</td>
<td valign="top" align="center"><bold>82.7</bold></td>
<td valign="top" align="center">74.4</td>
<td valign="top" align="center">80.0</td>
<td valign="top" align="center">80.9</td>
<td valign="top" align="center">93.3</td>
<td valign="top" align="center">84.2</td>
<td valign="top" align="center">47.7</td>
<td valign="top" align="center"><bold>85.6</bold></td>
<td valign="top" align="center">77.5</td>
<td valign="top" align="center">84.3</td>
<td valign="top" align="center">93.1</td>
<td valign="top" align="center"><bold>93.4</bold></td>
<td valign="top" align="center">63.3</td>
<td valign="top" align="center">93.0</td>
<td valign="top" align="center">93.4</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">17.3</td>
<td valign="top" align="center">21.0</td>
<td valign="top" align="center"><bold>52.8</bold></td>
<td valign="top" align="center">26.6</td>
<td valign="top" align="center">36.4</td>
<td valign="top" align="center">49.9</td>
<td valign="top" align="center">73.4</td>
<td valign="top" align="center">55.3</td>
<td valign="top" align="center">28.3</td>
<td valign="top" align="center"><bold>57.4</bold></td>
<td valign="top" align="center">47.3</td>
<td valign="top" align="center">53.6</td>
<td valign="top" align="center">73.4</td>
<td valign="top" align="center"><bold>73.3</bold></td>
<td valign="top" align="center">39.8</td>
<td valign="top" align="center">69.8</td>
<td valign="top" align="center">73.3</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic><italic>N</italic> = sample size, CL = cut point location, 1/2 = central cut point location, 1/3 = moderately non-central, 1/6 = strongly non-central, <italic>MG</italic> = MGSEM. Best-performing methods are printed in bold.</italic></attrib>
</table-wrap-foot>
</table-wrap>
<p>The presence of a noise variable (not shown in <xref ref-type="table" rid="T4">Table 4</xref>) approximately halved the power of all tree methods but affected <italic>na&#x00EF;ve</italic> trees most severely. The pronounced effect of noise variables on <italic>na&#x00EF;ve</italic> trees was mainly driven by continuous noise variables, which led to severely over-adjusted <italic>p</italic>-values. Providing <italic>na&#x00EF;ve</italic> trees with ordinal or dichotomous noise variables led to a decrease of power that was comparable to the decrease in other methods. Increasing the sample size had an approximately uniform effect and raised the power of all methods substantively.</p>
</sec>
<sec id="S5.SS2.SSS2">
<title>Precision of Estimated Cut Points</title>
<p>The estimation of the optimal cut point in the covariate is crucial for recovering the true grouping of individuals. The approaches for locating cut points differed between likelihood-ratio and score-guided SEM trees. Likelihood-ratio-guided trees found cut points by maximizing a partitioned log-likelihood, and score-guided SEM trees determined cut points by searching through a disaggregated max<italic>LM</italic> statistic. We limit ourselves to discuss cut points estimated by max<italic>LR</italic> and max<italic>LM</italic> trees in the following. We used only trees that selected the covariate for the initial split of the data, ignoring possible further splits. Since noise variables had no visible effect, we will discuss only simulation trials without additional noise variables. Also, we did not evaluate dichotomous covariates because there is only a single trivial cut point.</p>
<p><xref ref-type="table" rid="T5">Table 5</xref> presents bias, standard deviation, and root mean squared error (RMSE) of the estimated cut points. Both approaches produced similar cut points that were nearly unbiased. Overall, cut points estimated by max<italic>LR</italic> were slightly more precise in terms of RMSE. Interestingly, group differences in the random effects led to slightly biased cut point estimates provided by max<italic>LM</italic> trees, which was not observed for max<italic>LR</italic> trees. Estimates for non-central cut points showed more variability than for central cut points in both methods. A larger sample size of 1,008 observations increased both methods&#x2019; precision and reduced the bias of cut points estimated by max<italic>LM</italic>.</p>
<table-wrap position="float" id="T5">
<label>TABLE 5</label>
<caption><p>Estimated cut points.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"></td>
<td/>
<td valign="top" align="center" colspan="6">Continuous<hr/></td>
<td valign="top" align="center" colspan="6">Ordinal<hr/></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center" colspan="3">max<italic>LR</italic><hr/></td>
<td valign="top" align="center" colspan="3">max<italic>LM</italic><hr/></td>
<td valign="top" align="center" colspan="3">max<italic>LR</italic><hr/></td>
<td valign="top" align="center" colspan="3">max<italic>LM</italic><sub><italic>O</italic></sub><hr/></td>
</tr>
<tr>
<td valign="top" align="left"><italic>N</italic></td>
<td valign="top" align="center">CL</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">SD</td>
<td valign="top" align="center">RMSE</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">SD</td>
<td valign="top" align="center">RMSE</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">SD</td>
<td valign="top" align="center">RMSE</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">SD</td>
<td valign="top" align="center">RMSE</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="14"><bold><italic>Group difference in the fixed slope</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.388</td>
<td valign="top" align="center">0.389</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.408</td>
<td valign="top" align="center">0.408</td>
<td valign="top" align="center">&#x2212;0.003</td>
<td valign="top" align="center">0.798</td>
<td valign="top" align="center">0.797</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.750</td>
<td valign="top" align="center">0.750</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.430</td>
<td valign="top" align="center">0.430</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.440</td>
<td valign="top" align="center">0.440</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.955</td>
<td valign="top" align="center">0.956</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.878</td>
<td valign="top" align="center">0.878</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.694</td>
<td valign="top" align="center">0.694</td>
<td valign="top" align="center">&#x2212;0.022</td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">1.347</td>
<td valign="top" align="center">1.347</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">1.312</td>
<td valign="top" align="center">1.312</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">&#x2212;0.002</td>
<td valign="top" align="center">0.273</td>
<td valign="top" align="center">0.273</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.282</td>
<td valign="top" align="center">0.282</td>
<td valign="top" align="center">&#x2212;0.004</td>
<td valign="top" align="center">0.546</td>
<td valign="top" align="center">0.545</td>
<td valign="top" align="center">&#x2212;0.010</td>
<td valign="top" align="center">0.459</td>
<td valign="top" align="center">0.459</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">&#x2212;0.006</td>
<td valign="top" align="center">0.310</td>
<td valign="top" align="center">0.310</td>
<td valign="top" align="center">&#x2212;0.007</td>
<td valign="top" align="center">0.311</td>
<td valign="top" align="center">0.311</td>
<td valign="top" align="center">&#x2212;0.004</td>
<td valign="top" align="center">0.603</td>
<td valign="top" align="center">0.603</td>
<td valign="top" align="center">&#x2212;0.006</td>
<td valign="top" align="center">0.494</td>
<td valign="top" align="center">0.494</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.004</td>
<td valign="top" align="center">0.491</td>
<td valign="top" align="center">0.491</td>
<td valign="top" align="center">&#x2212;0.001</td>
<td valign="top" align="center">0.499</td>
<td valign="top" align="center">0.499</td>
<td valign="top" align="center">0.004</td>
<td valign="top" align="center">0.933</td>
<td valign="top" align="center">0.933</td>
<td valign="top" align="center">0.007</td>
<td valign="top" align="center">0.836</td>
<td valign="top" align="center">0.836</td>
</tr>
<tr>
<td valign="top" align="left" colspan="14"><bold><italic>Group differences in the random effects</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.345</td>
<td valign="top" align="center">0.345</td>
<td valign="top" align="center">0.183</td>
<td valign="top" align="center">0.348</td>
<td valign="top" align="center">0.393</td>
<td valign="top" align="center">0.003</td>
<td valign="top" align="center">0.699</td>
<td valign="top" align="center">0.699</td>
<td valign="top" align="center">0.255</td>
<td valign="top" align="center">0.790</td>
<td valign="top" align="center">0.831</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.381</td>
<td valign="top" align="center">0.381</td>
<td valign="top" align="center">0.177</td>
<td valign="top" align="center">0.401</td>
<td valign="top" align="center">0.439</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.769</td>
<td valign="top" align="center">0.769</td>
<td valign="top" align="center">0.246</td>
<td valign="top" align="center">0.882</td>
<td valign="top" align="center">0.916</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.611</td>
<td valign="top" align="center">0.611</td>
<td valign="top" align="center">0.098</td>
<td valign="top" align="center">0.602</td>
<td valign="top" align="center">0.610</td>
<td valign="top" align="center">0.003</td>
<td valign="top" align="center">1.187</td>
<td valign="top" align="center">1.187</td>
<td valign="top" align="center">0.138</td>
<td valign="top" align="center">1.260</td>
<td valign="top" align="center">1.267</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.224</td>
<td valign="top" align="center">0.225</td>
<td valign="top" align="center">0.085</td>
<td valign="top" align="center">0.240</td>
<td valign="top" align="center">0.255</td>
<td valign="top" align="center">0.001</td>
<td valign="top" align="center">0.415</td>
<td valign="top" align="center">0.415</td>
<td valign="top" align="center">0.118</td>
<td valign="top" align="center">0.542</td>
<td valign="top" align="center">0.555</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.245</td>
<td valign="top" align="center">0.245</td>
<td valign="top" align="center">0.092</td>
<td valign="top" align="center">0.268</td>
<td valign="top" align="center">0.283</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.457</td>
<td valign="top" align="center">0.457</td>
<td valign="top" align="center">0.117</td>
<td valign="top" align="center">0.600</td>
<td valign="top" align="center">0.611</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.004</td>
<td valign="top" align="center">0.387</td>
<td valign="top" align="center">0.387</td>
<td valign="top" align="center">0.076</td>
<td valign="top" align="center">0.421</td>
<td valign="top" align="center">0.428</td>
<td valign="top" align="center">&#x2212;0.001</td>
<td valign="top" align="center">0.728</td>
<td valign="top" align="center">0.728</td>
<td valign="top" align="center">0.102</td>
<td valign="top" align="center">0.943</td>
<td valign="top" align="center">0.949</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic><italic>N</italic> = sample size, CL = cut point location, 1/2 = central cut point location, 1/3 = moderately non-central, 1/6 = strongly non-central, B = bias, SD = standard deviation, RMSE = root mean squared error.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S5.SS2.SSS3">
<title>Group Recovery</title>
<p>We used the adjusted Rand index (ARI; <xref ref-type="bibr" rid="B17">Hubert and Arabie, 1985</xref>; <xref ref-type="bibr" rid="B32">Milligan and Cooper, 1986</xref>) to measure how well the true groups are recovered by each SEM tree method. The ARI is widely used to measure the similarity between two partitions and is adjusted for agreement by chance. A large ARI value up to the maximum of 1 indicates a high agreement between the partitioning estimated by a tree and the true partitioning, while smaller values imply a lower degree of similarity. Particularly, an ARI of 0 is obtained if a tree fails to detect any group differences and does not split the sample.</p>
<p>The ARI of the different tree methods is shown in <xref ref-type="table" rid="T6">Table 6</xref>. In our simulation setup, the ARI of a SEM tree method seemed to be mainly determined by its power to detect heterogeneity as a similar rank order as for the statistical power emerged. Given a continuous covariate, score-guided <italic>DM</italic> and <italic>CvM</italic> trees showed the largest ARI for central cut points, max<italic>LM</italic> and max<italic>LR</italic> trees showed the largest ARI for non-central cut points, while the original likelihood-ratio-guided <italic>na&#x00EF;ve</italic> and <italic>fair</italic> trees performed poorly. As with power, <italic>DM</italic> trees had a higher ARI for a difference in the slope, and the ARI of the other score-guided and max<italic>LR</italic> trees was higher for differences in the random effects. <italic>Na&#x00EF;ve</italic> trees performed better when provided with an ordinal or dichotomous covariate. For ordinal covariates, <italic>WDM</italic> trees exhibited the largest ARI if the fixed slope differed between groups, whereas the ARI of max<italic>LR</italic> and max<italic>LM</italic><sub><italic>O</italic></sub> trees was higher for differences in the random effects. For dichotomous covariates, <italic>na&#x00EF;ve</italic> trees showed a slightly higher ARI than score-guided <italic>LM</italic> trees. However, if provided with an additional noise variable, <italic>na&#x00EF;ve</italic> trees showed a more pronounced decrease in the ARI than <italic>LM</italic> trees (not shown in <xref ref-type="table" rid="T6">Table 6</xref>). This effect was mainly driven by continuous noise variables, which led to overcorrected <italic>p</italic>-values of <italic>na&#x00EF;ve</italic> trees. Non-central cut points generally reduced the ARI of all trees, affecting <italic>DM</italic> and <italic>CvM</italic> trees without a scaling term the most. The ARI of all tree methods improved substantially for larger samples with 1,008 simulated individuals without drastically changing the rank order.</p>
<table-wrap position="float" id="T6">
<label>TABLE 6</label>
<caption><p>Adjusted Rand index.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left" colspan="2"></td>
<td valign="top" align="center" colspan="6">Continuous<hr/></td>
<td valign="top" align="center" colspan="5">Ordinal<hr/></td>
<td valign="top" align="center" colspan="3">Dichotomous<hr/></td>
</tr>
<tr>
<td valign="top" align="left"><italic>N</italic></td>
<td valign="top" align="center">CL</td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>DM</italic></td>
<td valign="top" align="center"><italic>CvM</italic></td>
<td valign="top" align="center">max<italic>LM</italic></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center"><italic>WDM</italic></td>
<td valign="top" align="center">max<italic>LM</italic><sub><italic>O</italic></sub></td>
<td valign="top" align="center"><italic>Na&#x00EF;ve</italic></td>
<td valign="top" align="center"><italic>Fair</italic></td>
<td valign="top" align="center"><italic>LM</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="16"><bold><italic>Group difference in the fixed slope</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.069</td>
<td valign="top" align="center">0.085</td>
<td valign="top" align="center">0.243</td>
<td valign="top" align="center"><bold>0.377</bold></td>
<td valign="top" align="center">0.323</td>
<td valign="top" align="center">0.234</td>
<td valign="top" align="center">0.277</td>
<td valign="top" align="center">0.120</td>
<td valign="top" align="center">0.296</td>
<td valign="top" align="center"><bold>0.361</bold></td>
<td valign="top" align="center">0.294</td>
<td valign="top" align="center"><bold>0.527</bold></td>
<td valign="top" align="center">0.270</td>
<td valign="top" align="center">0.518</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.075</td>
<td valign="top" align="center">0.208</td>
<td valign="top" align="center"><bold>0.261</bold></td>
<td valign="top" align="center">0.215</td>
<td valign="top" align="center">0.204</td>
<td valign="top" align="center">0.230</td>
<td valign="top" align="center">0.108</td>
<td valign="top" align="center">0.246</td>
<td valign="top" align="center"><bold>0.306</bold></td>
<td valign="top" align="center">0.248</td>
<td valign="top" align="center"><bold>0.459</bold></td>
<td valign="top" align="center">0.240</td>
<td valign="top" align="center">0.452</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.044</td>
<td valign="top" align="center">0.101</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center"><bold>0.105</bold></td>
<td valign="top" align="center">0.131</td>
<td valign="top" align="center">0.065</td>
<td valign="top" align="center">0.141</td>
<td valign="top" align="center"><bold>0.159</bold></td>
<td valign="top" align="center">0.142</td>
<td valign="top" align="center"><bold>0.301</bold></td>
<td valign="top" align="center">0.170</td>
<td valign="top" align="center">0.293</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.248</td>
<td valign="top" align="center">0.209</td>
<td valign="top" align="center">0.557</td>
<td valign="top" align="center"><bold>0.726</bold></td>
<td valign="top" align="center">0.636</td>
<td valign="top" align="center">0.556</td>
<td valign="top" align="center">0.647</td>
<td valign="top" align="center">0.313</td>
<td valign="top" align="center">0.662</td>
<td valign="top" align="center"><bold>0.751</bold></td>
<td valign="top" align="center">0.675</td>
<td valign="top" align="center"><bold>0.855</bold></td>
<td valign="top" align="center">0.514</td>
<td valign="top" align="center">0.852</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.196</td>
<td valign="top" align="center">0.179</td>
<td valign="top" align="center">0.500</td>
<td valign="top" align="center"><bold>0.604</bold></td>
<td valign="top" align="center">0.490</td>
<td valign="top" align="center">0.504</td>
<td valign="top" align="center">0.575</td>
<td valign="top" align="center">0.267</td>
<td valign="top" align="center">0.593</td>
<td valign="top" align="center"><bold>0.697</bold></td>
<td valign="top" align="center">0.607</td>
<td valign="top" align="center"><bold>0.815</bold></td>
<td valign="top" align="center">0.476</td>
<td valign="top" align="center">0.812</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.067</td>
<td valign="top" align="center">0.098</td>
<td valign="top" align="center">0.275</td>
<td valign="top" align="center">0.137</td>
<td valign="top" align="center">0.097</td>
<td valign="top" align="center"><bold>0.290</bold></td>
<td valign="top" align="center">0.335</td>
<td valign="top" align="center">0.159</td>
<td valign="top" align="center">0.353</td>
<td valign="top" align="center"><bold>0.430</bold></td>
<td valign="top" align="center">0.358</td>
<td valign="top" align="center">0.586</td>
<td valign="top" align="center">0.304</td>
<td valign="top" align="center"><bold>0.587</bold></td>
</tr>
<tr>
<td valign="top" align="left" colspan="16"><bold><italic>Group differences in the random effects</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">504</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.136</td>
<td valign="top" align="center">0.125</td>
<td valign="top" align="center">0.361</td>
<td valign="top" align="center">0.378</td>
<td valign="top" align="center"><bold>0.441</bold></td>
<td valign="top" align="center">0.341</td>
<td valign="top" align="center">0.428</td>
<td valign="top" align="center">0.187</td>
<td valign="top" align="center"><bold>0.444</bold></td>
<td valign="top" align="center">0.361</td>
<td valign="top" align="center">0.413</td>
<td valign="top" align="center"><bold>0.688</bold></td>
<td valign="top" align="center">0.360</td>
<td valign="top" align="center">0.674</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.102</td>
<td valign="top" align="center">0.109</td>
<td valign="top" align="center">0.309</td>
<td valign="top" align="center">0.252</td>
<td valign="top" align="center"><bold>0.312</bold></td>
<td valign="top" align="center">0.293</td>
<td valign="top" align="center">0.371</td>
<td valign="top" align="center">0.160</td>
<td valign="top" align="center"><bold>0.388</bold></td>
<td valign="top" align="center">0.308</td>
<td valign="top" align="center">0.356</td>
<td valign="top" align="center"><bold>0.629</bold></td>
<td valign="top" align="center">0.328</td>
<td valign="top" align="center">0.610</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.038</td>
<td valign="top" align="center">0.063</td>
<td valign="top" align="center">0.150</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.066</td>
<td valign="top" align="center"><bold>0.170</bold></td>
<td valign="top" align="center">0.192</td>
<td valign="top" align="center">0.091</td>
<td valign="top" align="center">0.205</td>
<td valign="top" align="center">0.175</td>
<td valign="top" align="center"><bold>0.206</bold></td>
<td valign="top" align="center"><bold>0.411</bold></td>
<td valign="top" align="center">0.210</td>
<td valign="top" align="center">0.380</td>
</tr>
<tr>
<td valign="top" align="left">1,008</td>
<td valign="top" align="center">1/2</td>
<td valign="top" align="center">0.449</td>
<td valign="top" align="center">0.339</td>
<td valign="top" align="center">0.711</td>
<td valign="top" align="center">0.714</td>
<td valign="top" align="center"><bold>0.763</bold></td>
<td valign="top" align="center">0.701</td>
<td valign="top" align="center">0.818</td>
<td valign="top" align="center">0.479</td>
<td valign="top" align="center"><bold>0.827</bold></td>
<td valign="top" align="center">0.747</td>
<td valign="top" align="center">0.800</td>
<td valign="top" align="center"><bold>0.957</bold></td>
<td valign="top" align="center">0.691</td>
<td valign="top" align="center">0.955</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/3</td>
<td valign="top" align="center">0.370</td>
<td valign="top" align="center">0.281</td>
<td valign="top" align="center"><bold>0.664</bold></td>
<td valign="top" align="center">0.585</td>
<td valign="top" align="center">0.632</td>
<td valign="top" align="center">0.645</td>
<td valign="top" align="center">0.769</td>
<td valign="top" align="center">0.415</td>
<td valign="top" align="center"><bold>0.780</bold></td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.744</td>
<td valign="top" align="center"><bold>0.934</bold></td>
<td valign="top" align="center">0.633</td>
<td valign="top" align="center">0.929</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">1/6</td>
<td valign="top" align="center">0.142</td>
<td valign="top" align="center">0.149</td>
<td valign="top" align="center"><bold>0.406</bold></td>
<td valign="top" align="center">0.141</td>
<td valign="top" align="center">0.171</td>
<td valign="top" align="center">0.398</td>
<td valign="top" align="center">0.494</td>
<td valign="top" align="center">0.243</td>
<td valign="top" align="center"><bold>0.509</bold></td>
<td valign="top" align="center">0.404</td>
<td valign="top" align="center">0.457</td>
<td valign="top" align="center"><bold>0.733</bold></td>
<td valign="top" align="center">0.398</td>
<td valign="top" align="center">0.698</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic><italic>N</italic> = sample size, CL = moderately non-central, 1/2 = central cut point location, 1/3 = moderately non-central, 1/6 = strongly non-central, <italic>MG</italic> = MGSEM. Best-performing methods are printed in bold.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S5.SS2.SSS4">
<title>Runtime</title>
<p>The computation time of the SEM trees in Simulation II was in line with the observed runtime in Simulation I. Overall, the median computation time of score-guided SEM trees was 0.50 s with little variability across the simulation conditions. In contrast, the runtime of likelihood-ratio-guided trees varied considerably according to the level of measurement of the covariate and noise variable. The overall median computation time for simulation conditions with ordinal or dichotomous covariate and noise variable was 0.88 s for <italic>na&#x00EF;ve</italic> trees, 0.89 s for <italic>fair</italic> trees, and 1.40 s for max<italic>LR</italic> trees. However, if either the covariate or the noise variable were continuous, the overall median runtime increased drastically. For instance, for conditions with a continuous covariate, a continuous noise variable, and a sample size of 504, the computation time of <italic>na&#x00EF;ve</italic>, <italic>fair</italic>, and max<italic>LR</italic> trees increased to 70.18, 32.85, and 71.05 s, respectively. For samples with 1,008 individuals, the runtime increased to 149.91, 72.23, and 280.58 s, respectively.</p>
</sec>
</sec>
<sec id="S5.SS3">
<title>Simulation III: Focus Parameters</title>
<p>The goal of Simulation III was to demonstrate how specific hypotheses about parameter heterogeneity, such as certain types of measurement invariance, can be tested with the use of SEM trees with focus parameters. By default, SEM trees split with respect to differences in any model parameter. At times, researchers may be interested in finding group differences only for a subset of parameters that are referred to as focus parameters in the <monospace>semtree</monospace> package. When focus parameters are given, SEM trees will only assess heterogeneity in these parameters and ignore group differences in the remaining parameters to evaluate a split.</p>
<p><xref ref-type="fig" rid="F4">Figure 4</xref> shows the population model used to generate data in Simulation III and IV. Depicted is a confirmatory factor model with two correlated factors, each measured by three indicators. A two group-population was simulated, where both factors were uncorrelated in the first group (<italic>&#x03D5;</italic><sub><italic>g</italic>1</sub> = 0) and covaried with <italic>&#x03D5;</italic><sub><italic>g</italic>2</sub> = 0.471 in the second group. The factor loading <italic>&#x03BB;</italic> did not vary in Simulation III and was set to 0.837 in both groups, corresponding to a latent factor accounting for 70% of the variance in the observed variables. In each simulation replication, we generated 250 individuals per group, resulting in a total sample size of 500. The model shown in <xref ref-type="fig" rid="F4">Figure 4</xref> was then estimated with a common identification constraint; specifically, by fixing the variances of the two factors <italic>f</italic><sub>1</sub> and <italic>f</italic><sub>2</sub> to one and estimating the latent covariance, all factor loadings, and all residual variances freely. We did not specify a mean structure. To recover the group difference, we provided max<italic>LR</italic> and max<italic>LM</italic> trees with a standard normally distributed informative covariate. The true cut point in the covariate point was central.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Path diagram of the CFA model used in Simulation III and Simulation IV. The model consists of two correlated latent factors, each measured by three indicators. The latent covariance <italic>&#x03D5;</italic> differed between the two groups in Simulation III. In Simulation IV, there were also differences in the factor loading <italic>&#x03BB;</italic>.</p></caption>
<graphic xlink:href="fpsyg-11-564403-g004.tif"/>
</fig>
<p>We explored the following three scenarios:</p>
<list list-type="simple">
<list-item>
<label>&#x2022;</label>
<p><italic>Testing measurement invariance:</italic> We ignored heterogeneity in the latent covariance <italic>&#x03D5;</italic> and tested only the (homogeneous) measurement part of the model by treating all factor loadings and residual variances as focus parameters. This scenario can be seen as an exploration of which covariates predict violations of strict measurement invariance.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Testing the latent covariance:</italic> Only the (heterogeneous) latent covariance <italic>&#x03D5;</italic> was treated as a focus parameter, and the remaining (homogeneous) parameters did not contribute to the assessment of potential splits. This scenario is akin to ignoring the covariates&#x2019; information on violations of measurement invariance and, instead, investigating differences on the latent level only.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>No focus parameters:</italic> No focus parameters were specified, and all parameters contributed to the evaluation of a potential split. This scenario served as a baseline.</p>
</list-item>
</list>
<p><xref ref-type="table" rid="T7">Table 7</xref> shows the percentage of max<italic>LR</italic> and max<italic>LM</italic> trees that split the sample and rejected the specific null hypothesis for a significance level of 5%. In the measurement invariance scenario, the SEM trees tested the null hypothesis that all parameters of the measurement model are homogeneous. Both max<italic>LR</italic> and max<italic>LM</italic> trees yielded error rates that were close to the optimal rate of 5%. In other words, the SEM tree methods successfully ignored the group difference in the covariance structure of the latent variables. Without focus parameters, the SEM trees tested the standard null hypothesis of complete parameter equivalency across groups. The max<italic>LR</italic> and the max<italic>LM</italic> trees rejected the null hypothesis in over 80% of the replications. If only the latent covariance <italic>&#x03D5;</italic> was declared as a focus parameter, the power of both SEM trees to detect the group difference rose substantially and almost approached one. This finding highlights that the sensitivity of SEM trees for heterogeneity in a specific set of target parameters can be significantly enhanced by specifying focus parameters if differences with respect to the non-focus parameters can be safely ignored.</p>
<table-wrap position="float" id="T7">
<label>TABLE 7</label>
<caption><p>Type I error and power to detect group differences.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Scenario</td>
<td valign="top" align="center">max<italic>LR</italic></td>
<td valign="top" align="center">max<italic>LM</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Testing measurement invariance</td>
<td valign="top" align="center">4.86</td>
<td valign="top" align="center">4.85</td>
</tr>
<tr>
<td valign="top" align="left">Testing the latent covariance</td>
<td valign="top" align="center">99.10</td>
<td valign="top" align="center">99.18</td>
</tr>
<tr>
<td valign="top" align="left">No focus parameters</td>
<td valign="top" align="center">82.97</td>
<td valign="top" align="center">81.21</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>The first row shows the type I error rates and the second and third row the statistical power to detect group differences.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S5.SS4">
<title>Simulation IV: Global Equality Constraints</title>
<p>Simulation IV aimed at investigating the utility of SEM trees with equality constraints and pointing out common pitfalls. Equality constraints are useful to incorporate prior knowledge about the homogeneity of specific parameters into a SEM tree. By constraining a parameter to equality, a so-called global constraint, this parameter is estimated once in the full sample, and the resulting estimate is used in all submodels. Constraining parameters increases a SEM tree&#x2019;s sensitivity for group differences in the remaining parameters and might stabilize estimation. However, by erroneously constraining parameters to equality that are actually different in certain groups, a SEM tree can be severely misspecified.</p>
<p>We investigated the following conditions:</p>
<list list-type="simple">
<list-item>
<label>&#x2022;</label>
<p><italic>Group differences:</italic> We tested two types of group differences. Either the latent covariance differed between groups (<italic>&#x03D5;</italic><sub><italic>g</italic>1</sub> = 0, <italic>&#x03D5;</italic><sub><italic>g</italic>2</sub> = 0.471) and the factor loading was homogeneous (<italic>&#x03BB;</italic><sub><italic>g</italic>1/<italic>g</italic>2</sub> = 0.837) or the latent covariance was homogeneous (<italic>&#x03D5;</italic><sub><italic>g</italic>1/g2</sub> = 0.471) and the factor loading differed (<italic>&#x03BB;<sub><italic>g</italic></sub></italic><sub>1</sub> = 0.837, <italic>&#x03BB;<sub><italic>g</italic></sub></italic><sub>2</sub> = 0.640). All other values were as shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. We generated 250 individuals per subgroup and provided the SEM trees with a standard normal covariate with a central cut point.</p>
</list-item>
<list-item>
<label>&#x2022;</label>
<p><italic>Equality constraints:</italic> Either the heterogeneous parameter (the latent covariance <italic>&#x03D5;</italic> or the factor loading <italic>&#x03BB;</italic>), all factor loadings and residual variances of the factor <italic>f</italic><sub>2</sub>, or no parameters were constrained to equality.</p>
</list-item>
</list>
<p>The empirical power of max<italic>LR</italic> and max<italic>LM</italic> trees for detecting heterogeneity for a significance level of 5% is shown in <xref ref-type="table" rid="T8">Table 8</xref>. Without equality constraints, both SEM tree methods showed a power of slightly above 80%. As expected, constraining a heterogeneous parameter reduced the power significantly, but the exact effect depended on the specific parameter. After constraining the heterogeneous latent covariance <italic>&#x03D5;</italic>, there was no possibility for the trees to detect a difference between groups. The latent covariance <italic>&#x03D5;</italic> was the only parameter associated with the correlation between the factors, and the group difference had no other parameter left to manifest. As a result, the power of max<italic>LR</italic> and max<italic>LM</italic> trees reduced to the type I error level. However, after constraining the heterogeneous factor loading <italic>&#x03BB;</italic>, the trees still found significant group differences in 24.89% (max<italic>LR</italic>) and 22.65% (max<italic>LM</italic>) of the replications. This finding implies that the group difference in the factor loading <italic>&#x03BB;</italic> were picked up by other unconstrained parameters of the measurement model. Finally, constraining the homogeneous parameters of the measurement model of factor <italic>f</italic><sub>2</sub> increased the power to detect differences in the latent covariance or the factor loading by roughly 10 percentage points.</p>
<table-wrap position="float" id="T8">
<label>TABLE 8</label>
<caption><p>Power to detect group differences.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"><bold>Equality constraints</bold></td>
<td valign="top" align="center"><bold>max<italic>LR</italic></bold></td>
<td valign="top" align="center"><bold>max<italic>LM</italic></bold></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="3"><bold><italic>Group difference in the latent covariance &#x03D5;</italic></bold></td>
</tr>
<tr>
<td valign="top" align="left">None</td>
<td valign="top" align="center">82.52</td>
<td valign="top" align="center">81.13</td>
</tr>
<tr>
<td valign="top" align="left">Latent covariance <italic>&#x03D5;</italic></td>
<td valign="top" align="center">5.71</td>
<td valign="top" align="center">5.52</td>
</tr>
<tr>
<td valign="top" align="left">Measurement model of <italic>f</italic><sub>2</sub></td>
<td valign="top" align="center">91.25</td>
<td valign="top" align="center">90.40</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold><italic>Group difference in the factor loading</italic> &#x03BB;</bold></td>
</tr>
<tr>
<td valign="top" align="left">None</td>
<td valign="top" align="center">83.20</td>
<td valign="top" align="center">81.44</td>
</tr>
<tr>
<td valign="top" align="left">Factor loading &#x03BB;</td>
<td valign="top" align="center">24.89</td>
<td valign="top" align="center">22.65</td>
</tr>
<tr>
<td valign="top" align="left">Measurement model of <italic>f</italic><sub>2</sub></td>
<td valign="top" align="center">91.52</td>
<td valign="top" align="center">90.92</td>
</tr>
</tbody>
</table></table-wrap>
<p>The results of Simulation IV suggest that constraining homogeneous parameters to equality can increase the power of SEM trees but also involves the risk of introducing severe misspecification. Constraining a heterogeneous parameter leads to a distorted picture of group differences as the SEM tree might or might not find significant heterogeneity in other parameters that are homogeneous across subgroups. Thus, it seems generally advisable to fully explore differences in all parameters or to use focus parameters rather than taking the risk of misspecifying trees by using inadequate equality constraints.</p>
</sec>
<sec id="S5.SS5">
<title>Summary</title>
<p>In our simulation studies, the likelihood-ratio-guided <italic>na&#x00EF;ve</italic> trees showed mixed results, confirming the known weaknesses of the approach. When provided with ordinal or dichotomous covariates, <italic>na&#x00EF;ve</italic> trees showed an adequate control of type I errors and were among the best-performing methods in terms of power to detect heterogeneity and group recovery. However, with continuous covariates, <italic>na&#x00EF;ve</italic> trees were overly conservative, resulting in too few type I errors and low power. The likelihood-ratio-guided <italic>fair</italic> trees showed overall the lowest power of all methods, resulting in a poor group recovery. In contrast to <italic>na&#x00EF;ve</italic> trees, the type I error rate of <italic>fair</italic> trees was close to optimal, regardless of the measurement level of the provided covariates. Therefore, <italic>fair</italic> trees may be useful in very large samples where low power is less of an issue. The likelihood-ratio-guided max<italic>LR</italic> trees, that we implemented in the <monospace>semtree</monospace> package, resolved many of the weaknesses of the classical SEM tree methods <italic>na&#x00EF;ve</italic> and <italic>fair</italic> and positioned themselves slightly above the score-guided max<italic>LM</italic> trees in terms of power, group recovery, and cut point precision. max<italic>LR</italic> trees and the score-guided max<italic>LM</italic> (for continuous covariates), and max<italic>LM</italic><sub><italic>O</italic></sub> trees (for ordinal covariates) exceeded other split selection approaches in conditions with group differences in multiple parameters associated with the random effects and non-central cut points. SEM trees guided by the score-based <italic>DM</italic> (for testing continuous covariates) and <italic>WDM</italic> (ordinal covariates) test statistics outperformed other methods in terms of power and group recovery when group differences were to be found in a single parameter describing the fixed slope. Different from <italic>DM</italic> trees, <italic>WDM</italic> trees were also sensitive to non-central cut points. Score-guided <italic>CvM</italic> trees performed better than other methods in detecting heterogeneity in the random effects when the cut point was central. Finally, the score-based <italic>LM</italic> trees for categorical covariates were slightly less powerful than the <italic>na&#x00EF;ve</italic> method. Although max<italic>LM</italic><sub><italic>O</italic></sub> and <italic>LM</italic> trees were roughly on par with <italic>na&#x00EF;ve</italic> trees, the score-based methods clearly outperformed <italic>na&#x00EF;ve</italic> trees when provided with an additional continuous noise variable. Overall, all score-guided trees and the newly implemented max<italic>LR</italic> trees showed a satisfactory control of type I errors. The most striking difference between likelihood-ratio and score-guided SEM trees was the runtime. Whereas the runtime of all likelihood-ratio-based methods was excessive if one of the covariates under evaluation was continuous, score-guided trees were computed quickly. In summary, all newly implemented methods (max<italic>LR</italic> and the score-based methods) outperformed the original <italic>na&#x00EF;ve</italic> and <italic>fair</italic> methods. Moreover, no single method under evaluation performed best across all situations, and all of the new methods had some unique advantages which may justify their use given certain conditions.</p>
<p>Regarding focus parameters and equality constraints, we found that both can successfully be applied to increase the power of SEM trees to detect group differences if there is either a clear set of target parameters or prior knowledge about homogeneous parameters available. Still, we discourage the use of equality constraints in favor of focus parameters, which allow exploring the effects of selected parameters without incurring misspecifications during the split evaluation.</p>
</sec>
</sec>
<sec id="S6">
<title>Discussion</title>
<p>In the present study, we introduced score-guided SEM trees as a fast and efficient way for growing SEM trees. Along with score-guided SEM trees, we also implemented a new likelihood-ratio-guided split selection based on the max<italic>LR</italic> statistic that solved many of the shortcomings of the original likelihood-ratio-guided SEM trees (<xref ref-type="bibr" rid="B7">Brandmaier et al., 2013b</xref>). We evaluated and compared the newly implemented and the original SEM tree approaches in a Monte Carlo simulation study. We investigated those cases in which users want to adjust the type I error rate for multiple testing of covariates. Overall, we conclude that the new split selection procedures are superior to the original split selection because they have higher statistical power and are unbiased in the selection of covariates that predict group differences in SEM parameters. Among the SEM tree methods, score-guided trees stand out due to their computational efficiency, making the use of SEM trees in large data sets feasible.</p>
<p>Our simulation studies evaluated three different likelihood-ratio-guided SEM tree approaches and five different score-guided SEM trees. The score-guided SEM trees were based on test statistics recently popularized in psychometrics by <xref ref-type="bibr" rid="B31">Merkle and Zeileis (2013)</xref> and <xref ref-type="bibr" rid="B30">Merkle et al. (2014)</xref> for studying heterogeneity in SEM parameters. When provided with continuous covariates, we found that guiding SEM trees with score-based tests significantly reduced the runtime of the trees. If solely provided with ordinal and categorical variables, score-guided SEM trees performed as well as likelihood-ratio-guided SEM trees. The large difference in the runtime of both approaches for continuous covariates can be attributed to the fact that the evaluation of a covariate by a likelihood-ratio-guided SEM tree requires the estimation of a MGSEM for every unique value of covariate. This leads to a large number of MGSEMs to be estimated as most values of continuous covariates are usually unique. However, score-guided SEM trees do not require the estimation of any MGSEMs for the evaluation of a covariate and, therefore, can be computed in little time.</p>
<p>Score-guided SEM trees and likelihood-ratio-guided trees based on the newly implemented max<italic>LR</italic> statistic also proved to be more powerful in detecting group differences than the original SEM tree methods if one of the covariates provided to the SEM tree were continuous. The low statistical power of the original SEM tree implementation is a side effect of a suboptimal correction of the selection bias. The low power of the <italic>na&#x00EF;ve</italic> method for continuous covariates can be explained by the overcorrection of the Bonferroni adjusted <italic>p</italic>-values due to too many possible cut points in the continuous variables. The low power that the <italic>fair</italic> method displayed throughout all simulation conditions was because the <italic>fair</italic> selection method uses only half of the sample for selecting the best covariate.</p>
<p>Besides the evaluation of the original and newly proposed SEM tree methods, we also demonstrated the utility and pitfalls of trees with focus parameters and equality constraints. We showed that focus parameters are well suited to investigate specific hypotheses about parameter heterogeneity, such as different types of measurement invariance. Specifying equality constraints for homogeneous parameters increased the power of SEM trees for detecting group differences in the remaining parameters. We also demonstrated that misspecified equality constraints can obscure group differences and thus discourage this approach. As the effect of misspecified equality constraints can be hard to predict for a user, we recommend to explore differences in all parameters or to use focus parameters rather than risk to misspecify trees by using inadequate equality constraints.</p>
<p>The faster runtime of score-based tests is a major advantage for practical use and enables the wider adoption of SEM trees. The slow runtime of likelihood-ratio tests had made SEM trees unattractive if not impossible to run with large data sets on desktop computers. The runtime improvement may become even more important if one wishes to complement SEM tree inferences with resampling methods such as SEM forests (<xref ref-type="bibr" rid="B8">Brandmaier et al., 2016</xref>). SEM forests are a more robust alternative to single SEM trees if the overall importance of variables is of primary interest because small variations in the sample often lead to different trees. As SEM forests are based on hundreds if not thousands of trees, they will profit dramatically from the score-guided strategy.</p>
<p>The question remains which of the newly implemented methods should be used to estimate SEM trees. Our simulation results imply that all of the new methods have their unique strengths. However, in practice, when it is usually unknown how many of the model parameters are heterogeneous or if the subgroups are roughly equal in size, the advantages of the <italic>DM</italic>, <italic>WDM</italic>, and <italic>CvM</italic> statistics seem hard to exploit. Instead, the max<italic>LR</italic> (if computational feasible), max<italic>LM</italic>, max<italic>LM</italic><sub><italic>O</italic></sub>, and <italic>LM</italic> trees statistics are best suited for situations without <italic>a priori</italic> knowledge about potential group differences. Moreover, if one is only interested in change in a specific parameter, specifying a focus parameter may represent an excellent alternative to the <italic>DM</italic> and <italic>WDM</italic> statistics.</p>
<p>Although SEM trees are a powerful and flexible method for investigating heterogeneity in SEMs, we want to stress that they are not always the most appropriate one. It is important to note that the performance of the SEM trees depends on the covariates available. If none of these covariates is in any way related to group differences, SEM trees will fail to detect any heterogeneity. In situations without informative covariates, researchers may resort to latent class or finite mixture models (<xref ref-type="bibr" rid="B19">Jedidi et al., 1997</xref>; <xref ref-type="bibr" rid="B33">Muth&#x00E9;n and Shedden, 1999</xref>; <xref ref-type="bibr" rid="B27">Lubke and Muth&#x00E9;n, 2005</xref>) for detecting heterogeneity. Latent class approaches automatically test for differences between all possible groups of individuals without requiring covariates. A disadvantage of these methods is that the number of subgroups needs to be pre-specified by the user. Another disadvantage of SEM trees is that they provide only sparse information about how a parameter changes with respect to a covariate. Recently, <xref ref-type="bibr" rid="B3">Arnold et al. (2019)</xref> suggested a framework called individual parameter contribution regression that allows modeling SEM parameter estimates as a linear function of covariates.</p>
<p>There are several limitations of our study. First, we focused narrowly on the <monospace>semtree</monospace> package for growing SEM trees and did not evaluate SEM trees estimated by the generic MOB algorithm from the <monospace>partykit</monospace> package. Ideally, a future study should aim to replicate our findings using MOB. Second, most of our simulations were performed using a linear latent growth curve model with only two types of group differences. Likely, different types of SEMs or parameter differences could have changed the performance of some of the methods under investigation. However, we would expect the general pattern of results to hold for other models as well. Third, for the sake of simplicity, we tested only a small number of uncorrelated covariates and did not test any covariate interactions. Fourth, we did not assess the influence of non-normally distributed data and model misspecification on the SEM trees. These remain topics for future research.</p>
<p>In summary, we found score-guided SEM trees to be fast, flexible, and powerful tools for investigating heterogeneity in SEM parameters. Based on our work, we suggest that score-guided split selection should become the new standard for estimating SEM trees and forests.</p>
</sec>
<sec id="S7">
<title>Data Availability Statement</title>
<p>The simulated data presented in this study can be found in the following online repository: <ext-link ext-link-type="uri" xlink:href="https://osf.io/k82y3/">https://osf.io/k82y3/</ext-link>.</p>
</sec>
<sec id="S8">
<title>Author Contributions</title>
<p>MA programmed the score-guided SEM tree implementation with the support of AB. MA designed and carried out the simulation study and wrote the manuscript with the support of MV and AB. MV revised the presentation of score-based tests and supervised the project. AB conceived the original idea and provided crucial code to link the score-guided SEM tree implementation to the existing <monospace>semtree</monospace> R package. All authors discussed the results and commented on the manuscript.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack>
<p>We acknowledge support by the German Research Foundation (DFG) and the Open Access Publication Fund of Humboldt-Universit&#x00E4;t zu Berlin.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ammerman</surname> <given-names>B. A.</given-names></name> <name><surname>Jacobucci</surname> <given-names>R.</given-names></name> <name><surname>McCloskey</surname> <given-names>M. S.</given-names></name></person-group> (<year>2019</year>). <article-title>Reconsidering important outcomes of the nonsuicidal self-injury disorder diagnostic criterion A.</article-title> <source><italic>J. Clin. Psychol.</italic></source> <volume>75</volume> <fpage>1084</fpage>&#x2013;<lpage>1097</lpage>. <pub-id pub-id-type="doi">10.1002/jclp.22754</pub-id> <pub-id pub-id-type="pmid">30735571</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andrews</surname> <given-names>D. W. K.</given-names></name></person-group> (<year>1993</year>). <article-title>Tests for parameter instability and structural change with unknown change point.</article-title> <source><italic>Econometrica</italic></source> <volume>61</volume> <fpage>821</fpage>&#x2013;<lpage>856</lpage>. <pub-id pub-id-type="doi">10.2307/2951764</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arnold</surname> <given-names>M.</given-names></name> <name><surname>Oberski</surname> <given-names>D. L.</given-names></name> <name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Voelkle</surname> <given-names>M. C.</given-names></name></person-group> (<year>2019</year>). <article-title>Identifying heterogeneity in dynamic panel models with individual parameter contribution regression.</article-title> <source><italic>Struct. Equ. Modeling</italic></source> <volume>27</volume> <fpage>613</fpage>&#x2013;<lpage>628</lpage>. <pub-id pub-id-type="doi">10.1080/10705511.2019.1667240</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bollen</surname> <given-names>K. A.</given-names></name></person-group> (<year>1989</year>). <source><italic>Structural Equation with Latent Variables.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Wiley</publisher-name>, <pub-id pub-id-type="doi">10.1002/9781118619179</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Driver</surname> <given-names>C. C.</given-names></name> <name><surname>Voelkle</surname> <given-names>M. C.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Recursive partitioning in continuous time analysis</article-title>,&#x201D; in <source><italic>Continuous Time Modeling in the Behavioral and Related Sciences</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>van Montfort</surname> <given-names>K.</given-names></name> <name><surname>Oud</surname> <given-names>J. H. L.</given-names></name> <name><surname>Voelkle</surname> <given-names>M. C.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>259</fpage>&#x2013;<lpage>282</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-77219-6_11</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Oertzen</surname> <given-names>T.</given-names></name> <name><surname>von McArdle</surname> <given-names>J. J.</given-names></name> <name><surname>Lindenberger</surname> <given-names>U.</given-names></name></person-group> (<year>2013a</year>). &#x201C;<article-title>Exploratory data mining with structural equation model trees</article-title>,&#x201D; in <source><italic>Quantitative methodology series. Contemporary issues in Exploratory Data Mining in the Behavioral Sciences</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>McArdle</surname> <given-names>J. J.</given-names></name> <name><surname>Ritschard</surname> <given-names>G.</given-names></name></person-group> (<publisher-loc>London</publisher-loc>: <publisher-name>Routledge</publisher-name>), <fpage>96</fpage>&#x2013;<lpage>127</lpage>.</citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Oertzen</surname> <given-names>T.</given-names></name> <name><surname>von McArdle</surname> <given-names>J. J.</given-names></name> <name><surname>Lindenberger</surname> <given-names>U.</given-names></name></person-group> (<year>2013b</year>). <article-title>Structural equation model trees.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>18</volume> <fpage>71</fpage>&#x2013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.1037/a0030001</pub-id> <pub-id pub-id-type="pmid">22984789</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Prindle</surname> <given-names>J. J.</given-names></name> <name><surname>McArdle</surname> <given-names>J. J.</given-names></name> <name><surname>Lindenberger</surname> <given-names>U.</given-names></name></person-group> (<year>2016</year>). <article-title>Theory-guided exploration with structural equation model forests.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>21</volume> <fpage>566</fpage>&#x2013;<lpage>582</lpage>. <pub-id pub-id-type="doi">10.1037/met0000090</pub-id> <pub-id pub-id-type="pmid">27918182</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Ram</surname> <given-names>N.</given-names></name> <name><surname>Wagner</surname> <given-names>G. G.</given-names></name> <name><surname>Gerstorf</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Terminal decline in well-being: the role of multi-indicator constellations of physical health and psychosocial correlates.</article-title> <source><italic>Dev. Psychol.</italic></source> <volume>53</volume> <fpage>996</fpage>&#x2013;<lpage>1012</lpage>. <pub-id pub-id-type="doi">10.1037/dev0000274</pub-id> <pub-id pub-id-type="pmid">28459278</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname> <given-names>T. A.</given-names></name></person-group> (<year>2015</year>). <source><italic>Confirmatory Factor Analysis for Applied Researchers</italic></source>, <edition>2nd Edn</edition>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Guilford</publisher-name>.</citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Mooij</surname> <given-names>S. M. M.</given-names></name> <name><surname>Henson</surname> <given-names>R. N. A.</given-names></name> <name><surname>Waldorp</surname> <given-names>L. J.</given-names></name> <name><surname>Kievit</surname> <given-names>R. A.</given-names></name></person-group> (<year>2018</year>). <article-title>Age differentiation within gray matter, white matter, and between memory and white matter in an adult life span cohort.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>38</volume> <fpage>5826</fpage>&#x2013;<lpage>5836</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1627-17.2018</pub-id> <pub-id pub-id-type="pmid">29848485</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fokkema</surname> <given-names>M.</given-names></name> <name><surname>Smits</surname> <given-names>N.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Hothorn</surname> <given-names>T.</given-names></name> <name><surname>Kelderman</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees.</article-title> <source><italic>Behav. Res. Methods</italic></source> <volume>50</volume> <fpage>2016</fpage>&#x2013;<lpage>2034</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-017-0971-x</pub-id> <pub-id pub-id-type="pmid">29071652</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hansen</surname> <given-names>B. E.</given-names></name></person-group> (<year>1992</year>). <article-title>Testing for parameter instability in linear models.</article-title> <source><italic>J. Policy Model.</italic></source> <volume>14</volume> <fpage>517</fpage>&#x2013;<lpage>533</lpage>. <pub-id pub-id-type="doi">10.1016/0161-8938(92)90019-9</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hildebrandt</surname> <given-names>A.</given-names></name> <name><surname>L&#x00FC;dtke</surname> <given-names>O.</given-names></name> <name><surname>Robitzsch</surname> <given-names>A.</given-names></name> <name><surname>Sommer</surname> <given-names>C.</given-names></name> <name><surname>Wilhelm</surname> <given-names>O.</given-names></name></person-group> (<year>2016</year>). <article-title>Exploring factor model parameters across continuous variables with local structural equation models.</article-title> <source><italic>Multivar. Behav. Res.</italic></source> <volume>51</volume> <fpage>257</fpage>&#x2013;<lpage>258</lpage>. <pub-id pub-id-type="doi">10.1080/00273171.2016.1142856</pub-id> <pub-id pub-id-type="pmid">27049892</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hjort</surname> <given-names>N. L.</given-names></name> <name><surname>Koning</surname> <given-names>A.</given-names></name></person-group> (<year>2002</year>). <article-title>Tests for constancy of model parameters over time.</article-title> <source><italic>J. Nonparametr. Stat.</italic></source> <volume>14</volume> <fpage>113</fpage>&#x2013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1080/10485250211394</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hothorn</surname> <given-names>T.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>partykit: a modular toolkit for recursive partytioning in R.</article-title> <source><italic>J. Mach. Learn. Res.</italic></source> <volume>16</volume> <fpage>3905</fpage>&#x2013;<lpage>3909</lpage>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hubert</surname> <given-names>L.</given-names></name> <name><surname>Arabie</surname> <given-names>P.</given-names></name></person-group> (<year>1985</year>). <article-title>Comparing partitions.</article-title> <source><italic>J. Classif.</italic></source> <volume>2</volume> <fpage>193</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1007/BF01908075</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jacobucci</surname> <given-names>R.</given-names></name> <name><surname>Grimm</surname> <given-names>K. J.</given-names></name> <name><surname>McArdle</surname> <given-names>J. J.</given-names></name></person-group> (<year>2017</year>). <article-title>A comparison of methods for uncovering sample heterogeneity: structural equation model trees and finite mixture models.</article-title> <source><italic>Struct. Equ. Modeling</italic></source> <volume>24</volume> <fpage>270</fpage>&#x2013;<lpage>282</lpage>. <pub-id pub-id-type="doi">10.1080/10705511.2016.1250637</pub-id> <pub-id pub-id-type="pmid">29225453</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jedidi</surname> <given-names>K.</given-names></name> <name><surname>Jagpal</surname> <given-names>H. S.</given-names></name> <name><surname>DeSarbo</surname> <given-names>W. S.</given-names></name></person-group> (<year>1997</year>). <article-title>Finite-mixture structural equation models for response-based segmentation and unobserved heterogeneity.</article-title> <source><italic>Mark. Sci.</italic></source> <volume>16</volume> <fpage>39</fpage>&#x2013;<lpage>59</lpage>. <pub-id pub-id-type="doi">10.1287/mksc.16.1.39</pub-id> <pub-id pub-id-type="pmid">19642375</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jensen</surname> <given-names>D. D.</given-names></name> <name><surname>Cohen</surname> <given-names>P. R.</given-names></name></person-group> (<year>2000</year>). <article-title>Multiple comparison in induction algorithms.</article-title> <source><italic>Mach. Learn.</italic></source> <volume>38</volume> <fpage>309</fpage>&#x2013;<lpage>338</lpage>. <pub-id pub-id-type="doi">10.1023/A:1007631014630</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname> <given-names>P. J.</given-names></name> <name><surname>Mair</surname> <given-names>P.</given-names></name> <name><surname>Simon</surname> <given-names>T.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Network trees: a method for recursively partitioning covariance structures.</article-title> <source><italic>Psychometrika</italic></source> <pub-id pub-id-type="doi">10.1007/s11336-020-09731-4</pub-id> Online ahead of print <pub-id pub-id-type="pmid">33146786</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kievit</surname> <given-names>R. A.</given-names></name> <name><surname>Frankenhuis</surname> <given-names>W. E.</given-names></name> <name><surname>Waldorp</surname> <given-names>L. J.</given-names></name> <name><surname>Borsboom</surname> <given-names>D.</given-names></name></person-group> (<year>2013</year>). <article-title>Simpson&#x2019;s paradox in psychological science: a practical guide.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>4</volume>:<issue>513</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2013.00513</pub-id> <pub-id pub-id-type="pmid">23964259</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kline</surname> <given-names>R. B.</given-names></name></person-group> (<year>2016</year>). <source><italic>Principles and Practice of Structural Equation Modeling</italic></source>, <edition>4th Edn</edition>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Guilford</publisher-name>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Komboz</surname> <given-names>B.</given-names></name> <name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>Tree-based global model tests for polytomous Rasch models.</article-title> <source><italic>Educ. Psychol. Meas.</italic></source> <volume>78</volume> <fpage>128</fpage>&#x2013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1177/0013164416664394</pub-id> <pub-id pub-id-type="pmid">29795950</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lang</surname> <given-names>M. N.</given-names></name> <name><surname>Schlosser</surname> <given-names>L.</given-names></name> <name><surname>Hothorn</surname> <given-names>T.</given-names></name> <name><surname>Mayr</surname> <given-names>G. J.</given-names></name> <name><surname>Stauffer</surname> <given-names>R.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Circular regression trees and forests with an application to probabilistic wind direction forecasting.</article-title> <source><italic>J. R. Stat. Soc. C</italic></source> <volume>69</volume> <fpage>1357</fpage>&#x2013;<lpage>1374</lpage>. <pub-id pub-id-type="doi">10.1111/rssc.12437</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Loh</surname> <given-names>W.-Y.</given-names></name> <name><surname>Shih</surname> <given-names>Y.-S.</given-names></name></person-group> (<year>1997</year>). <article-title>Split selection methods for classification trees.</article-title> <source><italic>Stat. Sin.</italic></source> <volume>7</volume> <fpage>815</fpage>&#x2013;<lpage>840</lpage>.</citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lubke</surname> <given-names>G. H.</given-names></name> <name><surname>Muth&#x00E9;n</surname> <given-names>B.</given-names></name></person-group> (<year>2005</year>). <article-title>Investigating population heterogeneity with factor mixture models.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>10</volume> <fpage>21</fpage>&#x2013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1037/1082-989X.10.1.21</pub-id> <pub-id pub-id-type="pmid">15810867</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McArdle</surname> <given-names>J. J.</given-names></name></person-group> (<year>2012</year>). &#x201C;<article-title>Latent curve modeling of longitudinal growth data</article-title>,&#x201D; in <source><italic>Handbook of Structural Equation Modeling</italic></source>, <role>ed.</role> <person-group person-group-type="editor"><name><surname>Hoyle</surname> <given-names>R. H.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Guilford Press</publisher-name>), <fpage>547</fpage>&#x2013;<lpage>570</lpage>.</citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McArdle</surname> <given-names>J. J.</given-names></name> <name><surname>Epstein</surname> <given-names>D.</given-names></name></person-group> (<year>1987</year>). <article-title>Latent growth curves within developmental structural equation models.</article-title> <source><italic>Child Dev.</italic></source> <volume>58</volume>:<issue>110</issue>. <pub-id pub-id-type="doi">10.2307/1130295</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Merkle</surname> <given-names>E. C.</given-names></name> <name><surname>Fan</surname> <given-names>J.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Testing for measurement invariance with respect to an ordinal variable.</article-title> <source><italic>Psychometrika</italic></source> <volume>79</volume> <fpage>569</fpage>&#x2013;<lpage>584</lpage>. <pub-id pub-id-type="doi">10.1007/S11336-013-9376-7</pub-id> <pub-id pub-id-type="pmid">24282129</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Merkle</surname> <given-names>E. C.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Tests of measurement invariance without subgroups: a generalization of classical methods.</article-title> <source><italic>Psychometrika</italic></source> <volume>78</volume> <fpage>59</fpage>&#x2013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1007/S11336-012-9302-4</pub-id> <pub-id pub-id-type="pmid">25107518</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milligan</surname> <given-names>G. W.</given-names></name> <name><surname>Cooper</surname> <given-names>M. C.</given-names></name></person-group> (<year>1986</year>). <article-title>A study of the comparability of external criteria for hierarchical cluster analysis.</article-title> <source><italic>Multiv. Behav. Res.</italic></source> <volume>21</volume> <fpage>441</fpage>&#x2013;<lpage>458</lpage>. <pub-id pub-id-type="doi">10.1207/s15327906mbr2104_5</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Muth&#x00E9;n</surname> <given-names>B.</given-names></name> <name><surname>Shedden</surname> <given-names>K.</given-names></name></person-group> (<year>1999</year>). <article-title>Finite mixture modeling with mixture outcomes using the EM algorithm.</article-title> <source><italic>Biometrics</italic></source> <volume>55</volume> <fpage>463</fpage>&#x2013;<lpage>469</lpage>. <pub-id pub-id-type="doi">10.1111/j.0006-341X.1999.00463.x</pub-id> <pub-id pub-id-type="pmid">11318201</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neale</surname> <given-names>M. C.</given-names></name> <name><surname>Hunter</surname> <given-names>M. D.</given-names></name> <name><surname>Pritikin</surname> <given-names>J. N.</given-names></name> <name><surname>Zahery</surname> <given-names>M.</given-names></name> <name><surname>Brick</surname> <given-names>T. R.</given-names></name> <name><surname>Kirkpatrick</surname> <given-names>R. M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Openmx 2.0: Extended structural equation and statistical modeling.</article-title> <source><italic>Psychometrika</italic></source> <volume>81</volume> <fpage>535</fpage>&#x2013;<lpage>549</lpage>. <pub-id pub-id-type="doi">10.1007/s11336-014-9435-8</pub-id> <pub-id pub-id-type="pmid">25622929</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quinlan</surname> <given-names>J. R.</given-names></name></person-group> (<year>1993</year>). <source><italic>C4.5: Programs for machine learning.</italic></source> <publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>Morgan Kaufmann</publisher-name>.</citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosseel</surname> <given-names>Y.</given-names></name></person-group> (<year>2012</year>). <article-title>lavaan: An R package for structural equation modeling.</article-title> <source><italic>J. Stat. Softw.</italic></source> <volume>48</volume> <fpage>1</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v048.i02</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serang</surname> <given-names>S.</given-names></name> <name><surname>Jacobucci</surname> <given-names>R.</given-names></name> <name><surname>Stegmann</surname> <given-names>G.</given-names></name> <name><surname>Brandmaier</surname> <given-names>A. M.</given-names></name> <name><surname>Culianos</surname> <given-names>D.</given-names></name> <name><surname>Grimm</surname> <given-names>K. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Mplus trees: structural equation model trees using Mplus.</article-title> <source><italic>Struct. Equ. Modeling</italic></source> <pub-id pub-id-type="doi">10.1080/10705511.2020.1726179</pub-id> <comment>[Epub ahead of print]</comment>.</citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shih</surname> <given-names>Y.-S.</given-names></name></person-group> (<year>2004</year>). <article-title>A note on split selection bias in classification trees.</article-title> <source><italic>Comput. Stat. Data Anal.</italic></source> <volume>45</volume> <fpage>457</fpage>&#x2013;<lpage>466</lpage>. <pub-id pub-id-type="doi">10.1016/S0167-9473(03)00064-1</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simpson-Kent</surname> <given-names>I. L.</given-names></name> <name><surname>Fuhrmann</surname> <given-names>D.</given-names></name> <name><surname>Bathelt</surname> <given-names>J.</given-names></name> <name><surname>Achterberg</surname> <given-names>J.</given-names></name> <name><surname>Borgeest</surname> <given-names>G. S.</given-names></name> <name><surname>Kievit</surname> <given-names>R. A.</given-names></name></person-group> (<year>2020</year>). <article-title>Neurocognitive reorganization between crystallized intelligence, fluid intelligence and white matter microstructure in two age-heterogeneous developmental cohorts.</article-title> <source><italic>Dev. Cogn. Neurosci.</italic></source> <volume>41</volume>:<issue>100743</issue>. <pub-id pub-id-type="doi">10.1016/j.dcn.2019.100743</pub-id> <pub-id pub-id-type="pmid">31999564</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x00F6;rbom</surname> <given-names>D.</given-names></name></person-group> (<year>1974</year>). <article-title>A general method for studying differences in factor means and factor structure between groups.</article-title> <source><italic>Br. J. Math. Stat. Psychol.</italic></source> <volume>27</volume> <fpage>229</fpage>&#x2013;<lpage>239</lpage>. <pub-id pub-id-type="doi">10.1111/j.2044-8317.1974.tb00543.x</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Kopf</surname> <given-names>J.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Rasch trees: a new method for detecting differential item functioning in the Rasch model.</article-title> <source><italic>Psychometrika</italic></source> <volume>80</volume> <fpage>289</fpage>&#x2013;<lpage>316</lpage>. <pub-id pub-id-type="doi">10.1007/S11336-013-9388-3</pub-id> <pub-id pub-id-type="pmid">24352514</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Malley</surname> <given-names>J.</given-names></name> <name><surname>Tutz</surname> <given-names>G.</given-names></name></person-group> (<year>2009</year>). <article-title>An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>14</volume> <fpage>323</fpage>&#x2013;<lpage>348</lpage>. <pub-id pub-id-type="doi">10.1037/a0016973</pub-id> <pub-id pub-id-type="pmid">19968396</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Wickelmaier</surname> <given-names>F.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2011</year>). <article-title>Accounting for individual differences in Bradley-Terry models by means of recursive partitioning.</article-title> <source><italic>J. Educ. Behav. Stat</italic>.</source> <volume>36</volume> <fpage>135</fpage>&#x2013;<lpage>153</lpage>. <pub-id pub-id-type="doi">10.3102/1076998609359791</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Usami</surname> <given-names>S.</given-names></name> <name><surname>Hayes</surname> <given-names>T.</given-names></name> <name><surname>McArdle</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Fitting structural equation model trees and latent growth curve mixture models in longitudinal designs: the influence of model misspecification.</article-title> <source><italic>Struct. Equ. Modeling</italic></source> <volume>24</volume> <fpage>585</fpage>&#x2013;<lpage>598</lpage>. <pub-id pub-id-type="doi">10.1080/10705511.2016.1266267</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Usami</surname> <given-names>S.</given-names></name> <name><surname>Jacobucci</surname> <given-names>R.</given-names></name> <name><surname>Hayes</surname> <given-names>T.</given-names></name></person-group> (<year>2019</year>). <article-title>The performance of latent growth curve model-based structural equation model trees to uncover population heterogeneity in growth trajectories.</article-title> <source><italic>Comput. Stat.</italic></source> <volume>34</volume> <fpage>1</fpage>&#x2013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1007/s00180-018-0815-x</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Merkle</surname> <given-names>E. C.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Score-based tests of measurement invariance: use in practice.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>5</volume>:<issue>438</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2014.00438</pub-id> <pub-id pub-id-type="pmid">24936190</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Merkle</surname> <given-names>E. C.</given-names></name></person-group> (<year>2018</year>). <article-title>Score-based tests of differential item functioning via pairwise maximum likelihood estimation.</article-title> <source><italic>Psychometrika</italic></source> <volume>83</volume> <fpage>132</fpage>&#x2013;<lpage>155</lpage>. <pub-id pub-id-type="doi">10.1007/s11336-017-9591-8</pub-id> <pub-id pub-id-type="pmid">29150815</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wickelmaier</surname> <given-names>F.</given-names></name> <name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>Using recursive partitioning to account for parameter heterogeneity in multinomial processing tree models.</article-title> <source><italic>Behav. Res. Methods</italic></source> <volume>50</volume> <fpage>1217</fpage>&#x2013;<lpage>1233</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-017-0937-z</pub-id> <pub-id pub-id-type="pmid">28779459</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeileis</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <source><italic>Structural equation model trees with partykit and lavaan.</italic></source> Available online at: <ext-link ext-link-type="uri" xlink:href="https://eeecon.uibk.ac.at/~zeileis/news/lavaantree/">https://eeecon.uibk.ac.at/~zeileis/news/lavaantree/</ext-link> <comment>(accessed December 17, 2020)</comment>.</citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Hornik</surname> <given-names>K.</given-names></name></person-group> (<year>2007</year>). <article-title>Generalized M-fluctuation tests for parameter instability.</article-title> <source><italic>Stat. Neerl.</italic></source> <volume>61</volume> <fpage>488</fpage>&#x2013;<lpage>508</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9574.2007.00371.x</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Hothorn</surname> <given-names>T.</given-names></name> <name><surname>Hornik</surname> <given-names>K.</given-names></name></person-group> (<year>2008</year>). <article-title>Model-based recursive partitioning.</article-title> <source><italic>J. Comput. Graph. Stat.</italic></source> <volume>17</volume> <fpage>492</fpage>&#x2013;<lpage>514</lpage>. <pub-id pub-id-type="doi">10.1198/106186008X319331</pub-id> <pub-id pub-id-type="pmid">12611515</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Leisch</surname> <given-names>F.</given-names></name> <name><surname>Hornik</surname> <given-names>K.</given-names></name> <name><surname>Kleiber</surname> <given-names>C.</given-names></name></person-group> (<year>2002</year>). <article-title>strucchange: an R package for testing for structural change in linear regression models.</article-title> <source><italic>J. Stat. Softw.</italic></source> <volume>7</volume> <fpage>1</fpage>&#x2013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v007.i02</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeileis</surname> <given-names>A.</given-names></name> <name><surname>Strobl</surname> <given-names>C.</given-names></name> <name><surname>Wickelmaier</surname> <given-names>F.</given-names></name> <name><surname>Komboz</surname> <given-names>B.</given-names></name> <name><surname>Kopf</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <source><italic>Psychotree: Recursive Partitioning Based on Psychometric Models (Version 0.15-3) [Computer software].</italic></source> Available online at: <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/psychotree/index.html">https://cran.r-project.org/web/packages/psychotree/index.html</ext-link> <comment>(accessed December 17, 2020)</comment>.</citation></ref>
</ref-list><fn-group>
<fn id="footnote1">
<label>1</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/brandmaier/semtree/commit/30ca7500e43ca99975dfe6b8917ef8f293beaeb3">https://github.com/brandmaier/semtree/commit/30ca7500e43ca99975dfe6b8917ef8f293beaeb3</ext-link></p></fn>
<fn id="footnote2">
<label>2</label>
<p><ext-link ext-link-type="uri" xlink:href="https://osf.io/k82y3/">https://osf.io/k82y3/</ext-link></p></fn>
</fn-group>
</back>
</article>
