<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2023.1266447</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Conceptual Analysis</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>In models we trust: preregistration, large samples, and replication may not suffice</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Spiess</surname> <given-names>Martin</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2385094/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Jordan</surname> <given-names>Pascal</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/2380128/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff><institution>Institute of Psychology, Department of Psychology and Human Movement Science, University of Hamburg</institution>, <addr-line>Hamburg</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Tomer Fekete, Ben-Gurion University of the Negev, Israel</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Marco Biella, University of Basel, Switzerland; Steffen Zitzmann, University of T&#x000FC;bingen, Germany</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Martin Spiess <email>martin.spiess&#x00040;uni-hamburg.de</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>09</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1266447</elocation-id>
<history>
<date date-type="received">
<day>25</day>
<month>07</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>09</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Spiess and Jordan.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Spiess and Jordan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Despite discussions about the replicability of findings in psychological research, two issues have been largely ignored: selection mechanisms and model assumptions. Both topics address the same fundamental question: Does the chosen statistical analysis tool adequately model the data generation process? In this article, we address both issues and show, in a first step, that in the face of selective samples and contrary to common practice, the validity of inferences, even when based on experimental designs, can be claimed without further justification and adaptation of standard methods only in very specific situations. We then broaden our perspective to discuss consequences of violated assumptions in linear models in the context of psychological research in general and in generalized linear mixed models as used in item response theory. These types of misspecification are oftentimes ignored in the psychological research literature. It is emphasized that the above problems cannot be overcome by strategies such as preregistration, large samples, replications, or a ban on testing null hypotheses. To avoid biased conclusions, we briefly discuss tools such as model diagnostics, statistical methods to compensate for selectivity and semi- or non-parametric estimation. At a more fundamental level, however, a twofold strategy seems indispensable: (1) iterative, cumulative theory development based on statistical methods with theoretically justified assumptions, and (2) empirical research on variables that affect (self-) selection into the observed part of the sample and the use of this information to compensate for selectivity.</p></abstract>
<kwd-group>
<kwd>population</kwd>
<kwd>sampling design</kwd>
<kwd>non-response</kwd>
<kwd>selectivity</kwd>
<kwd>misspecification</kwd>
<kwd>biased inference</kwd>
<kwd>diagnostics</kwd>
<kwd>robust methods</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="0"/>
<equation-count count="16"/>
<ref-count count="65"/>
<page-count count="13"/>
<word-count count="9963"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Theoretical and Philosophical Psychology</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The debate around the replication crisis is not the only consequence of methodological deficiencies discussed in the psychological literature, but certainly one that has attracted a large amount of attention in recent years (e.g., Open Science Collaboration, <xref ref-type="bibr" rid="B38">2012</xref>, <xref ref-type="bibr" rid="B39">2015</xref>; Klein et al., <xref ref-type="bibr" rid="B25">2014</xref>, <xref ref-type="bibr" rid="B26">2018</xref>; Shrout and Rodgers, <xref ref-type="bibr" rid="B54">2018</xref>). In fact, criticism of the methodological practice has addressed a wide range of aspects, from science policy and human bias (e.g., Sterling, <xref ref-type="bibr" rid="B60">1959</xref>; Rosenthal, <xref ref-type="bibr" rid="B46">1979</xref>; Sterling et al., <xref ref-type="bibr" rid="B59">1995</xref>; Pratkanis, <xref ref-type="bibr" rid="B41">2017</xref>) over rather general methodological approaches (e.g., Meehl, <xref ref-type="bibr" rid="B34">1967</xref>, <xref ref-type="bibr" rid="B35">1990</xref>; Hahn, <xref ref-type="bibr" rid="B19">2011</xref>; Button et al., <xref ref-type="bibr" rid="B5">2013</xref>; Fiedler, <xref ref-type="bibr" rid="B14">2017</xref>) to more specific topics, like automated null hypothesis testing or underpowered studies (e.g., Rozeboom, <xref ref-type="bibr" rid="B47">1960</xref>; Cohen, <xref ref-type="bibr" rid="B6">1962</xref>; Sedlmeier and Gigerenzer, <xref ref-type="bibr" rid="B53">1989</xref>; Gigerenzer, <xref ref-type="bibr" rid="B17">2018</xref>).</p>
<p>The wide range of aspects criticized over a large time span suggests that most of them may be symptoms of an underlying disease rather than several isolated problems: A lack of appreciation for the close interconnection of theory and methods to analyze empirical data in psychological research. One explicit indication for an underlying nonchalant attitude is provided by Rozeboom (<xref ref-type="bibr" rid="B47">1960</xref>) according to whom researchers are consumers of statistical methods with the legitimate demand that the available statistical techniques meet his or her respective needs. He or she is not required to have a deeper understanding of the instruments. Rozeboom (<xref ref-type="bibr" rid="B47">1960</xref>) warned however, that this position makes the researcher vulnerable to misusing the tools. As discussions over time have shown, it is not enough to have a toolbox of instruments available; it must be of vital interest to researchers to know which instrument provides the relevant information under which conditions and how to interpret the results of those instruments in order to derive valid conclusions. And although more responsibility of researchers for the methods they adopt has been demanded (e.g., Hahn, <xref ref-type="bibr" rid="B19">2011</xref>), this seems not to have had a strong impact on the carefulness with which statistical methods are applied and statistical results are interpreted (e.g., Gigerenzer, <xref ref-type="bibr" rid="B17">2018</xref>; Fricker et al., <xref ref-type="bibr" rid="B15">2019</xref>).</p>
<p>In this paper we consider two methodological aspects and their possible consequences in more detail that, although their possible importance has been insinuated from time to time, neither received much attention nor have been treated in more detail in the discussion of theoretical and methodological issues in psychological research: Selection of samples and handling of model assumptions (e.g., Arnett, <xref ref-type="bibr" rid="B3">2008</xref>; Fernald, <xref ref-type="bibr" rid="B12">2010</xref>; Henrich et al., <xref ref-type="bibr" rid="B22">2010</xref>; Asendorpf et al., <xref ref-type="bibr" rid="B4">2013</xref>; Falk et al., <xref ref-type="bibr" rid="B10">2013</xref>; Kline, <xref ref-type="bibr" rid="B27">2015</xref>; Scholtz et al., <xref ref-type="bibr" rid="B52">2020</xref>).</p></sec>
<sec id="s2">
<title>2. The methodological framework</title>
<p>The general steps from a population to the observed sample (and back) as schematically displayed in <xref ref-type="fig" rid="F1">Figure 1</xref> are not new but the graphic highlights the steps considered more closely in the subsequent sections: Selecting units from the population into the observed sample and drawing inferences from the observed sample about the assumed data generating process (DGP).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Population (<inline-formula><mml:math id="M1"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>), sample and inference. DGP, Data generating process; MDM, missing data mechanism; <italic>t</italic>, time point; &#x00394;, time interval.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-14-1266447-g0001.tif"/>
</fig>
<p>Alpha and omega of psychological research is a population of biological subsystems and, more precisely, phenomenons mostly but not exclusively related to the nervous system located in humans. The elements of the population, i.e., humans or, more generally, units, are defined by and reduced to possibly high-dimensional vector variables. For example, variables characterizing the subsystems of interest in psychological research can be indicators like socio-demographic variables, age, gender or biomarkers but also reactions evoked by some stimuli under (non-) experimental conditions. In general, however, these variables neither describe the subsystems exhaustively nor do the subsystems exist isolated. Furthermore, not the variables themselves but the process that leads to realizations at least of some of these variables, i.e., the true DGP of some variables usually given covariates or explanatory variables, is of scientific interest. However, since the units are the carriers of&#x02014;among a huge number of other variables&#x02014;the scientifically interesting variables, it is these units that have to be selected.</p>
<p>Inferences are usually intended about a DGP inevitably linked to units in a population of humans within a certain time period &#x00394;, denoted as DGP&#x000A0;&#x00394; and <inline-formula><mml:math id="M2"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>, respectively, in <xref ref-type="fig" rid="F1">Figure 1</xref>. An important criterion to evaluate the maturity of a theory is the precision with which the units and their environments can be defined. Thus, the set of humans and the time period about which inferences are intended have to be defined as clear as possible in each step of the iterative development of a theory. Are inferences intended about homo sapiens in general or about homo sapiens living in the first half of the 21st century in western, educated, industrialized, rich and democratic countries (Arnett, <xref ref-type="bibr" rid="B3">2008</xref>; Henrich et al., <xref ref-type="bibr" rid="B22">2010</xref>)? The answer certainly depends on the psychological subfield. For example, the intended population may be wider in general psychology as compared to social psychology. Often, however, populations are not or only vaguely defined.</p>
<p>In contrast to, for example, official statistics, the target population in psychological research is abstract: Inferences are made about systems linked to units that do not necessarily exist at the time the research is conducted, either because the carriers already deceased or did not yet come into existence. However, units can only be selected from an observed part of <inline-formula><mml:math id="M3"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>. It therefore remains part of the theory to justify that the observable subpopulation of carriers at time point <italic>t</italic>, <inline-formula><mml:math id="M4"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, is not selective with respect to the true DGP<sub>&#x000A0;&#x00394;</sub> of interest.</p>
<p>The gross sample is the set of units selected from <inline-formula><mml:math id="M5"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> by some mechanism. In official statistics this is straightforward: Select a sample of units, typically according to a predefined sampling plan, from the well-defined finite (sub)population of interest, e.g., from the residents in a given country at a defined time point. Thus, the sampling mechanism is known and is usually such that the selected or gross sample is not selective or can be corrected for its selectivity. Note that in this case <inline-formula><mml:math id="M6"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> is often assumed to be approximately equal to <inline-formula><mml:math id="M7"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. If the selection mechanism is unknown, then it is usually (implicitly) assumed that the selection step can be ignored in order to proceed with the analysis.</p>
<p>Unfortunately, there is a further selection step from the gross sample to the observed or net sample with units dropping out depending on, in most cases, an underlying unknown mechanism. For example, people belonging to <inline-formula><mml:math id="M8"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> see a notice inviting them to participate in an experiment, but decide (not) to take part. This step is governed by a missing data mechanism (MDM) which at best is partly known. If enough information is available for all the units in the gross sample explaining response behavior, then it is possible to compensate for missing units. Otherwise, again, this missing information has to be replaced by the assumption that this process is not selective.</p>
<p>The assumed DGP&#x000A0;&#x00394; and, usually to a lesser extent, the assumed MDM at the item level at time <italic>t</italic> affect how the data are collected through the study design and measurement instruments, resulting in the observed data. This observed data set is then analyzed with statistical methods, i.e., information relevant to the research question contained in the observed data set is summarized in graphics, descriptive statistics, estimates, confidence intervals or <italic>p</italic>-values (&#x0201C;condensed information&#x0201D; in <xref ref-type="fig" rid="F1">Figure 1</xref>).</p>
<p>Estimates of parameters and variances of parameters, confidence intervals or <italic>p</italic>-values are used to draw inferences about <inline-formula><mml:math id="M9"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and finally <inline-formula><mml:math id="M10"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>. These inferences will be valid if the assumed DGP&#x000A0;&#x00394; (approximately) correctly models how the observed data values have been generated. This requires modeling not only the true DGP&#x000A0;&#x00394; in <inline-formula><mml:math id="M11"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> but also all (selection) processes from <inline-formula><mml:math id="M12"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> to the observed data. Ignoring any of these processes is equivalent to (implicitly) assuming that they can be ignored for valid inferences in subsequent analyses and thus, statistical methods for valid inferences in simple random samples can be applied. Hence this is a modeling assumption, as is, e.g., the assumption that variables are independent from each other, that relationships are linear or that variables are normally distributed. And, of course, unjustified assumptions can easily be wrong.</p>
<p>Our subsequent analysis can be embedded in the different stages depicted in <xref ref-type="fig" rid="F1">Figure 1</xref>. The following section will concentrate on the selection part and the missing data mechanism (<italic>MDM</italic><sub><italic>t</italic></sub>) at the unit level, whereas Section 4 will predominantly deal with misspecifications of the (assumed) DGP. For technical details on the examples used in the text (see the <xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>).</p>
</sec>
<sec id="s3">
<title>3. Sample selection and unit response</title>
<p>In psychological research, samples are often selected in such a way that both, the sample selection and the unit response process are unknown and cannot be separated. An example is a convenience sample where there is no information on units that chose not to participate. Therefore, we integrate both processes into one selection mechanism. Note that the selection process can easily be generalized to cover other selection phenomena such as the file drawer problem, outlier deletion, or item non-response.</p>
<sec>
<title>3.1. The general framework</title>
<p>Prominent examples of estimated models at the analysis stage are regression and analysis of variance models. Estimation of these models amounts to assuming a distributional model for the outcome <italic>y</italic> given covariates, including a 1 for the constant, collected in a vector <italic>x</italic>. Throughout Section 3 we presuppose that the assumed model including the required assumptions approximates the true DGP&#x000A0;&#x00394; sufficiently well for valid inferences.</p>
<p>After having selected a sample of units, it is common practice to estimate the model of DGP&#x000A0;&#x00394; adopting a classical model based frequentist statistical approach, using only those units whose values have been observed, with the number of observed units, <italic>n</italic><sub>obs</sub>, and <italic>x</italic> fixed at their observed values. What actually should be modeled, however, is the distribution of the <italic>y</italic>-variables whose values have been observed given the <italic>x</italic>-values and the pattern of observed and not observed units from <inline-formula><mml:math id="M13"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (cf., Rubin, <xref ref-type="bibr" rid="B48">1976</xref>, <xref ref-type="bibr" rid="B49">1987</xref>). By conditioning on the pattern of observed and unobserved units, the selection mechanism is explicitly taken into account. Common practice is to ignore the selection mechanism, thereby implicitly assuming that it is not informative for <italic>y</italic> given <italic>x</italic>.</p>
<p>In regression models with independent units, it can be shown that inferences based on a model that ignores the selection mechanism will be valid if the probability of observing the actually observed units given the observed <italic>x</italic>-values is the same for all possible values of the observed <italic>y</italic> variables. See Rubin (<xref ref-type="bibr" rid="B48">1976</xref>) for the corresponding theory in the case of missing items. For specific models, it has also been shown that inferences about effects of covariates on the outcome ignoring the selection process are valid if the probability of the observed pattern of observed and unobserved units changes with unobserved components in <italic>y</italic> which are independent of <italic>x</italic>, but is the same for all possible values of <italic>x</italic> (e.g., Heckman, <xref ref-type="bibr" rid="B21">1979</xref>; Terza, <xref ref-type="bibr" rid="B62">1998</xref>; McCulloch et al., <xref ref-type="bibr" rid="B31">2016</xref>).</p>
<p>On the other hand, the selection process cannot be ignored in general if for a given pattern of observed and unobserved units, the probability of observing this pattern changes with <italic>x</italic> and <italic>y</italic> even if the model correctly specifies the true DGP&#x000A0;&#x00394;. In this case, inferences will systematically be biased. Similar arguments hold if a non-frequentist Bayesian approach is adopted.</p>
<p>Given that the selection mechanism can be ignored in certain cases without biasing inferences, the question arises whether this is also true in experimental contexts, which are considered the silver bullet for unbiased causal inference.</p>
</sec>
<sec>
<title>3.2. Selectivity in experimental designs</title>
<p>One way to model the selection process is through a threshold model,</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M14"><mml:mrow><mml:msubsup><mml:mi>v</mml:mi><mml:mi>i</mml:mi><mml:mo>*</mml:mo></mml:msubsup><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msubsup><mml:mi>z</mml:mi><mml:mi>i</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mi>&#x003B3;</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>&#x000A0;</mml:mtext><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mi>w</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>~</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>w</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x02003;and&#x02003;</mml:mtext><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mn>1</mml:mn></mml:mtd><mml:mtd columnalign='left'><mml:mrow><mml:mtext>if&#x000A0;</mml:mtext><mml:msubsup><mml:mi>v</mml:mi><mml:mi>i</mml:mi><mml:mo>*</mml:mo></mml:msubsup><mml:mo>&#x02264;</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign='left'><mml:mrow><mml:mtext>otherwise,</mml:mtext></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>wherein <italic>z</italic><sub><italic>i</italic></sub> is a vector of covariates including a 1 for the constant and possibly (elements of) <italic>x</italic><sub><italic>i</italic></sub> or <inline-formula><mml:math id="M15"><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:math></inline-formula> (<italic>i</italic>&#x02260;<italic>i</italic>&#x02032;), <inline-formula><mml:math id="M16"><mml:msubsup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mspace width="0.3em" class="thinspace"/></mml:math></inline-formula> , and <inline-formula><mml:math id="M17"><mml:msubsup><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msubsup></mml:math></inline-formula> is an unobserved tendency to observe unit <italic>i</italic>, such that <italic>y</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>i</italic></sub> are only completely observed if <inline-formula><mml:math id="M18"><mml:msubsup><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x02264;</mml:mo><mml:mi>c</mml:mi></mml:math></inline-formula>, <italic>i</italic> &#x0003D; 1, &#x02026;, <italic>N</italic>, in which case the response indicator <italic>v</italic><sub><italic>i</italic></sub> takes on the value one. Otherwise, if the unit is not observed, <italic>v</italic><sub><italic>i</italic></sub> &#x0003D; 0. Large values of &#x003B3; model strong impacts of the covariates in <italic>z</italic><sub><italic>i</italic></sub> on the probability of (not) observing unit <italic>i</italic> in the sample. The unknown threshold <italic>c</italic> regulates the fraction of observed units: High values of <italic>c</italic> lead to high percentages of observed units and low values to small fractions. For simplicity, we assume <inline-formula><mml:math id="M19"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, i.e., that the error term <italic>w</italic><sub><italic>i</italic></sub> is normally distributed with mean zero and variance <inline-formula><mml:math id="M20"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, not depending on <italic>z</italic><sub><italic>i</italic></sub> or <inline-formula><mml:math id="M21"><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>Based on these assumptions, let</p>
<disp-formula id="E2"><mml:math id="M22"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mtext>&#x02003;</mml:mtext><mml:mtext class="textrm" mathvariant="normal">and</mml:mtext><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x003A6;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein &#x003D5;(&#x000B7;) and &#x003A6;(&#x000B7;) are the density and standard normal distribution function, respectively. The term &#x003C8;<sub><italic>i</italic></sub> can be interpreted as the expected tendency to be selected into the sample and to respond, &#x003A6;(&#x003C8;<sub><italic>i</italic></sub>) models the probability that unit <italic>i</italic> is observed and &#x003BB;<sub><italic>i</italic></sub> is a term that corrects for the selection mechanism in the model of scientific interest (cf.Heckman, <xref ref-type="bibr" rid="B21">1979</xref>; Amemiya, <xref ref-type="bibr" rid="B1">1985</xref>). <xref ref-type="fig" rid="F2">Figure 2</xref> illustrates the effect of &#x003C8; on &#x003D5;(&#x000B7;), &#x003A6;(&#x000B7;), and &#x003BB;.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Illustration of effects of <inline-formula><mml:math id="M23"><mml:mi>&#x003C8;</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, wherein <italic>c</italic> &#x0003D; 0, <italic>z</italic>, and &#x003B3; are both scalars and &#x003B3; &#x0003D; &#x003C3;<sub><italic>w</italic></sub> &#x0003D; 1, on &#x003D5;(&#x003C8;), &#x003A6;(&#x003C8;), and &#x003BB; &#x0003D; &#x003D5;(&#x003C8;)/&#x003A6;(&#x003C8;).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-14-1266447-g0002.tif"/>
</fig>
<p>To illustrate the consequences of ignoring the selection mechanism for inference in experimental settings, we consider three examples in two scenarios that amount to a comparison of means in two groups.</p>
<sec>
<title>3.2.1. Scenario 1: one measurement per unit</title>
<p>Assume that the correctly specified model for the true DGP&#x000A0;&#x00394; is</p>
<disp-formula id="E3"><label>(2)</label><mml:math id="M24"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:mi>N</mml:mi><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein &#x003B2; is the parameter of interest, the errors &#x003F5;<sub><italic>i</italic></sub> are independent across all units and all assumptions for valid inferences are met in <inline-formula><mml:math id="M25"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, and let &#x003F5;<sub><italic>i</italic></sub> and <italic>w</italic><sub><italic>i</italic></sub> follow a bivariate normal distribution with correlation 0 &#x02264; &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> &#x0003C; 1. Hence, we may write <inline-formula><mml:math id="M26"><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B6;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, wherein &#x003B6;<sub><italic>i</italic></sub> is normally distributed with mean zero and variance <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> (e.g., Mardia et al., <xref ref-type="bibr" rid="B30">1979</xref>).</p>
<p>Taking the selection process into account, the model of scientific interest that would have to be estimated based on the observed sample is a model for <italic>y</italic><sub><italic>i</italic></sub> conditional on <italic>x</italic><sub><italic>i</italic></sub> as a function of <italic>w</italic><sub><italic>i</italic></sub> which, in the observed sample, i.e., for units with <inline-formula><mml:math id="M28"><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:math></inline-formula>, is truncated above at <inline-formula><mml:math id="M29"><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:math></inline-formula> and thus follows a truncated normal distribution. Let <italic>i</italic> &#x0003D; 1, &#x02026;, <italic>n</italic><sub>obs</sub> index the units in this subsample. Following Heckman (<xref ref-type="bibr" rid="B21">1979</xref>), for <italic>i</italic> &#x0003D; 1, &#x02026;, <italic>n</italic><sub>obs</sub>, the model to be estimated is</p>
<disp-formula id="E4"><label>(3)</label><mml:math id="M30"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B2;</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein <inline-formula><mml:math id="M31"><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula> and the term &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub>&#x003C3;<sub>&#x003F5;</sub>&#x003BB;<sub><italic>i</italic></sub> corrects for a possible bias due to the selection process.</p>
<p>Let <inline-formula><mml:math id="M32"><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> where <italic>x</italic><sub><italic>i</italic></sub> is a binary variable, resulting in a comparison of the means of two independent groups defined by <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1, respectively. Ignoring the selection mechanism, which is equivalent to ignoring the term &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub>&#x003C3;<sub>&#x003F5;</sub>&#x003BB;<sub><italic>i</italic></sub>, leads to the estimator of the difference in the means of the two groups,</p>
<disp-formula id="E5"><label>(4)</label><mml:math id="M33"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mover accent='true'><mml:mstyle mathvariant='bold-italic'><mml:mi>&#x003B2;</mml:mi></mml:mstyle><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mtext>&#x000A0;&#x000A0;and</mml:mtext></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x1D53C;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mover accent='true'><mml:mstyle mathvariant='bold-italic'><mml:mi>&#x003B2;</mml:mi></mml:mstyle><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic'><mml:mi>x</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic'><mml:mi>v</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C1;</mml:mi><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>&#x003C3;</mml:mi><mml:mi>&#x003F5;</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003BB;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003BB;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein &#x00233;<sub>0</sub> and &#x00233;<sub>1</sub> are the sample means and &#x003BC;<sub>0</sub> and &#x003BC;<sub>1</sub> are the true population means of <italic>y</italic>-values for which <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1, respectively, <inline-formula><mml:math id="M34"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is the sample mean of &#x003BB;<sub><italic>i</italic></sub> values if <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <inline-formula><mml:math id="M35"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is the sample mean of &#x003BB;<sub><italic>i</italic></sub> values if <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1. Thus, the bias of the estimator for the difference between the two groups, &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub>, is <inline-formula><mml:math id="M36"><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
<p>The estimator of &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> will be biased if &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub>&#x02260;0, i.e., if there is at least one variable, independent from <italic>x</italic><sub><italic>i</italic></sub> and <italic>z</italic><sub><italic>i</italic></sub>, that has an effect on the selection process and is linearly related to the outcome in the model of scientific interest, and if the difference <inline-formula><mml:math id="M37"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is not zero. This latter difference is not zero if the tendencies to be observed in the sample differ systematically between the subsamples defined by <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1, respectively. Any bias will be amplified by a decreasing fit of the model of scientific interest. Note that even if the two population means &#x003BC;<sub>0</sub> and &#x003BC;<sub>1</sub> are equal, the estimator of their difference may systematically be different from zero.</p>
<p>If the assignment of each unit to one and only one condition is random and independent from <italic>x</italic><sub><italic>i</italic></sub>, <inline-formula><mml:math id="M38"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> will usually (approximately) be zero and thus the estimator of the mean difference between the groups will (approximately) be unbiased. In this case, the selection mechanism can be ignored even if selection into the observed part of the sample depends on variables that have an effect on the outcome.</p>
<p>However, if the selection mechanism depends on <italic>x</italic><sub><italic>i</italic></sub>, then <inline-formula><mml:math id="M39"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> will not (approximately) be zero because the bounds <inline-formula><mml:math id="M40"><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:math></inline-formula> will systematically be different in the two groups. For example, suppose that the two levels of <italic>x</italic><sub><italic>i</italic></sub> represent two clinical groups that differ in their willingness to participate in a study, e.g., because of a decreased level of physical activity in one of the two groups, that affects the outcome only through <italic>x</italic><sub><italic>i</italic></sub>. If in addition there are variables, like general openness, independent from <italic>x</italic><sub><italic>i</italic></sub> and <italic>z</italic><sub><italic>i</italic></sub> that affect both, the outcome of interest and the tendency to be observed in the sample, so that &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub>&#x02260;0, then the estimator for the difference in the means in <inline-formula><mml:math id="M41"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> will be biased and inferences will be invalid. Ignoring the &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub>&#x003C3;<sub>&#x003F5;</sub>&#x003BB;<sub><italic>i</italic></sub> part is equivalent to estimating a misspecified model, although the model would be correctly specified in <inline-formula><mml:math id="M42"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> illustrates the effect of &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> on the coverage rate of the true values &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> &#x0003D; 0 based on 0.95-confidence intervals if <inline-formula><mml:math id="M43"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>, <italic>x</italic><sub><italic>i</italic></sub> and a scalar binary <italic>z</italic><sub><italic>i</italic></sub> are correlated with &#x003C1;<sub><italic>z, x</italic></sub> &#x0003D; 0.5, and for different values of <inline-formula><mml:math id="M44"><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula>. If the correlation &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> or the difference of the means of <inline-formula><mml:math id="M45"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M46"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is close to zero, then the actual coverage rate of the true difference of the two means is close to the nominal level 0.95. The actual coverage rate decreases, however, with increasing values of &#x003B4;<sub>&#x003BB;</sub> or &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> if both are not zero. The actual coverage rate of the 0.95-confidence interval can drop even below 0.5, leading to rejection rates of the true null hypothesis that are far too high. Thus, a non-existing effect may be &#x0201C;found&#x0201D; far too often.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>One measurement per unit. Coverage rate (coverage) of true value &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> &#x0003D; 0 by 0.95-confidence intervals as a function of, <bold>(A)</bold> &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> and different values of <inline-formula><mml:math id="M47"><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> if &#x003C1;<sub><italic>z, x</italic></sub> &#x0003D; 0.5 and &#x003C3;<sub>&#x003F5;</sub> &#x0003D; 1, and, <bold>(B)</bold> <inline-formula><mml:math id="M48"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> and different values of <inline-formula><mml:math id="M49"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> if &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> &#x0003D; 0.4, &#x003B3; &#x0003D; 0 and &#x003B4;<sub>&#x003BB;</sub> &#x0003D; 0 (see text for details). In both scenarios <inline-formula><mml:math id="M50"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-14-1266447-g0003.tif"/>
</fig>
<p>For the second example, we introduce a minor change: Assume possibly different error variances under the two conditions in DGP&#x000A0;&#x00394;, i.e., <inline-formula><mml:math id="M51"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> if <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <inline-formula><mml:math id="M52"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> if <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1. For simplicity we assume &#x003C1;<sub>&#x003F5;<sub>0</sub>, <italic>w</italic></sub> &#x0003D; &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub>. Then, the expected value of <inline-formula><mml:math id="M53"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:math></inline-formula> ignoring the selection mechanism is</p>
<disp-formula id="E6"><label>(5)</label><mml:math id="M54"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mtext>&#x1D53C;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mover accent='true'><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover></mml:mstyle><mml:mrow><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>v</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C1;</mml:mi><mml:mrow><mml:mi>&#x003F5;</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>&#x003BB;</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>&#x003BB;</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>Thus, the estimator for the difference &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> will generally be biased if there is any variable independent of <italic>z</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>i</italic></sub> that has an effect on selection and <italic>y</italic><sub><italic>i</italic></sub>, and if <inline-formula><mml:math id="M55"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x02260;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> even if assignment to the two conditions is random and does not depend on <italic>x</italic><sub><italic>i</italic></sub>.</p>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> illustrates the coverage rates of true value &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> &#x0003D; 0 by 0.95-confidence intervals under this more general scenario. Now &#x003C1;<sub>&#x003F5;, <italic>w</italic></sub> &#x0003D; 0.4, &#x003B4;<sub>&#x003BB;</sub> &#x0003D; 0 and selection does not depend on <italic>z</italic><sub><italic>i</italic></sub> &#x0003D; <italic>z</italic><sub><italic>i</italic></sub> because &#x003B3; &#x0003D; 0. What varies are the error variances <inline-formula><mml:math id="M56"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M57"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>. The actual coverage rates are equal to the nominal 0.95-level if both error variances are equal but differ greatly for large differences between the two. Again, effects may be &#x0201C;found&#x0201D; much too often even if &#x003BC;<sub>1</sub> &#x0003D; &#x003BC;<sub>0</sub>.</p>
</sec>
<sec>
<title>3.2.2. Scenario 2: two measurements per unit</title>
<p>Consider a repeated measurement design, where each unit is observed under each of two conditions, <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 0 and <italic>x</italic><sub><italic>i</italic></sub> &#x0003D; 1, but the selection mechanism is given by Equation (1). We further assume that there are no systematic position effects. Using the same notation and estimator for the difference &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> as in the last section, its expected value ignoring the selection process is</p>
<disp-formula id="E7"><mml:math id="M58"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mtext>&#x1D53C;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mover accent='true'><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover></mml:mstyle><mml:mrow><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>v</mml:mi></mml:mstyle><mml:mrow><mml:mtext>obs</mml:mtext></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mover accent='true'><mml:mi>&#x003BB;</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003C1;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C1;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>wherein &#x003C1;<sub>&#x003F5;<sub>0</sub>, <italic>w</italic></sub> and &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> are the correlations of the errors in the model of scientific interest with <italic>w</italic><sub><italic>i</italic></sub> in Equation (1), respectively, and &#x003C3;<sub>&#x003F5;<sub>0</sub></sub> and &#x003C3;<sub>&#x003F5;<sub>1</sub></sub> are the corresponding variances. Because <inline-formula><mml:math id="M59"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula> is not zero if there are unobserved units, the estimator of &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> is biased if &#x003C1;<sub>&#x003F5;<sub>0</sub>, <italic>w</italic></sub>&#x003C3;<sub>&#x003F5;<sub>0</sub></sub>&#x02260;&#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub>&#x003C3;<sub>&#x003F5;<sub>1</sub></sub>. Hence, if there is any variable independent from <italic>x</italic> and <italic>z</italic> which is not included into the model of scientific interest but has different effects on <italic>y</italic><sub>0</sub> and <italic>y</italic><sub>1</sub> and is relevant in the selection and response mechanism, then the estimator of the difference &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> will be biased and corresponding inferences will not be valid. The amount of bias will be amplified by decreasing values of <italic>c</italic> or, for positive &#x003B3;, by increasing values of <italic>z</italic> and thus by larger values of <inline-formula><mml:math id="M60"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula>.</p>
<p><xref ref-type="fig" rid="F4">Figure 4</xref> shows, for different values of <inline-formula><mml:math id="M61"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, the effect of &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> on the actual coverage rate of 0.95-confidence intervals. Again, there is only one <italic>z</italic><sub><italic>i</italic></sub>-variable the corresponding parameter of which is zero, i.e., &#x003B3; &#x0003D; 0. For simplicity, &#x003C1;<sub>&#x003F5;<sub>0</sub>, <italic>w</italic></sub> is zero and &#x003C3;<sub>&#x003F5;<sub>0</sub></sub> &#x0003D; 1. The mean over all &#x003BB;<sub><italic>i</italic></sub>-values in the observed sample is <inline-formula><mml:math id="M62"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>9294</mml:mn></mml:math></inline-formula> and the covariance of &#x003F5;<sub>0</sub> and &#x003F5;<sub>1</sub>, <inline-formula><mml:math id="M63"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, is 0.2. Thus, the bias is not zero and increases with increasing (absolute) values of &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> and <inline-formula><mml:math id="M64"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>. Consequently, the actual coverage rate may dramatically decline with increasing (absolute) values of &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> and <inline-formula><mml:math id="M65"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>. If &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> is zero, then the actual coverage rates are equal to the nominal 95%-coverage rate.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Two measurements per unit. Coverage rate (coverage) of true value &#x003BC;<sub>1</sub>&#x02212;&#x003BC;<sub>0</sub> &#x0003D; 0 by 0.95-confidence intervals as a function of &#x003C1;<sub>&#x003F5;<sub>1</sub>, <italic>w</italic></sub> and different values of <inline-formula><mml:math id="M66"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>. Covariance of &#x003F5;<sub>0</sub> and &#x003F5;<sub>1</sub> is <inline-formula><mml:math id="M67"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:math></inline-formula> (see text for details).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-14-1266447-g0004.tif"/>
</fig>
<p>As an example, consider a simple reaction time experiment with two conditions, and suppose students at a university are invited to participate. If age is an indicator of the developmental stage of a subsystem related to reaction time, and if the disregarded age affects the outcome variable reaction time differently under the two conditions via the corresponding subsystem (e.g., Dykiert et al., <xref ref-type="bibr" rid="B7">2012</xref>), then ignoring the selection mechanism will lead to biased inferences. In this simplified example disregarded age would be part of <italic>w</italic>, which would be correlated with &#x003F5;<sub>0</sub> and &#x003F5;<sub>1</sub>.</p>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4. Violations of model assumptions</title>
<p>In this section we assume that the selection of units can be ignored. Instead, we discuss the consequences of model misspecifications in more general models commonly used in applications, but without further detailed examples.</p>
<sec>
<title>4.1. Ordinary linear regression models</title>
<p>Suppose that different studies addressing the same research topic possibly differ in the (implicit) subpopulation they are referring to and that our (perhaps meta analytical) aim might be to infer effects in an appropriately defined mixture population. To sketch the possible inconsistency issues that might result, we assume the following: The aim is to infer the effect of some predictor variable <italic>x</italic> on some outcome <italic>y</italic> in <inline-formula><mml:math id="M68"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, which can, for the sake of simplicity, be subdivided into two subpopulations, <italic>k</italic> &#x0003D; 1, 2. Assuming that the modeling assumptions hold in each subpopulation, we will analyze under what conditions these modeling assumptions hold in the mixture.</p>
<p>We thus take a sample (<italic>y</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>i</italic></sub>), <italic>i</italic> &#x0003D; 1, &#x02026;<italic>n</italic>, from <inline-formula><mml:math id="M69"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and ask whether the standard assumptions (see the <xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>) along with the normality assumption also hold within the mixture. To this end, abbreviate by <italic>z</italic><sub><italic>i</italic></sub> now the random variable which denotes the subpopulation to which the <italic>i</italic>-th unit belongs. According to our assumptions, we have</p>
<disp-formula id="E8"><label>(6)</label><mml:math id="M70"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein the intercept &#x003B2;<sub>0</sub>(<italic>z</italic>) and slope &#x003B2;<sub>1</sub>(<italic>z</italic>) may depend on the subpopulation <italic>z</italic>. Equation (6) entails a linear regression of <italic>y</italic> on <italic>x</italic> within each subpopulation whereby the regression lines might differ across the subpopulations. If they differ, then there is an interaction between <italic>z</italic> and <italic>x</italic> with respect to the outcome.</p>
<p>According to the law of iterated expectation, it follows that our key term of interest&#x02014;the conditional expectation in the mixture population&#x02014;is given by</p>
<disp-formula id="E9"><label>(7)</label><mml:math id="M71"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>&#x1D53C;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:mi>y</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mtext>&#x1D53C;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:mi>y</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein we use <italic>E</italic><sub><italic>z</italic>|<italic>x</italic></sub> as a shorthand notation to indicate the conditional distribution of <italic>z</italic> given <italic>x</italic> with respect to which the expectation has to be taken.</p>
<p>We may now distinguish between three cases: Firstly, independence of <italic>x</italic> and <italic>z</italic>. Here, the conditional expectations with respect to <italic>E</italic><sub><italic>z</italic>|<italic>x</italic></sub> resolve to unconditional expectations and we arrive at</p>
<disp-formula id="E10"><mml:math id="M72"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mtext>&#x1D53C;</mml:mtext><mml:mrow><mml:mi>z</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>z</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>wherein both <inline-formula><mml:math id="M73"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula> parameters are weighted averages of the subpopulation specific intercept and slope terms. Therefore, although the regression parameters differ from those in the subpopulations, the presumed functional form in the modeling of <italic>E</italic>(<italic>y</italic>|<italic>x</italic>) remains identical to the form which was assumed within each subpopulation.</p>
<p>Secondly, lack of interaction. In this case, the intercept and slope terms do not depend on <italic>z</italic> and Equation (7) reduces to:</p>
<disp-formula id="E11"><mml:math id="M74"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x1D53C;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mrow><mml:mi>&#x1D53C;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>|</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mi>x</mml:mi><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Again, the functional form is preserved and in this case also the parameters.</p>
<p>Thirdly, interaction or dependency. Then the conditional expectation is a function of <italic>x</italic> and we may conclude that the conditional expectation is furthermore likely to include nonlinear terms despite the fact that within each subpopulation we have linearity. Or stated differently: Suppose we are given two publications on the impact of the predictor <italic>x</italic> on the outcome <italic>y</italic>. Assuming the validity of the assumptions in each study, we infer the impact of the predictor via the regression coefficient &#x003B2;<sub>1</sub>. However, if a third researcher conducts a study in the mixture population, which would be a natural setup to draw meta analytical conclusions, then to ensure the validity of a linear regression model, the researcher would have to deviate from the model used in the publications. In addition, the report of the impact of the predictor would have to focus on different coefficients.</p>
<p>The described dependencies of the modeling assumptions on the population as well as on the sampling scheme were highlighted in terms of the ordinary linear regression model which just served as a mathematical convenient example to demonstrate these effects. The described phenomena occur in more complicated modeling classes as well, as illustrated in the following section.</p>
</sec>
<sec>
<title>4.2. Generalized linear mixed model (GLMM)</title>
<p>The class of GLMMs has many applications in psychology, most notably in Item Response Theory (IRT). As literally every construct of interest in psychology requires an adequate measurement device, it is hardly an overstatement to say that IRT models, alongside with their older classical test theory counterparts,<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref> are omnipresent in applied research. For the sake of clarity, we will therefore limit the statement of the model to the most relevant case of IRT and refer for general formulation of the GLMM to Jiang (<xref ref-type="bibr" rid="B24">2007</xref>). We will also exclude any covariates in order to focus on the random effects part of the model that goes beyond the ordinary regression setup.</p>
<p>Let <italic>y</italic><sub><italic>i, j</italic></sub> denote the response of the <italic>i</italic>-th test taker to item <italic>j</italic> (<italic>j</italic> &#x0003D; 1, &#x02026;, <italic>J</italic>) of a test that is supposed to measure a single construct&#x02014;say numerical IQ, denoted by &#x003B8;<sub><italic>i</italic></sub>. The response <italic>y</italic><sub><italic>i, j</italic></sub> is binary, encoding as to whether the item was solved correctly or not. The postulate of a single underlying construct when combined with the local independence assumption<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref> provides us with a formula for the probability of any particular response pattern on the test, for example:</p>
<disp-formula id="E12"><label>(8)</label><mml:math id="M75"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>|</mml:mo><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x022EF;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>wherein <italic>f</italic><sub><italic>j</italic></sub>(&#x000B7;) denotes the item response function (IRF) of the <italic>j</italic>-th item. The latter is defined as the conditional probability that a test taker with latent ability &#x003B8;<sub><italic>i</italic></sub> &#x0003D; &#x003B8; solves the <italic>j</italic>-th item, i.e., <italic>f</italic><sub><italic>j</italic></sub>(&#x003B8;) &#x0003D; <italic>P</italic>(<italic>y</italic><sub><italic>i, j</italic></sub> &#x0003D; 1|&#x003B8;).</p>
<p>There are two key parts, wherein restrictive modeling assumptions emerge: Firstly, the IRF must be specified, leading to particular assumptions such as imposing logistic shape on <italic>f</italic><sub><italic>i</italic></sub>. Thus, <italic>f</italic><sub><italic>i</italic></sub> has a given shape but may depend on a few remaining parameters&#x02014;such as item difficulty and discrimination parameter(s)&#x02014;which are suppressed in our notation. Secondly, apart from the special case of a Rasch model (Andersen, <xref ref-type="bibr" rid="B2">1970</xref>), one needs to specify a distribution for the latent variable. This is necessary because the conditional probabilities being functions of unknown latent abilities in Equation (8) are not amenable.</p>
<p>However, by using the law of total probability in conjunction with the specification of a distribution function for &#x003B8;, Equation (8) resolves to an empirical testable statement referring to observable quantities with no hidden quantities involved,</p>
<disp-formula id="E13"><label>(9)</label><mml:math id="M76"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x022EF;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In the latter equation, <italic>P</italic>(&#x003B8;) denotes the probability of sampling a test taker with numerical IQ &#x003B8;. Equation (9) assumes a discrete latent variable. In most applications, however, the hidden latent variable is modeled as continuous. In these cases the above summation has to be replaced by an integral with respect to <italic>G</italic>, the cumulative density function of &#x003B8;. Nearly all applications specify a normal distribution for the latter.</p>
<p>Note that Equation (9) provides us with a frequency statement: In a sample of size <italic>n</italic> of test takers from <inline-formula><mml:math id="M77"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, we expect <italic>n</italic>&#x000D7;<italic>P</italic>(<italic>y</italic><sub><italic>i</italic>, 1</sub> &#x0003D; 1, <italic>y</italic><sub><italic>i</italic>, 2</sub> &#x0003D; 0, &#x02026;<italic>y</italic><sub><italic>i, k</italic></sub> &#x0003D; 0) test takers to show this particular response pattern according to our specified model. Stated differently, given estimates for the unknown parameters, e.g., item difficulties, item discrimination and variance of the latent variable, which enter Equations (9) through (8), we can plug in these estimates into the right hand side and evaluate the model fit via some discrepancy measure between the observed frequency count and the expected count according to the model.<xref ref-type="fn" rid="fn0003"><sup>3</sup></xref></p>
<p>From the above outline, the following may be deduced. Firstly, as the computation of the marginals in (9) also involves <italic>G</italic>, an IRT model can show misfit despite a correct specification of the dimensionality of &#x003B8; and of each IRF. This misfit is then solely caused by an incorrectly specified distribution function of the latent variable (i.e., of the random effect). Secondly, it is difficult to construct test statistics which allow for a detailed analysis of the cause of misspecification. That is, although we may observe a practically meaningful deviation of the observed and expected counts, we may not know if the latter is a result of the misspecification of the IRFs or of the distribution function. And thirdly, it follows from the first aspect that the model fit is highly dependent on subpopulations. That is, given two populations which only differ in the distribution of the latent ability, the appropriateness of the IRT model will be evaluated differently. In essence, this is already highlighted in Equation (9). That is, according to the law of total probability, (marginal) probabilities are always affected by the marginal distribution of the partitioning random variable (&#x003B8; in this case) and differ from each other&#x02014;even if all conditional distributions are identical.</p>
<p>We may further elaborate on the latter point: Assuming a validly constructed numerical IQ scale in accordance with the usual assumptions (entailing the normality of &#x003B8;), it follows that we are likely to encounter nonnormality in subpopulations. For example, if we have a mixture of two subpopulations which differ in the location or variance of the latent ability (the analog reasoning as given in Section 4.1 applies). Likewise, if there is a variance restriction such as using the scale for job selection tasks, wherein the job applicants are supposed to show less variation in the IQ due to the requirements of the job profile (e.g., engineers; cf. Section 3). Both cases depict a simple, practical relevant mechanism which dissolves any prior existing normality. In conjunction with the second example, it follows that two researchers which examine the same scale in different (sub)populations are likely to disagree on the fit of the model solely due to a strong assumption on the distribution for &#x003B8;.</p>
<p>Importantly, it must be emphasized that the outlined results also appear in other GLMM type models. Every GLMM model requires the specification of the distribution of an unobservable latent quantity.</p></sec></sec>
<sec id="s5">
<title>5. Minimizing the risk of biased inferences</title>
<p>There is a fine line in reaching valid conclusions, with any violation of an assumption along the way potentially leading to biased conclusions. However, there are also strategies for dealing with the potential problems discussed in this paper. A scientifically sound approach to empirical research is, first, to be aware of the assumptions underlying the selection and analysis steps and, second, to explicitly state the assumptions and justify their validity. Both aspects require the following triad: A sufficiently developed theory, appropriate methods to generate and analyze the data, and a reliable body of relevant empirical studies. Appropriateness of the methods in turn implies availability and good knowledge of the adopted statistical techniques. All three components are necessarily interdependent and ideally evolve iteratively as knowledge is accumulated.</p>
<p>It follows from the foregoing sections that the maturity of a theory determines how precisely <inline-formula><mml:math id="M78"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mo>&#x025B3;</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and DGP&#x000A0;&#x00394; can be defined. The more developed a theory, the better informed a possible sampling design, the better justified the statistical analysis tools, and consequently the fewer untestable assumptions required. This in turn increases the credibility of inferences and helps to built better theories. Therefore, at any point in the process, available knowledge should be used to challenge, sharpen and develop a theory. In general, however, the systematic development of theories does not seem to have been given a high priority in psychological research (e.g.,Meehl, <xref ref-type="bibr" rid="B33">1978</xref>; Fiedler, <xref ref-type="bibr" rid="B13">2014</xref>; Eronen and Bringmann, <xref ref-type="bibr" rid="B8">2021</xref>; McPhetres et al., <xref ref-type="bibr" rid="B32">2021</xref>; Szollosi and Donkin, <xref ref-type="bibr" rid="B61">2021</xref>). In the usual case, where the definition of the target population is vague at best, conclusions should be interpreted with great caution and perhaps limited to a smaller, defensible subpopulation, such as a group of students in a particular subject and age group.</p>
<p>A crucial condition for ignoring the selection mechanism resembles the fundamental condition in experimental settings to avoid systematic effects of confounding variables: Selection into the &#x0201C;observed&#x0201D; vs. &#x0201C;not observed&#x0201D; conditions may depend on observed covariates but not additionally on the outcome. In many psychological studies, this is implicitly assumed without further justification, but in order to allow compensation of a possible selectivity of an observed subsample and thus to justify statements about a broader subpopulation or even <inline-formula><mml:math id="M79"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, the selection mechanism, the relevant variables and their relationships with the DGP&#x000A0;&#x00394; must be known. Thus, in addition to the theory of interest, at least a rudimentary auxiliary theory of response behavior must be available.</p>
<p>Based on not necessarily exact replications of a study, knowledge of response behavior can be built up iteratively by collecting variables informative of non-response. This can consist of individual information about non-respondents such as age or cohort membership in terms of age groups, field of study if units are students, or residential area (e.g., Groves et al., <xref ref-type="bibr" rid="B18">2001</xref>). Although trying to collect this additional information requires more expensive data collection methods, it would allow researchers to adopt a weighting strategy, to include a correction term in the estimated model, to apply a (full information) maximum likelihood method or to generate multiple imputations to compensate for missing units (e.g.,Rubin, <xref ref-type="bibr" rid="B49">1987</xref>; Robins et al., <xref ref-type="bibr" rid="B44">1995</xref>; Schafer and Graham, <xref ref-type="bibr" rid="B51">2002</xref>; Wooldridge, <xref ref-type="bibr" rid="B63">2002</xref>, <xref ref-type="bibr" rid="B64">2007</xref>, <xref ref-type="bibr" rid="B65">2010</xref>). To allow valid inference, all these techniques require, in addition to more or less strong modeling assumptions, that all variables relevant to the non-response process are included in the analysis.</p>
<p>In addition to variables directly related to DGP&#x000A0;&#x00394; or response behavior, variables could be collected for explanatory purposes to help build an increasingly strong foundation by sharpening the definition of <inline-formula><mml:math id="M80"><mml:mrow><mml:msub><mml:mtext>&#x1D4AB;</mml:mtext><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, helping to learn about possible mixture populations, and thus increasing knowledge about DGP&#x000A0;&#x00394;. The necessary exploratory analyses should be incentivized by publishing these as independent, citable articles. Similarly, research on the reasons for non-response should be encouraged to provide the research community with information on variables to compensate for unobserved units in related contexts.</p>
<p>If the theory underlying a research question of interest does not justify the assumptions necessary for the adopted analysis method, or if empirical results raise doubts whether they are met, then a sensitivity analysis, a multiverse analysis (Steegen et al., <xref ref-type="bibr" rid="B58">2016</xref>) or the adaption of a robust or non-parametric estimation method may be an appropriate choice. The basic idea of sensitivity analyses is to analyze the data set at hand under a range of plausible assumptions. If inferences do not change substantially, they are robust with respect to this set of plausible assumptions (e.g., Rosenbaum and Rubin, <xref ref-type="bibr" rid="B45">1983</xref>; in the context of missing values, see Rubin, <xref ref-type="bibr" rid="B49">1987</xref>). This strategy, although not new, has not received much attention in applied research.</p>
<p>However, there is a way around using parametric models based on strong assumptions. Semi- or non-parametric methods require larger, although not necessarily much larger samples (e.g.,Spiess and Hamerle, <xref ref-type="bibr" rid="B57">2000</xref>) but also less detailed formulated theories, which is helpful at an earlier stage of theory development. If then a random sample is selected from a clearly defined subpopulation according to a known sampling design and auxiliary variables are surveyed to compensate for possible selectivity due to non-response, the results may cautiously be interpreted with respect to the addressed subpopulation if model diagnostics following the analyses do not imply serious violation of assumptions. Of course, the whole procedure including all the variables surveyed should be described in detail and the data should be made publicly available to allow replications and evaluation of the results.</p>
<p>Semiparametric approaches, requiring less strong assumptions have been proposed, e.g., in biometrics and econometrics, respectively. Hansen (<xref ref-type="bibr" rid="B20">1982</xref>) proposes a generalized methods of moments (GMM) approach and Liang and Zeger (<xref ref-type="bibr" rid="B28">1986</xref>) a generalized estimating equations (GEE) approach. For valid inferences in (non-) linear (panel or repeated measurement) regression models, both approaches require only correct specification of the fixed part of a model, whereas the covariance structure may be misspecified. GMM is more flexible as it allows the estimation of more general models than GEE, but the latter is easier to use. Both approaches have been adapted or generalized since the 80&#x00027;s, e.g., to deal with many different situations, e.g., high dimensional data (Fan and Liao, <xref ref-type="bibr" rid="B11">2014</xref>), panel or repeated measurement models with mixed continuous and ordinal outcomes (Spiess, <xref ref-type="bibr" rid="B56">2006</xref>) or ordered stereotpye models (Spiess et al., <xref ref-type="bibr" rid="B55">2020</xref>). Another approach that allows modeling linear or much more general, smooth non-linear effects of covariates on the mean and further shape parameters of the (conditional) outcome distribution is described in Rigby and Stasinopoulos (<xref ref-type="bibr" rid="B43">2005</xref>). This approach would be helpful when the effects of some covariates cannot be assumed to be linear, but need to be controlled.</p>
<p>For the non-parametric modeling approach, we limit ourselves to an example from IRT to illustrate that these arguably more robust approaches have been available, but have not been adopted by researchers in psychology: The theoretical underpinnings of some non-parametric approaches were established as early as the 1960s (Esary et al., <xref ref-type="bibr" rid="B9">1967</xref>). One of the first practical outlines of a non-parametric approach to IRT was then given in the early 1970s by Mokken (<xref ref-type="bibr" rid="B36">1971</xref>), and some important generalizations of the latter&#x02014;both in practical and theoretical aspects&#x02014;were established in the 1980s (e.g., Holland and Rosenbaum, <xref ref-type="bibr" rid="B23">1986</xref>) and 1990s (e.g., Ramsay, <xref ref-type="bibr" rid="B42">1991</xref>). These results generally provide robustness against misspecification of the distribution of &#x003B8; as well as misspecification of the IRFs. In many cases, <italic>G</italic> does not need to be specified at all, and the only relevant property of the IRF is monotonicity. Of course, this comes at a price, e.g., inference of the latent variable is done via simple sum scores. However, since the latter is already dominant in practical applications, this does not seem to be a severe restriction in practice.</p>
<p>Obviously, semi- or non-parametric approaches make less strong assumptions than fully parametric approaches, by allowing certain aspects of the statistical models to be miss- or unspecified. Besides the fact that they usually require more observations than fully parametric approaches, inferences about the misspecified aspects are either not possible or should be drawn very cautiously, e.g., when a correlation matrix might be misspecified. If no theory is available to justify a statistical model, including assumptions, a better strategy, if possible, would be to use a simpler design in conjunction with a simple and robust evaluation method (e.g.,Peterson, <xref ref-type="bibr" rid="B40">2009</xref>). A notable side-effect of relying on simple designs and analysis steps is the availability of sufficiently elaborated tools for model diagnostics.</p></sec>
<sec id="s6">
<title>6. Discussion and conclusions</title>
<p>The methodological framework presented in Section 2 highlights the close linkage between scientific theory, sampling and data collection design as well as the statistical methods and models adopted to empirically test the theory. Since not much resources are devoted to the proper sampling of subjects from a well-defined population and since missing data are oftentimes ignored or assumed to follow a convenient missing mechanism, it can be assumed that assumptions of the commonly used parametric models are often violated. As shown in Section 3, the consequences can range from marginal biases to, e.g., in case of confidence intervals, actual coverage rates of true values close to zero even in the analysis of experimental data. It should also be noted that the outlined methodological problems cannot be prevented by preregistration or a ban on null hypothesis testing, nor can they be uncovered by mere replications within the same or very similar subpopulations. Increasing sample sizes, e.g., via online data collection, makes things even worse: the biases in the estimators do not vanish but the standard errors tend to zero, further lowering the actual coverage rates of confidence intervals in case of biased estimators.</p>
<p>Interestingly, although the approaches described in Section 5 circumvent severe problems in the estimation of general regression and IRT models, they seem to have largely been ignored. Instead, applied research seems to stick to convenience samples and highly specific (and fragile) parametric models. Among other reasons, such as publication policies, part of the problem may be that statistical training in psychology largely neglects sampling theory (e.g.,S&#x000E4;rndal et al., <xref ref-type="bibr" rid="B50">1992</xref>) (beyond sample size determination), strategies of avoiding or compensating for non-response (e.g., Rubin, <xref ref-type="bibr" rid="B49">1987</xref>; Wooldridge, <xref ref-type="bibr" rid="B65">2010</xref>) and problems of model misspecification.</p>
<p>However, the problem of missing reported model checks seems to be mainly caused by two factors. Firstly, in many modeling classes there is not a uniquely defined and accepted way of testing the modeling assumptions. In fact, the number of potential applicable statistics can be arbitrarily large. For example, assessing unidimensionality in an IRT model with <italic>J</italic> items can entail more than 10<sup><italic>J</italic></sup> potential statistics (Ligtvoet, <xref ref-type="bibr" rid="B29">2022</xref>) and there is no universal way to check unidimensionality. In conjunction with the dominance of parametric models this contributes to the fragility of the analysis.</p>
<p>Secondly, there is also an important connection of the lack of model checking with respect to the so called &#x0201C;garden of folking paths&#x0201D; (Gelman and Loken, <xref ref-type="bibr" rid="B16">2014</xref>). The latter describes a sequence of data-dependent choices a researcher undertakes in order to arrive at his/her final analysis result. At each step, another decision could have been made with potential consequences for the outcome of the analysis. The mere fact that these decisions are not set a priori but are made data dependent contributes to the inflated effect sizes reported in the literature. Now suppose a researcher did arrive at a final result that seems to make sense in terms of content. In this case, we would argue that looking at additional model checks has already become highly unlikely. Not only is there the potential to &#x0201C;ruin&#x0201D; the result, but there is also the implication of going back to the drawing board and starting from scratch.</p>
<p>A potential way to resolve the problem of forking paths is given by preregistration of the study and by specifying the analysis protocol ahead of looking at the data. However, if we were humble with respect to the validity of our proposed model in the preregistration step, our plan would need to entail the possibility of misspecification. In some cases this could very well be incorporated in the preregistration step (e.g., Nosek et al., <xref ref-type="bibr" rid="B37">2018</xref>). However, for complex types of analysis, the potential ways of model failures and the number of alternative models grows very fast, so that preregistration is unlikely to cover all potential paths of analysis. Furthermore, if it is suspected that the observed sample is selective and model diagnostics are considered as an important part of analysis, we must be open to sometimes unforeseen changes in the analysis plan&#x02014;for otherwise we put too much trust in our models. This reveals that some proposals, such as preregistration, that aim to increase the trustworthiness of scientific research face additional major challenges, as the data dependence of the analysis may require switching to alternative models or procedures.</p>
<p>A longer-term strategy to overcome the shortcomings discussed above would be to hopefully increase students&#x00027; appreciation of statistics by emphasizing the close interaction of theory, methods, and empirical information. A simple example would be to ask students to try to define the humans about which inferences are being made, to compare this definition with observed samples described in research papers, and to try to verbalize as clearly as possible the rationale and necessary assumptions for the inferences from the latter to the former. This exercise may also demonstrate that the validity of inferences depends on the weakest link in the chain. In addition, rather than teaching statistics as a clickable toolbox with many different models and techniques, and in addition to topics such as sample selection and missing data compensation, it may be beneficial to treat in depth the consequences of violated assumptions of standard techniques and models. The consequences of violated assumptions could be illustrated by simulating data sets following a real example, varying the assumptions being violated and discussing the consequences with respect to the inferences. To clearly demonstrate the consequences, this amounts to running simulation experiments. Students should learn that violation of some assumptions may have only mild consequences, whereas inferences can be very misleading if other assumptions are violated. Application of robust methods could be illustrated by applying semi- or non-parametric methods to a real problem for which the data set is available and compare the results with those reported in the corresponding research paper. Although the described problem-oriented strategy relies on practical examples and illustrations (or simulations), the corresponding theoretical concepts should be treated as well to a mathematical level such that the key ideas can be understood. Generalizations to more complex models should then be possible for students even without recourse on simple but often superficial receipts.</p></sec>
<sec sec-type="author-contributions" id="s7">
<title>Author contributions</title>
<p>MS: Conceptualization, Formal analysis, Visualization, Writing&#x02014;original draft, Writing&#x02014;review and editing. PJ: Conceptualization, Formal analysis, Writing&#x02014;original draft, Writing&#x02014;review and editing.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.</p>
</sec>
<ack><p>We would like to thank the two referees for their helpful comments and suggestions.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="s10">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1266447/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1266447/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.PDF" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/></sec>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>In fact, the CTT counterparts can be subsumed under the linear mixed model formulation, so that most of the following discussion applies also to the CTT framework.</p></fn>
<fn id="fn0002"><p><sup>2</sup>The local independence assumption is the formal manifestation of the statement that once the numerical IQ is fixed, the items show no statistical dependency anymore.</p></fn>
<fn id="fn0003"><p><sup>3</sup>There are some complications regarding the proper asymptotic behavior if the resulting table is sparse. Hence, our description is somewhat imprecise, as the correct setup would involve a properly defined likelihood function.</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Amemiya</surname> <given-names>T.</given-names></name></person-group> (<year>1985</year>). <source>Advanced Econometrics</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Harvard University Press</publisher-name>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andersen</surname> <given-names>E. B.</given-names></name></person-group> (<year>1970</year>). <article-title>Asymptotic properties of conditional maximum-likelihood estimators</article-title>. <source>J. R. Stat. Soc. Ser. B Stat. Methodol.</source> <volume>32</volume>, <fpage>283</fpage>&#x02013;<lpage>301</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1970.tb00842.x</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arnett</surname> <given-names>J. J.</given-names></name></person-group> (<year>2008</year>). <article-title>The neglected 95%. Why American psychology needs to become less American</article-title>. <source>Am. Psychol.</source> <volume>63</volume>, <fpage>602</fpage>&#x02013;<lpage>614</lpage>. <pub-id pub-id-type="doi">10.1037/0003-066X.63.7.602</pub-id><pub-id pub-id-type="pmid">18855491</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Asendorpf</surname> <given-names>J. B.</given-names></name> <name><surname>Conner</surname> <given-names>M.</given-names></name> <name><surname>De Fruyt</surname> <given-names>F.</given-names></name> <name><surname>De Houwer</surname> <given-names>J.</given-names></name> <name><surname>Denissen</surname> <given-names>J. J. A.</given-names></name> <name><surname>Fiedler</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Recommendations for increasing replicability in psychology</article-title>. <source>Eur. J. Pers.</source> <volume>27</volume>, <fpage>108</fpage>&#x02013;<lpage>119</lpage>. <pub-id pub-id-type="doi">10.1002/per.1919</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Button</surname> <given-names>K. S.</given-names></name> <name><surname>Ioannidis</surname> <given-names>J. P. A.</given-names></name> <name><surname>Mokrysz</surname> <given-names>C.</given-names></name> <name><surname>Nosek</surname> <given-names>B. A.</given-names></name> <name><surname>Flint</surname> <given-names>J.</given-names></name> <name><surname>Robinson</surname> <given-names>E. S. J.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Power failure: why small sample size undermines the reliability of neuroscience</article-title>. <source>Nat. Rev. Neurosci.</source> <volume>14</volume>, <fpage>365</fpage>&#x02013;<lpage>376</lpage>. <pub-id pub-id-type="doi">10.1038/nrn3475</pub-id><pub-id pub-id-type="pmid">23571845</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname> <given-names>J.</given-names></name></person-group> (<year>1962</year>). <article-title>The statistical power of abnormal-social psychological research: a review</article-title>. <source>J. Abnorm. Soc. Psychol.</source> <volume>65</volume>, <fpage>145</fpage>&#x02013;<lpage>153</lpage>. <pub-id pub-id-type="doi">10.1037/h0045186</pub-id><pub-id pub-id-type="pmid">13880271</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dykiert</surname> <given-names>D.</given-names></name> <name><surname>Der</surname> <given-names>G.</given-names></name> <name><surname>Starr</surname> <given-names>J. M.</given-names></name> <name><surname>Deary</surname> <given-names>I. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Age differences in intra-individual variability in simple and choice reaction time: systematic review and meta-analysis</article-title>. <source>PLoS ONE</source> <volume>7</volume>, <fpage>e45759</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0045759</pub-id><pub-id pub-id-type="pmid">23071524</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eronen</surname> <given-names>M. I.</given-names></name> <name><surname>Bringmann</surname> <given-names>L. F.</given-names></name></person-group> (<year>2021</year>). <article-title>The theory crisis in psychology: how to move forward</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>16</volume>, <fpage>779</fpage>&#x02013;<lpage>788</lpage>. <pub-id pub-id-type="doi">10.1177/1745691620970586</pub-id><pub-id pub-id-type="pmid">33513314</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Esary</surname> <given-names>J. D.</given-names></name> <name><surname>Proschan</surname> <given-names>F.</given-names></name> <name><surname>Walkup</surname> <given-names>D. W.</given-names></name></person-group> (<year>1967</year>). <article-title>Association of random variables, with applications</article-title>. <source>Ann. Math. Stat.</source> <volume>38</volume>, <fpage>1466</fpage>&#x02013;<lpage>1474</lpage>. <pub-id pub-id-type="doi">10.1214/aoms/1177698701</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Falk</surname> <given-names>E. B.</given-names></name> <name><surname>Hyde</surname> <given-names>L. W.</given-names></name> <name><surname>Mitchell</surname> <given-names>C.</given-names></name> <name><surname>Faul</surname> <given-names>J.</given-names></name> <name><surname>Gonzalez</surname> <given-names>R.</given-names></name> <name><surname>Heitzeg</surname> <given-names>M. M.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>What is a representative brain? Neuroscience meets population science</article-title>. <source>PNAS</source> <volume>110</volume>, <fpage>17615</fpage>&#x02013;<lpage>17622</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1310134110</pub-id><pub-id pub-id-type="pmid">24151336</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fan</surname> <given-names>J.</given-names></name> <name><surname>Liao</surname> <given-names>Y.</given-names></name></person-group> (<year>2014</year>). <article-title>Endogeneity in high dimensions</article-title>. <source>Ann. Stat.</source> <volume>42</volume>, <fpage>872</fpage>&#x02013;<lpage>917</lpage>. <pub-id pub-id-type="doi">10.1214/13-AOS1202</pub-id><pub-id pub-id-type="pmid">25580040</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fernald</surname> <given-names>A.</given-names></name></person-group> (<year>2010</year>). <article-title>Getting beyond the &#x0201C;convenience sample&#x0201D; in research on early cognitive development</article-title>. <source>Behav. Brain. Sci.</source> <volume>33</volume>, <fpage>91</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1017/S0140525X10000294</pub-id><pub-id pub-id-type="pmid">20546649</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fiedler</surname> <given-names>K.</given-names></name></person-group> (<year>2014</year>). <article-title>From intrapsychic to ecological theories in social psychology: outlines of a functional theory approach</article-title>. <source>Eur. J. Soc. Psychol.</source> <volume>44</volume>, <fpage>657</fpage>&#x02013;<lpage>670</lpage>. <pub-id pub-id-type="doi">10.1002/ejsp.2069</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fiedler</surname> <given-names>K.</given-names></name></person-group> (<year>2017</year>). <article-title>What constitutes strong psychological science? The (neglected) role of diagnosticity and a priori theorizing</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>12</volume>, <fpage>46</fpage>&#x02013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.1177/1745691616654458</pub-id><pub-id pub-id-type="pmid">28073328</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fricker</surname> <given-names>R. D.</given-names> <suffix>Jr.</suffix></name> <name><surname>Burke</surname> <given-names>K.</given-names></name> <name><surname>Han</surname> <given-names>X.</given-names></name> <name><surname>Woodall</surname> <given-names>W. H.</given-names></name></person-group> (<year>2019</year>). <article-title>Assessing the statistical analyses used in basic and applied social psychology after their <italic>p</italic>-value ban</article-title>. <source>Am. Stat.</source> <volume>73</volume>, <fpage>374</fpage>&#x02013;<lpage>384</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.2018.1537892</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gelman</surname> <given-names>A.</given-names></name> <name><surname>Loken</surname> <given-names>E.</given-names></name></person-group> (<year>2014</year>). <article-title>The statistical crisis in science</article-title>. <source>Am. Sci</source>, 102, 460. <pub-id pub-id-type="doi">10.1511/2014.111.460</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gigerenzer</surname> <given-names>G.</given-names></name></person-group> (<year>2018</year>). <article-title>Statistical rituals: the replication delusion and how we got there</article-title>. <source>Adv. Methods Pract. Psychol. Sci.</source> <volume>1</volume>, <fpage>198</fpage>&#x02013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1177/2515245918771329</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Groves</surname> <given-names>R. M.</given-names></name> <name><surname>Dillman</surname> <given-names>D. A.</given-names></name> <name><surname>Eltinge</surname> <given-names>J. L.</given-names></name> <name><surname>Little</surname> <given-names>R. J. A.</given-names></name></person-group> (<year>2001</year>). <source>Survey Nonresponse.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>John Wiley &#x00026; Sons</publisher-name>.</citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hahn</surname> <given-names>U.</given-names></name></person-group> (<year>2011</year>). <article-title>The problem of circularity in evidence, argument, and explanation</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>6</volume>, <fpage>172</fpage>&#x02013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1177/1745691611400240</pub-id><pub-id pub-id-type="pmid">26162136</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hansen</surname> <given-names>L. P.</given-names></name></person-group> (<year>1982</year>). <article-title>Large sample properties of generalized method of moments estimators</article-title>. <source>Econometrica</source> <volume>50</volume>, <fpage>1029</fpage>&#x02013;<lpage>1054</lpage>. <pub-id pub-id-type="doi">10.2307/1912775</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heckman</surname> <given-names>J. J.</given-names></name></person-group> (<year>1979</year>). <article-title>Sample selection bias as a specification error</article-title>. <source>Econometrica</source> <volume>47</volume>, <fpage>153</fpage>&#x02013;<lpage>161</lpage>. <pub-id pub-id-type="doi">10.2307/1912352</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Henrich</surname> <given-names>J.</given-names></name> <name><surname>Heine</surname> <given-names>S. J.</given-names></name> <name><surname>Norenzayan</surname> <given-names>A.</given-names></name></person-group> (<year>2010</year>). <article-title>The weirdest people in the world?</article-title> <source>Behav. Brain. Sci.</source> <volume>33</volume>, <fpage>61</fpage>&#x02013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1017/S0140525X0999152X</pub-id><pub-id pub-id-type="pmid">20550733</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holland</surname> <given-names>P. W.</given-names></name> <name><surname>Rosenbaum</surname> <given-names>P. R.</given-names></name></person-group> (<year>1986</year>). <article-title>Conditional association and unidimensionality in monotone latent variable models</article-title>. <source>Ann. Stat.</source> <volume>14</volume>, <fpage>1523</fpage>&#x02013;<lpage>1543</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1176350174</pub-id><pub-id pub-id-type="pmid">36933110</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <source>Linear and Generalized Linear Mixed Models and Their Applications.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer Science &#x00026; Business Media</publisher-name>.<pub-id pub-id-type="pmid">34823466</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klein</surname> <given-names>R. A.</given-names></name> <name><surname>Ratliff</surname> <given-names>K. A.</given-names></name> <name><surname>Vianello</surname> <given-names>M.</given-names></name> <name><surname>Adams</surname> <given-names>R. B.</given-names> <suffix>Jr.</suffix></name> <name><surname>Bahn&#x000ED;k</surname> <given-names>&#x00160;.</given-names></name> <name><surname>Bernstein</surname> <given-names>M. J.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Investigating variation in replicability a &#x0201C;Many Labs&#x0201D; replication project</article-title>. <source>Soc. Psychol.</source> <volume>45</volume>, <fpage>142</fpage>&#x02013;<lpage>152</lpage>. <pub-id pub-id-type="doi">10.1027/1864-9335/a000178</pub-id><pub-id pub-id-type="pmid">36690972</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klein</surname> <given-names>R. A.</given-names></name> <name><surname>Vianello</surname> <given-names>M.</given-names></name> <name><surname>Hasselman</surname> <given-names>F.</given-names></name> <name><surname>Adams</surname> <given-names>B. G.</given-names></name> <name><surname>Adams</surname> <given-names>R. B.</given-names> <suffix>Jr.</suffix></name> <name><surname>Alper</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Many labs 2: investigating variation in replicability across samples and settings</article-title>. <source>Adv. Methods. Pract. Psychol. Sci.</source> <volume>1</volume>, <fpage>443</fpage>&#x02013;<lpage>490</lpage>. <pub-id pub-id-type="doi">10.1177/2515245918810225</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kline</surname> <given-names>B.</given-names></name></person-group> (<year>2015</year>). <article-title>The mediation myth</article-title>. <source>Basic Appl. Soc. Psychol.</source> <volume>37</volume>, <fpage>202</fpage>&#x02013;<lpage>213</lpage>. <pub-id pub-id-type="doi">10.1080/01973533.2015.1049349</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liang</surname> <given-names>K.-Y.</given-names></name> <name><surname>Zeger</surname> <given-names>S. L.</given-names></name></person-group> (<year>1986</year>). <article-title>Longitudinal data analysis using generalized linear models</article-title>. <source>Biometrika</source> <volume>73</volume>, <fpage>13</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/73.1.13</pub-id><pub-id pub-id-type="pmid">20007201</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ligtvoet</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>Incomplete tests of conditional association for the assessment of model assumptions</article-title>. <source>Psychometrika</source> <volume>87</volume>, <fpage>1214</fpage>&#x02013;<lpage>1237</lpage>. <pub-id pub-id-type="doi">10.1007/s11336-022-09841-1</pub-id><pub-id pub-id-type="pmid">35124767</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mardia</surname> <given-names>K. V.</given-names></name> <name><surname>Kent</surname> <given-names>J. T.</given-names></name> <name><surname>Bibby</surname> <given-names>J. M.</given-names></name></person-group> (<year>1979</year>). <source>Multivariate Analysis.</source> <publisher-loc>London</publisher-loc>: <publisher-name>Academic Press</publisher-name>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCulloch</surname> <given-names>C. E.</given-names></name> <name><surname>Neuhaus</surname> <given-names>J. M.</given-names></name> <name><surname>Olin</surname> <given-names>R. L.</given-names></name></person-group> (<year>2016</year>). <article-title>Biased and unbiased estimation in longitudinal studies with informative visit processes</article-title>. <source>Biometrics</source> <volume>72</volume>, <fpage>1315</fpage>&#x02013;<lpage>1324</lpage>. <pub-id pub-id-type="doi">10.1111/biom.12501</pub-id><pub-id pub-id-type="pmid">26990830</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McPhetres</surname> <given-names>J</given-names></name> <name><surname>Albayrak-Aydemir</surname> <given-names>N.</given-names></name> <name><surname>Barbosa Mendes</surname> <given-names>A.</given-names></name> <name><surname>Chow</surname> <given-names>E. C.</given-names></name> <name><surname>Gonzalez-Marquez</surname> <given-names>P.</given-names></name> <name><surname>Loukras</surname> <given-names>E.</given-names></name> <etal/></person-group> (<year>2021</year>). <article-title>A decade of theory as reflected in Psychological Science (2009&#x02013;2019)</article-title>. <source>PLOS ONE</source> 16, e0247986. <pub-id pub-id-type="doi">10.1371/journal.pone.0247986</pub-id><pub-id pub-id-type="pmid">33667242</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meehl</surname> <given-names>P. E.</given-names></name></person-group> (<year>1978</year>). <article-title>Theoretical risks and tabular asterisks: sir karl, sir ronald, and the slow progress of soft psychology</article-title>. <source>J. Consult. Clin. Psychol.</source> <volume>46</volume>, <fpage>806</fpage>&#x02013;<lpage>834</lpage>. <pub-id pub-id-type="doi">10.1037/0022-006X.46.4.806</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meehl</surname> <given-names>P. E.</given-names></name></person-group> (<year>1967</year>). <article-title>Theory-testing in psychology and physics: a methodological paradox</article-title>. <source>Philos. Sci.</source> <volume>34</volume>, <fpage>103</fpage>&#x02013;<lpage>115</lpage>.</citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meehl</surname> <given-names>P. E.</given-names></name></person-group> (<year>1990</year>). <article-title>Appraising and amending theories: the strategy of lakatosian defense and two principles that warrant it</article-title>. <source>Psychol. Inq.</source> <volume>1</volume>, <fpage>108</fpage>&#x02013;<lpage>141</lpage>.</citation>
</ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mokken</surname> <given-names>R. J.</given-names></name></person-group> (<year>1971</year>). <source>A Theory and Procedure of Scale Analysis.</source> <publisher-loc>The Hague</publisher-loc>: <publisher-name>Mouton</publisher-name>.</citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nosek</surname> <given-names>B. A.</given-names></name> <name><surname>Ebersole</surname> <given-names>C. R.</given-names></name> <name><surname>DeHaven</surname> <given-names>A. C.</given-names></name> <name><surname>Mellor</surname> <given-names>D. T.</given-names></name></person-group> (<year>2018</year>). <article-title>The preregistration revolution</article-title>. <source>PNAS</source> <volume>115</volume>, <fpage>2600</fpage>&#x02013;<lpage>2606</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1708274114</pub-id><pub-id pub-id-type="pmid">29531091</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><collab>Open Science Collaboration</collab></person-group> (<year>2012</year>). <article-title>An open, large-scale, collaborative effort to estimate the reproducibility of psychological science</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>7</volume>, <fpage>657</fpage>&#x02013;<lpage>660</lpage>. <pub-id pub-id-type="doi">10.1177/1745691612462588</pub-id><pub-id pub-id-type="pmid">26168127</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><collab>Open Science Collaboration</collab></person-group> (<year>2015</year>). <article-title>Estimating the reproducibility of psychological science</article-title>. <source>Science</source> 349, aac4716. <pub-id pub-id-type="doi">10.1126/science.aac4716</pub-id><pub-id pub-id-type="pmid">26315443</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peterson</surname> <given-names>C.</given-names></name></person-group> (<year>2009</year>). <article-title>Minimally sufficient research</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>4</volume>, <fpage>7</fpage>&#x02013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1111/j.1745-6924.2009.01089.x</pub-id><pub-id pub-id-type="pmid">26158822</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pratkanis</surname> <given-names>A. R.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;The (partial but) real crisis in social psychology. a social influence analysis of the causes and solutions,&#x0201D;</article-title> in <source>Psychological Science Under Scrutiny Recent Challenges and Proposed Solutions</source>, eds S. O. Lilienfeld, and I. D. Waldman (Chapter 9) (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>141</fpage>&#x02013;<lpage>163</lpage>.</citation>
</ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ramsay</surname> <given-names>J. O.</given-names></name></person-group> (<year>1991</year>). <article-title>Kernel smoothing approaches to nonparametric item characteristic curve estimation</article-title>. <source>Psychometrika</source> <volume>56</volume>, <fpage>611</fpage>&#x02013;<lpage>630</lpage>. <pub-id pub-id-type="doi">10.1007/BF02294494</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rigby</surname> <given-names>R. A.</given-names></name> <name><surname>Stasinopoulos</surname> <given-names>D. M.</given-names></name></person-group> (<year>2005</year>). <article-title>Generalized additive models for location, scale and shape (with discussion)</article-title>. <source>J. R. Stat. Soc. Ser. C Appl. Stat.</source> <volume>54</volume>, <fpage>507</fpage>&#x02013;<lpage>554</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9876.2005.00510.x</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robins</surname> <given-names>J. M.</given-names></name> <name><surname>Rotnitzky</surname> <given-names>A.</given-names></name> <name><surname>Zhao</surname> <given-names>L. P.</given-names></name></person-group> (<year>1995</year>). <article-title>Analysis of semiparametric regression models for repeated outcomes in the presence of missing data</article-title>. <source>J. Am. Stat. Assoc.</source> <volume>909</volume>, <fpage>106</fpage>&#x02013;<lpage>121</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1995.10476493</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosenbaum</surname> <given-names>P. R.</given-names></name> <name><surname>Rubin</surname> <given-names>D. B.</given-names></name></person-group> (<year>1983</year>). <article-title>Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome</article-title>. <source>J. R. Stat. Soc. Series B. Stat. Methodol.</source> <volume>45</volume>, <fpage>12</fpage>&#x02013;<lpage>218</lpage>.</citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosenthal</surname> <given-names>R.</given-names></name></person-group> (<year>1979</year>). <article-title>The &#x0201C;File Drawer Problem&#x0201D; and tolerance for null results</article-title>. <source>Psychol. Bull.</source> <volume>86</volume>, <fpage>638</fpage>&#x02013;<lpage>641</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.86.3.638</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rozeboom</surname> <given-names>W. W.</given-names></name></person-group> (<year>1960</year>). <article-title>The fallacy of the null-hypothesis significance test</article-title>. <source>Psychol. Bull.</source> <volume>57</volume>, <fpage>416</fpage>&#x02013;<lpage>428</lpage>. <pub-id pub-id-type="doi">10.1037/h0042040</pub-id><pub-id pub-id-type="pmid">13744252</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>D. B.</given-names></name></person-group> (<year>1976</year>). <article-title>Inference and missing data</article-title>. <source>Biometrika</source> <volume>63</volume>, <fpage>581</fpage>&#x02013;<lpage>590</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/63.3.581</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>D. B.</given-names></name></person-group> (<year>1987</year>). <source>Multiple Imputation for Nonresponse in Surveys.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>John Wiley &#x00026; Sons</publisher-name>.</citation>
</ref>
<ref id="B50">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>S&#x000E4;rndal</surname> <given-names>C.-E.</given-names></name> <name><surname>Swensson</surname> <given-names>B.</given-names></name> <name><surname>Wretman</surname> <given-names>J.</given-names></name></person-group> (<year>1992</year>). <source>Model Assisted Survey Sampling.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>.</citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schafer</surname> <given-names>J. L.</given-names></name> <name><surname>Graham</surname> <given-names>J. W.</given-names></name></person-group> (<year>2002</year>). <article-title>Missing data: our view of the state of the art</article-title>. <source>Psychol. Methods</source> <volume>7</volume>, <fpage>147</fpage>&#x02013;<lpage>177</lpage>. <pub-id pub-id-type="doi">10.1037/1082-989X.7.2.147</pub-id><pub-id pub-id-type="pmid">12090408</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scholtz</surname> <given-names>S. E</given-names></name> <name><surname>de Klerk</surname> <given-names>W.</given-names></name> <name><surname>de Beer</surname> <given-names>L. T.</given-names></name></person-group> (<year>2020</year>). <article-title>The use of research methods in psychological research: a systematised review</article-title>. <source>Front. Res. Metr. Anal.</source> <volume>5</volume>, <fpage>1</fpage>. <pub-id pub-id-type="doi">10.3389/frma.2020.00001</pub-id><pub-id pub-id-type="pmid">33870039</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sedlmeier</surname> <given-names>P.</given-names></name> <name><surname>Gigerenzer</surname> <given-names>G.</given-names></name></person-group> (<year>1989</year>). <article-title>Do studies of statistical power have an effect on the power of studies?</article-title> <source>Psychol. Bull.</source> <volume>105</volume>, <fpage>309</fpage>&#x02013;<lpage>316</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.105.2.309</pub-id><pub-id pub-id-type="pmid">36577739</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shrout</surname> <given-names>P. E.</given-names></name> <name><surname>Rodgers</surname> <given-names>J. L.</given-names></name></person-group> (<year>2018</year>). <article-title>Psychology, science, and knowledge construction: broadening perspectives from the replication crisis</article-title>. <source>Annu. Rev. Psychol.</source> <volume>69</volume>, <fpage>487</fpage>&#x02013;<lpage>510</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-psych-122216-011845</pub-id><pub-id pub-id-type="pmid">29300688</pub-id></citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spiess</surname> <given-names>M.</given-names></name> <name><surname>Fern&#x000E1;ndez</surname> <given-names>D.</given-names></name> <name><surname>Nguyen</surname> <given-names>T.</given-names></name> <name><surname>Liu</surname> <given-names>I.</given-names></name></person-group> (<year>2020</year>). <article-title>Generalized estimating equations to estimate the ordered stereotype logit model for panel data</article-title>. <source>Stat. Med.</source> <volume>29</volume>, <fpage>1919</fpage>&#x02013;<lpage>1940</lpage>. <pub-id pub-id-type="doi">10.1002/sim.8520</pub-id><pub-id pub-id-type="pmid">32227517</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spiess</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>Estimation of a two-equation panel model with mixed continuous and ordered categorical outcomes and missing data</article-title>. <source>J. R. Stat. Soc. Ser. C Appl. Stat.</source> <volume>55</volume>, <fpage>525</fpage>&#x02013;<lpage>538</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9876.2006.00551.x</pub-id></citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spiess</surname> <given-names>M.</given-names></name> <name><surname>Hamerle</surname> <given-names>A.</given-names></name></person-group> (<year>2000</year>). <article-title>Regression models with correlated binary responses: A Comparison of different methods in finite samples</article-title>. <source>Comput. Stat. Data Anal.</source> <volume>33</volume>, <fpage>439</fpage>&#x02013;<lpage>455</lpage>. <pub-id pub-id-type="doi">10.1016/S0167-9473(99)00065-1</pub-id><pub-id pub-id-type="pmid">22988281</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steegen</surname> <given-names>S.</given-names></name> <name><surname>Tuerlinckx</surname> <given-names>F.</given-names></name> <name><surname>Gelman</surname> <given-names>A.</given-names></name> <name><surname>Vanpaemel</surname> <given-names>W.</given-names></name></person-group> (<year>2016</year>). <article-title>Increasing transparency through a multiverse analysis</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>11</volume>, <fpage>702</fpage>&#x02013;<lpage>712</lpage>.<pub-id pub-id-type="doi">10.1177/1745691616658637</pub-id><pub-id pub-id-type="pmid">27694465</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sterling</surname> <given-names>T. D.</given-names></name> <name><surname>Rosenbaum</surname> <given-names>W. L.</given-names></name> <name><surname>Weinkam</surname> <given-names>J. J.</given-names></name></person-group> (<year>1995</year>). <article-title>Publication decisions revisited: the effect of the outcome of statistical tests on the decision to publish and vice versa</article-title>. <source>Am. Stat.</source> <volume>49</volume>, <fpage>108</fpage>&#x02013;<lpage>112</lpage>.</citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sterling</surname> <given-names>T. D.</given-names></name></person-group> (<year>1959</year>). <article-title>Publication decisions and their possible effects on inferences drawn from tests of significance &#x02013; or vice versa</article-title>. <source>J. Am. Stat. Assoc.</source> <volume>54</volume>, <fpage>30</fpage>&#x02013;<lpage>34</lpage>. <pub-id pub-id-type="doi">10.2307/2282137</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szollosi</surname> <given-names>A.</given-names></name> <name><surname>Donkin</surname> <given-names>C.</given-names></name></person-group> (<year>2021</year>). <article-title>Arrested theory development: the misguided distinction between exploratory and confirmatory research</article-title>. <source>Perspect. Psychol. Sci.</source> <volume>16</volume>, <fpage>717</fpage>&#x02013;<lpage>724</lpage>. <pub-id pub-id-type="doi">10.1177/1745691620966796</pub-id><pub-id pub-id-type="pmid">33593151</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Terza</surname> <given-names>J. V.</given-names></name></person-group> (<year>1998</year>). <article-title>Estimating count data models with endogenous switching: sample selection and endogenous treatment effects</article-title>. <source>J. Econom.</source> <volume>84</volume>, <fpage>129</fpage>&#x02013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1016/S0304-4076(97)00082-1</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wooldridge</surname> <given-names>J. M.</given-names></name></person-group> (<year>2002</year>). <article-title>Inverse probability weighted M-estimators for sample selection, attrition, and stratification</article-title>. <source>Port. Econ. J.</source> <volume>1</volume>, <fpage>117</fpage>&#x02013;<lpage>139</lpage>. <pub-id pub-id-type="doi">10.1007/s10258-002-0008-x</pub-id><pub-id pub-id-type="pmid">25956004</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wooldridge</surname> <given-names>J. M.</given-names></name></person-group> (<year>2007</year>). <article-title>Inverse probability weighted estimation for general missing data problems</article-title>. <source>J. Econom.</source> <volume>141</volume>, <fpage>1281</fpage>&#x02013;<lpage>1301</lpage>. <pub-id pub-id-type="doi">10.1016/j.jeconom.2007.02.002</pub-id></citation>
</ref>
<ref id="B65">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wooldridge</surname> <given-names>J. M.</given-names></name></person-group> (<year>2010</year>). <source>Econometric Analysis of Cross Section and Panel Data, 2nd Edn</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>.</citation>
</ref>
</ref-list>
</back>
</article>