^{*}

Edited by: Holmes Finch, Ball State University, USA

Reviewed by: Mike W.-L. Cheung, National University of Singapore, Singapore; Stanley E. Lazic, Novartis Institutes for Biomedical Research, Switzerland

*Correspondence: David A. Magezi, Neurology Unit, Laboratory for Cognitive and Neurological Sciences, Department of Medicine, Faculty of Science, University of Fribourg, Chemin du Musée 5, 1700 Fribourg, Switzerland e-mail:

This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Linear mixed-effects models (LMMs) are increasingly being used for data analysis in cognitive neuroscience and experimental psychology, where within-participant designs are common. The current article provides an introductory review of the use of LMMs for within-participant data analysis and describes a free, simple, graphical user interface (LMMgui). LMMgui uses the package

Linear mixed-effects models (LMMs) provide a versatile approach to data analysis and have been shown to be very useful in a several branches of neuroscience (Gueorguieva and Krystal,

Let us consider a hypothetical experiment where a researcher is interested in how quickly human listeners can detect a telephone ringing in the presence of concurrent speech. The response variable collected is the average reaction time (RT), and at first, only one explanatory variable is available: language. Measurements of RT are available for concurrent speech in French, German, and English, and thus language can be described as a categorical factor with three levels. RTs may have been measured from three different groups of monolingual listeners. Importantly, each measurement would be from a different listener. Such data is grouped by listener and by language, and since each listener can only belong to one language group, the grouping factors of listener and language are said to be nested. In this case, language can also be described as a “between-participants” factor, and the data may be analyzed with a standard analysis of variance (ANOVA). This method assumes that the response variable comes from a normally distributed population and shows homogeneity of variance.

Now it may be that the measurements were obtained in a very different manner. If the measurements came from a single group of multilingual listeners who all performed the task in each language, then language would be described as a “within-participants” factor. These measurements cannot be considered as independent because three measurements (“repeated measures”) were collected per listener. This phenomenon, which is known as pseudoreplication, is common in neuroscience experiments and leads to the use of repeated-measures (rm) ANOVAs. rmANOVAs require two additional assumptions (for example, see Maxwell and Delaney,

In stark contrast to rmANOVAs, LMMs do not depend on limited assumptions about the variance-covariance matrix and can accommodate missing data. Furthermore, LMMs provide the ability to include various configurations of grouping hierarchies: multiple, nested groups such as street, town, country, and continent; partially-crossed groups, such as student and teacher in a large school where not all students interact with all teachers; and fully crossed groups. This flexibility explains social scientists increasing use of LMMs, also known as “multilevel” or hierarchically linear models. However, it is important to realize that the use of LMMs is by no means restricted to complex grouping designs, and can also be used for experimental psychology studies with a single grouping factor of participant or subject. Importantly for the experimental psychologist, LMMs also allow one to explicitly model the effect of stimulus tokens. For example, in our hypothetical experiment the concurrent speech may have been provided by different multilingual speakers. If each speaker was presented to each listener under all experimental conditions, speaker can be considered a fully crossed, within-participant random factor. A further advantage is that, in some situations, LMM results provide better interpretability in terms of physiological phenomena and a superior fit to the data (Kristensen and Hansen,

Like many statistical models, an LMM describes the relationship between a response variable and other explanatory variables that have been obtained along with the response. In an LMM, at least one of the explanatory variables must be a categorical grouping variable that represents an experimental “unit.” In the above example, that would be an individual listener.

When using LMMs, it is important to classify explanatory variables either as “fixed factors” or “random factors.” Fixed factors are those where all levels of interest are actually included in the experiment. For example, in studies which are interested in the difference between males and females, the factor of gender with two levels would be a fixed-factor. In contrast, random factors, also commonly referred to as “grouping variables”, include only a sample of all possible levels. Although researchers are often interested in studying a large population, such as adult humans, psychology experiments typically only include a very small subset of that population, so that participant is a random factor. Classification of a factor is not always a trivial task. For example, consider the factor language in our hypothetical experiment. Do the researchers have theoretical or practical reasons to be only interested in the differences between French, German, and English specifically, or would they like to generalize their findings to all languages? In the former case, language would be a fixed factor and in the latter, a random factor. In fact, to generalize to other stimuli within a language, one should also treat the individual stimulus tokens, in our example the speaker, as a random-factor (Baayen et al.,

LMMs comprise two types of terms: “fixed-effects” and “random-effects,” hence the label “mixed-effects.” The fixed-effects terms comprise exclusively fixed factors, and the fixed-effect part of a LMM can vary in complexity depending on which terms are included. The “full” LMM includes the highest-order interaction between the fixed factors, as well as lower-order interaction terms and main effects, whereas other LMMs would include only some of these terms. Note that for data analysis, it is also important to distinguish between categorical fixed factors such as language or gender, which are sampled from a population of discrete levels, and continuous fixed covariates (numeric variables). An example of the latter is the sound level of the telephone in our hypothetical experiment: RTs were measured with the telephone ringing at different sound levels (60, 70, and 80 decibels sound pressure level, dB SPL), while the sound level of the concurrent speech was fixed.

The random-effects terms of LMMs are all the terms that include random factors; interactions between fixed and random factors are considered in the random-effects terms. For example, in the hypothetical telephone-ringing experiment, the random factor listener and its interaction with the fixed covariate sound level can be modeled using a number of different random-effects terms, which differ in their complexity (number of parameters). The simplest random-effects term, known as “random intercept only,” ignores the interaction: it only considers how RT at zero sound level (0 dB SPL) varies between listeners. This is analogous to the assumption of compound symmetry. However, RT may vary as function of sound level, for example RT could decrease with increasing sound level. The slope of this function may vary between participants, and to account for this interaction between participant and sound level, we would also need to include a “random slope” term. In the full LMM, the random-effects part would also include parameters that allow for the intercept and slope to be correlated: for example, if as shown in Figure

One approach to using LMMs is to systematically compare the full LMM to other models which are the same except for one term missing. The comparison is done using a likelihood-ratio test (LRT), and the test statistic χ^{2}, degrees of freedom and ^{2} test can be conservative (for further discussion, see Pinheiro and Bates,

Although LMMs are useful for both confirmatory hypothesis tests and exploratory analyses, it is important to distinguish between these two when reporting results. The former are tests based on hypotheses, which were posited before data collection, and motivated the study design (Tukey,

LMMs are available in commercial programs such as SPSS (“mixed”), SAS (“proc mixed”), S-PLUS, MLwiN, or ASReml. LMMgui, is a free, graphic user interface that uses ^{2}, degrees of freedom and

For the hypothetical data shown in Figures

Where “RT” is the response variable and the model terms are to the right of the tilde character (“~”). The first terms are fixed-effects: “Language” and “Level.” An interaction term would include a colon “:.” The random-effects terms are those which include a bar symbol (“|”). To the right of the bar is the random factor “Listener.” The expression to the left of the bar indicates that this random term includes correlated intercepts and slopes for the fixed factors. “(Language + Level|Listener)” implicitly includes the random intercept and is equivalent to “(1 + Language + Level|Listener).” In contrast, a random-intercept only term would be “(1|Listener),” and the term for uncorrelated random intercept and slope would be “(Language + Level || Listener).” Further examples and alternative syntax for model terms are given by Bates et al. (

In order to evaluate the main effect of level, the above model can be compared to a model without the term of interest, that is:

Note that during evaluation of fixed-effects, it is recommended that the random-effects part of the models always includes slopes for all fixed factors because this has been shown to be important for confirmatory hypothesis testing in experimental psychology (Barr,

As with most statistical analyses, an important computational step is estimating the parameters of the LMM. Although the details of this are beyond the scope of this mini-review, the reader should be aware of standard maximum likelihood (ML) and restricted ML (REML) criteria. Although the default REML may provide a better estimate of random-effects standard deviation, it does so by averaging over some of the uncertainty in the fixed-effects parameters. For this reason, the ML criterion is used when comparing LMMs with different fixed-effects structures.

A significant LRT would indicate that the missing fixed-effects term (interaction or main effect) is important. For example, the hypothetical data (Figures

Although at present LMMgui is only available for continuous response variables from a normally distributed population, mixed-effects models can also be used for categorical response variables (Dixon,

In order to promote simplicity of use, LMMgui is not as comprehensive as using the command-line options. It is likely that there may be some criticism for a program that provides such a simple interface; Barr et al. (

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author would like to thank Laurent Donzé, Marina Laganaro, Kyle Nakamoto, and Olivier Renaud for their helpful comments on an earlier draft of this manuscript. This work was supported by the University of Fribourg, Switzerland.

Data needs to prepared in long format, with the first row being the variable names and each subsequent row representing a separate measurement. An example data file (“example.csv”) is available with lmmgui. Variable names should begin with a letter, and comprise standard alphanumeric characters (a–z and 0–9)—no spaces or special characters. Implicitly coded nested factors need to be explicitly recoded. For example, consider that in the hypothetical experiment, there was an additional random factor of “Town” because the participants were sampled from different towns. If the listeners from town A are labeled L1, L2… etc., but a different set of listeners from town B are also labeled L1, L2… etc., then the factor listener is implicitly nested in town, and would need to be explicitly recoded, for example as AL1, AL2….BL1, BL2, etc. Next, the data should be saved in text file using the comma (,) or semi-colon (;) as delimiter. In many spreadsheet programs this is achieved by saving in the “.csv” format. The file name should also begin with a letter and comprise standard alphanumeric characters.

Users need to have already installed R, which is available at (

Once analysis is complete, a number of text files will be written to the directory of the prepared data. These files allow the user to inspect all the stages of the analysis, including intended analysis steps in R (“.R” files), the actual steps carried out (“.Rout” file), diagnostic plots (“.pdf”) and details about any warnings, if present (“.Warning.txt”).