Challenges to Ego-Depletion Research Go beyond the Replication Crisis: A Need for Tackling the Conceptual Crisis

One important line of self-control research concerns the phenomenon known as ego-depletion, the negative effect of performing a self-control task (Task 1) on performance on a subsequent self-control task (Task 2). Although a 2010 meta-analysis reported a moderate effect size (d = 0.62) for this phenomenon (Hagger et al., 2010), its replicability has since come under scrutiny with the publication of some replication failures (Xu et al., 2014; Lurquin et al., 2016), including a high-profile study involving 23 laboratories (Hagger et al., 2016). Some researchers even suggest that the ego-depletion effect might not be real and that the reported results primarily reflect publication bias (Carter and McCullough, 2014). This replication crisis has prompted a call for additional replication attempts involving large sample sizes and preregistration (Carter et al., 2015). Although such replication efforts are undoubtedly important, we submit that, unless some fundamental conceptual (and related methodological) issues are more satisfactorily addressed, attempts to evaluate the ego-depletion effect would unlikely be successful. In this article, we outline what we call the conceptual crisis for the ego-depletion literature, explain how these limitations undermine replication attempts, and suggest possible ways to alleviate these problems. We do so by noting some parallel problems that have faced cognitive psychologists studying attention, working memory (WM), and executive functions (EFs), in the hope that such insights might contribute to theoretical and empirical development in ego-depletion research.


THE CONCEPTUAL CRISIS SURROUNDING THE EGO-DEPLETION EFFECT
We propose that compellingly resolving the controversy surrounding the ego-depletion effect requires concerted efforts to address three interrelated conceptual problems, which jointly make it difficult to derive unequivocal and testable predictions for any ego-depletion study. Below, we illustrate these problems by referring to the strength model of self-control  because this influential model has provided the basis for most of the existing ego-depletion research. We emphasize, however, that these problems are general enough to be also applicable to other models (e.g., Inzlicht and Schmeichel, 2012) and, hence, that field-wide efforts are needed to satisfactorily address them.

LACK OF CLEAR OPERATIONAL DEFINITIONS OF SELF-CONTROL Problem
The field lacks clearly articulated and generally agreed-upon operational definitions of selfcontrol that can guide ego-depletion research. Although some studies refer to inhibitory control of some sort as their operational definition (e.g., Muraven et al., 2006;Tice et al., 2007), the term "inhibition" is typically used in a rather generic sense, without being specific as to different types of inhibitory processes postulated in the EF literature (Nigg, 2000;Friedman and Miyake, 2004). More problematic, some studies define self-control too broadly as the ability to control thoughts, emotions, and behavior (Segerstrom and Nes, 2007) or any monitoring and modification of behavior (Vohs et al., 2005). The justifications used for selecting self-control tasks are equally unsatisfactory: Many studies use circular logic to justify task selection by noting that the task was used before and had a depleting effect. Even when some independent justifications are provided, the attributes used to justify a task vary greatly, including being intellectually demanding (Fennis et al., 2009), requiring effort (Boucher and Kofos, 2012), and simply being difficult (Webb and Sheeran, 2003). Consequently, wide-ranging tasks like taking standardized tests (e.g., Converse and Deshon, 2009) or even balancing on one leg (Tyler and Burns, 2008) count as self-control tasks.
Given this confusing state, it is hardly surprising that the same task (e.g., 3-digit by 3-digit multiplication) has been used as both the self-control (depletion) task (Stillman et al., 2009) and the control (nondepletion) task (Burkley, 2008). If one cannot unambiguously determine whether or not a particular task implicates self-control, it is impossible to determine whether one should expect a significant ego-depletion effect.

Ways Forward
Each researcher should explicitly articulate an operational definition of self-control used in his/her study and justify task selection with regard to that operational definition. To facilitate the progress, however, more needs to be done by the field as a whole. Parallel conceptual problems that have faced EF research-another elusive and multifaceted concept-may be relevant here. Although it is still far from achieving a fieldwide consensus (Baggetta and Alexander, 2016), attempts to systematically classify and operationally define different facets of EFs (e.g., updating, shifting, and inhibition; Miyake et al., 2000) have contributed to developing some initial consensus, which has helped researchers judge whether a task implicates EF processes. Analogous attempts would be helpful for self-control research, especially if such efforts can help systematically examine which facets of self-control are linked to the ego-depletion phenomenon (e.g., Fujita, 2011;Heller et al., 2017).

LACK OF INDEPENDENT EMPIRICAL VALIDATION FOR SELF-CONTROL TASKS Problem
Various tasks used in ego-depletion research-such as watching a video while ignoring words appearing onscreen and writing essays without using certain letters-have not been independently validated as effective measures of self-control. Some such tasks have not been used outside ego-depletion research, and some (e.g., the video-viewing task) even lack objective measures of task performance that could be used as indices of self-control.
This lack of independent validation of self-control tasks is problematic, because it makes it difficult to derive an unambiguous prediction for any ego-depletion study. For example, according to the strength model, the ego-depletion effect should be observed only when Tasks 1 and 2 (a) both implicate self-control and (b) draw from the same selfcontrol resources. It is unclear, however, whether various task combinations used in ego-depletion research actually meet these necessary conditions.
Concerning (a), a negative consequence of this problem is illustrated by recent exchanges (Baumeister and Vohs, 2016b; regarding the appropriateness, as a self-control task, of the specific e-crossing task used in  multilab replication study. A focal issue was the necessity of an initial habit-forming block to make the e-crossing task sufficiently demanding, but, tellingly, this exchange did not reference any independent (non-ego-depletion) research validating different versions of the e-crossing task as effective (or not-so-effective) indices of self-control. Without such independent evidence, any replication failures would be open for alternative explanations based on task-selection problems.
Concerning (b), we do not know of any independent evidence for this crucial domain-generality assumption. Although there has been rigorous theoretical debate about, and empirical investigation into, the domain generality/specificity of attention (e.g., Wickens, 1984) and WM (e.g., Kane et al., 2004), little consideration has been given to this important issue in ego-depletion research, despite some prior evidence for domain/process-specific ego-depletion effects (Persson et al., 2007;Healey et al., 2011). Moreover, this domain-generality assumption is built on circular logic: Domain-general selfcontrol resources must be present because the ego-depletion effect is observed. This criticism is reminiscent of those raised against resource theories in cognitive psychology, most notably Kahneman's (1973) seminal capacity theory of attention, which, like the strength model, postulated a single pool of general-purpose attentional resources fueling various mental activities. 1 We find it justifiable to initially develop laboratory self-control tasks on the basis of the experimenter's intuition (Baumeister, 2016). We expect, however, that subsequent research would validate their appropriateness as self-control indicators and offer independent evidence that these tasks indeed draw on the same pool of domain-general self-control resources. Without knowing whether a particular task combination used in a study meets these conditions, it is impossible to predict whether one should expect a significant ego-depletion effect in that study.

Ways Forward
One way to alleviate these problems is to conduct carefully designed correlational research (e.g., latent-variable analysis) and/or experimental studies using the simultaneous dual-task interference paradigm to establish that various commonly used tasks in ego-depletion research share some underlying commonality, namely self-control resources. Tests of egodepletion would be more effective when the specific combination of tasks used has already been shown to demonstrate a clear overlap between them. In this regard, relying more on cognitive (attention, WM, and EF) tasks for which such evidence of overlap already exists might be helpful.
It is also important to provide more objective measures of task performance to quantify the self-control demands associated with Task 1 performance. One such possibility is to use pupillometry (Beatty, 1982) as an index of the degree of effort or attentional demands associated with the task performance 2 (e.g., Hopstaken et al., 2015;Rondeel et al., 2015).

LACK OF WELL-SPECIFIED MODELS THAT MAKE UNAMBIGUOUS, FALSIFIABLE PREDICTIONS Problem
The existing models purported to explain the ego-depletion effect are currently too underspecified to allow other researchers to unambiguously derive testable (falsifiable) predictions. For example, the strength model does not specify how the selfcontrol resources are consumed by Tasks 1 and 2 and when the available remaining resources are low enough to start impairing subsequent performance on Task 2. Such key resourceconsumption parameters must be more formally specified before one can determine whether an experiment should produce the ego-depletion effect.
This theoretical issue has been neglected in ego-depletion research, despite some relevant historical precedent. In an influential critique, Navon (1984) articulated various problems with resource theories (e.g., Kahneman, 1973), including the aforementioned circularity problem and the ambiguity surrounding the hypothesized resource-performance functions. This critique led some theorists to abandon the resource concept altogether (Neuman, 1987) and others to attempt to better specify the nature of resources and their consumption functions in the form of computational models (e.g., Just and Carpenter, 1992;Lovett et al., 1999). Models of ego-depletion phenomena are in need of such formalization.
Such theoretical development is urgently needed following the recent updates made to the strength model (Baumeister and Vohs, 2016a) that, in our view, make the model flexible enough to fit any data and, hence, unfalsifiable. In particular, this revised model incorporates the notion of the "central governor" (adopted from Evans et al., 2016), whose role is to determine whether to expend or conserve the available self-control resources. This addition seems to us a step backwards, considering that WM theories, which have long featured the "central executive" (Baddeley and Hitch, 1974;Baddeley, 1996), have been trying to replace this vague, homunculus-like construct with something more precise 3 . Without better specifying how this central governor determines whether and when to consume or conserve self-control resources, one cannot unambiguously determine whether one should observe a significant egodepletion effect (for a more detailed critique of the central governor model, see Inzlicht and Marcora (2016).

Ways Forward
We do not know of any formal attempts to mechanically specify how self-control resources are consumed when two tasks are performed consecutively in the sequential-task paradigm. It seems necessary not only to better specify the underlying resource-consumption functions (preferably via mathematical or computational modeling) but also to be more explicit about critical moderating variables (e.g., when to conserve or consume resources). To gain insights into the underlying resource-performance functions, it might also be helpful to systematically (parametrically) manipulate task durations or attentional demands for Task 1 (Lee et al., 2016), which unfortunately has rarely been done in ego-depletion research.

CONCLUSION
The recent replication efforts have succeeded in promoting preregistration, open data, and large sample sizes, all of which improve the reproducibility of scientific work. To resolve the issue of whether ego-depletion is a real phenomenon, however, it is also crucial to address the severe conceptual problems that impede the derivation and testing of specific, falsifiable predictions. Although tackling these issues is not easy, we believe that effectively addressing them is a necessary step to resolve the current controversy surrounding the ego-depletion effect in a manner that satisfies its proponents and skeptics alike.

AUTHOR CONTRIBUTIONS
All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

FUNDING
Publication of this article was funded by the University of Colorado Boulder Libraries Open Access Fund.