Edited by: Esperanza Navarro-Pardo, University of Valencia, Spain
Reviewed by: Poppy Watson, University of New South Wales, Australia; Corinna Ross, Texas Biomedical Research Institute, United States
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Everyday decision-making is supported by a dual-system of control comprised of parallel goal-directed and habitual systems. Over the past decade, the two-stage Markov decision task has become popularized for its ability to dissociate between goal-directed and habitual decision-making. While a handful of studies have implemented decision-making tasks online, only one study has validated the task by comparing in-person and web-based performance on the two-stage task in children and young adults. To date, no study has validated the dissociation of goal-directed and habitual behaviors in older adults online. Here, we implemented and validated a web-based version of the two-stage Markov task using parameter simulation and recovery and compared behavioral results from online and in-person participation on the two-stage task in both young and healthy older adults. We found no differences in estimated free parameters between online and in-person participation on the two-stage task. Further, we replicate previous findings that young adults are more goal-directed than older adults both in-person and online. Overall, this work demonstrates that the implementation and use of the two-stage Markov decision task for remote participation is feasible in the older adult demographic, which would allow for the study of decision-making with larger and more diverse samples.
Since its conception, the two-stage Markov decision task has been widely used across many studies to investigate decision-making behavior. One reason for its widespread popularity is that the two-stage task allows the dissociation of model-based from model-free reinforcement learning (RL) algorithms, which traditionally have been considered the computational proxy of goal-directed and habitual decision making, respectively (
Previous research has shown that we typically use both strategies in parallel, but there is some individual variability in the propensity toward one decision making strategy over another. Further, the balance between habitual and goal-directed strategies has been shown to shift with various factors: with age, greater stress, and compulsivity, the balance shifts toward more habitual decision-making strategies, whereas greater working memory capacity has been associated with more goal-directed strategy and in fact protects goal-directed strategies from the effects of stress (
While the original two-stage Markov tasks were conducted in-person (
Given that remote study participation requires some measure of technological proficiency on the part of the participant, examining whether decision-making data collected online is comparable to in-person data in the older adult population is especially important as older adults tend to use the internet less than younger age groups and require more time to learn computer programs (
A total of 42 healthy young adults (YA) and 41 healthy older adults (OA) participated in the study. Of these, 12 YA and 11 OA participated in the study in-person prior to the onset of the COVID-19 pandemic, and 30 YA and 30 OA participated in the study online. Participants were recruited through convenience sampling and through social media. All participants were recruited from the United States only, and had to score ≥ 19 on the telephone-Montreal Cognitive Assessment (t-MoCA). Young adult participants had to be between 18 and 49 years of age; older adult participants had to be between 50 and 80 years of age. Individuals were excluded if they were left-handed or ambidextrous, did not have normal or corrected-to-normal visual acuity, did not speak English proficiently, or if they had any history of neurological conditions. Informed consent was obtained from all participants in accordance with procedures approved by the University of Southern California Institutional Review Board (IRB), and participants received monetary compensation, either in cash (in-person participants) or in the form of electronic gift cards (online participants).
The task was implemented in MATLAB R2019b (MathWorks Inc., MA, United States) for both the in-person and online participants. Due to the difficult nature of the traditional two-stage Markov task, we used a modified version, in which we included only one decision per trial, rather than two, for simplicity (similar to that presented in
There were 201 trials in this task, divided up into three blocks of 67 trials each. We implemented a 1-min break between each block, during which we gave a reminder that the goal of the task was to earn as much money as possible, up to $10.
Each trial consisted of two subsequent stages followed by a reward outcome state (
Two-stage Markov Decision Task. Participants were given a choice between one of two start states on each trial, a forest and a desert. One location more commonly (70%) led to one of the second-stage states (the blue and purple cartoons), and rarely (30%) led to the other. Each second-stage state was associated with slowly changing reward probabilities.
The logic of the task is such that a habitual or model-free decision-maker, would, if rewarded, be more likely to stay with the same first-stage choice (i.e., forest or desert) on the next trial even if the first-stage choice led to a second-stage cartoon to which it was less commonly associated (i.e., 30%). On the other hand, the goal-directed or model-based decision-maker would exhibit a
The online task was administered through the use of Google Chrome Remote Desktop
Participant demographics between in-person and online groups in both young and older adults were compared using two-sample
A standard hybrid reinforcement learning model originally proposed by
We first simulated decisions for pure model-free and model-based agents in MATLAB, as well as decisions by the hybrid reinforcement learning model. For each model, we randomly drew 100 samples from a normal distribution with mean μ and standard deviation 0.2, where μ is the transformed parameter value as reported in
Demographics.
In-person OA ( |
Online OA ( |
In-person YA ( |
Online YA ( |
|
Age mean ± SD, range | 62.81 ± 9 (51–76) | 60.70 ± 7 (50–73) | 24.17 ± 3 (19–29) | 31.70 ± 6 (20–47) |
Gender | 5 Male, 6 Female | 7 Male, 23 Female | 1 Male, 11 Female | 8 Male, 22 Female |
Education | 6 Graduate degree, 3 Bachelor’s degree, 1 Associate degree, 1 Some college/no degree | 13 Graduate degree, 13 Bachelor’s degree, 1 Associate degree, 3 Some college/no degree | 3 Graduate degree, 7 Bachelor’s degree, 2 Some college/no degree | 20 Graduate degree, 8 Bachelor’s degree 2 Some college/no degree |
χ2 = 1.30, |
χ2 = 7.51, |
|||
Marital status | 4 Married or domestic partnership, 3 Divorced, 1 Widowed, 3 Single, never married | 21 Married or domestic partnership, 4 Divorced, 2 Widowed, 2 Single, never married, 1 Separated | 12 Single, never married | 11 Married or domestic partnership, 18 Single, never married, 1 Divorced |
χ2 = 5.64, |
χ2 = 7.70, |
|||
Race | 6 White, 4 Asian, 1 from multiple races | 13 White, 14 Asian, 3 Black or African-American | 1 White, 2 Black or African-American, 6 Asian, 2 Hispanic or Latino, 1 N/A | 5 White, 25 Asian |
χ2 = 4.24, |
χ2 = 14.21, |
Generated (input) and estimated parameters for simulations.
α (0–1) | β (0–20) | Weight (0–1) | λ (0–1) | Perseveration (−1 to 1) | |
Input: 0.54 | Input: 5.24 | Input: 0.39 | Input: 0.57 | Input: 0.12 | |
Estimated: 0.55 | Estimated: 5.22 | Estimated: 0.36 | Estimated: 0.54 | Estimated: 0.11 | |
Difference: 0.01 | Difference: 0.02 | Difference: 0.03 | Difference: 0.03 | Difference: 0.01 | |
Pure Model Based ( |
Input: 0.55 | Input: 5.18 | Input: 1 | Input: 0.49 | Input: 0 |
Estimated: 0.50 | Estimated: 4.04 | Estimated: 0.83 | Estimated: 0.49 | Estimated: -0.05 | |
Difference: 0.05 | Difference: 1.14 | Difference: 0.17 | Difference: <0.001 | Difference: 0.05 | |
Pure Model Free ( |
Input: 0.49 | Input: 5.16 | Input: 0 | Input: 0.49 | Input: 0 |
Estimated: 0.49 | Estimated: 6.24 | Estimated: 0.22 | Estimated: 0.49 | Estimated: 0.3 | |
Difference: <0.01 | Difference: 1.07 | Difference: 0.22 | Difference: <0.01 | Difference: 0.3 | |
Hybrid ( |
Input: 0.51 | Input: 5.12 | Input: 0.51 | Input: 0.50 | Input: 0.01 |
Estimated: 0.54 | Estimated: 4.29 | Estimated: 0.45 | Estimated: 0.52 | Estimated: 0.13 | |
Difference: 0.03 | Difference: 0.83 | Difference: 0.06 | Difference: 0.02 | Difference: 0.12 |
The simulated choices were then fit to a mixed logistic regression model using the lme4 package in the R programming language, version 4.0.3
For each group, we conducted a mixed logistic regression analysis on the behavioral data using the model specified in section “Simulation and Validation of Model Implementation” to examine trial-by-trial adjustments in choice preferences for each group during the task. Specifically, the specification for the regression was choice - reward ∗ transition + (1 + reward × transition | subject).
Next, the observed sequences of choices and rewards were used to estimate free parameters of the hybrid model (α, β, λ, weight, and perseveration; refer to section “Simulation and Validation of Model Implementation”) for each individual participant, using Markov chain Monte Carlo sampling for Bayesian modeling, implemented in Stan. The posterior median was used as the parameter estimate for each parameter.
Our primary goal was to compare performance between the two participation mediums (in-person and online) on the two-stage task in both young and older adults. However, as mentioned above, previous studies have also shown that age shifts the balance between goal-directed and habitual decision-making. Based on this, we hypothesized that the older adult group would have a lower weight parameter, indicating more habitual decision-making, than the young adult group. Thus, we conducted a two-way ANOVA with age group and participation setting as factors, to examine the effects of each and their interaction effect on the weight parameter. We also performed two-way ANOVAs on the other parameters to examine the effects of study participation setting as well as age on each of the parameters.
Because the in-person groups were relatively small, we performed a follow up analysis of covariance (ANCOVA) on each of the estimated parameters, combining the two age groups into a single group by using age as a continuous variable. We first tested for an interaction effect between age and participation setting, and then re-ran the ANCOVA without the interaction term with the following specification in R: parameter - age + factor (participation setting).
Participant demographics are reported in
Consistent with previous studies (
Stay-switch plots of simulated behavior. Graphs depicting purely model-based (goal-directed), purely model-free (habitual), and hybrid behaviors. Purely model-based behavior is predicted by an interaction between reward and transition, whereas purely model-free behavior is predicted solely by reinforcement history. Hybrid behavior represents a mix of model-based and model-free behavior.
For young adults, both the main effect of reward (in-person β
Stay-switch plots of behavior. Both in-person (
For both older adult groups, the main effect of reward was also significant (
Two-way ANOVAs were conducted to examine the effects of participation medium and age on each of the estimated parameters (
Estimated free parameters by group. We performed 2-way ANOVAs assessing main effects of age group (young adults vs. older adults) and participation medium (in-person vs. online) for each parameter. There were no main effects of participation medium, but we found main effects of age for weight (w) and β. Bolded lines represent significant main effects, brackets represent significant pairwise comparisons (**
We also performed ANCOVAs to observe the effect of participation medium, controlling for age. There were no significant interaction effects between participation medium and age (e.g., there was homogeneity of regression slopes), and the ANCOVA model was rerun excluding the interaction term. Age was a significant predictor of weight [β
Conducting a study online has many advantages, including larger sample sizes and being able to continue research even during a pandemic with restricted in-person activities. While the two-stage task has been conducted online in previous studies, to our knowledge, none of these have compared in-person to online performance on the two-stage task in older adults. This is critical because it is currently unclear whether high-quality decision-making data can be reliably collected via online task participation in older adults as previous findings have shown that older adults have more difficulties learning computer programs (
In this study, we validated a web-based version of the two-stage decision task by simulating behavior on the models and successfully recovered the parameters. We also replicated behavioral results between in-person and online participation in young and older adults and found no differences across the estimated free parameters between in-person and online participation within the young adult group and the older adult groups. Most importantly, despite having a small sample of in-person participants, we replicated the primary effect of interest: we found more goal-directed decision making in young adults than older adults across both the in-person group (
Interestingly, we found a significant difference in the parameter β between the young adult and older adult groups that participated in the study online, but we did not find this difference between the in-person groups. In the two-stage task, the β parameter corresponds to the stochasticity, or randomness, of choices, where β = 0 corresponds to completely random choices, and choices become more deterministic as β increases. One possible explanation for the disparate results between the online and in-person groups for the β parameter could have resulted from the technological demands of setting up the task at home. The older adults may have had a more difficult time with the online task as it requires more set up on their part, compared to in-person participants. Although all the older adults in this group were able to successfully complete the task, they may have already been more tired when they started the task as a result of having to navigate technology to set up remote desktop. However, in either of the online groups, there was no relationship between participants’ self-rating of computer usage at home to β (see
Although the primary goal of our study was to determine whether estimated parameters in the online group in both young and older adults was comparable to those in-person, we also want to highlight some benefits and drawbacks from conducting a web-based study.
The advantages of running an online study are quite obvious: convenience and access to larger sample sizes. In a lab-based study, the pool of potential participants is limited by geographical constraints. An online study is limited insofar as the guidelines set by the IRB and/or funding sources allow. This can result in larger and more diverse, representative samples (
There are also a number of noteworthy drawbacks to running an online study that should be taken into consideration for future studies. First and perhaps most importantly, even though an online study removes the geographical barriers of participating in a study, participation is still constrained to those who have computer and reliable internet access. In our study, because we used remote desktop to support the two-stage task which had time constraints on responses, our study pool was even more limited as it required fairly high-speed internet. This limitation should not be downplayed—it highlights disparities in both access to participation and representation in research. Moving forward, it is important to think more deeply on methods to increase access, such as lending out equipment with limited data plans. Related to this issue of access is the environment in which online participants partake in the study. Whereas a lab environment is generally quiet (and admittedly lacking in ecological validity), some online participants may not be able to find a quiet space in their home to limit distractions. Additionally, even though our online study was moderated by an experimenter on the phone, there was no way to fully ensure that online participants were always paying attention during the duration of the study without an experimenter physically present. Both the diversity of environments among online participants and lack of physical presence of an experimenter could potentially result in noisier data.
Yet another important consideration to make while conducting an online study is whether the participants would have the computer proficiency to set up and complete the study. As mentioned above, an online experiment requires more set up on the part of the participant compared to an in-person study. We originally planned on instructing participants to download an app-based version of the two-stage task using MATLAB Runtime (MathWorks, MA, United States). However, we switched to using Google Remote Desktop to reduce the onus on the participants to set up the task. This unfortunately came at the cost of requiring high-speed internet for the study (≥60 mbps) and being able to accurately measure participant response latencies due to variability in internet connection speeds (see section “Limitations and Future Directions”).
Finally, we would like to acknowledge a few limitations specific to our study. First, there were significant demographic differences between the in-person and online groups in young adults, and we did not have a diverse sample. This was due to the mixed use of convenience sampling and recruitment through social media as a result of making quick adaptations in this study in response to the COVID-19 pandemic. Furthermore, all of the in-person participants completed the study before the onset of pandemic, whereas the online participants participated online as a direct result of restrictions due to the pandemic. Related to this, we also had a larger sample of online participants than in-person participants. Despite these differences between the in-person and online groups, however, we did not find differences across the estimated free parameters between in-person and online participation within the young adult group and the older adult groups, demonstrating the feasibility of conducting data collection on the two-stage task online for both groups. Another potential limitation of this study is that online two-stage task performance may be biased toward participants who have greater computer proficiency. As mentioned above, we found no differences between participants’ self-rating of computer usage at home to response stochasticity (β). However, our measure of computer usage was likely lacking in sensitivity, and the use of a standardized and more sensitive measure, as opposed to a self-rated percentage, may have revealed a bias effect of computer proficiency. Finally, as discussed above, we were unable to accurately measure response latencies on the online version of the two-stage task due to the use of Google Remote Desktop. In the future, the two-stage task could be implemented on a web server to more accurate measure choice response times, which could provide further insight on disparities in reaction times between habitual and goal-directed choices (
Overall, despite some limitations to online studies that require careful consideration, conducting a research study online has many advantages. Here, we found online performance on the two-stage task was comparable to performing the task in the lab for both young and older adults and also replicated previous findings that young adults are generally more goal-directed than older adults. Our results suggest that, despite being a fairly lengthy study requiring focus and attention, online administration of the two-stage task is feasible across both young and older adults.
The original contributions presented in the study are included in the article/
The studies involving human participants were reviewed and approved by IRB Administrator: Andrea Tlaseca, University of Southern California Institutional Review Board. The patients/participants provided their written informed consent to participate in this study.
KLI carried out the experiment, analyzed the data, and wrote the manuscript. LC implemented and tested the computational modeling parts of the study. RR recruited and ran participants for the study. BK implemented an early version of the task and computational modeling scripts. JM provided feedback on the conceptualization of an earlier version of the study and revised the manuscript. NS supervised computational modeling aspects of the study and revised the manuscript. S-LL supervised the project and revised the manuscript. All authors contributed to the article and approved the submitted version.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We would like to thank the study participants for their contribution to this study.
The Supplementary Material for this article can be found online at: