Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Aging, 08 January 2026

Sec. Interventions in Aging

Volume 6 - 2025 | https://doi.org/10.3389/fragi.2025.1715756

The Timed Up and Go dual-task test’s cognitive and motor outcomes show promising test-retest reliability in older adults with perceived memory impairment

Niklas Lfgren,
Niklas Löfgren1,2*Vilmantas Giedraitis,Vilmantas Giedraitis1,3Kjartan HalvorsenKjartan Halvorsen1Erik RosendahlErik Rosendahl4Anna Cristina berg,,Anna Cristina Åberg1,2,3
  • 1School of Health and Welfare, Dalarna University, Falun, Sweden
  • 2CIRCLE – Complex Intervention Research in Health and Care, Department of Women’s and Children’s Health, Uppsala University, Uppsala, Sweden
  • 3Department of Public Health and Caring Sciences, Geriatrics, Uppsala University, Uppsala, Sweden
  • 4Department of Community Medicine and Rehabilitation, Umeå University, Umeå, Sweden

Background: It is of utmost importance to identify older adults at risk of cognitive impairment at the earliest possible stage. Previous research supports the potential of investigating step parameters and turn duration during Timed Up and Go (TUG) during single and dual-task (TUGdt) conditions to detect subtle impairment. The aim of this study was therefore to investigate the test-retest reliability and measurement error of novel outcomes related to TUG and two TUGdt tests, TUGdt-NA (naming animals) and TUGdt-MB (reciting months in reverse order), in older adults with perceived memory impairment.

Methods: Thirty-four participants (18 women, mean age 76) were included and assessed with TUG, TUGdt-NA and TUGdt-MB on two different occasions, 5–10 days apart. Tests were video recorded for data extraction of spatiotemporal step parameters and turn duration. Reliability of motor and cognitive outcomes were analyzed with intraclass correlations (ICC2.1), standard errors of measurement and minimal detectable change (MDC). The proportional measurement error was presented with MDC%.

Results: The results showed very good reliability (ICC2.1 ≥ 0.85) regarding total completion times, although the measurement error and proportional measurement error (MDC%) was higher during TUGdt conditions than TUG. The reliability of cognitive outcomes during TUGdt favored TUGdt-MB (ICC2.1 ≥ 0.77, MDC% ≤39.8). Step length was the step parameter with highest reliability (ICC2.1 ≥ 0.86) and lowest proportional measurement error (MDC% ≤21.4) across conditions, whereas turn duration showed good reliability during TUG and TUGdt-MB (ICC2.1 ≥ 0.74, MDC%≤38.9).

Conclusion: The results support the potential of including TUG and TUGdt outcomes in cognitive risk evaluations among older adults.

Trial Registration Number: Uppsala-Dalarna Dementia and Gait Project | ClinicalTrials.gov, identifier NCT05893524.

1 Introduction

Older age is the strongest risk factor for dementia disorders, which constitute significant and global health problems (WHO, 2025). Indeed, the detrimental effects of dementia are multifold and impacts not only the affected individuals and their families, but also strains national healthcare systems (Burks et al., 2021). As there is currently no cure for the disease, early detection of dementia risk is of utmost importance in order to initiate preventive treatment and health promotion. While mild cognitive impairment (MCI) is widely considered as a potential pre-dementia syndrome, it has also been found that individuals with subjective cognitive decline (SCD) have an elevated risk of developing MCI and dementia (Mitchell et al., 2014; Mendonca et al., 2016). SCD refers to self-perceived decline in cognitive capacity that is not associated with underlying disease and can emerge up to 15 years before cognitive decline is identified through objective measures (Molinuevo et al., 2017).

Growing evidence shows that cognitive impairment coexists with motor impairment (Leroy et al., 2023; Montero-Odasso et al., 2012; Mullin et al., 2022; Verghese et al., 2013) and deviant gait has even been found to precede cognitive decline by several years (Skillback et al., 2022). Although previous research has primarily investigated potential links between motor and cognitive impairment in older adults with dementia or MCI (Zhong et al., 2025), emerging findings have also identified impaired motor performance among older adults with SCD in comparison with cognitively healthy counterparts (Knapstad et al., 2019; Åhman et al., 2021). Hence, the dual task paradigm, which entails the concurrent performance of two tasks with distinct objectives (McIsaac et al., 2015) and challenges executive functions (Yogev-Seligmann et al., 2008), may therefore be relevant in the assessment of SCD. In addition, to further investigate the added challenges of performing dual tasking in relation to either task alone (i.e., single task), dual task cost (or dual task interference) can be approximated by dividing the difference between single- and dual task performance with single task performance. Both decreased dual task performance and increased dual task cost have also been previously shown to differentiate between older adults without cognitive impairment and with mild cognitive impairment (Ramirez and et, 2021; Yang et al., 2020), whereas no differences in dual task gait outcomes were found between older adults with SCD and controls in a recent review and meta-analysis (Salzman et al., 2025). However, the majority of included studies were conducted during straight overground walking, while only one study (Åhman et al., 2021) investigated motor-cognitive dual-tasking with the more complex Timed Up and Go (TUG) mobility task (Podsiadlo, 1991) and wherein the results discriminated between individuals with and without SCD. This result may indicate that more complex motor tasks may be needed to detect subtle changes in cognitive ability, even under dual-task conditions. Indeed, TUG performance entails a combination of movement segments such as rising from a seated position, walking 3 m, performing a 180-degree turn, walking back, and sitting down again. This combination of movement sequences requires movement adaptations and challenges executive functions, which have been shown to be reduced among older adults and even more among those with cognitive impairment (Pott et al., 2022; Herman and Hausdorff, 2011).

Within an ongoing longitudinal Dementia and Gait project, two TUG dual-task (TUGdt) conditions have been developed and evaluated with regard to its potential of identifying older adults at risk of cognitive decline and dementia (Cedervall et al., 2020; Åberg et al., 2023). The test procedure was designed in close collaboration with memory clinic specialists to enable its implementation into clinical practice, and the assessment methodology has been acknowledged in a systematic review (Ramirez and et, 2021). Previous findings have been encouraging, particularly with regards to the novel TUGdt outcome words/10 s which not only has been associated with neurodegeneration (Åhman et al., 2019), but also been found to discriminate between controls, older adults with SCD, MCI, and dementia (Löfgren et al., 2025), as well as to predict conversion to dementia (Åberg et al., 2023). In addition, the use of an innovative method involving marker-free video recordings during the TUG test conditions enabled extraction of specific step parameters (Åberg et al., 2021), which may contribute to the investigation of subtle outcomes such as step length and turn duration. Accordingly, increasing research interest has been directed toward investigating TUG performance in a more detailed manner, including specific step performance and turn duration (Bottinger et al., 2024; Ortega-Bastidas et al., 2023). Initial results indicate from our group indicate that shorter step length (Löfgren et al., 2025) and longer turn duration (unpublished results) during TUG with a cognitive dual-task significantly discriminates between individuals with SCD and controls, whereas others have found indications that the 180-degree turn entail an added capacity for identifying a history of falls (Brauner et al., 2022). However, despite this potential importance of investigating specific step parameters and segments of TUG, few studies have investigated the reliability of these outcomes. In addition, although the reliability of the full TUG has been investigated in various populations, research regarding the reliability of the cognitive task as well as on populations with subtle cognitive impairment is lacking.

Individuals with perceived memory impairment—a population similar to individuals with SCD—are likely to undergo cognitive assessment. Hence, it is vital to investigate the reliability and measurement error of TUG and TUGdt outcomes in this population. The aim of this study was therefore to investigate the test-retest reliability and measurement error of the performance of the TUG conditions with regard to total completion time, performance of the cognitive tasks, the specific step parameters, and the duration of the 180-degree turn in older adults with perceived memory impairment.

2 Materials and methods

This was an observational cohort study with a test-retest design, that is part of the longitudinal UDDGait project (Trial registration number: NCT05893524). The Regional Ethical Review Board in Uppsala, and the Swedish Ethical Review Authority approved this study, and informed consent was obtained from all participants prior to study commencement.

2.1 Participants

Older adults with perceived memory impairment were recruited. Initial recruitment was halted midway due to the COVID-19 pandemic, therefore participants were recruited and assessed at two geographical cohorts in central Sweden. During the period 2019–2020, 16 individuals were recruited via a specialist memory clinic in connection with a healthcare visit and, in 2024, 18 individuals were recruited through collaboration with a housing facility for older people. Inclusion criteria were: ≥50 years, perceived (self-reported) memory impairment, and ability to rise from a chair and walk 3 m back and forth without the use of walking aids. Two physiotherapists (one for each cohort) with vast experience in instructing and assessing TUG in various populations acted as test administrators and followed a standard protocol regarding how to instruct and administer the tests.

2.2 Data collection

All participants were assessed under similar conditions during two test occasions, with an interval of 5–10 days between the tests, since it has been found that a time interval ranging from 2 days to 2 weeks is adequate for the test-retest assessment of health status instruments (Marx et al., 2003). Data collection was carried out in line with the study protocol (Cedervall et al., 2020). At the first visit, and prior to TUG assessments, a brief interview (following a standardized questionnaire) regarding perceived cognitive and physical status was conducted and demographic data were collected, including memory assessment, screening for depressive symptoms (Almeida and Almeida, 1999), and general physical ability. For the TUG data collection, the participants were instructed to rise from sitting on a standard chair with armrests, walk 3 m, turn 180°, walk back to the chair and sit down again (Podsiadlo, 1991; Almeida and Almeida, 1999). The instructions were given verbally, and the test administrator also demonstrated the test to the participants. Three conditions, recorded with two synchronized high-definition video cameras, were assessed in the following order: (WHO, 2025) TUG as a single task test (Burks et al., 2021), TUGdt-NA (TUG while simultaneously naming animals), and (Mitchell et al., 2014) TUGdt-MB (TUG while reciting months in reverse order, starting with December). Prior to each TUG-condition, participants were instructed to complete the task at comfortable speed and time was registered with a stopwatch. The timing started when the participants back left the backrest and stopped when their backside touched the chair again. During the TUGdt conditions, the test administrator recorded the correct number of animals/months recited by the participants, which were controlled for when reviewing the videos.

At the beginning of the retest occasion, the participants were again interviewed regarding their cognitive and physical status and other possible adverse events that may have occurred, between the test occasions. Following this, they underwent the TUG assessments under conditions identical to the prior test occasion.

2.3 Data preparation

For TUGdt-NA and TUGdt-MB, the number of correctly recited animals/months per 10 s during the test performance was calculated (recited animals or months/time*10) and documented as TUGdt-NA or TUGdt-MB words/10s. Registration of correct words recited during TUGdt-NA and TUGdt-MB was performed by reviewing the video recordings and followed the procedures used in establishing norms for such tests. Dual-task cost (DTC) was calculated as: TUGdt time–TUG time/TUG time * 100 (McIsaac et al., 2015).

Data processing for the step parameters was based on the documentation from two synchronized high-definition video cameras using a semi-automatic method aided by a technique for human 2D pose estimation of events of contact points of the heel with the ground (i.e., heel strike), based on a deep learning procedure, described in more detail elsewhere (Åberg et al., 2021), see Supplementary Material. Based on determined heel strike events, steps were quantified during gait, in two segments of straight walking: 1) during gait toward the 3-m line, starting with the second heel strike and ending with the last heel strike for which no part of the foot had passed the line, and 2) during gait back to the chair, starting with the first heel strike for which the whole foot had passed the 3-m line and ending with the last heel strike that did not appear to be part of preparations for sitting down, as indicated by a foot twist or an atypical short step. For each step parameter, the mean of all analyzed steps for each participant was used in the analysis. The time to turn (between 1 and 2) was measured from the last heel strike of gait before the 3-m line and ending at the first heel strike of gait back to the chair, as described above.

2.4 Analysis

Statistical analyses were conducted with SPSS (version 29, IBM Inc., Armonk, NY). Descriptive characteristics were presented with mean and standard deviations. Reliability for each outcome was investigated with intraclass correlations (ICCagreement, 2-way random effects = ICC 2.1) with 95% confidence. To categorize the level of ICCagreement, Bland-Altman’s classification was used: ˂0.20 = poor; 0.21–0.40 = fair; 0.41–0.60 = moderate; 0.61–0.80 = good; 0.81–1.00 = very good (Altman, 1991). Measurement errors were calculated in two steps. First, SEMagreement was calculated as follows: SEM = √ within subject error variance (Bland and Altman, 1996). Second, to investigate the measurement error on individual level, the minimal detectable change (MDC) with 95% confidence interval (CI95) was calculated with the formula: 1.96 x √2 x SEMagreement (Beckerman et al., 2001). To illustrate the magnitude of the measurement error, the proportional measurement error (MDC%) was calculated by dividing the MDC value with the mean result of both tests (Flansbjer et al., 2005), whereby MDC% ≤30 has been proposed to be acceptable (Para et al., 2022). Additionally, to analyze systematic differences between the test occasions Bland-Altman plots were conducted.

3 Results

Thirty-five older adults with perceived cognitive impairment, deriving from two geographical cohorts due to the COVID-19 pandemic, were enrolled in this study. One participant reported sickness between test sessions 1 and 2 and was therefore excluded from analyses, however all remaining participants reported stable conditions between the sessions, resulting in 34 participants included in the analyses for completion times of the TUG and TUGdt conditions. However, due to technical problems with video recordings the extraction of step parameters was not possible for three participants during TUG and TUGdt-NA, and for four participants during TUGdt-MB. In addition, one participant made several long stops during all TUG conditions, whereby steady state gait was not achieved, and was therefore excluded from the analyses of specific step parameters and turn duration.

The participants were 52–91 years old (CI95: 73.1–79.3), 18 were women, 17 (50%) had participated in higher education, and 18 (52%) lived with someone (see Table 1 for participant characteristics).

Table 1
www.frontiersin.org

Table 1. Participant demographics.

The participants in the initial cohort (recruited 2019–2020) were younger (mean age 73 vs. 79 years), had a lower proportion of women (37.5% vs. 67%), higher proportion of participants living together with someone (69% vs. 39%), and smaller proportion of participants with higher education (44% vs. 56%). However, the participants from both cohorts showed similar results regarding the MMSE-SR and physical function (10-m walk speed).

3.1 Test-retest results

For test-retest results regarding the performance of the different TUG conditions (TUG, TUGdt-NA, TUGdt-MB, and dual-task cost for TUGdt-NA and TUGdt-MB, respectively), reliability results were categorized as very good (ICC 0.85–0.90). However, results regarding the measurement error were proportionally higher during the TUGdt-NA (MDC% = 35.5) and TUGdt-MB (MDC% = 39.1) than during TUG (MDC% = 23.0), and significantly higher for DTC (125.8%–133.7%), as presented in Table 2.

Table 2
www.frontiersin.org

Table 2. Test-retest reliability and measurement error of the different TUG conditions (N = 34).

The results for the performance of the cognitive tasks were varied. While the reliability of both the number of correctly named animals and recited months (in reverse order) were categorized as good, the results were higher for correctly recited months (ICC = 0.77) than for naming animals (ICC = 0.63). An even more distinct pattern was shown for the outcomes animals/10 s and months/10 s, where the reliability of the latter was categorized as very good (ICC = 0.89) and the former as good (ICC = 0.66). The results of measurement error were similar, with markedly higher proportional measurement errors for both the number of correctly named animals (MDC% = 56.1) and animals/10 s (MDC% = 66.9) than for correctly recited months (MDC% = 39.8) and recited months/10 s (MDC% = 34.9).

Regarding step parameters during TUG, the reliability was very good for step duration, SS duration, and step length (ICCagreement = 0.81–0.89), with proportional measurement errors ranging from 10.8% to 15.5% (Table 3). For DS duration and turn duration, the reliability was good (ICCagreement = 0.77 and 0.74, respectively), with similar proportional measurement errors (MDC% = 41.7 and 38.9). Step width was found to be of moderate reliability (ICCagreement = 0.56), with a proportional measurement error of 38.9%.

Table 3
www.frontiersin.org

Table 3. Test-retest reliability and measurement error of specific step parameters and turn duration during the TUG conditions.

For step parameters during TUGdt-NA, step length was found to be of very good reliability (ICCagreement = 0.87), with a proportional measurement error of 21.1%. The step parameters DS duration and step width showed similar (good) reliability (ICCagreement = 0.62 and 0.63) with proportional measurement errors of 66.4% and 43.0%, respectively. Finally, the parameters step duration, SS duration, and turn duration were all of moderate reliability (ICCagreement ≥ 0.45–0.54), with proportional measurement errors ranging between 31.3% and 58.6%.

The investigation of step parameters during TUGdt-MB found the reliability of step length to be very good (ICCagreement = 0.86), with a proportional measurement error of 21.4%. For all other parameters, the reliability was found to be good (ICCagreement = 0.62–0.75), with proportional measurement errors ranging from 35.5% to 70.7%.

Bland-Altman Plots (Figures 1A–C) indicate tendencies of systematic differences, where the participants performance during the retest assessments generally were slightly improved in comparison to the prior assessment. However, we did not identify any clear tendencies of heteroscedasticity.

Figure 1
Three Bland-Altman plots labeled A, B, and C. Plot A shows the difference between observations for TUGdt-MB Months/10 seconds with mean values on the x-axis. Plot B displays differences in TUGdt-MB Step Length in centimeters. Plot C illustrates differences for TUGdt-MB Turn Duration. Each plot contains scattered points, a central solid line indicating mean difference, and dashed lines for limits of agreement.

Figure 1. (A–C) Bland-Altman plots, illustrating the test-reliability of (A) the number of correctly recited months per 10 s, (B) step length (centimeters) during TUGdt-MB, and (C) turn duration (seconds) during TUGdt-MB. The difference between the observations is plotted against the mean of the observations. The solid line represents the difference between the two observations and the dotted lines represent the upper and lower limits of agreement.

4 Discussion

This is the first study to investigate the test-retest reliability of TUG and TUGdt conditions, including the cognitive task and specific step parameters, in older adults with perceived cognitive impairment. The results show very good reliability regarding completion times during the different TUG conditions, although both the measurement error and proportional measurement error were higher during the TUGdt conditions, particularly for DTC. The reliability of the cognitive outcomes during TUGdt showed that the highest reliability (good to very good) and lowest measurement error occurred during outcomes related to TUGdt-MB (correctly recited months and correctly recited months/10 s). There were discrete tendencies of systematic differences as the performance at the retest occasions was slightly improved (Figure 1), however, given the small magnitude of these differences, they were not considered to indicate learning bias.

The assessment procedures were designed to be adaptable to clinical practice and the robustness of the reliability findings for completion times of the different TUG conditions are supported by their alignment with previous results in various populations, using different assessment procedures (Rodrigues et al., 2023; Hofheinz and Schusterschitz, 2010; Smith et al., 2016). To minimize time consumption, an important factor for potential implementation (Wang et al., 2023), our procedure included a single trial for each condition. This procedure differs from other studies exhibiting variations between including/excluding practice trials, using multiple/single trials, and where data collection has been conducted in motion laboratory settings or in the participants’ homes. In addition, although the measurement errors regarding total completion times were higher than in similar studies on older adults (Donoghue et al., 2019), they were lower compared with studies on individuals with cognitive impairment (Braun et al., 2019), which may be considered reasonable due to the expected variability among individuals also with lower levels of cognitive impairment (Kwak et al., 2023). Further, while the proportional measurement errors during the investigated TUGdt conditions in our study were slightly higher than recommended levels, there is generally a lack of studies presenting this outcome, making direct comparisons challenging. The results of dual-task cost, for both the TUGdt-NA and TUGdt-MB, showed markedly higher reliability (very good) compared with what has been presented in a recent review on the reliability of dual-task cost metrics (Pike et al., 2023), as well as in similar studies investigating TUG dual-task costs in populations with intact and impaired cognition (Åhman et al., 2021; Venema et al., 2019). On the other hand, the measurement error was similar to the findings in these studies. Considering also that the proportional measurement error was well beyond recommended levels (≤30%), measures need to be taken in to control for variability to optimize measurement robustness for this outcome during TUG. However, few studies have presented MDC% for TUG with and without a dual-task, (making direct comparisons challenging). Nevertheless, one previous study (Venema et al., 2019) that included 50 individuals found higher MDC results for dual-task cost than we found, both among cognitively impaired individuals (MDC = 129) and in cognitively intact older adults (MDC = 41). In addition, a previous review that investigated the reliability of gait parameters derived from instrumented walkways (Para et al., 2022) found the MDC% for gait speed to exceed 30% (MDC% = 42) in individuals with cognitive impairment. Considering this, it may be considered somewhat unsurprising that the results on MDC% were generally higher than optimal in this study (despite the absolute MDCs being at similar levels of corresponding studies). However, more fragile groups may be expected to entail higher degrees of variability which reasonably affects MDC%. Therefore, as MDC% is an important outcome for evaluating the magnitude of the measurement error, future studies need to present MDC% to a higher extent. Not least to enable comparisons of assessment instruments in populations where larger degree of variability can be expected.

The two dual-task conditions used in the UDDGait project can be categorized as mental tracking tasks according to a proposed taxonomy of cognitive tasks during motor-cognitive dual-task investigations (Wollesen et al., 2019). Despite this, the results varied with higher reliability and lower measurement error for the TUGdt-MB outcomes in relation to TUGdt-NA. In addition, few studies have investigated the reliability, not least the measurement error, of the cognitive task during dual-task assessments, and particularly during TUG. However, results presented in a previous systematic review (including straight overground walking and TUG) indicate large variations of reliability between different tasks (Yang et al., 2015), where the tasks with the highest reliability showed similar results to what we found for TUGdt-MB. Bearing in mind that more complex tasks such as TUG may inflict larger cognitive load when conducted as a dual-task, the finding of very good reliability (combined with a proportional measurement error close to 30%) for the TUGdt-MB outcome months/10 s may be of particular importance. Indeed, this novel outcome has been found to predict conversion to dementia over a period of up to 6 years among individuals with SCD or MCI and its potential has previously received particular attention (Ramirez and et, 2021). Therefore, the indication of the robustness of the outcome months/10 s may be argued to support its potential for implementation into clinical practice.

The findings of our study regarding the reliability of step parameters and turn duration during TUG and TUGdt-MB performance showed good to very good reliability, with even better reliability for TUG as a single task. In relation to one of the few comparable studies (Smith et al., 2016), our results showed similar or higher reliability for most outcomes (whereas TUGdt-NA showed slightly worse results). In addition, especially our findings of very good reliability and acceptable proportional measurement errors for step length during all conditions are aligning with the findings in a recent review (Para et al., 2022) on the reliability of step parameters assessed via instrumented walkways. This finding may be of particular importance since we have previously found step length to discriminate between groups of people with different cognitive ability (Löfgren et al., 2025). Another important finding refers to turn duration, a parameter attributed with the potential of identifying fall risk (Brauner et al., 2022). Our results showed similar reliability (good) and measurement errors during TUG and TUGdt-MB, whereas the reliability was moderate for TUGdt-NA. Other studies investigating the reliability of the duration of 180-degree turn during TUG in older adults have generally found mixed results. However, the definition of the start and end of the turn differs between the studies (McGrath et al., 2011; Salarian et al., 2010). Whereas others have registered the turn with gyroscopes and based turn duration on angular velocity, our method was based on perceived clinical applicability. Nevertheless, to enable credible replications of results, future research needs to establish a consensus on how to register turn duration during TUG.

This was not without limitations. First of all, although the sample size was similar to related studies investigating test-retest reliability in fragile populations (Salarian et al., 2010; van Lummel et al., 2016), it was lower than what has been recommended for studies for reliability investigations (Ter et al., 2007). Therefore, the results need to be interpreted with caution and particularly the results of MDC% highlight the importance that future studies aspire to recruit larger samples when investigating subtler outcomes during challenging conditions. In addition, the assessments entailed performing TUG and TUGdt conditions once each, without a practice session, as recommended by Podsiadlo (1991), which may have caused larger measurement errors than if more than one session for each condition had taken place. For example, it has previously been suggested that nine strides are required for the assessment of dual-task gait (Hollman et al., 2011), which is rarely achievable during TUG. However, in the current UDDGait project, the assessment procedure was developed in close collaboration with expert clinicians in order to optimize potential for implementation, whereby time efficiency and minimized patient burden are crucial factors (Wang et al., 2023). Therefore, certain trade-offs may be necessary. Considering that previous research findings have shown that, particularly during TUGdt-MB, the outcome of correct words/10 s can predict conversion to dementia (Åberg et al., 2023), whereas step length can discriminate between people with different levels of cognitive function (Löfgren et al., 2025), the test-retest reliability of these outcomes support the potential of implementing this assessment into clinical practice. Nevertheless, more research is needed on the psychometric properties of specific step parameters and segments of TUG under single- and dual-task conditions, particularly in populations with perceived symptoms of cognitive impairment.

5 Conclusion

The results of this study show good to very good reliability for completion times of TUG, TUGdt, and the cognitive tasks, with acceptable measurement errors, particularly during TUG. For novel and potentially important outcomes such as words/10 s, step parameters, and turn duration, the reliability was generally good to very good during TUG and good during TUGdt-MB, whereas the magnitude of measurement errors varied. Particularly step length showed very good reliability with acceptable measurement during all TUG conditions, and turn duration showed good reliability during TUG and TUGdt-MB. Although more research is needed, these results support the robustness of outcomes that have previously been found to predict dementia and discriminate between different levels of cognitive function among older adults and may therefore support their potential for implementation into clinical practice.

Data availability statement

The material analyzed during the current study is not publicly available due to its content of sensitive personal data. Datasets generated can be available from the principal investigator Anna Cristina Åberg upon reasonable request, after ethical considerations. Requests to access the datasets should be directed to YW5uYS5jcmlzdGluYS5hYmVyZ0B1dS5zZQ==.

Ethics statement

The studies involving humans were approved by The Regional Ethical Review Board in Uppsala, and the Swedish Ethical Review Authority. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

NL: Data curation, Formal Analysis, Visualization, Writing – original draft, Writing – review and editing. VG: Data curation, Methodology, Writing – review and editing. KH: Data curation, Methodology, Software, Writing – review and editing. ER: Writing – review and editing. AÅ: Conceptualization, Formal Analysis, Funding acquisition, Methodology, Writing – original draft, Writing – review and editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. Open access funding was provided by Dalarna University. This work was supported by grants from the Swedish Research Council (2017-1259 and 2020-01056), the Promobilia Foundation, the Dementia Foundation, Sweden and Konung Gustaf V:s och Drottning Victorias frimurarestiftelse. The funding bodies had no role in the design, methods, data collection, analysis, or preparation of the manuscript.

Acknowledgements

The authors thank the older adults that participated in this research.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fragi.2025.1715756/full#supplementary-material

Footnotes

Abbreviations:MCI, Mild cognitive imparment; SCD, Subjective cognitive decline; TUG, Timed Up and Go; TUGdt, Timed Up and Go as a dual-task; DTC, Dual-task cost; ICC, Intraclass correlation; SEM, Smallest error of measurement; MDC, Minimal detectable change; MDC%, Proportional measurement error.

References

Åberg, A. C., Olsson, F., Åhman, H. B., Tarassova, O., Arndt, A., Giedraitis, V., et al. (2021). Extraction of gait parameters from marker-free video recordings of timed up-and-go tests: validity, inter- and intra-rater reliability. Gait Posture 90, 489–495. doi:10.1016/j.gaitpost.2021.08.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Åberg, A. C., Petersson, J. R., Giedraitis, V., McKee, K. J., Rosendahl, E., Halvorsen, K., et al. (2023). Prediction of conversion to dementia disorders based on timed up and go dual-task test verbal and motor outcomes: a five-year prospective memory-clinic-based study. BMC Geriatr. 23 (1), 535. doi:10.1186/s12877-023-04262-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Åhman, H. B., Giedraitis, V., Cedervall, Y., Lennhed, B., Berglund, L., McKee, K., et al. (2019). Dual-task performance and neurodegeneration: correlations between timed up-and-go dual-task test outcomes and alzheimer's disease cerebrospinal fluid biomarkers. J. Alzheimers Dis. 71 (s1), S75–S83. doi:10.3233/JAD-181265

PubMed Abstract | CrossRef Full Text | Google Scholar

Åhman, H. B., Berglund, L., Cedervall, Y., Giedraitis, V., McKee, K. J., Rosendahl, E., et al. (2021). Timed “Up and Go” dual-task tests: age- and sex-specific reference values and test-retest reliability in cognitively healthy controls. Phys. Ther. 101 (10). doi:10.1093/ptj/pzab179

PubMed Abstract | CrossRef Full Text | Google Scholar

Almeida, O. P., and Almeida, S. A. (1999). Reliability of the Brazilian version of the ++abbreviated form of geriatric depression scale (GDS) short form. Arq. Neuropsiquiatr. 57 (2B), 421–426. doi:10.1590/s0004-282x1999000300013

PubMed Abstract | CrossRef Full Text | Google Scholar

Altman, D. G. (1991). Practical statistics for medical research. London: Chapman and Hall/CRC.

CrossRef Full Text | Google Scholar

Beckerman, H., Roebroeck, M. E., Lankhorst, G. J., Becher, J. G., Bezemer, P. D., and Verbeek, A. L. (2001). Smallest real difference, a link between reproducibility and responsiveness. Qual. Life Res. 10 (7), 571–578. doi:10.1023/a:1013138911638

PubMed Abstract | CrossRef Full Text | Google Scholar

Bland, J. M., and Altman, D. G. (1996). Measurement error. BMJ 312 (7047), 1654. doi:10.1136/bmj.312.7047.1654

PubMed Abstract | CrossRef Full Text | Google Scholar

Bottinger, M. J., Labudek, S., Schoene, D., Jansen, C. P., Stefanakis, M. E., Litz, E., et al. (2024). TiC-TUG: technology in clinical practice using the instrumented timed up and go test-a scoping review. Aging Clin. Exp. Res. 36 (1), 100. doi:10.1007/s40520-024-02733-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Braun, T., Thiel, C., Schulz, R. J., and Gruneberg, C. (2019). Reliability of mobility measures in older medical patients with cognitive impairment. BMC Geriatr. 19 (1), 20. doi:10.1186/s12877-019-1036-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Brauner, F. O., Figueiredo, A. I., Urbanetto, M. S., Baptista, R. R., Schiavo, A., and Mestriner, R. G. (2022). The 180 degrees turn phase of the timed up and Go test better predicts history of falls in the oldest-old when compared with the full test: a case-control study. J. Aging Phys. Act., 31 (2), 303–310. doi:10.1123/japa.2022-0091

PubMed Abstract | CrossRef Full Text | Google Scholar

Burks, H. B., des Bordes, J. K. A., Chadha, R., Holmes, H. M., and Rianon, N. J. (2021). Quality of life assessment in older adults with dementia: a systematic review. Dement. Geriatr. Cogn. Disord. 50 (2), 103–110. doi:10.1159/000515317

PubMed Abstract | CrossRef Full Text | Google Scholar

Cedervall, Y., Stenberg, A. M., Åhman, H. B., Giedraitis, V., Tinmark, F., Berglund, L., et al. (2020). Timed up-and-go dual-task testing in the assessment of cognitive function: a mixed methods observational study for development of the UDDGait protocol. Int. J. Environ. Res. Public Health 17 (5), 1715. doi:10.3390/ijerph17051715

PubMed Abstract | CrossRef Full Text | Google Scholar

Donoghue, O. A., Savva, G. M., Borsch-Supan, A., and Kenny, R. A. (2019). Reliability, measurement error and minimum detectable change in mobility measures: a cohort study of community-dwelling adults aged 50 years and over in Ireland. BMJ Open 9 (11), e030475. doi:10.1136/bmjopen-2019-030475

PubMed Abstract | CrossRef Full Text | Google Scholar

Flansbjer, U. B., Holmback, A. M., Downham, D., Patten, C., and Lexell, J. (2005). Reliability of gait performance tests in men and women with hemiparesis after stroke. J. Rehabil. Med. 37 (2), 75–82. doi:10.1080/16501970410017215

PubMed Abstract | CrossRef Full Text | Google Scholar

Herman, T., and Hausdorff, J. M. (2011). Properties of the 'timed up and go' test: more than meets the eye. Gerontology 57 (3), 203–210. doi:10.1159/000314963

PubMed Abstract | CrossRef Full Text | Google Scholar

Hofheinz, M., and Schusterschitz, C. (2010). Dual task interference in estimating the risk of falls and measuring change: a comparative, psychometric study of four measurements. Clin. Rehabil. 24 (9), 831–842. doi:10.1177/0269215510367993

PubMed Abstract | CrossRef Full Text | Google Scholar

Hollman, J. H., McDade, E. M., and Petersen, R. C. (2011). Normative spatiotemporal gait parameters in older adults. Gait Posture 34 (1), 111–118. doi:10.1016/j.gaitpost.2011.03.024

PubMed Abstract | CrossRef Full Text | Google Scholar

Knapstad, M. K., Steihaug, O. M., Aaslund, M. K., Nakling, A., Naterstad, I. F., Fladby, T., et al. (2019). Reduced walking speed in subjective and mild cognitive impairment: a cross-sectional study. J. Geriatr. Phys. Ther. 42 (3), E122–E128. doi:10.1519/JPT.0000000000000157

PubMed Abstract | CrossRef Full Text | Google Scholar

Kwak, K., Kostic, E., and Kim, D. (2023). Gait variability-based classification of the stages of the cognitive decline using partial least squares-discriminant analysis. Sci. Prog. 106 (4), 368504231218604. doi:10.1177/00368504231218604

PubMed Abstract | CrossRef Full Text | Google Scholar

Leroy, V., Nunkessore, O., Dentel, C., Durand, H., Mockler, D., et al. (2023). The nebulous association between cognitive impairment and falls in older adults: a systematic review of the literature. Int. J. Environ. Res. Public Health 20 (3), 2628. doi:10.3390/ijerph20032628

PubMed Abstract | CrossRef Full Text | Google Scholar

Löfgren, N., Berglund, L., Giedraitis, V., Halvorsen, K., Rosendahl, E., McKee, K. J., et al. (2025). Extracted step parameters during the timed up and go test discriminate between groups with different levels of cognitive ability-a cross-sectional study. BMC Geriatr. 25 (1), 182. doi:10.1186/s12877-025-05828-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Marx, R. G., Menezes, A., Horovitz, L., Jones, E. C., and Warren, R. F. (2003). A comparison of two time intervals for test-retest reliability of health status instruments. J. Clin. Epidemiol. 56 (8), 730–735. doi:10.1016/s0895-4356(03)00084-2

PubMed Abstract | CrossRef Full Text | Google Scholar

McGrath, D., Greene, B. R., Doheny, E. P., McKeown, D. J., De Vito, G., and Caulfield, B. (2011). Reliability of quantitative TUG measures of mobility for use in falls risk assessment. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2011, 466–469. doi:10.1109/IEMBS.2011.6090066

PubMed Abstract | CrossRef Full Text | Google Scholar

McIsaac, T. L., Lamberg, E. M., and Muratori, L. M. (2015). Building a framework for a dual task taxonomy. BioMed Res. Int. 2015, 591475. doi:10.1155/2015/591475

PubMed Abstract | CrossRef Full Text | Google Scholar

Mendonca, M. D., Alves, L., and Bugalho, P. (2016). From subjective cognitive complaints to dementia: who is at risk? a systematic review. Am. J. Alzheimers Dis. Other Demen. 31 (2), 105–114. doi:10.1177/1533317515592331

PubMed Abstract | CrossRef Full Text | Google Scholar

Mitchell, A. J., Beaumont, H., Ferguson, D., Yadegarfar, M., and Stubbs, B. (2014). Risk of dementia and mild cognitive impairment in older people with subjective memory complaints: meta-analysis. Acta Psychiatr. Scand. 130 (6), 439–451. doi:10.1111/acps.12336

PubMed Abstract | CrossRef Full Text | Google Scholar

Molinuevo, J. L., Rabin, L. A., Amariglio, R., Buckley, R., Dubois, B., Ellis, K. A., et al. (2017). Implementation of subjective cognitive decline criteria in research studies. Alzheimers Dement. 13 (3), 296–311. doi:10.1016/j.jalz.2016.09.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Montero-Odasso, M., Beauchet, O., and Hausdorff, J. M. (2012). Gait and cognition: a complementary approach to understanding brain function and the risk of falling. J. Am. Geriatr. Soc. 60 (11), 2127–2136. doi:10.1111/j.1532-5415.2012.04209.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Mullin, D. S., Cockburn, A., Welstead, M., Luciano, M., Russ, T. C., and Muniz-Terrera, G. (2022). Mechanisms of motoric cognitive risk-Hypotheses based on a systematic review and meta-analysis of longitudinal cohort studies of older adults. Alzheimers Dement. 18 (12), 2413–2427. doi:10.1002/alz.12547

PubMed Abstract | CrossRef Full Text | Google Scholar

Ortega-Bastidas, P., Gomez, B., Aqueveque, P., Luarte-Martinez, S., and Cano-de-la-Cuerda, R. (2023). Instrumented timed up and Go test (iTUG)-More than assessing time to predict falls: a systematic review. Sensors (Basel) 23 (7), 3426. doi:10.3390/s23073426

PubMed Abstract | CrossRef Full Text | Google Scholar

Parati, M., Ambrosini, E., B, D. E. M., Gallotta, M., Dalla Vecchia, L. A., Ferriero, G., et al. (2022). The reliability of gait parameters captured via instrumented walkways: a systematic review and meta-analysis. Eur. J. Phys. Rehabil. Med. 58 (3), 363–377. doi:10.23736/S1973-9087.22.07037-X

PubMed Abstract | CrossRef Full Text | Google Scholar

Pike, A., McGuckian, T. B., Steenbergen, B., Cole, M. H., and Wilson, P. H. (2023). How reliable and valid are dual-task cost metrics? A meta-analysis of locomotor-cognitive dual-task paradigms. Arch. Phys. Med. Rehabil. 104 (2), 302–314. doi:10.1016/j.apmr.2022.07.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Podsiadlo, D. (1991). The timed “Up and Go”: a test of basic functional mobility for frail elderly persons. J. Am. Geriatr. Soc. 39 (2), 142–148. doi:10.1111/j.1532-5415.1991.tb01616.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Pottorf, T. S., Nocera, J. R., Eicholtz, S. P., and Kesar, T. M. (2022). Locomotor adaptation deficits in older individuals with cognitive impairments: a pilot study. Front. Neurol. 13, 800338. doi:10.3389/fneur.2022.800338

PubMed Abstract | CrossRef Full Text | Google Scholar

Ramirez, F., and et, al. (2021). Dual-task gait as a predictive tool for cognitive impairment in older adults: a systematic review. Front. Aging Neurosci. 13, 769462. doi:10.3389/fnagi.2021.769462

PubMed Abstract | CrossRef Full Text | Google Scholar

Rodrigues, F., Teixeira, J. E., and Forte, P. (2023). The reliability of the timed Up and Go test among Portuguese elderly. Healthc. (Basel) 11 (7), 928. doi:10.3390/healthcare11070928

PubMed Abstract | CrossRef Full Text | Google Scholar

Salarian, A., Horak, F. B., Zampieri, C., Carlson-Kuhta, P., Nutt, J. G., and Aminian, K. (2010). iTUG, a sensitive and reliable measure of mobility. IEEE Trans. Neural Syst. Rehabil. Eng. 18 (3), 303–310. doi:10.1109/TNSRE.2010.2047606

PubMed Abstract | CrossRef Full Text | Google Scholar

Salzman, T., Laurin, E., Thibault, C., Farrell, P., and Fraser, S. (2025). A systematic review and meta-analysis of dual-task outcomes in subjective cognitive decline. Alzheimers Dement. (Amst). 17 (1), e70054. doi:10.1002/dad2.70054

PubMed Abstract | CrossRef Full Text | Google Scholar

Skillback, T., Zetterberg, H., Skoog, J., Rydén, L., Wetterberg, H., et al. (2022). Slowing gait speed precedes cognitive decline by several years. Alzheimers Dement. 18 (9), 1667–1676. doi:10.1002/alz.12537

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, E., Walsh, L., Doyle, J., Greene, B., and Blake, C. (2016). The reliability of the quantitative timed up and go test (QTUG) measured over five consecutive days under single and dual-task conditions in community dwelling older adults. Gait Posture 43, 239–244. doi:10.1016/j.gaitpost.2015.10.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Terwee, C. B., Bot, S. D., de Boer, M. R., van der Windt, D. A., Knol, D. L., Dekker, J., et al. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. J. Clin. Epidemiol. 60 (1), 34–42. doi:10.1016/j.jclinepi.2006.03.012

PubMed Abstract | CrossRef Full Text | Google Scholar

van Lummel, R. C., Walgaard, S., Hobert, M. A., Maetzler, W., van Dieen, J. H., Galindo-Garre, F., et al. (2016). Intra-rater, inter-rater and test-retest reliability of an instrumented timed up and Go (iTUG) test in patients with parkinson's disease. PLoS One 11 (3), e0151881. doi:10.1371/journal.pone.0151881

PubMed Abstract | CrossRef Full Text | Google Scholar

Venema, D. M., Hansen, H., High, R., Goetsch, T., and Siu, K. C. (2019). Minimal detectable change in dual-task cost for older adults with and without cognitive impairment. J. Geriatr. Phys. Ther. 42 (4), E32–E38. doi:10.1519/JPT.0000000000000194

PubMed Abstract | CrossRef Full Text | Google Scholar

Verghese, J., Lipton, R. B., and Holtzer, R. (2013). Motoric cognitive risk syndrome and the risk of dementia. J. Gerontol. A Biol. Sci. Med. Sci. 68 (4), 412–418. doi:10.1093/gerona/gls191

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, T., Tan, J. B., Liu, X. L., and Zhao, I. (2023). Barriers and enablers to implementing clinical practice guidelines in primary care: an overview of systematic reviews. BMJ Open 13 (1), e062158. doi:10.1136/bmjopen-2022-062158

PubMed Abstract | CrossRef Full Text | Google Scholar

WHO (2025). Global status report on the public health response to dementia. Available online at: https://www.who.int/publications/i/item/97892400332452021.

Google Scholar

Wollesen, B., Wanstrath, M., van Schooten, K. S., and Delbaere, K. (2019). A taxonomy of cognitive tasks to evaluate cognitive-motor interference on spatiotemoporal gait parameters in older people: a systematic review and meta-analysis. Eur. Rev. Aging Phys. Act. 16, 12. doi:10.1186/s11556-019-0218-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, L., Liao, L. R., Lam, F. M., He, C. Q., and Pang, M. Y. (2015). Psychometric properties of dual-task balance assessments for older adults: a systematic review. Maturitas 80 (4), 359–369. doi:10.1016/j.maturitas.2015.01.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, Q., Tian, C., Tseng, B., Zhang, B., Huang, S., Jin, S., et al. (2020). Gait change in dual task as a behavioral marker to detect mild cognitive impairment in elderly persons: a systematic review and meta-analysis. Arch. Phys. Med. Rehabil. 101 (10), 1813–1821. doi:10.1016/j.apmr.2020.05.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Yogev-Seligmann, G., Hausdorff, J. M., and Giladi, N. (2008). The role of executive function and attention in gait. Mov. Disord. 23 (3), 329–342. doi:10.1002/mds.21720

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhong, Y., Huang, S., Zou, M., Chen, Y., Shen, P., He, Y., et al. (2025). Gait analysis in older adults with mild cognitive impairment: a bibliometric analysis of global trends, hotspots, and emerging frontiers. Front. Aging 6, 1592464. doi:10.3389/fragi.2025.1592464

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: dementia, gait, mild cognitive impairment, motor-cognitive dual-task, psychometrics, subjective cognitive decline

Citation: Löfgren N, Giedraitis V, Halvorsen K, Rosendahl E and Åberg AC (2026) The Timed Up and Go dual-task test’s cognitive and motor outcomes show promising test-retest reliability in older adults with perceived memory impairment. Front. Aging 6:1715756. doi: 10.3389/fragi.2025.1715756

Received: 29 September 2025; Accepted: 19 December 2025;
Published: 08 January 2026.

Edited by:

Kuan-yi Li, Chang Gung University, Taiwan

Reviewed by:

Lien Van Laer, University of Antwerp, Belgium
I-Chen Chen, Dayeh University, Taiwan

Copyright © 2026 Löfgren, Giedraitis, Halvorsen, Rosendahl and Åberg. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Niklas Löfgren, bmxnQGR1LnNl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.