The relationship between response dynamics and the formation of confidence varies across the lifespan

Accurate metacognitive judgments, such as forming a confidence judgment, are crucial for goal-directed behavior but decline with older age. Besides changes in the sensory processing of stimulus features, there might also be changes in the motoric aspects of giving responses that account for age-related changes in confidence. In order to assess the association between confidence and response parameters across the adult lifespan, we measured response times and peak forces in a four-choice flanker task with subsequent confidence judgments. In 65 healthy adults from 20 to 76 years of age, we showed divergent associations of each measure with confidence, depending on decision accuracy. Participants indicated higher confidence after faster responses in correct but not incorrect trials. They also indicated higher confidence after less forceful responses in errors but not in correct trials. Notably, these associations were age-dependent as the relationship between confidence and response time was more pronounced in older participants, while the relationship between confidence and response force decayed with age. Our results add to the notion that confidence is related to response parameters and demonstrate noteworthy changes in the observed associations across the adult lifespan. These changes potentially constitute an expression of general age-related deficits in performance monitoring or, alternatively, index a failing mechanism in the computation of confidence in older adults.

Accurate metacognitive judgments, such as forming a confidence judgment, are crucial for goal-directed behavior but decline with older age. Besides changes in the sensory processing of stimulus features, there might also be changes in the motoric aspects of giving responses that account for agerelated changes in confidence. In order to assess the association between confidence and response parameters across the adult lifespan, we measured response times and peak forces in a four-choice flanker task with subsequent confidence judgments. In 65 healthy adults from 20 to 76 years of age, we showed divergent associations of each measure with confidence, depending on decision accuracy. Participants indicated higher confidence after faster responses in correct but not incorrect trials. They also indicated higher confidence after less forceful responses in errors but not in correct trials. Notably, these associations were age-dependent as the relationship between confidence and response time was more pronounced in older participants, while the relationship between confidence and response force decayed with age. Our results add to the notion that confidence is related to response parameters and demonstrate noteworthy changes in the observed associations across the adult lifespan. These changes potentially constitute an expression of general age-related deficits in performance monitoring or, alternatively, index a failing mechanism in the computation of confidence in older adults.

Introduction
Humans can report a subjective sense of confidence that is closely related to the accuracy of their actions. This ongoing monitoring of decisions and their execution is called metacognition and includes the evaluation of behavior and the detection of occurring errors (Fleming and Dolan, 2012). Accurate metacognitive judgments should lead to adaptive behavior adjustments and are thus crucial for all activities. Undetected errors (i.e., an example of incorrect metacognitive judgments) might have severe implications for real-life scenarios because they may not trigger the required adjustments for future actions and decisions (Wessel et al., 2018). Research on perceptual metacognition (i.e., metacognitive evaluations of perceptual decisions) has evolved largely separate in the fields of decision confidence and error detection, which were suggested to share the same underlying mechanisms (Boldt and Yeung, 2015;Charles and Yeung, 2019). From this point of view, low confidence in the accuracy of a decision (e.g., expressed in a confidence rating as "certainly wrong") would be equivalent to reporting the detection of an error (e.g., signaled by a designated key press).
Metacognitive performance across the adult lifespan seems to depend on the cognitive domain being assessed. In the memory domain, evidence points toward largely preserved metacognitive abilities with older age (Dodson et al., 2007;Hertzog and Curley, 2018;Zakrzewski et al., 2021; although for aspects of memory metacognition that declined with age, see Chua et al., 2009;Tauber and Dunlosky, 2012). In the perceptual domain, in contrast, tasks require fundamentally different cognitive processes and research showed a rather robust decline in metacognitive performance across age (Rabbitt, 1990;Palmer et al., 2014;Niessen et al., 2017;Wessel et al., 2018). This study is intended to assess perceptual metacognition only. When participants were asked to report committed errors in an easy choice-reaction task, the detection rates declined with age, even when task performance was comparable (Harty et al., 2017;Niessen et al., 2017). In our previous publication, using the same dataset described in this study (Overhoff et al., 2021), we asked participants to rate their confidence after each decision on a four-point scale instead of reporting detected errors, and we concordantly revealed a decline in metacognitive performance across the lifespan. The accuracy of these ratings (i.e., the degree to which they matched the observed performance) decreased gradually with higher age, reflecting that older adults were less aware of their errors and rated correct responses with lower confidence compared to younger adults. Previous studies investigating age-related changes in metacognitive performance using confidence ratings are rare and yielded contradicting results. A recent large online study investigated metacognitive performance in 304 participants across the lifespan in a gamified setting using a visual perception task (McWilliams et al., 2022). Despite lower mean confidence with older age, the results revealed no effect of age on the accuracy of metacognitive judgments. In contrast, a laboratory based study with 60 participants using a similar task observed a significant decline in metacognitive accuracy with older age (Palmer et al., 2014; see also Filippi et al., 2020). Thus, while findings regarding age-related changes in perceptual metacognition are mixed, the question of which factors are related to this selective declinewhen it is found -remains open.
In order to understand the age-related decline in metacognitive performance, it is essential to understand the basic mechanisms underlying the computation of confidence. It is still unclear which information is used to compute confidence, and how input from different sources is weighted (Charles and Yeung, 2019;Feuerriegel et al., 2021). For instance, confidence has been related to the strength of stimulus evidence, stimulus discriminability (Yeung and Summerfield, 2012;Charles and Yeung, 2019;Turner et al., 2021b), or instructed time pressure (Vickers and Packer, 1982). Furthermore, growing evidence suggests that the interoceptive feedback of a motor action while giving a response might be another source of information contributing to the formation of confidence about the decision (Kiani et al., 2014;Fleming et al., 2015;Palser et al., 2018;Gajdos et al., 2019;Siedlecka et al., 2021;Turner et al., 2021a). Fleming and colleagues (2015) investigated the interaction of confidence and motor-related activity by delivering single-pulse transcranial magnetic stimulation (TMS) to the dorsal premotor cortex. This perturbation did not affect task performance, but crucially, it did affect the accuracy of subsequent confidence judgments. This finding indicates that action-specific cortical activations might contribute to confidence. In line with this assumption, confidence ratings have been shown to be more accurate if the preceding decision required a motor action (Pereira et al., 2020;Siedlecka et al., 2021). For instance, Siedlecka et al. (2021) recently showed that metacognitive accuracy was higher after decisions requiring a key press than decisions which were indicated without a motor action. Taken together, these findings suggest that features of the motor response indicating a given decision might influence the confidence ratings about this decision. Therefore, further investigations of how confidence is reflected in different response parameters are warranted.
A response can be characterized by different dimensions. The most commonly used output variable is time, usually response time or movement time. A robust finding across studies is a negative relationship between response times for the initial decision and subsequent confidence ratings (Fleming et al., 2010;Kiani et al., 2014;Rahnev et al., 2020). Intuitively, one might assume that the degree of confidence is expressed in the time taken to make the decision, i.e., the less confident we are about a decision, the longer it should take to respond. However, another possible explanation is that the monitoring system uses the interoceptive signal of a movement produced by the response as an informative cue about the difficulty of the decision (Kiani et al., 2014;Fleming and Daw, 2017). Accordingly, if an easy decision led to a fast response, the internal read-out could boost subjective confidence. A recent study provided evidence for the directional effect of movement time (i.e., the time from lifting to dropping a marble) on confidence (Palser et al., 2018). In this study, movement speed was experimentally manipulated by instructing participants to move faster than they naturally would, and this manipulation resulted in declined metacognitive accuracy.
Nevertheless, temporal parameters do not capture all aspects of a movement. For instance, subthreshold motor activity (i.e., partial responses) cannot be detected by classical RT recordings but rather by recording muscle activity. However, partial responses have also been shown to affect reported confidence (Ficarella et al., 2019;Gajdos et al., 2019). An informative motor parameter of a response is the applied force, which is often measured in its peak force, i.e., the maximum exerted force during a response action. Notably, peak force and response time index distinct processes as they show divergent behavioral patterns (i.e., small to no correlation) across experimental manipulations (Franz and Miller, 2002;Stahl and Rammsayer, 2005;Cohen and van Gaal, 2014). Therefore, the measurement of peak force might provide unique information about cognitive processes, which in turn might enrich our understanding of agerelated impairments in metacognition because previous studies have already suggested a close relationship between peak force and metacognitive judgments (Stahl et al., 2020;Turner et al., 2021a).
Recently, Turner et al. (2021a) examined the relationship between confidence and response force by explicitly manipulating the degree of physical effort that had to be exerted to give a response. When participants were prompted to submit their response to a perceptual decision with varying force levels, participants reported higher confidence in their decisions when their response peak force was higher. Notably, requiring participants to produce a specific (and comparably high) degree of force (as mandated in the experiment) is fundamentally different from measuring naturally occurring force patterns of a response (in terms of a dependent measure). The latter was done, for example, in a study by Bode and Stahl (2014), who found that naturally occurring peak force was lower in errors compared to correct responses. It was suggested that this might indicate a process in which low force in error trials signifies an unsuccessful attempt to stop the already initiated response, which requires early and fast error detection (Ko et al., 2012;Bode and Stahl, 2014;Stahl et al., 2020). However, error detection or confidence was not directly assessed, rendering comparison between these two studies difficult.
The relationship between different response parameters and confidence has not been systematically assessed in the context of healthy aging. While response and movement times are slower and more variable with older age, findings of age-related changes in response force are inconsistent (Salthouse, 2000;Bunce et al., 2004;Dully et al., 2018). Some studies showed delayed and altered electrophysiological signatures of motor processing in older age [e.g., lateralized readiness potential (LRP)/movement-related potential (MRP) and mu/beta desynchronization; Sailer et al., 2000;Falkenstein et al., 2006;Quandt et al., 2016]. When required to produce visually cued levels of force, older participants' force output was found to be more variable (Vaillancourt and Newell, 2003), which may be restricted to very old age though (Sosnoff and Newell, 2006). In contrast, electromyographic or force recordings of motor responses revealed similar patterns in younger and older adults (Van Der Lubbe et al., 2002;Yordanova et al., 2004;Falkenstein et al., 2006;Dully et al., 2018). Notably, these studies did not assess error detection or confidence. Therefore, it is warranted to specifically examine the associations between confidence and response time and between confidence and response force and to investigate whether these associations change across the adult lifespan.
The present study constitutes the first comprehensive assessment of the association between metacognitive accuracy and two main response parameters across the adult lifespan. We intended to answer the following questions: First, what are the relationships between decision confidence and response time on the one hand, and peak force of a response (as it naturally occurs, i.e., without specific instruction or experimental manipulation) on the other hand? Second, do these relationships between confidence and response parameters change with age? Additionally, we were interested in investigating the potential moderating effect of accuracy because many studies on decision confidence only assessed the relationship between a given response parameter and confidence in correct responses. However, evidence suggests that meaningful differences in confidence levels within correct responses and within errors exist (Charles et al., 2013). Concerning response time, for instance, the well-known negative relationship with confidence is inverted for errors when the confidence rating allows to indicate error detection (i.e., a rating scale was used that ranged from certainty in being correct to certainty in being wrong; Pereira et al., 2020). In order to answer these questions, we employed a conflict task, which is often used in studies of error monitoring (Vissers et al., 2018;Maier et al., 2019), while studies of decision confidence typically use signal detection tasks where stimuli are difficult to discriminate and errors are rarely detected (Resulaj et al., 2009;Fleming et al., 2016).
We expected significant associations between confidence judgments and parameters of the response (Fleming et al., 2015;Pereira et al., 2020;Rahnev et al., 2020;Turner et al., 2021a). In particular, response time was expected to decrease with higher confidence for correct trials (Kiani et al., 2014;Dotan et al., 2018;Rahnev et al., 2020) and to increase with higher confidence for errors (Pereira et al., 2020). We tentatively hypothesized a positive relationship between response force and confidence for errors and correct responses (Ko et al., 2012;Bode and Stahl, 2014;Turner et al., 2021a).
Most importantly, we intended to explore age-related changes in the associations between response parameters and confidence without having a priori hypotheses about the direction of possible effects due to a lack of previous studies on this topic. If we find divergent patterns across the lifespan, this might encourage research on the causal relationship between response parameters and confidence.

Materials and methods Participants
Eighty-two participants were recruited and received monetary compensation for their participation in the experiment. Inclusion criteria were, amongst others, no history of or current neurological or psychiatric disease, which excluded participants with diagnosed Parkinson's disease, a condition characterized by altered patterns of response force (Franz and Miller, 2002). Data from seventeen participants had to be discarded due to: symptoms of depression (N = 1, Beck's Depression Inventory score higher than 17; BDI; Hautzinger, 1991), poor behavioral performance (N = 8, more than 30% invalid trials, error rate higher than chance, here 25%), or a behavioral pattern that was indicative of an insufficient understanding or implementation of task demands (N = 8, inspection of individual datasets for a combination of errors in the color discrimination test described below, near chance task performance, frequent invalid trials, and biased use of single response keys). This resulted in a final sample for analysis of sixty-five healthy, right-handed adults [age = 45.5 ± 2.0 years (all results are indicated as mean ± standard error of the mean; SEM); age range = 20 to 76 years with a near equal distribution of participants across decades; <40 years: 28, 40-59 years: 20, >60 years: 13; 26 female, 39 male] with (corrected to) normal visual accuracy, no color-blindness, no signs of cognitive impairment (Mini-Mental-State Examination score lower than the cut-off of 24; MMSE; Folstein et al., 1975) and no history of psychiatric or neurological diseases.
The current study's data have been used previously (Overhoff et al., 2021). The same exclusion criteria regarding the neuropsychological assessment and the task performance were applied, resulting in the same subsample included in the analyses. In the previous publication, we thoroughly examined the metacognitive performance and its relation to behavioral parameters (response accuracy, response time, behavioral adjustments) as well as two electrophysiological potentials (i.e., the error/correct negativity, N e , and the error/correct positivity, P e ; for detailed results and discussion thereof, see Overhoff et al., 2021). We did not report or analyze any response force measures in the previous publication.
The experiment was approved by the ethics committee of the German Psychological Society (DGPs). All participants gave written informed consent, and the study followed the Declaration of Helsinki.

Stimuli
The experiment consisted of a color version of the Flanker task (Eriksen and Eriksen, 1974) with four response options intended to increase conflict and thereby the number of errors while ensuring feasibility for participants of all ages. Four target colors were mapped onto both hands' index and middle fingers. In each trial, we presented one central, colored target square flanked by two squares on the left and right side, respectively. Participants had to respond to the central target by pressing the corresponding finger. The flankers were presented slightly before the target appeared to increase their distracting effect. Flankers could be of the same color as the target (congruent condition), of one of three additional neutral colors that were not mapped to any response (neutral condition), or of another target color (incongruent condition).

Experimental paradigm
Each trial started with the presentation of a white fixation cross on black background for 500 ms. The fixation cross was replaced by the two flankers, followed by the target after 50 ms and the two flankers and the target remained on screen for another 100 ms. Participants pressed their left or right index or middle finger to indicate their decision (see Figure 1B). Participants were instructed to respond as fast and accurately as possible. A black screen was presented until a response was registered (max. 1200 ms) and an additional 800 ms before presenting the confidence rating. For this rating, participants indicated their confidence in the decision on a four-point scale comprising the options "surely wrong, " "maybe wrong, " "maybe correct, " and "surely correct" (max. 2000 ms). A jittered intertrial interval of 400 to 600 ms preceded the subsequent trial. If no response was registered in the decision task, the participants received feedback about being too slow, and the trial was terminated. The sequence of an experimental trial is depicted in Figure 1A.

Procedures
Prior to testing, we collected demographic details, and the participants conducted a brief color discrimination test without any time pressure or cognitive load to ensure that they were capable of correctly discriminating the stimulus colors used in the experiment. The neuropsychological tests for assessing (A) Trial structure (here, incongruent condition illustrated). Each trial commenced with the presentation of a fixation cross. Subsequently, two colored squares (flankers) were presented, and a third square (of the same or a different color; target) was added shortly after. The stimuli disappeared after 100 ms and the ensuing black screen, where participants were instructed to make a response by pressing one of four response keys mapped onto one color each, remained until a response was registered (max. 1200 ms). If no response was given, the German words for "too slow" were shown, and the trial was terminated. Otherwise, after another black screen, the confidence rating scale was presented, which remained on the screen until a judgment (the four fingers were mapped onto the squares according to their spatial location) was made (max. 2000 ms). The next trial started after another black screen of random duration between 400 and 600 ms. (B) Force-sensitive response keys. Left and right index and middle fingers (red circles) were placed on adjustable finger rests.
Participants first performed 18 practice trials without confidence rating, receiving feedback about their accuracy, which could be repeated once if necessary. Two practice blocks of 72 trials without feedback or the option to repeat followed. Another practice block of 18 trials then introduced the confidence rating. The actual experiment consisted of five blocks with 72 trials each, with optional breaks after each block. The electroencephalogram (EEG) was recorded throughout the testing session. Note that the EEG results have been reported in our previous publication (Overhoff et al., 2021). The main experiment lasted approximately 40 min.

Apparatus
The participants were seated in a noise-insulated and dimly lit testing booth at a viewing distance of 70 cm to the screen (LCD monitor, 60 Hz). A chin rest minimized non-task related movements.
For response recording, we used force sensitive keys with a sampling rate of 1024 Hz and a high temporal resolution that is superior to standard keyboards ( Figure 1B; Stahl et al., 2020). The keys were calibrated to the fingers' weight before and during the experiment. The keys could be adjusted to the hand size, and a comfortable hand position was ensured by a wrist rest. An applied force was registered as a response when it exceeded a threshold of 40 cN.
The color discrimination test was programmed using Presentation software (Neurobehavioral Systems, version 14.5) and the main task using uVariotest software (version 1.978).

Analysis
Response time (RT) was defined as the time from target stimulus presentation to the initial crossing of the response force threshold of 40 cN by any response key. Peak force (PF) was defined as the maximum of a force pulse following the crossing of the threshold. Additionally, we measured the time from response onset to the time of the PF (only used for the exclusion of trials).
We excluded from the analysis: invalid trials, which were too slow, responses without confidence rating, responses with an RT below 200 ms (indicating premature responding), a PF below 40 cN (indicating incomplete or aborted responding), a time to PF of more than three standard deviations above the mean (indicating that the response was not of the expected ballistic nature), and recording artifacts (implausible time between response onset and time of the PF, incorrect identification of response key in case of multiple responses).
As a first step, to characterize the distribution of the behavioral parameters of interest independent of confidence, we computed paired samples t-tests at the group level to compare RT, PF, and their dispersion between correct and incorrect trials. For the investigation of age-related effects, we used a series of linear regressions with the predictor age for each of the following variables: error rates (ER; the proportion of valid responses that were incorrect), mean confidence ratings, mean RT and PF, and standard deviation of RT and PF. The latter analyses were performed separately for errors and correct responses.
Next, data were analyzed using generalized linear mixedeffects models (GLMMs) with a beta distribution using the glmmTMB package (version 1.0.2.1; Brooks et al., 2017) in R (version 4.0.5; R Core Team, 2021). We chose this modeling approach because the beta distribution is assumed to better account for data that are not normally distributed and doubly bounded (i.e., having an upper and a lower bound; here: 1, "surely wrong, " and 4, "surely correct"), which applies to our confidence data (Verkuilen and Smithson, 2012). All continuous predictor variables were mean centered and scaled for model fitting, and confidence was scaled to the open interval (0,1; i.e., the range is slightly compressed to avoid boundary observations; Verkuilen and Smithson, 2012). Analyses were again conducted separately for correct responses and errors.
We examined the effects of age and the two parameters (RT, PF) of the response on confidence ratings using the following regression model structures (separately for the subsets of errors and correct responses): Response time (RT) and PF were used as fixed effects, and age was included as a covariate due to its documented negative effect on metacognitive accuracy (i.e., a negative effect on confidence for correct responses and a positive effect on confidence for errors; Palmer et al., 2014;Overhoff et al., 2021).
For the most complex model, we considered an interaction term between all three factors, as RT and PF are known to vary across age (Dully et al., 2018), and previous work suggests potential interactions between RT and PF (Bode and Stahl, 2014;Gajdos et al., 2019). We fitted random intercepts for participants, allowing their mean confidence ratings to differ. If possible and the models converged, random slopes by participant were added for the predictors of interest to account for individual differences in the degree to which these were related to the confidence ratings (Barr et al., 2013). Models were checked for singularity and multicollinearity by calculating the variance inflation factor (VIF) using the performance package (version 0.7.2; Lüdecke et al., 2021).
We compared model fits including all effects of interest (model 4) to models including only one (models 2, 3) or no effect of interest (model 1) using likelihood ratio tests, and computed Wald z-tests to determine the significance of each coefficient. This means that, if a model including one predictor of interest (e.g., RT) fits the data better than a model including no effect of interest, this predictor has a relevant effect on confidence, and its inclusion in the model allows for a better prediction of participants' ratings.
To follow up on significant interaction effects between age and the predictors of interest, we calculated slopes for three values of age (the mean and one standard deviation above and below the mean). Additionally, for statistical analysis of the transitions between these values, we computed Johnson-Neyman intervals using an adapted version of the johnson_neyman function of the interactions package (version 1.1.0; Long, 2019). This analysis reveals whether the statistical effect of the response parameters on confidence is conditional on the entire range of the moderator age, or just a sub-range, thus providing bounds for where the observed interaction effect is significant.

Effect of age on response parameters
We have already reported the relationship between age and error rate, RT, and confidence in our previous publication (Overhoff et al., 2021) based on a slightly different subset of trials to the one used here (due to additional force-related exclusions of trials in this study). Our initial results were confirmed using a series of linear regression analyses, each using age as the predictor for one of the following variables (for an overview, see Supplementary Table 2) The mean confidence (in the decision being correct, on a scale from 1 to 4) for correct responses (3.82 ± 0.02 for the entire sample) decreased with age [F(1,63) = 22.42, p < 0.001, β = -0.007, SE = 0.002, t = -4.74]. This finding indicates that the older participants were, the less confident they were in being correct when responding correctly. Contrarily, the mean confidence for errors (2.35 ± 0.08 for the entire sample) increased with age [F(1,63) = 21.96, p < 0.001, β = 0.019, SE = 0.004, t = 4.69; see Supplementary Figure 2]. Hence, the older the participants were, the less sure they were that the decision was wrong when making an error. We have recently described this phenomenon as an age-related tendency to use the middle of the confidence scale, pointing toward increased uncertainty in older adults (Overhoff et al., 2021).

Modeling of confidence
Variance inflation factors across all models with interactions were <2.03, indicating low collinearity (<5; James et al., 2013) between the predictors, and the models were not overfitted, as the fits proved not to be singular.

Confidence in correct decisions
We computed likelihood ratio tests to compare the model fit of the winning model to the three other models. These tests revealed that for correct decisions, model 2 [i.e., Confidence ∼ RT * Age + (RT | Participant)], which included the interaction between RT and age, fitted the data best. It was superior to model 1 (the null model), which included only the fixed effect of age [χ 2 (2) = 38.15, p < 0.001], and model 3, which included only the interaction between PF and age [χ 2 (2) = 33.71, p < 0.001]. Moreover, model 4, which included the full interaction between PF, RT and age, did not improve the fit further [χ 2 (4) = 8.64, The best fitting model showed significant negative effects of age [β = -0.147, SE = 0.039, z = -3.75, p < 0.001] and RT [β = -0.109, SE = 0.018, z = -6.13, p < 0.001] on confidence and a significant interaction between the two factors [β = -0.050, SE = 0.017, z = -2.96, p = 0.003; Table 1]. Given that we found a significant interaction between RT and age, we computed simple slopes for three values of age (the mean and one SD above and below the mean). The analysis revealed that the negative effect of RT on confidence (i.e., higher confidence for faster responses) increased with older age [ Figure 2A; The Johnson-Neyman technique revealed that the effect of RT on confidence became significant from around 24 years of age onward (higher bound of insignificant interaction effect: 24.41; Figure 2A).

Confidence in erroneous decisions
For errors, the best fitting model was model 3 [i.e., Confidence ∼ PF * Age + (RT | Participant)], which included the interaction between PF and age. The likelihood ratio tests revealed that this model fitted the data better than model 1 (the null model) [χ 2 (2) = 8.38, p = 0.015] and model 2, which included the interaction between RT and age [χ 2 (2) = 5.36, p < 0.001], and model 4, which included the interaction between all three factors, did not show an improved model fit, either [χ 2 (4) = 4.47, p = 0.347].
The winning model showed a significant effect of age [β = 0.288, SE = 0.066, z = 4.40, p < 0.001] and a significant interaction between PF and age [β = 0.080, SE = 0.030, z = 2.67, p = 0.008], but no main effect of PF [β = -0.018, SE = 0.029, z = -0.65, p = 0.519; Table 2]. To further unpack the interaction effect, we ran a simple slope analysis. This analysis showed a negative relationship between confidence and PF only for younger adults, while with increasing age, the slope was not significantly different from zero [ Figure 2B: -1SD (younger adults): β = -0.098, SE = 0.038, z = -2.60, p = 0.009; mean (middle-aged adults): β = -0.018, SE = 0.029, z = -0.645, p = 0.519; +1SD (older adults): β = 0.061, SE = 0.045, z = 1.37, p = 0.170]. Computation of the Johnson-Neyman interval showed that above an age of about 44 years (lower bound of significant interaction effect: 43.501), PF was no longer significantly associated with confidence ( Figure 2B). Interaction plots including the predictors of the models predicting confidence best for errors and correct responses. (A) Regression of age on confidence in correct trials with the moderator RT. (B) Regression of age on confidence in error trials with the moderator PF. Regressions are shown for the moderator fixed on the mean (dashed line) and one standard deviation above (solid line) and below (dotted line) the mean. Blue shaded areas indicate confidence intervals, and gray shaded areas indicate the age range in which a significant effect of RT (in correct trials) or PF (in error trials) on confidence is observed, resulting in the significant interaction effect.

Discussion
This study investigated age-related changes in the relationship between the temporal and motor response parameters RT and PF with decision confidence in a perceptual conflict task. Overall, higher confidence was related to faster and less forceful responses. We could further show that, across the entire sample, confidence was associated with both parameters, and these effects were moderated by performance accuracy: While RT was related to confidence in correct responses, peak force was related to confidence in error trials. Finally, age interacted with the response parameters so that with higher age, the effect of RT on confidence was more pronounced, while the effect of PF on confidence was diminished.
We will first focus on the observed age-related associations between confidence and response parameters and subsequently discuss possible interpretations within two different theoretical frameworks.

Behavioral correlates of confidence
In a complex conflict task, we replicated one of the most robust findings on decision confidence, namely a negative relationship between RT and confidence (Fleming et al., 2010;Rahnev et al., 2020). In correct trials, the higher participants rated their confidence in a decision, the faster they had made the decision. In addition, we found a negative relationship between PF and confidence for errors (higher PF was related to lower confidence for the younger participants), which has not been reported before. Observing the latter association is interesting per se because participants' attention was not directed to the applied force in any way (i.e., participants were not aware of the PF assessment), whilst the relevance of speed had been stressed in the instructions. A recent study (Turner et al., 2021a) showed, in a sample of young participants, that when higher levels of force had to be produced to report the (correct) decision, participants' confidence ratings were higher. While these results do not mirror ours, it should be noted that these findings also cannot be directly compared as their study was conceptually different to our study design and explicitly required participants to produce different force ranges. However, these studies together highlight the added value of assessing response force. In our study, the differential effect of RT and PF for correct and error trials, respectively, further highlights that these are dissociable parameters of a response, supporting a model of Ulrich and Wing (1991); see Jaśkowski et al. (2000) and Armbrecht et al. (2013) that RT and RF do not reflect just two sides of the same coin. Further, our findings stress the significance of including incorrect responses as a distinct response type in the corresponding analyses. Interestingly, the observed associations between the response parameters and confidence differed across the studied age range. While age-related changes in metacognitive performance, and error detection in particular, have been studied across tasks and domains (Palmer et al., 2014;Harty et al., 2017;Niessen et al., 2017;McWilliams et al., 2022), specific characteristics of confidence judgments have rarely been investigated in the context of healthy aging. The current study revealed a stronger association with increasing age between confidence and RT in correct trials and a weaker association between confidence and PF in errors. Importantly, control analyses (see Results and for illustration, Supplementary  Figure 1) revealed increased mean and dispersion of RTs with older age, which may have caused the pronounced negative relationship between confidence and RT, but no effect of age on the force output. This may appear counterintuitive, but it is in in line with previous studies showing similar response force for younger and older adults (except for very old adults above 70 years of age; Van Der Lubbe et al., 2002;Sosnoff and Newell, 2006). It was suggested that older adults may have difficulties in visuo-motor integration in tasks that require to meet a visually cued target force rather than in the pure motor execution (Sailer et al., 2000;Slifkin et al., 2000;Mattay et al., 2002). This implies that age-related changes in the relationship between confidence and PF in our study were -most likely -not due to generally altered patterns of PF.
In sum, this study provides the first comprehensive quantification of the interrelation between confidence and two behavioral response parameters across age. Additionally, it bridges the gap between two related research fields by combining approaches from error detection and decision confidence studies (Rabbitt, 1966;Sosnoff and Newell, 2006;Cohen and van Gaal, 2014;Kiani et al., 2014;Palmer et al., 2014;Stahl et al., 2020;Thurm et al., 2020;Turner et al., 2021a). In our previous study, we showed that age-related changes in metacognitive accuracy were associated with altered patterns of electrophysiological correlates of confidence (Overhoff et al., 2021), which is complemented by the present description of age-related changes in behavioral correlates of confidence. This characterization in a sample of participants covering a broad age range constitutes a further step in identifying and understanding age-related changes in metacognitive performance.
We will present two complementary but not exclusive interpretations of the observed age-related variations in the following. In the first part, we attempt to explain our findings under the assumption that response characteristics simply cooccur with the build-up of confidence. In contrast, in the second part, we assume that response parameters comprise additional information about the decision accuracy that is integrated into confidence during its formation process.

Response parameters as the expression of confidence
One possible framework for explaining the experimental findings is to assume that the level of decision confidence is expressed in the RT or PF of the response indicating this decision, either because confidence defines the response parameters or because a common process drives both confidence and the two parameters. In other words, if a participant is highly confident in a decision, this will affect the speed and the force with which they report this decision. Research has identified multiple stimulus-related characteristics that alter the accuracy of confidence judgments, like relative and absolute evidence strength (Peters et al., 2017;Ko et al., 2022) or evidence reliability (Boldt et al., 2017). If sensory evidence is unambiguous, an easy decision will accordingly lead to high certainty of having made a correct response. In turn, if the participant nevertheless responds incorrectly but changes their mind and detects this error, the certainty of having made an error will be high (i.e., resulting in a low confidence rating). It is intuitive to imagine that high certainty of having made a correct or incorrect response (which is identical to very high or very low confidence, respectively) will lead to fast and more forceful responses.
Notably, neither RT nor PF showed the expected pattern of change as would be expected if one or both parameters simply mirrored a decline in confidence with age. Arguably, it might still be possible to explain the differential interactions with age by assuming that multiple other sources (e.g., perception, attention, response selection, motor processes) cause the observed relationships between confidence and the two response parameters. If aging impacts (some of) these sources differentially, this might result in altered associations between confidence, RT and PF, as observed here. For instance, a cognitive process that is differentially susceptible in older compared to younger adults might affect the RTconfidence relationship but spare the relationship between PF and confidence. However, as we did not systematically investigate these other processes in the present study, we can neither support nor rule out these assumptions.

Modulation of confidence by response parameters
Alternatively, our findings could also be interpreted in line with recent studies postulating that parameters of a response indicating a decision may serve as an additional source of evidence that is integrated into confidence judgments about this decision -especially in ambiguous situations (Gajdos et al., 2019;Filevich et al., 2020;Pereira et al., 2020;Wokke et al., 2020;Turner et al., 2021a). These studies showed that confidence could be altered, for instance, by applying TMS to the dorsal premotor cortex or by instructing participants to move faster (Fleming et al., 2015;Palser et al., 2018). Using very different methodological approaches, these studies mutually indicate that the post-decisional evidence accumulation might incorporate response characteristics of the initial decision into the subsequent confidence rating. Although our study design assessing the relationship of confidence with RT and naturally occurring PF precludes any conclusions regarding the causal direction of effects, it is nevertheless interesting to reflect on our results within this framework.
Looking at the overall relationship between confidence and the two response parameters, our differential findings for errors and correct responses suggest serial processing. First, the RTrelated information might be "read out" by the monitoring system and serve as an interoceptive cue about the difficulty of a decision. This assumption is in line with previous work (Fleming et al., 2010;Kiani et al., 2014;Dotan et al., 2018;Gajdos et al., 2019;Rahnev et al., 2020). This interpretation would suggest that the decision-makers arrive at a higher confidence judgment because they also register having responded faster [e.g., via the efference copy (Latash, 2021) or the later representation of their action].
However, this proposed mechanism might exclusively operate in correct trials to refine confidence judgments. For the relationship between confidence and RT in error trials, which were on average slower than correct trials, it must be considered that a variety of aspects can cause errors (e.g., lack of attention, perceptual lapse), and the response profiles of errors are similarly heterogeneous. Therefore, in case of conflict (which is present in error trials), RT might no longer yield reliable information about the task requirements, and the monitoring system might probe PF instead as an alternative response parameter to compensate for the lack of reliable RT when computing confidence. In support, recent work indicated that within a similar speed range for responses, the PF in error trials was related to decision confidence (Stahl et al., 2020). Hence, while RT might not differentiate confidence levels in errors, variations in PF may well capture this information and could therefore be integrated into the final confidence judgment.
Given the frequently described decline of metacognitive abilities with older age, which was also shown in our previous analysis of the current data set [by using the Phi correlation coefficient (Nelson, 1984) for the analysis of metacognitive accuracy (Overhoff et al., 2021)], it seems likely that the older adults were lacking relevant input for the computation of confidence, making it harder for them to accurately rate their decisions. Consequently, one possibility is that the stronger association between confidence and RT in older adults might reflect a compensation mechanism. To explain, while our study does not allow for firm conclusions as to why metacognitive accuracy declined in older adults, it appears that the input to the performance monitoring system was diminished (or not adequate anymore) and did not allow for computing confidence with the same level of accuracy as in younger adults. Therefore, it is possible that the stronger reliance on RT (in correct trials) might reflect the attempt to compensate for this by relying more on other sources of input, like the interoceptive feedback about the response speed (Fleming et al., 2010;Palser et al., 2018). However, it remains unclear whether this compensation fails, as, despite more substantial reliance on RT information, confidence judgments were still poorer compared to younger adults. This could be plausible, for example, if the monitoring of response parameters itself might also become poorer with increasing age. Alternatively, it is also possible that this compensation was indeed (somewhat) successful, and without incorporating RT information more strongly, confidence judgments would be even worse. Ultimately, our study cannot resolve this question.
For error trials, we observed that the relationship between confidence and response force diminished with age. One explanation might be related to the finding that healthy aging has been associated with diminished neural specificity for errors (Park et al., 2010;Endrass et al., 2012;Harty et al., 2017;Overhoff et al., 2021), meaning that older adults might have generally been worse at detecting the errors in the first place. A recent fMRI study has extended these findings by showing that the activity related to error awareness was specifically reduced in older adults (Sim et al., 2020). Based on these findings, our results could be interpreted as another instance of an age-related errorspecific processing deficit. This functional processing deficit might also extend to the sensorimotor feedback of the produced force. The read-out of the response force -which might be used to infer confidence in case of errors -might thus not be readily accessible by older adults and potentially contribute to the demonstrated deficits in metacognitive accuracy.

Limitations
While the simultaneous recording of two response parameters for each response constitutes a strength of the present study, treating RT and PF as equivalent may be problematic. We have discussed RT and PF as separate but comparable features of motor activity, even though their apparent relevance differed largely. Force was produced without constraints, while the time to report the decision was limited to 1200 ms and exerted considerable time pressure on the participants. Therefore, it would be interesting to examine changes in the modulation of confidence by RT and PF without limiting the time to respond. Moreover, since the RT in a given trial represents the sum of the time for stimulus-related processes (between stimulus onset and the start of the response movement) and the time for motor-related processes (e.g., movement time -the time between starting and terminating a response movement), future studies should additionally assess movement time and its relation to confidence.
As mentioned above, this study cannot resolve the question of causality of the observed associations. Based on the described literature, it is reasonable to speculate that our findings can be explained within the framework of response dynamics informing confidence judgments and the discussion of this interpretation should be comprehended as a proposal of avenues for future research. Having established the associations between confidence and naturally occurring response parameters in this study, future work could directly manipulate RT and/or PF, for example, by providing visual feedback about the required speed/force, or via instruction (Palser et al., 2018;Turner et al., 2021a). We have carefully outlined that alternative explanations are equally plausible and acknowledge that both lines of interpretation may be valid in part.
Finally, a thorough characterization of age-related changes in behavioral and neural correlates of confidence is needed to explain normal and abnormal impairments related to aging and might proof valuable to predict the onset of cognitive decline or age-related (neurodegenerative) diseases such as dementia (Wilson et al., 2015).

Conclusion
Corroborating recent evidence, we revealed significant associations between decision confidence and the parameters of the responses indicating this decision. Furthermore, we extended these findings by showing that confidence was associated with fine-grained changes in the time taken to report a decision and the force invested in this response. These relationships were moderated by the accuracy of the response, and, most importantly, changed markedly across the adult life span. This notion should encourage the recording of response force in behavioral experiments whenever possible, as it might uncover specific effects that cannot be revealed by measuring other response parameters, like response times. While a causal explanation of these findings was beyond the scope of this study, one possible interpretation is that the observed age-related changes in the pattern of associations reflect a mechanism in the computation of confidence and may even constitute one aspect of the frequently observed decline in metacognitive ability with older age.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by German Psychological Society (DGPs). The patients/participants provided their written informed consent to participate in this study.