Consistent long-term practice leads to consistent improvement: Benefits of self-managed therapy for language and cognitive deficits using a digital therapeutic

Background Although speech-language therapy (SLT) is proven to be beneficial to recovery of post-stroke aphasia, delivering sufficiently high amounts of dosage remains a problem in real-world clinical practice. Self-managed SLT was introduced to solve the problem. Previous research showed in a 10-week period, increased dosage frequency could lead to better performance, however, it is uncertain if dosage still affects performance over a longer period of practice time and whether gains can be seen following practice over several months. Objective This study aims to evaluate data from a health app (Constant Therapy) to investigate the relationship between dosage amount and improvements following a 30-week treatment period. Two cohorts of users were analyzed. One was comprised of patients with a consistent average weekly dosage amount and the other cohort was comprised of users whose practice had higher variability. Methods We conducted two analyses with two cohorts of post-stroke patients who used Constant Therapy. The first cohort contains 537 “consistent” users, while the second cohort contains 2,159. The 30-week practice period was split into three consecutive 10-week practice windows to calculate average dosage amount. In each 10-week practice period, patients were grouped by their average dosage into low (0–15 min/week), medium (15–40 min/week) and moderate dosage (greater than 40 min/week) groups. Linear mixed-effects models were employed to evaluate if dosage amount was a significant factor affecting performance. Pairwise comparison was also applied to evaluate the slope difference between groups. Results For the consistent cohort, medium (β = .002, t17,700 = 7.64, P < .001) and moderate (β = .003, t9,297 = 7.94, P < .001) dosage groups showed significant improvement compared to the low dosage group. The moderate group also showed greater improvement compared to the medium group. For the variable cohort in analysis 2, the same trend was shown in the first two 10-week windows, however, in weeks 21–30, the difference was insignificant between low and medium groups (β = .001, t = 1.76, P = .078). Conclusions This study showed a higher dosage amount is related to greater therapy outcomes in over 6 months of digital self-managed therapy. It also showed that regardless of the exact pattern of practice, self-managed SLT leads to significant and sustained performance gains.


Introduction
Stroke is the most common disease that causes serious neurological disorders (1). Every year, over 795,000 people in the United States have a stroke, and aphasia or other communication disorders develop in approximately one-third of cases (2,3). Compared to other patients, patients with aphasia are facing higher mortality and a higher degree of functional limitation, communication limitation, and social isolation (4,5), making the need for effective rehabilitative approaches especially acute.
Previous research has shown that speech-language therapy (SLT) benefits functional language, language comprehension (listening and reading), and language production (speaking and writing) (6)(7)(8)(9)(10)(11)(12)(13)(14). Results also indicated that therapy at high intensity, high dosage, or over a longer period might be more beneficial compared to lower-intensity therapy (6). Moreover, high-intensity SLT over a short period appeared to help participants' language use in daily life and reduced the severity of their aphasia. However, high-intensity treatments might be less acceptable than less intensive therapy schedules for patients, as indicated by a significantly greater drop-out rate for higherintensity regimens (6). Besides acceptability, there was also the problem of delivering sufficiently high therapy doses to patients in the real world, where practical realities (e.g., reimbursement caps, difficulties with mobility and travel, geographic isolation) placed severe limits on the amount of therapy actually received. National statistics available from the American Speech-Language-Hearing Association (ASHA) demonstrated a substantial reduction in the frequency and amount of SLT by the time patients had been discharged from acute or inpatient settings to community-based outpatient settings (15-17). A recent study of dosage amounts in a U.S.-based outpatient setting reported a median total therapy dosage of just 7.5 h for individuals with post-stroke aphasia (18). Similarly, another study of access to outpatient post-stroke rehabilitation services found that the average total dosage of outpatient SLT was 8 h total in the year following an individual's stroke (19). These average numbers were far from the number of hours of therapy recommended for high-intensity SLT. In fact, meta-analytic reviews have characterized high-intensity SLT protocols as providing total therapy dosages between 27 and 208 h, with positive effect studies tending to provide at least 50 total hours of therapy (6,20).
Enabling patients to engage in in-home practice through computerized or app-based therapeutic programs could help patients to get more sufficient amounts of therapy and meet the dosage requirements of high-intensity SLT (13). Digital SLT interventions have been used as part of a treatment protocol in the form of smartphone, tablet, or computer-based programs. Some of these programs are entirely self-managed, meaning that patients can determine their own therapy schedule (14,(21)(22)(23). By giving patients the freedom to determine their practice schedule, researchers can access a wide range of practice frequencies, amounts and overall practice patterns from patient to patient. This variability provides a unique opportunity to probe practice-response relationships in SLT via dose articulation studies, which are a necessary first step toward the ultimate goal of establishing optimal dosage recommendations for SLT interventions (23, 24).
Recent efforts by Cordella et al. analyzed retrospectively collected data to evaluate the optimal dosage of interventions. In this study, the authors directly compared different dosage amounts of the same intervention in the context of self-managed digital therapies (23). This study focused on the relationship between the varied dosage frequency and the performance outcome across 13 different skill domains following a 10-week period of self-managed digital SLT. The results showed that higher dosage frequency groups (e.g., four or five times per week) achieved greater improvement vs. lower ones (e.g., once or twice per week) across all domains and also within a majority of individual subdomains. However, the definition of dosage in the Cordella et al. study is primarily the median number of days in a week patients practice, which is only one parameter to evaluate overall dosage (25). Other ways to calculate dosage have included session duration, total intervention duration, and total number of sessions administered (24-32). Moreover, it is not clear that 10 weeks is a sufficient duration of language therapy, especially in chronic survivors. Consequently, it is useful to evaluate improvements over a longer time period than 10 weeks, by which it would be possible to discover potentially more nuanced relationships between dosage and performance.
The goal of this study was to examine real-world therapy data to investigate the relationship between dosage amount, and midpoint and cumulative improvements following a 30-week treatment period using the Constant Therapy app. There were two main objectives of the current study. First, we investigated whether greater average weekly dosage-defined as number of minutes per week-led to greater performance gains over a 30-week period in a cohort of consistent users who practiced approximately the same average amount week to week. Second, in a larger cohort of more variable users we investigated the effect of weekly therapy dosage on performance outcomes across three consecutive 10-week intervals for a total of 30 weeks (i.e., 6 months). The two cohorts were denoted as consistent cohort and variable cohort. We hypothesized that in both analyses, greater practice amount would lead to better performance outcomes. Prior work has shown that during the first 10-week period of therapy, higher dosage frequency groups improved more compared to lower ones across all domains (23). Therefore, we hypothesized that such trend would persist in longer-term therapy that was practiced beyond 10 weeks.

Participants
Data used in this study are from patients who used the Constant Therapy app between March 2016 and July 2020. 30,129 unique users who reported having had a stroke with resultant speech, language, and cognitive deficits were included in the analyses with their consent to using exercise and performance outcome data for research purposes. In order to evaluate the performance of longer-term therapy, a smaller number of users were filtered using criteria described in detail below. Overall, all users were engaged in the app for more than 10 weeks in order for their data to be included in the analyses. As described above, in the first cohort, 537 users practiced 30 weeks of consistent therapy (i.e., consistent cohort). In the second, variable cohort, the number of users differed among time periods. 2,159 patients are considered in the first 10 weeks, 1,314 in the second 10 weeks, and 812 in the last 10 weeks. The filtering procedure flowchart is shown in Figure 1 to describe how we select the users from the whole population in the database. Note that all sessions we selected are self-managed sessions, which means no interference is made by any other individual including clinicians or Constant Therapy support team. Demographic details regarding participants are provided in Table 1 after the filtering criteria are described.

Constant therapy program
Constant Therapy (CT) is an app-based, evidence-based digital therapeutic designed to improve multiple domains of language simultaneously using a self-managed approach (www. constanttherapy.com) (33). Figure 2 depicts the CT therapy program using a tripartite schema (i.e., therapy target(s), FIGURE 1 Flow chart of the data filtering procedure that results in the two cohorts for which analysis 1 and 2 are conducted, respectively. ingredients, and mechanisms of action) following the Rehabilitation Treatment Specification System (RTSS) (34). There are several unique ingredients of the program, including (1) task variety with 266 different task types spanning speech, language and cognitive domains and functional daily activities that encompass them (e.g., listening to a voicemail, reading a map); (2) personalized goal setting enabling patients and their clinicians to identify highpriority, functionally relevant therapy goals across multiple domains; (3) adaptive difficulty that enables self-paced progression from easier to harder tasks within each targeted domain using an algorithm based on performance accuracy and consistency, allowing for therapy scaffolding in a way that mirrors in-person therapy techniques employed by skilled clinicians; (4) consistent feedback that is provided to the patient after every item, therapy goal and session; (5) ease of access that allows patients to log in and practice therapy at their convenience and progress at their own pace; and (6) the recommended therapy regimen that can be self-managed, reducing the need for regular face-to-face interaction with a clinician. Preliminary studies of CT have indicated that it is effective in inducing improvements in language outcomes in chronic post-stroke aphasia (22, 35,36). For this study, we aggregated data across 13 different skill domains: (1) analytical, (2) arithmetic, (3) attention, (4) auditory comprehension, (5) auditory memory, (6) naming, (7) phonological processing, (8) production, (9) quantitative, (10) reading, (11) visual memory, (12) visuospatial skills and (13) writing. When using the Constant Therapy program, users select skill domains they wish to improve and are assigned tasks based on that selection by the algorithm. Task difficulty is adjusted per individual user using an adaptive algorithm, with more difficult tasks assigned once patients have demonstrated mastery of prior tasks assigned with a high accuracy. The order in which more difficult tasks are assigned is according to a universal task progression order per domain. The progression order is thus a serial ranking of tasks from least to most difficult. Determination of each domain's progression order was based on research evidence in consultation with speechlanguage pathologists (37). Patient progress is subdomain specific, so improvement in one domain does not affect the progression order of other domains the patient is practicing simultaneously. In this way, during a session, patients practice tasks in order of subsequent increasing progression orders. Additionally, if a patient fails to improve at one progression order, a lower-level task will be assigned to the patient in addition to the original task. The Constant Therapy app records all data for each session for this study including accuracy per  Ingredients, mechanisms, and targets of the constant therapy program, conceptualized within the rehabilitation treatment specification system (RTSS) framework. trial, latency per trial, the progression order, timestamp, total exercises, and session duration. Because users practice different task types at different levels of difficulty, it is not enough to evaluate the performance outcome using an accuracy metric alone. Instead, we derived a summative metric of performance accuracy that allows for comparison across different skill domains and task difficulty levels, called domain score. In a specific session, the highest progression order of the task passed or worked on and the lowest progression order of the task failed are recorded. Here passing a task indicates accuracy of the task is equal to or greater than 90%, working on means more than 40% and less than 90%, while failure means accuracy is lower than 40%. The domain score of the session is calculated by averaging the two progression numbers, which is an estimate of the session's difficulty level. After that, the domain score is normalized by dividing it by the total number of progression orders in the specific domain. Normalization is required because the numbers of progression orders vary from domain to domain, and the original number alone cannot be used to compare directly across different domains. More details of domain score and its calculation have been previously described (23). By averaging the domain score across sessions in a week (only if there are multiple sessions in a single week), it is possible to evaluate the improvement or deterioration of patients' performance over time in a single domain.

Determination of the different dosage groups
Prior to discussing the data analyses, it is important to describe the determination of the different dosage groups. For a specific patient, the term exercise week indicates a week in which the patient has exercise records; unless explicitly noted, week is defined as exercise week in this study. In an n-week time period, the average dosage amount is calculated by summing up the dosage amount in the n exercise weeks and dividing it by the total number of calendar weeks the patient spent to complete n weeks of practice, which may include some additional weeks that do not have exercise records. Patients were then binned into the following three groups based on their average dosage amount over a period spanning 10 exercise weeks: 0-15 min per week (low dosage group), 15-40 min per week (medium dosage group), and more than 40 min per week (moderate dosage groups). It should be noted here that users practicing greater than 40 min per week on average demonstrated a large dosage range (up to 1,736 min per week in a 10-week period).
We considered 30 (exercise) weeks of time in total to evaluate the relationship between dosage amount and performance outcome. The 30-week period was split into three 10-week periods, and dosage amounts were averaged separately in the three periods. Patients were considered consistent (Analysis 1) only if (1) they had at least 30 exercise weeks on record and (2) for each of the 10-week time periods, they stayed within the same dosage group. Since this dataset is relatively small and not reflective of the more variable practice patterns that characterize the majority of app users, we also wanted to include an analysis of patients with more variable usage habits (Analysis 2). In the three 10-week periods, patients were included if they had practice records in each of the 10 weeks. Crucially for this analysis, a specific patient could appear in different groups in different time periods (e.g., 0-15 min/ week group in the first 10 weeks vs. 15-40 min/week group in the second 10 weeks), so it is not possible to compare the same dosage amount group across multiple 10-week periods, hence data in the three time periods were analyzed separately.

Statistical analyses
For all statistical analyses, the first week of the therapy within a 10-week period of exercises was indicated as the baseline week, and a comparison of domain scores between later weeks and the baseline week was made to address the performance outcome over this 10-week period. Because we were primarily interested in the effect of dosage amount on performance outcome, we began by grouping patients according to their average weekly dosage amount, measured by calculating the mean minutes per week of therapy. Patients were then binned into one of the three groups introduced above: 0-15 min per week, 15-40 min per week, and more than 40 min per week.
Linear mixed-effect models (LMMs) were run in order to examine domain score changes over time as a function of dosage amount group. The weekly domain score served as the dependent variable in the model, with fixed effects of time (week number), dosage amount group, cumulative practice amount (i.e., total hours spent completing therapy tasks), time × dosage amount group, and time × cumulative practice amount. Covariates of age, time since stroke (≤6 and >6 months), sex, and baseline domain scores were also included as fixed effects in the model. The model included random effects of patients and domains.
All statistical analyses were conducted in R (version 4.1.2; R Foundation for Statistical Computing) using lme4, lmerTest, emmeans, and sjPlot packages.

Analysis 1: consistent users
A total of 537 patients and 1,448 records in different domains were selected as consistent practice patients by the criteria mentioned previously. As we are considering records of different domains from one specific patient separately, this can yield multiple records per patient. The statistical analysis is based on the total number of 1,448 records. Among these records 820 are from male patients while 628 records are from female patients. The average age of patients is 63.13 (SD, 13.68) years old with 48.7% (705) in the acute recovery stage (less than 6 months prior to therapy initiation). The summary statistics for the entire cohort and for each dosage amount group are presented in Table 1. In general, age, sex, and chronicity did not differ among dosage groups (P > .05 in all comparisons).
Analysis 1 asked the question of whether greater average weekly dosage-defined as number of minutes per week-leads to greater performance gains over a 30-week period in a cohort of consistent users who practice approximately the same average amount weekto-week. The overall change in domain score (collapsed across domains) for the consistent group over 30 weeks is plotted in Figure 3A. The plot shows that, while all patients show improvements in the overall domain score, the 40+ min/week group shows greater changes in the domain score than the 0-15 min/week and 15-40 min/week groups over the 30-week time period. The statistical results for the consistent cohort are shown in Tables 2, 3. Specifically, a higher weekly domain score was associated with an increase in the number of weeks of therapy (β = .004; t = 6.09; P < .001), higher baseline domain score (β = .378; t = 67.74; P < .001), and greater practice amount (15-40 min/week: β = .034, t = 7.49, P < .001; 40+ min/week: β = .091, t = 14.70, P < .001). In addition, age (β = −.001; t = −2.80; P = .005) and time since stroke (β = .019; t = 2.08; P = .038) were also significant predictors of domain score, with younger age and acute chronicity associated with a higher weekly domain score. Sex was not a significant predictor of domain score.
Most importantly given our study objectives, the time × dosage amount group interaction was significant (F = 38.78, P < .001). From this result, we note that although all groups of consistent app users improved over the 30-week therapy period, the rate of improvement was driven by the weekly dosage amount. Compared to the group practicing 0-15 min per week, the 15-40 min per week group (β = .002, t 17,700 = 7.64, P < .001) and the group practicing more than 40 min per week (β = .003, t 9,297 = 7.94, P < .001) showed significantly higher weekly domain scores   Table 4). Analysis 1 took a conservative approach to evaluate the effects of practicing long-term therapy, only users that consistently practiced for 30 weeks were included in the analyses. Consequently, the number of such users was relatively low, with only 537 individual users. A perusal of the database of users indicated that users were more likely to be variable in their practice patterns, sometimes practicing more often and sometimes practicing less often. To evaluate whether this variable practice pattern influenced the extent of domain score change, we conducted Analysis 2.
Crucial to our question of interest, the interaction of time × dosage amount group was significant across each of the three 10 week analysis periods. Compared to the 0-15 min/week group, the 15-40 min/week (week 1-10: β = .003, t = 8.60, P < .001, week 11-20: β = .002, t = 4.48, P < .001) and 40 + min/ week groups (week 1-10: β = .006, t = 13.02, P < .001, week 11-20: β = .3, t = 7.27, P < .001) showed greater rates of performance improvement in the first and second 10-week analysis intervals. Post hoc comparisons of slopes (Table 8) demonstrated a significantly greater rate of improvement also for the 40 + min/week compared to the 15-40 min/week in both the first and second 10-week intervals. For the final 10-week analysis interval (i.e., weeks 20-29 of therapy), a similar pattern of significance emerged, with the rate of improvement being significantly greater for 40+ min/week vs. 0-15 min/week group (β = .002, t = 4.85, P < .001), but with no significant difference in rates of improvement for the 15-40 and 0-15 min/week (β = .001, t = 1.76, P = .078). Post hoc tests revealed there was also a significantly greater rate of improvement for the moderate vs. medium dosage group.

Discussion
This study aimed to examine if self-managed therapy could be sustained over a long period of time, and if greater average amounts of therapy were associated with greater therapy outcomes. To address these broad questions, therapy practice over a course of a 30-week treatment period (i.e., 6 months) was evaluated for different dosage amounts. Specifically, we evaluated whether greater average weekly dosage-defined as number of minutes per week-led to greater performance gains over a 30-week period in a cohort of consistent users who practice approximately the same average amount week to week. A second analysis examined a larger cohort of variable users, also over a course of 30-week period, to see if performance outcomes at each 10-week period showed relative greater gains for high practice frequency than lower practice frequencies.
There were several main conclusions to be drawn from our study results. Firstly, patients were able to practice consistently for 30 weeks of self-managed therapy and this practice was associated with concurrent improvements in domain scores. Not surprisingly, in this context, users who practiced more than 40 min per week showed greater improvements in the average domain score than users who practiced less than 15 min per week. These results suggest that consistent and sustained practice can result in therapy improvements and that these gains are maintained 20-30 weeks from the therapy onset time. Notably, patients who practiced more variably over a 30-week treatment period likewise demonstrated that greater weekly average dosage amounts were associated with greater improvements in overall domain score. In particular, users who practiced more than 40 min per week showed significantly greater performance gains than users who followed a medium (15-40 min) or low (0-15 min) dosage practice regimen. This was the case in each of the three 10-week intervals of interest, demonstrating that dosage amount matters for therapy outcomes not just in the beginning but also throughout the course of treatment. It should also be noted, as Figure 4 shows, users who practiced more than 40 min per week ( Figure 4C) also tended to practice more frequently, with a portion of 65.1% practicing more than 5 days per week and 27.8% practiced every day, compared to less frequent, more massed practice patterns in the medium (15-40 min) ( Figure 4B) and low (0-15 min) ( Figure 4A) dosage groups.
One notable observation is that by the 21-30 week period, the proportion of chronic patients (greater than 6 months post injury) was higher than in the first 1-10 week period, where they were more acute patients (less than 6 months post injury). This observation was true for both the consistent and variable group analyses. These results suggest that chronic survivors are able to sustain practice over long periods of time (>20 weeks) and demonstrate noticeable improvements on the domain score within the Constant Therapy program.
Results from both consistent and variable user cohorts demonstrated significant gains in domain score across the entire 30-week period of interest in our analyses. In both cohorts, the greatest rates of improvement occurred in the early weeks of therapy but crucially, all users were able to maintain performance gains during later weeks of therapy (e.g., weeks 10-20; 20-30). Moreover, for users following a relatively higher dosage practice regimen, these additional weeks of therapy resulted not only in maintenance of initial gains but in significant additional gains. This result underscores the promise of higher dose therapy to induce gains over a much longer time period than has previously been shown.
In line with prior research, our results show that relatively higher dosage therapy regimens are associated with greater gains in performance as compared to medium or low dosage regimens (6,38). This study is among a relatively small number of studies to directly compare the effects of varied dose of the same behavioral intervention. The small number of these dose articulation studies has been identified as a major barrier to the  Practice frequency distribution of users in different dosage amount groups.  (1) high-intensity, self-managed SLT leads to significant performance gains over a much more extended therapy time than previously shown (30 vs. 10 weeks) and (2) performance gains are greater for users who practice a greater average amount, regardless of whether they are consistent or more variable in their usage pattern. The current study also contributes to existing literature through its use of a real-world, ecologically valid dataset. Although the efficacy of high dose speech-language therapy has been established in the literature, there is a gap in translating these research findings to clinical practice. Translation of research findings is complicated by several barriers that include, among others, a large discrepancy in the amount of therapy recommended in research compared to the amount of therapy that is realistically attainable in the clinical setting (15, 18). By analyzing data from two cohorts of patient users who showed natural divergence in the pattern and amount of app-based practice logged over the 30-week time period, we were able to investigate effects of different dosage amounts taking into account the actual amount and types of practice of a large number of real-world users. This ensures greater generalizability of results to the clinical and real-world settings. Our results are encouraging because they not only show that higher intensity (40 min or more per week in our study) is feasible for a sizable group of real-world users but they also show that regardless of your exact pattern of practice (consistent vs. variable; moderate vs. medium vs. low), self-managed SLT leads to significant and sustained performance gains. Our results also demonstrate that higher-intensity therapy may look different in self-managed settings compared to highly controlled laboratory or clinical trial settings. In the latter, weekly dosage prescriptions are very high but total intervention duration is relatively short, whereas in our data, weekly dosage amounts are comparatively more modest but users instead practice for many more weeks (30+ weeks), resulting in cumulative dosage amounts that are comparable to high-intensity regimens as reported in the literature (38). Also important to consider is that CT or other app-based, at-home therapy can be used as an adjuvant to other SLT within the context of patients' longer-term trajectory of recovery; patients may for instance receive direct SLT in early post-acute recovery stages but turn to use of at-home, self-managed therapy after exhausting options for insurance-covered direct SLT. Finally, we note that the data analyzed in this study is the result of entirely self-managed practice, meaning that users were not given explicit instructions on the amount or frequency with which to practice. It is likely that dosage amounts-and possibly also the resultant therapy gains-could be augmented if users were advised on a specific practice regimen.
Importantly, the current study focused on measuring improvement via an in-app task improvement measure (i.e., domain score). Though outside the scope of the current study, it will be essential in future work to evaluate the generalizability of in-app domain score improvements to standardized measures of global language severity (e.g., WAB-R Aphasia Quotient), to realworld communication settings and conducted with large numbers of users. Prior clinical studies of the Constant Therapy app have reported clinically significant gains in both global language severity measures and quality of life scores following inapp practice (35,36). Des Roches et al. found significant pre-post improvements on the WAB-R Aphasia Quotient and composite severity score on the Cognitive Linguistic Quick Test among an experimental group of patients using the CT program as an adjuvant to traditional SLT; no such changes were seen among control participants receiving only traditional SLT (36). Most recently, Braley and colleagues conducted a randomized clinical trial comparing language-based outcomes following digital-only CT therapy compared to traditional SLT. Participants receiving digital-only CT therapy improved 6.75 points on the WAB-R AQ and also demonstrated significant improvement in overall quality of life, as measured by the Stroke and Aphasia Quality of Life Scale 39 (SAQOL-39) (35). Taken together, these findings lend encouraging evidence in support of treatment generalization for CT app users. It is also worthwhile to note that unlike rote paper-and-pencil therapy exercises, CT tasks are functional in nature (e.g., reading a museum map to determine where a given exhibit is), which may make it more likely for in-app improvements to generalize to out-of-app settings.

Limitations
We note several limitations of the current study. First is the lack of standardized performance metrics to characterize baseline severity and relatedly, the reliance on patient self-report for reporting of demographic and etiological details. The Constant Therapy app makes it possible to collect a large amount of realworld data about users and their daily performance patterns but because it is entirely self-managed, our dataset did not include standardized assessment metrics that might typically be collected in a clinic setting (e.g., Western Aphasia Battery-Revised aphasia quotient). Likewise, we did not have access to detailed information about concurrent medical and cognitive comorbidities, motivation levels, or personality types, all of which have the potential to influence therapy outcomes. Our analysis models do take into account basic demographic information such as age, sex, and chronicity and we also include random effects of patients in all analysis models. For baseline severity, we use the baseline domain score as a proxy measure. Nonetheless, future models with more detailed patient factors would likely lead to more robust and generalizable results.
A second limitation of the current study relates to the way in which users were assigned to their respective dosage groups and the way in which we chose to bin these groups. Users were binned into one of the three dosage amount groups according to their usage patterns and not by random assignment, leading to the possibility for some degree of self-selection into these groups (e.g., more severe users practicing less). To account for this potential effect of severity on results, we included baseline domain score-our proxy for starting severity-as a covariate in all statistical models. We also acknowledge that the current study employs data-informed but clinically arbitrary cutoffs to determine grouping into low, medium and moderate dosage groups. We therefore are careful to interpret results as providing support for higher vs. lower dose therapy rather than for a specific therapy prescription in minutes (e.g., 40 or min/week).
A final limitation is that there is insufficient information available on whether users had access to other direct therapy services. It is possible that some users may have used the appbased regimen in combination with traditional, in-person SLT, while others may have solely relied on the app. Differences in the amount of outside (i.e., non-app-based) therapy received by users across the dosage groups could potentially affect the results. High dosage app users may also be receiving more outside therapy, making it difficult to attribute any improvement in performance solely to increased in-app practice.

Conclusion
This study explored the relationship between the weekly dosage amount that stroke patients practiced in an appbased, self-managed therapeutic program and their performance improvement over a 30-week period. The results showed that across all users, the moderate dosage group (more than 40 min per week) achieved greater performance gains compared to medium (15-40 min per week) and low (0-15 min per week) dosage groups. A similar trend was noted between the medium and low dosage groups. Thus, our results show that performance gains are greater for users who practice a greater average amount. One possible further research direction could be suggesting a new evaluation metric to link in-app performance gains with real-world improvement.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author.

Ethics statement
This project was considered an institutional review boardexempt retrospective analysis by Pearl Institutional Review Board (#17-LNCO-101) under 45 Code of Federal Regulations 46.101 (b) category 2.