Comparison of Peak Oxygen Uptake Between Upper-Body Exercise Modes: A Systematic Literature Review and Meta-Analysis

Purpose: To compare peak oxygen uptake (VO2peak) between the asynchronous arm crank ergometry (ACE), and synchronous wheelchair ergometry (WERG), wheelchair treadmill (WTR), and upper-body poling (UBP) mode. Methods: PubMed, Scopus, CINAHL, and SPORTDiscus™ were systematically searched, and identified studies screened based on title, abstract, and thereafter full-text. Studies comparing VO2peak between ≥2 of the modes were included. A meta-analysis was performed by pooling the differences in VO2peak between upper-body exercise modes. The quality of the included studies was assessed and the level of evidence (LoE) established for each mode comparison. Meta-regression analyses investigated the effect of total body mass and participant-related characteristics (% of able-bodied participants, % of participants with tetraplegia and % of participants who are wheelchair athletes) on differences in VO2peak between modes. Results: Of the 19 studies included in this review, 14 studies investigated the difference in absolute and body-mass normalized VO2peak between ACE and WERG, and 5 studies examined the differences between ACE and WTR. No significant difference in absolute or body-mass normalized VO2peak was found between ACE and WERG (overall effect ±95% CI: 0.01 ± 0.06 L·min−1 and 0.06 ± 1.2 ml·kg−1·min−1, both p > 0.75; LoE: strong). No significant difference in absolute or body-mass normalized VO2peak was found between ACE and WTR (overall effect ±95% CI: −0.10 ± 0.18 L·min−1 and −1.8 ± 2.5 ml·kg−1·min−1, both p > 0.14; LoE: moderate). Absolute and/or body-mass normalized VO2peak did not differ between WERG and WTR in one study with 13 participants (LoE: limited) and between ACE and UBP in one study with 18 participants (LoE: moderate). In the meta-regression analyses, there was no significant effect of the investigated factors on differences in VO2peak. Conclusions: The differences between the asynchronous ACE and synchronous WERG propulsion, including possible differences in trunk involvement, do not seem to influence VO2peak. Therefore, ACE and WERG can be used interchangeably to test VO2peak. Possible differences in VO2peak in all other mode comparisons remain unclear due to the wide CIs and limited to moderate LoE.


Results:
Of the 19 studies included in this review, 14 studies investigated the difference in absolute and body-mass normalized VO 2peak between ACE and WERG, and 5 studies examined the differences between ACE and WTR. No significant difference in absolute or body-mass normalized VO 2peak was found between ACE and WERG (overall effect ±95% CI: 0.01 ± 0.06 L·min −1 and 0.06 ± 1.2 ml·kg −1 · min −1 , both p > 0.75; LoE: strong). No significant difference in absolute or body-mass normalized VO 2peak was found between ACE and WTR (overall effect ±95% CI: −0.10 ± 0.18 L·min −1 and −1.8 ± 2.5 ml·kg −1 ·min −1 , both p > 0.14; LoE: moderate). Absolute and/or body-mass normalized VO 2peak did not differ between WERG and WTR in one study with 13 participants (LoE: limited) and between ACE and UBP in one study with 18 participants (LoE: moderate). In the meta-regression analyses, there was no significant effect of the investigated factors on differences in VO 2peak .

INTRODUCTION
In individuals who primarily use their upper-body during exercise, peak oxygen uptake (VO 2peak ) attained during maximal effort exercise is commonly used as an indicator of cardiorespiratory fitness. VO 2peak is dependent on genetic predisposal, the level of physical activity, and type of disability of the individual tested. Particularly low VO 2peak values have been found among non-active individuals with a spinal cord injury (SCI) with high lesion levels and a correspondingly reduced muscle mass (Janssen et al., 2002;de Groot et al., 2016). In comparison, especially Paralympic (Para) athletes who compete in various sitting endurance sports, display relatively high VO 2peak (Bernardi et al., 2012;Baumgart et al., 2018a). Arm crank ergometry (ACE) is the most commonly used test mode to assess upper-body VO 2peak . However, in a sports-context, the specificity of the test mode is of importance to reflect a VO 2peak that is of relevance for the respective sport (McCafferty and Horvath, 1977). In wheelchair athletes, the wheelchair ergometry (WERG) (i.e., employing a wheelchair on rollers), or the wheelchair treadmill (WTR) mode may provide a more sport-specific alternative compared to the ACE mode. In Paralympic ice hockey players, sitting Para cross-country skiers and sitting Para biathletes, the upper-body poling (UBP) mode may be more sports-specific compared to the ACE mode for assessing VO 2peak .
The ACE mode is more efficient than the WERG, WTR, and UBP modes partly due to the continuous rather than discontinuous power production. At the same time, during WERG, WTR, and UBP, the synchronous movement of the arms allows more displacement of the trunk compared to during ACE, where the asynchronous movement of the arms limits trunk displacement. Consequently, a higher VO 2peak might be expected in WERG, WTR, and UBP compared to the ACE mode, speculatively due to recruitment of more active muscle mass during the synchronous movement. Some studies have shown higher values in the WERG or WTR compared to the ACE mode (Wicks et al., 1983;Gass et al., 1995;Bloemen et al., 2015), whereas others show no differences between the WERG, WTR, and UBP and the ACE mode (Gayle et al., 1990;Martel et al., 1991;Arabi et al., 1997;Baumgart et al., 2018b). Furthermore, in a previous study, we conducted a pooled regression analysis based on 22 studies in 169 wheelchair athletes, in which the WERG/WTR mode resulted in 5 mL·kg −1 ·min −1 higher VO 2peak compared to ACE (Baumgart et al., 2018a). The higher VO 2peak in WERG/WTR compared to ACE in the latter analysis might be explained by inclusion of only wheelchair athletes for whom the WERG/WTR mode is more sports-specific than the ACE mode. However, the results of the review of Baumgart et al. (2018a) are based on regression analyses and need to be interpreted with caution since VO 2peak was not compared directly between modes in a repeated-measures design within the same studies. In this context, where there is large heterogeneity in the participants tested, a meta-analysis based solely on studies comparing VO 2peak between different modes in a repeated-measures design is a more valid approach due to a reduced effect of between-participant variance on the overall outcome. Furthermore, meta-regression analyses can provide information on whether wheelchair athletes achieve a higher VO 2peak in modes using wheelchair propulsion as compared to the ACE mode.
In addition to being specifically trained for a certain test mode, other participant-related factors might explain why there is a higher VO 2peak in the WERG and WTR compared to the ACE mode in some studies, while there are no differences between modes in other studies. Speculatively, persons with a complete tetraplegia, who have reduced sitting balance, might be able to exhaust themselves more in modes with less displacement in the upper-body (i.e., the ACE), thereby reducing the differences in VO 2peak between the ACE and the other modes. In addition, differences between the ACE and the other modes might be influenced by the % of able-bodied participants often included in these types of studies, since able-bodied participants are generally less familiar to using a wheelchair compared to wheelchair users. Furthermore, the influence of total body mass on the difference in VO 2peak between upper-body exercise modes has not yet been investigated. Overall, investigating these factors would contribute to the understanding of the variability in VO 2peak differences between modes across studies.
Information on whether VO 2peak differs between upper-body test modes provides important knowledge both in a clinical as well as in a sport setting and indicates to what extent test modes can be used interchangeably. Therefore, the aim of this systematic literature review and meta-analysis was to compare VO 2peak between the ACE, WERG, WTR, and UBP. Furthermore, the influence of other participant characteristics (i.e., body mass, % of able-bodied participants, % of participants with tetraplegia, and % of participants who are wheelchair athletes) was investigated in meta-regression analyses. It is hypothesized that VO 2peak is higher in WERG, WTR, and UBP compared to ACE. In addition, it is hypothesized that VO 2peak is higher in the ACE compared to other modes in studies with a higher % of able-bodied participants and a higher % of participants with tetraplegia. Furthermore, VO 2peak is expected to be higher in WERG and WTR as compared to ACE in studies with a higher % of participants who are wheelchair athletes.

METHODS
This review was conducted in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA-P) guidelines [11] (see Supplementary Material S1 for the PRISMA checklist). Additionally, the study protocol was registered a priori in the International Prospective Register of Systematic Literature Reviews (PROSPERO) under the following registration number: CRD42019025063.

Data Sources and Search Strategy
PubMed, CINAHL (through EBSCOhost), SPORTDiscus TM (through EBSCOhost) and Scopus R were systematically searched in November 2018 using relevant keywords and a Boolean search string (see Supplementary Material S2 for the Boolean search string).
References of the included studies were searched manually for further identification of studies relevant to the research question.

Inclusion Criteria
Studies were included if they compared absolute and/or bodymass normalized VO 2peak between at least two of the following upper-body exercise modes: ACE, WERG, WTR, and UBP. Only studies where the same participants were tested in a repeatedmeasures design in two or more modes in the respective study were included, i.e., studies were excluded if the participants were split in groups and tested in one of the modes only. There were no restrictions made to the participants tested, i.e., included were articles comparing VO 2peak between upper-body exercise modes in able-bodied participants as well as participants that range from untrained individuals who ambulate in a wheelchair for various reasons to Paralympic athletes in a variety of sports. Studies were included if they tested absolute and/or bodymass normalized VO 2peak in a standardized laboratory setting and the same ergospirometer was used within each study. Field studies were excluded due to a lack of standardization. Furthermore, studies that included an intervention in between VO 2peak testing in the two or more modes were excluded. Only full-text, original research published in peer-reviewed journals in English, Dutch, German or French were considered. Abstracts and conference proceedings were not eligible due to lack of reporting detailed methods and results. No restrictions were made on the publication date of the studies.

Study Selection
After eliminating duplicate articles, the titles were screened by JB. Studies that did not directly mention VO 2peak or a synonym in the study title but were likely to have included it as a secondary measure, were also included. In a second step, the abstracts of the studies deemed relevant by title were read by JB. Articles considered relevant by abstract were then read in full-text by JB

Assessment of Methodological Quality
The quality of the included studies was assessed by JB and BB with a modified version of the Downs and Black checklist (Downs and Black, 1998) (see Supplementary Material S3 for the modified Downs and Black checklist). Modified versions of this checklist have been employed in several reviews in the field of sports science, which also mainly used cross-sectional studies for their data retrieval (Hebert-Losier et al., 2014;Baumgart et al., 2018a). The original checklist comprises 27 items, which are distributed over five sub-scales: reporting (item 1-10), external validity (item 11-13), bias (item 14-20), confounding (items 21-26), and power (item 27) [13]. For the purpose of this review, items 8,9,[11][12][13][14][15][16]19, and 22-26 were excluded since our review did not focus on interventions The term "patient" was replaced by participant and "treatment" was interpreted in the context of testing (Hebert-Losier et al., 2014;Baumgart et al., 2018a). All items, except item number 5 and 27, were rated as "Yes" (1 point), "No" (0 points), or "Unknown" (0 points). For item 5, sex, age, body mass, type of disability, physical activity level, and test protocol (i.e., increment duration) were considered to be core confounders. Item 5 was scored with 2 points if all core confounders were mentioned. 1 point was scored if 5 out of the 6 core confounders were explained. Item 27 was scored with 3 points for studies with above 21 participants, 2 points with 18-21 participants, 1 point with 15-17 participants and 0 points with 15 or fewer participants. Quality cut-off points were decided on retrospectively and studies were ranked to be of low (0-6 points), moderate (7-11 points), or good methodological quality (12-18 points) based on the score of the Downs and Black checklist. The level of evidence (LoE) for each mode comparison was ranked from unknown to strong by combining the quality scores of each of the studies included in the respective mode comparison (see Table 1).

Data Extraction
Data on VO 2peak in the respective mode and the characteristics of the participants (number of participants, sex, number of able-bodied participants, number of athletes who are wheelchair athletes, age, body mass, type of disability and training status) as well as the starting workload, duration and workload increases of the increments used during the test protocols was extracted from the included studies by JB with the BB cross-checking all the data.

Statistics
All data are presented as mean ± standard deviation (SD) or 95% confidence intervals (CI) unless specified otherwise. A meta-analysis was performed in Stata 14.2 (StataCorp LLC, Texas, USA) by the random effects approach described by Dersimonian and Laird (1986) to investigate whether there were differences in VO 2peak between upper-body exercise modes (see Supplementary Material S4 for the input for the meta-analyses).
Participants of the included studies were heterogeneous with respect to body mass, being specifically trained for a certain test mode and whether they had a disability or not; and in case they had a disability with respect to the type of disability. Therefore, random-effects meta-regression analyses with REML estimates of heterogeneity and a Knapp and Hartung modification (Knapp and Hartung, 2003) were performed to look into the separate effect of each of the following factors on VO 2peak : body mass (kg), % of able-bodied participants, % of participants with tetraplegia, and % of participants who are wheelchair athletes (see Supplementary Material S4 for the input for the metaregression analyses). The % of able-bodied participants ranged from 100 in studies that solely tested able-bodied participants to 0 for studies that solely tested participants with a disability. The % of participants with tetraplegia ranged from 100 in studies that solely tested participants with tetraplegia to 0 for studies that solely tested participants without tetraplegia. The % of FIGURE 1 | Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart depicting the study identification, screening, eligibility and inclusion process. VO 2peak , Peak oxygen uptake; ACE, arm crank ergometry; WERG, wheelchair ergometry; WTR, wheelchair treadmill; UBP, upper-body poling.
Frontiers in Physiology | www.frontiersin.org FIGURE 2 | Quality scores of the 19 included studies. Green dots are for items scored Yes, red dots for items scored No and yellow dots for "Partial Yes" (i.e., scoring 1 of the 2 points possible for item 5).
participants who are wheelchair athletes ranged from 100 in studies that solely tested wheelchair athletes to 0 for studies that solely tested participants who are no wheelchair athletes. An alpha level of 0.05 was used to indicate statistical significance.

Study Selection
Of the 2119 studies initially screened on their title, 19 fulltext studies were included in this systematic literature review (Figure 1). Of these, 12 and nine studies compared absolute or body-mass normalized VO 2peak values, respectively, in 239 and 200 participants, between the ACE and the WERG mode. Four and five studies compared absolute and body-mass normalized VO 2peak , respectively, in 43 and 51 participants, between the ACE and the WTR modes. VO 2peak was compared between the WERG and WTR mode in one study including 13 participants (Arabi et al., 1997) and between the ACE and UBP mode in one study including 18 participants (Baumgart et al., 2018b).

Methodological Quality
There was an 84% agreement between JB and BB in the items rated initially, with full agreement reached when re-checking details in the appraised studies. Two studies were ranked as having good, 13 as having moderate, and 4 as having low methodological quality (Figure 2). The quality of the studies included in each of the mode comparisons determines the LoE of the respective comparison.

Meta-Regression Analyses: Influence of Participant and Test Characteristics on VO 2peak
We were only able to investigate the influence of body mass and participant-related characteristics on differences in Frontiers in Physiology | www.frontiersin.org   VO 2peak between modes in the comparison of the ACE to the WERG mode due to a sufficient number of studies included for this comparison. The meta-regression analyses were based on 12 studies that provided data of 18 subgroups for the prediction of absolute VO 2peak , and on 9 studies that provided data of 14 subgroups for the prediction of body-mass normalized VO 2peak (Tables 2, 3). Note that there are more subgroups than studies, since Wicks et al. (1983) presented their data in seven sub-groups. None of the investigated factors significantly predicted absolute or body-mass normalized VO 2peak differences between the ACE and WERG mode ( Table 5).

DISCUSSION
The main aim of this systematic literature review and metaanalysis was to compare VO 2peak between the ACE, WERG, WTR, and UBP modes in a variety participants. In brief, no difference in absolute or body-mass normalized VO 2peak was found either between the ACE and WERG (LoE: strong) or between the ACE and WTR mode (LoE: moderate), while the single studies comparing VO 2peak between the WERG and WTR mode (LoE: limited) and between the ACE and the UBP mode (LoE: moderate) found no significant differences. In the meta-regression analyses, none of the investigated factors significantly predicted differences between absolute or bodymass normalized VO 2peak . In the current meta-analysis, we found no difference in VO 2peak between the ACE and WERG mode, with a strong LoE. This is contrary to our hypothesis and indicates that more displacement of the trunk in the WERG mode does not necessarily lead to a higher active muscle mass and a consequently higher VO 2peak . The finding is also in contrast to a previous systematic literature review, showing that the WERG/WTR mode resulted in 5 mL·kg −1 ·min −1 higher VO 2peak values compared to ACE in wheelchair athletes (Baumgart et al., 2018a). However, the higher VO 2peak in WERG/WTR compared to ACE in our previous review (Baumgart et al., 2018a) might due to inclusion of only wheelchair athletes, for whom the WERG/WTR mode is more sports-specific than the ACE mode.
The latter is in line with a review on able-bodied runners and cyclists where sport specificity of the test mode was suggested to be important for achieving VO 2max (Millet et al., 2009), as well as a study on kayakers that showed higher VO 2peak values in the kayak than the arm crank ergometer mode (Forbes and Chilibeck, 2007). The results in the latter review are based on regression analyses and need to be interpreted with caution though since VO 2peak in the different modes was not directly compared within the included studies as done in the current analyses. The contrasting finding in the current review, with no difference in VO 2peak between ACE and WERG, suggests less effect of sport-specificity of the test mode on VO 2peak since also non-athlete participants with a disability and able-bodied participants are included. However, the meta-regression analyses revealed that also in studies with a higher % of participants who are wheelchair athletes, there was no difference in VO 2peak between the WERG and ACE mode. This indicates that both during the supposedly more sport-specific WERG and the less sport-specific ACE a sufficient amount of active muscle mass is recruited to elicit VO 2peak . The results of the meta-regression analyses need to be interpreted with caution though, since only few studies were included in each prediction model. Overall, these data indicate that the ACE and WERG modes might be used interchangeably to test VO 2peak in persons that are not specifically trained for either of these two modes and possibly also in athletes that are specifically trained for the WERG mode.
Even though the included studies show a small but consistently lower mean VO 2peak in ACE compared to WTR, there was no significant overall effect of the test mode on VO 2peak . This is contrary to our hypothesis and might suggest that trunk oscillations and shifts in center of gravity, that contribute more to propulsion in the WTR compared to the ACE mode (Vanlandewijck et al., 2001), do not necessarily lead to a larger active muscle mass with a consequently higher VO 2peak . In line with the other mode comparisons in the current review, Arabi et al. (1997) found no difference in VO 2peak between WERG and WTR and Baumgart et al. (2018b) found no difference in VO 2peak between ACE and UBP or between WERG and WTR. However, our ability to conclude with certainty is limited due to the wide CIs of the overall effect, the limited to moderate LoE and the limited amount of studies included for these comparisons.
The meta-regression analyses showed that none of the participant-related characteristics (i.e., body mass, % of able-bodied participants, % of participants with tetraplegia and % of participants that are wheelchair athletes) influenced the difference in VO 2peak between the ACE and the WERG modes. This is in contrast to our hypotheses of VO 2peak being higher in the ACE compared to other modes in studies with participants with higher body mass, studies with a higher % of participants without disability and studies with a higher % of participants with tetraplegia; and VO 2peak being higher in the WERG/WTR compared to the ACE mode in studies with a higher % of participants who are wheelchair athletes. However, as already stated in the above, the results of the meta-regression analyses need to be interpreted with caution since only few studies were included in each prediction model. Furthermore, it remains to be investigated to what extent other participant-related and test protocol factors explain the higher VO 2peak in the WERG compared to the ACE mode in some studies, whereas there are no differences in other studies.

CONCLUSION
No difference in VO 2peak between the ACE and WERG mode were found in the present meta-analyses, indicating that ACE and WERG may be used interchangeably to test VO 2peak in persons that are not specifically trained for a certain mode and possibly also in athletes that are specifically trained for the WERG mode. In addition, it remains unclear whether VO 2peak differs between ACE and WTR due to the wide CIs, moderate LoE and the limited amount of studies comparing VO 2peak between these two modes. Furthermore, we are not able to conclude on the comparison of VO 2peak between WERG and WTR, and ACE and UBP since only on study was included for each comparison.