Clinical Utility of the 6-Item CTS, Boston-CTS, and Hand-Diagram for Carpal Tunnel Syndrome

Background: Self-reported measures are often used in research and clinical practice to diagnose carpal tunnel syndrome (CTS) and guide therapeutic choices. We aimed to assess the clinical utility of the Norwegian versions of two self-reported outcome measures for symptom severity assessment, the 6-item CTS (CTS-6), and Boston-CTS (BCTQ), and of one diagnostic measure, the hand-diagram, by evaluating measurement properties including discriminative ability for severity assessment (CTS-6, BCTQ), and diagnosis of CTS (hand-diagram). Methods: We performed forward and backward translation and cultural adaptation of the Norwegian CTS-6 and BCTQ. Following COSMIN guidelines, we investigated internal consistency, reliability, construct validity, and discriminative ability for distinguishing between severity levels of CTS in patients with confirmed CTS for the CTS-6 and BCTQ and reliability and discriminative ability for diagnosing CTS for the hand-diagram. Results: Two hundred and fifty-one patients referred for diagnostic work-up for CTS with nerve conduction studies (NCS) participated. The CTS-6 and BCTQ had acceptable internal consistency (Crohnbach's α = 0.82 and 0.86, respectively), reliability (ICC = 0.86 and 0.90; SEM = 0.24 and 0.20; SDC95% = 0.68 and 0.55, respectively), construct validity (all eight pre-defined hypotheses confirmed) and discriminative ability to distinguish between severity levels of CTS [Area under the curve (AUC) = 0.75, 95% CI 0.64–0.85]. The hand-diagram had acceptable reliability (Cohen's kappa = 0.69) and discriminative ability to diagnose CTS (sensitivity = 0.72, specificity = 0.90). Conclusion: Our findings support the clinical utility of the CTS-6 and BCTQ for symptom severity assessment and of the hand-diagram for diagnostic screening.


INTRODUCTION
Carpal tunnel syndrome (CTS) is characterized by paresthesia, numbness and pain in the median nerve distribution (1). Nerve conduction studies (NCS) contribute to diagnosing CTS by demonstrating reduced conduction velocity of the median nerve (2). Treatment is often based on clinical and NCS severity (3,4). Due to a CTS prevalence of ca. 3-5% (5) and limited access to NCS, several measures (6, 7) have been developed for diagnostic screening and for severity assessment. The 6item CTS from   (8,9) and Boston carpal tunnel questionnaire (BCTQ) (10) are widely used for severity assessment, and the hand-diagram (11,12) for diagnostic screening. These instruments are not currently available in Norwegian. Single measurement properties have been described for other language versions of these instruments (13,14), but few studies follow systematic guidelines, as for instance provided by the COSMIN group (15). Thus, it can be challenging to choose from among the available instruments (16,17) and measures may not be used as designed or for their intended purpose, greatly hampering their utility. An example is that several studies used the BCTQ for diagnostic purposes (18). Thus, a study of the measurement properties of these instruments according to the COSMIN guidelines, together with a delineation of their respective discriminative abilities for diagnosis (hand-diagram) and severity assessment [CTS-6 and (BCTQ)], could give a good estimate of their utility and help clinicians and researchers to choose the appropriate instrument (19).
The main objective of this study was to systematically investigate internal consistency, reliability, construct validity and discriminative ability for severity assessment of the Norwegian versions of the CTS-6, BCTQ, and reliability and discriminative ability of the hand-diagram for diagnosing CTS according to the COSMIN guidelines (20).

Design
The study was carried out in two stages. First, we translated and cross-culturally adapted the CTS-6 and BCTQ. Then, we used a cross-sectional design to test the measurement properties of the CTS-6, BCTQ, and hand-diagram against NCS as external criteria. Additionally, we performed a test-retest assessment with an interval of 4 days.
Following COSMIN recommendations (21), we aimed for a sample size of 200 patients, consisting of 100 patients with confirmed CTS and 100 patients for whom CTS could not be confirmed. Internal consistency and the discriminative ability to distinguish severity levels of CTS (both analyses applied to the CTS-6 and BCTQ) was tested in the sample with confirmed CTS, while construct validity (applied to the CTS-6 and BCTQ) and discriminative ability for diagnosis of CTS (applied to the hand-diagram) were tested in the entire sample. For testretest reliability, we invited the first fifty included patients with Abbreviations: CTS, carpal tunnel syndrome; NCS, nerve conduction studies; AUC, area under the curve; ROC, receiver operating characteristics. confirmed CTS to complete a retest questionnaire 4 days after the baseline questionnaire. According to Norwegian law, this study was categorized as a quality improvement project and not a medical research project. The Study was accordingly approved by the local Data Protection Official (PVO 2015/14753) and not the Regional Ethical Committee. All participants provided written consent.

Participants
We recruited patients who had been referred for diagnostic workup of suspected CTS with NCS to the clinical neurophysiology lab at Oslo University Hospital. We included patients referred from both primary and secondary health services. Patients > 18 years of age with sufficient knowledge of the Norwegian language were eligible. Exclusion criteria were patient withdrawal or more than 50% missing items in the questionnaire. Written informed consent was obtained from all patients.

Procedures and Measures
The participants filled out the CTS-6, BCTQ, hand-diagram, and sociodemographic background variables 2 days before the consultation. Those participating in test-retest reliability assessment were asked to fill out the CTS-6, BCTQ, and handdiagram again 2 days after the consultation and return them by mail. We chose an interval of 4 days in order to minimize bias from memory or from a change in clinical symptoms. For each questionnaire, patients were asked to state for which hand they had answered.

Translation and Cross-Cultural Adaptation
The translation and cross-cultural adaptation was conducted according to international guidelines (22), including forward translation performed by a native speaker of Norwegian and backward translations performed by native speakers of English and Swedish. Based on the translations, an expert committee developed a pre-final version. Following a review by 10 patients with musculoskeletal disorders recruited from our out-patient clinic, a final version was developed. The Norwegian versions of the CTS-6 and BCTQ are provided in Supplementary Materials 1, 2.

Measures
The CTS-6 (8) was developed from the BCTQ as a brief symptom scale for patients with confirmed CTS. It was designed to measure severity of the cardinal symptoms of CTS. The questionnaire consists of six items covering presence and intensity of numbness, paresthesia, and pain during day and night, whether the patient is woken by these symptoms, and, if so, how often. Each question is answered on a scale ranging from "1" (best) to "5" (worst). The total score is calculated as the mean of all answers, with a minimum score of "1" and a maximum score of "5." One missing item is allowed.
The BCTQ (10) was designed as a self-reported measure of symptom severity and functional impairment of CTS, with two respective subscales. Only the symptom severity scale was used in the present study. This scale consists of 11 items covering presence, frequency, duration and severity of numbness, paresthesia and pain, both day and night, and impairment of fine motor skills, with five possible answers ranging from "1" (best) to "5" (worst). The total score is calculated as the mean of all answers, with a minimum score of "1" and a maximum score of "5." One missing item is allowed.
The hand-diagram (11) was developed as a rapid screening tool for CTS in the general population. It consists of a diagram of the palmar and dorsal hand, and patients record in which areas of the hand and arm they experience numbness, tingling, pain and loss of sensation. There are four different probability scores according to presence and distribution of symptoms: (1) Classic CTS is defined by tingling, numbness, decreased sensation with or without pain in at least two of the three radial digits. Symptoms in the palm and dorsum of the hand are not allowed, wrist pain or radiation proximal to the wrist is allowed. (2) Probable CTS is defined by the same pattern as classic CTS, except that palmar symptoms are allowed unless confined to the ulnar aspect.
(3) Possible CTS is defined by tingling, numbness, decreased sensation, and/or pain in at least one of the three radial digits. (4) Unlikely CTS is defined by the absence of symptoms in the three radial digits.

Nerve Conduction Studies
We performed bilateral motor and sensory NCS of the median and ulnar nerves on a Natus key-point classic EMG machine (Alpine Bio-med, Denmark). For both motor and sensory NCS, we used pre-gelled, disposable surface electrodes (Alpine biomed, Skovlunde, Denmark) with a 3 mm inter-electrode distance. For stimulation we used a handheld stimulation bar with felt tips (diameter 7.5 mm) soaked in saline solution and with fixed inter-electrode distance. Electrodes were placed on predefined anatomic landmarks and distances between stimulation and recording sites were measured prior to performing NCS. Before conducting the NCS, we measured skin temperature with a handheld infrared thermographic scanner (Exergen Corporation, Watertown, MA, USA). Skin temperature was kept over 30 degrees at all times. All amplitudes were recorded using supramaximal stimulation. Motor and sensory amplitudes were measured from baseline to peak. Sensory latencies were calculated based on the peak of the negative deflection, motor latencies at onset of the negative peak. We performed orthodromic sensory NCS of the median nerve (branches to the palm and to the second, third, and fourth fingers), ulnar nerve (branches to the palm and to the fourth and fifth fingers) and radial nerve (superficial branch at the laterodorsal side of the hand). Distances between stimulation and recording electrodes in the fourth finger were equal in the median and ulnar nerves to allow for comparison of the sensory latency. NCS findings were classified according to the scale suggested by Padua (23). Minimal CTS is characterized by a significant difference between sensory latency in the median/ulnar nerves at two sites (≥0.5 ms in the fourth digit and in the palm); mild CTS by sensory conduction velocities of the median nerve below the lower normal limit; moderate CTS by motor distal latency above the normal limit in addition to sensory involvement; severe CTS by additionally absent sensory responses (amplitude < 0.2 mV); and extreme CTS by absence of both motor and sensory responses.

Diagnosing CTS
In order to provide a CTS diagnosis certain clinical (1, 2) and NCS criteria had to be met: A diagnosis of CTS was present if NCS showed at least minimal CTS according to Padua (23) and two of the following classical symptoms of CTS (2) were present: numbness and/or paresthesia in the median nerve distribution, alleviation of symptoms by shaking the limb, weakness in the hand and presence of symptoms during the night. Further, there should be no alternative or more plausible explanation for these symptoms. In cases of bilateral disease, analyses were applied to the most symptomatic side.

Statistical Analysis
The analysis was performed on SPSS V24 software (SPSS Inc., Chicago, IL). P < 0.05 were considered significant. We substituted up to 50% missing items with the mean of the remaining answers. Floor and ceiling effects were defined as >15% of patients reporting the lowest or highest possible total score, and end-effects as >15% of patients reporting the highest or lowest possible score for a single item.

Internal Consistency
Internal consistency was assessed by Cronbach's α. COSMIN regards values of >0.70 and <0.95 as acceptable.

Test-Retest Reliability
Test-retest reliability was tested with relative and absolute reliability measures. Relative reliability was assessed with intraclass correlation (ICC 2.1 , two way random, absolute agreement) (24), with values of ≥0.70 regarded as acceptable. Absolute reliability was tested with the standard error of measurement (SEM, calculated as standard deviation of the difference/ √ 2)

Construct Validity
Construct validity was assessed for the CTS-6 and BCTQ by testing eight pre-defined hypotheses concerning the CTS-6, the BCTQ, and concurrent measurements based on existing literature (2,8,10,11,(28)(29)(30)(31)(32). Hypotheses one, three, four, five, six, and seven were tested in the sample with confirmed CTS, and hypotheses two and eight in the whole sample. Since the CTS-6 and BCTQ measure the same construct, parallel hypotheses were created.

Discriminative Ability
Discriminative ability of the CTS-6 and BCTQ was assessed by receiver operating characteristics (ROC) curves for the ability to distinguish between minimal to mild and moderate to severe CTS grades. An area under the curve (AUC) of at least 0.70 is considered adequate (21). We used the coordinates of the ROC curve to calculate the scores with optimal sensitivity and specificity and positive and negative likelihood ratios. We assessed the discriminative ability of the hand-diagram by calculating sensitivity, specificity, and positive and negative likelihood ratios directly from 2 × 2 tables for the diagnostic ability to discriminate between patients who had or had not been diagnosed with CTS.

RESULTS
We collected data from April 2016 to January 2018. Out of 293 invited patients, 42 either declined to participate or did not speak Norwegian. Of the 251 remaining patients, 128 were diagnosed with CTS (based on predefined clinical symptoms and NCS findings). Of the 123 patients who were not diagnosed with CTS, 10 did not fulfill the clinical criteria, 15 did not   fulfill the NCS criteria and 98 did not fulfill either of these criteria. These patients were diagnosed with one of the following conditions: polyneuropathy, ulnar nerve neuropathy, cervical disk herniation, tendinitis, or radial nerve neuropathy. Internal consistency and the discriminative ability to distinguish between severity levels of CTS (applied to the CTS-6 and BCTQ) was tested in the sample with confirmed CTS, while construct validity and discriminative ability for diagnosis of CTS (applied to the hand-diagram) were tested in the entire sample. Fifty-four patients with confirmed CTS participated in test-retest analysis. Sample characteristics are given in Table 1. There were no significant differences between the samples regarding age, sex distribution, educational level, or proportion of patients with Norwegian as their mother tongue.

Data Quality
The total scores were normally distributed in the CTS-6 and BCTQ. There was little missing data for the CTS-6 and BCTQ (Tables 2A,B). Neither of the outcome measures showed floor or ceiling effects, but there were end effects in four items of the CTS-6 and in eight of the 11 items in the BCTQ (defined as >15% of patients reporting the lowest or highest possible score for a single item). Ten patients had missing data in the hand-diagram. The distribution of hand-diagram scores is presented in Table 1.

Internal Consistency and Test-Retest Reliability
Cronbach's α was 0.82 for the CTS-6 and 0.86 for the BCTQ. There was no difference between the test and the re-test total scores for the CTS-6 and BCTQ ( Table 3). Both the SEM and the SDC were lower for the BCTQ than for the CTS-6. The agreement and test-retest reliability were acceptable for all three instruments.

Construct Validity
All pre-defined hypotheses for the correlation between the CTS-6, the BCTQ and the external criteria were confirmed ( Table 4).
As expected, the total CTS-6 and BCTQ scores (Figures 1A,B) were significantly different between NCS severity groups. Table 5 shows the results of the discriminative ability testing. Using a score of "probable" as the cut-off for CTS, the handdiagram showed a high specificity and positive likelihood ratio and good sensitivity in detecting CTS. For the CTS-6 and BCTQ scores of 2.55 and 2.47, respectively, yielded the highest combination of sensitivity and specificity for detecting moderate to severe CTS. Both showed an acceptable to good ability to discriminate between severity levels of CTS.

DISCUSSION
The Norwegian versions of the CTS-6, BCTQ and hand-diagram showed good measurement properties when assessed in a sample of patients referred for diagnostic work-up for CTS with NCS. The results support the utility of the CTS-6 and BCTQ for symptom severity assessment and of the hand-diagram for diagnostic screening. 1. In the subgroup of patients with confirmed CTS, we expected a moderate level of correlation between their NCS severity and the total scores in the CTS-6 and the BCTQ.

† (both) Yes
2. In the whole sample (comprised of patients with and without CTS), we expected a significant, but weak, correlation between the total scores of the CTS-6 and BCTQ and severity of NCS.

† (both) Yes
3. We expected a significant difference between the NCS severity groups in terms of the severity of pain in the CTS-6 and BCTQ (item #1 in the CTS-6 and item #1 in the BCTQ). The age and gender distribution, as well as the distribution of NCS severity levels, in the present study correspond to previous reports (5,23,33). The response rate was high and there were generally few missing items, corresponding to previous studies (29,34). We did not find floor or ceiling effects in the total scores. However, we found end effects for items concerning numbness and tingling in the CTS-6 and BCTQ, especially in patients with moderate to severe NCS findings (see Figure 1). This group is often considered for surgery (3,4). End effects can indicate that single items are not differentiated enough, which may make it difficult to measure changes in these items. In turn, this might reduce the utility of the two outcome measures for post-operative followup (21).
According to the COSMIN criteria, test-retest reliability of the CTS-6 and BCTQ was very good for both absolute and relative reliability measures, comparable (21,35) to other translated versions of the CTS-6 (9, 36) and BCTQ (34,37). Internal consistency was very good for the CTS-6 and BCTQ and comparable to the original version and Spanish versions of the CTS-6. An artificially high Cronbach's α can be found in questionnaires with a large number of items, which is unlikely in our study (21).
All pre-defined hypotheses were confirmed, which supports good construct validity of the CTS-6 and BCTQ. We chose to use NCS as external criteria as they are an objective measurement of nerve function and frequently used to diagnose CTS (38). We found a moderate level of correlation between the NCS severity and the total scores of the CTS-6 and BCTQ in the sample with confirmed CTS, corresponding with previous reports (39). In contrast, the correlation between NCS severity and symptom severity in the whole population was significant, but rather low. This result confirms that the two outcome measures are not specific for CTS and is in accordance with the literature (28,31,40). In addition, the correlation between NCS severity and clinical symptom severity in CTS varies in different studies (29,39,(41)(42)(43). This variation might be due to the use of different methods in the studies. For instance, the result would be different if correlation to clinical sum scores or to severity of single symptoms was assessed (28,30). In our study, pain and numbness intensities, as measured by the CTS-6 and BCTQ, differed between the NCS severity grades, in keeping with previous reports (41). These findings support the notion that NCS and clinical severity assessment are complementary (28,44) and, consequently, indicate that the CTS-6 and BCTQ should only be used in patients with confirmed CTS.
The high level of correlation between the CTS-6 and BCTQ was expected, as the CTS-6 is derived from the BCTQ. Likewise, the low level of correlation between the CTS-6 and the hand-diagram was expected, as the handdiagram is designed for diagnostic purposes and not symptom severity assessment in contrast to the CTS-6 and BCTQ. This suggests that the BCTQ would not perform well when used for diagnostic purposes (18) when compared to diagnostic measures.
The hand-diagram had a very good ability to distinguish between patients with and without CTS. This ability is likely due to its measurement of how closely pain and numbness follow FIGURE 1 | (A,B) Total scores of the 6-item CTS (CTS-6) and Boston-CTS (BCTQ), respectively, in the different nerve conduction study (NCS) severity groups of carpal tunnel syndrome (CTS) graded according to Padua. Minimal CTS ≥ 0.5 ms difference between median/ulnar nerves sensory latency; mild CTS, sensory conduction velocities of the median nerve below the lower normal limit; moderate CTS, motor distal latency above the normal limit in addition to sensory conduction velocities of the median nerve below lower normal limit; severe CTS, absent sensory amplitudes in addition to motor distal latency above the normal limit; and extreme CTS, absence of sensory and motor responses.
the median nerve distribution, which is a classic sign of CTS (1,31,32,45). The CTS-6 and BCTQ showed good sensitivity and acceptable specificity for distinguishing moderate to severe CTS from minimal and mild CTS, as previously reported (29). This finding highlights the clinical utility of the CTS-6 and BCTQ in guiding treatment decisions, as patients with moderate to severe NCS findings often benefit from surgery, and patients with milder severity grades may benefit from conservative treatment (4,46). A limitation in this study is that we only assessed the symptom-severity subscale of the BCTQ, and not the functionalimpairment subscale. We made this decision because the CTS-6 does not contain a functional impairment subscale. The functional-impairment subscale of the BCTQ should be scored independently form the symptom severity scale (47). Also, we did not test the responsiveness of the CTS-6 and BCTQ, i.e., their ability to detect clinically important changes over time. This knowledge would be necessary to assess the interpretability of the two outcome measures and their utility for follow-up analysis. Because the clinical criteria for CTS partially overlapped with single items in the measures, incorporation bias cannot be ruled out. Incorporation bias is what happens when the test which is being evaluated is integrated into the reference standard. In this case, it may have led to an overestimation of sensitivity of the hand-diagram. Incorporation bias is difficult to avoid and is nearly always present in studies evaluating clinical diagnostic methods, nevertheless, its potential effect should be addressed (48). The items in question are central symptoms of CTS and it is hard to diagnose CTS without asking about paresthesia in the median nerve distribution. We tried to minimize the effect of this type of bias by using a reference standard comprised not only of clinical criteria, but of a combination of clinical criteria and NCS findings. Further, we addressed the problem of bilateral disease by applying the analyses to the most symptomatic side. A limitation is that a Martin-Gruber anastomosis might have been present in some individuals in the sample. Presence of Martin-Gruber anastomosis in patients with CTS might lead to confusing NCS findings (49,50). Due to the anastomosing fibers bypassing the carpal tunnel, the proximal motor latency (measured at the cubital fossa) and the motor conduction velocity of the median nerve in the forearm can be mistakenly interpreted as normal. As the distal motor latency is usually not impacted by a Martin-Gruber anastomosis, a mismatch between pathological distal motor latency and seemingly normal proximal motor latency can be observed (51,52). The effect of a Martin Gruber anastomosis on the NCS findings in CTS can be subtle and is not always easy to recognize. This could potentially lead to underestimation of pathology in the median nerve motor conduction velocity in the forearm and in the proximal motor latency in patients with CTS.
It is important to note that the findings of this study are valid in the context of a population referred to NCS with suspected CTS, which is a somewhat select group. However, from a pragmatic standpoint, this sample represents the population in which the studied instruments are likely to be used. It is, for instance, possible to use the hand-diagram to assess a pre-test probability before performing NCS and adapting the scheduled NCS protocol thereafter.
Some major strengths of the present study are that we examined all quality criteria proposed by COSMIN within our design and that we recruited more patients than recommended (21).

CONCLUSION
The Norwegian versions of the CTS-6 and BCTQ as well as the hand-diagram showed acceptable to good measurement properties when applied to patients referred to NCS. The hand-diagram provided a clear estimate of pretest probability prior to performing NCS, and the NCS protocol may be adjusted accordingly (53). The CTS-6 and BCTQ provided complementary information to NCS in severity assessment and can be used to guide the therapeutic approach in patients with diagnosed CTS.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because of local data protection restrictions. Requests to access the anonymized datasets should be directed to Daniel Gregor Schulze (d.g.schulze@studmed.uio.no).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by local data protection official of Oslo university hospital (PVO 2015/14753). The patients/participants provided their written informed consent to participate in this study.