Reliability of a telephone interview for the classification of headache disorders

Objective The study aimed to test the reliability of a semi-structured telephone interview for the classification of headache disorders according to the ICHD-3. Background Questionnaire-based screening tools are often optimized for single primary headache diagnoses [e.g., migraine (MIG) and tension headache (TTH)] and therefore insufficiently represent the diagnostic precision of the ICHD-3, which limits epidemiological research of rare headache disorders. Brief semi-structured telephone interviews could be an effective alternative to improve classification. Methods A patient population representative of different primary and secondary headache disorders (n = 60) was recruited from the outpatient clinic (HSA) of a tertiary care headache center. These patients completed an established population-based questionnaire for the classification of MIG, TTH, or trigeminal autonomic cephalalgia (TAC). In addition, they received a semi-structured telephone interview call from three blinded headache specialists individually. The agreement of diagnoses made either using the questionnaires or interviews with the HSA diagnoses was evaluated. Results Of the 59 patients (n = 1 dropout), 24% had a second-order and 5% had a third-order headache disorder. The main diagnoses were as follows: frequent primary headaches with 61% MIG, 10% TAC, 9% TTH, and 5% rare primary and 16% secondary headaches. Second-order diagnosis was chronic migraine throughout, and third-order diagnoses were medication overuse headache and TTH. Agreement between main headaches from the HSA was significantly better for the telephone interview than for the questionnaire (questionnaire: κ = 0.330; interview: κ = 0.822; p < 0.001). Second-order diagnoses were not adequately captured by questionnaires, while there was a trend for good agreement with the telephone interview (κ = 0.433; p = 0.074). Headache frequency and psychiatric comorbidities were independent predictors of HSA and telephone interview agreement. Male sex, headache frequency, severity, and depressive disorders were independently predictive for agreement between the questionnaire and HSA. The telephone interview showed high sensitivity (≥71%) and specificity (≥92%) for all primary headache disorders, whereas the questionnaire was below 50% in either sensitivity or specificity. Conclusion The semi-structured telephone interview appears to be a more reliable tool for accurate diagnosis of headache disorders than self-report questionnaires. This offers the potential to improve epidemiological headache research and care even in underserved areas.


Introduction
Headache disorders are a great burden on the general population, resulting in reduced quality of life and job performance.There are effective treatment options which, however, have to be individualized on the basis of the correct diagnosis.However, making the correct diagnosis can be challenging for physicians not specialized in headache care as there are more than 200 distinct headache disorders defined by the international classification of headache disorders (ICHD-3) (1).Moreover, especially in rural areas, headache care must be maintained primarily by nonheadache specialists (primarily primary care physicians) who are often not adequately trained (2,3).Therefore, questionnaires have been developed to screen for main primary headache disorders such as migraine (MIG), tension-type headache (TTH), or trigeminal autonomic cephalalgias (TACs) for both clinical routine and research.Although a few headache questionnaires were validated for more than one disorder, these show poor performance in detection rate and accuracy of the diagnosis when a combination of different headache disorders is present (e.g., classification of MIG with trigeminal autonomic cephalalgia symptoms as TAC) (4).Many epidemiologic headache studies were conducted before the publication of the ICHD3 classification so that heterogeneous data exist, particularly in the prevalence of rare primary and secondary headaches (5,6).Short semi-structured interviews via telephone might be an alternative option to improve detection rates.In this study, we investigated the reliability of a semi-structured telephone interview identifying different headache disorders in comparison to a questionnaire validated and used for epidemiological headache research (4) and our outpatient headache clinic (gold standard).

Methods
The study was performed as a blinded observational study in our outpatient headache clinic and approved by the local ethics committee (BB 085/21).Known patients diagnosed with one or more headache disorders according to outpatient consultation and classified according to the ICHD-3 criteria were identified through a chart review.Care was taken to include both primary and secondary headaches and also frequent and rare headache disorders to keep the interviewers unaware of an a priori probability for certain diagnoses For this purpose, we screened the database starting with headache diagnoses that were least common and increasing to more common diagnoses (i.e., headache disorders were sorted by frequency in the database).We then contacted the identified patients and asked if they were willing to participate in the study.Since migraine is by far the most prevalent diagnosis, the remaining places according to the power analysis were filled with patients suffering episodic/chronic migraine with or without MOH, which yielded the final study sample.After inclusion, they prospectively completed a questionnaire that was validated and used for epidemiological headache research (4).This questionnaire was chosen because, to the best of our knowledge, there was Abbreviations: MIG, migraine; TTH, tension-type headache; TAC, trigeminal autonomic cephalalgia; HAS, headache outpatient clinic; HA, headache; ICHD, international classification of headache diseases.
no other questionnaire validated for the detection of more than one headache disorder in German and English language.Briefly, after explaining the principles and general rules for answering, the questionnaire continues with specific questions regarding MIG (seven items), TTH (seven items), and TAC (six items).The questions in the questionnaire were to be answered with "yes" or "no."There are additional questions on the number of intake days of acute pain or migraine drugs per month.Questions and analysis algorithms are based on the classification criteria of the ICHD-2 (4, 7, 8).Questionnaires were sent to the patients' home addresses with an instruction to complete them and send them back using an envelope provided along with the letter.Later, they were called separately by three different headache specialists performing a semi-structured telephone interview for 10 min at the most (flow chart in Figure 1).The interview starts by exploring facial pain, secondary headache, and rare primary headache disorders, which are characterized by situational triggers and specific features.The interview then continues with pain intensity and frequency of headaches and specific phenomenological characteristics.Finally, a headache diagnosis was determined.Revaluation of the diagnostic interview was possible at any time in case of a new information provided by the patient.There were no predefined specific questions.In the case of several headache disorders, the diagnoses were sorted according to two criteria: (1) the amount of impairment caused by the disorder, which was generally the reason for consultation in the first place (i.e., migraine > TTH; TAC > TTH; if migraine + TAC co-exists, the one with more impairment was considered primarily).( 2) in case of diagnoses that are not independent, causality was used to sort the data (i.e., you need a migraine to develop chronic migraine, and medication overuse headache is often a consequence of chronic migraine although disentanglement may be difficult if both co-exist for quite some time).In this case, migraine would be first-order, chronic migraine would be second-order, and MOH would be a third-order headache.Patients were instructed upon study inclusion beforehand to remain anonymous and neither to tell nor to provide hints regarding their headache diagnosis.The interview resulted in one or more headache disorders using the ICHD-3.

Sample size considerations and statistics
Migraine is among the most frequent and bothersome headache disorders, and the sample size was thus adjusted to detect migraine patients among the sample population (9).Assuming an alpha error of 5% and a beta error of 80%, the McNemar test for paired observation of a headache disorder (i.e., outpatient clinic as gold standard vs. telephone interview) based on the expected 75% probability of detecting migraine patients in the outpatient clinic revealed a sample size of 60 to detect at least 20% discrepancies in diagnosis, which would yield a non-superiority to the questionnaire.
Cohen's kappa was used for agreement between the outpatient headache clinic (gold standard) and the questionnaire/telephone interview; differences between raters in the telephone interview were analyzed by the chi-squared test; stepwise binary logistic regression analysis was used for identifying predictors for agreement between gold standard and questionnaire or telephone interview.A predictor was kept in the model if the p-value was lower or equal to 0.157, which is the cutoff for an optimization of the model based on the Akaike information criterion (10).There were no missing values or outliers.Only patients with complete outpatient clinic data were selected.

Results
Fifty-nine patients were recruited between 1 March and 4 April 2022, and one patient dropped out (did not respond to phone calls).Telephone interviews were performed within 2 weeks after inclusion.All patients were interviewed by the headache specialists.The majority of patients were women (68%), and the median age was 50 years [interquartile range (IQR) 33-39].Median headache frequency was 8 days per month (IQR 4-13), and patients were moderately to severely affected in the Headache Impact Test-6 (HIT-6) (median: 60; IQR 53-64) and Migraine Disability Assessment (MIDAS) (median: 29; IQR: 10-49).A total of 24% of patients suffered more than one disorder and 5% suffered a third-order headache disorder (Table 1).
The frequency of first-order headache diagnoses was 61% MIG, 10% TAC, and 9% TTH; others were rare primary (5%) or secondary headaches (16%).The second-order disorder was chronic migraine (n = 14), and all third-order disorders were medication overuse headache (n = 2) and TTH (n = 1).Agreement with first-order headache disorders from the headache outpatient clinic (HSA) was significantly better for the telephone interview than for the questionnaire [questionnaire: κ = 0.330; interview: κ = 0.822; p < 0.001)].The second-order headache diagnosis was not adequately captured by questionnaires, while there was a trend for good agreement with a telephone interview (κ = 0.433; p = 0.074).There was no agreement in the third-order diagnosis either in questionnaires or telephone interviews.
MIG, TTH, and TAC showed moderate-to-fair agreement between questionnaires and HSA (kappa < 0,57; p = 0,037).There was no agreement in all other headache disorders.However, there was substantial-to-almost perfect agreement between the telephone interview and HSA in all first-order headache diagnoses (kappa ≥ 0,66; p < 0,001) (Table 2).There were no significant differences between the three telephone raters in determining first-order headache diagnosis (p = 0.72).
The agreement between HSA and the questionnaire was independently influenced by male sex, headache frequency, headache intensity, and depressive disorders (Table 3), whereas agreement with the telephone interview was only influenced by headache frequency and psychiatric comorbidity (Table 4).Looking at the performance scores, it is evident that the telephone interview performs significantly better than the questionnaire in detecting primary headache with high sensitivity (>88%) and specificity (>92%), especially in rare primary headache syndromes that are virtually impossible to detect with the questionnaire.The questionnaire has a moderate positive and negative predictive value for MIG and a high negative predictive value for TTH and TAC (Table 5).Performance scores for all headache disorders in our cohort can be found in Appendix.

Discussion
The results of our study showed that the semi-structured telephone interview performed more reliably in the classification of headache disorders than the self-reporting questionnaire.In addition, the results showed that there was a high general agreement on the clinical diagnosis of a headache clinic, making the interview an effective and valid screening tool.
The diagnosis-specific agreement between HSA and the questionnaire at the first headache diagnosis in our study was comparable to results from the validation study (4).The agreement was the best in the diagnosis of migraine, followed by TTH and TAC.Analogous to our study, agreement decreased in the presence of multiple headache diagnoses when the questionnaire was used (4, 8).Compared with a specific headache diagnosis questionnaire such as the Migraine ID, it results in lower agreement and lower sensitivity (11)(12)(13).Consequently, when using a headache diagnosis-specific questionnaire, the a priori test probability must be high which needs some diagnostic headache skills in the first place.The telephone interview showed significantly better agreement than the questionnaire in the diagnosis of the first headache entity, especially for MIG, TTH, and TAC, and opens the possibility to identify rare primary headache disorders.Consequently, performance scores for the telephone interview are significantly better.
To the best of our knowledge, this is the first study of a physician-based semi-structured telephone interview for the diagnosis of headache disorders.Generally, telephone interviews have been used more frequently in headache care (14) and also in primary headache disorder classification (15).Potter et al. developed a semi-structured telephone interview aimed to exclude rare primary and secondary headaches and differentiate between chronic TTH and MIG as well as medication overuse headache (MOH), which was conducted by untrained nurses (16).It was not a diagnostic interview as patients not having chronic TTH and MIG were sent to their GP for further evaluation.The overall agreement between headache specialists and nurses was only moderate.The German Robert Koch Institute designed a structured telephone interview detecting TTH, MIG, and MOH using the ICHD-3 criteria, which was applied in a German nationwide survey of 5,000 subjects done by lay personnel (17)  interview combines the advantages of the structured interviewstructure allows a time-effective classification of the headache disorder-with the advantages of an unstructured interviewflexible response to the patient's answers.This examiner-dependent variability opens at the same time as the possibility of bias errors, which are not present in the self-report questionnaire (18,19).Possible biases are the emergence of question order, context effects, the emergence of response-order effects, the validity of retrospective reports, and socially desirable responses (20).We do not believe that these effects are crucial here; after all, there was no significant difference between investigators in the firstorder diagnosis.Designing self-reporting questionnaires is based on the operationalization of the ICHD criteria into laypersonunderstandable questions and on the use of possible filter questions.
The depth of the desired classification has a significant influence on the scope of the questions to be asked and the complexity of the questionnaire (21).Coi et al. identified 48 possible causes of bias in designing and administering a questionnaire (22).Furthermore, they were prone to subjective assessment and thus subject to individual influencing factors.Thus, we identified several independent factors affecting agreement between the questionnaire and HSA diagnosis in our study.Male sex and more severe depressive symptoms were clearly associated with increased odds for an agreement of diagnoses.On the contrary, higher headache frequency led to an inferior agreement, which might be explained by tension-type phenotypes in patients with chronic migraine and/or MOH.The telephone interview done by headache specialists offers the advantage of the entity-independent recording of headache disorders by an interactive review of ICHD3 criteria.In our study, agreement between the telephone interview and gold standard was much less confounded by headache characteristics and comorbidities than the questionnaire, rendering its results more robust.In addition, this approach offers the possibility of recording  rare headache entities for which no validated questionnaires are available.Another advantage is the easy access that might allow large-scale use in underserved regions.For these reasons, it is of interest for use in epidemiological studies, especially to clarify how common rare headache syndromes really are.For routine clinical practice, the disadvantage of the need for limited available headache specialists to perform the interview in rural areas can be compensated by telemedicine approaches (23,24).However, questionnaires are still an important tool in primary care, and more efforts are needed to be made to impart knowledge and skills about administering and interpreting the results in our study-moderate positive and negative predictive values for MIG and high negative predictive values for TTH and TAC.This study has some limitations, which need to be addressed.It is a monocentric study, i.e., despite blinding, it was possible in individual cases that patients named their diagnoses or that the patient was known to the investigator by voice.Furthermore, a significant proportion of patients were affected by migraine rendering its detection more likely.However, investigators were unaware of the distribution of diagnoses, and secondary and rare primary headache disorders were equally well identified, which contradicts a selection bias in this study.Nonetheless, validation of results in an independent cohort is desirable.

Conclusion
The semi-structured telephone interview appears to be a more reliable and accurate tool for the classification of headache disorders than self-report questionnaires.Main headache diagnoses were comparable to personal consultations in this study, a finding that requires confirmation in different settings.Nonetheless, our findings offer future potential to improve headache care even in previously underserved areas.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
TABLE Stepwise backward multivariate analysis of factors influencing agreement between HSA ψ diagnosis and questionnaire.