Validity and reliability of the Musicians’ Health Literacy Questionnaire, MHL-Q19

High prevalence of musicians’ physical and mental performance-related health issues (PRHI) has been demonstrated over the last 30 years. To address this, health promotion strategies have been implemented at some post-secondary music institutions around the world, yet the high prevalence of PRHI has persisted. In 2018, an international group of researchers formed the Musicians’ Health Literacy Consortium to determine how best to decrease PRHI, and to examine the relationship between PRHI and health literacy. An outcome of the Consortium was the development of a new health literacy tool for musicians, the MHL-Q19, which drew from the theoretical framework of the European health literacy suite of tools, HLS-EU. The aim of the current study was to evaluate the validity and reliability of the MHL-Q19. Participants completed a battery of questionnaires, including the HLS-EU-Q16 for the assessment of general health literacy; the Musculoskeletal Pain Intensity and Interference Questionnaire for Musicians (MPIIQM); the RAND-12 quality of life questionnaire; and the General Self-Efficacy scale (GSE). We hypothesized that the MHL-Q19 would have a weak correlation with the HLS-EU-Q16; moderate correlation with the physical component scale and weak correlation with the mental component scale of the RAND-12; moderate correlation with the GSE; and finally, moderate correlation with pain interference and weak correlation with pain intensity of the MPIIQM. A total of 549 post-secondary music students from six English-speaking countries completed the battery of questionnaires, and 328 of these participants provided valid responses to the MHL-Q19 alone 2 weeks later. The tool showed acceptable internal consistency and test–retest reliability. Three of our hypotheses were supported, although the strength of the correlations varied from what we had predicted. The fourth hypothesis was not supported; our findings indicate that lower health literacy scores were weakly related to higher MPIIQM pain intensity and interference scores. The results of this study support the notion that musicians’ health literacy is a distinct construct that cannot be fully evaluated with existing health literacy tools. Given that this is a new instrument, the evidence presented is positive and promising. Further studies will be needed to refine the tool.

The importance of health literacy for improving personal health behaviors and community health has been recognized by the World Health Organization (WHO) and in public health research (WHO, 1998; see also Nutbeam, 1998;Okan et al., 2019). Sørensen et al. (2012) define health literacy as follows: "Health literacy is linked to literacy and entails people's knowledge, motivation and competences to access, understand, appraise, and apply health information in order to make judgments and take decisions in everyday life concerning healthcare, disease prevention and health promotion to maintain or improve quality of life during the life course." (p.3). Research has increasingly recognized the importance of context and relational aspects in determining individuals' health literacy (McKenna et al., 2017;Geboers et al., 2018;Sørensen et al., 2021). Occupational health literacy models that account for the environmental and social determinants of health in workplaces have been developed (Rauscher and Myers, 2014;Jørgensen and Larsen, 2019), as have population-specific health literacy tools designed to address the variability of these contextual and relational aspects in different occupational settings (Shannon and Parker, 2020;Suthakorn et al., 2020).
In 2018, the Musicians Health Literacy Consortium (MHLC) was formed, bringing together a panel of international experts in musicians' health. Our intention was to provide a global perspective on how health education can address the persistence of performancerelated health issues (PRHI) among musicians, and the role health literacy may play in influencing their health behaviors (Baadjou et al., 2019). An outcome of the MHLC collaboration was the development of an occupational health literacy tool specifically for musicians to measure their abilities to access, understand, appraise, and apply health information concerning their performance health, the Musicians' Health Literacy Questionnaire (MHL-Q19) (Wijsman et al., forthcoming). 1 The aim of the current study was to evaluate the validity and reliability of the MHL-Q19 among post-secondary music students.

Materials and methods
The research team The research team included tertiary music educators RL, BR-S, PV, SW and health professionals specializing in musicians' health, CG, VB, BA, all of whom, like JG, are academic researchers in musicians' health; a senior graduate student research coordinator with expertise in music education and musicians' health research, TS; and an academic researcher with expertise in applied measurement, MRR.

Developing the items
The MHL-Q19 was modeled on the theoretical framework and health literacy matrix of the European Health Literacy Survey, the HLS-EU-Q (Sørensen et al., 2012(Sørensen et al., , 2013(Sørensen et al., , 2015, designed to measure health literacy, in relation to general health, in European populations. The HLS-EU health literacy matrix outlines the interaction of four dimensions of health literacy (accessing, understanding, appraising and applying health information) with three domains of health (Healthcare, Disease Prevention and Health Promotion; 1 Wijsman, S., Baadjou, V., de Lisle, R., Ginsborg, J., Guptill, C., Rennie-Salonen, B., et Sørensen et al., 2012;Pelikan et al., 2019;see Figures 1, 2). These three domains can be further understood through their relationship with the four dimensions of health in the theoretical framework ( Figure 1). In other words, the Healthcare domain can be understood as a person's ability to access, understand, appraise or interpret, and apply information relating to medical or clinical issues and advice. The Disease Prevention domain is concerned with accessing, understanding, appraising or interpreting, and applying information on risk factors for health. Finally, the Health Promotion domain is concerned with accessing, understanding, appraising and applying information related to determinants of health (Sørensen et al., 2012).
The HLS-EU-Q framework was chosen as a model, first, because it incorporates the assessment of health literacy competencies discussed in the published research literature on musicians' health behaviors that the team deemed essential to musicians' health literacy: decision-making, evaluation, responsibility, and self-efficacy (Daykin, 2005;Williamon and Thompson, 2006;Kreutz et al., 2008;Ginsborg et al., 2009;Araújo et al., 2017). Second, the HLS-EU-Q was designed for multi-lingual administration; therefore, deriving items for the MHL-Q19 from the HLS-EU-Q enhances the ability to adapt our new tool for different cultures, which is desirable in a tool designed for use with the global population of musicians. Four dimensions of health literacy across three domains of health: original HLS-EU health literacy matrix (Sørensen et al., 2012;used with permission) and adapted MHL-Q19 matrix (Wijsman et al., forthcoming; see footnote 1).
Frontiers in Psychology 04 frontiersin.org The MHLC undertook a consensus development process over a 6-month period, adapting the wording of the subscales of HLS-EU matrix to make it more suitable for the musician population, and generating 19 musician-specific items based on questions in the HLS-EU-Q47 (Wijsman et al., forthcoming; see footnote 1). Items specific to musicians' performance-related health were modeled on questions in the HLS-EU-Q, avoiding the use of medical terminology and focusing on context-specific aspects of musicians' health literacy. In addition, questions were worded to be inclusive of singers (in future; the current sample consisted of instrumental musicians only); to be inclusive of all musical genres; and to be easily translated into other languages. For example, Question 17 of the HLS-EU-Q47 reads: "On a scale from very easy to very difficult, how easy would you say it is to: find information about how to manage unhealthy behavior such as smoking, low physical activity and drinking too much?" Question 1 of the MHL-Q19 frames this in terms of musician-specific health concerns, asking: "On a scale from very easy to very difficult, how easy would you say it is to: find information about healthy performance habits?" (Wijsman et al., forthcoming; see footnote 1).

Face validity
The preliminary version of the tool was distributed to a stakeholder network developed by the lead author. This group was comprised of 12 field experts, including administrators of music programs at three Canadian post-secondary institutions; senior leaders of five national-and provinciallevel music organizations; one healthcare practitioner specializing in musicians' health; and three musicians' health researchers. These experts assessed the face validity of the questionnaire items and gave open-ended feedback on the content of the tool via email and during a teleconference, as described in Wijsman et al. (forthcoming; see footnote 1). Based on the generally positive feedback from the stakeholder network, no changes were made to the tool at this stage.

Construct validity
To assess the construct validity of the new MHL-Q19, we administered it along with four additional validated questionnaires assessing relevant constructs. We also sought to determine whether there were, indeed, elements of health literacy for musicians that were not sufficiently addressed by the other questionnaire tools. The tools employed were as follows: a. The HLS-EU-Q16 (a shorter form of the HLS-EU-Q47) is a short-form assessment of general health literacy (Pelikan and Ganahl, 2017). This tool consists of 16 questions. The overarching question is: "On a scale from very easy to very difficult, how easy would you say it is to:" with possible answers of very easy, easy, difficult, very difficult, and don't know. Very easy and easy responses are assigned a score of 1, and difficult and very difficult are assigned a score of 0. Don't know is scored as missing data. Total scores of 0-8 are defined as representing inadequate health literacy, 9-12 problematic, and 13-16 adequate. The tool takes about 3 min to complete. Concurrent validity was confirmed by correlations of 76% with the HLS-EU-Q47 and the Newest Vital Sign test (Pelikan and Ganahl, 2017;Bas-Sarmiento et al., 2020). It also demonstrates high internal consistency (Bas-Sarmiento et al., 2020). Reliability was reported to be high in several studies in which the tool was translated into other languages (Bas-Sarmiento et al., 2020;Niedorys et al., 2020;Nolasco et al., 2020). b. The Musculoskeletal Pain Intensity and Interference Questionnaire for Musicians (MPIIQM; Berque, 2014;Berque et al., 2014;Schmidt, 2017) is one of two validated tools purporting to measure musculoskeletal pain in musicians, the other being the Musculoskeletal Pain Questionnaire for Musicians (MPQM; Lamontagne and Bélanger, 2012). This questionnaire asks participants to map the location of symptoms of playing-related musculoskeletal disorders (based on a definition related to that developed by Zaza (1998). Participants use a Likert-type scale to answer a series of questions aiming to determine their level of pain and the degree to which it interferes with their ability to play music at the level to which they are accustomed. We chose the MPIIQM because, at the time that this study was first proposed, the MPQM had only been validated in French on a sample of 31 professional musicians. The MPIIQM validation, by comparison, was conducted with a sample of 37 musicians in English. The two-factor structure (pain intensity and pain interference) was supported by confirmatory factor analysis. The tool had high internal consistency and good test-retest reliability (Berque, 2014). c. The RAND-12 (Johnson and Maddigan, 2004;Cheak-Zamora et al., 2009) is equivalent to the SF-12, derived from the 116-item Medical Outcomes Study designed to measure physical and mental quality of life (QOL). Both tools have been used for decades in health research around the world and have been shown to have good reliability and validity in adults with musculoskeletal (Mani et al., 2017) and chronic health conditions (Johnson and Maddigan, 2004), and mental illness (Huo et al., 2018). Two summary scores can be calculated using these tools: the physical component score (PCS) and the mental component score (MCS). Unlike the SF-12, the RAND-12 does not require a license or the use of the proprietary scoring algorithm, which assumes that the PCS and MCS are unrelated. Instead, we employed the algorithm developed by Johnson and Maddigan (2004) for use in research on individuals with chronic health conditions. d. The General Self-Efficacy (GSE) scale (Schwarzer and Jerusalem, 1995;Schwarzer, 2014) was selected because the literature indicates that health literacy and self-efficacy are related (Xu et al., 2018;Beasley et al., 2020;Wayment et al., 2020). In addition, we chose a measure of general self-efficacy rather than specific music-related self-efficacy, as we felt that such an item would be more related to efficacy in music performance activities, and less related to health literacy. The GSE was first developed in 1979, and the current version was reduced to 10 items in 1981 (the version used in this study; Schwarzer and Jerusalem, 1995). While information does not appear to be available for psychometrics of this tool in its initial development, subsequent studies have established that the GSE has high internal consistency (α = 0.86; Scholz et al., 2002) and good test-retest reliability (ranging from 0.47 to 0.75 in studies cited by Scholz et al., 2002). With respect to validity, the tool performed as expected in relation to Bandura's theory of behavior change (1997), correlating positively with self-esteem and optimism, and negatively with anxiety, depression and physical symptoms, as would be expected (Schwarzer and Born, 1997;Schwarzer et al., 1999). Finally, the scale was demonstrated to be unidimensional, supporting its use as a single measure of general self-efficacy.
We hypothesized that these constructs would be related to musicians' health literacy and, therefore, that scores on the MHL-Q19 would be positively correlated with those on the four questionnaires. Our hypotheses were as follows: 1. There would be a weak correlation between MHL-Q19 and HLS-EU-Q16 scores because the MHL-Q19 is designed to measure context-specific health literacy in musicians related to their performance-related health, using questions that are similar to those in the HLS-EU-Q47 from which the HLS-EU-Q16 is derived, but that capture a more specific subset of health literacy concerns not addressed in the general health literacy questionnaire (please see Methods -Developing the Items); 2. There would be a moderate correlation between MHL-Q19 and MPIIQM pain interference scores but a weak correlation between MHL-Q19 and MPIIQM pain intensity scores. While there is insufficient evidence about the relationship between health literacy and pain intensity and interference, in the research team's expert opinion, informed by combined clinical, pedagogical and applied research experience, musicians whose symptoms interfere with performance are more motivated to seek care; thus, we postulated that they would also be more motivated to seek information, thereby increasing their music performance-specific health literacy; Frontiers in Psychology 06 frontiersin.org 3. There would be a moderate correlation between total MHL-Q19 scores and the physical component scores, and a weak correlation between MHL-Q19 scores and mental component scores of the RAND-12. This is because the research literature suggests (Douglas, 2019) that information about physical health is more accessible to music students than information about mental health; 4. There would be a moderate correlation between MHL-Q19 and GSE scores because the literature has demonstrated a relationship between health literacy and self-efficacy (Xu et al., 2018;Beasley et al., 2020;Wayment et al., 2020).

Participants and recruitment
The MHLC panel agreed with other researchers that postsecondary training is an ideal time to deliver health promotion and injury prevention education (Chesky et al., 2006;Ginsborg et al., 2012;Norton, 2016;Salonen, 2018), because music students are at significant risk of PRHI, are highly engaged in educational activities, and are motivated to take actions to safeguard their future careers as musicians. Therefore, this initial validation study was completed with post-secondary music students from 13 institutions worldwide. We excluded singers from the sample, because the MPIIQM has only been validated with instrumental musicians. All recruitment was conducted electronically, by email to students directly and through social media using avenues frequented by post-secondary music students.
To calculate the required sample size, we looked at the total number of music students from available data: in the US, 332,297 post-secondary students in 2011, or ~0.1% of the current US population (College Music Society, 2015), and in Australia, 3,500 students in 2011, or ~0.01% of the current Australian population (Global Access Partners, 2011). We used 0.05% as the estimated percentage of the population in the other countries where data were collected (Canada, New Zealand, South Africa and the United Kingdom), for a total of 421,297 students. The estimated sample size required to achieve 95% confidence with 5% margin of error is 384 students (Qualtrics sample size calculator, https:// www.qualtrics.com/blog/calculating-sample-size/).
A total of 549 post-secondary music students completed the battery of questionnaires, including participants from Canada (n = 253), the United States (n = 138), South Africa (n = 43), Australia (n = 57), New Zealand (n = 31), and the United Kingdom (n = 27). See Table 1 for a summary of the demographic characteristics of this sample. Participants were largely in the early-20s age bracket, which was only slightly above our minimum age of inclusion in the research (18 years old). For this reason, the age data were not normally distributed and the measure of central tendency for this variable in Table 1 is expressed as median (interquartile range). The majority of the students at these schools were at the Bachelor's level, and studying Western art (classical) music, although some were also in contemporary or jazz programs. Students could be studying in any musical discipline (e.g., performance, composition and theory).
To ensure that data from different nations could be compared to each other, we conducted a test equivalence analysis (Abubakar et al., 2013). Cronbach's alpha was acceptable in all countries involved in the study, with the highest alpha levels among Australia (0.915), Canada (0.932), and the United States (0.917), and only slightly lower levels among New Zealand (0.809), South Africa (0.872), and the United Kingdom (0.893). These

Procedure
Research procedures were reviewed and approved by the Office of Research Ethics and Integrity at the University of Alberta, which was the institutional affiliation of the first author when the study began, as well as the local research ethics board at each of the universities from which participants were recruited. Recruitment emails and social media materials included a link to the complete survey, including the MHL-Q19 and additional four questionnaires, hosted online using the Qualtrics platform. 2 When they had completed the survey, participants could provide their email address for follow-up to assess test-retest reliability. Those who provided their email addresses were sent a new link to a survey consisting only of the MHL-Q19, which took ~5 min to complete. Participants who completed the second survey were offered a gift card for $5 Canadian or a similar value in a local currency. In the majority of cases, the gift cards were digital and two options were offered (e.g., Starbucks and Amazon). Local contacts were consulted as to the appropriateness of the incentives at each site, and if electronic gift cards were not available, arrangements were made with a suitable vendor to distribute gift cards to participants in accordance with the policies of the local research ethics board.
The second survey (MHL-Q19 only) was sent ~2 weeks after participants had responded to the first survey. This is consistent with the relevant literature (Nunnally and Bernstein, 1994;Marx et al., 2003;Bardhoshi and Erford, 2017).

Data analysis plan
Data were cleaned and collated. Frequencies, distribution of data and ranges of scores were calculated. Frequencies of missing items per question were evaluated. Non-parametric statistical tests were used where normality of data distribution was not evident. Normality of data was assessed by inspecting histograms, skewness, and kurtosis statistics. Responses of don't know on the MHL-Q19 were considered missing data and, where there were more than 20% missing responses for any participant, that participant's responses were deemed invalid. We did not identify any outliers. We conducted an inter-item correlation analysis of responses to the MHL-Q19 to assess potential relationships between items. Internal consistency was evaluated by calculating Cronbach's alpha. Values of Cronbach's α between 0.70 and 0.95 are considered good (Terwee et al., 2007).
In regard to construct validity, correlation testing was applied to calculate correlations between the scores on the different questionnaires. A correlation lower than 0.30 was considered 2 www.qualtrics.com weak, 0.30-0.60 moderate, and higher than 0.60 as strong (Terwee et al., 2007). As responses in the MHL-Q19 are scored using a Likert-style scale, the data were considered ordinal and nonparametric correlation statistics (Spearman rho) were used for hypothesis testing.

Preliminary analysis
Data were cleaned and checked for missing values. As with previous HLS-EU tools assessing health literacy, any response of don't know was considered a missing value. The responses very easy, easy, difficult and very difficult were considered valid responses. Participants who gave fewer than 80% valid responses across the whole survey were excluded from analysis, in line with previous tool development research in health literacy (Pelikan et al., 2019). A total of 73 participants who responded to the first survey were excluded from the factor analysis for this reason, leaving 439. A total of 37 participants who responded to the second survey were excluded for the same reason, and 126 participants did not complete the follow-up survey. Valid responses to both the first and second surveys were provided by 350 participants, of whom 22 were eliminated because they reported voice as their primary instrument. As shown in Table 1, this left 328 valid responses for the test-retest reliability analysis. Table 2 shows descriptive statistics for responses to the MHL-Q19 included in the factor analysis and hypothesis testing (column 2). A slight floor effect was evident for responses to Questions 13-17, with 17%-25% of responses in the lowest scale option. At the second administration, only questions 15 and 16 demonstrated this floor effect. Frequencies of responses to each of the questions in the MHL-Q19 are provided as Supplementary material.
We also conducted an inter-item correlation analysis of MHL-Q19 responses to assess potential relationships between items. Inter-item correlation was moderate, other than for Question 19, which was weakly correlated with 14 of the 18 other items (r < 0.3, p < 0.001).

Exploratory factor analysis
As described above, the items comprising the MHL-Q19 were developed using an adapted version of the HLS-EU Health Literacy Matrix (Figures 1, 2). This matrix provided a number of theoretically sound options for a factor solution for the present data. For this reason, Exploratory Factor Analysis was selected as the best choice of analysis for dimension reduction.
Frontiers in Psychology 08 frontiersin.org Principal axis factoring was conducted on the 19 items, using oblique rotation (direct oblimin). Oblique factor rotation was used because it would be theoretically coherent to assume that factors of health literacy are related to one another (Sørensen et al., 2012). The Kaiser-Meyer-Olkin measure verified the sampling adequacy for the analysis (KMO = 0.91) and all KMO values for individual items were >0.857, indicating that the sample size was sufficiently large for the factor analysis. Initial analysis found four factors with eigenvalues >1. However, the mean communality was 0.456, suggesting that Kaiser's criterion of extracting factors with eigenvalues >1 may be inappropriate. The scree plot was ambiguous, with grounds for extracting one, two, or three factors. Ultimately, we chose to retain three factors because of the convergence of support from the inflection of the scree plot and the theoretical congruence of the factors extracted.
This initial solution gave us three factors that were congruent with the theoretical foundation of the tool, in that they correspond roughly to the horizontal axis of the HLS-EU Health Literacy matrix displayed in Figure 1. The pattern matrix for this solution is displayed in Table 3 and the correlation between the factors is shown in Table 4.

Naming the factors
The factors extracted were largely in line with the theoretical constructs of Health Promotion, Disease Prevention, and Healthcare from the HLS-EU Health Literacy matrix. Closer examination of the factor loadings for each question revealed a slightly different pattern in our findings than in the design of the questionnaire. Question 18 ("Understand treatment advice if you have performance health issues") and Questions 19 ("Follow treatment advice if you have performance health issues") were notable in that they both loaded onto a factor with the Disease Prevention items, rather than with Healthcare as we had anticipated when developing these items.

Hypothesis testing
We tested our hypotheses using summary scores for each of the subscales of the MHL-Q19 determined by the factor analysis. Table 5 shows the correlations between scores for each of the subscales of the MHL-Q19 and for the other four comparator tools. The data were ordinal, so Spearman's rho was used as the correlation statistic. We also conducted the hypothesis testing using an overall summary score for the MHL-Q19, and this scoring structure was likewise supported by our factor analysis.
Hypothesis 1: was supported in that scores on the MHL-Q19 and HLS-EU-16 were significantly moderately (rather than weakly) correlated.
Hypothesis 2: was not supported in that scores on the MHL-Q19 were weakly but significantly negatively correlated with scores for both MPIIQM pain interference and intensity; we had predicted a moderate positive correlation with the former, and a weak positive correlation with the latter.
Hypothesis 3: was partially supported in that scores on the MHL-Q19 were weakly positively correlated with both the physical and mental component scores of the RAND-12; we had predicted a stronger correlation with the former than the latter. Hypothesis 4: was partially supported in that scores on the MHL-Q19 were weakly but significantly correlated with scores on the GSE; we had predicted a moderate correlation between them, on the basis of the previous literature reporting a relationship between health literacy and self-efficacy.

Reliability
Each subscale demonstrated high reliability using Cronbach's alpha (Health Promotion, α = 0.802; Risks to Performance Health, α = 0.854, Healthcare, α = 0.826). When the same analyses were run using the whole scale, Question 19 demonstrated a corrected item-total correlation (r = 0.408, p < 0.01) that was considerably lower than the other items (r values between 0.503 and 0.67). Cronbach's alpha for the entire scale was also high (α = 0.919).
Each of the subscale scores were moderately and significantly correlated between the first and second administrations of the new tool. Health Promotion (rho = 0.651, p > 0.001), Risks to Performance Health (rho = 0.692, p < 0.001), and Healthcare (rho = 0.629, p > 0.001) all demonstrated acceptable levels of testretest reliability. Test-retest reliability of an overall summary score was higher (rho = 0.779, p > 0.001).

Discussion
This is the first study evaluating the validity and reliability of a new instrument for measuring musicians' health literacy. The findings support the need for the development of such a tool and show promising psychometric features with great potential for measuring this construct. Some variation in Cronbach's alpha levels was present across the countries where participants lived; however,  in some countries multiple schools participated in the study, and participation varied between schools. However, these small differences in Cronbach's alpha do suggest a need for future research to assess the validity of the tool in different geographical and/or cultural settings. With respect to construct validity, some of our findings were consistent with our hypotheses, while others were not. Responses to the HLS-EU-Q16 were moderately correlated with their responses to the MHL-Q19, while we had predicted a weak correlation. This indicates that musicians' health literacy and general health literacy are related constructs; however, the overlap between the two is not strong. This finding further justifies the need for this musician-specific health literacy assessment tool as argued by Wijsman et al. (forthcoming; see footnote 1).
Responses to the GSE were less strongly correlated with responses to the MHL-Q19 than we had predicted. As described in the Introduction, the research literature suggests that health literacy and self-efficacy are related constructs. However, these conclusions have been drawn from research using tools that are designed to measure general health literacy in large populations. Our results, by contrast, indicate that self-efficacy is less important in determining postsecondary music students' ability to access, understand, appraise, and apply health information related to music performance. This might suggest that other factors may have more influence on music students' health literacy, and setting the expectation that they will take responsibility for their own health and well-being may be unfounded.
While we had predicted a stronger correlation between the RAND 12 physical component scores and MHL-Q19 scores than between the mental component scores and MHL-Q19 scores, correlations between both physical and mental health component scores and the MHL-Q19 scores were weak. This suggests that music students' health literacy is weakly related to, but not entirely predicted by, their individual health status alone.
Our results did not support the hypothesis that participants scoring higher on the MPIIQM for pain and its interference with performance would also score higher for musicians' health literacy. While the correlation was relatively weak, these participants in fact scored lower on the MHL-Q19. In other words, lower health literacy in these student musicians was associated with more pain and interference symptoms. It is possible that students who have -and possibly struggled with -performance-related health issues, may have worse health outcomes, and therefore, less confidence in being able to cope with these issues, leading to lower health literacy scores. However, research has also shown that music students do not always consider PRHIs to be health issues (Guptill et al., 2000;Waters, 2019Waters, , 2020. Therefore, when they are faced with PHRIs, they may not try to access health information at all, nor consult healthcare professionals or identify health resources to assist them in addressing these concerns. The relationship between lower scores on the MHL-Q19 and higher scores on the MPIIQM therefore may reflect a lower level of health literacy in these music students, specifically in relation to their occupational health. It is also possible that lower health literacy may lead to more PRHIs, a primary rationale for the development of the MHL-Q19 (Wijsman et al., forthcoming; see footnote 1). The relationship between pain intensity/interference and musicians' health literacy appears to be complex and could prove to be an important focus of future research on musicians' health. Figure 2 shows the questions from the MHL-Q19 and their corresponding domains from the adapted matrix created during the development of this new measurement tool. Comparing this figure with Table 3, it can be seen that the results of the factor analysis are a good but not perfect fit with the theoretical design of our questionnaire. As indicated above, Q18 and Q19 mapped onto domains that were not those we had intended when we created the tool. As a result, the construct represented by this domain, which was called Disease Prevention in the HLS-EU tools, might not represent the way participants in our survey understood these particular items. Based on the grouping provided by their responses, we have proposed a new name for this domain: Risk to Performance Health. Further evaluation of the tool could include using methods such as cognitive interviews (Willis and Artino, 2013;Willis, 2018), which ask participants to complete questionnaires while describing their thought processes as they complete the tool. Such evaluation may shed more light on how musicians relate questions in these three domains to one another.
While the internal consistency of the MHL-Q19 is high, its test-retest reliability was somewhat lower than we would have expected. Streiner et al. (2008) suggest three possible explanations for low test-retest reliability: 1. The scale itself is unreliable; 2. The construct changes over time; or 3. The participants' perception of the construct is changed by filling out the questionnaire. The latter explanation is the most likely. In retrospect, the majority of the institutions from which we recruited participants were those with which we had previous relationships, or where we had contacts; and these institutions were more likely to be sympathetic to the need for health promotion and injury prevention education, or indeed, already offered some health education to their students. It is possible, therefore, that participants were, in fact, learning about health promotion and injury prevention over the course of the 2 weeks between the first and second administration of the MHL-Q19. It is also possible that their health literacy was quite low to begin with, and that completing the questionnaire caused them to consider these issues for the first time or to take them more seriously, and thus, this changed the construct for them. For example, some of the Health Promotion items asked participants to consider what they would do if they had a PRHI. After having been prompted to consider this possibility, their awareness and confidence in accessing, understanding, appraising and applying information related to PRHI may have increased. If this is indeed the case, it would suggest that initial interventions that were focused on raising awareness of PRHI and health literacy might have an impact. However, we cannot rule out the possibility that the wording of the questions led to lower test-retest reliability.
Given that the wording of MHL-Q19 items was chosen to be simple and relatable for musicians, and to resemble the items in the well-established HLS-EU-Q tools that have been used in many countries around the world (Wijsman et al., forthcoming; see footnote 1), we think that this is unlikely. During the factor analysis, Questions 6 and 18 were found to load onto more than one factor. In addition, as mentioned above, Questions 18 and 19 loaded onto factors that were not the ones we initially assigned to them during the design of the questionnaire, namely Disease Prevention (Risks to Performance Health) rather than Healthcare. This caused us to reflect on the questions themselves. Questions 18 and 19 would be difficult for participants to answer if they had never experienced a PRHI, and both questions also introduce an "if, " thereby requiring the respondent first to decide if they had had a PRHI, and second to decide how difficult it would be to understand or follow treatment advice they might not have received. The loading of Questions 18 and 19 onto Risks to Performance Health factor rather than Healthcare suggest that our thinking as developers when we designed these items might be somewhat different than that of participants when answering these questions. Future research employing cognitive interviews with participants about their thought processes as they complete the questionnaire, as described previously, could also help us to change the wording of the questions so that participants' responses are better aligned with our intended outcomes.
Overall, the psychometric properties of this new tool are promising. We anticipate that this tool could be used by healthcare professionals, educators, and educational programs to measure students' health literacy at baseline (e.g., upon entry to the program or at the beginning of treatment), to determine the effectiveness of educational interventions or healthcare. It could also be used in conjunction with measures of health to further our understanding of the relationship between musicians' health literacy and their health and well-being.

Limitations
Limitations of this validation study include the fact that the questionnaire was administered in English only in Englishspeaking, primarily higher-income countries. We intend to translate this tool into other languages and test non-English versions in a variety of countries in future.
As with the development of any tool to measure a newly identified construct, it is possible that the process we followed to develop our tool from the pre-existing HLS-EU family of questionnaires resulted in a tool that did not fully address health literacy for musicians. There are other ways that the development of such a tool could progress, such as beginning the process from qualitative interviews with musicians. Other health literacy tools have been developed using such a process (Osborne et al., 2013). However, when considering the selection of a model for the development of our tool, we rejected Osborne et al. 's health literacy questionnaire because it only addressed one of four health literacy competencies (decision-making/critical thinking, evaluation, responsibility, and confidence/self-efficacy) that were suggested in the literature to be enablers of positive health outcomes for musicians (Wijsman et al., forthcoming; see footnote 1). At this time, our tool is the only one to address health literacy in musicians, and its utility in practice and research remains to be demonstrated in future research.
During the development of the questionnaire, the MHLC chose to maintain the scoring structure of the HLS-EU including the four valid responses (very difficult, difficult, easy, and very easy), the optional response don't know, scored as missing data, and the determination that 20% or more don't know responses rendered the participant's responses invalid. This decision was made to maximize the comparability with the HLS-EU suite of tools. It should be noted that the HLS-EU tools were initially designed to be administered verbally, either in person or over the telephone. It seems reasonable to assume that don't know might not be selected as often when the questionnaire is administered verbally as when don't know appears as an available response option in print or online, and this may explain the lower invalidity rate found for HLS-EU questionnaires than for the MHL-Q19. The developers of the HLS-EU questionnaires did, however, intend their tools to be delivered online in future applications, although the literature on the HLS-EU-Q (Sørensen et al., 2013(Sørensen et al., , 2015Pelikan et al., 2019) does not include any visual representation of how it would appear. Thus, while we acknowledge that there might be differences between the rates of invalid responses if the MHL-Q19 were to be administered verbally as well as online, we think it unlikely that these differences would be significant.

Conclusion
In this paper, we have evaluated the validity and reliability of a new health literacy questionnaire for musicians, the MHL-Q19. The tool showed acceptable reliability. Questions that arose from the reliability evaluation point to the intriguing possibility that music students' health literacy may be changed by completing this questionnaire, which should be explored in future research. Factor analysis indicates that MHL-Q19 questions generally map onto the domains of the health literacy conceptual framework as intended, with a few exceptions. Our hypotheses about how the new tool would perform in comparison with other validated instruments were partially supported, with some unexpected results. These can be partly explained by reflection on additional literature, and some of which point to the need for more research to further investigate the utility of this new questionnaire. In addition, because musicians' health literacy is a developing construct, the partial support for the hypotheses described above may provide useful insight into the construct of musicians' health literacy itself.
In summary, given that this is a new instrument, the validity evidence presented is positive and promising. Further studies are needed to refine the tool. The results from this study support the view of the MHLC that musicians' health literacy is a distinct construct that cannot be fully evaluated with existing health literacy tools. The MHL-Q19 has demonstrated great potential for measuring this construct and we anticipate that future research will further strengthen both the tool itself and our understanding of musicians' health literacy.

Data availability statement
The data in this article are potentially sensitive health information. We do not have ethics approval to share them outside of the research team. Requests to access the datasets should be directed to cguptill@uottawa.ca.

Ethics statement
The studies involving human participants were reviewed and approved by University of Alberta Research Ethics Office. A letter of information was presented as the first page of the online surveys and consent was inferred when participants completed the questionnaires.

Author contributions
CG was the lead researcher and primary author for the manuscript, with significant contributions from TS and SW. CG, TS, VB, MRR, and SW were the primary researchers designing and implementing this study. TS, with assistance from MRR, analyzed the data. RL, BR-S, JG, and SW assisted in obtaining research ethics board approvals and recruiting participants. CG, VB, RL, BR-S, BA, JG, PV, and SW participated in the design of the MHL-Q19 questionnaire. SW is the Lead Academic for the Musicians' Health Literacy Consortium. All authors contributed to the article and approved the submitted version.