Comparison of Psychometric Characteristics for Five Versions of the Interpersonal Needs Questionnaire in Teenagers Sample

Interpersonal Needs Questionnaire (INQ) is a self-report measure of perceived burdensomeness and thwarted belongingness with five versions in recent studies. There are five versions of INQ. But results from studies using different versions are quite different. Current suicide behavior among teenagers has attracted much attention. But which version is more suitable for teenage samples is still uncertain. It is important to compare the potential differences in different versions of INQ to identify the most psychometrically available version to predict teenagers' acquired capability for suicide and provide them with timely help to reduce teenagers' suicide rates. This study compared the construct validity, internal consistency, validity, and average test information of each version in the sample of teenagers. Results showed the 10-item version provided the most average test information in both thwarted belongingness subscale and perceived burdensomeness subscale, and the INQ-10 is more suitable for teenage samples.


INTRODUCTION
Suicide is a major social and public health problem facing the world. Nearly 800,000 people commit suicide each year, and the number of suicide attempts is many times the number of suicides. Suicide occurred throughout the lifespan and it was the third leading cause of death in 15-19-yearold worldwide in 2016 (World Health Organization, 2019). In the general population, attempted suicide is the biggest risk factor for suicide. Suicide attempts peak in mid-puberty (Carballo et al., 2019). Suicide and serious self-harm not only seriously endanger the lives and health of young people, but also cause serious losses to individuals, families, and society.
The interpersonal-psychological theory of suicide (IPTS) was first proposed by Joiner (2005) and further expanded by Van Orden et al. (2010). This theory surpasses the previous theories of suicide in that it explains why the vast majority of people with suicidal ideation do not attempt suicide. IPTS proposes that suicidal behavior occurs only when an individual has both the desire to die and the acquired capability to engage in suicidal behavior. The desire to die is an individual's desire to end her/his life, which roughly corresponds to the common definition of suicidal ideation (Van Orden et al., 2008a). The acquired ability to engage in suicidal behavior is a learned ability, which means that through repeated exposure to painful and provocative events, the fear of death can be reduced and the tolerance of physical pain can be enhanced. According to the prediction of IPTS (Joiner, 2005;Van Orden et al., 2010), whether an individual has suicidal ideation depends on whether belongingness of the individual is met (thwarted belongingness, TB) and whether the individual considers himself/herself a burden to others (perceived burdensomeness, PB), and suicidal ideation will not turn into suicidal behavior until the acquired capability is large enough. Therefore, the interpersonal-psychological theory of suicide is defined as the framework of ideation-to-action (Klonsky and May, 2014;Klonsky et al., 2016).
Since the interpersonal-psychological theory of suicide was proposed, it has inspired many empirical studies on the causes of suicidal ideation, attempts, and fatalities. Research on the interpersonal-psychological theory of suicide has been conducted in different samples, such as undergraduates (Hagan et al., 2015;Suh et al., 2017), prison inmates (Mandracchia and Smith, 2015), physicians (Fink-Miller, 2015), older adults (Cukrowicz et al., 2013), psychiatric inpatients and outpatients (Monteith et al., 2013), military service members (Bryan et al., 2010), sexual minorities (Silva et al., 2015), and firefighters (Chu et al., 2016). The interpersonal-psychological theory of suicide was also validated cross-culturally across Korean and US undergraduate students (Suh et al., 2017). Moreover, based on IPTS, Van Orden (2009) further confirmed and extended the theory that perceived burdensomeness and thwarted belongingness were combined into interpersonal needs and constructed a corresponding Interpersonal Needs Questionnaire (INQ) with 25 items to reflect whether current interpersonal relationship needs of the individual were met. On account of the multicollinearity between thwarted belongingness and perceived burdensomeness, the 12-item INQ was later developed (Van Orden et al., 2008a). And the original authors proposed a 15item version (Van Orden et al., 2012). An 18-item version was validated in a book on the interpersonal theory (Joiner et al., 2009) and a 10-item INQ was validated for use in military samples (Bryan et al., 2010). Each of the shorter versions of INQ is a subset of the original 25-item version. The 18-item version has been used primarily in the older adult, veterans, and college student samples in the US and college student samples in China (e.g., Davidson et al., 2011;Rasmussen and Wingate, 2011;Wong et al., 2011;Monteith et al., 2013;Zhang et al., 2013;Suh et al., 2017). The 15-item version introduced as an empirically derived refinement of the INQ-25 has been used in college student samples in the US, Singapore, China, and Switzerland (INQ-15;e.g., Van Orden et al., 2012;Hill and Pettit, 2013;Li et al., 2015;Baertschi et al., 2017;Teo et al., 2018), and Hallensleben et al. (2016) administrated the INQ-15 to a sample of German general population aged 14-75 years. The 12-item version has been used in a variety of samples in the US (e.g., Van Orden et al., 2008a;Davidson et al., 2009;Freedenthal et al., 2011;Hill and Pettit, 2012;Lamis and Lester, 2012). The 10-item version has been used primarily in military samples (e.g., Bryan et al., 2010Bryan et al., , 2012Bryan et al., , 2013Bryan, 2011). According to previous studies (e.g., Bryan, 2011;Davidson et al., 2011;Freedenthal et al., 2011;Baertschi et al., 2017), the original version and its four shorter versions had acceptable internal consistencies and validities.
Although previous studies showed the INQ predicted suicide behaviors significantly (e.g., Van Orden et al., 2008a,b), a large number of discrepant studies found different versions of INQ had some differences in predicting suicide behaviors. For example, the INQ-25 (Anestis and Joiner, 2011), the INQ-18 (Wong et al., 2011), the INQ-12 (Hill and Pettit, 2012), and INQ-10 (Bryan et al., 2012) have been confirmed that perceived burdensomeness was a significant predictor, but thwarted belongingness was not. On the contrary, both perceived burdensomeness and thwarted belongingness had adequate predictive validity in the 12-item version (Lamis and Malone, 2011) and the 15-item version (Van Orden et al., 2012). The differences in the predictive validity of the measures confuse future researchers on which version of INQ should be used. Furthermore, there is documentation on how to select items in the previous literature for the . But for the other versions, it is unclear how to select items from the original 25-item. Therefore, it is necessary to investigate and compare the psychometric characteristics of the five versions of the INQ in the same sample.
In the past, most psychological constructs of self-rating measurements have been assessed through classical test theory (CTT), which focuses on construct validity, internal consistency, and test-retest stability (Hunsley and Mash, 2008). However, CTT methods to assess interpersonal needs of an individual rely on the total score or transformed total score and fail to offer individuals with more direct information about his/her interpersonal needs range. This goal can be realized through the application of the item response theory (IRT). As the basis of the latest psychometric techniques, the IRT methods can provide analyses of individual latent traits (e.g., interpersonal needs) and item characteristics.
Based on IRT, item and test-information functions can be calculated by integrating the estimated parameters in IRT models to describe graphically and most precisely evaluate the regions of the individual latent trait continuum. Based on the IRT, item and test-information functions assessed on the same latent trait instrument are comparable in different measurements (Fayers, 2004). Therefore, based on the IRT methods, multiple inventories on a single and common metric can be comparable. What is more, the IRT methods can provide suggestions on which item or inventory can provide the most information for different latent traits (Olino et al., 2012).
Until now, no study has compared the psychometric characteristics of different INQ versions based on the IRT in the teenage samples and no study has investigated which version of INQ is more suitable for teenage samples. But the issue of teenagers' suicide cannot be ignored. From the Centers for Disease Control and Prevention, the incidence of suicide attempts peaks in mid-puberty, and the suicide mortality rate steadily increases throughout the teenage period with age. It is the third leading cause of death among young people aged 10-24 (Centers for Disease Control Prevention, 2017). The suicidal characteristics of teenagers are different from those of adults (Parellada et al., 2008). Hence, an effective tool is needed to assess the ranges of interpersonal needs of young people to identify a higher risk of suicidal behaviors. Predicting which individual is likely to commit suicide will help establish strategies for youth suicide prevention and intervention. It is important to explore the potential differences in different versions of INQ to identify the most psychometrically available version to assess the range of interpersonal needs. Moreover, results from many empirical studies found that the relationship between thwarted belongingness and suicidal behaviors was generally weaker in comparison to perceived burdensomeness (Ma et al., 2016;Chu et al., 2017). It is necessary to verify whether this phenomenon exists in teenagers.
In this study, we have investigated and compared the psychometric properties of five versions of the INQ based on an IRT model, to identify the version (or versions), which is more suitable for teenage samples to assess interpersonal needs. Since INQ-15 is a refinement of the INQ-25 (see Van Orden et al., 2012) and how the other versions select items from the original 25-item version is unknown, we hypothesized that the 15-item version would have adequate psychometric characteristics concerning factor structure, internal consistency, and validity. According to Hill et al. (2015), we hypothesized that the 15-item version and 10-item version would show more satisfactory psychometric characteristics compared to the other versions. Based on the IPTS and previous studies, we also hypothesized that in the INQ-12, INQ-18, and INQ-25, perceived burdensomeness, but not thwarted belongingness, would significantly predict capability for suicide and that both PB and TB would significantly predict capability for suicide in the INQ-15 and the INQ-10. Since no research were testing average test information and the differential item functioning caused by gender in the five versions, no specific hypotheses were made. The software R (Version 3.3.21) and the R packages mirt (Version 1.24; Chalmers, 2012) were employed to estimate item parameters. Moreover, we also compared which version could provide greater average test information in a larger range of latent traits. What is more, there was a conversion table provided to obtain the transformed scores of each version. At last, this study is also exempted to test the hypotheses of the IPTS and guide refinement of the IPTS.

Participants
The complete data in the study was available for 905 individuals after deleting the missing response data. Participants were Chinese teenagers from four middle schools in two provinces of China. The mean age of the participants was 15.03 years ranged from 12 to 18 years (SD = 1.70). Participants were predominantly male (60.4%), only child (77.8%), and urban (78.6%). Participants were from six grades: Junior One (10.2%), Junior Two (22.9%), Junior Three (10.5%), Senior One (16.6%), Senior Two (16.8%), and Senior Three (23.1%). Both written and verbal consents were acquired from parents of the participants before taking part in the experiment. This study was approved by the Ethics Committee of Jiangxi Normal University and was conducted following the ethical principles of the Declaration of Helsinki.

Measures
The Interpersonal Needs Questionnaire (INQ; Van Orden, 2009) is a 25-item self-report measure used to assess thwarted belongingness and perceived burdensomeness. Each of the 18-, 15-, 12-, and 10-item versions is a subset of the original 25-item version (see Table 1). Each item is rated on a 7-point Likert scale ranging from 1 (not at all true for me) to 7 (very true for me). The higher scores represent thwarted belongingness and perceived burdensomeness of the heavier individuals. The coefficients of Cronbach's alpha of five versions ranged from 0.91 to 0.95 in the current study.
The revised UCLA Loneliness Scale (Russell et al., 1980) is a 20-item self-report measure of loneliness. Participants are asked to rate the frequency of satisfaction and dissatisfaction with social relationships. All items are on a 4-point Likert scale ranging from 1 (never) to 4 (always). The higher scores represent higher levels of loneliness. Russell et al. (1980) reported a high internal consistency for the scale (Cronbach's alpha = 0.94), as well as support for the validity of the scale. In the current study, the scale had a high internal consistency (Cronbach's alpha = 0.91).
The Perceived Social Support Scale (PSSS; Blumenthal et al., 1987) is a 12-item self-report measure of social support. Participants rate their degree of agreement to statements describing people can give them support on a 7-point Likert scale, ranging from "strongly agree" to "strongly disagree." The higher scores represent more social support. Blumenthal et al. (1987) reported a good internal consistency (Cronbach's alpha = 0.88). In the current study, the scale had a high internal consistency (Cronbach's alpha = 0.94).
The Acquired Capability for Suicide Scale-Chinese Version (ACSS-CV; Yang et al., 2019) is a 14-item self-report measure of acquired capability for suicide. Participants rate their degree of agreement to statements that describe their fearlessness about lethal self-injury on a 5-point Likert scale, ranging from "I do not agree at all" to "I fully agree." The higher scores represent less fearlessness about lethal self-injury. Yang et al. (2019) reported a proper internal consistency (Cronbach's alpha = 0.78). In this study, the scale had a good internal consistency (Cronbach's alpha = 0.80).

Construct validity
Confirmatory factor analysis (CFA) was used to examine the fit of the structure of five versions of the INQ, and several global fit indices were used to evaluate the fitness, including the root mean square error of approximation (RMSEA), the standardized root mean square residual (SRMR), the comparative fit index (CFI), and the Tucker-Lewis index (TLI). RMSEA and SRMR values < 0.08 (Browne and Cudeck, 1992;Hu and Bentler, 1999), and CFI and TLI values close to 0.95 or greater were considered adequate (Brown, 2006).

Internal consistency and validity
Cronbach's alpha coefficients were used to examine the internal consistency of each version scale and its subscales. And regression equations were constructed to test whether the five versions of PB and TB would predict acquired capability for suicide (measured by the ACSS-CV) significantly.

Differential item functioning
If respondents from different groups (e.g., gender) with the same ability or proficiency level have different probabilities of choosing the same option for a certain item, then the item is flagged for differential item functioning (DIF; Kim, 2001). In the study, DIF analysis was used to identify systematic bias caused by gender. The IRTPRO program was used to calculate the DIF analysis based on the IRT method. This program performed DIF analysis according to Lord's IRT parameter comparison technique (Lord, 1977) under the framework of IRT.

Average test information
Test information is the sum of the information of each item.
When the test provides more information to the participants with a certain potential trait value (θ ), the standard error of the measurement of these participants will be smaller. In other words, the measurement will be more accurate. Based on the IRT model, we calculated the total test information curve of each subscale separately. The average test information was the total test information divided by the corresponding test length. The equation of item and test information in the graded response model (GRM; Samejima, 1969) are given, respectively, as and where Here a j and b jt denote the discrimination parameter and the location parameter of the item j in GRM, respectively. b jt is the tth location parameter for item j, which satisfies b j1 < b j2 < · · · < b jmf j ; mf j represents the maximum score of item j. θ i refers to the potential trait value of the participant i. P * jt denotes the cumulative probability of participants i gaining at least a score point t on the item j. D is a constant with a value of 1.7, I j (θ ) refers to the information provided by item j to participants whose potential trait value is θ , n is the test length. I (θ ) denotes the total test information.

Expected Scores Conversion
Based on the graded response model (GRM), we estimated the item parameters and transferred the potential trait value (θ ) to calculate the expected scores of five versions of INQ, and then created a conversion table to implement a comparable process.
To calculate the expected scores of subscales, the individual's response probability was calculated based on the GRM. The expected scores for θ i can be calculated as where P jt (θ i ) is the probability of getting t score.

Confirmatory Factor Analyses
Consistent with the literature to date, items of each of the five versions were loaded in two dimensions of perceived burdensomeness and thwarted belongingness and were set to load on their hypothesized factor with a correlation between the two factors. Similar to the work of Van Orden et al. (2012), and due to consistently high modification indices across different versions, each subscale had a pair of correlated residuals (items 1 and 3 on the perceived burdensomeness subscales and items 20 and 21 on the thwarted belongingness subscales). Since INQ-12 didn't include both item 20 and item 21, this pair of residuals was not included in the corresponding thwarted belongingness subscale. Fit indices for the models are presented in Table 2. As can be seen in Table 2

Internal Consistency and Validity
To examine the internal consistency and criterion validity of each subscale, Cronbach's coefficient alphas and correlation coefficients were generated for each subscale (see Table 3). Both perceived burdensomeness and thwarted belongingness subscales were demonstrated good internal consistency (Cronbach's alphas ranged from 0.81 to 0.93). The internal consistency coefficients of thwarted belongingness subscales were smaller than the corresponding versions of perceived burdensomeness subscales. Both perceived burdensomeness and thwarted belongingness subscales had significant correlation coefficients with the calibration standards of PSSS and UCLA Loneliness Scale. The result verified the views of Van Orden (2009) that interpersonal interactions characterized by low closeness or low frequency could not fully satisfy the sense of belonging, and might lead to feelings of loneliness and perceptions of insufficient social support. The criterion validities between the two subscales of each version were similar. To examine concurrent predictive validity, the regression equations were conducted. Both perceived burdensomeness and thwarted belongingness of each version were significant predictors of acquired capability for suicide (see Table 4).

Differential Item Functioning
To examine the differential item functioning of 25 items, the 'gender' variable divided into males and females was used to analyze. According to parameter comparison, the first item with the significance level of 0.01 and the 18th item with the significance level of 0.001 existed DIF (see Table 5). All versions contained the first item. Therefore, the versions without the 18th item were suggested such as the 10-item version and the 12-item version.

Average Test Information Curve
To examine the average test information of each subscale, we calculated the total test information curve of each subscale presented in Figures 1, 2, and the average test information curves were presented in Figures 3, 4, which indicated the item information contained at each node along the θ scale. Measurement providing more information had higher reliability and more measurement precision. In Figure 3, the INQ-10 provided the most average test information at the range approximately from −0.8 to 2 standard deviations of perceived burdensomeness, among the five versions. At high ranges of θ value, the five scales provided a similar amount of information, while at other ranges of θ value, the INQ-10 provided the least in the five scales. On the whole, the five scales could provide proper average test information. The results suggested that the INQ-10 could provide more measurement precision for varying degrees of perceived burdensomeness, and this suggested that the INQ-10 might be more useful  in teenage samples for measuring perceived burdensomeness in clinical trials and measuring perceived burdensomeness as an index of treatment response. In Figure 4, above −1 standard deviations of thwarted belongingness, the INQ-10 provided the most average test information, while at other ranges of θ value, the INQ-10 was the least in the five scales, but close to the other versions. Similar to the result of perceived burdensomeness subscales, the thwarted belongingness subscale of the INQ-10 could provide higher reliability and more measurement precision for varying degrees of thwarted  Thwarted belongingness   INQ25  INQ18  INQ15  INQ12  INQ10  INQ25  INQ18  INQ15  INQ12

Expected Scores
The expected scores of the five versions were calculated by transferring θ values based on GRM and presented in Table 6.
The scores conversion means that the scores of the five versions measuring the same psychological trait can be compared with each other. The conversion table of the five scale scores provides help shifting one scale into another one and is useful for future study and application when different version scores need to be switched.

DISCUSSION
To date, the present study provides the first comparison of five versions of INQ simultaneously within teenage samples. Construct validity, internal consistency, validity, and average test information were compared for the five versions to (a) identify the version with the optimal overall psychometric characteristics in teenage samples to encourage future use, and (b) test the hypotheses of the IPTS and guide refinement of the IPTS. Concerning validity, the INQ-15, INQ-12, and INQ-10 demonstrated adequate fit for a two-factor model for all global indices of fit, while the other longer versions did not. Thus, the INQ-15, INQ-12, and INQ-10 most consistently demonstrated construct validity, providing evidence in support of their continued use in teenage samples. The results of internal consistency and criterion validity showed that five versions of the INQ and their subscales all had good internal consistency. But the internal consistency coefficients of thwarted belongingness subscales were smaller than perceived burdensomeness subscales, which was similar to most previous researches (e.g., Bryan et al., 2010;Hill and Pettit, 2012;Monteith et al., 2013;Teo et al., 2018). The correlation coefficients of perceived burdensomeness and thwarted belongingness subscales were closed to each other. As for concurrent predictive validity, both perceived burdensomeness and thwarted belongingness of each version were significant predictors of acquired capability for suicide. The internal consistency, correlation coefficient, and concurrent validity of different versions did not provide any basis for recommending a specific version of the INQ. According to the DIF analysis, the first item with the significance level of 0.01 and 18th with the significance level of 0.001 item had DIF. All versions contained the first item. Therefore, the versions without the 18th item were suggested such as the 10-item version and the 12-item version. As for average test information, the INQ-10 had clear advantages both in the perceived burdensomeness subscale and in thwarted belongingness subscale. In perceived burdensomeness subscales, the version that included more items provided less average test information. In thwarted belongingness subscales, the thwarted belongingness subscale of INQ-10 evidently provided the most average test information, and the thwarted belongingness subscale of INQ-25 provided the second most information. The average test information curves of thwarted belongingness subscales of the other three versions almost coincided. The thwarted belongingness subscales of INQ-18 and INQ-15 contained the same items, and the result demonstrated that 5-item of thwarted belongingness subscale of the INQ-12 provided the same average test information with 9item of thwarted belongingness subscales of the INQ-15 and the INQ-18. Concerning the expected scores, the conversion of the five scale scores (see Table 6) enables the conversion of one scale shifting into another one. The conversion table can provide help for future studies and applications when one of the five scale scores needs to be transformed into another.
Overall, the result of average test information suggested the INQ-10 provided higher reliability and more measurement precision, and the 10-item version with proper reliability and validity was demonstrated as an adequate fit for a twofactor model. Hence, the 10-item version of INQ is the most suitable version for future use in teenage samples. In addition, the results above showed that perceived burdensomeness performed better in multiple indicators in comparison to thwarted belongingness. If different results are a consequence of measurement, it is necessary to take into consideration both theoretical and operational definitions of the thwarted belongingness and perceived burdensomeness. It is also possible that the thwarted belongingness subscale of the INQ is not adequately measuring the thwarted belongingness and developing a new self-report scale for thwarted belongingness is needed.
The results of this study should be viewed within the context of its limitations. First, the present study uses data from a teenage sample in China, which limits the generalizability of the results. According to the study, we cannot make a decision on which version of the INQ is best for the elderly and clinical samples. Furthermore, the INQ was demonstrated that it could predict suicidal ideation of the individual in previous studies (Joiner, 2005;Van Orden et al., 2010). But this study demonstrated that the INQ could predict the acquired capability for suicide and it was cross-sectional and did not examine the difference of predictive validity about suicidal ideation among the five versions, which can be done in the future. In addition, the five versions of INQ were derived from the response to INQ-25. Thus, current data did not take into account the possible influence of question order effects (e.g., consecutive questions might be answered more similarly than non-continuous questions). Furthermore, the internal consistency coefficients of thwarted belongingness subscales were smaller than the perceived burdensomeness subscales in this study. And findings in previous studies for the relationship between thwarted belongingness and suicidal ideation were weaker in comparison to perceived burdensomeness (Ma et al., 2016;Chu et al., 2017). In the future study, a new self-report scale for thwarted belongingness (TB) can be developed to expand the availability of valid measurement approaches for interpersonal risk.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of School of Psychology, Jiangxi Normal University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.