Quality of Smartphone Apps Related to Panic Disorder

Quality of smartphone apps related to panic: smartphone apps have a growing role in health care. This study assessed the quality of English-language apps for panic disorder (PD) and compared paid and free apps. Keywords related to PD were entered into the Google Play Store search engine. Apps were assessed using the following quality indicators: accountability, interactivity, self-help score (the potential of smartphone apps to help users in daily life), and evidence-based content quality. The Brief DISCERN score and the criteria of the “Health on the Net” label were also used as content quality indicators as well as the number of downloads. Of 247 apps identified, 52 met all inclusion criteria. The content quality and self-help scores of these PD apps were poor. None of the assessed indicators were associated with payment status or number of downloads. Multiple linear regressions showed that the Brief DISCERN score significantly predicted the content quality and self-help scores. Poor content quality and self-help scores of PD smartphone apps highlight the gap between their technological potential and the overall quality of available products.


Introduction
Panic disorder (PD) is a common anxiety disorder associated with an important social and economic burden (1,2). Available treatments include pharmacotherapy and cognitive-behavioral therapy (CBT) (3). Such treatments have, however, been insufficiently disseminated in clinical settings (4).
Smartphones are widely used worldwide (5,6) and have a growing role in health care (7)(8)(9)(10). Concerns about the regulation of medical smartphone apps are, at the same time, rising among public administrations and the scientific community (9,11,12). The U.S. Food and Drug Administration regulates some health-related apps, as it does medical devices. Smartphone apps related to psychiatric conditions do not yet fall under this regulation (13).
A number of recent studies have assessed the quality of medically oriented apps in various fields, such as smoking cessation, weight management, sleep, cancer, and diabetes . While acknowledging the potential opportunity offered by apps-related technologies, these studies concluded that the apps available from different stores, with few exceptions, were of overall poor quality. A gap was furthermore found between the considerable number of apps related to medical conditions available in stores and the low number of peer-reviewed papers about them (37). In particular, despite their potential to improve health care, mental health apps currently available in stores lack scientific evidence about their efficacy (38). With few exceptions (39)(40)(41), preliminary findings reported for health apps were similar to previous findings on the poor quality of health information websites (42)(43)(44)(45)(46).
One may hypothesize that persons with PD could benefit from the development of well-conceived apps. Indeed, Internetbased CBT has previously been shown to offer some efficacy in PD (47)(48)(49)(50), and several studies are under way to evaluate apps designed specifically for the treatment of PD (51,52). Nonetheless, developers have not waited for scientific evidence to create apps for PD, as many are already available on smartphone stores. To our knowledge, no studies have yet been performed to rate these apps.
In the present study, we aimed to assess the quality of Englishlanguage PD-related apps available on the Google Play Store like any layperson searching for an app related to PD on the Google Play Store. It is a descriptive and exploratory study of what it is possible to find. Furthermore, the study aimed to compare free and paid apps. We furthermore assessed the factors associated with the main quality indicators, as well as the links between the quality indicators and users' ratings (star ratings, as reported on the Google Play page) and downloads.

Selection of Apps
A keyword search was performed between February and March 2014 to produce a comprehensive list of PD-related apps that were accessible in English on the Google Play Store. The Google play account was set to English United Kingdom language and linked to a mobile phone, which was registered on a Swiss mobile network.
Google is the developer of the Android operating system, the most widely used smartphone operating system in the world (53). The following queries were entered into the Google Play search engine: "stop panic, " "stop panic attack, " "panic attack, " "PD, " "anxiety attacks, " and "anxiety disorder. " Studies of Internet users have shown that most people rarely search beyond the first 20 retrieved results (54). However, we extended the coverage of the present study to the first 50 free apps and to the first 30 paid apps for each tag to obtain the most comprehensive list of apps. Apps were included if they were related to PD. Exclusion criteria were as follows: the app could not be downloaded after more than three attempts, the app was not in English, or the app was a book or an article.

Evaluation of Apps
Apps were reviewed on an HTC One Android 4.3. They were assessed by using tools reported in previous studies, tools adapted from quality evaluation studies of websites (55)(56)(57)(58)(59), and tools described in other studies on the quality of smartphone apps (15,16,23,30,31). The assessment instruments are described below.

Google Play's Page and Functionalities of Apps
As reported by other investigators (15,30), we extracted a number of items from the Google Play page, such as number of downloads and ratings of the apps.

Self-help Model
A self-help model assessment tool for PD was used ( Table 1). The model was based on the potential of smartphone apps to help users in daily life and on the second edition of the Practice Guideline for the Treatment of Patients with Panic Disorder by the American Psychiatric Association (3).

Content Quality
As in other studies on Internet websites related to mental health disorders (39)(40)(41)(42), evidence-based content quality was assessed according to the availability of information related to the following questions that a patient could search for: Answers found on the apps were assessed on the basis of the American Psychiatric Association practice guideline (3). For every request, the coverage (the extent to which the question was addressed) and correctness (the extent to which the answer was right) of the answer were comprehensively scored on a 3-point scale (0 = absent; 1 = partially incorrect or incomplete; 2 = correct and complete). A total content quality score, ranging from 0 to 14, was calculated by combining the scores. Interactivity Interactivity ( Table 2) was measured with an adaptation of the Abbott scale (58). Three items were added to the scale: presence of a gamification module, possibility of personalizing the user's profile (avatar, color, sound), and tailoring of the app upon use.

Health on the Net Code
The health on the net code (HON) label (55) was created for websites that focus on ethical standards in online publishing. Usually, a website requests evaluation, after which the label is awarded. In the absence of the common use of this label by the apps, we assessed whether they respected the HON criteria ( Table 2).

Brief DISCERN
The Brief DISCERN (56) is a six-item ( Table 2) assessment tool adapted from the DISCERN instrument (60). It is used as a potential indicator to estimate the quality of the information about the choice of treatment in websites. The Brief DISCERN includes six items on a five-point scale (1 = not at all; 5 = completely). The first two items identify the transparency of information sources; the other four items estimate the quality of the information regarding treatment. A cutoff score of ≥16 has previously been associated with good content quality scores of health-related websites (56).

Accountability
Accountability was estimated with the Silberg scale (57), which includes authorship (names of authors, affiliation, and references), attribution (sources and references), disclosure (property of site, sponsorship, and advertising), and currency (date of creation, modification of site, and updating in the last 6 months). A total score ranging from 0 to 9 (1 point for each item if present) was calculated for each app ( Table 2).

Statistical Analyses
Statistical analyses were performed with SPSS software (version 18.0, Chicago, IL, USA). An initial exploratory analysis involved the calculation of proportions, as well as means and SDs, of the above-mentioned outcome measures. Next, we compared paid apps with free apps in bivariate analyses by using parametric tests (t-test, chi-square, or Fisher's exact test) or non-parametric tests (median test) when appropriate. Finally, we computed prediction models by using multiple linear regressions for two variables of interest.
Applying the principle of model parsimony and statistical relevance, the selection of the independent variables was driven by the objective to find the simplest model (i.e., contain a small number of variables): (1) that adequately fits the data and (2) that explains most of the variance in the dependent variable (the highest value of R2).
Beforehand, the relevance of the independent variables for the prediction of each dependent variable was discussed among the authors. Taking into account multicollinearity and using the «Enter»method, several subsets of these independent variables were regressed and the model with the least number of variables that still explain a percentage of variance in the dependent variable that is comparable to the percentage explained with all the variables in the equation was retained.
On the one hand, the prediction of content quality was fitted by using the Abbot interactivity scale, the Brief DISCERN score, the number of installs, and the link to paid content (yes vs. no) as the independent variables, and by controlling for payment status (free apps vs. paid apps). On the other hand, the prediction of self-help was assessed with the Brief DISCERN score, whether the app was recommended (yes vs. no), the link to paid content (yes vs. no), and the Silberg accountability scale as the independent variables, controlling for payment status (free apps vs. paid apps). For all analyses, a significance level of p ≤ 0.05 was used.
Payment status was hypothesized to matter as a potential confounder on the ground that it could moderate the impact of the

Results
A total of 480 apps were found (50 free apps and 30 paid apps for each of the six keywords). The search with these keywords highlighted several duplicates among the apps identified. After their removal, of the 247 remaining unique apps, 52 were retained for analysis and 195 were excluded (Figure 1). One app offered only a self-assessment tool aiming to help user to screen for a possible PD. The other apps are designed to help user manage their symptoms (via information, assessments, and techniques to deal with PD). Most of the apps (58.1% of free apps and 61.9% of paid apps) recommend the user to consult a medical doctor if suffering from symptoms of PD.
There is furthermore a wide variability on the contents of the apps. For example, one of the most downloaded app offers features like psychoeducation, audio's for relaxation, and mindfullness as well as a diary tool to record panic events. Some apps offer interactive modules to face panic attacks. For example, one app tells the user to control his breathing while giving instruction on the screen with pictograms. Other apps offer discussion forum. Some applications are selling various products, such as books or medicinal herbs.
The characteristics of paid and free unique apps are reported in Table 3.
The two subgroups, paid and free apps, were similar, although we found several differences. For instance, the variable publicity appeared to be present in 61.3% of the free apps group but in none of the paid apps group, with a p-value of <0.0005. In addition, the HON criteria were more fulfilled in the paid apps group than in the free apps group (p-value 0.002). As shown in Table 3, the overall adherence to HON criteria was low: none of the assessed apps fulfilled the eight criteria. Significantly more apps that were designed to be stand-alone (no additional app content) were found in the paid apps group than in the free apps group (pvalue 0.001). Both paid and free apps groups had low content quality and low self-help scores, with no differences between the two groups.

Regression Results
Content quality was regressed on a set of independent variables, namely the Abbot interactivity scale, the Brief DISCERN score, the number of downloads, the link to paid content controlling for payment status. We found that the Abbot interactivity scale and the Brief DISCERN score significantly predicted the content quality (p = 0.01 and p < 0.0005, respectively), but not the number of downloads (p = 0.9), or the link to paid content (p = 0.1). After careful examination of the regression coefficients without and with payment status, no confounding effect could be imputed to this variable. The full model performed well with an adjusted R 2 of about 70%. Another regression model predicted the self-help score with the Silberg accountability scale, the Brief DISCERN score, whether the app was recommended, and the link to paid content as the independent variables, controlling for payment status. This model performed less well than the preceding model, as shown by an adjusted R 2 of 54.4%. We found that the Silberg accountability scale (p < 0.0005) and the Brief DISCERN score (p = 0.03) significantly predicted the self-help score, but not whether the app was recommended (p = 0.4), or the link to paid content (p = 0.5). After due consideration of the regression coefficients without and with payment status, we did not detect a confounding effect of this variable.

Discussion
In this study, we aimed to assess the quality of English-language smartphone apps for PD. In particular, we evaluated the content quality and self-help scores with instruments that were adapted from other studies on the content quality of medical websites and from previous assessment studies on medical smartphone apps.
In consideration of the lack of specific studies with similar purpose to our study, and of the potentially high interactivity of apps in daily life, we adapted some assessment tools for the study herein. Abbott's scale of interactivity (58) was adapted to match the specificity of smartphone apps, as described in the Section "Materials and Methods. " Another important adaptation was the self-help tool. It may be a helpful indicator for apps assessment and development, especially for those based on CBT treatment models. The low scores obtained with the tool may underline the gap between the clinical potential offered by apps technology and its rather low level of clinical development. The self-help model was based on CBT for two reasons: first, its easy translation into eHealth, as shown by important developments related to Internetbased therapy for mental health disorders (48,(61)(62)(63)(64); second, its validity for the treatment of PD (3). The self-help model score is probably also useful for other CBT treatment apps. Further studies on apps for mental health and about the instruments proposed here are warranted. Similar to the results reported in other studies on health-related apps, the mean content quality (14-16, 19, 23, 30, 31, 34, 35) and self-help scores were low in our study. Most apps were insufficiently evidence based; furthermore, the technological capabilities were underused in most of the available PD-related apps. As shown in website studies (40,41,56), the Brief DISCERN score (56) is linked to content quality scores, as well as to the self-help score specifically developed for CBT-based app assessment. In the present study, measures such as accountability and interactivity were associated with the main quality indicators, such as content quality and self-help scores, as was previously found in some (39,41,65), but not all (66), studies on health-related websites.
Factors related to the community success of a given app, such as the number of downloads and whether the app was recommended, as well as factors linked to the economic model, such as payment status or a link to paid content, were not associated with content quality or self-help scores. This is somewhat surprising, particularly in regard to the number of downloads. One might expect better quality for the most downloaded apps. The results are possibly limited by the assessments of apps found only on the Google Play Store as well as by the small number of apps with a high amount of downloads (only three apps with more than 5000 downloads).
The number of active users (unavailable on the Google Play Store) would, however, probably be more informative for the sustained success of a given app after the initial download.
Payment status was not associated with the quality indicators assessed. Further studies may assess in more details the commercial strategy linked to the development model related to healthrelated apps.
The link found between payment status and publicity reflects some differences in the commercial model. Other aspects should, however, be included in further assessments (i.e., marketing strategies, interaction with users.).
Our study contains several limitations. We assessed only apps from the Google Play Store and not the Apple Appstore or others. This aspect limits the generalization of the study findings.
In addition, the keywords for the search used in this study might be different from those used by people with PD, and we may have missed PD-related apps on the Google Play Store. Furthermore, the results may differ depending on the country and the language setting of the Google Play store. Nonetheless, the study suggests possible modifications to medical eHealth assessments through the proposal of adaptations to apps. Despite expectations about the potential of PD apps to improve treatments (51,52), the apps available to users from stores to date need to be improved and to include more patterns of evidence-based information, more interactive assessments, such as ecological momentary assessments (67), and more self-help options.