Despite the proliferation of apps to support the prevention and self-management of a number of health conditions, including supporting health behavior change, there is little evidence for the efficacy of most of these apps [1]. Resources such as PsyberGuide have assisted with important advances in this area, by synthesizing existing research on app testing and conducting credibility and transparency ratings of commercial apps for mental health [2]. However, much less evidence exists about the efficacy of apps for other areas of health promotion (e.g., healthy eating, weight loss) as well as new mental health apps that enter the market at a rapid pace.  

In a recently published paper [1], we laid out a number of steps researchers can take to evaluate commercial apps, including methods to examine the content, usability, and efficacy of the apps. Methods for content analysis include comparing the information in the app to a set of comparator information, such as clinical guidelines, evidence-based treatment strategies, or behavior change techniques. Methods for usability assessment include laboratory studies (i.e., set protocol of actions in a supervised setting), field testing (i.e., natural use “in the wild”), and feedback collected via user ratings. Methods for efficacy testing include the use of various study designs including randomized trials as well as the use of observational studies to explore app performance (i.e., collecting data from large groups of existing app users). In addition to examples of these various methods, we also outline research steps and procedures that will aid in systematically growing the scientific literature about how apps function across these diverse domains, extending beyond just an overview of the app content (where many published investigations end thus far). The app evaluations we suggest could contribute largely to the PsyberGuide domain of overall app credibility (i.e., the strength of the evidence for using the app). However, we also suggest that usability and efficacy testing will be critical for the advancement of this field, and encourage researchers to think broadly about the type of evidence needed to demonstrate that an app is ready to be broadly recommended for use by consumers.

While the science of testing and validating apps lags behind their commercial availability, consumers may face situations where they need to make decisions about what app to select in a space without extensive scientific evidence. In this case, individuals may need to use similar methods to those suggested by researchers, or potentially consult with their medical provider to evaluate their options. The American Psychology Association (APA) has released a toolkit for use by patients and providers evaluating a mobile app for mental health care [3], that walks through questions in five domains of the app that have been found to be important to its effectiveness and safety for work—background information (i.e., who developed the app, what does it cost); the risk to privacy and security of using the app; the evidence for its efficacy; the potential ease of using the app; and the interoperability (i.e., how well it integrates with other systems).   

Taken together, the three app development frameworks (our researcher-oriented suggestions, the APA suggestions for patients and providers, and PsyberGuide ) provide a thorough overview of the app selection process from a variety of perspectives. All three emphasize that finding the best app for use in a care plan should consider a balance of many aspects of a mobile app, extending beyond scientific content to include the usability of the app as well as privacy and security concerns. The APA framework uses a more stepwise process to guide the selection of an app (i.e., if it meets criteria for background, consider the privacy settings, then evidence), while our framework and PsyberGuide suggest consider evaluating all evidence available holistically. The use of these frameworks for app selection will largely be driven by the specific use case for the app, be it as part of a research program (our framework), provider-initiated care plan (APA framework), or user-initiated behavior change (PsyberGuide).  

Regardless of the use case for the app or other digital therapeutic, as reliance on these tools continues to grow, having sound methods to evaluate tools as both a scientific community as well as at the individual consumer level will continue to be of critical importance. We hope to see more research and practice level evidence generated in this important area and shared back with the community through tools like PsyberGuide so that these data can inform the development of even more effective apps in the future.  



  1. Jake-Schoffman DE, Silfee VJ, Waring ME, et al. Methods for Evaluating the Content, Usability, and Efficacy of Commercial Mobile Health Apps. Eysenbach G, ed. JMIR mHealth and uHealth. 2017;5(12):e190. doi:10.2196/mhealth.8758.
  2. PsyberGuide. App Guide. URL: [accessed 2018-9-10]
  3. Psychiatry. App Evaluation Model URL: [accessed 2018-9-10]