Intensive Monitoring Studies for Assessing Medicines: A Systematic Review

Introduction: Intensive monitoring (IM) is one of the methods of post-marketing active surveillance based upon event monitoring, which has received interest in the current medicines regulatory landscape. For a specific period of time, IM involves primary data collection and is actively focused on gathering longitudinal information, mainly safety, since the first day of drug use. Objectives: To describe IM systems and studies' data published over 11-years period (2006–2016). Specifically, we reviewed study population/event surveillance, methodological approaches, limitations, and its applications in the real-world evidence generation data. Methods: We completed a systematic search of MEDLINE and EMBASE to identify studies published from 2006 to 2016, that used IM methodology. We extracted data using a standardized form and results were analyzed descriptively. The methodological quality of selected studies was assessed using the modified Downs and Black checklist. Results: From 1,400 screened citations, we identified 86 papers, corresponding to 69 different studies. Seventy percent of reviewed studies corresponded to established IM systems, of which, more than half were prescription event monitoring (PEM) and modified-PEM. Among non-established IM systems, vaccines were the most common studied drugs (n = 14). The median cohort size ranged from 488 (hospitals) to 10,479 (PEM) patients. Patients and caregivers were the event data source in 39.1% of studies. The mean overall quality score was similar between established and non-established IM. Conclusions: Over the study period, IM studies were implemented in 26 countries with different maturity levels of post-marketing surveillance systems. We identified two major limitations: only 20% of studies were conducted at hospital-level, which is a matter of concern, insofar as healthcare systems are facing a lack of access to new medicines at ambulatory care level. Additionally, IM access to data of drug exposure cohorts, either at identification or at follow-up stages, could somehow constitute a barrier, given the complexity of managerial, linkable, and privacy data issues.


INTRODUCTION
Bridging the gap between information generated by randomized clinical trials (RCT) and how to interpret different evidence sources to better understand the real-world drug usage is of great importance, since drugs often do not perform as well in RCT as in routine clinical practice, the latter characterized by a variety of sociocultural behaviors and clinical settings (1,2). Overtime this was clearly a lesson learned and nowadays society, including payers, demands an integrated assessment of benefits and risks under real life conditions as the next logical step after RCT (3,4). The adoption and use of real-world evidence (RWE), defined as the clinical evidence regarding the usage and potential benefits or risks of a medical product derived from analysis of routine care data, is being increasingly important for regulatory decisionmaking (5, 6). RWE can provide insights into key evidentiary needs by regulators which include: (1) monitoring of medication performance in routine care, including the effectiveness, safety (e.g., labeling changes, withdrawals) and value; (2) identifying new patient strata in which a drug may have added value or unacceptable harms; and (3) monitoring targeted utilization (7).
In the last decades, a tale of withdrawals (8)(9)(10) has boosted interest in pharmacovigilance and in response, regulators have started to reform their systems, which have shifted from a largely reactive response, that relied mainly on spontaneous reporting (SR), to a more proactive approach to drug safety issues (11). Specifically, in late 2005, the US Food and Drugs Administration (FDA) and the European Medicines Agency (EMA) issued guidance documents on therapeutic risk management planning aimed at strengthening proactive postmarketing surveillance (12). More recently, the European Union implemented new pharmacovigilance legislation, where regulatory agencies have now extended powers to demand for post-authorization efficacy studies (PAES) in addition to post-authorization safety studies (PASS) (13). Overall, it has been recognized that the knowledge of drugs is no longer restricted to a binary decision at the time of marketing authorization and the prevailing paradigm changed from a risk centered approach to a benefit/risk assessment throughout the medicine entire lifecycle (1,14).
Framed onto the scope of all these regulatory changes, intensive methods of post-marketing surveillance based on drug event monitoring (15), known as intensive monitoring (IM) methodology has been of interest (16)(17)(18). IM established systems were launched in New Zealand [Intensive Medicines Monitoring Program (IMMP)] (19) and in the UK [Prescription Event Monitoring (PEM)] (20,21), in the late 1970s and early 1980s, respectively. Since then, these systems and its background methodology have evolved and been implemented in several geographies worldwide, such as in the Netherlands [Lareb Intensive Monitoring (LIM)] (11), Japan (22), or in some African countries (23).
As compared to SR system that passively monitors all drugs during their whole life cycle and cover all population (24,25). IM combines the strengths of pharmacoepidemiological and clinical pharmacovigilance approaches and focuses on specific drugs. For a specific period of time, IM involves primary data collection and is defined as an observational inception cohort of subjects exposed to the drug(s) of interest (26). IM cohorts of drug exposures are identified either through prescribers (e.g., PEM), pharmacies (e.g., IMMP), and national pharmacovigilance systems (e.g., LIM) and followed in a systematic and prospective fashion through a large variety of sources (e.g., patients, prescribers, and hospitals).
Although IM systems were developed more than 30 years ago, there has not been a global comprehensive synthesis of event drug monitoring research studies to date. The purpose of this systematic review is to describe IM systems and studies' data published in the decade following the paradigm shift in medicines regulatory assessment, which was largely characterized by a more proactive approach to drug safety issues. From 2006 to 2016, we reviewed study population/event surveillance, methodological approaches (including data collection sources and analysis), limitations, main outcomes of interest, and IM applications in the real-world evidence generation data.

MATERIALS AND METHODS
This study followed current guidance of conducting and reporting systematic reviews, including guidance for undertaking reviews in health care on public health intervention reviews by the Center for Reviews and Dissemination of the University of York (27) and recommendations from the PRISMA-P statement regarding reporting items (28). The protocol for this review was registered at PROSPERO (CRD42017069309) available at https://www.crd.york.ac.uk/PROSPERO/display_record.asp? ID=CRD42017069309.
For inclusion in the review, papers had to report data on an IM study/system as defined above. RCT, studies conducted through automated databases (e.g., claims or electronic health/medical records), registries, SR schemes and case-reports/series were excluded. No restriction on study population, intervention, outcomes and comparator was imposed for study selection, although we only included studies published in English, Portuguese, Spanish, Italian, or French. Letters to editor and conference proceedings were also excluded, as these materials often reflect preliminary analysis and it is less likely that methods and results are described with the necessary details.
Electronic database identification of reports was undertaken on MEDLINE and EMBASE via OVID SP interface from inception to the 20th of April 2017, to include studies published on the time-frame of interest: January 2006 to December 2016. Complementary searches were made to identify potential additional articles: reference checking and hand-searching. The search strategy was developed after several iterations and it is presented in Additional File 1.
References located and potentially eligible for inclusion were exported to an Excel R file where authors recorded eligibility criteria of selected abstracts and full paper references. The abstracts were independently checked against the inclusion criteria by CT, MC, and PB and classified as include, unclear or exclude. The full reports for all articles that classified as include or unclear were retrieved, and two authors (CT, MC) independently evaluated its eligibility criteria for inclusion. All disagreements were resolved by discussion or, if necessary, by arbitration by a third review author (AM). The main reasons for exclusion, either at the title/abstract or at the full text screening phases were recorded.
One review author (FB) assessed the risk of bias of the included studies using the modified Downs and Black assessment checklist (30), for the risk of bias and the quality of both randomized and non-randomized studies. Data was validated by another reviewer (CT) and the rationale behind assessments was documented. The Downs and Black assessment checklist was selected for the following reasons: (1) in an evaluation by Deeks et al. (31), it was one of the six instruments considered most suitable for use in systematic reviews of non-randomized studies, out of 182 tools identified; (2) it was recommended as one of the most useful tools for assessing risk of bias in non-randomized studies both by Cochrane Collaboration and the Agency for Healthcare Research and Quality (32). As some items of the Downs and Black checklist are only applicable to randomized studies and since the majority of published IM studies are a single-arm design, the Downs and Black checklist was adapted for the purpose of this review as provided in Additional File 2. Our modified checklist included a total of 13 topics out of the 27 of the original version. Consequently, the overall quality score of each study ranged between 0 and 13.
The data synthesis was descriptive as the main aim of this systematic review was to identify methods, not quantify any effect. Data from the included studies were described and presented in text, tables and figures. When multiple papers were retrieved from the same IM study (e.g., results at different follow-up periods or reporting at different outcomes/drug study domains) they were treated as a single study.

Overview of Studies
The included studies were conducted in 26 countries. Overall, 70% of studies corresponded to established IM systems: PEM (n = 18), M-PEM (n = 8), CEM (n = 12), LIM (n = 6), and IMMP (n = 4). The remaining (n = 21) were single studies conducted within the IM methodology framework but were not part of any established IM. These studies were grouped in three categories: Vaccines (n = 14), Hospital setting based (n = 5), and Others (n = 2). Tables 1, 2 summarize the main characteristics (drugs monitored ATC, drug domains studied, event data source, methods of data collection and countries where the studies were conducted) of established and non-established IM systems. Data extracted from all included studies are presented in Table 3.

Established IM Systems
PEM and M-PEM represented the majority of the studies included (n = 26). Concerning PEM studies, the median study duration was 35.5 months (range:  and the duration of patient follow-up varied between 2 and 12 months (median: 6.0). Similar results were found for M-PEM studies. The median number of patients per study was 10479.5 (range: 1,728-28,357) and 7419.5 (range: 551-26,877), for PEM and M-PEM studies, respectively. For both schemes, it was stated that all studies were conducted with unconditional funding from the pharmaceutical industry. The common limitations pointed out by the authors was the non-return by general practitioner (GP) of questionnaires (which might result in non-response bias if the characteristics of patients at responding GP practices differ from those at non-responding GP practices), under-reporting and the restriction to primary care setting. Furthermore, the  lack of a concurrent control (single-group cohort design) was also addressed as a limitation, leading to a knowledge gap on the true background incidence for events. Unlike PEM, the M-PEM methodology offered a greater scope to collect information on confounding variables, since a more detailed study-specific questionnaire was used. Considering CEM studies, the median study and patient follow-up duration, was 10.0 (range: 0.5-109) and 0.7 months (range: 0.2-12), respectively. The median cohort size was 4,789 (range: 228-23,988) patients. Five out of 12 studies were conducted with no sources of funding, 6 studies were financially supported by either governmental institutions (n = 3), nongovernmental institutions (n = 2) or both (n = 1) and one study was financed by the pharmaceutical industry. Lack of generalizability (selection bias concerning patients' enrolment and high cohort drop-out rates), baseline events reported as "true" adverse drug events (ADE) (e.g., antimalarials studies with no event collection before vs. after treatment), costly and resource labor intensive for data collection and management were described as limitations of concern.
LIM studies reported the lowest cohort size among the established IM systems. Overall, a median number of 1462.5 (range: 398-3,569) patients were enrolled. The median study duration for the 5 out of 6 studies where this information was available, was 24 months (range: 7-63) and patients' follow-up duration varied between 1 and 12 months (median: 5.0). The majority of the LIM studies (n = 3) did not report the source of funding, 2 studies were conducted with financial support from governmental institutions and one was implemented without any source of funding. Limitations raised were in line with other established IM systems. LIM studies reported event rates rather than true incident rates and no information was provided about the patients that did not accept to participate (e.g., older people might be underrepresented since they do not have access/are not familiar with internet). Furthermore, since the patients were the source of event information, those who experienced an adverse   drug reaction (ADR) might be more motivated to fill in a questionnaire than those who did not experience it (reporting bias). It was also stated as a limitation the difficulty in obtaining information about serious and fatal outcomes. The median number of patients from IMMP studies was 6,891 (range: 420-17,298). The median study duration was similar to PEM studies, however a higher duration of follow-up time period (median: 15 months; range: 2-20) was observed. All studies received funding from governmental institutions and 2 studies were unconditionally co-funded by pharmaceutical industry. Not all IMMP studies reported limitations. From those studies where this information was available, an absence of a comparator group, underestimation of ADE rates and limited clinical detailed information were issues pointed out. Further, in the study of varenicline (92), the "effectiveness assessment" was performed based on information provided by the reporting doctor and for many patients, it was unknown whether varenicline was effective.

Non-established IM Systems
Two-thirds of non-established IM studies reported the IM of vaccines, half of those were related to the influenza H1N1 2009 pandemic vaccine. Almost all vaccines' studies (13 out       Obtain information on the incidence of common AEFI of the new pentavalent vaccine; strengthen the nascent AEFI system in a resource-limited country The of 14) targeted vulnerable populations (e.g., children, pregnant women). These studies were carried out using different methods for data collection (HCP face-to-face/web-based/telephone or mobile text messages). The median follow-up time observed was 4.5 months (range: 0.2-10) and the median study duration was 14 months; range: 1-27). The main limitations were non-response bias, non-representativeness, the lack of a control group, small sample size to detect rare outcomes (e.g., autoimmune diseases) and information bias (e.g., recall bias, adverse events following immunization (AEFI) not clinically confirmed). IM non-established system studies classified as "Others" covered only drugs from cardiovascular system ATC main group. Regarding hospital-based studies, a wide range of drugs were monitored, although the median number of patients included was lowest (488) within all reviewed studies. Regarding funding sources, 8 out of the 21 studies did not mention the source of funding, 7 were supported by governmental institutions, 3 from the pharmaceutical industry, 1 from a non-governmental organization, and 2 reported no sources of funding.

DISCUSSION
In the decade following the paradigm shift in medicines regulatory systems, from a largely reactive response to a more proactive approach to drug safety issues (2006-2016), we thorough examined IM methodological features for data collection and analysis, population surveilled, limitations and its applications in the daily practice environment. IM studies reviewed were implemented in 26 countries with different maturity levels of post-marketing surveillance systems. IM systems operated either in countries with non-existing or weak monitoring SR schemes, such as sub-Saharan African countries (23,139), or in countries that have the most widely used recordlinkage databases in the world for drug research, such as the UK (e.g., Clinical Practice Research Datalink) (140) or the Netherlands (e.g., PHARMO) (141)-picturing the contribution of IM systems in the real-world evidence generation data. Regardless the differences found within the methodologies used, these schemes were developed with the purpose of filling the gap between RCT (high internal validity and low external validity) (142,143), SR data (limited by under and selective reporting) (25,144) and automated database studies (their large size and their longer follow-up times and representativeness make it possible to study real-world effectiveness and safety, but they are usually poor in detailed covariate data) (145,146). Based on event monitoring and by tracking patients and drug use in a life-cycle based fashion, the results originating from IM studies encompasses the identification/quantification of factors that possibly negatively affect the benefit/risk balance, including (new) adverse events (identification and strengthening of signals), increase of knowledge of drug utilization patterns, identification of off-label use, among others. Moreover, by collecting longitudinal data since the first day of drug use, it allows to follow the time course (latency time and duration), outcome and management (to help clinicians and patients to adequate predicting with handling ADE, improving adherence and avoid early-discontinuation) of ADE; information that very few post-authorization methods can provide.
In the beginning of the century, Waller and Evans (147) argued that pharmacovigilance should be less focused on finding harm and more focused on extending knowledge of safety. Since then, the regulatory landscape has evolved and in parallel, an endeavor of post-marketing active surveillance schemes to meet the new regulatory challenges was witnessed. IM systems were no exception. For example, in the UK, PEM moved toward a more target surveillance: M-PEM. In the latter, efforts are done to better understand known or partially known drug risks (e.g., target analysis of events requiring special monitoring, more detailed characterization of drug usage, adherence to prescribing guidelines) and an alignment with regulatory requirements (e.g., PASS as part of RMP), is explicitly described as applications of this scheme. Further, the target sample size of 10,000 patients in conventional PEM-studies, which was driven by sensitivity assumption to detect rare and uncommon events was abandoned in M-PEM studies, where a specific sample size is calculated depending on the research question of interest (18). Some authors argue that IM is not an efficient way to detect these frequency-type events and for that purpose, other methods should be considered. For example, SR would probably be a more suitable method followed by an analytical study to confirm the signal (85). Likewise, the limited follow-up time duration does not allow for the detection of long-term events (e.g., cancer).
On the whole, drugs monitored through the reviewed studies were in the early post-marketing phase or were characterized by uncertainties concerning specific safety issues, namely those identified in the RMP (safety concerns raised from RCT, postmarketing experience and/or suspicion of inappropriate drug use). This was generally in line with IM drug entry decision criteria previously described by Coulter (19) and more recently by Harrison-Woolrych (148). Also, noteworthy that older drugs can be studied within this methodology. This was the case of metformin, marketed 60 years ago, where relevant information from the daily practice perspective, such as the outcome, management and the time course of metformin related ADE was lacking (77). We also observed that two-thirds of CEM studies were launched in resource-constrained settings and developed for monitoring artemisinin-based combination therapy for malaria treatment, aiming to complement information from RCT. In recent years, CEM was adapted and covered other drugs, such as antiretrovirals (126), vaccines (76), among others. Overtime, some practical handbooks have been issued by the WHO to support the implementation of specific programs [malaria (149), HIV/AIDS (150), and tuberculosis (151)]. The experiences of countries that have implemented CEM indicate that this was a key opportunity to raise awareness and to build pharmacovigilance capacity in these settings, which can be expected to have a positive effect on SR activities in the long run (23). The latter is of importance, since there is a need to strengthen ADR reporting rates in low-income countries and IM studies could be used in national pharmacovigilance systems (152).
Despite IM features found worldwide, the majority of monitored drugs were prescribed at the primary care level, highlighting the limited research in hospital and other secondary settings, either among established or non-established IM studies. At hospital level, where the drug market is rapidly changing, with more and more new drugs being introduced (e.g., cancer, autoimmune diseases, infectious diseases, etc.) (153), it seems that automated databases or often registries (drug registries or frequently disease registries) supplement IM systems. This might be partially due to efficiency reasons tied with decisions taken at an early stage dialogue with regulatory agencies. A recent study (154) revealed that one third of drugs approved in Europe (2007-2010), were coupled with a requirement for a registry, mainly with the purpose of gathering additional safety data. Most of the registries involved were derived from existing disease registries, i.e., designed for other purposes. The latter feature is seen as an advantage of this source due to efficiency reasons. However, it could also represent a weakness, since the multipurpose nature of registries frequently means that they are often organized for broader questions and therefore are limited by their heterogeneity in safety data collection and reporting (155). In other words, they may lack a focused hypothesis since they are viewed as a data collection structure within which studies can be performed rather than a study aimed at answering a specific research question (16,17,153). It is also important to cover drugs prescribed by specialists, where patients are frequently more complex in terms of underlying disease and co-morbidities. This drawback was not a reality within LIM studies, where the inclusion point was commonly the community pharmacy, but was the case of PEM/M-PEM. In the UK, to overcome this, a new IM system is being developed: the Specialist Cohort Event Monitoring (SCEM). A few SCEM studies are ongoing: OBSERVA-Observational Safety Evaluation of Asenapine and ROSE-Rivaroxaban Observational Safety Evaluation, both in response to post-authorization commitments requested by the European Medicines Agency (156).
Over the study period, the reviewed IM studies were not restricted to safety data collection. Other domains of drug outcomes, such as drug utilization patterns (both in terms of prescriber characteristics and patient population) and in a less extent, effectiveness ("therapeutic response") were studied. Concerning safety, our review illustrated a high degree of variability and a lack of standardization. Regardless of causality assessment, terms such as "adverse event" and "adverse reaction" were often used interchangeably, without explicit definitions to ensure consistency of use. In PEM and IMMP methodology the reported information was treated as adverse events. However, in LIM studies it was stated that although a causality assessment was not performed, the term ADR was used for the reactions reported as the authors claimed that patients were asked only to report symptoms that they believed to be associated with the use of the monitored drug. In this review, we used the terms reported by the authors but we encourage developing methodological and guidance safety reporting standards, for example through scientific and collaborative working groups at international level (e.g., International Society of Pharmacovigilance and International Society for Pharmacoepidemiology).
Patients and caregivers were the event data source in 39.1% of the studies. Overtime, the evolving regulatory landscape has heightened the recognition of patients as important players in clinical practice (157). Since 2012, in the European Union, patients can report ADE directly to competent authorities. Nevertheless, the concept of patient reporting schemes is far from new-it has been around for more than 50 years (158). Studies on patient reporting have demonstrated the ability of early identification of new and strengthening potential safety signals (159)(160)(161). Moreover, reports of symptomatic non-serious ADE from PCG are of great importance, since these events are often systematically downgraded by HCP, though they play a negative role on patients' quality of life and adherence to treatment, and ultimately on the benefit-risk of a drug. On the contrary, PCG could be less valuable to detect asymptomatic or serious or fatal events (162)(163)(164).
As any other primary data collection study, IM schemes are costlier and labor intensive. In a recent survey documenting the experiences of four African countries with CEM programmes (23), limited/inadequate funding was often considered as a challenge to deal with. This constraint was also reported in the New Zealand, where due to funding cessation, IMMP was disestablished in 2013 (148). It also seems that Japan-PEM (J-PEM) is no longer operational, since no published study from this scheme was found within the timeframe of our study. The J-PEM was launched in 1997 (165) and at least two pilot studies were conducted: troglitazone (166) and losartan (167). Although, J-PEM employed the method of a concurrent-control, which represented an advantage when compared with the majority of the reviewed IM studies, it appeared to be rather complex concerning data protection and managerial issues (22).
Low response rate and/or non-response bias was frequently mentioned as a limitation of both established and nonestablished IM system studies. A postal survey aiming to identify reasons for non-response in PEM studies (168), found workload and lack of payment, as the main reasons for non-response. In M-PEM studies, GP were offered a modest reimbursement for completion of questionnaires, which had a positive impact on the response rate (the median response rate increased from 50% in PEM to 64% in M-PEM) (18). Moreover, unforeseen challenges when conducted CEM studies were found, namely socio-cultural reasons that led to selective/non-participation (e.g., in Kenya some women could not give informed consent without permission from their husbands) (23). In LIM studies, non-response bias was also investigated (169). The major reason for non-response raised by patients was the fact that the study was not (properly) informed in the pharmacy. Further reasons, such as time-consuming, no-access to internet or being too ill to participate, were also pointed out (170). For external validity purposes, it is important to know whether IM population is comparable to the whole population using the monitored drug. Härmark et al. (171) found that LIM population were more often male, younger and healthier (higher percentage of de novo treated patients, shorter disease treatment duration and less co-medication) than the reference population. The authors concluded that these differences might lead to an underestimation of events, however it was not clear whether this influenced their time-course.
Our systematic review is subject to some limitations. Firstly, unpublished research (gray literature, reports) was not captured by our search strategy and therefore not included in this study. Secondly, we acknowledge that our review is limited by what authors have reported or presented in their studies. However, an assessment of quality was performed for all reviewed studies. Despite these limitations, we believe that our results are relevant and represent the first systematic review with the most comprehensive information available of IM systems implemented worldwide.

CONCLUSIONS
Over the study period, IM studies were implemented in 26 countries with different maturity levels of post-marketing surveillance systems, picturing the contribution of IM schemes in the real-world evidence generation data. Based on event monitoring and by tracking patients and drug use in a lifecycle based fashion, specific applications of the reviewed studies covered the following: increase of knowledge of drug safety data profile (outcome, time-course and management of ADE) identification of potential unrecognized and unsuspected ADE (tool for signal generation), gathering ADE data in resource limiting settings from populations frequently excluded from RCT (pregnant women, pediatrics and elderly), increase of knowledge of drug utilization patterns, and identification of off-label use. Overtime, an alignment with regulatory requirements was observed, where some studies have been undertaken to address specific questions related to safety concerns and drug utilization patterns (e.g., phase IV assessment as part of the RMP).
Framed onto the scope of IM systems implementation criteria, we identified two major limitations. Unexpectedly, only 20% of reviewed studies were conducted at hospital-level, which is a matter of concern, insofar as healthcare systems are facing a lack of access to new medicines at ambulatory care level (e.g., issues concerning pricing/reimbursement), and there has been a shift of new drugs introduction to hospital setting. Additionally, IM access to data of (new) drug exposure cohorts, either at identification or at follow-up stages, could somehow constitute a barrier, given the complexity of managerial, linkable and privacy data issues.

DATA AVAILABILITY
All datasets analyzed for this study are included in the manuscript and/or the Supplementary Files.

AUTHOR CONTRIBUTIONS
MC and CT were the guarantors. All authors contributed to the study protocol, the development of the selection criteria, the risk of bias assessment strategy, and data extraction criteria. CT, MC, and JA developed the search strategy. CT, MC, and PF examined compliance of studies with eligibility criteria, with a fourth author acting as an arbiter (AM). CT, MC, and PF extracted data from reports of all included studies which was validated by a third author (FB). FB performed quality assessments which were validated by CT, with a third reviewer serving as the final arbitrator (MC). AM, HL, JA, and JC contributed to the result interpretation and discussion of results. All authors read, provided feedback and approved the final manuscript. All authors had full access to all data in the study and take responsibility for its integrity and the accuracy of the data analysis.

FUNDING
The publication fee was supported by the Center for Health Evaluation and Research (CEFAR), National Association of Pharmacies, Lisbon, Portugal.