Comparative efficacy and acceptability of pharmacotherapies for postpartum depression: A systematic review and network meta-analysis

Purpose: To evaluate the efficacy and tolerability of pharmacotherapies for postpartum depression (PPD). Method: We performed a computerized search of MEDLINE (Ovid and PubMed), Embase, Cochrane Library, Web of Science, and Google Scholar to identify eligible randomized controlled trials (RCTs) before 31 March 2022. We calculated standardized mean differences (SMDs) for continuous outcomes and odds ratios (ORs) for dichotomous outcomes with the random-effects model. The tolerability of antidepressants in terms of early dropouts was investigated. The surface under the cumulative ranking curve (SUCRA) was used for ranking the outcomes. Quality assessment of the included studies was performed using the Cochrane Collaboration’s tool. Results: A total of 11 studies with 944 participants were included in this network meta-analysis, involving nine antidepressants. With respect to efficacy, only estradiol and brexanolone were significantly more effective than the placebo (p < 0.05), and the calculated SUCRA indicated that estradiol (94.3%) had the highest probability ranking first for reducing the PPD, followed by paroxetine (64.3%) and zuranolone (58.8%). Regarding tolerability, a greater percentage of patients treated with brexanolone experienced early dropout as compared to those treated with most other antidepressants. Conclusion: Only estradiol and brexanolone showed significantly higher efficacy than the placebo. According to the SUCRA ranking, estradiol, paroxetine, and zuranolone were the three best antidepressants. Concerning acceptability in terms of early dropouts, brexanolone was less well-tolerated than other antidepressants.


Introduction
Postpartum depression is one of the most common complications of childbirth, with an estimated prevalence of 10-20% worldwide (Howard et al., 2014;Ko et al., 2017). According to the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), PPD is a major depressive episode "with peripartum onset" and is defined as "the onset of mood symptoms occurring during pregnancy or in 4 weeks following delivery" (American Psychiatric Association undefined, 2013). Nevertheless, PPD is variably defined in clinical practice, occurring from four weeks to 12 months after childbirth (Stewart and Vigod, 2016). Although more than half of the women are suffering from PPD, most of them are underdiagnosed and undertreated and the condition can persist for years (Vliegen et al., 2014;Netsi et al., 2018). Some patients with PPD could be cured spontaneously within weeks; however, it is estimated that 20% of the women with this disorder remain to suffer from depression during the first year and 13% after two years (Stewart and Vigod, 2016). The common symptoms of PPD include sleep disturbance, anxiety, irritability, and a feeling of being overwhelmed as well as an obsessional preoccupation with the baby's health and feeding (Wisner et al., 2013;Stewart and Vigod, 2016). These severe psychiatric disorders that onset in the immediate postpartum period are called postpartum psychosis, which is rare, has an estimated prevalence of 1-2 cases per 1,000 births, and is often a manifestation of bipolar disorder (Meltzer-Brody et al., 2018a). PPD leads to maternal impaired emotions, loss of work, and negative effects on infant development (Pearson et al., 2013;Verkuijl et al., 2014;Valla et al., 2016) and is associated with an increased risk of both suicide and infanticide (Bergink et al., 2016). Despite the mechanism of PPD being not clearly understood, studies demonstrated that one of the strongest risk factors is previous mood and anxiety problems, especially depression during the pregnancy period (Wisner et al., 2013). Some evidence showed that the hypothalamic-pituitary-adrenal (HPA) axis and γ-aminobutyric acid (GABA) signaling may play a role in the pathophysiology of postpartum depression (Meltzer-Brody et al., 2018a). In addition, the rapid decline in plasma concentrations of allopregnanolone, which is a potent positive allosteric modulator of synaptic and extra-synaptic GABA type A (GABA A ) receptors, demonstrates a relationship between peripartum hormonal fluctuations and GABA regulation (Maguire and Mody, 2008;Mody and Maguire, 2011).
The first-line antidepressants for PPD treatment are selective serotonin reuptake inhibitors (SSRIs); however, some studies reported that the ability of SSRIs in treating and preventing postpartum depression is limited. In recent years, several new antidepressants are developed for treating PPD and showed promising efficacy and acceptability (Kanes et al., 2017;Deligiannidis et al., 2021). In this study, we aimed to compare the efficacy and acceptability of currently available pharmacological treatment for PPD in women.

Materials and methods
This systematic review and meta-analysis was performed and reported in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) extension statement for network meta-analyses (Hutton et al., 2015). A research question was established based on the patient index test comparator outcome study (PICOS) design criteria as follows: what are the differences between currently available antidepressant drugs for the treatment of women with PPD? Our goal was to compare these antidepressant drugs, for continuous outcomes we calculated the SMD and for dichotomous outcomes we calculated the OR. The primary outcome of this study was changes from baseline in a depression scale between different antidepressant drugs, which were measured using the 17-item Hamilton rating scale for depression (HAMD-17) score (Hamilton, 1960). The secondary outcomes include clinical remission rate (HAMD-17 score≤7) and response rate (≥50% reduction of score from baseline), as well as acceptability (treatment dropout measured by the proportion of patients who drop out prematurely).

Search strategy
A systematic search of MEDLINE (Ovid and PubMed), Embase, Cochrane Library, Web of Science, and Google Scholar was performed to identify eligible studies before 30 April 2022, with no language restriction. We included double-blind, randomized controlled trials comparing antidepressants with the placebo or another active antidepressant for the treatment of PPD. Studies using psychotherapy also would be included if it was applied to all intervention groups without difference. The full reports of studies published in peer-reviewed journals were identified. Additionally, we retrieved the ClinicalTrials.gov and the World Health Organization International Clinical Trials Registry Platform for ongoing and completed clinical trials with results (ClinicalTrials, 2021;WHO, 2021). The references listed in eligible articles and reviews were inspected to expand the scope of the searches. The combinations of terms used for the literature search are as follows: {[("antidepressant" OR "antidepressive") AND ("agent*" OR "medication*" OR "drug*")]} OR {("nortriptyline" OR "estradiol" OR "paroxetine" OR "sertraline" OR "fluoxetine" OR "brexanolone" OR "saffron" OR "zuranolone" OR "SSRI" OR "serotonin reuptake inhibitors" OR "SSRI" OR "serotonin norepinephrine reuptake inhibitor" OR "SNRI") AND [("PPD" OR "postnatal" OR "postpartum") AND ("depression" OR "disorder")]}.

Inclusion criteria
We included studies that met all of the following criteria: 1) randomized controlled parallel-group trials involving more than 10 patients, 2) studies aimed to investigate the efficacy and tolerability of antidepressant agents for the treatment of women diagnosed with PPD according to standard operationalized diagnostic criteria (Feighner Criteria, Research Diagnostic Criteria, DSM-III, DSM-III-R, DSM-IV, DSM-5, or ICD-10), 3) with detailed data on changes in the severity of depression syndrome measured with the HAMD-17 score or the Montgomery-Åsberg depression rating scale (MADRS) to assess the efficacy (Montgomery and Åsberg, 1979), and 4) must be original full-length articles.

Exclusion criteria
We excluded studies if they satisfy any of the following criteria: 1) studies that do not use pharmacotherapy but psychotherapy or other treatment, 2) not reported detailed data to assess the efficacy and tolerability, and 3) conference abstracts, guidelines, editorials, letters, and reviews.

Data extraction and quality assessment
We used a standardized protocol to extract the following relevant data and results from the included studies: authors, the trial conducted countries, year of publication, the sample size of each treatment group, patient age, dosage, follow-up period, outcome measurement, and corresponding data. The Cochrane Collaboration's risk of bias tool was used to assess the risk of bias for the included studies (Higgins et al., 2011), and the following seven domains were evaluated as low, high, or unclear risk of bias for individual studies: 1) random sequence generation, 2) allocation sequence concealment, 3) blinding of participants and personnel, 4) blinding of outcome assessment, 5) completeness of outcome data, 6) selective reporting, and 7) other sources of bias. Data extraction and quality assessment were performed by two reviewers (Z.Q. and D.X.L.) independently, and disagreements were resolved through discussion or arbitration by the third reviewer (L.W.).

Data synthesis and analysis
We conducted a network meta-analysis using the frequentist model to compare the efficacy and tolerability between antidepressants of interest in all RCTs. For individual studies, we calculated the SMD and corresponding 95% confidence intervals (95% CIs) for continuous outcomes at each followup time point, whereas for dichotomous outcomes of clinical response and remission rate as well as dropout rate, we calculated the OR and corresponding 95% CIs using a random-effects network meta-analysis model. We created the network plots to visually present the comparison between different treatments, in which the node sizes correspond to study participants and connection widths correspond to the number of studies. We performed a network meta-analysis of the comparative effectiveness or tolerability using the contrastbased network meta-analysis methods. The inconsistency assumption was used to determine the level of disagreement between direct and indirect evidence, which was evaluated using the overall inconsistency test by fitting design-by-treatment in the inconsistency model. In the case of different intervention doses in a single RCT, we combined them into a single intervention group. The SUCRA and the mean ranks were used to rate the treatments, which represented the probability of a given treatment being the best (or worst) option (Salanti et al., 2011). For SUCRA values, 0% indicates that no chance the treatment is the most efficacious, and 100% indicates that the treatment is certainly the most efficacious. As there was no concrete methodology to evaluate the publication bias between studies in network meta-analysis, we used the comparisonadjusted funnel plots to assess the publication bias among treatment comparisons (Chaimani and Salanti, 2012). Comparison-adjusted funnel plots are scatter plots of effect size versus precision, in which the substantial asymmetry around the effect estimate suggests the likelihood of publication bias. Heterogeneity across included RCTs was determined with the Cochran Q test and measured with I 2 statistic: 0%-40%, slight; 30%-60%, moderate; 50%-90%, substantial; and 75%-100%, considerable. All analyses were conducted using Stata version 15.2 (StataCorp, TX, United States), with statistical significance at p < 0.05.

Literature search
Our initial literature search identified 1861 citations, of which 978 were excluded because of duplicates. Among the remaining 883 results, most of them were psychotherapies (i.e., cognitive behavioral therapy and interpersonal therapy) rather than pharmacological therapies (n = 437). Other studies excluded were because of the following reasons: for preventing instead of treating PPD (n = 51); for the treatment of depression during the prenatal and pregnancy period (n = 89); review, systematic review, and metaanalysis (n = 196); and retrospective and observational studies (n = 42). We conducted the full-text review on the potentially eligible 68 studies, and finally 11 studies including a total of 944 participants were included in this network meta-analysis (Appleby et al., 1997;Wisner et al., 2006;Yonkers et al., 2008;Bloch et al., 2012;O'Hara and McCabe, 2013;Hantsoo et al., 2014;Kanes et al., 2017;Kashani Table 1 summarizes the demographic characteristics of the included studies. The study sample ranged from 12 to 150, with a mean age of 25.2-32.1 years. In total, 582 participants were randomly assigned to active antidepressant medications, whereas 362 participants were randomly assigned to the placebo. In seven RCTs, the patients had moderate-to-severe postpartum depressive disorder, with baseline severity on the HAMD-17 score of 21-29, whereas in the remaining three RCTs patients had lower HAMD-17 scores of 16-18 (Bloch et al., 2012;Kashani et al., 2017;Li et al., 2020). In one trial, however, the HAMD-17 score at baseline was 13.67 because it used the Edinburgh postnatal depression scale (EPDS) for screening participants instead of the HAMD-17 (Appleby et al., 1997). The treatment period for the included studies ranged from 4 to 12 weeks, and in most RCTs the period lasted for eight weeks. In five RCTs, the antidepressant medications were SSRIs, including sertraline, fluoxetine, and paroxetine. For the remaining studies, brexanolone was used in three RCTs (Kanes et al., 2017;Meltzer-Brody et al., 2018b), zuranolone was used in one RCT (Deligiannidis et al., 2021), estradiol was used in one RCT (Li et al., 2020), saffron was used in one RCT (Kashani et al., 2017), and nortriptyline was used in one RCT (Wisner et al., 2006). Figure 2 demonstrates the risk of bias for included RCTs. Of the 11 RCTs included in the network meta-analysis, four were assigned at low risk of bias (Appleby et al., 1997;Kanes et al., 2017;Meltzer-Brody et al., 2018b;Deligiannidis et al., 2021), and six were considered unclear risk of bias due to attrition bias (Yonkers et al., 2008;Bloch et al., 2012;O'Hara and McCabe, 2013;Hantsoo et al., 2014;Kashani et al., 2017;Li et al., 2020). Only one RCT scored a high risk of bias because the blinding was not explicitly reported (Yonkers et al., 2008). The results of the comparison-adjusted funnel plots showed that there was no evidence of apparent asymmetry, indicating no significant publication bias (Figure 3).

Comparison of efficacy between antidepressant medications
For the primary outcome of the mean-changed HAMD-17 score, the network plot consists of nine nodes ( Figure 4A), and Table 2 shows the results of the network meta-analysis. As the majority of RCTs were comparisons between antidepressant medications and the placebo, it was unfeasible to check for inconsistency between direct and indirect treatments. There was moderate statistical heterogeneity between studies (I 2 = 54.9%) regarding the efficacy of changes in the HAMD-17 score. In all RCTs, the active antidepressant drugs were superior to the placebo for reducing depression, with SMD ranging from −1.70 (95% CI −4.72 to 1.33) to −9.59 (95% CI −15.78 to -3.4). However, our analysis demonstrated that only estradiol and brexanolone were significantly superior to the placebo (p < 0.01 and p = 0.02, respectively), whereas for all the remaining seven antidepressant medications the results suggested there was no substantial difference between them and the placebo (with p-values ranging from 0.08 to 0.54). Regarding the active antidepressant medication, only estradiol was significantly more effective than sertraline, with the SMD of −7.89 (95% CI-14.78 to −1.01), p = 0.03. The SUCRA suggested that estradiol (94.3%), paroxetine (64.3%), and zuranolone (58.8%) were superior to other antidepressants in the ranking probability, and the details are presented in Figure 5A.
In addition to mean changes from baseline, the remission rate was reported in six studies involving five antidepressant drugs (Wisner et al., 2006;Yonkers et al., 2008;Hantsoo et al., 2014;Kanes et al., 2017;Meltzer-Brody et al., 2018b;Deligiannidis et al., 2021), and the corresponding network plot is shown in Figure 4B. Although no significant differences were found between active antidepressant medications and the placebo (p = 0.18-0.48), the SUCRA ranking suggested that nortriptyline (67.9%), sertraline (68.0%), and paroxetine (62.8%) had a higher remission rate than others ( Figure 5B). The relative efficacy of the responder rate was reported in five studies, including five antidepressants ( Figure 4C) (Wisner et al., 2006;Hantsoo et al., 2014;Kanes et al., 2017;Meltzer-Brody et al., 2018b;Deligiannidis et al., 2021). According to SUCRA, nortriptyline had the greatest likelihood of ranking first (80.0%) and the second was sertraline (64.5%), which is shown in Figure 5C. However, neither of them was significantly superior to the placebo (OR = 6.74, 95% CI 0.46-99.39 and OR = 4.00, 95% CI 0.5-31.83, respectively). No global statistical heterogeneity was noted among studies with respect to both remission (I 2 = 0.0%) and responder rates (I 2 = 0.0%). In the light of several RCTs that provided the outcomes measured with  Frontiers in Pharmacology frontiersin.org 06 MADRS, we performed network analysis of efficacy among these RCTs, and the results are presented in Supplementary Table S1.

Tolerability and side effects
Regarding comparative acceptability, a total of nine RCTs reported the total number of early dropouts for any reason. No antidepressant medication was found to be significantly inferior to the placebo. Similarly, there was no significant difference between active antidepressants. When treatments were ranked according to SUCRA, brexanolone was less well-tolerated than most other drugs (4.5%), and the following were two SSRIs of fluoxetine (28.6%) and sertraline (29.0%, Supplementary Figure   S1). We did not observe statistical heterogeneity between RCTs (I 2 = 0.0%). As few studies provided details on the most common and serious side effects, it was unfeasible to pool data in terms of this endpoint (Supplementary Table S2). However, we carried out the network meta-analysis in studies providing dropout for side effects and lacking efficacy, and the results were similar to dropout for any reason, which is presented in Supplementary  Figure S2.

Discussion
To the best of our knowledge, this is the first systematic review and network meta-analysis of RCTs of antidepressant   Frontiers in Pharmacology frontiersin.org 07 medications for women with PPD. While using the changes in the HAMD-17 score from baseline as the endpoint, only estradiol and brexanolone were found to be superior to the placebo after 4-12 weeks of treatment. Among several active antidepressants, we found that only estradiol was significantly more effective than sertraline. The SUCRA indicated that estradiol, paroxetine, and zuranolone were superior to other antidepressant drugs. While using the remission rate as the endpoint of interest, the SUCRA ranking demonstrated that nortriptyline, sertraline, and paroxetine had a higher remission rate than other antidepressants. Again, no significant difference was found among these drugs. In terms of responder rate, despite observing no significant differences in efficacy between treatments through indirect comparisons among studies, the SUCRA indicated that nortriptyline had the greatest likelihood of ranking first. In the study of Cooper et al., they performed an indirect comparison between brexanolone and SSRIs with EPDS and HAMD-17 as the measurements. According to their analyses, brexanolone was more effective within 60 h and can potentially lead to better treatment and symptom reduction for mothers with PPD (Cooper et al., 2019). In our study, the EPDS was not used because of insufficient data; moreover, the EPDS was developed as a screening tool and may not have captured all relevant aspects of changing PPD symptoms.
Concerning the tolerability profile, the SUCRA ranking revealed that brexanolone has a higher dropout rate for any reason than most other antidepressant medications. On the other side, two SSRIs of fluoxetine and sertraline were more well-tolerated than other antidepressants; however, considering that many dropouts were lost to follow-up, the results should be regarded with caution. Because of insufficient data, it was difficult to perform a comparison regarding the dropout rate for the most common and serious side effects. Another problem that should be noted was that brexanolone was administered by injection and not orally, which may bring inconvenience in outpatient settings.
Depending on the severity and functional status of women diagnosed with PPD, the treatment option varied widely. For mild disorder, psychosocial interventions by trained health professionals are recommended as the first-line interventions, whereas for women with moderate disorder, formal psychotherapy over 3-4 months should be considered (Antenatal and postnatal mental health: clinical management and service guidance, 2014; Dennis and Dowswell, 2013). When PPD is severe or cannot be resolved with psychological treatment, antidepressant drugs are required. At present, the majority of trials for the treatment of PPD were focused on non-pharmacological treatments. By comparison, far fewer trials focused on pharmacotherapy for the treatment of PPD, and most of these studies were placebo-controlled trials (Lam et al., 2013). In the current network meta-analysis, we observed moderate heterogeneity, which is commonly seen in psychological treatment research. Nevertheless, because of insufficient data, we were not able to explore potentially important clinical and demographical information at the individual patient level such as age, duration of illness, and especially the severity of symptoms (ranging from 16.5 to 29.1 at baseline with the HAMD-17 scale). Although without these clinical subgroups, results might limit the applicability of our study, and it was intended as a methodological strength to assure transitivity in the network. For non-postpartum populations, there is evidence of some benefits of combined psychotherapy and pharmacotherapy for functional outcomes. Nevertheless, trials of combination therapy for postpartum depression are scarce. Considering that several newer RCTs provided the MADRS scale, we compared different antidepressants in RCTs providing MADRS, and the results are consistent with the measurement of HAMD-17. In light of a few head-to-head comparison studies between different antidepressants, the results of the current network meta-analysis were derived from indirect comparisons of treatments. Compared with pairwise comparison, network meta-analysis can supply a more precise estimate of the relative efficacy and tolerability and allow treatments to be ranked to assist clinical decisions (Salanti et al., 2008;Salanti, 2012). Nevertheless, due to the high risk of bias of included RCTs, the quality of the evidence from this study is low. Therefore, the results of the current network meta-analysis should be regarded with caution. However, in the absence of a direct comparison between antidepressants, our findings represent the best currently available evidence for patients and clinicians to inform first-line and second-line treatment decisions for PPD.
There are some limitations to our study which should be considered. First, in more than half of the RCTs, the antidepressant medications were tested in less than 100 patients, and small sample trials usually result in larger treatment effects. Second, many RCTs were conducted 10 years ago and did not report adequate information about allocation concealment and sequence generation. Furthermore, the endpoints used to judge treatment efficacy were less stringent than those used in earlier RCTs relative to those newer ones. Third, some RCTs reported detailed efficacy at varied time points such as 4, 24, and 48 h; however, it is unfeasible to pool these data because of insufficient details.

Conclusion
All active antidepressant medications demonstrated higher efficacy relative to the placebo while using changes in the HAMD-17 score from baseline as the endpoint; however, only estradiol and brexanolone were significantly superior to the placebo. The SUCRA ranking revealed that estradiol, paroxetine, and zuranolone were superior to other antidepressants. Concerning the acceptability of dropouts, the SUCRA ranking indicated that brexanolone is less well-tolerated than most other antidepressants.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding authors.