An Observational Retrospective Cohort Trial on 4,828 IVF Cycles Evaluating Different Low Prognosis Patients Following the POSEIDON Criteria

Objective: To study the actual controlled ovarian stimulation (COS) management in women with suboptimal response, comparing clinical outcomes to the gonadotropins consume, considering potential role of luteinizing hormone (LH) addition to follicle-stimulating hormone (FSH). Design: Monocentric, observational, retrospective, real-world, clinical trial on fresh intra-cytoplasmic sperm injection (ICSI) cycles retrieving from 1 to 9 oocytes, performed at Humanitas Fertility Center from January 1st, 2012 to December 31st, 2015. Methods: COS protocols provided gonadotropin releasing-hormone (GnRH) agonist long, flare-up, short and antagonist. Both recombinant and urinary FSH were used for COS and LH was added according to the clinical practice. ICSI outcomes considered were: gonadotropins dosages; total, mature, injected and frozen oocytes; cumulative, transferred and frozen embryos; implantation rate; pregnancy, delivery and miscarriage rates. Outcomes were compared according to the gonadotropin regimen used during COS. Results: Our cohort showed 20.8% of low responders, defined as 1–3 oocytes retrieved and 79.2% of “suboptimal” responders, defined as 4–9 oocytes retrieved. According to recent POSEIDON stratification, cycles were divided in group 1 (6.9%), 2 (19.8%), 3 (11.7%), and 4 (61.5%). The cohort was divided in 3 groups, according to the gonadotropin's regimen. Women treated with FSH plus LH showed worst prognostic factors, in terms of age, basal FSH, AMH, and AFC. This difference was evident in suboptimal responders, whereas only AMH and AFC were different among treatment groups in low responders. Although a different result, in terms of oocytes and embryos detected, major ICSI outcomes (i.e., pregnancy and delivery rates) were similar among groups of COS treatment. Outcomes were significantly different among Poseidon groups. Implantation, pregnancy and delivery rates were significantly higher in Poseidon group 1 and progressively declined in other POSEIDON groups, reaching the worst percentage in group 4. Conclusions: In clinical practice, women with worst prognosis factors are generally treated with a combination of LH and FSH. Despite low prognosis women showed a reduced number of oocytes retrieved, the final ICSI outcome, in terms of pregnancy, is similarly among treatment group. This result suggests that the LH addition to FSH during COS could improve the quality of oocytes retrieved, balancing those differences that are evident at baseline. Clinical Trial Registration: www.ClinicalTrials.gov, identifier: NCT03290911


INTRODUCTION
The number of couples seeking help in assisted reproductive technologies (ART) is progressively increasing and about 1.5 million cycles are currently performed every year (1). ART starts with a controlled ovarian stimulation (COS) phase, in which the ovary is exogenously stimulated with gonadotropins at producing the largest number of oocytes to be used in embryo development. During COS, follicle stimulating hormone (FSH), human menopausal gonadotropin (hMG) and luteinizing hormone (LH) are variably used. However, a significant interand intra-individual different response to COS is largely demonstrated so far and the research practice is focused on the way to optimize ovarian response. In this improving process, two main challenges remain to be clarified nowadays. First, how women who poorly respond to COS could be identified? The first poor responders definition dates back to 1983 (2), although only in 2011 the first realistic attempt to define poor responders have been made by the scientific community of the European society of human reproduction and embryology (ESHRE) (3). Using this definition, poor responder women show at least two of the following criteria: (i) advanced age (>40 years), (ii) previous poor ovarian response (<3 oocytes retrieved), (iii) abnormal reserve test, detected as antral follicle count (AFC) < 5-7 or anti-Mullerian hormone (AMH) <0.5-1.1 ng/mL (3). However, this classification fails to define all patients who experience poor response to ovarian stimulation. Recently, a new classification of low ovarian response has been proposed. Four subgroups have been identified considering quantitative and qualitative parameters, such as age, antral follicle count (AFC), and anti-Müllerian hormone (AMH) and ovarian response to previous stimulation cycle was performed, defined as a number of retrieved oocytes lower than 9 (4). These criteria could assist the clinician in the women classification, although they are still not useful to define the best COS treatment.
The second main challenge in ART is the appropriate gonadotropins stimulation needs to be used (5,6). All COS protocols provide the exogenous FSH administration. Is FSH alone enough to induce multiple follicle development or a gonadotropins combination could improve the final outcome? Although wide consensus is reached about the tailored protocol, there is currently a lack of consensus on what represents the gold standard gonadotropins combination to COS. Indeed, COS schemes remain mainly empirical and a personalized medicine is still advocated in this setting (7). Best practices suggest the use 150-300 IU of gonadotropins daily, but no more than 450 IU daily, for a total of 9-10 days (8). These doses are generally applied considering the women expected response, using an average dosage of 150 IU for those younger and higher doses in women who are older or who are expected to have a poorer response. However, such approaches show poor clinical results and the ideal COS protocol is still under debate.
In the literature, a wide number of clinical trials evaluated the efficacy of gonadotropin combinations in COS. However, strong evidence in favor or contrast to gonadotropins combination is not reached so far (9). The analysis of large databases of infertile population could be useful to update the specification of the "best clinical practice" in ART. With this in mind, considering that large population based cohorts demonstrated that an oocyte yield of 10-15 oocytes in all age groups resulted in the most optimal live birth rate in fresh cycles (10), we decided to focus on women with low ovarian response according to recent diagnostic stratification (i.e., 9 oocytes retrieved). Thus, a realworld trial based on a large database is designed to collect data from the standard clinical practice in ART center in which all gonadotropins preparations were used alone or in combination. The main aim of this study is to analyse the actual COS management in women with not optimal response, comparing clinical outcomes to the gonadotropins consume.

Study Design
A monocentric, observational, retrospective, real-world, clinical trial was performed evaluating all ART cycles performed at the Humanitas Fertility Center from January 1st, 2012 and December 31st, 2015. All fresh cycles of intra-cytoplasmic sperm injection (ICSI), retrieving 1-9 oocytes were enrolled. Overall, this study enrolled women with low ovarian response, according to Patient-Oriented Strategies Encompassing IndividualizeD Oocyte Number (POSEIDON) stratification system (4,(11)(12)(13). Only ICSI cycles were included in the analyses considering that only in this ART methodology information on oocytes quality have been recorded. Moreover, inclusion criteria provided an age ≤44 years and a body mass index (BMI) between 18 and 27 kg/m 2 . The following exclusion criteria were considered: (i) age > 44 years, (ii) number of oocytes retrieved > 10 or <1 or cycles suspended before retrieval, (iii) abnormal uterine cavity, (iv) endometriosis III-IV stage or adenomyosis, (v) testicular sperms, and (vi) PGT (Preimplantation Genetic Testing). Only fresh cycles were considered and pregnancies from frozen cycles were excluded from the analysis.
The study was approved by the Independent Ethical Committee of the Humanitas Institutional Clinic (Milan, Italy) (Trial registration number: NCT03290911). Informed and written consent was obtained from each patient after full explanation of the purpose and nature of all procedures used.

COS Protocol
The COS protocol provided the use of recombinant FSH (rFSH), hMG or rFSH + recombinant LH (rFSH + rLH). The gonadotropin starting dose was determined according to ovarian reserve parameters, such as AMH, AFC, and BMI. COS was performed using four different protocols: GnRH agonist long protocol; GnRH agonist short protocol; GnRH antagonist protocol; Flare-up GnRH agonist protocol. Most of antagonist COS starts with the use of combined oral contraceptives pretreatment.
Long GnRH agonist protocol was based on the administration of daily leuprorelin (Enantone die, Takeda, Italy) or Triptorelin Depot (3.75 mg IM, Decapeptyl R , Ipsen, Milan, Italy) on day 21 of the previous luteal phase of the stimulation cycle. When pituitary desensitization was achieved (14 days after the initiation of GnRH agonist), as evidenced by the absence of ovarian follicles >5 mm and endometrial thickness <5.4 mm on transvaginal ultrasound examination, gonadotropin stimulation was initiated. In short GnRH agonist protocol the agonist (Leuprolin 0.1 mg/day) is administered from day 21 of the previous cycled and induction from day 1 or 2 of the cycle (day 1 being the start of the menstrual bleed) reducing the agonist dose to 0.05 mg/day and continuing with stimulation until the day of HCG administration. In the GnRH antagonist protocol, the first day of women spontaneous menstrual cycle or a withdrawal bleeding after receiving a low dose oral contraceptive, gonadotropin stimulation was initiated and when the leading follicle reached 13-14 mm in mean diameter, and/or plasma E2 exceeded 400 pg/ml, an injection of 0.25 mg of GnRH antagonist (Cetrotide R , Merck Serono S.p.A, Rome, Italy; Orgalutran R , Organon, MSD-Italy) was administered SC daily until the day of ovulation trigger.
Finally, in the Flare-up GnRH protocol, daily agonist started on cycle day 1 of the cycle with triptorelin (0.1 mg/day) and gonadotropins was started according to ovarian reserve parameter on day 2 of the cycle. A starting variable dose of gonadotropin (hMG (Meropur R , Ferring, Milan, Italy) or rFSH (Puregon R , MSD-Italy; Gonal-F, Merck Serono S.p.A., Rome, Italy) with or without the addition of r-LH for the first 4 days and then an individualized dose was administered according to the parameters resulting from transvaginal ultrasound and estradiol and progesterone levels until the day of ovulation trigger. The protocol of induction and the dose of gonadotropins administered were tailored on an individual basis according to patient's age, serum hormonal levels, and AFC. Transvaginal ultrasonography, estradiol and progesterone determinations were performed during COS. When at least three follicles with a mean diameter >18 mm were observed, 250 mcg of recombinant hCG (Ovitrelle; Merck Serono S.p.A.) was administered subcutaneously. Oocyte retrieval was performed transvaginal 36 h after hCG injection. Embryo transfer was performed day 3-day 5 after oocyte collection. Luteal phase was supported in all patients with vaginal progesterone (Crinone 8%; Merck Serono S.p.A. or Prometrium; Rottapharm). Serum hCG was assessed 2 weeks after embryo transfer and then every 48 h until a value over 1,000 mIU was detected and a vaginal ultrasound was scheduled 4 weeks after the embryo transfer.

Parameters Detected
All anamnestic information was collected, with attention to the years of infertility and the indication to ART. Baseline women characteristics were collected after the last menstrual cycle before ART, such as female age, AFC, FSH, LH, estradiol, TSH, AMH and inhibin B serum levels. AMH and AFC were evaluated considering previous statements (14). The ART protocol applied was registered, considering the GnRH analog used, the FSH and LH doses used and the duration of the COS protocol. Finally, considering the ART outcomes, the following outcomes were considered: oocytes retrieved, oocyte nuclear maturity stage, injected, frozen and fertilized oocytes, transferred and frozen embryos. Implantation, pregnancy, live birth, and miscarriage / ectopic rates were finally collected. The implantation rate was calculated as the ratio between the number of gestational sacs identified at this time and the number of embryos transferred. Clinical pregnancy was defined a pregnancy as visualization of one or more gestational sacs or definitive clinical signs of pregnancy. It includes ectopic pregnancy as defined by The International Committee for Monitoring Assisted Reproductive Technology (ICMART) and the World Health Organization (WHO) Revised, Glossary on ART Terminology, 2009 [21]. Miscarriage rate and ectopic pregnancies, per clinical pregnancy, were defined as the proportion of patients who failed to continue development before 20 weeks of gestation in all clinical pregnancies. Live birth was defined as the delivery of a fetus with signs of life after 20 completed weeks of gestational age.

Statistical Analysis
The entire dataset was first evaluated to select each single couple treated with ART. These couples represented the entire cohort of patients evaluated by the study.
Descriptive analyses were performed considering the entire cohort of patients. Continuous variables distribution was evaluated with Kolmogorov-Smirnov test. According to the not normal distribution, continuous variables were compared with Kruskal Wallis non-parametric test. Post-hoc analyses were performed by Tukey test. Continuous variables are expressed as mean ± standard deviation. Categorical variables were compared using Fisher exact test and they were expressed as number (percentage). Multiple linear stepwise analyses were performed considering implantation rate as dependent variable and other parameters as independent variables. These analyses were repeated, considering the following POSEIDON groups: (1) age <35 years with adequate ovarian reserve (AFC>5 and AMH>1.2 ng/mL) and 9 oocytes retrieved; (2) age >35 years with adequate ovarian reserve (AFC>5 and AMH>1.2 ng/mL) and 9 oocytes retrieved; (3) age <35 years with poor ovarian reserve (AFC<5 and AMH<1.2 ng/mL); (2) age >35 years with poor ovarian reserve (AFC<5 and AMH<1.2 ng/mL).
Statistical analysis was performed using the "Statistical Package for the Social Sciences" software for Macintosh (version 21.0; SPSS Inc., Chicago, IL). Considering that multiple hypotheses were tested together, and multiple analyses were performed, the statistical significance was evaluated after correction using Bonferroni test. Thus, nine endpoints were consecutively evaluated, and the p-values was considered statistically significant when p < 0.0005. Moreover, considering the real-world data registration considers both repeated (i.e., multiple cycles performed for the same couples) and missing data, the missing not at random (MNAR) approach was used to adjust analyses (15). To this purpose, the Expectation-Maximization method was applied, creating a new dataset in which all missing values are estimated by the maximum likelihood methods (16).

RESULTS
Twelve thousand five hundred and forty-three ART cycles were performed from January 1st, 2012 and December 31st, 2015 on 9,928 infertile couples. Finally, 4,828 cycles fulfilled inclusion and exclusion criteria and represented the final cohort evaluated. Figure 1 showed the study flow chart and reported the reasons for exclusion (Figure 1).

Comparison Among Gonadotropins Groups
At baseline, a significant difference among groups was detected. Female age was significantly higher in group 3 compared to group 1 and 2 (p < 0.001 and p < 0.001, respectively) ( Table 1). Moreover, female age was significantly higher in hMG group compared to FSH group (p < 0.001; Figure 2). Basal FSH serum levels were significantly higher in group 3, compared to groups 1 and 2 (p < 0.001 and p < 0.001, respectively) and lower levels were detected in group 1 compared to 2 (p = 0.001; Figure 3). Similar trend was detected for AMH and AFC, which were significantly higher in group 1 and progressively decreased until group 3 (Figures 4, 5). On the contrary, basal LH did not change among groups ( Table 1). These results, taken together, suggest that a different clinical approach is generally applied, preferring the LH addition to FSH when a low response is expected.
The GnRH analog used to induce COS was different among groups. The GnRH antagonist was generally chosen when FSH or hMG were used. On the contrary, when the FSH and LH combination was selected, agonist flare-up and antagonist protocols were equally used ( Table 1). The duration   of gonadotropins administration was similar among groups (p = 0.123; Table 1), whereas an obvious difference in gonadotropin dosages was detected. FSH doses were significantly higher in group 2 compared to 1 and 3 (p < 0.001 and p < 0.001, respectively), whereas no FSH dosages differences were seen between groups 1 and 3 (p = 0.107). This difference translated in a significant reduction in the FSH dose needed for each oocyte retrieved ( Table 1), comparing group 3 with groups 2 (p < 0.001). The number of oocytes retrieved was higher in group 1 compared to group 2 and 3 (p < 0.001 and p < 0.001, respectively) (Table 1 and Figure 6). However, no significant differences were seen between groups 2 and 3 (p = 0.319), suggesting that despite the baseline differences between these two groups, the addition of LH to FSH retrieved a relative higher oocytes number compared to hMG. Similarly, the  number of MII oocytes reflected the total oocytes number, with a higher retrievement in group 1 compared to groups 2 and 3 (p < 0.001 and p < 0.001) and a similar number between groups 2 and 3 (p = 0.620). Interestingly, the MII oocytes on total oocytes retrieved ratio was not different among groups (Table 1). Again, similar trend was observed for injected and fertilized oocytes ( Table 1). Despite the differences between group 1 and other groups, the number of cumulative, transferred and frozen embryos did not differ among groups ( Table 1), suggesting that the addition of LH improve the oocytes quality and capability to develop embryos. This hypothesis was further confirmed by the similar results among groups obtained for implantation, pregnancy and delivery rates ( Table 1).

Comparison Among Patients Stratification
Considering only women with low response (1-3 oocytes retrieved), despite the different baseline characteristics, all ART outcomes evaluated did not differ among groups ( Table 2). On the contrary, considering women with a "suboptimal" response (oocytes retrieved between 4 and 9), the three groups showed significant differences (Table 3). First, baseline characteristics (i.e., age, FSH, AMH and AFC) were significantly impaired in group 3 compared to group 1 and 2 and in group 2 compared to group 1. Considering the ART outcomes, the number of total, MII, injected and fertilized oocytes was significant higher when FSH was used alone (group 1), compared to hMG (group 2) or FSH plus LH (group 3), while no differences were seen between groups 2 and 3. Moreover, other endpoints (i.e., embryos, implantation, pregnancy and delivery rates) were not different among groups ( Table 3). These results suggest that the LH addition to FSH reduce the baseline differences among women obtaining similar outcomes to those women with better prognosis. These differences were maintained after analysis statistical adjustment for women age, AFC, AMH and GnRH protocols used. Finally, multivariate analyses were not able to identified independent variables able to modifying independent one. All ART outcomes were significantly different among groups after patients stratification ( Table 4). Post-hoc analyses showed better ART outcomes in group 1 compared to other three groups and group 4 showed lower ART outcomes compared to others. Accordingly, implantation, pregnancy and delivery rates were significantly higher in group 1 and progressively declined among POSEIDON groups, reaching the worst percentage in group 4 ( Table 4). However, the COS stimulation in these groups was not homogeneous, showing higher FSH and lower LH doses in women belonged to group 1 ( Table 4), confirming the previous suggestion that clinician usually adapted the gonadotropin stimulation to the women ovarian reserve.

DISCUSSION
Our study supports the substantial effect of LH addition to FSH during COS in a large real-world ART setting. Here, 4,828 ICSI cycles performed on women with less or equal than 9 oocytes retrieved have been fully evaluated. Despite these women are known to poorly respond to COS protocol, baseline clinical characteristics guide the clinician's decision about the best COS approach. Indeed, women with higher age, higher FSH basal levels, lower AMH serum levels and lower AFC (representing the 30.9% of the entire cohort) are generally treated with gonadotropin combination. In particular, the worst clinical baseline picture, occurring in 16.8% of the cohort, is preferentially treated with FSH plus LH, instead of hMG (i.e., FSH plus hCG). On the contrary, women with a best baseline clinical picture (the 69.1% of the entire cohort) are treated with FSH alone. This clinical approach is further confirmed stratifying patients considering more recent classification, dividing women according to the number oocytes retrieved and the ovarian reserve (i.e., AFC and AMH serum levels) (4). However, despite the unbalance between patients' baseline characteristics, the final ART outcomes are similar among the three COS approaches evaluated. Indeed, FSH alone, hMG or FSH plus LH reached the same result, in terms of embryos number, implantation, pregnancy and delivery rates. This is extremely important considering that the number of oocytes retrieved is an important prognostic variable for ART success (17,18). The LH addition to FSH balances the final ART outcome among groups, despite the women clinical baseline differences. Interestingly, the use of hMG is usually chosen for those women with expected results within FSH alone or FSH-LH combination (14.0%). According, hMG obtains similar results to other groups, suggesting that the hCG-LH activity added to FSH during COS can improve the final ART outcome. However, the use of hMG shows a higher FSH on oocytes ratio, suggesting that this approach leads to a higher FSH consume to obtain stimulation like FSH alone or FSH plus LH.
The beneficial action of LH on COS is particularly evident dividing women enrolled in two clinical groups: low and suboptimal responders. Low responders should represent women with 1-3 oocytes retrieved, whereas suboptimal those with 4-9 oocytes. Indeed, Sunkara et al. established that women with 4-9 oocytes retrieved result in acceptable live birth rates, ranging from 15 to 36%, although the level of response to stimulation which is not ideal (18) Indeed, age-matched women with 10-15 oocytes retrieved obtain a live birth rates 20-30% higher (18). When suboptimal responders are considered, the beneficial LH action is confirmed. Indeed, in this subgroup, women are treated with FSH alone when the better outcome is expected, whereas the FSH-LH combination is preferred when an impaired baseline picture is evident, such as higher age, FSH basal levels and lower AMH and AFC. Despite these differences, the LH addition improves the ART outcome, obtaining similar results in terms of embryos number, implantation, pregnancy and delivery rates. On the other hand, low responders, although reduced of number (1,096 women), show a slight baseline difference among groups of COS approaches, only in terms of AMH and AFC. According to the reduced baseline difference, the final ART outcome remains similar among groups. However, no differences are observed also in terms of oocytes number and quality. Thus, it is probable that the LH addition improves the entire COS process, reducing the differences among women according to the baseline characteristics. In this context, a further evaluation of potential single nucleotide polymorphisms (SNP) effect on ART outcomes must be considered. Indeed, several trials suggested so far that gonadotropin effect during ART Could be modulated by SNP on gonadotropin and gonadotropin-receptor genes (19)(20)(21)(22)(23)(24)(25). Nowadays in ART it is well known that COS schemes should be personalized to each woman to ensure the highest chance of live birth rate (7,26). Several models have been developed with the aim at predicting outcomes for the infertile patient and at tailoring gonadotropins stimulation (27)(28)(29). These models involve similar, well-established predictors, such as female age, duration of infertility, number of previous successful or unsuccessful ART cycles, pregnancy history and whether infertility was caused by tubal pathology. However, they still need external validation and they are not routinely clinically applied (30,31). These models are well designed on poor responders, although a well-standardized approach remains far to be completely elucidated (32,33). However, no specific large datasets are available to build predictive models for poor responder patients. This is the first real-world evaluation of more than twelve thousand ART cycles. This paper is a perfect snapshot of population who typically attends ART centers: woman average age is higher than 37 years and the entire cohort showed the 38.5% of patients with a not-optimal response to ovarian stimulation (≤9 oocytes retrieved). Thus, alongside the wide number of women enrolled, our trial gives an interesting focus on the actual COS approach in clinical practice.
A large number of clinical trials and meta-analyses have been performed at comparing different gonadotropins combinations in terms of COS outcome (34)(35)(36)(37)(38)(39)(40). These publications focused on a wide range of heterogeneous studies, evaluating different endpoints, setting and patient characteristics. Recently, a morecomprehensive meta-analysis on FSH plus LH during COS has been performed, including 70 clinical trials and detecting a clear effect of LH on the final ART outcomes (41). Whether FSH alone  obtains higher oocyte number, the FSH-LH combination leads to a higher pregnancy rate (41). Thus, the LH addition seems to increase the selective pressure on follicular selection exerted by the two gonadotropins together, improving oocyte quality.
Here, we support these previous results in a real-world setting. Indeed, our results confirm in a clinical setting the LH molecular action widely demonstrated in the literature. Indeed, through the specific receptor binding, LH leads to highest activation of ERK1/2 and AKT-pathway and a final proliferative and antiapoptotic signal (42,43). LH exerts a proliferative action at molecular level, leading to a better ART outcome. This study shows three main limitations. First, this study was retrospectively designed, thus possible selection and performed biases should be considered. Indeed, clinical trials should provide an a priori study design, which help in the selection of patients, limiting the inter-and intra-individual differences. Second, the groups of treatment show a significant baseline differences, in terms of those parameters widely associated to the final ART outcomes. Thus, the groups are not completely comparable at baseline and consequently a clear advantage to one treatment to another is not clearly demonstrated. Third, the gonadotropin administration is not standardized in each group, but each woman was treated with a tailored therapy.
In conclusion, in our work, a deep and accurate description of ovarian response to COS in a large population of women undergoing ART is performed. Here, more than 4,000 cycles of women with sub-optimal response are detected, defined as those women who retrieve from 1 to 9 oocytes. Using this large database, for the first time a beneficial effect of LH addition to FSH during COS raises from the clinical practice. In particular, the gonadotropin combination is usually preferred when impaired clinical features are evident at baseline. This combined approach can reduce these differences, reaching similar ART outcomes. The results have been analyzed comparing both COS approaches and POSEIDN stratification. This latter shows clearly differences among POSEIDON groups.

DATA AVAILABILITY
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The study was approved by the Independent Ethical Committee of the Humanitas Institutional Clinic (Milan, Italy). Consent was obtained from each patient after full explanation of the purpose and nature of all procedures used.

AUTHOR CONTRIBUTIONS
PL-S designed the project, collected data, and drafted the manuscript. IZ, AB, EZ, LS, AS, EM, RD, and AD collected clinical data. DS analyzed clinical data and drafted the manuscript. All authors participated to final manuscript.