Integrated analysis of randomized controlled trials evaluating bortezomib + lenalidomide + dexamethasone or bortezomib + thalidomide + dexamethasone induction in transplant-eligible newly diagnosed multiple myeloma

Objective Providing the most efficacious frontline treatment for newly diagnosed multiple myeloma (NDMM) is critical for patient outcomes. No direct comparisons have been made between bortezomib + lenalidomide + dexamethasone (VRD) and bortezomib + thalidomide + dexamethasone (VTD) induction regimens in transplant-eligible NDMM. Methods An integrated analysis was performed using patient data from four trials meeting prespecified eligibility criteria: two using VRD (PETHEMA GEM2012 and IFM 2009) and two using VTD (PETHEMA GEM2005 and IFM 2013-04). Results The primary endpoint was met, with VRD demonstrating a noninferior rate of at least very good partial response (≥ VGPR) after induction vs VTD. GEM comparison demonstrated improvement in the ≥ VGPR rate after induction for VRD vs VTD (66.3% vs 51.2%; P = .00281) that increased after transplant (74.4% vs 53.5%). Undetectable minimal residual disease rates post induction (46.7% vs 34.9%) and post transplant (62.4% vs 47.3%) support the benefit of VRD vs VTD. Treatment-emergent adverse events leading to study and/or treatment discontinuation were less frequent with VRD (3%, GEM2012; 6%, IFM 2009) vs VTD (11%, IFM 2013-04). Conclusion These results supported the benefit of VRD over VTD for induction in transplant-eligible patients with NDMM. The trials included are registered with ClinicalTrials.gov (NCT01916252, NCT01191060, NCT00461747, and NCT01971658).


Introduction
Multiple myeloma remains an incurable disease with relatively high mortality rates despite availability of multiple treatment options (1)(2)(3).Several treatment regimens are recommended for induction therapy in patients with transplant-eligible (TE) newly diagnosed multiple myeloma (NDMM) (4)(5)(6).Selection of the optimal frontline therapy is important, as 60% to 70% of patients receive fewer than three lines of therapy (7)(8)(9)(10).Therefore, providing the most efficacious frontline therapy is critical to minimizing disease burden and optimizing survival outcomes (7)(8)(9)11).Studies show that achieving at least very good partial response (≥ VGPR) during induction is associated with improved long-term outcomes (12-15).Other goals of induction therapy include achievement of rapid disease control and undetectable minimal residual disease (MRD) without negatively impacting stem cell harvest.Furthermore, low rates of toxicity would enable patients to complete induction, which helps optimize treatment responses.
National and international TE NDMM treatment guidelines, including European Society for Medical Oncology (ESMO), European Myeloma Network (EMN), and American Society of Clinical Oncology/Cancer Care Ontario (ASCO/CCO) guidelines, recommend triplet regimens such as bortezomib + lenalidomide + dexamethasone (VRD) and bortezomib + thalidomide + dexamethasone (VTD) (4)(5)(6).In addition, VRD and VTD are both currently being used as backbone therapy in modern quadruplet induction regimens with anti-CD38 monoclonal antibodies.However, in contrast to lenalidomide, the use of thalidomide is limited by peripheral neuropathy (16,17).Bortezomib has also been associated with peripheral neuropathy, and the combination with thalidomide in VTD led to higher rates of this adverse event vs thalidomide + dexamethasone in phase 3 studies (18)(19)(20).The tolerability of bortezomib has improved with subcutaneous administration, which demonstrated noninferior efficacy and a lower incidence of peripheral neuropathy vs intravenous administration (21)(22)(23).Thus, VRD with subcutaneous bortezomib offers a treatment option with reduced rates of peripheral neuropathy.In phase 3 studies, VRD provided deep and durable responses during frontline therapy without limiting a patient's ability to receive further therapy (24)(25)(26)(27)(28).Given these results, VRD has been integrated into clinical practice and is a preferred regimen for primary therapy for transplant candidates (4)(5)(6).
While both VRD and VTD are included as options for frontline therapy in international guidelines, no direct comparison of the safety and efficacy of VRD vs VTD has been done to date.In the absence of a randomized controlled trial (RCT), an integrated analysis can be performed using propensity score (PS)-based statistical methods (29,30).This strategy minimizes the effects of observed baseline factors that could confound analysis to improve the comparison between different treatment cohorts.This method was previously used in a cross-trial comparison and regulatory submission to evaluate bortezomib ± dexamethasone in relapsed MM (31).
The goal of this integrated analysis was to compare VRD and VTD induction therapy in patients with TE NDMM.A literature search for phase 3 RCTs that met prespecified eligibility criteria identified two trials using VRD (PETHEMA GEM2012 and IFM 2009) and two trials using VTD (PETHEMA GEM2005 and IFM 2013-04).These four RCTs were included in the integrated analysis.

Study identification
A comprehensive review of published literature and ongoing clinical studies was performed to identify studies that met the following eligibility criteria: (1) study was a phase 3 RCT evaluating a full-dose VRD or VTD induction regimen (every 3 or 4 weeks) in patients with TE NDMM before autologous stem cell transplant (ASCT), (2) study had reached the primary endpoint for the purpose of the integrated analysis before data transfer, and (3) an agreement was in place for access to patient-level data adequate to conduct an integrated analysis by 31 December 2016.Search details are provided in the Supplement.

Endpoints
The primary endpoint of the integrated analysis was noninferiority of the post induction ≥ VGPR rate based on International Myeloma Working Group criteria.Secondary endpoints were safety, post ASCT ≥ VGPR rate, and ≥ VGPR rate over time during induction.Exploratory endpoints were progression-free survival (PFS), overall survival (OS), and undetectable MRD.

Statistical analysis
Statistical methods were based on the PS, which is a conditional probability of being treated given observed covariates that could be used to balance the covariates in two treatment cohorts and reduce bias (30).A logistic regression model in which treatment group was regressed based on 11 baseline variables (age, sex, height, weight, performance status, International Staging System disease stage, hemoglobin level, creatinine clearance, albumin level, b2microglobulin level, and lactate dehydrogenase level) was used to estimate PS (32).Patients with missing baseline values for any of these variables were excluded.Patients were stratified into five equal-sized groups using the quintiles of the estimated PS.Additional details on the PS model and noninferiority hypothesis are provided in the Supplement.
To demonstrate noninferiority of VRD vs VTD, a margin of noninferiority was prespecified using a two-sided 95% confidence interval.For the primary endpoint, post-initial treatment response rate of ≥ VGPR, the noninferiority margin (11.3%) was selected using historical data; a margin of 10% did not represent a substantial difference in treatment effect and was within normal variance between two treatment regimens in similar patient populations.
For PFS and OS, Kaplan-Meier methodology was used for the descriptive statistics.The stratified log-rank test, with the stratum based on the quintiles of the PS, was used to assess statistical significance of treatment effects.The stratified Cox proportional hazards model was used to estimate the hazard ratio and 95% CI for VRD vs VTD.
In GEM2005, MRD was assessed using four-color multiparameter flow cytometry with undetectable MRD defined as < 20 clonal plasma cells after measuring ≥ 200,000 nucleated cells at a sensitivity level of 10 −4 .GEM2012 assessed MRD at sensitivity levels of 10 −4 and 10 −6 using next-generation flow following EuroFlow protocols per International Myeloma Working Group, using an optimized eight-color, two-tube antibody panel (33).

Safety analysis
The safety analysis was primarily based on the induction phase of treatment, and the safety population included all randomized patients who received at least one dose of study drug.Adverse events were coded using the Medical Dictionary for Regulatory Activities version 15.1 and summarized by system organ class and preferred terms.Adverse events were graded using the National Cancer Institute Common Terminology Criteria for Adverse Events versions 4.03 (PETHEMA GEM2012) and 4.0 (IFM studies).Due to limited access to some data from the PETHEMA GEM2005 clinical trial database, safety data from that study are not included here.Treatment-emergent peripheral neuropathy was summarized using the grouped term (see Supplement).

Study identification
Sixteen studies of prospective phase 3 RCTs evaluating VRD or VTD induction (every 3 or 4 weeks) in patients with TE NDMM before ASCT were identified.A brief overview of the study details and the criteria not met for inclusion for 12 trials are provided in Table S1.Four studies met the eligibility criteria and were included: PETHEMA GEM2012 and IFM 2009 for VRD and PETHEMA GEM2005 and IFM 2013-04 for VTD (Figure 1A) (16,18,24,25).
Similarities and differences in the induction phase (length, number, dose, schedule, and route of bortezomib administration) are summarized for the included studies in Figure 1B.For example, GEM studies used 28-day cycles, whereas IFM studies used 21-day cycles.Additionally, although the bortezomib dose was the same for all four studies, bortezomib was administered subcutaneously in GEM2012 and IFM2013-04 vs intravenously in GEM2005 and IFM 2009.Lenalidomide was given on days 1 to 21 of the 28-day cycle in GEM2012 vs days 1 to 14 of the 21-day cycle in IFM 2009.In GEM2005, the thalidomide dose was escalated from 50 mg/day to 100 mg/day in cycle 1 and 200 mg/day thereafter; whereas, in IFM 2013-04, it was 100 mg/day throughout induction.Furthermore, differences between the induction regimens affected how data were included in the integrated analysis.For example, GEM2012 (VRD) and GEM2005 (VTD) were considered the main studies for the efficacy analysis due to their symmetrical induction regimens (six 4week cycles of induction followed by ASCT).IFM 2009 (VRD) and IFM 2013-04 (VTD) were considered supportive studies due to differences in design, meaning that these studies were included in the efficacy analysis to support the main efficacy comparisons made for the GEM studies.While both used 3-week cycles, IFM 2009 arm A used eight cycles, arm B used three cycles followed by ASCT and two cycles of consolidation, and IFM 2013-04 used four cycles followed by ASCT.Based on these differences in the IFM studies, only IFM 2009 arm A was used to compare with IFM 2013-04.Only data through four cycles of induction were included.This comparison of studies was possible because of the research agreement granting patient-level data access for each study.
Eligibility criteria were similar between the included studies.The four studies were all conducted in patients < 65 years of age with newly diagnosed, untreated MM with measurable M-protein concentrations (16,18,24,25).Patients in these studies all had an Eastern Cooperative Oncology Group (ECOG) performance status of ≤ 3. GEM2012, GEM2005, and IFM2009 studies included patients with platelet counts of ≥ 50 to 100 × 10 9 /L and neutrophil counts of ≥ 1 × 10 9 /L.Additionally, GEM2012, IFM2009, and IFM2013-04 all excluded patients with grade ≥ 2 peripheral neuropathy.

PS-stratified population and baseline characteristics
In the GEM studies, the intent-to-treat populations included 458 patients who received VRD (GEM2012) and 130 patients who received VTD (GEM2005).Due to missing data for at least one baseline variable, 51 and 1 patients were excluded from the respective PS-stratified cohorts, leaving 407 (GEM2012) and 129 (GEM2005) patients in the integrated analysis.Similarly, intent-totreat populations of the IFM studies had 19 and 15 patients, respectively, excluded from the respective PS-stratified cohorts due to missing data for at least one baseline variable.Thus, as 350  S2).Baseline characteristics were similar between the VRD and VTD PS-stratified cohorts (Table 1) and the respective overall intent-to-treat populations (Table S3).
In the IFM studies, the ≥ VGPR rate by four cycles (12 weeks) was noninferior with VRD (57.1%) vs VTD (56.5%) in the PS-stratified population, with similar results in patient subgroups (Figure S2).
In the GEM studies, the ≥ VGPR rate increased over time with both regimens during induction.Among patients in the PS-stratified cohort who initiated cycle six, the ≥ VGPR rate improved from 54.5% at cycle three to 70.1% at cycle six with VRD (n = 378) and from 35.1% to 55.9% with VTD (n = 111) (Figure 3).
In the overall PS-stratified cohort in the GEM studies, the improvement in ≥ VGPR rate seen for VRD vs VTD post induction was maintained after ASCT (74.4% vs 53.5%).Of note, the ≥ VGPR rate improved more for VRD than for VTD from post induction to post ASCT (8% vs 2%).The undetectable MRD rates (10 -4 threshold) post induction (46.7% vs 34.9%) and post ASCT (62.4% vs 47.3%) supported the benefit with VRD vs VTD.Data with a threshold of 10 -6 were available for GEM2012, showing an undetectable MRD rate of 28.5% post induction and 41.8% post ASCT.Similar comparisons could not be performed with the IFM studies, as response over time and MRD were not assessed in IFM 2013-04.

Survival
Due to differences in median follow-up times and numbers of patients who experienced progression or died between studies, the 2-year event-free rate is a better comparison for the cohorts than median PFS or OS.In the GEM studies, the 2-year PFS rates (ie, those patients who had not experienced progression or died) were 82% and 69%, and the 2-year OS rates (ie, patients who had not died) were 90% and 87% with VRD and VTD, respectively.
In the IFM studies, the 2-year PFS rates were 67% and 71%, and the 2-year OS rates were 93% and 93% with VRD and VTD, respectively.However, differences in the number of cycles of induction received, inclusion of ASCT following induction, and treatment received post ASCT may limit interpretation of these data.

Safety
A summary of treatment-emergent adverse events (TEAEs) is provided in Table 3.The most common grade 3/4 (TEAE) in the GEM2012 VRD study was neutropenia (13%), and in the IFM studies, lymphopenia (49% with VRD and 22% with VTD; Table 4).In GEM2012, the rate of peripheral neuropathy was 21% for grade ≥ 2 events and 5% for grade 3/4 events.In the IFM studies, in which intravenous bortezomib was used in VRD and subcutaneous bortezomib was used in VTD, rates of grade ≥ 2 peripheral neuropathy events (30% vs 27% with VRD vs VTD, respectively) and grade 3/4 events (6% vs 8%, respectively) were similar.

Dose reductions and discontinuations
In GEM2012, 22% of patients had at least one TEAE leading to dose reduction, most commonly peripheral neuropathy (17%).At least one TEAE leading to study or treatment discontinuation occurred in 3% of patients with VRD and were most frequently due to infection (1%), septic shock (< 1%), and disease progression (< 1%).In the safety population of the GEM studies, a higher percentage of VTD patients received fewer cycles of therapy compared with VRD.For example, 4.4% vs 6.2% of patients received three or fewer cycles of VRD vs VTD.This trend continued, with 6.1% vs 10.8% of patients receiving four or fewer cycles, and 7.0% vs 13.8% of patients receiving five or fewer cycles of VRD vs VTD, respectively.Thus, more patients receiving VRD vs VTD initiated the protocol-defined sixth cycle of induction.
In the IFM studies, 33% and 18% of patients had at least one TEAE leading to dose reduction with VRD vs VTD, most commonly peripheral neuropathy (26% vs 14%, respectively).At least one TEAE leading to treatment discontinuation occurred in 6% and 11% of patients treated with VRD and VTD.The most common TEAEs leading to discontinuation were peripheral neuropathy (3%) with VRD and peripheral neuropathy (7%) and pulmonary embolism (2%) with VTD.The percentage of patients in the safety population of the IFM studies who received three or fewer cycles was 5.3% for both VRD and VTD.
In the absence of a prospective RCT comparing VRD and VTD, the established methodology of an integrated analysis was used to compare these regimens in TE NDMM.Using patient-level data from the GEM and IFM studies, the analysis met its primary endpoint and demonstrated the noninferiority of VRD vs VTD.Furthermore, VRD induction showed superiority by achieving a statistically significant and clinically relevant improvement in ≥ VGPR rate over VTD when six treatment cycles were compared in the GEM studies (66.3% vs 51.2%; P = .00281).Additionally, the improvement in ≥ VGPR rate from post induction to post ASCT was more notable with VRD than with VTD (rising to 74.4% vs 53.5%).Increasing ≥ VGPR rates over the course of treatment, undetectable MRD, and 2-year PFS rates further supported the benefit of VRD over VTD.The difference in ≥ VGPR rates was more notable in the GEM studies, which featured longer cycles than the IFM studies.It is also important to note that the differences in cycle length and overall treatment duration in the GEM studies may have contributed to a greater ≥ VGPR rate than what might be expected in the clinic.These considerations further highlight the importance of cycle length and treatment duration for achieving a clinically meaningful response during induction.While length of induction therapy in GEM2005 and GEM2012 (6 cycles/24 weeks) was longer than that of some other recent phase 3 trials, such as CASSIOPEIA (34), studies incorporating 6 cycles of induction with VTD have produced similar improvements in CR rates from pre-to posttransplant compared with those using 3 cycles of induction with VTD (18,19).Further, differences in the length of induction therapy among these trials may reflect variations in regional practices.The current ESMO guidelines recommend 4 to 6 cycles of induction with or without consolidation (4).
Although safety results from the GEM2005 study were not included in this analysis due to limited access, safety data reported in the GEM2005 primary manuscript and safety data reported here for GEM2012 and the IFM studies showed that TEAEs were generally consistent with the known safety profiles of lenalidomide, bortezomib, thalidomide, and dexamethasone.Differences in the most common TEAEs between the regimens were largely consistent with the toxicities of lenalidomide and thalidomide.Overall, TEAEs with the VRD regimen were manageable, and the tolerability profile compared well with that of VTD, with fewer TEAEs leading to discontinuation.Peripheral neuropathy due to thalidomide often increases in frequency with long-term use (16,17,35,36).In addition to the effects of lenalidomide vs thalidomide in the VRD and VTD regimens (including the dose of thalidomide used), TEAEs should be considered in the context of the different bortezomib routes and frequencies of administration used in these studies.Bortezomib was given subcutaneously in GEM2012 and IFM 2013-04 and intravenously in GEM2005 and IFM 2009.Additionally, bortezomib dose intensity was higher when administered in 3-week cycles in the IFM studies vs 4-week cycles in the GEM studies.The efforts made to improve the tolerability of induction regimens were important, as they may have allowed delivery of the full number of planned cycles.This could increase depth of response and lead to improved survival outcomes.One can hypothesize that a weekly subcutaneous bortezomib schedule could further improve tolerability compared with a twice-weekly schedule.
The survival data (PFS and OS) should be interpreted with caution considering the key study design differences.While GEM VRD and VTD trials had largely parallel designs (with six cycles of induction followed by ASCT), post-ASCT treatment differed between the trials.For example, patients in the VRD trial received VRD consolidation (and could enroll in a maintenance study of   Another point to consider is that different MRD sensitivity thresholds were used across these studies.In GEM2005, 10 −4 was used since the technology for 10 −6 was not available at the time the study was conducted.Compared with MRD 10 −5 and 10 −6 , MRD 10 −4 is a threshold at which more MRD-positive disease may go undetected, and MRD positivity has been associated with less-durable responses (37).While current response criteria have established an MRD threshold of 10 −5 , evidence has emerged that 10 −6 may be more clinically relevant.Further, achievement of MRD negativity at 10 −6 has been associated with longer PFS compared with 10 −5 (37).
This analysis has several limitations, the most notable of which is the cross-trial comparison.Although we used PS-based statistical methods, it is possible that differences in baseline factors among the trials could have confounded the comparison between treatment cohorts.In addition, the analyzed trials had relatively short followup durations and are older, having been published between 2012 and 2019 with a database closure of 2017.Since this time, regimens based on the anti-CD38 monoclonal antibodies daratumumab and isatuximab have emerged.However, both VRD and VTD are being used in current practice as backbone therapy in modern quadruplet induction regimens incorporating daratumumab or isatuximab.Further, the comparison of VRD and VTD remains relevant as both of these triplet regimens are recommended in current TE NDMM treatment guidelines, including ESMO, EMN, and ASCO/ CCO guidelines.VRD is also designated as a category 1 preferred regimen for primary therapy for TE patients by the National Comprehensive Cancer Network Guidelines ® , while VTD has a category 1 designation for use in certain circumstances (38).
As with any systematic analysis, the applicability of these results to the real-word setting is of interest, and different practice patterns in the Europe and the US should be acknowledged.In Europe, VTD is a standard of care for patients with TE NDMM (34), whereas in the US, VRD is considered the optimal induction regimen for these patients (39).Therefore, the results of this analysis will have different implications for each respective setting.The inclusion of younger, healthier patients (age <65 years with ECOG performance status of ≤ 3) may not be generally reflective of a real-world patient population.Lastly, the percentage of patients with undetermined cytogenetic risk included in each trial was unbalanced, ranging from 8% to 36%; therefore, it is likely that an analysis of real-world patients may have a more equitable distribution of risk groups.
Although some studies did not meet the criteria for inclusion in the integrated analysis, their results further support the use of VRD.For example, the DSMM XIV, GMMG-HD6, and SWOG S0777 studies showed VRD to be active and well tolerated (26,27,40), with the latter supporting the recent European approval of VRD for transplant-ineligible patients.In Myeloma XI, responses to induction were deeper with cyclophosphamide + lenalidomide + dexamethasone vs cyclophosphamide + thalidomide + dexamethasone (≥ VGPR rates, 60.4% vs 52.9%, respectively) (41).Responses also deepened with additional cycles, highlighting the importance of a tolerable regimen to maximize the number of cycles that can be given (42).Of note, although induction was stopped due to toxicity at a similar rate for cyclophosphamide + lenalidomide + dexamethasone vs cyclophosphamide + thalidomide + dexamethasone (5.0% vs 6.7%, respectively), dose modifications of lenalidomide were less frequent than of thalidomide (38.3% vs 73.6%, respectively) (41).Together, these results supported the advantage of lenalidomidevs thalidomide-containing regimens.Moreover, results of several other studies suggest that there is a role for VRD as consolidation therapy in NDMM (25,43).For example, in the EMN02/HOVON95 trial, consolidation therapy with VRD followed by maintenance with lenalidomide improved PFS in patients with NDMM vs maintenance alone (43).However, the benefit observed in this trial is possibly due to use of a suboptimal induction regimensome other trials have not demonstrated the same effect (44).
Given the incurable nature of MM and the fact that patients with MM will ultimately experience relapse, selecting the ideal induction regimen is critical for minimizing disease burden and promoting durable survival outcomes.In lieu of a direct comparison between VRD and VTD, it is our hope that this analysis can be used to help inform these treatment decisions and address clinical questions that are of particular importance to clinicians who treat patients with NDMM.Overall, the results of this integrated analysis provide evidence demonstrating the benefit of VRD over VTD as induction treatment in TE patients with NDMM.These results support the inclusion of VRD as a preferred regimen for primary therapy for transplant candidates (4-6).

FIGURE 3 ≥
FIGURE 3≥ VGPR rate throughout induction and post ASCT in the GEM studies.Data are based on the 378 patients taking VRD and 111 patients taking VTD who started cycle 6 in the GEM2012 and GEM2005 studies, respectively.ASCT, autologous stem cell transplant; PS, propensity score; VGPR, very good partial response; VRD, bortezomib, lenalidomide, and dexamethasone; VTD, bortezomib, thalidomide, and dexamethasone.

TABLE 2
Dichotomized response; patients without any MRD assessment data were not included in the detectable category.ASCT, autologous stem cell transplant; MRD, minimal residual disease; NA, not assessed; VGPR, very good partial response; VRD, bortezomib, lenalidomide, and dexamethasone; VTD, bortezomib, thalidomide, and dexamethasone.

TABLE 3
Summary of TEAEs †

TABLE 4
Grade 3/4 TEAEs in ≥5% of patients † † Safety was assessed using NCI CTCAE v4.03 (PETHEMA GEM2012) and v4.0 (IFM 2009 and IFM 2013-04).Peripheral neuropathy of any percentage is included.Due to limited access to some data from the PETHEMA GEM2005 clinical trial database, safety data from that study is not included here.‡Groupedterm used to capture events related to peripheral neuropathy.NCI CTCAE, National Cancer Institute Common Terminology Criteria for Adverse Events; TEAE, treatment-emergent adverse event; VRD, bortezomib, lenalidomide, and dexamethasone; VTD, bortezomib, thalidomide, and dexamethasone.