Multi-Institutional Dosimetric Evaluation of Modern Day Stereotactic Radiosurgery (SRS) Treatment Options for Multiple Brain Metastases

Purpose/Objectives: There are several popular treatment options currently available for stereotactic radiosurgery (SRS) of multiple brain metastases: 60Co sources and cone collimators around a spherical geometry (GammaKnife), multi-aperture dynamic conformal arcs on a linac (BrainLab Elements™ v1.5), and volumetric arc therapy on a linac (VMAT) calculated with either the conventional optimizer or with the Varian HyperArc™ solution. This study aimed to dosimetrically compare and evaluate the differences among these treatment options in terms of dose conformity to the tumor as well as dose sparing to the surrounding normal tissues. Methods and Materials: Sixteen patients and a total of 112 metastases were analyzed. Five plans were generated per patient: GammaKnife, Elements, HyperArc-VMAT, and two Manual-VMAT plans to evaluate different treatment planning styles. Manual-VMAT plans were generated by different institutions according to their own clinical planning standards. The following dosimetric parameters were extracted: RTOG and Paddick conformity indices, gradient index, total volume of brain receiving 12Gy, 6Gy, and 3Gy, and maximum doses to surrounding organs. The Wilcoxon signed rank test was applied to evaluate statistically significant differences (p < 0.05). Results: For targets ≤ 1 cm, GammaKnife, HyperArc-VMAT and both Manual-VMAT plans achieved comparable conformity indices, all superior to Elements. However, GammaKnife resulted in the lowest gradient indices at these target sizes. HyperArc-VMAT performed similarly to GammaKnife for V12Gy parameters. For targets ≥ 1 cm, HyperArc-VMAT and Manual-VMAT plans resulted in superior conformity vs. GammaKnife and Elements. All SRS plans achieved clinically acceptable organs-at-risk dose constraints. Beam-on times were significantly longer for GammaKnife. Manual-VMATA and Elements resulted in shorter delivery times relative to Manual-VMATB and HyperArc-VMAT. Conclusion: The study revealed that Manual-VMAT and HyperArc-VMAT are capable of achieving similar low dose brain spillage and conformity as GammaKnife, while significantly minimizing beam-on time. For targets smaller than 1 cm in diameter, GammaKnife still resulted in superior gradient indices. The quality of the two sets of Manual-VMAT plans varied greatly based on planner and optimization constraint settings, whereas HyperArc-VMAT performed dosimetrically superior to the two Manual-VMAT plans.


INTRODUCTION
Stereotactic radiosurgery (SRS) was first conceptually introduced by neurosurgeon, Lars Leksell, in 1951 (1, 2). The evolution of this technology alongside advances in image guidance have enabled the Gamma Knife to serve as the leading workhorse for treating cranial malignancies with hypofractionation. Although it was the first of its kind to perform SRS, the Gamma Knife has not been the only player, with other accelerator modalities adapting to offer solutions for patients requiring SRS (3,4). Advancements in hardware and software design have since propelled linacs to become a popular and more widely available technology for stereotactic treatment capability. This is particularly pertinent for the treatment of multiple brain metastases, which were traditionally treated with surgery and/or whole brain radiation therapy (WBRT).
With more studies promoting the benefits of SRS for multiple brain metastases such as: improved local control when adding SRS to WBRT (5)(6)(7)(8), similar survival (WBRT+SRS vs. SRS only) (8)(9)(10)(11)(12)(13)(14)(15)(16)(17) and less cognitive deterioration (SRS only) (18)(19)(20)(21), the ratio of patients receiving SRS treatments annually increased 15.8 percentage points from 2004 to 2014 and the number of facilities offering SRS annually increased 19.2 percentage points (22). Supporting evidence for SRS of a large number of brain metastases has further contributed to this effect (14,20,(23)(24)(25)(26)(27)(28)(29). This growing demand for SRS, coupled with the ease of access to conventional linacs, has stimulated the development of a number of new technologies to facilitate the implementation of linac-based SRS for the treatment of multiple metastases. The common goal of all these linac SRS techniques is to use a single isocenter to treat all of the metastases simultaneously, in order to avoid prohibitively long treatments with multiple isocenters and thereby improve patient comfort and throughput. The most current single isocenter linac-based SRS options include multi-aperture dynamic conformal arcs on a linac (30-32) (BrainLab Elements TM v1.5, Munich, Germany), volumetric arc therapy (VMAT) calculated with the conventional optimizer (33-43) (Varian Medical Systems, Palo Alto, CA) or VMAT delivery calculated with the newer Varian HyperArc solution (44)(45)(46)(47).
With this large variety of commercially available SRS treatment techniques, it is important to assess and be aware of the different strengths and weaknesses of the numerous options available for patients seeking treatment for multiple metastases. As the different technologies have emerged, there have been a number of studies comparing some of the techniques against each other. Thomas et al. (48), Liu et al. (49), and Potrebko et al. (50) each compared VMAT to GammaKnife for 28 patients with 2-9 targets, 6 patients with 3-4 metastases and 12 patients with at least 7 metastases, respectively. Mori et al. compared Elements to GammaKnife for two patients each with 9 metastases (32). Ohira et al. (44) compared HyperArc to conventional VMAT for 23 patients with 1-4 metastases, meanwhile Slosarek et al. (46) has most recently compared CyberKnife, VMAT and HyperArc for a set of 15 patients with 3-8 metastases each. Overall, these studies have found that VMAT is generally comparable to GammaKnife (with some minor differences such as improved conformity indices at the cost of potentially increased low dose spread), as is Elements to GammaKnife, and similarly now HyperArc is to VMAT. However, most of the published studies have only compared two technologies to each other, with the exception of Slosarek et al. (46), which added CyberKnife to the mix. This makes it difficult to assess whether one technique may truly be superior to another for a certain patient scenario because there is a lack of comparison data on the same subset of patients for the multiple SRS techniques available. It is therefore the aim of this work to provide a more rigorous and comprehensive evaluation of the dosimetric differences between the following state-ofthe-art SRS modalities: GammaKnife, Elements, Manual-VMAT, and HyperArc-VMAT.

METHODS AND MATERIALS
Sixteen patients with a range of 4-10 metastases each, for a total of 112 metastases, were included in this study. The patient's age ranged from 36 to 81 years old and consisted of the following primary cancers: renal cell carcinoma, esophageal, oropharyngeal, melanoma, breast, colon, and non-small cell lung carcinoma (adenocarcinoma and large cell). Five of the 16 patients did receive prior radiation treatment: SRS alone, WBRT alone, or both SRS and WBRT. The target volumes and prescribed doses (Gy) are detailed in Table 1 for each of the 16 patients.
Details on each of the SRS modalities utilized in this comparison study are described as follows. The most up to date commercially available product is the Leksell GammaKnife Icon (Elekta, Stockholm, Sweden), containing 192 60 Co sources and 4, 8, and 16 mm cone collimator options, which is an upgrade of the Perfexion unit, in that it allows frameless treatments with the addition of on-board cone-beam computed tomography (CBCT) imaging and a real-time motion tracking device. BrainLab Elements TM v1.5 is a commercial treatment planning system that automatically optimizes a dedicated group of dynamic conformal arcs to treat each of the lesions within the brain (via a single arc or a composition of multiple arcs) with a single common isocenter. Volumetric arc therapy enables intensity-modulated dose delivery via varying MLC positions and dose rate, simultaneous to varying gantry rotation speed, thus significantly increasing the degrees of freedom for the optimization algorithm. There is no physical difference in terms of the delivery for conventional VMAT vs. HyperArc. The major difference lies on the planning side for HyperArc, where the software assists the user by automatically selecting an optimal mono-isocenter, collimator angles, and non-coplanar arc setup with the intent of delivering the most conformal plan while minimizing low dose spillage into the surrounding normal brain structures. With conventional VMAT optimization, the planner is responsible for selecting and manipulating all of these variables. For every patient, a treatment plan was generated according to each of the four SRS techniques: GammaKnife, Elements, Manual-VMAT, and HyperArc-VMAT. Note, all patients were treated clinically with Elements and all other modalities were retrospectively planned for comparison in this study. A total of three different planners were included in this study. A single planner created all of the treatment plans across all patients per specified SRS modality to remove planner variability within each SRS modality. A single SRS planner with 8-10 years of experience generated all of the Elements plans used to treat the 16 patients in this study. A second SRS planner with 1-3 years of experience generated all GammaKnife plans and one set of Manual-VMAT A plans across all patients. Finally, a third SRS planner with 3-5 years of experience generated all Manual-VMAT B plans for the 16 patients. All HyperArc-VMAT plans were generated by the same planner for Manual-VMAT B but after all manual plans were done, i.e., Manual-VMAT B plans were not influenced by HyperArc plans. An additional Manual-VMAT plan was created for every patient following another institution's planning standard, in order to also evaluate the potential differences that may arise between two different treatment planner's styles. The difference in planning techniques between the two VMAT plans are summarized as follows: for VMAT A an upper and lower constraint was used for all targets, and no aggressive objective on low dose spread was applied, whereas VMAT B only applied lower constraints to target volumes but with additional objectives to control low dose spread.
Beam arrangements for the Elements plans were selected from a set of six predefined templates with a range of 5-6 couch angles with 28, 32, 35,  All linac plans were normalized such that the 100% isodose line covered 99% of the target volume. The GammaKnife plans were normalized with the same goal of covering 99% of the target volume with the prescription dose. This resulted in a range of 49-73% prescribed isodose lines with a median of 54%. All of the plan doses were imported into the same treatment planning system platform and version of Varian Eclipse (Varian Medical Systems, Palo Alto, CA) for dosimetric evaluation at a calculation grid size of 1 mm. Note, target normalization was entirely performed in each plan's native treatment planning system and no differences in target coverage were discovered after importing into Eclipse during dosimetric evaluation. The target volume metastasis for all patients in this study was defined as the planning target volume (PTV), already incorporating setup margins. All of the extracted and calculated dosimetric parameters described below are compared equivalently across all SRS techniques in terms of PTV. Thus, there are no inherent biases in comparing conformity indices for GTV vs. PTV when comparing GammaKnife vs. linac-based SRS.
The following dosimetric parameters were extracted per patient target volume across all SRS treatment plans: RTOG conformity index (CI-RTOG) defined as the ratio of the 100% isodose volume to the target volume; Paddick conformity index (CI-Paddick) defined as the ratio of the square of the volume of the target enclosed by the 100% isodose volume to the multiplication of the target volume with the 100% isodose volume; Gradient Index (GI) defined as the ratio of the 50% isodose volume to the 100% isodose volume; and the volume of 12Gy delivered to the surrounding brain tissue contributed only from that individual target (V 12Gy ) and the volume of 12Gy delivered to the surrounding brain tissue per individual target after subtracting that individual target volume (V 12Gy -TV). Additionally, the following dosimetric parameters were extracted per patient across pertinent organs-at-risk (OARs): the total volume of brain receiving 12Gy, 6Gy, and 3Gy (V 12Gy , V 6Gy , V 3Gy ) the mean dose to the brain excluding the targets (Brain mean dose), the maximum dose to the brainstem (D max Brainstem), maximum dose to the left eye and optic nerve (D max L Eye and D max L ON), maximum dose to the right eye and optic nerve (D max R Eye and D max R ON), and maximum dose to the optic chiasm (D max OC). Lastly, the total treatment time for each plan was also extracted for comparison (linac plans times were calculated assuming a dose rate of 1,400 MU/min).
Statistical evaluation of the extracted parameters was performed with JMP Pro v14 (SAS, Cary, NC). The Wilcoxon signed rank test was applied in the format of matched pairs to compare each of the plans against each other per extracted dosimetric parameter. Differences were found to be statistically significant with p < 0.05. Figure 1 graphically compares both types of conformity indices across all five SRS plans grouped according to target size. It is evident that for very small target sizes (<1 cm), GammaKnife, HyperArc-VMAT and both Manual-VMAT plans perform similarly well across both conformity indices. All are superior to the Elements conformity results. However, for target size diameters above 1 cm, HyperArc-VMAT and both Manual-VMAT plans result in superior conformity as compared to GammaKnife and Elements. Figure 2 also graphically divides the results per target bin size for GI and both V 12Gy dose metrics. The GI results show that GammaKnife is superior amongst small target diameters (<1 cm), but above that GI is similar amongst all techniques with the exception of VMAT A with the largest range. Amongst the two V12 Gy parameters, it is apparent that HyperArc-VMAT is slightly inferior compared to GammaKnife for the small targets (<1 cm) and even outperforms GammaKnife for large targets above 1 cm in diameter. When comparing total V 12Gy per patient, i.e., combining all per target V 12Gy , HyperArc-VMAT is slightly lower than GammaKnife by a median difference of 1.3 cc, which is statistically significant but clinically equivalent. Not surprisingly, the data in Figure 2 demonstrates an increase in both V 12Gy metrics as the target size increases. Also noteworthy are the widely variable results between the two Manual-VMAT planning techniques, where VMAT B consistently provides lower V 12Gy and V 12Gy -TV volumes of the brain than VMAT A . Yet, neither Manual-VMAT plan performed as well as the HyperArc-VMAT amongst these parameters.

RESULTS
Displaying all of the data together, rather than dividing by target bin size, Figure 3 displays the trends observed amongst the remaining extracted parameters representative of low dose spread: brain mean dose, V 12Gy , V 6Gy , and V 3Gy . Here it is again evident that HyperArc-VMAT is comparable with GammaKnife in terms of low dose spillage into the brain, when looking at the entire dataset of target sizes. Elements performs similarly to the Manual-VMAT plans, but inferior to GammaKnife and HyperArc-VMAT in this aspect. The visually evident differences amongst the plans in Figures 1, 3 are further detailed in Table 2, which lists the median difference as well as the Wilcoxon signed rank results per extracted parameter for every potential matched pair of plan comparisons amongst the five options. The median differences are displayed as a result of the row plan subtracted from the column-listed plan. Because a majority of the table displays statistically significant differences with p < 0.05, the only 6 (of a total of 70) non-significant p-values were instead bolded and underlined in the table to stand out. The purpose of this table was to serve as a more detailed reference of the magnitude of the differences when looking at two specific SRS plans per extracted dosimetric parameter.  2 | Gradient Index (GI), V12 Gy per target (defined as the volume of 12Gy delivered to the surrounding brain tissue contributed only from that individual target), and V12 Gy -TV(defined as the total volume of brain receiving 12Gy per target excluding the target volume) results displayed as box plots per SRS plan type, divided into five separate target size diameter bins. FIGURE 3 | Box plot results per SRS plan type for the following dosimetric parameters across all patients: the total volume of brain receiving 12Gy, 6Gy, and 3Gy (V 12Gy , V 6Gy , V 3Gy ) and the mean dose to the brain excluding the targets (Brain mean dose).  Figure 4 compares the plan results for all of the studied OARs: maximum dose to the brainstem, optic chiasm left and right eyes, left and right optic nerves. It is important to note that each of the plans satisfied normal tissue constraints amongst all of the patients. Overall, not many patterns nor striking differences between the SRS techniques were observed when it came to sparing OARs and in general they all performed similarly well. The large range observed in D max Brainstem for GammaKnife planning is a result of target location coupled with source geometry and an inability to optimize the beam's trajectory as is possible with Elements and VMAT treatment planning software.
As a visual comparison of the dosimetric results, Figure 5 displays axial, coronal, and sagittal views of the five different SRS plans per patient case #15 with a total of 10 metastases. This patient was selected due to the presence of multiple small metastases as well as a larger, more irregularly shaped target volume, all treated within the same plan. The slice locations were selected so as to show case as many of the treated metastases as possible.
Lastly, treatment delivery times listed in Table 3 were extracted from the GammaKnife treatment plans and approximately calculated for the Elements and Manual-VMAT plans based on the total MUs required (since the dose rate and gantry rotation speed can vary), assuming a dose rate of 1,400 MU/min with 6X flattening-filter-free energy. Unsurprisingly, GammaKnife plans took hours longer to deliver than any linac-based radiosurgery plan. Elements and Manual-VMAT A had similar beam-on times, but HyperArc-VMAT and Manual-VMAT B were longer for almost every single case. The higher MU is a result of the increased modulation, which often happens when more stringent constraints are applied during the optimization process. This is consistent with the brain V 12Gy results exhibited in Figure 2 and the mean differences listed in Table 2, where HyperArc-VMAT and Manual-VMAT B result in the least low dose spillage across all target size groups.

DISCUSSIONS
The overall findings of this comparison study have demonstrated that as would be expected, all of the commercially available options for SRS are able to achieve acceptable conformality and OAR dose sparing limits. However, looking more closely at each dosimetric parameter has revealed interesting information. While it was not surprising to find the improved conformity results of the linac-based SRS techniques over GammaKnife for larger and more irregular volumes (due to the more advanced inverse optimization features as well as the ability of MLC shaping), it was certainly unexpected to see HyperArc-VMAT be able to compete with GammaKnife in terms of V12 Gy . Also expected was GammaKnife's outperformance amongst GI for small targets. However, for the larger target sizes, GammaKnife resulted in similar GIs to HyperArc-VMAT and Manual-VMAT B . This information coupled with the results from Table 3 of total beam-on times of minutes vs. hours, suggests that linac-VMAT radiosurgery is a valuable contender to GammaKnife for patients seeking treatment of multiple brain metastases, particularly for large and irregularly-shaped target volumes.
Another rather interesting find was the large deviation seen in the results between the Manual-VMAT A and VMAT B plans, where the optimization objective setting was the main difference between the two techniques, with one having applied upper   constraints (VMAT A ) and the other avoiding upper constraints entirely (VMAT B ) but with a more stringent control on low dose spread. VMAT B outperformed VMAT A across basically all of the studied parameters: CI, GI, V12 Gy , V6 Gy , OAR doses, etc., but all essentially at the cost of longer beam-on times. This large variation in plan quality indicated that the quality of care using VMAT for the treatment of multiple brain metastases is largely dependent on planner experience and institutional standards. Thus, in order to improve the standardization of quality of care, planning procedures and optimization objective settings need to be carefully standardized across our community even at this level of detail. Furthermore, it can be seen from the results that even though Manual-VMAT B had in general the longest beam-on time, i.e., highest modulation complexity, its plan quality was still mostly inferior compared to HyperArc-VMAT. This indicates that the objective settings used in VMAT B are suboptimal and do not provide as good of a balance (relative to HyperArc-VMAT) between modulation complexity and plan quality. To this end, HyperArc-VMAT could help improve both the optimization efficiency and plan quality standardization for SRS treatment of multiple brain metastases using a VMAT delivery technique.
As a quick and straightforward summary of our findings, a spider plot was generated in Figure 6 to serve as a qualitative description of the data. The categories spanned not only dosimetric results, but also considered efficiency and skill in terms of staff and time resources required: conformity, low dose fall-off, inter-planner variability and skill, delivery efficiency, and patient-specific QA effort. Each of the SRS techniques (GammaKnife, Elements, HyperArc-VMAT, and Manual-VMAT) was ranked relative to each other according to the specific category item. Across the different target size bins, Figure 1 demonstrated that HyperArc-VMAT resulted in comparable or superior CI amongst the SRS techniques, thus earning a ranking of 1. GammaKnife had excellent conformity at the smaller target size bins, but that deteriorated with increasing size (compared to VMAT), thus earning it a ranking of 3, after VMAT with a rank of 2. Elements was consistently inferior to the other SRS modalities in terms of CI and thus was ranked last at 4. Regarding the category of dose fall-off, GammaKnife was consistently superior according to Figures 2, 3, thus it was ranked the highest (1), followed by HyperArc-VMAT (2), Elements (3) and then Manual-VMAT (4), due to the dependence on planning strategy and skill. In terms of required planning skill and inter-planner variability, Elements and HyperArc-VMAT are less dependent on this aspect, in that all of the programmed presets only require minimal planner interaction, thus earning both a ranking of 1. GammaKnife would then rank lower (at 3), given that each target is typically forwardplanned by the user. (Note however that the forward-planning of multiple metastases in GammaKnife allows the user to finetune the coverage of each target, whereas in VMAT planning the software only allows normalization to a single target at the highest dose level when prescribing different doses to different size metastases.) Manual-VMAT ranked the lowest at 4, due to the potential for greatest variability amongst different planners with the large degree of customizable plan settings (compared to GammaKnife), which can result in varying plan quality as seen in plans A vs. B. Table 3 displays the beam-on time and thus the delivery efficiency are straightforward in this respect: Elements had the lowest average beam-on time (rank = 1), followed by comparable beam-on times of HyperArc-VMAT and Manual-VMAT (both ranked at 2), and GammaKnife coming in last (rank 4) with the longest beam-on times. Furthermore, GammaKnife treatment requires the presence of an authorized medical physicist as well as a physician trained in emergency procedures for the entirety of the treatment, which may pose an additional burden on staff resources (as compared with linac-based radiosurgery). Lastly, when it comes to required patient-specific QA, GammaKnife does not require any and thus would be ranked the highest at 1, followed by Elements ranking at 2 (whether to perform dose verification for 3D-DCA SRS plans varies according to institutional policies) and then both VMAT techniques (all ranked at 3) which require additional resources i.e., physics staff to perform the time-consuming QA, involving plan preparation, device setup, beam delivery and plan analysis. The overall purpose of Figure 6 is to allow the reader to qualitatively evaluate the differences in focus amongst the SRS techniques per category of interest, in the context of multiple metastases treatment.
Another practical aspect to consider when interpreting the differences seen in the results is the accuracy and precision of these treatment machines and how truly capable they are to deliver exactly what is displayed to the user in the treatment planning software. Inevitably, uncertainties exist throughout the entire treatment process, from simulation to on-board imaging and patient setup, all the way through to radiation delivery. Although it is beyond the scope of this paper, it is important to be aware of the potential geometric uncertainties present not only from the hardware (imaging and radiation isocenter coincidence, gantry rotation and sag, couch positional accuracy, MLC positional accuracy, etc.), but in the patient immobilization (frameless mask treatments for linac and GammaKnife) aspect as well, which can alter the expected conformity indices as calculated by the planning software. This type of data analysis will be the goal of our future studies.
Upon evaluation of the dosimetric and logistical differences of these currently available SRS treatment techniques, the question arises whether any of these differences actually have a clinically tangible impact. The clinical implications of the disparities in the low dose spillage or the conformity indices, in terms of local control or quality of life, is a much more vast and complicated discussion that ultimately is very difficult to determine. It would require multi-institutional prospective clinical trials with long term follow-up, which sadly may be rather difficult to obtain, given the average length of survival of patients with multiple brain metastases. However, for the purposes of this comparison study, we have analyzed and presented the data in such a manner as to provide the community with a tool for selecting an SRS modality for a specific patient scenario when more than one option is available, or even for the case of selecting which type of SRS modality fits best within one's clinical needs based on their specific patient population.

CONCLUSIONS
HyperArc-VMAT and Manual-VMAT plans resulted in superior CI when compared with GammaKnife and Elements for target diameters > 1 cm in size, albeit at the expense of more MUs (relative to Elements). For targets < 1 cm, GammaKnife, HyperArc-VMAT and both Manual-VMAT plans achieved similar CI, but still all superior to Elements. In the smaller target size bins, GammaKnife resulted in superior GI. In terms of low dose spread into the brain, HyperArc-VMAT achieved comparable (target size < 1 cm) or slightly better V12 Gy values as GammaKnife (target size > 1 cm). All five SRS plans were able to meet the surrounding normal tissue limits, and overall resulted in similar doses to the pertinent OARs. Beam-on times were hours longer for GammaKnife vs. each of the linac-based SRS plans, with VMAT A and Elements resulting in shorter times relative to VMAT B and HyperArc-VMAT. Manual-VMAT plan quality varied greatly between the two institutional planning strategies employed.
In summary, this study demonstrated that HyperArc-VMAT is capable of achieving similar or slightly better low dose spread into the brain as GammaKnife, while maintaining excellent conformity as well as minimizing inter-planner variability and beam-on time for patients seeking treatment of multiple metastases. GammaKnife remains superior in terms of gradient index and eliminates the need for patient-specific QA. Elements strengths include delivery/QA efficiency and inter-planner consistency due to automated optimization of pre-defined templates. Manual-VMAT is subject to larger inter-planner variability as compared to HyperArc-VMAT.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the supplementary files.