# Data-Based Radiation Oncology – Design of Clinical Trials

edited by: Kerstin A. Kessel, Anne W. Lee, Søren M. Bentzen, Bhadrasain Vikram, Fridtjof Nuesslin and Stephanie E. Combs published in : Frontiers in Oncology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-438-9 DOI 10.3389/978-2-88945-438-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **Data-Based Radiation Oncology – Design of Clinical Trials**

Topic Editors:

**Kerstin A. Kessel,** Klinikum rechts der Isar, Technische Universität München, Helmholtz Zentrum München, Germany **Anne W. Lee,** The University of Hong Kong Shenzhen Hospital, China **Søren M. Bentzen,** University of Maryland, United States **Bhadrasain Vikram,** National Cancer Institute (NIH), United States **Fridtjof Nuesslin,** Klinikum rechts der Isar, Technische Universität München, Germany **Stephanie E. Combs,** Klinikum rechts der Isar, Technische Universität München, Helmholtz Zentrum München, Germany

San Francisco Museum of Modern Art, San Francisco, United States. Image: Jason Leung/Unsplash.com

**Citation:** Kessel, K. A., Lee, A. W., Bentzen, S. M., Vikram, B., Nuesslin, F., Combs S. E., eds. (2018). Data-Based Radiation Oncology – Design of Clinical Trials. Lausanne: Frontiers Media. doi: 10.3389/ 978-2-88945-438-9

# Table of Contents

*05 Editorial: Data Based Radiation Oncology—Design of Clinical Trials* Kerstin Anne Kessel, Anne W. M. Lee, Søren M. Bentzen, Bhadrasain Vikram, Fridtjof Nüsslin and Stephanie E. Combs *06 Big Data in Designing Clinical Trials: Opportunities and Challenges* Charles S. Mayo, Martha M. Matuszak, Matthew J. Schipper, Shruti Jolly, James A. Hayman and Randall K. Ten Haken *13 mHealth and Application Technology Supporting Clinical Trials: Today's Limitations and Future Perspective of smartRCTs* Marco M. E. Vogel, Stephanie E. Combs and Kerstin A. Kessel

*19 Which Obstacles Prevent Us from Recruiting into Clinical Trials: A Survey about the Environment for Clinical Studies at a German University Hospital in a Comprehensive Cancer Center*

Christoph Straube, Peter Herschbach and Stephanie E. Combs

*24 Use of Multicenter Data in a Large Cancer Registry for Evaluation of Outcome and Implementation of Novel Concepts*

Gabriele Schubert-Fritschle, Stephanie E. Combs, Thomas Kirchner, Volkmar Nüssler and Jutta Engel

*37 Data-Based Radiation Oncology: Design of Clinical Trials in the Toxicity Biomarkers Era*

David Azria, Ariane Lapierre, Sophie Gourgou, Dirk De Ruysscher, Jacques Colinge, Philippe Lambin, Muriel Brengues, Tim Ward, Søren M. Bentzen, Hubert Thierens, Tiziana Rancati, Christopher J. Talbot, Ana Vega, Sarah L. Kerns, Christian Nicolaj Andreassen, Jenny Chang-Claude, Catharine M. L. West, Corey M. Gill and Barry S. Rosenstein


Sandra Rutzner, Rainer Fietkau, Thomas Ganslandt, Hans-Ulrich Prokosch and Dorota Lubgan

*66 Integrating Hyperthermia into Modern Radiation Oncology: What Evidence Is Necessary?*

Jan C. Peeken, Peter Vaupel and Stephanie E. Combs

#### *83 Parenchymal and Functional Lung Changes after Stereotactic Body Radiotherapy for Early-Stage Non-Small Cell Lung Cancer—Experiences from a Single Institution*

Juliane Hörner-Rieber, Julian Dern, Denise Bernhardt, Laila König, Sebastian Adeberg, Vivek Verma, Angela Paul, Jutta Kappes, Hans Hoffmann, Juergen Debus, Claus P. Heussel and Stefan Rieken

#### *92 Relationships between Regional Radiation Doses and Cognitive Decline in Children Treated with Cranio-Spinal Irradiation for Posterior Fossa Tumors*

Elodie Doger de Speville, Charlotte Robert, Martin Perez-Guevara, Antoine Grigis, Stephanie Bolle, Clemence Pinaud, Christelle Dufour, Anne Beaudré, Virginie Kieffer, Audrey Longaud, Jacques Grill, Dominique Valteau-Couanet, Eric Deutsch, Dimitri Lefkopoulos, Catherine Chiron, Lucie Hertz-Pannier and Marion Noulhiane

#### *102 Tangential Field Radiotherapy for Breast Cancer—The Dose to the Heart and Heart Subvolumes: What Structures Must Be Contoured in Future Clinical Trials?*

Marciana Nona Duma, Anne-Claire Herr, Kai Joachim Borm, Klaus Rüdiger Trott, Michael Molls, Markus Oechsner and Stephanie Elisabeth Combs

# Editorial: Data Based Radiation Oncology—Design of Clinical Trials

*Kerstin Anne Kessel1,2\*, Anne W. M. Lee3 , Søren M. Bentzen4 , Bhadrasain Vikram5 , Fridtjof Nüsslin1 and Stephanie E. Combs1,2*

*1Department of Radiation Oncology, Klinikum rechts der Isar, Technische Universität München, Munich, Germany, 2 Institute for Innovative Radiotherapy (iRT), Helmholtz Zentrum München, Munich, Germany, 3Department of Clinical Oncology, The University of Hong Kong Shenzhen Hospital, Shenzhen, China, 4Division of Biostatistics and Bioinformatics, Department of Epidemiology and Public Health, The Greenebaum Cancer Center, School of Medicine, University of Maryland, Baltimore, MD, United States, 5National Cancer Institute (NIH), Rockville, MD, United States*

Keywords: clinical trials, data collection, radiation oncology, clinical study design, study management

#### **Editorial on the Research Topic**

#### **Data Based Radiation Oncology—Design of Clinical Trials**

In radiation oncology as in many other specialties, clinical trials are essential to investigate new therapeutic approaches. Usually, preparation for a prospective clinical trial is time-consuming until ethics approval is obtained. To test a new treatment many years pass before it can be implemented in the routine care. During that time, already new interventions emerge, new drugs appear on the market, technical and physical innovations are being implemented, novel biology-driven concepts are translated into clinical approaches while we are still investigating the ones from years ago.

Another problem is associated with molecular diagnostics and the growing amount of tumorspecific biomarkers which allow for better stratification of patient subgroups. On the other side, this may result in a much longer time for patient recruiting and consequently in larger multicenter trials. Moreover, all of the relevant data must be readily available for treatment decision making, treatment as well as follow-up, and ultimately for trial evaluation. This challenges even more for agreed standards in data acquisition, quality, and management.

How could we change the way currently clinical trials are performed in a way they are safe and ethically justifiable and speed up the initiation process so that we can provide new and better treatments faster for our patients?

Furthermore, while we rely on various quantitative information handling distributed, large heterogeneous amounts of data efficiently is very important. Thus, data management becomes a strong focus. A good infrastructure helps to plan, tailor and conduct clinical trials in a way they are easy and quickly analyzable.

In this research topic, we want to discuss new ideas for intelligent trial designs and concepts for data management.

#### AUTHOR CONTRIBUTIONS

All authors wrote and revised the editorial.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 Kessel, Lee, Bentzen, Vikram, Nüsslin and Combs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

#### *Edited and Reviewed by:*

*Timothy James Kinsella, Warren Alpert Medical School of Brown University, United States*

> *\*Correspondence: Kerstin Anne Kessel*

*kerstin.kessel@tum.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 26 January 2018 Accepted: 01 February 2018 Published: 16 February 2018*

#### *Citation:*

*Kessel KA, Lee AWM, Bentzen SM, Vikram B, Nüsslin F and Combs SE (2018) Editorial: Data Based Radiation Oncology—Design of Clinical Trials. Front. Oncol. 8:34. doi: 10.3389/fonc.2018.00034*

# Big Data in Designing Clinical Trials: Opportunities and Challenges

*Charles S. Mayo\*, Martha M. Matuszak, Matthew J. Schipper, Shruti Jolly, James A. Hayman and Randall K. Ten Haken*

*Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, United States*

Emergence of big data analytics resource systems (BDARSs) as a part of routine practice in Radiation Oncology is on the horizon. Gradually, individual researchers, vendors, and professional societies are leading initiatives to create and demonstrate use of automated systems. What are the implications for design of clinical trials, as these systems emerge? Gold standard, randomized controlled trials (RCTs) have high internal validity for the patients and settings fitting constraints of the trial, but also have limitations including: reproducibility, generalizability to routine practice, infrequent external validation, selection bias, characterization of confounding factors, ethics, and use for rare events. BDARS present opportunities to augment and extend RCTs. Preliminary modeling using singleand muti-institutional BDARS may lead to better design and less cost. Standardizations in data elements, clinical processes, and nomenclatures used to decrease variability and increase veracity needed for automation and multi-institutional data pooling in BDARS also support ability to add clinical validation phases to clinical trial design and increase participation. However, volume and variety in BDARS present other technical, policy, and conceptual challenges including applicable statistical concepts, cloud-based technologies. In this summary, we will examine both the opportunities and the challenges for use of big data in design of clinical trials.

#### *Edited by:*

*Bhadrasain Vikram, National Cancer Institute (NIH), United States*

#### *Reviewed by:*

*Niloy Ranjan Datta, Kantonsspital Aarau, Switzerland Torunn I Yock, Massachusetts General Hospital, United States*

#### *\*Correspondence:*

*Charles S. Mayo cmayo@med.umich.edu*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 31 May 2017 Accepted: 09 August 2017 Published: 31 August 2017*

#### *Citation:*

*Mayo CS, Matuszak MM, Schipper MJ, Jolly S, Hayman JA and Ten Haken RK (2017) Big Data in Designing Clinical Trials: Opportunities and Challenges. Front. Oncol. 7:187. doi: 10.3389/fonc.2017.00187*

Keywords: big data, trial design, randomized controlled trials, informatics, analytics

### INTRODUCTION

A primary objective of clinical research is gaining knowledge from studying a subset of patients which can then be applied to a much wider group of patients to improve care. In routine practice, patient care is delivered within a rich background of intrinsic and endemic confounding factors and biases associated with practices and patients. Clinical research methodologies are challenged to accurately delineate specific relationships and be relevant to routine practice.

Optimal trial design methodologies have a long history of debate within the medical field (1–15). Recently, there has been substantial growth in the number of academic groups investing in development of big data analytics resource systems (BDARSs) to support practice quality improvement (PQI) and translational research (TR) applications in radiation oncology (16, 17). BDARSs aggregate clinical data from multiple systems including electronic health records (EHRs), Radiation Oncology information systems (ROISs), treatment planning systems (TPSs), and others into common location designed to support analyzing this data to improve patient care. Our objective in this presentation is to explore how these big data efforts might intersect with trial design methodologies to augment or extend these approaches.

### RANDOMIZED CLINICAL TRIALS

Randomized controlled trials (RCTs) provide the highest ranked level of evidence for delineation of causal relationships between treatment results and outcomes. Using a design methodology that meticulously minimizes and controls variation encountered in routine practice, RCTs are designed for statistical rigor. They have high internal validity for selected constraints and treatment delivery conditions specified in the trial design. RCTs are well incorporated into clinical and research systems. Systems for funding, management, and infrastructure supporting collaborative trials research are oriented to RCTs. However, RCT's also have challenges including: reproducibility, generalizability, cost, external validation, and delay (1, 2, 14). Meta-analysis of individual patient data addresses some of these challenges of any single trial. In particular, results of a meta-analysis of multiple clinical trials will generally be more reproducible, generalizable, and have greater external validity. However, they also have greater delay and cost than any single trial. Additionally, they are still based on the population of patients who actually enroll in clinical trials which may not be fully representative of a broader patient population.

#### Reproducibility

Multiple, independent measurements demonstrating reproducibility of results are strong evidence for the validity of the result. Difficulty in reproducing results for RCTs is a concern in the community and for the National Institutes of Health (3). Observational studies are ranked lower than RCTs in level of evidence, but frequently utilize larger number of patients. Some researchers have demonstrated greater consistency among observational studies than findings consistent with RCTs (2, 4, 5). In an analysis comparing results of independent RCTs (45) to independent, well-designed observational studies (44) spanning five clinical research topics, Concato demonstrated more inconsistency in RCT, and much tighter confidence intervals for the observational studies which included larger number of subjects (2). In an early meta-analysis Horwitz examined 200 RCTs spanning 36 topics in cardiology and gastroenterology highlighting conflicting results. He found that complex design and inconsistencies in clinical execution and therapeutic evaluation undermined reproducibility (4). In radiation oncology, complex single institution trials may require significant redesign to reduce complexity, such as in the case of translating the University of Michigan's PET adaptive lung cancer trial to a cooperative group trial run through RTOG (18, 19). Additionally, compared with pharmacologic interventions, technique-based interventions in Radiation Oncology as in Surgery, introduce added complexities sensitive to skill of individual practitioners, and evolution of technique over the period of the trial as experience is acquired.

#### Cost

Effort required for collection and aggregation of data frequently falls outside the range of routine clinical practice. Interfaces to EHRs, ROISs, and TPSs typically require manual inspection of all to synthesize, extract, and report required trial data.

#### Generalizability

Complexity and cost of implementing trials work against recruitment of large numbers of patients and introduces selection bias for patient cohorts with geographic, insurance, and medical history profiles commensurate with treatment at medical centers that also have sufficient resources to participate in trials. This selection bias can become dangerous when the RCT result is applied to an underrepresented group of patients that were not well represented in trial enrollment and whose disease may not respond to the experimental treatment. In addition, RCTs are typically designed to test a drug or specific intervention in a patient cohort with strict eligibility criteria. In many cases, RCTs are testing these interventions in a small subset of patients in larger disease sites. So, even after a positive trial, the number of patients that the results of an RCT may apply to, could be relatively small. However, this does not prevent the community from applying the intervention to a larger cohort of patients, making future observation studies potentially washed out or negative due to inappropriate use of the trial results.

As more data on genomic variations across patients and tumors becomes available, it is also possible that the results of certain positive trials could be driven by strong positive result in a previously unknown subset of the population. Without further study and patient classification by BDR, the ability to further analyze these trials is lost.

#### Infrequent External Validation

If an objective of funding RCTs is to improve care for a broader segment of the population, then demonstrations of external validation are needed. Due to a variety of factors, RCTs suffer from low rates of external validation. Larger RCT series with multiple studies testing similar regimes, such as accelerated whole breast irradiation (6, 7) are the exceptional case where RCTs can lead to sweeping practice changes and updated national guidelines. However, smaller RCTs, especially those run in a single institution setting, are rarely validated in an external cohort due to complex design, cost, and loss of equipoise after the initial trial is published.

One reason for this may be that testing a trial concept for extensibility to and validity in the "real world" of routine clinical practice is rarely a priority in trial design. Therefore, RCTs continue to include a much, much smaller number of patients and less variable clinical practices than represented by the majority of patients treated.

As more and more biomarker and image driven treatment selection is incorporated into trials, this lack of external validation will only become worse. Not only will the validation studies not be possible due to the lack of knowledge and resources to run the trial, but specific nuances of image analysis and bio-specimen testing/handling, may be unavailable or irreproducible. National clinical trial resources and core facilities will assist in this area for larger cooperative group studies, but this remains an issue for single institution studies.

#### Delay

Clinical trial infrastructure, both at individual institutions and cooperative groups, is organized in such a way that trials go through a number of steps to ensure that trials are of sufficient potential benefit to the patient or population, are able to be funded appropriately, and are designed properly. While these steps are essential, it also means that the initiation of a trial is delayed by even years before starting.

Almost one-fifth clinical trials even at large centers are "slowaccruing" (14). Thus, once a trial opens, the study question may no longer be as relevant as it was when the concept was first initiated. Expense of tests and staff to carry out the RCT may limit resources needed for accrual into the trial. Use of manual rather than standardized electronic means at point of care—point of data entry impede aggregation from multiple institutions. Managing logistics of clinical process flows and mechanisms for data aggregation for RCTs that differ from those used for the majority of off-protocol patients add to cost and slow accrual.

### SYNERGIES IN CONSTRUCTING BIG DATA SYSTEMS AND SUPPORTING CLINICAL TRIALS

Rather than replacing RCTs, we posit that BDARSs will present resources and methodologies that can be incorporated into design of RCTs to augment and extend them to address the issues outlined above. Assuring that data elements needed for BDARSs are routinely aggregated using methodologies that assure accurate electronic extraction is also synergistic with objectives for clinical trials and observational studies. Construction of effective BDARSs includes development and use of standardizations that can be practically fitted into clinical practice. Coordination with multi-disciplinary groups to clean point of care—point of data entry processes to support BDARSs is extensible to these groups for entry of data elements necessary for clinical trials. Standardizations in designation of key data elements, nomenclatures supporting exchange, and clinical processes improving accurate are vital to these efforts.

#### EHR Templates

For example, our BDARS, the University of Michigan Radiation Oncology Analytics Resource (M-ROAR), requires accurate data on provider reported toxicities, recurrence, performance status, etc. (18). Examining the work flows of care providers, the most consistent point of entry is provider notes in the electronic health record (EHR). Our EHR, EPIC, does not provide quantified fields for these key data elements. However, with development of M-ROAR to enable use of the full text of encounter notes, options for standardizing text entry to enable accurate, automated electronic extraction became viable solutions.

The EHR does provide means create templates that regularize text entry of information. In that EHR system, these are known as Smart List and Smart Phrase objects. Smart List objects allow defining a tab activated drop down list of serializable options to be inserted in the text field of a clinical note. Smart phrases are used to assemble sets of smart lists embedded with other standardized text.

We developed a standardized schema for representation of key data elements in text fields utilizing these smart objects to regularize data entry across providers. With this schema standardization, software tools known as regular expressions can be used to accurately extract key data elements from the text of clinical encounter notes. This is carried out in high volume for all patients.

The schema developed demarking key data elements are illustrated below. Highlighted text indicates characters with specific interpretations. Italicized text indicates place holders for specific information types.


**Figure 1** illustrates creation of smart list objects using this schema. The |> and <| character combinations delineate the beginning and the end of a key data element. The text to the left of the = sign following |> is a standardized name for the key data element; the text to the right indicates the value assigned to the data element. Parenthesis characters, (), are used to delineate optional commentary information. The bar symbols, |, demark entry of optional supplemental item/value pairs related to the key data element. Four examples of schema valid text fields are listed below.

```
```



The standardized schema assures accurate identification of key data elements and component information elements. Together with definition of a standardized data dictionary of key elements, supplemental information items and allowed values, the standardized schema provides a flexible but fully defined means to accurately and electronically extract information needed for BDARSs.

When a clinical trial is implemented, additional key data elements may be needed. If the EHR is the optimal point of care-point of data entry mechanism, then the data dictionary is extended, and new smart list/smart phrase objects are constructed using the standardized schema developed to support extractions for the BDARS.

Note that while access to TPS and ROIS data is routine in most Radiation Oncology clinics, access to EHR data varies widely among institutions. Considerable cooperation between the EHR vendor and the institutional IT groups controlling access with end users is required. Introduction of standardizations, like that defined above, increases the value of the enterprise data stores for both vendors and IT groups as well as for end users. However, these standardizations only arise and become incorporated into routine practice if end users are enabled to access and use the data. This is especially important for community clinics, where the majority of patients are treated.

### Optimized Clinical Process Flow Using Existing Systems

For several key data element categories, ROISs or TPSs may be optimal point of care-point of data entry systems. Optimizing

clinical process to assure availability of these elements for all patients supporting the BDARSs also eliminates extra efforts to acquire these elements when needed for clinical trials.

For example, by modifying clinical process flows to implement a standardized approach for entry of diagnosis and staging information along with explicit linkages to treatment course, both the BDARS and clinical trials are supported. In another example supporting the BDARS, we modified our clinical process to assure routine creation of as treated plan sums to enable automated extraction of course cumulative dose volume histogram (DVH) curves reflecting cumulative doses for the plans and actual number fractions treated. In addition, the standardized nomenclature recommendations of AAPM TG-263 for targets and normal structures were adopted to assure correct identification of structures in extract, transform, and loads (ETLs) of DVH curves.

Patient reported outcomes aggregation required modification of clinical process flows and staffing as well as collection technology. With subsequent completion of the informatics circle to ETL PRO data into M-ROAR, the PRO data became available for large volume analysis. With that step, the mechanisms used for gathering PROs for M-ROAR, could plausibly be extended to support gathering analogous information for patients on RCTs.

#### Multiple Institutions

Ability to aggregate key data elements, including survival, recurrence, and toxicity, is challenged when patients do not return for follow-up or shift away from the academic center delivering specialized care back to their local community hospital for ER and continuing care visits. Fully understanding therapeutic outcomes requires longitudinal follow-up data over many years. Scalable, automated solutions are technically feasible, but requisite contractual relationships and PHI protection compliance mechanisms are not. Health care policy efforts to improve continuity of care will in the long run benefit both BDRs and RCTs.

The regulatory and institutional compliance office constraints arising from the Health Insurance Portability and Accountability Act (HIPAA) are important for protecting sensitive, personal information of patients from misuse. However, HIPAA can be a double edged sword. Ability to utilize information gained from prior patients from multiple institutions to improve treatments of future patients is a desirable use. Current views of how to implement the intent of HIPAA often prevent reaching this potential. Finding a middle ground that affords needed protections, while also enabling the benefits of multi-institutional datasets is a vital area of collaboration between patient advocacy groups, legislators, regulatory groups, and researchers.

#### USING BIG DATA TO AUGMENT TRIAL DESIGN

As BDARSs emerge, are integrated with EHRs, ROISs, and TPSs and applied to all patients treated, they present resources for improving trial design. Successfully carrying out this integration requires navigating multi-disciplinary, multi-stakeholder clinical processes needed to achieve access, and implement standardizations (20, 21). Building standardizations and automations into systems reduces the amount of manual effort required to enter and extract data, lowering cost. In addition, wider adoption of standardizations and templates and applications supporting BDARSs lowers resource thresholds for participation in RCTs. This should translate to increasing participation in RCTs.

By proactively identifying and incorporating BDARS supporting standardizations, researchers designing trials can improve curation and reproducibility. Standardizations reduce complexity introduced by variability and increase reliability of consistency checks on inputs and outputs. Use of these standards in routine clinical care and in RCTs makes possible development of sharable automated curation algorithms to flag outliers or longitudinal variation in data entry that may signal errors.

For example, AAPM's Task group 263 on Standardization of Nomenclature for Radiation Therapy defined standards for naming of target and normal structures as well as defining a schema for representing DVH metrics. The task group of 57 members representing, a broad range of roles (e.g., physician, physicist, vendor), professional societies (e.g., AAPM, ASTRO, ESTRO), clinic types (e.g., academic, community practice), and specialty groups (e.g., IHE-RO, DICOM, NRG) to meet common needs of RCTs and routine practice (22). This standard has been adopted by NRG in designing new trials (23). By adopting this standardization into routine practice, effort to prepare data for RCT trial aggregation sites or use in local PQI and TR is reduced.

By designing trials to utilize BDARSs as the optimal aggregation system rather than manual one-by-one extraction from EHRs, ROISs, and TPSs, ability to extend trial results to routine practice and later to carry out validation studies is improved. With this approach, by utilizing BDARS aggregations up front when there are resources for introducing the RCT, then the infrastructure for follow-on efforts is largely in place. In addition, by identifying and fixing "pinch points" in clinical processes to support the BDARS, highlighting practice sensitive data elements affecting RCTs and ability to design trials with intent to incorporate external validation is improved.

Further, with automated aggregation of multiple data elements the range of confounding factors that can be tested in the trial increases. In addition, standardization and automation extended across multiple centers increases ability aggregate enough patients to examine rare events.

### CONSIDERATIONS IN OBSERVATIONAL STUDIES

One of the main challenges to learning from BDARSs is the potential for confounding. In RCTs, the randomization ensures that patients receiving each of the randomized treatments will, on average, be similar with respect to any baseline variable. In observational datasets, there often exist selection biases such that patients receiving two different treatments have different distributions of a variable that may be related to an outcome of interest.

There are a number of statistical approaches to assessing and accounting for confounders. A simple approach is to use multivariable regression models in which potential confounders are included as covariates in addition to treatment. A generally preferable approach is to use propensity scores as weights (inverse probability of treatment), strata, or matching variables (24). Using propensity scores as weights creates a "synthetic" population of outcomes in which both treatment groups have similar distributions of any measured confounders. In this sense, it mirrors an RCT. Both multivariable regression models and propensity methods account only for measured confounders. In some settings, there may be unmeasured confounders.

Instrumental variable analysis (IVA) (25) represents an approach which can provide valid treatment effect estimates in the presence of unmeasured confounding if certain assumptions are met. IVA analyses rely on the selection of an "instrumental" variable that is correlated with treatment and meets other conditions. Importantly, these conditions cannot be verified empirically from the data so that selection of an instrument must be based on subject-matter knowledge.

### USING BIG DATA TO EXTEND TRIAL DESIGN

Increase in availability of BDARS also presents several opportunities to extending clinical trial design methodologies or to generate RCT hypothesis fueled by large, preliminary observational studies. BDARS make distributions for a wide range of treatment and diagnostic parameters readily available. These distributions can be utilized to carry out "virtual design trials" ahead of designing the RCT (**Figure 2**).

For example, in designing a trial aimed at investigating the co-dependence of a chemotherapy regime used in conjunction with an SBRT dose escalation strategy for lung cancer patients, historic data could be used to examine distributions, and crosscorrelations of demographic, radiation, and chemo therapy treatment parameters, dosimetric, and laboratory values, survival, recurrence, provider reported toxicities, and patient reported outcomes. With the distributions and inter-relationships

characterized, variations as anticipated from the proposed trial can be simulated with Monte Carlo and Bayesian methods to better anticipate confounding interactions and to optimize design decisions. Machine learning approaches can be used to leverage the wide range of data element categories contained in BDARS to identify unanticipated interactions and dependencies that should be considered in the RCT design. When the BDARS contains data on charges and procedure codes, ability to improve projecting budgets for the trial is improved. This approach puts examination of the confidence intervals of key parameters and implications for the study up front using actual data rather using hypothetical projections and having to adjust the RCT after it is started.

Prior to conducting an RCT, investigators could utilize BDARSs to more precisely understand characteristics of patients with a particular type of cancer or of patients being treated with a certain treatment. This knowledge could then be translated into the design of the RCT to ensure that the patients enrolled on the RCT are reflective of the intended population. This could mean, for example, that enrollment would be stratified by subgroups. A key step in designing RCTs is selection of sample size. Key drivers of sample size include effect size estimates as well as estimates of variance. There is much room for improvement in how these parameters are selected in the design stage and BDARSs could be utilized to estimate them more precisely and accurately. BDARSs could also be used to accurately estimate the number of eligible patients and hence likelihood of completing accrual within a timely fashion.

After an RCT is completed, BDARSs could be utilized to assess uptake of the "winning" treatment and importantly whether the results in actual clinical practice are similar to those observed in the RCT. One reason for discrepancy has to do with how the treatments are implemented. Treatments such as IMRT are complex and can vary substantially in important details such as normal tissue constraints. If these variables are captured as part of the BDAR, then the source of discrepant results can be sought in discrepant implementations.

In addition, ability for a site proposing an RCT to carry out this analysis demonstrating the potential of the proposed RCT, either as a single- or multi-institutional effort, provides a low cost means of testing the potential value of the RCT and focuses funding on efforts with significant likelihood of success. Publication of these virtual trial results ahead of implementing the actual RCT would place specific and focused discussion of the trial design and potential weaknesses ahead of implementation.

## CONCLUSION

The recent surge in big data initiatives in health care is expected to have a positive impact on clinical trials. Increased standardization of common data elements and nomenclature should assist in streamlined trial design and exchange of data. Standardize between trials and will allow easier multi-study analysis. Standardization and quality improvement efforts go hand in hand with a maturing big data infrastructure providing collateral benefits to data curation for RCTs (24, 26).

The quality and power of observational studies will increase tremendously as use of BDARS increases. Addition of standard outcomes measurements and patient reported outcomes to clinical databases will widen the range for which observational studies are deemed high quality evidence. While BDARS-based observational studies will not eliminate need for RCTs, they can be anticipated to raise expectations for level of evidence thresholds required from RCTs and prompt more frequent validation studies.

Granting agencies may note dividends from BDARS supporting standardizations and ETLs for lowering cost and improving

#### REFERENCES


RCT design. Funding for virtual design trials using Bayesian and Machine Learning methodologies will promote standardizations and growth of BDARS that will ultimately support and improve the quality of RCTs.

#### AUTHOR CONTRIBUTIONS

The authors have participated in discussion of concepts presented in the manuscript, writing, and/or review of the manuscript.

### FUNDING

Work was funded in part by a grant from Varian Medical Systems.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Mayo, Matuszak, Schipper, Jolly, Hayman and Ten Haken. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# mHealth and Application Technology Supporting Clinical Trials: Today's Limitations and Future Perspective of smartRCTs

#### *Marco M. E. Vogel1,2\*, Stephanie E. Combs1,2 and Kerstin A. Kessel1,2*

*1Department of Radiation Oncology, Technische Universität München (TUM), Munich, Germany, 2 Institute for Innovative Radiotherapy, Helmholtz Zentrum München, Neuherberg, Germany*

Nowadays, applications (apps) for smartphones and tablets have become indispensable especially for young generations. The estimated number of mobile devices will exceed 2.16 billion in 2016. Over 2.2 million apps are available in the Google Play store®, and about 1.8 million apps are available in the Apple App Store®. Google and Apple distribute nearly 70,000 apps each in the category Health and Fitness, and about 33,000 and 46,000 each in medical apps. It seems like the willingness to use mHealth apps is high and the intention to share data for health research is existing. This leads to one conclusion: the time for app-accompanied clinical trials (smartRCTs) has come. In this perspective article, we would like to point out the stones put in the way while trying to implement apps in clinical research. Further, we try to offer a glimpse of what the future of smartRCT research may hold.

Keywords: clinical trials, app, smartRCT, eHealth, mHealth

#### INTRODUCTION

In the twenty-first century, digitalization in day-to-day life is ubiquitous, and besides conventional computers, laptops, and mobile phones, the use of smartphones is continuously increasing and far beyond writing messages or phone calls. Applications (apps) for smartphones and tablets have become indispensable, especially for young generations; however, increasing use of apps in the middle-aged and elderly population is observed, thus arguing for a common use across generation borders (1). The estimated number of mobile devices will exceed 2.16 billion in 2016 (2). Over 2.2 million apps are available in the Google Play store®, and about 1.8 million apps are available in the Apple App Store®. Google and Apple distribute nearly 70,000 apps each in the category Health and Fitness, and about 33,000 and 46,000 each in medical apps (3, 4). The WHO defines these tools under the label "mHealth" or "eHealth" as "medical and public health practice supported by mobile devices, such as mobile phones, patient monitoring devices, personal digital assistants, and other wireless devices" (5). The willingness to use mHealth apps or devices seems high. In a current study, Chen et al. (6) showed a great acceptance (77%) to share data for health research, which leads to the natural conclusion: the time for app-accompanied clinical trials has come. In addition, we are living in the era of Big Data, where lots of data are generated and analyzing strategies need to be found. Big Data is defined as "high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization." (7)

#### *Edited by:*

*Daniel Grant Petereit, Rapid City Regional Hospital, USA*

#### *Reviewed by:*

*John E. Mignano, Tufts University School of Medicine, USA Joseph Kamel Salama, Duke University, USA*

> *\*Correspondence: Marco M. E. Vogel marco.vogel@tum.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 05 January 2017 Accepted: 27 February 2017 Published: 13 March 2017*

#### *Citation:*

*Vogel MME, Combs SE and Kessel KA (2017) mHealth and Application Technology Supporting Clinical Trials: Today's Limitations and Future Perspective of smartRCTs. Front. Oncol. 7:37. doi: 10.3389/fonc.2017.00037*

Currently, the medical field shows an evolving trend in developing apps used as tools for behavior change therapy or lifestyle intervention, such as diabetes, weight loss, or exercise performance (8–10). But to date, only a few researchers like Volkova et al. (11) use mobile apps as supporting tools in trials. In their research, they launched the app-based Food Label Trial investigating the impact of labels on consumer behavior. They created the very apt term "smartRCTs" (app-accompanied randomized controlled trials).

In this perspective article, we would like to investigate the rocky path while attempting to implement apps in clinical research. Further, we try to offer a glimpse of what the future of smartRCT research may hold.

### LIMITATIONS AND BARRIERS OF smartRCTs

### Legal Limitations—Do Researchers Need a Law Degree?

When implementing apps in clinical research, one of the highest barriers to overcome is the legal limitations. The requirements differ in each country. We describe the situation in Germany, as it has one of the strictest data privacy regulations [place 7 in Privacy and Human Rights Report 2007 (12)], and we are experienced with the legal situation. Clinical trials need to be approved by the ethics committee (preventing unethical practice) and the data protection officer (enforcing data privacy laws). Usually, the ethics commission supports smartRCTs, if all the data privacy regulations are met (13). However, it is difficult to meet the requirements of the data privacy officer. Informal consent by patients must be obtained before entering the trial. This means, besides the usual trial information, patients have to be educated about the app technology, secure data transfer (anonymous or pseudonymous), data storage, time of storage, and the possibility to delete data if they discontinue the study. When recruiting patients prospectively during treatment this is less problematic, but gaining patient consent by phone or letter is difficult and time consuming.

The type of data transfer needs to be chosen carefully: anonymous data transfer (no patient-related data are transferred) will not conflict with data privacy regulations. Whereas pseudonymous data transfer (patient data are tagged with a pseudonym) requires proper planning. Pseudonyms must be generated to be untraceable and securely stored. The best approach is to set up two different servers: on server A, patients' data are stored, while server B contains the pseudonym and identifier. Both servers are not connected, hence, only authorized trial personal, who has access to both servers, is able to retrieve sensitive data (14).

Further, it is important to be in full control of patient data, as the right to informational self-determination is fundamental. Data deletion based on patient's wish needs to be possible. However, services from third-party providers often store data on remote servers outside the respective law system. Complete deletion of data is impossible and, therefore, such services do not comply with the law. Furthermore, secure data storage must be ensured for at least 10 years, which might lead to data overload (described below). It is important to stay—already in the phase of trial planning—in close contact with the ethics commission and the data privacy officer to meet all regulations and adapt the study protocol, if necessary. It would be beneficial if the German data privacy regulations and laws would get adjusted to the new technology to reduce bureaucracy and allow for high-tech research.

Medical apps demand a different quality standard. It is crucial that apps in smartRCTs need to meet standardized criteria, as they are involved in patient care. In Germany, apps that are considered as medical products need to comply with the Medizinproduktegesetz (Law for Medical Products) and thus must meet stricter criteria as common apps. Medical products need clinical testing and certification. Further, compliance with the Telemedizingesetz (Law for Telemedicine) needs to be ensured. The U.S. Food and Drug Administration published similar guidelines (15). Hence, it is viable to ensure the legal status of the app before starting the programming phase.

### Patients—An App a Day Keeps the Doctor Away?

Limitations of smartRCTs are set by the patients themselves. In earlier publications, we investigated the attitude toward apps for therapeutic and scientific purposes. We saw a dependency on age and gender. Men and participants <60 years are more likely to use an app (16). The reluctance of female patients needs to be further evaluated. The age appears to be a limiting factor in app use. In 2014, only 18% of elderly people (>65 years) owned smartphones or tablets. Seventy-seven percent of the elderlies would request help when learning to use this new technology (17). Consequentially, this may lead to a lack of compliance when using apps in smartRCTs. In contrast, Smith et al. (18) showed a growing trend in smartphone use by people >65 years in 2015: already 27% in the U.S. owned smartphones, which means older citizens increasingly adapt to the new technology. Either way, smartRCTs must be planned considering involved patient cohort. Trials with elderly participants need careful selection of the mobile tools and possibly intense pretrial participant teaching. After all, Denis et al. showed that patient using mHeatlh apps feel closer to their treating doctor due to better communication (19).

Another problem could be that patients do not always own an adequate mobile device. Although in industrial countries like Germany (60%) or the United States (72%) a majority of the citizens is in possession of a mobile device (20), the variety of devices and operating systems is huge. Certainly, participation in smartRCTs can be predicated by adding smartphone possession (maybe even with certain operating systems) to the inclusion criteria, but this leads to preselection and therefore to biased results. One approach is to hand out a mobile device with a specified operating system to the participants while taking part in the trial. This reduces the costs for developing apps for different operating systems and ensures non-biased samples, however, requires funding to purchase devices. In summary, patients might be a limiting factor when launching a smartRCT, but proper preparation and careful planning can lead to an exceptional trial compliance.

### Staff—An Important Cog in the Wheel?

As in the case with patients, the age of clinicians and principal investigators is an obstacle when implementing smartRCTs. In an earlier series, we investigated the attitude of health-care professionals toward app use of patients during treatment and aftercare. The idea of implementing apps was supported by 84.3%; 64.8% even preferred to be alerted if patients enter severe side effects, which require action. The majority (93.5%) supports scientific evaluation of the collected data. The named arguments against app use were legal uncertainty regarding medical responsibility; wish for sole personal contact between health-care professionals and patient; missing technical skills; and lack of time (14).

Legal uncertainty should be minimal within smartRCTs, as the study protocol describes precise treatment algorithms, which are audited by the ethics commission. Principal investigators and staff need to be technically trained. Without proper skills, they are not able to work with the used tools, as well as teach patients. Certainly, the fear of additional work exists. However, Denis et al. (19) showed that treating doctors needed less than 15 min per week for data analysis and phone calls within the entire cohort of 42 patients. With the appropriate development of automated data analyses, the staffs' work time can be minimized. Timeconsuming clinical visits might be reduced, as some are replaced by app-based follow-ups. The delegation of tasks to specifically trained medical support staff like study nurses could reduce high workloads for principal investigators.

As always when introducing new technologies, there are supporters and critics within the medical profession. However, a smartRCT can only be successful if all principal investigators and the participating staff are technically educated and ready to support the project. Otherwise, failure is unavoidable.

### Technical Realization—Let Us Start Programming in the Garage?

When overcoming all barriers, technical realization of smartRCTs is a minor obstacle. Programming from scratch needs qualified staff, which increases the costs of trials. Projects with apps require also medical computer scientists who not only provide technical skills but also understand how medical trials are conducted and have basal knowledge of the physicians' daily work and patients' needs. However, there are open source development kits offered by providers (e.g., Apple Researchkit® and Google Study kit®) to compose an app based on the respective operating system. For instance, Bot et al. (21) use the Apple Researchkit® in his "mPower" Parkinson trial.

As stated above, it is important to match the legal requirements for data transfer, security, and privacy when developing apps. Data transfer to a cloud or server needs to be encrypted to ensure the best possible protection of highly sensitive patient information. He et al. (22) showed that few Android mHealth apps match these criteria, although technical capabilities are already established: Thilakanathan et al. (23) developed a secure protocol for sharing patient data in clouds, and Silva et al. (24) presented the DE4MHA algorithm for secure encryption. Moreover, data storage needs to be encrypted to protect patient data from unauthorized access. A long-term storage on secure servers in favor of transparency and plausibility of trial data is needed. This may lead to higher costs and data overload. Today, public health studies need already data storage capacity of 10 trillion bytes (10 TB) and more. This would equal tens of millions of floppy disks (25). Eventually, this is a future challenge for data transfer, storage, and management systems, which is no insurmountable problem.

## FUTURE PERSPECTIVE OF smartRCTs

### Apps Supporting Clinical Trials—Time Is Money

Besides all the barriers, smartRCTs hold numerous benefits in supporting trials. Khan et al. (26) observed the work time of three individual clinical trial managers and showed that tasks such as documentation (24%), administrative work (20%), and recruitment (16%) are time consuming. Moreover, activities, such as filling out case report forms (12%), data entry (10%), and recruiting eligible patients (9%), are exhausting. The most commonly used tool was paper (24%). Data collection with apps can reduce all those tasks and thus duration. Neuer et al. (27) suggest a 30% decrease in duration by using electronic data capture. Therefore, data analyses can be achieved faster, and results are quicker implemented into clinical routine (28).

The whole process of documentation could be simplified by asking patients to document trial parameters, quality of life scores, or other information *via* an app. Consequentially, a new dimension of information is added to the usual trial data: the patients' view. Going even a step further, mHealth devices such as activity trackers, blood pressure monitors, blood glucose meters, or personal scales can be connected with apps. Hence, the completeness and timeliness of the data are increased, because the course of the disease is monitored longitudinally and not only cross-sectional as it is the case with classical periodical visits. Highly compliant patients could even enter blood test or imaging results made by other physicians (see **Figure 1**).

Needless to say, it is possible to develop apps to be used two sided, and physicians or study nurses could also enter data into the app. Standardized entry is guaranteed, and paperwork is diminished. Reduced paperwork protects the environment and leads to a more secure archiving of patient data. Subsequent changes of data are more difficult; hence, data transparency is improved. Errors during processing and digitizing data are prevented. Dependencies can be used to check entries for plausibility, and automated algorithms can verify inputs already before storing.

The time-consuming recruitment process can be simplified by using apps. It is difficult to find eligible patients for trials using the traditional form of recruitment. With mHealth tools, a wide range of people can be approached, and, therefore, it is easier, faster, and more cost-effective. Laws et al. (29) showed that online recruitment compared to a practitioner and face-to-face recruitment is the quickest and cheapest form with average costs of AUD\$ 14 per participant.

With all named benefits a higher cost-effectiveness can be achieved. Sertkaya et al. (30) applied a media data study and showed total costs of \$78.6 million for phase 1–4 trials investigating drugs in oncology. Administrative staff costs (phase 3:

20.40%), physician costs (phase 3: 7.08%), source data verification costs (phase 3: 3.52%), patient recruitment costs (phase 3: 2.71%), and data management costs (phase 3: 0.34%) (30) can be reduced by using app technology where appropriate. The prerequisite is that apps operate as safe and without patient discomfort as humans. Sertkaya et al. suggest a total cost reduction of 7.91% (=\$6.2 million) if mobile technology is used. Eisenstein et al. (31) calculated a reduction of 9.8% by using electronic data capture.

A huge problem within trials is the inevitable subjectivity of research personnel. Various studies of Hróbjartsson et al. (32–34) showed an occurring bias, especially if blinding is not feasible. App-based data collection is objective as well as standardized, for example, pre-validated questionnaires can be used. Furthermore, smartRCTs simplify multicenter trials as data transfer is easier and all investigators are closely connected. It is possible to centralize the data storage and prevent data loss (28). Moreover, app-based study procedures are equal and objective within all centers.

The mentioned advantages can be of great scientific value in radiation oncology as it is a highly technical discipline. In contrast to classical clinical visits, apps can be used for trial documentation of side effects caused by radiotherapy or concomitant radiochemotherapy. Trial compliance is enhanced as apps can be used to remind patients of radiation dates or drug intake. Continuous aftercare over years plays an important role in radiation oncology. App-based research could enhance documentation and therefore simplify trials on long-term toxicity. The implementation of smartRTCTs would be of great benefit as it marks the departure into the new era of radiation oncology 4.0 (35).

## Apps and Big Data—I Have a Dream **…**

smartRCTs and app-based research generate a huge amount of data—Big Data. Only a fraction of collected trial data are used and published. The data could, however, be utilized to perform sub-studies or can be merged to acquire new insights for clinical day-to-day life. Epidemiological researchers could identify patterns, causes, and effects of diseases without high costs and workload. In return, it is important to improve the level of technical skills within epidemiological studies (36).

Moreover, Big Data can be used to perform clinical trials *in silico*, which means computer simulations are run instead of classical studies (37). Less animal testing and patient recruitment in pharmaceutical or other trials would be needed (38). Especially, research in the area of highly rare conditions would benefit. Instead of using the hard road of recruiting a huge number of patients to gather valid results, computer simulation can accompany trials in rare tumor or diseases. Furthermore, Big Data shows potential in the evolving genomics research. A variety of recently published studies used electronic health data to show relationships between genetic variations and clinical conditions (39–41). Bowton et al. (42) showed this approach to be cost-effective and quick. The Omics movement led to further research disciplines: pharmacogenomics investigates effects of genetics on the individuals' drug response. Omics can be used to predict disease probability, a great tool for preventive medicine (43). Hence, Big Data and Omics will be a major step toward personalized and precise medicine.

### CONCLUSION

smartRCTs and app-based studies are the future of medical research—radiation oncology in particular. While there are certain barriers—especially the data privacy laws—the advantages outweigh the limitations. It would be desirable if politicians and lawmakers establish better opportunities and adjust the regulations to the new technology. This is possible without undermining the right to informational self-determination and data privacy. Further, all parties involved—data privacy officers,

#### REFERENCES


ethical commission, patients, and researchers—need to support an aspired smartRCT. Necessarily, apps for research need to meet certain criteria concerning patient safety and good clinical practice; therefore, generally accepted standards for trial apps need to be established. If so, apps can reduce trial costs, study duration, and subjectivity bias as well as collect a wider range of data. One thing is clear: smartRCT is not a question of whether or not, but of when and how.

### AUTHOR CONTRIBUTIONS

MV wrote the manuscript. SC advised and edited this manuscript. KK advised and edited the manuscript and proposed the initial concept. All authors approved the manuscript.

and oncological apps. *J Med Internet Res* (2016) 18(11):e312. doi:10.2196/ jmir.6399


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Vogel, Combs and Kessel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Which Obstacles Prevent Us from Recruiting into Clinical Trials: A Survey about the Environment for Clinical Studies at a German University Hospital in a Comprehensive Cancer Center

#### *Christoph Straube1,2,3\*, Peter Herschbach2,3,4 and Stephanie E. Combs1,2,3,5*

*1Department of Radiation Oncology, Klinikum rechts der Isar, Technische Universität München (TUM), Munich, Germany, 2 Deutsches Konsortium für Translationale Krebsforschung (DKTK) – Partner Site München, Munich, Germany, 3Roman Herzog Comprehensive Cancer Center (RHCCC), Munich, Germany, 4Klinik für psychosomatische Medizin und Psychotherapie, Klinikum rechts der Isar, Technische Universität München (TUM), Munich, Germany, 5Department of Radiation Sciences (DRS), Institute for Innovative Radiotherapy (iRT), Helmholtz Zentrum München, Oberschleißheim, Germany*

#### *Edited by:*

*Issam El Naqa, University of Michigan, United States*

#### *Reviewed by:*

*Charles B. Simone, University of Maryland Medical Center, United States Valdir Carlos Colussi, University Hospitals Seidman Case Medical Center, United States*

> *\*Correspondence: Christoph Straube christoph.straube@tum.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 30 May 2017 Accepted: 07 August 2017 Published: 28 August 2017*

#### *Citation:*

*Straube C, Herschbach P and Combs SE (2017) Which Obstacles Prevent Us from Recruiting into Clinical Trials: A Survey about the Environment for Clinical Studies at a German University Hospital in a Comprehensive Cancer Center. Front. Oncol. 7:181. doi: 10.3389/fonc.2017.00181*

Background: Prospective clinical studies are the most important tool in modern medicine. The standard in good clinical practice in clinical trials has constantly improved leading to more sophisticated protocols. Moreover, translational questions are increasingly addressed in clinical trials. Such trials must follow elaborate rules and regulations. This is accompanied by a significant increase in documentation issues which require substantial manpower. Furthermore, university-based clinical centers are interested in increasing the amount of patients treated within clinical trials, and this number has evolved to be a key quality criterion. The present study was initiated to elucidate the obstacles that limit clinical scientists in screening and recruiting for clinical trials.

Methods: A specific questionnaire with 28 questions was developed focusing on all aspects of clinical trial design as well as trial management. This included questions on organizational issues, medical topics as well as potential patients' preferences and physician's goals. The questionnaire was established to collect data anonymously on a web-based platform. The survey was conducted within the Klinikum rechts der Isar, Faculty of Medicine, Technical University of Munich; physicians of all levels (Department Chairs, attending physicians, residents, as well as study nurses, and other studyrelated staff) were addressed. The answers were analyzed using the Survio analyzing tool (http://www.survio.com/de/).

Results: We collected 42 complete sets of answers; in total 28 physicians, 11 study nurses, and 3 persons with positions in administration answered our survey. The study centers reported to participate in a range of 3–160 clinical trials with a recruitment rate of 1–80%. Main obstacles were determined: 31/42 (74%) complained about limited human resources and 22/42 (52%) reported to have a lack on technical resources, too. 30/42 (71%) consented to the answer, that the documentation effort of clinical trials is too large. A possible increase of the patients' study participation rate up to over 20% was deemed to be possible if the described limitations could be overcome.

Discussion: The increasing documentation effort in clinical trials has led to a strong increase in the work load of scientific personnel. Recruiting of patients into clinical trials therefore is not only limited by patient issues, but also by the infrastructure of the centers. Especially the lack of study nurses is likely to be a major limitation. Furthermore, technical resources for time efficient and safe documentation within clinical routine as well as in clinical trials are required. By optimization of these factors, a significant increase in the amount of patients treated in clinical trials seems to be possible.

Keywords: barriers to participation, clinical trials as topic, survey, physicians, management

#### INTRODUCTION

Prospective clinical trials, especially randomized controlled trials, are accepted as the most important source of evidence in most subdisciplines of modern medicine; data from clinical trials do constantly shape our guidelines for clinical practice and thus contribute essentially to up-to-date patient care (1, 2). However, gaining this evidence is often hampered by low patient accrual-rates that subsequently can lead to a failure of important trials because they do not reach the preplanned sample sizes in adequate time intervals (3). Many studies have been conducted to investigate the role of patients in this recruitment dilemma. It has been shown that, besides other factors, patients' concerns about the possibility to be randomized to a placebo treatment or to the standard treatment arm can lead to the hesitation to give informed consent to participate in a trial (4). Also the consent process itself has been identified as one barrier for patients as well as for clinicians to participate in clinical trials (5).

Focusing only on the unwillingness of patients to participate in clinical trials, however, might be an oversimplification of this problem. Trial recruitment depends on several factors, which can also be organizational, financial, or related to other factors relevant to trial management. Therefore, mainly focusing on the patient itself may underestimate the difficulties. Surprisingly, these factors have only been studied marginally so far, although time shortenings and lack of study staff have been blamed as barriers for clinical trials (5, 6).

In order to optimize the recruitment of patients into clinical trials in our hospital, we conducted informative interviews, generated and conducted a quantitative survey, and discussed the results at an open conference at our center. We hypothesized, that recruitment into clinical trials is also limited by infrastructural shortenings. This was confirmed by the responses to our questionnaire. The quantitative results of this survey gave us strong arguments to ease financial resources and to force changes in the administrative framework. We hereby present the results from our survey as similar obstacles might be present in other clinical trial centers, too.

#### MATERIALS AND METHODS

#### Questionnaire Development

After a series of informative interviews with department chairs and leading senior professionals from 17 departments of the University Hospital Klinikum rechts der Isar, Technical University of Munich (TUM), Germany, a specific questionnaire was developed, focusing on all aspects of clinical trial design as well as trial management. The questionnaire included 28 questions. We included demographic questions, namely two multiple choice questions about the position in the clinic and the predominant medical field (i.e., medical vs. surgical treatments), two dichotomous questions about the sex of the participant and whether the participant belongs to the TUM and seven open-ended questions asking for the age of the participant, the years of experience in clinical trials, the number of clinical trials and study nurses within the clinical center, and the amount of patients that are treated within different subtypes of clinical trials. Subsequently, 16 rating-scale questions asked whether patient relate complaints, trial factors, structural aspects, and complaints or administrative obstacles had an influence onto the rate of patients recruited to clinical trials (**Table 1**). Lastly, an open-ended questions asked to which extent the amount of patients within clinical trials could be increased if the obstacles stated in the rating-part would be eliminated. Additionally, the responders had the possibility to leave comments at the end of the survey.

Furthermore, questions on the medical background and the organizational structure of the participant's centers were included. The questionnaire was piloted within a small cohort of physicians. Participants of the pilot run were instructed not to participate in the final run of the survey, however, as we performed an anonymous web-based survey it cannot been ruled out that some participants from the pilot run also answered the final questionnaire.

The questionnaire was established to collect data anonymously on a web-based platform (the entire questionnaire in German language can be found within Table S1 in Supplementary Material). The survey was conducted within the Klinikum rechts der Isar, Faculty of Medicine, TUM; clinical scientists of all levels (Departments Chairs, attending physicians, residents, as well as study nurses, and other study-related staff) were addressed. The questionnaire was performed within the quality assurance program of the Comprehensive Cancer Center of our hospital and was therefore in line with the institutional guidelines of the local Ethic's committee. The answers were analyzed using the web-based Survio analyzing tool (http://www.survio.com/de/) and Microsoft© Excel 2016. The frequency of the answers given by physicians and study nurses were compared with each other using the χ<sup>2</sup> -test function of SPSS v. 18 (IBM).

#### RESULTS

The web-based survey counted 120 visits resulting in 44 completed questionnaires (37%). Twenty-nine physicians, most of them senior physicians (20), 12 study nurses, and 3 persons with administrative areas of responsibility completed the questionnaire. Eighteen physicians were from non-surgical and medical specialties, eight physicians had a surgical background, and three physicians had a surgical as well as a medical background. All physicians reported to have long-term experiences in conducting prospective medical trials (median 10 years, range 4–27 years).

Study nurses were employed in surgical disciplines in one (8%) case, in medical disciplines (e.g., internal medicine or radiation oncology) in seven cases (58%), and in subjects with medical and surgical treatments in four cases (33%). The level of experience was comparable with the group of physicians (median 10 years, range 1–15 years). Based on the composition of this cohort, the sample has to be considered as a random sample, as especially department chairs and residents are underrepresented.

The participants did also answer questions about their study centers. Overall, the centers reported to have 0–160 active trials (median 15 trials). Compared with the large number of trials, centers employed only a relative small number of study nurses (median 3, range 0–6), indicating that this might limit the maximum number of patients recruited into clinical trials; in average, every study nurse cared for 10.5 clinical trials. 32 of 44 persons (73%, 7 totally agreed, 25 agreed mostly, **Table 1**) consented to the answer, that the documentation effort of clinical trials is to large (10 of 12 study nurses, 20 of 29 physicians). Focusing on the clinical routine, 14 physicians (48%) affirmed the statement, that a large burden of documentation is an important obstacle in recruiting patients to clinical trials. Deficits in information technology resources were described by 52% of the responders.

Consistent to this findings, limited human resources were complained by 31 persons (70%; 19 agreed totally, 12 agreed mostly). Additionally, eight responders highlighted this topic within the free-text answers. Limited resources in information technology were also reported by the participants, although by a smaller number (23 of 44 answers, 52%).

Trial-related factors as well as patient-related factors were deemed to have less influence on the recruitment rates, only two physicians (7%) answered, that there are not enough trials


*The results represent the complete or predominant agreement to the given statements. The frequency of the answers from physicians and study nurses were compared with the* χ*<sup>2</sup> test.*

available for patients treated at their department. Also refusal from patients to participate in prospective clinical trials seems to be a minor issue in our center (one physician agreed mostly onto that statement).

Altogether, obstacles in infrastructure as well as limitation of human resources were deemed to limit the number of recruited patients by all but six participants. Vice versa, the participants expected to be able to substantially increase the recruitment of patients to clinical trials in 39 cases (89%) if the obstacle would be improved.

We grouped the answers of the responders according to their job type (e.g., physicians vs. study nurses) and compared the frequency of the answers. There were no significant differences in the frequency of affirmations to questions asking for patientor organization-related factors. However, physicians agreed significantly more to the statements "Important trials could not be established at our center" (*p* = 0.039) and "Concerning the conduction of clinical trials, I am discouraged by legal regulations […]" (*p* = 0.019).

### DISCUSSION

In the present analysis, we sought to determine major obstacles for prospective clinical trials. In a detailed questionnaire, we identified documentation tasks as a key factor, as well as the difficulty to recruit well trained staff. Limited personnel resources may be a main obstacle for effective conduction of clinical trials. Furthermore, the majority of participants suspected an increased efficacy in patient recruiting if these obstacles would be eased.

Clinical trials are the most important sources for valuable evidence in modern medicine. Unfortunately, a large proportion of trials undergo early closure due to poor accrual. While patient-related obstacles for the recruitment into clinical trials have already been studied, literature on institutional obstacles for trial recruitment is scarce. A large documentation due to rising quality claims and increasing legal regulations are reasons for the growing documentation effort in clinical trials that leads to an increase of workload for the scientific personnel in university hospitals (6–8). Besides there is no evidence that this continuous increase in regulation efforts does really improves the quality of scientific results, conduction of clinical trials is increasingly complex. Consequently, the vast majority of our participants did agree to the statement, that the documentation effort of clinical trials is to large. As many responders complained about large documentation efforts within the routine treatment of patients and about deficits in the information technology infrastructure, one could summarize that infrastructural shortenings do currently limit time of physicians and study nurses to an extent that precludes the recruitment of more patients. While this already seems to be a significant problem at a German university hospital, centers within the developing world do suffer even more from this development (6, 7). Therefore, the discussion of this topic should be continued although the recently updated GCP guidelines do allow "more efficient approaches to clinical trial design […], recording [… and] reporting" (9). It is within the responsibility of future sponsors and investigators to optimize their protocols to consequently reduce the requested information to the least necessary amount.

Current trials, however, still suffer from the large burden of necessary documentation which only can be handled by the help of an adequate number of supportive personal (5). This was confirmed by one key-finding from our survey: a shortening of human resources is one of the most important obstacles for increasing the rate of patients recruited to clinical trials. This was also reported within a survey by Kaanoi et al., who reported that 22 of 27 oncologists in Hawaii did not have enough support staff to recruit more patients to clinical trials; he explained, that the low number of study nurses limited the amount of patients in clinical trials for which all quality claims for clinical trials could be fulfilled (10). Furthermore, a systematic review by Fisher and colleagues summarized, that a lack of time for the screening, treatment, and follow up of clinical trials is a major barrier for oncologists to participate in clinical trials (5). Notably, the increasing documentation duties within the clinical routine leads to a further increase of this barrier, a finding that was already described in the early 1990s in the United Kingdom (11). While especially investigator initiated trials are often underfinanced, a political debate on the necessity of supportive scientific personnel in clinical centers is needed. Furthermore, the additional personnel effort should be taken into consideration when contracts for company initiated trials are made. Concerted lists for the costs of study personal as well as medial measures, as already common for the pharmacists in dispensaries in Germany, could become an important tool for the planning, contracting, and the conduction of clinical trials.

The results of our survey cannot be generalized to the level of individuals of our center, as especially department chairs, and residents are underrepresented. However, the survey was answered mostly by experienced physicians involved in the conception of clinical trials as well as in the management of the scientific centers. The second largest group consisted of the supportive scientific staff. These two large groups are likely to give valuable information about their centers since they are the two main groups in charge of day-to-day issues in clinical trial management. Therefore, the obstacles reported by the participants are likely to represent the most important institutional barriers for patient recruitment to clinical trial at our center. Whether the results of our survey can be generalized to other centers can hardly be answered, as key values needed for a comparison, i.e., the number of study nurses or the number of active trials, are not available for other centers. However, since at least within Germany University Hospitals are characterized by similar organizational structures most arguments most likely hold true for other sites. Further investigations about an ideal balance between the number and complexity of clinical trials at one center, the number of study nurses and the number of patients within clinical trials are therefore highly recommended.

Until that, a continuous review of the study process on all levels of a scientific clinical center seems to be a sufficient tool to identify barriers for the conduction of clinical trials. Of importance, patient related as well as structural factors need to be analyzed to improve the process of clinical trials. Results of quantitative surveys can help to hierarchical sort the importance of administrative hurtles and can serve as arguments for easing financial of personnel resources for their solution. Coming back to our experiences, the results from the interviews as well as from the survey allowed us to build an interdisciplinary consent about the most important issues, and some of the most important issues have already been solved. Additionally, clinical scientists should take the personal limitations of clinical centers into account when new protocols for clinical trials are generated. A lower burden of documentation, partially by focusing onto the main objectives of the trial, can help to increase the efficacy of the trial centers which subsequently can handle more patients within clinical trials.

### REFERENCES


### AUTHOR CONTRIBUTIONS

CS drafted and designed the survey, approved the survey, analyzed the results, and wrote the manuscript. PH and SC designed the survey, approved the survey, critically discussed the results, gave important intellectual input, and wrote the manuscript. All authors approved the final version for the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fonc.2017.00181/ full#supplementary-material.


**Conflict of Interest Statement:** All authors declare to have neither financial, commercial nor any author conflict of interest that could affect the results or the discussion of the content of the manuscript.

*Copyright © 2017 Straube, Herschbach and Combs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Use of Multicenter Data in a Large Cancer Registry for Evaluation of Outcome and Implementation of Novel Concepts

*Gabriele Schubert-Fritschle1 \*, Stephanie E. Combs 2,3,4,5, Thomas Kirchner 2,5,6, Volkmar Nüssler <sup>2</sup> and Jutta Engel1,2*

*1Munich Cancer Registry (MCR) of the Munich Tumour Centre (TZM), Institute for Medical Information Processing, Biometry and Epidemiology (IBE), University Hospital of Munich, Ludwig-Maximilians-University (LMU), Munich, Germany, 2 Munich Tumour Centre (TZM), Medical Faculties, Ludwig-Maximilians-University (LMU) and the Technical University of Munich (TUM), Munich, Germany, 3Department of Radiation Oncology, Technische Universität Munich (TUM), Klinikum rechts der Isar, Munich, Germany, 4Department of Radiation Sciences (DRS), Institute for Innovative Radiotherapy (iRT), Helmholtz Zentrum Munich, Oberschleißheim, Germany, 5Deutsches Konsortium für Translationale Krebsforschung (DKTK), Partner Site Munich, Munich, Germany, 6 Institute for Pathology, Ludwig-Maximilians-University (LMU), Munich, Germany*

#### *Edited by:*

*Christopher Schultz, Medical College of Wisconsin, United States*

#### *Reviewed by:*

*Geraldine Vink, University Medical Center Utrecht, Netherlands Vivek Verma, University of Nebraska Medical Center, United States*

#### *\*Correspondence:*

*Gabriele Schubert-Fritschle gabriele.schubert-fritschle@med. uni-muenchen.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 16 May 2017 Accepted: 11 September 2017 Published: 29 September 2017*

#### *Citation:*

*Schubert-Fritschle G, Combs SE, Kirchner T, Nüssler V and Engel J (2017) Use of Multicenter Data in a Large Cancer Registry for Evaluation of Outcome and Implementation of Novel Concepts. Front. Oncol. 7:234. doi: 10.3389/fonc.2017.00234*

Large clinical cancer registries (CCRs) in Germany shall be strengthened by the German Social Code Book V (SGB V) and implemented until the end of 2017. There are currently several large cancer registries that support clinical data for outcome analysis and knowledge acquisition. The various examples of the Munich Cancer Registry outlined in this paper present many-sided possibilities using and analyzing registry data. The main objective of population-based cancer registration within a defined area and the performance of outcomes research is to provide feedback regarding the results to the broad public, the reporting doctors, and the scientific community. These tasks determine principles of operation and data usage by CCRs. Each clinical department delivers its own findings and applied therapy. The compilation of these data in CCRs provides information on patient progress through the regional network of medical care and delivers meaningful information on the course of oncological diseases. Successful implementation of CCRs allows for presenting the statistical outcomes of health-care delivery, improving the quality of care within the region, accelerating the process of implementing innovative therapies, and generating new hypotheses as a stimulus for research activities.

Keywords: cancer incidence, cancer mortality, survival, trends, data analysis, quality assurance, comparative effectiveness research

### REGIONAL CCRs—INSTRUMENTS FOR CLINICAL AND EPIDEMIOLOGICAL RESEARCH

According to the 1995 German law regarding the regulating of cancer registration (Cancer registry law, Krebsregistergesetz—KRG 1995), German states were required to establish cancer registries until January 1999. All German states complied with this regulation and generated comprehensive epidemiological cancer registrations. Over time, there has been increasing precision in the

estimation of cancer incidence and mortality by the German Society for Epidemiological Cancer Registries in Germany (GEKID) (1) and the Centre for Cancer Registry Data (ZFKD) at the Robert Koch-Institute (RKI) (2). In addition, data from nine German regions are currently published in the WHO publication Cancer Incidence in Five Continents, Vol. X (3), and 23% of EUROCARE-5 (4) data originate from Germany.

Population-based cancer registries are important instruments for epidemiological reference. Epidemiology involves the analysis of health and illness or, more generally, the dynamics, causes, and consequences of the health status of a defined population (5). Cancer will affect more than 40% of all people globally. In Germany alone, approximately 477,000 persons per year are diagnosed with cancer, and 221,000 persons die each year from cancer (6).

The few indices, however, that may be estimated by epidemiological cancer registries are not sufficient to describe the complex structures, care, and outcomes of cancer diseases. Therefore, it is necessary to obtain evidence on cancer subtypes, various outcomes of cancer care, clinical experiences, and knowledge origination, all of which require population-based clinical data collected by way of CCRs and then analyzed and published by the registries in cooperation with their clinical partners.

Therefore a national law, the Krebsfrüherkennungs- und registergesetz (KFRG), which generated basic conditions for population-based CCRs in all regions of Germany (SGB V §65c), was enacted in April 2013. The KFRG requires that cancer data be recorded in CCRs countrywide, but in small regions (federal state or parts of it), in accordance with standardized definitions [see common German oncological basic dataset and its supplementary organ-specific modules edited by ADT (Arbeitsgemeinschaft Deutscher Tumorzentren, working group of German cancer centers) and GEKID]. Data acquisition must be completed relative to cases with defined ICD-10 codes and items of ADT-oncological data sets within a defined region. Each federal state has to legislate specific details (e.g., data protection) by its own.

Clinical cancer registries may provide useful data on cancer diseases and cancer care if all doctors and hospitals within a defined region and within defined fields, such as surgery, pathology, radiotherapy, and systemic oncology, prepare and submit all independent and cross-sectoral findings and therapies from the primary diagnosis through the course of the disease. Data are checked, coded, and compiled in CCRs, and data management is completed using follow-up information, including date and cause of death. Based on this structure, feedback may be realized in the manner postulated by law.

#### THE MUNICH CANCER REGISTRY (MCR)—ORGANIZATION AND STRUCTURE OF A MULTICENTER DATA POOL

The MCR is the population-based clinical cancer registry of Upper Bavaria and one region of Lower Bavaria (Southern Germany) (7). Since 1978, the registry's catchment area has been enlarged twice. In 2002, it was increased to 2.3 million inhabitants, and in 2007, it was increased to 4.5 million inhabitants. It currently includes more than 4.8 million inhabitants (**Figure 1**).

Pathology reports of solid tumors from all pathology laboratories in this catchment area are available. From these reports, the total number of cancer patients in the region is systematically assessed and the main prognostic factors were ascertained. In parallel, clinicians complete standardized forms concerning patients' domicile, age, primary disease characteristics such as TNM-stage, histology, grade, as well as therapies or deliver these data online to the MCR.

The life status of patients diagnosed with cancer is maintained by clinicians and is systematically updated by the MCR through death certificates. **Figure 2** displays the interdisciplinary and cross-sectoral documentation of course of cancer disease.

All data and clinical findings during the course of the disease (e.g., local or regional recurrence, metastases, and death) are coded according to the guidelines of the International Agency for

Table 1 | Malignancies by year of diagnosis from 1998 to 2015 defined by KFRG.


*Without DCO-cases (portion from sum of columns total, DCO and children).*

*Without children* <*18 years (portion from sum of columns total* + *children).*

*Without non-melanotic skin cancer (C44, D04), secondary malignancies (C77-C79).*

Research on Cancer. Tumors are classified in compliance with the staging criteria of the TNM Classification of Malignant Tumors (8). There are approximately 350 departments in approximately 70 hospitals in the cooperating network of the MCR. Currently, more than 25,000 new cases are registered each year. Within the

smaller catchment area, there were approximately 11,000 new cases reported annually in 1998, which increased to approximately 25,000 annually by 2015. Accordingly, there were more than 360,000 cancer cases registered in the MCR from 1998 to 2015 (**Table 1**).


All patients are followed actively and prospectively to make the database as complete as possible. Over time, approximately 50% of the patients have deceased. The course of the disease in the other 50% is continuously being adjusted by including information regarding disease progression and life status. Furthermore, the percentage of death certificate-only cases decreased from 10 to 7% from 1998 to 2015.

The diagnoses of cancers with high incidence rates, such as breast or colorectal cancer, range from 3,500 to 4,000 per year. While the catchment area is large, only a small number of patients with rare cancers, such as vulvar cancer or Hodgkin's lymphoma, is expected. This quantity structure must be considered in the data analysis and study design. Nonetheless, even given the cooperation of several institutions and the aggregation of time periods, there remain an insufficient number of patients with rare cancers to conclude reliable survival analyses or to conduct multivariate statistical methods.

### FEEDBACK REGARDING RESULTS

One of the elementary tasks of CCRs is to provide information regarding patients' cancers to the cooperating partners, doctors, hospitals, public, and most especially, to the patients and their relatives. The MCR developed four levels of data presentation, all of which can be accessed *via* the Internet (**Figure 3**).

Level 1 provides open access information on age distribution, incidence, mortality, and survival as ordered by ICD-10 (C- and D-diagnoses) for all interested persons (9). Access to level 2 information is restricted by password to cooperators and authorized persons. Level 2 provides special analyses of the whole catchment area and is limited to the main ICD-10 cancer diagnoses. The statistics and results presented in level 3 are restricted to the cohort of patients of a single hospital and may be accessed using a unique hospital password. Level 1 to level 3 presents aggregated statistics with a varying degree of detail. Finally, doctors with special personal identification are allowed online access to the MCR database in level 4. Basic queries may be performed on single patients or patient groups with defined characteristics, and it is possible to perform online documentation of cancer patients.

### USE OF MULTICENTER DATA TO EVALUATE OUTCOMES AND IMPLEMENT NOVEL CONCEPTS

Cancer registry data may be used in various clinical and scientific applications. Clinicians must have access to case histories for daily patient care or for evaluation of the course of disease for single patients. A precondition for clinical QA measures is to ensure data correction and completion of documents that gather information such as disease parameters and types of therapies. Certain questions are intended for patient cohorts with comparable diagnostics or treatments. The demands on an individual level are described in the upper portion of **Figure 4**.

Descriptive and analytical statistics are essential for certain scientific applications. For example, the analyses of hospital variations are useful for benchmarking, providing information feedback required for QA and reproducing published results using epidemiological data from the CCR for CER. These and other examples are outlined in the lower portion of **Figure 4**. The duties and responsibilities of CCRs can be defined through comparisons of hospitals, analyses of certification audits, comparisons to cancer-specific guidelines, assessments of regional and time trends, and individual benchmark results of clinical study outcomes.

Caution should be taken when interpreting the results, as data from single cooperating hospitals have various types of bias and lack representation. Thus, for proper comparisons, CCRs should provide statistics that include averages of epidemiological results from general clinical data and data on the course of the disease that are aggregated by different characteristics. In this way, comparisons can be made and single hospital results can be appropriately interpreted.

### OUTCOME EVALUATION

Additional clinical data are required for QA and CER in oncology to increase the explanatory power attained by representing a defined population. With relevant positive and negative deviations found through multivariate data analyses, the following examples illustrate the various uses of CCR data to evaluate outcomes.

#### Estimation of Prognosis

A cancer prognosis is important not only for evaluating outcomes but also for the patients' information. The main criterion of prognosis is survival, which is calculated as overall survival (OS), which includes all deceased individuals, and relative survival (RS). RS is the ratio of the observed survival rate to the expected survival rate. RS may be interpreted as cancer survival after correcting for other causes of death; therefore, it is used to estimate cancer-specific survival. The expected survival time of age-matched individuals is calculated according to the Ederer II method using life tables of the German population (10).

There is a good prognosis for people with prostate cancer, a disease in older men with a median age of 69 years, especially in the T1 and T2 categories. **Figure 5** shows OS (**Figure 5A**) and RS (**Figure 5B**) as estimator of cancer-specific survival. The survival rate (>100%) for patients with T2 tumors during the first 7 years after diagnosis is better than the mean survival rate of the German male population. The relative 5-year survival rates are 101.1 and 95.1% for the T2 and T1 categories, respectively, primarily due to incidental carcinomas. **Figure 5** shows the effect of calculating RS, which accounts for the mean life expectancy of the German male population.

The morphological verification of tumors delivers important information for treatment planning and prognosis estimation. **Figure 6** presents the spectrum of morphology and the frequency

of morphological types of gastric cancer. As RS largely depends on morphology, there are better results for GIST and neuroendocrine neoplasms. The relative 5-year survival rates for patients with stomach adenocarcinoma, signet ring cell carcinoma, and GIST/sarcoma are 35.6, 29.9, and 88.6%, respectively.

#### Benchmarking

There may be selection bias within single hospitals that influences outcomes. Therefore, the results of single institutions must be compared to each other using measures based on summary population-based data. Accordingly, these results may be interpreted as the mean epidemiological values.

**Figure 7** presents two diagrams of the percentages of UICC stage III and IV colorectal cancer, where one red bar indicates one co-operating hospital. For UICC stage III colorectal cancer, the upper diagram reveals a variation between 20.9 and 36.7%, with an epidemiological mean of 29.2%. For UICC stage IV colorectal cancer, the lower diagram shows a variation between 12.0 and 34.7%, with an epidemiological mean of 23.5%.

This example of clinic-specific variation emphasizes the use of multivariate statistical methods, such as proportional hazard models, to adjust not only for multiple prognostic parameters but also for clinical variations.

### IMPLEMENTATION OF NOVEL CONCEPTS

The implementation of novel concepts in cancer care is a continuous process that relies on the presumption of results from research and randomized clinical studies as well as reliable evidence from observational data (e.g., provided by populationbased CCRs). Oncological guidelines compile and periodically actualize the state of the art of diagnostics and treatment. CER proves the degree of implementation and the effectiveness of the applied measures.

#### Breast Cancer

Since 2008, the German S3 guideline for breast cancer has recommended a sentinel lymph node biopsy (SLNB) (11). **Figure 8** reveals the implementation of the SLNB parallel to the trends of other axilla operations within the catchment area of the MCR. When the guideline was published in 2008, SLNBs were being practiced in more than 50% of all axilla operations. In 2015, SLNBs were performed in 74.8% of all axilla operations.

From 1998 to 2015, lymphadenectomies during the observation period decreased from 88.5 to 4.5%, respectively.

Additionally, during this time, adjuvant and neoadjuvant therapy was refined and intensified. Accordingly, the RS rate for breast cancer patients within the MCR region increased from 1998 to 2015 (**Figure 9**).

### Vulva Carcinoma

An article published about a less invasive local lymph node surgery for squamous cell vulvar carcinoma (12) reported that in an analysis of 1,133 patients diagnosed between 1998 and 2013, there were significant decreases in complete vulvectomies and inguinal lymph node surgeries. Moreover, the change in therapy to less radical procedures did not negatively affect the time to local and lymph node recurrence, OS, or RS.

This publication is an example of the limits of evidence-based medicine, but it also indicates that population-based CCRs with a large catchment area of about five million inhabitants and a multicenter cooperation structure such as the MCR still deliver

for small cohorts of rare cancers such as vulvar carcinoma. An important advantage of larger CCRs is its possibility to deal with rare cancers.

#### Lung Cancer

The implementation of new therapeutic concepts in patient care requires the evaluation of their effects on outcomes. For therapy planning and predicting the prognosis of UICC stage IV nonsmall cell lung cancer (NSCLC), the role of the EGFR mutation (EGFR mutated) was examined in 536 patients of the MCR (**Figure 10**) because the mutation of EGFR is a good predictor of the effectiveness of tyrosine kinase inhibitors (13).

While there is no validation of the type of therapy applied, the median RS for patients with EGFR mutation is 23.5 month, which is more than twice that (11.2 months) for patients without EGFR mutation, that is, EGFR wild type.

#### Rectal Cancer

The therapy for rectal cancer has changed within the past 30 years as it concerns surgery, radiotherapy, and systemic therapy.

In the last decades, there has been a population-based implementation of total mesorectal excision of rectal cancer, along with a quality assessment using the MERCURY classification, as well as initially the implementation of adjuvant radiotherapy and subsequent of neoadjuvant radio-chemotherapy for UICC-stages II and III. Effectiveness of therapeutic innovations may be attested by data from population-based CCRs, as presented in **Figure 11**, for rectal carcinoma treated within the catchment area of the MCR.

### KNOWLEDGE ACQUISITION

#### Incidence of Second Malignancies

The incidence of second malignancies is not well known. The network of different medical departments for different tumor entities within a cancer registry enables gathering all malignancies. Thus, multiple malignomas can be compiled, which is difficult for a single department of a particular discipline. In addition, results of risk estimations depend on the calculation method. Whereas the probability calculated using the Kaplan–Meier method

considers the cases lost to follow-up due to censoring, the inverse rate (1-KM) quantifies the percentage of secondary primaries occurring per year and cumulated over years (**Figure 12**, upper diagram).

The calculation of the cumulative incidence function (CI) considering competing risks (14), e.g., the risk of dying before a second malignancy is diagnosed, leads to lower probabilities of the occurrence of second malignancies (**Figure 12**, lower diagram).

#### Translational Research

Translational medicine describes the effort, in which research results are transformed into routine patient care. One aspect in this field is the investigation of molecular characteristics and the functioning of cancer cells and its metabolites. The cooperation of the MCR with pathological institutes has led to a series of publications (15–23). Moreover, the connection between *in vitro* results and clinical data from CCRs validates laboratory insights regarding the course of the disease, the quality of life, and the prognoses.

### Hypothesis: Lymph Nodes do not Metastasize

The influence of positive lymph nodes on the process of metastasization is not finally clear and is one subject of research at the MCR. Although the presence of positive lymph nodes is a key prognostic factor, there is little evidence as to whether tumor cells in positive lymph nodes infiltrate other lymph nodes or distant organs. Moreover, while there is no evidence in the registry data of increased survival resulting from lymph node dissection. The success of the sentinel lymph node concept for some solid tumors and the fact that lymph node recurrence is rare in the course of disease of many solid tumors support the hypothesis, that "positive lymph nodes do not metastasize" (24).

#### LIMITATIONS AND CHANCES

Clinical cancer registries with their network for information processing and feedback provide a powerful infrastructure for optimizing patient care and initiating research projects. Though there are promising activities in the use of CCRs, the data are observational and thus contain various types of bias that must be considered in statistical analyses. Furthermore, the results should be interpreted with caution and knowledge of the state of the art, and the limitations and risks associated with using observational data must be evaluated separately according to the specific application.

Clinical cancer registries meet the demands of a defined catchment area, including population-based data attained due to the thoroughness of the registration and the appropriateness of the form and content of the incoming documents. Involvement of clinicians and scientists into cancer registries is necessary to keep registries in a current state and to support analysis of open questions. Therefore, catchment area for a population-based CCR should not cover far more than five million residents. Thus, it is highly likely that meaningful cancer data will be gathered and available for analyses by clinicians, scientists, and epidemiologists.

The main issue with respect to oncology and public health is the creation of transparency in patient care, developing state of the art updates in diagnostics and therapy, quantifying the outcome of procedures subject to the guidelines and, if necessary, defining starting points for improvement. The results from CCRs may be compared with those of other hospitals, with the results of randomized controlled trials, and with the results published

#### REFERENCES


in the national and international literatures. Positive and negative deviations must be noticeable and considered for drawing relevant conclusions. All this aspects until now are handled only partly or only in some regions but not for Germany in total. So, it will be a main task for CCRs to create an infrastructure and the valid database to deal with these questions on a regional, but comparable way.

While, in the past, many centers build up single databases without network and communication, such effects as described in the present manuscript offer significant short- and long-term benefit for all participants, generate large and multicenter data, and provide a comprehensive platform for scientific work as well as quality-related evaluations (25, 26).

The current work discusses only some of the applications and multiple aspects of the use of CCRs. The legislative and financial support provided by the KFRG should be used for further activities in health-care delivery research. Cancer control and patient care may benefit. In the future, such data will become even more important and will be an indispensable key element of all cancer centers.

### AUTHOR CONTRIBUTIONS

GS-F: conception and design, collection and assembly of data, data analysis, and interpretation, manuscript writing, final approval of manuscript. SC and TK: conception and design, collection and assembly of data, final approval of manuscript. VN: collection and assembly of data, final approval of manuscript. JE: conception and design, collection and assembly of data, data analysis and interpretation, manuscript writing, final approval of manuscript.

#### ACKNOWLEDGMENTS

We thank all hospitals, departments, and practitioners who participated in the documentation of data.

#### FUNDING

The Munich Cancer Registry (MCR) is part of the Munich Tumour Centre (TZM) at the Institute for Medical Information Processing, Biometry, and Epidemiology (IBE) of the Ludwig-Maximilians-Universität (LMU) at the University Hospital of Munich.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Schubert-Fritschle, Combs, Kirchner, Nüssler and Engel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Data-Based Radiation Oncology: Design of Clinical Trials in the Toxicity Biomarkers Era

*David Azria1 \*† , Ariane Lapierre1†, Sophie Gourgou1 , Dirk De Ruysscher 2,3, Jacques Colinge1 , Philippe Lambin2 , Muriel Brengues1 , Tim Ward4 , Søren M. Bentzen5 , Hubert Thierens <sup>6</sup> , Tiziana Rancati <sup>7</sup> , Christopher J. Talbot <sup>8</sup> , Ana Vega9 , Sarah L. Kerns 10, Christian Nicolaj Andreassen11, Jenny Chang-Claude12,13, Catharine M. L. West14, Corey M. Gill15,16 and Barry S. Rosenstein15,16*

#### *Edited by:*

*Sean P. Collins, Georgetown University School of Medicine, USA*

#### *Reviewed by:*

*Vinay Sharma, University of the Witwatersrand, South Africa Marianne Aznar, Rigshospitalet, Denmark*

#### *\*Correspondence:*

*David Azria david.azria@icm.unicancer.fr*

*† These authors have contributed equally to this work and are joint first co-authors.*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 03 February 2017 Accepted: 13 April 2017 Published: 27 April 2017*

#### *Citation:*

*Azria D, Lapierre A, Gourgou S, De Ruysscher D, Colinge J, Lambin P, Brengues M, Ward T, Bentzen SM, Thierens H, Rancati T, Talbot CJ, Vega A, Kerns SL, Andreassen CN, Chang-Claude J, West CML, Gill CM and Rosenstein BS (2017) Data-Based Radiation Oncology: Design of Clinical Trials in the Toxicity Biomarkers Era. Front. Oncol. 7:83. doi: 10.3389/fonc.2017.00083*

*1Department of Radiation Oncology, Radiobiology Unit, Biometric and Bio-informatic Divisions, Montpellier Cancer Institute (ICM), IRCM, INSERM U1194, Montpellier, France, 2Department of Radiation Oncology, Maastricht University Medical Centre, MAASTRO Clinic, Maastricht, Netherlands, 3Radiation Oncology, KU Leuven, Leuven, Belgium, 4Patient Advocate, Manchester, UK, 5 Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD, USA, 6Department of Basic Medical Sciences, Ghent University, Ghent, Belgium, 7Prostate Cancer Program, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy, 8Department of Genetics, University of Leicester, Leicester, UK, 9 Fundacion Publica Galega de Medicina Xenomica-SERGAS, Grupo de Medicina Xenomica-USC, IDIS, CIBERER, Santiago de Compostela, Spain, 10Department of Radiation Oncology, University of Rochester Medical Center, Rochester, NY, USA, 11Department of Experimental Clinical Oncology, Aarhus University Hospital, Aarhus, Denmark, 12Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany, 13University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany, 14Division of Cancer Sciences, University of Manchester, Manchester Academic Health Science Centre, Christie Hospital NHS Trust, Manchester, UK, 15Department of Radiation Oncology, Icahn School of Medicine at Mount Sinai, New York, NY, USA, 16Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA*

The ability to stratify patients using a set of biomarkers, which predict that toxicity risk would allow for radiotherapy (RT) modulation and serve as a valuable tool for precision medicine and personalized RT. For patients presenting with tumors with a low risk of recurrence, modifying RT schedules to avoid toxicity would be clinically advantageous. Indeed, for the patient at low risk of developing radiation-associated toxicity, use of a hypofractionated protocol could be proposed leading to treatment time reduction and a cost–utility advantage. Conversely, for patients predicted to be at high risk for toxicity, either a more conformal form or a new technique of RT, or a multidisciplinary approach employing surgery could be included in the trial design to avoid or mitigate RT when the potential toxicity risk may be higher than the risk of disease recurrence. In addition, for patients at high risk of recurrence and low risk of toxicity, dose escalation, such as a greater boost dose, or irradiation field extensions could be considered to improve local control without severe toxicities, providing enhanced clinical benefit. In cases of high risk of toxicity, tumor control should be prioritized. In this review, toxicity biomarkers with sufficient evidence for clinical testing are presented. In addition, clinical trial designs and predictive models are described for different clinical situations.

Keywords: trial design, patient selection, biomarkers, radiotherapy, toxicity tests

## INTRODUCTION

Radiotherapy (RT) is one of the leading treatment modalities in oncology, and over 50% of patients diagnosed with cancer undergo RT during their course of treatment. Although RT is primarily a local treatment, patients are exposed to a risk of toxicities in the treatment field and surrounding tissues, which may develop acutely and late. Early toxicities are defined as side effects occurring during treatment or in the first 3 months after treatment completion. Late toxicities are defined as those occurring more than 3 months following RT and could increase over time for a period of many months to years. Late toxicities often persist and can have a significant negative impact on quality of life among cancer survivors. A sequential effect between early and late toxicity is often reported.

A total of 5–10% of patients will eventually develop severe side effects with a significant impact on treatment outcome or quality of life. Based on this observation and dependent upon the prognosis that reflects the type of tumor and its stage at time of treatment, dose–volume constraints to organs-at-risk are usually chosen in order to keep the risk of developing grade 3 or higher side effects below 5% (1, 2). Due to considerable progress in cancer management in recent decades, the number of cancer survivors has dramatically increased, raising new challenges in the various phases of survivorship. Thus, posttreatment morbidity and quality of life have become a critical concern in the growing patient population (3). However, there is large patient-to-patient variability for the development of adverse outcomes following RT, in terms of both prevalence and severity. While most patients will develop toxicities within the normal range, some patients demonstrate a hypersensitive phenotype and develop severe toxicities even at standard radiotherapeutic doses.

The first example of individual variation in degree of response was described by Holthusen in 1936 (4). Numerous normal tissue complication probability (NTCP) models have since been developed, and variation in normal tissue response has been shown to follow a normal distribution with 5% of patients considered as radiosensitive (5). Identification of these patients beforehand is critical to avoid morbidity because severe toxicities in a minority of patients limit the dose that can be safely delivered to the majority of patients (6). In addition, individualized risk estimation for even mild or moderate effects would provide patients with information as to their risk for complications following treatment and could be used to select patients for interventions designed to prevent or mitigate toxicities. Thus, understanding individual variation is crucial to individualization of RT treatment planning and increased therapeutic outcomes (7).

While early toxicities might compromise treatment completion, they can usually be managed with adequate care. In contrast, late toxicities can significantly affect quality of life in survivors and may require extensive treatments to alleviate symptoms. However, acute radiation reactions are not necessarily an indicator of a predisposition for late toxicity (8). Therefore, there is a need to measure individual radiosensitivity and predict the risk of toxicity before treatment. Even though many external factors such as age, concomitant medications, or recent surgery impact on the risk of toxicity, the main determinant seems to be genetic factors (i.e., intrinsic radiosensitivity). However, it is unlikely that intrinsic radiosensitivity is the product of a single genetic alteration, and as such, it should be regarded as a complex polygenic trait (7). If a link can be found between underlying genetic variation and normal tissue susceptibility to developing toxicity, then patients could benefit from genomically guided, therapeutic individualization of their treatment: early identification of patients predicted to be at high risk for radiation-induced toxicities may benefit from either RT dose reduction or hyperfractionation. Conversely, identification of patients who are at low risk of toxicity could allow for (i) hypofractionation of the treatment plan, thereby shortening treatment time or (ii) dose escalation, which could improve tumor control (9).

Several observations support the hypothesis that clinical normal tissue radiosensitivity is influenced by genomics. However, very little is known about the genetic architecture of radiosensitivity or the specific genomic variants underlying interindividual differences in normal tissue reactions to RT in unselected cancer patients. It is considered that intrinsic radiosensitivity of a patient should be regarded as a complex trait depending on the combined effect of multiple genomic alterations (9). However, factors other than intrinsic radiosensitivity (i.e., genetically determined) will influence the risk of toxicity (e.g., radiation dose, age, and comorbidities), which highlights the need to collect and include multiple variables in studies.

Several genes involved in response to radiation injury were identified because homozygous mutations resulted in unusually severe reactions to RT (e.g., ATM). Other genes studied were known to be involved in the DNA damage response to ionizing radiation or the development of fibrosis. Most studies to date investigated single nucleotide polymorphisms (SNPs) because of their high prevalence in a population. With the rise of nextgeneration sequencing and genome-wide assays, genomic studies have been immensely facilitated (10). SNPs associated with radiation injury have been identified using high-throughput genotyping, in genome-wide association studies (GWASs) as well as candidate gene studies (11, 12). However, SNP discovery through GWAS requires a large number of patients to reach statistical significance, and the number of patients who exhibit severe toxicity is relatively low in clinical studies (6). In addition, careful clinical consideration is required when designing radiogenomic studies. While radiation dose is the main factor influencing toxicity, additional factors, including genomic alterations (e.g., SNP) and treatment volume, may be effect modifiers of the dose–toxicity relationship. Other clinical factors such as age, smoking habits (13, 14), or preexisting conditions (autoimmune diseases such as collagen vascular diseases) (15) may influence toxicity independently of genetic background, and so it is important that risk prediction models are not restricted to only genetic or only non-genetic factors.

It is also important to consider the future development of a test for clinical application. Rigorous methodology in choice of hypothesis, methods, and result reporting is required to allow generalization of the results (16) (see also Cancer Research UK predictive biomarker roadmap: http://www.cancerresearchuk.org/prod\_consump/groups/cr\_common/@fre/@fun/

documents/generalcontent/cr\_027486.pdf). In addition, the methodology developed for reporting tumor markers could be used for evaluating the level of evidence of prognostic normal tissue radiosensitivity markers (17–20). Based on these works, our consortium developed an 18-item checklist for reporting radiogenomic studies called STROGAR (**Table 1**), which should stand as reference for any new predictive biomarker development (21).

This study aims to review currently available radiogenomic assays based on level of evidence and clinical relevancy and to

#### Table 1 | STROGAR 18-item checklist for reporting radiogenomic studies from Kerns et al. (21). Item number Recommendations Title and abstract Title and abstract 1 Include the primary outcome(s) and type of study [whether genome-wide association studies (GWASs) or gene-specific]; provide an informative summary of the study including study design, whether discovery or validation, sample size, main end points, and major results. Introduction Background/rationale 2 Note if the study is a GWAS or a candidate gene/SNP study and, if candidate gene study, rationale for choice of genes/SNPs; give a general description of the study setting. Objectives 3 Define the primary/main outcome(s) of interest; describe the overall/long-term goal of the study; note if it is a discovery, validation, or multistage study. Use terminology and definitions from National Cancer Institute biomarker study guidelines (22), where applicable. Methods Study design 4 Specify the study design (case–control, cohort); whether data were collected under a controlled trial setting; whether data were collected retrospectively or prospectively. Report power and sample size considerations. Patient population 5 Specify the source(s) of the patients and, if multiple sources, whether they are pooled or treated as separate cohorts; define inclusion/exclusion criteria; report whether comorbidities and medications were assessed by self-report or medical records; define methods/system used for tumor staging; describe the larger patient population from which the study sample was drawn; define how major changes in treatment protocol were handled in the analysis. Radiation exposure 6 Specify details of radiation treatment parameters including: organ(s)-at-risk, dose–time fractionation; dose rate, target volume selection (e.g., breast + boost), dose to critical substructures, dose–volume metric used, the type of treatment and treatment setting, radiation modality (e.g., external beam vs. brachytherapy), whether single or combined treatment modalities were used, whether primary treatment or salvage therapy, imaging and planning details, ICRU recommendations followed and note relaxation of criteria, note any changes in dose or treatment protocol over the time course of enrollment and whether there were any interruptions in treatment. Phenotype(s) 7 Specify how intrapatient or pretreatment assessment was made and whether it is accounted for in defining phenotype(s); note whether patient-reported outcomes or physician-assessed outcomes are being used to define phenotype(s); note which toxicity scoring system was used (if using a common/ standard system); define the grading scales used and whether the phenotype(s) is/are defined as continuous, dichotomous or categorical; describe frequency of follow-up scheduling and diagnostic intensity; define the posttreatment time frame for assessment of toxicity outcomes; describe whether outcome(s) is/are based on a single time point or the maximum/worst time point out of a series of followup assessments; note if/how competing risks were handled (such as non-radiation-related manifestation of the phenotype); note any medical intervention that may influence study outcome(s). Genotyping strategy and quality control (QC) 8 Specify DNA source and isolation methods; note the methods/platform used for genotyping; specify whether genotyping was done in one stage or multiple stages; note whether genotyping was done in more than one lab or batch, and if so, how batch effects were handled; describe methods for genotype calling and cite the algorithm used; note whether genotype calling was done for the whole study sample together or in batches; describe QC methods including concordance between duplicates, control samples, and checks for cryptic relatedness; describe methods for assessing population structure; describe SNP/CNP filtering methods including filtering on per-sample call rate, per-SNP call rate, minor allele frequency, and Hardy–Weinberg equilibrium; note whether imputation was used and, if so, describe methods. Data analysis and statistical methods 9 Define the statistical methods and models used for association testing; cite the software and settings used; describe how censoring was handled; define model selection methods used for multivariable models; describe whether all samples are analyzed together or sequentially if the study involves multiple cohorts; for multistage studies, define methods for selecting variants to follow-up in subsequent stages; describe how missing data were handled; if multiple cohorts were included, describe data harmonization methods; note whether gene–gene interaction or gene–environment interaction was investigated; describe methods used to adjust for population structure; describe methods used to correct for multiple comparisons and/or control for risk of false-positive findings.

(*Continued*)

#### TABLE 1 | Continued


evaluate potential ways in which these assays might be implemented in routine clinical practice.

### AVAILABLE RADIOGENOMIC BIOMARKERS AND THEIR RESPECTIVE LEVELS OF EVIDENCE

#### SNP Association Studies

The initial research performed in radiogenomics involved candidate gene studies, which focused on genes encoding proteins with known associations with pathways involved in responses to radiation, such as DNA repair processes and cell cycle checkpoint control. Although a number of positive associations were reported, these studies often did not adequately correct for multiple hypothesis testing and generally were not validated in subsequent studies, with several notable exceptions described below. More recent advances in radiogenomics research have been achieved through use of SNP microarrays and the performance of GWASs in which large numbers of SNPs across the genome have been evaluated. Using both of these approaches, several large studies have been accomplished involving a rigorous analysis for association between particular SNPs and toxicity outcomes that follow the STROGAR guidelines for reporting radiogenomic studies (**Table 1**) (21).

The most progress has probably been made in identifying specific SNPs associated with late toxicity following RT for prostate cancer. The first radiogenomics GWASs performed aimed to identify SNPs associated with erectile dysfunction in African-American men treated with RT for prostate cancer (23). Through this study, a SNP (rs2268363) in the FSHR gene, which encodes follicle-stimulating hormone, was identified (unadjusted *p*-value = 5.46 × 10<sup>−</sup><sup>8</sup> ; Bonferroni *p*-value = 0.028). In another prostate cancer study, a three-stage GWAS was conducted using discovery and replication cohorts that included the use of Standardized Total Average Toxicity (STAT) score (24) as a measure of overall toxicity, combining urinary and rectal end points. A locus encompassing the TANC1 gene was associated with STAT score for overall late toxicity (25) with an odds ratio (OR) of ~6 (combined *p*-value = 4.64 × 10<sup>−</sup>11). More recently, a GWAS meta-analysis was performed using data from four cohorts of men treated for prostate cancer for whom toxicity was measured at 2-year post-RT (26). Two SNPs were identified in this study that met genome-wide significance. One was rs17599026, which resides on chromosome 5q31.2 and associated with urinary frequency and characterized by an OR of 3.1 (95% confidence interval 2.1–4.7, *p* = 4.2 × 10<sup>−</sup><sup>8</sup> ). rs7720298, which is situated on chromosome 5p15.2, was associated with decreased urine stream with an OR of 2.7 (95% CI 1.9–3.9, *p*-value = 3.2 × 10<sup>−</sup><sup>8</sup> ). This SNP is located in an intronic region downstream of DNAH5 exon 30. Using a candidate gene approach, a study of more than 5,000 patients who underwent RT for either prostate or breast cancer reported an association between overall toxicity and rs1801516 in the ATM gene with ORs of 1.5 for acute and 1.2 for late toxicity (27).

Several other studies have been successful in identification of SNPs associated with the development of adverse normal tissue outcomes following RT for breast cancer. For example, a study comprising four SNPs related to the TGFβ pathway reported associations with several outcomes, including breast induration, telangiectasia, and overall toxicity (28). Significant and replicated associations with adverse outcomes following breast RT were reported for the TNF SNP rs1800629 and rs2857595, which is located 25.7 kb from rs1800629 and resides in the intergenic region between NCR3 and AIF1. Another validated study of breast cancer patients identified SNP rs1139793 in TXNRD2 associated with subcutaneous fibrosis following RT (29). A separate study used a two-stage design to investigate associations between SNPs in genes whose products are involved with responses to oxidative stress with toxicities following radiation treatment of women diagnosed with breast cancer. The rs2682585 SNP in XRCC1 (30) was found to be associated with reduced risk for skin toxicities (OR 0.77, 95% CI 0.61–0.96, *p* = 0.02) and decreased STAT scores (−0.08, 95% CI −0.15 to −0.02, *p* = 0.016).

Several candidate gene SNP studies have successfully identified and validated SNPs associated with late RT toxicity in lung cancer. It was reported in studies of patients treated with RT for nonsmall-cell lung cancer (NSCLC) that the HSPB1 rs2868371 SNP was associated with grade 3 or greater radiation pneumonitis (31) in both the training (*p*= 0.031) and validation sets (*p*= 0.025) and that this SNP was also associated with the development of grade 3 or greater radiation-induced esophagitis (32) in both the training (*p* = 0.045) and validation cohorts (*p* = 0.031). In addition, it was reported that the TGFB1 rs1800469 SNP was associated with a higher risk of radiation esophagitis in both the training (*p* = 0.045) and validation (0.023) sets of NSCLC patients (32).

While much work remains to be done in order to identify the many radiosensitivity SNPs that likely remain undiscovered, the studies published to date represent an important step toward development of polygenic risk models. Furthermore, the GWASs have contributed to uncovering novel radiation biology genes and pathways. Functional studies of these genes will provide important information for development of pharmacological interventions to prevent or mitigate the toxic effects of radiation on normal tissues.

#### Fibroblast-Based Assays

Fibroblasts have traditionally been the gold-standard considered to be the best model of normal tissue for RT studies, given the importance of fibrosis in late effects and that these cells play a large role in the supporting cellular networks that surround tumors outside of the central nervous system. The first study of this model was conducted by Burnet et al. in 1992 (33). Since then, several studies suggested that fibroblast radiosensitivity *in vitro* could predict early toxicity risk. This association was studied in the clinical setting, in breast and head and neck cancer, where fibroblast clonogenic survival after irradiation was associated with radiation-induced toxicity in patients (34). However, to date, no prospective study has been able to demonstrate a significant association between fibroblast radiosensitivity and radiation-induced toxicity in patients (35, 36).

### Radiation-Induced Lymphocyte Apoptosis (RILA) Assay

In response to the limited success of fibroblast-based tests, lymphocyte-based assays were developed in their stead. While clonogenic assays showed promise in a prospective setting and in multivariable analysis (37), the 2-week assay time was considered a barrier to clinical implementation. Therefore, Ozsahin et al. developed an assay based on CD8+ T-lymphocyte apoptosis after *in vitro* irradiation with a single 8-Gy dose (38). While no association between lymphocyte apoptosis and early toxicities was found in multivariate analyses, CD8+ T-lymphocyte apoptosis was significantly associated with late effects in various cancers in a single-center prospective trial (39) and recently confirmed in a prospective multicenter study for late breast fibrosis (40). Furthermore, this assay has been shown to be reproducible between laboratories, making it a robust test to assess individual radiosensitivity (39, 41). As such, several prospective trials are currently assessing the clinical validity of the RILA assay in different cancer settings, such as prostate or lung cancer (42).

### Other Lymphocyte-Based Assays

As lymphocytes are a convenient model for radiation response, several other lymphocyte-based assays have been used to assess individual radiosensitivity. Of these, the most common is the γ-H2AX residual foci assay. H2AX is a protein phosphorylated upon double-strand breaks formation and is one of the earliest events that can be detected after cell irradiation. The number of γ-H2AX foci after cell damage has been extensively used to evaluate response to chemotherapy and RT (43–45). However, the association between the number of residual foci and clinical response to radiation on the patient level (either measured by toxicity or tumor response) has failed to be prospectively validated.

G2 metaphase and G0 micronuclei assays were initially used to assess chromosomal radiosensitivity and predisposition to breast cancer. Along with the γ-H2AX assay, many studies have sought to find an association between G2 metaphase and G0 micronuclei assays and radiation-induced toxicity (46). However, the G2 metaphase assay has exhibited low reproducibility. As new techniques have improved this assay, its use warrants prospective validation (47, 48). Similarly, the G0 micronuclei assay has been compared to other lymphocyte-based assays, but failed to be prospectively validated for prediction of either early or late radiation-associated toxicity (49–51).

**Table 2** rates these tests according to their respective level of evidence, based on the STROGAR items and adapted from the levels of evidence proposed by Simon et al. (19).

Table 2 | Available assays for radiosensitivity assessment with their respective level of evidence adapted from Simon et al. (19).


*SNP, single nucleotide polymorphism; RILA, radiation-induced lymphocyte apoptosis. Level of evidence based on REMARK guidelines (19).*

#### CLINICAL IMPLEMENTATION

For a radiosensitivity test to have utility in the clinic, a valid alternative treatment option that permits modification of the proposed treatment based on the results of a test must be available. These interventions could be dose or fractionation alterations, addition or omission of concomitant treatments (such as chemotherapy or RT mitigators), or complete exclusion of RT in hypersensitive patients if the predicted risk of toxicity exceeds the expected benefit of RT. For these individuals, treatment with either surgery and/or chemotherapy may be considered. Overall, these interventions can be divided into four situations, based on a patient's tumor control probability (TCP) and NTCP.

#### High TCP, Low NTCP

A low risk of tumor recurrence and a low risk of radiationinduced toxicity are the ideal clinical presentation. In this situation, quality of life improvement during radiation treatment should be the main goal of any intervention.

There is no need to increase total tumor dose since local control is high with standard treatment. However, alternate fractionation, such as hypofractionation, could offer a shorter treatment course with a substantial increase in quality of life. Hypofractionation has been shown to be a valid alternative for early breast cancer radiation, with schedules decreasing from 33 to 15–16 and finally 5 fractions yielding similar results in well-selected patients (52–54). In this case, hypofractionation could cut the treatment duration by half and have a significant impact on quality of life and treatment cost (55). Furthermore, a combined analysis of the START trials for breast cancer suggests that overall treatment time might be a significant determinant of local cancer control after adjuvant whole breast RT with a lower relapse rate in the accelerated arms (56).

Similarly, several hypofractionated schedules have shown promising results in prostate cancer (most recently the CHHip and HYPRO trials), with only moderate increase in rectal toxicities (57, 58). Furthermore, when analyzed from a medicoeconomic point of view, hypofractionated regimens could result in improved health gains at lower cost (59).

#### High TCP, High NTCP

In this case, the patient would be at increased risk for developing severe toxicity following RT, but at a low risk of tumor recurrence.

This scenario is when alternate treatment plans may be most appropriate, such as a strictly surgical treatment. For example, in low-risk prostate cancer, treatment with either surgery or RT, or even active surveillance in appropriately selected patients has demonstrated similar survival outcomes (60). However, toxicity profiles differ significantly; there is a higher risk for urinary toxicity and erectile dysfunction after surgery, but a greater incidence of rectal bleeding and fecal incontinence after RT (61). Therefore, these treatment options could be offered to the patient, who could take all of these factors into careful consideration when deciding upon the type of treatment. In addition, focal therapies could be considered for appropriately selected patients.

In the case of postoperative prostate cancer, adjuvant RT has been shown to reduce the risk of biochemical failure but without overall survival improvement (62). Thus, in highly radiosensitive patients, RT could be postponed until disease recurrence or omitted altogether.

Considering early breast cancer, postoperative RT has been shown to decrease the risk of local recurrence by 15% (63). However, mastectomy with immediate reconstructive surgery could be an alternative to breast-conserving surgery plus RT for patients at high risk for development of radiation-induced toxicity (64, 65). Of course, in this case, as in any treatment change, patient's opinion should be taken into account in the decisionmaking process, as a more invasive surgery might be proposed.

Alternatively, in low risk breast cancer and elderly patients, cosmetic results could be improved by reducing radiation treatment volumes with intraoperative RT or partial breast irradiation (66, 67), while maintaining excellent tumor control (68, 69).

#### Low TCP, Low NTCP

Increasing total treatment dose would be the easiest intervention for a high risk of tumor recurrence in a patient with low risk of radiation-induced toxicity.

Dose escalation has been shown to improve local control in several tumor types, such as prostate or rectal cancer, where an increase in total dose could yield a higher rate of pathological complete response after surgery (70). Several dose escalation trials (in prostate, rectum, cervix or lung cancer for example) are currently recruiting, and these patients could be ideal candidates for radiogenomic trials.

Alternatively, chemotherapy or radiosensitizers could be used to increase radiation efficacy without increasing the physical dose or overall treatment time. In head and neck cancers, for example, the hypoxic modifier nimorazole could be added to the treatment regimen to overcome tumor hypoxia in patients with low risk of radiation-induced toxicity (71, 72). Gemcitabine use in locally advanced bladder cancers also has radiosensitizer effects (73).

#### Low TCP, High NTCP

This presentation is the worst-case scenario with a highly radiosensitive patient and a high risk of tumor recurrence or progression.

Since the main goal of RT is to ensure tumor control, dose deescalation cannot be offered to these patients since the need for tumor control exceeds the risk of radiation-induced toxicity.

In this case, alternate fractionation could be considered, such as a hyperfractionated regimen, which may maintain the same therapeutic ratio with decreased risk of toxicity (74). When available, stereotactic body radiation therapy could also offer a decreased risk of normal tissue complications with excellent tumor control rates (75). The use of proton or carbon ion RT could also be considered if these modalities are available. Prediction models including clinical and dosimetric parameters are currently under development (76). Individual radiosensitivity measured using the aforementioned tests should be incorporated into these predictive models (77).

When alternate fractionation schedules are not applicable, radioprotectors may reduce the risk of normal tissue toxicity while maintaining comparable tumor control rates (78). Amifostine is the only Federal Drug Administration (FDA)-approved radioprotector (79). However, severe side effects (nausea, hypotension) limit its widespread clinical use. However, patients predicted to be high risk for development of adverse outcomes following RT could be good candidates for this treatment, whose pharmacologic side effects might prove more easily manageable than severe radiation-induced toxicity.

**Table 3** summarizes the different clinical situations stratified by type of cancer and the suggested interventions.

#### STUDY DESIGN AND MEDICO-ECONOMIC CONSIDERATIONS

Randomized prospective clinical trials are the gold standard for interventional studies (19). There are 10 theoretical possible designs for testing clinical utility of radiogenomics models (80). However, of these, four are most applicable to randomized trials: randomize-all, interaction or risk factor-stratified design, targeted or selection design, and the individual profile design (81).

Randomize-all is the simplest design, with patients randomized for both treatments, regardless of their prognostic group and those being studied subsequently in each treatment arm. It is the most robust design to assess an intervention, regardless of patient profile. The risk factor-stratified design enables hierarchical


*TCP, tumor control probability; NTCP, normal tissue complication probability; IORT, intraoperative radiotherapy; RT, radiotherapy; IMC, internal mammary chain; SBRT, stereotactic body radiation therapy; ENI, elective nodal irradiation; CNS, central nervous system.*

statistical tests, by stratifying patients according to their risk level before intervention. In the targeted design, only subjects identified as high-risk patients are randomized for intervention. This model allows studies to target a specific population with a higher statistical power, even if the accuracy of the model is low. Finally, the individual profile design enables parallel therapeutic strategies to be tested in various patient profiles with patients randomized between standard treatment and a risk profile-based strategy (81).

Nevertheless, trials of radiogenomics models should carefully follow appropriate reporting guidelines, such as STROGAR, CONSORT and REMARK in order to make large-scale validation of the results easier (18, 21, 82). Development of these tests for clinical implementation should theoretically follow regionspecific guidelines, such as FDA or European Medicines Agency (16). However, not all tests can comply with every item in these guidelines, such as availability of randomized interventional studies. We consider retrospective and large prospective multicenter cohorts to be a required minimum in these cases.

As normal tissue response to radiation is a polygenic trait also affected by clinical, demographic and health behavior factors, multiparametric models should be the gold standard for predictive assays. Furthermore, given that radiosensitivity assays are predictive factors, they cannot be interpreted in an independent manner (83). For example, the RILA assay has been shown to be biased by numerous factors in breast fibrosis prediction, such as smoking habits or hormone therapy (41). A nomogram has thus been developed to incorporate effect modifiers and confounding parameters when predicting risk of radiation-induced breast fibrosis. Similar considerations apply to SNP-based predictive assays. For example, the SNP tagging the TANC1 risk locus for late toxicity in prostate cancer was shown to interact with radiation dose (25). There are likely other gene-by-environment interactions that remain to be uncovered, and inclusion of interaction terms is expected to improve performance of predictive models.

From a health-economic perspective, identification of hypersensitive patients could significantly decrease the cost of radiation-associated toxicity treatments, or even the cost of treatment in low NTCP high TCP patients eligible for accelerated regimen. There are approximately 15.5 million cancer survivors in the US, and there may be substantial costs to clinically manage the toxicities that could result from treatment of their disease (84). Cardiac complications that can develop after RT to the chest area (in breast cancer or lung cancer), for example, can be substantial. Costs associated with adverse outcomes following RT are often hard to specify, because they represent a small part of a complex disease management protocol (85). However, decreasing the rate of late toxicities will undeniably lower long-term costs of cancer survivorship.

In order to clearly quantify the economic gain from radiogenomics tests, several factors need to be considered. First, the cost of the actual test needs to be taken into account. For instance, the costs of the RILA assay or targeted SNP genotyping or gene sequencing for a limited panel are generally less than 2,000€, and the price for clinical whole genome sequencing continues to drop. Treatment costs must also be considered. In this case, treatment adaptation to the NTCP of the patient could result in significant savings: total health-care expenditures for breast cancer can be decreased by 10% with hypofractionated RT (86). The cost–utility of intervention must be assessed by comparing these costs to Quality-Assessed Life Years, in all patient groups.

Once a test has shown sufficient clinical validity, it can be used to create medical companies, such as Novagray® for the RILA assay, to promote and market the test.

### CONCLUSION AND PERSPECTIVES

A large number of tests for radiosensitivity have been investigated over the last three decades, and some have proven their validity in multicenter prospective settings. Of the many tests developed over the years, only several SNP assays and the RILA assay have shown replicated performance in the development phase.

The next step that should be undertaken is the large-scale study of these models to implement clinical use and assess cost– utility. This is being carried out in Europe through the ongoing REQUITE project, using the RILA assay, as well as other validated biomarkers (42). The RILA assay incorporated in a nomogram with the other independent factors has already proven its validity in a multicenter study on breast cancer and is currently under evaluation for other cancer types (40).

Similarly, the Radiogenomics Consortium has developed the TAILORED project to validate the concept of stratification to identify cancer patients with increased individual radiosensitivity and provide cost-effective therapeutic interventions to reduce the side effects of RT for cancer. This would allow for a personalized risk-adapted approach to provide more effective treatments.

#### AUTHOR CONTRIBUTIONS

DA participated in the design of the trial, wrote the manuscript, and coordinated the corrections of all the consortium participants. AL participated in the design of the trial, wrote the manuscript, and coordinated the corrections of all the consortium participants. SG, DD, JC, PL, MB, TW, SB, HT, TR, CT, AV, SK, CA, JC-C, CW, CG, and BR participated in writing the manuscript.

#### FUNDING

This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no 601826 (REQUITE). This study was also supported by the SIRIC Montpellier Cancer (Grant INCa-DGOS-Inserm 6045), grants, and contracts to BSR from the United States National Institutes of Health (1R01CA134444 and HHSN261201500043C), the American Cancer Society (RSGT-05-200-01-CCE), the United States Department of Defense (PC074201 and PC140371), K07CA187546 from the United States National Institutes of Health (SLK), Associazione Italiana Ricerca sul Cancro (AIRC-IG16087), grants to AV from Instituto de Salud Carlos III (FIS PI13/02030 and PI16/00046), and Fondo Europeo de Desarrollo Regional (FEDER 2007-2013).

### REFERENCES


and late adverse effects of earlier breast radiotherapy. *Radiother Oncol* (2016) 119:244–9. doi:10.1016/j.radonc.2016.04.012


consortium statement. *Radiother Oncol* (2016) 121(3):440–6. doi:10.1016/j. radonc.2016.11.003


**Conflict of Interest Statement:** DA participated in the NovaGray start-up creation. The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Azria, Lapierre, Gourgou, De Ruysscher, Colinge, Lambin, Brengues, Ward, Bentzen, Thierens, Rancati, Talbot, Vega, Kerns, Andreassen, Chang-Claude, West, Gill and Rosenstein. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Challenges for Quality Assurance of Target Volume Delineation in Clinical Trials

#### *Amy Tien Yee Chang1,2\*, Li Tee Tan3 , Simon Duke3 and Wai-Tong Ng1*

*1Department of Clinical Oncology, Pamela Youde Nethersole Eastern Hospital, Hong Kong, Hong Kong, 2Department of Clinical Oncology, University of Hong Kong, Hong Kong, 3Department of Oncology, Cambridge University Hospitals NHS Trust, Cambridge, United Kingdom*

In recent years, new radiotherapy techniques have emerged that aim to improve treatment outcome and reduce toxicity. The standard method of evaluating such techniques is to conduct large scale multicenter clinical trials, often across continents. A major challenge for such trials is quality assurance to ensure consistency of treatment across all participating centers. Analyses from previous studies have shown that poor compliance and protocol violation have a significant adverse effect on treatment outcomes. The results of the clinical trials may, therefore, be confounded by poor quality radiotherapy. Target volume delineation (TVD) is one of the most critical steps in the radiotherapy process. Many studies have shown large inter-observer variations in contouring, both within and outside of clinical trials. High precision techniques, such as intensitymodulated radiotherapy, image-guided brachytherapy, and stereotactic radiotherapy have steep dose gradients, and errors in contouring may lead to inadequate dose to the tumor and consequently, reduce the chance of cure. Similarly, variation in organ at risk delineation will make it difficult to evaluate dose response for toxicity. This article reviews the literature on TVD variability and its impact on dosimetry and clinical outcomes. The implications for quality assurance in clinical trials are discussed.

Keywords: target volume delineation variability, contouring guidelines, peer review, education program, clinical trial

#### INTRODUCTION

The last 20 years has seen the emergence of novel anticancer treatments which have the potential to improve clinical outcomes for patients. The standard method of evaluating such treatments is to conduct large scale multicenter clinical trials, often across continents. Radiotherapy is indicated for more than 50% of all cancer patients (1). Many oncology clinical trials, therefore, include radiotherapy within their treatment protocol even if the radiotherapy technique itself is not the subject of evaluation. Poor radiotherapy technique has been shown to be associated with inferior overall survival in many clinical trials; the benefit of any intervention in a clinical trial may, therefore, be compromised by suboptimal radiotherapy.

The radiotherapy quality assurance (RTQA) program was introduced to standardize radiotherapy across participating centers within a clinical trial. The RTQA program covers all aspects of the radiotherapy process including volume delineation, planning and delivery as well as infrastructure, equipment, personnel, and procedures. Several trial groups have reported that the implementation

#### *Edited by:*

*Stephanie E. Combs, Technische Universität München, Germany*

#### *Reviewed by:*

*Stefan Rieken, University Hospital Heidelberg, Germany Paul Stephen Rava, UMass Memorial Medical Center, United States*

> *\*Correspondence: Amy Tien Yee Chang tienyee.chang@gmail.com*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 30 April 2017 Accepted: 01 September 2017 Published: 25 September 2017*

#### *Citation:*

*Chang AT, Tan LT, Duke S and Ng W-T (2017) Challenges for Quality Assurance of Target Volume Delineation in Clinical Trials. Front. Oncol. 7:221. doi: 10.3389/fonc.2017.00221*

of RTQA procedures has enhanced protocol compliance and improved clinical trial outcome (2). However, the RTQA procedures in different clinical trials vary considerably making analysis and inter-trial comparisons to identify the most effective procedures difficult. Moreover, the cost of running a trial RTQA program is substantial, even more so with the introduction of advanced radiotherapy techniques.

Advanced radiotherapy techniques improve local tumor control and reduce treatment toxicity by delivering higher radiation doses to tumors while sparing adjacent normal tissue. Examples include intensity-modulated radiotherapy (IMRT), which allows the radiotherapy dose to be conformed to the target volume while sparing nearby organs at risk (OAR), and image-guided radiotherapy, which improves the precision of treatment delivery and allows smaller margins to be added to the target volume for delivery uncertainty (3). The benefit of these and other high precision techniques is critically dependent on optimal target volume delineation (TVD) by radiation oncologists as the steep dose gradients and reduced margins leave little room for error. There are numerous reports in the literature of suboptimal TVD, which can lead to fatal marginal recurrences due to geographical miss (4–8).

This article reviews the literature on TVD variability and its impact on dosimetry and clinical outcomes. The current methods for reducing TVD variability within and outside clinical trials and their limitations are discussed.

#### MAGNITUDE OF TVD VARIABILITY

The delivery of radiotherapy treatment has long been subject to careful measurement and evaluation of the causes and magnitude of systematic and random errors. As a result, evidence-based strategies have been developed and universally adopted which have enabled radiotherapy delivery to approach millimeter precision.

In contrast, variability in TVD has not been evaluated with the same rigor. In 2016, Vinod et al. (9) published a systematic review of publications on uncertainties in TVD in radiation oncology. They identified 119 papers on TVD variability published between 2000 and 2014 covering the following clinical topics—breast, bladder, prostate, lung, esophagus, stomach, pancreas, liver, rectum, head and neck, brain, cervix, uterus, lymphoma, sarcoma, palliative radiotherapy, and OAR contouring. A number of studies focused on specific advanced radiotherapy techniques including image-guided brachytherapy (IGBT) for cervical cancer, stereotactic ablative body radiotherapy for lung cancer, and stereotactic radiosurgery for brain metastases.

All the studies showed considerable TVD variability between observers, often measured in centimeters. TVD variability was evident in all the volumes pertinent to radiotherapy planning as specified in ICRU Report 50 (10) published in 1978, i.e., the gross tumor volume (GTV), clinical target volume (CTV), and planning target volume (PTV).

Target volume delineation variability was seen among experienced radiation oncologists as well as trainees. There were also differences between different specialists [diagnostic radiologists, positron emission tomography (PET) physicians, neurosurgeons, orthopedic surgeons, gynecology oncologists, medical oncologists, hematologists, respiratory physicians] and disciplines (medical physicists and radiation therapists/radiographers). In one highly cited French study of GTV delineation in lung cancer (11), nine radiologists and eight radiation oncologists working in five different centers, classified as either "junior" or "senior" according to their professional experience, were asked to delineate the primary tumor and involved lymph nodes on the computed tomography (CT) images of 10 patients. The study showed that compared to radiation oncologists, radiologists tended to delineate smaller volumes and encountered fewer difficulties to delineate "difficult" cases. Junior doctors also tended to delineate smaller and more homogeneous volumes than their senior colleagues, regardless of their specialty, especially for "difficult" cases.

## CAUSES OF TVD VARIABILITY

Despite the numerous papers on TVD variability within and outside clinical trials, very few have attempted to evaluate the causes of TVD variability in a systematic fashion.

Several studies have reported the impact of imaging modality on TVD variability. For example, a number of studies (12–14) showed that more consistent definition of the GTV in lung cancer can be obtained if the CT images were co-registered with 2-[18F]-fluoro-2-deoxy-d-glucose PET images. Similarly, there are studies showing more consistent definition of GTV and CTV of brain tumors on CT images co-registered with magnetic resonance images (MRI) (15). Image co-registration is now standard practice for both these tumor sites.

It is important to appreciate that reduced TVD variability seen on one imaging modality does not necessarily equate to this being a superior imaging modality. In a study on IGBT for cervical cancer (16), 23 gynecologic radiation oncology experts were asked to delineate the CTV on CT and MRI. There was a higher level of agreement of contours on CT despite MRI being universally recognized as the superior imaging modality. This probably reflects clinician unfamiliarity of MRI image interpretation for IGBT cervix planning where post-radiation changes can be a confounding factor.

It is commonly assumed that the major cause of intra-observer TVD variability is suboptimal image interpretation (17). However, other factors such as conceptual understanding of patterns of tumor spread and organ motion are equally important. In a study on definitive radiotherapy for cervical carcinoma (18), five radiation oncologists and two gynecologists independently contoured the CTVs for three patients. The study showed good consistency in outlined anatomical structures suggesting that image interpretation was not an issue. However, there was large inter-observer variability in CTV delineation with the ratio between largest and smallest volumes ranging between 3.6 and 4.9 for all observers. The ratio of common volumes to encompassing volumes ranged between 0.11 and 0.13 for the radiation oncologists, and between 0.30 and 0.57 for the gynecologists.

The TVD variability between gynecologists and radiation oncologists probably reflects different conceptual understanding of areas at risk of microscopic disease between the two specialties. The core skill for gynecologists is to remove the tumor with a small margin (usually 5 mm) with minimal disruption of surrounding tissue. In contrast, radiation oncologists irradiate large volumes of tissue to a relatively homogenous dose to minimize the risk of in-field and edge recurrences. The concepts of microscopic disease for these two specialties are, therefore, likely to be very different. This explanation could also account for the TVD variability between radiologists and radiation oncologists in the lung cancer study. Cancer radiologists are required to accurately define the tumor (avoiding both under and over estimation) to predict surgical resectability whereas the prime concern of radiation oncologists is to avoid missing the tumor. It is, therefore, easy to see why in difficult cases, some radiation oncologists would err on the side of caution and include areas of uncertainty in the GTV. Similarly, it is well recognized that junior doctors are less able to appreciate uncertainties than their senior colleagues, a phenomenon known as the Dunning Kruger effect based on Charles Darwin's quote that "Ignorance more frequently begets confidence than does knowledge."

Consistency and clarity of conceptual understanding is particularly important when new concepts are introduced. An example is the internal target volume (ITV), a concept first introduced in ICRU Report 62 published in 1999 (19). The ITV is defined as the CTV plus a margin taking into account uncertainties in size, shape, and position of the CTV within the patient. The margin for the ITV (called the internal margin) is distinct from the setup margin used for the PTV. However, in a survey of 50 radiation oncologists at a pelvic IMRT workshop (unpublished), 38% did not use the concept of the ITV in their daily practice, 30% incorporated the internal margin into the CTV, 26% incorporated the internal margin into the PTV, and only 8% contoured the ITV as a separate structure.

## ASSESSMENT OF TVD VARIABILITY

The Vinod et al. review (9) reported that the number of imaging datasets in the studies on TVD variability varied from 1 to 132 with a median of 9, while the number of participants contouring ranged from 3 to 50 with a median of 7. There are no studies which have systematically analyzed the impact of number of imaging datasets or number of participants on TVD variability unlike the literature on setup accuracy. In those studies, where more than one case was used, the magnitude and direction of TVD variability varied considerably between cases reflecting the variation in patient anatomy and tumor topography.

There was also a wide range of methods used to assess TVD variability. A volume metric (volume measurements, volume ratios) was most consistently reported across most studies. Measures of overlap (concordance index, discordance index, dice similarity coefficient) were also frequently reported. Comparisons were usually measured against a reference contour. The definition of a reference contour varied from the contour of a recognized expert to a consensus contour with multiple observers or a Simultaneous Truth and Performance Level Estimation (STAPLE) contour (20) (STAPLE is the probabilistic estimate of the "true" volume generated from all observers). All these methods have an inherent deficiency in that they do not provide any information on the location of any discrepancies or their clinical significance.

## DOSIMETRIC IMPACT OF TVD VARIABILITY

Vinod identified only 25 (21%) studies which evaluated the impact of variability in target and OAR contouring on dosimetry (9). Thirteen studies evaluated the dosimetric impact of target volume variability; it was interesting that three of these studies found no significant impact on PTV dose coverage. Ten studies also evaluated the impact of target volume variability on OAR doses; of these, eight studies found a significant impact on OAR dose–volume histograms (DVH). Twelve studies examined the impact of variability in OAR volume delineation; eight of these studies found statistically significant differences in OAR doses.

Vinod classified the analysis of the dosimetric impact of TVD variability into three broad methods. The first method involved a reference plan (usually the treatment plan or a plan optimized to a reference or expert contour) being applied to the volumes of many observers. This technique was used by Hellebust et al. (4) to study the dosimetric impact of contouring variations on a group of patients treated with IGBT for cervix cancer. They found that that the dose to the GTV and high-risk CTV (HR-CTV) had the smallest variation compared to the dose to the intermediate risk CTV (IR-CTV). This is perhaps not surprising as the IR-CTV is a new and complex concept, first introduced in 2005, which requires the clinician to integrate the CTV at the time of brachytherapy (BT) with the GTV at diagnosis. For OAR, the dose effect was largest for the sigmoid colon which again illustrates the greater uncertainty in defining this organ compared to the rectum and bladder. Overall, TVD variability resulted in a deviation of up to 5 Gy to the HR-CTV and up to 3 Gy for OAR.

The same method was used by Loo et al. (5) to investigate the dosimetric impact of variability in OAR contouring for head and neck IMRT. Four radiation oncologists and three radiologists delineated the parotid gland on the CT datasets of 10 patients with oropharyngeal carcinoma treated with parotid-sparing IMRT. The DVH for each study contour was calculated using the IMRT plan actually delivered for that patient and was compared with the original DVH obtained when the plan was used clinically. The mean parotid dose achieved during actual treatment was within 10% of 24 Gy for all patients. However, using the study contours, the mean parotid dose was within 10% of 24 Gy for only 53% of volumes by radiation oncologists and 55% of volumes by radiologists. The parotid DVH of 46% of the study contours were sufficiently different from the clinical DVH, such that a different IMRT plan would have been produced.

The second method as identified by Vinod is the converse of method one. In this method, the plans generated from many observer volumes are assessed for resultant dosimetry on a reference volume. This method was used in the INTERLACE study on IMRT for cervix cancer (6). No plan generated from the observer volumes was found to achieve the optimal gold standard PTV (GS-PTV) coverage; on average, the resultant dose (V95%, D95%) was 10–20% lower. The GS-PTV volume outside the 95% isodose ranged from 83 to 458 cc. A qualitative assessment showed the most common anatomical areas not covered by the 95% isodose were vagina, obturator, and nodal regions such as external iliac nodes.

In the first two methods, there is an assumption that the reference plan is "correct" and based on a "gold standard" volume which is again correct. If the reference plan is based on a volume that is an outlier compared to the contours being analyzed, the systematic differences measured may be amplified. In contrast, the third method involves a comparison of all plans applied to all contours without a reference. A plan is optimized to a particular delineated volume and then applied to all other volumes to assess dosimetry. This is then repeated for each observer's volume. This allows for the most in-depth comparison of dosimetry relating to TVD variability but is also the most resource-intensive.

The third method was used in a lung cancer study by Van de Steene et al. (21) in which five clinicians were asked to define the GTV (tumor and lymph node) on the planning CT scans of eight patients. For each volume, a standard conformal treatment plan comprising two pairs of opposed antero-posterior and lateral beams were created. The study reported inter-observer variation in the dimensions of the primary tumor of up to 4.2 (transverse), 7.9 (cranio-caudal), and 5.4 cm (antero-posterior). The variation in the extreme extensions of the GTV (tumor and lymph nodes) ranged from 2.8 to 7.3 cm. After common review, only 63% of involved lymph node regions were delineated by the clinicians (i.e., 37% were false negative). The probability (in the population of all conformal plans) of irradiating at least 95% of the GTV with at least 95% of the nominal treatment dose decreased from 96% for a matched plan (i.e., a plan created for that GTV volume) to 88% for an unmatched plan.

The authors suggested four possible causes for the large interobserver variation—problems with methodology including definitions and concepts (e.g., definition of GTV to exclude atelectasis, definition of involved lymph nodes based on size, contouring of individual lymph nodes, or lymph node regions), difficulty differentiating between tumor and benign pathology (e.g., atelectasis), difficulty differentiating between tumor and normal structures, and lack of knowledge of anatomy. Interestingly, they also concluded that only the minority of the issues could be resolved objectively.

#### CLINICAL IMPACT OF TVD VARIABILITY

There are no studies which have assessed the direct impact of TVD variability on clinical outcome.

Peters et al. (8) retrospectively analyzed 780 patients in the Trans-Tasman Radiation Oncology Group 02.02 (TROG 02.02) HeadSTART trial in head and neck cancer and found that patients whose radiotherapy plans failed trial quality assurance (12% overall) had poorer survival and loco-regional control compared to the those with protocol-compliant plans [2-year overall survival (OS) 50 vs. 70%, *p* < 0.001, 2-year loco-regional control 54 vs. 78%, *p* < 0.001]. However, incorrect volume delineation was a feature in only 25% (24/97) non-compliant plans.

A number of studies have modeled the potential impact of TVD variability. Van de Steene et al (11) estimated the impact of GTV delineation variability on tumor cure probability (TCP). Across all plans, the mean TCP decreased from 51% for a matched plan (i.e., a plan created for that GTV volume) to 42% for an unmatched plan (i.e., a plan created for another GTV), a difference of 9%. The mean range in TCP across the eight patients was 2% (maximum range 5%) for matched plans compared to 14% (maximum 31%) for unmatched plans. They also estimated the normal tissue complication probabilities for different OAR but this analysis was of limited value as the plans used were 4-field boxes which would not have been used clinically.

Jameson et al. (7) also modeled the impact of GTV delineation variability on TCP and equivalent uniform dose (EUD) in lung cancer. Three radiation oncologists contoured the GTV on the planning CT, the diagnostic PET–CT and the radiotherapy planning PET–CT for seven patients. An optimized plan with 3–5 conformal beams was created for each volume. The SD of the volumes across all seven patients ranged from 39 to 419 cc. However, the SD of the EUD was ≤1 Gy in four of the seven patients (range 0.09–21.2 Gy). Similarly, the SD of the TCP was negligible (0–1%) in four of the seven patients (range 0–22%). Contouring variations in the lateral dimensions had the greatest impact on EUD and TCP.

### MINIMIZING TVD VARIABILITY IN ROUTINE PRACTICE

Several interventions have been developed to reduce interobserver TVD variability. These have been reviewed in another publication by Vinod et al. (21).

#### Contouring Guidelines and Atlases

The most common method for reducing TVD variability within and outside clinical trials is probably the use of consensus contouring guidelines and/or atlases (22, 23). Lobefalo et al. (24) evaluated the benefit of a contouring guideline on consistency of TVD in a study of rectal cancer. Four radiation oncologists contoured the CTV on 10 patients before and after the introduction of a shared guidelines. The Agreement Index improved from 0.57 (pre-guideline) to 0.69 (post-guideline). The unmatched PTV coverage improved from 93.7 ± 9.2 to 96.6 ± 4.9% for 3D conformal radiotherapy and 86.5 ± 13.8 to 94.5 ± 7.5% for a volumetric modulated arc radiotherapy (VMAT) technique. This suggests that the dosimetric impact of inter-observer variation is more pronounced for advanced radiotherapy techniques.

Eminowicz et al. (22) from the INTERLACE trial reported the reduction of inter-observer contouring variation and increased protocol adherence after introduction of an atlas. They analyzed seven key guidelines for target volume contouring in cervical cancer and identified 11 common areas of variation. A pictorial atlas was then derived to illustrate a consistent delineation method for these areas. The average proportion of outlines (of 4; primary CTV, nodal CTV, bladder, rectum) complying to the protocol improved from 1.8/4 to 2.7/4 with atlas use.

While contouring guidelines are undoubtedly invaluable in making TVD more consistent, they can also be a source of variability if different groups produce conflicting guidelines for the same tumor site or anatomical region. For example, the GYN consortium consensus guidelines for CTV delineation for IMRT for cervix cancer defines the lateral border of the parametrium as the medial edge of internal obturator muscle/ischial ramus (i.e., lateral to the pelvic vessels) whereas the EMBRACE-II guidelines define this border as the medial edge of internal iliac and obturator vessels. Similarly, the inferior border of the presacral nodes has been defined as S2 in gynecological guidelines (23, 25), S3 in prostate guidelines (26, 27) and bottom of the coccyx in anal guidelines (28, 29). It is easy to see how a clinician used to contouring in a particular way will continue to do so in a clinical trial regardless of the protocol specification.

### Multi-Modality Imaging

Improved imaging, e.g., use of intravenous contrast, optimal window settings, and multi-modality imaging, is an intuitive way to improve TVD consistency. In the Vinod et al. review (9), there were more published studies using this method than all other methods combined. However, results have been mixed and 9 of the 31 studies reviewed did not demonstrate a statistically significant reduction in TVD variability. It appears that interpretation of the additional imaging modality and image co-registration are sources of error in themselves.

### Auto-Contour Provision

A few studies have reported improved TVD consistency from clinicians editing an auto-contour compared to manual delineation (21). However, if the auto-contour contains an error, then this is more likely to be transmitted through the manual editing process as a systematic error. The majority of auto-contouring software in clinical use utilize atlas-based segmentation which always requires manual review and adjustment due to the wide variation in normal and post-treatment anatomy. Machine learning techniques hold promise for increasing accuracy and reducing the burden of user editing as discussed in a review by Sharp et al. (30).

### Contouring Workshops and Educational Programs

Several publications have reported the benefit of contouring workshops on reducing TVD variability. An example is an International Atomic Energy Agency study over a 1-year period involving 11 pairs of clinicians comprising a radiation oncologist and a nuclear medicine physician (31). Training consisted of lectures, contouring practice, and group and individualized feedback. Following the first training, overall concordance indices for three repeated cases increased from 0.57 ± 0.07 to 0.66 ± 0.07. After further training, overall concordance indices for another three repeated cases further increased from 0.64 ± 0.06 to 0.80 ± 0.05 (*p* = 0.01).

Contouring workshops are a popular method for teaching TVD but they have several limitations. In most cases, improvement is measured by re-contouring on the same cases and it is difficult to ascertain whether learning is transferred to different cases with different patient anatomy and tumor topography. The number of participants is limited by logistics and cost.

Recent advances in technology such as web-enabled video conferencing and interactive software have enabled both live and offline educational interventions to reach across geographical boundaries. An example is the FALCON program (Fellowship in Anatomic delineation and Contouring), offered by the European Society for Radiotherapy & Oncology (32). However, online workshops will face the same pedagogical issues as live ones.

A few contouring tools have been developed to support selflearning TVD programs. These tools offer delineation practice often with provision of a reference volume and/or automated feedback. These programs are in their infancy and their utility remains to be established. Issues include difficulty in defining a reference volume given the extent of disagreement in TVD among experts, challenges for user engagement and outdated internet access particularly in hospitals.

### Peer Review

Peer review involves the review of aspects of radiotherapy treatment by two or more radiation oncologists, or another specialist such as a radiologist. It may cover indications for treatment, treatment approach, volume delineation, planning directives, evaluation of plan quality and/or treatment verification. The American Society for Radiation Oncology has identified TVD as the first priority for peer review due to the heterogeneity in contouring and its impact on the rest of the radiotherapy process (33).

Multiple audits of peer review have identified that a proportion of radiotherapy treatments require significant alteration. In an early study (34), 3,052 cases were reviewed over 8 years of which 4.1% were "not approved." More recently, Mackenzie et al. (35) presented a prospective audit of peer review meetings in breast, head and neck, and lung cancer. Overall 9% of treatments required alteration before the first or next fraction of radiotherapy, although this varied significantly across the tumor sites (1–16%). A study by Dimigen et al. (36) reported that involving a radiologist in weekly QA meetings resulted in a significant change in management in 6% of cases.

Multiple professional organizations now advocate peer review as an important component of safe and effective radiotherapy. However, there are significant barriers to its implementation including a lack of personnel, dedicated time and facilities, and a reluctance of clinicians to invite scrutiny, especially across institutions. Given its cost and resource implications, rigorous research to evaluate its benefit is urgently needed. Technologies which allow large scale remote assessment of contours would be hugely advantageous.

## MINIMIZING TVD VARIABILITY IN CLINICAL TRIALS

The process for RTQA of TVD in clinical trials may involve one or more of the following (37):


Most of the reports on RTQA for TVD have used benchmark cases. An example is the INTERLACE study on IMRT for cervix cancer. The principal investigators (PIs) of participating centers were asked to contour the CTV on two cases with different FIGO stages. 21 outlines were compared for case 1 and 22 for case 2. The delineated volumes ranged from 340 to 676 cc for case 1 and 458 to 806 cc for case 2. The direction of the maximum variation was different in the two cases.

The EMBRACE-I study on IGBT for cervix cancer is an example of RTQA based on a dummy run (38). Each center was asked to upload a "good response" case and a "poor response" case for central review. The review was qualitative with one physician reviewing all the external beam radiotherapy (EBRT) contours and three other physicians reviewing the BT contours. Out of 30 submitting centers, 13 had major inconsistencies in BT contouring while 11 had major inconsistencies in EBRT contouring. Centers with experience in IGBT (>30 cases) performed better than those with limited experience.

Retrospective individual case review was reported by the SCALOP trial in pancreatic cancer (39). The chief investigator and a radiologist contoured the GTV on the 60 of 74 patients who received radiotherapy in the study (12 patients had planning CTs which were deemed to be of insufficient quality for re-contouring) and compared their gold standard contours with the treating clinicians' contours using the Jaccard conformity index and geographical miss index. The median geometric indices for GTV and PTV seen in on-trial patients were better than the pre-trial benchmark case, suggesting that overall, quality of tumor delineation was acceptable and that the pre-trial RTQA may have enhanced the quality of tumor delineation within the main trial. However, tumor was completely missed in one patient, and ≥50% of the tumor was missed in three cases. The authors reported that patients with Jaccard conformity index for GTV ≥ 0.7 had 7.12 (95% CIs: 1.83–27.67, *p* = 0.005) higher odds of progressing by 9 months in multivariate analysis, which is counter-intuitive.

#### DISCUSSION

Our review has found that although there are numerous publications reporting considerable TVD variability within and outside clinical trials, there are very few which have investigated the causes of the variability or its impact on actual clinical outcomes. The limited data on outcomes are conflicting with modeling papers suggesting different impact on TCP in different patterns which is perhaps not surprising. The one paper which correlated TVD variability with outcomes showed that higher concordance with the gold standard contours actually worsens outcome. All the data to date suggest that the relationship between TVD variability and outcome is not straightforward and further research is required. Similarly, several educational strategies have been put forward to minimize TVD variability but there is little systematic research into the effectiveness of the strategies and more importantly, whether learning is retained.

### REFERENCES

1. Delaney G, Jacob S, Featherstone C, Barton M. The role of radiotherapy in cancer treatment. *Cancer* (2005) 104:1129–37. doi:10.1002/cncr.21324

The problem is particularly acute for clinical trials due to the requirement to assess clinicians from many participating centers, in dispersed locations. The logistics are such that most clinical trials limit their RTQA process to the PIs who are probably the most likely to contour correctly. Similarly, most RTQA is based on 1 or 2 carefully chosen benchmark cases which does not take into account patient anatomy and difficult topography. The assessment process is usually subjective and there may be a conflict of interest for the central review team to "pass" centers in order to increase trial recruitment.

In 2010, the Global Clinical Trials RTQA Harmonization Group (GHG) (40) was established to


The aim is to increase cooperation between trial groups internationally and facilitate the exchange and interpretation of RTQA data.

Perhaps a neglected opportunity in clinical trials is the potential to use RTQA content for systematic education. This strategy has been adopted in the EMBRACE-II study of IMRT and IGBT in cervix cancer (www.embracestudy.dk). In addition to workshops and annual update meetings, the study has set up an online continuous education program for all study participants. The program includes a number of educational resources not commonly available in clinical trials such as training contouring cases and quizzes. The quizzes in particular have been popular with participants and have identified gaps in knowledge and participant comprehension of the protocol. This has enabled the trial management group to develop targeted learning resources which should hopefully improve protocol compliance. The aim is to eventually make these resources available to non-trial participants as well.

### CONCLUSION

Target volume delineation variability is a significant problem in radiotherapy both within and outside clinical trials. More research is required to evaluate the causes of variability and its impact on dosimetry and clinical outcome.

### AUTHOR CONTRIBUTIONS

AC: draft outline and final manuscript. LT and SD: revision of some sections of the manuscript and organization of references. W-TN: revision and review of the manuscript.

<sup>2.</sup> Ohri N, Shen X, Dicker AP, Doyle LA, Harrison AS, Showalter TN. Radiotherapy protocol deviations and clinical outcomes: a meta-analysis of cooperative group clinical trials. *J Natl Cancer Inst* (2013) 105:387–93. doi:10.1093/ jnci/djt001


segmentation. *IEEE Trans Med Imaging* (2004) 23(7):903–21. doi:10.1109/ TMI.2004.828354


brachytherapy for cervical cancer: final results of the EMBRACE study dummy run. *Radiother Oncol* (2015) 117(3):548–54. doi:10.1016/j.radonc.2015.08.001


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Chang, Tan, Duke and Ng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Electronic Support for Retrospective Analysis in the Field of Radiation Oncology: Proof of Principle Using an Example of Fractionated Stereotactic Radiotherapy of 251 Meningioma Patients

#### *Edited by:*

*Kerstin Anne Kessel, Technische Universität München, Germany*

#### *Reviewed by:*

*John E. Mignano, Tufts University School of Medicine, USA John C. Roeske, Loyola University Medical Center, USA*

#### *\*Correspondence:*

*Dorota Lubgan dorota.lubgan@uk-erlangen.de*

*† These authors contributed equally to this work.*

*‡ The present work was performed in (partial) fulfillment of the requirements for obtaining the degree "Dr. rer. biol. hum."*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 07 November 2016 Accepted: 24 January 2017 Published: 09 February 2017*

#### *Citation:*

*Rutzner S, Fietkau R, Ganslandt T, Prokosch H-U and Lubgan D (2017) Electronic Support for Retrospective Analysis in the Field of Radiation Oncology: Proof of Principle Using an Example of Fractionated Stereotactic Radiotherapy of 251 Meningioma Patients. Front. Oncol. 7:16. doi: 10.3389/fonc.2017.00016*

*Sandra Rutzner1†‡, Rainer Fietkau1 , Thomas Ganslandt2 , Hans-Ulrich Prokosch2 and Dorota Lubgan1 \*†*

*1Department of Radiation Oncology, Erlangen University Hospital, Erlangen, Germany, 2Chair of Medical Informatics, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen, Germany*

Introduction: The purpose of this study is to verify the possible benefit of a clinical data warehouse (DWH) for retrospective analysis in the field of radiation oncology.

Material and methods: We manually and electronically (using DWH) evaluated demographic, radiotherapy, and outcome data from 251 meningioma patients, who were irradiated from January 2002 to January 2015 at the Department of Radiation Oncology of the Erlangen University Hospital. Furthermore, we linked the Oncology Information System (OIS) MOSAIQ® to the DWH in order to gain access to irradiation data. We compared the manual and electronic data retrieval method in terms of congruence of data, corresponding time, and personal requirements (physician, physicist, scientific associate).

Results: The electronically supported data retrieval (DWH) showed an average of 93.9% correct data and significantly (*p* = 0.009) better result compared to manual data retrieval (91.2%). Utilizing a DWH enables the user to replace large amounts of manual activities (668 h), offers the ability to significantly reduce data collection time and labor demand (35 h), while simultaneously improving data quality. In our case, work time for manually data retrieval was 637 h for the scientific assistant, 26 h for the medical physicist, and 5 h for the physician (total 668 h).

Conclusion: Our study shows that a DWH is particularly useful for retrospective analysis in the radiation oncology field. Routine clinical data for a large patient group can be provided ready for analysis to the scientist and data collection time can be significantly reduced. Furthermore, linking multiple data sources in a DWH offers the ability to improve data quality for retrospective analysis, and future research can be simplified.

Keywords: clinical data warehouse, MOSAIQ®, routine clinical data, secondary use of data, data retrieval, stereotactic radiotherapy, meningioma

## INTRODUCTION

Routinely documented clinical data are of great importance for patient care as well as for research purposes (1, 2).

So far, the retrospective analyses in medical research have been predominantly performed manually, meaning that clinical data are often transferred by hand from routine clinical reports into a separate research database (3) and stored in standard office tools (e.g., Microsoft Excel spreadsheets), which are not validated for clinical research. The continuously increasing expansion of electronic documentation in the clinical treatment process creates a large amount of various databases (4); thus, manual retrospective analysis is currently quite ambitious and time consuming.

In the field of radiation oncology, data sets are large and heterogeneous (5). Electronic information systems contain patients' data for imaging in the Radiology Information System and Picture Archiving and Communication System, for irradiation in the Clinical Information System (CIS), e.g., Oncology Information System (OIS, MOSAIQ®) and data of the current course of the patients' disease in the electronic health record (EHR, e.g., Soarian® Clinicals).

With the increasing amount of patient information captured in EHRs and CISs, more opportunities should be established to facilitate clinical research by obtaining routine clinical data from distributed databases for secondary use, though providing access to routine clinical data for secondary use is challenging in practice (6). One of the greatest challenges in clinical research is to define and implement health data standards for integration between routinely used subsystems (7, 8). Medical data are frequently distributed across multiple electronical information systems of several departments in different forms of documentation styles (9). Although most university hospitals already implemented commercial hospital information systems and started to develop comprehensive EHRs, there is still a gap between clinical care and using this data for medical research that needs to be filled (10, 11). Recent studies have focused on providing routine clinical data for research purposes, e.g., by using a single-source tumor documentation or supporting systems for patient recruitment into clinical trials in the field of radiation oncology (12) and intensive care (13).

Data warehouses (DWHs) are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating analytical reports for knowledge workers throughout the enterprise (14). The purpose of this study is to verify the possible benefit of a DWH for retrospective analysis and reflect differences in manual and automated data retrieval.

Using meningioma patients as an example, we performed a therapy evaluation by utilizing an integrated electronic research database system DWH (clinical DWH) of the Erlangen University Hospital (UKER) to make routine radiotherapy data available from various operational subsystems. This is one of the largest populations of meningioma patients treated with stereotactic radiotherapy (SRT) in a single institution with a comprehensive database due to a high overall survival rate and a long observation period of meningioma patients after SRT.1

We manually and electronically collected basic information (patient characteristics), radiotherapy, and outcome data of 251 meningioma patients, who were irradiated from January 2002 to January 2015 at the Department of Radiation Oncology of the UKER (see text footnote 1). Currently, manual data collection represents the "gold standard." In our study, we compared the results of both the electronic and manual data retrieval process and determined the congruence of data. Moreover, we measured the corresponding time requirements for both retrieval methods and the involvement of personnel (physician, physicist, scientific associate).

#### MATERIALS AND METHODS

#### Environment

Erlangen University Hospital (UKER) is a tertiary care hospital that has 1.368 beds and combines 24 departments, 18 independent divisions, 7 institutes, and 25 interdisciplinary centers. In 2015, over 60,000 inpatient and nearly 475,000 outpatient cases were treated (15). At the Department of Radiation Oncology, 130–150 patients with many different tumor entities are irradiated daily. Approximately, 32 patients with meningioma are irradiated annually.

For our study, an agreement for the usage of routine clinical data was signed by those departments of the UKER that were involved in the patients' treatment (Neurosurgery, Neurology, Neuropathology, and Radiology). These regulatory requirements and institutional policies need to be reconciled to use clinical routine data for clinical research activities.

### Principles of Radiotherapy of Intracranial Meningioma

During the past two decades, SRT has become increasingly well known as a treatment option for meningiomas (16, 17). Adjuvant SRT is offered to all grades II and III meningioma patients, whereas symptomatic grade I meningioma patients only received SRT after incomplete resection. Inoperable grade I (symptomatic only), grades II and III meningioma are treated with primary SRT. SRT was performed using the stereotactic radiosurgery system Novalis™ (BrainLAB, Feldkirchen, Germany). Patients were treated on consecutive workdays, with one fraction per day (see text footnote 1). SRT was mostly given in 28, 30, or 25 fractions to a median reference dose of 54.0 Gy.

### Scientific Objective of the Retrospective Analysis

Based on the example of 251 patients with 275 intracranial meningiomas treated between January 2002 and January 2015

<sup>1</sup>Lubgan D, Rutzner S, Semrau S, Lambrecht U, Roessler K, Buchfelder M, et al. Effective long-term local results and prognostic factors after fractionated stereotactic radiotherapy of 257 intracranial meningeoma. *J Neuro Oncol* (submitted for publication).

with SRT at the department of Radiation Oncology of the UKER, we have illustrated the workflow of manual and electronical supported data retrieval for this analysis. For determination of efficacy of SRT on long-term outcome (e.g., overall survival, local control), the relevant parameters (age, gender, tumor localization, WHO grading, and current disorders after radiotherapy), data of the computed tomography (CT) or magnetic resonance (MR) imaging (to determine the tumor status after therapy), and temporal dose distribution [fractionation, target volume (PTV), dose distribution of risk organs] were evaluated.

#### Workflow of Manual and Electronical Supported Data Collection for Retrospective Analysis Manual Data Retrieval

For the purpose of retrospective analysis, the Department of Radiation Oncology begins with specifying the research question and defining the patient collective. Here, the patient collective was identified by multiple reference sources (e.g., outdated medical records and databases, institutional statistics) and manually summarized in a separate chart (Microsoft Excel 2010). All medical data in the routine CISs and necessary data elements for each patient were manually and separately noted in an electronic document using Microsoft Excel 2010. The systems used for manual analysis are listed in **Table 1**.

To evaluate the time required for manual data retrieval, we documented the time needed to collect all necessary data elements from clinical source systems and manually transcribed them into an Excel spreadsheet.

#### Electronically Supported Data Retrieval

In order to simplify retrospective analysis, we decided to use a tool that obtains routine clinical data from multiple CISs for secondary use. Since 2003, the UKER provides the clinical DWH research platform to scientists for numerous analyses. It has the ability to combine data from multiple clinical source systems and to provide it to the hospital users. The DWH stores clinical and administrative data from 22 different data sources (e.g. Accounting, Pharmacy, Surgery, Anesthesia, Pathology, and Radiology). For transformation of routine clinical data, it utilizes the open enterprise-class platform Cognos Data Manager. The database language Structured Query Language (SQL) is used for defining data structures, editing, and querying the databases.

We used the DWH for defining a patient collective and obtaining routine clinical data from multiple CISs. The workflow of manual and electronical supported data retrieval for retrospective analysis is illustrated in **Figure 1**.

A database query based on routine clinical data from patient care was initiated to design a core data set for retrospective analysis (date of the last contact, date of the last imaging, life-status, beginning and end of the radiotherapy, fractionation, and dose). Selected data elements and the related data source system are shown in **Table 2**.

The official system which was used for coding of the diagnosis is the 10th Revision of German Modification of the International Statistical Classification of Diseases (ICD-10) and for procedures the German "Operationen- und Prozedurenschlüssel Version 2015."

Currently, not all listed data elements or source systems are accessible for the DWH (e.g., tumor as cause of death in the GDTS, the minimum or maximum dose, PTV-volume, coverage PTV, dose distribution on risk organs documented in the treatment planning software) or there were no suitable methods available for the extraction of the data elements (e.g., tumor localization, WHO grading, or several radiotherapy documented in Soarian® Clinicals) at the time of analysis (**Table 2**). Therefore, they are not included in the electronical analysis.

### Integrating OIS MOSAIQ**®** into the Clinical DWH of the UKER: Reusing Data from the OIS MOSAIQ**®** for Retrospective Analysis

Since 2012, the Department of Radiation Oncology uses the OIS MOSAIQ® developed by Elekta (Hamburg, Deutschland). It provides medical oncology data (e.g., demographic data,

TABLE 1 | Summary of clinical information systems (CISs) for manual data retrieval to evaluate routine medical data for retrospective analysis of patients with meningioma treated with stereotactic radiotherapy.


diagnoses, beginning and end of the radiotherapy, planned and administered fractionation and doses), regulates the respective linear accelerator, and is linked to imaging, planning, and therapy systems.

In order to make irradiation data available for retrospective analysis, we analyzed the table structure from the clinical system and transferred a copy of relevant data tables as read-only user during the non-productive clinical stage of radiotherapy (after 5 p.m.) into the staging area of the DWH. This process is called "*extraction*." As a next step, we queried the DWH to select patients with a diagnosis of meningioma (ICD10-GM code D32.0, D32.9, C70.0, C70.9) and to identify the data elements *beginning and end of the radiotherapy*, *planned and administered fractionation and dose distribution*. Subsequently, we compared the results of the data base query and the manual data retrieval.

In addition, unnecessary or inconsistent data can be corrected or extinguished at the staging area. This process is called "*transformation*." The entire process is called *ETL* (*extraction*, *transformation*, *loading*) (18). The structure of the DWH and technical implementation of the clinical source system MOSAIQ® is illustrated in **Figure 2**.

#### Statistical Analysis and Ethics Committee Vote

Standard summary statistics and two-tailed 95% confidence intervals were calculated as appropriate. All statistical analyses were performed using the Statistical Package for the Social Sciences version 21 (IBM Corp., Armonk, NY, USA). The level of significance for all analyses was set at α = 0.05 (two-tailed).

Our institution obtained a positive ethics committee vote from the ethical review board for our research (reference number 347\_16 Bc). All data used for the retrospective analysis was in anonymized form.

### RESULTS

#### Effectiveness of Patient Data Collection—DWH

A total amount of 275 data sets (case ID) from 251 (patients ID) patients were manually collected and stored in a Microsoft Excel spreadsheet. We counted 275 data sets (case ID) due to the fact that some patients had more than one lesion and thus were irradiated at multiple times.

Two hundred seventy-four electronic data sets (100%) from 250 patients were electronically collected because one patients' data were not available for data protection reasons. The data congruence of the data elements "*beginning and end of the radiotherapy*, *date of the last contact*, *date of the last imaging and life-status (alive, dead)*," were evaluated on the basis of manual data retrieval compared with the results of the DWH report.

### Manual Data Retrieval Compared with the Results of the DWH Report

The summary of selected data elements determined by manual and electronical supported data retrieval is shown in **Table 3**.

### Data Element "Beginning of the Radiotherapy" and "End of Radiotherapy"

Two hundred fifty-two (92.0%) for manual and 257 (93.8%) for the electronical method out of 274 (100%) data elements "beginning of the radiotherapy" and "end of the radiotherapy" were



*a Not included in the electronical [data warehouse (DWH)] analysis (currently not all listed data elements are accessible for the DWH or there were no suitable methods available for the extraction of the data elements).*

identical. Thirty-nine (22 manual, 17 electronical) data elements were not identical.

Deviating results are more often generated by the manual than the electronical data retrieval method. Manual data retrieval produced 22/274 (8%) deviating results: this difference was caused by the fact that in 22 cases the treatment date of radiation was incorrectly documented in the discharge letter and the incorrect dates were transferred into the Microsoft Excel spreadsheet.

The DWH determined the correct treatment date for these 22 patients. However, the DWH query produced 17/274 (6.2%) deviating results due to an error in the data base query. The query was carried out patient-based (patients ID) instead of case-based (case ID). If a patient (patients ID) was treated multiple times over several years (case ID) only the latest "date of beginning and the end" was identified. For a flawless determination of the treatment (case ID), date the SQL statement of the data base query has to be adjusted for future data exports.

#### Data Element "Date of the Last Imaging"

Of the 274, 248 (90.5%) by manual and 236 (86.1%) by electronical retrieval data elements were identical.

Differing results are more often generated by the electronical (38/274) than the manual (26/274) data retrieval method. Manual data retrieval produced 9.5% of inconsistent data: this difference

#### TABLE 3 | The comparison of manual data retrieval and the result of the data warehouse report.


χ*2 test data source manual compared to data source electronical (MOSAIQ*® *included) p* = *0.009.*

*Data are number of data elements (%) unless otherwise stated. p-value: analysis of covariance,* χ*<sup>2</sup> test in case of categorical data.*

was caused by the fact that over the course of time of manual data retrieval, an additional imaging was performed for 26 patients; thus, manually collected data were already outdated.

The DWH report determined 38 cases (13.9%) of diverging data: for 38 patients an imaging was performed at an external hospital. The information about external imaging is not accessible by a database query as it is based on the documented procedure code in the source system of the UKER.

#### Data Element "Date of the Last Contact"

All data elements collected electronical were identical. Deviating results are only caused by the manual (42/274, 15.3%) data retrieval method. There were two reasons for this: first, for 18 patients the date of the last contact was incorrectly transferred from the source system into the Excel spreadsheet. Second, during the time of analysis, 24 patients were being treated again in another department at the UKER, and subsequently, manually collected data were already outdated.

### Data Element "Life-Status"

Overall 14 (5.6%) of 251 evaluated patients died. For all 14 patients, the day of death was manually collected. Seven (50.0% of all deceased) patients were overlooked by the DWH report because no information about their death was documented in the EHR (Soarian® Clinicals) as the date of death is only documented for patients who died at the UKER.

### Effectiveness of Patient Data Collection—OIS

Fractionated SRT is documented in the OIS MOSAIQ® since June 2012. We identified 110 suitable values for 74 patients (74 stereotactic irradiation + 36 data values for boost irradiation) since the system went into operation at the department of Radiation Oncology and transferred them into the DWH. We collected the data elements "beginning and end of radiotherapy, distributed dose and fractionation" by querying the DWH and compared the results with the manual data retrieval.

### Manual Data Retrieval Compared with the Results of the Mosaiq**®** Report

#### Data Element "Beginning of the Radiotherapy" and "End of Radiotherapy"

Differing results were only caused by the manual data collection method (22/110): due to an incorrect date in the medical discharge letter manually retrieved data produced the deviating data for the beginning of radiotherapy and for the end of radiotherapy.

There were no deviating results by querying the source system MOSAIQ® (DWH report) because the linear accelerator is regulated by the OIS that uses validation rules for data entry for every single fractionation in the primary source system.

### Data Element "Administered Dose and Fractionation"

In all, 94.6% (70/74) data elements were identical. The manual data retrieval methods lead to 4 (5.4%) deviating results because a medical physicist determined 4 false data elements of administered dose and fractionation on the basis of the paper-based health record, OIS MOSAIQ® and the treatment planning systems I-plan RT® or Pinnacle3®. There were no deviating results by querying the source system MOSAIQ® (DWH report).

### Time Invested in Manual Data Retrieval

To evaluate the time required for manual data retrieval, we documented the time needed to collect all necessary data elements from clinical source systems and manually transmit them into a Microsoft Excel spreadsheet. The manual data retrieval required 668 h (**Figure 3**). The collection of all data elements took place over an extended period of time of about 24 weeks.

The scientific assistant required the largest amount of time while manually collecting routine clinical data in 637 h (95.4%) The support of a physician (5 h, 0.7%) and a medical physicist (26 h, 3.9%) was required (**Figure 3**). The physician analyzed actual MR or CT imaging (to determine localization, relapse, and progression of the tumor) and the medical physicist evaluated necessary data elements (PTV volume, fractionation, doses, minimum/maximum dose, coverage PTV, dose distribution of risk organs) on the basis of the paper-based health record and the treatment planning systems I-plan RT® or Pinnacle3®.

### Time Consumption for Electronical Data Retrieval

In collaboration with a computer scientist of the Department of Medical Informatics and two scientific assistants of the Department of Radiation Oncology of the UKER, the DWH report was developed. Implementing the DWH query took 30 h that are composed of the definition, adjustment, and execution of the database query. For administrative activities (e.g., obtaining permission for data access by those departments of the UKER, which were involved in the patients' treatment), we need additional 5 h.

The support of a medical physicist was not required to evaluate data elements (beginning and end of radiotherapy, administered fractionation, and dose) on the basis of the paper-based health record and the treatment planning systems I-plan RT® or Pinnacle3®. For evaluating the data elements (PTV volume, minimum/maximum dose, coverage PTV, dose distribution of

risk organs), the support of the medical physicist (approximately 20 h) and a physician (5 h) to analyze actual MR or CT imaging is still required.

### DISCUSSION

The purpose of this study is to verify possible benefits of a clinical DWH for retrospective analysis in the field of radiation oncology.

We compared two different methods of collecting routine clinical data: manually and electronically using DHW for secondary use of the scientific retrospective analysis.

In summary, our results indicated that the electronically supported data retrieval (DWH) showed an average of 93.9% correct data and a significantly better (*p* = 0.009) result compared to manual data retrieval (91.2%). Using a research, database (DWH) replaces manual activities and offers the ability to significantly reduce data collection time and labor while improving data quality. However, data integrity depends on the quality of a structured routine clinical documentation as well as the system requirements to get access to medical data in the clinical source systems. Furthermore, expert knowledge for the transformation of routine clinical data is necessary in practice.

In our study, manual data retrieval needed significantly more overall workload time (668 h) of all involved professional groups compared to implementing the DWH query (30 h). We needed the support of a physician (5 h) to manually analyze CT or MR imaging and a medical physicist (26 h) for evaluating necessary irradiation data elements (fractionation, dose distribution, coverage/PTV volume, minimum/maximum dose, dose distribution at risk organs). Up to now, the support of a physician (5 h) to analyze actual MR or CT imaging is still required. In order to completely automate the assignment of the medical physicist for retrospective analysis (evaluating the data elements coverage/PTV-volume, minimum/maximum dose, dose distribution at risk organs), the departmental planningsystems I-plan® RT and Pinnacle3® need to be made accessible for the DWH.

In addition, the long period of time necessary for retrieving data manually produced outdated databases and caused errors when transmitting data into an electronic format such as Microsoft Excel, which became evident in some cases of our study. Furthermore, data retrieval errors can easily be introduced because medical record data are not guaranteed to be accurate (e.g., incorrectly documented treatment date of radiation in the discharge letter of radiotherapy) and depend on the care and knowledge level of the scientific assistant. A related study by Roelofs et al. (19) that examined the benefit of a clinical DWH combined with tools for extraction of relevant parameters data for a radiotherapy trial supports this point of view. A DWH is beneficial for data collection time in addition to offering the ability to improve data quality.

Besides of benefits of data collection times and improving data quality, the strength of a DWH its ability to combine data from multiple clinical source systems and make it easily accessible for researchers. Though, before using routine data for research purposes, it is important to carefully verify this data and determine data integrity. In this context, Galster (20) has reviewed existing barriers for reusing routine data, he came to the conclusion that clinical data are not available when or where it is needed, even though data is present, the usage of the existing source is prohibited or cannot be routinely used in its available form. In our study, there are regulatory requirements and institutional agreements that need to be reconciled from the departments of the UKER that are involved in the patients' treatment in order to use clinical routine data for clinical research activities.

Next to the challenges of gaining access to multiple data sources, another major barrier for data reuse is the fact that routine data cannot be used in its available form. Usually, clinical data are distributed across several tables in a generic form with coded values (21). In our analysis, some data (e.g., tumor localization, histology/pathology) are semi-structured values (mostly free-text format) and therefore can't be used for automatically analysis. The data recorded in structured fields are more readily to be extracted from an EHR than data that was recorded in free text notes. Therefore, expert knowledge for the transformation of this data is necessary, and the accuracy of database queries mainly depends on a specific SQL statement. In addition, EHR data are frequently recorded inconsistently in a variety of formats that are complex, inaccurate, and often incomplete (22). For our study, it is a necessary condition that medical data are recorded completely in a specific data schema in order to automatically capture as much information as possible for retrospective analysis.

Furthermore, EHRs often do not tell a complete patient story, whether it may be those of a single institution or those aggregated across institutions (23). An example for this problem in our study is the date of death that is only documented in the clinical source system (EHR) for patients who died at the UKER. Moreover, the information about an external imaging is not routinely documented in a coded form in the EHR and is therefore not accessible for database queries. Consequently, medical details from external sources (e.g., life status in the GTDS®, imaging at an external hospital) must be requested or made available for automated data abstraction. This would be worthwhile in order to determine a patients' life status as an electronical life-status comparison with the residents' registration offices is prohibited due to privacy policy since 2008 and an amendment to the Bavarian Cancer Registry is made for provision in 2016 (24). To keep the medical routine data up to date, we send a specially designed questionnaire to the patients in order to assess the health-related outcome that are completed by patients themselves.

Additionally, routine clinical documentation in the primary source systems affects the research outcome: data quality for retrospective analysis is only as good as the routine clinical documentation in the primary source systems e.g., EHR. Therefore, Kessel et al. (5) have developed a professional data-based documentation system for analysis purposes where information about radiation therapy, diagnostic images, and dose distributions has been imported into a web-based system. They showed that the central storage of data outside of EHR leads to benefits of digital management, data analysis, and reusability of the results. In this context, Kirrmann et al. (9)


TABLE 4 | Limitations for using a data warehouse (DWH) for retrospective analysis in the radiation oncology field.

developed and described a flexible browser based reporting and visualization system for clinical and scientific use by linking web-services/MOSAIQ®, the physician letter system MEDATEC, and central server MiraPlus (laboratory, pathology and radiology). They reported that all relevant data were available at all times in a simple manner, which improved their effectiveness resulting in a considerable amount of time saving.

In this context, one benefit of our retrospective analysis was that the gain of access to radiotherapy data from the clinical source system MOSAIQ®. Besides the data sets "beginning and the end of radiotherapy" for evaluating treatment outcomes of patients with meningioma, we also extracted irradiation parameters "planned and effectively implemented fractionation and dosage distribution" from the existing primary source (OIS). Due to the fact that the linear accelerator and the OIS both use validation rules for data entry in the primary source system, original routine data are not subsequently changed. As we have shown in our analysis, using original and unprepared data leads to a higher percentage of accurate data.

A summary of described limitations and potential solutions using a DWH are shown in **Table 4**.

Although only a selected data set of the evaluation of patients with meningioma was examined and not all data were directly available in a DWH, our present study highlights the benefit of electronical supported data retrieval for secondary use. Thus, our goal is to adapt our approach to other types of tumors in radiation oncology and extract more parameters from the existing routine care documentation systems.

#### CONCLUSION

Our present study shows that a DWH is particularly beneficial for retrospective analysis in the field of radiation oncology. Routine clinical data for a large patient group can be provided

#### REFERENCES

1. Dentler K, ten Teije A, de Keizer N, Cornet R. Barriers to the reuse of routinely recorded clinical data: a field report. *Stud Health Technol Inform* (2013) 192:313–7. doi:10.3233/978-1-61499-289-9-313

ready for analysis to the scientific operator, and data collection time can be reduced significantly. Furthermore, using a DWH provides the ability to improve data quality for retrospective analysis; thus, future research can be simplified. However, expert knowledge for the transformation of routine clinical data is still necessary and the quality of a structured routine clinical documentation in the CISs as well as the system requirements allowing access to medical data also affect the outcome.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the ethical review board of the Friedrich-Alexander-University of Erlangen-Nuremberg (FAU) with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the ethical review board of the Friedrich-Alexander-University of Erlangen-Nuremberg (FAU) (reference number 347\_16 Bc).

### AUTHOR CONTRIBUTIONS

SR: conducted data analysis, described throughout the manuscript, and major contributor to the writing of the manuscript and literature search. RF: clinical oncologist and principal of the research organization, involved in the design of the study, and reviewed the manuscript. TG: made substantial contributions to the acquisition of data, developed the data warehouse report, and reviewed the manuscript. H-UP: was involved in the design of the study and reviewed the manuscript. DL: major contributor to the writing of the manuscript, supervised the study, and major contributor to organization of the data analysis and manuscript. All authors read and approved the manuscript.


aufeinanderfolgenden Jahren. *Der Notarzt* (2003) 19(03):114–9. doi:10.1055/ s-2003-39534


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Rutzner, Fietkau, Ganslandt, Prokosch and Lubgan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Integrating Hyperthermia into Modern Radiation Oncology: What Evidence Is Necessary?

*Jan C. Peeken1 , Peter Vaupel1 and Stephanie E. Combs1,2\**

*1Department of Radiation Oncology, Klinikum rechts der Isar, Technische Universität München, München, Germany, 2Department of Radiation Sciences (DRS), Institute of Innovative Radiotherapy (iRT), Helmholtz Zentrum München, Neuherberg, Germany*

Hyperthermia (HT) is one of the hot topics that have been discussed over decades. However, it never made its way into primetime. The basic biological rationale of heat to enhance the effect of radiation, chemotherapeutic agents, and immunotherapy is evident. Preclinical work has confirmed this effect. HT may trigger changes in perfusion and oxygenation as well as inhibition of DNA repair mechanisms. Moreover, there is evidence for immune stimulation and the induction of systemic immune responses. Despite the increasing number of solid clinical studies, only few centers have included this adjuvant treatment into their repertoire. Over the years, abundant prospective and randomized clinical data have emerged demonstrating a clear benefit of combined HT and radiotherapy for multiple entities such as superficial breast cancer recurrences, cervix carcinoma, or cancers of the head and neck. Regarding less investigated indications, the existing data are promising and more clinical trials are currently recruiting patients. How do we proceed from here? Preclinical evidence is present. Multiple indications benefit from additional HT in the clinical setting. This article summarizes the present evidence and develops ideas for future research.

Keywords: hyperthermia, radiation oncology, reirradiation, infrared-A, thermoradiotherapy

## INTRODUCTION

Hyperthermia (HT) is defined as an exogenous, supraphysiological elevation of tissue/body temperature. The beginning of modern HT dates back to the 1700s when remissions of malignant tumors were repeatedly associated with concomitant bacterial infections. This effect was first systematically investigated at the break of the 19th century by Coley (1). Patients with unresectable sarcomas received injections of bacterial vaccines for fever induction. In total, a cure rate of 20% was achieved (2). It took several decades of technological developments for local/locoregional heat application until HT alone became available for clinical application.

#### *Edited by:*

*William Small, Jr., Stritch School of Medicine, United States*

#### *Reviewed by:*

*Mark Hurwitz, Thomas Jefferson University, United States Aaron Howard Wolfson, University of Miami, United States Eric D. Donnelly, Northwestern Memorial Hospital, United States*

#### *\*Correspondence:*

*Stephanie E. Combs stephanie.combs@tum.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 28 February 2017 Accepted: 06 June 2017 Published: 30 June 2017*

#### *Citation:*

*Peeken JC, Vaupel P and Combs SE (2017) Integrating Hyperthermia into Modern Radiation Oncology: What Evidence Is Necessary? Front. Oncol. 7:132. doi: 10.3389/fonc.2017.00132*

**Abbreviations:** BT, brachytherapy; CT, chemotherapy; DFS, disease-free survival; EBRT, external beam radiotherapy; HPV, human papilloma virus; HR, hazard ratio; HSP, heat shock protein; HT, hyperthermia; HTP, hyperthermia treatment planning; LC, local control; NK, natural killer; NMA, network meta-analysis; NSCLC, non-small cell lung cancer; OR, odds ratio; pCR, pathologic complete response; PFS, progression-free survival; RR, risk ratio; RT, radiation therapy; RTHT, thermoradiotherapy; RTCT, chemoradiotherapy; RTHTCT, thermochemoradiotherapy; SAR, specific absorption rate; TER, thermal enhancement ratio; TRTP, thermoradiotherapy planning; TTP, time to progression; WBHT, whole-body hyperthermia; wIRA, water-filtered infrared-A; y, year.

Nowadays, HT is either administered independently or, more often, in combination with radiotherapy (RT) or chemotherapy (CT). HT alone is being used for direct ablation of single tumor lesions with temperatures exceeding 50°C. Multiple techniques are being used to obtain necessary temperature coverage such as high-intensity focused ultrasound and radiofrequency-, microwave-, or infrared laser-based heating *via* ablation catheters directly inserted into the tumor (3).

In bimodal treatment schemes such as thermoradiotherapy (RTHT) and chemoradiotherapy (RTCT) as well as in trimodal thermochemoradiotherapy (RTHTCT), HT is utilized for augmentation of treatment effects of the concomitant oncological therapy. Necessary tissue temperatures are significantly lower ranging from 39 to 43°C (4, 5).

In this literature-based review, a brief introduction to HT physiology, cell biology, and immune response is given to examine the underlying modes of action of HT. Currently used HT techniques for heat delivery and temperature control are described. The clinical evidence of combining RT with HT is summarized and sorted per tumor entity. To this end, a PubMed search was conducted searching for the term "hyperthermia" in combination with tumor entities treatable by RT, and terms describing technical aspects such as "biology," "physiology," "chemotherapy," and "radiation therapy." Special emphasis is given to recent metaanalyses and published prospective trials.

### PRECLINICAL EVIDENCE

#### Changes in Perfusion and Oxygenation

Data and the respective interpretation of HT-induced changes in perfusion and oxygenation remain controversial and are briefly described in the following. A comprehensive review of this topic has been published by Vaupel and Kelleher (6). There is evidence that mild HT can increase blood perfusion of the heated tissue, preferentially at the beginning of tumor heating (7, 8). It has been reported that this can lead to increased oxygen delivery *via* an improvement of microcirculation (9). This is especially true in cases when the oxygen demand of the tissue is reduced. It has been proposed that direct heat-dependent cell killing and loss of mitochondrial membrane potential contribute to this phenomenon (10, 11). On the contrary, other studies showed increased oxygen consumption at elevated tissue temperature (van't Hoff 's law!) counteracting the oxygenating effect of increased perfusion (12). An increase in oxygen availability may favor oxygenation of hypoxic cells (7). The effect appears to be preferentially in diffusion-limited, chronic hypoxia (13, 14). Whether the radiosensitizing effect outlasts the time frame of increased perfusion remains so far unclear. Some studies have reported increased perfusion extending over 24 h after HT, which would benefit following RT/CT sessions (15, 16). Other studies could not reproduce this result (17). As hypoxia is a central causative factor for radioresistance, a decrease in hypoxia by HT may be responsible for the observed radiosensitization.

#### Induction of Cell Death

Hyperthermia has been shown to confer cell death by apoptosis or mitotic catastrophe (18, 19). It has been reported that HT triggers unfolding of especially heat-labile non-histone nuclear proteins leading to aggregation, due to exposition of hydrophobic groups, with surrounding proteins and subsequent association with the nuclear matrix. As consequence, basic nuclear matrix-dependent functions such as transcription, replication, or DNA repair are impaired (20, 21). Malfunction of DNA replication finally causes chromosome aberrations, genome instability, and cell death by mitotic catastrophe (22). Apoptosis may be mediated by cell death membrane receptor activation and subsequent caspase 3 activation (23). The extent of apoptosis appears to differ among different tumor types (24). In addition, the permeability of the cellular and mitochondrial membranes is altered leading to cellular Ca2<sup>+</sup>-spikes as well as mitochondrial depolarization with resulting bursts of reactive oxygen species. Both mechanisms may further enhance protein instability and apoptosis (25–27).

#### Inhibition of DNA Repair Mechanisms

As mentioned above, there is sufficient evidence showing inhibition of DNA repair mechanisms upon HT. Krawczyk et al. have demonstrated inhibition of homologous recombination at clinically achievable mild HT temperatures (41–42.5°C) associated with BRCA2 degradation and its reduced accumulation at double-strand break sites (28). Further on, HT impairs the function of the Ku heterodimer by reducing its DNA-binding capacity and preventing the initiation of non-homologous end joining at DNA double-strand breaks sites (29). In addition, base excision after cell radiation has been shown to be reduced upon heat administration (30). In summary, HT acts on multiple levels including excision repair, non-homologous end joining, and homologous recombination influencing the repair of DNA lesions as well as single-strand and double-strand breaks (29–31). As a consequence, the effects of DNA damaging treatments such as CT or RT are enhanced. A more detailed review was recently published discussing existing evidence (32).

#### Immune Stimulation

Besides direct effects on cell metabolism, HT appears to trigger multiple immune responses on local and systemic levels. Toraya-Brown and Fiering published a thorough review covering this aspect (33). In summary, HT increases expression of immunogenic surface receptors such as MICA and MHC-I enhancing effectiveness and function of natural killer (NK) cells and of CD8<sup>+</sup> cells, respectively (34, 35). The expression of heat shock proteins (HSPs) such as HSP70 is increased. After binding intracellular proteins, HSPs get secreted stimulating the activity of NK cell- and antigen-presenting dendritic cells (36, 37). Presentation of these tumor antigens can cause specific antitumor immune responses effected by CD8<sup>+</sup> cells (38). Tumor antigens are also provided by increased release of exosomes (39). Direct enhancement of immunogenic activity of leukocytes is mediated by increased lysis acitivy of NK cells, activation of macrophages, maturation of dendritic cells, and increased IFNy production as well as cytotoxicity of CD8<sup>+</sup> cells (34, 40–42). In addition, immune cell trafficking is enhanced by increased perfusion and permeability (43). Following elevated intratumoral IL-6 signaling, it may further be facilitated by increased cell adhesion molecule expression such as ICAM-I (44).

### HEAT DELIVERY AND TEMPERATURE CONTROL

#### Heating Techniques

Heating techniques can be divided by the size, penetration depth, and region of energy deposition. Local or regional HT is mostly used to enhance local therapy such as RT or CT. Alternatively, hyperthermic isolated limb perfusion to administer CT agents is performed. Whole-body hyperthermia (WBHT) has been applied either alone or in combination with CT for the treatment of metastatic disease. Different approaches including capacitive, radiative, infrared-A, or ultrasound have been used for clinical HT treatments (45). The clinically most relevant methods are described in the following.

#### Capacitive Heating Systems

Capacitive heating systems work with two electrodes positioned on both sites of the body with direct body contact using a water bolus. Heat is induced by the resulting currents and is directed toward the smallest electrode (46). Capacitive heating tends to create high power densities around the bolus' edges but good heat coverage of targets inside of the fat layer (47). On the contrary, in obese patients, therapy-limiting local hot spots can occur causing painful subcutaneous burns (48).

#### Radiative Heating Systems

Radiative heating systems work with frequencies ranging from 75 to 915 MHz (spectrum of radiowaves and microwaves) and use a water bolus for electromagnetic coupling. Compared to capacitive heating, radiative systems appear to yield better power disposition and temperature distribution leading to better target coverage (47). The applicable temperature is sometimes limited due to local temperature hot spots. The accuracy of such systems depends on construction details such as the number, positioning and design of antennas or properties of the water bolus (49). In recent years, increasingly complex systems have been introduced comprising multiple antennas such as the commercially available Sigma applicators, build in a circular arrangement, or the AMC-8 phased array HT system (50, 51). Site-specific systems such as the HYPERcollar3D system, which was developed for the treatment of carcinomas of the head and neck, take into account local characteristics of target areas to optimize temperature coverage (52). Alternatively, antennas can be interstitially implanted or used for endocavitary HT in direct approximation to tumors (45). In order to perform superficial or interstitial HT, heating systems are used applying higher frequencies such as 915 MHz (53).

#### Walter-Filtered Infrared-A (wIRA)-Based Systems

Walter-filtered infrared-A-based systems have successfully been used for the treatment of superficial tumors (54, 55). Infrared-A radiation is generated by a halogen lamp, passing through a water filter. The range of therapeutically relevant temperatures is limited to a depth of 15–20 mm [with good therapeutic temperature coverage (56)]. Due to the technical setup a very short interval between HT and RT combined with heat isolation procedures enables quasi-simultaneous RTHT, optimizing the synergistic effect (see below).

Pretherapeutic hyperthermia treatment planning can be used to optimize tumor temperatures. It uses dielectric models created on the basis of segmented CT or MRI data assigning literaturebased dielectric properties to distinct tissue types. So, the specific absorption rate (SAR) of the respective tissue can be calculated and used for a finite element-based prediction of temperature distributions. In clinical studies, calculated SAR values correlated well with measured SAR values, relative temperature increase and clinical data regarding hot spots in patients with pelvic tumors (57–59).

#### Treatment Temperatures

Preclinical experiments were initially performed with relatively high temperatures ranging from 43 to 45°C focusing on direct cytotoxic effects. However, in the clinical setting, temperatures above 42.5°C were only achieved in small tumor subvolumes. While trying to reach targeted temperatures, therapy-limiting hotspots occurred causing substantial side effects. In these cases, this led to a reduction of target temperatures or early termination of HT (60, 61). This was regarded as failure of delivering adequate thermal doses, which lead to a rapid decline in HT-usage in the mid-to-late 1990s. It took several more years until the beneficial effects of mild HT (39.5–43°C), as described above, became known. Nowadays, mild HT has become the standard in modern clinical trials and daily clinical usage (5). Modern HT technology has been developed and optimized for minimal hot spot occurrence as a main focus (51, 52). As a consequence, therapylimitation due to focal hot spots has not been an issue in many recent HT trials (62–64).

#### Temperature Control

As important as heat generation, measurement of the actual tissue temperature distribution is crucial for effective heating of tumors. A homogenous temperature distribution is necessary for optimal treatment effects. Local dose-limiting hotspots have to be avoided. Originally, temperature assessment was restricted to single-point measurements. It can be performed either invasively by insertion of intratumoral catheters or, as applicable in tumors with close proximity to natural cavities such as rectal, cervical, vaginal, urethral or vesical tumors, equally efficient by endoluminal catheters. Insertion of catheters inherits the risk of complications such as pain, inflammation, or abscess formation (65). Thus, the latter option should be used if possible. In superficial tumors, surface skin measurements by contact electrodes constitute a further alternative. In addition, by using infrared thermography cameras, two-dimensional data can be obtained for superficial tumors even though calibration with contact electrodes is necessary for absolute temperature assessment (56). A promising method for deep temperature monitoring is MRI-guided thermometry capable of measuring three-dimensional temperature distributions noninvasively. Temperature can be measured by exploiting either T1w-imaging, diffusion weighted imaging or proton resonance frequency shift-imaging (66). Proton resonance frequency shiftimaging appears to be the most accurate method in the clinical setting. By combining HT with online MRI thermometry, direct changes of temperature delivery can be performed to optimize temperature distribution and suppress hot spots (67). To this end, an adaptive iterative algorithm has been developed (68). First studies have confirmed its applicability in the clinical setting (69).

### Interval of Administration

In general, a dose–effect relationship of HT has been shown (70). The interval of administration of HT relative to RT has great influence on its effectiveness. The actual effect is quantified with the thermal enhancement ratio (TER) defined as the ratio of the respective radiation doses of RT alone divided by the RT + HT dose necessary to receive equal survival curves (4). However, biological aspects of HT react differently to the extent of heating sequence of HT. In the clinical setting, the maximal achievable TER should be combined with the most limited TER for healthy tissue to retain a tumor-specific effect reducing toxicity.

The inhibition of DNA repair has its highest effect when HT is given simultaneously to RT. The effect declines with the end of DNA repair mechanisms approximately 4 h after RT. However, inhibition of DNA repair mechanisms is not tumor specific since it is also present in normal tissues (71, 72). In contrast, direct cell killing is specific to malignant tissue at target temperatures. The respective TER is estimated at 1.5 deriving most likely from direct radiation-independent cell damage to radioresistant hypoxic cells (73). To summarize, optimal effects can be achieved by simultaneous RTHT treatment with no tumor-directed specificity. Treatment selectiveness would completely depend on accuracy of radiation delivery. In the time frame of 1–4 h before or after radiation, a maximal selective TER can be achieved by adding up DNA repair inhibition and direct cell damage. For optimization of the oxygenation-effect, RT should be applied shortly after HT (72). So far, the shortest interval between HT and RT has been described for wIRA followed by RT (56).

When considering time schedules and fractionation schemes, one should also take into account two phenomena termed cellular and vascular thermotolerance. The underlying mechanisms remain incompletely understood (74–77). For this reason, detailed description of these phenomena is not performed in the context of this review. Most probably, thermotolerance in currently used clinical HT schedules (i.e., once or twice/week) seems to play no limiting role.

### CLINICAL EVIDENCE OF COMBINED THERMORADIOTHERAPY

In the following, key studies providing evidence for a clinical benefit of a combined treatment with HT and RT is presented, sorted by tumor entity and the extent of existing evidence (see **Tables 1**–**3** for a detailed overview). In the following paragraph, existing clinical evidence of combined HT with CT is summarized (see **Table 4**).

#### Breast Cancer

Breast cancer constitutes the most widely investigated malignant entity. Vernon et al. published the results of five randomized trials conducted between 1981 and 1991 that were combined due to insufficient patient accrual. The pooled analysis of 306 patients with inoperable primary or recurrent disease yielded a significantly better complete response (CR) (RTHT: 59%, RT alone: 41%, odds ratio (OR): 2.3, *p* < 0.001) but no survival benefit. However, 50% of patients had metastases at the time of randomization. The effect was most prominent in preirradiated recurrent lesions. Skin toxicities such as blisters, ulceration, and necrosis were higher in the HT group, however with low impact on the patients well-being and generally treatable with conservative measures (60).

In cases of locoregional recurrence, breast surgery is recommended if possible. However, radical resection only appears to be feasible in 65% of patients and may cause significant treatmentrelated morbidity (78). Thus, RT constitutes a significant alternative, and depending on the time interval between first and second RT and other pretreatment characteristics offers substantial clinical benefit. Since nowadays most patients receive RT in the primary situation, HT may help to enhance the effects of reirradiation in the recurrence scenario, especially since in some clinical situations only reduced radiation doses may be prescribed. A recent meta-analysis by Datta et al. included 8 two-arm studies and 24 single-arm studies involving 2,110 patients with locoregional recurrent disease (79). CR was similar between one-arm studies (CR: 63.4%) and two-arm studies as well as significantly higher compared to RT alone (RTHT: 60.2%, RT: 38.1%, OR 2.6, *p* < 0.001). In preirradiated patients (779 patients), a CR of 66.4% was achieved (mean re-RT dose: 36.7 Gy). Treatment-related toxicity was overall not increased with mean acute grade 3/4 toxicities of 14.4%. Among the analyzed studies, there was great heterogeneity in RT dose, HT fraction schedules, total number of HT fractions, HT duration, or average achieved temperatures. However, no prognostic treatment variables could be identified in a subgroup analysis and meta-regression. Similar results were recently published by Linthorst et al. In a retrospective study encompassing 248 patients with unresectable recurrences, reirradiation + HT yielded a CR of 70% (80). In resectable cases, a regimen including surgery, RT, HT, and partly CT or hormone therapy, a local control (LC) rate of 78% after 5 years was achieved (81). Notter et al. have treated patients with locally recurrent breast cancer with RTHT using a hypofractionation scheme of 5 × 4 Gy with one fraction per week. A wIRA system was used for superficial HT. A CR rate of 61% was achieved without any treatment-related toxicities (56). In summary, there is sustained evidence demonstrating a value of adjunct HT in locoregional, recurrent breast cancer as definitive or adjuvant treatment.

### Cervical Cancer

Several randomized trials have been conducted to test HT in combination with RT. A Cochrane database meta-analysis was performed analyzing six trials involving a total of 487 patients (82). The studies included mostly patients with locally advanced disease (74% FIGO stage IIIB). Except in one study, RT was delivered as a combination of external beam therapy (EBRT) and brachytherapy (BT). The pooled data analysis showed a significantly higher CR [CR relative risk (RR): 0.56, *p* < 0.001] and OS [hazard ratio (HR): 0.67, *p* = 0.05] in favor of RTHT as well as a reduced local recurrence rate (HR: 0.48, *p* < 0.001). No difference was found regarding acute and late toxicity rates. The


*RT, CT, and HT schemes, p-value or significance status (s/ns) are described as mentioned in the original publications.*

*a.s., arm study; BT, brachytherapy; CT, chemotherapy; CR, complete response; EFS, event-free survival; HR, hazard ratio; HT, hyperthermia; m, mean; ns, not significant; LR, local recurrence; MA, meta-analysis; OS, overall survival; OR, odds ratio; PRFS, pelvic recurrent-free survival; P #, patient number; p, phase; PA, patients alive; w, week; r, randomized; RT, radiotherapy; s, significant; RTHT, thermoradiotherapy; RTCT, chemoradiotherapy; EBRT, external beam therapy; à, with a single dose of; x/w, times per week. a Including 4 two-arm studies reported in the study by Vernon et al.*

*bSubgroup analysis.*

authors criticized a lack of uniformity among the trials concerning HT delivery, HT schedules, as well as RT treatment protocols, RT dose, and RT techniques. Therefore, the authors conclude that the results do not suffice for a definitive recommendation to apply HT along standard treatment regimen. The included Dutch deep HT trial was updated for long-term results after 12 years of follow-up showing sustained improved LC (RT: 37%, RTHT: 57%, *p* = 0.01) and survival (RT: 20%, RTHT: 37%, *p* = 0.03) (83). All studies have in common that concurrent CT was not included in the treatment regimen.

In recent years, many trials have discussed the value of the combined treatment regimes such as RTHT, RTCT, or trimodal RTHTCT. A recently published meta-analysis has revised all relevant publications including 23 articles with a total patient number of 1,160 (84). In a network meta-analysis (NMA), all four possible treatment modalities (RT, RTHT, RTCT, RTHTCT) were compared. All studies, but one, included patients with locally advanced disease. RT always comprised EBRT + BT. The comparison of RTHT and RT yielded similar results as the previously described Cochrane analysis. The same six trials with minor updates were included. For the direct comparison of RTHTCT and RTCT, only one study including 68 patients was available (85). It showed a significant better CR for the trimodal treatment arm (RTCT: 46.7%, RTHTCT: 83.3%, risk difference 36.7%, *p* = 0.0001). No significantly increased grade 3/4 toxicities were found. The NMA showed a significant advantage of RTHTCT over all other treatment combinations in direct and indirect comparisons as well as in the SUCRA value-based ranking for CR and patients alive. RTCT and RTHT had similar performance even though RTHT had a small advantage over RTCT regarding CR. In coherence, a recent randomized phase III trial, which was closed early due to poor accrual with only 87 of 376 planned patients, did not show any significant difference in event-free survival and pelvic recurrence-free survival between RTHT and RTCT (86). In summary, by adding radiosensitizing treatments such as CT or HT to RT, better treatment outcomes are achievable. In locally


Table 2 | Summary of cited meta-analyses and randomized trials for head and neck cancer and rectal cancer.

*RT, CT, and HT schemes, p-value are described as mentioned in the original publications.*

*a.s., arm study; BT, brachytherapy; CT, chemotherapy; CR, complete response; DFS, disease-free survival; HR, hazard ratio; HT, hyperthermia; m, mean; mo, months; nr, nonrandomized; MA, meta–analysis; OS, overall survival; OR, odds ratio; P #, patient number; p, phase; w, week; PFS, progression-free survival; r, randomized; RR, relative risk; RT, radiotherapy; y, years; RTHT, thermoradiotherapy; RTCT, chemoradiotherapy; à, with a single dose of; x/w, times per week.*

advanced cervical cancers, RTHT appears to be a valuable substitute for RTCT if CT is not applicable. By combining all three modalities, the best treatment effect may be possible. Additional phase III trials are necessary directly comparing RTHT, RTCT, and RTHTCT for optimal treatment stratification.

#### Head and Neck Cancers

To evaluate the effect of HT in head and neck carcinomas, a meta-analysis was recently performed including six 2-armed studies encompassing 451 patients. Five of the six studies were randomized trials. The CR rate appeared to be significantly better in patients treated with combined RTHT compared to RT alone (RT alone: 39.6%, RTHT: 62.5%, OR 2.92, *p* < 0.0001). Acute and late grade 3/4 toxicities were not significantly different (87). One study included exclusively nasopharyngeal carcinomas, whereas all other studies considered all cancer sites of the head and neck. However, all studies involving surgery or concurrent CT were excluded.

Three other randomized trials with a total of 417 patients recently analyzed the effects of trimodal treatment combing RT, CT, and HT in patients with nasopharyngeal carcinomas (88–90). Two studies reporting CR showed a significant advantgae for RTHTCT treatment. The same patients had an increased OS in two of the studies. Progression-free survival (PFS) or disease-free survival (DFS) was significantly better in the RTHTCT group in all three studies. Patients with higher tumor temperatures and higher HT fraction numbers showed a better outcome (88). In all three studies, no difference in treatment-related toxicity has been described. Patients receiving HT showed even better quality of life scores after completion of therapy (90). These studies demonstrate that trimodal therapy including RT, CT with different agents, and HT constitute an effective and safe treatment alternative. To our knowledge, no other randomized studies have been published in other head and neck sites investigating trimodal therapy. As shown in other malignancies, re-treatment may constitute a further clinical situation in which HT may be a valuable treatment option. In a small cohort receiving reirradiations combined with HT, a CR of 46% was achieved showing feasibility of such an approach (91). To conclude, HT constitutes a valuable treatment option in cancers of the head and neck.


Table 3 | Summary of cited randomized trials for bladder cancer, melanoma, NSCLC, glioblastoma, and sarcoma.

*RT, CT, and HT schemes, p-value or significance status (s/ns) are described as mentioned in the original publications.*

*a.s., arm study; BT, brachytherapy; CT, chemotherapy; CR, complete response; HT, hyperthermia; LC, local control; m, mean; mo, months; ns, not significant; OS, overall survival;* 

*P #, patient number; p, phase; w, week; PFS, progression-free survival; r, randomized; RT, radiotherapy; RTHT, thermoradiotherapy; NSCLC, non-small cell lung cancer; TTP, time to progression; STS, soft tissue sarcomas; à, with a single dose of; x/w, times per week.* 

*a As part of the Dutch hyperthermia trial.*

However, due to relatively high perfusion rates and fast adaptation to local temperature changes, HT delivery appears to be especially challenging. By using a site-tailored radiative heating device, treatment outcomes may be better in the future (92). HT could be used in multimodal treatment schemes further improving treatment outcome. Alternatively, it may constitute a toxicitysparing alternative for concurrent CT in elderly or multimorbid patients. Further clinical studies are necessary to evaluate the true clinical value.

#### Rectal Cancer

In 2009, a Cochrane analysis of six phase II and III randomizedcontrolled trials including 520 patients with locally advanced rectal carcinomas was performed. Patients received neoadjuvant RT with or without HT. Increased CR (RR 2.81, *p* = 0.01) as well as increased OS at 2y follow-up (HR 2.06, *p* = 0.001) could be observed. The survival benefit, however, could not be measured for any later time point. No difference in acute toxicity was found in the two studies reporting on this side effect (93). A positive impact on pathologic complete response (pCR) could as well be shown in a retrospective study of 106 patients. Sphincter-sparing surgery was higher for tumors in close proximity to the anal verge (94). In a further retrospective study encompassing 235 patients, HT appeared to confer better downstaging of the primary tumor and involved lymph nodes (95). Two small studies evaluated hypofractionated RTHTCT schemes showing principal efficacy and safety (96, 97). Additional HT appears to be well tolerated without increased impairment of quality of life (98). To conclude, the Cochrane analysis demonstrated a fundamental possibility of increased response by applying adjunct HT. However, further randomized prospective trials are necessary to evaluate the true value of neoadjuvant RTHTCT as well as of treatment of recurrent disease.

#### Bladder Cancer

For the treatment of bladder carcinomas, HT has been predominantly applied in combination with intravesical CT. However, a few trials have evaluated RTHT. In an early study, 56 patients

#### Table 4 | Cited studies for RTHT indications with limited data.


*RT, CT, and HT schemes, p-value or significance status (s/ns) are described as mentioned in the original publications.*

*CR, complete response; PR, partial response; BT, brachytherapy; CT, chemotherapy; HT, hyperthermia; m, mean; mo, months; nr, non-randomized; ns, not significant; OS, overall survival; OR, odds ratio; P #, patient number; p, phase; w, week; PR, partial response; PFS, progression-free survival; r, randomized; RT, radiotherapy; s, significant; RTHT, thermoradiotherapy; RTCT, chemoradiotherapy; à, with a single dose of; x/w, times per week.*

with bladder carcinomas were treated with intravesical CT (bleomycin) simultaneously to RTHT with reduced total dose (40 Gy) or RT alone with higher dose prescription (50–70 Gy). HT was delivered by intravesical infusion of warmed saline solution containing bleomycin. The RTHT group had a higher response rates (RTHT: 84%, RT: 56%, *p* < 0.001) with decreased toxicity rates (less bladder capacity reduction) (99). In a different approach, 49 patients with nodal-negative disease of all T stages were treated with a hypofractionated RT scheme (24 Gy, 4 Gy/fraction) with or without HT. The HT group was split into a high (Tmean > 41.5°C) and a low temperature cohort (Tmean < 41.5°C). The high temperature cohort showed significantly better downstaging compared to both other groups indicating the importance of adequate temperature delivery (100). In a more recent German trial, high-risk T1 and T2 cancers were treated with transurethral resection followed by RTHTCT (50.4 Gy + 5.4–9 Gy; cisplatin and 5FU). At six weeks follow-up, a pCR of 96% was achieved. After a median followup of 34 months, OS was 89% with 80% of the patients being satisfied with their bladder function (63). The Dutch deep HT trial also included bladder carcinomas besides cervical and rectal carcinomas. In a randomized protocol, 101 patients with T2–T4 N0 M0 bladder carcinoma were treated with either RT or RTHT. RTHT yielded a significantly better CR (RTHT: 73%, RT: 51%, *p* = 0.01). However at 3y, LC and OS were not significantly different. There was no difference in toxicity (61). Taken together, some studies exist demonstrating a clinical benefit for adjunct HT with no additional toxicity. However, the patient cohorts in the different studies appeared to be quite heterogeneous by mixing locally restricted and advanced tumors. The treatment regimens used differed among studies impairing adequate comparisons. Randomized studies are necessary with clearly defined risk profiles and adequate direct comparisons with guideline-based treatment regiments.

#### Melanoma

One multicentric randomized trial analyzed the benefit of adjunct HT in melanomas treated with RT. Patients either received RT (24 Gy or 27 Gy in three fractions) alone or with HT (43°C for 60 min). There was no significant difference in toxicity. CR (RT alone: 35%, RTHT: 62%, *p* < 0.05) and LC after 2y were significantly increased (RT alone: 28%, RTHT: 46%, *p*= 0.008) (101). As these results are very promising, more randomized trials would help to establish a distinct role for RTHT in the treatment of melanomas, for example in combination with less hypofractionated treatment schemes.

#### Non-Small Cell Lung Cancer (NSCLC)

Only few studies have investigated the role of HT for the treatment of NSCLC. A multi-institutional prospective randomized trial investigated the role of HT in addition to primary RT for locally advanced NSCLCs. No significant difference of OS, local response, or treatment-related toxicity could be observed. However, with a significantly higher 1y-PFS, a certain benefit was apparent (67.5%, 29%, *p* = 0.036) (102). In a small case–control study encompassing 13 patients with direct bone invasion treated with RTHT (60–70 Gy) showed a possibly high efficacy in LC and survival under this unfavorable condition (103). The benefit of HT in addition to reirradiation for recurrent NSCLC after primary RT was investigated in a small retrospective study involving 33 patients. Median doses used initially and for reirradiation were 70 and 50 Gy, respectively. Toxicity was moderate and limited to a maximum toxicity of grade 3 in 9% of patients. In patients with smaller tumors (<4 cm) and no distant metastases, long time survival was partly achieved (104). All three studies employed radiofrequency capacitive heating systems. So far, only limited evidence exists showing a true benefit of HT in NSCLC treatment. More studies are necessary to explore potential areas of application.

#### Prostate Cancer

Feasibility of adjuvant HT treatment for prostate carcinomas has first been shown in two phase I/II studies involving locally advanced disease or recurrences after radical prostatectomy. Toxicity was limited to grades 2 and 3, respectively. Quality of life was not significantly changed by addition of HT to RT treatment (105). A HT-dependent burn occurred in one patient indicating critical temperature delivery (106, 107). Similar results were obtained in a larger phase II study involving 144 patients with high-risk disease (T2 + serum-PSA > 10 ng/ml or Gleason score ≥ 7) or locally advanced disease (T3/4) treated with RTHT and antihormonal therapy (108). A 5y-OS of 87% and 5y-biochemical PFS of 49% was observed with limited toxicity (maximum grade 2). Hurwitz et al. combined radiation with or without hormone deprivation therapy and transrectal ultrasound HT in locally advanced disease showing promising results with a 2y-DFS of 84% compared to historical 2y-DFS of 64% observed in patients in the 4-month androgen suppression cohort of the RTOG 92-02 trial (62). Currently, a phase II study examines the safety of combining HT and dose-escalated external salvage RT for recurrent prostate cancer (109). Another study is examining salvage BT combined with interstitial HT (110). In a retrospective analysis of 146 patients, no significant difference was found between patients receiving HT or not. The authors discussed that this might be due to insufficient heat delivery, since a significant difference was apparent for patients receiving a high thermal dose (111). To summarize, a set of phase II studies show promising results. However, randomized phase III trials are necessary to evaluate the actual value of adjuvant HT in the treatment of prostate carcinomas.

#### Glioblastoma Multiforme (GBM)

In 1998, Sneed et al. investigated the impact of adjuvant interstitial HT after a BT boost for patients with newly diagnosed, supratentorial GBM smaller than 5 cm treated with postoperative RT and concomitant hydroxyurea. After patient exclusions, 68 patients were randomized to BT with or without HT (112). HT was administered 30 min before and after a BT boost *via* placement of helical-coil microwave antennas. The HT group showed significantly increased survival (2y-survival 31% vs. 15%, *p* = 0.02) and time to progression (TTP) (median TTP 49 vs. 33 months, *p* = 0.045); however, this treatment was accompanied by increased grade 3 toxicity rates. In recent years, efforts were made to optimize HT delivery by improving interstitial catheters or applying focused ultrasound (113, 114). No data exist so far validating these new techniques or the combination with temozolomide.

#### Sarcomas

Randomized trials have shown a significant benefit of HT in addition to CT (115). However, evidence of combined RTHT in sarcoma treatment remains scarce. An early phase II study involving 17 soft tissue sarcomas (STS) patients neoadjuvant RTHT with twice weekly HT showed significantly more extensive changes in histopathological examinations than the once weekly HT group (116). In another study, 16 patients received irradiation with concurrent HT for the treatment of radiationassociated sarcomas (predominantly angiosarcoma) showing a total response rate of 75%. Toxicity was mild except one grade 4 adverse event (117). First clinical results show general feasibility of applying RTHT to sarcoma treatment. However, randomized trials are necessary to assess whether a similar benefit exists as it has been shown for neoadjuvant HTCT.

#### Esophageal Cancer

Several phase II studies have investigated the feasibility of trimodal neoadjuvant RTHTCT treatment. Nakajima et al. treated 24 patients with neoadjuvant RTHTCT using docetaxel. A general response rate of 41.7% with a pCR rate of 17.6% was observed (118). Described toxicities were limited to grade 2 and grade 3. In a further study, 28 patients received RTHTCT with carboplatin and paclitaxel. R0 resection was possible in all patients with mild toxicity. Pathologic evaluation yielded a pCR rate of 19%. Treatment was tolerated well with mild toxicity rates (maximum grade 2) (119). A third study treated 35 patients with advanced disease with RTHTCT (bleomycin/cisplatin and 5FU) yielding a good CR rate of 33.3% (120). In addition, multiple retrospective analyses have shown a significantly increased survival benefit in favor of RTHTCT over RTCT (121–123). In summary, there is existing evidence promising a substantial clinical value of RTHTCT. However, randomized trials are necessary directly comparing RTHTCT to standardized treatment protocols involving RTCT.

#### RTHT in Indications with Limited Data

Apart from the described entities, in multiple trials, RTHT has been applied to rather rare RT indications with only low levels of evidence (see **Table 4** for a detailed summary of the cited trials).

A Dutch retrospective trial analyzed the efficacy of hypofractionated (28/32 Gy with a single dose of 4 Gy) reirradiation with concurrent HT for the palliation of painful unresectable recurrent rectal cancer with good to complete response in 72% of 47 patients (124). Similar results (70% pain relief) were reproduced in a prospective phase II trial by Milani et al. with normofractionated RTCTHT (125). Klaver et al. proposed a novel treatment strategy in a case series for locally advanced rectal cancer with concurrent peritoneal carcinomatosis by combing hyperthermic intraperitoneal chemotherapy with intraoperative RT showing general feasibility (126).

Thermoradiotherapy was evaluated for the palliation of symptomatic locally advanced or recurrent hormone-refractory prostate cancer in a small phase I/II study. All patients demonstrated partial response or complete response as well as complete symptom relief (127). However, two of eight preirradiated patients developed grade IV toxicities.

Primary or recurrent locally advanced pancreatic cancer was subject to an open-label study comparing RTCT with RTHTCT (gemcitabine ± 5FU/cisplatin/oxaliplatin) showing a significantly increased survival benefit without increased toxicity (mean OS: RTHTCT: 15 months, RTCT: 11 months, *p* = 0.025) (128). Moreover, the influence of adjunct HT for the treatment of liver lesions has been assessed. In a randomized Chinese trial of hepatocellular carcinoma patients, 1y-recurrence (RTHT: 10%, RT: 15%, *p* < 0.001) and mortality rates (RTHT: 12.5, RT: 20%, *p* < 0.001) were significantly lower for combined RTHT compared to RT alone (129). Chemorefractory colorectal cancer metastases were treated with whole-liver RT and concimittant HT showing partial response and pain relief in 30% of treated patients, respectively (130). Vaginal cancers have been chosen as target for HT in small prospective Dutch trial. Patients with vaginal carcinomas with a tumor size larger than 4 cm were treated with RTHT, whereas smaller tumors were primarily treated with RT showing no significant difference in 5y-survival (131).

In a more general approach, Jones et al. performed a randomized prospective trial pooling superficial tumors of various entities (breast carcinoma, melanoma, head and neck cancer, and others). Only tumors that appeared to be heatable on a pretest were randomized. Addition of HT to RT lead to significantly increased CR (RTHT: 66.1%, RT 42.3%, OR 1.7, *p* = 0.02). In coherence with many other studies, the highest difference was achieved in preirradiated patients (CR: RTHT: 68.2%, CR 23.5%) (64).

Despite the limited amount of evidence, substantial benefits of RTHT, especially for preirradiated, locally advanced and recurrent tumors, became apparent. Since there is a lack of randomized trials, definite recommendations for treatment cannot be made. In situations of recurrent or metastatic disease, RTHT may be justifiable as "individual treatment approach" on the basis of the existing evidence. In order to reach necessary patient number in potential future randomized trials, multiple entities with a similar condition (such as "recurrent," "preirradiated," or "locally advanced") could be combined, following the approach of Jones et al. (64). In addition, HT centers should work more closely together for the establishment of multicenter trials capable of gathering critical patient numbers.

### Combined Thermochemotherapy (CTHT)

Thermochemotherapy has been evaluated in multiple clinical trials. In contrast to RTHT, CTHT is also being combined with WBHT. A limited number of phase II studies have shown feasibility of applying WBHT to CTHT treatment of various entities such as recurrent ovarian cancer, malignant pleural mesothelioma, metastatic STS, melanoma, and pretreated metastatic colorectal cancer (132–136). Up to date, no phase III trials exist. In contrast, regional HT has been evaluated in a large phase III trial (341 patients) of STS performed as joint effort by the European Organization for the Research and Treatment of Cancer and European Society for Hyperthermic Oncology. It showed a substantially and significantly improved local PFS, DFS, and OS after adding HT to EIA (etoposide, ifosfamide, doxorubicin) CT (HR for local progression/death: 0.58, *p* = 0.003, local 2y-PFS: HTCT: 76%, CT: 61%; 2y-DFS: CTHT: 58%, CT: 44%, *p* = 0.011; per-protocol OS: HR 0.66, *p* = 0.038) changing daily clinical practice in HT treatment centers (115). Further randomized trials have evaluated neoadjuvant CTHT in esophageal carcinoma (40 patients; histologic effectiveness: CTHT 58.3, CT: 14.3, *p* < 0.05) and adjuvant CTHT after transurethral resection of bladder carcinomas (83 patients; 10y-DFS: CTHT: 53%, CT: 15%, *p* < 0.001) demonstrating increased treatment efficacy (137, 138). A randomized trial with NSCLC showed a small benefit regarding "clinical benefit response" (80 patients, CTHT: 82.5%, CT: 47.5%, *p* < 0.05) (139). General feasibility of local CTHT has also been shown in other entities such as refractory or recurrent non-testicular germ cell carcinomas, recurrent or persistent ovarian cancer, breast carcinoma, or peritoneal carcinomatosis in several phase II studies (140–143). As alternative to regional HT, hyperthermic isolated limb perfusion has been established for the treatment of STS and unresectable melanomas showing favorable results (144, 145). In summary, the few existing randomized trials suggest substantial benefit by adding HT to CT. More randomized trials are necessary to broaden the spectrum of CTHT.

## OUTLOOK

Review of the current literature has shown various retrospective and prospective trials exploring the value of adding HT to RT or RTCT regiments in multiple tumor entities. As in some entities, the real benefit of HT remains elusive, in other malignancies sustained evidence has been acquired. When considering the currently existing studies, a substantial part of evidence was gathered between the 1980s and early 2000s. Apart from technical aspects of HT, RT techniques have dramatically evolved since then. Nowadays, even better results in regard to treatment outcome as well as toxicity could be expected. Still, no widespread use of HT has been established in the last decades. Several reasons, such as reimbursement issues in certain countries, technical complexities, and challenges of homogenous heating and temperature monitoring, may have contributed to this fact. As more and more clinical trials are being published, the willingness/memorandum to make the effort of establishing HT in a rising number of institutions has increased.

The development of novel techniques with more exact heat delivery and temperature monitoring capacities may help to gain higher acceptance among physicians. HT may as well be further improved by better planning techniques. Mathematical modeling of the earlier mentioned HT effects on cell biology has paved the way to actual thermoradiotherapy planning (4). By integrating the biological HT effect into the LQ-model, a more exact RTHT treatment planning becomes possible. By adding online temperature control, as it can be achieved by MRI thermometry, temperature distribution could be optimized even further by real-time adjustments.

In the trials performed so far, HT delivery specifications were highly heterogeneous. HT frequency varied between once weekly to daily applications. Mean achieved temperature profiles differed vastly between 39 and 43°C. In trials using multiple HT specification, higher temperatures or HT frequencies were associated with better outcome (88, 100, 116). This underlines the necessity of optimizing HT schedules to optimize the treatment effect. To this end, larger randomized studies are necessary directly comparing distinct HT specifications.

Even though there are still a lot of open questions, basic research has revealed many ways of action of HT. Regarding the biological mechanisms of HT, combining HT with drugs exploiting underlying mechanisms may further increase radiosensitization. As an example, HT has been combined with antiangiogenesis agents. HT itself appears to directly impair angiogenesis at least in part by plasminogen activator inhibitor 1 induction (146). By combining HT with VEGFR2-inhibitor treatment, a synergistic antiangiogenesis effect *in vivo* has been shown to inhibit tumor growth (147). Regarding the immunogenic effects of HT, adjunct immunotherapy, such as checkpoint inhibition, constitutes a further interesting field of research. Before clinical trials can be designed, more basic research is necessary to evaluate the effects of thermo-immunotherapy in preclinical models.

All studies mentioned above have used HT in combination with photon-based irradiation. However, all over the world, an increasing number of particle beam facilities are being installed. Due to technical improvements, smaller and low-cost proton facilities may become available in the future possibly enabling a further widespread use. In contrast, facilities capable of delivering heavy ion-based irradiation with 12C remain scarce. In a brief summary, proton and 12C-ions share similar favorable dose distributions with low entry dose, a high dose in the "Bragg peak" followed by a more or less steep dose decline (148). In contrast to protons, 12C-ions inherit a significantly higher linear energy transfer and relative biological effectiveness (149). Several biological factors have been identified such as a low oxygen enhancement ratio (OER), less cell-cycledependent cell killing, inhibition of non-homologous DNA repair, cluster damages to the DNA, and more efficient cell killing of tumor stem cells (150). HT triggers killing of hypoxic, acidic as well as energy-deprived tumor cells, decreases OER, and confers direct killing of S-phased cells. Therefore, it has been proposed that combined therapy of proton irradiation with HT may have similar effectiveness as 12C beam therapy alone (151, 152). To the best of our knowledge, no data of the clinical use of simultaneous thermo-particle therapy has been published. The HYPROSAR phase I/II study is currently recruiting patients with unresectable STS at the Paul Scherrer Institut in Switzerland. Weekly HT is combined with proton beam therapy to achieve tumor downstaging with subsequent resection. There has only been one randomized trial treating 151 patients with uveal melanoma with or without adjuvant transpupillary thermotherapy months after the end of proton irradiation (153). Indeed, the rate of secondary enucleation was significant lower. Since the therapy was not conducted in direct temporal proximity, the abovementioned factors would not have taken effect. Hence, clinical trials are necessary to explore the benefit of combinational therapy. Direct comparison of thermo-particle therapy with 12C beam therapy should be performed in randomized trials.

Different technical solutions of external HT delivery have been discussed. Advances in the field of nanomedicine have introduced a novel approach for targeted HT by development of magnetic or superparamagnetic nanoparticles as recently reviewd by Datta et al. (154). Particles with cores of iron oxide or gold shells have already made their way into clinical phase I trials. Bergs et al. recently reviewed the existing particle constructs (155). Tumor-specific targeting might be achieved by passive accumulation in the tumor due to the aberrant vasculature with increased leakage and simultaneous impairment of lymphatic drainage (156). Alternatively, active targeting could be achieved by coating with tumor-specific antibodies or ligands (157). Heat can then be generated by applying external magnetic fields with rapid field alternations (158). In clinical phase I trials, dispersed nanoparticles were directly deposited at the tumor side either by percutaneous injection or intraoperatively. Only mild toxicities and quality of life impairments were observed. A maximum temperature of up to 55°C was achieved but target coverage remained insufficient (159–161). The advantage of nanoparticlebased HT may lead to a more selective heat delivery to the tumor with possibly higher temperatures and lower toxicity to adjacent normal tissues. By carrying chemotherapeutic agents, antibodies, or gene silencing RNA residues, nanoparticles may open completely new therapeutic opportunities (154). On the other side, tumor volume coverage is still far from optimal. Poorly perfused regions of tumors, in which HT has its greatest potential for radiosensitization, tend to "collect" lower particle numbers. Accumulation of particles in non-malignant tissues such as the reticuloendothelial system or by the glomerular filter of the kidney carries the risk of side effects (162). Microscopic disease, e.g., in lymphatic tissue may not be reached by sufficient high temperatures. Until safe appliance in the clinic becomes possible, more research is necessary to assess the biological risks and to optimize particle distribution and targeting. If these problems can be addressed, nanoparticles may be a valuable alternative to external HT.

Besides enhancement of RT effects, HT may also be used for radiation dose reduction. As described above, Notter et al. used a hypofractionation treatment scheme (5 × 4 Gy, one fraction per week) to treat patients with locally recurrent breast cancer. A similar CR rate of 61% was achieved compared to other studies using mean doses of 38.2 Gy (79). The relatively low dose enabled the authors to perform re-reirradiations. 17 patients with re-recurrences of lymphangiosis carcinomatosa received re-reirradiation following the same therapy scheme. A CR rate of 31% could be achieved (56).

Infections with human papilloma virus (HPV) are commonly found in carcinomas of the head and neck or cervix. HPV-positive tumors appear to be more radiosensitive than HPV-negative tumors. HT has been shown to trigger the degradation of the oncogenic HPV-derived E6 protein. As functional E6 binds and inactivates p53, HT may indirectly renew p53 activity favoring p53-dependent apoptosis. Thus, a combination of RT and HT may be particularly effective. To evaluate this effect, future trials should include HPV status for risk stratification (87). As a next step, clinical trials should test further RT dose de-escalation with concurrent HT in HPV-positive patients as it is already performed for RT alone (163, 164).

#### REFERENCES


### CONCLUSION

There is abundant evidence demonstrating that HT constitutes a valuable supplement to currently performed RT or RTCT schemes improving tumor response in tumor entities such as head and neck or cervix. It also inherits great potential to enhance RT-based therapy of recurrent disease as it has been shown for breast cancer. Through continuous improvement of HT delivery, planning, and monitoring techniques, treatment effects may further improve. Novel combinations with targeted therapy agents, immunotherapy, nanomedicine, or particle therapy constitute promising fields of further research and interesting areas for future clinical application.

#### AUTHOR CONTRIBUTIONS

JP: reviewed literature and wrote the manuscript. PV: reviewed literature and wrote the manuscript. SC: reviewed literature and wrote the manuscript.

tumors upon localized infrared-A-hyperthermia at 42°C. *Adv Exp Med Biol* (2003) 530:237–47. doi:10.1007/978-1-4615-0075-9\_23


randomised, multicentre trial. Dutch Deep Hyperthermia Group. *Lancet* (2000) 355(9210):1119–25. doi:10.1016/S0140-6736(00)02059-6


chemorefractory liver metastases from colorectal cancer. *Radiat Oncol J* (2016) 34(1):34–44. doi:10.3857/roj.2016.34.1.34


assessment of reported trials. *J Surg Oncol* (2012) 106(8):921–8. doi:10.1002/ jso.23200


photothermal therapy in the treatment of prostate disease. *Int J Toxicol* (2016) 35(1):38–46. doi:10.1177/1091581815600170


radiation and weekly cetuximab in patients with HPV-associated resectable squamous cell carcinoma of the oropharynx-ECOG-ACRIN Cancer Research Group. *J Clin Oncol* (2017) 35(5):490–7. doi:10.1200/JCO. 2016.68.3300

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Peeken, Vaupel and Combs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Parenchymal and Functional Lung Changes after Stereotactic Body Radiotherapy for Early-Stage Non-Small Cell Lung Cancer— Experiences from a Single Institution

*Juliane Hörner-Rieber1,2, Julian Dern1,2, Denise Bernhardt1,2, Laila König1,2, Sebastian Adeberg1,2, Vivek Verma3 , Angela Paul1,2, Jutta Kappes4 , Hans Hoffmann5,6, Juergen Debus1,2, Claus P. Heussel5,7,8 and Stefan Rieken1,2\**

*1Department of Radiation Oncology, University Hospital Heidelberg, Heidelberg, Germany, 2 Heidelberg Institute of Radiation Oncology, Heidelberg, Germany, 3University of Nebraska Medical Center, Department of Radiation Oncology, Nebraska Medical Center, Omaha, NE, United States, 4Department of Pneumology, Thoraxklinik, Heidelberg University, Heidelberg, Germany, 5 Translational Research Unit, Thoraxklinik, Heidelberg University, Germany Translational Lung Research Centre Heidelberg (TLRC-H), German Centre for Lung Research (DZL), Heidelberg, Germany, 6Department of Thoracic Surgery, Thoraxklinik, Heidelberg University, Heidelberg, Germany, 7Department of Diagnostic and Interventional Radiology, University-Hospital, Heidelberg, Germany, 8Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Thoraxklinik at University-Hospital, Heidelberg, Germany*

#### *Edited by:*

*Stephanie E. Combs, Technische Universität München, Germany*

#### *Reviewed by:*

*Chandan Guha, Albert Einstein College of Medicine, United States Michael Buckstein, Mount Sinai Hospital, United States*

*\*Correspondence: Stefan Rieken stefan.rieken@med.uni-heidelberg.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 25 April 2017 Accepted: 29 August 2017 Published: 19 September 2017*

#### *Citation:*

*Hörner-Rieber J, Dern J, Bernhardt D, König L, Adeberg S, Verma V, Paul A, Kappes J, Hoffmann H, Debus J, Heussel CP and Rieken S (2017) Parenchymal and Functional Lung Changes after Stereotactic Body Radiotherapy for Early-Stage Non-Small Cell Lung Cancer—Experiences from a Single Institution. Front. Oncol. 7:215. doi: 10.3389/fonc.2017.00215*

Introduction: This study aimed to evaluate parenchymal and functional lung changes following stereotactic body radiotherapy (SBRT) for early-stage non-small cell lung cancer (NSCLC) patients and to correlate radiological and functional findings with patient and treatment characteristics as well as survival.

Materials and methods: Seventy patients with early-stage NSCLC treated with SBRT from 2004 to 2015 with more than 1 year of CT follow-up scans were analyzed. Incidence, morphology, severity of acute and late lung abnormalities as well as pulmonary function changes were evaluated and correlated with outcome.

Results: Median follow-up time was 32.2 months with 2-year overall survival (OS) of 83% and local progression-free survival of 88%, respectively. Regarding parenchymal changes, most patients only developed mild to moderate CT abnormalities. Mean ipsilateral lung dose (MLD) in biological effective dose and planning target volume size were significantly associated with maximum severity score of parenchymal changes (*p* = 0.014, *p* < 0.001). Furthermore, both maximum severity score and MLD were significantly connected with OS in univariate analysis (*p* = 0.043, *p* = 0.025). For functional lung changes, we detected significantly reduced total lung capacity, forced expiratory volume in 1 s, and forced vital capacity (FVC) parameters after SBRT (*p* ≤ 0.001). Multivariate analyses revealed SBRT with an MLD ≥ 9.72 Gy and FVC reduction ≥0.54 L as independent prognostic factors for inferior OS (*p* = 0.029, *p* = 0.004).

Conclusion: SBRT was generally tolerated well with only mild toxicity. For evaluating the possible prognostic impact of MLD and FVC reduction on survival detected in this analysis, larger prospective studies are truly needed.

Keywords: non-small cell lung cancer, stereotactic body radiotherapy, radiation pneumonitis, radiation fibrosis, pulmonary function, lung injury

## INTRODUCTION

Stereotactic body radiotherapy (SBRT) is the standard of care for medically inoperable, early-stage non-small cell lung cancer (NSCLC) patients (1–3). After the introduction of SBRT, substantially higher overall survival (OS) rates were reported for this patient group by three large population-based analyses from the Netherlands and the US (4–6). Several prospective studies then demonstrated excellent 3-year local control around 90%, and survival rates of more than 50% in a highly comorbid population (7–9). SBRT has hence become the most optimal treatment option for patients with highly reduced pulmonary function [forced expiratory volume in 1 s (FEV1) < 30%] suffering from severe chronic obstructive pulmonary disease (COPD) (GOLD III–IV) (10, 11).

However, despite the high conformality of SBRT, toxicities are a non-trivial result of SBRT, especially in a population with poor pulmonary function. Depending on the study, symptomatic radiation pneumonitis occurs in about 10–30% of patients, which can impair patient quality of life (12–15). Fatal (grade 5) radiation pneumonitis following SBRT is only reported in very rare cases (15–17). An understudied aspect of adverse effects is the association with parenchymal remodeling following SBRT, which is detected to some degree in nearly all patients (18). Fibrosis in the high-dose regions is found in about 80% of patients and can make for a challenging differentiation between benign radiologic changes and local recurrence (18, 19).

Whether post-SBRT lung scarring correlates with significant clinical changes in pulmonary function is controversial (20, 21). Stone et al. recently showed significant impairment in pulmonary function after SBRT in a prospective trial, but though this was not associated with worse OS (22). On the basis of these results, we conducted a toxicity analysis examining post-SBRT parenchymal and functional changes, and the influence on outcome in patients who are deemed inoperable due to medical comorbidities.

### MATERIALS AND METHODS

#### Patient Population

Seventy consecutive patients treated between February 2004 and May 2015 were chosen for this analysis. Inclusion criteria were as follows: (1) receipt of SBRT for medically inoperable early-stage NSCLC (cT1-3cN0cM0) and (2) regular follow-up CT scans for at least 1 year at the Department of Radiation Oncology at the University Hospital Heidelberg, the Thoraxklinik Heidelberg, or at the German Cancer Research Center. The study and the study protocol were reviewed and approved by the Ethics committee of the University Hospital Heidelberg (S-140/2016). According to the decision of the Ethics committee, obtaining of written informed consent was not necessary due to the retrospective character of the study. Patients included in the analysis were identified from our cancer database. Anonymized patient data were used for analysis.

#### Treatment Characteristics

Patient selection, imaging protocols, and detailed treatment techniques have been reported previously (23–25). Risk-adapted fractionation schemes were used, meaning that dose and fractionation schemes were adjusted based on tumor size and location (peripheral vs. central). Until 2011, patients were generally treated with a single fraction of 20–24 Gy prescribed to the 80% isodose line, depending on proximity to critical structures (*n* = 32). Thereafter, peripheral lesions were irradiated with three fractions of 15–18 Gy, prescribed to the conformally enclosing 65% isodose line, while central lesions received eight fractions of 7.5 Gy prescribed to the 80% isodose line (*n* = 38). Delivery techniques were 3-D (*n* = 49), helical TomoTherapy® (*n* = 11), and volumetric-modulated arc therapy (*n* = 10).

#### Outcome Evaluation

Routine follow-up visits involved a contrast-enhanced CT scan of the thorax around the 3-, 6-, and 12-month intervals following SBRT. If no tumor recurrence was detected in the CT scan after 12 months, CTs and X-rays were done alternately every 6–12 months thereafter. Patients with reduced performance scores often only received X-rays after 12 months. Local progression referred to progression of the tumor within the high-dose volume. Differentiation between local progression and benign fibrosis in the high-dose volume is known to be challenging. Positron emission tomography (PET)-CT scans or biopsy was used to distinguish between benign lesions and tumor recurrence.

To correlate irradiated doses with clinical results, biological effective doses (BEDs) were calculated: an α/β ratio of 10 and 3 Gy was assumed for the tumor and lung tissue, respectively. BED was calculated using the linear-quadratic model:

$$\begin{aligned} \text{BED} \{ \text{Gy} \} &= \text{fractional dose} \\ & \times \text{number of fractions} \left( 1 + \frac{\text{fractional dose}}{\alpha/\beta} \right) . \end{aligned}$$

Pulmonary function tests (PFTs) of all patients as performed 1–0 month before SBRT and in median 9.3 months after SBRT (5.8–18.1 months) were analyzed. These included the following: FEV1, forced vital capacity (FVC), total lung capacity (TLC), residual volume, and airway resistance (R).

Radiologic changes were defined as acute changes when they were registered within the first 6 months following SBRT, and late changes when they occurred at or after 6 months. The applied classification system was initially described by Trovo et al. and later specified by Dahele et al. (18, 26). Herein, acute findings were grouped into five categories: no parenchymal abnormalities (NPA), patchy ground-glass opacity (PGGO), diffuse GGO (DGGO), patchy consolidation (PCO), and diffuse consolidation (DCO) (**Figure 1A**). Late CT changes were classified into four different categories: NPA, scar-like fibrosis (SLF), mass-like fibrosis (MLF), and modified conventional pattern of fibrosis (MCPF) (**Figure 1B**) (18, 26). Radiologic changes were categorized by two experienced radiation oncologists with the support of an experienced pulmonary radiologist.

For general scoring, the severity score that was introduced by Dahele et al. was applied. Radiographic changes were classified as "severe" (massive changes), "moderate" (extensive, but commonly

Figure 1 | Classification of radiologic changes following stereotactic body radiotherapy (SBRT). (A1–4) Acute parenchymal changes within the first 6 months after SBRT. One category [no parenchymal abnormalities (NPA)] is not shown. (B1–3) Late parenchymal changes after 6 months following SBRT. One category (NPA) is not shown. (C1–2) Severity score for radiologic changes with (C1) classified as no/mild changes and (C2) classified as moderate/severe changes. GGO, ground-glass opacity.

expected changes), "mild/minor" (rare changes only), or "none" (**Figure 1C**).

#### Statistical Analysis

Overall survival, local progression-free survival (LPFS), and distant progression-free survival (DPFS) were calculated using the Kaplan–Meier method. Survival curves were compared between groups in a univariate analysis applying the log-rank test or Cox regression analysis. Multivariable Cox models were performed including all variables with *p* ≤ 0.05 in univariate analysis. Correlations between baseline factors as well as irradiation doses and severity of CT changes were assessed using Spearman's or Pearson's correlation coefficients. McNemar's test was applied to calculate the association between early and late severity scores. Descriptive statistics were performed by using Mann–Whitney *U* tests or χ<sup>2</sup> tests for continuous or categorical data, respectively. The non-parametric Wilcoxon signed-rank test was applied for assessing pulmonary function changes. Receiver operating characteristics (ROC) curves and the Youden's index were performed to determine the optimal cutoff for FVC reduction or mean ipsilateral lung dose in BED (MLD) in predicting OS after 2 years. A *p*-value ≤ 0.05 was considered statistically significant. All statistical analyses were performed with SPSS software (version 20.0).

#### RESULTS

#### Survival and Local Control

Patient and treatment characteristics are displayed in **Table 1**. With a median follow-up time of 32.2 months (range 14.6– 104.3 months), 2- and 3-year OS was 83% and 60%, respectively (**Figure 2A**). Two- and three-year LPFS was 88% and 80%

#### Table 1 | Patient and treatment characteristics.



*SBRT, stereotactic body radiotherapy; FDG-PET, fluoro-deoxy-glucose positron emission tomography; BED, biological effective dose; PTV, planning target volume.* (**Figure 2B**), while 2- and 3-year DPFS was, respectively, 84% and 74%. OS, LPFS, and DPFS were not significantly affected by any potential risk factor investigated (**Table 2**).

#### Parenchymal Lung Changes after SBRT

In total, 463 CT scans of 70 patients were reviewed for parenchymal lung changes. A median of five CT scans (range 3–17) could be evaluated per patient covering a time frame of in median 20.0 months after SBRT (range 12.2–78.8 months). The median time to onset of CT changes was 2.5 months (range 1.6–8.8 months).

Acute radiologic changes within the first 6 months (113 CT scans available) following SBRT were assessed for each patient: NPA were detected in 10% of the cases, while 63 patients (90%) displayed acute parenchymal changes. From this cohort, 11% PGGO, 25% DGGO, 25% PCO, and 29% DCO (**Figure 1A**).

Late parenchymal changes were detected to some degree in all CT scans available (**Figure 1B**). After 6 months following SBRT (60 patients with CT scans available), 10% of the cases showed SLF, 7% MLF, and in 83% of the patients MCPF was detected. Parenchymal changes slightly decreased 12 months post-SBRT with 14% SLF, 9% MLF, and 77% MCPF (64 patients with CT scans available). After 18 months, a further reduction in parenchymal changes was registered (156 CT scans in 41 patients): 20% SLF, 9% MLF, and 71% MCPF.

Most of the tumors had an acute severity score of 0 (none, *n* = 10, 14%), 1 (mild, *n* = 43, 62%), or 2 (moderate, *n* = 16, 23%). Only one patient each suffered from acute severe changes (score = 3) after SBRT. The pattern for chronic severity score was as follows: mild (score 1): 66%, moderate (score 2): 33%, and severe (score 1): 1%. The two patients with severe radiologic changes developed radiation pneumonitis CTCAE grade III requiring corticosteroids and oxygen support until resolution of symptoms. Two additional patients developed CTCAE grade



*FDG-PET, fluoro-deoxy-glucose positron emission tomography; BED, biological effective dose; PTV, planning target volume.*

*The variables sex, staging FDG-PET, histology, TNM stage, tumor location, smoking status, and PTV-encompassing biological effective total dose were analyzed as categorical variables, while the other variables were taken as continuous variables for analysis.*

II radiation pneumonitis. In total, 5.7% of the patients suffered from grade ≥II radiation pneumonitis.

The severity of acute CT changes predicted for those of late changes (*p*= 0.027). We did not detect any significant correlations between maximum severity score for each tumor and gender (*p* = 0.085), patient age (*p* = 0.366), Karnofsky performance score (*p* = 0.426), tumor histology (*p* = 0.333), TNM stage (*p* = 0.190), tumor location (*p* = 0.329), smoking status (*p* = 0.502), smoking history in number of packyears (*p* = 0.473), total dose in BED (*p* = 0.705), single dose (*p* = 0.643), and number of fractions (*p* = 0.625). However, both planning target volume (PTV) size and MLD in BED were predictive for parenchymal lung changes measured as the maximum severity score (respectively, *p* < 0.001 and *p* = 0.014).

Furthermore, OS was significantly reduced if scans showed moderate or severe parenchymal lung changes (*p* = 0.043, HR 1.928 [CI 1.020–3.644]) (**Figure 3A**). Specifically, patients with a maximum severity score of 0–1 (none/mild) showed 2- and 3-year OS of 83 and 65%, while patients with a maximum severity score of 2–3 (moderate/severe) experienced 2- and 3-year OS of 78 and 51%, respectively (*p* = 0.043, HR 1.928 [CI 1.020–3.644]). In addition, OS was significantly influenced by MLD but not by PTV size (*p* = 0.025, HR 1.046 [CI 1.002– 1.092]; *p* = 0.408, HR 1.004 [CI 0.995–1.013]) (**Figure 3B**). A cutoff MLD of 9.72 Gy was calculated in ROC analysis. Hence, patients treated with an MLD < 9.72 Gy showed 2- and 3-year OS of 89.2% and 67.7%, while patients with an MLD ≥ 9.72 Gy only had 2- and 3-year OS rates of 73.6% and 48.6%, respectively (*p* = 0.042; 1.904 [CI 1.017–3.563]). Both LPFS and DPFS were not significantly affected by maximum severity score, MLD, or PTV size (*p* ≥ 0.05).

#### Functional Lung Changes after SBRT

In total, paired PFTs were available for 57 of the analyzed 70 patients before and after SBRT. PFTs were obtained at a median of 44 days before SBRT (range 3–70 days) and 9.3 months (range 5.8–18.1 months) after SBRT. Detailed PFT data are illustrated in **Table 3**.

All analyzed baseline pre- and posttreatment PFT parameters did not significantly affect OS, LPFS, and DPFS (*p* > 0.05). However, SBRT treatment significantly reduced post-SBRT lung function: TLC (−0.52 L; *p* = 0.001), FVC (−0.45 L, *p* < 0.001), FEV1 (−0.17 L, *p* < 0.001), FEV1% (−5.2%, *p* < 0.001), and airway resistance (+0.09 kPa s/L, *p* = 0.003) (**Table 3**).

As a next step, we evaluated whether absolute differences between pre- and post-interventional PFT parameters could predict outcome. While we did not detect a significant effect of TLC, FEV1, FEV1%, and resistance, treatment-related reduction in FVC significantly affected survival (*p*= 0.007, 3.910 [CI 1.445–10.575]). A cutoff FVC reduction of 0.54 L was calculated in ROC analysis. Patients with a reduction in FVC ≥ 0.54 L showed significantly worse 2- and 3-year OS of 71% and 35%, while patients with an FVC reduction <0.54 L had 2- and 3-year OS rates of 93% and 73%, respectively (*p*= 0.011, 2.439 [CI 1.227–4.849]) (**Figure 3C**). Absolute reductions in FVC did not significantly correlate with MLD (*p* = 0.913), PTV size (*p* = 0.334), and maximum severity score of parenchymal changes (*p* = 0.546).

Finally, we performed multivariate analysis revealing MLD ≥ 9.72 Gy and FVC reduction ≥0.54 L to be statistically significant independent prognostic factors for OS (*p* = 0.029, 1.037 [CI 1.011–1.089]; *p* = 0.004, 2.347 [CI 1.167–4.723]). Maximum severity score of parenchymal changes was not identified as an independent prognostic factor for OS (*p* = 0.140, 1.289 [CI 0.601–2.766]).

#### DISCUSSION

Pulmonary SBRT is believed to be a milder way of treatment with less side effects compared to surgery involving lobectomy and systematic lymphadenectomy as it is primarily offered to patients with reduced performance score who are classified medically inoperable (2, 11, 27). In this study, we investigated early and late radiographic lung injury as well as pulmonary function changes following SBRT. In general, most patients only showed mild to

Figure 3 | (A) Overall survival (OS) was significantly reduced if patients showed moderate/severe radiologic changes following stereotactic body radiotherapy (SBRT) compared to patients with only none/mild parenchymal changes (*p* = 0.043). (B) Patients with treated with an MLD ≥ 9.72 Gy suffered from worse OS (*p* = 0.042). (C) OS was significantly impaired if patients had an absolute reduction in FVC ≥ 0.54 L following SBRT (*p* = 0.007). FVC, forced vital capacity; MLD, mean ipsilateral lung dose in biological effective dose.


*SBRT, stereotactic body radiotherapy; TLC, total lung capacity; FVC, forced vital capacity; FEV1, forced expiratory volume in 1 s; FEV1/FVC, forced expiratory volume in 1 s divided by forced vital capacity.* 

*\*p* ≤ 0.05

moderate parenchymal and functional lung alterations that did not translate into reduced clinical performance in the majority of cases.

Regarding parenchymal lung changes in follow-up imaging studies, nearly all patients only developed minor changes, while severe changes were only noticed in two patients (2.9%) and were transient. All patients were diagnosed with radiological changes following SBRT at some time of follow-up, which is known to impair diagnosis of local recurrence (19, 28). Similar acute and chronic patterns of CT changes were reported by Trovo et al. and Dahele et al. (18, 26).

However, to the best of our knowledge, this is the first investigation describing a significant association between MLD and survival following SBRT (**Figures 3A,B**).

Several groups have shown a dose–response relationship for radiation-induced pneumonitis following SBRT (29, 30). Furthermore, a recent pooled analysis of 88 studies investigating lung toxicity after SBRT reported MLD as well as large tumor size to be significant adverse risk factors for pneumonitis and lung fibrosis (15). Indeed, we also detected a significant correlation between both MLD as well as PTV size and maximum severity score of radiological CT changes. In comparison, when regarding radiotherapy for locally advanced NSCLC patients, an association between radiation exposure to normal lung and severe pneumonitis is also well known (31). Furthermore, development of radiation pneumonitis and generalized radiological changes after radiotherapy in locally advanced NSCLC patients were shown to be independent negative prognostic factors for survival (32). A recent study even underlined the predictive impact of lung dose and especially MLD on survival analyzing prognostic factors in 468 patients with stage IIIA– IIIB NSCLC (33). Our study now shows that SBRT with an MLD ≥ 9.72 Gy was associated with significantly worse survival. However, due to the low number of patients and the limited number of events recorded in this analysis, this finding has to be interpreted with caution. All patients included in this study were classified medically inoperable and suffered from severe pulmonary comorbidities, which probably highly impaired survival. Nevertheless, a recent study showed that dose to heart substructures was associated with non-cancer death after SBRT in stage I–II NSCLC patients (34). Hence, dose spillage to the heart and healthy lung tissue should be kept as low as reasonably possible when performing SBRT.

In a second step, we analyzed functional lung changes and detected a significant decline following SBRT for TLC, FVC, FEV1, and FEV1% (*p* ≤ 0.001). In contrast to our results, Stanic et al. and Stephans et al. did not show any significant change in pulmonary function examining lung function in 55 and 92 patients after SBRT (21, 35). However, a recent study by Stone et al. also reported a significant decline for FEV1, diffusion capacity, FVC, and TLC following SBRT (22). Nevertheless, most studies did not show any association of lower baseline or post-SBRT pulmonary function with worse survival (21, 22, 36, 37). Guckenberger et al. only described a significant impact of pretreatment, and not posttreatment, diffusion capacity of carbon monoxide on survival (thus not treatment-related) (10). Notably, absolute reduction in FVC was shown to be an independent prognostic factor for OS in this analysis (**Figure 3C**), indicating a possible influence of radiation-induced restrictive lung disease upon survival. This might be due to the fact that this analysis is the only study investigating the prognostic impact of absolute loss in PFTs and not only pre- and posttreatment pulmonary function parameters.

Similar to other studies reduction in FVC did not significantly correlate with prognostic factors for lung toxicity such as MLD, and PTV size in this analysis (20, 21). There was no significant correlation between absolute reduction in FVC and maximum severity score of parenchymal changes. This finding is supported by reports about conventionally fractionated radiotherapy in which a dose–effect relationship for posttreatment PFT changes is also missing (38). This might be explained by the fact that irradiation of lung tumors may even improve PFT by tumor shrinkage or reopening of atelectasis (39). Furthermore, the vast majority of these patients had severe COPD with decline in pulmonary function on the basis of natural disease progression (40, 41). Hence, the detected loss in pulmonary function has to be interpreted with caution and might also be caused by the natural progression of preexisting COPD (40, 41). Larger, multicenter studies are truly needed to evaluate the possible prognostic impact of MLD and lung function changes following SBRT on survival. In this study, we did not detect distinct factors for surely predicting possible lung toxicity following SBRT. Nevertheless, other factors such as pretreatment immune status are reported to predict for toxicity after SBRT (42).

Despite the reported parenchymal and functional lung changes, survival and local control rates detected in this analysis were comparable to other studies and still much higher in comparison to conventionally fractionated radiotherapy (7–9, 16, 43). In detail, 3-year OS, progression-free survival (PFS), LPFS, and DPFS rates were 60, 65, 80, and 74%, respectively. Regarding current guidelines, a benchmark is the 3-year local rate following SBRT which is supposed to be 90% and higher when a BED > 100 Gy is applied (2). As this study included data from 2004 to 2015, several patients were treated with lower doses which might have led to the slightly reduced local control rates in this analysis.

The higher PFS compared to the detected OS in this analysis might raise the question whether some patients were overtreated, although other studies reported similar results (42, 44). Furthermore, not all patients received histopathological confirmation of disease due to reduced performance score but had fluoro-deoxy-glucose positron emission tomography positive tumors. The severity of pulmonary comorbidities is known to be an important predictor for survival for lung cancer patients—not only after SBRT (45, 46). However, two recent reports stated that withholding SBRT in patients with severe COPD is not justified (47, 48). Hence, in some patients SBRT might transfer the cause of death from tumor disease to pulmonary comorbidities.

As pulmonary SBRT for early-stage NSCLC was analyzed between 2004 and 2015 in this study, patients were mainly treated with less advanced radiation techniques while survival data were available therefore as a tradeoff. For example, regular performance of 4-D-CT scans to account for tumor motion started in 2009 in our department. Hence, larger safety margins leading to larger PTV sizes and higher MLD were needed. Today, advances in radiation planning and delivery techniques such as intensity-modulated radiotherapy (IMRT) and elaborate image guidance including gating and tracking help to further minimize the dose to normal tissue and therefore reduce sideeffects. Some limitations of this study deserve mention. Aside from the smaller sample size and retrospective nature, paired PFTs were not available for all patients. Second, further lung dose parameters as *V*5Gy and *V*20Gy were not accessible. Third, a larger cutoff interval for follow-up CT scans of more than one year was not possible, as several patients only received X-ray scans for follow-up imaging after 1 year due to their poor performance status. Fourth, due to the retrospective character of this study, detailed analysis of cardiopulmonary comorbidities and their potential impact on survival in this study was not possible.

Analyzing parenchymal and functional lung injury following SBRT, we detected only mild radiological changes and tolerable reduction in pulmonary function for most patients. However, this study showed a significant association between SBRT with a higher MLD and inferior survival. Furthermore, higher absolute reduction in FVC significantly impaired survival in this analysis. Nevertheless, these results have to be interpreted with caution due to the limited number of patients and the retrospective character of this study. Natural progression of pulmonary comorbidities including COPD surely also led to reduced survival in this patient group. If toxicity of SBRT had an impact on survival in this study, this was potentially caused by the interaction with preexisting pulmonary comorbidities.

Based on this study, modern radiotherapy methods including delivery techniques such as IMRT and daily image guidance should be applied for minimizing PTV sizes and keeping MLD as low as reasonable possible. Furthermore, larger prospective and multicenter studies are highly needed for evaluating the potential prognostic impact of parenchymal and functional lung changes on survival.

#### ETHICS STATEMENT

This retrospective study was carried out in accordance with the recommendations of the Ethics committee of the University Hospital Heidelberg. According to the committee's decision based on the retrospective character of this analysis, no additional written informed consent of all patients was needed.

### AUTHOR CONTRIBUTIONS

JH-R carried out data collection as well as statistical analysis and drafted the manuscript. JDern helped with data collection. DB, SA, and AP assisted with patient care. LK helped with figure and table preparation. VV assisted with manuscript preparation. JK and HH were responsible for pulmonary patient treatment. CH supported assessment of radiologic changes on CT scans and classifying of severity. SR and JDebus conceived of the study, and participated

#### REFERENCES


in its design and coordination and helped to draft the manuscript. All the authors were responsible for data interpretation, participated in manuscript revisions, and approved the final manuscript.

### FUNDING

This work was supported by the Medical Faculty of Heidelberg University providing a research grant for JH-R. No additional funding was received.

after stereotactic radiation therapy for lung tumors. *Radiat Oncol* (2007) 2:21. doi:10.1186/1748-717X-2-21


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Hörner-Rieber, Dern, Bernhardt, König, Adeberg, Verma, Paul, Kappes, Hoffmann, Debus, Heussel and Rieken. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

*,* 

# Relationships between Regional Radiation Doses and Cognitive Decline in Children Treated with Cranio-Spinal Irradiation for Posterior Fossa Tumors

*Elodie Doger de Speville1,2,3, Charlotte Robert4,5,6,7, Martin Perez-Guevara8 , Antoine Grigis9 Stephanie Bolle4 , Clemence Pinaud1,2, Christelle Dufour3 , Anne Beaudré4,7, Virginie Kieffer3,10, Audrey Longaud3,11, Jacques Grill3,11, Dominique Valteau-Couanet3,11, Eric Deutsch4,5,6,7, Dimitri Lefkopoulos4,7, Catherine Chiron1,2, Lucie Hertz-Pannier1,2 and Marion Noulhiane1,2\**

#### *Edited by:*

*Bhadrasain Vikram, National Cancer Institute (NIH), United States*

#### *Reviewed by:*

*John Austin Vargo, West Virginia University Hospitals, United States Jeff Buchsbaum, National Institutes of Health (NIH), United States*

#### *\*Correspondence:*

*Marion Noulhiane marion.noulhiane@parisdescartes.fr*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 11 May 2017 Accepted: 25 July 2017 Published: 18 August 2017*

#### *Citation:*

*Doger de Speville E, Robert C, Perez-Guevara M, Grigis A, Bolle S, Pinaud C, Dufour C, Beaudré A, Kieffer V, Longaud A, Grill J, Valteau-Couanet D, Deutsch E, Lefkopoulos D, Chiron C, Hertz-Pannier L and Noulhiane M (2017) Relationships between Regional Radiation Doses and Cognitive Decline in Children Treated with Cranio-Spinal Irradiation for Posterior Fossa Tumors. Front. Oncol. 7:166. doi: 10.3389/fonc.2017.00166*

*<sup>1</sup> INSERM U1129, CEA, Paris Descartes University, Paris, France, 2UNIACT, Institut Joliot, DRF, Neurospin, CEA, Paris Saclay University, Gif-sur-Yvette, France, 3Department of Pediatric and Adolescent Oncology, Gustave Roussy, Villejuif, France, 4 Radiation Oncology Department, Gustave Roussy Cancer Campus, Villejuif, France, 5 INSERM, U1030, Villejuif, France, 6 Paris Sud University, Paris-Saclay University, Villejuif, France, 7 Gustave Roussy, Paris-Saclay University, Department of Medical Physics, Villejuif, France, 8 INSERM U992 Unicog CEA, Neurospin, Paris Descartes University, Paris, France, 9 Institut Joliot, Neurospin, CEA, Paris-Saclay University, Gif-sur-Yvette, France, 10CSI Department for Children with Acquired Brain Injury, Hopitaux de Saint Maurice, Saint-Maurice, France, 11Paris Sud University, Orsay, France*

Pediatric posterior fossa tumor (PFT) survivors who have been treated with cranial radiation therapy often suffer from cognitive impairments that might relate to IQ decline. Radiotherapy (RT) distinctly affects brain regions involved in different cognitive functions. However, the relative contribution of regional irradiation to the different cognitive impairments still remains unclear. We investigated the relationships between the changes in different cognitive scores and radiation dose distribution in 30 children treated for a PFT. Our exploratory analysis was based on a principal component analysis (PCA) and an ordinary least square regression approach. The use of a PCA was an innovative way to cluster correlated irradiated regions due to similar radiation therapy protocols across patients. Our results suggest an association between working memory decline and a high dose (equivalent uniform dose, EUD) delivered to the orbitofrontal regions, whereas the decline of processing speed seemed more related to EUD in the temporal lobes and posterior fossa. To identify regional effects of RT on cognitive functions may help to propose a rehabilitation program adapted to the risk of cognitive impairment.

#### Keywords: pediatric, posterior fossa, radiotherapy, cognitive impairments, radiation effects

## INTRODUCTION

Posterior fossa tumors (PFTs) account for two-thirds of all pediatric brain tumors (1). The most common malignant PFT is medulloblastoma (40%), followed by ependymoma (10%) (2). As a result of improved treatment, event-free survival has significantly increased (3). However, these children suffer from varied cognitive impairments, the most frequently described being decreased sustained attention, working memory, and information processing speed (4). This latter impairment seems to appear first in PFT patients treated with cranio-spinal irradiation (CSI) (4). These cognitive impairments might relate to the decline of global intellectual functioning (full scale IQ, FSIQ) reported to be between two and four points per year (5–9). Several factors have been shown to predict cognitive impairments in PFT patients. Radiotherapy (RT) has been considered to be the major one, especially in young children (6, 8). Three RT-associated risk factors have been highlighted as predictors of cognitive impairments: (i) CSI (6, 7, 10), (ii) the volume receiving the boost [i.e., to the posterior fossa (PF)] (11), and (iii) the dose per fraction (12, 13). Grill et al. (10) observed that PFT survivors with low CSI (25 Gy) showed better cognitive outcomes than those receiving high CSI (36 Gy). Nonetheless, the reduction of CSI dose (14) did not prevent IQ decline (9). An alternative way to decrease cognitive impairments has been to reduce the volume of the PF irradiated, in addition to the reduced CSI. While the PF received the highest dose, the boost dose also contributed to higher doses in other regions such as the temporal lobes, the brainstem, and the hypothalamus (11). Moxon-Emre et al. (15) showed that medulloblastoma survivors for whom the CSI was reduced, and the boost volume was reduced from the entire PF to the tumor bed, had preserved IQ over time. Nonetheless, medulloblastoma survivors treated *via* either a CSI dose reduction or a diminution of PF volume irradiated (tumor bed boost) still experienced a decline of IQ.

Recent studies reported a higher contribution of specific brain regions to the development of RT-induced cognitive decline. Jalali et al. (16) observed that more than 43.2 Gy to >13% of the left temporal lobe was predicting IQ decline in patient treated for a benign tumor with stereotactic conformal RT. Merchant et al. (6) assessed the impact on IQ change of different mean dose values in distinct regions (whole brain, temporal lobe, hippocampus, infratentorial, and supratentorial spaces) in patients treated for a medulloblastoma, and suggested that supratentorial space was the most sensitive across the brain. Using a neurocognitive questionnaire, Armstrong et al. (17) pointed out a strong association between maximum radiation dose to the temporal lobe and poor performance in *Task efficiency* (i.e., attention and processing speed) and *Organization.* These subscores were measured as given by the Childhood Cancer Survivor Study Neurocognitive Questionnaire. While these studies did not identify a relationship between radiation dose of PF and changes in cognitive scores, such associations have been reported in children with ependymoma (18).

Despite marked progress, the regional effect of RT on cognitive impairment still remains unclear. So far, research on this question has been mainly carried out on either single (19) or large (6) brain regions, limiting the analysis to specific anatomical structures. In this study, we implemented a whole brain analysis; to investigate the relation between regional biological dose and changes over time of different cognitive scores (IQ, processing speed, and working memory) in 30 patients treated for a PFT. The use of a principal component analysis (PCA) was an innovative way to cluster correlated irradiated regions due to similar radiation therapy protocols across patients. We aimed to describe the relationships between regional radiation dose and declines in specific cognitive functions.

### PATIENTS AND METHODS

#### Patient's Characteristics

Inclusion criteria were (1) PFT patients treated at Gustave Roussy Cancer Campus between 2000 and 2014; (2) 17 years of age or under at diagnosis (3) multiple (>2) IQ assessments after treatment onset (4); for the PFT patients treated with radiation therapy, the computed tomography (CT) scan, T1-weighted MRI, and dosimetric maps had to be available. Thirty patients (14 males and 16 females) matched these criteria. Information was gathered from medical files about the history of the illness (i.e., age at diagnosis) and the type of treatment (i.e., surgery, chemotherapy, and radiation therapy protocol). The underlying malignancy of the 30 patients studied was medulloblastoma, ependymoma, astrocytoma, and embryonal tumor in 25, 3, 1, and 1 patients, respectively. Twenty patients had a localized disease and 10 had a metastatic disease. Complete tumor resection was achieved in 20 PFT. Post-operative complications occurred in 10 patients. No patient relapsed between the two evaluations, but the patient with an astrocytoma whose relapse before the first evaluation, was treated with chemotherapy alone. The mean age at diagnosis was 4.62 years (SD = 3.05; [0.49; 12.24]). The mean delay between treatment and the last assessment was 4.60 years (SD = 4.60; [1.28; 14.24]). Pre-operative hydrocephalus was present in 19 patients (63%). Seventeen patients were treated with RT alone (*N* = 7) or RT and chemotherapy (*N* = 10). The remaining patients were treated with chemotherapy alone and were used as controls. All patients with RT received a CSI and a boost in the PF and were treated with three-dimensional conformal radiation therapy (**Table 1**). This study was approved by an ethical committee (CPP no. 14973, Gustave Roussy, Villejuif, France).

Table 1 | Absorbed dose and type of fractionation [conformational fractionation (CF) vs. hyperfractionated radiotherapy (HFRT)] prescribed to the cranio-spinal irradiation (CSI) and posterior fossa (PF) for the 17 patients.


*The number of fractions per day and the dose per fraction varied from one patient to another. Some patients received two fractions of 1 Gy per day with an inter fraction of 8 h with HFRT, whereas other patients were treated by CF, i.e., one fraction of 1.82 Gy per day.*

### Neuropsychological Assessments

Three cognitive indices were estimated from age appropriate Wechsler Intelligence Scale (20, 21): FSIQ in all patients and the processing speed index (PSI) and working memory index (WMI) when available. Neuropsychological assessments were done by formally trained neuropsychologists from the pediatric department. Because of age or time constraints, not all participants were administered all the tests. Thus, the numbers of patients assessed varied across cognitive scores [*N* (ΔFSIQ) = 30, *N* (ΔPSI) = 23, *N* (ΔWMI) = 14]. Patients were evaluated at variable time points after treatment onset. Thus, the delay (T) between two neuropsychological assessments varied from one patient to the other (**Table 2**). The change in cognitive scores (ΔFSIQ, ΔPSI, and ΔWMI) of each patient was calculated from the difference between the first and last scores (ΔT). We did not consider intermediate scores. Changes in cognitive scores (ΔFSIQ, ΔWMI, and ΔPSI) were compared using two-tailed Student's *t*-tests.

### NEUROIMAGING DATA

To study regional dose effects on changes in cognitive scores, 3D-T1 MRI, CT scan, and absorbed dose maps of patients treated with RT (*n* = 17) were collected and processed to create individual dose distribution maps into selected brain regions of interest (ROI) covering the whole brain.

#### Image Collection

3D T1-MR images were acquired on a 3-T scanner using a 32-channel head coil (General Electric, Milwankee, MN, USA). In this clinical retrospective study, two types of T1-weighted images were collected: 3D T1-weighted sagittal slices (matrices: 256 mm × 256 mm, pixel size: 0.5 mm, slice thickness: 1 mm, FOV: 240 mm) and 3D-T1 weighted axial slices (matrices: 224 × 288, pixel size: 0.5 mm, slice thickness: 1 mm).

Computed tomography scans were acquired on a SIEMENS Sensation Open scanner located in Gustave Roussy RT department (matrices: 512 mm × 512 mm, pixel size: 0.8 mm, slice thickness: 3 mm). Radiation dose maps (RD maps) were computed with the ISOgray™ Treatment Planning System (DOSIsoft, version 4.1, Cachan, France). The Clarkson–Cunningham model was used for dose calculation. Dose maps resolution was 3 mm × 3 mm × 1 mm.

### Image Analysis

#### Image Preprocessing

We designed a five-step preprocessing pipeline to identify anatomical ROI on dose maps (**Figure 1**).

Table 2 | Changes in the three measured cognitive scores [Delta (Δ)] with the corresponding number of evaluated patients: mean score change (±SD, range) and mean test interval ΔT (±SD, range).


*Step 1*: We chose three MRI templates specific to ages 0–2, 2–5, and 5–9 years (151 × 192 × 152 voxels, 1 mm × 1 mm × 1 mm voxel size) and three corresponding anatomical atlases (151 × 192 × 152 voxels, 1 mm × 1 mm × 1 mm) from the Neurodevelopmental MRI DataBase (22). The atlases contained 56 ROIs extracted from the LPBA40 atlas (23) that were adapted to selected ages thanks to label propagation and decision fusion methods (24). For each child, we selected both the atlas and associated template according to the age at which the child received radiation therapy, to be as close as possible to the individual anatomy, which varies significantly during development (22, 25). Since the atlases did not included some particular regions (i.e., corpus callosum, a part of internal capsule, and ventricles), we created a supplementary label that encompassed these regions, resulting in 57 ROIs.

*Step 2*: The selected template was warped to individual patient 3DT1 image using a non-linear registration tool [Advanced Normalization Tools, SyN (26), and ANTS (27)].

*Step 3*: Individual MR images were also registered to the corresponding individual CT scan by applying a linear transformation with FSL [FLIRT (28)].

*Step 4*: Each CT scan was then down sampled to match the corresponding RD map voxel sampling.

*Step 5*: Finally, we combined the computed transformations into a single concatenated transformation from the template space to the individual dose map coordinate system. This enabled us to perform statistical radiation dose analyses over the group in each ROI extracted from the template.

Individual registrations have been assessed qualitatively by two experimenters independently and by consensus. From this check, four subjects were excluded from the study. In the majority of cases, registrations have been adjusted manually to optimize intersubject comparisons.

#### Data Analysis

We designed a four-step analysis pipeline to determine the associations between both clinical variables and ROI dose distribution with changes in cognitive scores (**Figure 2**).

#### *Equating Dose Maps across Patients: EQD2 Computation*

*Step 1*: Given the differences in fractionation parameters (dose per fraction and number of fractions per day varied from one patient to another), even at equal total doses, the biological effectiveness of these two types of irradiation will be different (**Figure 2**, step 1). However, using the linear quadratic model (29), it is possible to calculate the total dose equivalent in terms of biological effects for two different fractionations (dose per fraction, time interval between two fractions) and a given tissue (EDQ2). Using this equation, all treatments are thus reduced to biological dose equivalent to treatments performed with fractions of 2 Gy, which is the standard fractionation scheme. Therefore, we corrected the dose of all fractionation types in a uniform way by calculating in each voxel the equivalent dose with the EQD2 formula (30) (Eq. 1; **Figure 2**, step 1). The EQD2 was calculated taking into account a function (*Hm*) depending on the number of equally spaced fractions per day; the dose per fraction (*d*) and the sensitivity of the tissue (α/β). *D* (the total delivered dose in Gy) and *d* varied across

patients. Based on the current literature, α/β was fixed to 2 and T1/2 to 3 h (31):

$$\text{EQD2} = D.\frac{d(1+H\_m) + \alpha/\beta}{2+\alpha/\beta} \tag{1}$$

#### *Calculation of Dose Index: Equivalent Uniform Dose (EUD) Computation*

*Step 2*: After calculating the biological dose map of each patient, for all subjects and ROIs we computed the EUD that accounts for heterogeneity of dose distribution, as follows (Eq. 2) (**Figure 2**, step 2):

$$\text{EUD} = \left(\sum\_{j} \nu\_{j} D\_{j}^{k}\right)^{\frac{1}{k}}.\tag{2}$$

Equivalent uniform dose corresponds to the value of a homogeneous dose that would cause the same clinical effect than the corresponding heterogeneous dose distribution (30). *k* was fixed at 5 according to the work of Emami et al. (32). We standardized EUD across the 17 subjects for each of the 57 ROIs.

#### *Taking into Account the Spatial Correlation of Radiation Doses across ROIs: PCA Approach*

*Step 3*: Because of the radiation therapy protocol (i.e., CSI and boost in the PF with three-dimensional conformal radiation therapy), EUD was highly correlated across brain regions (**Figure 2**, step 3). Therefore, it was not possible to assess the effect of irradiation on cognitive scores in each region with an ordinary least square regression, as regression weights would be highly unstable. Thus, we ran a PCA, a data-driven method

that clusters correlated variables into common factors named principal components (PCs). In this approach, highly correlated variables share higher weights within each factor/component, but components are uncorrelated. The PCA enabled us (1) to obtain uncorrelated components representative of the radiation dose distribution variability across subjects, each component revealing a brain network with a particular radiation pattern and (2) to reduce the number of variables in our model, as sample size was limited. We performed a PCA taking the ROIs normalized EUD as variables (**Figure 2**, step 3). Then, we selected the *n* < 57 PCs accounting for 90% of the variance (33). Due to the high correlation between regions, we recovered only three components (PCs). To figure out the spatial contribution of the ROIs on each PC-EUD, we computed the correlations between EUD in each ROI and each PC-EUD, and projected the correlation coefficients onto a glass brain.

#### Highlighting the Respective Contribution of Clinical Variables and EUD-PCs on Clinical Score Changes

*Step 4*: We then considered the computed PCs-EUD and the clinical variables (chemotherapy, time since diagnosis, age at diagnosis, and ΔT) in a least square regression (**Figure 2**, step 4). We first checked for multicollinearity that could induce a biased estimation and a loss of power (34), using the variance inflation factor, which summarizes how an independent variable is explained by other variables. We removed regressors with a variance inflation factor >10 (35). In each regression, we examined *t*-scores to determine which variable had the most important effect on the cognitive scores of these 30 patients.

All analyses and plots were computed using the *Python libraries*, *Nilearn*, *Scikit Learn*, and *Statsmodels* (36, 37).

### RESULTS

#### Neuropsychological Performance

At time of first neuropsychological assessment, the mean estimated IQ over the whole population was 87.5 (SD = 18.4; [45–130]). A declining performance over time was observed in 67, 64, and 48% of the patients for ΔFSIQ, ΔWMI, and ΔPSI, respectively. The remaining patients showed either preserved or better performance over time. However, there were no statistically significant differences between cognitive scores [ΔFSIQ vs. ΔWMI: *t*(32) = −0.64, *p* = 0.52; ΔPSI vs. ΔWMI: *t*(35) = −0.81, *p* = 0.42; ΔFSIQ vs. ΔPSI: *t*(51) = −0.37, *p* = 0.70] (**Table 2**). Moreover, ANOVAs were conducted to compare the three treatment groups (chemotherapy alone vs. RT and chemotherapy vs. RT alone) on their cognitive scores (ΔFSIQ, ΔWMI, and ΔPSI). There was no statistically significant difference between treatment groups in ΔFSIQ [*F*(2,30) = 2.36; *p* = 0.11; RT alone: M = −10.5 (±9.11), chemotherapy alone: M = 1.57 (±11.28), RT and chemotherapy: M = −2.0 (±11.03)]; and ΔWMI [*F*(2,14) = 1.17; *p* = 0.34; RT alone: M = −10 (±9.90), chemotherapy alone: M = −0.29 (±5.90), RT and chemotherapy: M = −4.60 (±10.15)], and ΔPSI [*F*(2,24) = 2.28; *p* = 0.12; RT alone: M = −10 (±9.90), chemotherapy alone: M = −0.29 (±5.90), RT and chemotherapy: M = −4.60 (±10.15)].

#### PCs Extracted from EUD of Anatomical ROIs

*PC1-EUD*, which explained 67% of the variance of original data, was strongly correlated (>0.50) with the dose (EUD) in all regions, especially in the supratentorial space. *PC2-EUD* explained 19% of the variance and was positively correlated with 16 regions in the PF, inferior occipital and temporal regions (see Table S1 in Supplementary Material). Meanwhile, three regions in the left superior occipital and parietal regions correlated negatively and moderately (>0.40) with *PC2-EUD* (see Table S1 in Supplementary Material). *PC3-EUD* explained 5% of the variance and had a moderate positive correlation (>0.40) with the EUD in the left orbitofrontal area. By contrast, the precuneus and the right cuneus negatively correlated with *PC3-EUD*. Values of all correlation coefficients are shown in Table S2 in Supplementary Material.

#### Effects of Clinical Variables and EUD Components on Cognitive Score Changes

Our final regression models included the three *PCs-EUD*, chemotherapy, age at diagnosis, and delay between assessments (ΔT). Indeed, in all models, time since diagnosis (variance inflation factor >10) was highly correlated with ΔT (variance inflation factor >10), while it was not the case between PCs (variance inflation factor <10), chemotherapy (variance inflation factor <10), and age at diagnosis (variance inflation factor <10). We thus removed time since diagnosis from the analysis and checked that all remaining variance inflation factor indices were below 10.

#### Clinical Variables and Cognitive Score Changes

ΔFSIQ was significantly negatively affected by age at diagnosis and interval between assessments (ΔT), and positively influenced by chemotherapy. These variables had no significant effect on the other scores (**Table 3**).

#### EUD Components and Cognitive Score Changes

ΔWMI was clearly negatively associated with both *PC1-EUD* and *PC3-EUD* and marginally by *PC2-EUD* (**Table 3**) *PC3-EUD* had the highest effect on ΔWMI, followed by *PC1-EUD* and *PC2-EUD* (**Figure 3**). The decline of WMI was first associated with an increase of EUD in left orbitofrontal area (*PC3-EUD*) and then with an increase of EUD in all regions, especially in the supratentorial space (*PC1-EUD*). By contrast, an EUD increase in the precuneus and right cuneus was positively associated with ΔWMI (**Figure 3**).

Only *PC2-EUD* and *PC3-EUD* were found to have a negative and significant effect on ΔPSI, with a seemingly higher effect of *PC2- EUD* than of *PC3-EUD*, contrarily to ΔWMI (**Table 3**). The decline of PSI was first associated with an EUD increase in the PF, inferior occipital and temporal regions (*PC2-EUD*) followed by an increase in the left orbitofrontal area (*PC3-EUD*). By contrast, *PC2-EUD* and *PC3-EUD* were positively associated with ΔPSI in superior occipital and parietal regions (**Figure 3**).

Finally, *PC2-EUD* and *PC3-EUD* had similar and nearly significant negative effects on ΔFSIQ (**Table 3**) The decline of FSIQ was similarly associated with the increase of EUD in the PF, inferior occipital, temporal regions, and left orbitofrontal areas (*PC3-EUD* and *PC2-EUD*). By contrast, EUD in superior occipital and parietal regions was positively associated with ΔFSIQ (*PC2-EUD* and *PC3- EUD*) (**Figure 3**).

### DISCUSSION

Our main results suggest different regional associations between radiation dose (EUD) and changes in cognitive scores in patients treated for PFTs. In particular, we highlighted a link between working memory decline and radiation dose in the orbitofrontal region, whereas the decline in processing speed seemed more related to irradiation of the temporal lobes and the PF.

#### Effect of Clinical Variables on Cognitive Score Changes

Consistently with previous studies (5, 6), the FSIQ decline depended on the delay between the two IQ tests. As shown in previous studies (5, 38), chemotherapy does not seem to have a significant negative impact on PSI and WMI functioning. The surprising positive effect of chemotherapy on FSIQ change might be linked to the positive impact of repeated measurements, also known as the carry over effect (or IQ test–retest) (39). Children acquired expertise concerning neuropsychological task along many neuropsychological tests, improving their performances. Therefore, the change in cognitive scores of each patient calculated from the difference between the first and last scores was positive. A large portion of children with chemotherapy alone showed an IQ improvement which confirms the absence of cognitive effect of chemotherapy (5, 38).

Table 3 | Effects on changes of cognitive scores (ΔFSIQ, ΔPSI, and ΔWMI) of the clinical variables and the components of the principal component analysis, according to our models (see Patients and Methods).


*For each variable, t score and p value (under parenthesis) are given, and for each model, the adjusted R2 indicates the total proportion of the scores variance that was predicted from the variables.*

This also explains the unexpected negative effect of age at diagnosis on FSIQ change, as children treated with chemotherapy alone are usually young (below 5 years) at diagnosis.

### ROIs EUD and Cognitive Score Changes

All components seem to have specific impacts on changes of the working memory score (WMI). The radiation distribution pattern involving the left orbitofrontal regions (*PC3-EUD*) had the most negative impact on working memory. Interestingly, this result could be in line with Mabbott et al. 's findings (40). They observed that working memory performance over time was different according to the tumor location in children treated for a central nervous system germ cell tumor. Patients with pineal tumors showed early, but stable, working memory deficit, whereas patients with suprasellar tumors experienced a significant working memory decline over time. Mabbott et al. suggested the observed decline was related to the radiation field rather than to the tumor location (40). In addition, this observation fits well with the compelling neuroimaging evidence of orbitofrontal implication in tasks relying on working memory [for meta-analysis, see Ref. (41, 42)]. *PC1-EUD*, however, corresponds to a distributed radiation pattern across the whole brain, suggesting that a global increase of radiation dose (EUD) impacts working memory negatively. From its patterns of spatial radiation distribution, this last component could be interpreted as CSI dose variability across subjects. However, such an overall radiation effect does not allow us to distinguish specifically irradiated brain networks that could be particularly involved in working memory impairment.

More specific brain network radiation patterns are found to influence processing speed. The large impact of *PC3-EUD* is strongly related to radiation to the temporal lobes and the PF. Previous studies have shown significant associations between radiation dose to the temporal lobe and processing speed impairments (16, 17, 43). The cerebellum has also been shown to play a role in processing speed capabilities (44). Importantly, temporal lobe regions are close to the PF upon which the dose was escalated. Thus, *PC3-EUD* impact could also reflect the radiation field boost trajectory to the PF across subjects. This would support the hypothesis that the volume receiving the highest dose has the greatest impact on cognitive functions. Accordingly, these findings would support current volume–reduction efforts.

Finally, *PC2-EUD* and *PC3-EUD* carry the exact same negative effect on IQ change. As reported earlier, *PC2-EUD* that includes the temporal lobes and the PF showed the most significant impact on processing speed changes. As for processing speed, previous studies have found associations between radiation dose to the temporal lobe and PF and IQ impairments (16, 18). In the same way, the role on IQ impairment of *PC3-EUD*, which strongly involves the left orbitofrontal cortex, is somehow expected, as many VBM studies in adults and adolescents have shown a link between IQ and gray matter density in this region (45–47). Alternatively, the equal contribution of these two components on ΔFSIQ might be the expression of an averaging effect as FSIQ is a composite index encompassing both working memory and processing speed.

Higher EUD in the superior occipital and parietal regions did not seem to be associated with lower cognitive scores. We may note that Armstrong et al. (17) did not find any significant association between occipito-parietal radiation dose and cognitive or social problems either. In addition, with the same amount of radiation dose, the parietal lobe white matter was shown to be less affected compared to frontal lobe in medulloblastoma (48, 49). Thus, it would be interesting to test whether the parietal lobe is less susceptible to radiation than other regions.

There are limitations in this study, and results should be interpreted with caution. First, the small size and heterogeneity of the patient population make it difficult to control for other variables that could affect the scores (i.e., hydrocephalus shunt, education, rehabilitation, surgical approach, molecular group, etc.). Moreover, considering only PFT patients prevented us from taking into account several potentially confounding variables such as type and localization of the tumor. However, this was a disadvantage regarding the large spatial correlations between close irradiated regions induced by similar radiation protocol. We could not access separately specific regions that are known to play an important role in working memory [e.g., dorsolateral area (50)] or processing speed [e.g., left middle frontal gyrus (51)]. Second, noise could be induced by intersubject variability of the brain morphology, even if we minimized possible segmentation errors by using atlases specific to age groups. Finally, we have to acknowledge that seven patients (that received hyperfractionated radiotherapy, HFRT) had the same total dose and could be considered as a subgroup that could influence the results (see Figure S1 in Supplementary Material). We recognize the possibility that the HFRT subgroup smaller variance might influence the result in other less crucial ways (see Figure S2 in Supplementary Material).

### CONCLUSION

This study confirms two cases for which there is a relationship between the radiation dose in particular brain areas and specific cognitive decline. The first case shows a correlation between

#### REFERENCES


orbitofrontal radiation and working memory decline, whereas the second case portrays a correlation between temporal lobe and PF radiation and slower processing speed. As this study is exploratory, it does aim to provide information regarding brain regions to avoid, but to describe relationships between radiation and cognitive function. The relationship between the cognitive profiles and the irradiation of these regions should be further confirmed in a prospective randomized with both, a bigger cohort and different radiation protocols.

### ETHICS STATEMENT

This retrospective study was approved by the Comite de Protection des Personnes CPP no. 14973 (Ile de France, France). All patient's parents gave a written informed consent in accordance with the Declaration of Helsinki.

## AUTHOR CONTRIBUTIONS

All authors carried a substantial contribution to the article and approved the final version of the manuscript. None of them has any conflict of interest. Guarantor of integrity of entire study: EDS, MN, and LH-P. Study concepts and design or data collection or data analysis and interpretation: all authors. Data preprocessing: EDS, CR, AG, and CP. Statistical analysis: EDS and MP-G. Drafting the work or revising it critically for important intellectual content: all authors.

### ACKNOWLEDGMENTS

This study was supported by a PhD fellowship from La Ligue Contre le Cancer. The authors gratefully acknowledge Edouard Duchesnay, Andres Hoyos-Idrobo, Pierre Maroun, Mehdi Rahim and Elvis Dohmatob for their advice regarding their respective domain of expertise. We thank Lucy Airs for her help in English. We also thank Sandrine Lopes Da Silva and Imène Hezam for their help in data collection.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/fonc.2017.00166/ full#supplementary-material.


assessment of treatment-induced white matter injury. *Neuroimage* (2006) 31(1):109–15. doi:10.1016/j.neuroimage.2005.12.007


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, JB, and the handling editor declared their shared affiliation, and the handling editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2017 Doger de Speville, Robert, Perez-Guevara, Grigis, Bolle, Pinaud, Dufour, Beaudré, Kieffer, Longaud, Grill, Valteau-Couanet, Deutsch, Lefkopoulos, Chiron, Hertz-Pannier and Noulhiane. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Tangential Field Radiotherapy for Breast Cancer—The Dose to the Heart and Heart Subvolumes: What Structures Must Be Contoured in Future Clinical Trials?

*Marciana Nona Duma1,2,3\*, Anne-Claire Herr1,4, Kai Joachim Borm1 , Klaus Rüdiger Trott1,5, Michael Molls1,6, Markus Oechsner1,2 and Stephanie Elisabeth Combs1,2,3*

*1Department of Radiation Oncology, Technical University of Munich (TUM), Munich, Germany, 2Center for Stereotactic and Highprecision Radiation Therapy (StereotakTUM), Technische Universität München (TUM), Munich, Germany, 3Department of Radiation Sciences (DRS), Institute of Innovative Radiotherapy (iRT), Helmholtz Zentrum München, Munich, Germany, 4Medical School, Technische Universität München, Munich, Germany, 5Cancer Institute, University College of London, London, United Kingdom, 6 Technische Universität München, Munich, Germany*

#### *Edited by:*

*William Small Jr., Stritch School of Medicine, United States*

#### *Reviewed by:*

*Stephan Bodis, Kantonsspital Aarau, Switzerland Andrea Riccardo Filippi, University of Turin, Italy*

*\*Correspondence:*

*Marciana Nona Duma marciana.duma@mri.tum.de*

#### *Specialty section:*

*This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology*

*Received: 23 March 2017 Accepted: 06 June 2017 Published: 19 June 2017*

#### *Citation:*

*Duma MN, Herr A-C, Borm KJ, Trott KR, Molls M, Oechsner M and Combs SE (2017) Tangential Field Radiotherapy for Breast Cancer—The Dose to the Heart and Heart Subvolumes: What Structures Must Be Contoured in Future Clinical Trials? Front. Oncol. 7:130. doi: 10.3389/fonc.2017.00130*

Background and purpose: The aim of the present study was to evaluate if it is feasible for experienced radiation oncologists to visually sort out patients with a large dose to the heart. This would facilitate large retrospective data evaluations. And in case of an insufficient visual assessment, to define which structures should be contoured and which structures can be skipped as their dose can be derived from other easily contoured structures for future clinical trials.

Material and methods: Planning CTs of left-sided breast cancer patients treated with 3D-conformal radiotherapy by tangential fields were visually divided into two groups: with an estimated high dose (HiD) and with an estimated low dose (LoD) to the heart. For 46 patients (22 HiD and 24 LoD), the heart, the left ventricle, the left anterior descending artery (LAD), the right coronary artery, and the ramus circumflexus were contoured. A helper structure (HS) around the LAD was generated in order to consider if contouring uncertainties of the LAD could be acceptable. We analyzed the mean dose (Dmean), the maximum dose, the V10, V20, V30, V40, and the length of the LAD that received 20 and 40 Gy.

Results: The two groups had a significant different Dmean of the heart (*p* < 0.001). The average Dmean to the heart was 4.0 ± 1.3 Gy (HiD) and 2.3 ± 0.8 Gy (LoD). The average Dmean to the LAD was 26.2 ± 7.4 Gy (HiD) and 13.0 ± 7.5Gy (LoD) with a very strong positive correlation between Dmean LAD and Dmean HS in both groups. The Dmean heart is not a good surrogate parameter for the dose to the LAD since it might underestimate clinically significant doses in 1/3 of the patients in LoD group.

Conclusion: A visual assessment of the dose to the heart could be reliable if performed by experienced radiation oncologists. However, the Dmean heart is not always a good surrogate parameter for the dose to the LAD or for the Dmean to the left ventricle. Thus, if specific late toxicities are evaluated, we strongly recommend contouring of the specific heart substructures as a heart Dmean is not highly specific.

Keywords: breast cancer, heart, tangential field, left anterior descending artery, radiotherapy

## INTRODUCTION

The heart is probably the most radiosensitive organ in the human body. Long-term follow-up of the Japanese A-bomb survivors demonstrated that a mean body dose (and thus mean heart dose) of 1 Gy increased the mortality from heart diseases by 14% (1). Follow-up studies in patients treated for various malignant and non-malignant diseases yielded similar risk values (2). Careful analysis of the pathologies of radiation-induced heart diseases after mantle field radiotherapy of patients for Hodgkin's disease (thus receiving a near-homogeneous dose to their hearts) demonstrated that five different radiation-induced heart diseases were diagnosed, namely, pericarditis, myocardial fibrosis, coronary atherosclerosis leading to myocardial infarction, conduction defects such as bundle branch blocks and valvular insufficiency. Each of these manifestations of cardiac radiation injury occurs in different substructures of the heart, follows different pathogenic pathways, and may have different dose dependence. This means, however, that different manifestations of cardiac radiation damage would be expected to occur after different anatomical dose distributions, such as from adjuvant radiotherapy of breast cancer patients and that mean heart doses may not be a relevant dose criterion for estimating cardiac complications from particular treatment plans.

Notwithstanding this argument, large retrospective data have demonstrated a relationship between the delivered heart dose and major coronary events. A recent study by Darby et al. (3) analyzed the risk of ischemic heart disease in women after radiotherapy for breast cancer. They have found that the average mean dose (Dmean) to the heart of patients treated between 1958 and 2001 was 4.9 Gy with a significant correlation between the mean heart dose and major coronary events. However, no individual dosimetric data were available for this retrospective study. In order to assess the mean heart doses and the Dmean to the anterior descending coronary artery, the 2D-plans were recalculated on a "typical" patient in the Darby et al. study. Studies have also shown a direct link between radiation dose in the coronary arteries and the location of coronary stenosis (4).

Although heart dose from breast cancer radiotherapy has been significantly reduced over the past decades, parts of the heart may still be located in the radiation field in modern 3D-conformal radiotherapy (3D-CRT) (5–7). Hence, it is essential to select all patients, which, with conventional techniques, could receive a significant dose to critical structures of the heart and offer them a cardiac sparing radiotherapy.

However, contouring of all the heart subvolumes is time consuming. Moreover, it has to be considered that there may be clinically and dosimetrically significant interobserver variations in heart and heart subvolume delineations (8). Lorenzen et al. found substantial interobserver variation in the estimated dose of the left anterior descending artery (LAD), which even guidelines could not reduce. The coefficients of variation in the estimated doses to the LADCA were for Dmean 27% without and 29% with guidelines. For the heart, variation was little, especially when guidelines were used (9). Thus, it is essential to understand the dosimetric impact of contouring uncertainties in the LAD.

The aim of the present study was to evaluate if it is feasible for experienced radiation oncologists to visually sort out patients with a large dose to the heart. This would facilitate large retrospective data evaluations. And in case of an insufficient visual assessment, to define which structures should be contoured and which structures can be skipped as their dose can be derived from other easily contoured structures. More specifically, two questions were addressed: (1) is the visual evaluation a reliable indicator of mean heart dose and (2) is the mean heart dose a reliable indicator of the radiation exposure of the left anterior descending coronary artery/left ventricle?

#### MATERIALS AND METHODS

201 consecutive patients with left-sided breast cancer treated in our institution between March 2009 and November 2010 were identified.

These patients were all treated with 3D-CRT by tangential fields, half beam technique. Patients were placed on a breast board with the left arm above the head. The treatment planning was performed with the Eclipse Treatment Planning System (Varian Medical Systems, Palo Alto, CA, USA). All patients underwent a planning kVCT scan (Siemens Inc., Erlangen, Germany) with an axial slice thickness of 5 mm before treatment. The CT scans were not contrast enhanced. The treatment plans consisted of two opposing tangential wedged beams. Additional segments (1–2) were used to improve target dose homogeneity, if necessary. Both medial and lateral beams were wedged. The PTV prescribed dose was 50 Gy for the whole breast (ICRU reference point), followed by an electron boost of 10–16 Gy. All treatments were performed with daily single doses of 2 Gy.

The planning CTs of the 201 patients were visually reviewed. The CTs with the calculated dose distributions for the whole breast radiotherapy (50 Gy) were presented to a radiation oncologist who was asked to assess whether the heart Dmean would be high or low. **Figure 1** exemplarily depicts two CTs with the isodoses (10–105%) used in this study. No structures were superimposed on the CT scan. Assessment was performed visually. Taking for example the patient in **Figure 1A**, as a large part of the heart is within the 10% isodose, the patient was estimated by the radiation oncologist to have a high dose (HiD) to the heart. Thus, two groups were generated: one with a visually estimated HiD to the heart (86 patients) and the other one with an estimated low dose (LoD) to the heart (115 patients).

The treatment records of the 201 patients were reviewed and we excluded from this analysis: patients who underwent systemic therapy beforehand, patients who were treated for breast cancer relapse, patients who received irradiation to the locoregional lymph nodes, patients with mastectomy, and patients with concomitant bilateral breast cancer radiotherapy. This was done because in an ongoing study, we perform functional imaging to assess correlations between heart toxicities and dose distributions. From the remaining patients, the first 46 consecutive patients were chosen (24 from the LoD group and 22 patients from the HiD group) for this dosimetric study. The left ventricle, the LAD (LAD), the ramus circumflexus (RCX), and the right coronary artery (RCA) were retrospectively contoured according to the

Feng et al. heart atlas (10). In order to test whether contouring uncertainties could be acceptable for the LAD, a helper structure (HS) with a width of 0.5 cm anterior–posterior and 1 cm left right around the LAD was generated (**Figure 1**). This was performed in order to test whether a significantly larger contouring of a very small region of interest can be safely performed.

For all contoured structures, dose volume histograms were analyzed. We assessed the minimum dose, maximum dose (Dmax), Dmean, absolute volume in cubic centimeters, V10 (the relative volume that receives 10 Gy or more), V20, V30, and V40. Additionally, the absolute volume V10, V20, V30, and V40 of the heart in cubic centimeters was assessed.

Since neither clinical nor radiobiological data provide a reliable data on the dose/volume dependence of radiationinduced atherosclerosis, different criteria of dose specification in the LAD were determined which would, in a second step, permit the determination of the anatomical relationship between local dose and local tissue injury. Therefore, in addition to V10 etc., also the absolute LAD length and the length of the LAD that lies within the 20 Gy isodose and within the 40 Gy isodose were evaluated. The statistical analyses were performed using SPSS Software for Windows version 20.0 (SPSS Inc., Chicago, IL, USA). All statistical tests were performed two-sided and a *p*-value <0.05 was considered to indicate statistical significance. Mean values are reported with SD, median values with range. Pearson correlations are presented.

#### RESULTS

In the 46 patients, the volume of the heart ranged between 471 and 1,013 cm3 , the volume of the left ventricle between 141.2 and 275.3 cm3 , the volume of the LAD between 1.1 and 2.6 cm3 , the HS volume between 11.2 and 20.8 cm3 , the RCX volume between 0.3 and 1.0 cm3 , and the RCA volume between 0.7 and 1.8 cm3 .

The median (range) Dmean/Dmax to the whole heart was 3.6 Gy (2.6–8.9 Gy)/49.3 Gy (47.7–51.6 Gy) for the HiD group and 2.6 Gy (0.8–3.5 Gy)/44.6 Gy (6.1–57.3 Gy) for the LoD group, respectively. The median (range) Dmean/Dmax to the left ventricle was 6.3 Gy (3.8–15.5 Gy)/49.2 Gy (46.1–51.4 Gy) for the HiD group and 4.0 Gy (1.0–6.0 Gy)/48.1 Gy (4.8–57.2 Gy) for the LoD group, respectively. Doses for the RCA or RCX were <1.0 Gy.

The two groups had a significant different Dmean of the heart (*p* < 0.001). Thus overall, the clinical assessment whether the heart will receive a HiD or LoD was good (**Figure 1**), yet, there was considerable overlap considering individual patients. In the HiD group 3 patients out 22 were wrongly estimated—Dmean within 0.5 Gy of the average Dmean heart of the LoD group. In the LoD group, 3 out of 24 patients were within 0.5 Gy of the average Dmean heart of the HiD group.

The overall average length of the LAD was 8.4 ± 0.8 cm (Mean ± SD). The length of the LAD that received 20 Gy/40 Gy was 4.5 ± 1.8 cm/2.9 ± 2.3 cm for the HiD group and 1.9 ± 1.7 cm/0.7 ± 1.1 cm for the LoD group, respectively.

The average Dmean to the LAD/HS was 26.2 ± 7.4 Gy/23.3 ± 6.9 Gy for the HiD group and 13.0 ± 7.5 Gy/13.0 ± 7.2 Gy for the LoD group. In both groups, there were very strong positive correlations between the Dmean LAD and the Dmean HS (*r* ≥ 0.964; *p* < 0.001) (**Figure 2**).

For both groups, there was a very strong positive correlation between the Dmean ventricle and the Dmean heart (*r* ≥ 0.902; *p* < 0.001) and a strong and very strong positive correlation between the Dmean heart and Dmean LAD/HS, respectively (HiD: *r* = 0.731/*r* = 0.724, *p* < 0.001; LoD: *r* = 0.834/*r* = 0.849, *p* < 0.001) (**Figure 3**). The correlation between Dmean ventricle and Dmean LAD was strong but not as strong as the one between Dmean heart and Dmean LAD (HiD: *r* = 0.642; LoD: *r* = 0.605; *p* ≤ 0.001). We found strong and very strong positive correlations between heart (relative volume) V10, V20, V30, and V40 and LAD V10, V20, V30, and V40 in the high as well as in the LoD group. **Table 1** presents the absolute and relative V10, V20, V30, and V40 of the heart (**Table 1**). **Figure 4** depicts exemplary scatterplots of the V30 of the LAD and the Dmean heart, which highlights the clinical problem. Despite significant correlation,

the predictive value of the mean heart dose and high doses to critical volumes of the LAD may not be good enough.

#### DISCUSSION

As breast cancer is the most common cancer in women and long-term survivorship is nowadays the rule, morbidity and mortality from radiation-induced heart disease has become a major concern in treatment planning. There is evidence that the risk of different potential late cardiac radiation injury depends on local radiation dose, which opens the possibility to reduce the risk by optimizing the dose distribution in the heart of the individual patient. This does, however, require detailed cardiac dosimetry for the individual patient as a basis for treatment plan optimization. Studies are available on modern heart dosimetry in breast cancer patients, revealing that even with contemporary treatment and planning techniques, some patients still receive important doses to the heart or to its substructures such as the LAD.

The aim of this study was to assess which structures should be contoured and which structures could be skipped as the dose could be derived from correlations with other structures.

The result of our study is that a visual assessment by experienced radiation oncologists often gives a reliable estimate of Dmean doses to the heart. There is a significant (*p* < 0.001) statistical difference in the Dmean to the heart between the two chosen groups. If the heart is not contoured due to workload,


Table 1 | The V10, V20, V30, and V40 of the heart.

6/24 patients in the LoD group who receive a dose of more than 40 Gy to more than 20% of the LAD and 19/22 in the HiD group (not depicted).

a retrospective visual examination of planning CTs from the department database with predefined range of isodoses (e.g., 10–105% isodose) would be very informative and the Dmean to the heart could be estimated for future patients.

However, some patients in our study were not perfectly matched to their group. We reviewed each patient's CTs individually and found two main confounding factors.

First, our contouring—according to the Feng et al. (10) atlas included the pericardium. Our group of patients, however, had some variations of the amount of epicardial fatty tissue. A simple scrolling through CT slices without definition of heart boundaries might visually group patients with larger epicardial fatty tissue into the LoD group as the pericardium is not always easily seen on every single CT slice. A large epicardial fatty tissue translates into a higher Dmean (since structures of the heart that were not visually considered, would now lie within the HiD region). These patients do not receive a large dose to the myocardium (left ventricle), but a significant dose to the LAD. Thus, either contouring of the heart including pericardium or contouring of all heart structures is necessary in these patients in order to estimate specific late toxicities.

Second, patients with a large dose to the heart on a few CT slices were wrongly categorized into to the HiD group. The HiD levels (40–50 Gy isodoses) extended into the heart on only very few CT slices (2–3 slices). Visually this can be misleading.

Focusing on mean heart dose solely, our findings suggest that visual grouping into the low heart dose category might be an acceptable way to eliminate detailed contouring in about half the patients with an error margin of 10%. In the other half of patients, detailed contouring should be recommended.

Yet, which structures should be contoured? Lorenzen et al. (9) found substantial interobserver variation in the estimated dose of the LAD, which even guidelines could not reduce. The spatial distance variation between the delineations was up to 7–8 mm. Thus, a structure like a HS might depict the whole uncertainties region. Overall, in our study, we found a very strong positive correlation between Dmean LAD and the Dmean HS (*r* ≥ 0.964; *p* < 0.001). As the HS represents the relative region in which we can assume that the LAD lays, we can conclude that contouring uncertainties might be acceptable. Even a rough contouring will be helpful for the clinician in order to assess the magnitude of dose to the LAD (i.e., 13 vs. 23 Gy Dmean in the two groups, respectively).

However, even significant correlations between values of dose specification are of limited use in the practice of treatment planning for the individual patient. Although the dose to the LAD correlates very strongly with the dose to the heart, even in the LoD group there are still patients who will receive a significant dose to the LAD (>30 Gy). To stress this point, in **Figure 4,** patients with the same Dmean heart (e.g., ≈2.7 Gy) had a LAD V30 value that ranged between 0% and approx. 50% in the LoD group and in the HiD group the LAD V30 even ranged from 10 to 70%. Thus, despite the strong positive correlation of the Dmean of the heart to the Dmean to the LAD, even in the LoD group, one-third of the patients will receive over 30 Gy to onethird of their LAD. In order not to skip any patients with a HiD to the LAD. we therefore recommend a contouring of the LAD/ HS in all patients with left-sided breast cancer, independent of the estimated or calculated mean heart dose.

### CONCLUSION

A visual assessment could be reliable if experienced radiation oncologists have to assess whether a patient receives a higher or a lower Dmean to the heart. Even a rough contouring of the region LAD (i.e., the HS) provides clinically valuable information on the magnitude of the LAD dose. The Dmean heart is not always a good surrogate parameter for the dose to the LAD as it might

### REFERENCES


underestimate clinically significant doses in one-third of the patients with a LoD. The Dmean heart is a good surrogate for the Dmean to the left ventricle, except for patients with a large epicardial fatty tissue. Thus, if specific late toxicities are evaluated, we strongly recommend contouring of the specific heart substructures as a heart Dmean is not highly specific.

### CONSENT PROCEDURES

All patients gave their informed consent both informed and written before starting the radiotherapy that they will undergo CT radiotherapy treatment planning. Data from the CT radiotherapy treatment planning were retrospectively analyzed.

#### ETHICS STATEMENT

This study was approved by the ethics committee of Klinikum rechts der Isar, Technical University Munich.

### AUTHOR CONTRIBUTIONS

MD, A-CH, KB, KT, MM, and MO participated in the study design, contributed to the data collection, and drafted the manuscript. SC made important contributions in revising the content. All authors read and approved the final manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Duma, Herr, Borm, Trott, Molls, Oechsner and Combs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*