Prediction of Radiation Pneumonitis Using Genome-Scale Flux Analysis of RNA-Seq Derived From Peripheral Blood

Purpose: Radiation pneumonitis (RP) frequently occurs during a treatment course of chest radiotherapy, which significantly reduces the clinical outcome and efficacy of radiotherapy. The ability to easily predict RP before radiotherapy would allow this disease to be avoided. Methods and Materials: This study recruited 48 lung cancer patients requiring chest radiotherapy. For each participant, RNA sequencing (RNA-Seq) was performed on a peripheral blood sample before radiotherapy. The RNA-Seq data was then integrated into a genome-scale flux analysis to develop an RP scoring system for predicting the probability of occurrence of RP. Meanwhile, the clinical information and radiation dosimetric parameters of this cohort were collected for analysis of any statistical associations between these parameters and RP. A non-parametric rank sum test showed no significant difference between the predicted results from the RP score system and the clinically observed occurrence of RP in this cohort. Results: The results of the univariant analysis suggested that the tumor stage, exposure dose, and bilateral lung dose of V5 and V20 were significantly associated with the occurrence of RP. The results of the multivariant analysis suggested that the exposure doses of V5 and V20 were independent risk factors associated with RP and a level of RP ≥ 2, respectively. Thus, our results indicate that our RP scoring system could be applied to accurately predict the risk of RP before radiotherapy because the scores were highly consistent with the clinically observed occurrence of RP. Conclusion: Compared with the standard statistical methods, this genome-scale flux-based scoring system is more accurate, straightforward, and economical, and could therefore be of great significance when making clinical decisions for chest radiotherapy.

Purpose: Radiation pneumonitis (RP) frequently occurs during a treatment course of chest radiotherapy, which significantly reduces the clinical outcome and efficacy of radiotherapy. The ability to easily predict RP before radiotherapy would allow this disease to be avoided.
Methods and Materials: This study recruited 48 lung cancer patients requiring chest radiotherapy. For each participant, RNA sequencing (RNA-Seq) was performed on a peripheral blood sample before radiotherapy. The RNA-Seq data was then integrated into a genome-scale flux analysis to develop an RP scoring system for predicting the probability of occurrence of RP. Meanwhile, the clinical information and radiation dosimetric parameters of this cohort were collected for analysis of any statistical associations between these parameters and RP. A non-parametric rank sum test showed no significant difference between the predicted results from the RP score system and the clinically observed occurrence of RP in this cohort.

INTRODUCTION
Radiation pneumonitis (RP) is a common treatment complication in cancer patients who receive chest radiotherapy, usually occurring within 6 weeks to 3 months of the end of radiation therapy (1). Severe RP can even lead to death (2). Therefore, RP not only limits the implementation of radiotherapy, but also significantly reduces the treatment efficacy and quality of life of cancer patients. To date, a variety of risk factors related to RP have been reported in diverse clinical studies, such as the radiation dose, the exposure lung volume, the average lung exposure dose, sex, age, chemotherapy, and smoking history (3). Although those risk factors contain relevant information related to RP, prediction models based on those risk factors alone have demonstrated limited performance with low degrees of sensitivity and specificity for predicting the occurrence of RP.
From a pathophysiological perspective, the alveolar epithelial and pulmonary interstitial cells are injured during or after the exposure of lung tissue to ionizing radiation. These injured cells continuously release a variety of inflammatory cytokines that induce inflammation in local lung tissue (4). The released inflammatory cytokines accumulate, causing severe damage to lung tissue, and enter the blood circulation, which indirectly induces systemic inflammatory reactions (5).
Messenger RNA (mRNA) is an important part of the synthesis of cytokines and therefore plays an essential role in the occurrence and development of RP (6). This suggests that the base sequence and expression level of mRNAs may be related to the occurrence and severity of RP. In recent RNA-sequencingbased projects, we developed several computational prediction models integrated with RNA-Seq data that had potential for predicting the efficacy of targeted therapies for lung cancer and the diagnosis and prognosis of rectal cancer (7-9). Because mRNA is an important information carrier in the pathogenesis of RP, we integrated RNA-Seq data from individual patients into a prediction model and evaluated its predictive ability for the occurrence of RP in chest radiotherapy, to establish a reliable reference for optimizing the clinical efficacy of radiotherapy.

Study Design and Patients
Patients who were diagnosed with lung cancer and treated with chest radiotherapy from October 2018 to December 2020 in Renmin Hospital of Wuhan University and the First Affiliated Hospital of Zhengzhou University were recruited for this study. The clinical information collected from the patients during treatment included sex, age, smoking history, history of chemotherapy, pathological type, tumor stage, target exposure dose, whole-lung-exposure dosage parameters (V5, V10, V20, and V30), and single-lung-exposure dosage parameters (V5, V10, V20, and V30). Clinical observation of the patients included clinical symptoms, lung imaging, and signs after radiotherapy. When RP occurred, the time from the beginning of radiotherapy to the diagnosis (days) was recorded, and the severity of RP was graded according to the RTOG diagnostic criteria (10). For each participant, a peripheral blood sample was collected before radiotherapy, and RNA sequencing of these blood samples was conducted. The RNA-Seq data were imported into a genomescale molecular model for pathway flux analysis. Subsequently, an RP scoring system was established to predict the occurrence of RP for individual patients, and the results were compared with the clinically observed occurrence of RP during the course of radiotherapy [ Figure 1; formula (1, 2)]. Written informed consent was obtained for all participants. This study was approved by the ethics committee of Renmin Hospital of Wuhan University.

RNA Sequencing Study and Computational Prediction of RP Occurrence
For each participant, 10 ml of intravenous blood was collected 3-5 days prior to chest radiotherapy and stored in a blood RNA storage tube (TempusTMBlood RNA Tube) at −80 • C. The collected blood samples were then thawed at room temperature (18-25 • C) and the sample solution was transferred to a clean 50 ml tapered test tube. A total RNA extraction kit (TempusTMSpinRNA Isolation Kit) was used to extract the total RNA in the blood sample, following the manufacturer's instructions. The extracted RNA was then stored at −80 • C. The globin mRNA was removed from the total RNA using the corresponding kit (GLOBINclearTMKit) following the manufacturer's instructions, and the isolated and purified RNA was then prepared for sequencing. The SMARTer Stranded Total RNA-Seq Kit v2 (Takara Bio Inc, Japan) was used to complete genomic cDNA synthesis, add connectors and indexes, use R-Probes to remove cDNA fragments from rRNA in the reverse transcription, delete interference from RNA from the library data in the sample, and finally expand the final RNA-Seq library. The captured RNA samples were then sequenced in a paired end manner using an Illumina NovaSeq 6000 platform. Afterward, the processed RNA-Seq data were imported into our computer prediction model [molecular signaling map, MSM (11)] for pathway flux calculation F(p). F(p) = I i (reaction, role)/N(P)) − flux(Crosstalk(P))), (1) where N(P) is the number of reactions of the pathway P, and crosstalk(P) is the crosstalk pathways related to the pathway P. Through the flux comparative analysis (12), the significant differences (S) of pathway flux derived from the RNA-Seq data of each participant before (RP control ) and after (RP treat ) radiotherapy were calculated.
A negative or zero value of S indicates a lower probability of RP occurrence, whereas a positive value indicates a higher probability of RP. The severity level of RP can be directly related to the value of S.

Follow-Up Study and Assessment of RP Severity
After starting chest radiotherapy, clinical observations were continuously conducted to record the symptoms and signs of individual participants, including physical status score, oxygen saturation, breathing rate, pulse, and heart rate. For those who completed radiotherapy, chest CT and other related tests were carried out 1 and 3 months after the end of treatment, in strict accordance with the clinical treatment regulation. For patients resting at home, regular telephone follow-ups were performed. If patients experienced a cough, shortness of breath, a low fever, chest tightness and/or other symptoms, they were notified to carry out relevant examinations at any time. The occurrence of RP was determined through a chest CT image and the clinical symptoms and signs. The severity of RP was graded according to the RTOG standard classification.

Statistical Analysis
A non-parametric Mann-Whitney U-test was applied to analyze the difference between the clinically observed RP occurrences and the predicted results of the RP scoring system. Univariant analysis (SPSS version 10.0) was applied to identify potential RP-related risk factors, including age, sex, smoking history, chemotherapy history, tumor stage, pathological type, and radiation exposure dose (full dose, V5, V10, V20, and V30). The identified potential risk factors were further analyzed by a multivariant approach in R (4.0.2). The prediction results were evaluated through ROC curve analysis between the clinical diagnostic group and the predicted RP score group.

RESULTS
A total of 48 lung cancer patients (median age, 62 years; age range, 35-80 years; 83.4% male) met the inclusion criteria and were included in this study. The demographic, clinical, and ± 15.96%, the mean bilateral lung V20 was 17.34 ± 7.38%, the mean bilateral lung V10 was 30.10 ± 12.22%, and the mean bilateral lung V30 was 11.19 ± 5.50%. During this study, a clinical occurrence of RP of varying severity was observed in 24 patients (50.0%). Specifically, 17 cases of RP were evaluated as low grade (level 1 or 2), while seven cases of RP were evaluated as high grade (level 3 or 4). The average exposure dose in the target area of patients with an RP occurrence was 58.71 Gy, with the bilateral lung V5, V10, V20, and V30 being 47.04, 33.87, 19.90, and 12.76%, respectively. The average exposure dose in the target area of patients who did not experience RP was 50.68 Gy, with the bilateral lung V5, V10, V20, and V30 being 37.17, 26.33, 15.11, and 9.62%, respectively. The average exposure dose in the target area of patients with an RP level <2 was 50.68 Gy, with the bilateral lung V5, V10, V20, and V30 being 39.7, 27.50, 15.53, and 9.94%, respectively. The average exposure dose in the target area of patients with an RP level <3 was 51.77 Gy, with the bilateral lung V5, V10, V20, and V30 being 40.93, 28.38, 16.20, and 10.36%, respectively.
The univariant analysis results showed that an increased target exposure dose (p = 0.005), a later tumor stage (p = 0.024), and an increased bilateral lung V5 (p = 0.028) and V20 (p = 0.036) were significantly associated with the occurrence of RP (Figure 2; Table 1). An increased bilateral lung V5 (p = 0.000) and V20 (p = 0.003), and a history of smoking (p = 0.016) were significantly associated with the occurrence of RP at a level ≥ 2 ( Table 2). Of note, patients with a bilateral lung V5 ≥ 39.70% or V20 ≥ 15.53%, or a history of smoking had a greater increase in risk for the occurrence of RP at a level ≥ 2 ( Table 2).
A history of smoking (p = 0.012), a later tumor stage (p = 0.009), and an increased bilateral lung V5 (p = 0.027) and V20 (p = 0.016) were significantly associated with the occurrence of RP at a level ≥ 3 (Table 3). Of note, patients with a bilateral lung V5 ≥ 40.93%, V20 ≥ 16.20%, tumor stage III or later, or a history of smoking faced a significantly increased risk for the occurrence of RP at a level ≥ 3 (Table 3). However, the radiation exposure doses, including the single-lung V5, V10, V20, and V30, were all shown to be unrelated to the occurrence of RP. The Cox proportional-hazards model analyzed all the significant risk factors obtained by the univariant analysis, and the results showed that the radiation exposure dose of the bilateral lung V5 was an independent risk factor for the occurrence of RP (p < 0.05), while the bilateral lung V10 was an independent risk factor for the occurrence of RP at a level ≥ 2 (p < 0.05). The cumulative incidence rates related to different radiation exposure parameters are shown in Figure 3, indicating the incidence rates of the occurrence of RP within the 3-month period of clinical observation under different exposure doses. The curves of the cumulative incidence rate of different dosimetric parameters, as shown in Figure 3, suggest that the frequency of RP occurs within a certain time of observation (3 months) under different dosimetric parameters.
From peripheral blood samples in this cohort, RNA-Seq libraries were successfully extracted and checked with the quality control procedure stated in the Methods section. Then, RNA-Seq data from each patient were integrated into the computational prediction model, MSM. Subsequently, genome-scale pathway flux analysis was performed to generate S values (as detailed in the Methods section). Afterward, each S value was compared to the clinically observed outcome of the corresponding patient, as shown in Figure 4. A Spearman rank sum test showed a positive significant correlation between both groups (rho = 0.915; p < 0.0001), with an overall consistency of 81.25% (39/48; 95% CI = 77.91-99.15%). The median of the S values was 0.537 (range, 0.0-1.2). In this cohort, 8 (16.7%) and 17 (35.4%) patients were predicted with a high probability to experience RP at a level ≥ 3 and ≤ 2, respectively, during radiotherapy. The remaining 23 patients (47.9%) were predicted to receive radiotherapy without experiencing RP. The non-parametric rank sum test indicated no significant difference between the predicted results from the RP scoring system and the clinically observed occurrence of RP in this cohort (p = 0.405), regardless of the total incidence of RP or the occurrence data of the different levels of RP. Moreover, ROC analysis was performed to compare the predictions of the RP scoring system with those of the classical RTOG radiation physical parameters (Figure 5). The comparison result showed that the area under the curve (AUC) of the prediction from the RP scoring system and the ROTG classification were 0.933 (95% CI = 0.89-1.00) and 0.604 (95% CI = 0.44-0.77), respectively. This strongly indicated that the predictive ability of the RP scoring system was superior to that of the ROTG classification for the diagnosis of RP occurrence in this cohort, with a significantly higher sensitivity and specificity.

DISCUSSION
Generally speaking, the efficacy of radiotherapy for most tumors is positively correlated with the dose of irradiation (13). The general principle for developing radiotherapy strategies is to maximize the dose to tumor-affected tissue while protecting healthy tissue (14). However, under the guidance of evidence-based medicine, the dose received by most patients is determined by the best irradiation dose for the group of patients. Individualized determination of the optimal dose for radiotherapy is a highly challenging task, because slight change of dose for individuals may directly lead to damage of healthy tissues or dramatical reduction of therapeutic efficacy during the course of radiotherapy (15)(16)(17). Thus, precise screening of patients with a high risk of developing RP before radiotherapy becomes an essential and important prerequisite for individualized radiotherapy.
Currently, the predictive biomarkers commonly used in a clinical prognosis for RP include clinical phenotypes (including sex, age, smoking history, chemotherapy history, types of tumor pathology, and clinical stages), biological biomarkers (such as levels of IL-6, IL-8, TGF-β, and other cytokines in the serum), and radiophysical properties (including V5, V20, V10, V30, and mean lung dose). Nevertheless, the predictive ability of these biomarkers varies greatly between studies. This suggests that it may be insufficient to only apply information related to clinical phenotypes and/or tumor characteristics of individual patients for the prognosis of RP during radiotherapy. Furthermore, the development of RP is a dynamic process, involving multiple stages and factors, including tissues, cells, and molecules (18,19). The influence of radiomics, biomics, and clinically relevant risks also needs to be considered comprehensively to accurately assess the risk of developing RP for patients receiving radiotherapy, and to achieve the final goal of individualized treatment.
Wide use of omics data and machine learning has largely promoted advances in medicine and drug development, as well as providing new opportunities. In the present study, we used a genome-scale flux analysis approach integrating a molecular signaling map based on patient-individualized RNA sequencing data. Previous studies have demonstrated that this approach, based on transcriptomic data, was able to quickly predict the efficacy of different treatments received by patients in ∼1 h (20). Compared with observational results, a computer assisted prediction approach can be much easier, more precise, and have a higher efficacy than traditional strategies. To determine the predictive efficacy of this computational molecular model for the development of RP in patients receiving radiotherapy, we compared the predicted outcomes before radiotherapy of 48 patients with the observed clinical outcomes of these patients after radiotherapy. The results of our comparison showed that there was extremely high consistency (81.25%) between the predicted results and the observational outcomes, and ROC analysis suggested that our computational prediction provides a higher accuracy (AUC, 0.933; range, 0.910-0.958; p < 0.001) for the prediction of RP compared with other biomarkers such as clinical phenotypes.
Our study also determined the predictive efficacy of common clinical phenotypes for RP. We evaluated the associations between the development of RP and sex, age, smoking history, chemotherapy history, pathological type, irradiation dose of target area, tumor stage, and the V5, V10, V20, and V30 values of the bilateral lung and the affected lung tissue. The results of the univariate analysis indicated that the irradiation dose of the target area, the tumor stage, and the bilateral V5 and V20 values were significantly associated with the development of RP (p < 0.05) (see Table 1). In contrast, the V5, V10, V20, and V30 values of the affected lung tissue were not correlated with the development of RP. A Cox multivariate analysis showed that the bilateral V5 value was an independent risk factor for the development of RP, while the V20 value was an independent risk factor for the development of RP at grade 2 and above (p < 0.05). Consistent with our results, Bryan et al. (21) observed that the V20 value and the mean lung dose were the main factors affecting the incidence of RP from a study including 281 patients with non-small cell lung cancer. The incidence of RP increased from 4.3 to 16.4% when V20 ≥ 4.3%. Moreover, Kong et al. (22) found that the incidence of RP at grade 2 and above was 16% when V20 ≤ 27%, while the incidence increased to 48% when V20 > 27%.
The present study had several limitations. First, a statistical bias may exist given the small sample size of this cohort, limited follow-up data, and missing patient data. Second, some of our results contradict previous reports, for instance, the V20 threshold was predictive for the development of RP at grade 2 or higher, possibly due to the small sample size and the different clinical stages of the patients recruited. Third, the predictive model in our study was determined according to the results of molecular signaling maps (MSM), which has demonstrated its potential in predicting the efficacy of targeted therapy in patient-derived xenograft models. However, RP is a pathophysiological process involving multiple factors, including the immune response, the release of substances after tumor necrosis, the status of lung tissue, and radiophysical properties. Future studies are warranted to further investigate whether it remains the best strategy to predict the risk of developing RP based on transcriptomic data alone, or whether more comprehensive and accurate predictive models for RP can be established by utilization of other high-throughput data, such as metabolomics, clinical parameters, or even with single-cell data.
In conclusion, we performed an initial exploration of the utility of a computer-assisted predictive model to predict the development of RP. Our study demonstrated that the computer-assisted predictive model based on RNA sequencing data was able to precisely predict the risk of developing RP in patients undergoing thoracic radiotherapy. Our strategy showed a significantly higher predictive efficacy compared with traditional strategies, providing a useful tool to precisely predict the risk of RP and thus promoting the development of individualized radiotherapy.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available on request from the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of Wuhan Renmin Hospital. The patients/participants provided their written informed consent to participate in this study.