Predictors of Lung Adenocarcinoma With Leptomeningeal Metastases: A 2022 Targeted-Therapy-Assisted molGPA Model

Objective To explore prognostic indicators of lung adenocarcinoma with leptomeningeal metastases (LM) and provide an updated graded prognostic assessment model integrated with molecular alterations (molGPA). Methods A cohort of 162 patients was enrolled from 202 patients with lung adenocarcinoma and LM. By randomly splitting data into the training (80%) and validation (20%) sets, the Cox regression and random survival forest methods were used on the training set to identify statistically significant variables and construct a prognostic model. The C-index of the model was calculated and compared with that of previous molGPA models. Results The Cox regression and random forest models both identified four variables, which included KPS, LANO neurological assessment, TKI therapy line, and controlled primary tumor, as statistically significant predictors. A novel targeted-therapy-assisted molGPA model (2022) using the above four prognostic factors was developed to predict LM of lung adenocarcinoma. The C-indices of this prognostic model in the training and validation sets were higher than those of the lung-molGPA (2017) and molGPA (2019) models. Conclusions The 2022 molGPA model, a substantial update of previous molGPA models with better prediction performance, may be useful in clinical decision making and stratification of future clinical trials.


INTRODUCTION
Leptomeningeal metastases (LM) refers to the seeding of tumor cells within the subarachnoid space and leptomeninges. It occurs in up to 10% of adult patients with solid tumors, especially melanoma, breast cancer, and non-small cell lung cancer (NSCLC) (1,2). The incidence of LM as a devastating complication of NSCLC is increasing, especially in patients with targeted molecule-driven mutations (3,4). Lung adenocarcinoma, which is the main component of NSCLC, is more likely to develop LM. Molecular targeted therapy has shown antitumor activity in central nervous system metastases, with median overall survival ranging from 1 to 3 months for historical treatments and 3 to 11 months for new treatments (4,5). Therefore, patients with lung adenocarcinoma have a greater risk of developing sequelae of advanced diseases in the future, such as brain metastasis (BM) and LM. These trends, coupled with the wide application of magnetic resonance imaging (MRI), indicate that an increasing number of patients will be diagnosed with LM in the next few years.
Some existing studies have focused on predicting the occurrence of heterogeneous BM. The Radiation Therapy Oncology Group (RTOG) database was used to generate the recursive partitioning analysis (RPA) classes which were modified in 2012 (modified RPA) (6)(7)(8). RPA is a prognostic index that is divided into three classes based on age, Karnofsky performance status (KPS), control of primary tumor, and extracranial metastases (ECM). The graded prognostic assessment (GPA) index was developed in 2007 and revised in 2017 to form a lung-molGPA model using age, KPS, ECM, number of BM, and gene status to define four disease classes, with median survival ranging from 3.0 to 14.8 months (9)(10)(11)(12). In 2019, another molGPA model was developed to predict LM using factors, such as KPS, ECM, and gene status (13).
In both the lung-molGPA (2017) and molGPA (2019) models, gene mutation status was identified as a significant prognostic factor (11,12). From a clinical perspective, gene mutation status, which indicates molecular-targeted therapy, also has a significant impact on the treatment of EM and LM. However, the efficacy of third-generation targeted drugs has led to revolutionary development compared to firstor secondgeneration targeted therapeutic approaches (2-5, 14, 15). According to the BLOOM and AURA studies (5,14,15), the third-generation epidermal growth factor receptor (EGFR)tyrosine kinase inhibitor (TKI) resulted in a significantly improved median overall survival (OS) of 11.0-18.8 months compared to even higher doses of firstor second-generation EGFR TKIs with a median OS of 3.1-6.2 months (2). The differences in efficacy between generations of targeted therapy may affect the prediction efficiency of the molGPA models. Therefore, in this study, we compared the effects of gene mutation status and targeted therapy on survival, and developed a novel 2022 lung-molGPA for the patients of lung adenocarcinoma with LM.
To the best of our knowledge, no studies have been conducted to predict the survival of lung adenocarcinoma with LM using targeted therapy; moreover, the use of machine learning methods, such as random forests, is lacking. Therefore, this study aimed to fill this research gap and study the role of targeted therapy in the prediction of lung adenocarcinoma with LM using both conventional molGPA and random forest models.

Study Design and Samples
The study was conducted in accordance with the principles of the Declaration of Helsinki, and the protocol was approved by the Medical Ethics Committee of Henan Provincial People's Hospital (approval number: 2017-28). All study participants provided written informed consent for the research and publication.
We collected data from 202 lung adenocarcinoma patients with LM, enrolled between April 2017 and January 2022, at Henan Provincial People's Hospital, Zhengzhou, China. The inclusion criteria were as follows: (i) ≥ 18 years; (ii) diagnosis of lung adenocarcinoma confirmed by histopathology; and (iii) LM diagnosis ascertained according to the NCCN guidelines and the European Association of Neuro-Oncology-European Society for Medical Oncology (EANO-ESMO) guidelines (16). According to the Leptomeningeal Assessment in Neuro-Oncology (LANO) neurological assessment in LM (Supplement Table 1) (17), all patients underwent complete work up, including standardized neurological examination, brain and spine MRI, CSF analysis, during hospitalization. Patients with insufficient clinical information (n=29) or missing follow-up data (n=11) were excluded. Finally, 162 patients were included in the study cohort and randomly assigned to the training (80%, n = 130) and validation (20%, n = 32) sets ( Figure 1).
Baseline clinicopathological characteristics of each patient were obtained from their medical records; they included age, sex, smoking status, ECM, controlled primary tumor, clinical presentations, KPS, gene profiles of EGFR mutation and ALK alteration, ThinPrep cytologic test (TCT), and brain and spine MRI. Treatments including TKI therapy, chemotherapy, bevacizumab, surgery, radiotherapy, intrathecal chemotherapy, and immune checkpoint inhibitors were included in the study. Controlled primary tumor was defined as remission or stable disease, without any clinical, radiologic, or laboratory findings suggestive of tumor progression at 2 months (6,7,18). The overall survival (OS) was defined as the time from diagnosis of LM to death.

Statistical Analysis
Missing values were imputed for variables with small missing proportion. Continuous variables, that is, CSF white blood cells, protein, and glucose, were transformed by taking the logarithm. Other continuous variables were categorized based on clinical reasoning and statistical methods. KPS status was divided into 3 groups: < 60 (high-risk group), 60-70 (moderate-risk group), and 80-100 (low-risk group). Age was dichotomized using a 65-year cutoff. Univariate Cox models were performed on the training set (n = 130), covering baseline characteristics, clinical symptoms, brain and spinal MRI, CSF analysis and treatment, to identify statistically significant variables. With significant variables in the univariate analysis, a multivariate Cox model was fitted to the training set to select significant predictors to construct the prognostic model.
We further utilized the random survival forest method to validate the selected predictors using the Cox model. In addition to the clinical prediction because of the high variance bias tradeof capability, Random survival forests (19,20) method is also usually used to select the most important variables that are linked with the time-to-event outcome (i.e., OS). Given these advantages of random survival forests, we first utilized all variables in the model to identify those with positive importance values. With the top variables, we performed the random survival forest method again to select significant variables, and compared them with those from the Cox model. Furthermore, the C-index of the prognostic model constructed using the top variables was calculated.
We constructed a novel molGPA model (2022) using statistically significant variables. The model was then used to predict the OS of LM with lung adenocarcinoma cancer. The C-index of the prognostic model was calculated and compared with the lung-molGPA (2017, Supplemental Table 2) and molGPA (2019) models (Supplemental Table 3) by taking the average of the C-index values from the randomly split training and validation sets 100 times. Missing values were imputed for variables with small missing proportion using R package mice with default settings (e.g., the number of multiple imputations is 5) (21). All analyses were conducted in R software using the mice package (21) for multiple imputation, survival package (22) for Cox model and C-index, and the randomForestSRC package (19,20) for random forest. The R code for analysis is available on the Github Page: https://github.com/Penncil/A-2022-Targetedtherapy-assisted-molGPA-.

Survival Analysis via Cox Regression Model
As shown in Table 2, the univariate Cox proportional hazard regression models showed that age, KPS, controlled primary tumor, gene mutation status, CSF chloride, LANO neurological assessment, and TKI therapy line were significantly associated with OS (all with p < 0.05). There was no significant correlation between ECM, BM, MRI, and CSF white blood cells, protein  Table 4). The results showed that the p-value of the gene mutation status was 0.07.

Random Survival Forest Model
A random survival forest model for predicting survival of patients with lung adenocarcinoma with LM was fitted to validate the results of the Cox model. As shown in Figure 2, candidate predictor variables were ranked according to their importance in terms of prognostic accuracy. Among these variables, the top four variables, which included KPS, LANO neurological assessment, TKI therapy line, and controlled primary tumor with p-values less than 0.05, were consistent with those identified by the multivariate Cox proportional hazard regression model.

Establishment and Internal Validation of the 2022 molGPA Model
By selecting statistically significant variables with the multivariate Cox and random forest models, we developed a novel molGPA model (2022) for LM of lung adenocarcinoma cancer using four parameters: controlled primary tumor, KPS, LANO neurological assessment, and TKI therapy line ( Table 3). Factors with larger effect sizes were given a maximum score of 1.0, including KPS from 80 to 100 (HR, 0.47 vs KPS < 60), LANO neurological assessment ≤2 (HR, 1.12) and 3 rd -TKI therapy line (HR, 0.42 vs no TKI therapy), with higher scores corresponding to better prognosis. The controlled primary tumor had a smaller effect size (HR, 0.66), with a maximum score of 0.5. The model had a maximum score of 3.5; the higher the score, the lower the risk was. The targeted-therapy-assisted molGPA score was calculated for each patient and categorized into three groups: molGPA 0 (group 1, high risk), 0.  Figure 3, which demonstrates significant separation among the three groups.

Model Evaluation
The previously reported lung-molGPA model (2017) (12) and molGPA model for LM (2019) (13) were tested in all patients. The C-index was calculated among the three models by taking the average of the C-index values from 100 randomly split training and validation sets. For each split, molGPA scores and concordance values were calculated. The higher the Cindex, the better the survival time predicted by the model. The concordance results are shown in Table 4, where the average C-index of this model on the training set was 0.710 (95% CI [0.69, 0.73]), which is 7.00% higher than that of the   lung-molGPA (2017) and 5.5% higher than that of molGPA (2019) models. The C-index of the model on the validation set was 0.714 (95% CI [0.63, 0.80]), which was 8.3% higher than that of the lung-molGPA (2017) and 5.9% higher than that of the molGPA (2019) models.
We also calculated the C-indices of the random survivalforest-derived prognostic model. The C-index for the training set (80% of the cohort) was 0.722 (95% CI [0.69, 0.74]), and 0.714 (95% CI [0.60, 0.84]) for the validation set (20% of the cohort). The C-index of the training set was slightly larger (1.7%) than that of the Cox-based prognostic model. This is because the prognostic model with the random survival forest method included all variables listed in Figure 2 rather than only the top four variables. The C-indices of the validation set of these two prognostic models (i.e., Cox-based and random-survival-forestbased) were the same (i.e., C-index = 0.714).

DISCUSSION
To the best of our knowledge, this is the first attempt to construct a 2022 targeted-therapy-assisted molGPA for LM of lung adenocarcinoma using a multivariate Cox proportional hazard regression model and the random survival forest method. The molGPA model considered the following four variables: controlled primary tumor, KPS, LANO neurological assessment, and TKI therapy line. According to the molGPA model scores, patients were divided into three groups: 0 for highrisk, 0.5-1.0 for immediate high-risk, and ≥ 1.5 for low-risk. In both the training and validation sets, patients with an LM molGPA score ≥ 1.5 (low risk) were more likely to have a better OS than the other two groups. The C-index values of the proposed prognostic model for the training and validation sets were higher than those of the lung-molGPA (2017) and molGPA (2019) models (12,13).
Our 2022 target-therapy-assisted molGPA for LM has several advantages. First, TKI therapy was used instead of gene mutations. The recent revolution in the treatment of patients with prognostic biomarkers has resulted in significant improvements in survival outcomes. As earlier mentioned, molecular markers were included as important factors in the lung-molGPA (2017) and molGPA (2019) models, and had been validated by several studies for its prognostic value in real-world cohorts (12,13,23,24). However, in this study, gene mutation status was not statistically significant in the multivariate Cox model. Considering the correlation between gene mutation status and TKI therapy line, we fitted the multivariate Cox model again by including the gene mutation status only (Supplemental Table 4). The results showed a boundary p-value = 0.07 for the gene mutation status was 0.07, which suggested the possible prognostic value of mutated status in real-life cohorts. We further found that the TKI therapy line was a significant positive prognostic factor for LM, identified by the multivariate Cox and random forest models. The efficacy of first-generation EGFR-TKIs for EGFR+ NSCLC remains poor because of low CSF penetration (25,26). Although secondgeneration EGFR-TKIs, such as afatinib, can partially penetrate the blood-brain barrier, they exhibit no obvious advantages as treatment for LM (27). Osimertinib, an irreversible thirdgeneration EGFR TKI, is highly effective in both untreated and previously treated patients with EGFR-mutant NSCLC, according to several encouraging international clinical trials (13)(14)(15)28). For ALK+ NSCLC, lorlatinib is a novel, highly potent, brain-penetrant, third-generation ALK TKI with broadspectrum potency against most known resistance mutations that can develop during treatment with existing firstand secondgeneration ALK TKIs; its efficacy is significant in BM and LM (29). Guttmann DM (30) also proposed that lung-molGPA is the critical first step in accurately defining the prognosis of patients with gene mutations; however, it also highlights the need for a prognostic index incorporating the utilization and timing of targeted therapy. Therefore, we considered that the TKI therapy line could be used as a significant positive prognostic factor in the prediction of LM.
The second advantage of our proposed molGPA is the use of the LANO assessment, a significant factor commonly used in clinical practice, which has never been considered by other prediction models. The LANO scorecard was formed by the Response Assessment in Neuro-Oncology (RANO) Leptomeningeal Metastasis Working Group, an international multidisciplinary group with the goal of improving response criteria and defining endpoints for neuro-oncology trials (17,31). Although the LANO neurological assessment in LM has not yet been validated, the LANO scorecard generated a proposal for the response assessment in LM and has been widely used in international randomized clinical trials, including the BLOOM and AURA studies (5,14,15,31,32). Patients with LM from lung adenocarcinoma are treated in different departments, including neurology, oncology, and respiratory medicine. The LANO assessment (Supplemental Table 1) is a standardized assessment for neurological examination in the prediction model and is easily utilized by neurologists, oncologists, nurses, and physician assistants.
Third, KPS and controlled primary tumors, two clinically important significant prognostic factors, were considered in our molGPA model. Patients with a KPS score of 80-100 had better OS than those with KPS of 60-70 and KPS < 60. KPS was significantly associated with survival and was included in all the prediction models for BM and LM (6-10, 12, 13). A controlled primary tumor, requiring the estimation of control of systemic disease, was included in the RPA and basic score for BM (BSBM) models (6,7,18). In the study, controlled primary tumor had a p-value of 0.09 in the multivariate Cox model while a boundary p-value between 0.05 and 0.1 indicates weak evidence or a trend (33,34). On the other hand, it was confirmed that in the full set data using random forest model, controlled primary tumor is significant with p=0.04. Because of the above two reasons, we considered controlled primary tumor as a significant factor and incorporated it into the proposed 2022 molGPA model. The controlled primary tumor was assigned a maximum of 0.5, based on its HR and statistical significance in the molGPA model for LM. Extracranial metastases were included in the Lung-molGPA (2017) and molGPA (2019) models (12,13). However, in this study, extracranial metastases showed no statistical significance in Cox proportional hazard regression model and random forest analysis, which may be related to sample bias, requiring further analysis and verification of a larger sample of patients. Our study had several limitations. First, it was a retrospective study from a single center and single ethnic population, which led to incompleteness of some variables. For example, forty-eight patients did not undergo lumbar puncture and had no available information on variables such as protein and white blood cells. However, the sensitivity analysis showed that excluding variables with missing data did not change our conclusions. Second, third-generation TKIs contain different EGFR-and ALK-related drugs, which may affect the prognostic effect of the TKI therapy line. Third, this study evaluated only lung cancer, not other solid cancers, such as melanoma and breast cancer, which are also common in LM. We intend to validate the 2022 molGPA model for LM with lung cancer and extend the model to other solid tumors in the further study.

CONCLUSIONS
We developed a novel targeted-therapy-assisted 2022 molGPA model for predicting LM in lung adenocarcinoma by incorporating a TKI therapy line in addition to a controlled primary tumor, KPS, and LANO neurological assessment. The 2022 molGPA model has a better prediction performance and is a substantial update of previous molGPA models (11,12). The 2022 molGPA model provides a user-friendly tool for estimating survival of lung adenocarcinoma patients with LM and may be useful in clinical decision-making and stratification of future clinical trials.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: The git repository of this study pertinent code is at https://github.com/Penncil/ 2022_molGPA.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The medical ethics committee at the Henan Provincial People's Hospital (ethics number: 2017-28). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
WL, YJ, JZ, FP and YC contributed to the conception and design of this study. WM, MZ, HL, YJ, LQ, XW and LY collected and organized the data. JT, CL, YC and MZ analyzed the data. MZ, JT, YJ, CL and WL drafted the manuscript. All the authors read and approved the final manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by grant to WL from the Medical Science and Technology Project of Henan Province (NO. SBGJ2018077). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.