Development and Validation of a Predictive Model to Evaluate the Risk of Bone Metastasis in Kidney Cancer

Background Bone is a common target of metastasis in kidney cancer, and accurately predicting the risk of bone metastases (BMs) facilitates risk stratification and precision medicine in kidney cancer. Methods Patients diagnosed with kidney cancer were extracted from the Surveillance, Epidemiology, and End Results (SEER) database to comprise the training group from 2010 to 2017, and the validation group was drawn from our academic medical center. Univariate and multivariate logistic regression analyses explored the statistical relationships between the included variables and BM. Statistically significant risk factors were applied to develop a nomogram. Calibration plots, receiver operating characteristic (ROC) curves, probability density functions (PDF), and clinical utility curves (CUC) were used to verify the predictive performance. Kaplan-Meier (KM) curves demonstrated survival differences between two subgroups of kidney cancer with and without BMs. A convenient web calculator was provided for users via “shiny” package. Results A total of 43,503 patients were recruited in this study, of which 42,650 were training group cases and 853 validation group cases. The variables included in the nomogram were sex, pathological grade, T-stage, N-stage, sequence number, brain metastases, liver metastasis, pulmonary metastasis, histological type, primary site, and laterality. The calibration plots confirmed good agreement between the prediction model and the actual results. The area under the curve (AUC) values in the training and validation groups were 0.952 (95% CI, 0.950–0.954) and 0.836 (95% CI, 0.809–0.860), respectively. Based on CUC, we recommend a threshold probability of 5% to guide the diagnosis of BMs. Conclusions The comprehensive predictive tool consisting of nomogram and web calculator contributes to risk stratification which helped clinicians identify high-risk cases and provide personalized treatment options.


INTRODUCTION
Kidney cancer is one of the 10 most oncologic diseases to plague the USA (1,2). Kidney cancer has historically been considered in general terms as a single disease. As continued exploration at the genetic level has shown that it is composed of several different types of cancer, characterized by different mutated genes corresponding to different histologies, clinical processes, and responses to treatment (3), renal cell carcinoma (RCC) accounts for 90% of kidney cancer with a global annual incidence and mortality rate of approximately 400,000 and 175,000, respectively (4,5). RCC is a highly heterogeneous tumor and has an obvious tendency to metastasize distantly. Given this feature, 30% of patients will be diagnosed with metastasis even after aggressive treatment of the primary tumor (6). Bone metastases (BMs), with a 40% incidence rate, are a common mechanism of metastasis and have been identified as a major prognostic risk factor associated with poor survival in patients with metastasic RCC (mRCC) (6)(7)(8).
The frequent sites of colonization by mRCC cells are the proximal extremities and pelvis, manifesting as bone pain, pathological fractures, and hypercalcemia. When spine is involved, devastating paraplegia can occur due to the spinal cord compression. It has been shown previously that metastatic tumors affect bone turnover differently, and radiographic images of bone metastatic tumors from mRCC are osteolytic or osteoclastic due to an imbalance between osteoclasts and osteoblasts mediated by tumor cells. Osteoclasts often increase their activity because of the upregulation of kappa-B ligand (RANKL) induced by mRCC (7). As a result, the osteoclast inhibitor bisphosphonates and the RANKL blocker denosumab are widely used to treat BMs from mRCC.
However, some of the treatment outcomes are still unsatisfactory. There is still a lack of standard treatment protocols or guidelines, most treatments focus solely on improving skeletal adverse events in mRCC with BMs (9). Only a few retrospective studies and case reports have confirmed that early diagnosis and timely wide resections are critical in clinical management for patients with mRCC (6,(10)(11)(12)(13). Therefore, the development of a predictive model to assess the risk of BMs in mRCC is an important part of achieving precision medicine, which includes more aggressive selection of surgical periods, enhanced surveillance, and regular bone scans.
In contrast to the high threshold of background knowledge required that clinicians need to obtain in order to use artificial intelligence, a simplicity and intuitiveness of nomogram can provide the same insightful analysis to help clinicians decide the clinical treatment. Several literatures have developed nomograms to predict prognosis targeting BMs from mRCC (14,15). However, the current study constructs the first predictive model to predict risk factors for BMs, facilitating clinicians in making individualized clinical decisions and assessing patients' long-term prognosis. We extracted kidney cancer patient data from the Surveillance, Epidemiology, and End Results (SEER) database and verified it with an independent validation dataset to mitigate the regional limitations of this study to the extent possible.

Study Design and Participants
Based on the SEER database, we extracted patients diagnosed with kidney cancer from 2010 to 2017 as the training group through SEER*STAT software (version 8.3.5). Validation group of patients from a large academic medical center and the time span matched to the SEER database.
The following inclusion criteria were practiced in the training group: (1) patients with primary kidney cancer (International Classification of Diseases for Oncology ICD-O. 8120/3, 8130/3, 8260/3, 8310/3, 8312/3, 8317/3) diagnosed between January 1, 2010 and December 31, 2017 and (2) diagnosis was based on surviving patients. Included histological subtypes were clear-cell RCC, papillary, chromophobe, and any kidney cancer. Cancerspecific survival (CSS) was defined based on the SEER mortality codes. Further inclusion criteria practiced in validation groups include the following: (1) the patient was older than 18 years, (2) sufficient radiological outcomes and pathological biopsy results during follow-up to determine if metastases are developing, and (3) the follow-up data were obtained until December 31, 2020.
Cases meeting the following criteria were excluded, as follows: (1) patients younger than 18, (2) multiple primary tumors, (3) unavailable demographic characteristics (age and sex), (4) unavailable tumor information (histological type, pathological grade, laterality, TNM stage, and sequence number), (5) diagnosis was from cadavers, (6) without or with unknown BMs and survival time, and (7) cause of death unrelated to kidney cancer or unknown.
Histological subtype was determined according to the International Classification of Diseases for Oncology code. Oncology staging was determined according to the 7th TNM classification of the American Joint Committee on Cancer.
According to the standard NAACCR terms, patients were assigned to two groups: one group were diagnosed with only one primary tumor and the other group were diagnosed with more than one tumor (16).
This study was approved by the institutional ethics committee.

Data Collection
All data for the training group were obtained from the SEER database, including the year of diagnosis, age at diagnosis, sex, pathological grade, TNM stage, histological type, primary site, laterality, and metastasis. Diagnosis in the independent validation group was completed separately by two pathologists in a blinded manner, and a senior pathologist performed review and final diagnosis of controversial patients. If any abnormalities are found, we recommend patients undergo a whole-body CT scan to help identify metastatic lesions. A radionuclide bone scan is used to evaluate the presence of bone metastasis, and PET-CT is also used to exclude insidious tumor metastasis. Diagnosis of suspected metastasis relied on pathological biopsy of the metastatic site. Follow-up documentation consisted mainly of remote follow-up and outpatient review. The primary endpoint event was the presence of BMs, and sub-endpoint event was survival time (as of death due to kidney or last follow-up). All validation data were obtained from our medical electronic records.

Statistical Analysis
Independent Samples t-tests and ordinary Chi-square tests were performed to analyze the characteristics of all included patients.
In the training group, we screened the results of statistically different univariate logistic regression analyses for multivariate analysis. Furthermore, validated independent risk factors were used to construct a nomogram to assess the odds of BMs in patients with RCC. The predictive performance of this nomogram was explored by receiver operating characteristic (ROC) curves, calibration plots, probability density functions (PDF), and clinical utility curve (CUC) (17). The OS of mRCC patients with BMs was demonstrated by Kaplan-Meier curves. Statistical analysis was performed using SPSS (version 20.0, Chicago, IL, USA). p-values <0.05 were considered statistically significant. R software (version 4.0.5, https://www.r-project.org/) was applied for developing predictive model using "rms" package and the "shiny" package to establish a nomogram and web calculator, respectively.

Included Patients
A total of 43,503 patients with kidney cancer were included in the present study. The SEER database provided 42,650 available patients for the training group. After screening out 279 patients (86 with multiple tumors, 75 lacking survival time due to loss to follow-up, 65 dying from nontumor factors, 33 lacking records of metastasis, and 20 with unavailable pathological diagnoses), 853 patients from the Second Affiliated Hospital of Dalian Medical University were grouped as an independent validation group. Older males were the predominant patient population. There were statistical differences in race (p < 0.001), marital status (p < 0.001), primary site (p < 0.001), laterality (p = 0.001), grade (p < 0.001), histology (p = 0.003), T-stage (p = 0.015), N-stage (p = 0.016), surgery (p < 0.001), chemotherapy (p = 0.005), bone metastasis (p < 0.001), and liver metastasis (p < 0.026) of the training and validation groups. These differences may be due to demographic differences and healthcare disparities between the USA and China. The detailed clinical characteristics of the patients are demonstrated in Table 1. Table 2 compares the cohort differences between the BM and non-BM groups.

Correlation of Variable With BMs
Independent risk factors for BMs were obtained by univariate and multivariate logistic regression analyses. Univariate analysis showed that age, sex, race, marital, sequence number, primary site, laterality, grade, histology, T-stage, N-stage, tumor size, and brain/liver/pulmonary metastasis were associated with BMs. Variables including sex, primary site, laterality, pathological grade, histology, T-stage, N-stage, sequence number, and brain/liver/pulmonary metastasis were indicated by multivariate analysis to influence the endpoint outcome events ( Table 3).

Develop and Validate Predictive Models
Statistically significant variables demonstrated by regression analysis were used to develop the nomogram. Figure 1 illustrated the nomogram for the risk of BMs in mRCC incorporating multiple clinical factors. In our nomogram, the effect of the variables on the endpoint events was reflected in the respective line lengths and corresponding scores. Different patients had individualized scores. The total score associated with each variable constituted the probability that the patient will develop BMs. The ROC curve ( Figure 2) was used to assess the predictive performance of the nomogram, and the AUC of the training group (AUC = 0.952) and the validation group (AUC = 0.836) showed that the model was useful for superior predictive ability ( Table 4). Meanwhile, the calibration plots of the training and validation groups were also used to evaluate the accuracy of the nomogram prediction results with respect to the actual occurrence. Ideally, the calibration curve is a diagonal line; at this time, the predicted probability is equal to the true probability. The calibration curves of our nomogram confirm the good agreement between the actual and predicted values in Figure 3. As shown in Figure 4, the PDF for nonmetastatic patients is concentrated in the portion representing 0%-10% risk of metastasis, while the distribution of the curve for metastatic patients is relatively flat. Figure 5 shows the percentage of patients with undetected metastases and preserved biopsies detected at any probability threshold and suggests 5% as the threshold probability for making a clinical decision. In addition, the Kaplan-Meier curve for cases that underwent risk stratification confirmed a significant survival advantage for patients without BMs ( Figure 6). This discovery is one solid evidence to confirm the significance of our study. When using the proposed web calculator, the corresponding BM risk scores could be obtained by selecting several risk factors confirmed in this study (https://liwenle0910.shinyapps.io/RCCapp/).

DISCUSSION
Similar to other solid tumors, bone metastasis is often associated with the progression of metastatic kidney cancer. Interestingly, Becerra and colleagues (18) found that the mutated genes were not consistent in the primary tumor and metastatic samples, enlightening us that metastasis may possess genetic characteristics that are distinct from the primary one. Meanwhile, previous reports confirmed that BMs was an independent risk factor for mRCC prognosis (5,8,14,15). Thus, identifying patients at high risk for BMs provides a sturdy basis for guiding treatment (timing and procedures of surgery, chemotherapy, and radiation therapy).
In this study, a nomogram of risk factors for BMs was developed. Eleven risk factors were identified, including sex, pathological grade, T-stage, N-stage, sequence number, brain metastases, liver metastasis, pulmonary metastasis, histological type, primary site, and laterality. In addition, the ROC curves and calibration curves were used to demonstrate favorable discrimination and calibration plots. The use of combinatorial   lines simplified the patient status and provided a visual assessment on the nomogram as a total score (15,19). Gender-associated genetic specification to mRCC has also been reported previously, involving multiple risk genes including 14q24.2 (DPF3) and 2p21 (EPAS1) (20). Meanwhile, the impact of sex on RCC-specific mortality is inconsistence, as lower RCCspecific mortality was detected in premenopausal women than in men of the same age, but the difference diminished after      (22) reported a distinct gender bias for RCC, with significantly higher prevalence (62.6%) and BMs (76.3%) in male, but they were unable to confirm the correlation between gender and BMs. Interestingly, another paper based on the SEER database concluded that male patients were more likely to develop BMs (23). Reasons for the disagreement may include the fact that the former is a single-center retrospective study. In Chen's study, the study time spanned at 16 years in order to recruit a sufficient number of subjects, resulting in the variant diagnostic criteria, given the development in early and precise diagnosis of RCC. Our study confirmed a strong correlation between BMs and male RCC patients due to the high risk of RCC invasion in men. Women more frequently use healthcare system, including the scheduled abdominal radiological examination, resulting in early detection, which may further contribute to the difference of BMs in RCC (24). Variables including pathological grade, T-stage, and N-stage associated with patient overall survival were also shown to be related to BMs (19,(24)(25)(26). A possible explication for the higher risk of BMs in mRCC patients with high-level pathological grade  and advanced T-stage and N-stage could be the possibility to possess drastic aggressiveness. The skeleton-specific microenvironment has been found to be a suitable "soil" for the growth of mRCC. The invasion of cancer cells requires roughly three processes: "escape" (malignant cells leave the kidney), "metastasis" (reaching the skeletal microenvironment suitable for mRCC growth via blood vessels), and "colonization" (formation of new lesions in the involved sites) (6). The metastasis tumor cells will invade the blood vessels and colonize in the target bone through a characteristic preference (27,28). Thus, predictors suggesting high malignancy of RCC still had efficacy in assessing BMs. Pulmonary/brain/liver metastasis are also predictors for the evaluation of BMs compared with patients without multiple metastasis. As mentioned above for the metastatic mechanism, the renal vein and inferior vena cava are a critical part of distant metastasis. In patients developing pulmonary/cerebral/liver metastasis, tumor cells have escaped and the risk of BMs is inevitably significantly elevated (29)(30)(31)(32). The KM curves presented in this study also confirmed that patients with BMs have a worse prognosis than patients without.
As demonstrated in our nomogram, of all the histologic types that are available from the SEER database, the most common and rarest subtypes of BMs are transitional cell carcinoma (TCC) and chromophobe RCC, respectively. TCC is a relatively rare renal malignancy that accounts for approximately 10% of all genitourinary cancers (33, 34). Its histologic feature has been shown to resemble bladder cancer, with a 5-year survival rate of 77%-80% in T1 patients and a highest risk of BMs (33, 35). For comparison, the metastatic inertia of chromophobe RCC is consistent with previous studies. Since Thoenes separated chromophobe RCC from RCC three decades ago, substantial evidence demonstrated its 10-year OS is 80%-90% and metastasis rate is only 5% which supports the definition of chromophobe RCC as a low malignant tumor (36)(37)(38).
The debate about the laterality has not reached a consensus until now. As we have known, the left renal vein has more vascular collateral circulation, which brings together multiple veins of the lumbar region and may induce more metastasis. Therefore, the left tumors may develop more metastasis than those on the right side (39). This idea is also reflected in the report of Morri et al. (40) Owing to the concern for laterality, surgeons usually remove left lesions as possible, which leads to more negative margins being observed in left-sided cases. However, we did not find that the difference between the left and right sides significantly affected the BMs. Conversely, patients with bilateral/other types were more likely to develop BMs. This finding may be associated with hereditary RCC, which is primarily characterized by bilateral or multifocal masses. Hereditary RCC frequently presents with perirenal fatty infiltration and renal vein infiltration; retroperitoneal and mediastinal lymph nodes, liver, and bone are common targets of metastasis (41,42). Another risk factor indicating the relationship between tumor location and BMs is the primary site. Renal pelvic RCC has a lower risk of BMs. In this regard, we believe the following explanation is acceptable. On the one hand, the clinical symptoms of renal pelvis tumors are obvious, 80% of patients will develop hematuria, and early medical consultation can effectively control the carcinoma progression. On the other hand, the surgical criteria for renal pelvic RCC require a greater extent of resection compared with kidney cancer, which is one of the explanations for effective patient protection (43). However, previous studies have reported the renal vein or inferior vena cava (IVC) involvement is associated with early onset of metastasis in renal pelvis RCC (29). There is a long way to go for exploring the mechanisms between the site of carcinoma origin and BMs.
Furthermore, the extensive overlap of risk factors for prognosis and BMs often leads to confusion between urologists and orthopedic surgeons about this concept. Prognostic factors typically indicate that the association between patient status and survival for patients eligible for this variable does not depend on the treatment regimen received (44). The findings of this study could not simply equate severe patients with patients with metastasis unless BMs were the direct cause of the patient's death. "Severe" is often used to define a patient's overall health status rather than simply describing the tumor, especially for patients with BMs who were vulnerable to the threat of adverse skeletal events and an increasing financial burden (45,46).
Our study demonstrated for the first time that sequence number was associated with BMs in patients with RCC. As the criteria presented previously, we found that patients with >1 primary tumor were less likely to develop BMs, possibly due to the poorer prognosis and shorter survival of patients with multiple tumors, resulting in the lack of necessary time for BMs to form (45).
Compared with other tumors, RCC is characterized by high vascularity, which poses a serious challenge for treatment. Vascular endothelial growth factor (VEGF) plays an important role in promoting angiogenesis in RCC. As a result, targeted therapies represented by VEGF inhibitors (bevacizumab, sunitinib, axitinib), as first-line treatment for advance patients, have been developed, showing impressive clinical outcome (4,47,48). Bone-targeted treatment (bisphosphonate) and surgical intervention are also effective to treat mRCC with BMs. Surgical recommendations for early radical resection of the primary tumor and/or skeletal lesions have been shown to help prolong patient survival. Benefiting from this, earlier detection and higher surgical rates led to a better prognosis for patients with mRCC metastasizing to the long bones (8). Notably, the high vascularity can result in the devastating bleeding during procedure without proper pretreatment. Several studies have reported that preoperative embolization has shown significant benefits in reducing perioperative blood loss in mRCC patients (49)(50)(51). Thus, the developed predictive model will be useful in risk stratification, surveillance of cases, decision treatment, prolongation of survival, and control of spending.
Unlike conventional logistic regression analysis that simply suggests parameters affecting BMs in mRCC (23), our proposed prediction model presented as a nomogram was able to quantify these predictors by scoring each risk factor. Higher scores indicated increased risk of developing BMs. In addition, we offered clinicians an online web calculator that fits the digital definition. By clicking on the link below and typing in the patient's personalized information, users can quickly obtain the target BMs risk (https://liwenle0910.shinyapps.io/RCCapp/). Nevertheless, there were several limitations in our study. First, this article is a retrospective cohort study, and artificial selection bias may have an adverse effect on the conclusion. Secondly, given that the variables recorded in the SEER database are stereotyped, some valuable clinical predictors were not involved in this study, including common tumor markers such as AFP, CE-199, molecular susceptibility, and the Fürhman classification (22,52). Meanwhile, we extracted the cases according to ICD-O codes, not the latest published WHO histological types. Notably, this population-based study included an adequate number of patients, which ensures the credibility of the conclusions. Future studies need to go further to incorporate tumor characteristics, laboratory results, and treatment regimens to establish a higher dimensional predictive model.

CONCLUSION
We retrospectively investigated the risk factors impacting the appearance of BMs in kidney cancer. Combining the SEER database and an independent external validation dataset, this study proposed and validated a prediction model that incorporated sex, pathological grade, T-stage, N-stage, sequence number, brain metastases, liver metastasis, pulmonary metastasis, histological type, primary site, and laterality. The adverse prognosis of BM patients was confirmed via KM curve. The composite predictive tool consisting of nomogram and web calculator provides important consideration for the multidisciplinary management.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
This study is based on the SEER database and does not require ethical approval.

AUTHOR CONTRIBUTIONS
SD and HY completed the study design. SD, HY, Z-RT, and YK performed the study and collected and analyzed the data. SD and HY drafted the manuscript. HW, KT, and WL provided the expert consultations and suggestions. SD, KT, and WL conceived of the study, participated in its design and coordination, and helped to embellish language. All authors contributed to the article and approved the submitted version.