Development and Validation of a Nomogram Based on 18F-FDG PET/CT Radiomics to Predict the Overall Survival in Adult Hemophagocytic Lymphohistiocytosis

Purpose: Hemophagocytic lymphohistiocytosis (HLH) is a rare and severe disease with a poor prognosis. We aimed to determine if 18F-fluorodeoxyglucose (18F-FDG) PET/CT-derived radiomic features alone or combination with clinical parameters could predict survival in adult HLH. Methods: This study included 70 adults with HLH (training cohort, n = 50; validation cohort, n = 20) who underwent pretherapeutic 18F-FDG PET/CT scans between August 2016 and June 2020. Radiomic features were extracted from the liver and spleen on CT and PET images. For evaluation of 6-month survival, the features exhibiting p < 0.1 in the univariate analysis between non-survivors and survivors were selected. The least absolute shrinkage and selection operator (LASSO) regression analysis was used to develop a radiomics score (Rad-score). A nomogram was built by the multivariate regression analysis to visualize the predictive model for 3-month, 6-month, and 1-year survival, while the performance and usefulness of the model were evaluated by calibration curves, the receiver operating characteristic (ROC) curves, and decision curves. Results: The Rad-score was able to predict 6-month survival in adult HLH, with area under the ROC curves (AUCs) of 0.927 (95% CI: 0.878–0.974) and 0.869 (95% CI: 0.697–1.000) in the training and validation cohorts, respectively. The radiomics nomogram combining the Rad-score with the clinical parameters resulted in better performance for predicting 6-month survival than the clinical model or the Rad-score alone. Moreover, the nomogram displayed superior discrimination, calibration, and clinical usefulness in both the cohorts. Conclusion: The newly developed Rad-score is a powerful predictor for overall survival (OS) in adults with HLH. The nomogram has great potential for predicting 3-month, 6-month, and 1-year survival, which may timely guide personalized treatments for adult HLH.


INTRODUCTION
Hemophagocytic lymphohistiocytosis (HLH) is a syndrome of severe immune activation and dysregulation characterized by hyperactive cytotoxic T lymphocytes, natural killer (NK) cells, and macrophages leading to cytokine storm and immunemediated multiple organ failure (1,2). Historically, HLH has been classified as primary or familial HLH driven by underlying genetic defects in cytotoxic immune function or as secondary or reactive HLH caused by infections [e.g., Epstein-Barr virus (EBV), cytomegalovirus (CMV), HIV, and coronavirus disease 2019 (COVID- 19)], malignancies (e.g., hematologic malignancies), and autoimmune diseases (e.g., macrophage activation syndrome) (1). Emerging evidence demonstrated that HLH may occur in patients of any age and is most often driven by an integration of genetic defects and acquired exposures (3,4). Primary HLH occurs in 1/50,000-1/100,000 live-born children, while secondary HLH occurs in older children and adults (1). The precise incidence of adult HLH is still unknown, but it accounts for ∼40% of all HLH (1,5). The frequent manifestations are intermittent fever, hepatosplenomegaly, lymphadenopathy, liver injury, cytopenia, hypertriglyceridemia, hyperferritinemia, and hemophagocytosis (1). Because of few data and/or no prospective studies for adult HLH, pediatric data are often generalized to guide diagnostic, therapeutic, and prognostic decision-making in adults (1,6). In general, adults have poorer outcome than children even with aggressive therapy, with a median survival of 4 months (1). The principal reasons for mortality are multiorgan failure, hemorrhage, and sepsis, which can be treated properly if diagnosed early (7). Therefore, identifying poor prognosis in adult HLH is crucial for risk stratification and therapeutic decision-making. Recently, it has been reported that clinical and laboratory markers are correlated with survival in adult HLH including age, platelet, fibrinogen, albumin, serum ferritin, alanine aminotransferase (ALT), and malignancy (1,2,8,9). But none of them can be a single effective prognostic factor as a result of poor sensitivity and/or specificity. Hence, it would be of great utility to build a predictive model to precisely evaluate the prognosis in adult HLH based on multiple indictors. 18 F-fluorodeoxyglucose ( 18 F-FDG) PET/CT has been employed for detecting underlying malignancy and predicting prognosis of adult HLH (10)(11)(12). One of the most common PET/CT finding is hepatosplenomegaly with diffusely increased FDG uptake, which contains a great deal of information reflecting disease status in adult HLH (13,14). Radiomics can convert medical images into quantitative data and subsequently analyze these data for prognosis prediction by high-throughput computing. PET/CT radiomic features have been explored to predict outcome in malignancies such as lymphoma and lung cancer (15)(16)(17). It has been suggested that the quantitative PET parameters of spleen are independent prognostic factors (11,12), but whether PET/CT radiomic features extracted from liver and spleen can be applied for outcome prognostication in adult HLH is unclear yet. Therefore, the first aim of this study was to establish a PET/CT radiomics score (Rad-score) for predicting 6-month survival in adult HLH and the second aim was to combine the Rad-score with clinical parameters, in order to develop a nomogram for predicting individual prognosis accurately and reliably.

Patients
This retrospective study was approved by Institutional Review Board of Beijing Friendship Hospital of Capital Medical University and the requirement of a written informed consent was waived. The medical records of 185 consecutive adult patients (age ≥ 18 years) with a diagnosis of HLH were reviewed from August 2016 to June 2020. The diagnostic criteria of HLH were in accordance with HLH-2004 protocol, which requires five of the following eight criteria: (1) fever; (2) splenomegaly; (3) cytopenia affecting ≥ 2 lineages (Hemoglobin (HGB) < 9 g/dl, platelets < 100 × 10 9 /L, neutrophils < 1.0 × 10 9 /L); (4) serum triglyceride ≥ 265 mg/dl and/or fibrinogen ≤ 150 mg/dl; (5) hemophagocytosis in bone marrow, spleen, lymph nodes, or liver; (6) low or absent NK cell activity; (7) Ferritin ≥ 500 µg/l; and (8) soluble interleukin-2 receptor (soluble CD25) ≥ 2,400 U/ml (18). The exclusion criteria included: patients with receiving chemotherapy before 18 F-FDG PET/CT scan (n = 114) or incomplete follow-up (n = 1). Consequently, a total of 70 patients were included in this study. All the patients received personalized treatments in the Department of Hematology and were followed-up for at least 180 days with a median of 353 days. These cases were randomly divided into the training (n = 50) and validation cohorts (n = 20) with a ratio of 5:2.

Clinical Data Collection
Clinical parameters including age, gender, malignancy, EBV infection, hemophagocytosis, and laboratory variables [white blood cell, absolute neutrophil, hemoglobin, platelet, Creactive protein (CRP), ALT, aspartate aminotransferase (AST), triglycerides, serum ferritin, fibrinogen, erythrocyte sedimentation rate, and lactate dehydrogenase] were obtained from medical records ( Table 1). All the laboratory and radiological data were collected before initial HLH-specific therapy. The most likely trigger of secondary HLH (malignancy, infection, autoimmune, and idiopathic) was determined by assessment and medical evidence of physician. 18 F-Fluorodeoxyglucose PET/CT Imaging Acquisition, Segmentation, and Feature Extraction 18 F-fluorodeoxyglucose PET/CT was performed on a Siemens biography mCT PET/CT scanner (Siemens Healthineers, Erlangen, Germany). Patients were instructed to fast for at least 6 h, accompanied by blood glucose <11.1 mmol/l. Then, 18 F-FDG (4.4 MBq/kg) was injected intravenously. After a 60-min uptake time, low-dose CT scan was executed for visualization of anatomic structures and attenuation correction, with 140 keV, automatic mAs, and a slice thickness of 3 mm. The whole-body PET scan was carried out with 2.5 min per bed position using three-dimensional (3D) mode immediately after a whole-body CT scan. Images were reconstructed with an iterative reconstruction algorithm. The entire liver and spleen on CT images were defined as the regions of interest (ROIs), which were delineated by two experienced nuclear radiologists with a validated semi-automatic approach using (3D Slicer TM software, Boston, Massachusetts, United States) (version 4.10.0, http://www.slicer.org) (Figure 1) (19). Moreover, the ROIs were resampled exploiting B-spline interpolation in order for mapping those onto the PET images. In consequence, the ROIs had the matching pixel spacing with the PET images.

Radiomic Feature Selection and the Rad-Score Construction
Our workflow is shown in Figure 1. Firstly, the univariate analysis (t-test for normally distributed variables or the Mann-Whitney U test for skewed distributed variables) was used to compare differences of radiomic features between nonsurvivors and survivors at 180 days in the training set. The total of 384 features with p-values < 0.1 were retained for further analysis. Next, the least absolute shrinkage and selection operator (LASSO) algorithm was applied to select the optimal features among 384 features in the training set, adding L1 regularization term to a least square algorithm for data dimension reduction. Because of imbalanced datasets, the synthetic minority oversampling technique (SMOTE) was used to improve random oversampling in the training set.  An individualized Rad-score was calculated from a linear combination of the selected features weighted by their respective coefficients. The receiver operating characteristic (ROC) curve was employed to evaluate the prediction accuracy and determine the optimal threshold of the Rad-score. All the patients were divided into high-and low-risk groups according to the maximum Youden index of the ROC curve. The potential association of the Rad-score with overall survival (OS) was evaluated by the Kaplan-Meier survival analysis and the log-rank test in the training and validation cohorts.

Clinical Variables Selection and Nomogram Creation
To build a powerful model and a robust nomogram for the survival prediction, the clinical prognostic factors were chosen by the univariate Cox regression analyses (p < 0.05). Then, the Rad-score and the strong clinical indicators were incorporated to establish the multivariate Cox regression model that was visualized by a nomogram. The Harrell's concordance-index (C-index) was employed to assess the model performance and calibration curves were plotted to enhance the predictive precision of nomogram. Similarly, a clinical model was established with clinical information alone by the multivariate Cox regression analysis. Three different types of predictive models (clinical variables, the Rad-score, and their combinations) were evaluated by the C-index. Decision curve analysis (DCA) was utilized to assess the clinical usefulness of the models.

Statistical Analyses
Continuous variables are presented as medians with interquartile ranges and categorical variables are presented as frequencies and percentages. OS was defined as the time from the initial diagnosis of HLH to the date of death from any cause or deadline of followup. All the p-values were two-sided, with a significant level of <0.05. Statistical analyses were performed with Python (version 3.7.8, www.python.org) and R (version 4.0.3, www.r-project.org). The Python packages "sklearn, " "numpy, " and "pandas" were used for the LASSO binary logistic regression and the ROC curve; the "scipy" was for analyzing statistical properties; and the "imblearn" was for analyzing SMOTE. The R package "rms" was employed to create nomograms.

Baseline Clinical Characteristics of Patients
A total of 70 adults with HLH were included in this study who fulfilled the inclusion criteria. There were 36 males and 34 females and the median age at diagnosis was 38 years (range: 18-79 years). The baseline characteristics of all the patients are shown in Table 1.
The baseline characteristics in the training and validation cohorts are also given in Table 1. Obviously, the clinical variables had no differences between the two cohorts (p > 0.05). After a
The median and the interquartile range for the selected radiomics features in the training cohort are shown in Table 2. The Rad-score in the training and validation cohorts are shown in Table 3. Not surprisingly, the Rad-score had notable difference between non-survivors and survivors in the training (p < 0.001) and validation cohorts (p = 0.011). Particularly, non-survivors had the higher Rad-score than survivors in the training (Radscore = 1.6386 vs. −1.0608) and validation cohorts (Rad-score = 1.3763 vs. −1.3595). The Rad-score for individuals in the both the cohorts is shown in Figures 3A,B. In addition, the Rad-score had good predictive power for survival forecast at 180 days and its area under the ROC curves (AUCs) in distinguish high-risk status were 0.927 (95% CI: 0.879-0.975) in the training set and 0.869 (95% CI: 0.684-1.000) in the validation set ( Figure 3C). The best cutoff with maximum Youden index was −0.3 and, therefore, patients were divided into high-and low-risk groups according to the Rad-score in the both the cohorts. The Kaplan-Meier curves and the log-rank test found that patients in low-risk category had a better prognosis than those in high-risk category in the training and validation cohorts (p < 0.05) (Figures 3D,E).

Strong Predictor Selection and Model Establishment and Assessment
The univariate Cox regression analysis showed that 6 parameters were significantly associated with OS including the Rad-score, T-cell neoplasms, white blood cell, hemoglobin, platelet count,   and CRP (p < 0.05; Table 4). The multivariate analysis displayed that the Rad-score, white blood cell, and CRP were consistently strong predictors (Table 4), which were used to build the combined model. When the Rad-score was excluded, three variables (T-cell neoplasms, hemoglobin, and platelet count) were independent prognostic factors among clinical parameters ( Table 5). Likewise, these three prognostic factors were used to build the clinical model. To assess the performance of models in predicting prognosis, the C-indices of three types of models were shown in Table 6.

Personalized Nomogram Establishment and Validation
Given that the combined model possessed synergetic power for survival prediction, the personalized nomogram was constructed by incorporating all the three independent prognostic factors (the Rad-score, white blood cell, and CRP) (Figure 5A), which can visualize the prediction outcome and the proportion of each factor. The calibration curves demonstrated good agreements between the predicted and observed values in the training and validation cohorts, indicating that the nomogram was able to precisely predict 6-month survival (Figures 5B,C).

DISCUSSION
Timely diagnosis and prognosis are critical for HLH considering that the early and proper administration of an efficacious therapy can improve survival. In this study, the novel prognostic factors and predictive models associated with 6-month survival in adult patients with HLH are reported via pretherapeutic 18 F-FDG PET/CT radiomics analysis. The Rad-score and the combined prediction model (the Rad-score and clinical variable combination) have been developed for quantitative identification of the adults with HLH at high risk of death within 6 months in 70 patients. 18 F-fluorodeoxyglucose PET/CT is a whole-body scan containing both the metabolic and anatomical information, which has been recommended for identifying possible triggers and suitable biopsy sites in secondary HLH (6). However, 18 F-FDG PET/CT findings are non-specific, since inflammatory response and malignant lesions have the same manifestation that is hypermetabolism. In HLH, 18 F-FDG PET/CT often shows diffusely increased FDG uptake in spleen, liver, and bone marrow with or without focal lesions and hypermetabolic lymph nodes. Increasing evidence demonstrated that these non-specific presentations can be used to assess systemic inflammatory response and have potential for prognosis prediction in HLH (21). For instance, the FDG uptake of spleen and bone marrow has been considered as prognostic factors in adult patients with HLH (11,12,22). More importantly, spleen and liver, components of the reticuloendothelial system, are the most frequent abnormal signs in HLH (23). Our data proved that 6 radiomic features from spleen and liver were linked with the prognosis of adult HLH, thus utilized for establishment of the Rad-score. Among six radiomic features, two-thirds (4/6) were derived from spleen including GLSZM size zone non-uniformity normalized feature and kurtosis of spleen PET and GLSZM gray level non-uniformity normalized feature and NGTDM contrast of spleen CT. It is well-known that spleen is the largest secondary lymphoid organ and a site where immune responses can be controlled by activated immune cells. As splenomegaly is one of the diagnostic criteria of HLH, the hypermetabolic spleen has been discovered to be correlated with high inflammatory response and cytokine activity (24,25). One recent report suggested that spleen FDG uptake may provide useful information for predicting in-hospital mortality in autoimmune diseases including HLH (26). Another study of 43 patients with secondary HLH found that the ratio of spleen to mediastinum in the average standardized uptake value (SUV) was an independent predictor for survival (12). In consistent with these statements, our findings indicated that radiomic features of spleen possessed a powerful predictive ability for 6-month survival in adult patients with HLH. The rest of two radiomic features were extracted from liver including GLCM IMC2 (Informational Measure of Correlation) of liver CT and GLSZM small area emphasis of liver PET. Radiomics have showed great value in characterization of diffuse liver diseases such as non-alcoholic steatohepatitis and chronic hepatitis B (27). In addition, the well-known liver enzymes, AST and ALT, are identified as indicators of various diseases including HLH. High ALT and AST/ALT ratio have been found to act as adverse prognostic factors in adult HLH (8,28). Furthermore, hepatic involvement and hepatomegaly reveal poor prognostic indicators and early death predictors in HLH (8,28). In line with these studies, our results illustrated that the radiomic features of spleen and liver presented great prognostic values for adult HLH.
Radiomics is a high-throughput extraction of quantitative information from medical images as well as subtle manifestations that are difficult to recognize or quantify by human eyes. Compared with the traditional PET/CT metrics, the radiomic features may reflect the pathological process much more sufficiently in the spleen and liver of patients with HLH. In this study, the majority of the selected radiomic features (5/6) were derived from wavelet decomposition images, indicating that wavelet transforms emphasize image details. It is very likely that wavelet decomposition images contain inconspicuous prognostic information (29,30). GLSZM quantifies the number of groups of interconnected neighboring voxels with the same gray level intensity. NGTDM represents contrast, quantifying the difference between the gray level of a voxel and the average level of its neighbors within a distance. GLCM captures spatial relationships of pairs of voxels, while kurtosis is a first-order feature expressing the peak of the distribution of values in the ROI. All the selected features describe the texture of the spleen and liver quantitatively, reflecting uniformity or heterogeneity in both the organs (31). Previous studies found that intratumor heterogeneity was associated with poor outcome in various malignancies (15,17,30,32,33). HLH is a heterogeneous disease with various etiological components and complex underlying genetic variant types (3). Each possible etiology has distinct clinical characteristics and prognosis. Even in lymphomaassociated HLH that has the worst prognosis, the treatment response is diverse (34). HLH could occur in EBV-associated T-/NK-cell lymphoproliferative disorders, which is a spectrum of disease from infection to malignancy. The histological features and immunophenotype are markedly heterogeneous. As in children, multiple gene mutations are linked to the development of HLH in adults, especially with the EBV-driven lymphoma, which requires hematopoietic cell transplant (6,35,36). The 18 F-FDG appearance of the liver and spleen came in various sizes, densities, and metabolisms, which reflected the heterogeneity of HLH. Radiomics quantified the spatial complexity of them. This study exhibited that the heterogeneity of spleen and liver may reveal overproliferation of immune cells accompanied with inflammatory infiltration triggered by EBV infection (37)(38)(39)(40). On the other hand, the heterogeneous distribution of metabolism or density may also suggest the involvement of tumor cells (41)(42)(43). Both the malignancy and EBV infection seemed to link an inferior prognosis in adult HLH (9,34,44,45). Additionally, the radiomic features have the possibilities associated with genetic signatures (3,46); however, the underlying biological significance of these radiomic features has not been fully studied and the relationships among radiomic features, genetic signatures, and prognosis need further exploration in HLH.
Recent studies pointed out that a number of clinical parameters play a role in the prognosis of HLH such as lymphoid malignancy, hemoglobin, platelets, CRP, and cytopenia [ (8,9,50,51)]. It is well-documented that lymphoid malignancy is negatively associated with survival. Typically, T-cell lymphoma is acknowledged to have a more severe survival due to poor response to chemotherapy, in comparison with B-cell lymphoma (47). In a large-scale Japanese study, the 5-year OS was the worst in T-/NK-cell lymphoma-associated HLH compared with other types of HLH including primary HLH, B-cell lymphomaassociated HLH, and infection-associated HLH (48). Lower hemoglobin and platelet have been reported to be the more consistent negative prognostic biomarkers in HLH (8,49). This study consistently showed that these 3 clinical parameters were involved in the clinical prediction model. However, the two clinical variables incorporated in the nomogram were white blood cell and CRP. Cytopenia is one of the major presentation in HLH. Serious cytopenia may mark the severity of a cytokine storm and lead to hemorrhage and sepsis, suggesting to be an inferior factor (7). CRP is the prototypical acute phase serum protein, increasing rapidly during inflammation (50). It has been highlighted that CRP is markedly enhanced in patients with secondary HLH compared to primary ones. High CRP levels have been correlated with increased risk of infection and overall mortality in HLH, suggested to be indices of disease severity (51). CRP probably serves as a predictor of 18 F-FDG PET/CT effectiveness due to the fact that the diagnostic accuracy of PET/CT is positively linked with CRP > 60 mg/l in HLH (12).
Interestingly, T-cell neoplasms were not retained in the predictive model when the Rad-score was incorporated. A possible explanation was that the Rad-score contained partial pathological information. The inclusion of the Rad-score not only improved the prognostic performance, but also simplified the prediction model. DCA demonstrated that the nomogram with the Rad-score and two clinical parameters was superior to the clinical model in terms of clinical application. Overall, the nomogram was successfully built to predict 3-month, 6-month, and 1-year survival of adults with HLH and the accuracy and clinical applicability of the model were verified through C-index, calibration curve, and DCA.
This study has several limitations. First, patients may have been missed for inclusion in a single-center study and selection bias may occur because of the retrospective nature of the study design. Second, the heterogeneity of the patients and treatments may affect our results. Third, gene, transcript, and protein signatures become increasingly important for the prognosis of adult HLH (3), but these data were not collected. Finally, the Rad-score was calculated using ROIs that were manually delineated in 3D slicer. It was time-consuming and inconvenient for clinical practice, so automatic or semi-automatic image segmentation will be needed. Notably, a multicenter and prospective study with larger cohort will be required to validate our findings in the future.

CONCLUSION
This preliminary study indicated that the pretherapeutic 18 F-FDG PET/CT radiomic features of spleen and liver are independent prognostic factors in adult HLH, with the heterogeneity of spleen and liver associated with inferior prognosis. Integrating radiomic features with clinical parameters show synergetic power for 6-month survival prediction compared to other models with radiomics features or clinical parameters alone. The nomogram has great potential for predicting individualized 3-month, 6-month, and 1-year survival, which may timely guide personalized treatments for adult HLH.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board of Beijing Friendship Hospital of Capital Medical University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. No potentially identifiable human images or data are presented in the manuscript.

AUTHOR CONTRIBUTIONS
JY, YK, and HZ contributed to the study design, decisionmaking, and coordination of the study. XY, JLiu, XL, WW, and SZ contributed to the management of registration of cases and collected PET/CT image data. XY, JLiu, XL, WW, and YK contributed to the image quality control, analysis, and data interpretation. LL and HZ contributed to the statistical analysis. XY, JLi, and JY contributed to the drafting and revising the manuscript. All the authors read, revised, and approved the final version of the manuscript.