Prediction of MYCN Amplification, 1p and 11q Aberrations in Pediatric Neuroblastoma via Pre-therapy 18F-FDG PET/CT Radiomics

Purpose This study aimed to assess the predictive ability of 18F-FDG PET/CT radiomic features for MYCN, 1p and 11q abnormalities in NB. Method One hundred and twenty-two pediatric patients (median age 3. 2 years, range, 0.2–9.8 years) with NB were retrospectively enrolled. Significant features by multivariable logistic regression were retained to establish a clinical model (C_model), which included clinical characteristics. 18F-FDG PET/CT radiomic features were extracted by Computational Environment for Radiological Research. The least absolute shrinkage and selection operator (LASSO) regression was used to select radiomic features and build models (R-model). The predictive performance of models constructed by clinical characteristic (C_model), radiomic signature (R_model), and their combinations (CR_model) were compared using receiver operating curves (ROCs). Nomograms based on the radiomic score (rad-score) and clinical parameters were developed. Results The patients were classified into a training set (n = 86) and a test set (n = 36). Accordingly, 6, 8, and 7 radiomic features were selected to establish R_models for predicting MYCN, 1p and 11q status. The R_models showed a strong power for identifying these aberrations, with area under ROC curves (AUCs) of 0.96, 0.89, and 0.89 in the training set and 0.92, 0.85, and 0.84 in the test set. When combining clinical characteristics and radiomic signature, the AUCs increased to 0.98, 0.91, and 0.93 in the training set and 0.96, 0.88, and 0.89 in the test set. The CR_models had the greatest performance for MYCN, 1p and 11q predictions (P < 0.05). Conclusions The pre-therapy 18F-FDG PET/CT radiomics is able to predict MYCN amplification and 1p and 11 aberrations in pediatric NB, thus aiding tumor stage, risk stratification and disease management in the clinical practice.


INTRODUCTION
Neuroblastoma (NB), the most common extracranial solid pediatric tumor, accounts for about 8-10% of all childhood cancer and 12-15% of childhood cancer mortality (1). Using selected clinical, pathologic, and genetic factors, patients diagnosed with NB can be classified into different risk groups for treatment (2). Previous studies have shown that patient outcomes of NB are highly correlated with risk stratification, with more than 90% cure in non-high risk patients and <50% event-free survival rate in high risk patients (3). It is therefore very important to obtain a better understanding of risk factors so that treatment strategies for children with NB can be tailored accordingly. Previous studies have demonstrated the value of prognostic factors such as patients age, tumor stage using the International Neuroblastoma Staging System (INSS), tumor histopathology using the International Neuroblastoma Pathology Classification (INPC) system, DNA ploidy, cytogenetics such as MYCN amplification status and chromosome aberrations of 1p and 11q (1,4,5). In addition, CT or MR image-defined risk factors (IDRFs) were used to distinguish low-risk tumors from high-risk tumors (6,7). However, the predictive value of nuclear medicine functional imaging techniques on tumor biology has been less studied.
Nuclear medicine functional imaging plays an important role in the assessment of NB. Currently, 123 I-Metaiodobenzylguanidine ( 123 I-MIBG) scintigraphy is a standard practice in the diagnosis of NB (6), with ∼90% of patients having MIBG avid tumors. However, in some countries, including China, 123 I-MIBG has not been approved for clinical use and cannot be included in the standard clinical protocols for NB patients. In our practice, we have utilized 18 Ffluorodeoxyglucose positron emission tomography/computer tomography ( 18 F-FDG PET/CT) in the diagnosis and follow-up of NB patients. 18 F-FDG PET imaging has been reported to be equal or superior to 123 I-MIBG scan for delineating NB disease extent in the chest, abdomen, and pelvis (8). In case the tumor is not MIBG avid, 18 F-FDG PET is also recommended as a complementary option to 123 I-MIBG scintigraphy (9).
The purpose of this study aims to evaluate whether diagnostic 18 F-FDG PET/CT imaging plays a role in risk stratification prediction in children with NB. The relationship between diagnostic 18 F-FDG PET/CT image features and the tumor biology of NB were investigated to answer this question. Specifically, cytogenetic factors, MYCN amplification status and chromosome aberrations of 1p and 11q, are chosen as representative indicators of tumor biology. It was well-documented that MYCN amplification and chromosome aberrations of 1p and 11q are powerful prognostic markers and have a strong association with worse outcome in NB (5). Amplification of MYCN can be detected in 20% of cases with NB and is closely linked with high-risk disease and poorer outcome (10). Loss of heterozygosity on chromosome 1p and 11q are correlated with increased disease severity (2,11). For the PET/CT image analysis method, radiomic analysis was chosen in this study. In contrast to conventional visual image features, radiomics is expected to provide more comprehensive description of tissues, with the potential to aid clinical care in several aspects including diagnosis, prognosis and treatment selection (12,13). Currently, a number of studies demonstrated the value of 18 F-FDG PET/CT-based radiomics in predicting the histological subtypes of lung cancer (14) and distinguishing breast carcinoma from breast lymphoma (15). So far, there is little study to investigate the predictive value of 18 F-FDG PET/CT on the status of MYCN, 1p and 11q in pediatric NB. Therefore, this study was designed to evaluate whether 18 F-FDG PET/CT-based radiomics can predict the status of MYCN, 1p and 11q, which in turn, can be used in risk stratification prediction in children with NB.

Patients
The records of 139 pediatric patients with newly diagnosed NB were reviewed retrospectively between March 2018 and November 2019 in our hospital. The inclusion criteria were as follows: (1) pathologically confirmed NB; (2) age ≤ 18 years at diagnosis; (3) complete PET/CT imaging data; (4) complete clinical information; (5) no cancer therapy before PET/CT imaging; (6) complete MYCN amplification and 1p and 11q aberrations data. Subsequently, 17 cases were excluded because of unavailable MYCN, 1p and 11q information, and 122 patients were included in this study. These patients were randomly divided into training set and test set with a ratio of 7:3. This retrospective study was approved by Institutional Review Board of our hospital and the requirement of written informed consent was waived.

Determination of MYCN Amplification and 1p and 11q Aberrations by FISH
MYCN amplification and 1p and 11q aberrations were determined using FISH from paraffin-embedded tissue obtained by biopsy or surgery at initial diagnosis according to the previously published method (16). According to the recommendations of the European Neuroblastoma Quality Assessment group (17,18), MYCN amplification was defined as a > four-fold increase of signals.
All patients underwent whole body scan on the PET/CT scanner (Biograph mCT-64 PET/CT; Siemens, Knoxville, Tenn) in accordance with EANM guidelines (19,20) and a biopsy/surgery for pathological diagnosis of NB was performed within 3 months. The PET scan was carried out with 3 min per bed position immediately after the whole body CT scan. PET images were reconstructed using the ordered subsets-expectation maximization algorithm with time-of-flight. The regions-ofinterest (ROIs) of primary tumor were manually drawn by an experienced nuclear medicine physician using the longitudinal PET/CT module in 3D Slicer (version 4.10.1). ROIs were delineated along the edge of NB on CT images, which included the entire tumor, metastatic lesions and unclear demarcations between the primary tumor and its surrounding metastasis. In order to map to the PET image, the ROIs were resampled based on B-spline interpolation to ensure that it had the same pixel spacing as the PET image.

Feature Extraction and Selection and Model Construction
Univariate analysis was performed to compare the differences in clinical characteristics. Based on the selected characteristics, a clinical model (C-model) was established.
Radiomic features from CT and PET images were computed separately using pyradiomics, an open-source python package for the extraction of radiomic features from medical imaging (21). First order features (n = 18), shape features (n = 14), gray level co-occurrence matrix (GLCM) features (n = 24), gray level run length matrix (GLRLM) features (n = 16), gray level size zone matrix (GLSZM) features (n = 16), neighboring gray tone difference matrix (NGTDM) features (n = 5), and gray level dependence matrix (GLDM) features (n = 14) were extracted from the original and the pre-processed images. The following methods were used in the imaging processing: wavelet filtering, square, square root, logarithm, exponential and gradient filtering (Figure 1).
Intraclass correlation coefficients (ICC) were obtained to assess the reliability of variables using the features extracted from the two sets of ROIs portrayed separately by two different nuclear medicine physicians in 24 out of the 122 patients with NB after 2 months. Because of imbalanced datasets, synthetic minority oversampling technique (SMOTE) was used to improve random oversampling in the training set. Least absolute shrinkage and selection operator (LASSO) was applied for variable selection and regularization in the training set. Predictive R_models were built by logistic regression and the radiomic score (rad-score) for each patient was computed based on the selected radiomic features. Additionally, the selected clinical characteristics combined with radiomics features were used to construct the combination model (CR_model). All models were built and trained in the training set, and the prediction performance was evaluated in the training and test sets. Ten-fold cross-validation was applied to prevent model overfitting in the training process. Receiver operating characteristic (ROC) curve and area under curve (AUC) were employed for the evaluation of the diagnostic performance in the training and test sets.

Statistical Analysis
Statistical analyses were performed with Python (ver. 3.7.8, www.python.org) and R (ver. 4.0.3, www.r-project.org). The Python packages of "sklearn, " "numpy, " and "pandas" were used for LASSO binary logistic regression and ROC curve; the "scipy" was for analyzing statistical properties; the "imblearn" was for SMOTE. The R package "rms" was employed to create nomograms. The t-test or Mann-Whitney U-test was applied for univariate analysis, and p < 0.05 with a 95% confidence interval was considered as statistical significance. AUC-ROC curve was calculated for evaluating the diagnostic performance of models. AUC ranging from 0.5 to 1.0 is commonly used as a measure of classifier performance. A value of 0.5 is equal to random guessing, while 1.0 means a perfect classifier.

Clinical Characteristics of Patients
According to the inclusion criteria, 122 out of 139 patients with NB were enrolled in this study. Eighty six patients were assigned to the training set and 36 patients were assigned to the test set. All clinical characteristics are summarized in Table 1 Table 1).

Predictive Model Construction
The total of 2,632 radiomic features were extracted from PET/CT images using pyradiomics. After assessing the robustness,  (Tables 1, 3). Eleven features were selected for 1p prediction, which included 5 clinical characteristics (NSE, LDH, VMA, MTD in Ultrasound and MTD in CT/MRI) and 5 PET, 1 CT features (Tables 1, 3). Eleven features were picked up for 11q prediction, which included 5 clinical characteristics (age, SF, LDH, VMA, and HVA) and 1 PET, 5 CT features (Tables 1, 3).  The p-values of radiomic features are shown in Table 3.

Rad
Rad-scores presented significant difference between positive and negative groups in the training and test sets (p < 0.001). NB with MYCN, 1p and 11q positive had higher Rad-score than those with negative in both the training and test sets. Nomogram score (Nomo_score) was calculated by the following formula (Figure 2  The nomogram was created based on the training set, which represented individualized prediction and visualized proportion of each factor (Figure 3).

Model Performance
To evaluate the performance in predicting MYCN, 1p and 11q status, C_model, R_model and CR_model were compared. The predictive abilities of models (sensitivity, specificity, and AUC) were shown in Table 4, and ROC curves were displayed in Figure 4. Obviously, the CR_models were the best predictive models for MYCN, 1p and 11q abnormalities,

DISCUSSION
Considering the well-established role of MYCN, 1p and 11q abnormalities in the prognosis of NB, identifying these events are crucial for risk stratification. This study provided three distinct forms of predictive models (clinical variables,    (22,23). In the present study, LDH and SF were also predictors of MYCN, 1p and 11q abnormalities. The radiomics models had a power to predict these aberrations, but models integrating PET and CT features with clinical variables led to higher predictive performance for training and test cohorts, in comparison with models with radiomic features or clinical parameters alone ( Table 2). In line with other studies (24), the integration of radiomic features with clinical parameters has a complementary and added impact in abnormal genetic and/or molecular prediction.
In this study, radiomic features were selected to construct CR_model for predicting MYCN, 1p and 11q abnormalities, including: PET_wavelet-LLH_glszm_GrayLevelNonUniformity, PET_wavelet-HHH_glszm_SizeZoneNonUniformity, CT_exponential_glrlm_LongRunEmphasis, CT_wavelet-HHL_firstorder_Maximum, PET_squareroot_ngtdm_Contrast, PET_logarithm_firstorder_Minimum, PET_wavelet-LLH_glrlm_LongRunLowGrayLevelEmphasis, PET_wavelet-HHH_glszm_SmallAreaHighGrayLevelEmphasis, PET_wavelet-HHH_glszm_LowGrayLevelZoneEmphasis, CT_exponential_glszm_SmallAreaEmphasis, PET_wavelet-LHL_gldm_DependenceNonUniformityNormalized, CT_wavelet-LLL_glrlm_RunVariance, CT_wavelet-LHL_firstorder_Median, CT_wavelet-LHL_glcm_Imc1, CT_wavelet-HLL_glrlm_LowGrayLevelRunEmphasis, and CT_wavelet-HHH_firstorder_Entropy. The majority of these features (12/16) were not derived from the primary image but from wavelet decomposition images, possibly because wavelet transformed features contained high-order information that may be more helpful for MYCN, 1p and 11q prediction. Previous studies have revealed the potential value of wavelet features in histologic subtype prediction and prognostic assessment (25,26). In agreement with that, our data also indicated that wavelet features possess remarkable abilities in MYCN, 1p and 11q prediction models. In addition, approximately half of the selected features were extracted from GLRLM (4/16) and GLSZM (5/16). Long run emphasis (LRE) in GLRLM quantifies the distribution of long run lengths, with a larger value representing longer run lengths and more coarse structural textures. Size-zone non-uniformity (SZN) in GLSZM quantifies the variability of size zone volumes in the image, with a smaller value representing more homogeneity in size zone volumes. Our results showed that the greater value of LRE or SZN was correlated with the higher possibility of MYCN amplification and 1p and 11q aberrations.
Currently, 123 I-MIBG scan is the most frequently used imaging modality and is regarded as standard of care in patients with NB. In comparison with 18 F-FDG PET/CT, 123 I-MIBG scan is carried out over 2 days and the image quality is less ideal that could post a challenge to inexperienced physicians (27). At many centers, planar I-MIBG imaging scans are performed, but radiomics based on these images was very limited. Moreover, false-negative MIBG scans were reported as early as 1990, which may result in incorrect down-staging (9). In about 8% of NB patients, false-negative scans at diagnosis occurred despite the solid evidence of disease. 18   patterns. In NB patients, 18 F-FDG PET/CT had higher sensitivity and specificity for the detection of lesions (9), and showed more extensive primary and/or residual lesions in stage 1 and 2 (8). Overall, 18 F-FDG PET/CT was superior in depicting NB, although 123 I-MIBG might be needed to exclude higherstage (8). Interestingly, the FDG-avid but MIBG-negative and MIBG-avid but FDG-negative NB can coexist in the same tumor (28).
The potential clinical significance of the present study included: (1) radiomics based on pre-therapy 18 F-FDG PET/CT provides a relatively accurate method in a non-invasive way for predicting MYCN, 1p and 11q, which can be applicable to pediatric NB patients; (2) the status of MYCN, 1p and 11q can be used for risk stratification, therapy selection, therapy response monitor and prognosis prediction.
This study had limitations. Small size cohort from single center may influence the generalized ability, sensitivity and specify of the predictive models. Therefore, prospective larger cohort from multi-center is necessary to validate the results and improve the reliability of models for MYCN, 1p and 11q predictions in NB.

CONCLUSION
The models developed by the pre-therapy 18F-FDG PET/CT radiomic signature and clinical parameters are able to predict MYCN amplification and 1p and 11 aberrations in pediatric NB, thus risk stratification, disease management and guiding personalized malignancy therapy in the clinical practice.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Beijing Friendship Hospital, Capital Medical University. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
LQ, SY, and SZ made substantial contributions to study design, image acquisition, data analysis and interpretation, and new software creation in this work. SZ, HQ, WW, YK, LL, JL, and HZ contributed writing and/or revising the manuscript. JY and JL approved all versions to be published and were responsible for all aspects of this study. All authors contributed to the article and approved the submitted version.