Updated Trends in Imaging Practices for Pancreatic Neuroendocrine Tumors (PNETs): A Systematic Review and Meta-Analysis to Pave the Way for Standardization in the New Era of Big Data and Artificial Intelligence

Purpose Medical imaging plays a central and decisive role in guiding the management of patients with pancreatic neuroendocrine tumors (PNETs). Our aim was to synthesize all recent literature of PNETs, enabling a comparison of all imaging practices. Methods based on a systematic review and meta-analysis approach, we collected; using MEDLINE, EMBASE, and Cochrane Library databases; all recent imaging-based studies, published from December 2014 to December 2019. Study quality assessment was performed by QUADAS-2 and MINORS tools. Results 161 studies consisting of 19852 patients were included. There were 63 ‘imaging’ studies evaluating the accuracy of medical imaging, and 98 ‘clinical’ studies using medical imaging as a tool for response assessment. A wide heterogeneity of practices was demonstrated: imaging modalities were: CT (57.1%, n=92), MR (42.9%, n=69), PET/CT (13.3%, n=31), and SPECT/CT (9.3%, n=15). International imaging guidelines were mentioned in 2.5% (n=4/161) of studies. In clinical studies, imaging protocol was not mentioned in 30.6% (n=30/98) of cases and only mentioned imaging modality without further information in 63.3% (n=62/98), as compared to imaging studies (1.6% (n=1/63) of (p<0.001)). QUADAS-2 and MINORS tools deciphered existing biases in the current literature. Conclusion We provide an overview of the updated current trends in use of medical imaging for diagnosis and response assessment in PNETs. The most commonly used imaging modalities are anatomical (CT and MRI), followed by PET/CT and SPECT/CT. Therefore, standardization and homogenization of PNETs imaging practices is needed to aggregate data and leverage a big data approach for Artificial Intelligence purposes.


INTRODUCTION
Pancreatic neuroendocrine tumors (PNETs) are relatively uncommon tumors, with an increasing incidence due to widespread use of cross-sectional imaging (1,2). PNETs represent a heterogeneous entity, characterized by a wide variation in clinical presentation and prognosis due to tumor functional status, possible genetic context, and variable aggressiveness, making the management of PNETs highly challenging (3)(4)(5)(6).
Medical imaging plays a critical role in guiding PNETs patients management (7,8). Computed tomography (CT) is often the initial modality used to evaluate pancreatic lesions, mostly because of its high spatial and temporal resolution (9), and correlation with histological prognostic factors (10). Magnetic resonance imaging (MRI) also plays a major role in pancreatic tumor characterization (11,12) and demonstrates imaging features that can be correlated with tumor aggressiveness and grade (13,14). A wide range of molecular imaging techniques are also used in PNET patients, as Somatostatin receptor (SSTR) imaging with single photon emission tomography/CT (SPECT/CT) and positron emission tomography/CT (PET/CT) (15), 18 F-DOPA (16) and 18 F-FDG with PET/CT (17). Molecular imaging techniques have shown a high association with tumor grade and are critical for theranostic approaches (18)(19)(20)(21). Increasingly, a multimodal imaging strategy, combining anatomical and molecular techniques, are leveraged for imaging-guided approaches to personalize and optimize patient management (22,23).
PNETs present four characteristics that make imaging evaluation challenging. First, PNETs are hypervascular slowgrowing tumors and therefore, limiting thus the value of using Response-Evaluation-Criteria-in-Solid-Tumors (RECIST) because tumor burden remains stable rather than decreased in patients with the best survival (24,25). Second, tumor size measurements may vary with contrast medium injection protocols on either CT or MRI (26). Third, new targeted cytostatic agents are used in PNETs treatment and alternative imaging criteria are needed, as tumor density change on perfusion CT (25). Fourth, immune-checkpoint modulators (ICMs) are currently being evaluated in several PNETs clinical trials (27,28). Because of their mechanisms of action, radiologists should be aware of new patterns of response and progression to immune therapies, as well as immune Related Adverse Events (iRAE) (29)(30)(31)(32)(33)(34). In addition, treatment beyond progression is allowed and immune RECIST (iRECIST) criteria should be used (35). This new era of immunotherapy makes tumor response assessment in PNETs even more difficult.
One of the key concepts unique to medical imaging is that the relevance and clinical utility of information derived from imaging depends on technical imaging parameters and acquisition. Therefore, using poor quality imaging techniques in clinical routine or in scientific studies may lead to inaccurate and biased results. Imaging examinations need to be technically adequate, uniform and homogeneous, which is even more salient in imaging PNETs since a majority of PNETs are hypervascular and up to 20% of PNETs measure 2 cm or less. Therefore, CT or MRI scans without an arterial phase acquisition or thin slices drastically reduces the sensitivity of the examination (36). Molecular imaging is also sensitive to technical parameters, with optimal patient preparation, administered radiotracer activity, and acquisition time as essential elements for highquality molecular examinations (37,38). Thus, imaging standardization is critically important in both clinical practice and in medical research, which encompasses clinical therapeutic trials and imaging research studies (i.e. diagnostic accuracy studies, comparison of imaging techniques, etc.). With respect to clinical trials, survival assessment, therapeutic response or prognostic value of a therapeutic effect are mostly dependent on the tumor imaging response mostly based on tumor size variations assessed by medical imaging. In an effort to harmonize and standardize clinical practice, the European Neuroendocrine Tumor Society (ENETS) published consensus guidelines for the standards of care in 2009 (38), which was updated in 2017 (37) and emphasized the importance of PNETs diagnostic procedures and technical quality of imaging methods.
In order to unravel the potential "imaging databases" that exist at the international level, we have conducted an updated review on the current imaging trends in clinical practice and research, based on a systematic review and meta-analysis approach, evaluating standardization of medical imaging in international PNETs studies during the last five years. The aim of this study was to evaluate the methodology and level of standardization of imaging in the recent literature of PNETs. We have focused this review on recent literature in order to reflect updated and current practices in imaging of PNETs, especially given the growth of literature in newer imaging techniques and theranostics.

MATERIALS AND METHODS
A preliminary step was conducted before stating this study, in which we have reviewed all available literature using different international guidelines in this area. This search is summarized in Supplementary Table 1.

Literature Search Strategy and Study Selection
The study protocol was developed and previously registered in PROSPERO (study number: CRD42020168542).
In order to review the entire recent published literature on PNETs during the last five years, a systematic search of major reference databases MEDLINE (PubMed), CENTRAL (Cochrane Central Register of Controlled Trials) and EMBASE was undertaken in December 2019, according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines (39). PRISMA checklist is shown in Supplementary Table 2.
Key search terms included "pancreatic neuroendocrine tumor/ tumor/neoplasm/carcinoma", and "islet cell adenoma/tumor". Study selection focused on recent literature, from December 1, 2014 to December 1, 2019. Details of search terms used for each database is reported on Supplementary Table 3.
After removal of duplicate studies and publications including only an abstract, non-English and non-human studies were automatically excluded from the study selection, as were case reports, systematic or non-systematic reviews and meta-analysis. The bibliographic management commercial software used was EndNote X9.3.1.

Inclusion and Exclusion Criteria
Titles and abstracts of studies were initially screened to select eligible publications, by removing studies dealing with the following topics: (1) Publications with other NETs than PNETs or non-neuroendocrine neoplasms/Not exclusively about PNETs; Then, full-text studies of eligible publications were reviewed to select all of imaging-based publications, separated into 2 subgroups: 1) imaging diagnostic accuracy studies and related studies: comparison of two imaging techniques, evaluation of prognosis value of a subtype of imaging technique; 2) clinical studies, including observational and cohort studies, experimental studies and clinical trials, for therapeutic or prognostic purposes, whose results are themselves based on therapeutic responses and survival endpoints (Progression-Free Survival PFS, Disease-Free Survival DFS, Disease Control Rate DCR, Recurrence-Free Survival RFS, Objective Response Rate ORR), fully dependent on imaging.
All studies identified by the search were screened for eligibility by two independent authors (E.P and F.Z.M), blinded to each other's decisions. In case of disagreement, a consensus was reached after review with a third reviewer (L.D).

Data Extraction
The two reviewers (E.P and F.Z.M) who performed the initial literature search independently extracted relevant data for each selected imaging-based publication, using a standardized form. This includes (a) General publication data (title, authors, journal and year of publication, country of origin); (b) Study design characteristics; (c) Demographic, clinical and pathological variables; (d) Type of imaging-based survival endpoint assessed (PFS, RFS, DFS, DCR, ORR) (e) Any imaging available data (imaging modality used, response evaluation criteria used); (f) Technical characteristics and acquisition parameters of each imaging modality; (g) Reference or mentioning of an imaging technical guideline from international NET societies. Table 1 summarizes all extracted data. Two investigators (E.P, F.Z.M) working in duplicate independently assessed all studies. Discordances were discussed with a third reviewer (L.D) and resolved by consensus.

Methodological Quality: Risk of Bias and Quality of Evidence
We assessed the risk of bias for all included studies. First, the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool was used without modification as provided by the QUADAS-2 group (40). This tool aims to evaluate the methodological quality applied to four "risk of bias" domains and three "concerns regarding applicability" domains (a total of 7 items to assess). Then, we also used the Methodological Index for Non-Randomized Studies (MINORS) grading score for clinical studies (41). MINORS score is a validated tool which uses eight graded questions for non-comparative studies. We judged each domain as presenting low, high, or unclear risk of bias by using a numeric score: each item can be scored as 0 (not reported), 1 (reported but inadequate) and 2 (reported and adequate). Ideal global score varies from 16 for non-comparative studies and 24 for comparative ones.

Statistical Analysis
Analyses were conducted using Microsoft Excel (v2019, Microsoft, USA, 2019) and open-source R software (version 3.6.2; R Foundation for Statistical Computing, Vienna, Austria). A p-value less than 0.05 was considered to indicate statistical significance (a=0.05).

Identification and Selection of Studies
The literature search yielded 9982 studies. Following the removal of duplicates, 6509 studies remained. Among them, 4189 records including only an abstract (n=149), non-English (n=293) and non-human studies (n=2115) were automatically excluded from the study selection, as were case reports (n=1058), systematic or non-systematic reviews and meta-analysis (n=574). Afterwards, 2320 studies were screened on the basis of title and abstract. Among them, 1846 were excluded: studies not exclusively dealing with PNETs (n=951), inherited syndromes (n=133), studies evaluating ultrasound (US) (n=250), basic science studies (n=317), and unrelated studies (n=195).
474 publications were included and fully reviewed, of which 161 were identified as imaging-based studies and included in the final analysis: 63 imaging studies on diagnostic accuracy studies and related studies and 98 clinical studies based on therapeutic responses and survival endpoints, fully dependent on imaging. The PRISMA flow chart of literature search and study selection process is shown in Figure 1.

Demographic, Clinical and Pathological Variables
In this study, 19852 patients were included, with 15728 patients in "clinical studies" and 4124 patients in "imaging studies.   Table 3.

Recent Trends in PNET Imaging
The two imaging modalities most frequently used in the recent PNET literature are CT and MRI representing 57.1% (n=92/161) and 42.9% (n=69/161) of studies, respectively. Nuclear medicine imaging was less frequently used with 19.3% (n=31/161) of studies utilizing PET/CT and 9.3% (n=15/161) utilizing planar scintigraphy and/or SPECT/CT. No significant difference was observed between imaging and clinical studies. Detailed repartition of imaging modalities per subgroup is illustrated in Figure 2F.

Standardization of Practice: Reporting of Imaging Technical Parameters
In imaging studies, available information on imaging methods, examination protocol and technical details for each imaging modality were collected and summarized in Table 4. With respect to CT, most studies reported a detailed imaging acquisition protocol (93.8%), with almost all studies using multiphase contrast-enhanced acquisition, except one reporting single phase acquisition. CT slice thickness parameters were For MRI, most studies reported an imaging acquisition protocol (95.2%, n=20/21), with almost all studies acquiring multiphase contrast-enhanced images (80%, n=16/20), T1-and T2-weighted images (90.5%), and majority of studies obtaining DWI images (76.2%). MRI slice thickness parameters were available in 81.0% of studies, with a mean slice thickness of 3.1 mm. Details of contrast administration were reported in 66.7% of studies. Among studies in which MRI scanner details were reported (n =20/21), half of them were performed on one single MRI scanner, while other half were performed on 2 or more different scanners. For magnet field strength, 33.3% were performed on a 1.5 T system, 28.6% on a 3.0 T system, and 33.3% used both 1.5 and 3.0 T systems.
There was significant heterogeneity regarding the reporting of imaging modalities and imaging acquisition protocols ( Figure 2G). For example, 30.6% (n=30/98) of clinical studies did not describe which imaging modalities were used, in contrast to imaging studies, which specified the imaging modality in 100% of cases (n=63/63) (p<0.001). Additionally, 63.3% of clinical studies (n=62/98) reported the imaging modalities used, however, no details on the imaging protocol were reported, while only one imaging study did not report the specific imaging protocol used (1.6%, p<0.001). In only 11.2% (n=11) of clinical studies, injection phase was specified, and in 4.1% (n=4), the multiphase injection phase was clearly specified.

Imaging Studies
For imaging studies (QUADAS-2: Figure 2H), no study was considered to be at low risk of bias for all domains. In 6.3% of included studies, a high risk of bias for patient selection was due to non-consecutive or random enrollment. Regarding the patient flow and timing, 20.6% of imaging studies used a combination of histopathologic findings and clinical follow-up in reference

Clinical Studies
Using MINORS index for clinical studies ( Figure 2I) allowed highlighting the fact that the main bias was introduced by the lack of prospective collection of data in 56.1% of the time. The second major bias was the lack of information on the consecutive nature of the inclusion of patients (16.3%). Of note, only a small proportion of the clinical studies was comparative (13.3%, n=15/98), which precluded the possibility to evaluate the four additional criteria (adequate control group, contemporary groups, baseline equivalence of groups, adequate statistical analyses).

DISCUSSION
Medical imaging plays a decisive role in PNETs management, a highly challenging disease (22), and is represented by a large panel of imaging tools available to physicians. With a purpose of standardizing practices, ENETS 2017 guidelines emphases on the need for technical quality of imaging methods (37,38,45). To optimize treatment strategies, it is often necessary to combine data from several centers in clinical therapeutic trials. In the new era of big data and artificial intelligence, harmonization of imaging practices is especially important to find relevant imaging biomarkers. This also explains the importance of assessing practice heterogeneity, in order to unravel the potential "imaging databases" that exist in this field. Based on this unmet need, the first  objective of this systematic review was to assess the level of standardization of imaging practices in the recent PNETs literature.
In this study, we demonstrated the existence of a significant lack of standardization and homogenization of methodological imaging practices in the recent PNETs literature. Study selection resulted in 161 imaging-based manuscripts and allowed the creation of two different sub-groups of publications in the final analysis: 63 imaging studies and 98 clinical studies.
The choice of studying each sub-group separately can be explained by our assumptions about the differences in conduction of each type of studies. In imaging studies subgroup, we expected to have all the necessary details because the purpose of these studies is to evaluate diagnostic accuracy. We wanted to assess the degree of homogeneity and compare this information with the international guidelines. For clinical studies in which the therapeutic evaluation is obtained by radiological assessment, we have hypothesized a very small amount of technical details since clinical outcomes were the primary endpoints.
Our study is the first to evaluate imaging standardization in PNETs. Beyond the overall lack of methodological standardization and homogenization, six key concepts were identified in this study.
First, overall methodology quality remains suboptimal. Indeed, the vast majority of the studies was retrospective (n=139/161; 86.3%) and non-randomized (n=155/161; 96.3%). However, there was a significant difference between the two subgroups in terms of institutional design, with multicentric nature in 38.8% of clinical studies, versus less than 5% of imaging studies (p=<0.001). This point may indicate that clinical studies are generally more qualitative, in a methodological point of view. At the opposite, clinical studies were mostly non-blinded, against only 4.8% of imaging studies (p=<0.001), making thus imaging diagnostic accuracy studies' evaluation methodologically valid. While clinical studies are prospective and multicentric, there is limited reporting of and lack of standardization of the imaging acquisition in these studies, which may lead to heterogeneous image quality. Imaging studies have more homogeneous and better described imaging techniques, but the level of evidence is limited by the fact that studies are monocentric and retrospective.
Second, there is a mismatch between types of data used for clinical or imaging studies. While 71.4% of clinical studies started collecting data before 2005 (n=70/98), 53.9% (n=34/63) of imaging studies used data collected after 2005. This point highlights the possible difference between results obtained with clinical studies as compared to imaging ones. Indeed, molecular imaging in the field of NET has been extensively developed this last decade, vastly improving the performance of imaging techniques through more accurate methods, such as Ga-68 DOTATATE PET/CT imaging (46,47). The two most common imaging modalities reported in the recent PNETs literature are CT and MRI, despite significant progress in nuclear medicine imaging, with the advent of newer highperformance PET radiotracers and its increased availability. Therefore, we predict there will be a future rebalancing in the partition of different PNETs imaging modalities.
Third, geographical distribution of populations in the current literature shows a lack of representation of patients from South America, Africa and Oceania, although the prevalence of PNETs in these parts of the world is not different (48). In other terms, international societies need to encourage research in these countries in order to obtain worldwide results, and better homogenize PNET patients' management, both in clinical routine and for research purposes.
Fourth, imaging procedures are described more frequently and in better detail in imaging studies than in clinical studies, even in large multicentric international clinical studies. Moreover, the radiological assessment is also of better quality in imaging studies, with a significantly higher rate of blinded assessors. In 90.8% of clinical studies, imaging assessment was not clearly stated. Paradoxically, in this study, multicentric international studies, which are supposed to be methodologically better, presented lower quality in terms of radiological methodology. This can be explained by a lack of standardization between each center. For instance, RADIANT-3, a large prospective, randomized, phase 3 clinical trial, published in 2011, demonstrated improvement of everolimus in progression-free survival (PFS) compared with placebo for patients with advanced PNETs (49). Contrasting with the vast majority of the recent clinical studies analyzed, imaging technical details were fully described in their supplementary materials. Since progression-free survival is in part an imagingbased clinical endpoint, this fact confirms that this study is methodologically correct in terms of technical quality and imaging protocol and has a high evidence-based value.
Fifth, adherence to international guidelines is very low in the included studies, as shown by low rates of reference to international imaging technical guidelines (2.5% (n=4/161) of all selected studies). In these 4 cases (published between 2015 and 2018), ENETS 2009 technical guidelines were mentioned. We noticed that ENETS 2017 technical guidelines were never mentioned in the 102 selected articles published since 2017, although it was the most recent and detailed guidelines.
Last, there was a lack of imaging quality assessment tools. Indeed, many tools and indexes are available for methodological quality evaluation of studies and assessment of risks of bias. We have chosen to use MINOR and QUADAS-2 because of their simplicity and their reliability, as demonstrated by the rigorous and evidence-based process to develop these tools. However, neither of these tools were specifically designed to assess how standardized imaging procedures are performed, which can be essential in some areas. Therefore, it seems important that future work focuses on a methodological quality assessment tool that incorporates the evaluation of how imaging techniques are performed.
Based on a systematic review and meta-analysis approach, using strict inclusion criteria, we applied state-of-the-art methodology in this research. We have chosen to restrict our search strategy to the last five years. Indeed, we focused on imaging technical parameters, a field of medicine and technology that is constantly evolving and changing. For example, thin CT sections were not routinely systematic before 2009 ENETS technical guidelines.
These results showed the difficulty of pooling all data for a big data approach. Qualitative assessment of potential "imaging databases", theoretically accessible to Datamining using AI in recent PNETs literature shows an excessive data heterogeneity. This is exacerbated by the use of many different machines and equipment, which increases input data variability. Initiatives like the EARL FDG PET/CT accreditation program provide a way to limit the data heterogeneity and facilitate multicenter research projects with accurate and reproducible imaging data.
Results expressed in this study might have major implications for clinicians, researchers, and guideline committees. Clinical decisions should be based on the best available imaging technique, using rigorously the recommended technical properties for each technique. A non-optimal imaging acquisition or reconstruction should be repeated before taking any clinical decision.
Similarly, precision should be requested in imaging-based studies. In addition, as a quality guarantee, affirmation of the use of imaging examinations in accordance with reference guidelines should be at least mentioned before envisioning any future publication. Another approach to improve practices would be to modify prospective databases from which a majority of studies collect their information. Technical imaging data, radiological protocols and acquisition methods should be mentioned, and only patients who have benefited from appropriate imaging examinations in accordance with international guidelines should be included. A proposal to expand these databases to include imaging technical information would also allow better selection of patients with technically correct imaging.
Herein, this systematic review of the recent literature on PNETs, with a special emphasis on imaging, demonstrates the lack of rigorous reporting and standardization of imaging techniques in clinical practice and research. Indeed, a clear gap in imaging information in clinical studies was demonstrated, particularly for types of modalities used, radiological protocol applied, and imaging assessment. This lack of information seems more intriguing, when it comes to clinical studies whose results are mainly based on radiological evaluation. Even when technical details were available in imaging studies, there is a significant heterogeneity of practices and a lack of references to established international guidelines. This non-uniformity makes it difficult to envision a pooled use of data for AI datamining and big data purposes since AI needs absolute homogeneity and standardization of clinical practices, that will perhaps allow identifying new biomarkers for treatment effectiveness, and thus a higher optimization of PNETs patients' management.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.