Skip to main content


Front. Oncol., 29 November 2023
Sec. Cancer Epidemiology and Prevention
This article is part of the Research Topic Advances in CNS Tumors Treatment and Diagnosis: Obstacles, Challenges, and Opportunities View all 16 articles

Identifying brain tumor patients’ subtypes based on pre-diagnostic history and clinical characteristics: a pilot hierarchical clustering and association analysis

  • 1Department of Epidemiology and Prevention, IRCCS Neuromed, Pozzilli, Italy
  • 2Mediterranea Cardiocentro, Napoli, Italy
  • 3Department of Neurosurgery, IRCCS Neuromed, Pozzilli, Italy
  • 4Libera Università Mediterranea (LUM) “Giuseppe Degennaro”, Casamassima (Bari), Italy
  • 5Department of Medicine and Surgery, LUM University, Bari, Italy

Introduction: Central nervous system (CNS) tumors are severe health conditions with increasing incidence in the last years. Different biological, environmental and clinical factors are thought to have an important role in their epidemiology, which however remains unclear.

Objective: The aim of this pilot study was to identify CNS tumor patients’ subtypes based on this information and to test associations with tumor malignancy.

Methods: 90 patients with suspected diagnosis of CNS tumor were recruited by the Neurosurgery Unit of IRCCS Neuromed. Patients underwent anamnestic and clinical assessment, to ascertain known or suspected risk factors including lifestyle, socioeconomic, clinical and psychometric characteristics. We applied a hierarchical clustering analysis to these exposures to identify potential groups of patients with a similar risk pattern and tested whether these clusters associated with brain tumor malignancy.

Results: Out of 67 patients with a confirmed CNS tumor diagnosis, we identified 28 non-malignant and 39 malignant tumor cases. These subtypes showed significant differences in terms of gender (with men more frequently presenting a diagnosis of cancer; p = 6.0 ×10−3) and yearly household income (with non-malignant tumor patients more frequently earning ≥25k Euros/year; p = 3.4×10−3). Cluster analysis revealed the presence of two clusters of patients: one (N=41) with more professionally active, educated, wealthier and healthier patients, and the other one with mostly retired and less healthy men, with a higher frequency of smokers, personal history of cardiovascular disease and cancer familiarity, a mostly sedentary lifestyle and generally lower income, education and cognitive performance. The former cluster showed a protective association with the malignancy of the disease, with a 74 (14-93) % reduction in the prevalent risk of CNS malignant tumors, compared to the other cluster (p=0.026).

Discussion: These preliminary data suggest that patients’ profiling through unsupervised machine learning approaches may somehow help predicting the risk of being affected by a malignant form. If confirmed by further analyses in larger independent cohorts, these findings may be useful to create potential intelligent ranking systems for treatment priority, overcoming the lack of histopathological information and molecular diagnosis of the tumor, which are typically not available until the time of surgery.

1 Introduction

Central nervous system (CNS) tumors are quite rare forms of tumors, representing about 1.3% of all cancers. They are hypothesized to have distinct cellular origins, which can be discriminated on the basis of anatomical location, expression of cellular markers, and morphological resemblance to normal brain cells (1). According to the World Health Organization (WHO), there are over 120 different types of brain tumors and data suggest that their incidence is further increasing (2). It is estimated that about 1,000 people receive a new cancer diagnosis every day in Italy (3) and, according to estimates by the National Cancer Registry, approximately 5,700 cases of CNS tumors are diagnosed in the Country each year (4).

CNS tumors are linked with a number of risk and protective factors, including both genetic and environmental factors. The main risk factors include family history of the disease, age, exposure to chemical compounds and radiations (57).

Levin and colleagues carried out a large case-control study of more than 400 between cases and controls to investigate whether sensitivity to γ radiation was associated with the risk of CNS tumors (8), and observed that this and the consequent inability to repair DNA damage induced by radiation can increase the risk of such tumors (8). A growing number of studies are supporting the importance of healthy eating in cancer prevention. In particular, a high adherence to Mediterranean Diet (MD) reduces the risk of mortality and the incidence of many types of tumors (7, 9). The protective effects of the MD could be attributed to the high concentration of polyphenols contained in olive oil, wine and vegetables, all foods known for their antioxidant and anti-inflammatory capacity (10, 11). Similarly, omega-3 fatty acids, which are abundant in fish, help slowing down cell proliferation, angiogenesis, inflammation and metastasis (12).

A large number of epidemiological studies have also analyzed the relationship between mobile phone use and the incidence of tumors in the CNS (13, 14), but a meta-analysis of these studies did not reveal any robust statistical evidence for an increase in the risk of malignant or benign neoplasms for a prolonged use of the mobile phone (>10 years) (15). Another potential risk factor is cigarette smoking, which represents a major source of exposure to multiple chemical carcinogens, including polycyclic aromatic hydrocarbons (PAHs) and N-nitroso compounds (16). These cancerous agents are associated with permeability of the blood brain barrier in animal models, along with nicotine (16). As for obesity, the relative risk of all CNS cancers – and especially meningiomas increases with increasing body mass index (BMI) (17).

CNS tumors have also been associated with several socioeconomic factors, occupational and environmental exposures. Inskip et al. found a significant positive association with education and income for low-grade glioma, but not for high grade glioma (18). Among the most reported environmental risk factors were also exposure to agricultural chemicals such as pesticides, insecticides and herbicides (19).

Moreover, studies have indicated that psychological and cognitive manifestations can be considered not only symptoms of CNS tumors but also early warning signs (20, 21), or even risk factors. In fact, a systematic review conducted by Ghandour and colleagues on case reports studies on brain tumors and psychiatric symptoms revealed that in some cases, psychiatric and minor neurological symptoms can emerge even months or years prior to the onset of noticeable neurological signs (22).

Overall, the association of these risk factors with the tumors of CNS has been scarcely investigated, especially through machine learning techniques, which allow to potentially identify subtypes of disease by taking into account also more complex and non-linear relationships among risk factors. This would provide a notable contribution to current knowledge in the field, in light of the modern view that each disease - and even more prominently cancer - has different clinical and biological subtypes, and that each patient is a unique combination of biological, clinical, cultural and psychological characteristics (23, 24).

The aim of this study was to preliminarily investigate the link of different known and suspected risk factors with CNS tumor malignancy, in a cohort of patients elected for neurosurgical treatment. This was accomplished through analysis of associations between diverse exposures which could influence the risk of CNS tumors and their diagnosis - including occupational, socioeconomic, psychometric, nutritional and anthropometric variables, cancer familiarity and history of chronic health conditions - and the different type of tumors, including malignant and non-malignant ones. The very final purpose of this approach is that - shall we identify clusters of patients associated with a higher risk of malignancy - this information may turn useful in future clinical practice, e.g. prioritizing patients for treatment, overcoming the lack of histopathological information and molecular diagnosis of the tumor, which are typically not available until the time of surgery.

2 Subjects and methods

2.1 Study design

Between October 2018 and March 2020, 90 consecutive patients were enrolled in the MEDICEA (adherence to the MEditerranean DIet in relation to CancEr of brAin) study. Recruited patients (≥ 18 years) had a suspected diagnosis of CNS tumors based on neuroimaging scan and were eligible for surgery at the Neurosurgery Department of the IRCCS Neuromed. Subjects with metastatic and/or recurrent brain tumors were excluded, as well as subjects with confirmed diagnosis of conditions other than brain tumor or with missing diagnosis (see below). Anthropometric measurements and administration of questionnaires were completed before surgery.

The pilot study, conducted according to the principles of the Helsinki declaration, was approved by the Ethical Committee at the IRCCS Neuromed, Pozzilli, Italy (Protocol number: 01262017). All patients signed a written informed consent to be enrolled in the study.

2.2 Study population

Trained research personnel from the Department of Epidemiology and Prevention at the IRCCS Neuromed carried out recruitment – carried out between 8.00 and 11.00 a.m. in the Neuromed clinical center and anthropometric measurements, using methods that had been standardized beforehand during preliminary training sessions. Primary CNS tumors were validated through medical records and confirmed by histological reports. Patients without histopathological confirmation or with a diagnosis of brain cysts, secondary tumors or other expansive cerebral processes (n= 22) were excluded. Similarly, one participant who did not complete any questionnaire was filtered out before analysis. Histological information was used to identify main CNS tumors types (i.e. meningiomas 29.5%, glioblastomas 18.2%, adenoma 13.6%, astrocytomas 13.6%, other types 25.1%; Supplementary Table 1). Other types of CNS tumors included olygoastrocitoma, chordoma, epidermoid cyst, rolandic tumor, oligodendroglioma, angioma, schwannoma, pituitary adenoma and hemangioblastoma. Additionally, CNS tumors were categorized in malignant (behavior code = 3) and non-malignant (behavior code = 0 or 1) (25).

2.3 Definition of variable analyzed

Education was based on the highest qualification attained and was categorized as up to secondary (≤8 y), upper secondary (≥9 y and ≤13 y) and post-secondary (>13 y). Occupational social class was classified as non-manual occupation, manual occupation, retired, housewife and unemployed/unclassified. Marital status was assessed and classified into married, separated/divorced, single and widowed. Household income, expressed as Euros per year, was classified as a four-level variable (<10,000; 10,000-25,000; ≥25,000 Euros/year), with missing values collapsed into a non-respondent category. Smoking status of participants was classified as never-smoker, current smoker or former smoker (i.e. having quitted smoking at least 1 year before enrollment). For clustering purposes, these classes were condensed into never vs ever smokers. Physical activity level was classified into: sedentary, mildly active or physically active lifestyle.

The study sample was also stratified as living in an urban or rural environment on the basis of the urbanization level of the city of residence, as defined by the European Institute of Statistics (EUROSTAT definition) and obtained by the tool “Atlante Statistico dei Comuni” provided by the Italian National Institute of Statistics ( (26).

Height and weight were measured, and BMI was calculated as weight to squared height ratio (kg/m²). Waist circumference was measured according to the National Institutes of Health, Heart, Lung, and Blood Guidelines (27), then waist-to-hip ratio was computed as the ratio between waist and hip, both measured in centimeters. Diastolic and systolic blood pressure were also measured during the visit, through three repeated assessments, and the average values of the last two measurements were taken as the final measure. Diagnosis of hypertension, hypercholesterolemia and diabetes were defined by current pharmacological treatments reported, while history of cardiovascular (angina, stroke and myocardial infarction) and peripheral artery disease was based on self-reported diagnosis.

Patients were also asked about family history of tumor disease within their first-degree family (Yes/No). Furthermore, they were asked whether they lived or worked in proximity of industries, signal relays/repeaters/antennas, sources of asbestos or landfills. The use of mobile phone was also investigated, both asking if patients used to sleep with the mobile phone nearby (Yes/No), and asking how many hours per day they used the phone, with the following potential answers: <2h/day, 2-4h/day and ≥ 4h/day. Finally, patients were asked if they had ever been hospitalized following a head injury due to an accident, a strong bump or a bruise, and if they had undergone previous surgery (Yes/No).

2.4 Dietary assessment

Data on food intake during the year before enrolment was collected by the validated Italian version of the EPIC food frequency questionnaire (28) which includes 188 food items, classified into 75 predefined food groups on the basis of similar nutrient characteristics or culinary usage. Adherence to the traditional MD was evaluated by the Mediterranean Diet Score (MDS) developed by Trichopoulou et al. (29) and ranged from 0 to 9 (the latter reflecting maximal adherence).

2.5 Psychometric assessment

Quality of life of the patients was assessed through a self-administered Functional Assessment of Cancer Therapy -Brain cancer (FACT-Br) questionnaire before the surgery. This includes five subscales that evaluate physical, social life and family, emotional and functional wellbeing, and additional conditions. The total score ranged from 0 to 184 (the latter indicating higher quality of life) (30).

Psychological resilience was tested in the patients through the Connor-Davidson Resilience Scale (CD-RISC), a self-rated assessment based on 25 items and assessing domains of personal competence, trust/tolerance/strengthening effects of stress, acceptance of change, secure relationships, control, humor, patience, and spiritual influences. Since each item is rated on a 5-point scale (0–4), the total score ranges from 0 to 100, with higher score reflecting greater psychological resilience (31). Global cognitive function was assessed via the Montreal Cognitive Assessment (MoCA). The MoCA is a widely used screening tool that assesses cognitive ability through brief evaluation of various cognitive domains, including visuospatial/executive, naming, memory, attention, language, abstraction, delayed recall and orientation (to time and place) (32). This test incorporates an adjustment for participants with ≤12 years of education, by the addition of 1 point to the final score (33). A total score out of 30 is given, with scores <18 indicating dementia, scores between 18 and 26 indicating mild cognitive impairment and scores ≥26 being classified as cognitively normal. This tool is administered in-person and takes ~10 minutes to complete (33). Depressive symptoms were assessed through the Patient Health Questionnaire 9 (PHQ‐9) self‐administered scale, assessing the nine symptoms most often affected in major depression, namely anhedonia, low mood, alteration of sleeping pattern, altered appetite or eating behavior, feeling of failure/low self-estimate, fatigue, troubles in mental concentration, hypo/hyperactivity behaviors, and suicidal ideation. Each item can receive a score from 0 to 3, depending on how often the relevant domain is affected, with the total PHQ-9 score ranging between 0 (indicating no depressive symptoms at all) to 27 (suggestive of severe depression) (34).

2.6 Statistical analysis

Malignant and non-malignant subtypes were compared for a number of variables, which included demographic (age, gender), socioeconomic (education, annual income, occupation), anthropometric (weight, height, BMI, diastolic and systolic blood pressure) and lifestyle variables (smoking habit, physical activity, adherence to MD, daily alcohol and energy intake), as well as psychometric variables (CD-RISC, MoCA, FACT-Br and PHQ-9 scores), professional and other environmental exposures (proximity to industries, exposure to pesticides, insecticides and herbicides). Descriptive analysis of continuous data included the mean and standard deviation (SD) for each group, while the frequency of each class was compared across groups for categorical variables. Fisher Exact tests were applied on the resulting contingency tables for all categorical variables, while unpaired t-test was used for analyzing continuous variables (Table 1).


Table 1 Characteristics of the sample according to type of central nervous system tumors.

Statistical association analyses were carried out at the Department of Epidemiology and Prevention of IRCCS Neuromed, through SAS/STAT software, Version 9.4 of the SAS System for Windows©2009.

2.7 Hierarchical clustering

Pre-diagnostic history and clinical data also underwent a hierarchical clustering analysis among all the patients with clear diagnosis and definition of malignancy (N=67), in R (37). This analysis, which was aimed at identifying subtypes of brain tumor patients in an agnostic way within the analyzed dataset - based only on anthropometric, socioeconomic, psychometric, lifestyle and other environmental information - was carried out as described in the Supplementary Methods, using both a divisive (top-down) and an agglomerative (bottom-up) approach. Briefly, we selected the variables to be included in the analysis, removing collinear features, implemented missing data imputation through a k-nearest neighbor algorithm (see Supplementary Methods) and then computed a pairwise (Gower distance) dissimilarity matrix across 67 patients (Supplementary Figure 1). Through the Average Silhouette method (Supplementary Figure 2), we determined the optimal number of clusters to classify patients based on their clinical and pre-diagnostic characteristics data, then carried out the actual cluster analysis, through which each patient was assigned to one of the clusters. Since divisive clustering has been reported to be more accurate and robust than agglomerative clustering (35) and the two classification methods were significantly homogeneous (Fisher Exact Test p = 0.004; Supplementary Table 2; Supplementary Figure 3), we took the divisive cluster classification as main exposure analyzed, as in (35). The two resulting clusters of patients, hereafter called Cluster 1 (green, N = 26) and Cluster 2 (red, N = 41), were then compared for all variables mentioned above, through Fisher’s Exact Test (for categorical variables) and through Student’s t test (for continuous variables). Moreover, a Fisher Exact Test was applied to compare the distribution of the two clusters of patients for each subtype of brain tumor identified a priori, and an Odds Ratio with 95% Confidence Interval (OR [CI]) was computed, so to detect potential associations between the two classifications and determine whether the agnostic clustering was somehow reflecting tumor diagnosis.

3 Results

Basic characteristics of the 67 patients involved in the analyses are reported in Table 1. Comparing non-malignant vs malignant CNS tumor cases, we observed a difference in gender distributions across the two groups, with men (representing 49.2% of the total sample) being more prevalent in malignant cases (64.1%), compared to non-malignant ones (28.6%; p = 0.006). Age (mean ± SD = 56.3 ± 14.1 y in the total sample) did not show any difference across the two categories, as educational attainment, occupational class and marital status. However, among socioeconomic and demographic variables, household income showed a differential distribution (Fisher Exact Test p = 0.003), with non-malignant tumors showing the highest percentage of subjects in the average income class (10,000-25,000 Euros; 35.7%), and malignant tumors showing a higher prevalence of people declaring ≥25,000 Euros (41.0%) and presenting many non-responders (35.9%). Hierarchical clustering analysis allowed to compute two clusters of patients based on pre-diagnostic history and clinical data (Figure 1), which were compared to analyze their characteristics (Table 2). This comparison revealed differences in several characteristics between the two clusters. Patients in Cluster 1 (N=26) were more frequently men (100% vs 17% in Cluster 2; p<0.0001) and smokers (80.8% vs 24.4%; p<0.0001), generally less educated (up to secondary education level: 46.1% vs 34.1%; p = 0.032) and mostly inactive workers (retired: 53.8% vs 17.1%; p<0.0001), with a lower income (≥25.000 Euros/year: 50.0% vs 26.8%; p = 0.002) and a marginal trend toward an older age (mean (SD) age: 60.7(13.3) vs 54.1(14.8); p = 0.061). Likewise, subjects of Cluster 1 reported more frequently a sedentary lifestyle (65.4% vs 56.1%; p=0.049), a previous diagnosis of cardiovascular disease (19.2% vs 2.4%; p=0.018), and a family history of cancer (30.8% vs 9.8%; p=0.012). From a psychometric perspective, Cluster 1 showed worse cognitive performance compared to Cluster 2: (mean (SD) MoCA score: 22.9 (2.4) vs. 24.7(3.3); p=0.023) (Table 2). No other difference was detected, except for self-reported proximity to potential pollution sources, such as industries, signal relays/repeaters/antennas, sources of asbestos or landfills (23.1% in Cluster 1 vs 56.1% in Cluster 2; p=0.006. When we compared the classification of tumor cases based on their malignancy vs the agnostic classification of patients made applying hierarchical clustering on pre-diagnostic history and clinical data, we observed an association of Cluster 2 with a lower risk of malignant tumor (OR [95% CI] = 0.26 [0.07-0.86], Fisher Exact Test p = 0.026; Table 3).


Figure 1 Hierarchical divisive clustering of brain tumor patients, based on the collected features. The dendrogram reporting the clusters identified through divisive hierarchical clustering is reported. Gower distance is reported on the y axis and each single unit analyzed (i.e. patients) on the x axis. Vertical lines correspond to groups (or clusters) of units, while connecting (horizontal) lines identify the distance level at which clusters merge. Legend: red = cluster 1; green = cluster 2.


Table 2 Characteristics of the sample according to the two clusters identified.


Table 3 Contingency table showing tumor subtype by cluster distribution.

4 Discussion

In this preliminary study, we aimed to investigate the relationship between environmental and biological risk factors and CNS tumor malignancy. We did this through a cross-sectional association analysis between CNS tumor subtypes - divided into non-malignant and malignant CNS tumor cases - and patients clusters derived from a wealth of pre-diagnostic history and clinical data collected within the MEDICEA study. These included not only classical sociodemographic and anthropometric measures, but also environmental exposures, socioeconomic and lifestyle factors, clinical and psychometric features of the patients. We observed two distinct subtypes of patients: one with more professionally active, educated, wealthier and healthier patients, and the other one with mostly retired and less healthy men, with a family history of the disease and lower cognitive performance. Of note, the former cluster showed a protective association with the malignancy of the disease, showing a 74 (14-93) % reduction in the prevalent risk of CNS malignant tumors, compared to the other cluster (Figure 2). Since we cannot formally compare our findings with previous evidence due to the lack of studies using a cluster approach to pre-diagnostic history and clinical characteristics of brain cancer patients, we will focus below on the comparison of the evidence derived by this analysis with that produced by classical association studies in the field. Indeed, most of the associations and discrepancies observed between the two risk clusters followed the same trend reported by other studies. As for gender, our results revealed a significantly higher number of men among patients affected by cancer, especially among malignancies, and the totality of our putative risk cluster was made up of men. Previous literature reports a clear predominance of some types of brain tumors in males, such as astrocytomas, glioblastomas multiforme, medulloblastomas, ependymomas and oligodendrogliomas (25), while meningiomas occur more commonly in females than in males, a trend thought to be related to hormonal components (36). Another significant association was detected between self-reported yearly household income and tumor malignancy, with a higher income being associated with the putative risk cluster. Moreover, non-malignant tumors showed the highest percentage of subjects in the average income class (10,000-25,000 Euros/year), while patients with malignant tumors showed a higher prevalence of people declaring ≥25,000 Euros/year and presented many non-responders (37). Part of these non-responders may actually represent people who feel ashamed to self-report a low income (which may actually counteract the imbalance between clusters) and are usually treated as a class. In a study including a total of 11,892 patients with meningiomas, low-grade gliomas, and high-grade gliomas, no clear association was observed between income and the risk of developing brain tumors (38).


Figure 2 Characteristics of the two patients’ clusters identified.

While no association was observed between cancer familiarity and tumor malignancy, this was significantly more frequent in the putative risk cluster, in line with previous studies, such as (39), which however reported significant excess of relatedness for astrocytomas, but not for glioblastomas. However, this link is still debated and needs to be clarified, especially with regard to the contribution of shared environmental and genetic factors to the clustering of cases within families (39). Similarly, we observed a higher frequency of ever smokers in the putative risk cluster, in spite of no significant direct association between malignancy and smoking status, but in agreement with previous evidence that active cigarette smoking was associated with an increased risk of CNS tumors in men, but with a reduced risk in women (40), and with gender-differential association between smoking and CNS tumor diagnosis in China (41).

From a more psychological perspective, although none of the psychometric scales assessed revealed direct associations with malignancy degree, the putative risk cluster identified showed a slightly worse cognitive performance. This is in line with the evidence that cognitive deficits are commonly observed in patients with brain tumors (42), and that this could even delay the diagnosis of brain tumor, because these symptoms are often linked to psychiatric diseases (22). However, it remains unclear whether this represents an early marker of brain tumor or a risk factor, a hypothesis which requires long-term longitudinal studies to be tested.

Our pilot study revealed no association with other potential or known risk factors like obesity and use of mobile phone, neither with tumor malignancy nor with risk/protective clusters. While the link with the use of mobile phone is still uncertain and debated (4345), the lack of associations with obesity is in contrast with previous evidence reported by (17), although this association may be stronger in adolescence, rather than in adulthood (46). Moreover, participants assigned to the putative protective cluster reported more often to live or work in proximity of potential pollution sources like industries, signal relays/repeaters/antennas, sources of asbestos or landfills, which is not in line with a recent review in the field (47). However, this may be partly due to subjects from Cluster 2 more often reporting to live in an urban setting, where there is a higher density of such potential sources of pollution, or simply be a false positive finding.

Overall, we observed here a clear link between patients clinical, lifestyle, psychometric, environmental and socioeconomic profiling and the risk of malignancies, as well as different associations of potential risk factors with the putative risk cluster – in line with previous literature – which we could not always observe when comparing malignant vs non-malignant tumors. This supports the application of machine learning algorithms in stratifying patients based on a combination of risk and protective factors, clinical and biological characteristics, in line with the modern view of cancer epidemiology (23, 24), which represents the essence of personalized medicine and prevention. Should our findings be confirmed by larger independent studies, this information may be useful in the future to create potential intelligent ranking systems for treatment priority, overcoming the lack of histopathological information and molecular diagnosis of the tumor, which are typically not available until the time of surgery. This may ultimately have beneficial implications on timely cancer diagnosis, prognosis and outcomes, possibly increasing survival for patients.

4.1 Strengths and limitations

This preliminary study shows some points of strength, but also limitations. Strengths include the originality and novelty of the approach. Although clustering techniques have been already used in brain tumor classification, these were applied to segment brain tumors (48) and identify transcriptomic/immune subtypes useful for prognosis prediction (49) rather than to classify patients’ profiles (49). To our knowledge, the present work is the first report of a cluster analysis based on data other than histological, neuroimaging and molecular characteristics from CNS tumor cases. This may notably improve the power to identify subtypes of disease, by taking into account also potentially complex and non-linear relationships among risk and protective factors. A further novelty consists in comparing CNS tumors based on their malignancy, while they are usually analyzed based on the tissue and cell type affected. The main limitation is represented by the cross-sectional/retrospective approach of the study, due to the current lack of longitudinal prospective data. Indeed, we are still collecting follow-up data after neurosurgery. Also, additional clinical variables like latency, dose-response and tumor localization may have been useful in patients profiling, but were not available at the time of the study due to the limitations imposed to the clinical research activity by the Covid-19 pandemics emergency, which forced us to interrupt recruitment, data collection and assessment. Due to this and to the rarity of the disease, sample size is also relatively small (<100), which may represent a hindrance to statistical power and clustering accuracy. For this reason, these findings warrant further replication in future independent studies on larger sample sizes, possibly including longitudinal data and a wider range of clinical features.

Author’s note

MEDICEA Study investigators are listed in Supplementary Materials.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories can be found below: The password for accessing raw data will be provided upon reasonable request to the corresponding author.

Ethics statement

The studies involving humans were approved by Ethical Committee at the IRCCS Neuromed, Pozzilli, Italy. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

SE: Conceptualization, Visualization, Writing – original draft, Data curation, Investigation, Project administration. ER: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Writing – original draft. ADC: Methodology, Writing – review & editing. SC: Data curation, Writing – review & editing. MB: Funding acquisition, Writing – review & editing. FB: Methodology, Writing – review & editing. VE: Data curation, Project administration, Resources, Writing – review & editing. GI: Data curation, Project administration, Resources, Writing – review & editing. SP: Data curation, Project administration, Resources, Writing – review & editing. CC: Conceptualization, Supervision, Writing – review & editing. MD: Conceptualization, Supervision, Writing – review & editing. GdG: Conceptualization, Supervision, Writing – review & editing. LI: Conceptualization, Funding acquisition, Supervision, Writing – review & editing. AG: Conceptualization, Formal analysis, Visualization, Writing – original draft.


The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The present work has been performed in the context of the Fondazione Umberto Veronesi - IRCCS Neuromed framework agreement. The data analyses were partially supported by the Italian Ministry of Health (Ricerca Corrente 2022-2024). ER was supported by the Fondazione Umberto Veronesi, which is gratefully acknowledged. Funders had no role in study design, collection, analysis, and interpretation of data, nor in the writing of the manuscript or in the decision to submit the article for publication. All Authors were and are independent from funders.


We are grateful to the participants of the MEDICEA Study. The present study has been performed in the context of the Fondazione Umberto Veronesi - IRCCS Neuromed framework agreement. ER was supported by a fellowship of the Fondazione Umberto Veronesi, that is gratefully acknowledged. We are grateful to all the Clinical Network Big Data and Personalised Health Project participants who enthusiastically joined the study and to all clinicians who contributed to recruitment, collection, assessment and elaboration of samples.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that ER was a review editor, MB, MD, GdG and LI were all Guest associate editors, VE was an associate editor and AG was a Guest Associate Editor and Review Editor and they were all members of the Frontiers, editorial board member of at the time of submission. This had no impact on the peer review process and the final decision.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at:


1. Bondy ML, Scheurer ME, Malmer B, Barnholtz-Sloan JS, Davis FG, Il’yasova D, et al. Brain Tumor Epidemiology Consortium. Brain tumor epidemiology: consensus from the Brain Tumor Epidemiology Consortium. Cancer (2008) 113(7 Suppl):1953–68. doi: 10.1002/cncr.23741

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Louis DN, Perry A, Reifenberger G, von Deimling A, Figarella-Branger D, Cavenee WK, et al. The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta Neuropathol (2016) 131(6):803–20. doi: 10.1007/s00401-016-1545-1

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Busco S, Buzzoni C, Mallone S, Trama A, Castaing M, Bella F, et al. Italian cancer figures–Report 2015: The burden of rare cancers in Italy. Epidemiol Prev (2016) 40(1 Suppl 2):1–120. doi: 10.19191/EP16.1S2.P001.035

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Connelly JM, Malkin MG. Environmental risk factors for brain tumours. Curr Neurol Neurosci Rep (2007) 7(3):208–14. doi: 10.1007/s11910-007-0032-4

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Barnholtz-Sloan JS, Ostrom QT, Cote D. Epidemiology of brain tumours. Neurol Clin (2018) 36(3):395–419. doi: 10.1016/j.ncl.2018.04.001

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Schwingshackl L, Schwedhelm C, Galbete C, Hoffmann G. Adherence to mediterranean diet and risk of cancer: an updated systematic review and meta-analysis. Nutrients (2017) 9(10):E1063. doi: 10.3390/nu9101063

CrossRef Full Text | Google Scholar

7. Levin VA, Leibel SA, Gutin PH. Neoplasms of the central nervous system. In: DeVita VTJ, Hellman S, Rosenberg SA, editors. Cancer: Principles and Practice of Oncology. Lippincott-Raven: Philadelphia (2001). p. 2100–60.

Google Scholar

8. Praud D, Bertuccio P, Bosetti C, Turati F, Ferraroni M, La Vecchia C. Adherence to the mediterranean diet and gastric cancer risk in Italy. Int J Cancer (2014) 134:2935–41. doi: 10.1002/ijc.28620

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Machowetz A, Poulsen HE, Gruendel S, Weimann A, Fitó M, Marrugat J, et al. Effect of olive oils on biomarkers of oxidative DNA stress in Northern and Southern Europeans. FASEB J (2007) 21(1):45–52. doi: 10.1096/fj.06-6328com

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Castelló A, Boldo E, Pérez-Gómez B, Lope V, Altzibar JM, Martín V, et al. Adherence to the western, prudent and mediterranean dietary patterns and breast cancer risk: MCC-Spain study. Maturitas (2017) 103:8–15. doi: 10.1016/j.maturitas.2017.06.020

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Esposito S, Bonaccio M, Ruggiero E, Costanzo S, Di Castelnuovo A, Gialluisi A, et al. Food processing and risk of central nervous system tumours: A preliminary case-control analysis from the MEditerranean DIet in relation to CancEr of brAin (MEDICEA) study. Clin Nutr (2022) 42(2):93–101. doi: 10.1016/j.clnu.2022.11.016

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Lian W, Wang R, Xing B, Yao Y. Fish intake and the risk of brain tumor: a meta-analysis with systematic review. Nutr J (2017) 16(1):1. doi: 10.1186/s12937-016-0223-4

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Carlberg M, Söderqvist F, Hansson Mild K, Hardell L. Meningioma patients diagnosed 2007–2009 and the association with use of mobile and cordlessphones: a case–control study. Environ Health (2013) 12(1):60. doi: 10.1186/1476-069X-12-60

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Carlberg M, Hardell L. Decreased survival of glioma patients with astrocytoma grade IV (glioblastoma multiforme) associated with long-term use of mobile and cordless phones. Int J Environ Res Public Health (2014) 11(10):10790–805. doi: 10.3390/ijerph111010790

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Leng L. The relationship between mobile phone use and risk of brain tumour: a systematic review and meta-analysis of trails in the last decade. Leng Chin Neurosurgical J (2016) 2:38. doi: 10.1186/s41016-016-0059-y

CrossRef Full Text | Google Scholar

16. Bogovski P, Bogovski S. Animal Species in which N-nitroso compounds induce cancer. Int J Cancer (1981) 27(4):471–4. doi: 10.1002/ijc.2910270408

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Benson VS, Pirie K, Green J, Casabonne D, V Beral for the Million Women Study Collaborators. Lifestyle factors and primary glioma and meningioma tumours in the Million Women Study cohort. Br J Cancer (2008) 99:185–190. doi: 10.1038/sj.bjc.6604445

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Inskip PD, Tarone RE, Hatch EE, Wilcosky TC, Fine HA, Black PM, et al. Sociodemographic indicators and risk of brain tumours. Int J Epidemiol (2003) 32(2):225–33. doi: 10.1093/ije/dyg051

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Provost D, Cantagrel A, Lebailly P, Jaffré A, Loyant V, Loiseau H, et al. Brain tumours and exposure to pesticides: a case-control study in southwestern France. Occup Environ Med (2007) 64(8):509–14. doi: 10.1136/oem.2006.028100

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Giovagnoli AR. Investigation of cognitive impairments in people with brain tumors. J Neurooncol (2012) 108(2):277–83. doi: 10.1007/s11060-012-0815-6

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Kahana MJ. The cognitive correlates of human brain oscillations. J Neurosci (2006) 26(6):1669–72. doi: 10.1523/JNEUROSCI.3737-05c.2006

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Ghandour F, Squassina A, Karaky R, Diab-Assaf M, Fadda P, Pisanu C. Presenting psychiatric and neurological symptoms and signs of brain tumors before diagnosis: A systematic review. Brain Sci (2021) 11(3):301. doi: 10.3390/brainsci11030301

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Dagogo-Jack I, Shaw AT. Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol (2018) 15(2):81–94. doi: 10.1038/nrclinonc.2017.166

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Garattini S, Fuso Nerini I, D’Incalci M. Not only tumor but also therapy heterogeneity. Ann Oncol (2018) 29(1):13–9. doi: 10.1093/annonc/mdx646

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Ostrom QT, Cioffi G, Gittleman H, Patil N, Waite K, Kruchko C, et al. CBTRUS statistical report: primary brain and other central nervous system tumors diagnosed in the United States in 2012-2016. Neuro Oncol (2019) 21(Suppl 5):v1–v100. doi: 10.1093/neuonc/noz150

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Luca S. Proposal for a statistical-economic assessment of soil degradation: A national scale approach. RIV (2014) 56:312–43. doi: 10.4081/ija.2020.1770

CrossRef Full Text | Google Scholar

27. Clinical guidelines on the identification, evaluation, and treatment of overweight and obesity in adults–the evidence report. National institutes of health. Obes Res (1998) 6 Suppl 2:51S–209S. doi: 10.1002/j.1550-8528.1998.tb00690.x

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Pisani P, Faggiano F, Krogh V, Palli D, Vineis P, Berrino F. Relative validity and reproducibility of a food frequency dietary questionnaire for use in the Italian EPIC centres. Int J Epidemiol (1997) 26 Suppl 1:S152–60. doi: 10.1093/ije/26.suppl_1.s152

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Trichopoulou A, Costacou T, Bamia C, Trichopoulos D. Adherence to a Mediterranean diet and survival in a Greek population. N Engl J Med (2003) 348(26):2599–608. doi: 10.1056/NEJMoa025039

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Connor KM, Davidson JR. Development of a new resilience scale: the Connor-Davidson Resilience Scale (CD-RISC). Depress Anxiety (2003) 18(2):76–82. doi: 10.1002/da.10113

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Hobson J. The montreal cognitive assessment (MoCA). Occup Med (Lond) (2015) 65(9):764–5. doi: 10.1093/occmed/kqv078

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc (2005) 53(4):695–9. doi: 10.1111/j.1532-5415.2005.53221.x

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med (2001) 16(9):606–13. doi: 10.1046/j.1525-1497.2001.016009606.x

PubMed Abstract | CrossRef Full Text | Google Scholar

35. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing (2021). Available at:

Google Scholar

36. Cowppli-Bony A, Bouvier G, Rué M, Loiseau H, Vital A, Lebailly P, et al. Brain tumors and hormonal factors: review of the epidemiological literature. Cancer Causes Control (2011) 22(5):697–714. doi: 10.1007/s10552-011-9742-7

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Bonaccio M, Di Castelnuovo A, Costanzo S, De Curtis A, Persichillo M, Cerletti C, et al. Impact of combined healthy lifestyle factors on survival in an adult general population and in high-risk groups: prospective results from the Moli-sani Study. J Intern Med (2019) 286(2):207–20. doi: 10.1111/joim.12907

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Nilsson J, Holgersson G, Järås J, Bergström S, Bergqvist M. The role of income in brain tumour patients: a descriptive register-based study: No correlation between patients’ income and development of brain cancer. Med Oncol (2018) 35(4):52. doi: 10.1007/s12032-018-1108-5

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Blumenthal DT, Cannon-Albright LA. Familiarity in brain tumors. Neurology (2008) 71(13):1015–20. doi: 10.1212/01.wnl.0000326597.60605.27

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Claus EB, Walsh KM, Calvocoressi L, Bondy ML, Schildkraut JM, Wrensch M, et al. Cigarette smoking and risk of meningioma: the effect of gender. Cancer Epidemiol Biomarkers Prev (2012) 21(6):943–50. doi: 10.1158/1055-9965.EPI-11-1059

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Lei H, Jiang J, Liu B, Han W, Wu Y, Zou X, et al. Smoking and adult glioma: a population-based case-control study in China. Neuro Oncol (2016) 18(1):105–13. doi: 10.1093/neuonc/nov146

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Van Kessel E, Baumfalk AE, van Zandvoort MJE, Robe PA, Snijders TJ. Tumor-related neurocognitive dysfunction in patients with diffuse glioma: a systematic review of neurocognitive functioning prior to anti-tumor treatment. J Neurooncol (2017) 134(1):9–18. doi: 10.1007/s11060-017-2503-z

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Inskip PD, Hoover RN, Devesa SS. Brain cancer incidence trends in relation to cellular telephone use in the United States. Neuro Oncol (2010) 12(11):1147–51. doi: 10.1093/neuonc/noq077

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Hardell L, Carlberg M. Mobile phones, cordless phones and rates of brain tumors in different age groups in the Swedish Na-tional Inpatient Register and the Swedish Cancer Register during 1998-2015. PloS One (2017) 12(10):e0185461. doi: 10.1371/journal.pone.0185461

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Hours M, Bernard M, Montestrucq L, Arslan M, Bergeret A, Deltour I, et al. Téléphone mobile, risque de tumeurs cérébrales et du nerf vestibuloacoustique: l’étude cas-témoins INTERPHONE en France [Cell Phones and Risk of brain and acoustic nerve tumours: the French INTERPHONE case-control study]. Rev Epidemiol Sante Publique (2007) 55(5):321–32. doi: 10.1016/j.respe.2007.06.002

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Moore SC, Rajaraman P, Dubrow R, Darefsky AS, Koebnick C, Hollenbeck A, et al. Height, body mass index, and physical activity in relation to glioma risk. Cancer Res (2009) 69(21):8349–55. doi: 10.1158/0008-5472.CAN-09-1669

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Pagano C, Navarra G, Coppola L, Savarese B, Avilia G, Giarra A, et al. Impacts of environmental pollution on brain tumorigenesis. Int J Mol Sci (2023) 24(5):5045. doi: 10.3390/ijms24055045

PubMed Abstract | CrossRef Full Text | Google Scholar

48. Khan AR, Khan S, Harouni M, Abbasi R, Iqbal S, Mehmood Z. Brain tumor segmentation using K-means clustering and deep learning with synthetic data augmentation for classification. Microsc Res Tech (2021) 84(7):1389–99. doi: 10.1002/jemt.23694

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Shen X, Wang X, Shen H, Feng M, Wu D, Yang Y, et al. Transcriptomic analysis identified two subtypes of brain tumor characterized by distinct immune infiltration and prognosis. Front Oncol (2021) 11:734407. doi: 10.3389/fonc.2021.734407

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: central nervous system tumors, cluster analysis, pre-diagnostic history, clinical characteristics, cognitive performance, cancer diagnosis, risk and protective factors, malignancy

Citation: Esposito S, Ruggiero E, Di Castelnuovo A, Costanzo S, Bonaccio M, Bracone F, Esposito V, Innocenzi G, Paolini S, Cerletti C, Donati MB, de Gaetano G, Iacoviello L and Gialluisi A (2023) Identifying brain tumor patients’ subtypes based on pre-diagnostic history and clinical characteristics: a pilot hierarchical clustering and association analysis. Front. Oncol. 13:1276253. doi: 10.3389/fonc.2023.1276253

Received: 04 September 2023; Accepted: 30 October 2023;
Published: 29 November 2023.

Edited by:

Sheng Zhong, Sun Yat-sen University Cancer Center, China

Reviewed by:

Felix Mircea Brehar, Carol Davila University of Medicine and Pharmacy, Romania
Alessandro Consales, Giannina Gaslini Institute (IRCCS), Italy
Judith Ann Schwartzbaum, Independent Researcher, United States
Andrea Botturi, IRCCS Carlo Besta Neurological Institute Foundation, Italy

Copyright © 2023 Esposito, Ruggiero, Di Castelnuovo, Costanzo, Bonaccio, Bracone, Esposito, Innocenzi, Paolini, Cerletti, Donati, de Gaetano, Iacoviello and Gialluisi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Licia Iacoviello,

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.