Deep Phenotyping and Genetic Characterization of a Cohort of 70 Individuals With 5p Minus Syndrome

Chromosome-5p minus syndrome (5p-Sd, OMIM #123450) formerly known as Cri du Chat syndrome results from the loss of genetic material at the distal region of the short arm of chromosome 5. It is a neurodevelopmental disorder of genetic cause. So far, about 400 patients have been reported worldwide. Individuals affected by this syndrome have large phenotypic heterogeneity. However, a specific phenotype has emerged including global developmental delay, microcephaly, delayed speech, some dysmorphic features, and a characteristic and monochromatic high-pitch voice, resembling a cat’s cry. We here describe a cohort of 70 patients with clinical features of 5p- Sd characterized by means of deep phenotyping, SNP arrays, and other genetic approaches. Individuals have a great clinical and molecular heterogeneity, which can be partially explained by the existence of additional significant genomic rearrangements in around 39% of cases. Thus, our data showed significant statistical differences between subpopulations (simple 5p deletions versus 5p deletions plus additional rearrangements) of the cohort. We also determined significant “functional” differences between male and female individuals.


INTRODUCTION
The syndrome of 5p-(5p-Sd) is caused by partial deletion of the short arm of chromosome 5. The size of the deletion is variable ranging from 500 kb or less to 45 Mb (Simmons et al., 1995;Gu et al., 2013;Elmakky et al., 2014). This syndrome is a rare chromosomal disease, with an incidence between 1 in 15,000 and 1 in 50,000 live births (Niebuhr, 1978;Higurashi et al., 1990;. The prevalence is higher among females (66%) than males, but the reason is unclear. No differences in prevalence between races or geographical areas have been found or related to prenatal events or age of the parents. In Spain, it is estimated that there are around 500-700 patients (Rodríguez-Caballero et al., 2012, and unpublished data from patient's Associations). It has been suggested that the great phenotypic variability observed among individuals with this syndrome is related to both the size and the location of the deletion (between 5p15.3 and 5p15.2 bands), since it is a chromosomal region with a large gene content (Nguyen et al., 2015;Correa et al., 2019).
Ninety percent of cases are de novo, and 10% are inherited, due to a rearrangement in the parents (unbalanced segregation of translocations, or recombination involving a pericentric inversion, rarely a parental mosaicism, or an inherited terminal deletion). In de novo cases, between 80% and 90% are of paternal origin possibly due to chromosome breakage during the formation of male gametes (Cerruti Mainardi et al., 2001). Prenatal diagnoses in 5p-Sd (at 12-16 weeks of gestation) are common because fetuses frequently show abnormal ultrasound signs (∼65-90%) including cerebral abnormalities, cerebellar hypoplasia, absent/hypoplastic nasal bone, hydrops fetalis, ascites or encephalocele, hypospadias, lung dysplasia, IUGR, microcephaly, and micro/retrognatia (Mak et al., 2019;Su et al., 2019;Peng et al., 2020).
Over the past decade, the accuracy of genetic diagnosis and the advances of analytical techniques have allowed to expand the genetic information associated with the short arm of chromosome 5. However, a full map of the involved genes in this syndrome is not completely established, nor the consequences of their haploisufficiency for subjects with 5p-Sd. In this sense, Nguyen et al. (2015) established a role for 11 dose-sensitive genes within the 5p-arm. In five of them, losses may lead to haploinsufficiency (TERT, SEMA5A, MARCH6, CTNND2, and NPR3), and in the remaining six genes their haploinsufficiency is conditioned by an additional environmental factor (SLC6A3, CDH18, CDH12, CDH10, CDH9, and CDH6). In addition, two additional genes have been suggested to have haplolethal effects (RICTOR and DAB2).
We here describe the clinical and molecular data of a cohort of 70 unrelated patients with a cytogenetic and/or molecular diagnosis of 5p-Sd. High-resolution single-nucleotide polymorphism (SNP) array, cytogenetic, fluorescence in situ hybridization (FISH), and multiplex ligation-probe amplification (MLPA) techniques were applied to most patients, in order to delineate the size, extent, gene content, and additional rearrangements. Genotype-phenotype relationship analyses were also established. A comparison of the clinical features with published patients in the literature and relevant findings that all patients share in this series were also discussed.

Subjects
During the period between 2017 and 2020, around 100 patients with 5p minus syndrome, formerly called Cri du chat syndrome (CDCS), were recruited for this study in our center. At this moment, around 30 cases had incomplete either clinical or molecular data and were finally not included in this study. The final cohort is constituted by 70 individuals (see Figure 1 and Supplementary Data). Most of the DNA samples from these patients were extracted and analyzed by SNP arrays at INGEMM (Madrid, Spain), and standard cytogenetic studies were made at the Spanish Collaborative Study of Congenital Malformations Centre (ECEMC) and INGEMM. Clinical information of patients was obtained from the referring physicians by two standardized questionnaires (INGEMM and ECEMC), completed with data of the medical reports, and interviewing most of the parents. Parents or guardians provided informed consent and the Institutional Review Board of our Hospital approved the study (HULP, Madrid, Spain).

Multiplex Ligation-Probe Amplification (MLPA)
We used MLPA Salsa kits P036 and P070 (subtelomeric probes for all chromosomes) and/or P096 and P358 (specific telomeric probes for the 5p arm) to characterize patients with 5p-Sd (MRC Holland, Amsterdam, Netherland). Data analyses were performed according to the protocols supplied by the provider defining relative probe signals by dividing each measured peak area by the sum of all peak areas of the control probes of that sample. The ratio of each peak's relative probe area was then compared versus a DNA control sample (Promega, United Kingdom), using Coffalyser v.9.4 (MRC Holland).

SNP-Array Analysis
A genome-wide scan of 850,000 tag SNPs (Infinium CytoSNP-850k BeadChip, Illumina, San Diego, CA, United States) was performed at INGEMM, in the majority of the patients, but three (analyzed by array-CGH at ECEMC). They were analyzed by using the Chromosome Viewer tool contained in the Genome Studio package (Illumina). In Chromosome Viewer, gene call scores <0.15 at any locus were considered "no calls." In addition, an allele frequency analysis was applied for all SNPs. All genomic positions were established according to the 2009 human genome build 19 (GRCh37/NCBI build 37.1). Deletion sizes were plotted on the genome browser (Figure 2 and Supplementary Data) using the University of California at Santa Cruz Genome Browser 1 .

Validation of Global Functional Assessment of the Patients (GFAP)
We estimated individual functional assessment in our cohort by using different features taken from the questionnaires and weighed them by Human Phenotype Ontology (HPO) term frequencies in a numerical scale of five "nuclear" items in the syndrome, based on our clinical experience. A final patient assessment (GFAP) was constructed by the summatory of items "(i) to (v)" as is indicated in Supplementary Table 1, and its validation is explained in the "Results" section.

Statistical Analysis
Statistical analysis was performed with SPSS version 25 (IBM Corporation, United States). Descriptive analysis included mean ± SD for continuous variables and frequency tables for categorical variables. These categorical variables were expressed as 1 or 0, indeed grouped as "ever" having a given condition compared to "never" having the condition, taken from the two questionnaires and curated from medical records. Correlation associations were calculated using Pearson's linear correlation coefficient (continuous variables) or Spearman's Rho and Kendall's tau_b (categorical variables). Comparisons between two groups (as based on sex or to have additional rearrangements) were performed either by Student's t-test (for continuous variables) or by chi-square tests (for categorical ones). For more than two groups, ANOVA analysis (and Bonferroni's post hoc tests) was run for continuous variables, and z-tests between column proportions for categorical variables. PCA (principal component analysis) was used to validate our GFAP construct, containing Kaiser-Meyer-Olkin's measure and Barlett's test. Ward's minimum variance method was the criterion used in hierarchical cluster analysis, and the number of clusters was selected using the Bayesian information criterion (BIC) or Akaike information criterion (AIC). A P-value (observed significance level) lower than 0.05 or 0.01 was considered to indicate a statistically significant or very significant difference, respectively.

Clinical Findings
We evaluate 70 unreported individuals. All but three were from Spain (see Figure 1 and Supplementary Data). The female/male ratio (2.04:1; 47/23) was very similar to previously described cohorts, and ages ranged from birth to 45 years (see Supplementary Table 2). The highest number of individuals with 5p-Sd in our cohort are individuals in the pediatric age (between 0 and 12 years, 77.23%). Descriptive statistics (for continuous variables) and frequencies (for categoric items) are shown in Tables 1, 2, respectively. The mean and median age at evaluation were 8 years and 9 months, and 7 years old, respectively ( Table 1).

Perinatal and Neonatal Data
Regarding neonatal data, the average gestational age of our cohort was 38.28 ± 2.59 weeks (Table 1). Grossly, 53% (37 subjects) were born between weeks 39th and 40th. Nineteen individuals were born before the 38th week of gestation, three of them below week 32th, and 27 after week 40th. The average birth weight is 2602.13 ± 677.50 g (centile below 5%; Marinescu et al., 2000), which corresponds to the average weight of a neonate of 35th-36th weeks (at centile 50%), and the average length, 45.89 ± 3.90 cm (centile below 5%; Marinescu et al., 2000). Finally, the mean of the cephalic perimeter (OFC) at birth was 32.20 ± 2.42 cm (centile below 5%; Marinescu et al., 2000). More than one-third of subjects were hospitalized at birth. The main causes were prematurity, low weight, and suspected chromosomal abnormality. During the first months of life, several individuals also had feeding difficulties.

Postnatal Clinical Findings
The frequencies of clinical features observed in this cohort were recorded using the HPO terms and are listed in Table 3.
In Table 3, we also listed data from previous published series of 5p-Sd individuals (Cerruti Van Buggenhout et al., 2000;Rodríguez-Caballero et al., 2012;Espirito Santo et al., 2016;Rodrigues de Medeiros, 2017;Honjo et al., 2018;Chehimi et al., 2020). Figure 1A shows that facial features are not always typical of the syndrome and that a specific gestalt is not always present. Nonetheless, microcephaly, large nose bridge, epicanthal folds, hypertelorism, high arched palate, downturned corners of the mouth, round face, ear anomalies ( Figure 1B), dental alterations ( Figure 1C), short philtrum, micrognathia, and feeding difficulties were present in around or higher than 60% of patients. These should considered, in addition to hypotonia, typical cry/acute voice, breathing problems, and behavior anomalies, as the commonest features in this syndrome ( Table 3). On the other hand, alterations of the hands or feet (see Figures 1D,E), hyperlaxity, divergent/convergent strabismus, down-slanting  "1" means "ever" having a given condition compared to 0, "never" having the condition, taken from either of our two questionnaires, and curated from medical records.
palpebral fissures, stereotypies, gastrointestinal anomalies, short neck, scoliosis, cardiac anomalies, and speech delay were present in 25-59% of the cases and should be considered frequent findings in the syndrome ( Table 3). It is remarkable that many of those called "nuclear clinical features" (the most frequent findings) seemed to be interrelated among them. Indeed, it showed significant positive correlation among them when a Kendall's tau_b analysis was performed. For example, microcephaly presented in more than 65% of the cases correlated with epicanthus, narrow nasal bridge, or ear alterations. As an example, Figure 2 summarizes some of those very significant intra-correlations (P ≤ 0.01), e.g., microcephaly. The expandend analyses of these correlations are summarized in Supplementary Table 3.
Brain MRI studies were performed in almost 75% of the individuals, though only 28.6% showed some kind of alterations (Table 2), including cerebellar amygdala herniation, abnormalities of the corpus callosum (ranging from thinness to agenesis), frontal horn ectasia, brainstem hypoplasia, dilated ventricular system, cysts, or hydrocephalus. Electroencephalograms showed normal results in only a reduced number of individuals (12/70, 22.90%) ( Table 2).
Speech abilities (evaluated only in patients aged ≥3 years; n = 56) showed severe abnormalities in the majority of patients (40/56; ∼71.50%). In fact, 30.35% of patients (17/56) had no speech at all, 41.07% (23/56) had an elementary vocabulary of 10 words or less, and 28.57% (16/56) were reported to have a mild vocabulary and the ability to use limited phrases for a short and comprehensible conversation (Tables 2, 3).
As examples of comorbidities, almost 33% of the cohort undertook at least one surgery (ranging from one to seven, Table 1) and include ventricular septal defect (VSD), percutaneous closure of the patent ductus arteriosus, closure of open foramen oval, duodenal atresia, strabismus, and inguinal hernia (the most frequent).

Genetic Findings
Breakpoint Data Analysis SNP-array analysis was performed in most cases except three patients who had comparative genomic hybridization (CGH array). Genomic coordinates for microdeletions affecting the short arm of chromosome 5 and other genomic rearrangements are listed in Table 4. A graphic representation of the deletions is shown in Supplementary Figure 2. Briefly, the average size of the losses was 20.21 ± 9.28 Mb (range 0.62-35.01 Mb). SNP arrays established the existence of other clinically significant genomic rearrangements in almost 39% of the patients ( Table 4); most of them were not previously detected by cytogenetic studies (see Table 4 and Supplementary Data). Most subjects had terminal deletions (65/70, 92.85%), and five individuals carried interstitial deletions (represented in Supplementary  Figure 2B). Among the terminal deletions, 16 of them (22.85%), had an additional terminal duplication in other chromosome, which could result from a possible translocation (de novo or inherited). Cytogenetic analysis of the parents allowed us to establish whether the rearrangements were familiar (6 cases) or de novo (10 cases). In one case, the 5p deletion was inherited from a maternal mosaicism (6.5% of cells in blood) unknown until the moment of diagnosis of the child. We found patients with additional terminal deletions in other chromosomes (two cases, 2.85%) and additional rearrangements at chromosome-5 nearby the deletions (seven cases, 10%). Finally, three children inherited from their mothers a simple, isolated terminal deletion.

Individual GFAP
The great heterogeneity observed in patients with this syndrome together with the high number of other significant genomic rearrangements (besides the 5p deletions) raised the question whether the presence of these additional rearrangements may modulate functionally the clinical features in this syndrome and to explain the high intra-cohort variability. We proposed a graduation of the individual global assessment of functionality (GFAP), constructing one continuous variable, based on the frequency of the different "nuclear" clinical items (i to v, see section "Materials and Methods"), and our clinical experience in the syndrome.
To verify this GFAP scale construction, a statistical combined analysis of Kaiser-Meyer-Olkin's, Bartlett's, and principal component analysis (PCA) test were performed to detect the best way of association between these grouped clinical features. Indeed, the first principal component (PCA 1) from PCA weighed the major score of the variance, supporting that PCA1 can be written as a weighted average of the five original variables. Finally, Pearson correlation analysis showed that PCA1 and the item GFAP are very significantly correlated (Pearson correlation value = 0.846; P = 0.001). The dispersion plot shows the strong linear correlation among them and therefore justifies GFAP as a valid construct (Figure 3). Table 5 shows the median and mean ± SD values for GFAP and its intermediates "functional" components for the whole cohort, and both subpopulations: simple deletions (47 cases) and patients with deletions and additional rearrangements (mainly duplications, 23 cases).

Among Subpopulations With and Without Additional Rearrangements
A chi-square test was performed to compare categoric variables in both groups: simple (isolated) 5p deletions and those including 5p deletions and additional rearrangements  (mainly duplications). Interestingly, the presence of additional rearrangements may exert significant differences on prenatal and postnatal growth delay findings, cardiac anomalies, and speech abilities in the expressive language ( Table 6, P ≤ 0.05, at CI 95%).
Remarkably, other findings became significant at CI of 90%, such as cleft lip/palate, renal anomalies, autistic spectrum disorders (ASD), or breathing difficulties ( Table 6). Ward cluster analysis allowed us to compare the frequencies of these variables in both subpopulations. We denote that better figures (low percentages) were more represented in the simple 5p deletion group, but with motor items, slightly better than in the group with additional rearrangements (Table 5 and Supplementary Data). Although the simple deletion group had a higher size of 5p deletions on average (see Supplementary Table 5), no statistical significant differences could be observed between the two different subpopulations (Student t-test, see Figure 4).
We also performed an association analysis among categoric variables in both subpopulations by Kendall's tau_b analysis (expanded analysis for the whole cohort and subgroups is presented in Supplementary Table 3). Interestingly, some of the observed correlations in the simple 5p deletion group disappeared in the group with additional rearrangements (Figure 5). A more specific example for three of these categoric variables is presented in Supplementary Figure 3.

Genotype-Phenotype Correlations
We made Ward's hierarchical cluster analysis using the item "size of deletion" as unique variable, in order to verify how individuals (initially, from the whole cohort) group according to their deletion size. At the end, individuals were grouped in four clusters (the number was established by BIC and AIC algorithms), as follows: 4.97 ± 1.83 Mb, 14.64 ± 2.31 Mb, 24.01 ± 1.38 Mb, and 29.95 ± 2.93 Mb (Figures 6A,B). ANOVA analysis discarded a significant correlation between the size of the deletion and the functional item, GFAP, or any of its intermediates (P = 0.07 at CI 95%, data not shown). However, ANOVA analysis for continuous variables or by chi-square test for categoric variables shows the existence of significant differences between clusters in a few variables, mostly related to perinatal parameters, some dysmorphic features, behavior, and cognitive features ( Figure 6C). Further, Bonferroni's and z-tests for previous significant variables revealed that cluster 3 TABLE 5 | Mean (±SD) and median of GFAP (Global Functional Assessment of the Patient) and its intermediates (items "i" to "v") from the whole cohort and subpopulations of 5p-individuals.

Whole cohort
Single 5p   (size 24.01 ± 1.38 Mb; 5p15.1-p14.1) is the most represented among the cluster pairs with significant differences among them ( Figure 6C). When Ward's clusters were dissected by item frequencies (in percentages), higher percentages (normally associated with a worse prognostic) seemed to be mapped, in cluster 3 too ( Table 6 and Supplementary Data). However, expressive language (followed by item, the ability to make short sentences) or the ability to write or read was associated preferentially with clusters 1 and 2 (11/16 individuals, 69%). Figure 7 shows how the four clusters integrate into suggested functional areas of chromosome-5p of several previously published data. We observed some relevant mapping findings such as the item speech delay, which was mapped at the beginning of the telomere.
We further analyzed these possible differences among clusters (by size of deletion) in the two subpopulations of 5p-Sd individuals (simple, isolated 5p deletions vs. 5p deletions plus additional rearrangements), using the same statistical approach presented above. Supplementary Figure 4A showed a similar result for simple deletions. We also found intra-cluster significant differences for some variables, with cluster 3 again as the most representative cluster for significant differences in the pair of cluster comparisons (Figure 4A and Supplementary Data). Remarkably, one of these variables that showed differences among clusters was GFAP. For analysis of the group with additional rearrangements, we generated only two clusters for comparison (due to the number of individuals) but also denoted significant differences between clusters "A" and "B" (now, cluster "A" aggregates clusters 1 and 2 and "B" clusters 3 and 4, Supplementary Figure 4B).
Finally, Pearson correlation analysis established that the size of the deletion inversely correlated with some neonatal parameters, such as weight or OFC (P ≤ 0.001), and almost with birth length (P = 0.061). However, the most significant genotype/phenotype correlation was observed between size of the deletions and gender (males, 15.79 ± 8.79 vs. females, 22.38 ± 8.84. Student t-test, P = 0.004).

Male vs. Female Comparative Analysis
A chi-square test was performed for the whole cohort and two of the subpopulations. Table 7i shows the statistic significant differences between males and females in the whole cohort. These differences were mostly related to growth delay (prenatal and postnatal), dysmorphic features, some spinal comorbidities, and behavioral and cognitive aspects. In addition, Ward cluster analysis between males and females showed the worst frequencies (in percentages) in females ( Table 7 and Supplementary Data). As we expected, neonatal data at birth showed also significant differences among gender and weight and OFC (P ≤ 0.01 and P ≤ 0.05, respectively, Student t-test) or with length at birth (P = 0.074, Student t-test). Most remarkably, there were also significant differences at the functional GFAP (P = 0.05, Student t-test). These differences showed higher values of frequencies (mainly, a worse prognosis) in females. Similarly, we FIGURE 5 | Schematic representation of the comparison between subpopulations in 5p-individuals (5p deletions vs. 5p deletion plus additional rearrangements) in categoric variables using Kendall's tau_b statistical analysis. Very significant differences (P ≤ 0.01) were denoted in bold. Circles denote significant differences among variables observed in the 5p deletion group and absent in the 5p deletion + additional rearrangement group. compared male vs. female significant differences for all categoric variables (in both isolated deletions and deletions + additional rearrangements) (Tables 7ii,iii). Significant correlations were found among gender, independently of the group. The only significant difference in common was intrauterine growth retardation (IUGR). Again, the most remarkable finding with significant differences in the simple 5p deletion group was GFAP (Table 7ii), but not in the group with additional rearrangements (Table 7iii). On the other hand, patients with additional rearrangements also showed significant differences in neonatal data, such as weight or OFC at birth, again as the whole cohort, showing better numbers in males than in females. Expanded Student t-test analysis is shown in Supplementary Table 8.

DISCUSSION
In this work, we describe the largest cohort of Spanish patients with 5p-Sd and one of the largest series of these patients so far, characterized by means of CMA and other genetic approaches, such as cytogenetics, MLPA, and FISH. Although its prevalence is still unknown, it was estimated around 1:15,000-50,000 (Niebuhr, 1978;Higurashi et al., 1990;. Our data showed that 5p-individuals may have a high clinical variability that is accompanied also by a high genetic heterogeneity. In fact, individuals with 5p-syndrome do not always carry a single rearrangement. In our cohort, around 39% of the individuals presented an additional clinically significant genomic rearrangement, mainly a duplication in other chromosomes. In other cases, additional deletions and duplications can be observed nearby the main 5p deletion (seven cases), probably as a result of a complex rearrangement, as it has been previously suggested (Gu et al., 2013). These additional rearrangements raised the question of whether additional genomic rearrangements may have a role in the syndrome, and thus, it may explain part of its variability, or if individuals with additional rearrangements should be considered as having 5p minus syndrome.
We described and compared our cohort with other previously reported series in terms of clinical features. Some limitations of this study come from information taken from the questionnaires filled up by parents or caregivers. This could explain part of these differences among subjects. We strongly recommend systematic codification of clinical features using the HPO system.
We think that frequency-weighed HPO terms grouped in five main nuclear features of the syndrome will help clinicians to describe 5p-Sd patients ( Table 3). We built a quotation scale called GFAP (see sections "Materials and Methods" and "Results"). We compared this "functional" GFAP and its intermediate components in order to establish FIGURE 6 | (A) Ward's hierarchical cluster of the whole cohort by size of the deletions. BIC and AIC determined to be grouped by four clusters. (B) Plot segregation ordered by deletion size in Mb. Every cluster showed the GFAP value. (C) ANOVA analysis for continuous variables or by chi-square test for categoric variables was performed to establish the existence of significant differences between clusters. Further, Bonferroni's test and z-test for previous significant variables revealed cluster pairs with significant differences among them. putative significant differences between both subpopulations: simple, isolated 5p deletions and 5p deletions with additional rearrangements. However, no statistical significant differences (Student t-test) could be observed between the two different subpopulations for the GFAP variable, although several significant differences could be denoted among other clinical features. The most relevant were cardiac anomalies and speech delay and the presence of additional rearrangements. Regarding behavioral aspects, there were significant differences among subpopulations in sleeping problems, stereotypical or aggressive behavior, and number of behavioral problems, being more common in the group with an additional genomic rearrangement. Thus, the latter showed better numbers in some cognitive items than simple 5p deletions. Altogether and based on statistic analysis, the presence of additional duplications did not have any significant representation over the whole phenotype of the 5p-patient, but it might have specific contributions for some clinical findings such as growth delay (either prenatal or postnatal) as well as cardiac anomalies.

Genotype-Phenotype Correlations
Some authors have previously stated that the severity of the phenotype and the cognitive delay of 5p-Sd were associated with an increased size of the deletion at chromosome 5p (Wilkins et al., 1983;Cornish et al., 1999;Cerruti Mainardi et al., 2001). However, this fact was not confirmed by others (Marinescu et al., 1999;Espirito Santo et al., 2016). Thus, this aspect is still controversial. We used our "functional" construct GFAP to validate this hypothesis. Our data supported these genotypephenotype correlations only in simple deletions. Since there is a scant number of publications in this syndrome incorporating microarray data, it cannot be discussed whether this fact has occurred in other cohorts. For instance, Cerruti  analyzed genotype-phenotype correlations but only in patients with isolated 5p terminal deletions (151/185 cases).
Furthermore, if a part of the huge phenotypic variability observed among 5p-individuals was not related to the size of the deletion, the other possibility may be established by the location of the deletion, since it is a chromosomal region with an important gene content. Our data supported that specific regions at chromosome 5p may have more significant roles in the syndrome than others. Our analysis of clusters (by size of the deletion) showed that cluster 3 was the most relevant among the cluster pairs with statistically significant differences, both in the whole cohort and subpopulation groups. In fact, the worst frequencies of most categorical items, as well as GFAP and its intermediates in cluster 3, seem to support this observation. Cluster 3 mapped at 18-25 Mb from the telomere (chromosomal bands 5p15.1-5p14.1). Among the genes mapping in this area were the cadherin (CDH) cluster, including CDH10, CDH9, CDH12, CDH18, and CDH6, strongly associated with this syndrome. This CDH cluster has been described to be FIGURE 7 | Integrative map for clusters shown in Figure 6. Comparisons with previously reported critical regions for phenotype sings at 5p minus syndrome (references refer to clinical symptoms reported in individual families with interstitial deletions in different reported works). Circles represent the areas mapped for the different clusters obtained in Ward's analysis. Major findings observed in our study are in italic text. Chromosome bands are reported according to ISCN. ID, intellectual disability.
conditionally haploinsufficient and depend on other genetic or environmental factors leading to an abnormal phenotype. This is an interesting fact and could also explain part of the variability observed in 5p-Sd. Other genes in this region are FBXL7, MARCH11, FAM134B, MYO10, DROSHA, PDZD2, GOLPH3, MTMR12, ZFR, SUB1, NPR3, and TARS. All genes have a significant level of haploinsufficiency (see Supplementary  Table 9). However, we cannot rule out a role for other genes such as CTNND2, TERT, and MED10, commonly deleted in 5p-Sd and associated with neuronal development/function and cellular death. The smallest region of overlap patients with interstitial deletions pointed out to two potential regions, one mapping at these genes and the other in the cadherin cluster. Additional interstitial cases and functional assays are needed to unreveal the role of all these genes.

Gender as a Differentiating Factor: Correlations Depending on Gender
A suspicion of putative cognitive and "functional" differences between males and females patients has been constantly suggested to us by parents, caregivers, and several clinical specialists. This is the first report showing "functional"  differences between males and females in 5p-Sd individuals.
We found that some of the clinical features analyzed showed statistically significant differences among males and females, for instance in the GFAP variable. Thus, we denoted worse functional scores and higher deletion sizes in females than in males using Ward's cluster analysis. Additional efforts with systematic cognitive-behavioral evaluations of the patients must be performed in order to assign more precise differences. The reason why the ratio female-male is 2:1 is still unknown. One of the most relevant differences between genders is the mean value for size of the deletions. Interestingly, Ward's cluster analysis allowed us to observe how the female/male ratio was modulated by the different sizes of the deletions in the clusters ( Tables 5, 7 and Supplementary Data). The number of males in these clusters decreased drastically when the size of the deletion increased over 15 Mb. This fact may suggest a different, possibly lethal, effect of deletions over 15 Mb in males and might explain the differences among the female/male ratio in this syndrome. In fact, miscarriages are frequent in this syndrome. This is not an unusual effect because other genes at 5p13.1, such as RICTOR and DAB2, have been suggested to be haplolethals (Peng et al., 2020) and may explain how deletions do not expand in size, more than 39 Mb from the telomere. However, we cannot rule out any other additional genetic or epigenetic effect in males, affecting chromosomal bands 5p15.1-p13.2. In fact, an aberrant DNA methylation in Cri du chat syndrome related to development conditions has been already suggested (Naumonova et al., 2018).

CONCLUSION
Summing up, we here report a large series of patients with 5p minus syndrome emphasizing some phenotype-genotype correlations. Remarkably, we found statistically significant "functional" differences among males and females. We also dissected subpopulations in 5p-Sd based on the presence/absence of clinical significant additional rearrangements, besides losses at the 5p arm. The presence of these additional rearrangements may have a role modulating part of the phenotype in the syndrome.
Finally, we recommend combining typical karyotyping with CMA as the definitive method for a precise diagnosis of 5p-Sd, in order to provide a more accurate genetic counseling for these families.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found in the DECIPHER database (https://www.deciphergenomics.org/) with the following accession numbers: 436269 to 436336 corresponding to cases 5pIMG01-5pIMG70.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board Hospital la Paz. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
JN conceived the presented idea, completed the data analysis, and wrote the manuscript. JN and PL designed the study. JN coordinated the data acquisition and collected the data from the second questionnaires. AS-T created and managed the Final 5p-database. PL, AH, CB-L, and BS-R assisted with the data management and statistics. CB-L analyzed the conductual and cognitive profiles of the patients. JN, PB, and MM-Á managed the SNP and CGH microarrays at the INGEMM. EM, IV, and FG-S provided the FISH and karyotyping studies at INGEMM. FS-S and PL provided and explored several patients at INGEMM. MM-Fr and MM-Fe created the first questionnaire and managed some of the patients' cytogenetic analysis. All authors contributed to the article and approved the submitted version.