The Hydropathy Index of the HCDR3 Region of the B-Cell Receptor Identifies Two Subgroups of IGHV-Mutated Chronic Lymphocytic Leukemia Patients With Distinct Outcome

The HCDR3 sequences of the B-cell receptor (BCR) undergo constraints in length, amino acid use, and charge during maturation of B-cell precursors and after antigen encounter, leading to BCR and antibodies with high affinity to specific antigens. Chronic lymphocytic leukemia consists of an expansion of B-cells with a mixed immature and “antigen-experienced” phenotype, with either a mutated (M-CLL) or unmutated (U-CLL) tumor BCR, associated with distinct patient outcomes. Here, we investigated the hydropathy index of the BCR of 138 CLL patients and its association with the IGHV mutational status and patient outcome. Overall, two clearly distinct subgroups of M-CLL patients emerged, based on a neutral (mean hydropathy index of -0.1) vs. negatively charged BCR (mean hydropathy index of -1.1) with molecular features closer to those of B-cell precursors and peripheral/mature B-cells, respectively. Despite that M-CLL with neutral HCDR3 did not show traits associated with a mature B-cell repertoire, important differences in IGHV gene usage of tumor cells and patient outcome were observed in this subgroup of patients once compared to both U-CLL and M-CLL with negatively charged HCDR3 sequences. Compared to M-CLL with negatively charged HCDR3 sequences, M-CLL with neutral HCDR3 sequences showed predominance of men, more advanced stages of the disease, and a greater frequency of genetic alterations—e.g., del(17p)—together with a higher rate of disease progression and shorter time to therapy (TTT), independently of other prognostic factors. Our data suggest that the hydropathy index of the HCDR3 sequences of CLL cells allows the identification of a subgroup of M-CLL with intermediate prognostic features between U-CLL and the more favorable subgroup of M-CLL with a negatively charged BCR.

The HCDR3 sequences of the B-cell receptor (BCR) undergo constraints in length, amino acid use, and charge during maturation of B-cell precursors and after antigen encounter, leading to BCR and antibodies with high affinity to specific antigens. Chronic lymphocytic leukemia consists of an expansion of B-cells with a mixed immature and "antigenexperienced" phenotype, with either a mutated (M-CLL) or unmutated (U-CLL) tumor BCR, associated with distinct patient outcomes. Here, we investigated the hydropathy index of the BCR of 138 CLL patients and its association with the IGHV mutational status and patient outcome. Overall, two clearly distinct subgroups of M-CLL patients emerged, based on a neutral (mean hydropathy index of -0.1) vs. negatively charged BCR (mean hydropathy index of -1.1) with molecular features closer to those of B-cell precursors and peripheral/mature B-cells, respectively. Despite that M-CLL with neutral HCDR3 did not show traits associated with a mature B-cell repertoire, important differences in IGHV gene usage of tumor cells and patient outcome were observed in this subgroup of patients once compared to both U-CLL and M-CLL with negatively charged HCDR3 sequences. Compared to M-CLL with negatively charged HCDR3 sequences, M-CLL with neutral HCDR3 sequences showed predominance of men, more advanced stages of the disease, and a greater frequency of genetic alterations-e.g., del(17p)-together with a higher rate of disease progression and shorter time to therapy (TTT), independently of other

INTRODUCTION
Chronic lymphocytic leukemia (CLL) is the most prevalent leukemia in adults in the Western world, which is characterized by an expansion of mature-appearing CD5 + CD20 lo B-cells showing an antigen-experienced CD27 + , IgM + , and/or IgD + unswitched phenotype, in association with either an unmutated (U-CLL) or mutated (M-CLL) B-cell receptor (BCR) (1). At diagnosis, most CLL patients show stable disease with a variable number of tumor B-cells in blood (always >5,000 cells/ml) and bone marrow (BM), in the absence of organomegalies, and they do not require active therapy (2). Despite this, a significant fraction of patients shows more advanced disease already at diagnosis or they experience disease progression during followup, which translates into the need for active cytotoxic therapy (3).
In the last decades, the mutational status of the immunoglobulin (IG) heavy-chain variable (IGHV) genes that code for the BCR, together with disease stage and tumor cytogenetics, has emerged among other variables, as relevant prognostic factors in CLL (4). Thus, U-CLL patients show a significantly poorer outcome compared to M-CLL (4). Thereby, analysis of the IGHV status is currently part of the core variables investigated in the diagnostic workup of this disease (3,5). Despite this, M-CLL patients have a heterogeneous outcome (6,7).
From a pathogenic point of view, U-CLL cells resemble "pregerminal center" (pre-GC) B-cells, whereas M-CLL cells mimic "post-GC" B-lymphocytes (8,9). However, tumor cells from both CLL groups typically display a mixed immature (CD5 + CD23 + ) and "antigen-experienced" (CD27 + ) B-cell phenotype (10), suggesting they might represent the leukemic counterpart of B-lymphocytes that might have undergone BCR stimulation in the GC (M-CLL) vs. peripheral tissues, following selection of Bcell precursors in BM (U-CLL). In line with this hypothesis, the IGHV1-69/IGHJ6 genes which show highly similar junctional regions to those of normal peripheral blood (PB) CD5 + GC Bcells are more frequently represented among U-CLL, supporting a close relationship between U-CLL cells and the B-cells responsible for the natural antibody repertoire (11). This potential relationship is further supported by the fact that most normal CD5 + B-cells isolated from blood correspond to immature and (early) naïve B-cells that express unmutated VH gene regions (12). In turn, B-cell activation via T-cell-dependent antigens leads to the expansion of hypermutated germinal center (GC)-derived B-cells (13), suggesting that M-CLL might be associated with the "classical unswitched memory B-cell" compartment, despite that some M-CLL also show BCR features that overlap with those of natural antibodies (14).
Another important biological feature of CLL is the usage of a biased IGHV-D-J repertoire (the so-called "stereotyped" BCR) (9) in around one-third of cases, particularly in U-CLL patients, with important pathogenic and prognostic implications (15,16). In contrast to U-CLL, the higher load of somatic mutations in the BCR of M-CLL cases makes recognition of common amino acid (aa) patterns in the HCDR3 region more difficult (17). However, other HCDR3 characteristics, such as its overall charge and hydropathy index, might also contribute to better understand the ontogeny of tumor B-cells in CLL, the affinity and specificity profile of their BCR, and its relationship with antigen-driven Bcell responses, even at earlier stages of B-cell maturation (18). In fact, the HCDR3 sequence of the BCR undergoes constraints in length, amino acid use, and charge along the B-cell development and maturation (19). Consequently, the BCR repertoire of early B-cell progenitors is first focused into what appears to be a preferred range for functional antigen recognition by mature Bcells, and subsequently modified after antigen recognition, in order to generate high-affinity antigen-specific antibodies and memory B-cells (19,20). Interestingly, receptor prototypes based on HCDR3 charge and its association with certain V gene characteristics have been defined in CLL cells with the possibility that such receptor restrictions could reflect selections of the BCR repertoire that have occurred among both antigen-experienced and naive B cells (21). Despite this, the hydropathy features of HCDR3 and its association with the IGHV mutational status and other clinical and biological features of the disease have not been systematically explored in large series of CLL and related with patient outcome.
Here we investigated the hydropathy index of the BCR of tumor cells from 138 CLL patients, and its potential association with other features of the disease, including the BCR mutational status and patient outcome.

Patients and Samples
A total of 138 untreated CLL patients-81 males and 57 females; median age (range) at diagnosis of 63 years (y) (33-84 y)diagnosed at the University Hospital of Salamanca (Salamanca, Spain) were studied. Most cases (95/138) had Binet stage A CLL, and 43 had more advanced CLL (Binet B, 22; and Binet C, 21 patients). Median follow-up at the time of study closure was 8 y; at that time, 65 patients (47%) had progressed and required therapy and 24 (17%) had died ( Table 1). In every patient, genomic DNA (gDNA) from purified CLL B-cells was obtained for molecular investigations. The study was approved by the local institutional Ethics Committee (approval code: CEIC-PI4705/2017). All patients gave their written informed consent to participate to the study in agreement with the Declaration of Helsinki.

IGHV-D-J Gene Rearrangement Studies
Analysis of the tumor IGHV-D-J gene rearrangements was performed by polymerase chain reaction (PCR) of gDNA from fluorescence-activated cell sorting (FACS)-purified tumor CLL cells according to the ERIC protocols (22), as previously described in detail (23,24). For IGHV sequencing, PCR amplicons were subjected to direct sequencing on both strands. Sequence data were analyzed using the IMGT databases and the IMGT/V-QUEST tool (http://www.imgt.org). Classification into the U-CLL vs. M-CLL categories was based on the wellestablished 98% cutoff identity to the germline sequence (U-CLL: 98%-100%; M-CLL: <98%) (21).

Calculation of the Hydropathy Index of HCDR3 Protein Sequences
To determine the hydropathy index-grand average of hydropathy (GRAVY) score-of the HCDR3 protein sequences, the ProtScale Tool from the ExPASy Bioinformatics Resource Portal (https:// web.expasy.org/protscale/) and the amino acid (aa) scale values, as defined by Kyte and Doolittle (25), were used (Supplementary Figures 1A, B). The Gravy score (GS) was calculated for each HCDR3 sequence by summing up the hydropathy index value of each amino acid residue in the individual HCDR3 sequences and dividing the sum obtained by the number of amino acids in each specific sequence (23) (Supplementary Table 1 and Supplementary Figures 1B, C). Since the HCDR3 hydropathy index in humans follows a Gaussian distribution centered in the neutral/hydrophilic range (average charge: -0.5) (19), each HCDR3 sequence was classified into the neutral HCDR3 (GS ≥ -0.5) or negatively charged HCDR3 (GS < -0.5) categories.

HCDR3 Hydropathy Index in M-CLL vs. U-CLL and Its Association With Other Disease Features
Based on the above findings, we subdivided M-CLL patients into cases with neutral HCDR3 (mean GS of -0.1) and patients with negatively charged HCDR3 sequences (mean GS of -1.1) and   Table 2). Thus, Rai stage 0 (p = 0.007) predominated in the two M-CLL patient subgroups vs. U-CLL ( Table 2). In contrast, greater median hemoglobin levels (150 vs. 130 g/L, p = 0.004) were found in M-CLL patients with a neutral HCDR3 (but not within those with a negatively charged HCDR3) vs. U-CLL.
Overall, the number of PB leukocytes, total T cells, CD8 + T-cells, and tumor CLL cells, in blood, were all significantly increased in U-CLL compared with the two M-CLL patient groups, in the absence of significant differences between the later M-CLL groups ( Table 2) altered cases between U-CLL and M-CLL with negatively charged HCDR3 (80% vs. 92% and 74%, respectively) together with a significantly greater proportion of del(17p) + patients (11% vs. 0% and 2%, respectively; p = 0.02) ( Table 2).
Regarding outcome, M-CLL with neutral HCDR3 sequences showed an intermediate rate of disease progression (37%) compared to both U-CLL patients (75%) (p < 0.001) and M-CLL with negatively charged HCDR3 sequences (19%), after a similar median follow-up ( Table 2). This was associated with a significantly lower percentage of M-CLL cases with negatively charged HCDR3 sequences that required therapy during the first 2 years after diagnosis (2%) compared to U-CLL (37%, p < 0.001) and M-CLL with a neutral HCDR3 (17%, p = 0.03) ( Table 2). This translated into significantly prolonged TTT among M-CLL with negatively charged HCDR3 sequences compared to both M-CLL patients with a neutral HCDR3 sequence and U-CLL patients-75th percentile TTT (95% confidence interval): not reached vs. 4.2 and 0.9 y, respectively; p < 0.001) ( Figure 2B).
Based on the results above, we specifically investigated the prognostic impact of the hydropathy index of the HCDR3 sequence of the tumor cell BCR compared to other clinical and laboratory variables in patients with M-CLL. Among all variables analyzed, Binet stage (p < 0.001), the number of total T-cells (p < 0.001), CD4 + T-cells (p = 0.002), CD8 + T-cells (p = 0.001), basophils (p = 0.04), the size of the tumor B-cell clone in blood (p = 0.003), del(11q) and/or del(17p) (p = 0.04) and the number of cytogenetically altered CLL cells (p = 0.001) in addition to the hydropathy index of the HCDR3 sequences of the tumor B-cell clone (p < 0.001) all showed a prognostic impact in the univariate analysis ( Table 3). Multivariate analysis confirmed the   independent adverse prognostic impact of neutral HCDR3 sequence of BCR (hazard ratio (HR), 12; 95% confidence interval (CI), 1.8 to 81; p = 0.01) together with an advanced Binet stage B/C (HR, 42.8; 95% CI, 1.7 to 1,073; p = 0.02) ( Table 3).

Distinctive Molecular Features of the BCR of M-CLL With Neutral vs. Negatively Charged HCDR3 Sequences
Interestingly, no significant differences were found between the two groups of M-CLL patients defined by having a neutral vs. negatively charged HCDR3, as regards the frequency of V(H) gene families used ( Table 4). Despite this, both groups of M-CLL patients (with neutral and negatively charged HCDR3 sequences) more frequently used the VH3 gene at the expense of a lower frequency of VH1 gene usage compared to U-CLL patients-53% and 52% vs. 32%, (p = 0.05) and 17% and 8% vs. 47%, (p < 0.001), respectively ( Table 4). In more detail, usage of the VH1-69 gene family was significantly associated with U-CLL-27% vs. 0% and 4%, p < 0.001-while VH4-34 was more frequently used in the two groups of (neutral and negatively charged HCDR3) M-CLL patients vs. U-CLL-17% and 21% vs. 3%, p = 0.02, respectively. Interestingly, VH3-7 was significantly associated with M-CLL with neutral HCDR3 sequences (17%) while rarely found in U-CLL (2%) (p = 0.01) ( Table 3). Likewise, usage of the D(H)2 genes was more frequently observed in M-CLL with neutral HCDR3 sequences (43%) than in M-CLL with negatively charged HCDR3 (19%, p = 0.02) and U-CLL (22%, p = 0.03) patients ( Table 4).
Regarding J(H) gene usage, M-CLL with negatively charged HCDR3 sequences showed a significantly higher frequency of J (H)4 genes (60%) than both M-CLL with a neutral HCDR3 (23%) (p = 0.001) and U-CLL (33%) (p = 0.004) patients (Table 4). Likewise, a significantly lower percentage of M-CLL with negatively charged HCDR3 sequences showed J(H)6 gene usage (21%) compared to U-CLL (48%, p = 0.003) ( Table 4). Interestingly, the lower use of JH6 and DH2 genes in M-CLL with negatively charged HCDR3 sequences was associated with different charges of the specific HCDR3 amino acids comprised by these coding genes vs. U-CLL patients (p = 0.010 and p = 0.003, respectively) ( Table 4). In contrast, the charge of the HCDR3 amino acids comprised by JH4 and DH3 genes did not show differences between U-CLL and both M-CLL groups ( Table 4). It should be noted, however, that the HCDR3 fraction comprised by nucleotides distinct to those included in the above referred JH and DH gene sequences showed always a significantly lower charge in M-CLL patients with negatively charged sequences compared to M-CLL with neutral HCDR3 sequences ( Table 4).

DISCUSSION
B-cells are a key component of the adaptive immune system (28). Their function is typically triggered through BCR-mediated recognition of specific antigens (28). Specific binding of BCR to antigens is mostly mediated through unique HCDR3 (and also LCDR3) regions capable of identifying and attaching to complementary epitopes in the recognized antigen (28). For adequate binding to the antigen, electrostatic links with the BCR are required (29). Thereby, the HCDR3 charge plays a critical role in antigen binding to the BCR and recognition by Bcells (30). Importantly, during antigen-driven maturation, Bcells modify their HCDR3 sequences to enhance their affinity for specific antigen triggers (30). This includes acquisition of somatic mutations involving the HCDR3 region, which progressively confer more negatively charged amino acid sequences for higher affinity antigen binding by both the BCR and the future B-cell derived (higher-affinity) antibodies (31). In addition, due to its key role in antigen recognition, the interaction of the BCR with the BM microenvironment also plays a critical role at an earlier stage, during lymphopoiesis, in selecting B-cell precursors that carry a functional BCR (31).
For decades now, studies have accumulated which support an important role for BCR-mediated expansion of tumor cells in CLL (32,33) in the absence of a common genetic driver (6). Thus, CLL cells show biased usage of specific IGHV(D)J gene families, with overrepresentation of some genes such as IGHV1-69, IGHV4-34, and IGHV3-21 (34). Of note, these genes are differentially distributed among the two major prognostic subgroups of CLL defined according to the mutational status of the BCR (U-CLL and M-CLL) (35). Accordingly, U-CLL cells have polyreactive BCRs that may respond to a wide spectrum of epitopes (36,37), as typically required during selection of recently produced immature B-lymphocytes in BM (38), whereas M-CLL cells are more mature B-cells that have undergone somatic hypermutation, whose BCRs are (potentially) less responsive to external signals, while more specific for a given epitope (39)(40)(41). Among other factors, this might also contribute to explain the more aggressive clinical course (42,43) and the shortened survival of U-CLL vs. M-CLL (44). As a consequence, the IGHV gene mutational status currently represents one of the most relevant prognostic determinants in CLL (44). Similarly to normal B-cells (31), here we show that CLL cells also display a Gaussian distribution according to the hydropathy index of their BCR, slightly skewed toward negatively charged HCDR3 amino acid sequences. Interestingly, when we divided our patients into cases with neutral (mean GS of -0.1) vs. more negatively charged (mean GS of -1.1) HCDR3 sequences, two subgroups of CLL patients with clearly distinct clinical and biological features emerged. Thus, CLL patients with neutral HCDR3 sequences showed a clear predominance of men, U-CLL with longer HCDR3 sequences, a lower frequency of IGHV gene mutation, and higher frequency of more advanced stages of the disease, in association with a higher rate of disease progression and shorter TTT. Interestingly, shortening of HCDR3 sequences with a trend to negatively charged BCRs is a typical feature of selection of B-cell precursors in BM required for the survival of B-cells that will enter the mature B-cell repertoire (31). Based on these findings, our results suggest that expanded CLL cells in patients with neutral and longer HCDR3 sequences might reflect an earlier tumor cell origin in BM (45)(46)(47). Among other factors, this might also contribute to explain the greater frequency of more advanced stages of disease at diagnosis (48), together with an increased rate of disease progression vs. patients with negatively charged HCDR3 sequences. Nevertheless, these differences could be potentially due to the fact that CLL cases with neutral HCDR3 sequences included a higher fraction of U-CLL vs. M-CLL patients.
To investigate the potential independent value of both variables (the BCR mutational status and its hydropathy index), we separately studied the features of CLL patients with neutral vs. negatively charged HCDR3 sequences among U-CLL and M-CLL cases. Thus, U-CLL cases with a neutral and negatively charged HCDR3 showed similar clinical and biological features associated with a uniformly poorer outcome, in line with previous observations (49)(50)(51). In contrast, the HCDR3 hydropathy index identified two different prognostic subgroups of M-CLL. These included a subgroup of M-CLL with neutral HCDR3 who displayed intermediate clinical, genetic, and prognostic features between the classical U-CLL and M-CLL patients with a negatively charged BCR. Thus, M-CLL with   neutral HCDR3 showed predominance of men-similar to that found in U-CLL, but with significantly higher hemoglobin levels -in association with a higher frequency of thrombocytopenia and an intermediate frequency of cytogenetically altered cases between U-CLL and the other M-CLL patients, at the expense of a greater frequency of del(17p). At present, it is well established that progression of MBL toward CLL is associated with a more prominent male predominance and greater frequency of U-CLL (52). Male predominance among M-CLL cases with neutral HCDR3 might also contribute to explain the greater hemoglobin levels observed in these patients, which contrasts with the higher frequency of thrombocytopenia compared to M-CLL with negatively charged HCDR3 sequences. This together with the greater frequency of more advanced stage of the disease among M-CLL with a neutral vs. negatively charged HCDR3 would support a poorer outcome within M-CLL for the former patient group, as confirmed here via an adverse impact on the time elapsed from diagnosis to first therapy among M-CLL patients with neutral vs. negatively charged HCDR3 sequences.
From the molecular point of view, M-CLL with neutral HCDR3 showed DJ footprints compatible with a more immature BCR repertoire associated with preferential usage of D(H)2 IGHV gene segments (53), in the absence of a biased use of JH4 gene segments, as found in M-CLL cases with negatively charged HCDR3 sequences, being biased use of JH4 gene segments a typical feature of more mature PB B lymphocytes (54). In addition, we also observed longer HCDR3 sequences in older (>65 y) patients who had U-CLL and M-CLL with neutral HCDR3 vs. M-CLL with negatively charged HCDR3 sequences, in line with what might be expected among older subjects (27). Despite U-CLL and M-CLL with neutral HCDR3 shared HCDR3 sequences which typically had no traits associated with a mature B-cell repertoire, important differences were still observed in the IGHV repertoire of CLL cells of both patient groups as regards the usage of the VH1 and VH3 gene segments, further emphasizing also the biological differences between them.
Altogether, our findings show that based on the HCDR3 hydropathy index of HCDR3 sequences, two clearly distinct subgroups of M-CLL patients with different clinical, genetic, and prognostic features can be identified which are characterized by neutral vs. negatively charged BCRs, associated with molecular features of precursor vs. peripheral/mature B-cells, respectively. Further studies are needed to elucidate the precise mechanisms involved in determining the role of these different BCR profiles (compared to other prognostic factors such as ZAP70) in the distinct clinical behavior and outcome of both groups of M-CLL patients and facilitate implementation of assays for routine assessment of the HCDR3 hydropathy index in M-CLL in the clinical settings.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the local Institutional Ethics Committee, University Hospital of Salamanca (code of approval: CEIC-PI4705/2017). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AR-C, JA, and AO contributed to conception and design of the study. BF, GO, IC, and MA organized the database. CP contributed to bioinformatics calculations. MP and AG-M contributed to genomic collection, storage, and quality control. AG-M performed part of the statistical analysis and critical review of manuscript. MGD, FF and AS-R contributed in the clinical part of the manuscript and critical review of manuscript. AR-C wrote the first draft of the manuscript. All authors contributed to the manuscript revision and read and approved the submitted version.