Original Research ARTICLE
Maturation of the Human Immunoglobulin Heavy Chain Repertoire With Age
- 1Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland
- 2Children's Research Center, University of Zurich, Zurich, Switzerland
- 3Alchemab Therapeutics Ltd, London, United Kingdom
- 4Department of Statistics, University of Oxford, Oxford, United Kingdom
- 5Immunodeficiency Laboratory, Department of Biomedicine, University and University Hospital of Basel, Basel, Switzerland
- 6Institute of Medical Engineering and Medical Informatics, University of Applied Sciences and Arts Northwestern Switzerland FHNW, Muttenz, Switzerland
- 7SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- 8aiNET GmbH, Basel, Switzerland
- 9Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, United Kingdom
- 10Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom
B cells play a central role in adaptive immune processes, mainly through the production of antibodies. The maturation of the B cell system with age is poorly studied. We extensively investigated age-related alterations of naïve and antigen-experienced immunoglobulin heavy chain (IgH) repertoires. The most significant changes were observed in the first 10 years of life, and were characterized by altered immunoglobulin gene usage and an increased frequency of mutated antibodies structurally diverging from their germline precursors. Older age was associated with an increased usage of downstream IgH constant region genes and fewer antibodies with self-reactive properties. As mutations accumulated with age, the frequency of germline-encoded self-reactive antibodies decreased, indicating a possible beneficial role of self-reactive B cells in the developing immune system. Our results suggest a continuous process of change through childhood across a broad range of parameters characterizing IgH repertoires and stress the importance of using well-selected, age-appropriate controls in IgH studies.
B cells play a central role in physiological adaptive immune processes and exert their main effector function through production of antibodies (1). B cells also contribute to the pathogenesis of autoimmune disease via generation of auto-reactive antibodies and modulation of T cell responses (2, 3). The heavy and light chains of the B cell receptor (BCR) are generated in the bone marrow by recombining individual variable (V), diversity (D), and joining (J) genes through a process called VDJ recombination. Upon antigen recognition, immunoglobulin heavy (IgH) and light chains of a BCR are further diversified through rounds of somatic hypermutation (SHM) leading to affinity maturation whereby B cells with improved antigen-binding properties are selected in the germinal center. Class switch recombination (CSR) is also initiated following antigen encounter, causing a change in the IgH constant region of the BCR and in its effector function.
Detailed characterization of B cells and their respective BCR sequences offers important information on B cell generation and selection as well as immune competence in health and disease. High-throughput sequencing of antibody genes (AIRR-seq) has become a widely used tool in human translational research (4, 5). Abnormal B cell responses can be explored by investigating IgH repertoires from patients and comparing their characteristics to those of healthy controls. The limited data already available suggest that significant changes occur in the properties of IgH repertoires with age (6). It is therefore important to establish robust data on normal IgH repertoires within sufficiently narrow age-bands to fully understand the process of IgH maturation. This will facilitate the use of AIRR-seq to understand changes of relevance to childhood disease. Given the high burden of infectious diseases in childhood and the importance of effective immune response to vaccines to prevent infection, pediatric individuals constitute an important group from which to have normative data. There are very few studies that have used AIRR-seq to investigate the healthy IgH repertoire, and these studies include a limited age range of participants (7–10). In a more detailed study, Ijspeert et al. reported on the antigen-experienced (i.e., IgA and IgG) IgH repertoires of 38 healthy control (HC) samples with their ages ranging from newborn to 74 years (11). The authors found several characteristics of the studied IgH repertoire varying with age and identified patterns that are specific for isotype subclasses. However, their study was limited by the number of samples from children, the low depth of sequencing, and the small number of B cell subsets analyzed.
We aimed to assess in detail the naïve and antigen-experienced IgH repertoires in children and young adults using isotype-resolved barcoded RNA-based AIRR-seq technology and extensive bioinformatic analysis. This approach allowed us to comprehensively address the age effect on the IgH repertoire in healthy individuals and also provides a robust data set that can serve as a future reference for studying IgH repertoires in children as well as young adults with disease.
Study Participants and Cell Isolation
Healthy individuals who did not have an immunologically relevant disease or a current infection were recruited to the study. Written informed consent was obtained from study participants or their legal guardians including any potentially identifiable data included in this article under ethical approval (KEK-ZH 2015-0555 and EKNZ 2015-187). Blood samples (5–9 mL) were collected at a single time point from 53 healthy participants aged 6 months to 50 years (Supplementary Table 1). Peripheral blood mononuclear cells (PBMC) were isolated by centrifugation of PBS-diluted blood over Ficoll-Paque Plus (Sigma-Aldrich). Either PBMC or B cells magnetically sorted using the AutoMACS Pro cell separator and CD19+ microbeads (both Miltenyi Biotec), were lysed in RLT buffer (Qiagen), snap frozen on dry ice and then stored at −80°C prior to use. Cells were counted using an optical microscope and an improved Neubauer chamber. The B cell number was recorded based on actual counts or estimated using PBMC counts and either B cell frequencies from flow cytometry performed on the same blood sample or the median percentage of age-dependent reference values (12) if the former was not available. No B cell subpopulations were isolated.
RNA Isolation and Library Preparation
RNA was extracted from stored samples using the RNeasy Mini Kit (Qiagen). Reverse transcription was performed using SuperScript III/IV (Invitrogen) according to the manufacturer's instructions and IgH constant region primers that included 14 nt unique molecular identifiers (UMI), and partial p7 adaptors. Two reverse transcription reactions were carried out for each sample: one with a mix of IgM and IgD-specific reverse primers and another with a mix of IgA, IgG, and IgE-specific reverse primers. For 6 samples, one mix with all constant region primers was used in a single reaction. Primer sequences with concentrations are included in Supplementary Table 2. IgH gene rearrangements were amplified in a two-round multiplex PCR; the first round using a mix of FR1 V family specific forward primers with partial p5 adaptors, and the second round to complete the adaptor sequences. PCR conditions for the first round were 95°C for 5 min, either 8 cycles (IgD/IgM) or 12 cycles (IgA/E/G) of 98°C for 20 s, 60°C for 45 s and 72°C for 1 min, and 72°C for 5 min. The PCR conditions for the second round were 95°C for 5 min, 22 cycles of 98°C for 20 s, 69°C for 20 s and 72°C for 15 s, and 72°C for 5 min. PCR amplicons were gel-extracted, purified and quantified using the Illumina qPCR library quantification protocol. Individual libraries were normalized based on concentration and then multiplexed in batches of 24 for sequencing on the Illumina MiSeq platform (2 × 300 bp paired-end chemistry), offering a read length to sequence far enough into the IgH constant region to allow accurate distinction between isotype subclasses.
Sequence Processing, Annotation, and Somatic Hypermutation
Samples were demultiplexed via their Illumina indices, and initially processed using the Immcantation toolkit (13, 14). Briefly, raw fastq files were filtered based on a quality score threshold of 20. Paired reads were joined if they had a minimum length of 10 nt, maximum error rate of 0.3 and a significance threshold of 0.0001. Reads with identical UMI (i.e., originating from the same mRNA molecule) were collapsed to a consensus sequence. Reads with identical full-length sequence and identical constant primer but differing UMI were further collapsed resulting in a dataset containing a set of unique sequences per sample and isotype. Sequences were then submitted to IgBlast (15) for VDJ assignment and sequence annotation using the IMGT germline database as a reference, and unproductive sequences were removed. Constant region sequences were mapped to germline using Stampy (16) for isotype (subclass) annotation, and only sequences with a defined constant region were kept for further analysis. The number and type of V gene mutations was calculated using the shazam R package (14). Levels of somatic hypermutation (SHM) were determined by calculating V gene mutations in individual sequences, and mean values were calculated across samples and cell subsets.
Sequence Clustering, Clonal Lineages, and Antigen-Driven Selection
Sequences were independently clustered for each sample to group together those arising from clonally related B cells. The clustering required identical V and J gene use, identical complementary-determining region (CDR) 3 length, and allowing a 1 in 15 nucleotides mismatch in the CDR3 as previously determined (7). Lineages were constructed from clusters using the alakazam R package (17). To account for read depth variation, lineage trees were constructed on subsamples of the original data. Specifically, we randomly sampled 25,609 unique collapsed sequences (corresponding to the lowest number of reads available for a sample) from every HC sample. For calculation of selection pressure of samples, individual sequences within clusters are not independent events, so an effective representative sequence of each clonal group was determined using the default settings of shazam. Selection pressure was calculated using BASELINe (18) implemented within shazam. The statistical framework used to test for selection was CDR_R/(CDR_R + CDR_S), which normalizes for the observed increase in the total number of mutations with age. The replacement/silent (R/S) mutation ratio was measured separately in framework regions (FWRs) and CDRs. In sequences with replacement but no silent mutations, the number of silent mutations was set to 1.
From Sequence to Structure
The SAAB+ pipeline was employed to annotate IgH repertoires with structural information (19). Briefly, IgH repertoires were numbered with the IMGT scheme (20) and filtered for structural viability using “ANARCI parsing” (21) as per the first steps of the ABOSS algorithm (22). Sequences were filtered out that (i) could not be aligned to the human Hidden Markov Model (HMM) profile of an IMGT germline (ii) had a J gene sequence identity of <50% to a human IMGT germline or (iii) contained non-amino acid entries in CDRs. Since the primer masking step in pRESTO (13) can remove the first framework region and positions 127 and 128 of some sequences, ANARCI parsing was customized to account for these exceptions. To retain as many sequences as possible for structural annotation, we substituted undetermined residues in the framework region with the residues from their respective parent germline genes.
To annotate the numbered sequences with canonical loop class information, SAAB+ employs SCALOP (23) with the IMGT CDR definition (20). The expected coverage of canonical loop class sequences with SCALOP is 93%, where 89% of predicted templates will have root-mean-square deviation (RMSD) values for the backbone atoms within 1.5 Å of the correct structure. The SCALOP database dated July 2018 was used in this study.
SAAB+ employs FREAD (24) to annotate CDR-H3 loops with the Protein Data Bank (PDB) code (25) of the closest crystallographically-solved CDR-H3 structure (template). Only CDR-H3 sequences with loop lengths between 5 and 16 were investigated. The expected average RMSD of CDR-H3 template prediction for the human IgH repertoire data is 2.8 Å, with an expected coverage of 48% (19). PDB templates within a 0.6 Å RMSD radius were clustered together (19), reducing 2,943 PDB templates to 1,169 CDR-H3 PDB clusters.
Statistical Analysis and Graphing
To test for statistical significance, age groups were defined with the following ranges: 0–3, 4–8y, 9–16y, 17–25y, 26–39y and older than 40y. The number of individuals by age group, the age range and mean number of sequences are detailed in Supplementary Table 3. Statistical analysis and plotting were performed using R (26); all plots were produced using the ggplot2 and ggpubr packages (27, 28). Heatmaps were visualized using the ComplexHeatmap R package (29). PCA plots were created using the R package factoextra (30). The Wilcoxon test was used to evaluate the differences between older age group and the youngest 0–3y age group. When comparing variables among the groups, the Kruskal–Wallis test was used. The significance level was set at 0.05. In cases where a model was fitted to the data, the R squared of the model and the p-value of the chi-squared goodness-of-fit test are shown in the bottom right of the graphs. Other specific tests used are detailed in the figure legends.
Classification of Sequences Into Cell Subsets Using Isotype and Number of Mutations
Since no B cell populations were isolated, we used constant region annotation and mutation number to group individual sequences into biologically different subsets based on known B cell subpopulations. Based on the frequency distribution of mutations for IgD and IgM sequences, those with up to 2 nt mutations across the entire V gene were considered “unmutated” (naïve) to account for allelic variance (31) and remaining PCR and sequencing bias (Supplementary Figure 1). All class-switched sequences were defined as antigen-experienced irrespective of their V gene mutation count. Because of very low sequence numbers, IgE and IgG4 transcripts were excluded from most analysis. The number of sequences of the different subsets among total transcripts by individual are found in Supplementary Table 1.
Raw sequence data used for analysis in this study are available at the NCBI Sequencing Read Archive (www.ncbi.nlm.nih.gov/sra) under BioProject number PRJNA527941 including metadata meeting MiAIRR standards (32). The processed and annotated final dataset is available in Zenodo (https://doi.org/10.5281/zenodo.3585046) along with the protocol describing the exact processing steps with the software tools and version numbers.
We obtained 78,702,939 raw sequences from samples of 53 healthy study participants. Processing, filtering and collapsing resulted in a final dataset of 8,341,669 unique IgH sequences used for downstream analysis. The numbers of unique sequences were significantly reduced after UMI-based collapsing resulting in a stronger correlation with the B cell numbers per sample (Supplementary Figure 2).
V Family and J Gene Usages Change With Age
Although previous work has observed common patterns of gene usage and has suggested a strong dependence on an individual's germline genetic background (33, 34), the relative contributions to variance from age remained unclear. Proportions of sequences assigned to the different V gene families and J genes were calculated for each sample and B cell subset. The overall distribution of V family and J gene usage were different in older individuals compared with younger age groups. In particular, frequencies of V1 family sequences significantly decreased with age in naïve and mutated IgD and IgM sequences. This decrease was also observed in IgG and IgA transcripts although with higher individual variation in older age groups (Figure 1A). No clear pattern was found in the usage of the other V families by age (Supplementary Figure 3A). Such changes in V1 family genes were due to age-related alterations in several V genes, particularly VH1-8 (Supplementary Figure 4). In order to assess whether the decreased usage of V1 genes with age was due to a lower number of copies of these genes in older individuals, we calculated the copy numbers of V1 genes and looked for deleted genes using Rabhit (35). Double chromosome deletions in V1 genes were found to be equally distributed across age groups (Supplementary Figure 5A). Haplotyping was inferred for 30 heterozygous individuals for J6 and/or D2-21 genes and copy numbers of V1 genes were not significantly different across age groups (Supplementary Figure 5B).
Figure 1. V family and J gene usage changes in early childhood. (A) V1 family usage was significantly reduced in older compared with younger individuals in all IgH repertoires. (B) J6 gene usage significantly decreased during the first 10 years of life mostly in IgG subsets. (C) Mean junction length significantly decreased in the first 10 years of life exclusively in IgG subsets. Comparison of each age group to the 0–3y group was performed using the Wilcoxon test. *p < 0.05, **p < 0.01, ***p < 0.001.
There were also changes in the overall J gene usage over the first 10 years of life marked by a significant decrease in the frequencies of sequences assigned to J6 in IgG transcripts (Figure 1B). Frequencies of the other J genes by age group are shown in Supplementary Figure 3B. In line with previous work (36, 37), we find that IgH sequences with rearranged J6 gene have longer junctions (Supplementary Figure 3C). Along with a declining J6 usage with age, a significant decrease in junction length was observed in IgG subsets of older individuals (Figure 1C). However, even within IgG J6 transcripts, junction length significantly decreased with age indicating that shorter junctions in older individuals are not solely the result of altered J gene usage (Supplementary Figure 6).
Somatic Hypermutation Exponentially Increases in the First 10 Years of Life
There was a significant increase in SHM in all mutated subsets with age, which was most prominent in the first 10 years of life (Figure 2A). Substantial changes in mutation counts were found in all IgA and IgG subsets with exponential increases in children under 10 years and more linear progression between 10 and 50 years. IgD and IgM memory showed the smallest change of all subsets with some increase in children and a plateau from the 2nd decade. However, the proportion of mutated IgD and IgM transcripts per sample increased from 0.1 in 0–3 year olds to an average of 0.4 in older individuals (Figure 2B). An age-related increase in the proportion of mutated sequences was also seen for IgA and IgG although at a higher level (Figure 2B).
Figure 2. Age-related changes in somatic hypermutation and predicted antibody structure. (A) Mean number of V gene mutations by individual and B cell subset with fitted logarithmic curves. Somatic hypermutation increased mainly in the first 10 years of life with some differences between cell subsets. (B) The proportion of memory IgD/IgM out of all IgD/IgM transcripts and the proportion of mutated IgG and IgA transcripts within repertoires showed significant increases in the first 10 years of life. Statistical differences between groups were tested using the Kruskal–Wallis test. (C) The proportion of sequences structurally different from germline increased in early childhood in all B cell subsets.
Sequences With Predicted Antibody Structures Diverging From Germline Increase With Age
Crystallographic studies have shown that antibody CDR-H1 and CDR-H2 loops can adopt a very limited number of structural conformations, known as canonical loop classes (38, 39). These canonical classes are considered to be separate and distinct structures of the CDRs and can be rapidly and accurately annotated by SCALOP (23). The proportion of sequences in which either CDR-H1 and CDR-H2 had switched from the canonical class of their germline increased with age for most mutated subsets, similar to the increasing mutation number with age (Figure 2C).
Structures of CDR3 were predicted by mapping sequences to antibody structures in the PDB and annotated with a PDB code identifier. The proportion of every PDB cluster within individual and repertoire was calculated and normalized to zero mean and unit variance across individuals. PDB cluster usages were similar across individuals and age with a small number of positive outliers (frequent usage) that were private to each individual (Supplementary Figure 7).
Structural but Not Sequence Clusters Are Commonly Shared Between Individuals
For each of the 53 individuals in this study, we calculated the frequency of sequence clusters (i.e., clonally related sequences) that are unique to the individual, the frequency of clusters that are shared with two, three or more subjects. Overlap with n subjects was quantified as the number of clusters shared with only n individuals divided by the total number of clusters in an individual's repertoire. We found that on average, 84.1% of clusters were unique to the individual, while 2.5, 0.2, 0.04, and 0.02% of clusters were shared with 2, 10, 25, and 45 or more other individuals, respectively (Figure 3A). Sharing of structural clusters, however, was much more frequent with the majority of clusters (57%) shared by all 53 individuals and on average only 0.03% of clusters unique to the individual (Figure 3B). Neither sequence nor structural cluster sharing showed age-related changes.
Figure 3. Sharing of sequence and structural clusters among the 53 healthy participants of different ages. (A) Percentage of sequence clusters shared by n individuals. (B) Percentage of structural clusters shared by n individuals. For structural clusters, zeros were replaced by 0.01% to be displayed on a logarithmic scale but labeled as 0%.
Older Individuals Display More Mature Clonal Lineages and Transcripts With Antigen-Driven Selection
Lineage trees were constructed from clusters of clonally related sequences and used to determine the evolutionary relationship within clusters (Figure 4A). The mean trunk length, representing the distance between the most recent common ancestor and germline sequence as a measure of the maturity of a lineage (40), greatly increased with age (Figure 4B). There was no relationship between age and the Gini index, which predicts whether lineages are dominated by a single clone (high index) or has a broad branching structure (low index) (Figure 4C). To account for differences in read depth, these characteristics were calculated on subsampled data so that the numbers of sequences were similar between individuals.
Figure 4. Age-related changes in clonal expansions. (A) Example lineage tree with each node representing a sequence and the size of the node indicating the number of identical sequences. The number of mutations between the sequences (nodes) is shown on top of the connecting lines. (B) Correlation between age and mean trunk length with a fitted logarithmic curve. (C) Correlation between mean Gini index and age with a fitted linear model.
Insights into the process of antigen-driven selection can be gained by analyzing the mutational pattern in antigen-experienced repertoires. The R/S ratio in CDRs showed a marked increase in all mutated subsets between 0 and 10 years of life (Figure 5A). In samples from study participants older than 10 years, the R/S ratio was largely constant with values of around 3–3.5 in all B cell subsets. In contrast, the R/S ratio was less variable and lower in FWRs compared with CDRs and no association with age was found (Supplementary Figure 8). Next, we determined selection pressure using a Bayesian estimation of antigen-driven selection (BASELINe), which calculates selection by comparing the observed mutations to expected mutations derived from an underlying SHM targeting model (18). In CDRs, there was a general trend toward an age-associated decrease in selection strength for IgA and IgG1 transcripts whereas this was constant across age for IgD or IgM sequences (Figure 5B). The statistical framework used to test for selection was CDR_R/(CDR_R + CDR_S), which normalizes for the observed increase in the total number of mutations with age.
Figure 5. Age-related changes in antigen-driven selection. (A) Mean R/S ratio in V gene CDRs as a measure of selection pressure showed an increase in early childhood in all mutated B cell subsets. For sequences with replacement but no silent mutations, the number of silent mutations was set to 1. (B) Mean selection strength in CDRs calculated using BASELINe decreases with age in class switched subsets.
Usage of IgG2 and IgA2 Subclasses Increase With Age
Subclass usages were calculated within IgA and IgG repertoires to explore age-dependent class-switching patterns. In most age groups, IgG1 sequences were the most commonly detected, followed by IgG2, IgG3, and IgG4 sequences. However, the proportion of IgG2 sequences increased with age (p=0.0140, Kruskal-Wallis by age group) at the expense of lower usage of IgG1 (p = 0.0086, Kruskal–Wallis) and IgG3 (p = 0.1900, Kruskal–Wallis) sequences in older individuals. Similarly, IgA1 was most commonly used in all age groups and there was a non-significant trend toward a higher proportion of IgA2 sequences with age (p = 0.0960, Kruskal–Wallis) (Figure 6).
Figure 6. Usage of IgG and IgA subclasses by age group. The IgG and IgA isotype subclass usage changes with age. Error bars represent standard error of the mean.
Repertoires From Older Individuals Contain More Self-Tolerant Sequences
Although self-reactivity can't be predicted with certainty from AIRR-seq data yet, sequence characteristics known to be associated with self-reactivity can be explored. These include an increased usage of certain V genes, mainly VH4-34, and usage of longer CDR3 with positively charged or hydrophobic residues (41–43). We investigated how these metrics vary with age in healthy individuals. Apart from the decreasing junction length in IgG subsets (Figure 1C), we found that age has no impact on charge or hydrophobicity of IgH repertoires (Supplementary Figure 9). VH4-34 usage was also unrelated to age whereas a more detailed SHM analysis including self-reactive motifs of VH4-34 sequences revealed an age-specific pattern. The VH4-34 germline contains an Ala-Val-Tyr (AVY) hydrophobic patch in FWR1 that is not present in other V genes and is thought to contribute to the self-reactive property of this gene (44, 45). Another feature of the VH4-34 germline associated with autoimmunity is the presence of an Asn-X-Ser N-glycosylation sequon (NHS) in CDR2 that modulates antibody avidity (46). Previous research has shown that mutating one or both of these motifs drives specificity of these sequences away from self, thereby contributing to peripheral tolerance. Lower frequencies of both unmutated AVY and NHS were present in healthy older individuals while there was a relative accumulation of single and double-mutated motifs in VH4-34 with age (Figure 7). This pattern was observed across all mutated subsets but was only statistically significant for IgA and IgG transcripts (p = 0.0110 and p = 0.0036, respectively; p = 0.1800 for IgM/IgD memory; Kruskal–Wallis test).
Figure 7. VH4-34 motifs by age group. Bar plots represent the proportion of sequences with mutated AVY and/or NHS motifs in IgD/IgM, IgG, and IgA. Error bars indicate standard error of the mean. Proportion of sequences with both unmutated motifs decreases with age.
Combining Age-Related Repertoire Features Distinguishes Between Children and Adults
Principal component analysis (PCA) based on the age-driven variables including mutation, R/S ratio, junction length, gene usage and proportion of sequences structurally divergent from germline clearly showed distinct grouping of children younger than 9 years old and individuals older than 10 years old in mutated repertoires (Figure 8). This distinction was most clearly observed in the class-switched IgG and IgA repertoires. In IgD/IgM mutated sequences, children <3 years old were separate from other individuals whereas the repertoire characteristics in older age categories overlapped.
Figure 8. Stratification of IgH repertoires by age group. Principal component analysis by age category including mutation number, R/S ratio, V1 gene family usage, J6 gene usage, junction length and proportion of sequences structurally divergent from germline as variables. For class-switched IgG and IgA, the proportion of IgG2 and IgA1 are included, respectively. Areas are the convex hulls of the age group and the largest point of one color represents the center of that hull.
In this study, we found an extensive maturation of B cell responses in the first 10 years of life consistent with what would be expected with cumulative antigen exposure and a generally more developed and stable B cell compartment in older individuals. Further antibody repertoire alterations continue to be made thereafter, although at a lower rate. Although sample sizes in our study were relatively small in some age groups they were adequate for detecting age-related changes and providing insight into the developing IgH repertoire. The age distribution of our participants, number of samples, number of sequences analyzed and the results presented here constitute the most in-depth evaluation of the IgH repertoire with age. This study also provides a detailed reference data set of isotype and subclass-specific IgH repertoires of healthy individuals across a relevant age range and stresses the importance of using well-selected, age-appropriate controls in future studies.
Development and maturation of B cell repertoires throughout human life have been the subject of prior research. Studies of the immune system in ontogeny have shown that fetal repertoires are characterized by short CDR3 junctions, decreased diversity and a lack of advanced maturation features (47, 48). Evidence of fetal B cells undergoing class-switch recombination and somatic hypermutation has been found in cord blood samples taken as early as 12 weeks' gestation (49, 50). It has been shown that B cell repertoires develop throughout gestation to reach a maturity equivalent of those of postnatal infants by the end of the second trimester (51). However, the B cell repertoire at birth remains considerably different compared with those of adults. Our results reveal the underlying mechanisms of B cell maturation and how the repertoire continues to be shaped from childhood to adulthood. The oldest participant in our cohort was 50 years old, therefore our data does not inform about immunological changes that occur later in life. However, according to previous studies in B cell repertoire aging and immunosenescence, no evidence of an altered mechanism of somatic hypermutation is seen in the elderly. Furthermore, aged repertoires are skewed toward an increased usage of longer CDR3 junctions and a general trend toward the IgG2/IgM memory repertoire is observed (52–55).
Previous studies have suggested that immunoglobulin gene usage is strongly genetically determined as it was conserved between monozygotic twins and across multiple time points within a given individual (7, 33). Both heritable and stochastic mechanisms have been suggested to be involved in shaping the antibody repertoire V gene usage (56). We found age-dependent alterations in both V family and J gene usage in antigen-experienced repertoires suggesting either polyclonal negative selection of V1- and J6-containing B cells or positive selection of non-V1/J6-bearing B cells during maturation of the adaptive immune system. However, here we also saw that V family gene usage changed in naïve repertoires that are supposedly unaffected by antigen exposure and not subject to antigen-driven selection pressure, indicating preferential development and/or survival of V1-bearing B cells in young children. However, without longitudinal data it remains unclear whether this observed change is a result of genetics, differences in antigen exposure or a variation in the V gene usage over time. The potential benefit and mechanism behind these age-related V family gene alterations also remain uncertain.
In line with earlier findings (11, 57, 58), we observed extensive maturation of antigen-experienced repertoires characterized by accumulation of somatically hypermutated B cell antibody with evidence of strong positive selection in older individuals. IgH repertoire sequencing in a longitudinal birth cohort showed that IgM and IgD transcripts reach adult SHM frequencies by 2 years of age and class-switched IgA and IgG reach about 60–75% of adult SHM frequencies by the age of three (59). We showed that the mutation rate progressively increases in the first years of life and reaches the adult level at the age of 10 consistently in mutated IgM/D and class-switched transcripts. The observed decrease in selection pressure in some class-switched subsets indicates that young individuals show accelerated dynamics to achieve highly selected sequences compared with older individuals. Of note, detailed analysis allowed to investigate characteristics of mutated IgM/D transcripts separately, which were observed at a higher frequency and with a greater number of mutations in older individuals. These findings indicate that the pool of circulating peripheral blood naïve B cells is continuously diminishing with age, possibly contributing to a decreasing capacity to effectively respond to novel antigens in older individuals (60). We also observed a substantially higher proportion of unmutated IgA/G transcripts in young children compared with adults (61), in agreement with previous findings of unmutated IgG transcripts in intrauterine development and in neonates (62). These results are also in line with previous in vitro studies (63) demonstrating that class-switch recombination and somatic hypermutation can occur independently and suggest class-switching to be an important element of B cell responses in young children.
Along with other characteristics indicative of antigen-driven maturation we found that the proportion of sequences with structures differing from germline increased with age, which was most pronounced for IgG1 and IgA1 subsets. To date, there is limited information on predicted antibody structures derived from high-throughput adaptive immune receptor repertoire sequencing data (64, 65). In line with measures of antigen-driven selection, there was a positive linear relationship between number of mutations and structural alterations of antigen-experienced sequences indicating that alteration of the three-dimensional structure is important to achieve high specificity and affinity of the antibody. By annotating individual sequences with PDB codes, we were able to investigate commonalities of CDR3 structures between individuals. In particular, in contrast to sharing on the sequence level, the majority of PDB clusters were public while only a very small percentage of PDB clusters were private to the individual. Although this comparison is influenced by the much smaller number of potential PDB clusters, the use of common PDB clusters indicates that a large number of different sequences can underlie similar antibody structures. Future work, such as the investigation of PDB usage in patients with immune disorders, will help determine how antibody structures can be used to assess global immune responses.
We found an increase in the usage of IgA2/IgG2 transcripts with age, similar to what has been seen in a recent study on the isotype subclasses surface expression of peripheral blood B cells (66). While human IgG subclasses have been extensively studied (67), there is limited information on the functional difference between the two IgA subclasses, whose structures mainly differ in the length of the hinge region (68). IgG2 has been implicated in the immune responses to capsular polysaccharides of bacteria such as S. pneumoniae that are commonly colonizing the oropharynx of young children and thereby induce polysaccharide-specific serum antibody (69). Our findings also match the sequential model proposed for CSR: with age, and after multiple encounter with the same antigen, class-switched memory B cells re-enter the germinal center to undergo a second round of CSR and switch toward more downstream constant region genes (70).
The majority of early immature human B cells display self-reactivity and although most of these are removed during B cell development, a substantial proportion of mature B cells may still be directed against autoantigens (41). Antibodies encoded by germline VH4-34 are intrinsically self-reactive antibodies mediated by a hydrophobic patch and a glycosylation sequon (44, 46). Unmutated VH4-34 antibody are more common in naïve than antigen-experienced repertoires as receptor editing of these antibodies drives specificity away from self (45, 71). In contrast to adults, we found that a substantial proportion of VH4-34 IgG and IgA transcripts from children are unmutated, with frequencies gradually decreasing with age. Previous work has shown that germline VH4-34-expressing IgG B cells recognized antigens from commensal gut bacterial (71) and hence, the higher frequency of these cells in children may be related to ongoing immune responses against gut pathogens in this age group.
This study used AIRR-seq technology coupled with bioinformatic methods to study in detail the IgH repertoires of healthy individuals and investigate the effect of age on repertoire characteristics. We chose a cross-sectional study design and—although unlikely—can therefore not exclude that longitudinal assessment of maturation on an individual basis may differ from the presented findings. We performed bulk sequencing on total PBMC or B cells with constant region primers that allow isotype subclass resolution. Although individual B cell subpopulations were not isolated prior to downstream processing, we used careful bioinformatic analysis to still inform about features of B cell subgroups in this healthy control cohort. For practical reasons, the number of input cells was variable between study participants, which resulted in variable sequence numbers per sample. For analysis where sequence number variability was considered to be of major relevance, such as constructing lineage trees, subsampling to an equal number of sequences per individual was performed.
We were able to map in detail the characteristics, magnitude and rate of age-dependent maturation of IgH repertoires. Combining age-related variables using a PCA allowed clear separation of individuals younger than 10 years from older study participants, which was most pronounced in IgG repertoires. Our analysis now allows comparisons to be made in the IgH repertoires of healthy individuals to patients with altered immune states such as primary or secondary immunodeficiency (4) or infectious disease (72, 73). By elucidating patterns that are associated with cumulative antigen exposure and an evolving immune system, this research offers important insight into adaptive immune system responses in humans. The mechanisms behind the development of clinically relevant autoimmunity is still poorly understood and the findings in this study show a substantial intrinsic capacity to produce self-reactive B cells, which may be essential to achieve the diversity needed for the defense against commensal pathogens in early life.
In summary, by studying the maturation of the healthy IgH repertoire with age, we found characteristics indicative of a maturing B cell system consisting of alterations in immunoglobulin gene usage, increased levels of SHM associated with strong positive selection, and canonical class usage that differed considerably from germline structures. Repertoires from older individuals more frequently contained antibody using more downstream constant region genes that are involved in the immune response to polysaccharide antigens. With accumulating mutations, germline-encoded self-reactive antibody were seen less with advancing age indicating a possible beneficial role of self-reactive B cells in the developing immune system. Finally, this study provides a reference data set of isotype subclass-specific IgH repertoires and stresses the importance of using well-selected, age-appropriate controls in future studies.
Data Availability Statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
The studies involving human participants were reviewed and approved by Zurich ethics committee (KEK-ZH 2015-0555) and the Basel ethics committee (EKNZ 2015-187). Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.
JT designed and supervised the study, oversaw analyses, had full access to all the data in the study and takes responsibility for the integrity of the data, and the accuracy of the data analysis. The first draft was written by JT and MG. VN, JG, and MG processed samples and prepared sequencing libraries. MG, JG, AK, and JT performed bioinformatic analysis, revised the manuscript, and approved the final version. JP, MR, AJ, EM, DK, and CD contributed to manuscript revision and approved the final version. All authors contributed to the article and approved the submitted version.
This work was supported by Swiss National Science Foundation (Ambizione-SCORE: PZ00P3_161147 and PZ00P3_183777) (JT); Gottfried und Julia Bangerter-Rhyner-Stiftung (JT); Olga Mayenfisch Stiftung (JT); Palatin-Stiftung (JT); Investment fund of the University of Zurich (JT) and Swiss National Science Foundation (Professorship: PP00P3_181038) (MR). DK receives salary support from the NIHR Oxford Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.
Conflict of Interest
JG is an employee of Alchemab Therapeutics Limited.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This manuscript has been released as a preprint at bioRxiv, Ghraichy et al. (74).
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2020.01734/full#supplementary-material
6. Jiang N, He J, Weinstein JA, Penland L, Sasaki S, He XS, et al. Lineage structure of the human antibody repertoire in response to influenza vaccination. Sci Transl Med. (2013) 5:171ra19. doi: 10.1126/scitranslmed.3004794
7. Galson JD, Trück J, Clutterbuck EA, Fowler A, Cerundolo V, Pollard AJ, et al. B-cell repertoire dynamics after sequential hepatitis B vaccination and evidence for cross-reactive B-cell activation. Genome Med. (2016) 8:68. doi: 10.1186/s13073-016-0322-z
8. Galson JD, Trück J, Fowler A, Münz M, Cerundolo V, Pollard AJ, et al. In-depth assessment of within-individual and inter-individual variation in the B cell receptor repertoire. Front Immunol. (2015) 6:531. doi: 10.3389/fimmu.2015.00531
9. Galson JD, Clutterbuck E, Trück J, Ramasamy MN, Münz M, Fowler A, et al. BCR repertoire sequencing: different patterns of B-cell activation after two meningococcal vaccines. Immunol Cell Biol. (2015) 93:885–95. doi: 10.1038/icb.2015.57
10. Kovaltsuk A, Leem J, Kelm S, Snowden J, Deane CM, Krawczyk K. Observed antibody space: a resource for data mining next-generation sequencing of antibody repertoires. J Immunol. (2018) 201:2502–9. doi: 10.4049/jimmunol.1800708
11. IJspeert H, van Schouwenburg PA, van Zessen D, Pico-Knijnenburg I, Driessen GJ, Stubbs AP, et al. Evaluation of the antigen-experienced B-cell receptor repertoire in healthy children and adults. Front Immunol. (2016) 7:410. doi: 10.3389/fimmu.2016.00410
12. Comans-Bitter WM, De Groot R, Van den Beemd R, Neijens HJ, Hop WCJ, Groeneveld K, et al. Immunophenotyping of blood lymphocytes in childhood: reference values for lymphocyte subpopulations. J Pediatr. (1997) 130:388–93. doi: 10.1016/S0022-3476(97)70200-2
13. Vander Heiden JA, Yaari G, Uduman M, Stern JNH, O'Connor KC, Hafler DA, et al. PRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. (2014) 30:1930–2. doi: 10.1093/bioinformatics/btu138
14. Gupta NT, Vander Heiden JA, Uduman M, Gadala-Maria D, Yaari G, Kleinstein SH. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. (2015) 31:3356–8. doi: 10.1093/bioinformatics/btv359
17. Stern JNH, Yaari G, Vander Heiden JA, Church G, Donahue WF, Hintzen RQ, et al. B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Sci Transl Med. (2014) 6:248ra107. doi: 10.1126/scitranslmed.3008879
19. Kovaltsuk A, Raybould MIJ, Wong WK, Marks C, Kelm S, Snowden J, et al. Structural diversity of B-cell receptor repertoires along the B-cell differentiation axis in humans and mice. bioRxiv. (2019). doi: 10.1101/762880
20. Lefranc MP, Pommié C, Ruiz M, Giudicelli V, Foulquier E, Truong L, et al. IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. Dev Comp Immunol. (2003) 27:55–77. doi: 10.1016/S0145-305X(02)00039-3
22. Kovaltsuk A, Krawczyk K, Kelm S, Snowden J, Deane CM. Filtering next-generation sequencing of the ig gene repertoire data using antibody structural information. J Immunol. (2018) 201:3694–704. doi: 10.4049/jimmunol.1800669
23. Wong WK, Georges G, Ros F, Kelm S, Lewis AP, Taddese B, et al. SCALOP: sequence-based antibody canonical loop structure annotation. Bioinformatics. (2019) 35:1774–6. doi: 10.1093/bioinformatics/bty877
26. R Project for Statistical Computing. A Language and Environment for Statistical Computing. R Found. Stat. Comput. 2. (2018) Available online at: https://www.R-project.org (Accessed on 6 April 2020)
31. Ohlin M, Scheepers C, Corcoran M, Lees WD, Busse CE, Bagnara D, et al. Inferred allelic variants of immunoglobulin receptor genes: a system for their evaluation, documentation, and naming. Front Immunol. (2019) 10:435. doi: 10.3389/fimmu.2019.00435
32. Rubelt F, Busse CE, Bukhari SAC, Bürckert J-P, Mariotti-Ferrandiz E, Cowell LG, et al. Adaptive immune receptor repertoire community recommendations for sharing immune-repertoire sequencing data. Nat Immunol. (2017) 18:1274–8. doi: 10.1038/ni.3873
33. Glanville J, Kuo TC, von Budingen H.-C., Guey L, Berka J, Sundar PD, et al. Naive antibody gene-segment frequencies are heritable and unaltered by chronic lymphocyte ablation. Proc Natl Acad Sci USA. (2011) 108:20066–71. doi: 10.1073/pnas.1107498108
36. Donisi PM, Di Lorenzo N, Riccardi M, Paparella A, Sarpellon C, Zupo S, et al. Pattern and distribution of immunoglobulin VH gene usage in a cohort of B-CLL patients from a northeastern region of italy. Diagnostic Mol Pathol. (2006) 15:206–15. doi: 10.1097/01.pdm.0000213469.85301.d6
37. Widhopf GF, Rassenti LZ, Toy TL, Gribben JG, Wierda WG, Kipps TJ. Chronic lymphocytic leukemia B cells of more than 1% of patients express virtually identical immunoglobulins. Blood. (2004) 104:2499–504. doi: 10.1182/blood-2004-03-0818
40. Tsioris K, Gupta NT, Ogunniyi AO, Zimnisky RM, Qian F, Yao Y, et al. Neutralizing antibodies against west nile virus identified directly from human B cells by single-cell analysis and next generation sequencing. Integr Biol. (2015) 7:1587–97. doi: 10.1039/C5IB00169B
41. Wardemann H, Yurasov S, Schaefer A, Young JW, Meffre E, Nussenzweig MC. Predominant autoantibody production by early human B cell precursors. Science. (2003) 301:1374–7. doi: 10.1126/science.1086907
43. Pugh-Bernard AE, Silverman GJ, Cappione AJ, Villano ME, Ryan DH, Insel RA, et al. Regulation of inherently autoreactive VH4-34 B cells in the maintenance of human B cell tolerance. J Clin Invest. (2001) 108:1061–70. doi: 10.1172/JCI200112462
44. Potter KN, Hobby P, Klijn S, Stevenson FK, Sutton BJ. Evidence for involvement of a hydrophobic patch in framework region 1 of human V4-34-encoded Igs in recognition of the red blood cell i antigen. J Immunol. (2002) 169:3777–82. doi: 10.4049/jimmunol.169.7.3777
45. Reed JH, Jackson J, Christ D, Goodnow CC. Clonal redemption of autoantibodies by somatic hypermutation away from self-reactivity during human immunization. J Exp Med. (2016) 213:1255–65. doi: 10.1084/jem.20151978
46. Sabouri Z, Schofield P, Horikawa K, Spierings E, Kipling D, Randall KL, et al. Redemption of autoantibodies on anergic B cells by variable-region glycosylation and mutation away from self-reactivity. Proc Natl Acad Sci USA. (2014) 111:E2567–75. doi: 10.1073/pnas.1406974111
48. Rogosch T, Kerzel S, Hoß K, Hoersch G, Zemlin C, Heckmann M, et al. IgA response in preterm neonates shows little evidence of antigen-driven selection. J Immunol. (2012) 189:5449–56. doi: 10.4049/jimmunol.1103347
49. Zemlin M, Hoersch G, Zemlin C, Pohl-Schickinger A, Hummel M, Berek C, et al. The postnatal maturation of the immunoglobulin heavy chain IgG repertoire in human preterm neonates is slower than in term neonates. J Immunol. (2007) 178:1180–8. doi: 10.4049/jimmunol.178.2.1180
50. Rechavi E, Lev A, Lee YN, Simon AJ, Yinon Y, Lipitz S, et al. Timely and spatially regulated maturation of B and T cell repertoire during human fetal development. Sci Transl Med. (2015) 7:276ra25. doi: 10.1126/scitranslmed.aaa0072
54. Wang C, Liu Y, Xu LT, Jackson KJL, Roskin KM, Pham TD, et al. Effects of aging, cytomegalovirus infection, and EBV infection on human B cell repertoires. J Immunol. (2014) 192:603–11. doi: 10.4049/jimmunol.1301384
55. de Bourcy CFA, Angel CJL, Vollmers C, Dekker CL, Davis MM, Quake SR. Phylogenetic analysis of the human antibody repertoire reveals quantitative signatures of immune senescence and aging. Proc Natl Acad Sci USA. (2017) 114:1105–10. doi: 10.1073/pnas.1617959114
56. Greiff V, Menzel U, Miho E, Weber C, Riedel R, Cook S, et al. Systems analysis reveals high genetic and antigen-driven predetermination of antibody repertoires throughout B cell development. Cell Rep. (2017) 19:1467–78. doi: 10.1016/j.celrep.2017.04.054
57. Tabibian-Keissar H, Hazanov L, Schiby G, Rosenthal N, Rakovsky A, Michaeli M, et al. Aging affects B-cell antigen receptor repertoire diversity in primary and secondary lymphoid tissues. Eur J Immunol. (2016) 46:480–92. doi: 10.1002/eji.201545586
58. Schatorjé EJ, Driessen GJ, van Hout RW, van der Burg M, de Vries E. Levels of somatic hypermutations in B cell receptors increase during childhood. Clin. Exp. Immunol. (2014) 178:394–8. doi: 10.1111/cei.12419
59. Nielsen SCA, Roskin KM, Jackson KJL, Joshi SA, Nejad P, Lee J-Y, et al. Shaping of infant B cell receptor repertoires by environmental factors and infectious disease. Sci Transl Med. (2019) 11:eaat2004. doi: 10.1126/scitranslmed.aat2004
61. Fecteau JF, Côté G, Néron S. A new memory CD27 – IgG + B cell population in peripheral blood expressing V H genes with low frequency of somatic mutation. J Immunol. (2006) 177:3728–36. doi: 10.4049/jimmunol.177.6.3728
62. Bauer K, Zemlin M, Hummel M, Pfeiffer S, Karstaedt J, Steinhauser G, et al. Diversification of Ig heavy chain genes in human preterm neonates prematurely exposed to environmental antigens. J Immunol. (2002) 169:1349–56. doi: 10.4049/jimmunol.169.3.1349
63. Nagumo H, Agematsu K, Kobayashi N, Shinozaki K, Hokibara S, Nagase H, et al. The different process of class switching and somatic hypermutation; a novel analysis by CD27- naive B cells. Blood. (2002) 99:567–75. doi: 10.1182/blood.V99.2.567
64. Kovaltsuk A, Krawczyk K, Galson JD, Kelly DF, Deane CM, Trück J. How B-cell receptor repertoire sequencing can be enriched with structural antibody data. Front Immunol. (2017) 8:1753. doi: 10.3389/fimmu.2017.01753
66. Blanco E, Pérez-Andrés M, Arriba-Méndez S, Contreras-Sanfeliciano T, Criado I, Pelak O, et al. Age-associated distribution of normal B-cell and plasma cell subsets in peripheral blood. J Allergy Clin Immunol. (2018) 141:2208–19.e16. doi: 10.1016/j.jaci.2018.02.017
69. Turner P, Turner C, Green N, Ashton L, Lwe E, Jankhot A, et al. Serum antibody responses to pneumococcal colonization in the first 2 years of life: results from an SE Asian longitudinal cohort study. Clin Microbiol Infect. (2013) 19:1–8. doi: 10.1111/1469-0691.12286
70. Xiong H, Dolpady J, Wabl M, Curotto de Lafaille MA, Lafaille JJ. Sequential class switching is required for the generation of high affinity IgE antibodies. J Exp Med. (2012) 209:353–64. doi: 10.1084/jem.20111941
71. Schickel JN, Glauzy S, Ng YS, Chamberlain N, Massad C, Isnardi I, et al. Self-reactive VH4-34-expressing IgG B cells recognizecommensal bacteria. J Exp Med. (2017) 214:1991–2003. doi: 10.1084/jem.20160201
73. Burkholder WF, Newell EW, Poidinger M, Chen S, Fink K. Deep sequencing in infectious diseases: immune and pathogen repertoires for the improvement of patient outcomes. Front Immunol. (2017) 8:593. doi: 10.3389/fimmu.2017.00593
Keywords: antibody, B cells, children, heavy chain, immunoglobulin, maturation, repertoire, high-throughput sequencing
Citation: Ghraichy M, Galson JD, Kovaltsuk A, von Niederhäusern V, Pachlopnik Schmid J, Recher M, Jauch AJ, Miho E, Kelly DF, Deane CM and Trück J (2020) Maturation of the Human Immunoglobulin Heavy Chain Repertoire With Age. Front. Immunol. 11:1734. doi: 10.3389/fimmu.2020.01734
Received: 06 April 2020; Accepted: 29 June 2020;
Published: 06 August 2020.
Edited by:Deborah K. Dunn-Walters, University of Surrey, United Kingdom
Reviewed by:Corey T. Watson, University of Louisville, United States
Michael Zemlin, Saarland University Hospital, Germany
Michael Reth, University of Freiburg, Germany
Copyright © 2020 Ghraichy, Galson, Kovaltsuk, von Niederhäusern, Pachlopnik Schmid, Recher, Jauch, Miho, Kelly, Deane and Trück. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Johannes Trück, firstname.lastname@example.org