In-Depth Assessment of Within-Individual and Inter-Individual Variation in the B Cell Receptor Repertoire

High-throughput sequencing of the B cell receptor (BCR) repertoire can provide rapid characterization of the B cell response in a wide variety of applications in health, after vaccination and in infectious, inflammatory and immune-driven disease, and is starting to yield clinical applications. However, the interpretation of repertoire data is compromised by a lack of studies to assess the intra and inter-individual variation in the BCR repertoire over time in healthy individuals. We applied a standardized isotype-specific BCR repertoire deep sequencing protocol to a single highly sampled participant, and then evaluated the method in 9 further participants to comprehensively describe such variation. We assessed total repertoire metrics of mutation, diversity, VJ gene usage and isotype subclass usage as well as tracking specific BCR sequence clusters. There was good assay reproducibility (both in PCR amplification and biological replicates), but we detected striking fluctuations in the repertoire over time that we hypothesize may be due to subclinical immune activation. Repertoire properties were unique for each individual, which could partly be explained by a decrease in IgG2 with age, and genetic differences at the immunoglobulin locus. There was a small repertoire of public clusters (0.5, 0.3, and 1.4% of total IgA, IgG, and IgM clusters, respectively), which was enriched for expanded clusters containing sequences with suspected specificity toward antigens that should have been historically encountered by all participants through prior immunization or infection. We thus provide baseline BCR repertoire information that can be used to inform future study design, and aid in interpretation of results from these studies. Furthermore, our results indicate that BCR repertoire studies could be used to track changes in the public repertoire in and between populations that might relate to population immunity against infectious diseases, and identify the characteristics of inflammatory and immunological diseases.

inTrODUcTiOn An effective humoral immune response is in part dependent on having a diversity of B cells with different B cell receptors (BCRs) capable of recognizing and binding to many different antigens. The sum of all B cells with distinct BCRs is termed the BCR repertoire, and in humans has the theoretical potential to reach a size of up to 10 11 unique variants (1). BCRs consist of paired heavy and light chains, with primary diversity generated by the somatic recombination of V, D (heavy chain only), and J gene segments during B cell development to form the functional genes (2). Further diversity is introduced by the random addition and deletion of nucleotides at the junctions of the gene segments. Upon B cell activation and proliferation, there is a secondary diversification step mediated by somatic hypermutation, and selection of B cells with increasing affinity for the antigen (3). Although the BCR repertoire is often considered as a single entity, it is actually a mixture of different B cell subpopulations, each of which may have a distinct repertoire structure (4,5).
The vast diversity of the BCR repertoire has made it difficult to study, but next-generation sequencing (NGS) technology now makes it possible to capture a high-resolution snapshot of the circulating BCR repertoire in humans. Obtaining information on paired heavy and light chains is technically challenging (6), so studies of the BCR repertoire generally focus on the heavy chain, which is the most variable, and most important for determining antigen-binding specificity (7). BCR repertoire analysis has been used to increase understanding of the fundamental properties of B cells, including developmental processes (8), and responses to antigen (4,(9)(10)(11)(12)(13). In addition, a number of clinical applications are beginning to emerge, including identification of autoimmune irregularities (14), monitoring of B cell lymphoma minimal disease residue (15,16), disease diagnostics (17), and the rapid identification of monoclonal antibody sequences (18). These studies generally monitor global features of the BCR repertoire, such as diversity, mutation levels, isotype subclass usage, and VDJ segment usage frequency as well as identifying specific B cell clones. Identifying B cells arising from the same clonal origin was initially conducted based on identifying common VDJ segment usage (15), but there is now a move toward incorporating complementarity-determining region (CDR) three amino acid (AA) sequence identity into the definition (8,17), which forms at the junction of VDJ joining and is the most important region for determining antigen-binding properties (19). B cell samples for repertoire studies in humans are usually obtained from peripheral blood as this is an easy to sample compartment, but it is not possible to extract all peripheral blood from humans, and some B cells may be present in different compartments. This means that the entire repertoire cannot be sampled, so instead a representative sample is taken. It is, therefore, important to quantify exactly what proportion of the repertoire is being sampled, and how repeatable the sampling is. This is particularly pertinent for clinical applications where it is necessary to have a highly repeatable measure to detect the clones of interest. As BCR repertoire sequencing is a young technology, such repeatability studies have been done in mice (20), but not exhaustively in humans, who have a much larger repertoire (1). Many studies of the BCR repertoire also assess changes in various features of the repertoire over time in response to certain interventions (9,(11)(12)(13). However, there is relatively little known about how the much the repertoire naturally fluctuates over time in the absence of any specific intervention, making it difficult to discern natural fluctuations from intervention-induced fluctuations. Furthermore, as the total number of healthy individuals who have had their repertoire sequenced remains small, there is not a clear consensus of what can be construed as a "normal" repertoire. It is uncertain to what extent similar B cell clones can be found in multiple individuals (the public repertoire) not undergoing a similar immune stimulus, and whether sharing is just due to chance, or due to historic expansion of similar B cells in different individuals from a common antigen and thus have some clinical significance (4,11,17,21).
We sought to shed light on these questions by carrying out repeat sampling from a single individual to assess the robustness of a BCR repertoire sequencing protocol, and also to determine how the repertoire changed over time in the absence of any intervention (Figure 1). Furthermore, sequencing of the BCR repertoire from nine additional individuals was conducted to determine how variable the repertoire was between them, and to interrogate properties of the public repertoire.

MaTerials anD MeThODs sample collection
Subjects were recruited with informed consent in accordance with the Declaration of Helsinki, and under approval from the Northampton Research Ethics Committee (13/EM/0036). For the assessment of inter-individual variation, 10 participants had blood sampled at a single timepoint. In addition, to assess withinindividual variation, and repeatability of the protocol, there was a single highly sampled participant, who had blood sampled on five consecutive weeks (days 0, 7, 14, 21, and 28) to give temporal replicates (Figure 1). Blood was transferred to a heparinized tube for processing within 4 h of collection.

B cell isolation
PBMC's were first isolated from heparinized blood by density-gradient centrifugation over lymphoprep (Axis-Shield Diagnostics). Magnetic-activated cell sorting was then used to isolate B cells, using the AutoMACS Pro cell separator (Miltenyi Biotec), and CD19 + microbead kit. B cells were counted using a hemocytometer (Neubauer), and split into aliquots each containing 500,000 cells. Sorted B cells were resuspended in RLT buffer (Qiagen) and frozen at −80°C prior to repertoire sequencing. For each participant, a single 500,000 B cell aliquot was used for repertoire sequencing. An additional 500,000 B cell aliquot was also taken for the highly sampled participant to give biological replicates.
repertoire sequencing RNA was extracted from sorted cells using the RNeasy Mini Kit (Qiagen), and reverse transcription performed using SuperScript III (Invitrogen), and random hexamer primers (42°C for 60 min, 95°C for 10 min). BCR heavy chain genes were amplified using the Multiplex PCR kit (Qiagen) with V family specific forward primers, and in separate reactions with either IgA, IgG, or IgM specific reverse primers (22) (94°C for 15 min, 30 cycles of 94°C for 30 s, 58°C for 90 s and 72°C for 30 s, and 72°C for 10 min). For the biological replicates, only IgG reactions were performed, but for all other samples, IgA, IgG, and IgM reactions were all carried out. In addition, for one of the aliquots at each timepoint from the highly sampled participant, the IgG PCR was repeated to give PCR replicates. PCR amplicons were gel-extracted, purified, and quantified using a Qubit fluorometer (Invitrogen). Samples were then A-tailed and adaptor ligated prior to size selection and amplification for sequencing on the MiSeq using the 2 × 300 bp paired-end chemistry (Illumina). Samples were multiplexed in batches of 50 for sequencing, and indexed using the Illumina tags.

raw sequence Processing
Paired-end reads were joined to give a continuous sequence spanning from framework region 1 to within the constant region using fastq-join (ea-utils), and default settings. Initial quality filtering was performed to remove any sequences containing unknown nucleotides, or with a Phred quality <30 over more than 15% of bases. Sequences were then submitted to IMGT/ HighV-Quest (23) for annotation, and unproductive sequences (as defined by IMGT) removed. To account for differences in the number of resulting sequences in different samples, all samples were randomly subsampled without replacement using the sample function in R (24) to give 100,000 sequences per sample. For assessment of within-individual variation, one participant (participant AF01) had blood sampled at five timepoints 1 week apart (temporal replicates). For this participant, two aliquots of B cells were taken at each timepoint to give biological replicates. IgA, IgG, and IgM isotype-specific PCRs were all conducted for one of the biological replicates, but only IgG for the second biological replicate (IgG 3). Additionally, for this participant, IgG PCR replicates were also conducted at each timepoint for one of the biological replicates (IgG 2). For assessment of inter-individual variation, nine additional participants were sampled at a single timepoint, and had IgA, IgG, and IgM PCRs conducted.

sequence-level annotation
Germline V and J gene usage, CDR3 AA sequence, number of V gene mutations from germline, and constant region nucleotide sequence was determined for each sequence by IMGT. Constant region sequences were then mapped to germline using Stampy (25) to determine isotype subclass, and number of nucleotide mismatches. Number of nucleotide mismatches in the constant region was used for error rate estimation, as this region will not be subject to somatic hypermutation. For each CDR3 AA sequence, the distance to its nearest neighbor in the same sample was determined. The nearest neighbor was defined as the sequence with the same CDR3 AA length and the fewest mismatches, with the nearest neighbor distance being the count of these mismatches.

cluster-level annotation
Clustering was performed using a previously described method (4), to form groups of sequences which are sufficiently similar that they are likely to be clonally related, or differ due to PCR and sequencing error. To be considered part of the same cluster, sequences were required to have the same length CDR3, the same V and J gene annotation, and a similar CDR3 AA sequence. Different thresholds for CDR3 similarity were trialed, ranging from one AA mismatch allowed per four AA's (≥75% similarity) to one AA mismatch per 26 AA's (≥96% similarity). D gene annotation was not considered, as this can not be carried out with a high degree of certainty (26), and the underlying D gene sequence is anyway included in the CDR3 AA sequence. Sequences from all samples were clustered together to allow easy comparison of clusters between participants and over time. The number of participants each cluster was present in was determined, along with specific comparisons of clusters present in different replicates. Each cluster was annotated with its dominant "cluster center" CDR3 AA sequence, the total number of unique sequences within it from a particular sample, in addition to the mean V gene mutation of these sequences, and their isotype subclass usage. Nearest neighbor distances were also calculated for cluster center sequences in the same way they were calculated for the individual sequences. Clusters were annotated for having potential binding specificity toward either influenza or tetanus toxoid (TT) antigens based on comparison to previously published sequences known to be specific for these antigens (27)(28)(29)(30)(31)(32)(33)(34). The comparison of clusters to these known sequences was based on CDR3 AA sequence identity only, and was based on whether the known sequence would have fallen into that cluster during the clustering process (i.e., allowing one AA mismatch per 12 AAs).

repertoire-level annotation
The mean CDR3 AA length, and number of V gene mutations were calculated for each sample using all sequences in the repertoire. In addition, the relative proportion of the repertoire comprised by sequences using different V and J genes (and VJ combinations), and of different isotype subclasses was calculated for each sample.
Repertoire diversity was calculated using three different single diversity metrics, which have previously been applied to repertoire studies. The Shannon entropy index and Simpson's concentration index are derived from ecology and take into account species richness and abundance in a single sample; for the purpose of studying the BCR repertoire, each cluster is considered a distinct species. The Shannon index gives more weight to rare species while the Simpson index gives more weight to abundant species. In addition, a clonality index derived from cryptanalysis (the study of text-based ciphers) was used which measures the probability that sequences selected from different PCR replicate samples belong to the same cluster (35). As well as these single diversity indices, Hill-based diversity profiles of each repertoire were generated using the method described by Greiff et al. (36). These profiles are based on a continuum of diversity measures with different weighting (alpha values) ascribed to the abundant and rare species. Alpha values between 0 and 10 were used with a step value of 0.5.

statistical analysis and graphing
Statistical analysis was conducted in R (24), using ggplot2 (37) for constructing graphs, gplots for constructing heatmaps (38), and Circos (39) used for constructing circular plots. Rarefaction analysis was conducted using the Vegan R package (40), with individual clusters representing species, and sampling done without replacement. Extrapolation of the rarefaction curves was conducted based on Chao's estimates (41), using the iNEXT package in R (42) with a q value of 0. Simpson's concentration, Shannon entropy, and diversity profiles were all calculated using the Vegan R package (40). Diversity profiles were compared based on Euclidean distance and clustered using the complete linkage algorithm with the hclust function in the Stats R package (24). Principal component analysis was conducted using the prcomp R function in the Stats R package (24). Capture-recapture analysis was used to estimate the effective repertoire size using the Chapman-Estimator formula, which has previously been applied to BCR repertoires (43). Genotyping was carried out using TIgGER (44).

resUlTs sequencing Output and clustering
Repertoire data were successfully obtained for all 52 samples (Table S1 in Supplementary Material). The mean number of raw sequences per sample was 367,634 (216,878-1,516,275). Quality filtering removed on average 31% of raw sequences, leaving at least 100,000 sequences per sample for subsequent analysis. Error rate estimates differed depending on the isotype of the sequence, and were 0.0021, 0.0079, and 0.0019 errors per nucleotide for IgA, IgG, and IgM sequences, respectively ( Figure S1 in Supplementary Material). To mitigate the effect of the error on subsequent analyses, data were clustered to group together closely related sequences. Analyzing the AA distance between CDR3 sequences prior to clustering revealed a bimodal distribution: the first peak of sequences had a close neighbor 0-2 AAs away, and the second peak of sequences had a more distant neighbor 3-15 AAs away (Figure 2A). The first peak was higher for IgA and IgG compared to IgM sequences, and the position of the second peak was shifted 1 AA toward the y-axis for the IgM sequences. As the first peak likely contains sequences whose nearest neighbor is either clonally related or differs due to error and the second peak likely contains sequences whose nearest neighbor arises from a distinct B cell clone, clustering should ensure that sequences with a neighbor in the first peak are clustered together, but sequences with a neighbor in the second peak are not. To find the clustering threshold that best achieved this, different thresholds allowing between one AA mismatch per every four AA's to one AA mismatch per every 26 AA's (i.e., from ≥75% to ≥96% similarity) in the CDR3 AA sequence were trialed. Above a threshold of one AA mismatch per 12 AA's there is a failure to cluster some sequences differing by two AA's ( Figure S2A in Supplementary Material), and below a threshold of one AA mismatch per eight AA's, there is a sharp drop in the number of clusters formed ( Figure S2B in Supplementary Material), indicating that sequences from unrelated cells start to be grouped. A threshold of one AA mismatch per 12 AA's was, therefore, chosen for the final analysis; however, it should be noted that using a threshold anywhere between one AA mismatch per eight AA's and one AA mismatch per 12 AA's gave negligible difference to the conclusions presented here.
Following final clustering with the threshold of one AA mismatch per 12 AA's, the mean number of clusters differed for the different isotypes, and was 9,972 for IgA (1,658-22,940), 15,080 for IgG (1,599-25,000), and 53,150 for IgM (19,000-75,420) ( Table S1 in Supplementary Material). Following clustering, the amplitude of the first peak of the nearest neighbor distribution of cluster center CDR3 AA sequences was greatly reduced for IgA and IgG sequences, and completely removed for IgM sequences ( Figure 2B). Cluster sequences with a distance of 0 to their neighbor represent clusters with the same CDR3, but different V and/or J gene annotations. Although such sequences could be FigUre 2 | effect of clustering on the dataset. Nearest neighbor distributions of sequences both before (a) and after (B) clustering. Before clustering, CDR3 AA sequences are used, and after clustering, the cluster center AA sequence is used. The nearest neighbor of each sequence in each dataset is determined by comparing it to every other sequence of the same length in the dataset to find the closest match -that is its nearest neighbor. The distance is then the number of AA difference between the sequence and its nearest neighbor. Distributions were calculated from samples from all 10 participants, and mean ± SEM values plotted. (c) Clusters are ordered according to size, and the size of each cluster plotted. Representative data from one sample is shown (participant AF01, day 0). Horizontal dotted line intersects the y-axis at 10 sequences (0.01% of the total sequenced repertoire), and represents the cutoff between rare and abundant clusters.
Within and inter-individual variation in the BCR repertoire Frontiers in Immunology | www.frontiersin.org chimeras formed during the PCR reaction, they are not present in the IgM dataset, making this unlikely, and were, thus, retained for analysis; it has also previously been reported that such sequences can form during convergent evolution in response to antigen (12). The size distribution of clusters was highly uneven, with a small number of abundant clusters, and a large number of rare clusters ( Figure 2C). Based on this, we defined clusters as abundant if they contained at least 10 sequences (i.e., comprised at least 0.01% of the total sequenced repertoire

Quantitative assessment of sequencing Depth
Two methods were used to assess the adequacy of the sequencing depth used: rarefaction analysis, and comparison of the PCR replicates. Rarefaction is a technique used in ecology to estimate species richness; in the context of the BCR repertoire, each cluster is defined as a unique species. Random samples of increasing size were taken from the total dataset to determine the number of clusters that were represented at increasing sequencing depths, and a curve was drawn to show the number of clusters represented as a function of sequencing depth ( Figure 3A). As the curve plateaus, it indicates that sequencing depth is sufficient, and only the very rare clusters remain to be identified. The curves for IgA and IgG do show the beginning of a plateau, so we can infer that a sequencing depth of 100,000 is sufficient to capture most of the abundant clusters for these samples. For IgM samples, which had a much larger number of rare clusters (Figure 2C), the curve does not begin to plateau, so we are unlikely to be capturing the full diversity of this population. If the curve does not plateau, it can be extrapolated to estimate where the plateau would occur, and give an estimate of the effective population size (a lower bound for the total number of clusters) of the population. However, these extrapolated values should be treated as rough values only, as they assume that all clusters will be present at the same frequency, and become unreliable once extrapolated to twice the interpolated number (41 PCR replicates were only conducted for IgG samples, so comparison of these can only be used to assess sequencing depth for this isotype. Although simply re-sequencing the same library can also be used to assess sequencing depth, conducting the PCR again prior to sequencing is a more stringent measure that will also take into account differences in amplification efficiency of the template cDNA. Across the five samples where PCR replicates were available, the mean overlap of clusters present in both PCR replicates was 58%, but this increased to 95% when just considering the abundant clusters ( Figure 3B; Figure S3 in Supplementary Material). In addition to identification of specific sequences, many repertoire studies also assess relative proportions of different VJ combinations. To determine the reproducibility of this, the proportional representation of each VJ clone in the repertoire was determined for each sample, and correlated. Strong correlations were seen, with Pearson's r > 0.997 for all PCR replicate samples ( Figure 3C).

Quantitative assessment of sampling Depth
To assess the adequacy of sampling 500,000 B cells for assessment of the total BCR repertoire, we used the biological replicates to estimate a lower bound for the total IgG BCR cluster repertoire size using capture-recapture analysis. Across the five samples where biological replicates were available, the mean IgG effective repertoire size was estimated as 142,576 unique clusters; as we obtained a mean of 15,128 total IgG clusters from these samples, we estimate that ~11% of the total repertoire was sampled at each timepoint ( Table 1). Taking just the abundant clusters, the mean estimated BCR repertoire size of abundant clusters was 4,644, indicating that we sample ~25% of the abundant repertoire at each timepoint. Estimates of repertoire size varied on the different days of sampling, and this variation in size estimates was more pronounced for the abundant repertoire (up to 8.9× size difference between days) compared to the total repertoire (up to 1.5× size difference between days).
As with the PCR replicates, the overlap in clusters present in both of the biological replicates was also calculated (Figure 3D; Extrapolated rarefaction curves for clusters of each isotype. Rarefaction analysis (interpolated) was conducted by subsampling data without replacement at 1,000 sequence increments, and determining the number of clusters represented by these sequences. Cluster richness estimation of the sample (extrapolation) was based on Chao's estimator formula, and conducted up to a sequencing depth of 500,000 sequences. The curve shows the number of clusters identified as a function of sampling depth -as the curves start to plateau it indicates that increased sampling depth will yield few additional clusters, and sampling depth is sufficient. Results were calculated from samples from all 10 participants, and mean ± SEM (gray curve) plotted. The three horizontal gray lines show the mean ± SEM for the Chao estimate of total clusters in the sample.  Figure S3 in Supplementary Material). The mean overlap of total clusters present in both biological replicates was 14%, although this increased to 44% when considering the abundant clusters only. The reproducibility of VJ usage frequency remained strongly correlated between the biological replicates, with Pearson's r > 0.995 for all samples (Figure 3E).

Fluctuations in the repertoire over Time
For a single participant, individual clusters were tracked across the samples collected at different times to see if they could be detected on multiple days. Although most clusters were present on just a single day, 5, 14, and 7% were detected on more than 1 day for IgA, IgG, and IgM, respectively ( Figure 4A). In addition, there were a small number of clusters detected at all timepoints. Circos plots were constructed to show the relationship between the clusters present on different days, and show that it is primarily the abundant clusters that are present on more than 1 day ( Figure 4B).
In addition to monitoring individual clusters, global repertoire metrics of VJ segment usage (Figure 5A), mutation ( Figure 5B), diversity (Figure 5C), and isotype subclass ( Figure 5D) were determined at each day. Different single measurements for diversity were used, but they all gave the same trend ( Figure S4 in Supplementary Material). Although most clusters do not persist over time, VJ segment usage frequency remained highly correlated between all days. Although the other repertoire metrics were highly conserved between both the PCR and biological replicates, on day 7, there was a large fluctuation in the repertoire, causing an increase in mutation and a decrease in diversity (most prominently in the IgG datasets), and a relative decrease in IgG1, and increase in IgG3 usage. Although there were actually fewer abundant clusters on day 7 than on the other days, the size of these abundant clusters tended to be greater ( Figure  S5 in Supplementary Material), suggestive of clonal expansions occurring at this timepoint. The use of diversity profiles has recently been reported as a more accurate method for determining immunological status from BCR data than the use of single diversity measures (36), and these results are shown in Figure 5E. The difference between the diversity profiles of different samples can then be calculated, and used for hierarchical clustering of the samples based on diversity. This revealed that all day 7 samples clustered together regardless of isotype. All IgM samples also clustered together, but the distinction between IgA and IgG samples was not as clear.

inter-individual Variation in the repertoire
As well as the single highly sampled participant, nine additional participants were sampled at a single timepoint to give insight into inter-individual variation in the repertoire. Although at the cluster level, there was very little overlap between the repertoires of different participants ( Figure S3 in Supplementary Material), the repertoires of different participants were more comparable at the level of the general repertoire properties. Different V and J gene segments are not used in even proportions, and these biases in gene use are conserved between different participants ( Figure S6 in Supplementary Material). To more specifically assess any differences in VJ usage between participants, principle component analysis was used. This is a dimensional reduction technique that takes into account independent correlations in usage frequencies between variables (VJ combinations) to give components that can explain the largest proportion of the total variability in the data. The first two principle components account for the largest proportion of variability, and in the case of VJ usage, account for ~40% of the variability. Plotting the samples according to the first two principle components shows that the samples from different participants cluster apart from each other compared to the samples on different days from participant AF01 which cluster closely together, indicating that VJ usage frequency is able to uniquely identify the different participants, and remains steady in participant AF01 despite apparent immune activation ( Figure 6A). Furthermore, the V gene genotype was inferred for each participant, indicating that each participant had a unique V gene genotype (Table S2 in Supplementary Material). Ten putative novel alleles were also found, that were not contained within the IMGT database; two of these have previously been reported, giving good evidence for them being real alleles (44). The global repertoire properties of mutation ( Figure 6B), diversity (Figure 6C), and isotype subclass usage ( Figure 6D) were also determined for each participant. There was considerable variation in each of these properties between the participants. Participant 1368 appeared to consistently be an outlier in nearly all measures for all isotypes, and notably for IgG had a highly mutated repertoire with low diversity, which is indicative of clonal expansions. The study was not powered to detect age-related differences in the repertoire, but we had an even mix of participants between the ages of 22 and 59. We correlated each of the global repertoire properties with age, and found a significant decrease in IgG2 usage with age, giving a relative increase in IgG1 and IgG3 use ( Figure S7 in Supplementary Material). Although not significant, other global repertoire metrics also appeared to change with age (increase in mutation, and decrease in diversity with age), and warrant exploration with appropriately powered studies.

The Public repertoire
The public repertoire is defined as set of clusters within the repertoire that are common to multiple participants, whereas the private repertoire is the set of clusters that are unique to a particular participant. In an unstimulated setting, the public repertoire comprises a very minor part of the total repertoire, but is greater for IgM than IgG or IgA (1.4, 0.3, and 0.5% of total clusters, respectively). Considering the number of clusters present in different numbers of participants, there was a sharp reduction when considering clusters shared by increasing numbers of participants, and there was only a single cluster (IgM) that was present in all participants (Figure 7A). Compared to the private repertoire, the public repertoire comprised larger clusters that had shorter CDR3's (Figures 7B,D). For IgG, the public clusters were more mutated than the private clusters, but for IgM, the public clusters were actually less mutated than the private clusters, with no significant difference for IgA ( Figure 7C).
Comparing previously described sequences specific for TT or influenza to out dataset revealed 50 TT-specific (of 1,093), and 19 influenza-specific (of 321) sequences that mapped to clusters in our dataset (Table S3 in Supplementary Material). These clusters were, therefore, labeled as potentially containing sequences with specificity toward TT and influenza, respectively. The potential TT and influenza-specific clusters were present in samples of at least one isotype from each participant, but only in small numbers. The mean number of potential TT-specific and influenzaspecific clusters was 1.1 and 0.5, respectively for IgA, 1.0 and 0.6 for IgG, and 10.5 and 9.4 for IgM. Despite the low numbers, where they were present, potential TT and influenza-specific clusters comprised a greater percentage of the public compared to the private repertoire ( Figure 7E).

DiscUssiOn
By collecting both repeat samples from a single participant as well as samples from multiple different participants, we have been able to perform in-depth assessment of both the within-individual, and inter-individual variation in the BCR repertoire. We show high reproducibility in the methods used to sample the BCR repertoire, and although not exhaustively sampling the entire repertoire at the individual B cell clone level, calculate comparable global repertoire metrics, and routinely detect abundant clusters in repeat samples. In the absence of an immune stimulus given during the study, there was considerable variation in the BCR repertoire over time in a single individual, highlighting that it is a highly dynamic system that is constantly subject to immune stimulus and selective pressures. Nevertheless, there are certain features that remain steady, which enable unique identification of different individuals, and may be a contributing factor to genetic causes of variation in the immune response. Finally, we show that there is a small public repertoire, and that this has a distinct structure, and appears enriched for specificity toward commonly encountered antigens. Although there is no standardized BCR repertoire sequencing protocol, different protocols give comparable results (45), so the data generated here are still informative for laboratories using alternate protocols. We favored use of a simple and cost-effective protocol, so that it could potentially be used on a large scale in the context of clinical trials, or for routine diagnostic tests. As input, we used 500,000 B cells -this number can normally be obtained from 2 to 8 ml of peripheral blood, so is a feasible quantity for routine sampling, including samples obtained from the youngest infants. To account for sequencing and PCR error in the dataset, a clustering-based approach was used that groups together related sequences. This approach is intended to just group together sequences arising from the same cell that only differ due to sequencing or PCR error; however, it is likely that sequences arising from clonally related cells, or cells with similar specificities will also be grouped together. Indeed, other laboratories have used similar approaches to identify what they term to be clones, or clonotypes (9,46), but this could be misleading, as B cells from distinct clonal origins could converge toward a similar sequence (12). For this reason, we instead consider the clusters to be groups of sequences that likely have a similar antigenic specificity. Using clustering for error correction is cheaper than using unique molecular identifier tags (43), which require increased depth of sequencing and may have other associated errors (47). We detected a small number of clusters that had identical CDR3 AA cluster center sequences, which were combined with different V and J gene segments. Such sequences could be PCR chimeras, or form during convergent evolution of different B cells toward  October 2015 | Volume 6 | Article 531 9 the same antigen (12). We favor the latter explanation, as these clusters were only present in the IgG and IgA datasets, and not the IgM dataset, which is mainly comprised of naïve B cells that have not undergone any selection. There was a large size distribution of clusters ranging from those just containing a single sequence (0.001% of the total sequenced repertoire) to 19,662 sequences (nearly 20% of the repertoire). This could in part be due to different PCR amplification efficiencies of different templates, the presence of different numbers of clonally related cells from which sequences are being grouped together, or different RNA quantities in different B cell subsets. Larger clusters are, therefore, more likely to represent plasma cells (which have a large amount of RNA), or proliferating cells (of which there will be multiple similar cells).
Obtaining both PCR and biological replicates allowed us to assess the adequacy of our sequencing depth for representing the true diversity of the 500,000 B cells in the sample, and the adequacy of our sampling depth for representing the true diversity of total B cells in a human. Sequencing depth and sampling depth are related in that there is no point collecting a large sample if sequencing depth is insufficient to capture the diversity of the sample. Under-sequencing results in a loss of information, whilst over-sequencing represents an unnecessary cost and increases the number of erroneous sequences in the dataset. Rarefaction analysis indicated that whilst we are not capturing the entire diversity of the 500,000 B cells, we are reliably capturing the most abundant clusters. So, for most applications, our sequencing depth should be sufficient, but for rare cluster identification, a greater sequencing depth would be recommended, especially for the IgM population, which forms the greatest proportion of the total B cells. For isotype-specific BCR repertoire sequencing from a pool of total B cells, in order to reliably sequence 90% of cells in the sample, we recommend aiming for at least a 0.5:1 ratio of sequences to cells for IgA and IgG, and a 1:1 ratio of sequences to cells for IgM populations. Using the biological replicates, the estimates of total IgG repertoire size we determined were steady over time, and in line with previous studies (43,48). Considering just the abundant clusters, estimates varied by nearly a factor of 10 at the different times, indicating that the repertoire of these clusters is more likely to be affected by immune fluctuations. It is worth emphasizing though, that our repertoire size estimates should be considered a lower bound for the total number of circulating clones. Sequences from similar clones may be clustered together, and the data violate the capture-recapture analysis assumption of equal abundance of all clusters, which will cause the size estimates to be deflated. At day 7, the dramatic changes in mean mutation levels (increase), diversity (decrease), and isotype subclass usage are indicative of potential B cell activation (4). We, therefore, hypothesize that these changes were due to a subclinical immune activation in this participant at day 7. In addition, the abundant repertoire had the smallest estimated repertoire size at day 7, potentially due to a restriction toward more limited antigenic specificity. Despite their reduction in number, the abundant clusters were large and mutated, so likely composed of rapidly proliferating B cells (49). The day 7 suspected immune stimulus highlights how dynamic the repertoire is over time, so for studies of the BCR repertoire following immune stimulation, it must, therefore, be considered that there could be natural stimulations of the repertoire that could affect the results. Further evidence for this comes from studying the differences between participants, where one had high mutation, and low repertoire diversity, indicative of immune stimulation in this participant as well. A larger study of more participants at multiple timepoints is necessary to determine exactly how common such immune stimulations are in normal healthy individuals, and how long they tend to last. In addition, further exploration of the optimal statistical approach for measuring BCR repertoire diversity is required (50). We used three different single diversity measures that have previously been applied to BCR repertoire studies (8,35). These all gave similar results, and were adequate for detecting the change at day 7. However, it has been reported that the use of different single diversity measures could give different results when comparing BCR repertoires, depending on the weight they give to abundant vs. rare species (36). For this reason, we also used the diversity profile method for comparing repertoires as suggested by Greiff et al. (36), which was also able to distinguish the day 7 samples from the other days.
Despite the day 7 changes in the repertoire, the proportion of usage of different VJ gene combinations remained steady over time in a single individual, compared to the relatively large differences between individuals. As VJ usage, therefore, appears unique to an individual, and certain V genes are preferentially used in a protective response to certain antigens (51,52), it could potentially be a cause of variation in disease and vaccine responses between individuals. As well as VJ usage frequency, V gene genotype is also unique for each individual in this study, and as certain V gene alleles may have different binding abilities to certain antigens (53), this may also affect disease and vaccine response. Although isotype subclass usage, mutation and diversity are more affected by immune stimulation, they also display considerable inter-individual variation. We hypothesized that one cause of this could be the different ages of the individuals in the study, as it has previously been observed that the repertoire in more elderly participants is less diverse and more mutated (9,35). We also saw a decrease in diversity, and increase in mutation with age, although this was not a statistically significant finding. Most striking, however, was a significant decrease in IgG2 levels with age, coinciding with an increase in IgG1 and IgG3. Such an observation is of potential importance, as the different IgG subclasses have different activities in different antigenic contexts (54), and change in their relative abundance could be a contributing factor to immunosenescence (55). For example, IgG2 is important for mounting immune responses against polysaccharide-encapsulated pathogens (such as Neisseria meningitides and Streptococcus pneumonia) (56). Such responses are reduced in older individuals (57), so this could be due to decreased IgG2 levels.
It appears that when different individuals are exposed to a common antigenic stimulus, there is a degree of similarity in the response (a public repertoire) at the BCR sequence level, and that this could be used to identify antigen-specific BCR sequences (4,11,17,21). However, here we also observe the presence of a public repertoire in the absence of any common immune stimulation. The presence of such a public repertoire could have three possible causes: laboratory contamination of different samples, random overlap by chance, or historical common antigenic stimuli. Laboratory work was conducted under stringent conditions to minimize cross-sample contamination, and there are no clusters shared across all samples, making this an unlikely contributor to the public repertoire. If sharing was due to chance, it is expected that the public and private repertoires would have similar properties, but this is not the case. The public IgG repertoire comprises larger, more mutated clusters, with shorter CDR3s than the private repertoire; this is consistent with these clusters arising from more differentiated B cell subsets (4). In addition, considering the presumed antigenic specificity of the clusters, a greater proportion of the public repertoire comprised presumed TT or influenza-specific clusters compared to the private repertoire. These are antigens to which all participants in the study are likely to have been exposed through either vaccination or infection, and thus provide support for using public repertoire analysis for identification of antigen-specific clusters following common antigen stimulation. However, it must be considered that this technique could then also enrich for sequences specific to antigens that are commonly encountered by the population. The public repertoire in the IgM dataset is approximately three times larger than that of the IgG or IgA datasets. This may be due to the presence of natural IgM antibodies, which target conserved microbial determinants and autoantigens (58); the most abundant of these is anti-gal, which constitutes ~1% of all human antibodies (59). This could explain why the public repertoire is less mutated than the private repertoire for IgM sequences, while the converse is true for IgG sequences. Unfortunately, to our knowledge there are currently no large sequence datasets of natural antibodies available to search in our dataset to confirm this hypothesis.
To summarize, we present a robust BCR repertoire sequencing method, and demonstrate that a sample of 500,000 B cells, and sequencing depth of 100,000 sequences per B cell isotype should be sufficient for most applications. The BCR repertoire varies substantially within a single individual over time, and between multiple different individuals. Assessing the complete range of such variation would benefit from further characterization in larger study cohorts. Nevertheless, we show certain conserved features of the repertoire within individuals that could be predictive of immune function. Finally, we investigated the public repertoire, and show that this is likely to be enriched for sequences with specificity toward antigens commonly encountered by the population.

acKnOWleDgMenTs
The authors are grateful to all the study participants and to the doctors and nurses at the Oxford Vaccine Group for assisting with sample collection. The authors also acknowledge support of the National Institute for Health Research Clinical Research Network and the UK Medical Research Council. Study funding was provided by the BBSRC and GlaxoSmithKline in the form of an iCASE studentship awarded to JG. Sequence data were generated by the High Throughput Genomics Group at the Wellcome Trust Centre for Human Genetics, which is subsidized by Wellcome trust grant reference 090532/Z/09/Z. AP is a Jenner Investigator and James Martin Senior Fellow. DK receives salary support from the NIHR Oxford Biomedical Research Centre.