Discordance Between the Predicted Versus the Actually Recognized CD8+ T Cell Epitopes of HCMV pp65 Antigen and Aleatory Epitope Dominance

CD8+ T cell immune monitoring aims at measuring the size and functions of antigen-specific CD8+ T cell populations, thereby providing insights into cell-mediated immunity operational in a test subject. The selection of peptides for ex vivo CD8+ T cell detection is critical because within a complex antigen exists a multitude of potential epitopes that can be presented by HLA class I molecules. Further complicating this task, there is HLA class I polygenism and polymorphism which predisposes CD8+ T cell responses towards individualized epitope recognition profiles. In this study, we compare the actual CD8+ T cell recognition of a well-characterized model antigen, human cytomegalovirus (HCMV) pp65 protein, with its anticipated epitope coverage. Due to the abundance of experimentally defined HLA-A*02:01-restricted pp65 epitopes, and because in silico epitope predictions are most advanced for HLA-A*02:01, we elected to focus on subjects expressing this allele. In each test subject, every possible CD8+ T cell epitope was systematically covered testing 553 individual peptides that walk the sequence of pp65 in steps of single amino acids. Highly individualized CD8+ T cell response profiles with aleatory epitope recognition patterns were observed. No correlation was found between epitopes’ ranking on the prediction scale and their actual immune dominance. Collectively, these data suggest that accurate CD8+ T cell immune monitoring may necessitate reliance on agnostic mega peptide pools, or brute force mapping, rather than electing individual peptides as representative epitopes for tetramer and other multimer labeling of surface antigen receptors.


INTRODUCTION
T cell immune monitoring has a long and successful track record in murine models in which defined experimental conditions, small model antigens, and work with inbred mouse strains expressing few restriction elements (MHC molecules) simplified the task (1). The magnitude of scope is entirely different when the outbred human population is to be studied, largely due to the immense diversity in complexity restriction elements (human leukocyte antigens, HLA) and the of the antigenic systems, such as viruses. To comprehensively monitor T cell immunity to SARS-CoV-2, for example, this virus' entire proteome, 9,871 amino acids long (2), would need to be considered. In this report, we confined ourselves to a single protein of HCMV, pp65, which is "only" 561 amino acid long, and to subjects who shared a common HLA-A*02:01 restriction element, but differed in the remaining HLA class I alleles.
Monitoring CD4+ T cell immunity is relatively simple. When the test antigen of interest is added as a protein to peripheral blood mononuclear cells (PBMC), the antigen presenting cells (APC) contained in the PBMC will acquire, process, and present the antigen (3). Unfortunately, this is not the case for ex vivo CD8+ T cell detection. CD8+ T cells evolved to recognize antigens actively bio-synthetized within host cells, as opposed to antigens that APC acquire from their surroundings. Thereby CD8+ T cells can survey ongoing protein synthesis in the cells of the body, permitting them to identify virally-infected or malignant cells, so as to kill them. During protein synthesis, defective byproducts also arise that are degraded by the proteasome into peptide fragments. Such peptides are loaded onto HLA class I molecules, and transported to the cell surface to be displayed to CD8+ T cells (4). Protein antigens are not suited to recall in vivo-primed CD8+ T cells within PBMC because exogenously added proteins are not efficiently presented to CD8+ T cells in the context of class I molecules. Instead, the CD8+ T cell epitopes need to be added as 8-11 amino acid long peptides that they can bind directly to cell surface expressed HLA class I molecules. From this requirement the need arises to select the "right" peptides for ex vivo CD8+ T cell immune monitoring: those very same epitopes that have induced a CD8+ T cell response in vivo. Missing the "right" peptides, or only partially covering them, has the consequence that the antigen-specific CD8+ T cell repertoire could go partially or entirely undetected.
Selecting the "right" peptides for CD8+ T cell immune monitoring is an inherently intricate task. HLA class I molecules are encoded by three genetic loci, HLA-A, HLA-B, and HLA-C, for which a multitude of alleles exist in the human population (5). Each allelic HLA class I molecule has a unique peptide binding specificity (6). As there are barely two humans with an identical HLA-type, there should be barely two humans who present the same array of epitopes. Protecting the species, T cell epitope recognition evolved to be highly individualized (7). Peptide selection for comprehensive CD8+ T cell immune monitoring must therefore account for the unique HLA allele composition in each test subject.
A mainstream effort for identifying the "right" peptides for CD8+ T cell monitoring is reliant upon in silico epitope predictions. As the peptide binding motifs for most HLA alleles are well-defined, predictions can be made as far as which peptide sequence of an antigen can bind to a given HLA allele, thus constituting a potential T cell epitope. Search engines have been made available to the scientific community to rank peptide sequences for their predicted binding strength to most HLA alleles, thus narrowing in on a finite set of epitopes. A critical assumption for epitope predictions is that peptides that rank high in their respective HLA allele binding score will be those that are being targeted most by CD8+ T cells. The data presented in this study challenge this hypothesis supporting the conclusions reached by Mei et al. (8).
Beyond doubt, a peptide needs to be able to bind to an HLA allele to be a candidate for T cell recognition. However, whether a peptide sequence of a protein antigen that has HLA-binding potential indeed becomes an epitope recognized by T cells is defined by many additional factors (9). Limitations exist on the level of antigen presentation, including whether that exact peptide is indeed generated through natural antigen processing, and whether it is produced in quantities that can outcompete other peptides, including self-peptides, for binding to the respective HLA molecules. It has been shown that different class I alleles present in an individual can compete with each other for epitope dominance (10). Limitations also exist on the level of the pre-immune T cell repertoire available to engage in antigen recognition. The duration and abundance of epitope presentation will also affect the ensuing CD8+ T cell response, being regulated both by a virus' replication biology, and the host's ability to control the virus. The CD8+ T cell response is dynamic (11). Therefore, it can be expected that only a fraction of peptides with HLA class I binding properties will elicit strong CD8+ T cell responses, becoming dominant epitopes. Other presented peptides might induce a weaker, subdominant, barely detectable, cryptic, or no CD8+ T cell responses at all. As all antigen-specific CD8+ T cells can be expected to contribute equally to the host's defense, irrespective of their fine specificity, comprehensive immune monitoring must not focus on a single or few epitopes, but should instead accommodate all epitopes of an antigen targeted by CD8+ T cells in an individual in order to assess the entire antigen-specific T cell pool.
Next to predictions in silico, experimentally verified epitopes have been used as a guide to select peptides for CD8+ T cell immune monitoring. Over the years, T cell lines and clones specific for many viral antigens have been isolated and their epitope specificity compiled in databases (12,13). Selection of such previously verified epitopes for immune monitoring is based on the assumption that epitope recognition, including epitope hierarchy, is constant in subjects who express the corresponding HLA allele. In other words, if an HLA-Xrestricted peptide Y has been identified as an immune dominant epitope in an HLA-X positive subject Z, this peptide Y will also be immune dominant in HLA-X positive subjects V and W. Such predictable immune dominance prevails in simple murine models when inbred mice are studied that express minimal restriction element diversity (14). However, predictable epitope dominance is lost as soon as restriction element diversity rises through interbreeding these inbred mouse strains. In such F1 mice, aleatory epitope recognition prevails (15): T cells in each F1 mouse respond in an unpredictable, dice-like fashion (alea means dice in Latin) to epitopes to which the parental strains responded predictably. Aleatory epitope dominance may also apply to humans due to their diverse restriction element makeup (16). Therefore, in the present study of HCMV pp65 epitope recognition in HLA-A*02:01-positive individuals, we also compare the peptides that the CD8+ T cells actually target in our cohort with previously verified epitopes.
The third approach for CD8+ T cell immune monitoring is not to select peptides at all, but to systematically test all possible peptides of the antigen. This can be done by using mega peptide pools consisting of hundreds of peptides that cover entire proteins of a virus. By necessity, this approach has become standard recently in the first real world challenge on clinical T cell immune monitoring: trying to study T cell immunity induced by SARS-CoV-2 infection. This crude approach is simple and practical, yet permits comprehensive assessment of the entire expressed antigen-specific T cell repertoire in outbred populations, without requiring customization to HLA types of individuals, but it does not reveal the epitope specificity of the antigen-reactive T cells.
In this study, we applied an agnostic approach in which all possible peptides were tested individually on each subject in a "brute force" high-throughput manner (17). The ability to test hundreds, even thousands of peptides individually on a subject is a recent technological advancement. The hurdles that needed to be overcome included limitations in PBMC numbers available from a subject, access to extensive custom peptide libraries, highthroughput-capable T cell assay platforms, and automated data analysis. We have developed and report here large-scale epitope mapping strategies that can be readily adopted even in small academic laboratories operating on tight budgets. Empowered by the ability to experimentally verify CD8+ T cell epitope utilization at the highest possible resolution in the well-studied HCMV pp65 T cell immune monitoring model, we set out to compare the epitopes actually recognized with those that are predicted, or assumed to be recognized based on existing data.
A protein of HCMV, pp65 is as far as CD8+ T cell recognition goes arguably one of the best studied model antigens: over the decades, 31 epitopes have been experimentally verified for the HLA-A2 allele alone (these are listed in Supplementary  Table 1, including the corresponding references). Moreover, while the understanding of the rules for peptide binding to most HLA alleles is overall advanced, they are by far best established for HLA-A2. In this study, therefore we focus on the actual CD8+ T cell recognition of pp65 epitopes in HLA-A2 positive subjects, asking the question whether previously defined epitopes or binding predictions suffice to guide the selection of individual peptides for immune monitoring purposes. We assume that lessons learned from the best studied model antigen have implications for still less characterized foreign antigens, such as those of the SARS-CoV-2 virus, or tumor/ self-antigens. We draw attention to how incomplete our appreciation of an individual's expressed epitope space currently is even when it comes to a well-studied foreign antigen. Epitope utilization in anti-self/cancer antigen recognition can be expected to underly the same rules, plus T cell repertoire limitations caused by negative selection by abundantly presented self-peptides. Our data suggest that neither epitope predictions, nor reliance on known epitopes suffice, but rather that the agnostic route is best suited for comprehensive CD8+ T cell immune monitoring for foreign antigens, and by inference, tumor antigens as well.
HLA multimers (tetramers, pentamers, dextramers) are frequently used for CD8+ T cell monitoring. Consisting of HLA molecules loaded with a peptide epitope, multimers constitute the T cell-receptor (TCR) ligand that can be used to selectively stain antigen-specific T cells (18). This technique is invaluable for the in-depth phenotypic analysis of the antigenspecific T cells via flow cytometry (19), but its limitation is that it requires epitope utilization to be predictable, either based on previously defined, or on in silico predicted epitopes. The data communicated in the following call into question whether such predictions are accurate even in the case of the arguably bestdefined model antigen, pp65. We argue that the agnostic use of peptide libraries is needed for reliable CD8+ T cell monitoring, along with the utilization of techniques that are suited for work with such mega peptide pools, including ELISPOT, ICS, and the AID tests. . The ten subjects for this study were selected according to their HCMV-positive status. The frozen cells were thawed following an optimized protocol (20) resulting in viability > 90% for all samples. The PBMC were resuspended in CTL-Test ™ Medium (from CTL), developed for low background and high signal performance in ELISPOT assays. The number of PBMC plated into the ImmunoSpot ® (ELISPOT) experiments was 3 × 10 5 PBMC per well.

Peptides and Antigens
Five hundred fifty-three nonamer peptides, spanning the entire amino acid (a.a.) sequence of the HCMV pp65 protein in steps of single a.a. were purchased from JPT (Berlin, Germany) as a FastTrack CD8+ T cell epitope library. These peptides were not further purified following their synthesis; however, individual peptides were analyzed by JPT using LC-MS. The average purity of these peptides was 56%. These peptides were delivered as lyophilized powder with each peptide present in a dedicated well of a 96-well plate, distributed across a total of six 96-well plates. Individual peptides were first dissolved in 50 ml DMSO, followed by addition of 200 ml of CTL-Test ™ Medium so as to generate a "primary peptide stock solution" at 100 mg peptide/ml with 20% v/v DMSO. From each of these wells, a "secondary, 10X peptide stock solution" was prepared using a 96-well multichannel pipette, in which peptides were at a concentration of 2 mg/ml, with DMSO diluted to 0.4%. On the day of testing, 20 ml from each well was transferred "en block", with a 96-well multichannel pipette into pre-coated ImmunoSpot ® assay plates containing 80 ml CTL-Test ™ Medium. Finally, 100 ml of PBMC (containing 3 × 10 5 cells) in CTL-Test ™ Medium was added resulting in a test peptide concentration of 0.2 mg/ml with DMSO present at 0.04% v/v. UV-inactivated entire HCMV virions (HCMV Grade 2 antigen from CTL) at 10 mg/ml was used to recall HCMVspecific CD4 cells. CPI (from CTL) was used as a positive control because, unlike CEF peptides, CPI elicits T cell recall responses in all healthy donors (21). CPI is a combination of protein antigens derived from CMV, influenza and parainfluenza viruses, and was used at a final concentration of 6 mg/ml in ImmunoSpot ® assays.

Human IFN-g ImmunoSpot ® Assays
Single-color enzymatic ImmunoSpot ® kits from CTL were used for the detection of antigen-induced IFNg-producing CD8+ T cells. Peptides or pp65 were plated at the above specified concentrations into capture antibody-precoated assay plates in a volume of 100 ml per well. These plates with the antigen were stored in a CO 2 incubator for less than 1 h until the PBMC were thawed and ready for plating. The PBMC were added at 3 × 10 5 cells/well in 100 ml CTL-Test ™ Medium followed by a 24 h activation culture at 37°C and 9% CO 2 . Thereafter the cells were removed, IFNg detection antibody was added, and the platebound cytokine was visualized by enzyme-catalyzed substrate precipitation. After washing, the plates were air-dried prior to scanning and counting of spot forming units (SFU). ELISPOT plates were analyzed using an ImmunoSpot ® S6 Reader, by CTL. For each well, SFU were automatically calculated by the ImmunoSpot ® Software using its Autogate ™ function (22). The data are expressed as SFU per 3 × 10 5 PBMC, whereby each SFU corresponds to the cytokine footprint of an individual IFNg-producing T cell (23).

Statistical Analysis
As SFU counts follow Gaussian (normal) distribution among replicate wells, the use of parametric statistics is appropriate to identify positive and negative responses, respectively (24). The 553 individual peptides of the pp65 nonamer peptide library were tested in single wells. For these peptides, the threshold for a positive response was set at SFU counts exceeding 3 SD of the mean SFU count detected in 18 replicate media control wells, the latter defining the background noise of the test system. This cut off criterion for weak (cryptic) responses renders the likelihood for false positive results at 0.3%. Dominant responses were defined by exceeding 10 SD, and subdominant responses between 5 and 10 SD of the negative control. Simple linear regressions were preformed to examine relationships between variables.

HLA-Binding Predictions
We assessed peptide-HLA I presentation by predicting peptide-HLA I binding using HLA I allele-specific profile motif matrices (25). We considered that a given peptide binds to a specific HLA I molecule when its binding score ranks within the top 3% percentile of the binding scores computed for 1,000 random 9-mer peptides (average amino acid composition of proteins in the SwissProt database). Peptide binding to experimentally defined HLA-A*02:01 restricted epitopes was predicted using netMHCIpan (25) an IEDB analysis resource (12), reporting percentile binding score. The lower the percentile binding score the better the binding. We selected netMHCIpan because it is the NIH's official recommended site that was created based on the consensus of earlier epitope prediction algorithms. Moreover, netMHCIpan allows to target more MHC I molecules for peptide binding predictions than any other method.

Previously Defined Epitopes
Epitope data for HLA-A2-restricted CD8+ T cell recognition was obtained from IEDB (12) with the following search settings: positive response only; host human; MHC I allele, HLA-A*0201, source species HCVM, source antigen: pp65. Only peptides 9 a.a. long were considered.

Experimental Design
Systematic CD8+ T cell epitope mapping for the HCMV pp65 protein requires 553 nonamer peptides to be tested individually on PBMC of single subjects. Due to the magnitude of this scope, such data have not been reported so far except for a recent feasibility study from our own group (17). We took advantage of the fact that ImmunoSpot ® assays require as few as 300,000 PBMC per antigen stimulation condition, and that these assays lend themselves to high-throughput testing and analysis. Utilizing only 173 million PBMC per subject, we therefore could test individually 553 nonamer HCMV pp65 peptides, along with 18 negative control replicate wells to establish the background noise, and the CPI positive control in triplicate.
Nonamer peptides were selected because the peptide binding groove of HLA class I molecules accommodates peptides 8-11 amino acid (aa) in length, with the most common peptide size being 9 aa residues (3). In contrast, HLA class II molecules present longer peptides (26). As such, the usage of nonamer peptides in our assays permitted preferential activation of CD8+ T cells (Some nonamer peptides can bind to HLA II molecules but there are only few CD4 T cell epitopes known that comprise of nonamers). Moreover, because the peptide binding groove of HLA class I molecules is closed on both ends (3), it is intolerant for frame shifts. The peptide library was designed therefore to walk the pp65 protein in steps of single a.a. with each nonamer peptide overlapping by 8 a.a. with the previous one (Supplementary Figure 1). Importantly, this approach enabled systematic coverage of every possible CD8+ T cell epitope within the pp65 antigen. By utilizing similar peptide libraries that are one or two amino acids shorter or longer than nine we possibly may have gained even more high-resolution data on epitope-reactive CD8+ T cells but for feasibility reasons such comparisons have been deferred. To reduce assay variables, all peptides used in this study were from the same vendor, and were synthetized, stored, dissolved, and tested in the same way. Moreover, all the peptides were tested on each PBMC donor in a single experiment, which rendered the peptides the only assay variable.
Standard IFNg ImmunoSpot ® assays with 24 h antigen exposure of PBMC were performed; a time period required for blast transformation and CD8+ T cell activation-driven IFNg secretion to occur, but too short to permit CD8+ T cell proliferation or differentiation during the cell culture. Thus, we measured at single-cell resolution the frequencies of antigen-specific IFNg-producing CD8+ T cells as they occurred in vivo at isolation of the PBMC. This approach, therefore permitted us to firmly measure within each PBMC sample the number of CD8+ T cells responding to each peptide, and thus to compare the frequencies of peptidereactive CD8+ T cells to establish epitope hierarchies for each donor. Moreover, adding up all peptide-induced IFNg SFU permits one to assess the cumulative magnitude of the antigen-specific CD8+ T cell population in each donor, in turn allowing for determination of the relative percentage of antigen-specific CD8+ T cells targeting individual epitopes in each test subject.
Such actual measurements of the epitope-specific CD8+ T cells were then compared to a) the recognition of published HLA-A*02:01-restricted epitopes, and b) epitope predictions not only for HLA-A*02:01, an allele that all 10 test subjects shared by design, but also for all other class I molecules expressed by the test subjects.
Epitope scans of this type do not permit to define the activation/differentiation states/lineages of the CD8+ T cells that recognize them. As most functional CD8+ memory/ effector T cells secrete IFN-g, however, screening for IFN-g should be well suited for detecting epitopes that are targeted by CD8+ T cells (19). Once the number of possible peptides has been narrowed from hundreds to a couple per donor, it becomes possible to narrow in on such peptide-reactive CD8+ T cell populations, e.g.by studying their phenotype by multimers, or by testing their (co-) secretion of other cytokines and effector molecules by multicolor FluoroSpot (27).

Highly Variable HCMV pp65 Epitope Recognition Patterns in HLA-A*02:01 Positive Subjects
Eighteen replicate wells containing media alone were included for all 10 individuals in our HCMV positive, HLA-A*02:01 positive cohort in order to firmly establish the background noise of the respective PBMC. The mean and standard deviation (SD) of this negative control is shown for all subjects in Table 1, also specifying the cut off values used for analyzing the peptide-induced SFU counts. The 533 individual pp65 nonamer peptides were also tested on each subjects' PBMC, and the resulting peptide-induced SFUcounts graded: peptides triggering SFU counts larger than 3 and less or equal to 5 SD over the medium control were considered weakly positive or cryptic (highlighted in beige in Table 1). Of note, with the mean plus 3 SD definition utilized in this study, the chance for a datapoint being false positive was negligible, less than 0.3%. Peptides triggering SFU counts more than 5 and less than or equal 10 SD over the medium b a c k g r o u n d ( h i g h l i g h t e d i n y e l l o w ) w e r e l a b e l e d subdominant, and peptides eliciting SFU counts exceeding 10 SD over the medium control were called dominant (and are labeled in orange in Table 1). We also introduced a fourth category for peptides that recalled CD8+ T cells in frequencies exceeding 100 SFU/300,000 PBMC, calling them superdominant epitopes (shown in red in Table 1). Table 1 lists peptides that induced at least one dominant or super-dominant recall response in at least one of the ten test subjects in our cohort. Only for these 56 select peptides of the 553 tested are SFU counts shown for the ten donors. Additionally, a color-coding system was utilized in Table 1 to delineate whether the peptide recalled a super-, dominant, subdominant, cryptic, or no response in the test subject.
As revealed by the color code at a glance, epitope recognition followed highly individual patterns, that are closer dissected below.
Multiple HCMV pp65 Epitopes Are Recognized in Each HLA-A*02:01 Positive Subject As Table 1 lists only super-and dominant recall responses (>10 SD over background), in Supplementary Table 3 we list 58 additional peptides that induced subdominant recall responses (5-10 SD over the background) in at least one of the ten test subjects in our cohort. At a glance, the color code reveals that these peptides are also recognized in a highly individualized pattern. Peptides that recalled cryptic responses (3-5 SD over background) are not listed individually, but their number is specified for each test subject in Table 2, along with the number of subdominant, and dominant and super-dominant epitopes recognized in each donor. Adding up the number of epitopes in all four categories permits one to establish the cumulative number of CD8+ T cell epitopes recognized in each subject, which varied between 5 and 47 HCMV pp65-derived peptides in this cohort ( x = 29 ± 17). Thus, of the 553 peptides covering the 561 amino acid long pp65 protein, between 1% and 8% ( x = 5 % ) of the peptides constituted a CD8+ T cell epitope in each individual, but for the entire cohort 114 peptides (21% of 553 peptides tested) were needed to recall all dominant (56 peptides) and subdominant (58 peptides) CD8+ T cell epitopes. These data draw attention to how critical it is for immune monitoring to hit the right peptides-those few super-dominant aleatory epitopes that the majority of CD8+ T cells target. The number of CD8+ T cells recognizing cryptic, subdominant and dominant epitopes, in spite of the numbers of such epitopes, does not add up in most subjects to the repertoire that is directed against the few superdominant epitopes.
In the feasibility study for this paper (Tables 1 and 2 of our publication (17) we studied HCMV negative subjects, finding no significant HCMV peptide-triggered IFN-g spot formation.
Confirming this notion, in a detailed study dedicated to chance cross-reactivity in various antigenic systems, we also did not find evidence for such (28). These data suggest that chance crossreactivity does not play a role in the dominant and superdominant HCMV responses we report here.
The Majority of the pp65-Specific CD8+ T Cell Repertoire Targets

Super-Dominant Epitopes
As T cells recognize processed peptides of antigens there is no reason to assume that a T cell specific for one peptide of the antigen will contribute differently to host defense than T cells recognizing >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU Ten subjects' PBMC at 300,000 cells per well were challenged with a library of 553 nonamer peptides that systematically covered all possible CD8+ T cell epitopes of the HCMV ppp65 antigen. An IFN-g ImmunoSpot assay was performed with the spot forming units (SFU) elicited by each peptide recorded. The mean and SD for 18 negative control media wells, and the cut-off values for the color-coded response categories are specified on the bottom of this  another. Immune monitoring therefore needs to assess all antigenspecific CD8+ T cells irrespective of their fine specificity. In our systematic assessment of CD8+ T cell immunity to pp65, we defined this number as the sum of all SFU counts elicited by the individual epitopes in a subject. This cumulative number of pp65specific CD8+ T cells is shown for each subject in Table 2 as "Cum. Spec. SFU". From this number, one can calculate what percentage of the pp65-specific CD8 + T cells occurs in each of the four response categories. As seen in Table 2, although the number of super-dominant epitopes was low in each subject (between 4 and 1), in eight of ten donors the majority of pp65-specific CD8+ T cells targeted these super-dominant epitopes. The percentage of CD8+ T cells specific for individual dominant and super-dominant epitopes is shown in S. Table 4.   >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU >100 SFU All 553 nonamer pp65 peptides in our library were run on the netMHCIpan search engine of the IEDB Analysis Resource for predicting their binding to the HLA-A*02:01 allele, resulting in "pp65 Rank" shown, with the top binder peptide ranked No. 1. The 30 highest ranking peptides are listed. Additionally, an Percentile Binding Score that compares a peptide's binding relative to the binding scores computed for 1,000 random nonamer peptides, reporting the percentile binding score, is also listed. A lower percentile binding score denotes better peptide binding to the HLA-A*02:01 allele. Peptides that have been described as HLA-A*02:01 restricted nonamer epitopes in the literature are highlighted in green and the color-coded response categories defined in Table 1 were applied.

CD8+ T Cells Target pp65 Epitopes in an Aleatory Manner
The data in Table 1 and Supplementary Table 3 show that each subject in our cohort displayed a unique CD8+ T cell epitope recognition pattern. This might come as a surprise, as all these subjects were HLA-A*02:01 positive, and one might have expected that among the epitopes recognized there should be at least a shared subset, those restricted by the HLA-A*02:01 allele. To narrow our investigation on such HLA-A*02:01restricted epitopes, we searched the IEDB database for HLA-A*02:01-restricted nonamer epitopes identifying 31 that have been experimentally verified so far: these are listed in Supplementary Importantly, donors who did not respond strongly or at all to these previously defined epitopes responded vigorously to other peptides of pp65 ( Table 1). These previously defined HLA-A*02:01-restricted peptides were therefore also targeted in a dice like, aleatory manner in HLA-A*02:01 positive subjects. Only one previously defined HLA-A*02:01-restricted epitope, pp65 495-503 , induced a dominant, or super-dominant recall response in eight of 10 subjects in our cohort ( Table 1 and  Supplementary Table 1). However, this peptide was not targeted in two donors (ID3# and ID#9) who exhibited responses to other pp65-derived peptides in a super-dominant fashion. Intrigued by this finding, we tested 42 additional (52 in total) HCMV positive, HLA-A*02:01 positive subjects for their recall response to the pp65 495-503 peptide. As shown in Supplementary Figure 2, the numbers of CD8+ T cells responding to the pp65 495-503 peptide did not correlate (r 2 = 0.01) to the numbers of T cells recalled by inactivated HCMV virus; which primarily activates HCMVspecific CD4+ T cells. Even though all these subjects have developed T cell immunity to HCMV, about one fourth of them either did not respond to the pp65 495-503 peptide, or displayed a low frequency of pp65 495-503 -specific CD8+ T cells. This finding is consistent with the notion that the CD8+ T cell response to pp65 495-503 peptide is also aleatory. Interestingly, although the HLA-A*02:01 restriction element was shared by all test subjects in our cohort, and despite the pp65 495-503 peptide being displayed in vivo via natural processing and presentation, in some individuals this epitope triggered a super-dominant CD8+ T cell response while in other subjects it did not induce a detectable respond at all. Moreover, in yet other donors, the magnitude of the CD8+T cell response induced by this peptide was anywhere between these two extremes. However, the higher prevalence of recognition for this epitope (pp65 495-503 ) compared to other previously defined HLA-A*02:01-restricted epitopes might have an unexpected reason: in addition to HLA-A*02:01, pp65 495-503 received a top binding score for several additional HLA-class I alleles (see next).

HLA Binding Scores Are Unreliable Predictors of Actual CD8+ T Cell Epitope Utilization
The participation of the other HLA class I alleles, beyond the shared HLA-A*02:01 restriction element, might explain the highly individual CD8+ T cell response pattern observed in our cohort. Based on extensive knowledge on the peptide binding properties of individual HLA alleles, reference search engines have been established that permit in silico predictions of which peptides fit the binding criteria of a given allele, and moreover, the strength of peptide binding can also be ranked. It has been widely anticipated that such in silico models will suffice to predict epitope utilization. In particular, when there is the need to select one or a few candidate epitopes, e.g. for multimer analysis, it is tempting to pick peptides that have the highest predicted binding score for the HLA allele of interest. In the following we address the validity of such an approach from three angles.
In the first two approaches we focused on predictions for HLA-A*02:01, the most studied HLA allele that is shared by all subjects in our cohort. We introduced into the netMHCIpan (25) search engine of the IEDB analysis resource (12) the individual sequences of our pp65 nonamer peptide library, resulting in the predicted pp65 ranking shown in Table 3 (in which only the top 30 predicted peptides of 553 are shown). In Approach 1, we compared this in silico predicted epitope hierarchy for pp65 with the actual peptide recognition we detected in our cohort. As can be seen in Table 3, pp65 495-503 ranked as the top binder, and indeed induced a CD8+ T cell response in the majority of our HLA-A*02:01 positive cohort (albeit in an aleatory manner, see above). Most of the other predicted peptides with high HLA-A*02:01 binding scores recalled CD8+T cells in low frequencies, and in an aleatory manner with each of these predicted peptides eliciting SFU in only one or two of the 10 test subjects. Except for the pp65 495-503 peptide, none of the super-dominant, and few of the dominant responses were recalled by these top 30 HLA-A*02:01 binding peptides (compare with Table 1). One might rightfully argue that this is because those dominant peptides were restricted by, and are binders of alternative class I molecules present in our cohort. We will address this hypothesis below, in Approach 3.
In Approach 2, we looked up the predicted HLA-A*02:01 binding scores for those peptides that have been identified experimentally as HLA-A*02:01-restricted pp65 epitopes. In Supplementary Table 1 these peptides have been listed according to their predicted HLA-A*02:01 binding ranking along with CD8+ T cell recall responses they induced in our HLA-A*02:01 positive cohort. With the exception of the pp65 495-503 peptide, none of these peptides were among the predicted top 20 binders. Seeking for a correlation between the predicted HLA-A*02:01 binding of these peptides, and their actual immune dominance, these data are also represented graphically in Supplementary Figure 3. No significant correlation was seen. The fact, however, that these peptides were targeted by CD8+ T cells in HLA-A*02:01 positive subjects establishes that immune dominant epitopes do not need to rank high in peptide binding score. The score apparently needs to be just high enough to enable stable HLA allele binding.
Proteasome cleavage and TAP binding predictions can enhance CD8+ T cell epitope discrimination in silico as compared with peptide-MHC I binding predictions alone (29). These data suggest, however, that such refinements to epitope predictions might not suffice to improve the ability to foretell actually recognized epitopes. All 31 peptides in Supplementary  Table 1 are previously experimentally defined HLA-A*02:01restricted epitopes. All of these peptides therefore passed proteasome and TAP selection. As there are no major known polymorphisms at the level of the proteasome or TAP binding, such are unlikely to contribute to the aleatory epitope recognition pattern observed for previously defined peptides. Therefore, rather than differences in antigen presentation, T cell repertoires and downstream repertoire selection processes are likely to explain the highly individualized epitope hierarchies seen in individuals.
In Approach 3, we matched binding predictions for all super-dominant and dominant epitopes detected in each of the 10 subjects with all HLA class I alleles expressed in the subject. The results shown for Subject ID 7 in Figure 1 are fully representative for all other subjects in our cohort (see Supplementary Figure 4). For donor ID 7 three superdominant epitopes were identified; these are represented by the red symbols. One is peptide 251-259 (the red triangle) that does not rank as a strong binder for any of the class I alleles present in this individual (a low "percentile binding score" on the Y axis of the graph means strong predicted binding). Based on the binding score the super-dominant status of this peptide could not have been predicted, and in this case the binding score also does not suggest what the restriction element might be. Peptide 495-503 (the red square) is also super dominant in this donor. It shows strong binding (a low score) for all class I alleles expressed in this individual, likely explaining its immunogenicity, but also suggesting that multiple restriction elements are involved (and that picking just one of them for a multimer is likely to underrepresent the 495-503-specific CD8 + T cell repertoire in this subject). Peptide 188-196 (the red dot) is also a super dominant epitope in donor ID 7. This peptide shows a high predicted binding score for the B*51:01 allele suggesting that as the restriction element. When designing B*51:01 multimers for immune monitoring for this donor, the 188-196 peptide would have been a hit-but why would one select one B allele over another B allele, neglecting all other loci and alleles, and if one did select top binders for each, most would be a miss. This erratic pattern carries through for all other dominant epitopes in this subject (the black symbols in Figure 1) and in all the other nine subjects we studied (Supplementary Figure 4). Few of the actually targeted CD8+ T cell epitopes ranked amongst the top binders for the class I alleles expressed in these respective subjects, and many super-dominant peptides ranked low. A binding score oriented in silico model would not have sufficed to predict the hierarchy of actual epitope recognition.
All three of our above approaches suggest that, at least in the case of CD8+ T cell immunity induced by HCMV infection against its pp65 antigen, in silico predicted high binding scores for a specific HLA class I allele neither predict whether those peptides will indeed induce a CD8+ T cell response, nor the magnitude of it. This finding raises the question how generalizable it is. Mei et al.'s recent report (8) suggests that it may be generalizable as they came to the same conclusion  Table 1. The corresponding Percentile Binding Score as established by the netMHCIpan search engine is shown comparing a peptide's binding relative to the binding scores computed for 1,000 random nonamer peptides. A lower percentile binding score denotes better peptide binding to the specified HLA allele.

CONCLUDING REMARKS
The scope of this study was to experimentally query whether CD8+ T cell epitope recognition for a prototypic foreign antigen follows immune dominance patterns that permit to predict the peptides recognized so immune monitoring can focus on them. We studied HCMV pp65 antigen recognition by CD8+ T cells in HCMV infected subjects at the highest possible resolution, testing every potential epitope and measuring the numbers of all epitope-specific CD8+ T cells.
Our data show that fixed epitope hierarchies do not exist even in an HLA-A*02:01 allele matched cohort. Instead, different super-dominant and dominant epitopes were targeted by the individual test subjects ( Table 1). Previously defined epitopes, and peptides predicted to be high HLA-A*02:01 binders also were also targeted in some, but not other individuals, if at all (Supplementary Table 1 and Table 3). If generalizable, the notion of such unpredictable, aleatory epitope recognition patterns in individuals makes it obsolete for CD8+ T cell immune monitoring to rely on testing a select few predicted or previously defined peptides. Rather, comprehensive CD8+T cell immune monitoring must be all inclusive, accommodating all potential epitopes on all restriction elements of each test subject. Such can be accomplished by brute force epitope mapping, as we did here, or by the use of mega peptide pools. Brute force epitope mapping might be required as the first step towards high definition CD8+ T cell monitoring, permitting the personalization od multimers. It identifies the few individually variable super dominant epitopes in an individual against which the vast majority of CD8+ T cells are directed ( Table 2). In a second step, the effector lineages/ differentiation states of these CD8+ T cells then can be closer characterized either by studying either their phenotype (19), and/or functions (27), whereby both steps can be pursued sequentially testing aliquots of cryopreserved PBMC. For the first step in this study we used 300,000 PBMC per well so as gain a high-resolution picture of epitope utilization detecting even low frequency CD8+ T cells. For testing 553 individual peptides at 300,000 PBMC we needed 173 million PBMC that can be readily obtained from healthy donors, but less so from diseased individuals and children. When epitope mapping aims at detecting super-dominant and dominant peptides only, this can be accomplished with substantially less PBMC. As the numbers of peptide-triggered SFU counts vs. the number of PBMC plated per well show a linear relationship between one million and 100,000 PBMC per well in regular 96well plates (30), the results provided here could have been obtained with 58 million PBMC testing each peptide individually at 100,000 PBMC/well (however, no longer reliably detecting all subdominant and cryptic epitopes, however), and with e.g. one-fifth of that (12 million PBMC) if the peptides are tested in matrices (31). ELISPOT assay can be further miniaturized to 384-well format requiring one third of PBMC per well compared to the 96-well format (the membrane surface of the 384-well plate is one-third that of the 96-well plate), thus, only four million PBMC could suffice for the agnostic mapping of super dominant and dominant epitopes for an antigen the size of pp65 (30).
The highly individualized nature of CD8+ T cell epitope recognition might also be accommodated by the agnostic use of mega peptide pools. Those presently available consist of 15mer peptides that walk the protein sequence with gaps of four amino acids and contain up to 200 peptides per pool. In a feasibility study towards this publication (17) we tested such a pp65 peptide pool (15-mers, 4 a.a. gaps, 138 peptides) along with the 9-mer epiScan. The number of peptide pool-triggered IFN-g producing (CD4+ and CD8+) T cells approximated the number of all 9-mer peptide-induced CD8+ T cells when the latter were added up. However, for several theoretical reasons we are reluctant to propose the use of such 15-mer peptide pools for CD8+ T cell immune monitoring. First, 15-mer peptides are ideal for binding to HLA class II molecules stimulating CD4+ T cells, but they cannot directly bind to class I molecules whose peptide-binding grove is closed on both ends thus not only prevents the direct accommodation of peptides this long. One possibility for a 15-mer peptide to provide a CD8+ T cell epitope is that the peptide is cross presented -a process that is dependent on a subtype of dendritic cells that is too rare in PBMC (32) to be a prevalent APC type in in vitro recall assays. Another possibility is that peptidases present in the PBMC culture trim the 15-mer peptide to a length that can be accommodated by class I molecules, or that there are shorter byproducts of the 15-mer peptide synthesis present that can bind directly, or both. Thus, to the extent CD8+ T cells are recalled by 15-mer peptide pools, such recall can be expected to occur under highly suboptimal conditions. In addition, covering the protein sequence in steps of 11 a.a. leaves considerable gaps in CD8+ T cell epitope coverage which is of additional concern as the closed peptide binding grove of class I molecules renders peptide binding intolerant to frame shifts in the anchor residues of an epitope. Mega peptide pools ideal for CD8+ T cell monitoring would consist of 9-mer peptides that cover the protein sequence in steps of single amino acids.
While in silico epitope ranking may have limited value in predicting immune dominant peptides, it should be helpful for narrowing in on the subset of peptides on an antigen that has sufficient HLA-allele binding affinity to constitute an epitope. As it is impractical to tailor a multitude of variable peptides to each individual's HLA-type, it might be more realistic for immune monitoring to develop rules for identifying peptides that do not bind to any HLA class I allele, so as to exclude those peptides from testing. Being able to narrow in on peptides should be helpful, as the ultimate goal of immune monitoring is to assess the CD8+ T cell response to the entire proteome of complex antigenic systems, such as all protein antigens of viruses and tumors.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical review and approval were not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
Experiments were designed by AL, PL and PR. Experimental data were generated by AL with TZ contributing. This publication serves as part of AL's doctoral thesis to be submitted to the Universidad Complutense de Madrid, Madrid, Spain. All authors contributed to the article and approved the submitted version.

FUNDING
This study was funded by the R&D budget of Cellular Technology Limited.
Supplementary Table 2 | HLA class I allotypes and other characteristics of human subjects tested in this study.
Supplementary Table 3 | Subdominant pp65 epitopes recognized in our cohort of ten HCMV positive, HLA-A*02:01 positive subjects. Peptides that elicited CD8+ T cell recall responses between 5 and 10 SD over the media background in at least one of the test subjects are highlighted in yellow. Cryptic recall responses (SD 3-5 over background) triggered by these peptides are highlighted in beige. Peptides that elicited SFU counts exceeding 10 SD over background are not shown in this table, as they are listed in Table 1. Peptides that only elicited cryptic recall responses (mean plus 3-5 SD) are not shown here, but are summarized in Table 2.
Otherwise the legend to Table 1 applies.
Supplementary Table 4 | Percentage of pp65-specific CD8+ T cells targeting individual epitopes. The total number of pp65-specific CD8+ T cells was calculated from the sum of SFU triggered by all epitopes in each subject as detailed in Table 2. The percentages of Cumulative Specific SFU counts elicited by individual peptides in each test subjects are shown. For the corresponding absolute SFU counts see Table 1. Otherwise, the legend to Table 1 applies.
Supplementary Figure 1 | Schematic representation of brute force CD8+ T cell epitope mapping. The amino acid sequence of the protein, illustrated on the top, is covered with nonamer peptides that walk the sequence in steps of single amino acids.
Supplementary Figure 2 | Frequency of HCMV pp65 495-503 peptide-specific CD8+ T cells vs. HCMV grade 2 antigen-reactive T cells. Fifty-two subjects were selected from the ePBMC database for being HLA-A*02:01 positive and responding to HCMV Grade 2 antigen with more than 100 SFU/300,000 PBMC. Each of these subjects' PBMC (represented by a dot) were re-tested in an ImmunoSpot assay for the numbers of SFU triggered by HCMV Grade 2 antigen (shown on the X axis), and the numbers of SFU elicited by the pp65 495-503 peptide (shown on the Y axis). No significant relationship was found by analysis through a simple linear regression.
Supplementary Figure 3 | HLA-A*02:01 binding ranking of previously defined HLA-A*02:01 restricted nonamer pp65 peptides vs. the SFU counts they induced in our cohort of HCMV positive, HLA-A*02:01 positive subjects. The numeric SFU data shown in S. Table 1 are plotted relative to their Percentile Binding Score as established run on the netMHCIpan search engine for predicting their binding to the HLA-A*02:01 allele. No significant relationship was found by analysis through a simple linear regression.
Supplementary Figure 4 | Predicted vs. actual pp65 epitope recognition by CD8+ T cells. Data are shown for the subjects specified in each panel. For each subject his/ her HLA class I alleles are specified (in the case of homozygosity the allele is listed once). Peptides that induced super-dominant responses in that individual are shown as red data points, dominant responses in black and weaker responses are not represented. The raw data for the peptide-induced SFU counts are listed in Table 1. The IEBD Rank shown for each peptide and allele was established using the netMHCIpan search engine predicting the peptides' binding score to the respective HLA allele, whereby a lower Percentile Binding Score binding score denotes better peptide binding.