Kinetics of HIV-Specific CTL Responses Plays a Minimal Role in Determining HIV Escape Dynamics

Cytotoxic T lymphocytes (CTLs) have been suggested to play an important role in controlling human immunodeficiency virus (HIV-1 or simply HIV) infection. HIV, due to its high mutation rate, can evade recognition of T cell responses by generating escape variants that cannot be recognized by HIV-specific CTLs. Although HIV escape from CTL responses has been well documented, factors contributing to the timing and the rate of viral escape from T cells have not been fully elucidated. Fitness costs associated with escape and magnitude of the epitope-specific T cell response are generally considered to be the key in determining timing of HIV escape. Several previous analyses generally ignored the kinetics of T cell responses in predicting viral escape by either considering constant or maximal T cell response; several studies also considered escape from different T cell responses to be independent. Here, we focus our analysis on data from two patients from a recent study with relatively frequent measurements of both virus sequences and HIV-specific T cell response to determine impact of CTL kinetics on viral escape. In contrast with our expectation, we found that including temporal dynamics of epitope-specific T cell response did not improve the quality of fit of different models to escape data. We also found that for well-sampled escape data, the estimates of the model parameters including T cell killing efficacy did not strongly depend on the underlying model for escapes: models assuming independent, sequential, or concurrent escapes from multiple CTL responses gave similar estimates for CTL killing efficacy. Interestingly, the model assuming sequential escapes (i.e., escapes occurring along a defined pathway) was unable to accurately describe data on escapes occurring rapidly within a short-time window, suggesting that some of model assumptions must be violated for such escapes. Our results thus suggest that the current sparse measurements of temporal CTL dynamics in blood bear little quantitative information to improve predictions of HIV escape kinetics. More frequent measurements using more sensitive techniques and sampling in secondary lymphoid tissues may allow to better understand whether and how CTL kinetics impacts viral escape.


INTRODUCTION
In 2014, the number of people living with human immunodeficiency virus 1 (HIV-1 or simply HIV) was estimated as 36.9 million (1), with roughly 2 million new HIV infections and 1.2 million people dead of HIV-induced diseases (AIDS) (2). Cytotoxic CD8 + T lymphocyte (CTL) responses play an important role in control of virus replication (3,4) by modulating some important predictors of disease progression (e.g., viral set-point and the rate of CD4 + T cell loss (5)). Generation of HIV-specific CD8 + T cells by vaccination is one of the current approaches in developing HIV vaccines (6,7). However, HIV is able to generate mutants (termed "CTL escape mutants") that are not recognized by HIV-specific T cells, which may be one of the reasons for failure of T cell based vaccines (8)(9)(10). Better understanding of mechanisms of viral escape and principles governing CD8 + T cell responses to HIV may allow us to evaluate in silico a potential efficacy of T cell-based HIV vaccines.
Viral escape from CTL responses follows a somewhat predictive pattern with more dominant (larger magnitude) CTL responses leading to earlier viral escape (11,12). However, not every CTL response elicits an escape and sometimes viral mutations occur in regions predicted to be recognized by CTLs but in the absence of detectable response (13). To understand the timing and kinetics of CTL escape in HIV/SIV infection, mathematical models have been proposed previously on the dynamics of viral escape from a single CTL response (e.g., Ref. (14)(15)(16)(17)(18)(19)(20)). These initial models made a strong assumption of independent viral escape-i.e., it was assumed that viruses escaping from different CTL responses do not compete. Recent work, however, suggested presence of clonal interference and genetic hitchhiking among immune escape variants through reconstruction of HIV whole genome haplotypes (21), and similar concurrent CTL escapes were observed in four HIV-infected patients (22). Clonal interference was suggested to impact the estimates of the escape rates (23,24). Even though several models have been developed to describe the dynamics of escapes from multiple CTL responses (e.g., Ref. (17,18,(23)(24)(25)(26)), many of these studies involved only model simulations and did not use information on the actual kinetics of HIV-specific CTL responses in predicting viral escape.
Here, we explored whether including experimentally measured CTL kinetics improves description of the viral escape data. In doing so, we compared predictions of three alternative models of viral escape from CTL responses such as independent escapes, sequential escapes, and concurrent escapes. In the first model (independent escapes), we assumed that escape from any given CTL response occurs independently of other escapes and directly from the wild-type, i.e., we ignored the effects of clonal interference-in essence assuming high effective population size and/or high recombination rate. Of note, several recent experimental papers also assumed independent escapes (11)(12)(13). In the second model (sequential escape), we assumed that escapes from different CTL responses occur along a defined pathway, generally set by the sequences of escape occurrence in the data. This model assumes strong clonal interference, which may arise at low effective population size or when recombination rate is low. Finally, in the third model (concurrent escape), we tracked all escape variants simultaneously, thus allowing for co-existence of multiple escape variants (i.e., escapes could occur along multiple alternative pathways). Interestingly, we found that for well-sampled data on virus evolution, the estimated CTL killing efficacies were independent of the model for viral escape. Some escape data could not be well described by the sequential escape model for biologically reasonable parameters. Furthermore, explicitly taking CTL kinetics into account did not improve the quality of fit of different models to escape data. Our results suggest that CTL kinetics in the blood as it is currently available may bear limited information relevant to improve description of kinetics of HIV escape from CTL responses.

Experimental Data
Experimental details of patient enrollment and data collection were described in detail previously (12,13). In short, data from 17 patients in the Center for HIV/AIDS Vaccine Immunology (CHAVI) infected acutely with HIV-1 (subtypes B or C) were analyzed in great detail. All patients were infected with a single transmitted/founder (T/F) virus as determined by the single genome amplification and sequencing (SGA/S), and there were enough samples to accurately quantify CTL response to the whole viral proteome. In each patient, the kinetics of virus-specific CTL (CD8 + T cell) responses were measured using peptidestimulated IFN-γ ELISPOT assay and/or intracellular cytokine staining (ICS) 6 months after enrollment using peptides matched to the founder virus sequence (12,13). For CTL responses measured by ELISPOT, the reported magnitude of the response was the number of cells, producing IFN-γ, per 10 6 peripheral blood mononuclear cells (PBMC). Multiple viruses were sequenced by SGA/S, and all sequences were compared at cites coding for CTL epitopes, and changes in the percentage of transmitted (wild-type) sequences were followed over time (12). The dynamics of the HIVspecific CTL responses and viral escape from epitope-specific CTL responses were measured longitudinally. Escape mutants were identified as viral variants with mutations in regions recognized by patient's CTL responses with a reduced (or fully abrogated) production of IFN-γ following T cell stimulation. In many cases, mutation in a single position was responsible for the escape. In our analysis, all viral variants, which did not have the wild-type amino acid in the epitope region, were considered as escape variants.
Review of the virus evolution and CTL dynamics data in all 17 patients revealed some data limitations. In particular, data for many patients lacked adequate temporal resolution to accurately estimate virus escape rates. In the vast majority of viral escape variants, escapes often occurred rapidly between two sequential time points with the frequency of the escape variant jumping from 0 to 1. While previously it was suggested that such data may be modified to provide an estimate of the escape rate (14,15,17), such approaches may lead to biased parameter estimates (25). While development of a method for unbiased estimation of escape rate from sparse data was recently proposed (27), for this analysis, we focused on patients CH131 and CH159 in which viral escape rates could potentially be accurately estimated due to sufficiently frequent sampling. While data from these patients were presented before (12), linking of escape and CTL response dynamics was not yet performed.

Model of Viral Escape from a Single CTL Response
Models describing the dynamics of viral escape from a single cytotoxic T lymphocyte (CTL) response have been developed and adopted by different researchers (e.g., Ref. (14)(15)(16)(17)(18)). Here, we start with the basic model formulated earlier (18) and extend it to viral escape dynamics from multiple CTL responses. The model of viral escape from a single CTL response can be extended from the basic viral dynamics model (28) in the following way: where T(t) is the density of uninfected target cells; I w (t) and I m (t) is the density of target cells infected by the wild-type or escape variant viruses, respectively; V w (t) and V m (t) is the density of wild-type or escape variant viruses, respectively; s is the turnover rate of uninfected target cells; T 0 is the preinfection level of uninfected target cells; β w and β m is infection rate of wild-type or escape variant viruses, respectively; µ is the probability of mutation from wild-type to escape mutant during reverse transcription of viral RNA into proviral DNA; δ is the death rate of infected cells due to viral pathogenicity; k is the killing rate of wild-type virus infected cell due to CTL response; p w and p m is the rate at which cells infected by wild-type or escape mutant viruses produce viruses; and c v is the clearance rate of free viral particles. In this model (equation (1)), we assume that target cells infected by wild-type (V w (t)) and escape viruses (V m (t)) differ by two factors: viral infectivity (β w and β m ) and the rate of virus production (p w and p m ). Given that in vivo viral particles are short-lived (29,30), to a good approximation, we may assume a quasi steady state for the virus particle concentration leading to V * We define a fitness cost c = 1 − β m p m β w p w , where c can be positive or negative. Positive c means true fitness cost of escape mutations, which is escape variant and has a lower replication rate (β m p m ≤ β w p w ) (31), and negative c implies fitness advantage of escape virus (31,32). By straightforward calculation, the system (equation (1)) can be written as For convenience, we replace V * w (t) and V * m (t) by w(t) or m(t), respectively, and assume that the wild-type and escape viruses differ only in the rate of infectivity (that is β w ≥ β m and p w = p m ) (13), the system (2) can be simplified as where r(t) = β w p w c v T(t) is the replication rate of cells infected by wild-type virus, and c = 1 − β m β w is the cost of the escape mutation defined as a selection coefficient. The frequency of the escape variant in the whole population is given by . This is perhaps the simplest model for a viral escape from a single CTL response. This is denoted as model 1 in the paper.

Models of Viral Escapes from Multiple CTL Responses
Mathematical model given in equation (3) tracks changes in densities of wild-type virus and a single variant that has escaped recognition from a single epitope-specific CTL response. In acute HIV infection, the virus can escape from recognition of multiple CTL responses, which are specific to several viral epitopes (13,33). Several models have been developed to describe the dynamics of escapes from multiple CTL responses (e.g., Ref., (17,18,26)). Our model is an extension of previous models (17,18) incorporating mutations from wild-type virus to different viral escapes. In contrast with previous studies, in our analyses, here, we used experimentally measured time courses of different CTL responses (12).
To track the dynamics of viral escape from multiple responses, we assume that there are in total n CTL responses that control viral growth, and virus can potentially escape from all n responses. We use m i to denote the density of variants where i is a vector i = (i 1 , i 2 , . . ., i n ) denoting the positions of n epitopes, and we define i j = 0 if there is no mutation in the jth CTL epitope and i j = 1 if there is a mutation leading to an escape from the jth (1 ≤ j ≤ n) CTL response. We denote the set of escape variant as I, which is i ∈ I. The wild-type variant is then denoted as (0, 0, . . . 0).
For our analysis, we neglect recombination and backward mutation from mutant to wild-type. We use k i , c i , and µ i to denote killing rate due to ith CTL response, cost of escape mutation from the ith CTL response and mutation rate for the ith epitope, respectively. Due to a small rate of double mutation (34), we assume that escape virus is generated with only one mutation in a single generation. That is, for two escape variants m i = m (i 1 ,i 2 ,...,i n ) and m j = m ( j 1 ,j 2 ,...,j n ) , we define the mutation rate M i,j from m i to m j as µ k , if and only if m j has only one more mutation at position k than m i and all other positions are exactly same. For example, when there are 3 CTL responses, the mutation rate from m (1,0,0) to m (1,1,0) is µ 2 , and the mutation rate from m (0,0,0) to m (1,0,1) is 0. Assuming multiplicative fitness (detailed deviation is given in Section S2 in Supplementary Material), that is, the fitness cost of a variant i = (i 1 , i 2 , . . ., i n ) is The death rate of the escape variant i = (i 1 , i 2 , . . ., i n ) due to remaining CTL responses is given by K i = ∑ n j=1 k j (1 − i j ), where we assume that killing of infected cells by different CTL responses is additive. Similar to equation (3), the dynamics of the wild-type and escape variants are given by We define M(t) = ∑ i∈I m i as the total density of all variants in the population, and f j (t) (j = 1, . . ., n) is the fraction of viral variants that have escaped recognition from the jth CTL response. The frequency of a viral variant escaping from the jth response is given by Based on previous work (22,25,35), we assume that there are two alternative ways to generate escape mutants (Figure 1). The first way can be called "sequential" escape (model 2), that is escape mutants are generated sequentially along a defined path from wild-type viruses. This is likely to happen when the effective population size of HIV is small and when the rate of recombination is negligible. The second way can be described as "concurrent" escape (model 3), in which the virus can escape from n CTL responses simultaneously along multiple different pathways. This is likely to happen when the HIV effective population size is large. With n CTL responses, there are n escape variants for "sequential" escape and 2 n − 1 escape variants for "concurrent" escape in addition to the wild-type variant. For example, with n = 3 CTL responses, for "sequential" escape, there are 3 escape variants: m (1,0,0) , m (1,1,0) , and m (1,1,1) with m (0,0,0) being the wildtype virus. For "concurrent" escape, there are 7 escape variants: m (1,0,0) , m (0,1,0) , m (0,0,1) , m (1,1,0) , m (1,0,1) , m (0,1,1) , and m (1,1,1) with m (0,0,0) being the wild-type virus (Figure 1). Detailed equations for both models with n = 3 CTL responses can be found in Supplement (Section S2 in Supplementary Material). It is interesting to note that "sequential" escape is a simplification of "concurrent" escape when the effective population size is small. Previous work did not fully resolve whether CTL escapes in HIV infection occur sequentially or concurrently (22,25); most likely the type of escape varies by patient.

Models for CTL Response
The killing rate k i of the CTL response specific to the ith epitope in all three models is composed of two parts: the per-cell killing efficacy of CTLs (k ′ i ) and the number of epitope-specific CTLs (E i ) (16). Previously the killing rates k i were often set to a constant (e.g., Ref. (16,18)), or were set to a certain form k ′ i g(E i (t)) where g(E i (t)) is a function of epitope-specific CTL responses E i (t) (e.g., Ref. (24,36)). With the measured epitope-specific CTL response dynamics (13), we adopted two forms of killing rate: constant k i (termed as "constant response") or time-dependent killing rate k ′ i E i (t) (termed as "interpolated/fitted response"). We used the "mass-action" killing term to describe effect of CTLs on virus dynamics because it is the simplest form, it involves minimum parameters, and it is supported by some experimental data (37).
Based on the available time course information of epitopespecific T cell response E i (t), we used the first-order interpolation function (termed as "interpolated response") or the fitted response function (termed as "fitted response") by the T on -T off model (38) to quantify the kinetics of HIV-specific CTL responses. The T on -T off model assumes that the response starts with E 0 epitopespecific CD8 + T cells that become activated at time T on . Activated T cells start proliferating at a rate ρ and reach the peak at time T off . After the peak, epitopes-specific CD8 + T cells decline at a rate α. The dynamics of the CD8 + T cell response E(t) is given thus by the following differential equation: (6) with E(0) = E 0 . Here the "precursor frequency" E 0 is a generalized recruitment parameter, which combines the true precursor frequency and the recruitment rate/time (38,39). Our recent work showed that this model (equation (6)) reasonably well describes kinetics of HIV-specific CTL responses in acute HIV infection (40). When fitting the model (equation (6)) to experimental data of CTL dynamics, we changed all initial undetected response values from 0 to 1; the latter was the detection limit in the data.

Statistics
Previously, under the assumption that some mutants are present initially, researchers (e.g., Ref. (16,36)) fit a logistic model to data on viral escape kinetics by the method of nonlinear least squares (41). In essence, this is a maximum likelihood method, which assumes normally distributed residuals. While this standard statistical method provides reasonable parameter estimates, it assumes equal weights to different data points independently of how many viral sequences were measured at every time point, which is likely to be unrealistic for most experimental studies. Here, we follow the method proposed recently (18) to use binomial distribution (and thus different weights for different measurements/time points) in the likelihood of the model given the escape data. For HIV escape from a single CTL response, the log-likelihood function is given by where a j is the number of escape variant sequences in a sample of N j sequences at the sample time t j , T i is the number of measured time points for a ith specific viral escape trajectory, and f (t j ) is the predicted frequency of a specific viral escape variant at time t j . Model parameters were thus found by maximizing the log-likelihood function (equation (7)).
To discriminate between alternative models under different parameter constrains, we used corrected Akaike information criterion (AIC) scores (42). The model fit with the minimum AIC score among tested models was treated as the best model; however, a difference of less than 3 AIC units is generally viewed as not significant (42). To test the statistical significance of the differences between parameters found by fitting different models, we used a bootstrap approach (43). In this approach, we resampled the data 1,000 times using the Random routine in Mathematica assuming beta distribution for sequencing data (44), fitted models to bootstrap samples, and recorded all estimated parameters. For the same parameter, we use either paired and unpaired t-test to compare the parameter averages for different models.
Both fitness costs of escape mutations and the killing efficacy of the CTL response determine the kinetics of viral escape from T cells (14)(15)(16), and that viral escape (sequence) data in most cases are not sufficient to estimate both rates (16). Therefore, in our analyses, to avoid overfitting, we set fitness cost of escape to 0 c i = 0. In all fits, we assumed that the rate of virus replication r = 1.5/day (28).
While multiple models may be able to describe accurately experimental data, some models may do so at biologically unreasonable parameters. For example, estimated rate of mutation at different epitopes may be unrealistically large. Thus, in our analysis, we assume that mutation rates, which are above 10 −3 are likely to be unrealistic given that currently estimated HIV mutation rate is about 3.2 × 10 −5 per bp per generation (34) and size of a CTL epitope is 8-10 amino acids (3 × 10 × 3.2 × 10 −5 ≈ 10 −3 ).
To fit the T on -T off model [equation (6)] to experimental data using non-linear least squares, we log-transformed the model predictions and the data.
When interpolating CTL response kinetics, there was often not enough information on the starting point (day 0). In such situations, we set the initial CTL density as 1 (the detection level for this data set) for simplicity. Other starting points (e.g., intersection point of the CTL response axis and the reverse extension line of the interpolation function) were also tested and led to similar results (not shown). This was largely due to the fact that, in our models, CTLs at low densities are not expected to exert large selective pressure on the virus population due to assumed mass-action killing term.

Statistical Model Impacts Estimation of the Escape (Killing) Rate
Given virus evolution data, we may be often interested in quantifying selecting pressures driving specific changes in the virus population. Following HIV-1 infection, the virus escapes from several cytotoxic T lymphocyte (CTL) responses (45), and multiple studies used mathematical models of various levels of complexity to estimate the predicted efficacy at which CTLs recognize and eliminate cells, infected with the wild-type (unescaped) virus (14)(15)(16)(17)(18)25). Many of these previous studies estimated the rate of HIV escape from immunity using nonlinear least squares, which explicitly assumes normal distribution of the deviations between model predictions and data (14)(15)(16)(17). However, the assumption of normally distributed residuals is likely to be violated for data when only a handful of viral genomes are sequenced-which is common in many studies involving single genome amplification and sequencing techniques (SGA/S). We have recently proposed to use a likelihood approach, which assumes virus genome sampling to follow a binomial distribution (18). This binomial distributionbased likelihood approach showed to impact the estimates of the CTL killing rate (escape rate can be proportional to the killing rate under an assumption of constant CTL response) when compared to normal distribution-based likelihood approach (least squares) (18). However, this previous comparison was done on data, which were fairly sparse and comparison involved modifications of data to allow for non-zero and non-one frequencies of the escape variant (14,15), and thus, it remained unclear if estimates of escape rates are truly dependent on the statistical model for better sampled data.
Unfortunately, in our cohort of 17 patients (12), very few patients were sampled frequently enough to observe gradual accumulation of escape variants in the population (i.e., data with two sequential time points with mutant frequency in the range 0 < f < 1 were rare). For the analysis, we, therefore, used the escape data from two patients, CH131 and CH159, where CTL and HIV sequence measurements were sufficiently frequent to address our modeling questions. We fitted a simple mathematical model describing escape of the virus from a single constant (non-changing) CTL response (equation (3)) to the data from one patient CH159 (Figure 2) assuming two different statistical models: with normally distributed residuals (least squares) or binomial distribution-based likelihood (equation (7)). Consistent with our previous observation, we found that the type of statistical model impacts the estimate of the escape rate (k in Figure 2) with difference being nearly twofold (k = 0.27/day vs. k = 0.51/day). It is interesting to note that, visually, the least squares method appear to describe the data better by accurately fitting the points with intermediate frequency of the escape variant in 20-30 days A B FIGURE 2 | Statistical model has a strong impact on the estimated killing rate. We fit model in equation (7) to the same data for HIV escape in the protein region DREVLIWKFDSSLARRHL of Nef (Nef 177-194) in patient CH159, assuming normal distribution-based likelihood (normally distributed residuals or nonlinear least squares (A)) or binomial distribution-based likelihood method (B). Data are shown as dots and bars represent the 95% confidence intervals calculated using beta distribution (Jefferey's intervals (44) after the symptoms (but missing the another intermediate data point (12,0.08)). However, this visually better fit is not supported by the statistics: likelihood of the model for these data is −12.64 or −10.53 for normal (Figure 2A) or binomial ( Figure 2B) distribution, respectively (and AIC scores being 31.0 vs. 26.8, respectively). Interestingly, the main difference in the estimated escape rates was driven by just one data point ((t, f ) = (12, 0.08)); removing this data point from the data led to identical estimates of the escape rate, k = 0.51/day, from two statistical models (results not shown). This is not surprising because with this data point removed, the information on escape rate is only coming from two data points when the frequency of the escape variant is intermediate (0 < f < 1). As discussed before, least squares may not allow to estimate escape rates, e.g., in cases when mutant frequency jumps from 0 to 1 between two subsequent time points unless data are modified (14,15). Similarly, models assuming normally distributed residuals may not be able to fit other types of data, in which frequency of the mutant has an intermediate value (0 < f < 1) at one time point only. In particular, in our analysis of another escape in patient CH159 (Rev GRPTEPVPFQLPPLERLC, see Figure 3), we could not obtain finite estimates of the escape rate using normally distributed residuals (results not shown). Rather, the model fits tended to describe accurately two data points (t = 22 days and t = 29 days) and ignore another data point (t = 56 days) leading to extremely high predicted escape rates (results not shown). Interestingly, using binomial distribution-based likelihood allowed for an accurate fit of the model to data and the fit compromised between describing early and late data points (Figure 4A). The reason for the compromise is that a fit predicting fast escape and nearly 100% escape variant by 56 days since symptoms is highly disfavored by the binomial distribution-based likelihood because some wild-type variants were still present at day 56 (thus, the weight for missing this point by the model fit was very high in binomial distribution-based likelihood but not in the normal distribution-based likelihood). Taken together, these results suggest that the type of the statistical model used to estimate HIV escape rates influences the final estimates. Therefore, many previous studies on HIV escape assuming normally distributed residuals may need to be re-evaluated for the robustness of their conclusions.

CTL Response Kinetics Do Not Improve Description of the Escape Data
As CTL responses drive HIV escape from epitope-specific T cells, it is expected that the magnitude of the CTL response should naturally impact escape kinetics. Previous studies provided some evidence that the relative magnitude of a given CTL response in the total HIV-specific CTL response early in infection (% immunodominance) predicts the timing of viral escape (11,12). Immune response was also shown to impact escape of simian immunodeficiency virus (SIV) from T cell responses (19,46,47). Immune response magnitude, and as a consequence, the overall CTL killing efficacy is important in determining both timing and speed of viral escape with the rate of viral escape being directly related to the immune response efficacy (16,17). In contrast, both initial mutant frequency, virus mutation rate, and CTL killing efficacy determine timing of viral escape (17). Whether inclusion of the experimentally measured CTL dynamics impacts ability of mathematical models to accurately describe viral escape data has not been tested.
To test the benefits of using longitudinally measured CTL responses in describing viral escape data, we considered several alternative models for the CTL dynamics and viral escape. Our model 1 describes the dynamics of viral escape from each CTL response independently. Models 2 and 3 describe escape from multiple CTL response that occurs sequentially or concurrently, respectively (see Materials and Methods for more details). CTL dynamics was either considered to be unimportant (i.e., killing rate k i was set constant over time), or when killing rate was proportional to the experimentally measured CTL frequency (k ′ i E i (t)), respectively. To describe CTL dynamics, we either used the first order interpolation function or the T on -T off model (equation (6) and see Materials and Methods for more detail).
In patient CH159, four CTL responses were detected (Figure 3B), and three of these responses were escaped within nearly 4 years of infection. Interestingly, the response specific to Gag TPQDLNTML was dominant ( Figure 3B), but the corresponding escape mutant Gag TPQDLNTMLNTVGGHQAA did not appear up to 1,132 days since onset of symptoms ( Figure 3A). Patient CH159 had two escape mutants in regions Rev GRPTEPVPFQLPPLERLC (Rev 65-82) and Nef DREVLIWKFDSSLARRHL (Nef 177-194) satisfying our selection criteria (Figure 3C). Despite a relative small magnitude of CTL responses specific to Rev65 and Nef177 early in infection (up to 29 days since onset of symptoms), escape mutants appeared early and their frequencies arose rapidly.
We fitted three alternative mathematical models for viral escape and three alternative models for the CTL dynamics to the data on viral escape ( Figure 3C) using binomial distribution-based likelihood method (see Materials and Methods for more detail). Surprisingly, we found that the models 1 and 3 with a constant immune response described the data with best quality as judged by the AIC (or likelihood). Parameter estimates in the model 1, which assumes independent escape were nearly identical to the parameters in the model 3, which assumed concurrent escape (Figure 4; Table 1). Importantly, adding experimentally measured CTL response dynamics (as interpolated function or by using parameterized T on -T off model) did not improve the quality of the model fit to escape data ( Table 1). Even worse, for models 1 and 3, the fits with a fitted response were of lower quality as judged by the large increase in AIC ( Table 1). Models that included an interpolated CTL response provided better fits than models with a fitted response ( Table 1).
The exact reasons of why including experimentally measured CTL response dynamics led to worse fits of the escape data are unclear but perhaps rapid change in magnitude of CTL responses in this patient-if response directly impacts killing of infected cells-was simply not reflected in the kinetics of viral escape (Figures 4D,G). Specifically, CTL kinetics-driven escape would predict non-monotonic rise in the escape variant frequency, which was not observed in the data, thus, favoring a model with a constant killing rate by CTLs. Interestingly, the model 2 fits of the data resulted in unphysiologically large estimates for the mutation rate µ 2 ( Table 1). As we elaborate later (see below), this failure of the model to describe these data stems from the fact that escapes in the data occur nearly at the same time and assuming that escapes are sequential led to an unrealistic mutation rate in the second epitope. This suggests that the observed dynamics of viral escape in patient CH159 is not consistent with sequential escape. Models 1 and 3 also predicted slightly higher than expected mutation rate µ 1 (bigger than 10 −3 ) for the peptide Rev 65-82. Constraining this parameter to remain µ 1 ≤ 10 −3 led to fits of Material, panels (C,F,I)) to escape data in patient CH159 with different response inputs (constant, interpolated, or fitted response, see Materials and Methods for more detail). Adding direct time-dependent response (interpolated or fitted response) did not improve the quality of the model fit to data (see Table 1 for parameter estimates). Model 2 was not able to accurately describe these data for biologically reasonable mutation rates (see Table 1).
significantly lower quality (likelihood ratio test, p < 0.05). Due to large length of the peptide, the overall mutation rate in this region could indeed be slightly higher than our calculated high bound for the mutation rate (see Materials and Methods for more detail). Furthermore, since peptide Rev 65-82 is the epitope in which first escape occurred, it was possible that the high estimate of the mutation rate could be due to late sampling of viral sequences. In these, data sampling was done after patients were diagnosed with infection; however, viral escape could have started earlier and for escapes starting earlier, it may be possible to describe the data with a lower mutation rate (18,48). Therefore, to test whether the timing of the start of the escape influences the estimate of the mutation rate we did the following. We shifted the data for two escapes forward by adding some initial zeroes to data and reverse extended the predicted CTL response curves. Then we refitted models 1 and 3 to the data under the constrain µ ≤ 10 −3 . We found shifting the data did not improve the quality of the model fits as compared to unmodified data when CTL dynamics is explicitly taken into account as interpolated or fitted response (results not shown). However, assuming a constant response allowed to obtain lower, more physiological estimates of the mutation rate. These results suggest that inability of the models, which explicitly incorporate CTL dynamics to explain kinetics of first escape with physiologically reasonable mutation rate is due to late appearance of the CTL response. Indeed, escape can only accumulate when CTL response is present and extending the time window for virus evolution but not having CTL response active will not significantly impact estimates of the mutation rate.
Given our results for one patient, we next sought to investigate whether our conclusions will remain robust when looking at     data from another patient. Patient CH131 had 6 CTL responses, and there was escape from at least 5 of these responses in 2 years since symptoms (Figure 5). One escape, Nef EEVGF-PVKPQV (Nef 64-74), occurred very early in infection, and two escapes, Env RQGYSPLSFQTLIPNPRG (Env 709-726) and Gag VKVIEEKAFSPEVIPMFT (Gag 156-173), occurred late ( Figure 5). In this patient, the pattern of escape followed the ranking of immunodominance of CTL responses (12): Nef64specific CTLs were dominant at symptoms and drove earlier escape, while Env 709-and Gag156-specific CTLs arose later with escapes occurring later in infection (Figures 5A,B). However, there were apparently discrepancies such as two escapes in Tat epitopes (Tat DPWNHPGSQPKTACNNCY, that is Tat 9-26 and Tat FQKKGLGISY, that is Tat 38-47) occurred at the same time while CTL responses specific to these different epitopes were of different sizes (Figures 5A,B). Because escapes in these two Tat epitopes occurred rapidly and did not have two intermediate measurements of the mutant frequency, our following analysis was only restricted to escapes in three CTL epitopes: Nef64, Env709, Gag156 (Figures 5C,D).
We thus fitted 3 different models of viral escape combined with 3 different models for the CTL dynamics to the data on viral escape (Figure 6). Importantly, as with the analysis of data from patient CH159, we found that including the data-driven CTL dynamics in the escape models did not improve the quality of the model fit to the escape data ( Table 2). In contrast with the previous results, though, the assumption of the constant and time-variable killing efficacy (i.e., due to variation in the immune response magnitude) did not strongly impact the quality of the model fit as judged by the AIC or likelihood ( Table 2). Importantly, however, models 1 and 3 gave nearly identical estimates of the CTL killing efficacy, suggesting that for data with good temporal resolution model estimates of the CTL killing efficacy (or by inference, escape rates) are not strongly dependent on the specific mechanisms used to describe escape (independent vs. concurrent escape). This observation also suggests that exclusion of the data on escape occurring at intermediate times after symptoms in Tat should not influence the accuracy of estimation of the killing rates of CTLs specific to other epitopes in CH131.
Extending the observation made with the patient CH159 data, we found that model assuming sequential escape (model 2) could not accurately describe the dynamics of viral escape for biologically reasonable parameter values specifically for the third escape in Gag156 although this inability was significant only for a constant killing efficacy ( Table 2). Allowing time-dependent killing efficacy resulted in small yet larger values for the mutation rate than that expected from basic calculations. Forcing the mutation rate µ 3 to be constrained (µ 3 ≤ 10 −3 ) significantly reduced the quality of the model fit to data (likelihood ratio test, p ≪ 0.001). Furthermore, estimates for the CTL killing efficacy differed between model 2 and models 1 and 3 suggesting that model choice (sequential vs. concurrent) may indeed influence estimates of the killing efficacy.

No Difference in Predicted Killing Efficacy of CTLs, Specific to Different Epitopes
Our analyses, so far, demonstrated that several different mathematical models were capable of accurately describing the escape data, but this ability was dependent on the specific pathway of how escape mutants were generated and the assumption on whether data-driven CTL dynamics was included in the model. In cases, when a model was able to accurately describe the data, we generally observed different estimates for the parameters for HIV escape in different epitopes; for example, for the data in patient CH131 estimated CTL killing rate in the model 1 (independent escapes) with interpolated response different nearly 100-fold between k ′ 1 and k ′ 3 ( Table 2). Knowing which immune responses may be more efficient on a per cell basis in killing virus-infected cells may be beneficial for inducing such responses by vaccination. We, therefore, investigated how robust these differences in estimated per capita killing rates are. For that, we fitted mathematical models assuming equal killing efficacies to the data on escape. As expected, reducing the number of fitted parameters led to fits of lower quality (as judged by the log-likelihood); however, this reduction in complexity of the model was favored by the AIC and in most cases by the likelihood ratio test (Tables S2 and S4 in Supplementary Material). Visually, the reduction in the quality of the model fit to data was also relatively small ( Figures S2 and  S4 in Supplementary Material). Thus, for these data, we found no strong evidence in the difference in the estimated per capita killing efficacy of the CTL response specific to different viral epitopes.

Identifying Conditions When the Model 2 (Sequential Escapes) Fails
In analysis of data from both patients, we found that model 2, describing sequential escape from CTL responses, was not able to accurately describe experimental data for biologically reasonable parameter values; these model fits predicted extremely high mutation rates (e.g., see Tables 1 and 2). Additional analyses demonstrated that fitting the models with constrained mutation rates, µ i ≤ 10 −3 led to fits of significantly lower quality (based on increased AIC, results not shown).
A closer look at the experimental data for which model 2 provided unreasonably high mutation rates revealed that the trajectories of two subsequent escapes in the model 2 were too close to each other, which naturally required a high mutation rate from one variant to another. Therefore, only when trajectories are separated in time mutation rate µ 2 is expected to be biologically reasonable. Indeed, by simulating virus dynamics using model for sequential escapes by varying model parameters, we found that CTL killing rate has the major impact on the time delay between two escapes (Figure 7). This analysis thus suggested that for the model 2 (sequential escape) to be consistent with the data, escapes from 2 responses must be separated in time by about 20-50 days.

DISCUSSION
CTL responses play a major role in HIV within-host evolution (45,49). Recent studies suggested that a relative magnitude of the CTL response (relative immunodominance) plays an important role in determining the time of viral escape from T cell responses (11,12). These previous studies, however, only utilized a maximum value of the CTL response early in infection, in general, within 50 days since the onset of symptoms, and thus impact of the kinetics of CTL response on the rate of virus escape remained undetermined. Furthermore, the pathways of HIV escape from CTL responses were not fully resolved as escapes occurring sequentially and | Including CTL response dynamics did not improve model fits of HIV escape data in patient CH131. We fitted model 1 (independent escapes, panels (A,D,G)), model 2 (sequential escape, panels (B,E,H)), and model 3 (concurrent escape, panels (C,F,I)) to escape data in patient CH131 with different CTL response inputs (constant, interpolated, or fitted response). Adding data-derived time-dependent CTL response (interpolated or fitted response) does not improve the fitting results in most cases ( Table 2). Notably, model 2 was unable to accurately describe late escape for biologically reasonable mutation rate µ 3 . Model parameters providing the best fit are given in Table 2.
concurrently have been proposed (21,22,25), and several previous studies assumed that escapes occur independently from each other (14,15,17). Here, by using experimental data on evolution of HIV sequences from acute infection into chronic phase and temporally resolved dynamics of HIV-specific CTL responses, we tested the hypothesis that CTL dynamics plays an important role in virus escape. Perhaps, in contrast with our initial expectations (e.g., due to Ref. (11,50)), we found that including experimentally measured dynamics of epitope-specific CTL responses did not lead to a better description of the kinetics of viral escape from T cells (e.g., in patient CH131, Table 2), or even reduced the quality of the model for viral escape fit to data (e.g., in patient CH159, Table 1). This was not because we assumed that killing of virus-infected cells was dependent on the absolute magnitude of epitope-specific CTL responses; assuming frequency-dependent killing, that is, when killing of infected cells expressing ith epitope was given by , led to similar conclusions (results not shown). Because previous work suggested that kinetics of escape was independent of the specific mechanism of how CTLs suppress wild-type virus (e.g., killing of infected cells or virus production by infected cells) (16), we did not investigate nonlytic control of HIV by T cells. It is interesting that the lack of correlation between the rate of viral escape and CTL response magnitude was highlighted previously (17).
Reasons of why a model with time-variable CTL response did not describe experimental data better than a model with a constant response remain unclear but several hypotheses could be generated. First, frequency of sampling of the viral sequences may not be high enough to detect change in the speed at which mutant viruses accumulate in the population. Indeed, in mathematical models, CTL dynamics has a direct impact on the rate of escape (e.g., see equation (3)), and the observed changes in CTL densities may not be reflected in escape data if measurements are infrequent. Second, virus sequence data could simply be noisy. Because only handful of viral sequences were analyzed by the SGA/S, measurements of frequencies of viral variants have in general large expected error (e.g., Figure 2). Third, CTL dynamics TABLE 2 | Parameters estimated by fitting different models of viral escape to escape data in patient CH131 assuming constant killing rates k i (panels A-C), or time-varying killing rates due to interpolated CTL response (panels D-E) or CTL response in the Ton-T off model (panels G-I).   A B FIGURE 7 | Model, assuming sequential escape (model 2), can be consistent with escape data when the trajectories for two sequential viral escape are separated in time. We illustrate that separation of trajectories by ∆t 50 = 409.8 − 344.2 ≃ 66 days is sufficient for the mutation rate to be realistically small (A). Here, t i 50 is the time by which the ith variant reaches 50% of the viral population, so, ∆t 50 = t 2 50 − t 1 50 . Parameters used in simulations are µ 1 = µ 2 = 10 −5A , k 1 = k 2 = 0.02 day −1A , r = 1.5 day −1 , δ = 1 day −1 . The distance between trajectories needed for small predicted mutation rates is reduced for higher CTL killing rates (B) and the time is only weakly dependent on the mutation rate assumed in simulations.

Peptide
in the blood may not reflect CTL dynamics in tissues such as secondary lymphoid organs (lymph nodes and spleen). While it is well known that T cells recirculate in the body (51), how quickly CTLs in the tissues migrate into the blood and then back to the tissues during HIV infection is not known. Finally, it is possible that the measured CTL responses were not the drivers of escape. While the ability of CTLs to recognize the wild-type virus and inability of the same CTLs to recognize mutant viruses is generally interpreted as evidence that these CTLs drove viral escape, such observations are correlational in nature, and thus cannot fully establish the causality of escape, at least in humans.
Our results may be interpreted as contradictory to several previous studies that found a strong correlation between the time of viral escape (time when an escape variant reaches frequency of 50% in the viral population) and a relative magnitude of CTL response (relative or "vertical" immunodominance) (11,12). However, our studies are not directly compatible because this previous work focused on the timing of escape while we primarily focused on the rate of viral escape. These two parameters are differently impacted by the CTL response (17) and may have different clinical importance. In our simple mathematical model (e.g., equation (3)), CTL response magnitude is expected to directly impact the rate at which an escape mutant accumulates in the population, independently of when this escape may occur. In contrast, timing of viral escape also depends on the mutation rate. Biologically, however, timing of escape may be more important than the rate because it may be more beneficial to the patient if viral escape occurs 5 years after infection but rapidly as compared to slow escape in just 1 year. This conjecture clearly depends on the premise that HIV escapes from CTL responses are detrimental to patients.
In our analysis, we generally found that for well sampled data, the pathway of generation of escape mutants played a minor role in predicting overall CTL killing efficacy; assuming escapes that occur independently (model 1) or concurrently (model 3) gave nearly identical estimates of the CTL killing efficacy (e.g. ,  Tables 1 and 2). In contrast, the model assuming sequential escape (model 2) often failed to accurately explain experimental data; this was due to some escapes co-occurring at nearly the same time, which obviously violated the model assumption of sequential escape. This inability of the sequential escape model to describe the data may be the result of the way we compared models to data: by using deterministic model approach and by ignoring recombination. Using deterministic model may be justified because, in acute infection, the effective population size of HIV may be sufficiently large and ignoring recombination may again be appropriate because very few cells in HIV infection are generally infected by 2 or more viruses (52,53). However, further work is needed to demonstrate whether our conclusions regarding inability of sequential escape model to accurately explain some escape data is due to some of the assumptions made in the model by running stochastic simulations and by allowing some degree of recombination.
Many of our model fits predicted a high mutation rate for the first epitope to be escaped by the virus (e.g., Table 2). This model prediction could not be changed by shifting the experimental data to allow for more time to generate escape mutant; in part, this test failed because in the absence of epitope-specific T cells escape variants accumulate rather slowly mainly driven by mutations. It may indicate that immune pressure on the virus population starts much earlier than it is reflected in the blood, echoing our concerns of whether CTL dynamics in the blood is an accurate reflection of T cell response in lymphoid tissues. Currently, it is believed that lymphoid tissues and not the blood are the major places of interactions between the virus and CTLs (50,54).
Our analysis further highlights the importance of choosing the appropriate statistical model for the analysis of the escape data-assuming normally distributed residuals, and therefore, using least squares approach, may not be appropriate for some escape data with very few sequences analyzed. Importantly, we confirm that the type of statistical model has an impact on the estimate of the escape rate (18).
We found that experimental data on HIV escape can be explained well if we assume identical per capital killing efficacy of CTLs, specific to different viral epitopes. This suggests that individual per capita killing rates not accurately estimated from these data. While it is possible that this result was the consequence of assuming additive killing of virus-infected cells by different CTL responses, we currently do not have any in vivo data to support more complex killing terms.
Overall, analyses of data from two patients suggested that models assuming independent escape of HIV from different CTL responses (model 1) or models assuming concurrent escape from multiple CTL responses (model 3) fit the data well and provide very similar (often nearly identical) estimates for the killing efficacy of CTL response. Thus, for well sampled data, assumption of independent escapes may be sufficient to accurately estimate HIV escape rates. Also, the model with datadriven time-dependent CTL response (interpolated or fitted response input) did not improve the quality of the model fit to data, so, at present, it appears to be unnecessary to incorporate the experimentally measured CTL response dynamics in the model describing viral escapes. Yet, because our results were found only for two patients, whether similar conclusions will be reached in other studies/patients remains to be determined. Our analysis nevertheless demonstrates how mathematical modeling may help to quantify HIV evolution in presence of CTL responses and to highlight potential limitations with experimental measurements.

ETHICS STATEMENT
This paper uses experimental data obtained previously, and no new observations requiring patient consent or institutional review board approval have been performed.

AUTHOR CONTRIBUTIONS
YY and VG designed the study, contributed to analysis, interpolation of data, and simulation results, and wrote the paper. YY performed the simulations.