- 1Microsoft Research, Redmond, WA, United States
- 2Adaptive Biotechnologies, Seattle, WA, United States
Introduction: T cell receptor (TCR) diversity is essential for immune defense, yet the mechanisms underlying its decline with age and its variation among individuals remain poorly understood. These patterns are typically attributed to passive processes such as thymic atrophy and cumulative immune exposures. However, this view does not account for the systematic and highly structured variation in TCR diversity observed across large populations.
Methods: We analyze TCRβ repertoires from approximately 30,000 adults using high throughput sequencing. We quantify repertoire size and the contribution of the most expanded clones and evaluate their ability to predict TCRβ diversity across age, sex and Cytomegalovirus exposure using machine learning and linear modeling approaches.
Results: We show that TCRβ diversity is almost entirely determined by two measurable repertoire features: repertoire size and the frequency of the 1,000 most abundant clones. Together, these features explain 96% of the variance in TCRβ diversity, capture its dependence on age and sex and define a robust relationship that persists under strong immune perturbations such as Cytomegalovirus infection. This relationship arises because the frequency of abundant clones, which represent less than one percent of TCRβ diversity, tracks a repertoire wide pattern of coordinated clonal expansion which we term intrinsic clonality.
Discussion: We propose that intrinsic clonality reflects a fundamental, previously unrecognized property of the immune system which challenges the view that TCR diversity declines primarily through passive erosion. Rather, TCR diversity emerges as a system level property mediated by repertoire size and intrinsic clonality, both of which are likely subject to homeostatic regulation. These findings offer a new conceptual framework for understanding TCR diversity within immune homeostasis which may guide therapies aimed at restoring immune function.
Introduction
The theory of clonal selection is a cornerstone of modern immunology, providing the foundation for understanding how TCR diversity is broadly shaped and maintained by the immune system (1). T cells play an essential role in immune defense by targeting antigens from infections and cancer, with their specificity determined by T cell receptors (TCRs) (2, 3). The ability to respond to a broad range of pathogens is enabled by the diversity of the TCR repertoire (4–13). A large pool of naive T cells with diverse, randomly rearranged TCRs is primarily generated during childhood and adolescence via V(D)J recombination and is maintained throughout adulthood predominantly through homeostatic proliferation (14, 15). Mechanisms of immune tolerance eliminate or regulate self-reactive T cells, thereby limiting responses to non-self antigens (16). The T cell repertoire is further shaped by selection processes, including clonal expansion upon antigen encounter and the subsequent preferential retention of activated T cells in the memory compartment. A key feature of immune homeostasis is the long-term balance between naive and memory T cell compartments, which enables rapid responses to previously encountered antigens while preserving the capacity to recognize new ones (17, 18).
Despite its essential role, the mechanisms governing TCR diversity remain poorly understood. Diversity declines by about a factor of two between the ages of 20 and 80 years and is systematically lower in males than females (19, 20). Moreover, variation between individuals exceeds that explained by age and sex alone (20). Changes in TCR diversity are commonly attributed to passive, cumulative processes such as thymic atrophy, stochastic cell loss and chronic immune activation (21–29). Under this view, diversity passively erodes over time independent of any intrinsic homeostatic regulatory mechanisms. However, these factors fail to explain the systematic, population-wide patterns observed in large datasets, nor do they identify specific mechanisms that mediate and constrain diversity.
Understanding what determines TCR diversity is critical because it directly impacts the immune system’s ability to recognize novel antigens. Reduced TCR diversity is linked to poor health outcomes including a greater risk of infectious disease and cancer (20, 21, 24, 26, 30– 31). This association with human health highlights the urgent need to understand TCR diversity within the broader context of immune homeostasis. Identifying factors that determine TCR diversity, which may themselves be intrinsically regulated, can provide new mechanistic insight into how TCR diversity is maintained and inform interventions aimed at restoring immune competence after its decline.
Here we analyze TCRβ repertoires from ∼30,000 individuals and show that diversity is almost entirely determined by two measurable features of the repertoire: the total number of T cells sequenced (repertoire size) and the frequency of the 1,000 most abundant clones. These two quantities independently correlate with TCRβ diversity and together account for 96% of its variation across individuals, including its systematic dependence on age and sex, as well as its response to Cytomegalovirus (CMV) exposure. The key finding of our analysis is that the predictive power of the frequency of abundant clones arises from the apparently coordinated nature of clonal expansion across the repertoire. We interpret this coordination as the manifestation of a fundamental, system level property that governs the amplitude of clonal expansion across T cells, which we term intrinsic clonality. Intrinsic clonality may be a previously unrecognized feature of the immune system subject to homeostatic control, helping to explain how TCR diversity is mediated within immune homeostasis with potentially important implications for translational research.
Results
We conduct a cross-sectional analysis of 30,430 TCRβ repertoires processed under standardized protocols in a CLIA1-certified laboratory. 95% of subjects in this cohort are aged 20–74 years (median 50) with 47% males and 53% females. The cohort was sequenced as part of the T-Detect COVID test which was granted Emergency Use Authorization by the Food and Drug Administration2 and is the same cohort analyzed by Zahid et al. (20). We measure the total number of T cells sequenced, S, and the number of unique clonotypes (i.e., richness), D, and refer to these quantities as the repertoire size and TCR diversity, respectively; we interpret them as relative measures of the true underlying size and diversity of the peripheral repertoire. Both quantities follow a log-normal distribution across individuals (all logarithmic values refer to base 10). We further define S1000 as the total number of T cells derived from the 1,000 most abundant clones and P1000 as the percentage of the repertoire they comprise (i.e., P1000 = 100 × S1000/S). To facilitate straightforward interpretation, we use P1000 when visualizing the data. For all modeling applications, S1000 is used to avoid covariance between P1000 and S, resulting in more robust and interpretable models.
Repertoire size is influenced by both biological factors and technical variables such as input volume and measurement uncertainty. We account for measurement error in our analysis and note that repertoire size strongly correlates with the T cell fraction (i.e., the proportion of peripheral blood mononuclear cells that are T cells; Spearman ρ = 0.84). Analyses using either repertoire size or T cell fraction yield consistent results, indicating that our findings are robust to the chosen metric. This supports the conclusion that the relationships we observe reflect intrinsic immune properties rather than technical artifacts.
TCRβ diversity, repertoire size and clonal expansion
TCRβ diversity declines with age. This decline occurs ∼10 years earlier in males than in females, leading to pronounced differences emerging in middle age (Figure 1A). After accounting for measurement uncertainty, the peak-to-peak intrinsic biological variance in D for the central 90% of subjects increases by a factor of 2 to 5 (0.3 dex to 0.7 dex3) between the ages of 20 and 80 years, respectively (20). Notably, inter-individual variation in D exceeds the systematic effects of age, sex and CMV exposure, particularly among older individuals. Repertoire size (S) declines with age similarly to D (Figure 1B), while clonal expansion (P1000) increases from 10% to 30% between ages 20 and 80 and is consistently lower in females (Figure 1C). Furthermore, CMV-positive individuals exhibit slightly lower TCRβ diversity, substantially larger repertoire size and higher clonal expansion (Figures 1D–F). These findings demonstrate that repertoire size, TCRβ diversity and clonal expansion all depend on age, sex and CMV exposure status.
Figure 1. The dependence of T cell receptor diversity, repertoire size and clonal expansion on age, sex and CMV exposure status. (A) Log of TCRβ diversity (Log D) as a function of age stratified by sex. Blue and orange curves are the median diversity in decade wide age bins for males and females, respectively. Error bars are bootstrapped and blue and orange shaded regions indicate the distribution of the central 50% of the data. Gray shading represents the central 90% of the subjects. (B) Log of the total number of productive TCRβs sequenced (Log S) in each repertoire as a function of age stratified by sex. The binning procedure and shading definitions are the same as in (A). (C) P1000 as a function of age stratified by sex. The binning procedure and shading definitions are the same as in (A). (D–F) are the same as in (A–C), respectively, but data are stratified by CMV exposure status rather than sex.
TCRβ diversity is independently correlated with both repertoire size and clonal expansion. We calculate the median D as a function of age, binned by S and P1000, respectively (Figures 2A, B). At all ages, individuals with high D tend to have low P1000 and high S, and vice versa. In CMV-negative individuals, larger repertoire size is accompanied by a more balanced clonal distribution, with both factors contributing to higher TCRβ diversity (Figure 2C). Conversely, in CMV-positive subjects, repertoire size and clonal expansion are decoupled, such that large repertoires coexist with high levels of clonal expansion. Larger repertoire sizes compensate for increased clonality in CMV-positive individuals, thereby mitigating the impact of CMV exposure on TCRβ diversity (32).
Figure 2. Relationship between T cell receptor diversity, repertoire size and clonal expansion. (A) Log D as a function of age with colored curves indicating median Log D in bins of Log S. Gray shading represents the central 90% of subjects. TCRβ diversity increases with repertoire size. (B) Same as in (A) but with color bars indicating median Log D as function of age in bins of P1000. TCRβ diversity decreases with increasing clonal expansion. (C) Relationship between repertoire size and P1000 stratified by CMV exposure status. For CMV-positive subjects, there is a marginal anti-correlation between P1000 and S (Spearman ρ = −0.04, p = 7×10−5). For CMV-negative subjects, the anti-correlation is significantly stronger (Spearman ρ = −0.25, p = 1×10−264). (D) Log D as a function of Log S with colored curves indicating median Log D in bins of P1000. The solid and dashed curves are for males and females, respectively. (E) Same as (D) but solid and dashed curves are for CMV-negative and CMV-positive subjects, respectively. (F) Same as in (D) but solid and dashed curves are subjects that are younger and older than the median age of the cohort (50 years), respectively. After accounting for repertoire size and clonal expansion, TCRβ diversity is independent of sex, CMV exposure status and age.
At a fixed S, D systematically declines with increasing P1000, demonstrating that clonal expansion reduces diversity independent of total repertoire size (Figures 2D–F). Remarkably, after controlling for S and P1000, the residual dependence of D on age, sex or CMV exposure status is minimal, indicating that these biological factors influence TCR diversity through their effects on repertoire size and clonal expansion. Meaning, repertoire size and clonal expansion strongly mediate the observed dependence of TCR diversity on age, sex and CMV exposure, suggesting that the relationship between these repertoire measures is largely independent of age, sex and CMV exposure.
Variations in repertoire size and clonal expansion almost fully account for the observed variance in TCRβ diversity at any given age (Figures 2A, B). P1000 increases systematically with age (Figure 1C) but subjects with low P1000 and high D are present at all ages. Notably, the 1000 most abundant clones may occupy a sizable fraction of the repertoire but they represent a very small fraction of the diversity (< 1%). Moreover, the 1000 most abundant clones in any repertoire are dominated by CD8+ memory T cells (Supplementary Figure S1), but we find that TCRβ diversity (D) is strongly correlated with the average clonal expansion of all clones in the repertoire (Spearman ρ = 0.92), which includes CD4+ and naive T cells. This correlation remains significant even when the 1000 most abundant clones are excluded (Spearman ρ = 0.52), indicating that P1000 serves as a proxy for the repertoire wide property we refer to as intrinsic clonality.
Modeling TCRβ diversity
To better understand the factors shaping TCR diversity, we develop predictive models using biological and repertoire-derived features. We first use XGBoost, a gradient-boosted decision tree algorithm (33), to model TCRβ diversity as a function of age, sex and CMV exposure status.
All three features are predictive (Figure 3A) and the model broadly captures the systematic dependence of TCRβ diversity on these variables (Figure 3B). Next, we include repertoire size (S) and clonal expansion (S1000). Adding these features substantially improves model performance (Figures 3C, D): S and S1000 together explain approximately 96% of the intrinsic variance in TCRβ diversity (see Materials & Methods). Furthermore, including S and S1000 eliminates the predictive value of sex, age and CMV status, confirming that their effects on TCR diversity are mediated through repertoire size and intrinsic clonality.
Figure 3. Modeling TCRβ diversity using XGBoost. (A) Feature importance of sex, age and CMV status in predicting TCRβ diversity. (B) Solid lines show the median Log D as a function age stratified by sex. The dashed lines show the model predictions generated via five-fold crossvalidation. (C) Feature importance when including Log S and S1000 as features in the model. The negligible contribution of age, sex and CMV status demonstrates that repertoire size and S1000 account for the dependence of TCRβ diversity on these factors. (D) Same as (B) but for a model including repertoire size and S1000 as parameters. Repertoire size and S1000 robustly predict TCRβ diversity and are better predictors of its systematic dependence on age and sex than the model shown in (A, B).
The relationship between TCRβ diversity, repertoire size and intrinsic clonality is independent of age, sex and CMV exposure status. We fit a linear model to explicitly quantify this relationship:
Here represents the predicted TCRβ diversity as a linear function of S and S1000. We note that all variables in the model are expressed in their original, non–log-transformed values. Equation 1 describes D as increasing with S and decreasing with S1000. Notably, the coefficient of S1000 is close to unity, indicating that TCRβ diversity decreases nearly one-to-one with increasing S1000. The model yields an R2 of 0.96 (Figure 4), consistent with the ∼4% residual intrinsic scatter estimated from our XGBoost model, further supporting the robustness and completeness of the relationship. These results reinforce the idea that S1000 and P1000 serve as quantitative proxies for intrinsic clonality, a systemic repertoire property that mediates TCR diversity. Despite CMV-positive and CMV-negative individuals exhibiting distinct relationships between repertoire size and intrinsic clonality (Figure 2C), both groups follow a consistent relationship linking these variables to TCR diversity, underscoring the universal and fundamental role of this relationship in characterizing immune homeostasis.
Figure 4. A linear model of TCRβ diversity. We model D as a linear function of S and S1000 and fit the data in five-fold cross-validation. We plot the predicted TCRβ diversity as a function the measured TCRβ diversity. The red dashed line shows one-to-one correspondence. The linear model accurately describes the measured TCRβ diversity (R2 = 0.96).
We note that the choice of the 1,000 most abundant clones is not inherently special. For instance, when using the 10 or 100 most abundant clones (S10 or S100, respectively) as features (Supplementary Figure S3), the model performance is only slightly degraded—likely due to greater statistical uncertainty in these measures. The fitted coefficients differ modestly from those obtained using S1000, reflecting the smaller dynamic range of these quantities, but the overall linear relationship remains intact. Importantly, even with S10 or S100, the model continues to account for the dependence of TCRβ diversity on age and sex, demonstrating that the form of the relationship is robust to the precise definition of our proxy for intrinsic clonality. This result reinforces that the relationship reflects the heavy-tailed structure of the repertoire rather than a privileged cutoff. Additionally, we find that low TCRβ diversity does not appear to be linked to any specific or specialized set of immune exposures and HLA genotypes are not predictive (see Supplementary Material). Taken together, these results strongly suggest that repertoire size and intrinsic clonality are the primary determinants of TCR diversity.
Discussion
TCR diversity is essential for immune competence, enabling the recognition and elimination of diverse threats. We show that repertoire size (S) and the abundance of the most expanded clones (S1000) explain nearly all the variation in TCRβ diversity (D) across individuals, including its systematic dependence on age, sex and CMV exposure. We emphasize that this relationship is not a mathematical artifact or tautology. Although S1000 is derived from a subset of highly expanded clones that represent <1% of TCRβ diversity, it robustly predicts D, a global property of the repertoire. This unexpected result indicates that a small fraction of clones encodes information about the entire distribution. We interpret this coordinated pattern of clonal expansion as the measurable manifestation of a previously unrecognized property of the immune system we term intrinsic clonality. Our proxy of intrinsic clonality does not depend on selecting the 1,000 most abundant clones. Using as few as 10 or 100 most abundant clones yields consistent results, indicating that intrinsic clonality captures a key biological property of the immune system. Thus, our findings reveal a simple but powerful organizing principle: overall TCR diversity is an emergent property of the immune system, arising from a fundamental relationship between repertoire size, intrinsic clonality and TCR diversity itself.
CMV exposure underscores the generality and resilience of the relationship between D, S and S1000, demonstrating that it holds even under strong immune perturbations. Unlike acute infections such as SARS-CoV-2 or other chronic herpes viruses like Epstein-Barr Virus, which have significantly smaller and more transient effects on the repertoire, CMV strongly perturbs homeostasis by increasing both repertoire size and intrinsic clonality. The chronic nature of CMV alone does not fully explain its outsized impact on repertoire structure and the biological reasons for its influence remain incompletely understood (34). Nevertheless, the consistency of the relationship between D, S, and S1000 across CMV-exposed and -unexposed individuals underscores its fundamental and robust nature.
We find no evidence that shared immune exposures or unmodeled host factors explain the relationship between D, S, and S1000. HLA genotype does not predict D and both the most abundant clones and their co-occurrence patterns vary widely across individuals, consistent with their origin from disparate immune exposures (Supplementary Figure S4). Additionally, a targeted search for TCRβs associated with low-diversity repertoires identified no strong candidates, further suggesting that unrecognized shared exposures are not the primary driver of diversity loss (see Supplementary Methods). However, these findings are not central to our conclusions. Rather, the key result is that S and S1000 together explain 96% of the variation in TCRβ diversity, leaving little room for additional contributors. While other variables may correlate with these quantities, repertoire size and clonality are fundamental properties of the T cell repertoire. The predictive power of S1000 reflects a coordinated pattern of clonal expansion across compartments and is robust to the specific number of clones included in the calculation (Supplementary Figure S3). Notably, S1000 is dominated by memory CD8+ T cells (Supplementary Figure S1), while D is shaped primarily by naive CD4+ T cells (35). The near one-to-one inverse relationship between S1000 and D (Equation 1) supports systemic coordination, possibly explaining the observed stability of clonal hierarchy over time (36). Together, these findings point to intrinsic, homeostatic regulation of the T cell repertoire rather than extrinsic factors.
Although the precise mechanisms remain uncertain, we propose that an immunometabolic regulatory network comprising two interdependent homeostatic processes may plausibly underlie our observations. The first regulates repertoire size through T cell competition for soluble IL-7 and IL-15 and access to stromal niches, which together determine how many total T cells can be sustained (17, 37–40), effectively setting a molecular homeostatic point for T cell carrying capacity. The second governs the overall amplitude of clonal expansion through an integrated network of cytokine, metabolic and costimulatory signals that integrate through the mTOR pathway (41, 42). mTOR links the cytokine milieu to cellular metabolic state, effectively coordinating the strength of clonal expansion across the repertoire, giving rise to the system-wide property we term intrinsic clonality. These two homeostatic processes are coupled through cytokine availability, which naturally accounts for the negative correlation between repertoire size and intrinsic clonality observed in CMV-negative subjects. Increased mTOR activity promotes greater clonal expansion but also likely enhances metabolic and cytokine dependence of each cell (43, 44), intensifying competition for limited resources and resulting in a smaller overall repertoire. In contrast, the same constraints do not apply to CMV-specific T cells which are dominated by late-differentiated CD45RA+ CCR7− (TEMRA) cells. TEMRAs are largely maintained through IL-15 trans-presentation and do not rely on the soluble cytokine–mediated homeostasis that governs the rest of the repertoire (34, 45). This lack of dependence on cytokine-mediated feedback explains why in CMV-positive individuals the expansion of CMV-specific T cells is balanced by an increase in repertoire size that preserves TCR diversity and why repertoire size and intrinsic clonality become uncoupled. While our findings robustly identify repertoire size and intrinsic clonality as key regulatory parameters, the specific biological mechanisms underlying intrinsic clonality remain speculative and may involve multiple, potentially overlapping pathways beyond mTOR.
There is growing evidence that many aspects of immune repertoire organization are genetically controlled, consistent with heritable regulation of the cytokine and metabolic pathways that govern T cell homeostasis. Cytokine levels and T cell counts vary systematically with age, sex and genetic background (46–53), indicating genetically encoded set points that shape T cell homeostasis. Genetic variation beyond HLA influences cytokine signaling and broader immune traits (47, 54–57), and twin studies demonstrate heritability in responses to homeostatic cytokines such as IL-7 and IL-2 (58) as well as in global immune parameters (46). Variation in genes encoding components and regulators of the mTOR pathway also modulates pathway activity and immune function, indicating that the metabolic arm of this regulatory network is likewise under genetic control (47, 58, 59). Together, these findings suggest that the mechanisms of immune homeostasis are at least partly genetically encoded, while environmental factors likely act through these intrinsic pathways (60–63).
Our analysis captures population-level trends in a cross-sectional manner. Small studies suggest TCR diversity is stable over short periods but declines with age (64, 65). Because our study focuses on adults, it primarily captures homeostatic regulation of established repertoires. In children, thymic production and developmental selection dominate repertoire dynamics. Applying this framework to pediatric cohorts, such as those described by Mitchell et al. (66), could help identify how homeostasis emerges. While these studies are consistent with our finding, their small sizes underscore the need for large-scale, longitudinal studies to help establish intrinsic clonality as a repertoire-wide feature. Investigating links between our findings and immunosenescence and inflammaging (67, 68) may offer further insights, as the mechanisms we propose may help explain key aspects of these age-associated phenomena. Future work could assess whether interindividual variation in mTOR pathway activity predicts intrinsic clonality and repertoire organization at the population level. These efforts could be integrated with studies leveraging high-throughput proteomics in large cohorts to elucidate how systemic cytokine levels influence TCR diversity and to identify molecular mediators of immune homeostasis. Continued integration of immune repertoire data with genomic profiling may help clarify how genetic variation modulates repertoire structure through the regulatory mechanisms we describe. Our findings provide a conceptual framework for investigating how intrinsic and extrinsic forces jointly regulate immune homeostasis, with implications for aging, disease susceptibility and therapeutic intervention.
T cells are essential for maintaining human health and their dysregulation contributes to a wide range of diseases, motivating therapeutic efforts to restore immune function (69, 70). A consistent feature of immune dysfunction is the loss of TCR diversity which is linked to poor clinical outcomes (20, 21, 24, 26, 30, 31, 71). Our findings suggest that TCR diversity is not directly regulated but instead emerges from clonal dynamics governed by repertoire size and intrinsic clonality, two properties that are likely directly regulated within immune homeostasis. By identifying these core determinants, our work provides guidance for therapeutic efforts aimed at preserving TCR diversity and restoring immune balance.
Materials and methods
Sequencing of human samples
Details of the sequencing data and IRB information are provided in Zahid et al. (20), here we highlight the most salient information. The CDR3 of TCRβ chains of T cells is sequenced with a multi-plexed PCR typically using 18µg of genomic DNA (72–75). The median sequencing depth is 518,618 TCRβs with 95% of subjects having a sequencing depth between 222,082 and 853,647. The median TCRβ diversity is 319,802 and 95% of subjects have values between 120,576 and 597,890. 95% of subjects have ages ranging between 20 and 74 years with a median age of 50 years. Sex is self-identified with males comprising 47.2% and females comprising 52.5% of subjects.
T cell based CMV diagnostic
We use a sensitive and specific T cell based diagnostic on the T Detect Covid cohort to identify subjects exposed to CMV. We use a method previously described in (76–78), which statistically identifies disease associated TCRβs based on serologically labeled cases and controls. 2181 labeled samples were used to build a CMV classifier with an area under the receiver operating characteristic curve (AUROC) of 0.96, measured on the same holdout set used in Emerson et al. (76). The performance of the T cell based test is comparable to serology and is limited by the accuracy of the serological labels. This diagnostic test allows us to identify subjects who are exposed to CMV using only their sequenced repertoire. Zahid et al. (20) demonstrate that CMV exposure primarily impacts TCR repertoire size.
Fitting TCRβ diversity
We first fit TCRβ diversity using the XGBRegressor routine implemented in version 2.1.6 of the XGBoost algorithm (33). We select XGBoost because of its ability to capture non-linear relationships, its strong out-of-the-box performance and its flexibility handling categorical variables like sex and CMV exposure status. We adopt the default hyperparameters of the algorithm and use its default squared error loss function. We derive predictions of TCRβ diversity using a five-fold cross-validation scheme implemented in the routine cross val predict from the scikitlearn (79) package version 1.2.0. We fit the model to a random 80% of the data and predict on the remaining 20%. This process is repeated across five distinct, randomly generated 80/20 splits of the data, ensuring that every data point is predicted without being used for model fitting. We determine feature importance by fitting all the data simultaneously.
We next fit the TCRβ diversity using a linear model described in Equation 1. We generate predictions in five-fold cross-validation and derive parameters by fitting all the data. We optimize the two parameters using the the optimize.curve fit module in version 1.15.2 of the SciPy package (80). We fit a linear model using the full dataset and generated parameter uncertainties using bootstrap resampling. Model evaluation was performed using five-fold crossvalidation and Figure 4 shows predicted versus observed TCRβ diversity values under this validation framework. The reported coefficients and uncertainties reflect the best-fit values and 1σ bootstrapped error estimates.
Estimating residual intrinsic scatter
To quantify the residual intrinsic (biological) scatter in the XGBoost model’s prediction of TCRβ diversity, we estimate and subtract the contribution of measurement error as:
Here σiis the intrinsic biological scatter, σris the model uncertainty and σmis the measurement uncertainty. The rationale is that for a perfect model the residual intrinsic scatter would be σi= 0, meaning the model’s uncertainty would be entirely limited by the measurement error.
Given that measurement errors in repertoire size and TCRβ diversity are correlated, we estimate the minimum achievable error (MAE) based on variability in D/S, the ratio of TCR diversity to repertoire size. Using repeat independent measurements from the same subjects, we calculate the MAE as the standard deviation of differences in Log D/S, yielding 0.027 dex (see Supplementary Material; Supplementary Figure S2). We fit TCRβ diversity using only S and S1000 as features and find the standard deviation of the fit residuals to be 0.031 dex. Subtracting the MAE from model uncertainty in quadrature yields a residual scatter of 0.015 dex, indicating that approximately 4% of the intrinsic variability in TCRβ diversity remains unexplained by S and S1000.
Data availability statement
Data tables with TCR repertoire metrics available at https://doi.org/10.5281/zenodo.14976210 and https://doi.org/10.5281/zenodo.13993996.
Ethics statement
The studies involving humans were approved by WIRB Copernicus Group Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
HZ: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. DM: Writing – original draft, Writing – review & editing. HR: Funding acquisition, Project administration, Resources, Writing – original draft, Writing – review & editing. JG: Investigation, Project administration, Resources, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare financial support was received for the research and/or publication of this article. The work was funded by Microsoft Corporation and Adaptive Biotechnologies. The author(s) declared that this work received funding from Microsoft and Adaptive Biotechnologies. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.
Acknowledgments
We thank the reviewers whose comments improved the manuscript. We also thank Ruth Taniguchi for discussion and feedback.
Conflict of interest
HZ and JG are employed by the company Microsoft. DM and HR are employed by Adaptive Biotechnologies.
Generative AI statement
The author(s) declared that Generative AI was used in the creation of this manuscript. The author(s) declare that ChatGPT was used for editing.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2025.1707727/full#supplementary-material
Supplementary Figure 1 | Distribution of clone frequencies for various compartments. (A) and (B) show clonal frequencies versus clone ranks derived from the sorted repertoires of 45 subjects split by CMV-negative and CMV-positive subjects, respectively. T cells are sorted on memory/naive and CD4/CD8 markers prior to sequencing. Details of the sorting procedure are provided in Zahid et al. (81). The 1000 most abundant clones are dominated by memory CD8+ T cells.
Supplementary Figure 2 | Measurement uncertainty and XGBoost model residuals. (A) Distribution of of the difference in the quantity Log D/S determined from 396 repeat samples. To avoid overestimating measurement uncertainties due to a small number of outliers, we estimate the standard deviation of the distribution by fitting a Gaussian. We attribute this measure from repeat samples of the same subjects as an estimate of the measurement uncertainties and adopt the standard deviation as the minimum achievable error for any model that is not overfitting to the data. (B) Distribution of residuals of the XGBoost model fit of TCR diversity using S and S1000 as features. The model error is only slightly larger than the measurement uncertainty, indicating that the model accounts for nearly all the intrinsic biological scatter in the data.
Supplementary Figure 3 | Modeling TCR diversity with S100 and S10 instead of S1000. (A) and (B) are the same as Figures 2A, B, respectively, but for a model fit using the fraction of repertoire comprised of the 100—not 1000—most abundant clones, i.e., S100. (C) Distribution of residuals of the model fit with S and S100. (D), (E) and (F) are the same as (A, B) and (C), respectively, for a model using S10 instead of S100. Model performance is only slightly degraded when using S100 or S10 instead of S1000.
Supplementary Figure 4 | Distribution of the number of TCRβs in the 1000 most abundant clones that are HLA-associated and their cluster membership. (A) Histogram of the number of TCRβs that are one of the 1000 most abundant which are also HLA-associated public TCRβs for all subjects. (B) Histogram of the number of distinct HLA-associated TCR clusters represented by the TCRβs in (A) for all subjects. Each cluster is interpreted as mapping to a distinct immune exposure.
Footnotes
- ^ Clinical Laboratory Improvement Amendments of 1988
- ^ https://www.fda.gov/media/146481/download
- ^ Here dex refers to scatter measured on a log scale such that a value of x dex indicates a relative difference of 10x.
References
1. Burnet FM. A modification of jerne’s theory of antibody production using the concept of clonal selection. CA: A Cancer J Clin. (1976) 26:119–21. doi: 10.3322/canjclin.26.2.119
2. Hedrick SM, Cohen DI, Nielsen EA, and Davis MM. Isolation of cDNA clones encoding T cell-specific membrane-associated proteins. Nature. (1984) 308:149–53. doi: 10.1038/308149a0
3. Yanagi Y, Yoshikai Y, Leggett K, Clark SP, Aleksander I, and Mak TW. A human T cell-specific cDNA clone encodes a protein having extensive homology to immunoglobulin chains. Nature. (1984) 308:145–9. doi: 10.1038/308145a0
4. Davis MM, Boniface JJ, Reich Z, Lyons D, Hampl J, Arden B, et al. Ligand recognition by (alpha)(beta) T cell receptors. Annu Rev Immunol. (1998) 16:523. doi: 10.1146/annurev.immunol.16.1.523
5. Nikolich-Žugich J, Slifka MK, and Messaoudi I. The many important facets of T-cell repertoire diversity. Nat Rev Immunol. (2004) 4:123–32. doi: 10.1038/nri1292
6. Foster AD, Sivarapatna A, and Gress RE. The aging immune system and its relationship with cancer. Aging Health. (2011) 7:707–18. doi: 10.2217/ahe.11.56
7. Martin MP and Carrington M. Immunogenetics of HIV disease. Immunol Rev. (2013) 254:245–64. doi: 10.1111/imr.12071
9. Montgomery RA, Tatapudi VS, Leffell MS, and Zachary AA. HLA in transplantation. Nat Rev Nephrol. (2018) 14:558–70. doi: 10.1038/s41581-018-0039-x
10. Kovacs AA, Kono N, Wang CH, Wang D, Frederick T, Operskalski E, et al. Association of HLA genotype with T-cell activation in human immunodeficiency virus (HIV) and HIV/hepatitis C virus–coinfectedWomen. J Infect Dis. (2020) 221:1156–66. doi: 10.1093/infdis/jiz589
11. Francis JM, Leistritz-Edwards D, Dunn A, Tarr C, Lehman J, Dempsey C, et al. Allelic variation in class I HLA determines CD8+ T cell repertoire shape and cross-reactive memory responses to SARS-CoV-2. Sci Immunol. 7:eabk3070.
12. Granadier D, Iovino L, Kinsella S, and Dudakov JA. Dynamics of thymus function and T cell receptor repertoire breadth in health and disease. Semin immunopathology. (2021) 43:119–34. doi: 10.1007/s00281-021-00840-5
13. Olafsdottir TA, Bjarnadottir K, Norddahl GL, Halldorsson GH, Melsted P, Gunnarsdottir K, et al. HLA alleles, disease severity, and age associate with T-cell responses following infection with SARS-CoV-2. Commun Biol. (2022) 5:914. doi: 10.1038/s42003-022-03893-w
14. Tonegawa S. Somatic generation of antibody diversity. Nature. (1983) 302:575–81. doi: 10.1038/302575a0
15. Jameson SC. Maintaining the norm: T-cell homeostasis. Nat Rev Immunol. (2002) 2:547–56. doi: 10.1038/nri853
16. Janeway C, Travers P, Walport M, Shlomchik M, et al. Immunobiology: the immune system in health and disease Vol. 2. . New York: Garland Pub (2001).
17. Surh CD and Sprent J. Homeostasis of naive and memory T cells. Immunity. (2008) 29:848–62. doi: 10.1016/j.immuni.2008.11.002
18. Kumar BV, Connors TJ, and Farber DL. Human T cell development, localization, and function throughout life. Immunity. (2018) 48:202–13. doi: 10.1016/j.immuni.2018.01.007
19. Britanova OV, Putintseva EV, Shugay M, Merzlyak EM, Turchaninova MA, Staroverov DB, et al. Age-related decrease in TCR repertoire diversity measured with deep and normalized sequence profiling. J Immunol. (2014) 192:2689–98. doi: 10.4049/jimmunol.1302064
20. Zahid HJ, Taniguchi R, Noceda MG, Robbins H, and Greissl J. T cell receptor diversity, cancer and sex: insights from 30,000 TCR β Repertoires. bioRxiv. (2024), 2024–10.
21. Khan N, Shariff N, Cobbold M, Bruton R, Ainsworth JA, Sinclair AJ, et al. Cytomegalovirus seropositivity drives the CD8 T cell repertoire toward greater clonality in healthy elderly individuals. J Immunol. (2002) 169:1984–92. doi: 10.4049/jimmunol.169.4.1984
22. Naylor K, Li G, Vallejo AN, Lee WW, Koetz K, Bryl E, et al. The influence of age on T cell generation and TCR diversity. J Immunol. (2005) 174:7446–52. doi: 10.4049/jimmunol.174.11.7446
23. Goronzy JJ and Weyand CM. T cell development and receptor diversity during aging. Curr Opin Immunol. (2005) 17:468–75. doi: 10.1016/j.coi.2005.07.020
24. Messaoudi I, LeMaoult J, Guevara-Patino JA, Metzner BM, and Nikolich-Žugich J. Agerelated CD8 T cell clonal expansions constrict CD8 T cell repertoire and have the potential to impair immune defense. J Exp Med. (2004) 200:1347–58. doi: 10.1084/jem.20040437
25. Qi Q, Liu Y, Cheng Y, Glanville J, Zhang D, Lee JY, et al. Diversity and clonal selection in the human T-cell repertoire. Proc Natl Acad Sci. (2014) 111:13139–44. doi: 10.1073/pnas.1409155111
26. Palmer S, Albergante L, Blackburn CC, and Newman T. Thymic involution and rising disease incidence with age. Proc Natl Acad Sci. (2018) 115:1883–8. doi: 10.1073/pnas.1714478115
27. Krishna C, Chowell D, Gönen M, Elhanati Y, and Chan TA. Genetic and environmental determinants of human TCR repertoire diversity. Immun Ageing. (2020) 17:1–7. doi: 10.1186/s12979-020-00195-9
28. Cardinale A, De Luca CD, Locatelli F, and Velardi E. Thymic function and T-cell receptor repertoire diversity: implications for patient response to checkpoint blockade immunotherapy. Front Immunol. (2021) 12:752042. doi: 10.3389/fimmu.2021.752042
29. Brown AJ, White J, Shaw L, Gross J, Slabodkin A, Kushner E, et al. MHC heterozygosity limits T cell receptor variability in CD4 T cells. Sci Immunol. (2024) 9:eado5295. doi: 10.1126/sciimmunol.ado5295
30. Turner SJ, La Gruta NL, Kedzierska K, Thomas PG, and Doherty PC. Functional implications of T cell receptor diversity. Curr Opin Immunol. (2009) 21:286–90. doi: 10.1016/j.coi.2009.05.004
31. Gleason L, Porcu P, and Nikbakht N. Reduced overall T-cell receptor diversity as an indicator of aggressive cutaneous T-cell lymphoma. Blood. (2022) 140:3539–40. doi: 10.1182/blood-2022-170357
32. Lindau P, Porcu P, and Nikbakht N. Cytomegalovirus exposure in the elderly does not reduce CD8 T cell repertoire diversity. J Immunol. (2019) 202:476–83. doi: 10.4049/jimmunol.1800217
33. Chen T and Guestrin C. (2016). Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, . pp. 785–94.
34. Klenerman P and Oxenius A. T cell responses to cytomegalovirus. Nat Rev Immunol. (2016) 16:367–77. doi: 10.1038/nri.2016.38
35. Li HM, et al. TCRβ repertoire of CD4+ and CD8+ T cells is distinct in richness, distribution, and CDR3 amino acid composition. J Leucocyte Biol. (2016) 99:505–13. doi: 10.1189/jlb.6A0215-071RR
36. Gaimann MU, Nguyen M, Desponds J, and Mayer A. Early life imprints the hierarchy of T cell clone sizes. Elife. (2020) 9:e61639. doi: 10.7554/eLife.61639
37. Tan JT, et al. IL-7 is critical for homeostatic proliferation and survival of naive T cells. Proc Natl Acad Sci. (2001) 98:8732–7. doi: 10.1073/pnas.161126098
38. Moses CT, Thorstenson KM, Jameson SC, and Khoruts A. Competition for self ligands restrains homeostatic proliferation of naive CD4 T cells. Proc Natl Acad Sci. (2003) 100:1185–90. doi: 10.1073/pnas.0334572100
39. Becker TC, et al. Interleukin 15 is required for proliferative renewal of virus-specific memory CD8 T cells. J Exp Med. (2002) 195:1541–8. doi: 10.1084/jem.20020369
40. Boyman O, Purton JF, Surh CD, and Sprent J. Cytokines and T-cell homeostasis. Curr Opin Immunol. (2007) 19:320–6. doi: 10.1016/j.coi.2007.04.015
41. Thomson AW, Turnquist HR, and Raimondi G. Immunoregulatory functions of mTOR inhibition. Nat Rev Immunol. (2009) 9:324–37. doi: 10.1038/nri2546
42. Powell JD, Pollizzi KN, Heikamp EB, and Horton MR. Regulation of immune responses by mTOR. Annu Rev Immunol. (2012) 30:39–68. doi: 10.1146/annurev-immunol-020711-075024
43. Delgoffe GM, et al. The mTOR kinase differentially regulates effector and regulatory T cell lineage commitment. Immunity. (2009) 30:832–44. doi: 10.1016/j.immuni.2009.04.014
44. Yang K, Neale G, Green DR, He W, and Chi H. The tumor suppressor Tsc1 enforces quiescence of naive T cells to promote immune homeostasis and function. Nat Immunol. (2011) 12:888–97. doi: 10.1038/ni.2068
45. van den Berg SP, Pardieck IN, Lanfermeijer J, Sauce D, Klenerman P, Baarle D van, et al. The hallmarks of CMV-specific CD8 T-cell differentiation. Med Microbiol Immunol. (2019) 208:365–73. doi: 10.1007/s00430-019-00608-7
46. Roederer M, Quaye L, Mangino M, Beddall MH, Mahnke Y, Chattopadhyay P, et al. The genetic architecture of the human immune system: a bioresource for autoimmunity and disease pathogenesis. Cell. (2015) 161:387–403. doi: 10.1016/j.cell.2015.02.046
47. Aguirre-Gamboa R, Joosten I, Urbano PC, Molen RG van der, Rijssen E van, Cranenbroek B van, et al. Differential effects of environmental and genetic factors on T and B cell immune traits. Cell Rep. (2016) 17:2474–87. doi: 10.1016/j.celrep.2016.10.053
48. Li Y, Oosting M, Smeekens SP, Jaeger M, Aguirre-Gamboa R, Le KT, et al. A functional genomics approach to understand variation in cytokine production in humans. Cell. (2016) 167:1099–110. doi: 10.1016/j.cell.2016.10.017
49. De Craen A, Posthuma D, Remarque E, Van Den Biggelaar A, Westendorp R, and Boomsma D. Heritability estimates of innate immunity: an extended twin study. Genes Immun. (2005) 6:167–70. doi: 10.1038/sj.gene.6364162
50. Bakker OB, Aguirre-Gamboa R, Sanna S, Oosting M, Smeekens SP, Jaeger M, et al. Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses. Nat Immunol. (2018) 19:776–86. doi: 10.1038/s41590-018-0121-3
51. Goetzl EJ, Huang MC, Kon J, Patel K, Schwartz JB, Fast K, et al. Gender specificity of altered human immune cytokine profiles in aging. FASEB J. (2010) 24:3580. doi: 10.1096/fj.10-160911
52. Ter Horst R, Jaeger M, Smeekens SP, Oosting M, Swertz MA, Li Y, et al. Host and environmental factors influencing individual human cytokine responses. Cell. (2016) 167:1111–24. doi: 10.1016/j.cell.2016.10.018
53. Bernardi S, Toffoli B, Tonon F, Francica M, Campagnolo E, Ferretti T, et al. Sex differences in proatherogenic cytokine levels. Int J Mol Sci. (2020) 21:3861. doi: 10.3390/ijms21113861
54. Piasecka B, Duffy D, Urrutia A, Quach H, Patin E, Posseme C, et al. Distinctive roles of age, sex, and genetics in shaping transcriptional variation of human immune responses to microbial challenges. Proc Natl Acad Sci. (2018) 115:E488–97. doi: 10.1073/pnas.1714765115
55. Orrù V, et al. Complex genetic signatures in immune cells underlie autoimmunity and inform therapy. Nat Genet. (2020) 52:1036–45. doi: 10.1038/s41588-020-0684-4
56. Liston A, Humblet-Baron S, Duffy D, and Goris A. Human immune diversity: from evolution to modernity. Nat Immunol. (2021) 22:1479–89. doi: 10.1038/s41590-021-01058-1
57. Poisner H, Faucon A, Cox N, and Bick AG. Genetic determinants and phenotypic consequences of blood T-cell proportions in 207,000 diverse individuals. Nat Commun. (2024) 15:6732. doi: 10.1038/s41467-024-51095-1
58. Brodin P, et al. Variation in the human immune system is largely driven by non-heritable influences. Cell. (2015) 160:37–47. doi: 10.1016/j.cell.2014.12.020
59. Saxton RA and Sabatini DM. mTOR signaling in growth, metabolism, and disease. Cell. (2017) 168:960–76. doi: 10.1016/j.cell.2017.02.004
60. Klein SL and Flanagan KL. Sex differences in immune responses. Nat Rev Immunol. (2016) 16:626–38. doi: 10.1038/nri.2016.90
61. Westergaard D, Moseley P, Sørup FKH, Baldi P, and Brunak S. Population-wide analysis of differences in disease progression patterns in men and women. Nat Commun. (2019) 10:666. doi: 10.1038/s41467-019-08475-9
62. Patwardhan V, Gil GF, Arrieta A, Cagney J, DeGraw E, Herbert ME, et al. Differences across the lifespan between females and males in the top 20 causes of disease burden globally: a systematic analysis of the Global Burden of Disease Study 2021. Lancet Public Health. (2024) 9:e282–94. doi: 10.1016/S2468-2667(24)00053-7
63. Stankiewicz LN, Salim K, Flaschner EA, Wang YX, Edgar JM, Durland LJ, et al. Sex-biased human thymic architecture guides T cell development through spatially defined niches. Dev Cell. (2025) 60:152–69. doi: 10.1016/j.devcel.2024.09.011
64. Yoshida K, Cologne JB, Cordova K, Misumi M, Yamaoka M, Kyoizumi S, et al. Aging-related changes in human T-cell repertoire over 20 years delineated by deep sequencing of peripheral T-cell receptors. Exp Gerontology. (2017) 96:29–37. doi: 10.1016/j.exger.2017.05.015
65. Chu ND, Bi HS, Emerson RO, Sherwood AM, Birnbaum ME, Robins HS, et al. Longitudinal immunosequencing in healthy people reveals persistent T cell receptors rich in highly public receptors. BMC Immunol. (2019) 20:1–12. doi: 10.1186/s12865-019-0300-5
66. Mitchell AM, Baschal EE, McDaniel KA, Simmons KM, Pyle L, Waugh K, et al. Temporal development of T cell receptor repertoires during childhood in health and disease. JCI Insight. (2022) 7:e161885. doi: 10.1172/jci.insight.161885
67. Xia S, Zhang X, Zheng S, Khanabdali R, Kalionis B, Wu J, et al. An update on inflamm-aging: mechanisms, prevention, and treatment. J Immunol Res. (2016) 2016:8426874. doi: 10.1155/2016/8426874
68. Fulop T, Larbi A, Dupuis G, Le Page A, Frost EH, Cohen AA, et al. Immunosenescence and inflamm-aging as two sides of the same coin: friends or foes? Front Immunol. (2018) 8:1960.
69. Iriguchi S, et al. A clinically applicable and scalable method to regenerate T-cells from iPSCs for off-the-shelf T-cell immunotherapy. Nat Commun. (2021) 12:430. doi: 10.1038/s41467-020-20658-3
70. Stankiewicz LN, Rossi FM, and Zandstra PW. Rebuilding and rebooting immunity with stem cells. Cell Stem Cell. (2024) 31:597–616. doi: 10.1016/j.stem.2024.03.012
71. Wang GC, Dash P, McCullers JA, Doherty PC, and Thomas PG. T cell receptor αβ diversity inversely correlates with pathogen-specific antibody levels in human cytomegalovirus infection. Sci Trans Med. (2012) 4:128ra42–128ra42. doi: 10.1126/scitranslmed.3003647
72. Robins HS, Campregher PV, Srivastava SK, Wacher A, Turtle CJ, Kahsai O, et al. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood J Am Soc Hematol. (2009) 114:4099–107. doi: 10.1182/blood-2009-04-217604
73. Robins H, Desmarais C, Matthis J, Livingston R, Andriesen J, Reijonen H, et al. Ultra-sensitive detection of rare T cell clones. J Immunol Methods. (2012) 375:14–9. doi: 10.1016/j.jim.2011.09.001
74. Carlson CS, Emerson RO, Sherwood AM, Desmarais C, Chung MW, Parsons JM, et al. Using synthetic templates to design an unbiased multiplex PCR assay. Nat Commun. (2013) 4:1–9. doi: 10.1038/ncomms3680
75. Dalai SC, Dines JN, Snyder TM, Gittelman RM, Eerkes T, Vaney P, et al. Clinical validation of a novel T-cell receptor sequencing assay for identification of recent or prior severe acute respiratory syndrome coronavirus 2 infection. Clin Infect Dis. (2022) 75:2079–87. doi: 10.1093/cid/ciac353
76. Emerson RO, DeWitt WS, Vignali M, Gravley J, Hu JK, Osborne EJ, et al. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet. (2017) 49:659–65. doi: 10.1038/ng.3822
77. Greissl J, Pesesky M, Dalai SC, Rebman AW, Soloski MJ, Horn EJ, et al. Immunosequencing of the T-cell receptor repertoire reveals signatures specific for diagnosis and characterization of early Lyme disease. medRxiv. (2021). doi: 10.1101/2021.07.30.21261353
78. Elyanow R, Snyder TM, Dalai SC, Gittelman RM, Boonyaratanakornkit J, Wald A, et al. T cell receptor sequencing identifies prior SARS-CoV-2 infection and correlates with neutralizing antibodies and disease severity. JCI Insight. (2022) 7. doi: 10.1172/jci.insight.150070
79. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. (2011) 12:2825–30.
80. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods. (2020) 17:261–72. doi: 10.1038/s41592-019-0686-2
81. Zahid HJ, Taniguchi R, Ebert P, Chow IT, Gooley C, Lv J, et al. Large-scale statistical mapping of T-cell receptor β sequences to Human Leukocyte Antigens. BioRxiv. (2025), 2024–4. Available online at: https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2025.1603730.
Keywords: intrinsic clonalty, T cell receptor diversity, immune repertoires, systems immunology, immune homeostasis
Citation: Zahid HJ, May D, Robins H and Greissl J (2026) A fundamental relationship between TCR diversity, repertoire size and systemic clonal expansion: insights from 30,000 TCRβ repertoires. Front. Immunol. 16:1707727. doi: 10.3389/fimmu.2025.1707727
Received: 17 September 2025; Accepted: 21 November 2025; Revised: 21 November 2025;
Published: 08 January 2026.
Edited by:
Peter S Linsley, Benaroya Research Institute, United StatesReviewed by:
Isha Monga, Weill Cornell Medicine, United StatesAaron Michels, University of Colorado, United States
Copyright © 2026 Zahid, May, Robins and Greissl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: H. Jabran Zahid, aHphaGlkQG1pY3Jvc29mdC5jb20=
Damon May2