Towards a sustainable, comprehensive and community-accepted nomenclature and naming standard of antibody and T cell receptor germline genes and alleles

Peres, Ayelet; Ohlin, Mats

doi:10.3389/fimmu.2025.1689673

OPINION article

Front. Immunol., 24 November 2025

Sec. T Cell Biology

Volume 16 - 2025 | https://doi.org/10.3389/fimmu.2025.1689673

Towards a sustainable, comprehensive and community-accepted nomenclature and naming standard of antibody and T cell receptor germline genes and alleles

Ayelet Peres¹

Mats Ohlin^2*

¹Department of Pathology, Yale University, New Haven, CT, United States
²Department of Immunotechnology and SciLifeLab, Lund University, Lund, Sweden

Technological advances have brought us to a point where we can now observe, interpret, and understand both the capabilities and limitations of the adaptive immune system—particularly its role in defending against external threats and internal challenges such as oncogenic transformation and hostile environments. These developments also enable us to examine how pathological conditions may arise from erroneous actions of the adaptive immune system itself.

Central to these processes is the generation of B cell receptors (BCRs) and antibodies, and T cell receptors (TCRs) by B and T cells, respectively. Together, these molecules form the Adaptive Immune Receptor Repertoire (AIRR), which enables the immune system to recognize and respond to a vast array of antigens. The immense diversity of these receptors, driven by gene diversity, gene rearrangements and, in the case of antibodies, affinity maturation through somatic hypermutation and selection, allows the adaptive immune system to respond to any encountered antigens, provided appropriate co-stimulatory signals are present.

Due to their importance in health and disease, extensive studies of adaptive immunity are carried out, studies that generate vast datasets. Proper interpretation of this data relies on standardized computational tools, as well as a consistent language for analysis and visualization. Numerous tools have been developed and are in widespread use. Recognizing the need for standardization, efforts to catalogue and define the genes involved in generating BCR/antibody and TCR-encoding sequences began more than three decades ago engaging both the international ImMunoGeneTics information system (IMGT) (1) and others (2). These standardized frameworks have greatly facilitated data analysis, interpretation, and communication within the research community. Developments in sequencing technology, in particular long-read sequencing and associated bioinformatics tools now enables valid reconstruction even of complex genomic loci such as those encoding AIRRs (3–7). However, as our understanding of the antibody and TCR loci has deepened—and as research has expanded to include previously understudied human populations and additional species—the challenges of developing complete, valid reference sets of germline genes participating in the generation of BCRs/antibodies and TCRs, while maintaining consistent nomenclature and naming standards, have become obvious (8). Some of these include:

● Multiple Numbering and Naming Schemes: Various systems for definition of the most diverse, commonly antigen-binding parts, the complementarity determining regions (CDRs), of BCRs, antibodies and TCRs, and gene and sequence numbering have been developed [e.g. (9–15)] but also been volatile with respect to their standard definitions. Several of these definitions are still in concurrent use. Multiple systems for gene naming have been implemented over the years. Although naming authority ought to reside with the International Union of Immunological Societies (IUIS) and its Nomenclature Committee, this seems not to be universally accepted. This creates confusion and hinders the consistent interpretation of results. A single naming body ought to be universally accepted.

● Position-Based Gene Naming: Gene names have traditionally been based on genomic position, aiding interpretation. However, growing evidence shows that genomic structures are far more complex than initially believed. Duplicated, inserted, and deleted genes and inverted segments (5, 16–18), absent from early maps, now challenge this approach. Moreover, AIRR loci in many species are significantly more diverse than in humans (19). Experimentally important model systems, like different mouse strains, differ substantially in their germline gene repertoire (20, 21) and germline loci of individuals of rhesus and cynomolgus macaques are very different (22, 23). Such complexity further complicates position-based naming—especially when diversity of the genomic loci structure is incompletely known. A cautious position-based practise of gene naming must be applied if we do not fully comprehend the diversity of these loci. It is even conceivable that the practice of a purely position-based naming scheme must be avoided for some or even most species as it might imply relations and proximity that does not exist.

● Evolving Nomenclature: Changes in gene names and numbering over time have not always been clearly traceable causing challenges in terms of understanding of analysis outcomes. In some cases, identical names have been reassigned to different sequences, introducing ambiguity. Such practices must be avoided; once assigned, a name should never be reused for a different sequence of a given species.

● Unreliable Sequence Naming: Sequence names have sometimes been assigned based on presumed gene location, without solid genomic support. For example, the human allele originally named IGHV4-59*08 recently had to be renamed as an allele of germline gene IGHV4-61, as name assignment that initially was based on conventional sequencing data was challenged by additional analysis (24) and subsequently on an extended genomic analysis.

● Reference sets are in some cases not open-source: Costly licensing agreements may prevent use of appropriate reference sets in commercial products aimed for analysis of AIRR sequence data. Although it is recognised that there is a need to retrieve costs associated to the generation and management of such reference sets, implementation of licencing fees that prevents the universal use of such sets for valid analysis negatively impacts the quality and reproducibility of AIRR research.

To address these issues, we urge all stakeholders in the AIRR research community—including researchers, tool developers, database curators, journal editors, and standard-setting organizations—to agree on and adhere to a common set of principles. These may include:

● A universally recognized nomenclature and naming authority, most likely the T-cell Receptor and Immunoglobulin (TR-IG) Nomenclature Sub-Committee of the IUIS Nomenclature Committee, should govern naming decision. Names approved by this body should be used; unapproved nomenclature should be clearly flagged in any publication or database. The sub-committee’s TR-IG Nomenclature Review Committee has recently released criteria for validating and naming new immunoglobulin and TCR gene/allele sequences (https://wp-iuis.s3.eu-west-1.amazonaws.com/app/uploads/2024/10/04120932/Data.Review.and_.Naming.Dec2023.pdf) as well as an overview of the challenges and likely best approaches to address them (25).

● Universal adoption of a standard sequence numbering system—such as that developed by IMGT (14)—for representing AIRR sequence data. Databases containing AIRR data, including antibody and TCR protein structures, should convert existing entries to conform with this standard, if necessary.

● Only well-documented genes and alleles should be recognised. These should be full-length sequences, primarily those supported by long-read genomic data of sufficient coverage. Discovery by traditional Sanger sequencing, by computational inference of germline allele sequences from transcriptome data (26–28), and by high-throughput gene amplicon sequencing (29), may likely largely be replaced by modern long-read sequencing technologies, while inference may still aid to determine expression levels of inferred alleles. Independently of the methodology used to identify novel genes and alleles, the supporting data should be made fully available to the research community to judge the relevance of the discovery. Inferred sequences from short-read assemblies must not be added to reference sets due to inherent inaccuracies of such analytical processes (30, 31).

● Positional naming of germline genes should only be applied when the structure and diversity of the relevant immunoglobulin or TCR locus in the species under investigation is well understood. Position-based naming must not be applied until a sufficiently representative number of individuals have been genomically assessed to capture population diversity. In many species, diversity may be too extensive for reliable position-based naming to be implemented.

● Germline alleles without defined genomic locations should receive permanent and unique names (32), that do not associate the sequences to specific genes. A diversity of such naming systems have been used in the past but we propose the use of a single unified permanent unique naming structure (32). Importantly, these labels will be retained as easily accessible metadata even when in-depth understanding of the locus in question allows us to assign a permanent, gene-associated name.

● Reference sets of germline alleles and related data used across downstream applications should be made freely available under open-source licenses to ensure their broad usability in tools and pipelines.

● All reports of studies of AIRR data must include version numbers not only of tools but also of reference sets used for the analysis. Such versioned reference sets must be stored indefinitely in publicly available repositories.

We urge all stakeholders in AIRR research to engage in debate, for instance through the processes established within the AIRR Community. This Community within the Antibody Society (https://www.antibodysociety.org/) is a grass-root science community that organises and coordinates researchers and stakeholders engaged in studies of immunoglobulin and TCR repertoires (33). Such open debate will enable the IUIS Nomenclature Committee and the research community too jointly establish and implement collectively agreed-upon principles for reference set management as well as gene naming and nomenclature principles, such as those defined by the TR-IG Nomenclature Sub-Committee. The role of IUIS, the home of the Committee, as a core meeting-point for the scientific community is well established. It engages multiple national and regional societies and associated members, and it is a recognised partner of the World Health Organization and a member of the International Science Council. Its Nomenclature Committee and its subcommittees openly engage the immunology community to define universal and consistent nomenclatures, promoting community-wide acceptance of common principles. Their continued engagement in this process will greatly benefit the research community and our reliance on the quality of reference sequences, and their continued relevance for future data analysis and interpretation processes.

Author contributions

AP: Conceptualization, Writing – original draft, Writing – review & editing. MO: Conceptualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that Generative AI was used in the creation of this manuscript. During the preparation of this text the authors used ChatGPT to rephrase parts of the original text. After using the tool, the authors reviewed and revised the content as needed and they take full responsibility for the content of the article.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Author Disclaimer

The authors are Co-Chairs of the Adaptive Immune Receptor Repertoire Community’s Inferred Allele Review Committee.

References

1. Lefranc MP, Giudicelli V, Duroux P, Jabado-Michaloud J, Folch G, Aouinti S, et al. IMGT(R), the international ImMunoGeneTics information system(R) 25 years on. Nucleic Acids Res. (2015) 43:D413–22. doi: 10.1093/nar/gku1056

PubMed Abstract | Crossref Full Text | Google Scholar

2. Cook GP, Tomlinson IM, Walter G, Riethman H, Carter NP, Buluwela L, et al. A map of the human immunoglobulin VH locus completed by analysis of the telomeric region of chromosome 14q. Nat Genet. (1994) 7:162–8. doi: 10.1038/ng0694-162

PubMed Abstract | Crossref Full Text | Google Scholar

3. Papadaki A, Georga M, Jabado-Michaloud J, Folch G, Zeitoun G, Duroux P, et al. IMGT((R)) analysis of the human IGH locus: unveiling novel polymorphisms and copy number variations in 15 genome assemblies from diverse ancestral backgrounds. NAR Genom Bioinform. (2025) 7:lqaf127. doi: 10.1093/nargab/lqaf127

PubMed Abstract | Crossref Full Text | Google Scholar

4. Rodriguez OL, Gibson WS, Parks T, Emery M, Powell J, Strahl M, et al. A novel framework for characterizing genomic haplotype diversity in the human immunoglobulin heavy chain locus. Front Immunol. (2020) 11:2136. doi: 10.3389/fimmu.2020.02136

PubMed Abstract | Crossref Full Text | Google Scholar

5. Rodriguez OL, Safonova Y, Silver CA, Shields K, Gibson WS, Kos JT, et al. Genetic variation in the immunoglobulin heavy chain locus shapes the human antibody repertoire. Nat Commun. (2023) 14:4419. doi: 10.1038/s41467-023-40070-x

PubMed Abstract | Crossref Full Text | Google Scholar

6. Zhang JY, Roberts H, Flores DSC, Cutler AJ, Brown AC, Whalley JP, et al. Using de novo assembly to identify structural variation of eight complex immune system gene regions. PloS Comput Biol. (2021) 17:e1009254. doi: 10.1371/journal.pcbi.1009254

PubMed Abstract | Crossref Full Text | Google Scholar

7. Rodriguez OL, Silver CA, Shields K, Smith ML, and Watson CT. Targeted long-read sequencing facilitates phased diploid assembly and genotyping of the human T cell receptor alpha, delta, and beta loci. Cell Genom. (2022) 2:100228. doi: 10.1016/j.xgen.2022.100228

PubMed Abstract | Crossref Full Text | Google Scholar

8. Watson CT, Collins AM, Ohlin M, Heather JM, Peres A, Lees WD, et al. Building immunoglobulin and T cell receptor gene databases for the future. ImmunoInformatics. (2025) 19:100059. doi: 10.1016/j.immuno.2025.100059

Crossref Full Text | Google Scholar

9. Abhinandan KR and Martin AC. Analysis and improvements to Kabat and structurally correct numbering of antibody variable domains. Mol Immunol. (2008) 45:3832–9. doi: 10.1016/j.molimm.2008.05.022

PubMed Abstract | Crossref Full Text | Google Scholar

10. Al-Lazikani B, Lesk AM, and Chothia C. Standard conformations for the canonical structures of immunoglobulins. J Mol Biol. (1997) 273:927–48. doi: 10.1006/jmbi.1997.1354

PubMed Abstract | Crossref Full Text | Google Scholar

11. Chothia C and Lesk AM. Canonical structures for the hypervariable regions of immunoglobulins. J Mol Biol. (1987) 196:901–17. doi: 10.1016/0022-2836(87)90412-8

PubMed Abstract | Crossref Full Text | Google Scholar

12. Honegger A and Plückthun A. Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool. J Mol Biol. (2001) 309:657–70. doi: 10.1006/jmbi.2001.4662

PubMed Abstract | Crossref Full Text | Google Scholar

13. Kabat EA, Wu TT, Perry HM, Gottesman KS, and Foeller C. Sequences of proteins of immunological interest. 5th edition. Bethesda, MD: U.S. Department of Health and Human Services (1991).

Google Scholar

14. Lefranc MP. IMGT unique numbering for the variable (V), constant (C), and groove (G) domains of IG, TR, MH, IgSF, and MhSF. Cold Spring Harb Protoc. (2011) 2011:633–42. doi: 10.1101/pdb.ip85

PubMed Abstract | Crossref Full Text | Google Scholar

15. Lefranc MP, Pommie C, Ruiz M, Giudicelli V, Foulquier E, Truong L, et al. IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. Dev Comp Immunol. (2003) 27:55–77. doi: 10.1016/s0145-305x(02)00039-3

PubMed Abstract | Crossref Full Text | Google Scholar

16. Engelbrecht E, Rodriguez OL, Shields K, Schultze S, Tieri D, Jana U, et al. Resolving haplotype variation and complex genetic architecture in the human immunoglobulin kappa chain locus in individuals of diverse ancestry. Genes Immun. (2024) 25:297–306. doi: 10.1038/s41435-024-00279-2

PubMed Abstract | Crossref Full Text | Google Scholar

17. Gibson WS, Rodriguez OL, Shields K, Silver CA, Dorgham A, Emery M, et al. Characterization of the immunoglobulin lambda chain locus from diverse populations reveals extensive genetic variation. Genes Immun. (2023) 24:21–31. doi: 10.1038/s41435-022-00188-2

PubMed Abstract | Crossref Full Text | Google Scholar

18. Watson CT, Steinberg KM, Huddleston J, Warren RL, Malig M, Schein J, et al. Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation. Am J Hum Genet. (2013) 92:530–46. doi: 10.1016/j.ajhg.2013.03.004

PubMed Abstract | Crossref Full Text | Google Scholar

19. Pospelova M, Voss K, Zamyatin A, Watson CT, Koepfli KP, Bankevich A, et al. Comparative analysis of mammalian adaptive immune loci revealed spectacular divergence and common genetic patterns. Mol Biol Evol. (2025) 42:msaf152. doi: 10.1093/molbev/msaf152

PubMed Abstract | Crossref Full Text | Google Scholar

20. Collins AM, Wang Y, Roskin KM, Marquis CP, and Jackson KJ. The mouse antibody heavy chain repertoire is germline-focused and highly variable between inbred strains. Philos Trans R Soc Lond B Biol Sci. (2015) 370:20140236. doi: 10.1098/rstb.2014.0236

PubMed Abstract | Crossref Full Text | Google Scholar

21. Jackson KJL, Kos JT, Lees W, Gibson WS, Smith ML, Peres A, et al. A BALB/c IGHV reference set, defined by haplotype analysis of long-read VDJ-C sequences from F1 (BALB/c x C57BL/6) mice. Front Immunol. (2022) 13:888555. doi: 10.3389/fimmu.2022.888555

PubMed Abstract | Crossref Full Text | Google Scholar

22. Peres A, Upadhyay AA, Klein VH, Saha S, Rodriguez OL, Vanwinkle ZM, et al. A broad survey and functional analysis of immunoglobulin loci variation in rhesus macaques. bioRxiv. (2025) 2025.05.28.656470. doi: 10.1101/2025.01.07.631319

PubMed Abstract | Crossref Full Text | Google Scholar

23. Vazquez Bernat N, Corcoran M, Nowak I, Kaduk M, Castro Dopico X, Narang S, et al. Rhesus and cynomolgus macaque immunoglobulin heavy-chain genotyping yields comprehensive databases of germline VDJ alleles. Immunity. (2021) 54:355–66 e4. doi: 10.1016/j.immuni.2020.12.018

PubMed Abstract | Crossref Full Text | Google Scholar

24. Parks T, Mirabel MM, Kado J, Auckland K, Nowak J, Rautanen A, et al. Association between a common immunoglobulin heavy chain allele and rheumatic heart disease risk in Oceania. Nat Commun. (2017) 8:14946. doi: 10.1038/ncomms14946

PubMed Abstract | Crossref Full Text | Google Scholar

25. Collins AM, Watson CT, van der Ham H-J, Teyton L, Rosati E, and Safonova Y. Challenges for the immunoglobulin and T cell receptor gene nomenclatures in the modern genomics era. Immunoinformatics. (2025) 19:100053. doi: 10.1016/j.immuno.2025.100053

Crossref Full Text | Google Scholar

26. Corcoran MM, Phad GE, Vazquez Bernat N, Stahl-Hennig C, Sumida N, Persson MA, et al. Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity. Nat Commun. (2016) 7:13642. doi: 10.1038/ncomms13642

PubMed Abstract | Crossref Full Text | Google Scholar

27. Gadala-Maria D, Yaari G, Uduman M, and Kleinstein SH. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles. Proc Natl Acad Sci U.S.A. (2015) 112:E862–70. doi: 10.1073/pnas.1417683112

PubMed Abstract | Crossref Full Text | Google Scholar

28. Ohlin M, Scheepers C, Corcoran M, Lees WD, Busse CE, Bagnara D, et al. Inferred allelic variants of immunoglobulin receptor genes: A system for their evaluation, documentation, and naming. Front Immunol. (2019) 10:435. doi: 10.3389/fimmu.2019.00435

PubMed Abstract | Crossref Full Text | Google Scholar

29. Corcoran M, Narang S, Kaduk M, Chernyshev M, Färnert A, Sundling C, et al. Human IGH germline gene diversity and allele frequencies in 2486 individuals from 25 global populations delineated by ultra-high throughput genotyping. bioRxiv. (2025). 2025.08.06.668935. doi: 10.1101/2025.08.06.668935

Crossref Full Text | Google Scholar

30. Collins AM, Peres A, Corcoran MM, Watson CT, Yaari G, Lees WD, et al. Commentary on Population matched (pm) germline allelic variants of immunoglobulin (IG) loci: relevance in infectious diseases and vaccination studies in human populations. Genes Immun. (2021) 22:335–8. doi: 10.1038/s41435-021-00152-6

PubMed Abstract | Crossref Full Text | Google Scholar

31. Watson CT, Matsen F, Jackson KJL, Bashir A, Smith ML, Glanville J, et al. Comment on “A database of human immune receptor alleles recovered from population sequencing data. J Immunol. (2017) 198:3371–3. doi: 10.4049/jimmunol.1700306

PubMed Abstract | Crossref Full Text | Google Scholar

32. Lees WD, Christley S, Peres A, Kos JT, Corrie B, Ralph D, et al. AIRR community curation and standardised representation for immunoglobulin and T cell receptor germline sets. Immunoinform (Amst). (2023) 10:100025. doi: 10.1016/j.immuno.2023.100025

PubMed Abstract | Crossref Full Text | Google Scholar

33. Breden F, Luning Prak ET, Peters B, Rubelt F, Schramm CA, Busse CE, et al. Reproducibility and reuse of adaptive immune receptor repertoire data. Front Immunol. (2017) 8:1418. doi: 10.3389/fimmu.2017.01418

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: adaptive immune receptor repertoire (AIRR), germline gene, gene nomenclature, reference sets, nomenclature

Citation: Peres A and Ohlin M (2025) Towards a sustainable, comprehensive and community-accepted nomenclature and naming standard of antibody and T cell receptor germline genes and alleles. Front. Immunol. 16:1689673. doi: 10.3389/fimmu.2025.1689673

Received: 20 August 2025; Accepted: 11 November 2025; Revised: 06 November 2025;
Published: 24 November 2025.

Edited by:

Jonathan S Duke-Cohan, Dana–Farber Cancer Institute, United States

Reviewed by:

Yuta Nagano, University College London, United Kingdom

Copyright © 2025 Peres and Ohlin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Mats Ohlin, bWF0cy5vaGxpbkBpbW11bi5sdGguc2U=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.