ORIGINAL RESEARCH article

Front. Immunol., 12 November 2019

Sec. B Cell Biology

Volume 10 - 2019 | https://doi.org/10.3389/fimmu.2019.02541

Standardized IMGT® Nomenclature of Salmonidae IGH Genes, the Paradigm of Atlantic Salmon and Rainbow Trout: From Genomics to Repertoires

  • 1. Immunology Laboratory, Biomedical Research Center, University of Vigo, Vigo, Spain

  • 2. Department of Biology, Center of Evolutionary and Theoretical Immunology, University of New Mexico, Albuquerque, NM, United States

  • 3. Nofima AS, Norwegian Institute of Food, Fisheries and Aquaculture Research, Tromsø, Norway

  • 4. IMGT®, The International ImMunoGeneTics Information System® (IMGT), Laboratoire d'ImmunoGénétique Moléculaire (LIGM), Institut de Génétique Humaine (IGH), CNRS, University of Montpellier, Montpellier, France

  • 5. Sechenov Institute of Evolutionary Physiology and Biochemistry, Saint Petersburg, Russia

  • 6. MICALIS, Institut National de la Recherche Agronomique, Université Paris-Saclay, Jouy-en-Josas, France

  • 7. Génétique Animale et Biologie Intégrative (GABI), INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France

  • 8. Virologie et Immunologie Moléculaires (VIM), Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, Jouy-en-Josas, France

  • 9. Pathobiology Department, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, United States

  • 10. Western Fisheries Research Center, U.S. Geological Survey, Seattle, WA, United States

  • 11. Department of Biology, University of Victoria, Victoria, BC, Canada

Article metrics

View details

26

Citations

5,4k

Views

1,6k

Downloads

Abstract

In teleost fish as in mammals, humoral adaptive immunity is based on B lymphocytes expressing highly diverse immunoglobulins (IG). During B cell differentiation, IG loci are subjected to genomic rearrangements of V, D, and J genes, producing a unique antigen receptor expressed on the surface of each lymphocyte. During the course of an immune response to infections or immunizations, B cell clones specific of epitopes from the immunogen are expanded and activated, leading to production of specific antibodies. Among teleost fish, salmonids comprise key species for aquaculture. Rainbow trout (Oncorhynchus mykiss) and Atlantic salmon (Salmo salar) are especially important from a commercial point of view and have emerged as critical models for fish immunology. The growing interest to capture accurate and comprehensive antibody responses against common pathogens and vaccines has resulted in recent efforts to sequence the IG repertoire in these species. In this context, a unified and standardized nomenclature of salmonid IG heavy chain (IGH) genes is urgently required, to improve accuracy of annotation of adaptive immune receptor repertoire dataset generated by high-throughput sequencing (AIRRseq) and facilitate comparisons between studies and species. Interestingly, the assembly of salmonids IGH genomic sequences is challenging due to the presence of two large size duplicated IGH loci and high numbers of IG genes and pseudogenes. We used data available for Atlantic salmon to establish an IMGT standardized nomenclature of IGH genes in this species and then applied the IMGT rules to the rainbow trout IGH loci to set up a nomenclature, which takes into account the specificities of Salmonid loci. This unique, consistent nomenclature for Salmonid IGH genes was then used to construct IMGT sequence reference directories allowing accurate annotation of AIRRseq data. The complex issues raised by the genetic diversity of salmon and trout strains are discussed in the context of IG repertoire annotation.

Introduction

Vertebrate species with jaws (Gnasthostomata) that appeared more than 400 million years ago are all characterized by an adaptive immune system based on B and T cells along with the huge diversity and specificity of their antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR), respectively (1, 2). The analysis of the germline IGH locus defines the genomic repertoire with the identification of the functional variable (V), diversity (D), and joining (J) genes that participate in the synthesis of VH domains. It also allows the identification of the functional constant (C) genes that encode the constant regions of the heavy chains and define their isotypes (37).

In teleost fish, B cell clonal responses are induced by infection or immunization, as described in humans or mice. Antibodies constitute a key factor for fish specific immunity and for the protection afforded by vaccines. As key species in aquaculture, Salmonids (family Salmonidae) including rainbow trout (Oncorhynchus mykiss; Oncmyk) and Atlantic salmon (Salmo salar; Salsal) constitute important models for the study of antibodies and B cell responses in fish.

Several groups started to clone and sequence IGH cDNA from rainbow trout in the early 1990s (812). Comparison of VH domains (V-D-J-REGION) expressed in trout stocks from Sweden, France, and the US revealed differences in IGHV subgroup usage: subgroups named 8, 9, 10, and 11 were found only in Swedish stocks while subgroups 4 and 7 were only found in French stocks and subgroup 5 (now part of IGHV1) was found in Swedish, French, and US stocks. These observations suggested genetic differences between the IGHV gene germline repertoires of different populations, but this was not fully clear due to the very small numbers of sampled individuals. In 1996, expressed VH domain sequences were classified into a set of 11 IGHV subgroups, defining a first unified nomenclature for rainbow trout (13). A more extensive study performed in 2006 on American trout by the group of Steve Kaattari found all these subgroups expressed, indicating that IGHV subgroups may have a wider distribution than previously suggested. Two additional subgroups expressed at low frequency were also discovered in this survey (14), leading to a repertoire of 13 IGHV subgroups. These subgroups were used for an IMGT gene table created in 2009, with a provisional gene nomenclature (letter S) for rainbow trout IGHV [path to access: IMGT Repertoire (IG and TR) >1. Locus and genes > Gene tables > IGHV > Rainbow trout (O. mykiss)]1.

In Atlantic salmon, Solem et al. described in 2001 nine IGHV subgroups (15), seven of which corresponded to IGHV subgroups defined in rainbow trout (1, 2, 3, 6, 8, 9, and 11). Southern blot experiments suggested that the number of genes per subgroup could vary between 1 and 7 ± 10. This work also clearly established that Atlantic salmon IGHV genes were rearranged and transcribed from both of the two Atlantic salmon IGH loci (IGH locus A on chromosome 6 and IGH locus B on chromosome 3), which were most likely produced by the salmonid whole genome duplication. These data actually suggested that genes from some subgroups could be expressed only from a single locus, while genes from other subgroups were expressed from both A and B loci. This analysis was later extended and refined in 2010 by Yasuike et al. from a complete assembly of the Atlantic salmon IGH A and B loci based on sequences of 24 bacterial artificial chromosomes (BAC) (16). This study provided a first map of the organization of the duplicated IGH loci of a salmonid species. Ninety-nine IGHV genes were found in locus A, and 103 in locus B; 23 IGHV genes are functional in locus A, and 32 in locus B. Using the IMGT threshold of 75% identity for the V-REGION, 18 IGHV subgroups were defined in this work (16). Subgroups that did fit with the IGHV subgroups established in rainbow trout were given a subgroup number consistent with the online 2009 IMGT gene table [IMGT Repertoire (IG and TR) > 1. Locus and genes > Gene tables > IGHV > Atlantic salmon (Salmo salar)]1.

As new genome assemblies of Atlantic salmon and rainbow trout have been recently made available, we decided to annotate the IGH locus of these species and to establish a common nomenclature of IGH genes based on IMGT rules. We used data previously published for Atlantic salmon (16) to develop a prototype for the Salmonid IMGT standardized nomenclature. We also applied the IMGT rules to the rainbow trout IGH loci as a novel example of IMGT genomic annotation. The objective was to take into account the specificities of the Salmonid loci and to develop a unique, consistent nomenclature, while respecting the IMGT Scientific chart rules and standards. These standards are based on the concepts of identification (keywords), classification (gene and allele nomenclature), description (labels), and numbering (IMGT unique numbering and IMGT Collier de Perles) (3). It is important to note that a consistent nomenclature is crucial to build IMGT reference directory sets that are constituted by the V-REGION, D-REGION, and J-REGION of each IMGT reference allele from IMGT/LIGM-DB (same accession numbers as GenBank, ENA, and DDBJ) (17). These reference directory sets are the fundamental basis for annotation of repertoire datasets produced by high-throughput AIRRseq approaches for the analysis of expressed repertoires, in particular to define expressed clonotypes (1820). The IMGT reference directories are built following the classification of the V, D, J, and C genes and alleles according to the IMGT rules and the assignment of the IMGT functionality: functional (F), open reading frame (ORF), or pseudogene (P) (IMGT Scientific chart > IMGT functionality)1 (3). These rules ensure that the nomenclature is consistent within and between species, and can be updated when more sequence data become available. Reference directory sets are used by IMGT/V-QUEST and IMGT/JunctionAnalysis (21, 22) for detailed analysis of nucleotide (nt) sequences of V domains [V-(D)-J-REGION]; by IMGT/DomainGapAlign, which provides alignments of amino acid (AA) sequences with the closest V and J regions for V domains and the closest C exons for C domains (23); by IMGT Collier de Perles based on the IMGT unique numbering for V and C domains (24, 25); and by IMGT/HighV-QUEST (26, 27) for high-throughput sequence analysis of expressed IGH repertoires and clonotype definition (1820). Importantly, IMGT reference directory sets are freely available for the academic community and can be used by other programs developed for repertoire analysis.

In this work, we produced reference directory sets for IGH loci of Atlantic salmon and rainbow trout, based on a unique nomenclature developed for salmonids and following IMGT rules. We show how the particularities of salmonid IGH loci (duplicated loci in each haplotype, large number of genes and pseudogenes) were taken into account and how reference directory sets can be used for annotation of IGH expression datasets. We also discuss how the nomenclature and reference directories can be updated with new data and extended to other salmonid species.

Materials and Methods

GU129139 and GU129140 from GenBank, ENA, and DDBJ, entered in IMGT/LIGM-DB (Rel. 201839-1) and IMGT annotated (GU129139 in Rel. 201923-5, Last updated, Version 11 and GU129140 in Rel. 201930-1, Last updated, Version 10), were selected as S. salar (Salsal) IMGT IGH locus prototypes. Sequences from these entries are from Atlantic salmon BAC library (CHORI-214), constructed from a Norwegian aquaculture strain male, from BACPAC Resources, Children's Hospital Oakland Research Institute (CHORI) (16). GU129139 (931200 bp) (Salsal locus A, ssa06, IMGT locus ID: Salsal_IGH_1) is in reverse (REV) orientation on chromosome 6 whereas GU129140 (1063283 bp) (Salsal IGH locus B, ssa03, IMGT locus ID: Salsal_IGH_2) is in forward (FWD) orientation on chromosome 3.

For obtaining IMGT gene names, newly identified Atlantic salmon and rainbow trout IGH genes and alleles from genome assemblies were submitted to the IG, T cell receptors (TR), and major histocompatibility (MH) Nomenclature Sub-Committee (IMGT-NC) of the International Union of Immunological Societies (IUIS) Nomenclature Committee2,3. Two IMGT_NC reports #2019-5-0131 and #2019-7-02202 comprise the submission of 75 Atlantic salmon IGHV sequences from two accession numbers NC_027305.1 and NC_027302.1. These reports concern 75 different genes [35 Atlantic salmon IGHV on NC_027305.1 (Salsal locus A, ssa06) and 40 Atlantic salmon IGHV genes on NC_027302.1 (Salsal locus B, ssa03)] and correspond to 75 new alleles (61 of them are *01 and 14 are *02).

Two new entries were created in IMGT/LIGM-DB: IMGT000028 for Salsal locus A [S. salar (Atlantic salmon), taxon:8030, breed: double haploid, assembly GCF_000233375.1, GenBank assembly ID: GCA_000233375.4, chromosome 6, CM003284.1 (20520824–22238370, complement), IGH locus A] [this entry includes IMGT annotated genes from NC_027305.1 (Salsal ssa06)] and IMGT000029 for Salsal locus B [S. salar (Atlantic salmon), taxon:8030, breed: double haploid, assembly GCF_000233375.1, GenBank assembly ID: GCA_000233375.4, chromosome 3, CM003281.1 (77578187–79383607), IGH locus B] [this entry includes IMGT annotated genes from NC_027302.1 (Salsal ssa03)].

The rainbow trout genome (assembly: Omyk_1.0, June 2017; GenBank assembly accession GCA_002163495.1) obtained from the homozygous Swanson clonal line was examined to locate IGH locus. Two IGH loci were identified, locus A on chromosome 13 (Oncmyk chr13) and locus B on chromosome 12 (Oncmyk chr12), both of them are in forward (FWD) orientation. The IMGT-NC Report #2019-10-0402 comprises the submission of 181 rainbow trout IGH gene sequences from NC_035089.1 (Oncmyk Omy13) and NC_035088.1 (Oncmyk Omy12). This IMGT-NC report concerns 181 different genes: 74 genes in locus A on Oncmyk chr 13 (49 IGHV, 11 IGHD, 10 IGHJ, and 4 IGHC on NC_035089.1) and 107 genes in locus B on Oncmyk chr 12 (80 IGHV, 13 IGHD, 9 IGHJ, and 5 IGHC on NC_035088.1) and corresponds to 181 new alleles *01. Two new entries were created in IMGT/LIGM-DB: IMGT000043 (IMGT/LIGM-DB) for Oncmyk locus A [O. mykiss (rainbow trout), taxon:8022, isolate: Swanson, assembly Omyk_1.0, GenBank assembly ID: GCF_002163495.1, chromosome 13: CM007947.1 (48012355–48422510), IGH locus A] [this entry includes IMGT annotated genes from NC_035089.1 (Oncmyk Omy13)] and IMGT000044 for Oncmyk locus B [O. mykiss (rainbow trout), taxon:8022, isolate: Swanson, assembly Omyk_1.0, GenBank assembly ID: GCF_002163495.1, chromosome 12: CM007946.1 (81302817–81805590), IGH locus B] [this entry includes IMGT annotated genes from NC_035088.1 (Oncmyk Omy12)].

Results

The complete and correct assembly of the Salmonidae IGH loci is a significant challenge owing to (i) the existence of two duplicated loci due to the tetraploidization (named locus A and locus B), (ii) the large size of each locus, (iii) the high number of different IGHV subgroups compared to mammals, (iv) the internal amplification and potential gene conversion that occurred inside each locus during their evolution, and (v) the very high number of pseudogenes, many of them partial, relative to the functional genes.

We therefore explored how the standardized IMGT nomenclature could allow the identification and classification of genes and alleles in incomplete or not yet fully annotated genome assemblies. The IGH data published for Atlantic salmon (16), largely based on BAC sequencing, were used as a prototype for establishing the standardized IMGT nomenclature for salmonids and for dealing, by comparison, with newly identified IGH genes from both Atlantic salmon and rainbow trout genome assemblies. The particularities of these IGH loci (in particular the tetraploidization) were taken into consideration for consistency between salmonid species.

From IG Classes to IMGT Constant (C) Gene Names

Three antibody classes have been identified in fish, namely, IgM, IgD, and IgT, while IgG, IgA, and IgE are absent (28). IgM and IgD are generally co-expressed at the cell surface of the same B cells through alternative splicing, as in mammals. Soluble IgM are tetrameric and constitute the main antibody class in serum. A third class, IgT, is expressed in most fish groups including salmonids. Interestingly, the IG-Heavy-Tau chains of IgT have a VH domain that results from independent V-D-J rearrangements, and is not obtained by a switch process (29). IgT has been found only in bony fish and is particularly involved in mucosal immunity and protection (30). IGHD was cloned and characterized in rainbow trout and Atlantic salmon, in parallel to the discovery of IGHT encoding the third fish IG-Heavy-Tau isotype (28, 29) and then in Atlantic salmon (31).

By convention, IMGT groups are designated by the locus and gene type. Based on the four gene types, V (variable), D (diversity), J (joining), and C (constant), the IGH genes belong to four groups: IGHV, IGHD, IGHJ, and IGHC. For the IGH locus, the constant genes are designated by the letter (and, if relevant, number) corresponding to the encoded isotype (IGHT, IGHM, and IGHD), instead of using the letter C.

The salmonid IGHC genes belong to three subgroups IGHM, IGHD, and IGHT and encode, when functional, the C-REGION of the heavy chain defining these three isotypes, IG-Heavy-Mu (heavy chain of the IgM class), IG-Heavy-Delta (heavy chain of the IgD class), and IG-Heavy-Tau (heavy chain of the IgT class) (Table 1). Salmonid locus A and locus B were assigned based on the literature, with the letter D (for “duplicated”) added to the conventional gene names for locus B.

Table 1

IG receptor classIG heavy chain typeIG C-gene groupIG C-gene subgroupIGHC gene names
Salmo salarOncorhynchus mykiss
Locus ALocus BLocus ALocus B
IgMIG-Heavy-MuIGHCIGHMIGHMIGHMDIGHMIGHMD
IgDIG-Heavy-DeltaIGHCIGHDIGHDIGHDDIGHDIGHDD
IgTIG-Heavy-TauIGHCIGHTIGHT1
IGHT2
IGHT3
IGHT4
IGHT5
IGHT1D
IGHT2D
IGHT3D
IGHT1
IGHT2
IGHT1D

Salmonid IG receptor classes, heavy chain types, and IGHC gene names.

Atlantic Salmon IGH Constant Genes and Associated D and J Genes

The Atlantic salmon IGH locus A, which is in a reverse (REV) orientation on chromosome 6 and spans 660 kilobases (kb) (with the V genes encompassing 600 kb) (Figure 1) includes 7 IGHC genes with 17 associated IGHD genes and 13 IGHJ genes. The Atlantic salmon IGH locus B, which is in forward (FWD) orientation on chromosome 3 and spans 720 kb (with the V genes encompassing 670 kb) (Figure 2) includes 5 IGHC genes with 11 associated IGHD genes and 8 IGHJ genes. The constant region of the IG-Heavy-Mu chain and of the IG-Heavy-Delta are encoded by a unique gene per locus (IGHM and IGHD for locus A and IGHMD and IGHDD for locus B) preceded by a D-J cluster. There are several IG-Heavy-Tau genes (IGHT), but the associated D-J cluster may be incomplete (lacking D and/or J genes). In Atlantic salmon, there is only one IGHT functional (F) gene per locus, IGHT4 for locus A and IGHT2D for locus B, each one having a complete D-J cluster (Table 2).

Figure 1

Figure 2

Table 2

Salmo salarlocus A
on chromosome 6 (Salsal ssa06)
Salmo salarlocus B
on chromosome 3 (Salsal ssa03)
IGHD genesIGHJ genesIGHC
genes
IGHD genesIGHJ genesIGHC genes
IGHT1PIGHJ1T1DPIGHT1DP
IGHD1T2FIGHJ1T2FIGHT2PIGHD1T2DFIGHJ1T2DFIGHT2DF
IGHD2T2FIGHJ2T2FIGHD2T2DFIGHJ2T2DF
IGHJ1T3FIGHT3PIGHD1T3DFIGHT3DP
IGHJ2T3FIGHD2T3DF
IGHD1T4FIGHJIT4FIGHT4FIGHD3T3DF
IGHD2T4FIGHJ2T4F
IGHD3T4F
IGHD4T4F
IGHD5T4F
IGHD1T5FIGHJIT5PIGHT5P
IGHJ2T5F
IGHD1FIGHJ1FIGHMFIGHD1DFIGHJIDFIGHMDF
IGHD2FIGHJ2ORFIGHD2DFIGHJ2DF
IGHD3FIGHJ3FIGHD3DFIGHJ3DF
IGHD4FIGHJ4FIGHD4DFIGHJ4DF
IGHD5FIGHJ5F, ORFIGHD5DFIGHJ5DF, ORF
IGHD6FIGHD6DF
IGHD7F
IGHD8F
IGHD9F
IGHDFIGHDDF

Atlantic salmon (Salmo salar) IGH constant C genes and associated D and J genes.

F, functional; ORF, open reading frame; P, pseudogene. The functionality is according to IMGT functionality (IMGT Scientific chart > IMGT functionality)1 (3).

In the Atlantic salmon locus A, the D and J genes associated to IGHT genes comprise two D (IGHD1T2 and IGHD2T2) and two J (IGHJ1T2 and IGHJ2T2) upstream of the pseudogene (P) IGHT2, two J (IGHJ1T3 and IGHJ2T3) upstream of IGHT3 (P), five D (IGHD1T4 to IGHD5T4) and two J (IGHJ1T4 and IGHJ2T4) genes, all of them functional, upstream of IGHT4 (F) and one D (IGHD1T5) and two J (IGHJ1T5 and IGHJ2T5) upstream of IGHT5 (P). There is no IGHD or IGHJ upstream of IGHT1 (P) (Table 2). The D and J associated to IGHM and IGHD comprise nine D (IGHD1 to IGHD9), all of them functional and five J genes, three of them functional (IGHJ1, IGHJ3, and IGHJ4), one with ORF, the IGHJ2, and one with alleles F or ORF (IGHJ5). They are located upstream of IGHM (F) and shared with the IGHD constant gene (F) (Table 2 and Figures S1, S2). Eleven IGHD not directly associated to constant genes are dispersed in locus A (IGHD-1 to IGHD-11).

In the Atlantic salmon locus B, the D and J genes associated to IGHT genes comprise one J (IGHJ1T1D) upstream of IGHT1D (P), two D (IGHD1T2D and IGHD2T2D), and two J (IGHJ1T2D and IGHJ2T2D) all functional upstream of IGHT2D (F) and three D (IGHD1T3D, IGHD2T3D, and IGHD3T3D) downstream of IGHT3D (P) (Table 2 and Figures S1, S2). The D and J genes associated to IGHMD and IGHDD comprise six D (IGHD1D to IGHD6D, all functional) and five J genes (four functional, IGHJ1D to IGHJ4D) and one with alleles F or ORF (IGHJ5D). They are located upstream of IGHMD (F) and shared with the IGHDD constant gene (F) (Table 2 and Figures S1, S2). Six IGHD not directly associated to constant genes are dispersed in locus B (IGHD-1D to IGHD-6D).

IGHD, IGHJ, and IGHC genes are reported in IMGT Gene tables [IMGT Repertoire (IG and TR) > 1. Locus and genes > Gene tables > IGHD > Atlantic salmon (S. salar); ibid., IGHJ > Atlantic salmon (S. salar); ibid., IGHC > Atlantic salmon (S. salar)]1.

Rainbow Trout IGH Constant Genes and Associated D and J Genes

Similar to the Atlantic salmon, the rainbow trout has one functional gene per IGH locus encoding the constant region of the IG-Heavy-Mu (IGHM gene in locus A and IGHMD gene in locus B), the constant region of the IG-Heavy-Delta (IGHD gene in locus A and IGHDD gene in locus B), and the constant region of the IG-Heavy-Tau (IGHT2 gene in locus A and IGHT1D gene in locus B).

The rainbow trout IGH locus A, which spans 360 kb and is in a forward (FWD) orientation on chromosome 13, includes 11 IGHD genes, 10 IGHJ genes, and 4 IGHC genes (Table 3). There are three D and two J genes upstream of IGHT1 (P), two D and two J genes upstream of IGHT2 (F), and six D and six J genes (all of them F) upstream of IGHM (F) and shared with the IGHD (F) constant gene (Figures S1, S2).

Table 3

Oncorhynchus mykisslocus A
on chromosome 13 (Oncmyk Omy13)
Oncorhynchus mykisslocus B
on chromosome 12 (Oncmyk Omy12)
IGHD genesIGHJ genesIGHC
genes
IGHD genesIGHJ genesIGHC genes
IGHD1T1FIGHJ1T1FIGHT1PIGHD1T1DFIGHJ1T1DFIGHT1DF
IGHD2T1FIGHJ2T1FIGHD2T1DFIGHJ2T1DF
IGHD3T1FIGHD3T1DORF
IGHD4T1DF
IGHD1T2FIGHJ1T2FIGHT2F
IGHD2T2FIGHJ2T2F
IGHD1FIGHJ1FIGHMFIGHD1DFIGHJ1DFIGHMD
IGHD2FIGHJ2FIGHD2DFIGHJ2DF
IGHD3FIGHJ3FIGHD3DFIGHJ3DF
IGHD4FIGHJ4FIGHD4DFIGHJ4DF
IGHD5FIGHJ5FIGHD5DFIGHJ5DF
IGHD6FIGHJ6FIGHD6DFIGHJ6DF
IGHJ7DF
IGHDFIGHDDF

Rainbow trout (Oncorhynchus mykiss) IGH constant C genes and associated D and J genes.

F, functional; ORF, open reading frame; P, pseudogene. The functionality is according to IMGT functionality (IMGT Scientific chart > IMGT functionality)1 (3).

The rainbow trout IGH locus B, which spans 485 kb and is in a forward (FWD) orientation on chromosome 12, includes 13 IGHD genes, 9 IGHJ genes, and 3 IGHC genes (Table 3). There are four D genes (1 ORF and 3 F) and two J genes (both F) upstream of IGHT1D (F), and six D and seven J genes (all of them F) upstream of IGHMD (F) and shared with the IGHDD (F) constant gene (Figures S1, S2).

Sequences of rainbow trout IGHD and IGHJ genes and alleles are available in the downloadable IMGT reference directory sets from IMGT/GENE-DB (/download/GENE-DB)1 and from IMGT/V-QUEST (/download/V-QUEST/IMGT_V-QUEST_reference_directory/Oncorhynchus_mykiss/IG/IGHD.fasta; ibid., /IGHJ.fasta)1. IGHD and IGHJ genes and alleles are reported in the IMGT Gene tables [IMGT Repertoire (IG and TR) > 1. Locus and genes > Gene tables > IGHD > Rainbow trout (O. mykiss); ibid., IGHJ > Rainbow trout (O. mykiss)]1.

The demonstration that there is only one rainbow trout IG-Heavy-Delta complete gene per locus, IGHD in locus A and IGHDD in locus B, respectively, and that these two genes are functional, results from the analysis derived from applying the nomenclature of the Atlantic salmon IGH loci as well as the interpretation of expression data and published references (15, 16, 29, 31). The anomalies (partial IGHD and IGHDD genes with exons in aberrant localizations or in reverse-complementary orientation) are likely artifacts of the current genome assembly. For that reason, the functionality of the IGHD and IGHDD, deduced from literature data and supported by sequences external to the genome assembly, is shown in parentheses in Table 3.

Atlantic Salmon IGH Variable Genes

The Atlantic salmon IGH locus comprises a total of 303 IGH variable (IGHV) genes (145 IGHV in locus A on Salsal chromosome 6, spanning 600 kb, and 158 IGHV in locus B on Salsal chromosome 3, spanning 670 kb) (Figures 1, 2). There are a total of 67–69 functional genes, 12 ORF, and 222–224 pseudogenes (Table 4).

Table 4

IMGT groupIMGT subgroupLocus A on Salsal chromosome 6Locus B on Salsal chromosome 3Locus A+Locus B
FunctionalORFPseudogeneTotalFunctionalORFPseudogeneTotalFunctionalORFPseudogeneTotal
IGHVIGHV171243212223371934769
IGHV220022(+1)*05(+1)*84(+1)*05(+1)*10
IGHV31045105620911
IGHV4221216311418532634
IGHV5022400440268
IGHV6111618602026713644
IGHV7102300331056
IGHV810(+1)*19(+1)*21305813(+1)*114(+1)*29
IGHV91045305840913
IGHV100014140189012223
IGHV1120680066201214
IGHV12001111021113
IGHV13001100220033
IGHV14001100110022
IGHV151056203530811
IGHV161078501015601723
Total29(+1)*7108(+1)*14538(+1)*5114(+1)*15867(+2)*12222(+2)*303

Atlantic salmon IGH variable genes.

Number of IGHV genes are given per subgroup and per locus A or B, and per IMGT functionality (functional, ORF, pseudogene) (3).

*

An asterisk indicates that the following genes have alleles with different functionalities: Functional or Pseudogene (IGHV2D-12 and IGHV8-58).

Based on the percentage of identity between nucleotide sequences of the V-REGION (threshold 75%), the Atlantic salmon 303 IGHV genes can be classified into 16 IGHV subgroups. IGHV genes are reported in IMGT Gene tables [IMGT Repertoire (IG and TR) > 1. Locus and genes > Gene tables > IGHV > Atlantic salmon (S. salar)]1. Correspondence with previous gene names is indicated.

Translation of alleles *01 of F, ORF, and in-frame P are aligned according to the IMGT unique numbering in IMGT Protein display allowing the visualization of the FR-IMGT and CDR-IMGT [IMGT Repertoire (IG and TR) > 2. Proteins and alleles > Protein displays > IGHV > Atlantic salmon (S. salar)]1 and the comparison of the CDR-IMGT lengths per subgroup (3) {IMGT Repertoire (IG and TR) > 3. 2D and 3D structures > FR-IMGT and CDR-IMGT lengths (V-REGION and V-DOMAIN) > [CDR1-IMGT.CDR2-IMGT.] length per subgroup > IGHV > Atlantic salmon (S. salar)}1 (3).

Rainbow Trout IGH Variable Genes

A total of 129 IGHV genes were identified in the rainbow trout genome, of which 57 can be considered fully functional or with an ORF without stop codon. A number of other sequences were identified as IGHV fragments in the assembly and were not included in the annotation. On chromosome 13 (locus A), 44 IGHV genes were found upstream of the functional IGHT2 gene, as well as 5 IGHV genes between the D-J-IGHT2 cluster and the D-J-IGHM-IGHD cluster. Eighty IGHV genes were found on chromosome 12 (locus B): 70 IGHV were located upstream of the functional IGHT1D gene and 10 IGHV were found between the D-J-IGHT1D cluster and the D-J-IGHMD-IGHDD cluster. The 129 rainbow trout IGHV genes could be classified into the same 16 subgroups defined for the Atlantic salmon IGHV genes, containing from only 1 pseudogene (i.e., IGHV5, IGHV13, and IGHV14 subgroups) to 35 genes, i.e., IGHV1 subgroup, which includes 12 F, 2 ORF, and 21 P IGHV genes. Figure 3 shows a phylogenetic tree based on nucleotide sequences of IGHV genes (F and ORF) present in Atlantic salmon and rainbow trout IGH loci. While some IGHV subgroups are not represented in both species, as far as we know, this tree illustrates how rainbow trout IGHV genes nicely cluster with their Atlantic salmon counterparts.

Figure 3

Expressed Repertoire Analysis

IMGT/V-QUEST and its high-throughput version, IMGT/HighV-QUEST, can perform analysis of nucleotide sequences of the IG and TR variable domains (21, 22, 26, 27). These tools run against the IMGT/V-QUEST reference directory database that includes several sets (per group and per species) and are built based on the IMGT standards (3) (annotation in IMGT/LIGM-DB, Gene tables, Alignments of alleles, Protein display, entry in IMGT/GENE-DB). The IMGT/V-QUEST sets comprise IMGT reference sequences from all functional (F) and ORF genes and alleles (in Advanced parameters, Selection of IMGT reference directory set “F + ORF”). The sets also include IMGT reference sequences from pseudogenes (P) and alleles with an in-frame V-REGION for versatile genomic analysis (proposed by default, in Advanced parameters IMGT reference directory set “F + ORF + in-frame P”).

Altogether, IMGT/V-QUEST reference directory for Atlantic salmon IGHV contains 150 alleles that include 76 F, 15 ORF, and 59 P in-frame (release: 201931-4, 1st August 2019) (Table 5). The 76 F comprise, in addition to the 67 F alleles *01 (28 from locus A and 39 from locus B), 8 alleles *02 and 1 allele *03. The 15 ORF comprise, in addition to the 12 ORF alleles *01 (7 from locus A and 5 from locus B), 3 alleles *02. The 59 in-frame P comprise, in addition to the 54 P alleles *01 (26 from locus A and 28 from locus B), 5 alleles *02. Alleles of closely related duplicated genes are managed in the same Alignments of alleles, as shown, for example, for IGHV1-64*01 F and IGHV1-100*01 F, which have identical V-REGION nucleotide sequences [IMGT Repertoire (IG and TR) 2. Proteins and alleles > Alignments of alleles > IGHV > Atlantic salmon (S. salar)]1.

Table 5

IGHV subgroupAtlantic salmonRainbow trout
Nb of genesNb of allelesIGH locus AIGH locus BNb of genesNb of allelesIGH locus AIGH locus B
F*ORF+P*F*ORF*P*F*ORF*P*F*ORF*P*
IGHV1333871(2)5(6)12(14)26(7)1717602522
IGHV25520030088201302
IGHV3341(2)0010122100010
IGHV4141422431299001404
IGHV52302(3)000000000000
IGHV614161(2)13603(4)88210302
IGHV73310100144000040
IGHV8162011(14)1(2)130066411000
IGHV98810130355301100
IGHV109900501344100300
IGHV1134201(2)00011100000
IGHV122200011044000400
IGHV130000000000000000
IGHV141100100011001000
IGHV1567101202(3)33000102
IGHV16161610350755022001
Total13515030(35)7(10)26(28)39(41)528(31)7777204924713
Locus A + Locus B13515069(76)F + 12(15)ORF + 54(59)P777744 F + 11 ORF + 22 P

Atlantic salmon (S. salar) and rainbow trout (O. mykiss) IGHV alleles included in the IMGT/V-QUEST reference directory sets (release 201931-4, 1st August 2019).

*

F, functional; ORF, open reading frame; P, pseudogene. Number of genes included in the IMGT reference directory, per subgroup and per functionality and total, are shown with, if relevant (more than one allele per gene), the corresponding number of alleles within brackets. The functionality is according to IMGT functionality (IMGT Scientific chart > IMGT functionality)1 (3).

IMGT/V-QUEST reference directory for rainbow trout IGHV contains sequences of 77 alleles that include 44 F, 11 ORF, and 22 P in-frame (/download/V-QUEST/IMGT_V-QUEST_reference_directory/Oncorhynchus_mykiss/IG/IGHV. fasta)1. All the alleles are *01 (release: 201931-4, 1st August 2019) (Table 5). All IGHV genes and alleles (including the P out-of-frame) are reported in IMGT Gene tables [IMGT Repertoire (IG and TR) > 1. Locus and genes > Gene tables > IGHV > Rainbow trout (O. mykiss)]1.

We then investigated the functionality and expression level of IGHV genes from the two species using the standardized nomenclature based on genomic annotation. To do so, adaptive immune receptor repertoire datasets generated by high-throughput sequencing (AIRRseq) were submitted to IMGT/HighV-QUEST analysis.

Atlantic Salmon

AIRRseq data from head kidney of Atlantic salmon were generated based on 5′RACE and specific primers for IGHM constant region [data from reference (32)]. Using the Atlantic salmon reference dataset updated in 2019, a total of 50 IGHV genes (42 functional “F,” 4 “ORF” and 4 pseudogenes “P”) were expressed in the dataset (Figure 4A). More than 80% of submitted sequences presented IGHV F genes. Interestingly, the majority of expressed V genes were from locus B (chromosome 3). This difference was reflected in the abundance of rearrangements (~66% from locus B) and in the diversity of IGHV genes expressed: 25 IGHV from locus B vs. 17 IGHV from locus A (Figure 4A). On average, IGHV1D-25*01, IGHV6D-18*01, IGHV6D-16*01, and IGHV1-73*01 were the most abundant IGHV functional genes, accounting for 30% of the expressed repertoire.

Figure 4

Rainbow Trout

In this species, we analyzed AIRRseq datasets from fish intraperitoneally immunized with a killed bacterial pathogen, Yersinia ruckeri [data from reference (33)]. 5′RACE PCR products were produced from spleen of immunized fish, using specific primers for IGHM constant region and with unique molecular identifiers (UIDs) for better data normalization (33). Only in-frame productive rearrangements (CDR3-IMGT without stop codons) were analyzed. Trout used in this study belonged to the isogenic line derived from Swanson strain that was selected for the rainbow trout genome project. Hence, these AIRRseq data express IGH genes from the very same repertoire, which was annotated in the current IMGT reference directories. These data therefore provided a quantitative assessment of the expression of IGHV genes in the spleen of three genetically similar individuals responding to a pathogen.

In this dataset, IMGT/High V-QUEST unambiguously identified the IGHV gene in 94% of submitted sequences, 90% of them with at least 99% of sequence identity (52% with 100% of identity). A total of 55 IGHV genes (35 functional “F,” 9 “ORF,” and 7 pseudogenes “P”) were expressed. Interestingly, these rearrangements are from both IGH loci (A and B) in relatively similar proportions.

In each trout sample, about 17% of sequences corresponded to IGHV ORF genes and 1.7–4.7% corresponded to IGHV pseudogenes (most of them correspond to IGHV1D-12*01 P or IGHV1-21*01 P) involved in-frame junction rearrangements. This feature could be detected because we selected the IMGT/HighV-QUEST directory sets “F + ORF + in-frame P,” which also include pseudogenes with in-frame stop codon in V region or defect in the leader or recombination signal (RS) sequences (3). Although IGH transcripts with stop codon are generally rare in mammals, they are typically much more frequent in fish, perhaps because nonsense-mediated mRNA decay (NMD) may work differently (28, 32, 34, 35).

Hence, about 80% of submitted rainbow trout sequences presented functional IGHV genes (Figure 4B). IGHV4D-24*01 F, IGHV6D-40*01 F, IGHV1-18*01 F, and IGHV11-25*01 F were the most expressed on average, with a limited interindividual variation as expected from the genetic constitution of the fish analyzed. In this dataset, for about 6% of submitted sequences, IMGT/HighV-QUEST provided two results assigned to distinct duplicated germline IGHV with alleles having identical or close sequences (for example, IGHV12D56*01/IGHV12D57*01, or IGHV8-30*01/IGHV8-40*01) owing to the gene duplication in salmonids.

Although the datasets analyzed here for salmon and trout were not selected for direct comparison, it suggests that these two species (at least, the fish strains analyzed here) do not use the two loci in the same way (see above). A rigorous and comprehensive comparison of expressed repertoires between rainbow trout and Atlantic salmon will require a systematic comparison of AIRRseq data from multiple strains.

Genetic Variability of IG Genes in Salmonids

Making available a full annotation and versatile nomenclature also offers the possibility to better integrate new data about variability of IG (or TR) genes. This issue is of particular interest in Salmonids for two main reasons: (1) variations of IG gene sequences may affect the repertoire of specificities targeted by Abs, in turn impacting the quality and efficiency of responses against pathogens, and (2) salmonid IG loci are particularly complex with high numbers of functional genes and pseudogenes located in two regions; therefore, they constitute interesting models to understand mechanisms of short-term evolution of such loci and the potential importance of homogenization vs. diversification of IG sequences.

To get preliminary data about IGHV variation in a salmonid species, we took advantage of the full genome sequencing of 19 isogenic lines of rainbow trout. These lines were produced using a mitogynogenesis-based strategy by Quillet et al. (36, 37). They represent 19 haplotypes randomly picked from the so-called INRA-SY “synthetic” population. This population was created about 35 years ago by a planned random mating (i.e., panmictic) mixture of French, Danish, and American domestic populations, and has been maintained since without any voluntary selection. The 19 isogenic lines analyzed here do not appear to be closely related to the Swanson trout generated at Washington State University using androgenesis, which has been sequenced and constitutes the reference genome (38, 39).

The numbers of indel and SNP detected within IGHV genes and pseudogenes are indicated in Table 6. Genetic variation between isogenic lines overall appears to be relatively modest at this level. It seems to be more frequent in the locus located in chromosome 13 (67 SNP and 1 indel for 29 functional genes, 41 SNP and 3 indel for 20 pseudogenes) compared to chromosome 12 (23 SNP and no indel for 29 functional genes, and 53 SNP and 10 indel for 51 pseudogenes). The proportion of silent vs. non-silent mutations was not significantly different between the two regions (40NS/67 SNP for chromosome 13 and 13NS/23 SNP for chromosome 12), suggesting that these genes did not evolve under strong positive selection. Indel and SNP were not significantly more frequent in pseudogenes. Variants were filtered to eliminate all assembly artifacts, but these data will have to be fully validated by resequencing, and the impact of variation on the gene status evaluated. We have indications that several new genes are present in productive and expressed rearrangements. This might be due to the absence of such genes in the genome of the Swanson strain or to gaps in the current reference genome assembly. In this context, it is of interest to evaluate the variability of IGHV gene numbers between the different haplotypes. Future assemblies will allow a more accurate description of the IGH diversity and variability. Incompleteness of the annotated repertoire may constitute a problem for repertoire analysis (for example, when a missing gene is used by a clonotype clonally selected in a response). Hence, sequences of genes that are not localized in the current assembly may be added to the IMGT Reference directories sets, providing that sufficient evidence is available to demonstrate their existence and expression. These sequences will be given a provisional name (with S) until their location and presence in the germline genomic sequence are validated. If new genes would appear, which do not belong to any of the IGHV subgroups identified and described in this work, a new subgroup may have to be defined. This is not impossible, but seems to be unlikely since we believe that the large set of IGHV sequences analyzed from Atlantic salmon and rainbow trout probably contains at least one representative of all subgroups. Such additions will be validated by the IG, TR, and MH Nomenclature Sub-Committee (IMGT-NC) (6, 7) of the IUIS Nomenclature Committee2,3, following a procedure analogous to the one used for example for inferred alleles in human.

Table 6

Functional genesPseudogenes
ChromStartStopNameSNPs number (NS)Indel numberStartStopNameSNPs numberIndel number
(A)
Chr1281 322 38581 322 680IGHV6D-762(1)081 312 81781 313 128IGHV16D-7930
Chr1281 335 72781 336 024IGHV1D-732(2)081 318 36781 318 661IGHV15D-7851
Chr1281 339 68081 339 363IGHV12D-711(0)081 320 82181 321 111IGHV1D-7710
Chr1281 365 08981 365 411IGHV15D-690081 332 71181 333 058IGHV3D-7522
Chr1281 365 95081 366 255IGHV2D-680081 333 67881 333 986IGHV1D-7410
Chr1281 395 84881 396 168IGHV7D-620081 336 97781 337 276IGHV4D-7200
Chr1281 397 76581 398 073IGHV4D-600081 383 88781 384 159IGHV1D-6500
Chr1281 422 49281 422 803IGHV15D-540081 384 68481 384 968IGHV6D-6400
Chr1281 436 86181 437 166IGHV2D-500081 388 21381 388 533IGHV1D-6311
Chr1281 438 84781 439 169IGHV15D-490081 396 76681 397 086IGHV7D-6100
Chr1281 464 50081 464 820IGHV7D-450081 398 32381 398 601IGHV1D-5900
Chr1281 465 76281 466 082IGHV7D-440081 399 51381 399 812IGHV4D-5800
Chr1281 466 76181 467 069IGHV4D-430081 401 68881 402 032IGHV12D-5700
Chr1281 493 82981 494 124IGHV6D-400081 402 04481 401 727IGHV12D-5600
Chr1281 529 17381 529 478IGHV1D-350081 421 38281 421 689IGHV16D-55110
Chr1281 561 75281 562 063IGHV3D-300081 434 53081 434 810IGHV2D-5200
Chr1281 568 86681 569 171IGHV2D-280081 435 68381 435 988IGHV2D-5100
Chr1281 595 83681 596 138IGHV10D-262(0)081 447 97281 448 244IGHV1D-4800
Chr1281 605 10881 605 419IGHV4D-241(0)081 448 77381 449 068IGHV6D-4700
Chr1281 618 38181 618 689IGHV4D-232(2)081 453 30681 453 632IGHV1D-4600
Chr1281 649 81081 650 108IGHV1D-175(3)081 467 31681 467 588IGHV1D-4200
Chr1281 653 37281 653 678IGHV1D-160081 481 94981 482 241IGHV16D-4100
Chr1281 661 00181 661 301IGHV1D-150081 511 23581 511 555IGHV1D-3920
Chr1281 675 46681 675 769IGHV1D-122(2)081 514 35381 514 683IGHV1D-3800
Chr1281 696 52381 696 828IGHV2D-111(0)081 516 97381 517 272IGHV4D-3700
Chr1281 700 73181 701 033IGHV10D-90081 519 24881 519 565IGHV12D-3611
Chr1281 705 76481 706 066IGHV10D-72(1)081 548 11681 548 425IGHV16D-3400
Chr1281 717 19881 717 494IGHV6D-53(2)081 550 86381 551 152IGHV16D-3300
Chr1281 737 67381 737 973IGHV1D-40081 558 85581 559 137IGHV15D-3200
81 560 79681 561 107IGHV12D-3100
81 567 43281 567 712IGHV2D-2900
81 594 92481 595 202IGHV6D-2710
81 598 87381 599 174IGHV1D-2500
81 618 93981 619 217IGHV1D-22112
81 628 80081 629 081IGHV1D-2120
81 629 83581 630 118IGHV8D-2000
81 630 92081 631 246IGHV6D-1900
81 637 54181 637 838IGHV6D-1800
81 674 59781 674 901IGHV4D-1300
81 699 36681 699 678IGHV6D-1000
81 704 39981 704 711IGHV6D-800
81 713 24381 713 531IGHV6D-631
81 745 52681 745 784IGHV1D-320
81 746 30881 746 602IGHV9D-210
81 750 74181 751 044IGHV16D-162
81 671 03081 671 329IGHV1D-1400
81 367 08681 367 437IGHV1D-6700
81 367 59981 367 922IGHV15D-6600
81 430 26381 430 605IGHV15D-5300
81 359 13781 359 442IGHV1D-7000
81 676 64181 676 923IGHV5D-1100
Total23(13)05310
(B)
Chr1348 030 79748 031 104IGH IGHV10-472(1)048 138 07148 138 427IGHV8-2900
Chr1348 034 51548 034 814IGHV8-461(0)048 027 35248 027 666IGHV15-4870
Chr1348 054 18148 054 484IGHV1-420048 046 87448 047 207IGHV9-4510
Chr1348 073 23448 073 536IGHV8-400048 048 08048 048 362IGHV4-4400
Chr1348 077 11548 077 414IGHV1-391(1)048 051 34248 051 671IGHV1-4320
Chr1348 082 08048 082 391IGHV16-373(3)048 068 02748 068 332IGHV1-4100
Chr1348 093 29848 093 597IGHV1-360048 079 96648 080 277IGHV16-3810
Chr1348 104 89748 105 217IGHV6-350048 108 55448 108 889IGHV9-3400
Chr1348 109 68348 109 994IGHV14-330048 146 92848 147 231IGHV2-2730
Chr1348 122 49948 122 783IGHV6-320048 147 81048 148 106IGHV6-2630
Chr1348 127 16848 127 466IGHV6-310048 157 81648 158 168IGHV9-2452
Chr1348 135 32948 135 631IGHV8-300048 211 86948 212 178IGHV7-1730
Chr1348 145 74248 146 047IGHV2-287(3)048 244 71948 245 034IGHV10-1230
Chr1348 148 98148 149 284IGHV11-250048 245 86748 246 172IGHV8-1120
Chr1348 164 98348 165 317IGHV9-234(3)048 254 87948 254 570IGHV16-900
Chr1348 166 89848 167 189IGHV4-229(7)048 279 54848 279 841IGHV1-700
Chr1348 168 16248 168 465IGHV1-215(3)048 280 38748 280 681IGHV4-641
Chr1348 174 81648 175 127IGHV3-204(2)048 316 05948 315 761IGHV6-330
Chr1348 191 97048 192 272IGHV8-196(5)048 339 57748 339 885IGHV4-100
Chr1348 201 66848 201 973IGHV1-182(0)048 076 21648 076 475IGHV13-3940
Chr1348 222 44148 222 126IGHV9-161(1)1
Chr1348 223 84448 223 544IGHV9-154(2)0
Chr1348 237 58848 237 899IGHV16-146(6)0
Chr1348 243 68848 243 987IGHV1-133(0)0
Chr1348 250 48748 250 789IGHV1-1000
Chr1348 257 52548 257 828IGHV2-82(1)0
Chr1348 307 97248 307 670IGHV8-57(2)0
Chr1348 312 16048 311 865IGHV6-400
Chr1348 327 41748 327 761IGHV1-200
Total67(40)1413

Number of SNP and variants in IGHV genes and pseudogenes across 19 isogenic rainbow trout lines.

Conclusion

Genome assembly is available for both Atlantic salmon and rainbow trout, representing the two main genera of Salmonids (Salmo and Oncorhynchus). More genomic (and transcriptome) data are coming from a number of genomic backgrounds, which will provide a rich source of knowledge about variations of potential antibody repertoires in these species. We therefore revisited the description and annotation of the two IGH loci present in these two species, currently from cDNA and BAC clone sequences, based on the IMGT biocuration and nomenclature for Salmonid IGH genes that will facilitate the analysis of AIRRseq data.

The IG or antibody repertoire sequencing has started to develop both in rainbow trout and in Atlantic salmon, reflecting a growing interest for an accurate and comprehensive description of the response against common pathogens and vaccines. As full genome assemblies are now available for several salmonid species (Atlantic salmon, rainbow trout, coho salmon, and chinook salmon), comparative analysis of the IGH locus structure in these closely related tetraploidized species is of great interest. It also appears very important to investigate the level of variation between germline repertoires of IG genes across commercial and wild salmonid stocks. This variation may have significant implications for practical issues in aquaculture and conservation; it will also be of significant interest for the basic comparative immunology community, in particular to address accurately the mechanisms of gene conversion, somatic hypermutation, and memory in these species and during vertebrate evolution.

Statements

Data availability statement

The datasets generated for this study can be found in the www.imgt.org – accession numbers can be found within the manuscript. Any other data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

Author contributions

SMa, AK, M-PL, and PB conceived the project and wrote the manuscript. SMa, AK, IS, and PB designed experiments. SMa, AK, SH-S, SA, SMo, DL, RC, IS, OS, JH, BK, M-PL, and PB performed data analysis. SMa, AK, DL, and IS provided resources. All authors contributed to manuscript revision, and read and approved the submitted version.

Acknowledgments

We are thankful to the entire IMGT® team in developing databases and tools. This work was granted access to the HPC@LR and to the High-Performance Computing (HPC) resources of the Centre Informatique National de l'Enseignement Supérieur (CINES) and to Très Grand Centre de Calcul (TGCC) of the Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA) under the allocation [036029] (2010–2019) made by GENCI (Grand Equipement National de Calcul Intensif). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2019.02541/full#supplementary-material

Figure S1

Alignment of the D-GENE-UNIT sequences of the IGHD (diversity) genes located upstream of the IGHM (locus A) and IGHMD (locus B) genes of Salmo salar (Salsal) and Oncorhynchus mykiss (Oncmyk) (A) and located upstream of IGHT genes (B). Genes of the locus B genes are identified by the letter D which follows the gene number. Labels are according to the D-GENE prototype (IMGT Scientific chart > 1. Sequence and 3D structure identification and description > IMGT prototypes table > D-GENE)1.

Figure S2

Alignment of the J-REGION amino acid sequences of the IGHJ (joining) genes located upstream of the IGHM or IGHT (locus A) and IGHMD or IGHTD (locus B) genes of Salmo salar (Salsal) and Oncorhynchus mykiss (Oncmyk). Genes of the locus B are identified by the letter D which follows the gene number. Labels are according to the J-GENE prototype (IMGT Scientific chart > 1. Sequence and 3D structure identification and description > IMGT prototypes table > J-GENE)1. The highly conserved FDYWGKGTXVT motif is pink highlighted and those residues that deviated from it are in red.

References

  • 1.

    LefrancM-PLefrancG. The Immunoglobulin FactsBook. London, UK: Academic Press (2001).

  • 2.

    LefrancM-PLefrancG. The T Cell Receptor FactsBook. London, UK: Academic Press (2001).

  • 3.

    LefrancM-P. Immunoglobulin and T cell receptor genes: IMGT® and the birth and rise of immunoinformatics. Front Immunol. (2014) 5:22. 10.3389/fimmu.2014.00022

  • 4.

    LefrancM-P. Nomenclature of the human immunoglobulin genes. Curr Protoc Immuno. (2001) Appendix1: Appendix 1P.1-A1P37.

  • 5.

    LefrancM-P. Nomenclature of the human T cell receptor genes. Curr Protoc Immunol. (2001) Appendix 1:Appendix 1O. 1-A.1O.23.

  • 6.

    LefrancM-P. WHO-IUIS Nomenclature Subcommittee for immunoglobulins and T cell receptors report. Immunogenetics. (2007) 59:899902. 10.1007/s00251-007-0260-4

  • 7.

    LefrancM-P. WHO-IUIS nomenclature subcommittee for immunoglobulins and T cell receptors report August 2007, 13th International Congress of Immunology, Rio de Janeiro, Brazil. Dev Comp Immunol. (2008) 32:4613. 10.1016/j.dci.2007.09.008

  • 8.

    MatsunagaTChenTTörmänenV. Characterization of a complete immunoglobulin heavy-chain variable region germ-line gene of rainbow trout. Proc Natl Acad Sci USA. (1990) 87:776771. 10.1073/pnas.87.19.7767

  • 9.

    AnderssonETörmänenVMatsunagaT. Evolution of a VH gene family in low vertebrates. Int Immunol. (1991) 3:52733. 10.1093/intimm/3.6.527

  • 10.

    LeeMABengténEDaggfeldtARyttingASPilströmL. Characterisation of rainbow trout cDNAs encoding a secreted and membrane-bound Ig heavy chain and the genomic intron upstream of the first constant exon. Mol Immunol. (1993) 30:6418. 10.1016/0161-5890(93)90075-M

  • 11.

    AnderssonEMatsunagaT. Complete cDNA sequence of a rainbow trout IgM gene and evolution of vertebrate IgM constant domains. Immunogenetics. (1993) 38:24350. 10.1007/BF00188800

  • 12.

    RomanTCharlemagneJ. The immunoglobulin repertoire of the rainbow trout (Oncorhynchus mykiss): definition of nine Igh-V families. Immunogenetics. (1994) 40:2106. 10.1007/BF00167081

  • 13.

    RomanTAnderssonEBengténEHansenJKaattariSPilströmLet al. Unified nomenclature of Ig VH genes in rainbow trout (Oncorhynchus mykiss): definition of eleven VH families. Immunogenetics. (1996) 43:3256. 10.1007/s002510050072

  • 14.

    BrownGDKaattariIMKaattariSL. Two new Ig VH gene families in Oncorhynchus mykiss. Immunogenetics. (2006) 58:9336. 10.1007/s00251-006-0149-7

  • 15.

    SolemSTHordvikIKillieJAWarrGWJørgensenTO. Diversity of the immunoglobulin heavy chain in the Atlantic salmon (Salmo salar L.) is contributed by genes from two parallel IgH isoloci. Dev Comp Immunol. (2001) 25:40317. 10.1016/S0145-305X(01)00008-8

  • 16.

    YasuikeMde BoerJvon SchalburgKRCooperGAMcKinnelLMessmerAet al. Evolution of duplicated IgH loci in Atlantic salmon, Salmo salar. BMC Genom. (2010) 11:486. 10.1186/1471-2164-11-486

  • 17.

    LefrancM-PGiudicelliVDurouxPJabado-MichaloudJFolchGAouintiSet al. IMGT®, the international ImMunoGeneTics information system® 25 years on. Nucleic Acids Res. (2015) 43:D41322. 10.1093/nar/gku1056

  • 18.

    LiSLefrancM-PMilesJJAlamyarEGiudicelliVDurouxPet al. IMGT/HighV QUEST paradigm for T cell receptor IMGT clonotype diversity and next generation repertoire immunoprofiling. Nat Commun. (2013) 4:2333. 10.1038/ncomms3333

  • 19.

    AouintiSMaloucheDGiudicelliVKossidaSLefrancM-P. IMGT/HighV-QUEST statistical significance of IMGT clonotype (AA) diversity per gene for standardized comparisons of next generation sequencing immunoprofiles of immunoglobulins and T cell receptors. PLoS ONE. (2015) 10:e0142353. 10.1371/journal.pone.0142353

  • 20.

    AouintiSGiudicelliVDurouxPMaloucheDKossidaSLefrancM-P. IMGT/statclonotype for pairwise evaluation and visualization of NGS IG and TR IMGT clonotype (AA) diversity or expression from IMGT/HighV-QUEST. Front Immunol. (2016) 7:339. 10.3389/fimmu.2016.00339

  • 21.

    BrochetXLefrancM-PGiudicelliV. IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res. (2008) 36:W5038. 10.1093/nar/gkn316

  • 22.

    MonodMYGiudicelliVChaumeDLefrancM-P. IMGT/JunctionAnalysis: the first tool for the analysis of the immunoglobulin and T cell receptor complex V-J and V-D-J JUNCTIONs. Bioinformatics. (2004) 20 (Suppl_1):i37985. 10.1093/bioinformatics/bth945

  • 23.

    EhrenmannFLefrancM-P. IMGT/DomainGapAlign: the IMGT® tool for the analysis of IG, TR, MH, IgSF, and MhSF domain amino acid polymorphism. Methods Mol Biol. (2012) 882:60533. 10.1007/978-1-61779-842-9_33

  • 24.

    LefrancM-PPommiéCRuizMGiudicelliVFoulquierETruongLet al. IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. Dev Comp Immunol. (2003) 27:5577. 10.1016/S0145-305X(02)00039-3

  • 25.

    LefrancM-PPommiéCKaasQDupratEBoscNGuiraudouDet al. IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains. Dev Comp Immunol. (2005) 29:185203. 10.1016/j.dci.2004.07.003

  • 26.

    AlamyarEDurouxPLefrancM-PGiudicelliV. IMGT® tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS. Methods Mol Biol. (2012) 882:569604. 10.1007/978-1-61779-842-9_32

  • 27.

    AlamyarEGiudicelliVShuoLDurouxPLefrancM-P. IMGT/HighV-QUEST: the IMGT® web portal for immunoglobulin (IG) or antibody and T cell receptor (TR) analysis from NGS high throughput and deep sequencing. Immunome Res. (2012) 8:26.

  • 28.

    FillatreauSSixAMagadanSCastroRSunyerJOBoudinotP. The astonishing diversity of Ig classes and B cell repertoires in teleost fish. Front Immunol. (2013) 4:28. 10.3389/fimmu.2013.00028

  • 29.

    HansenJDLandisEDPhillipsRB. Discovery of a unique Ig heavy-chain isotype (IgT) in rainbow trout: implications for a distinctive B cell developmental pathway in teleost fish. Proc Natl Acad Sci USA. (2005) 102:691924. 10.1073/pnas.0500027102

  • 30.

    ZhangY-ASalinasILiJParraDBjorkSXuZet al. IgT, a primitive immunoglobulin class specialized in mucosal immunity. Nat Immunol. (2010) 11:82735. 10.1038/ni.1913

  • 31.

    HordvikI. Identification of a novel immunoglobulin delta transcript and comparative analysis of the genes encoding IgD in Atlantic salmon and Atlantic halibut. Mol Immunol. (2002) 39:8591. 10.1016/S0161-5890(02)00043-3

  • 32.

    KrasnovAJørgensenSMAfanasyevS. Ig-seq: deep sequencing of the variable region of Atlantic salmon IgM heavy chain transcripts. Mol Immunol. (2017) 88:99105. 10.1016/j.molimm.2017.06.022

  • 33.

    MagadanSJouneauLBoudinotPSalinasI. Nasal vaccination drives modifications of nasal and systemic antibody repertoires in rainbow trout. J Immunol. (2019) J Immunol.203:148092. 10.4049/jimmunol.1900157

  • 34.

    CastroRJouneauLPhamH-PBouchezOGiudicelliVLefrancM-Pet al. Teleost fish mount complex clonal IgM and IgT responses in spleen upon systemic viral infection. PLoS Pathog. (2013) 9:e1003098. 10.1371/journal.ppat.1003098

  • 35.

    Magadán-MompóSSánchez-EspinelCGambón-DezaF. Immunoglobulin heavy chains in medaka (Oryzias latipes). BMC Evol Biol. (2011) 11:165. 10.1186/1471-2148-11-165

  • 36.

    QuilletEDorsonMLeguillouSBenmansourABoudinotP. Wide range of susceptibility to rhabdoviruses in homozygous clones of rainbow trout. Fish Shellfish Immunol. (2007) 22:5109. 10.1016/j.fsi.2006.07.002

  • 37.

    DiterAQuilletEChourroutD. Suppression of first egg mitosis induced by heat shocks in the rainbow trout. J Fish Biol. (1993) 42:77786. 10.1111/j.1095-8649.1993.tb00383.x

  • 38.

    PaltiYGenetCLuoM-CCharletAGaoGHuYet al. A first generation integrated map of the rainbow trout genome. BMC Genom. (2011) 12:180. 10.1186/1471-2164-12-180

  • 39.

    PaltiYGaoGMillerMRVallejoRLWheelerPAQuilletEet al. A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids. Mol Ecol Resour. (2014) 14:58896. 10.1111/1755-0998.12204

Summary

Keywords

immunoglobulin, antibody repertoire, salmonid fish, VDJ annotation, comparative immunology

Citation

Magadan S, Krasnov A, Hadi-Saljoqi S, Afanasyev S, Mondot S, Lallias D, Castro R, Salinas I, Sunyer O, Hansen J, Koop BF, Lefranc M-P and Boudinot P (2019) Standardized IMGT® Nomenclature of Salmonidae IGH Genes, the Paradigm of Atlantic Salmon and Rainbow Trout: From Genomics to Repertoires. Front. Immunol. 10:2541. doi: 10.3389/fimmu.2019.02541

Received

30 August 2019

Accepted

14 October 2019

Published

12 November 2019

Volume

10 - 2019

Edited by

Andrew M. Collins, University of New South Wales, Australia

Reviewed by

Peter Daniel Burrows, University of Alabama at Birmingham, United States; Michael Zemlin, Saarland University Hospital, Germany

Updates

Copyright

*Correspondence: Susana Magadan

This article was submitted to B Cell Biology, a section of the journal Frontiers in Immunology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics