<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="review-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Immunol.</journal-id>
<journal-title>Frontiers in Immunology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Immunol.</abbrev-journal-title>
<issn pub-type="epub">1664-3224</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fimmu.2022.858057</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Immunology</subject>
<subj-group>
<subject>Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Machine Learning Approaches to TCR Repertoire Analysis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Katayama</surname>
<given-names>Yotaro</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1521230"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yokota</surname>
<given-names>Ryo</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/434776"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Akiyama</surname>
<given-names>Taishin</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/798801"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kobayashi</surname>
<given-names>Tetsuya J.</given-names>
</name>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/30726"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Graduate School of Engineering, The University of Tokyo</institution>, <addr-line>Tokyo</addr-line>, <country>Japan</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>National Research Institute of Police Science</institution>, <addr-line>Kashiwa, Chiba</addr-line>, <country>Japan</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Laboratory for Immune Homeostasis, RIKEN Center for Integrative Medical Sciences</institution>, <addr-line>Yokohama</addr-line>, <country>Japan</country>
</aff>
<aff id="aff4">
<sup>4</sup>
<institution>Graduate School of Medical Life Science, Yokohama City University</institution>, <addr-line>Yokohama</addr-line>, <country>Japan</country>
</aff>
<aff id="aff5">
<sup>5</sup>
<institution>Institute of Industrial Science, The University of Tokyo</institution>, <addr-line>Tokyo</addr-line>, <country>Japan</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Loretta Tuosto, Sapienza University of Rome, Italy</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Gur Yaari, Bar-Ilan University, Israel; Pirooz Zareie, Monash University, Australia; Steven M. Abel, The University of Tennessee, Knoxville, United States; David Olivieri, University of Vigo, Spain; Victor Greiff, University of Oslo, Norway</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Yotaro Katayama, <email xlink:href="mailto:yotaro.katayama@gmail.com">yotaro.katayama@gmail.com</email>
</p>
</fn>
<fn fn-type="other" id="fn002">
<p>This article was submitted to T Cell Biology, a section of the journal Frontiers in Immunology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>15</day>
<month>07</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>858057</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>07</day>
<month>06</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Katayama, Yokota, Akiyama and Kobayashi</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Katayama, Yokota, Akiyama and Kobayashi</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Sparked by the development of genome sequencing technology, the quantity and quality of data handled in immunological research have been changing dramatically. Various data and database platforms are now driving the rapid progress of machine learning for immunological data analysis. Of various topics in immunology, T cell receptor repertoire analysis is one of the most important targets of machine learning for assessing the state and abnormalities of immune systems. In this paper, we review recent repertoire analysis methods based on machine learning and deep learning and discuss their prospects.</p>
</abstract>
<kwd-group>
<kwd>machine learning</kwd>
<kwd>deep learning</kwd>
<kwd>T cell</kwd>
<kwd>T cell receptor</kwd>
<kwd>immunoinformatics</kwd>
</kwd-group>
<contract-num rid="cn001">19H05799, 20H03441</contract-num>
<contract-num rid="cn002">JPMJCR2011</contract-num>
<contract-sponsor id="cn001">Japan Society for the Promotion of Science<named-content content-type="fundref-id">10.13039/501100001691</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">Core Research for Evolutional Science and Technology<named-content content-type="fundref-id">10.13039/501100003382</named-content>
</contract-sponsor>
<counts>
<fig-count count="3"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="184"/>
<page-count count="20"/>
<word-count count="11633"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<title>Introduction</title>
<p>Our bodies are constantly exposed to threats from various pathogenic bacteria, viruses, and cancer cells. The immune system is central to maintaining our body in a healthy state by detecting and evicting those pathogens. Among the different types of immune cells, T cells play various roles in the recognition, memory, and eviction of such threats (<xref ref-type="bibr" rid="B1">1</xref>). The peptides derived from those pathogens provide information to T cells when they are presented on the major histocompatibility complex (MHC) as antigens. T cells recognize an antigen if their T cell receptors (TCRs) can bind to the antigen-MHC complex. As antigens are diverse and MHC genes are highly polymorphic, TCRs also must be diverse to recognize a wide range of antigens. TCR diversity is generated by V(D)J recombination (<xref ref-type="bibr" rid="B2">2</xref>), one of the somatic recombination processes in our body. This process can potentially yield more than 10<sup>13</sup> patterns of TCR (<xref ref-type="bibr" rid="B3">3</xref>). This diversity of TCRs ensures that, even if unknown antigens enter the body, there will be T cells with TCRs that can recognize them with a high probability. Furthermore, the recognition of such antigens by T cells, i.e., the binding of antigens to their TCRs, activates the T cells, inducing their proliferation and/or phenotypic changes (<xref ref-type="bibr" rid="B1">1</xref>). These dynamics alter the diversity of TCRs (TCR repertoire) in a T cell population and modulate its collective recognition of antigens. Therefore, quantitative evaluation of the TCR repertoire in individuals enables us to capture the individual&#x2019;s past and present immunological status. It may also be possible to predict its future. Specifically, quantitative measurement of TCR repertoires may contribute to the quantification of abnormalities in the immune status of patients with specific diseases, the identification of the causes, and prediction of the risk of developing immune-related diseases in the future. For example, a diagnosis for a kind of leukemia is already approved by FDA (U.S. Food and Drug Administration) and clinically used. Quantitative measurement of TCR repertoire is performed by sequencing the recombinant genes encoding the TCRs of T cells in blood or other specimens. Since the mainstream of DNA sequencing technology has shifted from the low-throughput Sanger method to high-throughput next-generation sequencers (NGSs), the cost and time required for sequencing TCR repertoires have been dramatically reduced, which makes it practical to exploit TCR repertoires for practical applications. The recent advent of new techniques such as single-cell sequencing further provides ways to characterize different aspects of T cell repertoires (<xref ref-type="bibr" rid="B4">4</xref>).</p>
<p>In parallel with the development of TCR repertoire sequencing technology, bioinformatic and machine learning (ML) based data analysis, including deep learning (DL), is pervading the field of immunology. As we will see in more detail later, this is because a typical sample of repertoire data from a single person consists of a set of several hundred thousand sequences, and ML is an effective tool for extracting information from such a large amount of data. ML is already indispensable to repertoire sequencing analysis. It has also allowed new applications based on the repertoire sequencing such as the personal cancer vaccine (neoantigen vaccine) design (<xref ref-type="bibr" rid="B5">5</xref>) and the new testing methods for infections such as COVID-19 (<xref ref-type="bibr" rid="B6">6</xref>). Not only clinical applications, but also basic researches are assisted by ML based analysis methods. The impact of ML in repertoire sequencing is rapidly growing.</p>
<p>In this paper, we will outline the rapidly developing TCR repertoire analysis methods based on ML with useful tools and databases. We also discuss possible directions for further development of TCR repertoire analysis.</p>
<sec id="s1_1">
<title>Diversification of TCR</title>
<p>T cell progenitors are generated from hematopoietic stem cells in the bone marrow and undergo differentiation for maturation in the thymus before being exported to the periphery (<xref ref-type="bibr" rid="B1">1</xref>). A TCR is a heterodimer of <italic>&#x3b1;</italic>- and <italic>&#x3b2;</italic>-chains (certain TCRs consist of the <italic>&#x3b3;</italic>- and <italic>&#x3b4;</italic>-chain, but these are omitted here for simplicity). In the V(D)J recombination, one gene is selected from each of the V, D, and J gene groups of pre-recombinant genes of each chain (in <italic>&#x3b1;</italic>-chain, the D gene group does not exist), and the selected genes are combined with random insertions and deletions. Because of the randomness in the gene selection, insertions, and deletions, a variety of TCRs are generated. For example, in the case of the human <italic>&#x3b2;</italic>-chain, there are 64-67 V genes, 2 J genes, and 14 D genes according to IMGT database<xref ref-type="fn" rid="fn1"><sup>1</sup></xref>. It should be noted that there are two loci for each TCR chain as humans are diploid. 30% of T cells (dual TCR) have two different productive TCR <italic>&#x3b1;</italic>-chain mRNAs despite the allelic exclusion mechanisms (<xref ref-type="bibr" rid="B7">7</xref>).</p>
<p>A TCR recognizes antigens present on the MHC. Antigens are digested into short peptides and presented on the MHC to form peptide-loaded MHC (pMHC) complexes (<xref ref-type="bibr" rid="B8">8</xref>). The affinity of a TCR to an pMHC complex is mainly determined by the recombination-dependent highly variable regions called the complementarity determining regions (CDRs) (<xref ref-type="bibr" rid="B9">9</xref>). In the sequence of recombinant TCR genes, three CDRs exist, from CDR1 to CDR3. CDR1 and CDR2 engage in binding to the MHC complex presenting an antigen, whereas CDR3 contributes to the binding affinity of the TCR to the antigen itself. Thus, the sequence of CDR3 plays a particularly important role in analyzing repertoires. Many studies to be introduced here also work on CDR3. After recombination, T cells in the thymus undergo positive and negative selection based on their interactions with self-antigens presented by other cells such as thymic epithelial cells (<xref ref-type="bibr" rid="B10">10</xref>). In positive selection, T cells with TCRs that have a moderate affinity to some self-antigen-MHC complexes are selected to survive. This process also selects T cells such that they recognize the antigen only if it is presented on the MHC (<xref ref-type="bibr" rid="B11">11</xref>). This phenomenon is called MHC restriction (<xref ref-type="bibr" rid="B12">12</xref>). Note that TCRs are &#x201c;personalized&#x201d; in this process as the MHC genes are highly polymorphic. The impact of genetic background, including MHC polymorphisms, on repertoire dynamics will be revisited in the next section. Cross-reactivity ensures that the selected T cells may recognize some non-self-antigen-MHC complexes too (<xref ref-type="bibr" rid="B13">13</xref>). In contrast, T cells with TCRs that have a high affinity to any self-antigen-MHC complex are eliminated in negative selection. This process decreases the number of self-reactive T cells by 60-70% (<xref ref-type="bibr" rid="B14">14</xref>). The remaining self-reactive T cells are suppressed by peripheral tolerance (<xref ref-type="bibr" rid="B14">14</xref>). By combining these mechanisms, TCRs that can recognize non-self-antigens but do not recognize self-antigens are selected and exported to the periphery. Then, T cells are induced to differentiate and proliferate depending on the antigens encountered in the periphery. From such peripheral T cell population dynamics, an appropriate repertoire is shaped and maintained so that it attains the ability to remember and rapidly respond to experienced antigens while retaining the diversity to respond to unknown ones (<xref ref-type="bibr" rid="B15">15</xref>).</p>
</sec>
<sec id="s1_2">
<title>Influencing Factors of TCR Repertoire</title>
<p>Various factors affect the formation of a TCR repertoire. As we described earlier, peripheral antigen exposure changes the TCR repertoire. We review other potential factors in this section.</p>
<p>First, the genetic background can affect diverse aspects of repertoire dynamics. As we mentioned in the positive selection in the thymus, TCRs are selected to have MHC restriction. Therefore, the MHC type can influence the formation of repertoire. For example, associations between specific HLA (human MHC) types and specific sequences are observed (<xref ref-type="bibr" rid="B16">16</xref>). Furthermore, gene usage in V(D)J recombination might be affected by MHC (<xref ref-type="bibr" rid="B17">17</xref>). In addition, some HLA variants are associated with onset of autoimmune diseases (<xref ref-type="bibr" rid="B18">18</xref>). These results contrast with those of immunoglobulin for which V(D)J recombination process before selection is found to be highly different even between monozygotic (MZ) twins (<xref ref-type="bibr" rid="B19">19</xref>). Moreover, as HLA genes are highly variant (<xref ref-type="bibr" rid="B12">12</xref>), TCRs that bind to the same peptide can differ between people. Therefore, we cannot easily assume that T cells with the same TCR recognize the same antigens in different individuals when genetic information such as HLA is not the same.</p>
<p>Not only MHC, but also V(D)J genes themselves have polymorphisms (<xref ref-type="bibr" rid="B20">20</xref>, <xref ref-type="bibr" rid="B21">21</xref>). Some of those variants are shown to affect the affinity of TCR-pMHC complex (<xref ref-type="bibr" rid="B22">22</xref>), which may result in different repertoire dynamics. We do not fully understand the effect of these variants on repertoire, as many variants remain to be discovered (<xref ref-type="bibr" rid="B23">23</xref>). Furthermore, genetic background is not the only dominant factor in the final peripheral repertoire dynamics. For example, a study in MZ twins revealed that peripheral repertoires of MZ twins are almost as different as those of unrelated individuals in terms of shared TCRs (<xref ref-type="bibr" rid="B24">24</xref>). On the other hand, those of the same person are very similar even after years (<xref ref-type="bibr" rid="B24">24</xref>). This might be caused by the fact that the probability of generation of the same TCR in different individuals is very low even if the MHC alleles are the same.</p>
<p>Second, aging also affect repertoire dynamics greatly. Age-related changes in the immune system are collectively called &#x201c;immunosenescence&#x201d; (<xref ref-type="bibr" rid="B25">25</xref>, <xref ref-type="bibr" rid="B26">26</xref>). In the context of TCR repertoire analysis, immunosenescence often refers to the decrease in the proportion of na&#xef;ve T cells and the increase in that of memory T cells undergoing persistent selection, for example, memory T cells recognizing antigens behind chronic viral infections such as cytomegalovirus (CMV) (<xref ref-type="bibr" rid="B27">27</xref>). This phenomenon impairs the diversity of the TCR repertoire. One of the main causes of this change is the decrease in the thymic output of na&#xef;ve T cells due to the age-related thymic involution (<xref ref-type="bibr" rid="B28">28</xref>).</p>
</sec>
<sec id="s1_3">
<title>Repertoire Sequencing and Batch Effects</title>
<p>We can quantify TCR repertoires through repertoire sequencing (AIRR-seq) using NGSs. A typical repertoire sequencing procedure is summarized in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1A</bold>
</xref>. Samples such as peripheral blood mononuclear cells (PBMC) are collected, and their CDR regions are amplified by polymerase chain reaction (PCR). Then NGSs are used to read the amplified sequences. As CDR3 is the most diversified region in the TCR gene, many protocols are developed for CDR3 sequencing (<xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B30">30</xref>).</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>
<bold>(A)</bold> Schematic illustration of the pipeline in TCR repertoire analysis. 1) First, T cells in samples, typically being collected from peripheral blood, are processed to extract its DNA or RNA of TCRs. 2,3) PCR is conducted to amplify the signal. 4,5) Then, the amplified DNA or cDNA is sequenced by NGS to obtain TCR sequences. 6,7) Finally, these sequences are mapped to the reference genes by the software pipeline introduced in the main text and analyzed further. <bold>(B)</bold> A typical experimental flow for applying ML methods on repertoire datasets. 1-3) Samples are collected from multiple groups of donors who have different immunological and physiological conditions. 4,5) By the pipeline illustrated in <bold>(A)</bold>, the dataset is obtained for each sample typically in the format of a table or matrix. 6) Datasets are encoded to ML friendly formats (feature vectors) using feature extraction methods. In bioinformatics, it is common to analyze gene expression matrices, which summarize the expression level of each gene for each sample. In repertoire analysis, for each sample, we have a matrix, each raw of which represents the sequence of one TCR, its observation count, its gene usage, and other properties of the TCR. Note that typically 10<sup>4</sup> to 10<sup>5</sup> different sequences are observed per sample and that only a limited number of overlapping sequences are usually detected among samples. Therefore, a relatively large sparse matrix must be handled for repertoire analysis. 7) ML algorithms are performed on the encoded datasets.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-13-858057-g001.tif"/>
</fig>
<p>Repertoire sequencing is one of the most actively developed sequencing technologies. In addition to the conventional procedure described above, single-cell repertoire sequencing has also been developed in recent years (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B31">31</xref>). Using such protocols, for example, the pairing of the TCR<italic>&#x3b1;</italic> and TCR<italic>&#x3b2;</italic> chains can be measured (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B31">31</xref>). Furthermore, dual TCRs can be investigated (<xref ref-type="bibr" rid="B32">32</xref>&#x2013;<xref ref-type="bibr" rid="B36">36</xref>). As this review is primarily dedicated to repertoire analysis methods, we focus mainly on the potential biases in the conventional sequencing methods, which may skew the results of the ML methods.</p>
<p>First, PCR introduces various biases originating from the amplification. The sequence composition influences the amplification ratio of PCR. Multiple primers are also a source of biases. Multiple primers are commonly used in repertoire sequencing (<xref ref-type="bibr" rid="B37">37</xref>) because the edges of the CDR3 region are diverse depending on the choice of V (and J) genes. These primers are designed for known V (and J) genes. As a result, CDR3 sequences composed of unknown V (and J) alleles may not be amplified (<xref ref-type="bibr" rid="B30">30</xref>). In addition, multiplex PCR is also influenced by the amplification bias (<xref ref-type="bibr" rid="B30">30</xref>). Such quantitative bias affects a variety of ML methods introduced later. For example, diversity-based methods in observation frequency-based methods can be directly skewed. Various proliferated clonotype discovery methods are also affected.</p>
<p>Second, PCR and NGS introduce errors in the TCR sequences (<xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B39">39</xref>). It is estimated that about 2% of the PCR amplicons contain some sequencing errors in TCR sequencing (<xref ref-type="bibr" rid="B40">40</xref>), and 1-6% of sequences yielded by NGSs (Illumina) are erroneous. Erroneous sequences lead to false-positive clusters and skew the diversity in observation frequency-based methods. In contrast, dissimilarity-based methods aggregate similar sequences into a cluster. Therefore, they can be less affected by such errors.</p>
<p>Third, the starting material matters. We can employ either DNA or RNA of TCRs. In general, DNA-based methods are supposed to be more quantitative than RNA-based ones, as the number of RNA copies fluctuates among cells (<xref ref-type="bibr" rid="B30">30</xref>). However, a recent systematic review (<xref ref-type="bibr" rid="B41">41</xref>) suggests that the starting material may not always be the determinant of correctness or sensitivity. Moreover, RNA-based methods have some qualitative advantages. For example, some RNA-based protocols can capture the full-length TCR sequence, which contains CDR1,2 and 3 (<xref ref-type="bibr" rid="B30">30</xref>).</p>
<p>Many protocols have already been proposed to reduce such biases and errors. Certainly, their magnitudes can differ by protocol (<xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B38">38</xref>). However, each method have both advantages and disadvantages. To apply ML methods to any data, we need to mind the protocol used to derive the data and be aware of the introduced bias beforehand. The following reviews are referred for details of each protocol (<xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B31">31</xref>, <xref ref-type="bibr" rid="B37">37</xref>).</p>
<p>Moreover, repertoire sequencing is affected by various batch effects. As we reviewed in this section, the choice of experimental protocol affects the result. Even if the protocol is the same, various conditions, such as different batches and different facilities where samples are collected, can be distinguished by ML algorithms (<xref ref-type="bibr" rid="B42">42</xref>, <xref ref-type="bibr" rid="B43">43</xref>). These batch effects can be problematic in applying machine learning because of shortcut learning (<xref ref-type="bibr" rid="B44">44</xref>). We here adopt a famous example from medical image processing to intuitively explain the concept of shortcut learning. In the pneumonia detection task from an X-ray image, the performance of ML models is known to be dropped if tested by the datasets from other hospitals. It was revealed that ML models seemed to distinguish from which hospital an image was taken (<xref ref-type="bibr" rid="B45">45</xref>). As every hospital has different pneumonia prevalence rates, the model outputs positive if the sample seemed to be taken in a hospital with a high prevalence rate and can achieve a decent performance score. However, of course, if an image is not taken at the known hospitals, the model cannot answer correctly. In this situation, the hospital classification task was easier and was thereby used as a &#x201c;shortcut&#x201d; for the pneumonia detection task. As ML can distinguish various experimental conditions because of the batch effects, shortcut learning can also happen in ML-based repertoire analysis. There are attempts to remove the batch effects in repertoire sequencing. Some of errors and biases can be corrected by bioinformatic post-processing (<xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B40">40</xref>, <xref ref-type="bibr" rid="B46">46</xref>). Such algorithms are implemented in popular software such as MiXCR (<xref ref-type="bibr" rid="B47">47</xref>). They are successful in reducing errors and biases (<xref ref-type="bibr" rid="B40">40</xref>). However, we have to be aware that the batch effects may not always be corrected. Thus, we must be careful when applying machine learning methods to repertoire datasets. For detailed comparisons of software, refer to (<xref ref-type="bibr" rid="B40">40</xref>, <xref ref-type="bibr" rid="B46">46</xref>).</p>
</sec>
<sec id="s1_4">
<title>Current Pipeline and Datasets for TCR Repertoire Analysis</title>
<p>Currently, a variety of TCR repertoire datasets are available to the public. There are two main types of platform hosting repertoire datasets. The first one is a public database, Sequence Read Archive (SRA)<xref ref-type="fn" rid="fn2"><sup>2</sup></xref>, to which we can register raw sequences (e.g., FASTQ files). To download data, users need to find the accession number of International Nucleotide Sequence Databases (INSD)<xref ref-type="fn" rid="fn3"><sup>3</sup></xref> and use software such as sra-toolkit<xref ref-type="fn" rid="fn4"><sup>4</sup></xref>. Each read sequence in a FASTQ file generated by NGSs is mapped to the reference sequences to annotate CDRs and selected V(D)J genes. Several pipeline tools for the analysis of FASTQ files have been proposed and developed, among which IMGT/HighV-QUEST (<xref ref-type="bibr" rid="B48">48</xref>), igBLAST (<xref ref-type="bibr" rid="B49">49</xref>), and MiXCR (<xref ref-type="bibr" rid="B47">47</xref>) are popularly used in previous studies. For performance comparisons of the major tools, we refer the reader to these review articles (<xref ref-type="bibr" rid="B50">50</xref>, <xref ref-type="bibr" rid="B51">51</xref>). This workflow is summarized in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1A</bold>
</xref>.</p>
<p>The other is the platforms dedicated to immunosequencing datasets. For example, VDJServer<xref ref-type="fn" rid="fn5"><sup>5</sup></xref> (<xref ref-type="bibr" rid="B52">52</xref>) and immuneAccess<xref ref-type="fn" rid="fn6"><sup>6</sup></xref> have been widely used in recent years. Once FASTQ files are uploaded to these services, they will automatically process the files, and various analyses can be performed on the web. Such services seem to be highly appreciated by emancipating users from setting up a local environment or being bothered by complex software options. Still, there is no de facto standard for such repositories, and this has led to the development of curated databases such as iReceptor<xref ref-type="fn" rid="fn7"><sup>7</sup></xref> (<xref ref-type="bibr" rid="B53">53</xref>) and TCRdb<xref ref-type="fn" rid="fn8"><sup>8</sup></xref> (<xref ref-type="bibr" rid="B54">54</xref>) for scattered datasets.</p>
<p>To efficiently collect information on TCR repertoire analysis, it is also recommended to use other major repositories and communities as follows; VDJdb (<xref ref-type="bibr" rid="B55">55</xref>), a database that combines information on TCRs, antigens, and MHCs; Immune Epitope Database (IEDB) (<xref ref-type="bibr" rid="B56">56</xref>), a database of immune epitopes; McPAS-TCR (<xref ref-type="bibr" rid="B57">57</xref>), a database that organizes and collects TCR sequences related to various pathogens; and Adaptive Immune Receptor Repertoire (AIRR) community (<xref ref-type="bibr" rid="B58">58</xref>), a community for sharing antigen and repertoire datasets.</p>
</sec>
<sec id="s1_5">
<title>Challenges in TCR Repertoire Analysis</title>
<p>In general, the basic approach for extracting useful information by comparing samples with others of different conditions is to contrast the information shared or not shared between samples in the same condition or those across different conditions like <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1B</bold>
</xref>. In TCR repertoire analysis, the TCR sequences commonly observed among different individuals (called public TCR sequences) are considered important (<xref ref-type="bibr" rid="B59">59</xref>). However, due to the diversity of TCR repertoires, the number of public TCR sequences is very small compared to the total number of sequences observed in each sample. Therefore, they may not be sufficient to characterize the differences among the sample groups.</p>
<p>In addition, for sequences observed uniquely in each sample, it is not easy to distinguish whether they are attributed to the differences of individuals or to those of sample conditions such as abnormalities or diseases. The difficulty of associating observed sequences with sample conditions is one of the major problems in repertoire analysis due to the diversity of TCRs. Moreover, the TCR repertoire may change over time due to the donor&#x2019;s health history such as injury, infection, and aging (<xref ref-type="bibr" rid="B1">1</xref>). Thus, the individual difference in TCR repertoire is large. Therefore, we should perform the analysis by taking into account not only the current but also the past health condition of the donors.</p>
<p>Furthermore, T cells isolated from peripheral blood samples are commonly used to measure human TCR repertoire. From each sample, we typically measure TCRs of about 10<sup>4</sup> to 10<sup>6</sup> T cells, which is a tiny fraction of the donor&#x2019;s approximately 10<sup>11</sup> Tcells (<xref ref-type="bibr" rid="B60">60</xref>, <xref ref-type="bibr" rid="B61">61</xref>). Thus, quantitative analysis of TCR repertoire has its own difficulties due to the diversity and chronological variation of TCR repertoire and also to the high population size of T cells compared with the measurable size.</p>
<p>Moreover, we still cannot directly know what antigens a specific TCR sequence recognizes. Therefore, only from sequence information, we cannot compare or measure the similarity of TCRs by their antigen recognition profile. Experimental analyses of antigen-specific TCRs are widely performed (<xref ref-type="bibr" rid="B62">62</xref>&#x2013;<xref ref-type="bibr" rid="B64">64</xref>), but they cannot be exhaustive, and we cannot conduct such analysis on every TCR. Although we have TCR-pMHC binding prediction methods, some of which we review later, the performance is still limited. In addition, the diversity of TCRs, MHCs, and antigens makes it impractical to calculate the complete recognition profile. This is problematic because we may need to utilize similar but different sequences to compare or characterize repertoires, as the number of shared identical TCRs is very small because of the high variety of TCRs and the limited sample size we mentioned earlier. Although a lot of experimental evidence such as (<xref ref-type="bibr" rid="B64">64</xref>, <xref ref-type="bibr" rid="B65">65</xref>) implies that similar TCR sequences may recognize similar antigens, there is no <italic>a priori</italic> similarity measure. We need to devise a new way to calculate the similarity of TCRs.</p>
<p>These challenges are not the only obstacles in TCR repertoire analysis for understanding the dynamics of TCR repertoire. As we saw earlier, experimental procedures for repertoire sequencing using PCR or NGS inevitably introduce batch effects and errors. Some of the software tools introduced in the previous section correct and debias the sequencing data to some extent. However, not all the errors and batch effects can be removed.</p>
<p>These problems summarized in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref> are related to the development of bioinformatic and ML methods to be introduced in the next section. Each method approaches to these challenges in a unique way, which can be categorized as in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref>. In the following sections, we review each category one by one.</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>Challenges in TCR repertoire analysis. <bold>(A)</bold> Only limited observations are possible compared with the massive diversity of TCRs in a body or that of possible TCR sequences. <bold>(B)</bold> Various factors alter the repertoire, which results in large individual differences. <bold>(C)</bold> As we cannot observe all the antigens that a TCR recognizes, we cannot directly evaluate the similarity between TCRs with different sequences. <bold>(D)</bold> Experimental procedures including PCR and NGS inevitably introduce errors and batch effects.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-13-858057-g002.tif"/>
</fig>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>Graphical summary of the development in TCR analysis methods. Early analysis was based on the TCR clonotype abundance (frequency) distribution (left panel). Recently, sequence information has started to be utilized in various ways (right panel) by employing statistical, ML, and DL methods.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-13-858057-g003.tif"/>
</fig>
</sec>
</sec>
<sec id="s2">
<title>Observation Frequency Based Methods</title>
<p>In the conventional analysis of TCR repertoires, statistical indices from ecology have been employed. In ecology, the complexity of ecosystems has been measured by diversity (<xref ref-type="bibr" rid="B66">66</xref>). Typically, diversity is calculated by the rarity weighted count of the species. If a species has a dominant population, the diversity of the ecosystem is small. In contrast, if there are many rare species, the diversity increases.</p>
<p>By treating each TCR clonotype as a species, diversity can be measured for a TCR repertoire in a sample. In immunology, the diversity of TCR repertoire is closely related to the clonal expansion (an increase in the proportion of T cells with the same TCR clonotype caused by a proliferation of T cells, which decreases the diversity of TCR repertoire) against specific antigens (<xref ref-type="bibr" rid="B1">1</xref>). By applying the species diversity analysis methods in ecology, the degree of clonal expansion has been associated with various sample conditions. A typical example is an approach to quantify the diversity of amino acid sequences in CDR3 using indices such as Hill&#x2019;s number (<xref ref-type="bibr" rid="B67">67</xref>). Around 2010, the quantification of the diversity of TCR repertoires using probability models were proposed (<xref ref-type="bibr" rid="B68">68</xref>), which enabled us to characterize differences between samples. Both Guindani et al. (<xref ref-type="bibr" rid="B69">69</xref>) and Rempala et al. (<xref ref-type="bibr" rid="B70">70</xref>) employed the Poisson abundance model (<xref ref-type="bibr" rid="B68">68</xref>), to not only fitting the abundance distribution shapes, but also to classify the samples using the estimated parameters of each sample. This approach is still being investigated: PowerTCR (<xref ref-type="bibr" rid="B71">71</xref>) proposed a probabilistic model not based on Poisson abundance model in 2018.</p>
<p>However, these approaches employ only frequency information and do not directly utilize the sequence information. As a result, important information can be obscured or lost. For example, even if the samples have very similar frequency distributions, the sequences observed at high frequencies might be completely different. In particular, it is difficult to examine or identify particular sequences that caused the differences between samples only from the frequency information, which is important for practical applications. Moreover, utilizing ML methods enabled the processing of sequence information without compressing it down to the frequency information. Therefore, the recent advances in TCR repertoire analysis have occurred primarily in sequence-information-based methods using ML methods. We categorized them by their approaches, as summarized in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref>, and will review each of them in the following sections. Nevertheless, frequency-based methods are still practically effective for small datasets as the frequency distribution can be obtained relatively stably even from a sample containing a small number of TCRs.</p>
</sec>
<sec id="s3">
<title>Utilization of Sequence Information</title>
<p>As we saw in the previous section, frequency distribution based methods can only provide the degree of difference between different samples. In particular, specific sequences characterizing the sample differences are of particular importance. Since we are interested in specific sequences characterizing the sample differences, we need another approach that can directly utilize sequence information to identify those specific sequences. One of the most illustrative and important applications of sequence information is monitoring minimum residual disease (MRD), a kind of T(B)-cell leukemia (<xref ref-type="bibr" rid="B72">72</xref>). As dominant T(B) cell clones themselves are the direct cause of MRD, unusually proliferated TCRs (BCRs) can be utilized as biomarkers to monitor the progression of the disease. In monoclonal leukemia, the identification of such dominant sequences is fairly easy because the dominant clone sometimes occupies more than 75% of the T cells (<xref ref-type="bibr" rid="B73">73</xref>).</p>
<p>However, we may not be able to find such obvious sequences for other diseases or conditions. Unlike leukemia, the most abundant clone in a sample may not be related to diseases or conditions. We must find a portion of the sequences shared between the samples of the same condition, but this is not straightforward. Due to the diversity and individual differences of TCR repertoires, the number of shared sequences is typically very small. Even if we find shared TCRs, we must statistically discriminate whether such shared TCRs are yielded by a condition or by chance. It should be noted that we need to devise a way to evaluate &#x201c;similar&#x201d; sequences because we cannot directly observe the similarity of TCRs. To overcome these problems, ML has been utilized. In this section, we review three major categories of methods, which are based on hypothesis tests, dissimilarity, and motifs, respectively.</p>
<sec id="s3_1">
<title>Hypothesis Test Based Methods</title>
<p>One of the straightforward ways to extract relationships of specific TCRs and sample conditions is to use hypothesis tests to judge whether the number of observed TCRs is significantly large or small for a specific sample condition. For example, Emerson et al. (<xref ref-type="bibr" rid="B16">16</xref>) collected peripheral blood samples from a total of 641 donors, 289 affected and 352 unaffected by CMV, and identified the TCR<italic>&#x3b2;</italic> sequences specific to the CMV-affected donors using Fisher&#x2019;s exact test. In addition, by utilizing the identified 164 CMV-specific TCR<italic>&#x3b2;</italic> sequences as features of a repertoire, they designed a discriminative model of beta-binomial distribution for predicting CMV infection. De Neuter et al. (<xref ref-type="bibr" rid="B74">74</xref>) replicated the results on another dataset and showed that a random forest classifier using the observation counts of these TCRs in a sample also works well to predict the infection. Emerson&#x2019;s method, however, ignores sequence similarity completely and only utilizes the information of &#x201c;Public TCRs.&#x201d;</p>
<p>In contrast, Ritvo et al. (<xref ref-type="bibr" rid="B75">75</xref>) proposed a method called TCRNET, which utilizes sequence similarity to estimate clusters of similar TCRs that are significantly proliferated in specific samples. Here, similar TCRs are defined as those derived from the same V and J genes and differ at most by one amino acid sequence. Then, the number of TCRs in the target cluster is contrasted with the number of TCRs with the same V, J genes and CDR3 sequence length as the target cluster. If the proportion is found to be significantly larger in a specific sample by the binomial test, the target cluster is judged as a proliferated cluster.</p>
<p>These methods require counting the same or similar TCRs. This process is very slow because, in a naive implementation, every possible pair of sequences must be compared. CompAIRR<xref ref-type="fn" rid="fn9"><sup>9</sup></xref> is developed for a faster exact or approximate search for shared TCRs.</p>
</sec>
<sec id="s3_2">
<title>Dissimilarity Based Methods</title>
<p>The methods introduced above compare only a specific TCR or a cluster of TCRs with the others. Therefore, they abandon the sequence information of the others, even though they constitute most of the sequences in samples. Some methods have been proposed to exploit such information. In particular, we review the methods based on the dissimilarity between TCRs. Network analysis based on sequence similarity has been used for a long time. For example, classification of healthy and leukemic samples is performed on the BCR sequence network of each sample in which all sequences differ at most one residue are connected (<xref ref-type="bibr" rid="B76">76</xref>). In TCR, a similar network analysis revealed the public TCRs conserved between mice and humans (<xref ref-type="bibr" rid="B77">77</xref>).</p>
<p>More complex dissimilarity indices tailored for TCR analysis have been proposed. Dash et al. (<xref ref-type="bibr" rid="B78">78</xref>) quantified the differences between two TCR sequences by weighted Hamming distances and visualized epitope-specific TCR clusters by dimensionality reduction and clustering of their dissimilarity matrices. Their method, called TCRdist (<xref ref-type="bibr" rid="B78">78</xref>), has become a popular method to search for epitope-specific TCR sequences. TCRdist focuses mostly on evaluating the differences of TCRs. By contrast, RECOLD (<xref ref-type="bibr" rid="B79">79</xref>), which was proposed by our group, is designed to measure the differences between samples. RECOLD calculates the distance between all the observed sequences in all samples to create a dissimilarity matrix. Then, dimensionality reduction is performed on the matrix, and every observed sequence is embedded in a shared low-dimensional space as a point. In this space, each sample is represented as a probability distribution, and the difference between samples is quantified as the difference of distributions by Jensen-Shannon Divergence. In addition, RECOLD can identify the sequences specifically contributing to the differences of samples using the bootstrap method.</p>
<p>New methods based on TCR-level dissimilarity are still actively and continuously explored. A method called GLIPH (<xref ref-type="bibr" rid="B63">63</xref>) integrates sequence information and observed frequency information with CDR3 length and HLA to estimate epitope-specific sequences. iSMART (<xref ref-type="bibr" rid="B80">80</xref>) and GLIPH2 (<xref ref-type="bibr" rid="B81">81</xref>) have been released recently to improve the performance and the applicable data size. In TCRdist3 (<xref ref-type="bibr" rid="B82">82</xref>), TCRdist-based distance can be combined with motifs (introduced in the next section) to characterize TCR clusters.</p>
<p>On the other hand, some methods are devised for calculating the distances between repertoires directly. Repertoire Dissimilarity Index (RDI) (<xref ref-type="bibr" rid="B83">83</xref>) compares the usage of V(D)J gene segment. ImmuneREF<xref ref-type="fn" rid="fn10"><sup>10</sup></xref> utilizes various interpretable indices such as diversity indices and positional amino acid frequencies.</p>
<p>As in the case of the hypothesis-based methods, the computation cost is important for dissimilarity-based methods, which also perform a lot of sequence comparison. ClusTCR (<xref ref-type="bibr" rid="B84">84</xref>) achieves faster clustering by focusing CDR3 and compromising flexibility in the sequence alignment. GIANA (<xref ref-type="bibr" rid="B85">85</xref>) used a different approach. In GIANA, a lightweight linear transformation equivalent to sequence alignment on BLOSUM62 is constructed. Then, every sequence can be encoded into a coordinate in the euclidian space, where clustering is fast.</p>
</sec>
<sec id="s3_3">
<title>Motif Based Methods</title>
<p>The dissimilarity based methods characterize TCRs (or samples) by the relative distances between them. Alternatively, we can directly encode TCRs (or samples) into feature vectors and apply ML methods to the vectors. A conventional but effective method to create such feature vectors is the k-mer method. It characterizes a TCR (or a sample) by the observed frequency of all possible k consecutive substrings (motifs) in the sequence (or sequences in the sample). Therefore, in a typical 3-mer method, its feature vector has approximately 21<sup>3</sup> dimensions (21 = 20 human amino acids + a symbol representing the edges of the amino-acid sequences). The k-mer features have been combined with various ML methods: LP-boost (<xref ref-type="bibr" rid="B86">86</xref>); Bayesian discriminators (<xref ref-type="bibr" rid="B87">87</xref>); and SVMs (<xref ref-type="bibr" rid="B87">87</xref>). They were applied to TCR <italic>&#x3b2;</italic>-chain CDR3 datasets to discriminate whether a sample had been treated with ovalbumin or not, which was used to stimulate immune responses. We also proposed MotifBoost (<xref ref-type="bibr" rid="B43">43</xref>), which merges the k-mer encoding and Gradient Boosting Decision Tree (GBDT) (<xref ref-type="bibr" rid="B88">88</xref>) for repertoire classification. Along with proposing a new method, we also investigated the nature of the k-mer encoding and revealed that CMV infected and healthy samples are well separated in the k-mer feature space derived by a PCA-like unsupervised learning method called Gaussian Process Latent Variable Model (GPLVM) (<xref ref-type="bibr" rid="B89">89</xref>). This result indicates that the k-mer encoding can naturally capture the intrinsic characteristics of repertoires. Moreover, k-mer based methods work effectively even on smaller samples compared to the other methods.</p>
<p>As we mentioned earlier, we still do not fully understand what kind of factors determine the similarity between different TCRs. However, the success of the dissimilarity-based methods, which is based on the hypothesis that similar TCRs work similarly in the body, implies that the hypothesis is true to some extent. Moreover, the success of k-mer encoding support and strengthen the view that some important motifs play a central role in determining the similarity of TCRs. This is also supported by the fact that shared motifs of antigen-specific TCRs are found in various conditions (<xref ref-type="bibr" rid="B62">62</xref>&#x2013;<xref ref-type="bibr" rid="B64">64</xref>).</p>
<p>While being conventional, k-mer encoding and combined ML methods still have room for further improvement and development. For example, Ostmeyer et al. (<xref ref-type="bibr" rid="B90">90</xref>) combined the 4-mer method with logistic regression to discriminate between cancerous and healthy tissues. In this work, feature vectors are created differently from the conventional way. Each 4-mer motif is represented as a 20-dimensional vector consisting of four 5-dimensional biophysicochemical feature vectors of each amino acid. Therefore, a TCR is converted into a bag of 4-mer feature vectors. To deal with this setup, they employed the multiple instance learning framework. Specifically, they trained a logistic regression model to assign a score, which is the probability that the motif is related to cancer, for each motif. A sample&#x2019;s score, which is used for sample-level classification, is defined as the maximum score of the motifs found in the sample.</p>
<p>As a good representation of data is decisive in ML, we expect that more applications appear, which are built around k-mer methods or other data representation methods.</p>
</sec>
</sec>
<sec id="s4">
<title>Application of Generative Models</title>
<p>Most of the methods mentioned above are used for characterizing the differences between samples. Thus, they usually compare samples obtained from different conditions by assuming that the dataset to be analyzed is from a cross-sectional or longitudinal study. However, careful effort is required for obtaining datasets from multiple experiments. Recruiting a sufficient number of donors for every sample condition is difficult, especially if they are rare.</p>
<p>To solve this problem, methods based on generative models have recently been explored. These methods employ mathematical models for the generation of TCRs, which have been intensively developed since 2012. For TCR generation in the thymus, a probabilistic model implementing the biological mechanism of V(D)J recombination was proposed (<xref ref-type="bibr" rid="B91">91</xref>). Various extensions to this model, especially for inference methods, have been proposed based on Monte Carlo simulation (<xref ref-type="bibr" rid="B92">92</xref>), improved expectation maximization (EM) algorithm (<xref ref-type="bibr" rid="B93">93</xref>), and dynamic programming (<xref ref-type="bibr" rid="B94">94</xref>). For TCR selection in the periphery, Elhanati et al. (<xref ref-type="bibr" rid="B95">95</xref>) devised another probabilistic model. Their model employs the actual peripheral repertoire dataset to estimate the probability distribution of post-selected TCRs, and utilizes the TCR generation model of (<xref ref-type="bibr" rid="B91">91</xref>) to infer that of unobserved pre-selected TCRs. This model is trained to predict the difference between the two distributions.</p>
<p>Based on the same idea of substituting the unobserved datasets with a generative model, Pogorelyy et al. (<xref ref-type="bibr" rid="B96">96</xref>) developed a method called Antigen-specific Lymphocyte Identification by Clustering of Expanded sequences (ALICE), which can characterize samples obtained from only one condition by contrasting them with the sequences generated by a generative model as reference repertoires. This strategy is also applied to characterizing TCRs (<xref ref-type="bibr" rid="B92">92</xref>).</p>
<p>The generative model can pave the way to quantify the abnormality of a sample and to infer its responsible sequences only from a snapshot sampling of the patient&#x2019;s repertoire, without expensive effort to conduct cohort studies. However, challenges remain for its practical and reliable employment. For example, because the TCR generation model utilized in ALICE does not take into account the individual difference that affects the TCR repertoire [e.g., genetic background (<xref ref-type="bibr" rid="B97">97</xref>) and age (<xref ref-type="bibr" rid="B26">26</xref>)], the parameters of the generative model may need to be adjusted to the conditions of individual samples to further enhance its reliability.</p>
<sec id="s4_1">
<title>Simulation of Repertoire</title>
<p>The advance of generative models leads to the emergence of some simulation software, which create pseudo repertoire datasets. Simulated datasets have been used to assess the performance of repertoire analysis methods. For example, a simulated dataset was used to assess the performance of the V(D)J genes identification for B cells (<xref ref-type="bibr" rid="B98">98</xref>).</p>
<p>IgSimulator (<xref ref-type="bibr" rid="B99">99</xref>) is one of the earliest repertoire dataset simulators. AbSim (<xref ref-type="bibr" rid="B100">100</xref>) simulates the temporal development of mutations in B cells. However, these simulators were made for antibody sequences, not TCR sequences. ImmuneSIM (<xref ref-type="bibr" rid="B101">101</xref>) is capable of simulating TCR repertoires. In addition, its remarkable feature is the simulation of repertoires for classification. It can implant k-mer like sequences into the repertoire dataset. Classification methods can be tested whether they can find the implanted TCR or repertoire or the implanted motif itself. As motifs play an important role in characterizing repertoires (see motif-based methods), k-mer like signal implanting is recently adopted in some studies (<xref ref-type="bibr" rid="B102">102</xref>, <xref ref-type="bibr" rid="B103">103</xref>).</p>
<p>Using simulation, further evaluation of analysis methods can be performed. For example, the classification performance was evaluated in various conditions with different density of signal, sample sizes and so on as done in (<xref ref-type="bibr" rid="B103">103</xref>). Evaluations like this cannot be conducted using only real datasets.</p>
</sec>
</sec>
<sec id="s5">
<title>Application of Deep Learning</title>
<p>Deep learning (DL) is a class of ML algorithm, which achieves good performance in various fields. DL has been pervading various areas of biology such as genomics (<xref ref-type="bibr" rid="B104">104</xref>) and systems biology (<xref ref-type="bibr" rid="B105">105</xref>), and it has also recently been applied to repertoire analysis. Again, DL itself is just another ML algorithm. However, representation learning, which is one of the notable features of deep learning, allow DL models to achieve high performance by learning appropriate representations from data without explicitly providing the mechanism behind it (<xref ref-type="bibr" rid="B106">106</xref>). On the other hand, most of the models we introduced earlier used hand-crafted features or were based on the human knowledge. We call such models &#x201c;hand-crafted model&#x201d; hereafter. While the generative model of TCRs introduced above is a hand-crafted model that explicitly implements biological mechanisms such as V(D)J recombination, Davidsen et al. (<xref ref-type="bibr" rid="B107">107</xref>) proposed a Variational Auto Encoder (VAE) (<xref ref-type="bibr" rid="B108">108</xref>) based generative model that treats the TCR generation like a string generation task. Another feature of DL is that representations learned in one task can be easily transferred to other tasks [called transfer learning (<xref ref-type="bibr" rid="B109">109</xref>)]. DeepTCR (<xref ref-type="bibr" rid="B110">110</xref>) solves classification problems using features obtained from a VAE-based generative model.</p>
<p>Not only generative models like VAE but also discriminative models are utilized for repertoire analysis. For example, DeepRC (<xref ref-type="bibr" rid="B102">102</xref>) utilized a popular class of DL model architecture called attention mechanism for the repertoire classification problem. Simply put, the attention mechanism is a kind of learnable weighted average (<xref ref-type="bibr" rid="B111">111</xref>). DeepRC encodes each amino acid sequence in the repertoire to a vector and analyzes its importance through the attention mechanism. Classification is made on the weighted average of the encoded vectors.</p>
<p>DL is also being intensively applied to the prediction of affinity between pairs of T cells and antigens (<xref ref-type="bibr" rid="B112">112</xref>, <xref ref-type="bibr" rid="B113">113</xref>), as well as triplets including MHCs (<xref ref-type="bibr" rid="B114">114</xref>). TCR-pMHC binding prediction task is one of the most actively studied topics in immunoinformatics (<xref ref-type="bibr" rid="B115">115</xref>). The task is to predict whether or not the target antigen will be recognized by a TCR using the sequence information of the TCR and the antigen protein. As Alphafold2 (<xref ref-type="bibr" rid="B116">116</xref>) has made an innovation in predicting the structure of proteins from their amino acid sequences, DL is expected to make a breakthrough in this area.</p>
<p>At this stage, DL-based methods have not yet demonstrated the performance to dominate hand-crafted models, in which human crafts the feature or the model structure, in this field. For example, a comparison between a hand-crafted generative model (<xref ref-type="bibr" rid="B95">95</xref>) and Davidsen&#x2019;s VAE-based generative model (<xref ref-type="bibr" rid="B107">107</xref>) was conducted (<xref ref-type="bibr" rid="B117">117</xref>). This paper concluded that the hand-crafted model outperforms DL-based models with lower computational cost and higher interpretability. For peptide-MHC binding prediction, according to a systematic performance comparison review conducted in 2020, ML-based models still scored better than DL-based models on average (<xref ref-type="bibr" rid="B118">118</xref>). In addition, our group compared a DL model and ML models by changing the available data size for learning on a repertoire classification task and found that the performance of the DL-based model deteriorates on the small datasets (<xref ref-type="bibr" rid="B43">43</xref>).</p>
<p>According to the current trend, the application of DL in this field will be investigated even more intensively in the future. For example, some more recent DL-based peptide-MHC methods reviewed in the next section are showing better performance than the traditional methods on some specific datasets. However, DL may not wipe out the need for traditional biological hand-crafted models because of its expensive computation cost, lack of interpretability, and data-intensive nature. Instead, the integration of hand-crafted and DL-based models is being explored. In a recently proposed model for T cell selection called soNNia (<xref ref-type="bibr" rid="B119">119</xref>), a hand-crafted generative model for TCR generation probability (<xref ref-type="bibr" rid="B95">95</xref>), which was used for comparison in (<xref ref-type="bibr" rid="B117">117</xref>), is combined with a DL model of the TCR selection. For TCR-pMHC interaction prediction, a combination of DL and traditional ML methods is also being pursued (<xref ref-type="bibr" rid="B120">120</xref>).</p>
<sec id="s5_1">
<title>Embedding Methods Based on Representation Learning</title>
<p>In the recent advances in Natural Language Processing (NLP), self-supervised representation learning draws attention, which utilizes the nature of data as a target signal to learn good representations. This is realized by the ability of DL to acquire good representations mentioned in the previous section. One of the earliest successful approaches is Word2Vec (<xref ref-type="bibr" rid="B121">121</xref>), which encodes a word to a numeric vector (Word Embedding). In a Word2Vec training method called CBOW (continuous bag of words), a neural network (NN) that converts a word to a vector is trained to predict a masked word in a sentence using encoded vectors of its surrounding words (<xref ref-type="bibr" rid="B122">122</xref>). Word2Vec is utilized widely to convert textual data to numerical representation in NLP and also is applied to repertoire analysis. Immune2Vec (<xref ref-type="bibr" rid="B123">123</xref>) is inspired by Word2Vec and treats a TCR/BCR as a sentence and a k-mer as a word, respectively. Representation of a TCR/BCR, which is composed of many k-mers, is derived by averaging all k-mer vectors, which is a similar procedure to FastText (<xref ref-type="bibr" rid="B124">124</xref>) in NLP.</p>
<p>After the success of Word2Vec, various NN architectures for self-supervised representation learning in NLP are developed. One of the noticeable approaches is neural language models. A language model is a generative model to predict words from the context. CBOW is a representative example which predicts a word from context words. Thanks to the invention of a new NN building block called Transformer (<xref ref-type="bibr" rid="B125">125</xref>), which utilized the attention mechanism we mentioned in the previous section. NNs can handle more distant dependencies in a text. New neural language models like BERT (<xref ref-type="bibr" rid="B126">126</xref>) exploited the Transformer&#x2019;s ability and broke the former models&#x2019; records in various tasks. These models are trained to predict a masked word similarly to CBOW. However, in contrast to CBOW, they can predict one or more meaningful sentences, not a word. One such language model called GPT-3 can write natural texts, e.g., news articles (<xref ref-type="bibr" rid="B127">127</xref>). We can also utilize a neural language model to embed a sentence using the output of the hidden layer (Sentence Embedding). Such sentence embedding is revealed to be a very good representation and can be applied to multiple downstream tasks in NLP, from question answering to translation, with little additional training for each task (called fine-tuning) (<xref ref-type="bibr" rid="B128">128</xref>). Training of the language model itself (called pretraining) requires a large corpus and enormous computation resources. However, once the training is done, the same model can be applied to various problems with fine-tuning using little data.</p>
<p>Language models have also been employed in repertoire analysis. Before that, language models have been intensively applied to general protein sequences (<xref ref-type="bibr" rid="B129">129</xref>&#x2013;<xref ref-type="bibr" rid="B132">132</xref>). BERTMHC (<xref ref-type="bibr" rid="B133">133</xref>) showed utilizing the pre-trained model of (<xref ref-type="bibr" rid="B129">129</xref>) actually increases the performance in the peptide-MHC (Class II) binding prediction task. ImmunoBERT (<xref ref-type="bibr" rid="B134">134</xref>) used the same pre-trained model for the peptide-MHC (Class I) binding prediction task. Hashemi et al. (<xref ref-type="bibr" rid="B135">135</xref>) employed the pre-trained model of (<xref ref-type="bibr" rid="B131">131</xref>) and fine-tuned them for peptide-MHC (Class I) binding prediction and achieved higher performance compared to a previous software. Some papers perform pre-training on their own on the repertoire sequencing dataset. In Leem et al. (<xref ref-type="bibr" rid="B136">136</xref>), each amino acid in a TCR is treated as a word, and a TCR is treated as a sentence to pre-train a BERT language model (AntiBERTa). AntiBERTa achieved a higher ROC-AUC in a paratope prediction task than other tools.</p>
<p>The utilization of language models is not limited to embedding. In Shuai et al. (<xref ref-type="bibr" rid="B137">137</xref>), another language model called GPT-2 (<xref ref-type="bibr" rid="B128">128</xref>) is utilized for pretraing on an antibody generation model (IgLM). Because GPT-2 is designed for full sentence generation, unlike BERT, IgLM can generate new antibodies (CDRs). A new antibody design workflow is proposed in the paper and outlined as follows: First, many antibodies are created using IgLM. Then the3D structure for each antibody is calculated. Finally, the properties of the generated structures are computed to select better antibody candidates.</p>
</sec>
</sec>
<sec id="s6">
<title>Machine Learning for Repertoire Analysis in Practice</title>
<p>In this review, we focused mainly on the technical aspects of ML and DL methods and categorized them by their approach. As a result, we cannot cover all topics, especially those being relevant to practical applications. This may be compensated by a thorough review of the repertoire analysis methods before 2019 in (<xref ref-type="bibr" rid="B138">138</xref>), and another review that introduce many methods categorized by task (<xref ref-type="bibr" rid="B139">139</xref>). In addition, more ML applications can be found on the pMHC-epitope analysis in (<xref ref-type="bibr" rid="B140">140</xref>&#x2013;<xref ref-type="bibr" rid="B143">143</xref>), and on longitudinal analysis in (<xref ref-type="bibr" rid="B144">144</xref>, <xref ref-type="bibr" rid="B145">145</xref>).</p>
<p>To practice ML methods, we can refer to the author&#x2019;s implementation in most cases. We can find a comprehensive list of such implementations and other software in (<xref ref-type="bibr" rid="B146">146</xref>). In addition, there exist some libraries that implement multiple popular methods to be used for general analysis. In particular, VDJTools (<xref ref-type="bibr" rid="B42">42</xref>) and tcR (<xref ref-type="bibr" rid="B147">147</xref>) (Immunearch<xref ref-type="fn" rid="fn11"><sup>11</sup></xref> is its successor) are equipped with a broad range of basic analysis methods and are widely used in practice. Moreover, new libraries are being developed such as ImmuneML (<xref ref-type="bibr" rid="B148">148</xref>), which focuses more on ML methods.</p>
<p>As for the topics that those sources cannot fully cover, we discuss the following two topics in relation to the practice of ML methods in TCR repertoire analysis: One is prospective practical applications of repertoire analysis, such as blood testing and cancer vaccination. The other is repertoire analysis of COVID-19</p>
<sec id="s6_1">
<title>Applications of Repertoire Analysis</title>
<p>Recently, applications of repertoire analysis have been developed rapidly. One of the most prominent applications is blood testing (<xref ref-type="bibr" rid="B149">149</xref>). In this field, the diagnosis of MRD (see UTILIZATION OF SEQUENCE INFORMATION) and the COVID-19 testing (see the next section) are already approved by FDA. There are potentially more diseases that can be diagnosed by repertoire sequencing. For example, autoimmune diseases such as lupus erythematosus (<xref ref-type="bibr" rid="B150">150</xref>), rheumatoid arthritis (<xref ref-type="bibr" rid="B150">150</xref>), and lupus nephritis (<xref ref-type="bibr" rid="B151">151</xref>) have been successfully classified with the V-J gene usage distribution feature and a random forest classifier. In the BCR repertoire, IGHV gene selection was analyzed for multiple autoimmune diseases (<xref ref-type="bibr" rid="B152">152</xref>).</p>
<p>In relation to autoimmunity, repertoire analysis revealed the features common to self-reactive T cells. Hydrophobic residues (<xref ref-type="bibr" rid="B153">153</xref>, <xref ref-type="bibr" rid="B154">154</xref>) or Cysteine (<xref ref-type="bibr" rid="B154">154</xref>) on CDR3 are related to their self-reactivity. Hydrophobic CDRs enrichment in regulatory T cells is replicated by a logistic regression model with 606 T cell features to predict whether a cell becomes a regulatory T cell or not (<xref ref-type="bibr" rid="B155">155</xref>). Prediction of self-reactive T cells may play an important role in the diagnosis of autoimmune diseases in the future.</p>
<p>Another prominent application is neoantigen vaccines to treat cancer. Neoantigen is a tumor-specific antigen that can be used to target tumor cells. Thus, neoantigen vaccines stimulate T cells to attack tumor cells. Neoantigen vaccines should be personalized because tumors of different individuals tend to acquire different mutations and express different neoantigens (<xref ref-type="bibr" rid="B156">156</xref>, <xref ref-type="bibr" rid="B157">157</xref>). Repertoire analysis is expected to reduce the labor required for finding individual neoantigen (<xref ref-type="bibr" rid="B158">158</xref>). The finding of neoantigens <italic>in silico</italic> is typically performed as follows: First, tumor-specific mutations and their transcripted proteins are identified by sequencing. Second, from those proteins, all antigenic peptides that mark cancer cells are listed. Third, the peptides that can bind to the patient&#x2019;s MHC well are screened. Finally, the obtained peptides are tested to determine whether the pMHC complex can be recognized by T cells or not. Repertoire analysis is used in the third step to predict the affinity of peptide and personal MHC. A couple of software was published for this task (<xref ref-type="bibr" rid="B118">118</xref>). On the other hand, immunopeptidome is studied as a different approach to find neoantigens (<xref ref-type="bibr" rid="B159">159</xref>). This approach is also interesting in relation to repertoire analysis. In this approach, TCR-pMHC complexes in tumor tissues are collected and analyzed to retrieve their peptide sequences. As the peptides are already assured to bind to MHC, we can skip some of the described screening process. Immunopeptidome can be seen as a peptide repertoire, and its analysis might provide insight into TCR repertoire in the future.</p>
<p>We reviewed some potential applications of repertoire analysis in this section. To realize such applications, we need reproducible and robust results. For clinical applications, standardized protocols must be established. For example, a standard experimental protocol is proposed for MRD diagnosis (<xref ref-type="bibr" rid="B160">160</xref>). Also, bioinformatic pipelines are not yet standardized. We will expect more standardized workflows to appear in the future. An example is a new standard format for repertoire dataset proposed by AIRR Community (<xref ref-type="bibr" rid="B161">161</xref>).</p>
</sec>
<sec id="s6_2">
<title>Repertoire Analysis for COVID-19</title>
<p>Understanding COVID-19 has been one of the most important research topics in recent years, and repertoire analysis has revealed various characteristics of COVID-19 so far. In this section, we will see how the ML-based repertoire analysis introduced in this review is used in the COVID-19 study.</p>
<p>Repertoire analysis has been employed to investigate the nature of COVID-19 infection. Most basic observation is the change in diveristy. Many studies reported the low TCR repertoire diversity in active COVID-19 patients (<xref ref-type="bibr" rid="B162">162</xref>&#x2013;<xref ref-type="bibr" rid="B165">165</xref>). Some studies further reported that the severity of the symptom is related to the lower diversity (<xref ref-type="bibr" rid="B163">163</xref>, <xref ref-type="bibr" rid="B166">166</xref>). However, it should be noted that decrease in TCR diversity is not necessarily specific to COVID-19 infection but common to various virus infections (<xref ref-type="bibr" rid="B164">164</xref>). Cheng et al. (<xref ref-type="bibr" rid="B167">167</xref>) investigated V(D)J gene usage and found that some V<italic>&#x3b2;</italic> genes, which are estimated to have a high affinity to SARS-Cov2 spike protein antigen, were enriched in severe COVID-19 patients.</p>
<p>Further insights are also provided by using sequence information based ML methods. In Simnica et al. (<xref ref-type="bibr" rid="B168">168</xref>), COVID-19 public TCRs are investigated. GLIPH2 (<xref ref-type="bibr" rid="B81">81</xref>), one of the dissimilarity-based methods we reviewed, was used to cluster TCRs and select COVID-19 related TCRs by Student&#x2019;s T-test (similar to Emerson et al. (<xref ref-type="bibr" rid="B16">16</xref>) introduced as one of the hypothesis test based methods). GLIPH2 was also employed in Chang et al. (<xref ref-type="bibr" rid="B166">166</xref>) to characterize the TCRs related to the severity of the symptoms. Minervina et al. (<xref ref-type="bibr" rid="B169">169</xref>) also examined the dynamics of COVID-19 patients&#x2019; repertoires over time using the hypothesis test previously proposed by the same group (<xref ref-type="bibr" rid="B170">170</xref>) to distinguish proliferating clones. Quiros-Fernandez et al. (<xref ref-type="bibr" rid="B171">171</xref>) revealed the cross-reactivity of CD8+ T cells in unexposed donors to the COVID-19 epitope, which is derived using NetCTLPan (<xref ref-type="bibr" rid="B172">172</xref>), an NN-based peptide-MHC binding prediction software.</p>
<p>We cannot cover all the COVID-19 related literature here. For further reading, see (<xref ref-type="bibr" rid="B173">173</xref>) for early researches and (<xref ref-type="bibr" rid="B174">174</xref>) for recent updates. For repertoire diversity and COVID-19, see (<xref ref-type="bibr" rid="B175">175</xref>, <xref ref-type="bibr" rid="B176">176</xref>). Note that, as COVID-19 is still not fully understood, these results should be further validated in the future.</p>
<p>As a practical application, the repertoire analysis is utilized to diagnose COVID-19. Adaptive Biotechnologies, a US-listed company, applied the ML algorithm that they developed for CMV [in Emerson et al. (<xref ref-type="bibr" rid="B16">16</xref>), introduced in Section 3.1] to the COVID-19 dataset. It was demonstrated that the algorithm successfully distinguished the sample&#x2019;s COVID-19 infection status (<xref ref-type="bibr" rid="B6">6</xref>). Adaptive Biotechnologies received EUA (Emergency Use Authorization) for the COVID-19 test from the FDA. Nevertheless, repertoire-based test may not be the first choice for COVID-19 diagnosis. First, T(B)CR repertoire can not provide direct evidence of SARS Cov-2 virus existence. Second, repertoire-based test requires sequencing, which costs substantially more than PCR or antibody tests. However, repertoire analysis can potentially reveal far more information than such tests (<xref ref-type="bibr" rid="B149">149</xref>), and the sequencing cost is decreasing. Therefore, in the future, repertoire-based blood testing can be utilized further (<xref ref-type="bibr" rid="B149">149</xref>).</p>
</sec>
<sec id="s6_3">
<title>Small Sample Problem of Repertoire Datasets</title>
<p>The size of datasets is the major determinant of the performance of methods and the reliability of their results (<xref ref-type="bibr" rid="B43">43</xref>, <xref ref-type="bibr" rid="B103">103</xref>). Therefore, the establishment and development of sufficiently large datasets are important equally to or even more than the development of analysis methods.</p>
<p>TCRdb (<xref ref-type="bibr" rid="B54">54</xref>), one of the major databases of TCR repertoire, contains 131 projects with a total of 8,341 samples of public datasets aggregated from various repositories as of November 2021. Since one project is usually associated with one paper, a rough estimate indicates that one paper contains 64 samples on average. In general, this number is considered small for applying ML algorithms, and actually, the classification methods mentioned above do not always work satisfactorily in some different classification tasks, especially when the sample size is less than 100 (<xref ref-type="bibr" rid="B43">43</xref>). A simulation also indicates that the number of samples affects the classification performance (<xref ref-type="bibr" rid="B103">103</xref>).</p>
<p>This situation is gradually changing with the appearance of large datasets containing several hundred samples, such as the CMV dataset in Emerson et al. (<xref ref-type="bibr" rid="B16">16</xref>). In addition, Adaptive Biotechnologies and Microsoft released a new COVID-19 dataset with 1,486 samples, one of the largest released ever as a single dataset<xref ref-type="fn" rid="fn12"><sup>12</sup></xref>, which was used in (<xref ref-type="bibr" rid="B6">6</xref>). However, such a large dataset is exceptional, especially as that of the human repertoire, in light of the difficulty to collect a large number of patients with the same condition, e.g., infection records. Even though the number of publicly available datasets have been grown steadily (<xref ref-type="bibr" rid="B177">177</xref>), and will continue to grow, the small data size problem may not be readily resolved. Note that we might employ other animals&#x2019; datasets for some basic research (<xref ref-type="bibr" rid="B77">77</xref>). In VDJdb (<xref ref-type="bibr" rid="B177">177</xref>), datasets of mice and macaques are recorded. However, the number of the dataset is much fewer than that of humans.</p>
<p>Simulations are not only a powerful tool for repertoire analysis, as we saw earlier, but also can contribute to overcoming the situation, as generative models can create an unlimited amount of pseudo datasets. However, the employment of simulations in repertoire analysis may not always be assured, depending on the tasks and situations. For example, simulated datasets for repertoire classification tasks are created by embedding specific k-mer like signals only in repertoires belonging to specific classes (<xref ref-type="bibr" rid="B101">101</xref>&#x2013;<xref ref-type="bibr" rid="B103">103</xref>). Though we know such motifs are important to characterize the binding property of TCR (<xref ref-type="bibr" rid="B63">63</xref>), other signals may be still missing. Also, each disease may affect repertoire uniquely [e.g., the difference between CMV and varicella zoster virus (VZV) (<xref ref-type="bibr" rid="B178">178</xref>)]. Therefore, until we have a plenty of real datasets, we can not know how we can characterize the changes in repertoire caused by a given condition. Therefore, we will still need real datasets, especially to enable new practical applications.</p>
<p>To alleviate the problem, we have to select appropriate methods for a given size of datasets, understand more about the limit of information that can be derived from a given data, and develop new methods that can integrate multiple datasets or work effectively even with small sizes of datasets.</p>
</sec>
</sec>
<sec id="s7" sec-type="discussion">
<title>Discussion</title>
<p>In this paper, we have surveyed ML applications to TCR repertoire analysis by following its development from simple statistical indices to DL, as being summarized in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref>. For reader&#x2019;s convenience, we summarized a detailed comparison between the methods in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Qualitative comparison of the methods reviewed in this article. In practice, both feature encoding methods and ML algorithms for specific tasks such as classification or regression are combined. As the choice of ML algorithms is usually arbitrary, this table is organized by the viewpoint of feature extraction.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" colspan="2" align="left">Methods</th>
<th valign="top" align="center">Core Idea</th>
<th valign="top" align="center">TCR-level encoding</th>
<th valign="top" align="center">Repertoire-level encoding</th>
<th valign="top" align="center">ML methods combined with</th>
<th valign="top" align="center">Strength</th>
<th valign="top" align="center">Weakness</th>
<th valign="top" align="center">Notable Examples</th>
<th valign="top" align="center">Relationship with other methods</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" rowspan="2" align="left">Distribution based models</td>
<td valign="top" align="left">Statistics (Diversity)</td>
<td valign="top" align="left">TCR diversity is related to healthiness and abnormality of immunological states. Diversity indices such as a rarity weighted count of TCR clonotypes can be used as basic parameters of the immunological state.</td>
<td valign="top" align="left">NA</td>
<td valign="top" align="left">A diversity index (a scalar value)</td>
<td valign="top" align="left">NA</td>
<td valign="top" align="left">Applicable to data with small sample size and/or small number of sequences.</td>
<td valign="top" align="left">Too simple and ignoring sequence information.</td>
<td valign="top" align="left">Grieff et al (<xref ref-type="bibr" rid="B67">67</xref>) used multiple diversity indices to create a repertoire-level feature vector.</td>
<td valign="top" align="left">NA</td>
</tr>
<tr>
<td valign="top" align="left">Distribution Shape</td>
<td valign="top" align="left">The distribution of the clonotype frequency is used to analyze the structure of clonotype diversity. By fitting the sample distribution by probabilistic models, characteristic parameters of the distribution are estimated.</td>
<td valign="top" align="left">NA</td>
<td valign="top" align="left">Model parameters of distributions</td>
<td valign="top" align="left">Probablistic Model</td>
<td valign="top" align="left">Applicable to data with small sample size and/or small number of sequences. Flexibility of modeling.</td>
<td valign="top" align="left">Arbitrariness of modeling and ignoring sequence information.</td>
<td valign="top" align="left">Guidani et al (<xref ref-type="bibr" rid="B69">69</xref>) used a bayesian model to infer the number of clonotypes in a sample from the distribution.</td>
<td valign="top" align="left">NA</td>
</tr>
<tr>
<td valign="top" rowspan="6" align="left">Sequence Information based methods</td>
<td valign="top" align="left">Hypothesis Test</td>
<td valign="top" align="left">The TCRs shared among the samples in a condition compared to others might be correlated with the condition. Such TCRs can be identified by hypothesis tests.</td>
<td valign="top" align="left">Significance of presence or absence of specific TCRs in a condition</td>
<td valign="top" align="left">A bool vector of the existence of the specific condition-related TCRs found by the hypothesis tests</td>
<td valign="top" align="left">Various Classifiers</td>
<td valign="top" align="left">Each TCR can be characterized by the relatedness to the conditions.</td>
<td valign="top" align="left">Ignoring most of the sequences.</td>
<td valign="top" align="left">Emerson et al (<xref ref-type="bibr" rid="B16">16</xref>) used a hypothesis-based method to find CMV-related TCRs and classify CMV infection based on the existence of such TCRs.s Ritvo et al (<xref ref-type="bibr" rid="B75">75</xref>) proposed a method to find proliferated clusters using a hypothesis test.</td>
<td valign="top" align="left">To include similarity, hypothesis tests are combined with dissimilarity-based methods (ex. Glanville et al., <xref ref-type="bibr" rid="B63">63</xref>).</td>
</tr>
<tr>
<td valign="top" align="left">Dissimilarity</td>
<td valign="top" align="left">Similar TCRs may play a similar role in the body. Distance between TCRs can be used to detect and cluster the similar TCRs.</td>
<td valign="top" align="left">Relative distance from other sequences. Manifold learning is sometimes used to calculate absolute position of the TCR in the latent space</td>
<td valign="top" align="left">Density distribution on the latent space</td>
<td valign="top" align="left">Clustering Algorithms and Manifold Learning</td>
<td valign="top" align="left">Utilizing all sequences to characterize samples. Each TCR is characterized by the relative distance from the other TCRs.</td>
<td valign="top" align="left">Computational cost of pairwise alignment.</td>
<td valign="top" align="left">Dash et al (<xref ref-type="bibr" rid="B78">78</xref>) used a dissimilarity matrix and visualized the epitope-specific clusters by manifold learning. Yokota et al (<xref ref-type="bibr" rid="B79">79</xref>) quantified the distance of repertoires by creating the inter-sample dissimilarity matrix. Glanville et al (<xref ref-type="bibr" rid="B63">63</xref>) integrate various information into the dissimilarity calculation (ex. length of CDR3)</td>
<td valign="top" align="left">NA</td>
</tr>
<tr>
<td valign="top" align="left">Motif</td>
<td valign="top" align="left">Local patterns such as (k-mer) motifs in a TCR may be related to its function. Encoding TCRs by a vector of local features may be a good representation of TCRs.</td>
<td valign="top" align="left">Bag of k-mer. Atchley vector is also used to encode the TCR to more dense vector.</td>
<td valign="top" align="left">Bag of k-mer or aggreation of TCR-level encoding</td>
<td valign="top" align="left">Various Classifiers / Regressors</td>
<td valign="top" align="left">Utilize all sequences to characterize samples. Applicable to data with small sample size and/or small number of sequences. Each TCR is directly characterized as a feature vector.</td>
<td valign="top" align="left">Low flexibility in modeling.</td>
<td valign="top" align="left">Sun et al (<xref ref-type="bibr" rid="B86">86</xref>) used a 3-mer feature vector of each CDR3 and SVM for a repertoire classification task. Ostmeyer et al (<xref ref-type="bibr" rid="B90">90</xref>) used a 4-mer vector further encoded by the Atchley vector, which represents the physicochemical nature of amino acids. Katayama et al (<xref ref-type="bibr" rid="B43">43</xref>) applied a 3-mer feature vector to repertoire classification tasks on small datasets.</td>
<td valign="top" align="left">Motifs are sometimes used for calculating dissimilarity (ex. Mayer-Blackwell et al., <xref ref-type="bibr" rid="B82">82</xref>).</td>
</tr>
<tr>
<td valign="top" align="left">Generative Models</td>
<td valign="top" align="left">The mechanisms of generation and selection of TCRs are the determinants of TCR repertoire. Their modelling provides additional information to the observed and not-observed repertoires.</td>
<td valign="top" align="left">NA</td>
<td valign="top" align="left">Model parameters of the generative models</td>
<td valign="top" align="left">Probablistic Model and Manifold Learning</td>
<td valign="top" align="left">Utilizing all sequences to characterize samples. Applicable to data with small sample size and/or small number of sequences. Generation of pseudo data (for simulatiion, data augmentation etc.)</td>
<td valign="top" align="left">Validity of assumptions in models.</td>
<td valign="top" align="left">Murugan et al (<xref ref-type="bibr" rid="B91">91</xref>) modelled the biological V(D)J recombination process and used unselected TCRs to fit the model. Elhanati et al (<xref ref-type="bibr" rid="B95">95</xref>) modelled the thymic selection of TCRs and combined Murugan's model to estimate the parameter of the selection process. Pogorelyy et al (<xref ref-type="bibr" rid="B96">96</xref>) proposed a method to quantify the abnormality of repertoire using the generation probability from a generative model.</td>
<td valign="top" align="left">NA</td>
</tr>
<tr>
<td valign="top" align="left">Deep Learning (DL)</td>
<td valign="top" align="left">Good representations of repertoires may be obtained by Deep learning and may improve the performance of various repertoire analysis</td>
<td valign="top" align="left">Various encoding based on VAE or language models (See embedding methods)</td>
<td valign="top" align="left">Inferred parameters of DL-based models</td>
<td valign="top" align="left">Generative Models and Embedding Methods</td>
<td valign="top" align="left">High flexibility in modeling. High performance if sufficient amount of data is provided.</td>
<td valign="top" align="left">Model is not explainable and data expensive.</td>
<td valign="top" align="left">Davidsen et al (<xref ref-type="bibr" rid="B107">107</xref>) proposed a VAE-based model to embed TCR sequences into the latent space. Widrich et al (<xref ref-type="bibr" rid="B102">102</xref>) proposed a Transformer-like model for a repertoire classification problem. Sidhom et al (<xref ref-type="bibr" rid="B110">110</xref>) used another VAE-based model to solve various regression/classification tasks.</td>
<td valign="top" align="left">Embedding Methods are closely related with DL.</td>
</tr>
<tr>
<td valign="top" align="left">Embedding Methods</td>
<td valign="top" align="left">Because TCR sequences are a collection of strings, encoding TCRs to fixed-length dense vectors using NLP may lead to efficient algorithms.</td>
<td valign="top" align="left">Sentence embedding</td>
<td valign="top" align="left">NA</td>
<td valign="top" align="left">Various Algorithms incl. DL</td>
<td valign="top" align="left">High flexibility in modeling. Applicable to data with small sample size and/or small number of sequences (after pre-training).</td>
<td valign="top" align="left">Model is not explainable.</td>
<td valign="top" align="left">Cheng et al (<xref ref-type="bibr" rid="B166">166</xref>) employed a pre-trained general protein language model for the peptide-MHC binding prediction task. Shuai et al (<xref ref-type="bibr" rid="B137">137</xref>) performed pre-training using the repertoire sequence dataset (BCR) and measured the performance on a single downstream task.</td>
<td valign="top" align="left">NA</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>NA stands for not applicable.</p>
</table-wrap-foot>
</table-wrap>
<p>Finally, we discuss the remaining technological challenges and outline the future directions in the development of TCR repertoire analysis. In particular, we focus on two topics, the small sample problem and the multimodal data integration.</p>
<p>The small sample problem of repertoire datasets we reviewed in the previous section is one of the most important problems that should be resolved in repertoire analysis. As mentioned earlier, the cost of large datasets will likely remain high. Thus, we need to address the problem by devising new analysis methods that can work on smaller but practical datasets. We have at least three representative approaches to achieve this goal. First, as we reviewed in the previous section, simulations can be used to create datasets. We expect more simulation software releases in the future. In this section, we discuss the other two approaches further.</p>
<p>Another possible direction is to utilize multiple datasets to solve a task. Two DL-based techniques which we mentioned earlier will play an important role to this end. Transfer learning can be employed to implement such a method. In transfer learning, we prepare a DL model that has already learned a good feature representation after training on a large unsupervised corpus, and then utilize it for feature extraction in the target task (<xref ref-type="bibr" rid="B179">179</xref>). This technique improves the performance of the target task, especially when the dataset for the target task is small. Similarly, representation learning is important. Good representations of repertoires may be learned from large amounts of unlabeled repertoire datasets. If such good representations are learned, classification of individual diseases, for example, may become possible with high performance even if only small amounts of data are available for the target diseases. As we saw earlier, this direction was already investigated using VAE (<xref ref-type="bibr" rid="B110">110</xref>). However, the size of the model is far smaller than those used in NLP, and the universality of the representation has not yet been discussed. Moreover, there is no standard task in repertoire analysis in contrast to NLP. Therefore, the models are not evaluated in terms of which downstream tasks can be applied <italic>via</italic> transfer learning. Recently, attempts appear, which utilize large language models in repertoire analysis (<xref ref-type="bibr" rid="B133">133</xref>&#x2013;<xref ref-type="bibr" rid="B137">137</xref>, <xref ref-type="bibr" rid="B180">180</xref>). In AntiBERTa (<xref ref-type="bibr" rid="B137">137</xref>), fine-tuning for a downstream task is also investigated. Currently, these methods are in development. To be more widely used, we need to further investigate the transferability of the learned models and representations further. In particular, we believe that studies on language models can be explored. Language models are still improved in NLP, with larger models being pursued. The application of these language models in repertoire analysis is also to be investigated.</p>
<p>The other approach is to combine multiple models to exploit more information in repertoire datasets. The hypothesis test-based methods tend to make predictions based on a tiny subset of specific TCRs, especially public TCRs, and ignore most of the other TCRs in the dataset. In other words, these methods are based on the exact match. This is contrary to a certain class of motif-based or deep-learning-based methods that exploit all the sequences in a sample by encoding them with a fixed-length feature vector. In other words, these methods are based on fuzzy matches. Actually, our group compared these two types of methods and revealed that they provide different prediction profiles (<xref ref-type="bibr" rid="B43">43</xref>). Two fuzzy-match-based methods yielded similar predictions. This is intriguing because the two methods are based on completely different methods (k-mer encoding on repertoire level + GBDT vs. deep learning-based feature encoding + attention mechanism). On the other hand, a hypothesis-based method yielded very different predictions. This result suggests that these methods may utilize different information and that ensembling these approaches may result in a better performance on smaller datasets.</p>
<p>While the repertoire data may possess the remaining information that can be further exploited, a T cell population cannot be characterized solely by the sequence information of the TCR repertoire. We cannot predict all the nature of TCRs only from the sequence information. Moreover, important information is missing. For example, T-cell subpopulations cannot be determined by sequence data itself. Therefore, the integration of multimodal information is a promising direction for further repertoire analysis. Most of the methods we reviewed in this paper do not employ information other than TCR sequences except one that integrates the physicochemical properties of amino acids to repertoire datasets (<xref ref-type="bibr" rid="B90">90</xref>). We may accommodate a lot more sources to analyze the repertoire dataset. Actually, multi-omics analysis is recently explored (<xref ref-type="bibr" rid="B181">181</xref>, <xref ref-type="bibr" rid="B182">182</xref>). The multi-omics approach is usually used with single-cell sequencing to connect multiple data at the single-cell level. Currently, such multi-omics data is not yet popularly employed. However, some interesting findings have been reported. For example, single-cell analysis of RNA-seq and CDR3 revealed the correlation between the gene expression and the frequent CDR3 sequences (<xref ref-type="bibr" rid="B182">182</xref>). Another source may come from the 3D structure estimation methods, as the nature of a TCR sequence is determined by the binding affinity to antigens. A recent paper (<xref ref-type="bibr" rid="B183">183</xref>) encodes a BCR sequence to a feature vector using the estimated 3D structure of the B cell receptor. Another paper (<xref ref-type="bibr" rid="B184">184</xref>) utilizes 3D structure information to predict peptides that bind well with a pair of TCR and MHC. In the paper, a binding score matrix between peptide residues and TCR residues is learned from the existing TCR-pMHC structures. The matrix is then used to calculate the possible alternative peptide of the TCR and MHC.</p>
<p>Toward this direction, hand-crafted models, which exploit specific information based on human understanding, can be effectively utilized to complement the data-driven models by DL. By considering the fact that Alphafold2 (<xref ref-type="bibr" rid="B116">116</xref>) was realized by the combination of a feature extraction method and loss function based on chemical insights, it would be promising to unite hand-crafted models with data-driven ones and to integrate multimodal data in repertoire analysis.</p>
</sec>
<sec id="s8" sec-type="author-contributions">
<title>Author Contributions</title>
<p>YK, RY: writing of the manuscript, TA, TK: writing of the manuscript and supervision. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="s9" sec-type="funding-information">
<title>Funding</title>
<p>This research is supported by the JSPS KAKENHI Grant Numbers 19H05799, 19K20408, 20H03441, and by the JST CREST Grant Number JPMJCR2011.</p>
</sec>
<sec id="s10" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s11" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>We thank the members of our lab for the fruitful discussion. <xref ref-type="fig" rid="f1">
<bold>Figures&#xa0;1</bold>
</xref>&#x2013;<xref ref-type="fig" rid="f3">
<bold>3</bold>
</xref> were created with <uri xlink:href="https://www.biorender.com">Biorender.com</uri>.</p>
</ack>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>
<uri xlink:href="http://www.imgt.org/">http://www.imgt.org/</uri>.</p>
</fn>
<fn id="fn2">
<label>2</label>
<p>
<uri xlink:href="https://www.ncbi.nlm.nih.gov/sra/">https://www.ncbi.nlm.nih.gov/sra/</uri>.</p>
</fn>
<fn id="fn3">
<label>3</label>
<p>
<uri xlink:href="https://www.insdc.org/">https://www.insdc.org/</uri>.</p>
</fn>
<fn id="fn4">
<label>4</label>
<p>
<uri xlink:href="https://github.com/ncbi/sra-tools">https://github.com/ncbi/sra-tools</uri>.</p>
</fn>
<fn id="fn5">
<label>5</label>
<p>
<uri xlink:href="https://vdjserver.org/">https://vdjserver.org/</uri>.</p>
</fn>
<fn id="fn6">
<label>6</label>
<p>
<uri xlink:href="https://clients.adaptivebiotech.com/immuneaccess">https://clients.adaptivebiotech.com/immuneaccess</uri>.</p>
</fn>
<fn id="fn7">
<label>7</label>
<p>
<uri xlink:href="https://gateway.ireceptor.org/">https://gateway.ireceptor.org/</uri>.</p>
</fn>
<fn id="fn8">
<label>8</label>
<p>
<uri xlink:href="http://bioinfo.life.hust.edu.cn/TCRdb/#/">http://bioinfo.life.hust.edu.cn/TCRdb/#/</uri>.</p>
</fn>
<fn id="fn9">
<label>9</label>
<p>
<uri xlink:href="https://github.com/uio-bmi/compairr">https://github.com/uio-bmi/compairr</uri>.</p>
</fn>
<fn id="fn10">
<label>10</label>
<p>
<uri xlink:href="https://github.com/GreiffLab/immuneREF">https://github.com/GreiffLab/immuneREF</uri>.</p>
</fn>
<fn id="fn11">
<label>11</label>
<p>
<uri xlink:href="https://immunarch.com/">https://immunarch.com/</uri>.</p>
</fn>
<fn id="fn12">
<label>12</label>
<p>
<uri xlink:href="https://clients.adaptivebiotech.com/pub/covid-2020">https://clients.adaptivebiotech.com/pub/covid-2020</uri>.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname> <given-names>BV</given-names>
</name>
<name>
<surname>Connors</surname> <given-names>TJ</given-names>
</name>
<name>
<surname>Farber</surname> <given-names>DL</given-names>
</name>
</person-group>. <article-title>Human T Cell Development, Localization, and Function Throughout Life</article-title>. <source>Immunity</source> (<year>2018</year>) <volume>48</volume>:<page-range>202&#x2013;13</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.immuni.2018.01.007</pub-id>
</citation>
</ref>
<ref id="B2">
<label>2</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nikolich-&#x17d;ugich</surname> <given-names>J</given-names>
</name>
<name>
<surname>Slifka</surname> <given-names>MK</given-names>
</name>
<name>
<surname>Messaoudi</surname> <given-names>I</given-names>
</name>
</person-group>. <article-title>The Many Important Facets of T-Cell Repertoire Diversity</article-title>. <source>Nat Rev Immunol</source> (<year>2004</year>) <volume>4</volume>:<page-range>123&#x2013;32</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nri1292</pub-id>
</citation>
</ref>
<ref id="B3">
<label>3</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Miho</surname> <given-names>E</given-names>
</name>
<name>
<surname>Yermanos</surname> <given-names>A</given-names>
</name>
<name>
<surname>Weber</surname> <given-names>CR</given-names>
</name>
<name>
<surname>Berger</surname> <given-names>CT</given-names>
</name>
<name>
<surname>Reddy</surname> <given-names>ST</given-names>
</name>
<name>
<surname>Greiff</surname> <given-names>V</given-names>
</name>
</person-group>. <article-title>Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune Repertoires</article-title>. <source>Front Immunol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>224</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2018.00224</pub-id>
</citation>
</ref>
<ref id="B4">
<label>4</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Simone</surname> <given-names>M</given-names>
</name>
<name>
<surname>Rossetti</surname> <given-names>G</given-names>
</name>
<name>
<surname>Pagani</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Single Cell T Cell Receptor Sequencing: Techniques and Future Challenges</article-title>. <source>Front Immunol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>1638</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2018.01638</pub-id>
</citation>
</ref>
<ref id="B5">
<label>5</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ott</surname> <given-names>PA</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Keskin</surname> <given-names>DB</given-names>
</name>
<name>
<surname>Shukla</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>J</given-names>
</name>
<name>
<surname>Bozym</surname> <given-names>DJ</given-names>
</name>
<etal/>
</person-group>. <article-title>An Immunogenic Personal Neoantigen Vaccine for Patients With Melanoma</article-title>. <source>Nature</source> (<year>2017</year>) <volume>547</volume>:<page-range>217&#x2013;21</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nature22991</pub-id>
</citation>
</ref>
<ref id="B6">
<label>6</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gittelman</surname> <given-names>RM</given-names>
</name>
<name>
<surname>Lavezzo</surname> <given-names>E</given-names>
</name>
<name>
<surname>Snyder</surname> <given-names>TM</given-names>
</name>
<name>
<surname>Zahid</surname> <given-names>HJ</given-names>
</name>
<name>
<surname>Elyanow</surname> <given-names>R</given-names>
</name>
<name>
<surname>Dalai</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Diagnosis and Tracking of Past SARS-CoV-2 Infection in a Large Study of Vo&#x2019;, Italy Through T-Cell Receptor Sequencing [Preprint]</article-title>. <source>medRxiv</source> (<year>2020</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2020.11.09.20228023</pub-id>
</citation>
</ref>
<ref id="B7">
<label>7</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schuldt</surname> <given-names>NJ</given-names>
</name>
<name>
<surname>Binstadt</surname> <given-names>BA</given-names>
</name>
</person-group>. <article-title>Dual TCR T Cells: Identity Crisis or Multitaskers</article-title>? <source>J Immunol</source> (<year>2019</year>) <volume>202</volume>:<page-range>637&#x2013;44</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.4049/jimmunol.1800904</pub-id>
</citation>
</ref>
<ref id="B8">
<label>8</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rock</surname> <given-names>KL</given-names>
</name>
<name>
<surname>Reits</surname> <given-names>E</given-names>
</name>
<name>
<surname>Neefjes</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Present Yourself! By MHC Class I and MHC Class II Molecules</article-title>. <source>Trends Immunol</source> (<year>2016</year>) <volume>37</volume>:<page-range>724&#x2013;37</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.it.2016.08.010</pub-id>
</citation>
</ref>
<ref id="B9">
<label>9</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garcia</surname> <given-names>KC</given-names>
</name>
<name>
<surname>Adams</surname> <given-names>EJ</given-names>
</name>
</person-group>. <article-title>How the T Cell Receptor Sees Antigen&#x2014;A Structural View</article-title>. <source>Cell</source> (<year>2005</year>) <volume>122</volume>:<page-range>333&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.cell.2005.07.015</pub-id>
</citation>
</ref>
<ref id="B10">
<label>10</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klein</surname> <given-names>L</given-names>
</name>
<name>
<surname>Kyewski</surname> <given-names>B</given-names>
</name>
<name>
<surname>Allen</surname> <given-names>PM</given-names>
</name>
<name>
<surname>Hogquist</surname> <given-names>KA</given-names>
</name>
</person-group>. <article-title>Positive and Negative Selection of the T Cell Repertoire: What Thymocytes See (and Don&#x2019;t See)</article-title>. <source>Nat Rev Immunol</source> (<year>2014</year>) <volume>14</volume>:<page-range>377&#x2013;91</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nri3667</pub-id>
</citation>
</ref>
<ref id="B11">
<label>11</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Van Laethem</surname> <given-names>F</given-names>
</name>
<name>
<surname>Tikhonova</surname> <given-names>AN</given-names>
</name>
<name>
<surname>Singer</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>MHC Restriction is Imposed on a Diverse T Cell Receptor Repertoire by CD4 and CD8 Co-Receptors During Thymic Selection</article-title>. <source>Trends Immunol</source> (<year>2012</year>) <volume>33</volume>:<page-range>437&#x2013;41</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.it.2012.05.006</pub-id>
</citation>
</ref>
<ref id="B12">
<label>12</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>La Gruta</surname> <given-names>NL</given-names>
</name>
<name>
<surname>Gras</surname> <given-names>S</given-names>
</name>
<name>
<surname>Daley</surname> <given-names>SR</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>PG</given-names>
</name>
<name>
<surname>Rossjohn</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Understanding the Drivers of MHC Restriction of T Cell Receptors</article-title>. <source>Nat Rev Immunol</source> (<year>2018</year>) <volume>18</volume>:<page-range>467&#x2013;78</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41577-018-0007-5</pub-id>
</citation>
</ref>
<ref id="B13">
<label>13</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sewell</surname> <given-names>AK</given-names>
</name>
</person-group>. <article-title>Why Must T Cells be Cross-Reactive</article-title>? <source>Nat Rev Immunol</source> (<year>2012</year>) <volume>12</volume>:<page-range>669&#x2013;77</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nri3279</pub-id>
</citation>
</ref>
<ref id="B14">
<label>14</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>ElTanbouly</surname> <given-names>MA</given-names>
</name>
<name>
<surname>Noelle</surname> <given-names>RJ</given-names>
</name>
</person-group>. <article-title>Rethinking Peripheral T Cell Tolerance: Checkpoints Across a T Cell&#x2019;s Journey</article-title>. <source>Nat Rev Immunol</source> (<year>2021</year>) <volume>21</volume>:<page-range>257&#x2013;67</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41577-020-00454-2</pub-id>
</citation>
</ref>
<ref id="B15">
<label>15</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Farber</surname> <given-names>DL</given-names>
</name>
<name>
<surname>Yudanin</surname> <given-names>NA</given-names>
</name>
<name>
<surname>Restifo</surname> <given-names>NP</given-names>
</name>
</person-group>. <article-title>Human Memory T Cells: Generation, Compartmentalization and Homeostasis</article-title>. <source>Nat Rev Immunol</source> (<year>2014</year>) <volume>14</volume>:<fpage>24</fpage>&#x2013;<lpage>35</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nri3567</pub-id>
</citation>
</ref>
<ref id="B16">
<label>16</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Emerson</surname> <given-names>RO</given-names>
</name>
<name>
<surname>DeWitt</surname> <given-names>WS</given-names>
</name>
<name>
<surname>Vignali</surname> <given-names>M</given-names>
</name>
<name>
<surname>Gravley</surname> <given-names>J</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>JK</given-names>
</name>
<name>
<surname>Osborne</surname> <given-names>EJ</given-names>
</name>
<etal/>
</person-group>. <article-title>Immunosequencing Identifies Signatures of Cytomegalovirus Exposure History and HLA-Mediated Effects on the T Cell Repertoire</article-title>. <source>Nat Genet</source> (<year>2017</year>) <volume>49</volume>:<page-range>659&#x2013;65</page-range>. doi: <pub-id pub-id-type="doi">10.1038/ng.3822</pub-id>
</citation>
</ref>
<ref id="B17">
<label>17</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Ivanova</surname> <given-names>ME</given-names>
</name>
<name>
<surname>Komech</surname> <given-names>EA</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Bolotin</surname> <given-names>DA</given-names>
</name>
<etal/>
</person-group>. <article-title>Distinctive Properties of Identical Twins&#x2019; TCR Repertoires Revealed by High-Throughput Sequencing</article-title>. <source>Proc Natl Acad Sci</source> (<year>2014</year>) <volume>111</volume>:<page-range>5980&#x2013;5</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.1319389111</pub-id>
</citation>
</ref>
<ref id="B18">
<label>18</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zanelli</surname> <given-names>E</given-names>
</name>
<name>
<surname>Breedveld</surname> <given-names>FC</given-names>
</name>
<name>
<surname>de Vries</surname> <given-names>RRP</given-names>
</name>
</person-group>. <article-title>HLA Association With Autoimmune Disease: A Failure to Protect</article-title>? <source>Rheumatology</source> (<year>2000</year>) <volume>39</volume>:<page-range>1060&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/rheumatology/39.10.1060</pub-id>
</citation>
</ref>
<ref id="B19">
<label>19</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Slabodkin</surname> <given-names>A</given-names>
</name>
<name>
<surname>Chernigovskaya</surname> <given-names>M</given-names>
</name>
<name>
<surname>Mikocziova</surname> <given-names>I</given-names>
</name>
<name>
<surname>Akbar</surname> <given-names>R</given-names>
</name>
<name>
<surname>Scheffer</surname> <given-names>L</given-names>
</name>
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Individualized VDJ Recombination Predisposes the Available Ig Sequence Space</article-title>. <source>Genome Res</source> (<year>2021</year>) <volume>31</volume>:<page-range>2209&#x2013;24</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1101/gr.275373.121</pub-id>
</citation>
</ref>
<ref id="B20">
<label>20</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ohlin</surname> <given-names>M</given-names>
</name>
<name>
<surname>Scheepers</surname> <given-names>C</given-names>
</name>
<name>
<surname>Corcoran</surname> <given-names>M</given-names>
</name>
<name>
<surname>Lees</surname> <given-names>WD</given-names>
</name>
<name>
<surname>Busse</surname> <given-names>CE</given-names>
</name>
<name>
<surname>Bagnara</surname> <given-names>D</given-names>
</name>
<etal/>
</person-group>. <article-title>Inferred Allelic Variants of Immunoglobulin Receptor Genes: A System for Their Evaluation, Documentation, and Naming</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>435</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2019.00435</pub-id>
</citation>
</ref>
<ref id="B21">
<label>21</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Omer</surname> <given-names>A</given-names>
</name>
<name>
<surname>Shemesh</surname> <given-names>O</given-names>
</name>
<name>
<surname>Peres</surname> <given-names>A</given-names>
</name>
<name>
<surname>Polak</surname> <given-names>P</given-names>
</name>
<name>
<surname>Shepherd</surname> <given-names>AJ</given-names>
</name>
<name>
<surname>Watson</surname> <given-names>CT</given-names>
</name>
<etal/>
</person-group>. <article-title>VDJbase: An Adaptive Immune Receptor Genotype and Haplotype Database</article-title>. <source>Nucleic Acids Res</source> (<year>2019</year>) <volume>48</volume>:<page-range>D1051&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gkz872</pub-id>
</citation>
</ref>
<ref id="B22">
<label>22</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gras</surname> <given-names>S</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Miles</surname> <given-names>JJ</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>YC</given-names>
</name>
<name>
<surname>Bell</surname> <given-names>MJ</given-names>
</name>
<name>
<surname>Sullivan</surname> <given-names>LC</given-names>
</name>
<etal/>
</person-group>. <article-title>Allelic Polymorphism in the T Cell Receptor and Its Impact on Immune Responses</article-title>. <source>J Exp Med</source> (<year>2010</year>) <volume>207</volume>:<page-range>1555&#x2013;67</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1084/jem.20100603</pub-id>
</citation>
</ref>
<ref id="B23">
<label>23</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Omer</surname> <given-names>A</given-names>
</name>
<name>
<surname>Peres</surname> <given-names>A</given-names>
</name>
<name>
<surname>Rodriguez</surname> <given-names>OL</given-names>
</name>
<name>
<surname>Watson</surname> <given-names>CT</given-names>
</name>
<name>
<surname>Lees</surname> <given-names>W</given-names>
</name>
<name>
<surname>Polak</surname> <given-names>P</given-names>
</name>
<etal/>
</person-group>. <article-title>T Cell Receptor Beta Germline Variability Is Revealed by Inference From Repertoire Data</article-title>. <source>Genome Med</source> (<year>2022</year>) <volume>14</volume>:<elocation-id>2</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s13073-021-01008-4</pub-id>
</citation>
</ref>
<ref id="B24">
<label>24</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dupic</surname> <given-names>T</given-names>
</name>
<name>
<surname>Bensouda Koraichi</surname> <given-names>M</given-names>
</name>
<name>
<surname>Minervina</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
</person-group>. <article-title>Immune Fingerprinting Through Repertoire Similarity</article-title>. <source>PloS Genet</source> (<year>2021</year>) <volume>17</volume>:<fpage>1</fpage>&#x2013;<lpage>16</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pgen.1009301</pub-id>
</citation>
</ref>
<ref id="B25">
<label>25</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nikolich-&#x17d;ugich</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>The Twilight of Immunity: Emerging Concepts in Aging of the Immune System</article-title>. <source>Nat Immunol</source> (<year>2018</year>) <volume>19</volume>:<page-range>10&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41590-017-0006-x</pub-id>
</citation>
</ref>
<ref id="B26">
<label>26</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aiello</surname> <given-names>A</given-names>
</name>
<name>
<surname>Farzaneh</surname> <given-names>F</given-names>
</name>
<name>
<surname>Candore</surname> <given-names>G</given-names>
</name>
<name>
<surname>Caruso</surname> <given-names>C</given-names>
</name>
<name>
<surname>Davinelli</surname> <given-names>S</given-names>
</name>
<name>
<surname>Gambino</surname> <given-names>CM</given-names>
</name>
<etal/>
</person-group>. <article-title>Immunosenescence and Its Hallmarks: How to Oppose Aging Strategically? A Review of Potential Options for Therapeutic Intervention</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>2247</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2019.02247</pub-id>
</citation>
</ref>
<ref id="B27">
<label>27</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pawelec</surname> <given-names>G</given-names>
</name>
</person-group>. <article-title>Hallmarks of Human &#x201c;Immunosenescence&#x201d;: Adaptation or Dysregulation</article-title>? <source>Immun Ageing</source> (<year>2012</year>) <volume>9</volume>:<elocation-id>15</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/1742-4933-9-15</pub-id>
</citation>
</ref>
<ref id="B28">
<label>28</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Palmer</surname> <given-names>D</given-names>
</name>
</person-group>. <article-title>The Effect of Age on Thymic Function</article-title>. <source>Front Immunol</source> (<year>2013</year>) <volume>4</volume>:<elocation-id>316</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2013.00316</pub-id>
</citation>
</ref>
<ref id="B29">
<label>29</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bolotin</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Mamedov</surname> <given-names>IZ</given-names>
</name>
<name>
<surname>Britanova</surname> <given-names>OV</given-names>
</name>
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<name>
<surname>Shagin</surname> <given-names>D</given-names>
</name>
<name>
<surname>Ustyugova</surname> <given-names>SV</given-names>
</name>
<etal/>
</person-group>. <article-title>Next Generation Sequencing for TCR Repertoire Profiling: Platform-Specific Features and Correction Algorithms</article-title>. <source>Eur J Immunol</source> (<year>2012</year>) <volume>42</volume>:<page-range>3073&#x2013;83</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/eji.201242517</pub-id>
</citation>
</ref>
<ref id="B30">
<label>30</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosati</surname> <given-names>E</given-names>
</name>
<name>
<surname>Dowds</surname> <given-names>CM</given-names>
</name>
<name>
<surname>Liaskou</surname> <given-names>E</given-names>
</name>
<name>
<surname>Henriksen</surname> <given-names>EKK</given-names>
</name>
<name>
<surname>Karlsen</surname> <given-names>TH</given-names>
</name>
<name>
<surname>Franke</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Overview of Methodologies for T-Cell Receptor Repertoire Analysis</article-title>. <source>BMC Biotechnol</source> (<year>2017</year>) <volume>17</volume>:<fpage>61</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s12896-017-0379-9</pub-id>
</citation>
</ref>
<ref id="B31">
<label>31</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valkiers</surname> <given-names>S</given-names>
</name>
<name>
<surname>de Vrij</surname> <given-names>N</given-names>
</name>
<name>
<surname>Gielis</surname> <given-names>S</given-names>
</name>
<name>
<surname>Verbandt</surname> <given-names>S</given-names>
</name>
<name>
<surname>Ogunjimi</surname> <given-names>B</given-names>
</name>
<name>
<surname>Laukens</surname> <given-names>K</given-names>
</name>
<etal/>
</person-group>. <article-title>Recent Advances in T-Cell Receptor Repertoire Analysis: Bridging the Gap With Multimodal Single-Cell RNA Sequencing</article-title>. <source>ImmunoInformatics</source> (<year>2022</year>) <volume>5</volume>:<elocation-id>100009</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.immuno.2022.100009</pub-id>
</citation>
</ref>
<ref id="B32">
<label>32</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname> <given-names>ES</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>PG</given-names>
</name>
<name>
<surname>Mold</surname> <given-names>JE</given-names>
</name>
<name>
<surname>Yates</surname> <given-names>AJ</given-names>
</name>
</person-group>. <article-title>Identifying T Cell Receptors From High-Throughput Sequencing: Dealing With Promiscuity in TCR<italic>&#x3b1;</italic> and TCR<italic>&#x3b2;</italic> Pairing</article-title>. <source>PloS Comput Biol</source> (<year>2017</year>) <volume>13</volume>:<fpage>1</fpage>&#x2013;<lpage>25</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pcbi.1005313</pub-id>
</citation>
</ref>
<ref id="B33">
<label>33</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Balakrishnan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Gloude</surname> <given-names>N</given-names>
</name>
<name>
<surname>Sasik</surname> <given-names>R</given-names>
</name>
<name>
<surname>Ball</surname> <given-names>ED</given-names>
</name>
<name>
<surname>Morris</surname> <given-names>GP</given-names>
</name>
</person-group>. <article-title>Proinflammatory Dual Receptor T Cells in Chronic Graft-Versus-Host Disease</article-title>. <source>Biol Blood Marrow Transplant</source> (<year>2017</year>) <volume>23</volume>:<page-range>1852&#x2013;60</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.bbmt.2017.07.016</pub-id>
</citation>
</ref>
<ref id="B34">
<label>34</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hosoya</surname> <given-names>T</given-names>
</name>
<name>
<surname>Li</surname> <given-names>H</given-names>
</name>
<name>
<surname>Ku</surname> <given-names>CJ</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Guan</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Engel</surname> <given-names>JD</given-names>
</name>
</person-group>. <article-title>High-Throughput Single-Cell Sequencing of Both TCR-<italic>&#x3b2;</italic> Alleles</article-title>. <source>J Immunol</source> (<year>2018</year>) <volume>201</volume>:<page-range>3465&#x2013;70</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.4049/jimmunol.1800774</pub-id>
</citation>
</ref>
<ref id="B35">
<label>35</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carter</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Preall</surname> <given-names>JB</given-names>
</name>
<name>
<surname>Atwal</surname> <given-names>GS</given-names>
</name>
</person-group>. <article-title>Bayesian Inference of Allelic Inclusion Rates in the Human T Cell Receptor Repertoire</article-title>. <source>Cell Syst</source> (<year>2019</year>) <volume>9</volume>:<fpage>475</fpage>&#x2013;<lpage>482.e4</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.cels.2019.09.006</pub-id>
</citation>
</ref>
<ref id="B36">
<label>36</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>L</given-names>
</name>
<name>
<surname>Jama</surname> <given-names>B</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Labarta-Bajo</surname> <given-names>L</given-names>
</name>
<name>
<surname>Z&#xfa;&#xf1;iga</surname> <given-names>EI</given-names>
</name>
<name>
<surname>Morris</surname> <given-names>GP</given-names>
</name>
</person-group>. <article-title>TCR&#x3b1; Reporter Mice Reveal Contribution of Dual TCR&#x3b1; Expression to T Cell Repertoire and Function</article-title>. <source>Proc Natl Acad Sci</source> (<year>2020</year>) <volume>117</volume>:<page-range>32574&#x2013;83</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.2013188117</pub-id>
</citation>
</ref>
<ref id="B37">
<label>37</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tr&#xfc;ck</surname> <given-names>J</given-names>
</name>
<name>
<surname>Eugster</surname> <given-names>A</given-names>
</name>
<name>
<surname>Barennes</surname> <given-names>P</given-names>
</name>
<name>
<surname>Tipton</surname> <given-names>CM</given-names>
</name>
<name>
<surname>Luning Prak</surname> <given-names>ET</given-names>
</name>
<name>
<surname>Bagnara</surname> <given-names>D</given-names>
</name>
<etal/>
</person-group>. <article-title>Biological Controls for Standardization and Interpretation of Adaptive Immune Receptor Repertoire Profiling</article-title>. <source>eLife</source> (<year>2021</year>) <volume>10</volume>:<fpage>e66274</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.66274</pub-id>
</citation>
</ref>
<ref id="B38">
<label>38</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname> <given-names>P</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>J</given-names>
</name>
<name>
<surname>Pei</surname> <given-names>D</given-names>
</name>
<name>
<surname>Obert</surname> <given-names>C</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>C</given-names>
</name>
<name>
<surname>Geiger</surname> <given-names>TL</given-names>
</name>
</person-group>. <article-title>Identification of Errors Introduced During High Throughput Sequencing of the T Cell Receptor Repertoire</article-title>. <source>BMC Genomics</source> (<year>2011</year>) <volume>12</volume>:<elocation-id>106</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/1471-2164-12-106</pub-id>
</citation>
</ref>
<ref id="B39">
<label>39</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rouet</surname> <given-names>R</given-names>
</name>
<name>
<surname>Jackson</surname> <given-names>KJL</given-names>
</name>
<name>
<surname>Langley</surname> <given-names>DB</given-names>
</name>
<name>
<surname>Christ</surname> <given-names>D</given-names>
</name>
</person-group>. <article-title>Next-Generation Sequencing of Antibody Display Repertoires</article-title>. <source>Front Immunol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>118</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2018.00118</pub-id>
</citation>
</ref>
<ref id="B40">
<label>40</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gerritsen</surname> <given-names>B</given-names>
</name>
<name>
<surname>Pandit</surname> <given-names>A</given-names>
</name>
<name>
<surname>Andeweg</surname> <given-names>AC</given-names>
</name>
<name>
<surname>de Boer</surname> <given-names>RJ</given-names>
</name>
</person-group>. <article-title>RTCR: A Pipeline for Complete and Accurate Recovery of T Cell Repertoires From High Throughput Sequencing Data</article-title>. <source>Bioinformatics</source> (<year>2016</year>) <volume>32</volume>:<page-range>3098&#x2013;106</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btw339</pub-id>
</citation>
</ref>
<ref id="B41">
<label>41</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barennes</surname> <given-names>P</given-names>
</name>
<name>
<surname>Quiniou</surname> <given-names>V</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Egorov</surname> <given-names>ES</given-names>
</name>
<name>
<surname>Davydov</surname> <given-names>AN</given-names>
</name>
<name>
<surname>Chudakov</surname> <given-names>DM</given-names>
</name>
<etal/>
</person-group>. <article-title>Benchmarking of T Cell Receptor Repertoire Profiling Methods Reveals Large Systematic Biases</article-title>. <source>Nat Biotechnol</source> (<year>2021</year>) <volume>39</volume>:<page-range>236&#x2013;45</page-range>. doi: <pub-id pub-id-type="doi">10.1038/s41587-020-0656-3</pub-id>
</citation>
</ref>
<ref id="B42">
<label>42</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Bagaev</surname> <given-names>DV</given-names>
</name>
<name>
<surname>Turchaninova</surname> <given-names>MA</given-names>
</name>
<name>
<surname>Bolotin</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Britanova</surname> <given-names>OV</given-names>
</name>
<name>
<surname>Putintseva</surname> <given-names>EV</given-names>
</name>
<etal/>
</person-group>. <article-title>VDJtools: Unifying Post-Analysis of T Cell Receptor Repertoires</article-title>. <source>PloS Comput Biol</source> (<year>2015</year>) <volume>11</volume>:<fpage>1</fpage>&#x2013;<lpage>16</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pcbi.1004503</pub-id>
</citation>
</ref>
<ref id="B43">
<label>43</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katayama</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Kobayashi</surname> <given-names>TJ</given-names>
</name>
</person-group>. <article-title>Comparative Study of Repertoire Classification Methods Reveals Data Efficiency of K-Mer Feature Extraction</article-title>. <source>Front Immunol</source> (<year>2022</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2022.797640</pub-id>
</citation>
</ref>
<ref id="B44">
<label>44</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Geirhos</surname> <given-names>R</given-names>
</name>
<name>
<surname>Jacobsen</surname> <given-names>JH</given-names>
</name>
<name>
<surname>Michaelis</surname> <given-names>C</given-names>
</name>
<name>
<surname>Zemel</surname> <given-names>R</given-names>
</name>
<name>
<surname>Brendel</surname> <given-names>W</given-names>
</name>
<name>
<surname>Bethge</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Shortcut Learning in Deep Neural Networks</article-title>. <source>Nat Mach Intell</source> (<year>2020</year>) <volume>2</volume>:<page-range>665&#x2013;73</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/S42256-020-00257-Z</pub-id>
</citation>
</ref>
<ref id="B45">
<label>45</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zech</surname> <given-names>JR</given-names>
</name>
<name>
<surname>Badgeley</surname> <given-names>MA</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>M</given-names>
</name>
<name>
<surname>Costa</surname> <given-names>AB</given-names>
</name>
<name>
<surname>Titano</surname> <given-names>JJ</given-names>
</name>
<name>
<surname>Oermann</surname> <given-names>EK</given-names>
</name>
</person-group>. <article-title>Variable Generalization Performance of a Deep Learning Model to Detect Pneumonia in Chest Radiographs: A Cross-Sectional Study</article-title>. <source>PloS Med</source> (<year>2018</year>) <volume>15</volume>:<fpage>e1002683</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/JOURNAL.PMED.1002683</pub-id>
</citation>
</ref>
<ref id="B46">
<label>46</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Afzal</surname> <given-names>S</given-names>
</name>
<name>
<surname>Gil-Farina</surname> <given-names>I</given-names>
</name>
<name>
<surname>Gabriel</surname> <given-names>R</given-names>
</name>
<name>
<surname>Ahmad</surname> <given-names>S</given-names>
</name>
<name>
<surname>von Kalle</surname> <given-names>C</given-names>
</name>
<name>
<surname>Schmidt</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Systematic Comparative Study of Computational Methods for T-Cell Receptor Sequencing Data Analysis</article-title>. <source>Briefings Bioinf</source> (<year>2017</year>) <volume>20</volume>:<page-range>222&#x2013;34</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bib/bbx111</pub-id>
</citation>
</ref>
<ref id="B47">
<label>47</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bolotin</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Poslavsky</surname> <given-names>S</given-names>
</name>
<name>
<surname>Mitrophanov</surname> <given-names>I</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Mamedov</surname> <given-names>IZ</given-names>
</name>
<name>
<surname>Putintseva</surname> <given-names>EV</given-names>
</name>
<etal/>
</person-group>. <article-title>MiXCR: Software for Comprehensive Adaptive Immunity Profiling</article-title>. <source>Nat Methods</source> (<year>2015</year>) <volume>12</volume>:<page-range>380&#x2013;1</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nmeth.3364</pub-id>
</citation>
</ref>
<ref id="B48">
<label>48</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alamyar</surname> <given-names>E</given-names>
</name>
<name>
<surname>Duroux</surname> <given-names>P</given-names>
</name>
<name>
<surname>Lefranc</surname> <given-names>MP</given-names>
</name>
<name>
<surname>Giudicelli</surname> <given-names>V</given-names>
</name>
</person-group>. <article-title>IMGT<sup>&#xae;</sup> Tools for the Nucleotide Analysis of Immunoglobulin (IG) and T Cell Receptor (TR) V-(D)-J Repertoires, Polymorphisms, and IG Mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS</article-title>. <source>Methods Mol Biol</source> (<year>2012</year>) <volume>882</volume>:<fpage>569</fpage>&#x2013;<lpage>604</lpage>. doi: <pub-id pub-id-type="doi">10.1007/978-1-61779-842-9_32</pub-id>
</citation>
</ref>
<ref id="B49">
<label>49</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ye</surname> <given-names>J</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>N</given-names>
</name>
<name>
<surname>Madden</surname> <given-names>TL</given-names>
</name>
<name>
<surname>Ostell</surname> <given-names>JM</given-names>
</name>
</person-group>. <article-title>IgBLAST: An Immunoglobulin Variable Domain Sequence Analysis Tool</article-title>. <source>Nucleic Acids Res</source> (<year>2013</year>) <volume>41</volume>:<page-range>W34&#x2013;40</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gkt382</pub-id>
</citation>
</ref>
<ref id="B50">
<label>50</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ou</surname> <given-names>JX</given-names>
</name>
<etal/>
</person-group>. <article-title>Tools for Fundamental Analysis Functions of TCR Repertoires: A Systematic Comparison</article-title>. <source>Briefings Bioinf</source> (<year>2019</year>) <volume>21</volume>:<page-range>1706&#x2013;16</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bib/bbz092</pub-id>
</citation>
</ref>
<ref id="B51">
<label>51</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smakaj</surname> <given-names>E</given-names>
</name>
<name>
<surname>Babrak</surname> <given-names>L</given-names>
</name>
<name>
<surname>Ohlin</surname> <given-names>M</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Briney</surname> <given-names>B</given-names>
</name>
<name>
<surname>Tosoni</surname> <given-names>D</given-names>
</name>
<etal/>
</person-group>. <article-title>Benchmarking Immunoinformatic Tools for the Analysis of Antibody Repertoire Sequences</article-title>. <source>Bioinformatics</source> (<year>2019</year>) <volume>36</volume>:<page-range>1731&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btz845</pub-id>
</citation>
</ref>
<ref id="B52">
<label>52</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Christley</surname> <given-names>S</given-names>
</name>
<name>
<surname>Scarborough</surname> <given-names>W</given-names>
</name>
<name>
<surname>Salinas</surname> <given-names>E</given-names>
</name>
<name>
<surname>Rounds</surname> <given-names>WH</given-names>
</name>
<name>
<surname>Toby</surname> <given-names>IT</given-names>
</name>
<name>
<surname>Fonner</surname> <given-names>JM</given-names>
</name>
<etal/>
</person-group>. <article-title>VDJServer: A Cloud-Based Analysis Portal and Data Commons for Immune Repertoire Sequences and Rearrangements</article-title>. <source>Front Immunol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>976</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2018.00976</pub-id>
</citation>
</ref>
<ref id="B53">
<label>53</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Corrie</surname> <given-names>BD</given-names>
</name>
<name>
<surname>Marthandan</surname> <given-names>N</given-names>
</name>
<name>
<surname>Zimonja</surname> <given-names>B</given-names>
</name>
<name>
<surname>Jaglale</surname> <given-names>J</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Barr</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Ireceptor: A Platform for Querying and Analyzing Antibody/B-Cell and T-Cell Receptor Repertoire Data Across Federated Repositories</article-title>. <source>Immunol Rev</source> (<year>2018</year>) <volume>284</volume>:<fpage>24</fpage>&#x2013;<lpage>41</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1111/imr.12666</pub-id>
</citation>
</ref>
<ref id="B54">
<label>54</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>SY</given-names>
</name>
<name>
<surname>Yue</surname> <given-names>T</given-names>
</name>
<name>
<surname>Lei</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>AY</given-names>
</name>
</person-group>. <article-title>TCRdb: A Comprehensive Database for T-Cell Receptor Sequences With Powerful Search Function</article-title>. <source>Nucleic Acids Res</source> (<year>2021</year>) <volume>49</volume>:<page-range>D468&#x2013;74</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/NAR/GKAA796</pub-id>
</citation>
</ref>
<ref id="B55">
<label>55</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Bagaev</surname> <given-names>DV</given-names>
</name>
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<name>
<surname>Vroomans</surname> <given-names>RM</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>JC</given-names>
</name>
<name>
<surname>Dolton</surname> <given-names>G</given-names>
</name>
<etal/>
</person-group>. <article-title>VDJdb: A Curated Database of T-Cell Receptor Sequences With Known Antigen Specificity</article-title>. <source>Nucleic Acids Res</source> (<year>2017</year>) <volume>46</volume>:<page-range>D419&#x2013;27</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gkx760</pub-id>
</citation>
</ref>
<ref id="B56">
<label>56</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vita</surname> <given-names>R</given-names>
</name>
<name>
<surname>Mahajan</surname> <given-names>S</given-names>
</name>
<name>
<surname>Overton</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Dhanda</surname> <given-names>SK</given-names>
</name>
<name>
<surname>Martini</surname> <given-names>S</given-names>
</name>
<name>
<surname>Cantrell</surname> <given-names>JR</given-names>
</name>
<etal/>
</person-group>. <article-title>The Immune Epitope Database (IEDB): 2018 Update</article-title>. <source>Nucleic Acids Res</source> (<year>2018</year>) <volume>47</volume>:<page-range>D339&#x2013;43</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gky1006</pub-id>
</citation>
</ref>
<ref id="B57">
<label>57</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tickotsky</surname> <given-names>N</given-names>
</name>
<name>
<surname>Sagiv</surname> <given-names>T</given-names>
</name>
<name>
<surname>Prilusky</surname> <given-names>J</given-names>
</name>
<name>
<surname>Shifrut</surname> <given-names>E</given-names>
</name>
<name>
<surname>Friedman</surname> <given-names>N</given-names>
</name>
</person-group>. <article-title>McPAS-TCR: A Manually Curated Catalogue of Pathology-Associated T Cell Receptor Sequences</article-title>. <source>Bioinformatics</source> (<year>2017</year>) <volume>33</volume>:<page-range>2924&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btx286</pub-id>
</citation>
</ref>
<ref id="B58">
<label>58</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rubelt</surname> <given-names>F</given-names>
</name>
<name>
<surname>Busse</surname> <given-names>CE</given-names>
</name>
<name>
<surname>Bukhari</surname> <given-names>SAC</given-names>
</name>
<name>
<surname>B&#xfc;rckert</surname> <given-names>JP</given-names>
</name>
<name>
<surname>Mariotti-Ferrandiz</surname> <given-names>E</given-names>
</name>
<name>
<surname>Cowell</surname> <given-names>LG</given-names>
</name>
<etal/>
</person-group>. <article-title>Adaptive Immune Receptor Repertoire Community Recommendations for Sharing Immune-Repertoire Sequencing Data</article-title>. <source>Nat Immunol</source> (<year>2017</year>) <volume>18</volume>:<page-range>1274&#x2013;8</page-range>. doi: <pub-id pub-id-type="doi">10.1038/ni.3873</pub-id>
</citation>
</ref>
<ref id="B59">
<label>59</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Attaf</surname> <given-names>M</given-names>
</name>
<name>
<surname>Huseby</surname> <given-names>E</given-names>
</name>
<name>
<surname>Sewell</surname> <given-names>AK</given-names>
</name>
</person-group>. <article-title>
<italic>&#x3b1;&#x3b2;</italic> T Cell Receptors as Predictors of Health and Disease</article-title>. <source>Cell Mol Immunol</source> (<year>2015</year>) <volume>12</volume>:<page-range>391&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/cmi.2014.134</pub-id>
</citation>
</ref>
<ref id="B60">
<label>60</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lythe</surname> <given-names>G</given-names>
</name>
<name>
<surname>Callard</surname> <given-names>RE</given-names>
</name>
<name>
<surname>Hoare</surname> <given-names>RL</given-names>
</name>
<name>
<surname>Molina-Par&#xed;s</surname> <given-names>C</given-names>
</name>
</person-group>. <article-title>How Many TCR Clonotypes Does a Body Maintain</article-title>? <source>J Theor Biol</source> (<year>2016</year>) <volume>389</volume>:<page-range>214&#x2013;24</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jtbi.2015.10.016</pub-id>
</citation>
</ref>
<ref id="B61">
<label>61</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
</person-group>. <article-title>How Many Different Clonotypes do Immune Repertoires Contain</article-title>? <source>Curr Opin Syst Biol</source> (<year>2019</year>) <volume>18</volume>:<page-range>104&#x2013;10</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.coisb.2019.10.001</pub-id>
</citation>
</ref>
<ref id="B62">
<label>62</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McHeyzer-Williams</surname> <given-names>LJ</given-names>
</name>
<name>
<surname>Panus</surname> <given-names>JF</given-names>
</name>
<name>
<surname>Mikszta</surname> <given-names>JA</given-names>
</name>
<name>
<surname>McHeyzer-Williams</surname> <given-names>MG</given-names>
</name>
</person-group>. <article-title>Evolution of Antigen-Specific T Cell Receptors <italic>In Vivo</italic>: Preimmune and Antigen-Driven Selection of Preferred Complementarity-Determining Region 3 (CDR3) Motifs</article-title>. <source>J Exp Med</source> (<year>1999</year>) <volume>189</volume>:<page-range>1823&#x2013;38</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1084/jem.189.11.1823</pub-id>
</citation>
</ref>
<ref id="B63">
<label>63</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Glanville</surname> <given-names>J</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Nau</surname> <given-names>A</given-names>
</name>
<name>
<surname>Hatton</surname> <given-names>O</given-names>
</name>
<name>
<surname>Wagar</surname> <given-names>LE</given-names>
</name>
<name>
<surname>Rubelt</surname> <given-names>F</given-names>
</name>
<etal/>
</person-group>. <article-title>Identifying Specificity Groups in the T Cell Receptor Repertoire</article-title>. <source>Nature</source> (<year>2017</year>) <volume>547</volume>:<page-range>94&#x2013;8</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nature22976</pub-id>
</citation>
</ref>
<ref id="B64">
<label>64</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>G</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X</given-names>
</name>
<name>
<surname>Ko</surname> <given-names>A</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>X</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>M</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>Sequence and Structural Analyses Reveal Distinct and Highly Diverse Human CD8+ TCR Repertoires to Immunodominant Viral Antigens</article-title>. <source>Cell Rep</source> (<year>2017</year>) <volume>19</volume>:<page-range>569&#x2013;83</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.celrep.2017.03.072</pub-id>
</citation>
</ref>
<ref id="B65">
<label>65</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Serana</surname> <given-names>F</given-names>
</name>
<name>
<surname>Sottini</surname> <given-names>A</given-names>
</name>
<name>
<surname>Caimi</surname> <given-names>L</given-names>
</name>
<name>
<surname>Palermo</surname> <given-names>B</given-names>
</name>
<name>
<surname>Natali</surname> <given-names>PG</given-names>
</name>
<name>
<surname>Nistic&#xf2;</surname> <given-names>P</given-names>
</name>
<etal/>
</person-group>. <article-title>Identification of a Public CDR3 Motif and a Biased Utilization of T-Cell Receptor V Beta and J Beta Chains in HLA-A2/Melan-A-Specific T-Cell Clonotypes of Melanoma Patients</article-title>. <source>J Trans Med</source> (<year>2009</year>) <volume>7</volume>:<fpage>1</fpage>&#x2013;<lpage>14</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/1479-5876-7-21</pub-id>
</citation>
</ref>
<ref id="B66">
<label>66</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chao</surname> <given-names>A</given-names>
</name>
<name>
<surname>Chiu</surname> <given-names>CH</given-names>
</name>
<name>
<surname>Jost</surname> <given-names>L</given-names>
</name>
</person-group>. <article-title>Unifying Species Diversity, Phylogenetic Diversity, Functional Diversity, and Related Similarity and Differentiation Measures Through Hill Numbers</article-title>. <source>Annu Rev Ecology Evolution Systematics</source> (<year>2014</year>) <volume>45</volume>:<fpage>297</fpage>&#x2013;<lpage>324</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1146/annurev-ecolsys-120213-091540</pub-id>
</citation>
</ref>
<ref id="B67">
<label>67</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Greiff</surname> <given-names>V</given-names>
</name>
<name>
<surname>Bhat</surname> <given-names>P</given-names>
</name>
<name>
<surname>Cook</surname> <given-names>SC</given-names>
</name>
<name>
<surname>Menzel</surname> <given-names>U</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>W</given-names>
</name>
<name>
<surname>Reddy</surname> <given-names>ST</given-names>
</name>
</person-group>. <article-title>A Bioinformatic Framework for Immune Repertoire Diversity Profiling Enables Detection of Immunological Status</article-title>. <source>Genome Med</source> (<year>2015</year>) <volume>7</volume>:<fpage>49</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s13073-015-0169-8</pub-id>
</citation>
</ref>
<ref id="B68">
<label>68</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Laydon</surname> <given-names>DJ</given-names>
</name>
<name>
<surname>Bangham</surname> <given-names>CRM</given-names>
</name>
<name>
<surname>Asquith</surname> <given-names>B</given-names>
</name>
</person-group>. <article-title>Estimating T-Cell Repertoire Diversity: Limitations of Classical Estimators and a New Approach</article-title>. <source>Philos Trans R Soc B: Biol Sci</source> (<year>2015</year>) <volume>370</volume>:<fpage>20140291</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1098/rstb.2014.0291</pub-id>
</citation>
</ref>
<ref id="B69">
<label>69</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guindani</surname> <given-names>M</given-names>
</name>
<name>
<surname>Sep&#xfa;lveda</surname> <given-names>N</given-names>
</name>
<name>
<surname>Paulino</surname> <given-names>CD</given-names>
</name>
</person-group>. <article-title>M&#xfc;ller P. A Bayesian Semiparametric Approach for the Differential Analysis of Sequence Counts Data</article-title>. <source>J R Stat Society: Ser C (Applied Statistics)</source> (<year>2014</year>) <volume>63</volume>:<fpage>385</fpage>&#x2013;<lpage>404</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1111/rssc.12041</pub-id>
</citation>
</ref>
<ref id="B70">
<label>70</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rempala</surname> <given-names>GA</given-names>
</name>
<name>
<surname>Seweryn</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ignatowicz</surname> <given-names>L</given-names>
</name>
</person-group>. <article-title>Model for Comparative Analysis of Antigen Receptor Repertoires</article-title>. <source>J Theor Biol</source> (<year>2011</year>) <volume>269</volume>:<fpage>1</fpage>&#x2013;<lpage>15</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jtbi.2010.10.001</pub-id>
</citation>
</ref>
<ref id="B71">
<label>71</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koch</surname> <given-names>H</given-names>
</name>
<name>
<surname>Starenki</surname> <given-names>D</given-names>
</name>
<name>
<surname>Cooper</surname> <given-names>SJ</given-names>
</name>
<name>
<surname>Myers</surname> <given-names>RM</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Q</given-names>
</name>
</person-group>. <article-title>powerTCR: A Model-Based Approach to Comparative Analysis of the Clone Size Distribution of the T Cell Receptor Repertoire</article-title>. <source>PloS Comput Biol</source> (<year>2018</year>) <volume>14</volume>:<fpage>1</fpage>&#x2013;<lpage>18</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pcbi.1006571</pub-id>
</citation>
</ref>
<ref id="B72">
<label>72</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rawstron</surname> <given-names>AC</given-names>
</name>
<name>
<surname>Fazi</surname> <given-names>C</given-names>
</name>
<name>
<surname>Agathangelidis</surname> <given-names>A</given-names>
</name>
<name>
<surname>Villamor</surname> <given-names>N</given-names>
</name>
<name>
<surname>Letestu</surname> <given-names>R</given-names>
</name>
<name>
<surname>Nomdedeu</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>A Complementary Role of Multiparameter Flow Cytometry and High-Throughput Sequencing for Minimal Residual Disease Detection in Chronic Lymphocytic Leukemia: An European Research Initiative on CLL Study</article-title>. <source>Leukemia</source> (<year>2016</year>) <volume>30</volume>:<page-range>929&#x2013;36</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/leu.2015.313</pub-id>
</citation>
</ref>
<ref id="B73">
<label>73</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gong</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>C</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>W</given-names>
</name>
<name>
<surname>Iqbal</surname> <given-names>J</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Greiner</surname> <given-names>TC</given-names>
</name>
<etal/>
</person-group>. <article-title>Assessment of T-Cell Receptor Repertoire and Clonal Expansion in Peripheral T-Cell Lymphoma Using RNA-Seq Data</article-title>. <source>Sci Rep</source> (<year>2017</year>) <volume>7</volume>:<fpage>11301</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41598-017-11310-0</pub-id>
</citation>
</ref>
<ref id="B74">
<label>74</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Neuter</surname> <given-names>N</given-names>
</name>
<name>
<surname>Bartholomeus</surname> <given-names>E</given-names>
</name>
<name>
<surname>Elias</surname> <given-names>G</given-names>
</name>
<name>
<surname>Keersmaekers</surname> <given-names>N</given-names>
</name>
<name>
<surname>Suls</surname> <given-names>A</given-names>
</name>
<name>
<surname>Jansens</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>Memory CD4+ T Cell Receptor Repertoire Data Mining as a Tool for Identifying Cytomegalovirus Serostatus</article-title>. <source>Genes Immun</source> (<year>2019</year>) <volume>20</volume>:<page-range>255&#x2013;60</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41435-018-0035-y</pub-id>
</citation>
</ref>
<ref id="B75">
<label>75</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ritvo</surname> <given-names>PG</given-names>
</name>
<name>
<surname>Saadawi</surname> <given-names>A</given-names>
</name>
<name>
<surname>Barennes</surname> <given-names>P</given-names>
</name>
<name>
<surname>Quiniou</surname> <given-names>V</given-names>
</name>
<name>
<surname>Chaara</surname> <given-names>W</given-names>
</name>
<name>
<surname>El Soufi</surname> <given-names>K</given-names>
</name>
<etal/>
</person-group>. <article-title>High-Resolution Repertoire Analysis Reveals a Major Bystander Activation of Tfh and Tfr Cells</article-title>. <source>Proc Natl Acad Sci</source> (<year>2018</year>) <volume>115</volume>:<page-range>9604&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.1808594115</pub-id>
</citation>
</ref>
<ref id="B76">
<label>76</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bashford-Rogers</surname> <given-names>RJ</given-names>
</name>
<name>
<surname>Palser</surname> <given-names>AL</given-names>
</name>
<name>
<surname>Huntly</surname> <given-names>BJ</given-names>
</name>
<name>
<surname>Rance</surname> <given-names>R</given-names>
</name>
<name>
<surname>Vassiliou</surname> <given-names>GS</given-names>
</name>
<name>
<surname>Follows</surname> <given-names>GA</given-names>
</name>
<etal/>
</person-group>. <article-title>Network Properties Derived From Deep Sequencing of Human B-Cell Receptor Repertoires Delineate B-Cell Populations</article-title>. <source>Genome Res</source> (<year>2013</year>) <volume>23</volume>:<page-range>1874&#x2013;84</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1101/gr.154815.113</pub-id>
</citation>
</ref>
<ref id="B77">
<label>77</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Madi</surname> <given-names>A</given-names>
</name>
<name>
<surname>Poran</surname> <given-names>A</given-names>
</name>
<name>
<surname>Shifrut</surname> <given-names>E</given-names>
</name>
<name>
<surname>Reich-Zeliger</surname> <given-names>S</given-names>
</name>
<name>
<surname>Greenstein</surname> <given-names>E</given-names>
</name>
<name>
<surname>Zaretsky</surname> <given-names>I</given-names>
</name>
<etal/>
</person-group>. <article-title>T Cell Receptor Repertoires of Mice and Humans Are Clustered in Similarity Networks Around Conserved Public CDR3 Sequences</article-title>. <source>eLife</source> (<year>2017</year>) <volume>6</volume>:<fpage>e22057</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.22057</pub-id>
</citation>
</ref>
<ref id="B78">
<label>78</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dash</surname> <given-names>P</given-names>
</name>
<name>
<surname>Fiore-Gartland</surname> <given-names>AJ</given-names>
</name>
<name>
<surname>Hertz</surname> <given-names>T</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>GC</given-names>
</name>
<name>
<surname>Sharma</surname> <given-names>S</given-names>
</name>
<name>
<surname>Souquette</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Quantifiable Predictive Features Define Epitope-Specific T Cell Receptor Repertoires</article-title>. <source>Nature</source> (<year>2017</year>) <volume>547</volume>:<fpage>89</fpage>&#x2013;<lpage>93</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nature22383</pub-id>
</citation>
</ref>
<ref id="B79">
<label>79</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yokota</surname> <given-names>R</given-names>
</name>
<name>
<surname>Kaminaga</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Kobayashi</surname> <given-names>TJ</given-names>
</name>
</person-group>. <article-title>Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information</article-title>. <source>Front Immunol</source> (<year>2017</year>) <volume>8</volume>:<elocation-id>1500</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2017.01500</pub-id>
</citation>
</ref>
<ref id="B80">
<label>80</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>J</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>J</given-names>
</name>
<name>
<surname>Shukla</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Investigation of Antigen-Specific T-Cell Receptor Clusters in Human Cancers</article-title>. <source>Clin Cancer Res</source> (<year>2020</year>) <volume>26</volume>:<page-range>1359&#x2013;71</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1158/1078-0432.CCR-19-3249</pub-id>
</citation>
</ref>
<ref id="B81">
<label>81</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>C</given-names>
</name>
<name>
<surname>Rubelt</surname> <given-names>F</given-names>
</name>
<name>
<surname>Scriba</surname> <given-names>TJ</given-names>
</name>
<name>
<surname>Davis</surname> <given-names>MM</given-names>
</name>
</person-group>. <article-title>Analyzing the Mycobacterium Tuberculosis Immune Response by T-Cell Receptor Clustering With GLIPH2 and Genome-Wide Antigen Screening</article-title>. <source>Nat Biotechnol</source> (<year>2020</year>) <volume>38</volume>:<page-range>1194&#x2013;202</page-range>. doi: <pub-id pub-id-type="doi">10.1038/s41587-020-0505-4</pub-id>
</citation>
</ref>
<ref id="B82">
<label>82</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mayer-Blackwell</surname> <given-names>K</given-names>
</name>
<name>
<surname>Schattgen</surname> <given-names>S</given-names>
</name>
<name>
<surname>Cohen-Lavi</surname> <given-names>L</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>JC</given-names>
</name>
<name>
<surname>Souquette</surname> <given-names>A</given-names>
</name>
<name>
<surname>Gaevert</surname> <given-names>JA</given-names>
</name>
<etal/>
</person-group>. <article-title>TCR Meta-Clonotypes for Biomarker Discovery With Tcrdist3 Enabled Identification of Public, HLA-Restricted Clusters of SARS-CoV-2 TCRs</article-title>. <source>eLife</source> (<year>2021</year>) <volume>10</volume>:<fpage>e68605</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.68605</pub-id>
</citation>
</ref>
<ref id="B83">
<label>83</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bolen</surname> <given-names>CR</given-names>
</name>
<name>
<surname>Rubelt</surname> <given-names>F</given-names>
</name>
<name>
<surname>Vander Heiden</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Davis</surname> <given-names>MM</given-names>
</name>
</person-group>. <article-title>The Repertoire Dissimilarity Index as a Method to Compare Lymphocyte Receptor Repertoires</article-title>. <source>BMC Bioinf</source> (<year>2017</year>) <volume>18</volume>:<elocation-id>155</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s12859-017-1556-5</pub-id>
</citation>
</ref>
<ref id="B84">
<label>84</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valkiers</surname> <given-names>S</given-names>
</name>
<name>
<surname>Van Houcke</surname> <given-names>M</given-names>
</name>
<name>
<surname>Laukens</surname> <given-names>K</given-names>
</name>
<name>
<surname>Meysman</surname> <given-names>P</given-names>
</name>
</person-group>. <article-title>ClusTCR: A Python Interface for Rapid Clustering of Large Sets of CDR3 Sequences With Unknown Antigen Specificity</article-title>. <source>Bioinformatics</source> (<year>2021</year>) <volume>37</volume>:<page-range>4865&#x2013;7</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btab446</pub-id>
</citation>
</ref>
<ref id="B85">
<label>85</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Zhan</surname> <given-names>X</given-names>
</name>
<name>
<surname>Li</surname> <given-names>B</given-names>
</name>
</person-group>. <article-title>GIANA Allows Computationally-Efficient TCR Clustering and Multi-Disease Repertoire Classification by Isometric Transformation</article-title>. <source>Nat Commun</source> (<year>2021</year>) <volume>12</volume>:<fpage>4699</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41467-021-25006-7</pub-id>
</citation>
</ref>
<ref id="B86">
<label>86</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Best</surname> <given-names>K</given-names>
</name>
<name>
<surname>Cinelli</surname> <given-names>M</given-names>
</name>
<name>
<surname>Heather</surname> <given-names>JM</given-names>
</name>
<name>
<surname>Reich-Zeliger</surname> <given-names>S</given-names>
</name>
<name>
<surname>Shifrut</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Specificity, Privacy, and Degeneracy in the CD4 T Cell Receptor Repertoire Following Immunization</article-title>. <source>Front Immunol</source> (<year>2017</year>) <volume>0</volume>:<elocation-id>430</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/FIMMU.2017.00430</pub-id>
</citation>
</ref>
<ref id="B87">
<label>87</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cinelli</surname> <given-names>M</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Best</surname> <given-names>K</given-names>
</name>
<name>
<surname>Heather</surname> <given-names>JM</given-names>
</name>
<name>
<surname>Reich-Zeliger</surname> <given-names>S</given-names>
</name>
<name>
<surname>Shifrut</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Feature Selection Using a One Dimensional Na&#xef;ve Bayes&#x2019; Classifier Increases the Accuracy of Support Vector Machine Classification of CDR3 Repertoires</article-title>. <source>Bioinformatics</source> (<year>2017</year>) <volume>33</volume>:<page-range>951&#x2013;5</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btw771</pub-id>
</citation>
</ref>
<ref id="B88">
<label>88</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedman</surname> <given-names>JH</given-names>
</name>
</person-group>. <article-title>Greedy Function Approximation: A Gradient Boosting Machine</article-title>. <source>Ann Stat</source> (<year>2001</year>) <volume>29</volume>:<fpage>1189 </fpage>&#x2013;<lpage> 1232</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1214/aos/1013203451</pub-id>
</citation>
</ref>
<ref id="B89">
<label>89</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lawrence</surname> <given-names>N</given-names>
</name>
</person-group>. <article-title>Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models</article-title>. <source>J Mach Learn Res</source> (<year>2005</year>) <volume>6</volume>:<page-range>1783&#x2013;1816</page-range>.</citation>
</ref>
<ref id="B90">
<label>90</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ostmeyer</surname> <given-names>J</given-names>
</name>
<name>
<surname>Christley</surname> <given-names>S</given-names>
</name>
<name>
<surname>Toby</surname> <given-names>IT</given-names>
</name>
<name>
<surname>Cowell</surname> <given-names>LG</given-names>
</name>
</person-group>. <article-title>Biophysicochemical Motifs in T-Cell Receptor Sequences Distinguish Repertoires From Tumor-Infiltrating Lymphocyte and Adjacent Healthy Tissue</article-title>. <source>Cancer Res</source> (<year>2019</year>) <volume>79</volume>:<page-range>1671&#x2013;80</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1158/0008-5472.CAN-18-2292</pub-id>
</citation>
</ref>
<ref id="B91">
<label>91</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Murugan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
<name>
<surname>Callan</surname> <given-names>CG</given-names>
</name>
</person-group>. <article-title>Statistical Inference of the Generation Probability of T-Cell Receptors From Sequence Repertoires</article-title>. <source>Proc Natl Acad Sci</source> (<year>2012</year>) <volume>109</volume>:<page-range>16161&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.1212755109</pub-id>
</citation>
</ref>
<ref id="B92">
<label>92</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Minervina</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Chudakov</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Mamedov</surname> <given-names>IZ</given-names>
</name>
<name>
<surname>Lebedev</surname> <given-names>YB</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<etal/>
</person-group>. <article-title>Method for Identification of Condition-Associated Public Antigen Receptor Sequences</article-title>. <source>eLife</source> (<year>2018</year>) <volume>7</volume>:<elocation-id>e33050</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.33050</pub-id>
</citation>
</ref>
<ref id="B93">
<label>93</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marcou</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
</person-group>. <article-title>High-Throughput Immune Repertoire Analysis With IGoR</article-title>. <source>Nat Commun</source> (<year>2018</year>) <volume>9</volume>:<fpage>561</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41467-018-02832-w</pub-id>
</citation>
</ref>
<ref id="B94">
<label>94</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sethna</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Elhanati</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Callan J Curtis</surname> <given-names>G</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
</person-group>. <article-title>OLGA: Fast Computation of Generation Probabilities of B- and T-Cell Receptor Amino Acid Sequences and Motifs</article-title>. <source>Bioinformatics</source> (<year>2019</year>) <volume>35</volume>:<page-range>2974&#x2013;81</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btz035</pub-id>
</citation>
</ref>
<ref id="B95">
<label>95</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elhanati</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Murugan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Callan</surname> <given-names>CG</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
</person-group>. <article-title>Quantifying Selection in Immune Receptor Repertoires</article-title>. <source>Proc Natl Acad Sci</source> (<year>2014</year>) <volume>111</volume>:<page-range>9875&#x2013;80</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.1409572111</pub-id>
</citation>
</ref>
<ref id="B96">
<label>96</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Minervina</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<name>
<surname>Chudakov</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Lebedev</surname> <given-names>YB</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<etal/>
</person-group>. <article-title>Detecting T Cell Receptors Involved in Immune Responses From Single Repertoire Snapshots</article-title>. <source>PloS Biol</source> (<year>2019</year>) <volume>17</volume>:<fpage>1</fpage>&#x2013;<lpage>13</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pbio.3000314</pub-id>
</citation>
</ref>
<ref id="B97">
<label>97</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>DeWitt I William</surname> <given-names>S</given-names>
</name>
<name>
<surname>Smith</surname> <given-names>A</given-names>
</name>
<name>
<surname>Schoch</surname> <given-names>G</given-names>
</name>
<name>
<surname>Hansen</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Matsen I Frederick</surname> <given-names>A</given-names>
</name>
<name>
<surname>Bradley</surname> <given-names>P</given-names>
</name>
</person-group>. <article-title>Human T Cell Receptor Occurrence Patterns Encode Immune History, Genetic Background, and Receptor Specificity</article-title>. <source>eLife</source> (<year>2018</year>) <volume>7</volume>:<fpage>e38358</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.38358</pub-id>
</citation>
</ref>
<ref id="B98">
<label>98</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bonissone</surname> <given-names>SR</given-names>
</name>
<name>
<surname>Pevzner</surname> <given-names>PA</given-names>
</name>
</person-group>. <article-title>Immunoglobulin Classification Using the Colored Antibody Graph</article-title>. <source>J Comput Biol</source> (<year>2016</year>) <volume>23</volume>:<page-range>483&#x2013;94</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1089/cmb.2016.0010</pub-id>
</citation>
</ref>
<ref id="B99">
<label>99</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Safonova</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Lapidus</surname> <given-names>A</given-names>
</name>
<name>
<surname>Lill</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>IgSimulator: A Versatile Immunosequencing Simulator</article-title>. <source>Bioinformatics</source> (<year>2015</year>) <volume>31</volume>:<page-range>3213&#x2013;5</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btv326</pub-id>
</citation>
</ref>
<ref id="B100">
<label>100</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yermanos</surname> <given-names>A</given-names>
</name>
<name>
<surname>Greiff</surname> <given-names>V</given-names>
</name>
<name>
<surname>Krautler</surname> <given-names>NJ</given-names>
</name>
<name>
<surname>Menzel</surname> <given-names>U</given-names>
</name>
<name>
<surname>Dounas</surname> <given-names>A</given-names>
</name>
<name>
<surname>Miho</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Comparison of Methods for Phylogenetic B-Cell Lineage Inference Using Time-Resolved Antibody Repertoire Simulations (AbSim)</article-title>. <source>Bioinformatics</source> (<year>2017</year>) <volume>33</volume>:<page-range>3938&#x2013;46</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btx533</pub-id>
</citation>
</ref>
<ref id="B101">
<label>101</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Weber</surname> <given-names>CR</given-names>
</name>
<name>
<surname>Akbar</surname> <given-names>R</given-names>
</name>
<name>
<surname>Yermanos</surname> <given-names>A</given-names>
</name>
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Snapkov</surname> <given-names>I</given-names>
</name>
<name>
<surname>Sandve</surname> <given-names>GK</given-names>
</name>
<etal/>
</person-group>. <article-title>immuneSIM: Tunable Multi-Feature Simulation of B- and T-Cell Receptor Repertoires for Immunoinformatics Benchmarking</article-title>. <source>Bioinformatics</source> (<year>2020</year>) <volume>36</volume>:<page-range>3594&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btaa158</pub-id>
</citation>
</ref>
<ref id="B102">
<label>102</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Widrich</surname> <given-names>M</given-names>
</name>
<name>
<surname>Sch&#xe4;fl</surname> <given-names>B</given-names>
</name>
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ramsauer</surname> <given-names>H</given-names>
</name>
<name>
<surname>Gruber</surname> <given-names>L</given-names>
</name>
<name>
<surname>Holzleitner</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Modern Hopfield Networks and Attention for Immune Repertoire Classification</article-title>. <source>Adv Neural Inf Process Syst</source> (<year>2020</year>) <volume>33</volume>:<page-range>18832&#x2013;45</page-range>. doi: <pub-id pub-id-type="doi">10.1101/2020.04.12.038158</pub-id>
</citation>
</ref>
<ref id="B103">
<label>103</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kanduri</surname> <given-names>C</given-names>
</name>
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Scheffer</surname> <given-names>L</given-names>
</name>
<name>
<surname>Motwani</surname> <given-names>K</given-names>
</name>
<name>
<surname>Chernigovskaya</surname> <given-names>M</given-names>
</name>
<name>
<surname>Greiff</surname> <given-names>V</given-names>
</name>
<etal/>
</person-group>. <article-title>Profiling the Baseline Performance and Limits of Machine Learning Models for Adaptive Immune Receptor Repertoire Classification [Preprint]</article-title>. <source>bioRxiv</source> (<year>2021</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2021.05.23.445346</pub-id>
</citation>
</ref>
<ref id="B104">
<label>104</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eraslan</surname> <given-names>G</given-names>
</name>
<name>
<surname>Avsec</surname> <given-names>&#x17d;</given-names>
</name>
<name>
<surname>Gagneur</surname> <given-names>J</given-names>
</name>
<name>
<surname>Theis</surname> <given-names>FJ</given-names>
</name>
</person-group>. <article-title>Deep Learning: New Computational Modelling Techniques for Genomics</article-title>. <source>Nat Rev Genet</source> (<year>2019</year>) <volume>20</volume>:<fpage>389</fpage>&#x2013;<lpage>403</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41576-019-0122-6</pub-id>
</citation>
</ref>
<ref id="B105">
<label>105</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zampieri</surname> <given-names>G</given-names>
</name>
<name>
<surname>Vijayakumar</surname> <given-names>S</given-names>
</name>
<name>
<surname>Yaneske</surname> <given-names>E</given-names>
</name>
<name>
<surname>Angione</surname> <given-names>C</given-names>
</name>
</person-group>. <article-title>Machine and Deep Learning Meet Genome-Scale Metabolic Modeling</article-title>. <source>PloS Comput Biol</source> (<year>2019</year>) <volume>15</volume>:<fpage>1</fpage>&#x2013;<lpage>24</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pcbi.1007084</pub-id>
</citation>
</ref>
<ref id="B106">
<label>106</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bengio</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Courville</surname> <given-names>A</given-names>
</name>
<name>
<surname>Vincent</surname> <given-names>P</given-names>
</name>
</person-group>. <article-title>Representation Learning: A Review and New Perspectives</article-title>. <source>IEEE Trans Pattern Anal Mach Intell</source> (<year>2013</year>) <volume>35</volume>:<page-range>1798&#x2013;828</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TPAMI.2013.50</pub-id>
</citation>
</ref>
<ref id="B107">
<label>107</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davidsen</surname> <given-names>K</given-names>
</name>
<name>
<surname>Olson</surname> <given-names>BJ</given-names>
</name>
<name>
<surname>DeWitt I William</surname> <given-names>S</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>J</given-names>
</name>
<name>
<surname>Harkins</surname> <given-names>E</given-names>
</name>
<name>
<surname>Bradley</surname> <given-names>P</given-names>
</name>
<etal/>
</person-group>. <article-title>Deep Generative Models for T Cell Receptor Protein Sequences</article-title>. <source>eLife</source> (<year>2019</year>) <volume>8</volume>:<fpage>e46935</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.46935</pub-id>
</citation>
</ref>
<ref id="B108">
<label>108</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname> <given-names>DP</given-names>
</name>
<name>
<surname>Welling</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Auto-Encoding Variational Bayes</article-title>. In: <person-group person-group-type="editor">
<name>
<surname>Bengio</surname> <given-names>Y</given-names>
</name>
<name>
<surname>LeCun</surname> <given-names>Y</given-names>
</name>
</person-group> <source>International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings</source> Banff. ICLR (<year>2014</year>).</citation>
</ref>
<ref id="B109">
<label>109</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhuang</surname> <given-names>F</given-names>
</name>
<name>
<surname>Qi</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Duan</surname> <given-names>K</given-names>
</name>
<name>
<surname>Xi</surname> <given-names>D</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>A Comprehensive Survey on Transfer Learning</article-title>. <source>Proc IEEE</source> (<year>2021</year>) <volume>109</volume>:<fpage>43</fpage>&#x2013;<lpage>76</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/JPROC.2020.3004555</pub-id>
</citation>
</ref>
<ref id="B110">
<label>110</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sidhom</surname> <given-names>JW</given-names>
</name>
<name>
<surname>Larman</surname> <given-names>HB</given-names>
</name>
<name>
<surname>Pardoll</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Baras</surname> <given-names>AS</given-names>
</name>
</person-group>. <article-title>DeepTCR is a Deep Learning Framework for Revealing Sequence Concepts Within T-Cell Repertoires</article-title>. <source>Nat Commun</source> (<year>2021</year>) <volume>12</volume>:<fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41467-021-21879-w</pub-id>
</citation>
</ref>
<ref id="B111">
<label>111</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chaudhari</surname> <given-names>S</given-names>
</name>
<name>
<surname>Mithal</surname> <given-names>V</given-names>
</name>
<name>
<surname>Polatkan</surname> <given-names>G</given-names>
</name>
<name>
<surname>Ramanath</surname> <given-names>R</given-names>
</name>
</person-group>. <article-title>An Attentive Survey of Attention Models</article-title>. <source>ACM Trans Intell Syst Technol</source> (<year>2021</year>) <volume>12</volume>:<page-range>1&#x2013;32</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1145/3465055</pub-id>
</citation>
</ref>
<ref id="B112">
<label>112</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Springer</surname> <given-names>I</given-names>
</name>
<name>
<surname>Besser</surname> <given-names>H</given-names>
</name>
<name>
<surname>Tickotsky-Moskovitz</surname> <given-names>N</given-names>
</name>
<name>
<surname>Dvorkin</surname> <given-names>S</given-names>
</name>
<name>
<surname>Louzoun</surname> <given-names>Y</given-names>
</name>
</person-group>. <article-title>Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs</article-title>. <source>Front Immunol</source> (<year>2020</year>) <volume>11</volume>:<elocation-id>1803</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2020.01803</pub-id>
</citation>
</ref>
<ref id="B113">
<label>113</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fischer</surname> <given-names>DS</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Schubert</surname> <given-names>B</given-names>
</name>
<name>
<surname>Theis</surname> <given-names>FJ</given-names>
</name>
</person-group>. <article-title>Predicting Antigen Specificity of Single T Cells Based on TCR CDR3 Regions</article-title>. <source>Mol Syst Biol</source> (<year>2020</year>) <volume>16</volume>:<fpage>e9416</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.15252/msb.20199416</pub-id>
</citation>
</ref>
<ref id="B114">
<label>114</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname> <given-names>T</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>J</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>P</given-names>
</name>
<name>
<surname>Xiao</surname> <given-names>X</given-names>
</name>
<etal/>
</person-group>. <article-title>Deep Learning-Based Prediction of the T Cell Receptor&#x2013;Antigen Binding Specificity</article-title>. <source>Nat Mach Intell</source> (<year>2021</year>) <volume>3</volume>:<page-range>864&#x2013;75</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s42256-021-00383-2</pub-id>
</citation>
</ref>
<ref id="B115">
<label>115</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
<name>
<surname>Andreatta</surname> <given-names>M</given-names>
</name>
<name>
<surname>Peters</surname> <given-names>B</given-names>
</name>
<name>
<surname>Buus</surname> <given-names>S</given-names>
</name>
</person-group>. <article-title>Immunoinformatics: Predicting Peptide&#x2013;MHC Binding</article-title>. <source>Annu Rev Biomed Data Sci</source> (<year>2020</year>) <volume>3</volume>:<fpage>191</fpage>&#x2013;<lpage>215</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1146/annurev-biodatasci-021920-100259</pub-id>
</citation>
</ref>
<ref id="B116">
<label>116</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jumper</surname> <given-names>J</given-names>
</name>
<name>
<surname>Evans</surname> <given-names>R</given-names>
</name>
<name>
<surname>Pritzel</surname> <given-names>A</given-names>
</name>
<name>
<surname>Green</surname> <given-names>T</given-names>
</name>
<name>
<surname>Figurnov</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ronneberger</surname> <given-names>O</given-names>
</name>
<etal/>
</person-group>. <article-title>Highly Accurate Protein Structure Prediction With AlphaFold</article-title>. <source>Nature</source> (<year>2021</year>) <volume>596</volume>:<page-range>583&#x2013;9</page-range>. doi: <pub-id pub-id-type="doi">10.1038/s41586-021-03819-2</pub-id>
</citation>
</ref>
<ref id="B117">
<label>117</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Isacchini</surname> <given-names>G</given-names>
</name>
<name>
<surname>Sethna</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Elhanati</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Nourmohammad</surname> <given-names>A</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
</person-group>. <article-title>Generative Models of T-Cell Receptor Sequences</article-title>. <source>Phys Rev E</source> (<year>2020</year>) <volume>101</volume>:<elocation-id>62414</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1103/PhysRevE.101.062414</pub-id>
</citation>
</ref>
<ref id="B118">
<label>118</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mei</surname> <given-names>S</given-names>
</name>
<name>
<surname>Li</surname> <given-names>F</given-names>
</name>
<name>
<surname>Leier</surname> <given-names>A</given-names>
</name>
<name>
<surname>Marquez-Lago</surname> <given-names>TT</given-names>
</name>
<name>
<surname>Giam</surname> <given-names>K</given-names>
</name>
<name>
<surname>Croft</surname> <given-names>NP</given-names>
</name>
<etal/>
</person-group>. <article-title>A Comprehensive Review and Performance Evaluation of Bioinformatics Tools for HLA Class I Peptide-Binding Prediction</article-title>. <source>Briefings Bioinf</source> (<year>2020</year>) <volume>21</volume>:<page-range>1119&#x2013;35</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bib/bbz051</pub-id>
</citation>
</ref>
<ref id="B119">
<label>119</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Isacchini</surname> <given-names>G</given-names>
</name>
<name>
<surname>Walczak</surname> <given-names>AM</given-names>
</name>
<name>
<surname>Mora</surname> <given-names>T</given-names>
</name>
<name>
<surname>Nourmohammad</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Deep Generative Selection Models of T and B Cell Receptor Repertoires With Sonnia</article-title>. <source>Proc Natl Acad Sci</source> (<year>2021</year>) <volume>118</volume>:<elocation-id>e2023141118</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.2023141118</pub-id>
</citation>
</ref>
<ref id="B120">
<label>120</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akbar</surname> <given-names>R</given-names>
</name>
<name>
<surname>Robert</surname> <given-names>PA</given-names>
</name>
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Jeliazkov</surname> <given-names>JR</given-names>
</name>
<name>
<surname>Snapkov</surname> <given-names>I</given-names>
</name>
<name>
<surname>Slabodkin</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>A Compact Vocabulary of Paratope-Epitope Interactions Enables Predictability of Antibody-Antigen Binding</article-title>. <source>Cell Rep</source> (<year>2021</year>) <volume>34</volume>:<elocation-id>108856</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.celrep.2021.108856</pub-id>
</citation>
</ref>
<ref id="B121">
<label>121</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mikolov</surname> <given-names>T</given-names>
</name>
<name>
<surname>Sutskever</surname> <given-names>I</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>K</given-names>
</name>
<name>
<surname>Corrado</surname> <given-names>GS</given-names>
</name>
<name>
<surname>Dean</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Distributed Representations of Words and Phrases and Their Compositionality</article-title>. In: <person-group person-group-type="editor">
<name>
<surname>Burges</surname> <given-names>C</given-names>
</name>
<name>
<surname>Bottou</surname> <given-names>L</given-names>
</name>
<name>
<surname>Welling</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ghahramani</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Weinberger</surname> <given-names>K</given-names>
</name>
</person-group>, editors. <source>Advances in Neural Information Processing Systems</source>, vol. <volume>26</volume>. <publisher-name>Curran Associates, Inc</publisher-name> (<year>2013</year>).</citation>
</ref>
<ref id="B122">
<label>122</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mikolov</surname> <given-names>T</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>K</given-names>
</name>
<name>
<surname>Corrado</surname> <given-names>G</given-names>
</name>
<name>
<surname>Dean</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Efficient Estimation of Word Representations in Vector Space</article-title>. In: <person-group person-group-type="editor">
<name>
<surname>Bengio</surname> <given-names>Y</given-names>
</name>
<name>
<surname>LeCun</surname> <given-names>Y</given-names>
</name>
</person-group>, editors. <source>International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings</source> Scottsdale. ICLR (<year>2013</year>).</citation>
</ref>
<ref id="B123">
<label>123</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ostrovsky-Berman</surname> <given-names>M</given-names>
</name>
<name>
<surname>Frankel</surname> <given-names>B</given-names>
</name>
<name>
<surname>Polak</surname> <given-names>P</given-names>
</name>
<name>
<surname>Yaari</surname> <given-names>G</given-names>
</name>
</person-group>. <article-title>Immune2vec: Embedding B/T Cell Receptor Sequences in <italic>
<sup>N</sup> </italic>Using Natural Language Processing</article-title>. <source>Front Immunol</source> (<year>2021</year>) <volume>12</volume>:<elocation-id>680687</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2021.680687</pub-id>
</citation>
</ref>
<ref id="B124">
<label>124</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bojanowski</surname> <given-names>P</given-names>
</name>
<name>
<surname>Grave</surname> <given-names>E</given-names>
</name>
<name>
<surname>Joulin</surname> <given-names>A</given-names>
</name>
<name>
<surname>Mikolov</surname> <given-names>T</given-names>
</name>
</person-group>. <article-title>Enriching Word Vectors With Subword Information</article-title>. <source>Trans Assoc Comput Linguistics</source> (<year>2017</year>) <volume>5</volume>:<page-range>135&#x2013;46</page-range>. doi: <pub-id pub-id-type="doi">10.1162/tacl_a_00051</pub-id>
</citation>
</ref>
<ref id="B125">
<label>125</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Vaswani</surname> <given-names>A</given-names>
</name>
<name>
<surname>Shazeer</surname> <given-names>N</given-names>
</name>
<name>
<surname>Parmar</surname> <given-names>N</given-names>
</name>
<name>
<surname>Uszkoreit</surname> <given-names>J</given-names>
</name>
<name>
<surname>Jones</surname> <given-names>L</given-names>
</name>
<name>
<surname>Gomez</surname> <given-names>AN</given-names>
</name>
<etal/>
</person-group>. <article-title>Attention is All You Need</article-title>. In: <person-group person-group-type="editor">
<name>
<surname>Guyon</surname> <given-names>I</given-names>
</name>
<name>
<surname>Luxburg</surname> <given-names>UV</given-names>
</name>
<name>
<surname>Bengio</surname> <given-names>S</given-names>
</name>
<name>
<surname>Wallach</surname> <given-names>H</given-names>
</name>
<name>
<surname>Fergus</surname> <given-names>R</given-names>
</name>
</person-group>, editors. <source>Advances in Neural Information Processing Systems</source> Vol. <volume>30</volume>. <publisher-loc>New York</publisher-loc>: <publisher-name>Curran Associates, Inc</publisher-name> (<year>2017</year>).</citation>
</ref>
<ref id="B126">
<label>126</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Devlin</surname> <given-names>J</given-names>
</name>
<name>
<surname>Chang</surname> <given-names>MW</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>K</given-names>
</name>
<name>
<surname>Toutanova</surname> <given-names>K</given-names>
</name>
</person-group>. <article-title>BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding</article-title>. In: <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)</source> <publisher-loc>Minneapolis Minnesota</publisher-loc>: <publisher-name>Assoc Comput Linguistics</publisher-name> (<year>2019</year>), <page-range>4171&#x2013;86</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.18653/v1/N19-1423</pub-id>
</citation>
</ref>
<ref id="B127">
<label>127</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Brown</surname> <given-names>T</given-names>
</name>
<name>
<surname>Mann</surname> <given-names>B</given-names>
</name>
<name>
<surname>Ryder</surname> <given-names>N</given-names>
</name>
<name>
<surname>Subbiah</surname> <given-names>M</given-names>
</name>
<name>
<surname>Kaplan</surname> <given-names>JD</given-names>
</name>
<name>
<surname>Dhariwal</surname> <given-names>P</given-names>
</name>
<etal/>
</person-group>. <article-title>Language Models are Few-Shot Learners</article-title>. In: <person-group person-group-type="editor">
<name>
<surname>Larochelle</surname> <given-names>H</given-names>
</name>
<name>
<surname>Ranzato</surname> <given-names>M</given-names>
</name>
<name>
<surname>Hadsell</surname> <given-names>R</given-names>
</name>
<name>
<surname>Balcan</surname> <given-names>MF</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>H</given-names>
</name>
</person-group>, editors. <source>Advances in Neural Information Processing Systems</source>, vol. <volume>33</volume>. <publisher-loc>New York</publisher-loc>: <publisher-name>Curran Associates, Inc</publisher-name> (<year>2020</year>). <page-range>1877&#x2013;901</page-range>.</citation>
</ref>
<ref id="B128">
<label>128</label>
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Radford</surname> <given-names>A</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>J</given-names>
</name>
<name>
<surname>Child</surname> <given-names>R</given-names>
</name>
<name>
<surname>Luan</surname> <given-names>D</given-names>
</name>
<name>
<surname>Amodei</surname> <given-names>D</given-names>
</name>
<name>
<surname>Sutskever</surname> <given-names>I</given-names>
</name>
</person-group>. <source>Language Models are Unsupervised Multitask Learners</source> (<year>2019</year>). Available at: <uri xlink:href="https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf">https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf</uri> (Accessed <access-date>May 17, 2022</access-date>).</citation>
</ref>
<ref id="B129">
<label>129</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rao</surname> <given-names>R</given-names>
</name>
<name>
<surname>Bhattacharya</surname> <given-names>N</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>N</given-names>
</name>
<name>
<surname>Duan</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>X</given-names>
</name>
<name>
<surname>Canny</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Evaluating Protein Transfer Learning With TAPE</article-title>. <source>Adv Neural Inf Process Syst</source> (<year>2019</year>) <volume>32</volume>:<page-range>9689&#x2013;701</page-range>. doi: <pub-id pub-id-type="doi">10.1101/676825</pub-id>
</citation>
</ref>
<ref id="B130">
<label>130</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elnaggar</surname> <given-names>A</given-names>
</name>
<name>
<surname>Heinzinger</surname> <given-names>M</given-names>
</name>
<name>
<surname>Dallago</surname> <given-names>C</given-names>
</name>
<name>
<surname>Rehawi</surname> <given-names>G</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Jones</surname> <given-names>L</given-names>
</name>
<etal/>
</person-group>. <article-title>ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing</article-title>. <source>IEEE Trans Pattern Anal Mach Intell</source> (<year>2021</year>) <volume>2021</volume>:<fpage>1</fpage>&#x2013;<lpage>1</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TPAMI.2021.3095381</pub-id>
</citation>
</ref>
<ref id="B131">
<label>131</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rives</surname> <given-names>A</given-names>
</name>
<name>
<surname>Meier</surname> <given-names>J</given-names>
</name>
<name>
<surname>Sercu</surname> <given-names>T</given-names>
</name>
<name>
<surname>Goyal</surname> <given-names>S</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Biological Structure and Function Emerge From Scaling Unsupervised Learning to 250 Million Protein Sequences</article-title>. <source>Proc Natl Acad Sci</source> (<year>2021</year>) <volume>118</volume>:<fpage>e2016239118</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.2016239118</pub-id>
</citation>
</ref>
<ref id="B132">
<label>132</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brandes</surname> <given-names>N</given-names>
</name>
<name>
<surname>Ofer</surname> <given-names>D</given-names>
</name>
<name>
<surname>Peleg</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Rappoport</surname> <given-names>N</given-names>
</name>
<name>
<surname>Linial</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>ProteinBERT: A Universal Deep-Learning Model of Protein Sequence and Function</article-title>. <source>Bioinformatics</source> (<year>2022</year>) <volume>38</volume>:<page-range>2102&#x2013;10</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btac020</pub-id>
</citation>
</ref>
<ref id="B133">
<label>133</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname> <given-names>J</given-names>
</name>
<name>
<surname>Bendjama</surname> <given-names>K</given-names>
</name>
<name>
<surname>Rittner</surname> <given-names>K</given-names>
</name>
<name>
<surname>Malone</surname> <given-names>B</given-names>
</name>
</person-group>. <article-title>BERTMHC: Improved MHC&#x2013;peptide Class II Interaction Prediction With Transformer and Multiple Instance Learning</article-title>. <source>Bioinformatics</source> (<year>2021</year>) <volume>22</volume>:<page-range>4172&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btab422</pub-id>
</citation>
</ref>
<ref id="B134">
<label>134</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gasser</surname> <given-names>HC</given-names>
</name>
<name>
<surname>Bedran</surname> <given-names>G</given-names>
</name>
<name>
<surname>Ren</surname> <given-names>B</given-names>
</name>
<name>
<surname>Goodlett</surname> <given-names>D</given-names>
</name>
<name>
<surname>Alfaro</surname> <given-names>J</given-names>
</name>
<name>
<surname>Rajan</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Interpreting BERT Architecture Predictions for Peptide Presentation by MHC Class I Proteins [Preprint]</article-title>. <source>arXiv</source> (<year>2021</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.48550/ARXIV.2111.07137</pub-id>
</citation>
</ref>
<ref id="B135">
<label>135</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hashemi</surname> <given-names>N</given-names>
</name>
<name>
<surname>Hao</surname> <given-names>B</given-names>
</name>
<name>
<surname>Ignatov</surname> <given-names>M</given-names>
</name>
<name>
<surname>Paschalidis</surname> <given-names>I</given-names>
</name>
<name>
<surname>Vakili</surname> <given-names>P</given-names>
</name>
<name>
<surname>Vajda</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Improved Predictions of MHC-Peptide Binding Using Protein Language Models [Preprint]</article-title>. <source>bioRxiv</source> (<year>2022</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2022.02.11.479844</pub-id>
</citation>
</ref>
<ref id="B136">
<label>136</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leem</surname> <given-names>J</given-names>
</name>
<name>
<surname>Mitchell</surname> <given-names>LS</given-names>
</name>
<name>
<surname>Farmery</surname> <given-names>JH</given-names>
</name>
<name>
<surname>Barton</surname> <given-names>J</given-names>
</name>
<name>
<surname>Galson</surname> <given-names>JD</given-names>
</name>
</person-group>. <article-title>Deciphering the Language of Antibodies Using Self-Supervised Learning [Preprint]</article-title>. <source>bioRxiv</source> (<year>2021</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2021.11.10.468064</pub-id>
</citation>
</ref>
<ref id="B137">
<label>137</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Shuai</surname> <given-names>RW</given-names>
</name>
<name>
<surname>Ruffolo</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Gray</surname> <given-names>JJ</given-names>
</name>
</person-group>. <article-title>Generative Language Modeling for Antibody Design</article-title> (<year>2021</year>), Paper presented at: <conf-name>Machine Learning for Structural Biology Workshop at the 35th Conference on Neural Information Processing Systems</conf-name>, <conf-date>2021 Dec 13</conf-date>. doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2021.12.13.472419</pub-id>
</citation>
</ref>
<ref id="B138">
<label>138</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bradley</surname> <given-names>P</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>PG</given-names>
</name>
</person-group>. <article-title>Using T Cell Receptor Repertoires to Understand the Principles of Adaptive Immune Recognition</article-title>. <source>Annu Rev Immunol</source> (<year>2019</year>) <volume>37</volume>:<page-range>547&#x2013;70</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1146/annurev-immunol-042718-041757</pub-id>
</citation>
</ref>
<ref id="B139">
<label>139</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Greiff</surname> <given-names>V</given-names>
</name>
<name>
<surname>Yaari</surname> <given-names>G</given-names>
</name>
<name>
<surname>Cowell</surname> <given-names>LG</given-names>
</name>
</person-group>. <article-title>Mining Adaptive Immune Receptor Repertoires for Biological and Clinical Information Using Machine Learning</article-title>. <source>Curr Opin Syst Biol</source> (<year>2020</year>) <volume>24</volume>:<page-range>109&#x2013;19</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.coisb.2020.10.010</pub-id>
</citation>
</ref>
<ref id="B140">
<label>140</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<name>
<surname>Tsvetkov</surname> <given-names>VO</given-names>
</name>
<name>
<surname>Chudakov</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>An Overview of Immunoinformatics Approaches and Databases Linking T Cell Receptor Repertoires to Their Antigen Specificity</article-title>. <source>Immunogenetics</source> (<year>2020</year>) <volume>72</volume>:<fpage>77</fpage>&#x2013;<lpage>84</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s00251-019-01139-4</pub-id>
</citation>
</ref>
<ref id="B141">
<label>141</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>M&#xf6;sch</surname> <given-names>A</given-names>
</name>
<name>
<surname>Raffegerst</surname> <given-names>S</given-names>
</name>
<name>
<surname>Weis</surname> <given-names>M</given-names>
</name>
<name>
<surname>Schendel</surname> <given-names>DJ</given-names>
</name>
<name>
<surname>Frishman</surname> <given-names>D</given-names>
</name>
</person-group>. <article-title>Machine Learning for Cancer Immunotherapies Based on Epitope Recognition by T Cell Receptors</article-title>. <source>Front Genet</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>1141</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fgene.2019.01141</pub-id>
</citation>
</ref>
<ref id="B142">
<label>142</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gielis</surname> <given-names>S</given-names>
</name>
<name>
<surname>Moris</surname> <given-names>P</given-names>
</name>
<name>
<surname>Bittremieux</surname> <given-names>W</given-names>
</name>
<name>
<surname>De Neuter</surname> <given-names>N</given-names>
</name>
<name>
<surname>Ogunjimi</surname> <given-names>B</given-names>
</name>
<name>
<surname>Laukens</surname> <given-names>K</given-names>
</name>
<etal/>
</person-group>. <article-title>Detection of Enriched T Cell Epitope Specificity in Full T Cell Receptor Sequence Repertoires</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>2820</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2019.02820</pub-id>
</citation>
</ref>
<ref id="B143">
<label>143</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ogishi</surname> <given-names>M</given-names>
</name>
<name>
<surname>Yotsuyanagi</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>Quantitative Prediction of the Landscape of T Cell Epitope Immunogenicity in Sequence Space</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>827</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2019.00827</pub-id>
</citation>
</ref>
<ref id="B144">
<label>144</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heather</surname> <given-names>JM</given-names>
</name>
<name>
<surname>Best</surname> <given-names>K</given-names>
</name>
<name>
<surname>Oakes</surname> <given-names>T</given-names>
</name>
<name>
<surname>Gray</surname> <given-names>ER</given-names>
</name>
<name>
<surname>Roe</surname> <given-names>JK</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>N</given-names>
</name>
<etal/>
</person-group>. <article-title>Dynamic Perturbations of the T-Cell Receptor Repertoire in Chronic HIV Infection and Following Antiretroviral Therapy</article-title>. <source>Front Immunol</source> (<year>2016</year>) <volume>6</volume>:<elocation-id>644</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2015.00644</pub-id>
</citation>
</ref>
<ref id="B145">
<label>145</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qi</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Cavanagh</surname> <given-names>MM</given-names>
</name>
<name>
<surname>Saux</surname> <given-names>SL</given-names>
</name>
<name>
<surname>NamKoong</surname> <given-names>H</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>C</given-names>
</name>
<name>
<surname>Turgano</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Diversification of the Antigen-Specific T Cell Receptor Repertoire After Varicella Zoster Vaccination</article-title>. <source>Sci Trans Med</source> (<year>2016</year>) <volume>8</volume>:<fpage>332ra46</fpage>&#x2013;<lpage>332ra46</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1126/scitranslmed.aaf1725</pub-id>
</citation>
</ref>
<ref id="B146">
<label>146</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teraguchi</surname> <given-names>S</given-names>
</name>
<name>
<surname>Saputri</surname> <given-names>DS</given-names>
</name>
<name>
<surname>Llamas-Covarrubias</surname> <given-names>MA</given-names>
</name>
<name>
<surname>Davila</surname> <given-names>A</given-names>
</name>
<name>
<surname>Diez</surname> <given-names>D</given-names>
</name>
<name>
<surname>Nazlica</surname> <given-names>SA</given-names>
</name>
<etal/>
</person-group>. <article-title>Methods for Sequence and Structural Analysis of B and T Cell Receptor Repertoires</article-title>. <source>Comput Struct Biotechnol J</source> (<year>2020</year>) <volume>18</volume>:<page-range>2000&#x2013;11</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.csbj.2020.07.008</pub-id>
</citation>
</ref>
<ref id="B147">
<label>147</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nazarov</surname> <given-names>VI</given-names>
</name>
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Komech</surname> <given-names>EA</given-names>
</name>
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<name>
<surname>Bolotin</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Shugay</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Tcr: An R Package for T Cell Receptor Repertoire Advanced Data Analysis</article-title>. <source>BMC Bioinf</source> (<year>2015</year>) <volume>16</volume>:<fpage>1</fpage>&#x2013;<lpage>5</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s12859-015-0613-1</pub-id>
</citation>
</ref>
<ref id="B148">
<label>148</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pavlovi&#x107;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Scheffer</surname> <given-names>L</given-names>
</name>
<name>
<surname>Motwani</surname> <given-names>K</given-names>
</name>
<name>
<surname>Kanduri</surname> <given-names>C</given-names>
</name>
<name>
<surname>Kompova</surname> <given-names>R</given-names>
</name>
<name>
<surname>Vazov</surname> <given-names>N</given-names>
</name>
<etal/>
</person-group>. <article-title>The immuneML Ecosystem for Machine Learning Analysis of Adaptive Immune Receptor Repertoires</article-title>. <source>Nat Mach Intell</source> (<year>2021</year>) <volume>3</volume>:<page-range>936&#x2013;44</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s42256-021-00413-z</pub-id>
</citation>
</ref>
<ref id="B149">
<label>149</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arnaout</surname> <given-names>RA</given-names>
</name>
<name>
<surname>Prak</surname> <given-names>ETL</given-names>
</name>
<name>
<surname>Schwab</surname> <given-names>N</given-names>
</name>
<name>
<surname>Rubelt</surname> <given-names>F</given-names>
</name>
<name>
<surname>Arnaout</surname> <given-names>RA</given-names>
</name>
<etal/>
</person-group>. <article-title>Adaptive Immune Receptor Repertoire Community. The Future of Blood Testing Is the Immunome</article-title>. <source>Front Immunol</source> (<year>2021</year>) <volume>12</volume>:<elocation-id>626793</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2021.626793</pub-id>
</citation>
</ref>
<ref id="B150">
<label>150</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>W</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>M</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>T Cell Receptor <italic>&#x3b2;</italic> Repertoires as Novel Diagnostic Markers for Systemic Lupus Erythematosus and Rheumatoid Arthritis</article-title>. <source>Ann Rheumatic Dis</source> (<year>2019</year>) <volume>78</volume>:<page-range>1070&#x2013;8</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1136/annrheumdis-2019-215442</pub-id>
</citation>
</ref>
<ref id="B151">
<label>151</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ye</surname> <given-names>X</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>P</given-names>
</name>
<name>
<surname>Song</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>High-Throughput Sequencing-Based Analysis of T Cell Repertoire in Lupus Nephritis</article-title>. <source>Front Immunol</source> (<year>2020</year>) <volume>11</volume>:<elocation-id>1618</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2020.01618</pub-id>
</citation>
</ref>
<ref id="B152">
<label>152</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bashford-Rogers</surname> <given-names>RJM</given-names>
</name>
<name>
<surname>Bergamaschi</surname> <given-names>L</given-names>
</name>
<name>
<surname>McKinney</surname> <given-names>EF</given-names>
</name>
<name>
<surname>Pombal</surname> <given-names>DC</given-names>
</name>
<name>
<surname>Mescia</surname> <given-names>F</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>JC</given-names>
</name>
<etal/>
</person-group>. <article-title>Analysis of the B Cell Receptor Repertoire in Six Immune-Mediated Diseases</article-title>. <source>Nature</source> (<year>2019</year>) <volume>574</volume>:<page-range>122&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41586-019-1595-3</pub-id>
</citation>
</ref>
<ref id="B153">
<label>153</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stadinski</surname> <given-names>BD</given-names>
</name>
<name>
<surname>Shekhar</surname> <given-names>K</given-names>
</name>
<name>
<surname>G&#xf3;mez-Touri&#xf1;o</surname> <given-names>I</given-names>
</name>
<name>
<surname>Jung</surname> <given-names>J</given-names>
</name>
<name>
<surname>Sasaki</surname> <given-names>K</given-names>
</name>
<name>
<surname>Sewell</surname> <given-names>AK</given-names>
</name>
<etal/>
</person-group>. <article-title>Hydrophobic CDR3 Residues Promote the Development of Self-Reactive T Cells</article-title>. <source>Nat Immunol</source> (<year>2016</year>) <volume>17</volume>:<page-range>946&#x2013;55</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/ni.3491</pub-id>
</citation>
</ref>
<ref id="B154">
<label>154</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Daley</surname> <given-names>SR</given-names>
</name>
<name>
<surname>Koay</surname> <given-names>HF</given-names>
</name>
<name>
<surname>Dobbs</surname> <given-names>K</given-names>
</name>
<name>
<surname>Bosticardo</surname> <given-names>M</given-names>
</name>
<name>
<surname>Wirasinha</surname> <given-names>RC</given-names>
</name>
<name>
<surname>Pala</surname> <given-names>F</given-names>
</name>
<etal/>
</person-group>. <article-title>Cysteine and Hydrophobic Residues in CDR3 Serve as Distinct T-Cell Self-Reactivity Indices</article-title>. <source>J Allergy Clin Immunol</source> (<year>2019</year>) <volume>144</volume>:<page-range>333&#x2013;6</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jaci.2019.03.022</pub-id>
</citation>
</ref>
<ref id="B155">
<label>155</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lagattuta</surname> <given-names>KA</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>JB</given-names>
</name>
<name>
<surname>Nathan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Pauken</surname> <given-names>KE</given-names>
</name>
<name>
<surname>Jonsson</surname> <given-names>AH</given-names>
</name>
<name>
<surname>Rao</surname> <given-names>DA</given-names>
</name>
<etal/>
</person-group>. <article-title>Repertoire Analyses Reveal T Cell Antigen Receptor Sequence Features That Influence T Cell Fate</article-title>. <source>Nat Immunol</source> (<year>2022</year>) <volume>23</volume>:<page-range>446&#x2013;57</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41590-022-01129-x</pub-id>
</citation>
</ref>
<ref id="B156">
<label>156</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carreno</surname> <given-names>BM</given-names>
</name>
<name>
<surname>Magrini</surname> <given-names>V</given-names>
</name>
<name>
<surname>Becker-Hapak</surname> <given-names>M</given-names>
</name>
<name>
<surname>Kaabinejadian</surname> <given-names>S</given-names>
</name>
<name>
<surname>Hundal</surname> <given-names>J</given-names>
</name>
<name>
<surname>Petti</surname> <given-names>AA</given-names>
</name>
<etal/>
</person-group>. <article-title>A Dendritic Cell Vaccine Increases the Breadth and Diversity of Melanoma Neoantigen-Specific T Cells</article-title>. <source>Science</source> (<year>2015</year>) <volume>348</volume>:<page-range>803&#x2013;8</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1126/science.aaa3828</pub-id>
</citation>
</ref>
<ref id="B157">
<label>157</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blass</surname> <given-names>E</given-names>
</name>
<name>
<surname>Ott</surname> <given-names>PA</given-names>
</name>
</person-group>. <article-title>Advances in the Development of Personalized Neoantigen-Based Therapeutic Cancer Vaccines</article-title>. <source>Nat Rev Clin Oncol</source> (<year>2021</year>) <volume>18</volume>:<page-range>215&#x2013;29</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41571-020-00460-2</pub-id>
</citation>
</ref>
<ref id="B158">
<label>158</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garcia-Garijo</surname> <given-names>A</given-names>
</name>
<name>
<surname>Fajardo</surname> <given-names>CA</given-names>
</name>
<name>
<surname>Gros</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Determinants for Neoantigen Identification</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>1392</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2019.01392</pub-id>
</citation>
</ref>
<ref id="B159">
<label>159</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vizca&#xed;no</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Kubiniok</surname> <given-names>P</given-names>
</name>
<name>
<surname>Kovalchik</surname> <given-names>KA</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Duquette</surname> <given-names>JD</given-names>
</name>
<name>
<surname>Mongrain</surname> <given-names>I</given-names>
</name>
<etal/>
</person-group>. <article-title>The Human Immunopeptidome Project: A Roadmap to Predict and Treat Immune Diseases</article-title>. <source>Mol Cell Proteomics</source> (<year>2020</year>) <volume>19</volume>:<fpage>31</fpage>&#x2013;<lpage>49</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1074/mcp.R119.001743</pub-id>
</citation>
</ref>
<ref id="B160">
<label>160</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Br&#xfc;ggemann</surname> <given-names>M</given-names>
</name>
<name>
<surname>Kotrov&#xe1;</surname> <given-names>M</given-names>
</name>
<name>
<surname>Knecht</surname> <given-names>H</given-names>
</name>
<name>
<surname>Bartram</surname> <given-names>J</given-names>
</name>
<name>
<surname>Boudjogrha</surname> <given-names>M</given-names>
</name>
<name>
<surname>Bystry</surname> <given-names>V</given-names>
</name>
<etal/>
</person-group>. <article-title>Standardized Next-Generation Sequencing of Immunoglobulin and T-Cell Receptor Gene Recombinations for MRD Marker Identification in Acute Lymphoblastic Leukaemia; a EuroClonality-NGS Validation Study</article-title>. <source>Leukemia</source> (<year>2019</year>) <volume>33</volume>:<page-range>2241&#x2013;53</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41375-019-0496-7</pub-id>
</citation>
</ref>
<ref id="B161">
<label>161</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vander Heiden</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Marquez</surname> <given-names>S</given-names>
</name>
<name>
<surname>Marthandan</surname> <given-names>N</given-names>
</name>
<name>
<surname>Bukhari</surname> <given-names>SAC</given-names>
</name>
<name>
<surname>Busse</surname> <given-names>CE</given-names>
</name>
<name>
<surname>Corrie</surname> <given-names>B</given-names>
</name>
<etal/>
</person-group>. <article-title>Community Standardized Representations for Annotated Immune Repertoires</article-title>. <source>Front Immunol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>2206</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fimmu.2018.02206</pub-id>
</citation>
</ref>
<ref id="B162">
<label>162</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schulthei&#xdf;</surname> <given-names>C</given-names>
</name>
<name>
<surname>Paschold</surname> <given-names>L</given-names>
</name>
<name>
<surname>Simnica</surname> <given-names>D</given-names>
</name>
<name>
<surname>Mohme</surname> <given-names>M</given-names>
</name>
<name>
<surname>Willscher</surname> <given-names>E</given-names>
</name>
<name>
<surname>von Wenserski</surname> <given-names>L</given-names>
</name>
<etal/>
</person-group>. <article-title>Next-Generation Sequencing of T and B Cell Receptor Repertoires From COVID-19 Patients Showed Signatures Associated With Severity of Disease</article-title>. <source>Immunity</source> (<year>2020</year>) <volume>53</volume>:<fpage>442</fpage>&#x2013;<lpage>455.e4</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.immuni.2020.06.024</pub-id>
</citation>
</ref>
<ref id="B163">
<label>163</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>JY</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>XM</given-names>
</name>
<name>
<surname>Xing</surname> <given-names>X</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>C</given-names>
</name>
<name>
<surname>Song</surname> <given-names>JW</given-names>
</name>
<etal/>
</person-group>. <article-title>Single-Cell Landscape of Immunological Responses in Patients With COVID-19</article-title>. <source>Nat Immunol</source> (<year>2020</year>) <volume>21</volume>:<page-range>1107&#x2013;18</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41590-020-0762-x</pub-id>
</citation>
</ref>
<ref id="B164">
<label>164</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>P</given-names>
</name>
<name>
<surname>Jin</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>W</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>M</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Comprehensive Analysis of TCR Repertoire in COVID-19 Using Single Cell Sequencing</article-title>. <source>Genomics</source> (<year>2021</year>) <volume>113</volume>:<page-range>456&#x2013;62</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.ygeno.2020.12.036</pub-id>
</citation>
</ref>
<ref id="B165">
<label>165</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hou</surname> <given-names>X</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>G</given-names>
</name>
<name>
<surname>Fan</surname> <given-names>W</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>X</given-names>
</name>
<name>
<surname>Mo</surname> <given-names>C</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>T-Cell Receptor Repertoires as Potential Diagnostic Markers for Patients With COVID-19</article-title>. <source>Int J Infect Dis</source> (<year>2021</year>) <volume>113</volume>:<page-range>308&#x2013;17</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.ijid.2021.10.033</pub-id>
</citation>
</ref>
<ref id="B166">
<label>166</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname> <given-names>CM</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>P</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>TH</given-names>
</name>
<name>
<surname>Alachkar</surname> <given-names>H</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>KY</given-names>
</name>
<name>
<surname>Chang</surname> <given-names>WC</given-names>
</name>
</person-group>. <article-title>Profiling of T Cell Repertoire in SARS-CoV-2-Infected COVID-19 Patients Between Mild Disease and Pneumonia</article-title>. <source>J Clin Immunol</source> (<year>2021</year>) <volume>41</volume>:<page-range>1131&#x2013;45</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s10875-021-01045-z</pub-id>
</citation>
</ref>
<ref id="B167">
<label>167</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname> <given-names>MH</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>S</given-names>
</name>
<name>
<surname>Porritt</surname> <given-names>RA</given-names>
</name>
<name>
<surname>Noval Rivas</surname> <given-names>M</given-names>
</name>
<name>
<surname>Paschold</surname> <given-names>L</given-names>
</name>
<name>
<surname>Willscher</surname> <given-names>E</given-names>
</name>
<etal/>
</person-group>. <article-title>Superantigenic Character of an Insert Unique to SARS-CoV-2 Spike Supported by Skewed TCR Repertoire in Patients With Hyperinflammation</article-title>. <source>Proc Natl Acad Sci</source> (<year>2020</year>) <volume>117</volume>:<page-range>25254&#x2013;62</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.2010722117</pub-id>
</citation>
</ref>
<ref id="B168">
<label>168</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simnica</surname> <given-names>D</given-names>
</name>
<name>
<surname>Schulthei&#xdf;</surname> <given-names>C</given-names>
</name>
<name>
<surname>Mohme</surname> <given-names>M</given-names>
</name>
<name>
<surname>Paschold</surname> <given-names>L</given-names>
</name>
<name>
<surname>Willscher</surname> <given-names>E</given-names>
</name>
<name>
<surname>Fitzek</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Landscape of T-Cell Repertoires With Public COVID-19-Associated T-Cell Receptors in Pre-Pandemic Risk Cohorts</article-title>. <source>Clin Trans Immunol</source> (<year>2021</year>) <volume>10</volume>:<fpage>e1340</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/cti2.1340</pub-id>
</citation>
</ref>
<ref id="B169">
<label>169</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Minervina</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Komech</surname> <given-names>EA</given-names>
</name>
<name>
<surname>Titov</surname> <given-names>A</given-names>
</name>
<name>
<surname>Bensouda Koraichi</surname> <given-names>M</given-names>
</name>
<name>
<surname>Rosati</surname> <given-names>E</given-names>
</name>
<name>
<surname>Mamedov</surname> <given-names>IZ</given-names>
</name>
<etal/>
</person-group>. <article-title>Longitudinal High-Throughput TCR Repertoire Profiling Reveals the Dynamics of T-Cell Memory Formation After Mild COVID-19 Infection</article-title>. <source>eLife</source> (<year>2021</year>) <volume>10</volume>:<fpage>e63502</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.7554/eLife.63502</pub-id>
</citation>
</ref>
<ref id="B170">
<label>170</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pogorelyy</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Minervina</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Touzel</surname> <given-names>MP</given-names>
</name>
<name>
<surname>Sycheva</surname> <given-names>AL</given-names>
</name>
<name>
<surname>Komech</surname> <given-names>EA</given-names>
</name>
<name>
<surname>Kovalenko</surname> <given-names>EI</given-names>
</name>
<etal/>
</person-group>. <article-title>Precise Tracking of Vaccine-Responding T Cell Clones Reveals Convergent and Personalized Response in Identical Twins</article-title>. <source>Proc Natl Acad Sci</source> (<year>2018</year>) <volume>115</volume>:<page-range>12704&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1073/pnas.1809642115</pub-id>
</citation>
</ref>
<ref id="B171">
<label>171</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quiros-Fernandez</surname> <given-names>I</given-names>
</name>
<name>
<surname>Poorebrahim</surname> <given-names>M</given-names>
</name>
<name>
<surname>Fakhr</surname> <given-names>E</given-names>
</name>
<name>
<surname>Cid-Arregui</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Immunogenic T Cell Epitopes of SARS-CoV-2 are Recognized by Circulating Memory and Na&#xef;Ve CD8 T Cells of Unexposed Individuals</article-title>. <source>EBioMedicine</source> (<year>2021</year>) <volume>72</volume>:<fpage>103610</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.ebiom.2021.103610</pub-id>
</citation>
</ref>
<ref id="B172">
<label>172</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stranzl</surname> <given-names>T</given-names>
</name>
<name>
<surname>Larsen</surname> <given-names>MV</given-names>
</name>
<name>
<surname>Lundegaard</surname> <given-names>C</given-names>
</name>
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>NetCTLpan: Pan-Specific MHC Class I Pathway Epitope Predictions</article-title>. <source>Immunogenetics</source> (<year>2010</year>) <volume>62</volume>:<page-range>357&#x2013;68</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s00251-010-0441-4</pub-id>
</citation>
</ref>
<ref id="B173">
<label>173</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gutierrez</surname> <given-names>L</given-names>
</name>
<name>
<surname>Beckford</surname> <given-names>J</given-names>
</name>
<name>
<surname>Alachkar</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>Deciphering the TCR Repertoire to Solve the COVID-19 Mystery</article-title>. <source>Trends Pharmacol Sci</source> (<year>2020</year>) <volume>41</volume>:<page-range>518&#x2013;30</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.tips.2020.06.001</pub-id>
</citation>
</ref>
<ref id="B174">
<label>174</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maecker</surname> <given-names>HT</given-names>
</name>
</person-group>. <article-title>Immune Profiling of COVID-19: Preliminary Findings and Implications for the Pandemic</article-title>. <source>J ImmunoTherapy Cancer</source> (<year>2021</year>) <volume>9</volume>:<elocation-id>e002550</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1136/jitc-2021-002550</pub-id>
</citation>
</ref>
<ref id="B175">
<label>175</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gallo Marin</surname> <given-names>B</given-names>
</name>
<name>
<surname>Aghagoli</surname> <given-names>G</given-names>
</name>
<name>
<surname>Lavine</surname> <given-names>K</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>L</given-names>
</name>
<name>
<surname>Siff</surname> <given-names>EJ</given-names>
</name>
<name>
<surname>Chiang</surname> <given-names>SS</given-names>
</name>
<etal/>
</person-group>. <article-title>Predictors of COVID-19 Severity: A Literature Review</article-title>. <source>Rev Med Virol</source> (<year>2021</year>) <volume>31</volume>:<fpage>e2146</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/rmv.2146</pub-id>
</citation>
</ref>
<ref id="B176">
<label>176</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bartleson</surname> <given-names>JM</given-names>
</name>
<name>
<surname>Radenkovic</surname> <given-names>D</given-names>
</name>
<name>
<surname>Covarrubias</surname> <given-names>AJ</given-names>
</name>
<name>
<surname>Furman</surname> <given-names>D</given-names>
</name>
<name>
<surname>Winer</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Verdin</surname> <given-names>E</given-names>
</name>
</person-group>. <article-title>SARS-CoV-2, COVID-19 and the Aging Immune System</article-title>. <source>Nat Aging</source> (<year>2021</year>) <volume>1</volume>:<page-range>769&#x2013;82</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s43587-021-00114-7</pub-id>
</citation>
</ref>
<ref id="B177">
<label>177</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bagaev</surname> <given-names>DV</given-names>
</name>
<name>
<surname>Vroomans</surname> <given-names>RMA</given-names>
</name>
<name>
<surname>Samir</surname> <given-names>J</given-names>
</name>
<name>
<surname>Stervbo</surname> <given-names>U</given-names>
</name>
<name>
<surname>Rius</surname> <given-names>C</given-names>
</name>
<name>
<surname>Dolton</surname> <given-names>G</given-names>
</name>
<etal/>
</person-group>. <article-title>VDJdb in 2019: Database Extension, New Analysis Infrastructure and a T-Cell Receptor Motif Compendium</article-title>. <source>Nucleic Acids Res</source> (<year>2019</year>) <volume>48</volume>:<page-range>D1057&#x2013;62</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gkz874</pub-id>
</citation>
</ref>
<ref id="B178">
<label>178</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goronzy</surname> <given-names>JJ</given-names>
</name>
<name>
<surname>Weyand</surname> <given-names>CM</given-names>
</name>
</person-group>. <article-title>Understanding Immunosenescence to Improve Responses to Vaccines</article-title>. <source>Nat Immunol 2013 14:5</source> (<year>2013</year>) <volume>14</volume>:<page-range>428&#x2013;36</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/NI.2588</pub-id>
</citation>
</ref>
<ref id="B179">
<label>179</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ruder</surname> <given-names>S</given-names>
</name>
<name>
<surname>Peters</surname> <given-names>ME</given-names>
</name>
<name>
<surname>Swayamdipta</surname> <given-names>S</given-names>
</name>
<name>
<surname>Wolf</surname> <given-names>T</given-names>
</name>
</person-group>. <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials</source>. <publisher-loc>Minneapolis, Minnesota</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name> (<year>2019</year>). p. <page-range>15&#x2013;8</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.18653/v1/N19-5004</pub-id>
</citation>
</ref>
<ref id="B180">
<label>180</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Ruffolo</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Gray</surname> <given-names>JJ</given-names>
</name>
<name>
<surname>Sulam</surname> <given-names>J</given-names>
</name>
</person-group>. (<year>2021</year>). <article-title>Deciphering Antibody Affinity Maturation With Language Models and Weakly Supervised Learning</article-title>. Paper presented at <conf-name>Machine Learning for Structural Biology Workshop at the 35th Conference on Neural Information Processing Systems</conf-name>, <conf-date>2021 Dec 13</conf-date>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/ARXIV.2112.07782</pub-id>
</citation>
</ref>
<ref id="B181">
<label>181</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Samir</surname> <given-names>J</given-names>
</name>
<name>
<surname>Rizzetto</surname> <given-names>S</given-names>
</name>
<name>
<surname>Gupta</surname> <given-names>M</given-names>
</name>
<name>
<surname>Luciani</surname> <given-names>F</given-names>
</name>
</person-group>. <article-title>Exploring and Analysing Single Cell Multi-Omics Data With VDJView</article-title>. <source>BMC Med Genomics</source> (<year>2020</year>) <volume>13</volume>:<elocation-id>29</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s12920-020-0696-z</pub-id>
</citation>
</ref>
<ref id="B182">
<label>182</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stephenson</surname> <given-names>E</given-names>
</name>
<name>
<surname>Reynolds</surname> <given-names>G</given-names>
</name>
<name>
<surname>Botting</surname> <given-names>RA</given-names>
</name>
<name>
<surname>Calero-Nieto</surname> <given-names>FJ</given-names>
</name>
<name>
<surname>Morgan</surname> <given-names>MD</given-names>
</name>
<name>
<surname>Tuong</surname> <given-names>ZK</given-names>
</name>
<etal/>
</person-group>. <article-title>Single-Cell Multi-Omics Analysis of the Immune Response in COVID-19</article-title>. <source>Nat Med</source> (<year>2021</year>) <volume>27</volume>:<page-range>904&#x2013;16</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41591-021-01329-2</pub-id>
</citation>
</ref>
<ref id="B183">
<label>183</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ripoll</surname> <given-names>DR</given-names>
</name>
<name>
<surname>Chaudhury</surname> <given-names>S</given-names>
</name>
<name>
<surname>Wallqvist</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Using the Antibody-Antigen Binding Interface to Train Image-Based Deep Neural Networks for Antibody-Epitope Classification</article-title>. <source>PloS Comput Biol</source> (<year>2021</year>) <volume>17</volume>:<fpage>1</fpage>&#x2013;<lpage>42</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pcbi.1008864</pub-id>
</citation>
</ref>
<ref id="B184">
<label>184</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karnaukhov</surname> <given-names>VK</given-names>
</name>
<name>
<surname>Shcherbinin</surname> <given-names>DS</given-names>
</name>
<name>
<surname>Chugunov</surname> <given-names>AO</given-names>
</name>
<name>
<surname>Chudakov</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Efremov</surname> <given-names>RG</given-names>
</name>
<name>
<surname>Zvyagin</surname> <given-names>IV</given-names>
</name>
<etal/>
</person-group>. <article-title>Predicting TCR-Peptide Recognition Based on Residue-Level Pairwise Statistical Potential [Preprint]</article-title>. <source>bioRxiv</source> (<year>2022</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1101/2022.02.15.480516</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>