<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Immunol.</journal-id>
<journal-title>Frontiers in Immunology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Immunol.</abbrev-journal-title>
<issn pub-type="epub">1664-3224</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fimmu.2021.682103</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Immunology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Improvement of Neoantigen Identification Through Convolution Neural Network</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Hao</surname>
<given-names>Qing</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1212446"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wei</surname>
<given-names>Ping</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Shu</surname>
<given-names>Yang</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/525240"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Zhang</surname>
<given-names>Yi-Guan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1212443"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Xu</surname>
<given-names>Heng</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/523577"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Zhao</surname>
<given-names>Jun-Ning</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>College of Pharmaceutical Sciences, Southwest Medical University</institution>, <addr-line>Luzhou</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Sichuan Center for Translational Medicine of Traditional Chinese Medicine, State Key Laboratory of Quality Evaluation of Traditional Chinese Medicine, Sichuan Geoherbs System Engineering Technology Research Center of Chinese Medicine, Sichuan Provincial Key Laboratory of Quality Evaluation of Traditional Chinese Medicine and Innovative Chinese Medicine Research, Institute of Translational Pharmacology of Sichuan Academy of Chinese Medicine Sciences</institution>, <addr-line>Chengdu</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Department of Laboratory Medicine, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University</institution>, <addr-line>Chengdu</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Min Cheng, Weifang Medical University, China</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Annika M. Bruger, Universit&#xe9; Catholique de Louvain, Belgium; Yunlong Lei, Chongqing Medical University, China</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Yi-Guan Zhang, <email xlink:href="mailto:yiguanzhang@126.com">yiguanzhang@126.com</email>; Heng Xu, <email xlink:href="mailto:xuheng81916@scu.edu.cn">xuheng81916@scu.edu.cn</email>; Jun-Ning Zhao, <email xlink:href="mailto:zarmy@189.cn">zarmy@189.cn</email>
</p>
</fn>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Cancer Immunity and Immunotherapy, a section of the journal Frontiers in Immunology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>25</day>
<month>05</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>682103</elocation-id>
<history>
<date date-type="received">
<day>17</day>
<month>03</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>05</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Hao, Wei, Shu, Zhang, Xu and Zhao</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Hao, Wei, Shu, Zhang, Xu and Zhao</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Accurate prediction of neoantigens and the subsequent elicited protective anti-tumor response are particularly important for the development of cancer vaccine and adoptive T-cell therapy. However, current algorithms for predicting neoantigens are limited by <italic>in vitro</italic> binding affinity data and algorithmic constraints, inevitably resulting in high false positives. In this study, we proposed a deep convolutional neural network named APPM (antigen presentation prediction model) to predict antigen presentation in the context of human leukocyte antigen (HLA) class I alleles. APPM is trained on large mass spectrometry (MS) HLA-peptides datasets and evaluated with an independent MS benchmark. Results show that APPM outperforms the methods recommended by the immune epitope database (IEDB) in terms of positive predictive value (PPV) (0.40 vs. 0.22), which will further increase after combining these two approaches (PPV = 0.51). We further applied our model to the prediction of neoantigens from consensus driver mutations and identified 16,000 putative neoantigens with hallmarks of &#x2018;drivers&#x2019;.</p>
</abstract>
<kwd-group>
<kwd>neoantigen</kwd>
<kwd>CNN</kwd>
<kwd>HLA</kwd>
<kwd>driver mutation</kwd>
<kwd>prediction</kwd>
</kwd-group>    <contract-sponsor id="cn001">Sichuan Provincial Administration of Traditional Chinese Medicine<named-content content-type="fundref-id">10.13039/501100016350</named-content>
</contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="2"/>
<equation-count count="3"/>
<ref-count count="63"/>
<page-count count="10"/>
<word-count count="4155"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<title>Introduction</title>
<p>Cancer develops as a result of the accumulation of tumor-specific somatic mutations (<xref ref-type="bibr" rid="B1">1</xref>&#x2013;<xref ref-type="bibr" rid="B3">3</xref>), where non-silent mutations in the coding region could be recognized as beacons of &#x201c;foreign&#x201d; by the immune system, named neoantigen (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B5">5</xref>). They can elicit a protective anti-tumor response when presented on the surface of cancer cells by the major histocompatibility complex (MHC) [also called human leukocyte antigen (HLA)]. Neoantigens have long been regarded as ideal targets in immunotherapy because they are restrictedly expressed by tumor cells and not subjected to central or peripheral tolerance (<xref ref-type="bibr" rid="B6">6</xref>). Neoantigen-based immunotherapy has achieved great success in recent years (<xref ref-type="bibr" rid="B7">7</xref>&#x2013;<xref ref-type="bibr" rid="B11">11</xref>), further highlighting the importance of accurate prediction of neoantigens for the development of cancer vaccines and adoptive T-cell therapy (<xref ref-type="bibr" rid="B12">12</xref>&#x2013;<xref ref-type="bibr" rid="B15">15</xref>). However, the current prediction approaches and algorithms to identifying immunogenic neoantigens from mutant peptides are far from satisfactory. Low precision is a major obstruction to their identification scheme (<xref ref-type="bibr" rid="B16">16</xref>), partially because they primarily rely on the HLA-peptide binding affinity (<xref ref-type="bibr" rid="B17">17</xref>). The&#xa0;binding affinity produced by <italic>in vitro</italic> binding experiments neglects other biological steps involved in the peptide delivery process, which results in a substantial fraction of false positives. Only ~1&#x2013;5% of predicted bound peptides using HLA binding-affinity predictions have been experimentally validated (<xref ref-type="bibr" rid="B18">18</xref>). One way to solve this problem is to train the prediction algorithm with peptides eluted from HLA complexes of mono-allelic or mixed-allelic cancer cell lines and identified by mass spectrometry (MS) analysis (<xref ref-type="bibr" rid="B19">19</xref>). The MS datasets profile the peptides naturally presented on the cell surface, which has already gone through antigen processing and transporting steps (<xref ref-type="bibr" rid="B20">20</xref>, <xref ref-type="bibr" rid="B21">21</xref>). Another reason for low precision may be that the recognition features, such as amino acid properties and spatial structure were not taken into consideration (<xref ref-type="bibr" rid="B22">22</xref>, <xref ref-type="bibr" rid="B23">23</xref>). Compared with other artificial neural networks used in MHCflurry, NetMHC-4.0 and NetMHCpan-4.0 (<xref ref-type="bibr" rid="B24">24</xref>&#x2013;<xref ref-type="bibr" rid="B26">26</xref>), the convolutional neural network (CNN) preserves local spatial features (<xref ref-type="bibr" rid="B27">27</xref>) and is more suitable for studying peptides where spatial locations of the amino acids are critical for binding (<xref ref-type="bibr" rid="B28">28</xref>).</p>
<p>In this study, we proposed an antigen presentation prediction model (APPM), a CNN algorithm trained to accurately predict the likelihood of a peptide presented by HLA-I molecules. APPM outperformed the approach recommended by IEDB (2020.04 netMHCpan EL 4.0) in terms of specificity and positive predictive value among 20 high-frequency HLA alleles. Besides, we predicted the neoantigens derived from the TCGA driver mutations, the preparation of which can be used in off-the-shelf immunotherapies to save the time from detecting mutations to personalized vaccine injection.</p>
</sec>
<sec id="s2" sec-type="materials|methods">
<title>Methods</title>
<sec id="s2_1">
<title>Data Collection</title>
<p>More than 1,900,000 published HLA-peptides MS data of mono-allelic or mixed-allelic cell lines which collectively expressed 20 high-frequency HLA-A and HLA-B allotypes are collected (<xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B30">30</xref>). All these data are labeled in binary notation. Label=1 denotes MS-identified peptides (hits), whereas label=0 denotes peptides from the reference proteome (SwissProt) that were not detected <italic>via</italic> mass spectrometry.</p>
</sec>
<sec id="s2_2">
<title>Data Encoding</title>
<p>The training datasets are peptides with the length from 8-mer to 11-mer, which are represented by a one-letter amino acid alphabet (a total of 20 distinct amino acids, namely &#x2018;ACDEFGHIKLMNPQRSTVWY&#x2019;). Such length range captures ~95% of all HLA class I-restricted peptides. To implement machine learning, the peptide sequences are vectorized by a one-hot encoding scheme. Peptides with multiple lengths (8-mer to 11-mer) were represented as fixed-length vectors by using a padded character &#x2018;Z&#x2019;. Each amino acid and the padded &#x2018;Z&#x2019; are encoded as a one-hot vector (see <xref ref-type="supplementary-material" rid="SF1">
<bold>Figure S1</bold>
</xref> for details). As a result, peptides are encoded as the fixed matrix of 11 rows (maximum length) by 21 columns (20 distinct amino acid alphabets and the padded character &#x2018;Z&#x2019;).</p>
</sec>
<sec id="s2_3">
<title>Imbalanced Distribution of Training Datasets</title>
<p>The collection of MS datasets shows a severe class imbalance. Overall, the total number of 0-labeled data is 1,866,484 which is 39 times as many as the 1-labeled counterparts. An extreme case can be found in HLA-A*02:07 datasets where the negative-labeled records are 72 times more than 1-labeled records. Such extreme imbalance influences the prediction of the machine learning model, inclined to show a better performance on the 0-labeled peptides (the majority) and a worse on the 1-labeled ones (the minority) (<xref ref-type="bibr" rid="B31">31</xref>). Thus, the class balance is adjusted <italic>via</italic> over-sampling and under-sampling procedures in preprocessing the training datasets. Briefly speaking, the under-sampling goes by removing the 0-labeled training data points at random, whereas the over-sampling duplicates the 1-labeled data points. <xref ref-type="table" rid="T1">
<bold>Table 1</bold>
</xref> shows the proportions of over-sampling and under-sampling on different HLA alleles.</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>The Training Detail on different HLA alleles.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" rowspan="2" align="left">Alleles</th>
<th valign="top" rowspan="2" align="center">Label = 1</th>
<th valign="top" rowspan="2" align="center">Label = 0</th>
<th valign="top" rowspan="2" align="center">Train</th>
<th valign="top" rowspan="2" align="center">Test</th>
<th valign="top" rowspan="2" align="center">Under-sampling</th>
<th valign="top" rowspan="2" align="center">Over-sampling</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">A*01:01</td>
<td valign="top" align="center">3398</td>
<td valign="top" align="center">48700</td>
<td valign="top" align="center">45498</td>
<td valign="top" align="center">6600</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">A*02:01</td>
<td valign="top" align="center">6779</td>
<td valign="top" align="center">165342</td>
<td valign="top" align="center">160921</td>
<td valign="top" align="center">11200</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">A*02:03</td>
<td valign="top" align="center">1780</td>
<td valign="top" align="center">116299</td>
<td valign="top" align="center">107879</td>
<td valign="top" align="center">10200</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">A*02:07</td>
<td valign="top" align="center">3206</td>
<td valign="top" align="center">232783</td>
<td valign="top" align="center">225389</td>
<td valign="top" align="center">10600</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">A*03:01</td>
<td valign="top" align="center">5419</td>
<td valign="top" align="center">83117</td>
<td valign="top" align="center">77536</td>
<td valign="top" align="center">11000</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">A*11:01</td>
<td valign="top" align="center">2114</td>
<td valign="top" align="center">123143</td>
<td valign="top" align="center">114857</td>
<td valign="top" align="center">10400</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">A*24:02</td>
<td valign="top" align="center">5189</td>
<td valign="top" align="center">142382</td>
<td valign="top" align="center">136571</td>
<td valign="top" align="center">11000</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">A*29:02</td>
<td valign="top" align="center">1149</td>
<td valign="top" align="center">54125</td>
<td valign="top" align="center">49074</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">A*31:01</td>
<td valign="top" align="center">1879</td>
<td valign="top" align="center">45918</td>
<td valign="top" align="center">41597</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">A*32:01</td>
<td valign="top" align="center">584</td>
<td valign="top" align="center">40401</td>
<td valign="top" align="center">34885</td>
<td valign="top" align="center">6100</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">A*68:02</td>
<td valign="top" align="center">1516</td>
<td valign="top" align="center">92678</td>
<td valign="top" align="center">83994</td>
<td valign="top" align="center">10200</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">B*07:02</td>
<td valign="top" align="center">3162</td>
<td valign="top" align="center">201778</td>
<td valign="top" align="center">194340</td>
<td valign="top" align="center">10600</td>
<td valign="top" align="center">0.6</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">B*15:01</td>
<td valign="top" align="center">1684</td>
<td valign="top" align="center">106482</td>
<td valign="top" align="center">97966</td>
<td valign="top" align="center">10200</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">B*35:01</td>
<td valign="top" align="center">1019</td>
<td valign="top" align="center">53819</td>
<td valign="top" align="center">48638</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">B*40:01</td>
<td valign="top" align="center">1321</td>
<td valign="top" align="center">80192</td>
<td valign="top" align="center">71313</td>
<td valign="top" align="center">10200</td>
<td valign="top" align="center">0.9</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">B*44:02</td>
<td valign="top" align="center">1525</td>
<td valign="top" align="center">44760</td>
<td valign="top" align="center">40085</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">B*44:03</td>
<td valign="top" align="center">1487</td>
<td valign="top" align="center">39482</td>
<td valign="top" align="center">34769</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">B*51:01</td>
<td valign="top" align="center">2597</td>
<td valign="top" align="center">77898</td>
<td valign="top" align="center">70095</td>
<td valign="top" align="center">10400</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">B*54:01</td>
<td valign="top" align="center">969</td>
<td valign="top" align="center">65623</td>
<td valign="top" align="center">56412</td>
<td valign="top" align="center">10180</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">B*57:01</td>
<td valign="top" align="center">1599</td>
<td valign="top" align="center">51562</td>
<td valign="top" align="center">46961</td>
<td valign="top" align="center">6200</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Alleles defined by DNA sequencing are named to identify the gene, followed by an asterisk, numbers representing the allele group.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s2_4">
<title>Convolutional Neural Network (CNN)</title>
<p>Usually, the Convolutional Neural Network (CNN) consists of convolutional layers, pooling layers and fully connected (dense) layers. In this study, an advanced CNN which is inspired by the <italic>inception</italic> module from <italic>GoogLeNet</italic> is used (<xref ref-type="bibr" rid="B32">32</xref>, <xref ref-type="bibr" rid="B33">33</xref>). Three parallel convolutional sections with eight two-dimensional convolutional kernels for each were constructed to maximize the feature extraction (see <xref ref-type="supplementary-material" rid="SF2">
<bold>Figure S2</bold>
</xref> for details). The output of three convolutional layers connects to a flattened matrix and is delivered to the fully-connected layers which contain 100 hidden nodes. The output layer displays the results of binary classification by two nodes where a tested peptide is classified as binding or not binding to HLA.</p>
<p>The model is implemented with Tensorflow (v. 1.14.0) and trained by Adam optimization algorithm with standard parameters on an NVIDIA GeForce RTX 2080 Ti GPU. Instead of the frequently-used activation function Rectified Linear Unit (ReLU), the advance function of Leaky ReLU (&#x3b1;=0.2) is applied to activate the model and the &#x201c;drop-out&#x201d; and &#x201c;early stopping&#x201d; schemes are introduced to avoid overfitting.</p>
</sec>
<sec id="s2_5">
<title>Data Splitting</title>
<p>The peptides of the MS dataset are randomly split into training sets, validation sets and test sets, and all three sets have approximately the same distribution of 1-labeled and 0-labeled peptides. The validation sets are used only for early stopping. The training sets are used to perform feed-forward and backpropagation and the test sets are used to evaluate performance <italic>via</italic> AUC.</p>
</sec>
<sec id="s2_6">
<title>Independent Validation Dataset</title>
<p>To benchmark the APPM and other HLA-peptide predictors, we collected HLA-bound peptides MS datasets from other studies that use cell lines to express a single HLA allele (<xref ref-type="bibr" rid="B34">34</xref>, <xref ref-type="bibr" rid="B35">35</xref>). From these MS-identified peptides (hits), we generated non-binders (decoy sets) by sampling unobserved peptides from the same proteins through the Uniprot human reference proteome (UP000005640_9606) as previously described (<xref ref-type="bibr" rid="B36">36</xref>). For each MS-identified peptide, we randomly selected 99-time decoy peptides of four different lengths (8, 9, 10, 11), and the number of each length is the same. The rationale for the 99-fold bias is that for a sample of peptide fragments from an organism, it is commonly considered that approximately 1%&#x223c;2% of the fragments will bind to MHC receptors (<xref ref-type="bibr" rid="B37">37</xref>). After removing the peptides appearing in the model training data and the duplicate sampled from different proteins, we obtained a mono-allelic benchmark dataset.</p>
</sec>
<sec id="s2_7">
<title>Predictive Performance Metric Calculation</title>
<p>Sensitivity, also called recall, was calculated as:</p>
<disp-formula>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Specificity was calculated as:</p>
<disp-formula>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Positive predictive value, also called precision, was calculated as:</p>
<disp-formula>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mi>o</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>b</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="s2_8">
<title>The Cancer Genome Atlas (TCGA) Driver Mutations</title>
<p>To obtain a consensus driver mutations list, we download the driver-mutations dataset processed and compiled by TCGA MC3 and driver working group (<uri xlink:href="https://gdc.cancer.gov/about-data/publications/pancan-driver">https://gdc.cancer.gov/about-data/publications/pancan-driver</uri>) (<xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B39">39</xref>). The driver-discovery dataset was derived from a compiled MAF file of 9079 TCGA samples across 33 different cancer types (syn7824274, <uri xlink:href="https://gdc.cancer.gov/about-data/publications/mc3-2017">https://gdc.cancer.gov/about-data/publications/mc3-2017</uri>). Based on sequencing and structure analyses, we ultimately selected 3,437 cancer driver mutations as the consensus list were identified by &#x2265; 2 approaches from CTAT-population, CTAT-cancer, or structural clustering (see <xref ref-type="supplementary-material" rid="SF6">
<bold>Supplementary File 4</bold>
</xref>).</p>
</sec>
<sec id="s2_9">
<title>Candidate Peptides From Driver Mutations</title>
<p>For each driver mutation, we extract 8-11mers candidate peptides that contain the driver specific mutant amino acid for neoantigen screening. For instance, the extracting procedure of 9-mer candidate peptides is described as follows (<xref ref-type="supplementary-material" rid="SF3">
<bold>Figure S3</bold>
</xref>). Firstly, we extracted a 17-mer peptide from the protein sequences, where the mutant amino acid was placed in the center with eight upstream and downstream wild amino acids as flanks. Secondly, by using the sliding window protocol, a 9 amino acid size window was slid N (N = 9) times to obtain 9-mer peptides. Briefly speaking, the mutant amino acid serves as the end point of the first 9-mer peptide. This 9-mer sliding window moves along the 17-mer fragment until the mutated point becomes the starting point of the 9-mer. Peptides with other lengths are treated in the same way.</p>
</sec>
</sec>
<sec id="s3" sec-type="results">
<title>Results</title>
<sec id="s3_1">
<title>Development of APPM</title>
<p>We aimed to improve the precision and specificity of the HLA-peptide prediction approaches through a novel tool that has been trained on improved training data and a new supervised machine learning model. HLA-Peptides of MS data were eluted by immunoprecipitation of HLA molecules and then identified by liquid chromatography-tandem mass spectrometry (LC-MS/MS) (<xref ref-type="bibr" rid="B40">40</xref>, <xref ref-type="bibr" rid="B41">41</xref>). Compared with <italic>in vitro</italic> binding affinity assays, MS data directly profiles peptides that are actively presented by cells or tissues (<xref ref-type="bibr" rid="B42">42</xref>). We collected publicly available HLA-peptides MS data from 16 mono-allelic HLA-A and HLA-B cell lines genetically engineered to express a single HLA allele and from B lymphocytes or cancer cell lines expressing multiple HLA complex alleles (<xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B43">43</xref>). These MS data consist of 20 high-frequency HLA-I alleles. We split the datasets into three sets: training, validation and testing sets (Methods). Owing to so many negative peptides (from reference proteome), we apply the over-sampling and under-sampling scheme, which neutralizes the substantial fraction of the imbalance issue.</p>
<p>Using these public HLA-peptides MS data, we build a convolutional neural network (CNN) framework to predict HLA-I presentation, a form of deep learning that excels at handling general sequence data such as amino acid sequences (<xref ref-type="fig" rid="f1">
<bold>Figure 1</bold>
</xref>) (<xref ref-type="bibr" rid="B28">28</xref>). The model has three parallel convolutional modules, each consisting of eight two-dimensional convolutional layers, which preserved HLA class I-peptide binding features.</p>
<fig id="f1" position="float">
<label>Figure 1</label>
<caption>
<p>The framework of our study includes the collection of training data and the deep learning model built based on the convolutional neural network.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-12-682103-g001.tif"/>
</fig>
</sec>
<sec id="s3_2">
<title>Predictive performance of APPM</title>
<p>To estimate the predictive performance of APPM, we first compared the prediction results of APPM with the IEDB recommended method (2020.04) (NetMHCpan4 EL (<xref ref-type="bibr" rid="B44">44</xref>), the state-of-the-art class I binding predictors available at <uri xlink:href="http://tools.iedb.org/mhci/">http://tools.iedb.org/mhci/</uri>) in terms of PPV. We compiled a benchmark using published MS data from cell lines genetically engineered to express a single HLA-I allele. In this mono-allelic benchmark, the MS-identified peptides are true positives where length-matched amino acid fragments from the same protein as negative peptides (decoys). For each paired HLA allele and peptide, NetMHCpan4 EL produced a binding score and percentile ranks. Using the recommended threshold of the percentile rank (top 2% ranks are considered binders), we obtained the average specificity and positive predictive value (PPV) of 0.97 and 0.22 for NetMHCpan4 EL (<xref ref-type="supplementary-material" rid="SF4">
<bold>Supplementary File 1</bold>
</xref>).</p>
<p>When tested on the same data, APPM outperformed NetMHCpan4 EL with the specificity of 0.99 and PPV of 0.40. The improvement in reducing false positives rates was substantial, with an average of 80% increase in PPV (<xref ref-type="fig" rid="f2">
<bold>Figure 2A</bold>
</xref>). For the 20 frequent haplotypes of HLA class I, APPM only exhibited a slightly lower PPV than NetMHCpan4 EL on HLA-A*02:01, but presented higher PPV for the rest of 19 HLA haplotypes, particularly with more than one fold of increase for HLA-A*02:03, HLA-A*29:02, HLA-A*32:01 and HLA-B*40:01 (<xref ref-type="fig" rid="f2">
<bold>Figure 2B</bold>
</xref>), suggesting the advantage of our algorithm.</p>
<fig id="f2" position="float">
<label>Figure 2</label>
<caption>
<p>Validation performance of IEDB recommended approach and APPM <bold>(A)</bold> The mean PPV accuracy on the mono-allelic MS benchmarks for APPM and NetMHCpan4 EL. <bold>(B)</bold> The PPV values of two predictors at different HLA alleles.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-12-682103-g002.tif"/>
</fig>
</sec>
<sec id="s3_3">
<title>Combining Algorithms Improves Prediction Performance</title>
<p>Interestingly, a low overlap rate (19%) is observed between APPM and NetMHCpan4 EL for the false-positive peptides (<xref ref-type="fig" rid="f3">
<bold>Figure 3A</bold>
</xref>), probably due to the different prediction mechanisms. In this case, we hypothesized that the prediction performance could be improved by combining these two predictive approaches. We redefined the predictive results: only peptides identified positively in both methods are regarded as positives. Using the combined predictions, we obtained the PPV of 0.51 (<xref ref-type="fig" rid="f3">
<bold>Figure 3B</bold>
</xref>), which is significantly higher than that of both APPM and NetMHCpan4 EL (<xref ref-type="fig" rid="f3">
<bold>Figure 3C</bold>
</xref>, p = 0.013, t-test and <xref ref-type="fig" rid="f3">
<bold>Figure 3D</bold>
</xref>, p &lt; 0.001, t-test), without significant decrease of sensitivity (<xref ref-type="fig" rid="f3">
<bold>Figure 3E</bold>
</xref>, p = 0.1, ANOVA). These results suggested that the combined predictions from different algorithms can improve the positive rate for neoantigen selection, which is consistent with previous studies (<xref ref-type="bibr" rid="B45">45</xref>, <xref ref-type="bibr" rid="B46">46</xref>).</p>
<fig id="f3" position="float">
<label>Figure 3</label>
<caption>
<p>Algorithms Combination Improves Prediction Performance. <bold>(A)</bold> The false-positive peptides of APPM and NetMHCpan4 EL. These peptides are decoy peptides of mono-allelic MS benchmarks that are incorrectly predicted to be bindings. <bold>(B)</bold> The mean PPV accuracy on the mono-allelic MS benchmarks for APPM, NetMHCpan4 EL and combination. <bold>(C)</bold> The significant improvement of predictive performance in the term of PPV on the mono-allelic MS benchmarks. The left is APPM and the right is the combination of APPM and NetMHCpan4 EL. **p &lt; 0.05. <bold>(D)</bold> The significant improvement of predictive performance in the term of PPV on the mono-allelic MS benchmarks. The left is NetMHCpan4 EL and the right is the combination of APPM and NetMHCpan4 EL. ***p &lt; 0.01. <bold>(E)</bold> The mean sensitivity on the mono-allelic MS benchmarks for APPM, NetMHCpan4 EL and combination. NS, no significance.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-12-682103-g003.tif"/>
</fig>
</sec>
<sec id="s3_4">
<title>Alleles-Specific Presentation Motif</title>
<p>To illustrate the binding characteristics of HLA-I alleles with peptides, we draw allele-specific presentation motifs for 20 HLA-I alleles (see <xref ref-type="supplementary-material" rid="SF5">
<bold>Supplementary File 2</bold>
</xref> for motifs of all alleles). Consistent with previous studies (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B47">47</xref>), these motifs revealed the dependence of HLA presentation on each sequence position for peptides of multiple lengths 8-11 (<xref ref-type="fig" rid="f4">
<bold>Figure 4A</bold>
</xref>). For example, the anchor residues of 9mer are amino acid at position 2 (refer as P2, a similar abbreviation for other positions) and P9, while 11mer at P2 and P11.</p>
<fig id="f4" position="float">
<label>Figure 4</label>
<caption>
<p>The motif of HLA alleles <bold>(A)</bold> The learned dependence of HLA presentation on each sequence position for peptides of lengths 8&#x2013;11. The red, blue, black, purple, and green lines represent the acidic, basic, hydrophobic, neutral and polar amino acids respectively. <bold>(B)</bold> Some similar motifs are depicted in this graph. <bold>(C)</bold> The radar view is a deformation of the percentage graph illustrating the motifs of HLA-A and HLA-B at the overall level. Different colors represent varied HLA class I molecules. Alleles defined by DNA sequencing are named to identify the gene, followed by an asterisk, numbers representing the allele group.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fimmu-12-682103-g004.tif"/>
</fig>
<p>In contrast to previous work (<xref ref-type="bibr" rid="B48">48</xref>), some distinct HLA alleles have similar presentation motifs. For instance, HLA-A*02:01 and HLA-A*02:03 have the same binding specificity, meaning the pockets preferentially bind to bind the peptides with leucine at P2 and valine/leucine at the last position. Likewise, HLA-A*03:01 and HLA-A*11:01 presented lysine at the last position, while HLA-B*40:01, HLA-B*44:02, and HLA-B*44:03 prefer to deliver peptides with glutamate at P2 (<xref ref-type="fig" rid="f4">
<bold>Figure 4B</bold>
</xref>).</p>
<p>Moreover, we analyzed the amino acid properties of anchor residues of 20 HLA alleles and refined their binding character: these binding peptides enriched in hydrophobic amino acids at anchor residues. It is consistent with the known preference of HLA-I binding and presentation (<xref ref-type="bibr" rid="B23">23</xref>, <xref ref-type="bibr" rid="B49">49</xref>). We also explored the whole preference of amino acid properties among HLA-A and HLA-B molecules on anchor residues (<xref ref-type="fig" rid="f4">
<bold>Figure 4C</bold>
</xref>). Besides the common preference of hydrophobic amino acids, HLA-A alleles prefer to bind basic and polar amino acids, while the HLA-B alleles prefer acidic amino acids.</p>
</sec>
<sec id="s3_5">
<title>Neoantigens From Driver Mutations</title>
<p>It is considered that the quality rather than the quantity of neoantigens may lead to a robust and durable response to immunotherapy (<xref ref-type="bibr" rid="B50">50</xref>). Most of the putative neoantigens are considered as the product of passenger rather than driver mutations, and their loss through chromosomal instability during tumor evolution may be readily tolerated. Therefore, targeting driver-mutation-neoantigens could manifest durable anti-tumor responses and may reduce the resistance to neoantigen therapies.</p>
<p>We applied the combining approach of APPM and NetMHCpan4 to predict neoantigens derived from oncogenic driver mutations. The consensus driver-mutation list was compiled and discovered by The Cancer Genome Atlas (TCGA) Multi-Center Mutation Calling in Multiple Cancers (MC3) working group and driver working group among 9079 samples across 33 cancer types (<xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B39">39</xref>). For a total of 3,437 missense driver mutations, we identified ~ 16,000 putative neoantigens in the context of 20 high-frequency HLA alleles (<xref ref-type="supplementary-material" rid="SF5">
<bold>Supplementary File 3</bold>
</xref>).</p>
<p>Among these driver mutations, only 15% (513/3437) do not yield putative neoantigens, while the products of the other could be bound and presented by these HLA alleles. We identified 36 high-frequent shared putative neoantigens derived from eight oncogenic driver mutations with more than 1% coverage of multiple cancer patients in the 9079 TCGA cohort (<xref ref-type="supplementary-material" rid="SF7">
<bold>Table S1</bold>
</xref>), e.g. HLA-A*03:01_KIGDFGLAT<underline>E</underline>K from BRAF_p.V600E with 5.60% (508/9079) in Pan-Cancer. Besides, we also found tumor-specific shared potential neoantigens with over 10% frequency in a given cancer type (<xref ref-type="supplementary-material" rid="SF8">
<bold>Table S2</bold>
</xref>). For example, HLA-B*15:01_IIIG<underline>C</underline>HAY from IDH1_p.R132C with 11.76% (4/34) in CHOL. Importantly, the immunogenicity of some shared putative neoantigens we identified has been confirmed experimentally (<xref ref-type="table" rid="T2">
<bold>Table 2</bold>
</xref>) (<xref ref-type="bibr" rid="B51">51</xref>). For&#xa0;instance, VVVGAG<underline>D</underline>VGK from KRAS_p.G13D has been shown to be immunogenic in the context of the HLA-A*03:01 allele. Overall, these putative shared driver-mutation-neoantigen pools provide a potential list of targets for off-the-shelf immunotherapy.</p>
<table-wrap id="T2" position="float">
<label>Table 2</label>
<caption>
<p>Validated immunogenic neoantigens derived from driver mutations.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="left">Driver Mutation</th>
<th valign="top" align="center">pmhc</th>
<th valign="top" align="center">CancerTypes</th>
<th valign="top" align="center">Frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">KRAS_p.G12D</td>
<td valign="top" align="left">HLA-A*03:01_VVGA<underline>D</underline>GVGK</td>
<td valign="top" align="left">Pan-Cancer</td>
<td valign="top" align="center">1.78% (162/9079)</td>
</tr>
<tr>
<td valign="top" align="left">KRAS_p.G13D</td>
<td valign="top" align="left">HLA-A*03:01_VVVGAG<underline>D</underline>VGK</td>
<td valign="top" align="left">COAD</td>
<td valign="top" align="center">8.77% (20/228)</td>
</tr>
<tr>
<td valign="top" align="left">KRAS_p.G13D</td>
<td valign="top" align="left">HLA-A*03:01_VVGAG<underline>D</underline>VGK</td>
<td valign="top" align="left">COAD</td>
<td valign="top" align="center">8.77% (20/228)</td>
</tr>
<tr>
<td valign="top" align="left">KRAS_p.Q61H</td>
<td valign="top" align="left">HLA-A*01:01_ILDTAG<underline>H</underline>EEY</td>
<td valign="top" align="left">PAAD</td>
<td valign="top" align="center">3.87% (6/155)</td>
</tr>
<tr>
<td valign="top" align="left">KRAS_p.Q61L</td>
<td valign="top" align="left">HLA-A*01:01_ILDTAG<underline>L</underline>EEY</td>
<td valign="top" align="left">TGCT</td>
<td valign="top" align="center">1.55% (2/129)</td>
</tr>
<tr>
<td valign="top" align="left">KRAS_p.Q61R</td>
<td valign="top" align="left">HLA-A*01:01_ILDTAG<underline>R</underline>EEY</td>
<td valign="top" align="left">COAD</td>
<td valign="top" align="center">1.32% (3/228)</td>
</tr>
<tr>
<td valign="top" align="left">IDH2_p.R140Q</td>
<td valign="top" align="left">HLA-B*07:02_SPNGTI<underline>Q</underline>NIL</td>
<td valign="top" align="left">LAML</td>
<td valign="top" align="center">4.35% (6/138)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s4" sec-type="discussion">
<title>Discussion</title>
<p>Neoantigen is the foreign protein that arises as a consequence of tumor-specific DNA alterations and could be presented on the surface of tumor cells by MHC molecules. When recognized by TCR specifically, it will elicit anti-tumor immune responses. In the current clinical application of targeting neoantigens immunotherapies, the accurate identification of relevant neoantigens has become a central challenge (<xref ref-type="bibr" rid="B46">46</xref>). Current prediction algorithms are insufficiently precise due to the limitation of <italic>in vitro</italic> binding affinity training data and algorithmic constraints, therefore resulting in high false positives (<xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B41">41</xref>). One of the solutions is to train a novel prediction algorithm by using MS-identified peptides from mono-allelic or mixed-allelic cell lines (<xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B52">52</xref>).</p>
<p>In this study, we build high PPV neoantigen prediction algorithms by training models on <italic>in vitro</italic> MS data and CNN deep learning model. Based on the mono-allelic benchmark, we demonstrate that our model, APPM, outperforms netMHCpan4 EL among 19 high-frequency HLA alleles in precision. Moreover, the combination of APPM and NetMHCpan4 EL improves the prediction performance, suggesting that the combined strategy can identify potential neoantigens in clinical practices with more precision. However, the mass spectrometry assay itself has a technological limitation: not all possible eluted ligands can be detected, which inevitably generates the false negative peptides (<xref ref-type="bibr" rid="B53">53</xref>&#x2013;<xref ref-type="bibr" rid="B55">55</xref>).</p>
<p>An important limitation of this work is that we apply MS datasets to train and evaluate our predictor. Using MS-identified peptides to reflect the factor of gene expression, protease cleavage, transportation and presentation might bring the MS bias in our prediction. Our work also neglects T cell recognition of presented epitopes. Many putative neoantigens identified by our predictor will not induce CD8+ T cell responses when used in cancer patients. This limitation is consistent with the previous study that presentation of antigens is essential but not sufficient for induction of robust anti-tumor responses (<xref ref-type="bibr" rid="B56">56</xref>).</p>
<p>Besides, neoantigens derived from driver mutations are particularly important for neoantigen-targeting immunotherapy. Firstly, driver-mutation-neoantigens are a source of &#x201c;high-quality neoantigens&#x201d; that may reduce the likelihood of resistance to neoantigen therapy. Secondly, driver mutations were shared between patients of the same cancer type with relatively high frequencies (<xref ref-type="bibr" rid="B57">57</xref>&#x2013;<xref ref-type="bibr" rid="B61">61</xref>), as well as between primary tumors and metastases (<xref ref-type="bibr" rid="B62">62</xref>). A limited number of high-frequent driver mutations may generate shared neoantigens that could be widely applied to multiple tumor patients and may be ideal targets for off-the-shelf immunotherapy (<xref ref-type="bibr" rid="B63">63</xref>). However, whether the shared putative neoantigens are immunogenic in different cancer patients remains to be determined. Nevertheless, prioritizing such neoantigens whenever possible is important, as constructing a library for storage of these shared neoantigens can significantly save time from detecting mutations to the preparation of the personalized vaccine and increase the efficiency of neoantigen-based immunotherapies.</p>
</sec>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SF1">
<bold>Supplementary Material</bold>
</xref>. Further inquiries can be directed to the corresponding authors. All training data and code are available on Github at: <uri xlink:href="https://github.com/haoqing12/APPM.git">https://github.com/haoqing12/APPM.git</uri>.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>QH trained the model and wrote the manuscript. PW, YS, Y-GZ, HX, and J-NZ reviewed and revised the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="s7" sec-type="funding-information">
<title>Funding</title>
<p>National Basic Research Program of China (973 Program) (2009CB522801); National Science and Technology Major Projects for &#x201c;Major New Drugs Innovation and Development&#x201d;(2011ZX09401-304, 2015ZX09501004-001-005); National Natural Science Foundation of China (30672651, 81073047, 81470180); Sichuan Traditional Chinese Medicine Administration Project(20017Z001)</p>
</sec>
<sec id="s8" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>We would like to thank Dr. Kun Wei at University of Electronic Science and Technology of China for providing a computational platform of machine learning.</p>
</ack>
<sec id="s9" sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fimmu.2021.682103/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fimmu.2021.682103/full#supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet_3.pdf" id="SF1" mimetype="application/pdf">
<label>Supplementary Figure 1</label>
<caption>
<p>Example of the peptide sequence &#x2018;ARHSLLQTL&#x2019; using one-hot encoding scheme.</p>
</caption>
</supplementary-material>
  <supplementary-material xlink:href="DataSheet_3.pdf" id="SF2" mimetype="application/pdf">
<label>Supplementary Figure 2</label>
<caption>
<p>The full CNN model structure. Purple, yellow and green represent three parallel convolutional layers. The black box represents the convolution kernel of each layer.</p>
</caption>
</supplementary-material>
  <supplementary-material xlink:href="DataSheet_3.pdf" id="SF3" mimetype="application/pdf">
<label>Supplementary Figure 3</label>
<caption>
<p>The extracting procedure of candidate peptides. The blue points represent the wild amino acids and the red points refer to the driver mutant amino acids.</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="DataSheet_1.csv" id="SF4" mimetype="text/csv"/>
<supplementary-material xlink:href="DataSheet_2.zip" id="SF5" mimetype="application/zip"/>
<supplementary-material xlink:href="DataSheet_4.xlsx" id="SF6" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_1.xlsx" id="SF7" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_2.xlsx" id="SF8" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pon</surname> <given-names>JR</given-names>
</name>
<name>
<surname>Marra</surname> <given-names>MA</given-names>
</name>
</person-group>. <article-title>Driver and Passenger Mutations in Cancer</article-title>. <source>Annu Rev Pathol</source> (<year>2015</year>) <volume>10</volume>:<fpage>25</fpage>&#x2013;<lpage>50</lpage>. doi: <pub-id pub-id-type="doi">10.1146/annurev-pathol-012414-040312</pub-id>
</citation>
</ref>
<ref id="B2">
<label>2</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haber</surname> <given-names>DA</given-names>
</name>
<name>
<surname>Settleman</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Cancer: Drivers and Passengers</article-title>. <source>Nature</source> (<year>2007</year>) <volume>446</volume>:<page-range>145&#x2013;6</page-range>. doi: <pub-id pub-id-type="doi">10.1038/446145a</pub-id>
</citation>
</ref>
<ref id="B3">
<label>3</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stratton</surname> <given-names>MR</given-names>
</name>
<name>
<surname>Campbell</surname> <given-names>PJ</given-names>
</name>
<name>
<surname>Futreal</surname> <given-names>PA</given-names>
</name>
</person-group>. <article-title>The Cancer Genome</article-title>. <source>Nature</source> (<year>2009</year>) <volume>458</volume>:<page-range>719&#x2013;24</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature07943</pub-id>
</citation>
</ref>
<ref id="B4">
<label>4</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schumacher</surname> <given-names>TN</given-names>
</name>
<name>
<surname>Schreiber</surname> <given-names>RD</given-names>
</name>
</person-group>. <article-title>Neoantigens in Cancer Immunotherapy</article-title>. <source>Science</source> (<year>2015</year>) <volume>348</volume>:<fpage>69</fpage>&#x2013;<lpage>74</lpage>. doi: <pub-id pub-id-type="doi">10.1126/science.aaa4971</pub-id>
</citation>
</ref>
<ref id="B5">
<label>5</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>S</given-names>
</name>
<name>
<surname>Deng</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>P</given-names>
</name>
<name>
<surname>Hou</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>Prognostic Factors for Checkpoint Inhibitor Based Immunotherapy: An Update With New Evidences</article-title>. <source>Front Pharmacol</source> (<year>2018</year>) <volume>9</volume>:<elocation-id>1050</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fphar.2018.01050</pub-id>
</citation>
</ref>
<ref id="B6">
<label>6</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schumacher</surname> <given-names>TN</given-names>
</name>
<name>
<surname>Scheper</surname> <given-names>W</given-names>
</name>
<name>
<surname>Kvistborg</surname> <given-names>P</given-names>
</name>
</person-group>. <article-title>Cancer Neoantigens</article-title>. <source>Annu Rev Immunol</source> (<year>2018</year>) <volume>37</volume>:<fpage>173</fpage>&#x2013;<lpage>200</lpage>. doi: <pub-id pub-id-type="doi">10.1146/annurev-immunol-042617-053402</pub-id>
</citation>
</ref>
<ref id="B7">
<label>7</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ott</surname> <given-names>PA</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Keskin</surname> <given-names>DB</given-names>
</name>
<name>
<surname>Shukla</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>J</given-names>
</name>
<name>
<surname>Bozym</surname> <given-names>DJ</given-names>
</name>
<etal/>
</person-group>. <article-title>An Immunogenic Personal Neoantigen Vaccine for Patients With Melanoma</article-title>. <source>Nature</source> (<year>2017</year>) <volume>547</volume>:<page-range>217&#x2013;21</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature22991</pub-id>
</citation>
</ref>
<ref id="B8">
<label>8</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keskin</surname> <given-names>DB</given-names>
</name>
<name>
<surname>Anandappa</surname> <given-names>AJ</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>J</given-names>
</name>
<name>
<surname>Tirosh</surname> <given-names>I</given-names>
</name>
<name>
<surname>Mathewson</surname> <given-names>ND</given-names>
</name>
<name>
<surname>Li</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Neoantigen Vaccine Generates Intratumoral T Cell Responses in Phase Ib Glioblastoma Trial</article-title>. <source>Nature</source> (<year>2019</year>) <volume>565</volume>:<page-range>234&#x2013;9</page-range>. doi: <pub-id pub-id-type="doi">10.1038/s41586-018-0792-9</pub-id>
</citation>
</ref>
<ref id="B9">
<label>9</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sahin</surname> <given-names>U</given-names>
</name>
<name>
<surname>Derhovanessian</surname> <given-names>E</given-names>
</name>
<name>
<surname>Miller</surname> <given-names>M</given-names>
</name>
<name>
<surname>Kloke</surname> <given-names>BP</given-names>
</name>
<name>
<surname>Simon</surname> <given-names>P</given-names>
</name>
<name>
<surname>L&#xf6;wer</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Personalized RNA Mutanome Vaccines Mobilize Poly-Specific Therapeutic Immunity Against Cancer</article-title>. <source>Nature</source> (<year>2017</year>) <volume>547</volume>:<page-range>222&#x2013;6</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature23003</pub-id>
</citation>
</ref>
<ref id="B10">
<label>10</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>R</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>L</given-names>
</name>
<name>
<surname>Shu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Personalized Neoantigen Pulsed Dendritic Cell Vaccine for Advanced Lung Cancer</article-title>. <source>Signal Transduct Target Ther</source> (<year>2021</year>) <volume>6</volume>:<fpage>26</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41392-020-00448-5</pub-id>
</citation>
</ref>
<ref id="B11">
<label>11</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>F</given-names>
</name>
<name>
<surname>Zou</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Du</surname> <given-names>J</given-names>
</name>
<name>
<surname>Su</surname> <given-names>S</given-names>
</name>
<name>
<surname>Shao</surname> <given-names>J</given-names>
</name>
<name>
<surname>Meng</surname> <given-names>F</given-names>
</name>
<etal/>
</person-group>. <article-title>Neoantigen Identification Strategies Enable Personalized Immunotherapy in Refractory Solid Tumors</article-title>. <source>J&#xa0;Clin Invest</source> (<year>2019</year>) <volume>129</volume>:<page-range>2056&#x2013;70</page-range>. doi: <pub-id pub-id-type="doi">10.1172/JCI99538</pub-id>
</citation>
</ref>
<ref id="B12">
<label>12</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garcia-Garijo</surname> <given-names>A</given-names>
</name>
<name>
<surname>Fajardo</surname> <given-names>CA</given-names>
</name>
<name>
<surname>Gros</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Determinants for Neoantigen Identification</article-title>. <source>Front Immunol</source> (<year>2019</year>) <volume>10</volume>:<elocation-id>1392</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fimmu.2019.01392</pub-id>
</citation>
</ref>
<ref id="B13">
<label>13</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hacohen</surname> <given-names>N</given-names>
</name>
<name>
<surname>Fritsch</surname> <given-names>EF</given-names>
</name>
<name>
<surname>Carter</surname> <given-names>TA</given-names>
</name>
<name>
<surname>Lander</surname> <given-names>ES</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>CJ</given-names>
</name>
</person-group>. <article-title>Getting Personal With Neoantigen-Based Therapeutic Cancer Vaccines</article-title>. <source>Cancer Immunol Res</source> (<year>2013</year>) <volume>1</volume>:<page-range>11&#x2013;5</page-range>. doi: <pub-id pub-id-type="doi">10.1158/2326-6066.CIR-13-0022</pub-id>
</citation>
</ref>
<ref id="B14">
<label>14</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vitiello</surname> <given-names>A</given-names>
</name>
<name>
<surname>Zanetti</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Neoantigen Prediction and the Need for Validation</article-title>. <source>Nat Biotechnol</source> (<year>2017</year>) <volume>35</volume>:<page-range>815&#x2013;7</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nbt.3932</pub-id>
</citation>
</ref>
<ref id="B15">
<label>15</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yadav</surname> <given-names>M</given-names>
</name>
<name>
<surname>Jhunjhunwala</surname> <given-names>S</given-names>
</name>
<name>
<surname>Phung</surname> <given-names>QT</given-names>
</name>
<name>
<surname>Lupardus</surname> <given-names>P</given-names>
</name>
<name>
<surname>Tanguay</surname> <given-names>J</given-names>
</name>
<name>
<surname>Bumbaca</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Predicting Immunogenic Tumour Mutations by Combining Mass Spectrometry and Exome Sequencing</article-title>. <source>Nature</source> (<year>2014</year>) <volume>515</volume>:<page-range>572&#x2013;6</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature14001</pub-id>
</citation>
</ref>
<ref id="B16">
<label>16</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bassani-Sternberg</surname> <given-names>M</given-names>
</name>
<name>
<surname>Pletscher-Frankild</surname> <given-names>S</given-names>
</name>
<name>
<surname>Jensen</surname> <given-names>LJ</given-names>
</name>
<name>
<surname>Mann</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Mass Spectrometry of Human Leukocyte Antigen Class I Peptidomes Reveals Strong Effects of Protein Abundance and Turnover on Antigen Presentation</article-title>. <source>Mol Cell Proteomics</source> (<year>2015</year>) <volume>14</volume>:<page-range>658&#x2013;73</page-range>. doi: <pub-id pub-id-type="doi">10.1074/mcp.M114.042812</pub-id>
</citation>
</ref>
<ref id="B17">
<label>17</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bulik-Sullivan</surname> <given-names>B</given-names>
</name>
<name>
<surname>Busby</surname> <given-names>J</given-names>
</name>
<name>
<surname>Palmer</surname> <given-names>CD</given-names>
</name>
<name>
<surname>Davis</surname> <given-names>MJ</given-names>
</name>
<name>
<surname>Murphy</surname> <given-names>T</given-names>
</name>
<name>
<surname>Clark</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Deep Learning Using Tumor HLA Peptide Mass Spectrometry Datasets Improves Neoantigen Identification</article-title>. <source>Nat Biotechnol</source> (<year>2018</year>) <volume>37</volume>(<issue>1</issue>):<fpage>55</fpage>&#x2013;<lpage>63</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nbt.4313</pub-id>
</citation>
</ref>
<ref id="B18">
<label>18</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname> <given-names>C-H</given-names>
</name>
<name>
<surname>Yelensky</surname> <given-names>R</given-names>
</name>
<name>
<surname>Jooss</surname> <given-names>K</given-names>
</name>
<name>
<surname>Chan</surname> <given-names>TA</given-names>
</name>
</person-group>. <article-title>Update on Tumor Neoantigens and Their Utility: Why it Is Good to Be Different</article-title>. <source>Trends Immunol</source> (<year>2018</year>) <volume>39</volume>:<page-range>536&#x2013;48</page-range>. doi: <pub-id pub-id-type="doi">10.1016/j.it.2018.04.005</pub-id>
</citation>
</ref>
<ref id="B19">
<label>19</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abelin</surname> <given-names>JG</given-names>
</name>
<name>
<surname>Keskin</surname> <given-names>DB</given-names>
</name>
<name>
<surname>Sarkizova</surname> <given-names>S</given-names>
</name>
<name>
<surname>Hartigan</surname> <given-names>CR</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>W</given-names>
</name>
<name>
<surname>Sidney</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Mass Spectrometry Profiling of HLA-Associated Peptidomes in Mono-Allelic Cells Enables More Accurate Epitope Prediction</article-title>. <source>Immunity</source> (<year>2017</year>) <volume>46</volume>:<page-range>315&#x2013;26</page-range>. doi: <pub-id pub-id-type="doi">10.1016/j.immuni.2017.02.007</pub-id>
</citation>
</ref>
<ref id="B20">
<label>20</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schirle</surname> <given-names>M</given-names>
</name>
<name>
<surname>Weinschenk</surname> <given-names>T</given-names>
</name>
<name>
<surname>Stevanovi&#x107;</surname> <given-names>S</given-names>
</name>
</person-group>. <article-title>Combining Computer Algorithms With Experimental Approaches Permits the Rapid and Accurate Identification of T Cell Epitopes From Defined Antigens</article-title>. <source>J Immunological Methods</source> (<year>2001</year>) <volume>257</volume>:<fpage>1</fpage>&#x2013;<lpage>16</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0022-1759(01)00459-8</pub-id>
</citation>
</ref>
<ref id="B21">
<label>21</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Caron</surname> <given-names>E</given-names>
</name>
<name>
<surname>Kowalewski</surname> <given-names>DJ</given-names>
</name>
<name>
<surname>Chiek Koh</surname> <given-names>C</given-names>
</name>
<name>
<surname>Sturm</surname> <given-names>T</given-names>
</name>
<name>
<surname>Schuster</surname> <given-names>H</given-names>
</name>
<name>
<surname>Aebersold</surname> <given-names>R</given-names>
</name>
</person-group>. <article-title>Analysis of Major Histocompatibility Complex (Mhc) Immunopeptidomes Using Mass Spectrometry</article-title>. <source>Mol Cell Proteomics MCP</source> (<year>2015</year>) <volume>14</volume>:<page-range>3105&#x2013;17</page-range>. doi: <pub-id pub-id-type="doi">10.1074/mcp.O115.052431</pub-id>
</citation>
</ref>
<ref id="B22">
<label>22</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname> <given-names>S</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>HS</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>E</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>MG</given-names>
</name>
<name>
<surname>Shin</surname> <given-names>EC</given-names>
</name>
<name>
<surname>Paik</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Neopepsee: Accurate Genome-Level Prediction of Neoantigens by Harnessing Sequence and Amino Acid Immunogenicity Information</article-title>. <source>Ann Oncol</source> (<year>2018</year>) <volume>29</volume>:<page-range>1030&#x2013;6</page-range>. doi: <pub-id pub-id-type="doi">10.1093/annonc/mdy022</pub-id>
</citation>
</ref>
<ref id="B23">
<label>23</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chowell</surname> <given-names>D</given-names>
</name>
<name>
<surname>Krishna</surname> <given-names>S</given-names>
</name>
<name>
<surname>Becker</surname> <given-names>PD</given-names>
</name>
<name>
<surname>Cocita</surname> <given-names>C</given-names>
</name>
<name>
<surname>Shu</surname> <given-names>J</given-names>
</name>
<name>
<surname>Tan</surname> <given-names>X</given-names>
</name>
<etal/>
</person-group>. <article-title>TCR Contact Residue Hydrophobicity Is a Hallmark of Immunogenic CD8&amp;Lt;Sup&lt;+&amp;Lt;/Sup&lt; T Cell Epitopes</article-title>. <source>Proc Natl Acad Sci</source> (<year>2015</year>) <volume>112</volume>:<fpage>E1754</fpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.1500973112</pub-id>
</citation>
</ref>
<ref id="B24">
<label>24</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andreatta</surname> <given-names>M</given-names>
</name>
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Gapped Sequence Alignment Using Artificial Neural Networks: Application to the MHC Class I System</article-title>. <source>Bioinformatics</source> (<year>2016</year>) <volume>32</volume>:<fpage>511</fpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btv639</pub-id>
</citation>
</ref>
<ref id="B25">
<label>25</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>O&#x2019;donnell</surname> <given-names>TJ</given-names>
</name>
<name>
<surname>Rubinsteyn</surname> <given-names>A</given-names>
</name>
<name>
<surname>Bonsack</surname> <given-names>M</given-names>
</name>
<name>
<surname>Riemer</surname> <given-names>AB</given-names>
</name>
<name>
<surname>Laserson</surname> <given-names>U</given-names>
</name>
<name>
<surname>Hammerbacher</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Mhcflurry: Open-Source Class I Mhc Binding Affinity Prediction</article-title>. <source>Cell Syst</source> (<year>2018</year>) <volume>7</volume>:<fpage>129</fpage>&#x2013;<lpage>32.e124</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cels.2018.05.014</pub-id>
</citation>
</ref>
<ref id="B26">
<label>26</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurtz</surname> <given-names>V</given-names>
</name>
<name>
<surname>Paul</surname> <given-names>S</given-names>
</name>
<name>
<surname>Andreatta</surname> <given-names>M</given-names>
</name>
<name>
<surname>Marcatili</surname> <given-names>P</given-names>
</name>
<name>
<surname>Peters</surname> <given-names>B</given-names>
</name>
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>NetMHCpan 4.0: Improved peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data</article-title>. <source>bioRxiv</source> (<year>2017</year>) <volume>199</volume>(<issue>9</issue>):<page-range>3360&#x2013;8</page-range>. doi: <pub-id pub-id-type="doi">10.1101/149518</pub-id>
</citation>
</ref>
<ref id="B27">
<label>27</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Taylor</surname> <given-names>GW</given-names>
</name>
<name>
<surname>Fergus</surname> <given-names>R</given-names>
</name>
<name>
<surname>Lecun</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Bregler</surname> <given-names>C</given-names>
</name>
</person-group>. <article-title>Convolutional Learning of Spatio-Temporal Features</article-title>. In: <source>European Conference on Computer Vision: Springer</source>. <publisher-loc>Berlin, Heidelberg</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2010</year>). p. <page-range>140&#x2013;53</page-range>.</citation>
</ref>
<ref id="B28">
<label>28</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vang</surname> <given-names>YS</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>X</given-names>
</name>
</person-group>. <article-title>HLA Class I Binding Prediction Via Convolutional Neural Networks</article-title>. <source>Bioinformatics</source> (<year>2017</year>) <volume>33</volume>:<page-range>2658&#x2013;65</page-range>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btx264</pub-id>
</citation>
</ref>
<ref id="B29">
<label>29</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trolle</surname> <given-names>T</given-names>
</name>
<name>
<surname>Mcmurtrey</surname> <given-names>CP</given-names>
</name>
<name>
<surname>Sidney</surname> <given-names>J</given-names>
</name>
<name>
<surname>Bardet</surname> <given-names>W</given-names>
</name>
<name>
<surname>Osborn</surname> <given-names>SC</given-names>
</name>
<name>
<surname>Kaever</surname> <given-names>T</given-names>
</name>
<etal/>
</person-group>. <article-title>The Length Distribution of Class I-Restricted T Cell Epitopes is Determined by Both Peptide Supply and MHC Allele-Specific Binding Preference</article-title>. <source>J Immunol</source> (<year>2016</year>) <volume>196</volume>:<page-range>1480&#x2013;7</page-range>. doi: <pub-id pub-id-type="doi">10.4049/jimmunol.1501721</pub-id>
</citation>
</ref>
<ref id="B30">
<label>30</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pearson</surname> <given-names>H</given-names>
</name>
<name>
<surname>Daouda</surname> <given-names>T</given-names>
</name>
<name>
<surname>Granados</surname> <given-names>DP</given-names>
</name>
<name>
<surname>Durette</surname> <given-names>C</given-names>
</name>
<name>
<surname>Bonneil</surname> <given-names>E</given-names>
</name>
<name>
<surname>Courcelles</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>MHC Class I-Associated Peptides Derive From Selective Regions of the Human Genome</article-title>. <source>J Clin Invest</source> (<year>2016</year>) <volume>126</volume>:<page-range>4690&#x2013;701</page-range>. doi: <pub-id pub-id-type="doi">10.1172/JCI88590</pub-id>
</citation>
</ref>
<ref id="B31">
<label>31</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lema&#xee;tre</surname> <given-names>G</given-names>
</name>
<name>
<surname>Nogueira</surname> <given-names>F</given-names>
</name>
<name>
<surname>Aridas</surname> <given-names>CK</given-names>
</name>
</person-group>. <article-title>Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning</article-title>. <source>J Mach Learn Res</source> (<year>2017</year>) <volume>18</volume>:<page-range>559&#x2013;63</page-range>.</citation>
</ref>
<ref id="B32">
<label>32</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altay</surname> <given-names>G</given-names>
</name>
</person-group>. <article-title>Tensorflow Based Deep Learning Model and Snakemake Workflow for Peptide-Protein Binding Predictions</article-title>. <source>bioRxiv</source> (<year>2018</year>) <volume>410928</volume>. doi: <pub-id pub-id-type="doi">10.1101/410928</pub-id>
</citation>
</ref>
<ref id="B33">
<label>33</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Szegedy</surname> <given-names>C</given-names>
</name>
<name>
<surname>Wei</surname> <given-names>L</given-names>
</name>
<name>
<surname>Yangqing</surname> <given-names>J</given-names>
</name>
<name>
<surname>Sermanet</surname> <given-names>P</given-names>
</name>
<name>
<surname>Reed</surname> <given-names>S</given-names>
</name>
<name>
<surname>Anguelov</surname> <given-names>D</given-names>
</name>
<etal/>
</person-group>. <article-title>Going Deeper With Convolutions</article-title>. In: <source>2015 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr))</source>. <publisher-name>IEEE</publisher-name> (<year>2015</year>). p. <fpage>1</fpage>&#x2013;<lpage>9</lpage>.</citation>
</ref>
<ref id="B34">
<label>34</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abelin</surname> <given-names>JG</given-names>
</name>
<name>
<surname>Harjanto</surname> <given-names>D</given-names>
</name>
<name>
<surname>Malloy</surname> <given-names>M</given-names>
</name>
<name>
<surname>Suri</surname> <given-names>P</given-names>
</name>
<name>
<surname>Colson</surname> <given-names>T</given-names>
</name>
<name>
<surname>Goulding</surname> <given-names>SP</given-names>
</name>
<etal/>
</person-group>. <article-title>Defining HLA-II Ligand Processing and Binding Rules With Mass Spectrometry Enhances Cancer Epitope Prediction</article-title>. <source>Immunity</source> (<year>2019</year>) <volume>51</volume>:<page-range>766&#x2013;79</page-range>:<fpage>e717</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.immuni.2019.08.012</pub-id>
</citation>
</ref>
<ref id="B35">
<label>35</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sarkizova</surname> <given-names>S</given-names>
</name>
<name>
<surname>Klaeger</surname> <given-names>S</given-names>
</name>
<name>
<surname>Le</surname> <given-names>PM</given-names>
</name>
<name>
<surname>Li</surname> <given-names>LW</given-names>
</name>
<name>
<surname>Oliveira</surname> <given-names>G</given-names>
</name>
<name>
<surname>Keshishian</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>A Large Peptidome Dataset Improves HLA Class I Epitope Prediction Across Most of the Human Population</article-title>. <source>Nat Biotechnol</source> (<year>2020</year>) <volume>38</volume>:<fpage>199</fpage>&#x2013;<lpage>209</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41587-019-0322-9</pub-id>
</citation>
</ref>
<ref id="B36">
<label>36</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>O&#x2019;donnell</surname> <given-names>TJ</given-names>
</name>
<name>
<surname>Rubinsteyn</surname> <given-names>A</given-names>
</name>
<name>
<surname>Laserson</surname> <given-names>U</given-names>
</name>
</person-group>. <article-title>Mhcflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing</article-title>. <source>Cell Syst</source> (<year>2020</year>) <volume>11</volume>:<fpage>42</fpage>&#x2013;<lpage>8.e47</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cels.2020.06.010</pub-id>
</citation>
</ref>
<ref id="B37">
<label>37</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alspach</surname> <given-names>E</given-names>
</name>
<name>
<surname>Lussier</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Miceli</surname> <given-names>AP</given-names>
</name>
<name>
<surname>Kizhvatov</surname> <given-names>I</given-names>
</name>
<name>
<surname>Dupage</surname> <given-names>M</given-names>
</name>
<name>
<surname>Luoma</surname> <given-names>AM</given-names>
</name>
<etal/>
</person-group>. <article-title>Mhc-II Neoantigens Shape Tumour Immunity and Response to Immunotherapy</article-title>. <source>Nature</source> (<year>2019</year>) <volume>574</volume>:<fpage>696</fpage>&#x2013;<lpage>701</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41586-019-1671-8</pub-id>
</citation>
</ref>
<ref id="B38">
<label>38</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bailey</surname> <given-names>MH</given-names>
</name>
<name>
<surname>Tokheim</surname> <given-names>C</given-names>
</name>
<name>
<surname>Porta-Pardo</surname> <given-names>E</given-names>
</name>
<name>
<surname>Sengupta</surname> <given-names>S</given-names>
</name>
<name>
<surname>Bertrand</surname> <given-names>D</given-names>
</name>
<name>
<surname>Weerasinghe</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Comprehensive Characterization of Cancer Driver Genes and Mutations</article-title>. <source>Cell</source> (<year>2018</year>) <volume>173</volume>:<fpage>371</fpage>&#x2013;<lpage>85.e318</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cell.2018.02.060</pub-id>
</citation>
</ref>
<ref id="B39">
<label>39</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ellrott</surname> <given-names>K</given-names>
</name>
<name>
<surname>Bailey</surname> <given-names>MH</given-names>
</name>
<name>
<surname>Saksena</surname> <given-names>G</given-names>
</name>
<name>
<surname>Covington</surname> <given-names>KR</given-names>
</name>
<name>
<surname>Kandoth</surname> <given-names>C</given-names>
</name>
<name>
<surname>Stewart</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines</article-title>. <source>Cell Syst</source> (<year>2018</year>) <volume>6</volume>:<fpage>271</fpage>&#x2013;<lpage>81.e277</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cels.2018.03.002</pub-id>
</citation>
</ref>
<ref id="B40">
<label>40</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khodadoust</surname> <given-names>MS</given-names>
</name>
<name>
<surname>Olsson</surname> <given-names>N</given-names>
</name>
<name>
<surname>Wagar</surname> <given-names>LE</given-names>
</name>
<name>
<surname>Haabeth</surname> <given-names>OA</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>B</given-names>
</name>
<name>
<surname>Swaminathan</surname> <given-names>K</given-names>
</name>
<etal/>
</person-group>. <article-title>Antigen Presentation Profiling Reveals Recognition of Lymphoma Immunoglobulin Neoantigens</article-title>. <source>Nature</source> (<year>2017</year>) <volume>543</volume>:<page-range>723&#x2013;7</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature21433</pub-id>
</citation>
</ref>
<ref id="B41">
<label>41</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bassani-Sternberg</surname> <given-names>M</given-names>
</name>
<name>
<surname>Br&#xe4;unlein</surname> <given-names>E</given-names>
</name>
<name>
<surname>Klar</surname> <given-names>R</given-names>
</name>
<name>
<surname>Engleitner</surname> <given-names>T</given-names>
</name>
<name>
<surname>Sinitcyn</surname> <given-names>P</given-names>
</name>
<name>
<surname>Audehm</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Direct Identification of Clinically Relevant Neoepitopes Presented on Native Human Melanoma Tissue by Mass Spectrometry</article-title>. <source>Nat Commun</source> (<year>2016</year>) <volume>7</volume>:<fpage>13404</fpage>. doi: <pub-id pub-id-type="doi">10.1038/ncomms13404</pub-id>
</citation>
</ref>
<ref id="B42">
<label>42</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>B</given-names>
</name>
<name>
<surname>Khodadoust</surname> <given-names>M</given-names>
</name>
<name>
<surname>Olsson</surname> <given-names>N</given-names>
</name>
<name>
<surname>Wagar</surname> <given-names>L</given-names>
</name>
<name>
<surname>Fast</surname> <given-names>E</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Predicting HLA Class II Antigen Presentation Through Integrated Deep Learning</article-title>. <source>Nat Biotechnol</source> (<year>2019</year>) <volume>37</volume>(<issue>11</issue>):<page-range>1332&#x2013;43</page-range>. doi: <pub-id pub-id-type="doi">10.1038/s41587-019-0280-2</pub-id>
</citation>
</ref>
<ref id="B43">
<label>43</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname> <given-names>W</given-names>
</name>
<name>
<surname>Qiu</surname> <given-names>S</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>L</given-names>
</name>
<name>
<surname>Xiang</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>Epip: MHC-I Epitope Prediction Integrating Mass Spectrometry Derived Motifs and Tissue-Specific Expression Profiles</article-title>. <source>bioRxiv</source> (<year>2020</year>) <volume>567081</volume>. doi: <pub-id pub-id-type="doi">10.1101/567081</pub-id>
</citation>
</ref>
<ref id="B44">
<label>44</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reynisson</surname> <given-names>B</given-names>
</name>
<name>
<surname>Alvarez</surname> <given-names>B</given-names>
</name>
<name>
<surname>Paul</surname> <given-names>S</given-names>
</name>
<name>
<surname>Peters</surname> <given-names>B</given-names>
</name>
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>NetMHCpan-4.1 and NetMHCIIpan-4.0: Improved Predictions of MHC Antigen Presentation by Concurrent Motif Deconvolution and Integration of MS MHC Eluted Ligand Data</article-title>. <source>Nucleic Acids Res</source> (<year>2020</year>) <volume>48</volume>:<fpage>W449</fpage>&#x2013;<lpage>54</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkaa379</pub-id>
</citation>
</ref>
<ref id="B45">
<label>45</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Callari</surname> <given-names>M</given-names>
</name>
<name>
<surname>Sammut</surname> <given-names>S-J</given-names>
</name>
<name>
<surname>De Mattos-Arruda</surname> <given-names>L</given-names>
</name>
<name>
<surname>Bruna</surname> <given-names>A</given-names>
</name>
<name>
<surname>Rueda</surname> <given-names>OM</given-names>
</name>
<name>
<surname>Chin</surname> <given-names>S-F</given-names>
</name>
<etal/>
</person-group>. <article-title>Intersect-Then-Combine Approach: Improving the Performance of Somatic Variant Calling in Whole Exome Sequencing Data Using Multiple Aligners and Callers</article-title>. <source>Genome Med</source> (<year>2017</year>) <volume>9</volume>:<fpage>1</fpage>&#x2013;<lpage>11</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s13073-017-0425-1</pub-id>
</citation>
</ref>
<ref id="B46">
<label>46</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>&#x141;uksza</surname> <given-names>M</given-names>
</name>
<name>
<surname>Riaz</surname> <given-names>N</given-names>
</name>
<name>
<surname>Makarov</surname> <given-names>V</given-names>
</name>
<name>
<surname>Balachandran</surname> <given-names>VP</given-names>
</name>
<name>
<surname>Hellmann</surname> <given-names>MD</given-names>
</name>
<name>
<surname>Solovyov</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>A Neoantigen Fitness Model Predicts Tumour Response to Checkpoint Blockade Immunotherapy</article-title>. <source>Nature</source> (<year>2017</year>) <volume>551</volume>:<page-range>517&#x2013;20</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature24473</pub-id>
</citation>
</ref>
<ref id="B47">
<label>47</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bassani-Sternberg</surname> <given-names>M</given-names>
</name>
<name>
<surname>Chong</surname> <given-names>C</given-names>
</name>
<name>
<surname>Guillaume</surname> <given-names>P</given-names>
</name>
<name>
<surname>Solleder</surname> <given-names>M</given-names>
</name>
<name>
<surname>Pak</surname> <given-names>H</given-names>
</name>
<name>
<surname>Gannon</surname> <given-names>PO</given-names>
</name>
<etal/>
</person-group>. <article-title>Deciphering HLA-I Motifs Across HLA Peptidomes Improves Neo-Antigen Predictions and Identifies Allostery Regulating HLA Specificity</article-title>. <source>PloS Comput Biol</source> (<year>2017</year>) <volume>13</volume>:<fpage>e1005725</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005725</pub-id>
</citation>
</ref>
<ref id="B48">
<label>48</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurtz</surname> <given-names>V</given-names>
</name>
<name>
<surname>Paul</surname> <given-names>S</given-names>
</name>
<name>
<surname>Andreatta</surname> <given-names>M</given-names>
</name>
<name>
<surname>Marcatili</surname> <given-names>P</given-names>
</name>
<name>
<surname>Peters</surname> <given-names>B</given-names>
</name>
<name>
<surname>Nielsen</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Netmhcpan-4.0: Improved Peptide&#x2013;Mhc Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data</article-title>. <source>J Immunol</source> (<year>2017</year>) <volume>199</volume>:<fpage>3360</fpage>. doi: <pub-id pub-id-type="doi">10.4049/jimmunol.1700893</pub-id>
</citation>
</ref>
<ref id="B49">
<label>49</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calis</surname> <given-names>JJA</given-names>
</name>
<name>
<surname>Maybeno</surname> <given-names>M</given-names>
</name>
<name>
<surname>Greenbaum</surname> <given-names>JA</given-names>
</name>
<name>
<surname>Weiskopf</surname> <given-names>D</given-names>
</name>
<name>
<surname>De Silva</surname> <given-names>AD</given-names>
</name>
<name>
<surname>Sette</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Properties of MHC Class I Presented Peptides That Enhance Immunogenicity</article-title>. <source>PloS Comput Biol</source> (<year>2013</year>) <volume>9</volume>:<fpage>e1003266</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003266</pub-id>
</citation>
</ref>
<ref id="B50">
<label>50</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mcgranahan</surname> <given-names>N</given-names>
</name>
<name>
<surname>Swanton</surname> <given-names>C</given-names>
</name>
</person-group>. <article-title>Neoantigen Quality, Not Quantity</article-title>. <source>Sci Transl Med</source> (<year>2019</year>) <volume>11</volume>(<issue>506</issue>):<fpage>eaax7918</fpage>. doi: <pub-id pub-id-type="doi">10.1126/scitranslmed.aax7918</pub-id>
</citation>
</ref>
<ref id="B51">
<label>51</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Douglass</surname> <given-names>J</given-names>
</name>
<name>
<surname>Hwang</surname> <given-names>MS</given-names>
</name>
<name>
<surname>Hsiue</surname> <given-names>EH-C</given-names>
</name>
<name>
<surname>Mog</surname> <given-names>BJ</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Direct Detection and Quantification of Neoantigens</article-title>. <source>Cancer Immunol Res</source> (<year>2019</year>) <volume>7</volume>:<page-range>1748&#x2013;54</page-range>. doi: <pub-id pub-id-type="doi">10.1158/2326-6066.CIR-19-0107</pub-id>
</citation>
</ref>
<ref id="B52">
<label>52</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roudko</surname> <given-names>V</given-names>
</name>
<name>
<surname>Greenbaum</surname> <given-names>B</given-names>
</name>
<name>
<surname>Bhardwaj</surname> <given-names>N</given-names>
</name>
</person-group>. <article-title>Computational Prediction and Validation of Tumor-Associated Neoantigens</article-title>. <source>Front Immunol</source> (<year>2020</year>) <volume>11</volume>:<page-range>27&#x2013;7</page-range>. doi: <pub-id pub-id-type="doi">10.3389/fimmu.2020.00027</pub-id>
</citation>
</ref>
<ref id="B53">
<label>53</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Purcell</surname> <given-names>AW</given-names>
</name>
<name>
<surname>Ramarathinam</surname> <given-names>SH</given-names>
</name>
<name>
<surname>Ternette</surname> <given-names>N</given-names>
</name>
</person-group>. <article-title>Mass Spectrometry&#x2013;Based Identification of MHC-Bound Peptides for Immunopeptidomics</article-title>. <source>Nat Protoc</source> (<year>2019</year>) <volume>14</volume>:<fpage>1687</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41596-019-0133-y</pub-id>
</citation>
</ref>
<ref id="B54">
<label>54</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>R&#xf6;tzschke</surname> <given-names>O</given-names>
</name>
<name>
<surname>Falk</surname> <given-names>K</given-names>
</name>
<name>
<surname>Deres</surname> <given-names>K</given-names>
</name>
<name>
<surname>Schild</surname> <given-names>H</given-names>
</name>
<name>
<surname>Norda</surname> <given-names>M</given-names>
</name>
<name>
<surname>Metzger</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Isolation and Analysis of Naturally Processed Viral Peptides as Recognized by Cytotoxic T Cells</article-title>. <source>Nature</source> (<year>1990</year>) <volume>348</volume>:<page-range>252&#x2013;4</page-range>. doi: <pub-id pub-id-type="doi">10.1038/348252a0</pub-id>
</citation>
</ref>
<ref id="B55">
<label>55</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hunt</surname> <given-names>DF</given-names>
</name>
<name>
<surname>Henderson</surname> <given-names>RA</given-names>
</name>
<name>
<surname>Shabanowitz</surname> <given-names>J</given-names>
</name>
<name>
<surname>Sakaguchi</surname> <given-names>K</given-names>
</name>
<name>
<surname>Michel</surname> <given-names>H</given-names>
</name>
<name>
<surname>Sevilir</surname> <given-names>N</given-names>
</name>
<etal/>
</person-group>. <article-title>Characterization of Peptides Bound to the Class I MHC Molecule HLA-A2. 1 by Mass Spectrometry</article-title>. <source>Science</source> (<year>1992</year>) <volume>255</volume>:<page-range>1261&#x2013;3</page-range>. doi: <pub-id pub-id-type="doi">10.1126/science.1546328</pub-id>
</citation>
</ref>
<ref id="B56">
<label>56</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shao</surname> <given-names>W</given-names>
</name>
<name>
<surname>Pedrioli</surname> <given-names>PGA</given-names>
</name>
<name>
<surname>Wolski</surname> <given-names>W</given-names>
</name>
<name>
<surname>Scurtescu</surname> <given-names>C</given-names>
</name>
<name>
<surname>Schmid</surname> <given-names>E</given-names>
</name>
<name>
<surname>Vizca&#xed;no</surname> <given-names>JA</given-names>
</name>
<etal/>
</person-group>. <article-title>The SysteMHC Atlas Project</article-title>. <source>Nucleic Acids Res</source> (<year>2018</year>) <volume>46</volume>:<fpage>D1237</fpage>&#x2013;<lpage>d1247</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkx664</pub-id>
</citation>
</ref>
<ref id="B57">
<label>57</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sawyers</surname> <given-names>CL</given-names>
</name>
</person-group>. <article-title>Chronic Myeloid Leukemia</article-title>. <source>New Engl J Med</source> (<year>1999</year>) <volume>340</volume>:<page-range>1330&#x2013;40</page-range>. doi: <pub-id pub-id-type="doi">10.1056/NEJM199904293401706</pub-id>
</citation>
</ref>
<ref id="B58">
<label>58</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kandoth</surname> <given-names>C</given-names>
</name>
<name>
<surname>Mclellan</surname> <given-names>MD</given-names>
</name>
<name>
<surname>Vandin</surname> <given-names>F</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>K</given-names>
</name>
<name>
<surname>Niu</surname> <given-names>B</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Mutational Landscape and Significance Across 12 Major Cancer Types</article-title>. <source>Nature</source> (<year>2013</year>) <volume>502</volume>:<page-range>333&#x2013;9</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nature12634</pub-id>
</citation>
</ref>
<ref id="B59">
<label>59</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Murphree</surname> <given-names>AL</given-names>
</name>
<name>
<surname>Benedict</surname> <given-names>WF</given-names>
</name>
</person-group>. <article-title>Retinoblastoma: Clues to Human Oncogenesis</article-title>. <source>Science</source> (<year>1984</year>) <volume>223</volume>:<page-range>1028&#x2013;33</page-range>. doi: <pub-id pub-id-type="doi">10.1126/science.6320372</pub-id>
</citation>
</ref>
<ref id="B60">
<label>60</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoadley</surname> <given-names>KA</given-names>
</name>
<name>
<surname>Yau</surname> <given-names>C</given-names>
</name>
<name>
<surname>Wolf</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Cherniack</surname> <given-names>AD</given-names>
</name>
<name>
<surname>Tamborero</surname> <given-names>D</given-names>
</name>
<name>
<surname>Ng</surname> <given-names>S</given-names>
</name>
<etal/>
</person-group>. <article-title>Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification Within and Across Tissues of Origin</article-title>. <source>Cell</source> (<year>2014</year>) <volume>158</volume>:<page-range>929&#x2013;44</page-range>. doi: <pub-id pub-id-type="doi">10.1016/j.cell.2014.06.049</pub-id>
</citation>
</ref>
<ref id="B61">
<label>61</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname> <given-names>MT</given-names>
</name>
<name>
<surname>Asthana</surname> <given-names>S</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>SP</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>BH</given-names>
</name>
<name>
<surname>Chapman</surname> <given-names>JS</given-names>
</name>
<name>
<surname>Kandoth</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Identifying Recurrent Mutations in Cancer Reveals Widespread Lineage Diversity and Mutational Specificity</article-title>. <source>Nat Biotechnol</source> (<year>2016</year>) <volume>34</volume>:<page-range>155&#x2013;63</page-range>. doi: <pub-id pub-id-type="doi">10.1038/nbt.3391</pub-id>
</citation>
</ref>
<ref id="B62">
<label>62</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>HN</given-names>
</name>
<name>
<surname>Shu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Liao</surname> <given-names>F</given-names>
</name>
<name>
<surname>Liao</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Qin</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>Genomic Evolution and Diverse Models of Systemic Metastases in Colorectal Cancer</article-title>. <source>Gut</source> (<year>2021</year>). doi: <pub-id pub-id-type="doi">10.1136/gutjnl-2020-323703</pub-id>
</citation>
</ref>
<ref id="B63">
<label>63</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Mattos-Arruda</surname> <given-names>L</given-names>
</name>
<name>
<surname>Vazquez</surname> <given-names>M</given-names>
</name>
<name>
<surname>Finotello</surname> <given-names>F</given-names>
</name>
<name>
<surname>Lepore</surname> <given-names>R</given-names>
</name>
<name>
<surname>Porta</surname> <given-names>E</given-names>
</name>
<name>
<surname>Hundal</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Neoantigen Prediction and Computational Perspectives Towards Clinical Benefit: Recommendations From the ESMO Precision Medicine Working Group</article-title>. <source>Ann Oncol</source> (<year>2020</year>) <volume>31</volume>(<issue>8</issue>):<page-range>978&#x2013;90</page-range>. doi: <pub-id pub-id-type="doi">10.1016/j.annonc.2020.05.008</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>