AUTHOR=Dhanda Sandeep Kumar , Karosiene Edita , Edwards Lindy , Grifoni Alba , Paul Sinu , Andreatta Massimo , Weiskopf Daniela , Sidney John , Nielsen Morten , Peters Bjoern , Sette Alessandro TITLE=Predicting HLA CD4 Immunogenicity in Human Populations JOURNAL=Frontiers in Immunology VOLUME=Volume 9 - 2018 YEAR=2018 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2018.01369 DOI=10.3389/fimmu.2018.01369 ISSN=1664-3224 ABSTRACT=Background: Prediction of T cell immunogenicity is a topic of considerable interest, both in terms of basic understanding of the mechanisms of T cells responses, and in terms of practical applications. HLA binding affinity is often used to predict T cell epitopes, since HLA binding affinity is a key requisite for human T cell immunogenicity. However, immunogenicity at the population it is complicated by the high level of variability of HLA molecules, potential other factors beyond HLA as well as the frequent lack of HLA typing data. To overcome those issues we explored an alternative approach to identify the common characteristics able to distinguish immunogenic peptides from non-recognized peptides. Methods: Sets of dominant epitopes derived from peer reviewed published papers were used in conjunction with negative peptides from the same experiments/donors to train neural networks (NN) and generate an “immunogenicity score”. We also compared the performance of the immunogenicity score with previously described method for immunogenicity prediction based on HLA class II binding at the population level. Results: The immunogenicity score was validated on a series of independent datasets derived from the published literature, representing 57 independent studies where immunogenicity in human populations was assessed by testing overlapping peptides spanning different antigens. Overall, these testing data sets corresponded to over 2000 peptides, and tested in over 1600 different human donors. The 7-allele method prediction and the immunogenicity score were associated with similar performance (average AUC values of 0.703 and 0.702, respectively) while the combined methods reached an average AUC of 0.725. This increase in average AUC value is significant compared with the immunogenicity score (p= 0.0135) and a strong trend towards significance is observed when compared to the 7-allele method (p= 0.0938). The new immunogenicity score method is now freely available using CD4 T cell immunogenicity prediction tool on the Immune Epitope Database (IEDB) website (http://tools.iedb.org/CD4episcore). Conclusions: The new immunogenicity score predicts CD4 T cell immunogenicity at the population level starting from protein sequences and with no need for HLA typing. Its efficacy has been validated in the context of different antigen sources, ethnicities and disparate techniques for epitope identification.