ORIGINAL RESEARCH article
Front. Immunol.
Sec. Autoimmune and Autoinflammatory Disorders : Autoimmune Disorders
Volume 16 - 2025 | doi: 10.3389/fimmu.2025.1630863
In Silico Tool for Predicting and Scanning Rheumatoid Arthritis-Inducing Peptides in an Antigen
Provisionally accepted- Indraprastha Institute of Information Technology Delhi, Delhi, India
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Rheumatoid arthritis (RA) is an autoimmune disorder in which the immune system mounts an abnormal response to self-antigens, leading to chronic inflammation and joint damage.Therefore, identifying antigenic regions in proteins that trigger RA is crucial for developing protein-based therapeutics. In this study, we developed models for predicting HLA class II binding RA-inducing peptides using a dataset comprising 291 experimentally confirmed RA-inducing peptides and 165 RA non-inducing peptides. Our initial analysis revealed that certain residues-such as glycine, proline, and tyrosine are significantly enriched in RAinducing peptides. While alignment based techniques like Basic Local Alignment Search Tool (BLAST) and Motif EmeRging and with Classes Identification (MERCI) offered high precision, they suffered from limited coverage. We developed machine and deep learningbased prediction models and obtained the highest performance (AUC = 0.75) using the XGBoost classifier on validation dataset. We also developed prediction methods using protein language models and achieved the highest performance (AUC = 0.72) using ProtBERT. Our ensemble model, which combines XGBoost and MERCI-derived motifs, achieved the best overall performance (AUC = 0.80; MCC = 0.45) on validation dataset. All models were rigorously evaluated using validation dataset that was not used during model training. This study will be valuable for assessing the risk of proteins used in probiotics, genetically modified foods, and protein-based therapeutics. Our most effective approach has been implemented in RAIpred, a web server and standalone software tool for predicting and scanning RA-inducing peptides (https://webs.iiitd.edu.in/raghava/raipred/).
Keywords: autoimmune disease, Rheumatoid arthritis, T-cell epitopes, machine learning, Large language models
Received: 18 May 2025; Accepted: 12 Aug 2025.
Copyright: © 2025 Tomer, Jain, Gahlot, Bajiya and Raghava. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Gajendra PS Raghava, Indraprastha Institute of Information Technology Delhi, Delhi, India
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.