ORIGINAL RESEARCH article
Front. Bioinform.
Sec. Protein Bioinformatics
Volume 5 - 2025 | doi: 10.3389/fbinf.2025.1684042
This article is part of the Research TopicAI in Protein ScienceView all articles
ParaDeep: Sequence-Based Deep Learning for Residue-Level Paratope Prediction Using Chain-Aware BiLSTM-CNN Models
Provisionally accepted- 1Chiang Mai University, International College of Digital Innovation, Chiang Mai, Thailand
- 2Center of Biomolecular Therapy and Diagnostic, Chiang Mai University Faculty of Associated Medical Sciences, Chiang Mai, Thailand
- 3Chiang Mai University, Office of Research Administration, Chiang Mai, Thailand
- 4Chiang Mai University Department of Chemistry, Chiang Mai, Thailand
- 5Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, Thailand
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Accurate prediction of antibody paratopes is a critical challenge in structure-limited, high-throughput discovery workflows. We present ParaDeep, a lightweight and interpretable deep learning framework for residue-level paratope prediction directly from amino acid sequences. ParaDeep integrates bidirectional long short-term memory networks with one-dimensional convolutional layers to capture both long-range sequence context and local binding motifs. We systematically evaluated 30 model configurations varying in encoding schemes, convolutional kernel sizes, and antibody chain types. In five-fold cross-validation, heavy (H) chain models achieved the highest performance (F1 = 0.856 ± 0.014, MCC = 0.842 ± 0.015), outperforming light (L) chain models (F1 = 0.774 ± 0.023, MCC = 0.772 ± 0.022). On an independent blind test set, ParaDeep attained F1 = 0.723 and MCC = 0.685 for H chains, and F1 = 0.607 and MCC = 0.587 for L chains, representing a 27% MCC improvement over the sequence-based baseline Parapred. Chain-specific modeling revealed that heavy chains provide stronger sequence-based predictive signals, while light chains benefit more from structural context. ParaDeep approaches the performance of state-of-the-art structure-based methods on heavy chains while requiring only sequence input, enabling faster and broader applicability without the computational cost of 3D modeling. Its efficiency and scalability make it well-suited for early-stage antibody discovery, repertoire profiling, and therapeutic design, particularly in the absence of structural data. The implementation is freely available at https://github.com/PiyachatU/ParaDeep, with Python (PyTorch) code and a Google Colab interface for ease of use.
Keywords: Antibody binding site prediction, deep learning, BiLSTM-CNN, Heavy and light chains, Paratope identification, Sequence modeling
Received: 12 Aug 2025; Accepted: 07 Oct 2025.
Copyright: © 2025 Udomwong, Pamonsupornwichit, Kodchakorn and Tayapiwatana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Piyachat Udomwong, piyachat.u@cmu.ac.th
Chatchai Tayapiwatana, chatchai.t@cmu.ac.th
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.