Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Protein Bioinformatics

Volume 5 - 2025 | doi: 10.3389/fbinf.2025.1684042

This article is part of the Research TopicAI in Protein ScienceView all articles

ParaDeep: Sequence-Based Deep Learning for Residue-Level Paratope Prediction Using Chain-Aware BiLSTM-CNN Models

Provisionally accepted
Piyachat  UdomwongPiyachat Udomwong1*Thanathat  PamonsupornwichitThanathat Pamonsupornwichit2Kanchanok  KodchakornKanchanok Kodchakorn3,4Chatchai  TayapiwatanaChatchai Tayapiwatana2,5*
  • 1Chiang Mai University, International College of Digital Innovation, Chiang Mai, Thailand
  • 2Center of Biomolecular Therapy and Diagnostic, Chiang Mai University Faculty of Associated Medical Sciences, Chiang Mai, Thailand
  • 3Chiang Mai University, Office of Research Administration, Chiang Mai, Thailand
  • 4Chiang Mai University Department of Chemistry, Chiang Mai, Thailand
  • 5Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, Thailand

The final, formatted version of the article will be published soon.

Accurate prediction of antibody paratopes is a critical challenge in structure-limited, high-throughput discovery workflows. We present ParaDeep, a lightweight and interpretable deep learning framework for residue-level paratope prediction directly from amino acid sequences. ParaDeep integrates bidirectional long short-term memory networks with one-dimensional convolutional layers to capture both long-range sequence context and local binding motifs. We systematically evaluated 30 model configurations varying in encoding schemes, convolutional kernel sizes, and antibody chain types. In five-fold cross-validation, heavy (H) chain models achieved the highest performance (F1 = 0.856 ± 0.014, MCC = 0.842 ± 0.015), outperforming light (L) chain models (F1 = 0.774 ± 0.023, MCC = 0.772 ± 0.022). On an independent blind test set, ParaDeep attained F1 = 0.723 and MCC = 0.685 for H chains, and F1 = 0.607 and MCC = 0.587 for L chains, representing a 27% MCC improvement over the sequence-based baseline Parapred. Chain-specific modeling revealed that heavy chains provide stronger sequence-based predictive signals, while light chains benefit more from structural context. ParaDeep approaches the performance of state-of-the-art structure-based methods on heavy chains while requiring only sequence input, enabling faster and broader applicability without the computational cost of 3D modeling. Its efficiency and scalability make it well-suited for early-stage antibody discovery, repertoire profiling, and therapeutic design, particularly in the absence of structural data. The implementation is freely available at https://github.com/PiyachatU/ParaDeep, with Python (PyTorch) code and a Google Colab interface for ease of use.

Keywords: Antibody binding site prediction, deep learning, BiLSTM-CNN, Heavy and light chains, Paratope identification, Sequence modeling

Received: 12 Aug 2025; Accepted: 07 Oct 2025.

Copyright: © 2025 Udomwong, Pamonsupornwichit, Kodchakorn and Tayapiwatana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Piyachat Udomwong, piyachat.u@cmu.ac.th
Chatchai Tayapiwatana, chatchai.t@cmu.ac.th

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.