Your new experience awaits. Try the new design now and help us make it even better

REVIEW article

Front. Bioinform., 12 January 2026

Sec. Drug Discovery in Bioinformatics

Volume 5 - 2025 | https://doi.org/10.3389/fbinf.2025.1712577

This article is part of the Research TopicAI in Drug DiscoveryView all 4 articles

Recent trends in machine learning and deep learning-based prediction of G-protein coupled receptor-ligand binding affinities

Joshua StephensonJoshua StephensonKonda Reddy Karnati
Konda Reddy Karnati*
  • Department of Natural Sciences, Bowie State University, Bowie, MD, United States

Accurately predicting protein-ligand binding affinity is key in drug discovery. Machine Learning and Deep Learning methods used in the drug discovery process have advanced the prediction of drug–target binding affinities, particularly for G protein–coupled receptors (GPCRs), a pharmacologically significant yet structurally heterogeneous protein family. In this review, binding affinity prediction models are examined and organized according to sequence-based one-dimensional, graph-based two-dimensional, and structure-based three-dimensional frameworks. Sequence-based models utilize convolutional neural networks for high-throughput screening. Recently published models incorporated attention mechanisms and self-supervised learning, enhancing interpretability and reducing dependence on annotated datasets. Graph-based models employ graph neural networks and molecular contact maps to capture topological features, enabling substructure-sensitive predictions. Structure-based approaches integrate spatial and conformational data into high-resolution interaction models. The hybrid use of these three approaches could significantly increase the success rate of in silico models for drug discovery, particularly for GPCRs.

Introduction

Binding affinity is the key parameter in drug discovery for predicting the strength between protein and ligand (Spassov, 2024; Gilson and Zhou, 2007). Predicting accurate binding affinity is challenging with current computational methods; strategies such as molecular docking are used for binding affinity prediction, but do not yield highly satisfactory results (Spassov, 2024). To overcome this limitation, binding affinity prediction models using Machine Learning (ML) and Deep Learning (DL) have become more prevalent in the drug discovery workflow. These models assist with the estimation of the strength of interactions between small molecules and biological macromolecules, which are often calculated as Kd, Ki, or IC50; this guides prioritization of compounds before costly experimental assays. The dimensionality of the input representations, one-dimensional (1D), two-dimensional (2D), and three-dimensional (3D) models, can broadly rank the models. 1D models operate on sequences or chemical strings, 2D models use graph-based molecular topologies or contact maps, and 3D models use inputs of spatial coordinates of atoms or coarse-grained conformations (Chen et al., 2016; Wang, 2024; Nguyen et al., 2024; Wang et al., 2015).

DL, which is a subset of ML, is a branch of artificial intelligence (AI) that enables computers to observe patterns from data and make predictions without strict rule-based programming (LeCun et al., 2015; Mahesh, 2020). Traditional ML approaches were lacking in their ability to process original data in its base state; they relied on engineered descriptors and algorithms such as decision trees, support vector machines, or random forests (Mahesh, 2020; LeCun et al., 2015). In contrast, DL leverages layered neural network architectures to change internal parameters by using backpropagation within the algorithm (Figure 1) (LeCun et al., 2015). Over the last decade, these methods have transformed early-stage drug discovery by accelerating virtual screening and reducing the difficulties of synthesis experiments (Paul et al., 2021; Blanco-González et al., 2023). They have enabled structure-based virtual screening (Cheng et al., 2012; Kitchen et al., 2004), kinase selectivity profiling (Davis et al., 2011), and the identification of ligand-binding residues (Chen et al., 2014).

Figure 1
Flowchart depicting a model architecture for data input and prediction. It starts with data input from a target sequence and SMILES representation, followed by data encoding using smiles and amino acid embeddings. The architecture includes transformer-based encoder blocks and a CNN. An interaction module leads to a prediction outcome.

Figure 1. Workflow of ML/DL based binding-affinity prediction, beginning with data input and encoding, proceeding through the model architecture and the interaction module (where SMILES and amino-acid embeddings are fused), concluding with the affinity prediction.

G-protein-coupled receptors (GPCRs) constitute the most prominent family of druggable membrane proteins and control multiple downstream cellular signals. Despite accounting for one-third of marketed therapeutics, many GPCRs lack effective pharmacological treatments (Hauser et al., 2017). Cryo-EM and X-ray crystallography advances have improved the use of high-resolution GPCR structures (Congreve et al., 2020), revealing conserved activation motifs (Venkatakrishnan et al., 2013) and enabling structure-based design. Although recent cryo-EM structures have improved coverage of GPCRs, high-resolution structures capturing receptor activation states and receptor conformations stabilized by a particular ligand remain uncommon, especially at allosteric sites; this complicates the accuracy of binding affinity inference and subtype selectivity (Congreve et al., 2020; Wacker et al., 2017; Krishna Kumar et al., 2019; Xia et al., 2021). However, predicting GPCR-ligand binding affinity can be difficult since the membrane receptors are dynamic and can adapt multiple conformations, while their endogenous peptide ligands vary widely in sequence, length, and post-translational modifications; together, these limit available structural data and make it hard for models to generalize beyond training sets. Due to GPCRs having multiple allosteric and orthosteric sites with ligand-specific pocket arrangements, any fixed representation risks missing relevant details (Latorraca et al., 2017; Christopoulos, 2014). Benchmarking also shows that distinguishing true binders from closely related decoys is difficult due to binding-pocket similarity across receptors (Hoegen Dijkhof et al., 2025). As a result of these features, the use of 1D sequence-only models, 2D graph-based models, 3D models, and structure-based models can potentially improve the accuracy of binding affinity predictions and lower the cost of resources for both in silico and experimental processes.

1D binding affinity models

1D ML models process sequential data, typically in 1D formats such as text, time series, or biological sequences, to extract patterns and make predictions (Kiranyaz et al., 2021). Proteins as amino-acid sequences and ligands encoded as canonical SMILES (Simplified Molecular Input Line Entry System) are fed into tokenized sequences within convolutional or recurrent networks. The use of 1D sequence-based models enables rapid, high-throughput screening due to not having to rely on structural data (Öztürk et al., 2018; Wang, 2024). Their Convolutional Neural Networks (CNN) encoders have given competitive results (Table 1.) on benchmark datasets (e.g., Davis, KIBA) and can outperform classical docking in some instances, making them efficient and easy to scale (Öztürk et al., 2018; Öztürk et al., 2019; Kitchen et al., 2004). Recent self-supervised approaches further boost their effectiveness by learning useful representations from extensive unlabeled data, reducing dependence on labeled data (Schuh et al., 2025; LeCun et al., 2015). 1D encodings can also capture pharmacological properties in multi-task setups, which broadens their role as initial filters (Brahma et al., 2025; Jabeen and Ranganathan, 2019). Recent studies have shown that coupling 1D CNN encoders for SMILES and protein sequences (Figure 2), followed by fully connected layers to regress binding affinity, could produce worthwhile results (Öztürk et al., 2018; LeCun et al., 1998).

Table 1
www.frontiersin.org

Table 1. The table below shows the metrics like CI, MSE and RMSE of various models.

Figure 2
Flowchart depicting a machine learning process for drug-target interaction prediction. Data from GPCRdb and GLASS are encoded with SMILES and protein sequences. The data is tokenized, split, and fed into a neural network model. A graph shows training and validation loss over 50 epochs, with decreasing trends. Arrows indicate the sequential flow from dataset to prediction.

Figure 2. 1D binding affinity models utilize linear representations of molecules (e.g., SMILES) and proteins (amino acid sequences) processed by CNNs or recurrent layers to predict interactions without requiring structural data (Öztürk et al., 2018). The image above illustrates a 1-D Binding Affinity Prediction model.

One of the models, DeepDTA (Öztürk et al., 2018), incorporated this approach and, despite its simplicity, outperformed classical docking on the Davis and KIBA datasets and has the potential to be adapted for GPCR-centric datasets. Its variant, WideDTA, expands on this by incorporating additional textual descriptors and interaction contexts (Öztürk et al., 2019); all recent 1D models' description is listed in Table 2. Barlow Twins, a self-supervised architecture, was introduced to learn embeddings from extensive unlabeled data, achieving optimal performance on diverse drug target interaction (DTI) sets while requiring fewer labeled examples (Schuh et al., 2025). Within GPCR drug discovery, multi-task sequence models such as AiGPro can classify both agonism versus antagonism across receptor subfamilies concurrently, which demonstrates how 1D encodings can capture pharmacological structures in addition to affinity (Brahma et al., 2025). These models are supported by trends in ligand discovery using ML-based algorithms (Jabeen & Ranganathan et al., 2019; Blanco-González et al., 2023; Lorente et al., 2025; Öztürk et al., 2018).

Table 2
www.frontiersin.org

Table 2. This table combines the comparisons across 1D, 2D, and 3D models, showing the strengths and weaknesses of each model as well as their architecture and metrics suited to each task. For the 1D models, the Concordance Index (CI), which measures how well predicted binding affinities preserve the rank ordering of experimental values, is used; CI > 0.85 is generally considered strong (Öztürk et al., 2018). For 2D models, Root Mean Squared Error (RMSE) is used, which is a standard regression metric where lower values indicate more accurate predictions of continuous affinity readouts, which is widely used on benchmarks like KIBA and Davis (Öztürk et al., 2018). 3D models incorporate spatial and conformational data, which could entail the fusing of ligand graphs, protein sequences, and 3D pocket descriptors (Cai et al., 2022).

2D binding affinity models

2D models process spatial data represented in two dimensions, such as images, matrices, or other grid-like structures (Figure 3). These models, particularly within DL, are usually built on CNNs, which apply 2D filters to detect local patterns that extract features such as edges, textures, and shapes (Li et al., 2022). 2D models adapt chemical representation to graph structures in which atoms are nodes and bonds are edges, or to residue–residue contact maps for proteins. Graph neural networks (GNNs) can generate annotations along these edges, capturing the local topology and functional group context; this creates a balance between its ability to recognize the complex relationships within a chemical environment and computational effectiveness (Chen et al., 2016; El-Atawneh and Goldblum, 2024). Due to the efficiency of these models, millions of compounds can be screened before 3D docking or simulation while preserving chemical diversity (Sadybekov and Katritch, 2023; Chen et al., 2016).

Figure 3
Flowchart illustrating a machine learning process. It begins with data input, showing a ligand graph and target sequence. Data is featurized into 2D vectors for ligand and protein, which are fused and mapped for interaction. This feeds into a neural network model. A graph depicts training versus validation loss over epochs, indicating performance evaluation. The process concludes with a prediction step.

Figure 3. 2D binding affinity models represent molecules as graphs where atoms are nodes and bonds are edges, and apply GNNs to capture topological and substructural information (Liao et al., 2022). The image above illustrates a 2-D Binding Affinity Prediction model.

The model DEAttentionDTA (See Table 2) integrates dynamic embedding with self-attention layers to re-weight atom and residue contributions, significantly improving Ki prediction on BindingDB and demonstrating strong similarities to GPCR sets (Chen et al., 2024). GSAML-DTA combines a GNN encoder with self-attention mechanisms and mutual-information shrinkage, yielding interpretable attention maps highlighting substructures while maintaining competitive performance (Liao et al., 2022). 2D models are frequently used to sort millions of compounds before structure-based docking, significantly reducing the pool of molecules while retaining chemically diverse ones. (Chen et al., 2016; Sadybekov et al., 2023; Karimi et al., 2019).

3D binding affinity models

3D models (see Figure 4) leverage spatial structural information to capture complex molecular interactions. 3D models introduce spatial coordinates and conformational ensembles, which directly model non-covalent interactions such as hydrogen bonding, π-stacking, and steric clashes. Techniques range from voxelated CNNs to SE (3) equivariant GNNs that respect rotational symmetry. By encoding ligand–target contact patterns as interaction fingerprints, these methods support computational simulations of biological changes and data-driven drug repurposing (Wacker et al., 2017; El-Atawneh and Goldblum, 2024). By leveraging receptor dynamics, these models can distinguish between active and inactive GPCR receptor conformational states, which would improve the quality of ligand design (Buyanov and Popov, 2024).

Figure 4
Flowchart illustrating a process from data input to model prediction. It begins with 3D protein-ligand binding data input, proceeds to feature representation with pose generation and 3D feature construction, and ends with a model involving a neural network. The final stage shows a graph of training versus validation loss, indicating prediction capabilities.

Figure 4. 3D binding affinity models incorporate spatial and conformational information using atomic coordinates and pocket structures to simulate molecular interactions with high resolution (Cai et al., 2022). The image above illustrates a 3-D Binding Affinity Prediction model.

Studies into this field have resulted in DeepREAL (See Table 2), a model that employs a multi-scale framework that fuses ligand graphs, receptor sequences, and coarse-grained 3D pocket descriptors. Trained not only for accuracy on training distribution, but also to generalize data drawn from different distributions, such as scaffold-split Distributionally Robust Optimization; it effectively predicts GPCR activity for novel chemical profiles (Cai et al., 2022). ML classifiers of GPCR conformational states use structural descriptors to distinguish between inactive, active, and intermediate poses obtained from molecular dynamics (Buyanov and Popov, 2024). Additionally, recurrent neural networks were used to forecast conformational transitions in molecular dynamics simulations (López-Correa et al., 2023). The CB1–Gi complex, which is a high-resolution cryo-electron microscopy (cryo-EM) structure of the cannabinoid receptor 1 (CB1) in complex with the Gi protein (Krishna Kumar et al., 2019), further enables transfer learning where pre-trained 3D encoders are based on ligand-specific affinity labels, bridging experimental structural biology and computational predictions (Xia et al., 2021).

Discussion

The growing integration of ML and DL in drug discovery has given rise to several binding affinity prediction models, each showing a unique perspective on GPCR-targeted research. These models utilize a specific metric based on the dimensionality being used, as shown in Table 1 above; however, due to these differences in their select metrics, it's difficult to draw a comparison between the models; this highlights the importance of interchangeable splits for comparison analysis and consistent data standards. 1D models such as DeepDTA and WideDTA use sequence-based representations that produce high-throughput virtual screening (Öztürk et al., 2018; Öztürk et al., 2019). DEAttentionDTA and Barlow Twins models use attention mechanisms and self-supervised learning techniques to improve performance while reducing dependence on labeled data (Chen et al., 2024; Schuh et al., 2025). Models such as AiGPro include pharmacological properties such as receptor agonism and antagonism, which go beyond the scope of just adhering to binding affinity (Brahma et al., 2025). However, the limitations of 1D models are in their lack of spatial and structural information, which is crucial for modeling conformational dynamics and ligand-specific binding data (Hauser et al., 2017; Wacker et al., 2017). To provide a solution for this issue, 2D and 3D models introduce greater structural awareness; 2D models like GSAML-DTA and DEAttentionDTA utilize GNNs and self-attention mechanisms to capture local chemical context and observe key substructural features, which can improve functionality and affinity prediction (Liao et al., 2022; Chen et al., 2024). 3D models such as DeepREAL and GPCR Conformational Classifier incorporate spatial coordinates and molecular dynamics conformations that allow accurate modeling of complex GPCR-ligand interactions and receptor activation (Cai et al., 2022; Buyanov and Popov, 2024). Although these models need greater computational resources to operate proficiently, receiving high-quality structural input provides important insight into receptor signaling and potential effectiveness of compounds (Congreve et al., 2020; Krishna Kumar et al., 2019). and molecular dynamics conformations that allow accurate modeling of complex GPCR-ligand interactions and receptor activation (Cai et al., 2022; Buyanov and Popov, 2024). Although these models need greater computational resources to operate proficiently, receiving high-quality structural input provides important insight into receptor signaling and potential effectiveness of compounds (Congreve et al., 2020; Krishna Kumar et al., 2019).

Improving data quality and standardization will be critical in overcoming limitations within this area of research. Inconsistent assay protocols, mixed affinity metrics (Kd, Ki, IC50), and benchmark biases can distort true generalization and inflate reported performance; adopting consistency between data standards and transparent data workflow reporting are steps that can be used for further advancement within this area. Random splits can leak closely related scaffolds across training and test sets, overestimating performance; scaffold splits provide a stricter estimate of practical generalization to novel chemotypes (Yang et al., 2019). To ensure reliability within these pipelines, incorporate explainable AI during validation, and report results under a scaffold, time, or cluster splits so explanations correspond to stable interactions rather than dataset effects (Ong et al., 2023; Davis et al., 2011). For GPCRs, resources that integrate sequence, structure, and function can benefit reliable cross-study comparisons; clarifying and extending the effective space can propel future innovation of these models. Evaluations should be conditioned on the pocket and receptor state instead of dataset-driven ones to assess the stability of the core framework and chemical changes. The use of databases such as GPCRdb, which offer sequences and structures that can be integrated within docking or structure-based workflows, and GLASS, which provides curated GPCR–ligand pairs and receptor subtype labels that can be useful for training and or validation and scaffold- or time-split tests (Munk et al., 2016; Chan et al., 2015; Hauser et al., 2017; Jabeen and Ranganathan, 2019; Congreve et al., 2020; Nguyen et al., 2024).

By aligning ESM sequence embeddings with pocket alignment features on GPCRdb structures and GLASS subtype labels, multimodal transfer learning can enhance out-of-distribution (OOD) robustness while maintaining clear explanations of the binding mechanism. (Munk et al., 2016; Chan et al., 2015; Lin et al., 2023; Stärk et al., 2022; Corso et al., 2022). The difficulty posed by screening multi-billion compound libraries would suggest that ML-guided pre-screening will remain a practical path; however, narrowing candidates before using more advanced computational techniques would be necessary to spare time and resources. Therefore, pipelines that can integrate 1D sequence models for initial filtering of molecules that incorporate either 2D or 3D structural models for processing may offer the most comprehensive approach to drug discovery. In parallel, ESM-style protein language models contribute transfer-learned, self-supervised embeddings that improve GPCR tasks without labels, while equivariant and diffusion pose predictors such as EquiBind and DiffDock give rapid predictions on ligand poses which can be used to integrate with GPCRdb structures for multimodal training and rapid screening (Lin et al., 2023; Stärk et al., 2022; Corso et al., 2022). The most beneficial outcome for GPCR binding affinity predictions would be the success of generative AI; having the ability to use all dimensional models coherently to predict binding affinity accurately and at a rapid pace would outperform all current methods used for drug discovery. When combined with rigorous data standards and an explainable AI evaluation, self-supervised pretraining, along with generative diffusion, offers a credible pathway to high-throughput GPCR discovery.

Strengths and limitations

GPCRs pose substantial challenges for models to address when predicting binding affinity. Receptors cycle through multiple conformational and signaling states, such as G-protein and β-arrestin pathways, giving rise to biased agonism; a single static structure cannot capture these features (Latorraca et al., 2017; Preininger et al., 2013). ML models often struggle to new GPCR subtypes when training is limited or imbalanced. If receptor structural data (crystal structures or AlphaFold models) are not incorporated, ML models miss receptor-specific features, limiting their ability to distinguish closely related subtypes. Subtle sequence and pocket differences across closely related GPCRs further influence bias through pocket shape, water networks, and side-chain rotamers, underscoring the need for approaches that can integrate dynamics with the limitations of experimental workflows (Venkatakrishnan et al., 2013; Michino et al., 2025). The influence of Ki, Kd, and IC50 values, with Ki and Kd being equilibrium affinity constants, whereas IC50 is a readout that depends on the experimental assay conditions; mixing them without careful normalization can introduce label bias (Gilson and Zhou, 2007; Kitchen et al., 2004). Thus, it is important to recognize the potential strengths and weaknesses of 1D, 2D, and 3D models pertaining to the GPCRs binding affinity prediction, as shown in Table 2.

1D models would do best for rapid screening of very large libraries (Wang, 2024). However, a potential issue for 1D-type models could be the impact of SMILES on the dataset used. Although it has been shown that CNNs using 1D inputs perform well under random splits, they collapse whenever there are unseen inhibitors, indicating that redundancy and leakage drive performance rather than learned interactions (Ong et al., 2023). When known SMILES are replaced with junk SMILES per inhibitor, accuracy remains unchanged and sometimes improves (Ong et al., 2023). This shows that these models mainly learn SMILES substrings as identifiers rather than structurally relevant features. This exposes a core limitation of SMILES encodings, where models can potentially fail to recognize that two different encodings can describe the same molecule, providing the need for more improved structure-based representations (Ong et al., 2023).

2D graph models are preferable for chemotype refinement when activity is driven by molecular topology (Liao et al., 2022). The trade-offs are that 2D encodings ignore stereochemistry and 3D orientations, which leads to dependence on approximations that can impact accuracy due to the sensitivity to input quality; this is especially true for the dynamic GPCR pockets (Wang, 2024; Congreve et al., 2020). Dataset biases could lead to inflated false-positive results; this emphasizes the need for bias-aware splits and validation (Ong et al., 2023).

3D models provide mechanisms that can be based on a specific active site, are preferred for subtype selectivity, allostery, and for determining how a ligand resides within a receptor which emphasizes their use when reliable, high-quality structures exist or when optimizing for selectivity and allosteric effects. (Gilson and Zhou, 2007; Spassov, 2024; Congreve et al., 2020; Kitchen et al., 2004; Cheng et al., 2012; Stärk et al., 2022; Corso et al., 2022). Despite the rapid progress, 3D GPCR binding-prediction needs more data for better efficacy; only a fraction of the ∼800 human GPCRs have been experimentally determined, which limits model training, pocket generalization, and the ability to study less-known receptors (Michino et al., 2025).

Conclusion and future directions

Binding affinity prediction for GPCRs has progressed from fast but superficial 1D sequence models to structurally informed 2D graphs and fully 3D, structurally-based models. Each methodology offers strengths that could support the other. 1D models enable rapid screening of large libraries, 2D models enhance chemical context and substructure awareness, and 3D models can effectively capture pocket geometry, receptor states, and substrate selectivity. Robust pipelines for GPCRs would benefit from the combination of dimensional scales rather than choosing among them. The use of 1D sequence-based models for initial screening, 2D graph and attention architectures for chemotype refinement, and 3D structure-based models that focus on specific pockets and receptor states. When supported by high-quality data curation, consistent affinity measurements, and processes that account for bias, multidimensional workflows can yield more realistic generalizations with trustworthy results. Multimodal learning that aligns protein language model embeddings, pocket geometry, receptor state, and readouts related to signaling bias can improve the robustness of new data. Generative models that can propose ligands based on GPCR sequences, a set of different 3D pocket conformations, and desired biological profiles could bridge the gap between affinity prediction and de novo design. To make these systems optimal for clinical use, future work should prioritize GPCR benchmarks built on curated resources, utilizing scaffold and time-split evaluations, and incorporating explainable AI analyses that can link the model to relevant chemical and structural features. Ultimately, the most impactful GPCR discovery platforms will treat 1D, 2D, and 3D representations as complementary parts of the same system, where they can be integrated into reproducible workflows that support experimental design, explain failures, and accelerate the progression from virtual candidates to safe and effective drugs.

Author contributions

JS: Data curation, Methodology, Writing – original draft, Writing – review and editing. KK: Conceptualization, Supervision, Writing – original draft, Writing – review and editing, Data curation, Methodology.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the National Science Foundation Grant 2300475.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Blanco-González, A., Cabezón, A., Seco-González, A., Conde-Torres, D., Antelo-Riveiro, P., Piñeiro, Á., et al. (2023). The role of AI in drug discovery: challenges, opportunities, and strategies. Pharmaceuticals 16 (6), 891. doi:10.3390/ph16060891

PubMed Abstract | CrossRef Full Text | Google Scholar

Brahma, R., Moon, S., Shin, J., and Cho, K. (2025). AiGPro: a multi-task model for profiling GPCRs for agonists and antagonists. J. Cheminformatics 17 (1), 12. doi:10.1186/s13321-024-00945-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Buyanov, I., and Popov, P. (2024). Characterizing conformational states in GPCR structures using machine learning. Sci. Rep. 14, 1098. doi:10.1038/s41598-023-47698-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Cai, T., Abbu, K., Liu, Y., and Xie, L. (2022). DeepREAL: a deep learning powered multi-scale modeling framework for predicting out-of-distribution ligand-induced GPCR activity. Bioinformatics 38 (9), 2561–2570. doi:10.1093/bioinformatics/btac154

PubMed Abstract | CrossRef Full Text | Google Scholar

Chan, W., Zhang, H., Yang, J., Brender, J., Hur, J., Özgür, A., et al. (2015). GLASS: a comprehensive database for experimentally validated GPCR-Ligand associations. Bioinformatics 31 (18), 3035–3042. doi:10.1093/bioinformatics/btv302

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, P., Huang, J., and Gao, X. (2014). LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinforma. 15 (S15), S4. doi:10.1186/1471-2105-15-S15-S4

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, X., Yan, C., Zhang, X., Zhang, X., Dai, F., Yin, J., et al. (2016). Drug–target interaction prediction: databases, web servers, and computational models. Briefings Bioinforma. 17 (4), 696–712. doi:10.1093/bib/bbv066

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, X., Huang, J., Shen, T., Zhang, H., Xu, L., Yang, M., et al. (2024). DEAttentionDTA: protein–ligand binding affinity prediction based on dynamic embedding and self-attention. Bioinformatics 40 (6), btae319. doi:10.1093/bioinformatics/btae319

PubMed Abstract | CrossRef Full Text | Google Scholar

Cheng, T., Li, Q., Zhou, Z., Wang, Y., and Bryant, S. (2012). Structure-based virtual screening for drug discovery: a problem-centric review. AAPS J. 14 (1), 133–141. doi:10.1208/s12248-012-9322-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Christopoulos, A. (2014). Advances in G protein-coupled receptor allostery: from function to structure. Mol. Pharmacol. 86 (5), 463–478. doi:10.1124/mol.114.094342

PubMed Abstract | CrossRef Full Text | Google Scholar

Congreve, M., de Graaf, C., Swain, N., and Tate, C. (2020). Impact of GPCR structures on drug discovery. Cell 181 (1), 81–91. doi:10.1016/j.cell.2020.03.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Corso, G., Stärk, H., Jing, B., Barzilay, R., and Jaakkola, T. (2022). “DiffDock: diffusion steps, twists, and turns for molecular docking,”. arXiv Prepr. doi:10.48550/arXiv.2210.01776

CrossRef Full Text | Google Scholar

Davis, M., Hunt, J., Herrgard, S., Ciceri, P., Wodicka, L., Pallares, G., et al. (2011). Comprehensive analysis of kinase inhibitor selectivity. Nat. Biotechnol. 29 (11), 1046–1051. doi:10.1038/nbt.1990

PubMed Abstract | CrossRef Full Text | Google Scholar

El-Atawneh, S., and Goldblum, A. (2024). A machine learning algorithm suggests repurposing opportunities for targeting selected GPCRs. Int. J. Mol. Sci. 25 (18), 10230. doi:10.3390/ijms251810230

PubMed Abstract | CrossRef Full Text | Google Scholar

Gilson, M., and Zhou, H. (2007). Calculation of protein–ligand binding affinities. Annu. Rev. Biophysics Biomol. Struct. 36, 21–42. doi:10.1146/annurev.biophys.36.040306.132550

PubMed Abstract | CrossRef Full Text | Google Scholar

Hauser, A., Attwood, M., Rask-Andersen, M., Schiöth, H., and Gloriam, D. (2017). Trends in GPCR drug discovery: new agents, targets and indications. Nat. Rev. Drug Discov. 16 (12), 829–842. doi:10.1038/nrd.2017.178

PubMed Abstract | CrossRef Full Text | Google Scholar

Hoegen Dijkhof, L., Rönkkö, T., Von Vegesack, H., Lenzing, J., and Hauser, A. (2025). Deep learning in GPCR drug discovery: benchmarking the path to accurate peptide binding. Briefings Bioinforma. 26 (2), bbaf186. doi:10.1093/bib/bbaf186

PubMed Abstract | CrossRef Full Text | Google Scholar

Jabeen, A., and Ranganathan, S. (2019). Applications of machine learning in GPCR bioactive ligand discovery. Curr. Opin. Struct. Biol. 55, 66–76. doi:10.1016/j.sbi.2019.03.022

PubMed Abstract | CrossRef Full Text | Google Scholar

Karimi, M., Wu, D., Wang, Z., and Shen, Y. (2019). DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35 (18), 3329–3338. doi:10.1093/bioinformatics/btz111

PubMed Abstract | CrossRef Full Text | Google Scholar

Kiranyaz, S., Avci, O., Abdeljaber, O., Ince, T., Gabbouj, M., and Inman, D. (2021). 1D convolutional neural networks and applications: a survey. Mech. Syst. Signal Process. 151, 107398. doi:10.1016/j.ymssp.2020.107398

CrossRef Full Text | Google Scholar

Kitchen, D., Decornez, H., Furr, J., and Bajorath, J. (2004). Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discov. 3 (11), 935–949. doi:10.1038/nrd1549

PubMed Abstract | CrossRef Full Text | Google Scholar

Krishna Kumar, K., Shalev-Benami, M., Robertson, M., Hu, H., Banister, S., Hollingsworth, S., et al. (2019). Structure of a signaling cannabinoid receptor 1–G protein complex. Cell 176 (3), 448–458.e12. doi:10.1016/j.cell.2018.11.040

PubMed Abstract | CrossRef Full Text | Google Scholar

Latorraca, N., Venkatakrishnan, A., and Dror, R. (2017). GPCR dynamics: structures in motion. Chem. Rev. 117 (1), 139–155. doi:10.1021/acs.chemrev.6b00177

PubMed Abstract | CrossRef Full Text | Google Scholar

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proc. IEEE 86 (11), 2278–2324. doi:10.1109/5.726791

CrossRef Full Text | Google Scholar

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521 (7553), 436–444. doi:10.1038/nature14539

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, Z., Liu, F., Yang, W., Peng, S., and Zhou, J. (2022). A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 33 (12), 6999–7019. doi:10.1109/TNNLS.2021.3084827

PubMed Abstract | CrossRef Full Text | Google Scholar

Liao, J., Chen, H., Wei, L., and Wei, L. (2022). GSAML-DTA: an interpretable drug–target binding affinity prediction model based on graph neural networks with a self-attention mechanism and mutual information. Comput. Biol. Med. 150, 106145. doi:10.1016/j.compbiomed.2022.106145

PubMed Abstract | CrossRef Full Text | Google Scholar

Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379 (6637), 1123–1130. doi:10.1126/science.ade2574

PubMed Abstract | CrossRef Full Text | Google Scholar

López-Correa, J., König, C., and Vellido, A. (2023). GPCR molecular dynamics forecasting using recurrent neural networks. Sci. Rep. 13 (1), 20995. doi:10.1038/s41598-023-48346-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Lorente, J., Sokolov, A., Ferguson, G., Schiöth, H., Hauser, A., and Gloriam, D. (2025). GPCR drug discovery: new agents, targets, and indications. Nat. Rev. Drug Discov. 24 (6), 458–479. doi:10.1038/s41573-025-01139-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Mahesh, B. (2020). Machine learning algorithms—A review. Int. J. Sci. Res. 9 (1), 381–386. doi:10.21275/ART20203995

CrossRef Full Text | Google Scholar

Michino, M., Vendome, J., and Kufareva, I. (2025). AI meets physics in computational structure-based drug discovery for GPCRs. Npj Drug Discov. 2, 16. doi:10.1038/s44386-025-00019-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Munk, C., Isberg, V., Mordalski, S., Harpsøe, K., Rataj, K., Hauser, A., et al. (2016). GPCRdb: the G protein-coupled receptor database—An introduction. Br. J. Pharmacol. 173 (14), 2195–2207. doi:10.1111/bph.13509

PubMed Abstract | CrossRef Full Text | Google Scholar

Nguyen, A., Nguyen, D., Koh, H., Toskov, J., MacLean, W., Xu, A., et al. (2024). The application of artificial intelligence to accelerate G protein-coupled receptor drug discovery. Br. J. Pharmacol. 181 (14), 2371–2384. doi:10.1111/bph.16140

PubMed Abstract | CrossRef Full Text | Google Scholar

Ong, W., Kirubakaran, P., and Karanicolas, J. (2023). Poor generalization by current deep learning models for predicting binding affinities of kinase inhibitors. bioRxiv, doi:10.1101/2023.09.04.556234

PubMed Abstract | CrossRef Full Text | Google Scholar

Öztürk, H., Özgür, A., and Ozkirimli, E. (2018). DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34 (17), i821–i829. doi:10.1093/bioinformatics/bty593

PubMed Abstract | CrossRef Full Text | Google Scholar

Öztürk, H., Ozkirimli, E., and Özgür, A. (2019). WideDTA: prediction of drug–target binding affinity. arXiv Prepr. arXiv:1902.04166. doi:10.48550/arXiv.1902.04166

CrossRef Full Text | Google Scholar

Paul, D., Sanap, G., Shenoy, S., Kalyane, D., Kalia, K., and Tekade, R. (2021). Artificial intelligence in drug discovery and development. Drug Discov. Today 26 (1), 80–93. doi:10.1016/j.drudis.2020.10.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Preininger, A., Meiler, J., and Hamm, H. (2013). Conformational flexibility and structural dynamics in GPCR-Mediated G protein activation: a perspective. J. Mol. Biol. 425 (13), 2288–2298. doi:10.1016/j.jmb.2013.04.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Sadybekov, A., and Katritch, V. (2023). Computational approaches streamlining drug discovery. Nature 616 (7958), 673–685. doi:10.1038/s41586-023-05905-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Schuh, M., Boldini, D., Bohne, A., and Sieber, S. (2025). Barlow Twins’ deep neural network for advanced 1D drug–target interaction prediction. J. Cheminformatics 17 (1), 18. doi:10.1186/s13321-025-00952-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Spassov, D. (2024). Binding affinity determination in drug design: insights from lock and key, induced fit, conformational selection, and inhibitor trapping models. Int. J. Mol. Sci. 25 (13), 7124. doi:10.3390/ijms25137124

PubMed Abstract | CrossRef Full Text | Google Scholar

Stärk, H., Ganea, O., Pattanaik, L., Barzilay, R., and Jaakkola, T. (2022). “EquiBind: geometric deep learning for drug binding structure prediction,” in Proceedings of the 39th international conference on machine learning (ICML) (PMLR), 20503–20521. doi:10.48550/arXiv.2202.05146

CrossRef Full Text | Google Scholar

Venkatakrishnan, A., Deupi, X., Lebon, G., Tate, C., Schertler, G., and Babu, M. (2013). Molecular signatures of G-protein-coupled receptors. Nature 494 (7436), 185–194. doi:10.1038/nature11896

PubMed Abstract | CrossRef Full Text | Google Scholar

Wacker, D., Stevens, R., and Roth, B. (2017). How ligands illuminate GPCR molecular pharmacology. Cell 170 (3), 414–427. doi:10.1016/j.cell.2017.07.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, H. (2024). Prediction of protein–ligand binding affinity via deep learning models. Briefings Bioinforma. 25 (2), bbae081. doi:10.1093/bib/bbae081

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, Y., Guo, Y., Kuang, Q., Pu, X., Ji, Y., Zhang, Z., et al. (2015). A comparative study of family-specific protein–ligand complex affinity prediction based on a random forest approach. J. Computer-Aided Mol. Des. 29 (4), 349–360. doi:10.1007/s10822-014-9827-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Xia, R., Wang, N., Xu, Z., Lu, Y., Song, J., Zhang, A., et al. (2021). Cryo-EM structure of the human histamine H1 receptor/Gq complex. Nat. Commun. 12 (1), 2086. doi:10.1038/s41467-021-22427-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., et al. (2019). Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59 (8), 3370–3388. doi:10.1021/acs.jcim.9b00237

PubMed Abstract | CrossRef Full Text | Google Scholar

Glossary

CI Concordance Index (Calculates how well the model preserves the rank ordering of valid values)

RMSE Root Mean Squared Error (Calculates the average magnitude of prediction errors, with larger errors penalized more)

R2 Coefficient of Determination (Calculates the share of variance in the target that the model explains)

AUC Area Under the ROC Curve (Calculates the probability that a random positive score will be higher than a random negative for binary classifications)

Keywords: drug–target binding affinity, G protein–coupled receptors (GPCRs), binding affinity prediction, machine learning (ML), deep learning (DL)

Citation: Stephenson J and Karnati KR (2026) Recent trends in machine learning and deep learning-based prediction of G-protein coupled receptor-ligand binding affinities. Front. Bioinform. 5:1712577. doi: 10.3389/fbinf.2025.1712577

Received: 24 September 2025; Accepted: 29 December 2025;
Published: 12 January 2026.

Edited by:

Garrett M. Morris, University of Oxford, United Kingdom

Reviewed by:

Karim Abbasi, Sharif University of Technology, Iran
Saleem Y. Bhat, University of Pennsylvania, United States

Copyright © 2026 Stephenson and Karnati. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Konda Reddy Karnati, a2thcm5hdGlAYm93aWVzdGF0ZS5lZHU=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.