EDITORIAL article
Front. Bioinform.
Sec. Protein Bioinformatics
This article is part of the Research TopicComputational protein function prediction based on sequence and/or structural dataView all 9 articles
Editorial: Computational Protein Function Prediction Based on Sequence and/or Structural Data
Provisionally accepted- University of Oxford, Oxford, United Kingdom
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
In recent years, the integration of computational and machine learning techniques has dramatically advanced our understanding of protein function and engineering. As proteins underpin virtually every biological process, accurate predictions of their functions—whether based on sequence or structural data—are critical to unraveling their roles in health, disease, and therapeutics. This editorial explores how cutting-edge computational strategies are shaping the future of protein sciences, as illustrated by the contributions in this Research Topic. By embracing both evolutionary data and state-of-the-art technologies like deep learning and molecular dynamics simulations, these studies illuminate new pathways for advancing drug development, vaccine design, and pathogen-interaction research. At the heart of protein function prediction is the challenging task of deciphering the link between protein sequences, their three-dimensional structures, and the biological functions they support. While protein sequences provide a genetic blueprint, their true functional capabilities are often embedded in the spatial arrangement of amino acids and the resulting folding patterns. Recent breakthroughs in computational biology have allowed researchers to use sequence data and structural features to predict protein functions with unprecedented accuracy1,2. Machine learning and deep learning approaches, such as DeepPredict3 and DCMA4, are enhancing the landscape of protein function prediction. These algorithms leverage large datasets to identify subtle patterns in sequence-structure-function relationships, enabling the prediction of secondary structures, solvent accessibility, and backbone dihedral angles with improved precision. Importantly, these methods do not only enhance prediction accuracy; they also reduce computational demands, enabling large-scale analysis that was previously unfeasible. Several studies in this collection highlight the utility of such AI-based tools for practical biomedical applications, from antigen design to drug discovery. One of the most compelling uses of these predictive methods is in the realm of vaccine development. For instance, AI-guided approaches have revolutionized the design of broadly protective antigens by predicting how viral proteins interact with immune receptors5,6. These insights are not just theoretical—they are directly applicable to real-world challenges like the development of vaccines against evolving pathogens. This Research Topic includes contributions that exemplify how computational tools can accelerate vaccine design by identifying critical epitopes and optimizing immune response profiles. In particular, the studies focusing on viral infections, including the investigation of SARS-CoV-2 cytokine interactions and the engineering of antiviral peptides, underscore the role of computational methods in addressing urgent global health threats. One such example is the design of cross-reactive antigens using machine learning, where computational models were employed to predict fHbp properties and design mutants that could potentially be used in next-generation vaccine platforms6. In addition to antigen design, another area where protein function prediction plays a pivotal role is in drug discovery. The ability to predict protein-ligand and protein-protein binding sites has far-reaching implications for designing therapeutic interventions. Many contributors to this Research Topic explore novel techniques for predicting these binding sites, with a particular emphasis on the challenges of pathogen-host interactions. By studying protein-protein interactions in pathogens such as Enterobacter cloacae5, or host-virus interactions like those between the human immune system and SARS-CoV-2, these studies are advancing our understanding of the molecular underpinnings of disease. By accurately predicting how proteins interact, researchers can identify new drug targets and design more effective treatments. Notably, large-scale molecular docking and PhIP-Seq7 approaches are being employed to uncover complex host-virus interactions, aiding in the discovery of novel therapeutic strategies. Moreover, this Research Topic highlights the growing importance of high-throughput techniques and large-scale molecular simulations in structural biology. Tools such as PhIP-Seq7, which enables comprehensive antibody profiling, and large-scale molecular docking, which uncovers complex host-virus interactions, are helping to unravel the intricate web of protein interactions that govern biological processes. These approaches are essential for mapping the full spectrum of pathogen-host dynamics and for designing strategies that can mitigate the impact of infectious diseases. Additionally, a computational framework for virtual screening of cytokine and coronavirus nucleocapsid protein interactivity has accelerated the discovery of effective interventions against viral threats, showcasing the complementary nature of deep learning-based models and semi-physicochemical methods8. Underlying all these efforts is an appreciation for the evolutionary data that drives protein function. Understanding how proteins evolve—and how evolutionary pressures shape their structure and function—is a key aspect of protein engineering. In this context, evolutionary algorithms and molecular dynamics simulations have emerged as invaluable tools. By simulating how proteins have adapted over time and predicting how they might evolve in the future, these methods provide critical insights into how we can design proteins with desired functions. Whether it's the development of enzymes with novel catalytic properties or the engineering of proteins with enhanced binding affinities, evolutionary data are integral to rational protein design. The study of Nipah virus (NiV) and its interaction with host receptors highlights the importance of understanding viral evolution and using computational approaches to develop novel antiviral peptides for therapeutic applications9. The diversity of studies presented here emphasizes the breadth of protein function prediction and engineering, ranging from the design of therapeutic antibodies to the prediction of protein-protein interactions and the identification of antiviral peptides. Collectively, these contributions highlight the transformative power of computational methods in modern biomedical research. They also reinforce the growing recognition that understanding protein function requires an integrated approach, combining sequence data, structural insights, and evolutionary analysis. Looking ahead, the convergence of deep learning, molecular dynamics, and high-throughput techniques promises to further enhance our ability to predict and engineer protein functions with ever greater accuracy and efficiency. As these computational strategies continue to evolve, they will likely open new frontiers in protein science, providing invaluable tools for tackling complex challenges in drug discovery, vaccine development, and beyond. In conclusion, this collection of articles provides a timely and comprehensive overview of the current state of computational protein function prediction, showcasing how innovative technologies are driving progress in both fundamental and applied protein research. The future of protein engineering lies in our ability to leverage these advances to design proteins with specific functions, offering exciting possibilities for the development of novel therapeutics and biotechnologies.
Keywords: protein function prediction, machine learning, deep learning, protein sequence, protein structure
Received: 14 Sep 2025; Accepted: 24 Oct 2025.
Copyright: © 2025 Jang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Yaan J. Jang, yaan.jang@gmail.com
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.