Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Integrative Bioinformatics

Volume 5 - 2025 | doi: 10.3389/fbinf.2025.1687687

Multimodal Knowledge Expansion Widget Powered by Plant Protein Phosphorylation Database and ChatGPT

Provisionally accepted
Chunhui  XuChunhui Xu1Yang  YuYang Yu1Gorvardhan  KhadakkarGorvardhan Khadakkar1Jiacheng  XieJiacheng Xie1Dong  XuDong Xu1*Qiuming  YaoQiuming Yao2*
  • 1University of Missouri, Columbia, United States
  • 2University of Nebraska-Lincoln, Lincoln, United States

The final, formatted version of the article will be published soon.

Biological databases are essential for providing curated knowledge, but their rigid data structures and restrictive query formats often limit flexible and exploratory user interactions. In the field of plant phosphorylation, manually curated and reviewed data represent only a small portion of the available knowledge, and users often seek information that goes beyond what is provided in structured databases. While large language models (LLMs) like ChatGPT-4o possess extensive contextual knowledge, integrating this capability into bioinformatics tools remains an open challenge. Here, we present a multimodal question-answering widget that integrates ChatGPT-4o with our Plant Protein Phosphorylation Database (P3DB). This system supports natural language queries and dynamic prompt formulation, enabling users to explore phosphorylation events, kinase-substrate relationships, and protein-protein interactions through a global entry. In another application, the widget leverages ChatGPT's image interpretation functionality to extract regulatory pathways and phosphorylation markers from complex scientific figures. To build this widget effectively, we have explored multiple prompt strategies, including one-step, two-step, few-shot, and image-cropping techniques, demonstrating their impact on output accuracy and consistency. In addition, recent multimodal LLMs such as ChatGPT-5 and Gemini 1.5 have demonstrated comparable capabilities and adaptability when applied to our test cases and the developed widgets. Together, our application widget and results highlight the development of the ChatGPT-P3DB integration as a system that enhances user This is a provisional file, not the final typeset article accessibility, enables visual extraction, and extends the current utility of biological knowledgebases through a flexible and adaptive framework. Our "ChatGPT-P3DB" is open-source and can be accessed on GitHub (https://github.com/yao-laboratory/p3db-chat). The frontend interface, "P3DB askAI" web module, can be accessed freely through https://www.p3db.org/ask-ai.

Keywords: multimodality1, large language mode2, plant protein phosphorylation3, information retrieva4, pathway identification5

Received: 18 Aug 2025; Accepted: 02 Oct 2025.

Copyright: © 2025 Xu, Yu, Khadakkar, Xie, Xu and Yao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Dong Xu, xudong@missouri.edu
Qiuming Yao, qyao3@unl.edu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.