MINI REVIEW article
Front. Plant Sci.
Sec. Functional and Applied Plant Genomics
Volume 16 - 2025 | doi: 10.3389/fpls.2025.1611992
This article is part of the Research TopicMachine Learning for Mining Plant Functional GenesView all 3 articles
Foundation models in plant molecular biology: Advances, challenges, and future directions
Provisionally accepted- China Agricultural University, Beijing, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
A foundation model (FM) is a neural network trained on large-scale data using unsupervised or self-supervised learning, capable of adapting to a wide range of downstream tasks. This review provides a comprehensive overview of FMs in plant molecular biology, emphasizing recent advances and future directions. It begins by tracing the evolution of biological FMs across the DNA, RNA, protein, and single-cell levels, from tools inspired by natural language processing (NLP) to transformative models for decoding complex biological sequences. The review then focuses on plantspecific FMs such as GPN, AgroNT, PDLLMs, PlantCaduceus, and PlantRNA-FM, which address challenges that are widespread among plant genomes, including polyploidy, high repetitive sequence content, and environment-responsive regulatory elements, alongside universal FMs like GENERator and Evo 2, which leverage extensive cross-species training data for sequence design and prediction of mutation effects. Key opportunities and challenges in plant molecular biology FM development are further outlined, such as data heterogeneity, biologically informed architectures, cross-species generalization, and computational efficiency. Future research should prioritize improvements in model generalization, multi-modal data integration, and computational optimization to overcome existing limitations and unlock the potential of FMs in plant science. This review serves as an essential resource for plant molecular biologists and offers a clear snapshot of the current state and future potential of FMs in the field.
Keywords: Foundation model, plant, Molecular Biology, Large Language Model, transformer
Received: 15 Apr 2025; Accepted: 16 May 2025.
Copyright: © 2025 Xu, Wu, Cheng, Wang and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Jun Yan, China Agricultural University, Beijing, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.