ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Functional and Applied Plant Genomics
Volume 16 - 2025 | doi: 10.3389/fpls.2025.1659345
This article is part of the Research TopicMachine Learning for Mining Plant Functional GenesView all 7 articles
Uncovering dormancy stage predictors in Sweet Cherry through DNA methylation and machine learning integration
Provisionally accepted- 1Centro de Genómica y Bioinformática, Universidad Mayor, Santiago, Chile
- 2Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, Huechuraba, Santiago 8580745, Chile, Santiago, Chile
- 3Escuela de Agronomía, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Huechuraba, Santiago 8580745, Chile, Santiago, Chile
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Prunus Avium L. dormancy is a complex physiological process that allows floral outbreaks to survive adverse winter conditions and resume favorable spring growth. Traditional phenological evaluations and agroclimatic models, although widely used, exhibit limited resolution and robustness over the years and cultivars. Epigenetic mechanisms, particularly DNA methylation, have emerged as critical regulators of dormancy transitions. However, the integration of methylation data with automatic learning tools (ML) for predictive modeling remains largely unexplored in perennial species. This study presents an integrative frame that combines whole-genome bisulfite sequencing and supervised ML to identify methylation markers at the cytosine and region level associated with specific dormancy stages in the sweet cherry. DNA methylation data sets from three different experiments underwent classification using Random Forest (RF) and eXtreme Gradient Boosting (XGBoost), complemented by SHapley Additive exPlanations (SHAP) for interpretability. The importance of the features was evaluated using the Integrated Model consensus in the RF, XGBoost, and SHAP metrics. The selection of features significantly improved the classification performance in the three-stages models (paradormancy, endodormancy, ecodormancy) and two-stages (endodormancy and ecodormancy). RF constantly exceeded XGBoost, achieving an accuracy of up to 97.1% in the two-stages scenario using informative cytosine level data. The SHAP analyses demonstrated that the selected feature effectively discriminated among stages of dormancy and revealed biologically significant epigenetic features. The key features were distributed not random throughout the genome, often colocalizing with transposable elements of long terminal repetition (LTR), particularly LTR/ty3-retrotransposons and LTR/copia families. Some features also co-localize with QTLs for chilling and heat requirement, flowering time and maturity date previously identified. This study highlights the usefulness of combining highresolution methylation data with interpretable ML techniques to identify robust dormancy biomarkers. The enrichment of the features associated with dormancy within the transposable elements and the proximal regions of genes suggests an epigenetic regulation through the remodeling of chromatin mediated by TE. These findings contribute to a deeper understanding of dormancy mechanisms and offer a basis for the development of non-destructive tools based on methylation to improve phenological management in perennial fruit crops.
Keywords: stage predictive model, epigenetics, biomarkers, Prunus avium, Feature Selection, bud break
Received: 04 Jul 2025; Accepted: 11 Aug 2025.
Copyright: © 2025 Saavedra, Povea, Urra, Gaete-Loyola, Maldonado and Almeida. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Carlos Ernesto Maldonado, Centro de Genómica y Bioinformática, Universidad Mayor, Santiago, Chile
Andrea Miyasaka Almeida, Centro de Genómica y Bioinformática, Universidad Mayor, Santiago, Chile
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.