Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Microbiol.

Sec. Aquatic Microbiology

This article is part of the Research TopicMicroalgae-Microbe Interactions: Advances and ApplicationsView all 7 articles

Chlamy_ChloroPred: A deep learning-based, highly accurate binary classifier for chloroplast protein prediction in the model microalga, Chlamydomonas reinhardtii, with potential cross-proteome versatility

Provisionally accepted
  • 1Korea Research Institute of Bioscience & Biotechnology, Yuseong-gu, Republic of Korea
  • 2University of Science & Technology, Daejeon, Republic of Korea
  • 3Daemyung Vision Co., Ltd., Yongin-si, Republic of Korea
  • 4Sungkyunkwan University - Natural Sciences Campus, Suwon-si, Republic of Korea

The final, formatted version of the article will be published soon.

The chloroplast, a living relic of an ancient endosymbiotic interaction between a microalga and a microbe and the principal subcellular organelle responsible for biological CO2 assimilation, is emerging as a key target for research to enhance photosynthetic efficiency beyond its current limitations. Given that accurate protein localization is a prerequisite for the in-depth scientific investigation and practical application of the membrane-compartmentalized photosynthetic organelle, numerous computational prediction tools have been proposed, yet their accuracy remains unsatisfactory. To address the limitation, we herein present Chlamy_ChloroPred, a newly developed deep learning-based framework composed of multi-layered artificial neural networks, carefully designed to perform binary classification of chloroplast proteins in the model photosynthetic microorganism, Chlamydomonas reinhardtii. Leveraging locality awareness of determinant amino acid residues within the chloroplast transit peptide (cTP) – typically located at the N-terminal upstream region of approximately 50 amino acids in mature chloroplast proteins, which was empowered by the combination of ProtBERT-BFD embeddings, stacked bidirectional long short-term memory (BiLSTM) networks, and an attentive pooling layer, our model achieved an accuracy of 0.8462 for the C. reinhardtii proteome, outperforming widely used localization predictors, including TargetP 1.1 (0.4970), TargetP 2.0 (0.7396), and PredAlgo (0.7738) under a binary classification scheme. Comparative analyses further demonstrated that Chlamy_ChloroPred exhibits competitive performance relative to the current state-of-the-art model, PB-Chlamy (0.8521), under identical evaluation conditions. Notably, despite being trained solely on the algal proteome, Chlamy_ChloroPred showed substantial cross-species versatility when applied to the proteome of the terrestrial plant, Arabidopsis thaliana, achieving an accuracy of 0.7316 – representing a 12.6% improvement over TargetP 2.0, a predictor with previously demonstrated cross-proteome versatility. This likely stems from the model's robust ability to capture conserved features of chloroplast proteins across proteomes from diverse photosynthetic lineages. Overall, we believe that Chlamy_ChloroPred represents a compelling alternative to existing predictors, especially when accurate inference of chloroplast proteins is required.

Keywords: Chlamydomonas reinhardtii, Chloroplast protein, deep learning, Microalgae, Neural Network, Protein localization prediction

Received: 12 Nov 2025; Accepted: 04 Feb 2026.

Copyright: © 2026 Choi, Lee, Lee, Lee, Yun, Choi, Cho, Shin, Chun, Lee and Kim. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Hong Il Choi
Hee-Sik Kim

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.