AUTHOR=Li Ying , Zhao Jianing , Liu Zhaoqian , Wang Cankun , Wei Lizheng , Han Siyu , Du Wei TITLE=De novo Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning JOURNAL=Frontiers in Genetics VOLUME=Volume 12 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.630379 DOI=10.3389/fgene.2021.630379 ISSN=1664-8021 ABSTRACT=Moonlighting proteins (MPs) are a special type of protein with multiple independent functions. MPs play vital roles in cellular regulation, diseases, and biological pathways. At present, very few MPs have been discovered by biological experiments. Due to the lack of data sample, computational based methods for MP identification are limited. Currently, there is no de-novo prediction method for MPs. Therefore, comprehensive research into and identification of MPs are urgently required. In this paper, we propose a multimodal deep ensemble learning architecture, named MEL-MP, which is the first de-novo computation model for predicting MPs. First, we extract four sequence based features: primary protein sequence information, evolutionary information, physical and chemical properties, and secondary protein structure information. Second, we construct specific classifiers for each kind of feature. Finally, we apply the stacked ensemble to integrate the output of each classifier. Through comprehensive model selection and cross-validation experiments it is shown that specific classifiers for specific feature types can achieve superior performance. For validating the effectiveness of the fusion based stacked ensemble, different feature fusion strategies including direct combination and a multimodal deep auto-encoder, are used for comparative purposes. MEL-MP is shown to exhibit superior prediction performance (F-score = 0.8911), surpassing the existing machine learning model, MPFit (F-score = 0.784). In addition, MEL-MP is leveraged to predict the potential MPs amongst all human proteins. We further explore predicted MPs from three different perspectives: the distribution on human chromosomes, the association with diseases, and evolutionary history. The results reveal that the predicted MPs are significantly related to diseases, the ratio of MPs in the Y chromosome is higher compared with other chromosomes, and MPs may have originated earlier than other proteins. Finally, for maximum convenience, a user-friendly web server is available at http://ml.csbg-jlu.site/mel-mp/.