AUTHOR=Wang Yili , Liu Yuanning , Wang Shuo , Liu Zhen , Gao Yubing , Zhang Hao , Dong Liyan TITLE=ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism JOURNAL=Frontiers in Genetics VOLUME=Volume 11 - 2020 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2020.612086 DOI=10.3389/fgene.2020.612086 ISSN=1664-8021 ABSTRACT=Accurate RNA secondary structure information is the cornerstone of gene function research and RNA tertiary structure prediction. However, most traditional RNA secondary structure prediction algorithms are based on the dynamic programming (DP) algorithm according to the minimum free energy theory with both hard and soft constraints. The accuracy is particularly dependent on the accuracy of soft constraints (from experimental data like chemical and enzyme detection). With the elongation of RNA sequence, the time complexity of DP-based algorithms will increase geometrically; as a result, they are not good at coping with relatively long sequences. Meanwhile, due to the complexity of pseudoknots structure, the secondary structure prediction method based on traditional algorithms has great defects, which can not well predict the secondary structure with pseudoknots. Therefore, few algorithms are available for pseudoknots prediction in the past. The ATTfold algorithm proposed in this article was a deep learning algorithm based on an attention mechanism. It analyzed the global information of RNA sequence via the characteristics of the attention mechanism, focused on the correlation between paired bases, and solved the problem of long sequence prediction. Meanwhile, this algorithm also extracted the effective multi-dimensional features from a great quantity of RNA sequences and structure information, by combining the exclusive hard constraints of RNA secondary structure. Hence, it accurately determined the pairing position of each base, and obtained the real and effective RNA secondary structure including pseudoknots. Finally, after training the ATTfold algorithm model through tens of thousands of RNA sequences and their real secondary structures, this algorithm was compared with four classic RNA secondary structure prediction algorithms. The results showed that our algorithm significantly outperformed others and more truly showed the secondary structure of RNA. As the data in RNA sequences database increase, our deep learning-based algorithm will have superior performance. In the future, this kind of algorithm will be more indispensable.