Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Protein Bioinformatics

Volume 5 - 2025 | doi: 10.3389/fbinf.2025.1630078

This article is part of the Research TopicEmerging Science, Trends, and Innovations from the 17th Brazilian Symposium on Bioinformatics (BSB 2024)View all articles

COCαDA - A Fast and Scalable Algorithm for Interatomic Contact Detection in Proteins Using Cα Distance Matrices

Provisionally accepted
  • 1Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil
  • 2Department of Informatics, Federal University of Viçosa, Viçosa, Brazil

The final, formatted version of the article will be published soon.

Protein interatomic contacts, defined by spatial proximity and physicochemical complementarity at atomic resolution, are fundamental to characterizing molecular interactions and bonding. Methods for calculating contacts are generally categorized as cutoff-dependent, which rely on Euclidean distances, or cutoff-independent, which utilize Delaunay and Voronoi tessellations. While cutoffdependent methods are recognized for their simplicity, completeness, and reliability, traditional implementations remain computationally expensive, posing significant scalability challenges in the current Big Data era of bioinformatics. Here, we introduce COCαDA (COntact search pruning by Cα Distance Analysis), a Python-based command-line tool for improving search pruning in large-scale interatomic protein contact analysis using alpha-carbon (Cα) distance matrices. COCαDA detects intra-and inter-chain contacts, and classifies them into seven different types: hydrogen and disulfide bonds; hydrophobic effects; attractive, repulsive, and salt-bridge interactions; and aromatic stackings. To evaluate our tool, we compared it with three traditional approaches in the literature: all-against-all atom distance calculation ("bruteforce"), static Cα distance cutoff (SC), and Biopython's NeighborSearch class (NS). COCαDA demonstrated superior performance compared to the other methods, achieving on average 6x faster computation times than advanced data structures like k -d trees from NS, in addition to being simpler to implement and fully customizable. The presented tool facilitates exploratory and large-scale analyses of interatomic contacts in proteins in a simple and efficient manner, also enabling the integration of results with other tools and pipelines. The COCαDA tool is freely available at https://github.com/LBS-UFMG/COCaDA.

Keywords: COCαDA, protein interactions, Contacts, structural bioinformatics, Command-line tool

Received: 16 May 2025; Accepted: 11 Aug 2025.

Copyright: © 2025 Lemos, Mariano, Silveira and de Melo-Minardi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Rafael Pereira Lemos, Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil
Raquel Cardoso de Melo-Minardi, Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.