About this Research Topic
Tandem repeats (TRs) are adjacent repetitive stretches of genomic DNA, found in abundance across all kingdoms of life. Recent estimates suggest as much as > 50% of an organism's proteins may contain at least one TR region. With time, the originally perfect repeated unit copies may diverge via mutations, deletions and insertions, which are shaped by existing structural or functional constraints. TRs display a wide variety of repeating units, from homorepeats (one repeating amino acid) and short TRs (1-10 nucleotides or STRs) to whole protein domains. Especially, shorter TRs are known for their orders of magnitude high mutation rates compared to SNPs and indels.
Due to their repetitive structure, TRs have been posing challenges at many stages of genomic analyses: from sequencing and genome assembly to gene annotation and evolutionary analyses. In the last decade, TR-suitable methods and resources are emerging one after the other allowing accurate TR annotation and genotyping integrated in existing genomic pipeline workflows. Studies using these resect approaches in a systematic fashion reveal that TRs are found in both coding and non-coding genomic regions, with various effects on protein function and expression. For example, protein TRs are often cited for their associations with diseases and immunity-related functions. Non-coding TRs, such as those found in promoters, may regulate the expression of neighboring genes.
Recently, numerous studies aimed to evaluate these effects in a more systematic fashion. This included both the development of new methods for TR analyses, as well as genome-wide studies of associations between TR variation, gene expression and human disease, such as cancers. TRs provide a rich source of variation in populations, and therefore a perfect playground for natural selection forces. Therefore, analysis of TR variation in populations and over longer evolutionary time may provide crucial insights leading us to a better understanding of their role in molecular interactions and functions of key proteins.
Within this research topic we welcome any research articles that address some of the current challenges in the TR field. While we welcome articles focusing on protein TRs, the contributions to this research topic may also relate to non-coding TRs and their effects. Overall, potential contributions to this research topic can include (but not limited to):
● Bioinformatics methods for TR detection, annotation and analysis
● Benchmarking existing methods for TR detection and analysis
● Computational methods for structure prediction of protein TR regions
● Computational analyses of TRs and their effects on protein folding, function, and expression
● Modeling of mechanisms of TR evolution and origin
● Methods and analyses of evolution of TRs regions
● Study cases analyzing the functional role of TRs in different biological scenario, such as host-pathogen interactions, immune, resistance or pathogenicity functions
● Large-scale systematic analyses of TR regions with insights on function and organismal biology
● Analyses of TR roles in protein interaction networks and biological pathways
● Classifications of protein TRs based on sequence profile models and structure
● Resources focusing on TRs: relevant databases, workflows, visualization tools
● Reviews of the above
The Topic Editors would like to acknowledge Dr. Tugce Bilgin (Columbia University, USA) has acted as coordinator. Dr. Bilgin has contributed to the preparation of the proposal for this Research Topic.
Keywords: Tandem repeat, homorepeat, STR, evolution, protein function, protein structure, gene expression
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.