About this Research Topic

Abstract Submission Deadline 01 February 2023
Manuscript Submission Deadline 01 May 2023

Gene regulation is the intricate and highly dynamic process of inducing or inhibiting the expression of individual genes in an organism’s genome. It is orchestrated by a vast array of molecules, including transcription factors and cofactors, chromatin regulators, as well as other epigenetic mechanisms, which allow an organism to control cell growth and differentiation during development, as well as to respond to a variety of environmental stimuli. In turn, disruptions in gene regulation, e.g., those caused by mutations in regulatory sequences, have been shown to represent a defining feature of a plethora of diseases, including developmental and neurological disorders as well as cancer.

Given its importance in proper cell functioning and adaptability, decoding the architecture of gene regulation has become one of the most pressing tasks in modern (computational) biology. To this end, it has been a long-held ambition to enable quantitative prediction of gene regulation, i.e., inference of gene expression levels, from genomic and epigenomic features alone. The rise in computing power, recent advances in learning algorithms alongside high-throughput, next-generation sequencing that provide large-scale quantification of gene expression at single-cell resolution, as well as the identification of novel genes and noncoding RNAs at unprecedented levels, may bring us one step closer to the realization of this dream.

Early attempts in the application of modern approaches in machine learning - in particular deep learning - to predict mRNA abundance levels directly from DNA sequence have already yielded promising results. Despite this, it remains an open question how individual factors and epigenomic features involved in the gene regulatory apparatus interact within the vast genomic landscape of an organism’s non-coding regions and, in turn, contribute to mRNA expression levels. In order to advance our understanding of gene expression inference, we need scalable algorithms that: i) allow for the integration of a variety of diverse genomic and epigenomic datasets as well as structural and/or biological priors; ii) are interpretable with respect to the representations they have learned; and, iii) allow for a transfer of the learned representations to novel and different environments and contexts.

This Research Topic welcomes both original studies and review articles assessing modern machine learning approaches that integrate genomic and/or epigenomic datasets to quantitatively predict gene regulation. The topics of interest include, but are not limited to, the following:

- Quantitative inference of gene expression levels from DNA sequence and/or epigenomic features;
- Analyses of transcription factor interaction within multiple binding elements and/or effects of single nucleotide polymorphisms, copy number variations, etc., in causing loss or creation of promoter binding elements and
enhancers;
- Strategies for heterogeneous data integration of genomic sequences and epigenetic datasets, including chromatin accessibility, methylation, or chromatin conformation-related data;
- Evaluation and/or benchmarking of different (un-) supervised learning paradigms, including Bayesian (deep) learning, deep convolutional models, graph neural networks, or attention-based approaches;
- Approaches to structural learning of efficient model architectures based on biological, structural priors, or data-driven methods, such as Neural Architectural Search or Bayesian sampling;
- Analyses of model scalability versus model complexity trade-offs with respect to structural priors, inductive biases, dimensionality reduction, randomized or sampling-based techniques;
- Investigations into model interpretability e.g., visualization of individual components of trained models, such as model filters representing sequence binding motifs;
- Studies into the possibilities/limitations of the transfer of trained models to novel/different contexts and environments.

Keywords: machine learning, plant genome, plant genes, gene regulation, epigenomics, deep learning


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Gene regulation is the intricate and highly dynamic process of inducing or inhibiting the expression of individual genes in an organism’s genome. It is orchestrated by a vast array of molecules, including transcription factors and cofactors, chromatin regulators, as well as other epigenetic mechanisms, which allow an organism to control cell growth and differentiation during development, as well as to respond to a variety of environmental stimuli. In turn, disruptions in gene regulation, e.g., those caused by mutations in regulatory sequences, have been shown to represent a defining feature of a plethora of diseases, including developmental and neurological disorders as well as cancer.

Given its importance in proper cell functioning and adaptability, decoding the architecture of gene regulation has become one of the most pressing tasks in modern (computational) biology. To this end, it has been a long-held ambition to enable quantitative prediction of gene regulation, i.e., inference of gene expression levels, from genomic and epigenomic features alone. The rise in computing power, recent advances in learning algorithms alongside high-throughput, next-generation sequencing that provide large-scale quantification of gene expression at single-cell resolution, as well as the identification of novel genes and noncoding RNAs at unprecedented levels, may bring us one step closer to the realization of this dream.

Early attempts in the application of modern approaches in machine learning - in particular deep learning - to predict mRNA abundance levels directly from DNA sequence have already yielded promising results. Despite this, it remains an open question how individual factors and epigenomic features involved in the gene regulatory apparatus interact within the vast genomic landscape of an organism’s non-coding regions and, in turn, contribute to mRNA expression levels. In order to advance our understanding of gene expression inference, we need scalable algorithms that: i) allow for the integration of a variety of diverse genomic and epigenomic datasets as well as structural and/or biological priors; ii) are interpretable with respect to the representations they have learned; and, iii) allow for a transfer of the learned representations to novel and different environments and contexts.

This Research Topic welcomes both original studies and review articles assessing modern machine learning approaches that integrate genomic and/or epigenomic datasets to quantitatively predict gene regulation. The topics of interest include, but are not limited to, the following:

- Quantitative inference of gene expression levels from DNA sequence and/or epigenomic features;
- Analyses of transcription factor interaction within multiple binding elements and/or effects of single nucleotide polymorphisms, copy number variations, etc., in causing loss or creation of promoter binding elements and
enhancers;
- Strategies for heterogeneous data integration of genomic sequences and epigenetic datasets, including chromatin accessibility, methylation, or chromatin conformation-related data;
- Evaluation and/or benchmarking of different (un-) supervised learning paradigms, including Bayesian (deep) learning, deep convolutional models, graph neural networks, or attention-based approaches;
- Approaches to structural learning of efficient model architectures based on biological, structural priors, or data-driven methods, such as Neural Architectural Search or Bayesian sampling;
- Analyses of model scalability versus model complexity trade-offs with respect to structural priors, inductive biases, dimensionality reduction, randomized or sampling-based techniques;
- Investigations into model interpretability e.g., visualization of individual components of trained models, such as model filters representing sequence binding motifs;
- Studies into the possibilities/limitations of the transfer of trained models to novel/different contexts and environments.

Keywords: machine learning, plant genome, plant genes, gene regulation, epigenomics, deep learning


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Loading..

Topic Coordinators

Loading..

articles

Sort by:

Loading..

authors

Loading..

views

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..

Share on

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.