About this Research Topic
The field of genomics has experienced an explosion of data. There now exist catalogues of millions of results from analyses of genome wide association, gene expression, and methylation studies. Often, the primary focus of these studies is whether a specific p-value threshold was met. However, criteria for thresholds of significance are often defined without sufficient empiric evidence of the threshold’s utility. Moreover, while p-values can provide an important piece of information relevant to statistical inference, p-values are often misused and misinterpreted, and p-values alone are insufficient for appropriate inference. Rather, it is critical to understand the design of the study, including data collection, pre-processing, and potential sources of error, as well as the analytical approach, its underlying assumptions and the scientific plausibility of the hypothesis being tested. Further, while researchers propose criteria for evaluating specific types of data, there is no generalized consensus on how multiple testing should be addressed. Without this consensus, the criteria for judging the science varies markedly, such that some studies are too liberal with their inferences and others too conservative. This Research Topic is intended to evaluate the major considerations for statistical inference in the era of large-scale genomic data including reproducibility of research. The goal of this Topic is determine what the best practices for genomic analyses are. We expect this work will provide the basis for ongoing conversations about how the field of genetics and genomics can move beyond simply reviewing p-values in order to improve statistical inference. Given the exponential increase in genomic data generated, this Topic will be critical in informing best practice in genetic epidemiology.
This Research Topic is open to all research on the evaluation of statistical genetic and genomic approaches. We encourage review articles, opinion pieces and original research articles which address the statistical inference process including study design, statistical approaches, multiple testing correction, sources and impact of error, statistical interpretation and reporting, replication, and other factors influencing inference. Genomic data considered may include genetic, gene expression, epigenetic, and microbiome. Frequentists and non-frequentist approaches are welcome.
Keywords: Statistical inference, multiple testing, design, genomics, error
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.