AUTHOR=Biová Jana , Kaňovská Ivana , Chan Yen On , Immadi Manish Sridhar , Joshi Trupti , Bilyeu Kristin , Škrabišová Mária TITLE=Natural and artificial selection of multiple alleles revealed through genomic analyses JOURNAL=Frontiers in Genetics VOLUME=Volume 14 - 2023 YEAR=2024 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2023.1320652 DOI=10.3389/fgene.2023.1320652 ISSN=1664-8021 ABSTRACT=Genome-to-phenome research in agriculture aims to improve crops through in-silico predictions.GWAS is potent in identifying genomic loci that underlie important traits. As a statistical method, increasing the sample quantity, data quality, or diversity of the GWAS data set positively impacts GWAS power. For more precise breeding, concrete candidate genes with exact functional variants must be discovered. Many post-GWAS methods have been developed to narrow down the associated genomic regions and ideally, to predict candidate genes and causative mutations (CM). Historical natural selection and breeding-related artificial selection both act to change the frequencies of different alleles of genes that control phenotypes. With higher diversity and more extensive GWAS data sets, there is an increased chance of multiple alleles with independent CMs in a single causal gene. This can be caused by the presence of samples from geographically isolated regions that arose during natural or artificial selection. This simple fact is a complicating factor in GWAS-driven discoveries. Currently, none of the existing This is a provisional file, not the final typeset article PAGE \* Arabic \* MERGEFORMAT 4 association methods address this issue and need to identify multiple alleles and more specifically, the actual CMs. Therefore, we developed a tool that computes a score for a combination of variant positions in a single candidate gene and based on the highest score identifies the best number and combination of CMs. The tool is publicly available as a Python package on GitHub and we further created a web-based Multiple Alleles discovery (MADis) tool that supports soybean and is hosted in SoyKB (https://soykb.org/SoybeanMADisTool/). We tested and validated the algorithm and presented the utilization of MADis on a pod pigmentation L1 gene case study with multiple CMs from natural or artificial selection. Finally, we identified a candidate gene for the pod color L2 locus and predicted the existence of multiple alleles that potentially cause loss of pod pigmentation. In this work, we show how a genomic analysis can be employed to explore the natural and artificial selection of multiple alleles and thus, improve and accelerate crop breeding in agriculture.