AUTHOR=Guenzi-Tiberi Pierre , Istace Benjamin , Alsos Inger Greve , The PhyloNorway Consortium , Coissac Eric , Lavergne Sébastien , The PhyloAlps Consortium , Aury Jean-Marc , Denoeud France , Alsos L.G. , Føreid Merkel M.K. , Lammers Y. , Coissac E. , Pouchon C. , Alberti A. , Denoeud F. , Wincker P. , Lavergne S. , Pouchon C. , Coissac E. , Roquet C. , Smyčka J. , Boleda M. , Thuiller W. , Gielly L. , Taberlet P. , Rioux D. , Boyer F. , Hombiat A. , Bzeznik B. , Alberti A. , Denoeud F. , Wincker P. , Orvain C. , Perrier C. , Douzet R. , Rome M. , Valay J.G. , Aubert S. , Zimmermann N. , Wüest R. O. , Latzin S. , Wipf S. , Van Es J. , Garraud L. , Villaret J.C. , Abdulhak S. , Bonnet V. , Huc S. , Fort N. , Legland T. , Sanz T. , Pache G. , Mikolajczak A. , Noble V. , Michaud H. , Offerhaus B. , Pires M. , Morvant Y. , Dentant C. , Salomez P. , Bonet R. , Delahaye T. , Leccia M.F. , Perfus M. , Eggenberg S. , Möhl A. , Hurdu B. , Pușcaș M. , Slovák M. TITLE=LocoGSE, a sequence-based genome size estimator for plants JOURNAL=Frontiers in Plant Science VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1328966 DOI=10.3389/fpls.2024.1328966 ISSN=1664-462X ABSTRACT=

Extensive research has focused on exploring the range of genome sizes in eukaryotes, with a particular emphasis on land plants, where significant variability has been observed. Accurate estimation of genome size is essential for various research purposes, but existing sequence-based methods have limitations, particularly for low-coverage datasets. In this study, we introduce LocoGSE, a novel genome size estimator designed specifically for low-coverage datasets generated by genome skimming approaches. LocoGSE relies on mapping the reads on single copy consensus proteins without the need for a reference genome assembly. We calibrated LocoGSE using 430 low-coverage Angiosperm genome skimming datasets and compared its performance against other estimators. Our results demonstrate that LocoGSE accurately predicts monoploid genome size even at very low depth of coverage (<1X) and on highly heterozygous samples. Additionally, LocoGSE provides stable estimates across individuals with varying ploidy levels. LocoGSE fills a gap in sequence-based plant genome size estimation by offering a user-friendly and reliable tool that does not rely on high coverage or reference assemblies. We anticipate that LocoGSE will facilitate plant genome size analysis and contribute to evolutionary and ecological studies in the field. Furthermore, at the cost of an initial calibration, LocoGSE can be used in other lineages.