AUTHOR=Xie Haiying , Yang Caiyun , Sun Yamin , Igarashi Yasuo , Jin Tao , Luo Feng TITLE=PacBio Long Reads Improve Metagenomic Assemblies, Gene Catalogs, and Genome Binning JOURNAL=Frontiers in Genetics VOLUME=Volume 11 - 2020 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2020.516269 DOI=10.3389/fgene.2020.516269 ISSN=1664-8021 ABSTRACT=PacBio long reads sequencing presents several potential advantages for DNA assembly and providing more complete gene profiling of metagenomic samples, however, lower single-pass accuracy can make gene discovery and assembly for low-abundance organisms difficult. To evaluate the application and performance of PacBio long reads and Illumina HiSeq short reads for metagenomics, we directly compared various assemblies with PacBio and Illumina sequencing reads on two anaerobic digestion microbiome samples. 19.6 Gb long reads were produced with an average length of 7604 bp and ~85-90% accuracy from PacBio platform and 45.4 Gb short paired reads were produced from Illumina HiSeq platform. Hybrid assemblies using PacBio long reads and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length, contig N50 size and number of large contigs. Interestingly, hybrid assemblies by sequencing depth generated a comparative number of complete gene (98.86%), to those based on HiSeq contigs (40.29%), because the PacBio reads are long enough to cover many repeating short elements and capture multiple genes in a single read. Meanwhile, the incorporation of PacBio long reads produced significant enhancements in reducing contigs numbers and increasing genome completeness of genome reconstruction, which assembled and binned poorly using HiSeq data alone. In this comparison of PacBio long reads with Illumina HiSeq short reads on a complex microbiome, we conclude that PacBio long reads produced longer contigs, more complete genes and better genome binning, thereby offering more information about the metagenomic samples.