AUTHOR=Cong Yingnan , Chan Yao-ban , Phillips Charles A. , Langston Michael A. , Ragan Mark A. TITLE=Robust Inference of Genetic Exchange Communities from Microbial Genomes Using TF-IDF JOURNAL=Frontiers in Microbiology VOLUME=8 YEAR=2017 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2017.00021 DOI=10.3389/fmicb.2017.00021 ISSN=1664-302X ABSTRACT=
Bacteria and archaea can exchange genetic material across lineages through processes of lateral genetic transfer (LGT). Collectively, these exchange relationships can be modeled as a network and analyzed using concepts from graph theory. In particular, densely connected regions within an LGT network have been defined as genetic exchange communities (GECs). However, it has been problematic to construct networks in which edges solely represent LGT. Here we apply term frequency-inverse document frequency (TF-IDF), an alignment-free method originating from document analysis, to infer regions of lateral origin in bacterial genomes. We examine four empirical datasets of different size (number of genomes) and phyletic breadth, varying a key parameter (word length