Original Research ARTICLE
Efficient mining of variants from trios for VSD-association study
- 1Taihe Hospital, Hubei University of Medicine, China
- 2Hubei University of Medicine, China
- 3Guangxi University, China
Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants, to which early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly of sequencing and missing of rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled-Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants calling. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10 and 5 folds comparing to single sequence based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal comparing with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and a lncRNA— NONHSAT096266.2 that have high confidence to be VSD-related.
Keywords: trio-sequencing, k-mer filtering, Variant calling, VSD, gene ontology
Received: 22 Dec 2018;
Accepted: 27 Jun 2019.
Edited by:Tao Zeng, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences (CAS), China
Reviewed by:Wenting Liu, Genome Institute of Singapore, Singapore
Zhenhua Li, National University of Singapore, Singapore
Copyright: © 2019 Jiang, Hu, Wang, Zhang, Zhu, Tong, Bai, Li and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Prof. Liang Zhao, Taihe Hospital, Hubei University of Medicine, Shiyan, Hubei Province, China, email@example.com