Brief Research Report ARTICLE
NUMT confounding biases mitochondrial heteroplasmy calls in favor of the reference allele
- 1Imperial College London, United Kingdom
Homology between mitochondrial DNA (mtDNA) and nuclear DNA of mitochondrial origin (nuMTs) causes confounding when aligning short sequence reads to the reference human genome, as the true sequence origin cannot be determined. Using a systematic in silico approach, we here report the impact of all potential mitochondrial variants on alignment accuracy and variant calling. A total of 49,707 possible mutations were introduced across the 16,569bp reference mitochondrial genome (16,569 x 3 alternative alleles), one variant at-at-time. The resulting in silico fragmentation and alignment to the entire reference genome (GRCh38) revealed preferential mapping of mutated mitochondrial fragments to nuclear loci, as variants increased loci similarity to nuMTs, for a total of 807, 362 and 41 variants at 333, 144 and 27 positions when using 100bp, 150bp and 300bp single-end fragments. We subsequently modelled these affected variants at 50% heteroplasmy and carried out variant calling, observing bias in the reported allele frequencies in favor of the reference allele. Four variants (chrM:6023A, chrM:4456T, chrM:5147A and chrM:7521A) including a possible hypertension factor, chrM:4456T, caused 100% loss of coverage at the mutated position (with all 100bp single-end fragments aligning to homologous, nuclear positions instead of chrM), rendering these variants undetectable when aligning to the entire reference genome. Furthermore, four mitochondrial variants reported to be pathogenic were found to cause significant loss of coverage and select Haplogroup-defining SNPs were shown to exacerbate the loss of coverage caused by surrounding variants. Increased fragment length and use of paired-end reads both improved alignment accuracy.
Keywords: NUMT, mtDNA, Genotype, mitochondrial variants, Mitochondrial genotype, NGS, Heteroplasmy
Received: 24 Apr 2019;
Accepted: 05 Sep 2019.
Copyright: © 2019 Maude, Davidson, Charitakis, Diaz, Bowers, Gradovich, Andrew and Huntley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Miss. Hannah Maude, Imperial College London, London, United Kingdom, firstname.lastname@example.org
Mr. William Bowers, Imperial College London, London, United Kingdom, email@example.com