Leveraging spatial variation in tumor purity for improved somatic variant calling of archival tumor only samples
- 1Translational Genomics Research Institute, United States
- 2Mayo Clinic Arizona, United States
- 3Imaging Endpoints, United States
- 4HonorHealth Scottsdale Shea Medical Center, United States
- 5GE Global Research (United States), United States
- 6Prairie View A&M University, United States
Archival tumor samples represent a rich resource of annotated specimens for translational genomics research. However, standard variant calling approaches require a matched normal sample from the same individual, which is often not available in the retrospective setting, making it difficult to distinguish between true somatic variants and individual-specific germline variants. Archival sections often contain adjacent normal tissue, but this tissue can include infiltrating tumor cells. As existing comparative somatic variant callers are designed to exclude variants present in the normal sample, a novel approach is required to leverage adjacent normal tissue with infiltrating tumor cells for somatic variant calling. Here we present lumosVar 2.0, a software package designed to jointly analyze multiple samples from the same patient, built upon our previous single sample tumor only variant caller lumosVar 1.0. The approach assumes that the allelic fraction of somatic variants and germline variants follow different patterns as tumor content and copy number state change. LumosVar 2.0 estimates allele specific copy number and tumor sample fractions from the data, and uses a model to determine expected allelic fractions for somatic and germline variants and classify variants accordingly. To evaluate lumosVar 2.0 to jointly call somatic variants with tumor and adjacent normal samples, we used a glioblastoma dataset with matched high and low tumor content and germline whole exome sequencing data (for true somatic variants) available for each patient. Both sensitivity and positive predictive value were improved when analyzing the high tumor and low tumor samples jointly compared to analyzing the samples individually or in-silico pooling of the two samples. Finally, we applied this approach to a set of breast and prostate archival tumor samples for which tumor blocks containing adjacent normal tissue were available for sequencing. Joint analysis using lumosVar 2.0 detected several variants, including known cancer hotspot mutations that were not detected by standard somatic variant calling tools using the adjacent normal as a reference. Together, these results demonstrate the potential utility of leveraging paired tissue samples to improve somatic variant calling when a constitutional sample is not available.
Keywords: cancer genomcis, somatic variant calling, Next generation sequecing, Tumor-only sequencing, Exome analysis, Cancer hotspots
Received: 16 Jul 2018;
Accepted: 11 Feb 2019.
Edited by:Sven Bilke, National Cancer Institute (NCI), United States
Reviewed by:Parvin Mehdipour, Tehran University of Medical Sciences, Iran
Lei Wei, Roswell Park Comprehensive Cancer Center, University at Buffalo, United States
Jamie K. Teer, Moffitt Cancer Center, United States
Nam S. Vo, University of Chicago, United States
Copyright: © 2019 Halperin, Liang, Kulkarni, Tassone, Adkins, Enriquez, Tran, Hank, Newell, Kodira, Korn, Berens, Kim and Byron. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dr. Rebecca F. Halperin, Translational Genomics Research Institute, Phoenix, United States, firstname.lastname@example.org