Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Genomic Analysis

This article is part of the Research TopicAI in Genomic AnalysisView all 4 articles

A Transformers-based framework for refinement of genetic variants

Provisionally accepted
  • Laval University, Quebec, Canada

The final, formatted version of the article will be published soon.

Accurate variant calling refinement is crucial for distinguishing true genetic variants from technical artifacts in high-throughput sequencing data. While heuristic filtering and manual review are common approaches for refining variants, manual review is time-consuming, and heuristic filtering often lacks optimal solutions, especially for low-coverage data. Traditional variant calling methods often struggle with accuracy, especially in regions of low read coverage, leading to false-positive or false-negative calls. Advances in artificial intelligence, particularly deep learning, offer promising solutions for automating this refinement process. Here, we present a Transformers-based framework for genetic variant refinement that leverages self-attention to model dependencies among variant features and directly processes VCF files, enabling seamless integration with standard pipelines such as BCFTools and GATK4. Trained on 2 million variants from the GIAB (v4.2.1) sample HG003, the framework achieved 89.26% accuracy and a ROC AUC of 0.88. Across the tested samples, VariantTransformer improved baseline filtering accuracy by 4–10%, demonstrating consistent gains over the default caller filters. When integrated into conventional variant calling pipelines, VariantTransformer outperformed traditional heuristic filters and, through refinement of existing caller outputs, approached the accuracy achieved by state-of-the-art AI-based variant callers such as DeepVariant, despite not operating as a standalone caller. By positioning this work as a flexible and generalizable framework rather than a single-use model, we highlight the underexplored potential of Transformers for variant refinement in genomics. This study contributes a blueprint for adapting Transformer architectures to a wide range of genomic quality control and filtering tasks. Code is available at: https://github.com/Omar-Abd-Elwahab/VariantTransformer.

Keywords: Bioinformatics & Computational Biology, deep learning - artificial intelligence, Genomics, transformers, Variant calling analyisis, Variant Filtering, VCF analysis

Received: 29 Aug 2025; Accepted: 12 Dec 2025.

Copyright: © 2025 Abdelwahab and Torkamaneh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Davoud Torkamaneh

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.