Skip to main content

ORIGINAL RESEARCH article

Front. Immunol.
Sec. B Cell Biology
Volume 15 - 2024 | doi: 10.3389/fimmu.2024.1407470

Interpretable deep learning reveals the role of an E-box motif in suppressing somatic hypermutation of AGCT motifs within human immunoglobulin variable regions

Provisionally accepted
  • 1 Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, New York, United States
  • 2 Department of Applied Mathematics and Statistics, College of Engineering and Applied Sciences, Stony Brook University, Stony Brook, New York, United States
  • 3 Research Institute of Molecular Pathology (IMP), Vienna, Austria
  • 4 Peter Gorer Department of Immunobiology, School of Immunology & Microbial Sciences, Faculty of Life Sciences & Medicine, King's College London, London, England, United Kingdom

The final, formatted version of the article will be published soon.

    Somatic hypermutation (SHM) of immunoglobulin variable (V) regions by activation induced deaminase (AID) is essential for robust, long-term humoral immunity against pathogen and vaccine antigens. AID mutates cytosines preferentially within WRCH motifs (where W=A or T, R=A or G and H=A, C or T). However, it has been consistently observed that the mutability of WRCH motifs varies substantially, with large variations in mutation frequency even between multiple occurrences of the same motif within a single V region. This has led to the notion that the immediate sequence context of WRCH motifs contributes to mutability.Recent studies have highlighted the potential role of local DNA sequence features in promoting mutagenesis of AGCT, a commonly mutated WRCH motif. Intriguingly, AGCT motifs closer to 5' ends of V regions, within the framework 1 (FW1) sub-region1, mutate less frequently, suggesting an SHM-suppressing sequence context. Here, we systematically examined the basis of AGCT positional biases in human SHM datasets with DeepSHM, a machine-learning model designed to predict SHM patterns. This was combined with integrated gradients, an interpretability method, to interrogate the basis of DeepSHM predictions. DeepSHM predicted the observed positional differences in mutation frequencies at AGCT motifs with high accuracy. For the conserved, lowly mutating AGCT motifs in FW1, integrated gradients predicted a large negative contribution of 5'C and 3'G flanking residues, suggesting that a CAGCTG context in this location was suppressive for SHM. CAGCTG is the recognition motif for E-box transcription factors, including E2A, which has been implicated in SHM. Indeed, we found a strong, inverse relationship between E-box motif fidelity and mutation frequency. Moreover, E2A was found to associate with the V region locale in two human B cell lines. Finally, analysis of human SHM datasets revealed that naturally occurring mutations in the 3'G flanking residues, which effectively ablate the E-box motif, were associated with a significantly increased rate of AGCT mutation. Our results suggest an antagonistic relationship between mutation frequency and the binding of Ebox factors like E2A at specific AGCT motif contexts and, therefore, highlight a new, suppressive mechanism regulating local SHM patterns in human V regions.

    Keywords: somatic hypermutation (SHM), Activation induced deaminase (AID), Immunoglobulin heavy chain, deep learning, Integrated gradients, E-box transcription factors, E2A

    Received: 26 Mar 2024; Accepted: 08 May 2024.

    Copyright: © 2024 Tambe, MacCarthy and Pavri. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Rushad Pavri, Research Institute of Molecular Pathology (IMP), Vienna, 1030, Austria

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.