Lorentz-Regularized Interpretable VAE for Multi-Scale Single-cell Transcriptomic and Epigenomic Embeddings

Fu, Zeyu; Chen, Chunlin; Zhang, Keyang

doi:10.3389/fgene.2025.1713727

ORIGINAL RESEARCH article

Front. Genet.

Sec. Computational Genomics

Lorentz-Regularized Interpretable VAE for Multi-Scale Single-cell Transcriptomic and Epigenomic Embeddings

Provisionally accepted

Zeyu Fu^1*

Chunlin Chen²

Keyang Zhang²

¹Army Medical University, Chongqing, China
²Sun Yat-Sen University, Guangzhou, China

The final, formatted version of the article will be published soon.

Background: Single-cell multi-omics technologies capture cellular heterogeneity at unprecedented resolution, yet dimensionality reduction methods face a fundamental local-global trade-off: approaches optimized for local neighborhood preservation distort global topology, while those emphasizing global coherence obscure fine-grained cell states. Results: We introduce LiVAE (Lorentz-Regularized Variational AutoEncoder), a dual-pathway architecture that applies hyperbolic geometry as soft regularization over standard Euclidean latent spaces. A primary encoding pathway preserves local transcriptional details for high-fidelity reconstruction, while an information bottleneck pathway extracts global hierarchical structure by filtering technical noise. Lorentzian distance constraints enforce geometric consistency between pathways in hyperbolic space, enabling LiVAE to balance local fidelity with global coherence without requiring specialized batch-correction procedures. Systematic benchmarking across 135 datasets against 21 baseline methods demonstrated that LiVAE achieves superior global topology preservation (distance correlation gains: 0.209–0.436), richer latent geometry (manifold dimensionality: 0.123–0.467; participation ratio: 0.149–0.761), and enhanced robustness (noise resilience: 0.184– 0.712) while maintaining competitive local fidelity. Overall embedding quality improved by 0.051–0.284 across UMAP and t-SNE visualizations. Component-wise interpretability analysis on a Dapp1 perturbation dataset revealed biologically meaningful latent axes. Conclusions: LiVAE provides a robust, general-purpose framework for single-cell representation learning that resolves the local-global trade-off through geometric regularization. By maintaining Euclidean latent spaces while leveraging hyperbolic priors, LiVAE enables improved developmental trajectory inference and mechanistic biological discovery without sacrificing compatibility with existing computational ecosystems.

Keywords: Single-cell multi-omics, dual-pathway variational autoencoder, Hyperbolicgeometry, Information bottleneck, Manifold Learning, Interpretable representation

Received: 06 Oct 2025; Accepted: 20 Nov 2025.

Copyright: © 2025 Fu, Chen and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Zeyu Fu, 1119692089@qq.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.