Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Genet., 05 January 2026

Sec. Computational Genomics

Volume 16 - 2025 | https://doi.org/10.3389/fgene.2025.1713727

Lorentz-regularized interpretable VAE for multi-scale single-cell transcriptomic and epigenomic embeddings

  • 1State Key Laboratory of Trauma and Chemical Poisoning, Institute of Combined Injury, Chongqing Engineering Research Center for Nanomedicine, College of Preventive Medicine, Army Medical University, Chongqing, China
  • 2Department of Orthopedics, Xinqiao Hospital, Army Medical University, Chongqing, China
  • 3Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
  • 4School of Medicine, Sun Yat-sen University, Shenzhen, China

Background: Single-cell multi-omics technologies capture cellular heterogeneity at unprecedented resolution, yet dimensionality reduction methods face a fundamental local–global trade-off: approaches optimized for local neighborhood preservation distort global topology, while those emphasizing global coherence obscure fine-grained cell states.

Results: We introduce the Lorentz-regularized variational autoencoder (LiVAE), a dual-pathway architecture that applies hyperbolic geometry as soft regularization over standard Euclidean latent spaces. A primary encoding pathway preserves local transcriptional details for high-fidelity reconstruction, while an information bottleneck (BN) pathway extracts global hierarchical structure by filtering technical noise. Lorentzian distance constraints enforce geometric consistency between pathways in hyperbolic space, enabling LiVAE to balance local fidelity with global coherence without requiring specialized batch-correction procedures. Systematic benchmarking across 135 datasets against 21 baseline methods demonstrated that LiVAE achieves superior global topology preservation (distance correlation gains: 0.209–0.436), richer latent geometry (manifold dimensionality: 0.123–0.467; participation ratio: 0.149–0.761), and enhanced robustness (noise resilience: 0.184–0.712) while maintaining competitive local fidelity. The overall embedding quality improved by 0.051–0.284 across uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) visualizations. Component-wise interpretability analysis on a Dapp1 perturbation dataset revealed biologically meaningful latent axes.

Conclusion: LiVAE provides a robust, general-purpose framework for single-cell representation learning that resolves the local–global trade-off through geometric regularization. By maintaining Euclidean latent spaces while leveraging hyperbolic priors, LiVAE enables improved developmental trajectory inference and mechanistic biological discovery without sacrificing compatibility with existing computational ecosystems.

1 Introduction

Cellular development unfolds through hierarchical differentiation programs where stem cells progressively commit to specialized fates (Trapnell et al., 2014). Single-cell multi-omics technologies now capture these hierarchies at an unprecedented resolution (Stuart et al., 2019); however, representing tree-like developmental structures computationally remains an open challenge (Luecken et al., 2022). Datasets routinely contain 105106 cells spanning dozens of cell types across tissues (Hao et al., 2021), demanding representations that preserve both fine-grained cell states and global developmental trajectories.

This challenge manifests as a fundamental local–global trade-off: representations must capture fine-grained local neighborhoods for rare cell-type detection (Heiser and Lau, 2020; Kiselev et al., 2019) while maintaining global topology for developmental trajectory inference (Cao et al., 2019; Saelens et al., 2019). Despite advances in deep learning (Hu et al., 2021; Yuan and Kelley, 2022) and foundation models (Cui et al., 2024), most methods excel at one scale at the expense of the other. The tension reflects a geometric limitation of Euclidean spaces: methods optimized for local structure, such as t-distributed stochastic neighbor embedding (t-SNE) (Kobak and Berens, 2019) and uniform manifold approximation and projection (UMAP) (McInnes et al., 2018), distort global topology, while those prioritizing global coherence, such as principal component analysis (PCA) and diffusion maps (Moon et al., 2018; Becht et al., 2019), may obscure fine-grained cell states. Graph-based approaches (Hetzel et al., 2021; Nguyen et al., 2022) and gene co-expression modeling (Deng et al., 2025; Li et al., 2025; Song T. et al., 2022) partially address this by explicitly encoding functional relationships, yet systematic benchmarking reveals that most methods still sacrifice one scale for the other (Tian et al., 2019).

These challenges intensify across modalities: single-cell ATAC sequencing (scATAC-seq) exhibits 90%–95% zero rates versus 60%–80% in single-cell RNA sequencing (scRNA-seq) (Chen et al., 2019; Fang et al., 2021), thus requiring flexible architectures that generalize without extensive re-engineering (Song Q. et al., 2022; Zhao et al., 2024). Hyperbolic geometry offers a principled solution as its exponential volume growth naturally accommodates tree-like hierarchies common in developmental biology (Nickel and Kiela, 2017; Chami et al., 2019; Sarkar, 2012; Bronstein et al., 2021). Existing hyperbolic deep learning methods have improved visualization and captured cellular relationships (Tian et al., 2023; Klimovskaia et al., 2020), but they constrain entire latent spaces to hyperbolic manifolds (Mathieu et al., 2019; Nagano et al., 2019), thus sacrificing compatibility with standard neural architectures and downstream analytical tools. This failure stems from representational constraints: embedding N nodes in a balanced binary tree requires O(N) Euclidean dimensions to avoid distortion, whereas hyperbolic space requires only O(logN) dimensions.

We introduce the Lorentz-regularized variational autoencoder (LiVAE), which applies hyperbolic geometry as regularization over standard Euclidean latent representations rather than constraining the latent space itself. LiVAE learns a primary embedding zRd optimized for reconstruction, while a bottleneck (BN) pathway compresses z to leRdc (where dcd) and reconstructs ldRd. A geometric loss enforces that Lorentzian distances—computed after projecting to the hyperboloid model (Ganea et al., 2018; Skopek et al., 2019)—between z and ld remain small, preserving hierarchical structure while maintaining compatibility with downstream tools. This design resolves the local–global tension through complementary objectives: the information bottleneck (Tishby and Zaslavsky, 2015; Strouse and Schwab, 2017) discards technical noise while retaining biological structure, implicitly promoting cross-sample integration (Lopez et al., 2018; Lynch et al., 2023), while the dual reconstruction paths balance local fidelity (primary path from z) with global coherence (bottleneck path via ld).

We validate LiVAE on 135 datasets spanning scRNA-seq and scATAC-seq against 21 baseline methods using 12 metrics assessing embedding fidelity, manifold geometry, and robustness. LiVAE consistently achieves superior global topology preservation and noise resilience while remaining competitive on local structure metrics. Component-wise interpretability analysis demonstrates that latent dimensions decompose into biologically meaningful axes corresponding to cell cycle, immune identity, and differentiation programs (Choi et al., 2023; Madrigal et al., 2024).

Our contributions are threefold: (1) architectural innovation: a hybrid design applying hyperbolic regularization to Euclidean representations via information bottlenecks, balancing local fidelity with global coherence; (2) cross-modality generalization: a unified framework handling scRNA-seq and scATAC-seq through modality-appropriate likelihoods without architectural changes; and (3) biological interpretability: latent dimensions aligned with known biological processes that enable mechanistic hypothesis generation beyond black-box embeddings. By resolving the local–global trade-off through geometric regularization, LiVAE provides a flexible foundation for single-cell multi-omics analysis that preserves biological hierarchy without sacrificing compatibility with existing computational workflows.

2 Materials and methods

2.1 Notation

Throughout this section, XRN×D denotes a batch of N cells with D features, xiRD denotes a single cell, and xij denotes a scalar value (gene or peak j in cell i). Latent representations include ZRN×d (batch) and ziRd (single cell). The dimensions are d (latent dimension) and dc (bottleneck dimension, where dcd). Lorentzian projections are denoted as zH,ldHHd+1 (hyperboloid manifold).

2.2 LiVAE architecture overview

LiVAE is a variational autoencoder that applies Lorentzian geometric regularization across a dual-pathway latent space architecture (Figure 1). The encoder ϕ maps the input xRD to a diagonal Gaussian posterior qϕ(z|x)=N(μ,diag(σ2)). A latent vector zRd is sampled via reparameterization and processed through two parallel pathways: (1) the primary path uses z directly for reconstruction and geometric comparison; (2) the bottleneck path compresses z to leRdc (where dcd) and expands it back to ldRd. A shared decoder θ reconstructs from both the representations, yielding x1=θ(z) and x2=θ(ld).

Figure 1
Diagram of the LiVAE architecture for single-cell data processing. It includes the input layer, encoder, primary latent path, bottleneck path, shared feature decoder, and reconstruction outputs. Loss functions are Loss_KL, Loss_geo, Loss_recon1, and Loss_recon2, culminating in a total loss equation. The process involves reparameterization, mapping, compressing, and expanding data, followed by evaluation and benchmarking.

Figure 1. LiVAE architecture. Input x is encoded to latent z, which is processed via two pathways: (1) primary path: direct Lorentzian projection (zH); (2) bottleneck path: compression to le (dimension dcd), expansion to ld, and Lorentzian projection to (ldH). Shared decoder θ reconstructs from both z and ld. The total loss combines two reconstruction terms (Lrecon1,Lrecon2), KL divergence (LKL), and geometric loss (Lgeo) enforcing Lorentzian distance preservation.

The model is trained via total loss Ltotal comprising two reconstruction losses (Lrecon1,Lrecon2), Kullback–Leibler (KL) divergence (LKL), and a geometric loss (Lgeo) that enforces Lorentzian distance preservation between z and ld after mapping to the hyperbolic space.

2.3 Model architecture

2.3.1 Encoder network

For each cell i in batch XRN×D, the encoder outputs the mean μiRd and log-variance logσi2Rd (Equation 1). Latent vectors are sampled as follows:

zi=μi+σiϵi,where ϵiN0,I,(1)

and σi=exp(12logσi2). The encoder consists of a single hidden layer with 128 units, ReLU activation, and layer normalization.

2.3.2 Dual latent pathways and decoder

The bottleneck path applies two linear transformations: compression Le=ZWe+be (to dimension dc) and expansion Ld=LeWd+bd (back to dimension d), where WeRd×dc and WdRdc×d, producing compressed representation LeRN×dc and expanded representation LdRN×d.

The decoder mirrors the encoder architecture, outputting distribution parameters via linear layers followed by softmax normalization. It generates two reconstructions:

• Primary reconstruction: X1=θ(Z).

• Bottleneck reconstruction: X2=θ(Ld).

2.4 Loss function

The total loss function (Equation 2) is a weighted sum of four components:

Ltotal=Lrecon1+Lrecon2+λgeoLgeo+βLKL,(2)

where λgeo0 and β0 are hyperparameters balancing the regularization terms.

2.4.1 Reconstruction losses

The reconstruction losses (Equation 3) measure how well each pathway captures the input:

Lrecon1=EqϕlogpX|Z,Lrecon2=EqϕlogpX|Ld.(3)

The likelihood p(X|) is modality-specific.

• scRNA-seq: negative binomial (NB) distribution with the mean μij (decoder output scaled by the cell library size) and gene-specific dispersion parameter θj to model count overdispersion.

• scATAC-seq: zero-inflated negative binomial (ZINB) with additional zero-inflation probability πij from a separate decoder head, accounting for excess zeros in chromatin accessibility (90%–95% sparsity vs. 60%–80% in scRNA-seq).

• Alternative: Poisson and zero-inflated Poisson (ZIP) likelihoods are also supported for datasets with minimal overdispersion.

For all likelihoods, the predicted means are obtained as μij=softmax(decoder())ijkxik, where the decoder output is normalized by the cell-wise library size.

2.4.2 Kullback–Leibler divergence

The KL divergence is the standard variational autoencoder (VAE) regularizer that encourages the posterior to approximate a unit Gaussian prior (Equation 4):

LKL=12Ndi=1Nj=1dσij2+μij21logσij2.(4)

2.4.3 Geometric loss

The geometric loss enforces that the bottleneck transformation preserves the hyperbolic geometric structure. Euclidean vectors z and ld are mapped to the hyperboloid Hd+1 (the Lorentzian model of hyperbolic space) via the exponential map at origin o=(1,0,,0) (Equation 5):

expov=coshv,sinhvvv,v>0,1,0,v=0,(5)

where v=(0,v1,,vd)ToHd+1 is a tangent vector with the first coordinate zero.

The geometric loss is the mean squared Lorentzian distance between paired representations (Equation 6):

Lgeo=1Ni=1NdHzH,i,ldH,i2,(6)

where zH=expo(z) and ldH=expo(ld), and the Lorentzian distance is obtained as follows (Equation 7):

dHu,v=arccoshu,vL,(7)

with the Lorentzian inner product u,vL=u0v0+k=1dukvk. For numerical stability, we use dH=log(2α) when α=u,vL>104, with clamping α1+108.

2.5 Evaluation metrics

We assess LiVAE performance using 12 metrics organized into four categories, evaluating complementary aspects of representation quality.

2.5.1 Clustering quality metrics

We assess the biological population structure using five standard metrics and one novel metric:

Standard metrics: Normalized mutual information (NMI) and adjusted Rand index (ARI) measure agreement between the predicted clusters and ground-truth cell-type labels, with values near 1 indicating strong correspondence. Average silhouette width (ASW) and the Calinski–Harabasz index (CAL) quantify cluster cohesion and separation (higher is better), while the Davies–Bouldin index (DAV) measures the average cluster similarity (lower is better). These metrics are computed using standard implementations in scikit-learn.

Coupling degree (COR) (Equation 8): We introduce this metric to quantify preservation of interdependent biological programs:

COR=1kk1i=1kjik|ρij|,(8)

where ρij is the Pearson correlation between latent dimensions i and j and k is the latent space dimensionality. Higher COR values indicate stronger coupling, reflecting coordinated gene expression programs that are essential for continuous differentiation trajectories.

2.5.2 Dimensionality reduction embedding quality metrics

We evaluate how effectively latent representations Z project to interpretable low-dimensional spaces (UMAP and t-SNE) while preserving biological relationships.

Distance correlation (ρdist) (Equation 9) quantifies the preservation of pairwise distance relationships:

ρdist=ρSpearmanvecDZ,vecDE,(9)

where DZ and DE are pairwise Euclidean distance matrices in latent and embedding spaces, respectively, vec() vectorizes the upper triangle, and ρSpearman is the Spearman rank correlation coefficient. Higher values indicate better preservation of the global structure.

Local quality (Qlocal) and global quality (Qglobal) (Equation 10) measure preservation at different scales through co-ranking matrix analysis:

Qlocal=1KmaxK=1KmaxQNXK,Qglobal=1n1KmaxK=Kmax+1n1QNXK,(10)

where QNX(K) is the normalized co-ranking quality measure at neighborhood size K and Kmax is the optimal local neighborhood boundary. Higher values indicate better maintenance of local neighborhoods and large-scale topology, respectively.

Overall embedding quality (Qembed) (Equation 11) combines the three components:

Qembed=13ρdist+Qlocal+Qglobal.(11)

2.5.3 Intrinsic manifold quality metrics

We characterize geometric properties of the latent manifold through spectral analysis of the covariance matrix C=1n1ZTZ, with eigenvalues λ1λk.

Manifold dimensionality (Mdim) (Equation 12) measures representation compactness:

Mdim=1deff1k1,(12)

where deff represents the number of principal components explaining 95% variance. Higher values indicate more efficient encoding.

Spectral decay rate (Sdecay) (Equation 13) quantifies hierarchical structure clarity:

Sdecay=11+e|β|λ1i=1kλi,(13)

where β is the slope from log-linear regression on eigenvalues. Higher values indicate steeper spectrum decay, reflecting clear hierarchical organization.

Participation ratio (Pratio) (Equation 14) assesses the balance of variance distribution:

Pratio=1ki=1kλi2i=1kλi2.(14)

Higher values indicate more uniform utilization of latent dimensions, thereby preventing dimension collapse.

Anisotropy score (Ascore) (Equation 15) quantifies directional bias strength:

Ascore=tanhlogλ1logλk+ϵ4,(15)

where ϵ=108. Higher values indicate stronger directional structure along the dominant axes, which is essential for trajectory representation.

Trajectory directionality (Tdir) (Equation 16) measures dominance of the primary variation axis:

Tdir=λ1i=2kλi+ϵ.(16)

Higher values indicate a single dominant trajectory, which is characteristic of linear differentiation processes.

Noise resilience (Nres) (Equation 17) approximates the signal-to-noise ratio:

Nres=mini=12λii=3kλi+ϵ110,1.(17)

Higher values indicate robust separation between signal and noise subspaces.

Composite scores.

We define two summary metrics: core intrinsic quality (Equation 18) integrates the fundamental geometric properties,

Qcore=14Mdim+Sdecay+Pratio+Ascore,(18)

while overall intrinsic quality (Equation 19) incorporates task-oriented components with weights (α,β,γ)=(0.5,0.3,0.2):

Qoverall=αQcore+βTdir+γNres.(19)

2.5.4 Batch integration quality

The integration local inverse Simpson index (iLISI) (Equation 20) measures batch mixing quality. For each cell i,

iLISIi=1j=1Bpij2,(20)

where B represents the number of batches and pij represents the proportion of cell i’s k-nearest neighbors from batch j, and it is computed using a Gaussian kernel with the bandwidth determined by perplexity. The overall score is iLISI=1ni=1niLISIi. Higher values (approaching B) indicate better batch integration, while values near 1 indicate poor mixing.

2.6 Datasets and preprocessing

2.6.1 Dataset selection

We curated 135 single-cell datasets from public repositories (Gene Expression Omnibus, GEO): 53 scRNA-seq dataset and 82 scATAC-seq dataset samples. Raw single-cell count matrices underwent quality control and normalization prior to model training. Both modalities require raw integer counts as input as the model employs count-based likelihood functions.

2.6.2 scRNA-seq preprocessing

The top 5,000 highly variable genes (HVGs) were selected by modeling the mean–variance relationship in count data. For model input, normalized data were obtained by applying log(x+1) transformation, followed by z-score standardization with outlier clipping at ±10 standard deviations.

2.6.3 scATAC-seq preprocessing

Term frequency–inverse document frequency (TF-IDF) normalization (Equation 21) was applied:

TF-IDFij=TFij×IDFj×s,(21)

where the term frequency for cell i and peak j is TFij=xij/kxik, the inverse document frequency is IDFj=log1+Nnj, with N being the total number of cells and nj being the number of cells where peak j is accessible, and s=104 is a scaling factor. Highly variable peaks (HVPs) were identified using variance-based selection on TF-IDF normalized values, which were restricted to peaks with accessibility between 1% and 95% of cells. The top 10,000 HVPs were selected as input features.

2.7 Model hyperparameters

LiVAE was configured with the following default hyperparameters. The encoder and decoder networks each contained a single hidden layer of dimension 128. The latent space dimension was set to d=10, and the information bottleneck dimension was set to dc=2. The loss function weights were the primary reconstruction weight λrecon1=1.0, bottleneck reconstruction weight λrecon2=1.0, KL divergence weight β=1.0, and geometric loss weight λgeo=5.0. For reconstruction losses, we employed NB likelihood for scRNA-seq data and ZINB likelihood for scATAC-seq data. Training was performed using the Adam optimizer with a learning rate of 1×104 and a batch size of 128. Gradient clipping with a threshold of 1.0 was applied. Layer normalization was employed in both the encoder and decoder networks.

2.8 Baseline methods

We compared LiVAE against 21 methods spanning four categories:

• Classical dimensionality reduction (seven methods): PCA, kernel PCA (KPCA), factor analysis (FA), non-negative matrix factorization (NMF), independent component analysis (ICA), truncated SVD (TSVD), and dictionary learning (DICL).

• Deep generative models (eight methods): standard VAE, β-variational autoencoder (β-VAE), total correlation β-VAE (β-TCVAE), disentangled inferred prior VAE (DIPVAE), information maximizing VAE (InfoVAE), single-cell variational inference (scVI), single-cell deep clustering (scDeepCluster), single-cell deep hyperbolic manifold learning (scDHMap), and single-cell trajectory optimization (scTour).

• Graph and contrastive learning (three methods): contrastive learning for scRNA-seq (CLEAR), single-cell graph neural network (scGNN), and single-cell graph contrastive clustering (scGCC).

• Modality-specific methods (three methods): latent semantic indexing (LSI), peak variational inference (PeakVI), and Poisson variational inference (PoissonVI) for scATAC-seq.

2.9 Statistical analysis

We used paired experimental designs (identical datasets for all methods). For each metric, normality was assessed using the Shapiro–Wilk test (α=0.05). Multi-method comparisons employed repeated measures analysis of variance (ANOVA) (normal data) or Friedman test (non-normal), followed by Tukey honest significant difference (HSD) or Bonferroni-corrected Wilcoxon signed-rank post hoc tests, respectively. The significance levels were * p<0.05, ** p<0.01, and *** p<0.001.

3 Results

3.1 Architectural progression from foundational VAEs yields comprehensive performance gains

We benchmarked LiVAE against its foundational predecessors—standard VAE and information bottleneck VAE (iVAE)—using 135 datasets (53 scRNA-seq and 82 scATAC-seq). LiVAE’s complete architecture established a new performance baseline, delivering statistically significant improvements across nearly all metrics for both scRNA-seq (Figure 2A) and scATAC-seq datasets (Figure 2B; Table 1).

Figure 2
Comparison of box plots depicting various metrics for VAE, IVA, and LVAE models in scRNA-seq (53 samples) and scATAC-seq (82 samples) data. Metrics include ASW, DAV, CAL, and more, showing statistical significance with stars.

Figure 2. Progressive architectural enhancements yield consistent performance gains. Boxplots display performance differences (Δ; LiVAE baseline) across five evaluation categories. Statistical significance was assessed using Tukey’s HSD post hoc test for ASW and CAL (scRNA-seq), Q local (UMAP), and overall intrinsic quality (scATAC-seq); Bonferroni-corrected Wilcoxon signed-rank tests were applied for all other metrics. Boxes indicate the median and IQR; whiskers extend to 1.5× IQR. Statistical methods are indicated in subplot titles. (A) scRNA-seq datasets (n=53). (B) scATAC-seq datasets (n=82).

Table 1
www.frontiersin.org

Table 1. Performance differences between LiVAE and baseline VAE models across scRNA-seq (n=53) and scATAC-seq (n=82) datasets. Values represent absolute differences (Δ; LiVAE baseline). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

The most striking advantage was a profound increase in model robustness. For scRNA-seq (Figure 2A), LiVAE boosted noise resilience by Δ=0.678 (vs. VAE) and Δ=0.526 (vs. iVAE), translating to superior overall intrinsic quality (Δ=0.403 and Δ=0.222, respectively; all p<0.001). Comparable patterns were obtained for scATAC-seq (Figure 2B), with noise resilience gains of Δ=0.486 and Δ=0.350, and overall intrinsic quality improvements of Δ=0.330 and Δ=0.177.

This enhanced robustness stemmed from LiVAE’s geometrically expressive latent space. For scRNA-seq, the participation ratio increased by Δ=0.467 (vs. VAE) and Δ=0.149 (vs. iVAE), while manifold dimensionality, spectral decay, and anisotropy improved by Δ=0.356/0.123, Δ=0.170/0.078, and Δ=0.287/0.153, respectively. Similar gains were obtained for scATAC-seq (Table 1). These structural improvements enabled faithful visualizations, with UMAP distance correlation increasing by Δ=0.251/0.062 (scRNA-seq) and Δ=0.217/0.065 (scATAC-seq). Clustering performance improved dramatically; for scRNA-seq, cluster alignment (CAL) increased by Δ=1270.811 (vs. VAE) and Δ=718.553 (vs. iVAE), while latent dimension coupling (COR) increased by Δ=2.802 and Δ=1.395 (all p<0.001), indicating stronger preservation of coordinated biological programs. Collectively, these results demonstrate that LiVAE’s components—bottleneck, dual-pathway loss, and Lorentz regularization—synergistically create a more stable, powerful, and biologically informative model.

3.2 Balanced profile of local fidelity and global structure against classical methods

We benchmarked LiVAE against seven classical algorithms on n=53 scRNA-seq datasets. While some classical methods showed competitive local neighborhood preservation, LiVAE provided superior global coherence, manifold complexity, and robustness (Table 2).

Table 2
www.frontiersin.org

Table 2. Performance differences between LiVAE and classical dimensionality reduction methods across n=53 scRNA-seq datasets. Values represent absolute differences (Δ; LiVAE baseline). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

LiVAE’s UMAP local quality (Figure 3A) was statistically equivalent to those of PCA, KPCA, FA, NMF, and TSVD but slightly lower than that of dictionary learning (DICL; Δ=0.036, p<0.001) and ICA (Δ=0.072). However, LiVAE demonstrated massive advantages in global structure, with UMAP distance correlation surpassing all methods by Δ=0.209 (vs. TSVD) to Δ=0.436 (vs. ICA), yielding overall UMAP quality gains of Δ=0.111 to Δ=0.228 (all p<0.001). Parallel trends emerged for t-SNE (Figure 3B), with distance correlation improvements of Δ=0.172 to Δ=0.383 and overall quality gains of Δ=0.084 to Δ=0.191.

Figure 3
Box plots depicting evaluation metrics across various methods labeled PCA, KPCA, ICA, FA, NMF, TSVD, DCL, and LLAE. Sections A and B compare Distance correlation, Q local, Q global, and Overall quality for umap and tsne. Sections C and D assess manifold dimensionality, spectral decay rate, participation ratio, anisotropy score, core intrinsic quality, trajectory directionality, noise resilience, and overall intrinsic quality. All analyses indicate significant differences with Friedman p < 0.001. Horizontal bars show statistically significant comparisons between methods.

Figure 3. Balanced local–global performance relative to classical dimensionality reduction. Boxplots summarize metric distributions for LiVAE and seven classical baselines across n=53 scRNA-seq datasets. Boxes indicate the median and IQR; whiskers extend to 1.5× IQR. Statistical significance was assessed using Bonferroni-corrected Wilcoxon tests (α=0.0018). (A) UMAP embedding fidelity. (B) t-SNE embedding fidelity. (C) Latent manifold structure. (D) Intrinsic quality and robustness.

This superior organization reflected a sophisticated latent space (Figure 3C). All four manifold metrics improved substantially: dimensionality (Δ=0.241 to Δ=0.467), spectral decay (Δ=0.124 to Δ=0.258), participation ratio (Δ=0.310 to Δ=0.761), and anisotropy (Δ=0.181 to Δ=0.494). Critically, noise resilience exceeded that of all classical methods (Δ=0.620 to Δ=0.712; Figure 3D), culminating in overall intrinsic quality gains of Δ=0.314 to Δ=0.531 (all p<0.001). Thus, LiVAE delivers a balanced, robust solution for single-cell exploratory analysis.

3.3 Competitive edge in stability and manifold quality against state-of-the-art generative models

We assessed LiVAE against eight state-of-the-art deep generative models on n=53 scRNA-seq datasets, confirming it as a top-tier general-purpose embedding method distinguished by exceptional stability and global latent-space integrity (Table 3).

Table 3
www.frontiersin.org

Table 3. Performance differences between LiVAE and advanced generative and scRNA-seq-specialized methods across n=53 scRNA-seq datasets. Values represent absolute differences (Δ; LiVAE baseline). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

While trajectory-focused models such as scTour achieved UMAP distance correlation (Figure 4A) statistically equivalent to LiVAE’s (Δ=0.020, n.s.), LiVAE consistently outperformed the majority across nearly all metrics. Its paramount advantage was robustness: noise resilience exceeded all eight competitors by Δ=0.184 (vs. scTour) to Δ=0.709 (vs. DIPVAE; Figure 4D), fostering overall intrinsic quality gains of Δ=0.049 to Δ=0.477 (all p0.01).

Figure 4
Box plots compare performance metrics of various algorithms (highBetaVAE, TCWAE, etc.) across four sets of criteria. Panels A and B show distance correlation, local and global Q measures, and overall quality for UMAP and t-SNE methods. Panels C and D assess manifold dimensionality, spectral decay rate, participation ratio, anisotropy score, intrinsic qualities, trajectory directionality, noise resilience, and overall intrinsic quality. Significant differences are indicated by stars, with non-significant results marked by

Figure 4. Enhanced global fidelity and stability versus advanced deep generative models. Boxplots compare LiVAE with eight state-of-the-art baselines across n=53 scRNA-seq datasets. Boxes indicate the median and IQR; whiskers extend to 1.5× IQR. Statistical significance was assessed using Bonferroni-corrected Wilcoxon tests (α=0.0014). (A) UMAP embedding fidelity. (B) t-SNE embedding fidelity. (C) Latent manifold structure. (D) Intrinsic quality and robustness.

These strengths translated into superior geometric organization. LiVAE delivered UMAP overall quality improvements of Δ=0.072 to Δ=0.284 (Figure 4A), with distance correlation gains of Δ=0.176 to Δ=0.453 (except scTour). The participation ratio consistently exceeded that of all methods (Δ=0.013 to Δ=0.698; Figure 4C), indicating richer manifold complexity. While specialized models such as scDHMap yielded higher anisotropy (Δ=0.213), reflecting trajectory optimization, LiVAE provided a state-of-the-art balance of local fidelity (Figures 4A,B), global structure, and best-in-class robustness, making it ideal for broad single-cell data exploration.

3.4 Implicit geometric regularization is comparable to explicit graph-based architectures

We benchmarked LiVAE against three prominent graph-aware models (CLEAR, scGNN, and scGCC) on n=53 scRNA-seq datasets. LiVAE’s implicit geometric regularization matched or exceeded explicit graph-based methods, particularly in global fidelity and robustness (Table 4).

Table 4
www.frontiersin.org

Table 4. Performance differences between LiVAE and graph-based deep learning methods across n=53 scRNA-seq datasets. Values represent absolute differences (Δ; LiVAE baseline). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

Despite the graph-based models’ design for local structure, LiVAE proved highly competitive, significantly outperforming CLEAR (Δ=0.324), scGNN (Δ=0.082), and scGCC (Δ=0.048) in UMAP local quality (all p<0.001; Figure 5A). LiVAE established commanding leads in global embedding fidelity, with UMAP distance correlation improvements of Δ=0.398 to Δ=0.491, yielding overall UMAP quality gains of Δ=0.208 to Δ=0.338. The t-SNE results (Figure 5B) paralleled these findings (Δ=0.176 to Δ=0.375 for overall quality).

Figure 5
Nine sets of box plots compare different metrics across four methods: scGNN, scCCC, CLEAR, and LVAE. Panels A and B display distance correlation, Q local, Q global, and overall quality for UMAP and t-SNE. Panel C includes manifold dimensionality, spectral decay rate, participation ratio, and anisotropy. Panel D presents core intrinsic quality, trajectory directionality, noise resilience, and overall intrinsic quality. Statistical significance is indicated by asterisks.

Figure 5. Strong embedding quality and robustness without explicit graph regularization. Boxplots compare LiVAE with three graph-based baselines across n=53 scRNA-seq datasets. Boxes indicate the median and IQR; whiskers extend to 1.5× IQR. Statistical significance was assessed using Bonferroni-corrected Wilcoxon tests (α=0.0083). (A) UMAP embedding fidelity. (B) t-SNE embedding fidelity. (C) Latent manifold structure. (D) Intrinsic quality and robustness.

Model stability was exceptional, with noise resilience substantially higher than that of all comparators (Δ=0.545 to Δ=0.722, all p<0.001; Figure 5D). Manifold complexity (Figure 5C) surpassed CLEAR across all metrics; comparisons with scGCC for dimensionality (Δ=0.281) and the participation ratio (Δ=0.348) showed numerical advantages not reaching statistical significance. While scGNN produced higher anisotropy (Δ=0.146), LiVAE produced superior overall intrinsic quality (Δ=0.197 to Δ=0.495; Figure 5D), demonstrating that Lorentz-regularized information bottlenecks provide a powerful alternative to graph-based regularization.

3.5 Versatile and robust performance on chromatin accessibility data

We evaluated LiVAE on scATAC-seq data against three specialized methods (LSI, PeakVI, and PoissonVI) across n=82 datasets, demonstrating its cross-modality versatility and unique strengths in capturing the global structure and trajectory information (Table 5).

Table 5
www.frontiersin.org

Table 5. Performance differences between LiVAE and scATAC-seq-specialized methods across n=82 scATAC-seq datasets. Values represent absolute differences (Δ; LiVAE baseline). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

Against LSI and PeakVI, LiVAE was unequivocally superior across all categories. Overall intrinsic quality (Figure 6D) improved by Δ=0.167 and Δ=0.224, driven by noise resilience gains of Δ=0.311 and Δ=0.354, confirming robustness to sparse chromatin data. Manifold structure (Figure 6C) exceeded both methods across all four metrics: dimensionality (Δ=0.133/0.170), spectral decay (Δ=0.079/0.103), participation ratio (Δ=0.142/0.250), and anisotropy (Δ=0.151/0.161, all p<0.001).

Figure 6
Box plot panels A to D compare four methods: LSI, PeakVI, PoissonVI, and LVAE. The metrics are distance correlation, Q local, Q global, and overall quality for UMAP and t-SNE, manifold dimensionality, spectral decay rate, participation ratio, anisotropy, core intrinsic quality, trajectory directionality, noise resilience, and overall intrinsic quality. Statistically significant differences are denoted by asterisks. Each panel has an annotation with the Friedman test p-value (< 0.001).

Figure 6. Comparison with scATAC-seq-specialized methods. Boxplots compare LiVAE with LSI, PeakVI, and PoissonVI across n=82 scATAC-seq datasets. Boxes indicate the median and IQR; whiskers extend to 1.5× IQR. Statistical significance was assessed using Bonferroni-corrected Wilcoxon tests (α=0.0083). (A) UMAP embedding fidelity. (B) t-SNE embedding fidelity. (C) Latent manifold structure. (D) Intrinsic quality and robustness.

The comparison against PoissonVI revealed LiVAE’s complementary strengths. LiVAE achieved higher core intrinsic quality (Δ=0.174) and dramatically superior trajectory directionality (Δ=0.321; Figure 6D), suggesting better capture of continuous biological processes. Its geometric regularization provided superior global structure (UMAP distance correlation Δ=0.155; Figure 6A) and overall intrinsic quality (Δ=0.264), alongside exceptional noise resilience (Δ=0.404, all p<0.001). These results confirm LiVAE as a powerful tool for scATAC-seq analysis, offering compelling advantages for studies prioritizing global landscape understanding and developmental trajectory inference.

3.6 Dual loss pathways provide complementary representational benefits

We performed ablation studies on n=53 scRNA-seq datasets, systematically removing either the main reconstruction or the bottleneck pathway under both BN and Lorentz-regularized configurations. The two pathways perform highly specialized, non-redundant functions, with the full model decisively outperforming any simplified variant (Table 6).

Table 6
www.frontiersin.org

Table 6. Performance differences for dual-pathway ablations under bottleneck-only (BN) and Lorentz-regularized (Lorentz) configurations across n=53 scRNA-seq datasets. Values represent absolute differences (Δ; full model ablation). All differences are derived from paired comparisons; formal significance testing was not applied.

Removing the bottleneck pathway (“w/o BN”) caused catastrophic collapse in geometric integrity. Overall intrinsic quality increased by Δ=0.186 (BN; Figure 7A) and Δ=0.185 (Lorentz; Figure 7B), while the participation ratio decreased by Δ=0.355/0.356. UMAP distance correlation declined by Δ=0.198/0.208, with overall quality decreasing by Δ=0.115/0.119. Remarkably, supervised clustering (NMI/ARI) declined minimally (Δ0.01), demonstrating the bottleneck’s primary role in establishing geometric robustness rather than defining discrete clusters.

Figure 7
Bar charts comparing metrics for two models: A) Lorentz-based model and B) Model without Lorentz regularization. Each chart measures NMI, ARI, ASW, DAV, CAL, COR, DC, QL, QG, OV, and other attributes across different configurations (w/o Main, w/o BN, Full). Color-coded bars represent variations in performance for each metric.

Figure 7. Ablation analysis of dual loss pathways. (A) Bottleneck-only configuration (BN): full model vs. w/o main and w/o BN, n=53 scRNA-seq datasets. (B) Lorentz-regularized configuration (Lorentz): full model vs. w/o main and w/o BN, n=53 scRNA-seq datasets.

Conversely, removing the main pathway (“w/o main”) produced an inverted deficiency profile. Supervised clustering accuracy collapsed (NMI/ARI Δ0.12; CAL Δ=635–724), while intrinsic quality (Δ=0.018/0.047) and manifold structure worsened less severely. This indicates that the main pathway refines latent space for categorical label separation, operating upon the bottleneck’s stable geometric foundation. These complementary roles prove that both pathways are indispensable for state-of-the-art performance.

3.7 A deterministic anchor point fortifies geometric regularization

To establish the most effective method for applying Lorentz-distance regularization, we contrasted two strategies: anchoring the calculation to the deterministic information BN versus using two independently sampled posterior views (Views). Evaluation on scRNA-seq (n=53) and scATAC-seq (n=82) datasets revealed that the bottleneck-anchored strategy was superior across all metric categories (Figures 8A,B; Table 7).

Figure 8
Box plots comparing two conditions, Views and BN, for various metrics in scRNA and scATAC data sets. Each plot represents statistical tests with significance indicated by asterisks. ScRNA (n=53) and scATAC (n=82) show differences across diverse parameters like ASW, DAV, and COR. Statistical tests include paired t-tests and Wilcoxon tests with p-values provided, indicating significant or non-significant results.

Figure 8. Comparison of Lorentz-regularization strategies. (A) scRNA-seq (n=53): bottleneck-anchored (BN) vs. two-view sampled (Views). (B) scATAC-seq (n=82): BN vs. Views.

Table 7
www.frontiersin.org

Table 7. Performance differences between bottleneck-anchored (BN) and two-view sampled (Views) Lorentz regularization across scRNA-seq (n=53) and scATAC-seq (n=82) datasets. Values represent absolute differences (Δ; BN Views). Significance levels: ***p<0.001, **p<0.01, and *p<0.05.

The BN approach’s most profound impact was on model robustness and manifold quality. Noise resilience increased dramatically (Δ=0.225 for scRNA-seq; Δ=0.140 for scATAC-seq), driving overall intrinsic quality gains of Δ=0.120 and Δ=0.108, respectively. This stability was mirrored in latent geometry: participation ratio improved by Δ=0.109 and Δ=0.142, respectively; manifold dimensionality improved by Δ=0.075 for both modalities; and trajectory directionality improved by Δ=0.118 and Δ=0.116, respectively. These structural gains translated to better embedding fidelity (UMAP overall quality: Δ=0.025 and Δ=0.040; t-SNE: Δ=0.013 and Δ=0.039), improved cluster compactness (CAL: Δ=405.759 and Δ=164.000), and enhanced latent dimension coupling (COR: Δ=0.736 and Δ=0.743), indicating stronger preservation of coordinated biological programs. The average silhouette width showed minimal change for scRNA-seq (Δ=0.002) but notable improvement for scATAC-seq (Δ=0.023).

These results indicate that anchoring Lorentz regularization to a fixed, deterministic reference point mitigates training instability inherent in stochastic sampling. The bottleneck provides a consistent geometric scaffold, enabling more coherent latent organization across all analysis modalities.

3.8 Optimizing data fidelity through modality-aware reconstruction

Selecting an appropriate reconstruction loss is crucial for modeling distinct single-cell assay properties. We benchmarked four likelihoods—NB, ZINB, Poisson, and ZI-Poisson—on scRNA-seq (n=53) and scATAC-seq (n=82) datasets (Figures 9A,B; Table 8).

Figure 9
Bar charts comparing different metrics across various datasets labeled

Figure 9. Evaluation of reconstruction likelihood functions. (A) scRNA-seq (n=53): LiVAE with NB, ZINB, Poisson, and ZI-Poisson. (B) scATAC-seq (n=82): LiVAE with NB, ZINB, Poisson, and ZI-Poisson.

Table 8
www.frontiersin.org

Table 8. Performance differences for reconstruction likelihood functions across scRNA-seq (n=53) and scATAC-seq (n=82) datasets. Values represent absolute differences (Δ; optimal loss alternative). For scRNA-seq: NB vs. others; for scATAC-seq: ZINB vs. others. All differences are derived from paired comparisons; formal significance testing was not applied.

For scRNA-seq, NB provided the most robust performance (Figure 9A). Its primary advantage was enhanced noise resilience (Δ=0.100 to Δ=0.186 vs. alternatives) and overall intrinsic quality (Δ=0.047 to Δ=0.056), translating to superior cluster compactness (CAL: Δ up to 768.543) and stronger latent dimension coupling (COR: Δ up to 0.624), reflecting better preservation of interdependent biological programs. Embedding fidelity showed consistent but modest improvements (UMAP overall quality: Δ=0.024 to Δ=0.046). While Poisson achieved marginal advantages in trajectory directionality (Δ=0.010) and spectral decay (Δ=0.003), NB’s robustness benefits outweighed these task-specific trade-offs. The NB distribution effectively models the overdispersion characteristic of gene expression without requiring explicit zero-inflation handling.

For scATAC-seq, ZINB was superior (Figure 9B), driven by better noise resilience (Δ up to 0.196) and manifold structure (participation ratio: Δ=0.063 vs. NB). ZINB achieved stronger latent dimension coupling (COR: Δ=0.632 vs. NB), indicating better preservation of coordinated regulatory programs and higher overall intrinsic quality (Δ=0.084 vs. NB, Δ=0.105 vs. Poisson). This advantage reflects ZINB’s ability to explicitly model both overdispersion and the extreme zero-inflation inherent in sparse chromatin accessibility data, making it essential for accurate scATAC-seq representation.

3.9 Robustness and hyperparameter stability

We evaluated LiVAE’s sensitivity to key hyperparameters on n=53 scRNA-seq datasets: Lorentz-regularization weight (λ{1,5,10}), bottleneck dimensionality (dBN{2,4,6,8,10}), and latent dimensionality (dlatent{5,10,15,20}), which are visualized in Figures 10A–C and quantified in Table 9.

Figure 10
Three sets of bar charts labeled A, B, and C display various statistical metrics for different data series. A shows Lorenz weight series, B shows Bottleneck dim series, and C shows Latent dim series. Each set includes multiple metrics: NMI, ARI, ASW, DAV, CAL, COR, DC, QL, QG, OV, Man. dim., Spec. decay., Part. ratio, Anisotropy, Core qual., Traj. dir., Noise res., and Overall qual. Bars in different colors indicate varying levels or clusters within each metric.

Figure 10. Hyperparameter sensitivity analysis. (A) Lorentz-regularization weight (λ{1,5,10}) on n=53 scRNA-seq datasets. (B) Information bottleneck dimensionality (dBN{2,4,6,8,10}) on n=53 scRNA-seq datasets. (C) Latent dimensionality (dlatent{5,10,15,20}) on n=53 scRNA-seq datasets.

Table 9
www.frontiersin.org

Table 9. Performance differences for hyperparameter ablations across n=53 scRNA-seq datasets. Values represent absolute differences (Δ; optimal setting alternative). Lorentz weight: λ=10 vs. others; bottleneck dim.: dBN=2 vs. others; latent dim.: dlatent=10 vs. others. All differences are derived from paired comparisons; formal significance testing was not applied.

Increasing λ from 1 to 10 consistently improved performance (Figure 10A), particularly overall intrinsic quality (Δ=0.197), driven by gains in noise resilience (Δ=0.353), trajectory directionality (Δ=0.205), and the participation ratio (Δ=0.168). Clustering compactness (CAL: Δ=871.749) and embedding fidelity (UMAP overall quality: Δ=0.049) also improved. Moving from λ=5 to λ=10 showed diminishing but positive returns across most metrics, suggesting that stronger regularization is generally beneficial.

The bottleneck dimensionality analysis revealed that dBN=2 achieved optimal performance in unsupervised metrics (Figure 10B). Compared to dBN=10, it improved clustering compactness (CAL: Δ=559.683), embedding quality (UMAP: Δ=0.054; t-SNE: Δ=0.049), and manifold structure (participation ratio: Δ=0.110), with minimal impact on supervised clustering (NMI: Δ=0.003). This tight bottleneck effectively isolates core biological signals while filtering technical noise.

Latent dimensionality presented a clear trade-off (Figure 10C). The dlatent=10 setting balanced supervised clustering accuracy (NMI: Δ=0.101 vs. d=5; Δ=0.037 vs. d=20) with unsupervised compactness. Lower dimensions (d=5) underfit the complex structure, while higher dimensions (d=20) reduced cluster compactness (CAL: Δ=969.863 worse than d=10) and weakened latent dimension coupling (COR: Δ=2.942), suggesting diminished preservation of coordinated biological programs.

These findings support the default settings of λ=10, dBN=2, and dlatent=10 for typical analyses, with dlatent adjustable based on the dataset complexity.

3.10 Emergent batch correction and robust clustering performance

LiVAE’s information bottleneck and geometric regularization promote globally coherent embeddings that can disentangle biological signals from batch effects without explicit batch-correction terms. We benchmarked multi-batch scRNA-seq integration against scVI, scDHMap, scDeepCluster, scGNN, and scGCC on 21 multi-batch datasets.

UMAP visualizations of five representative datasets show well-mixed embeddings preserving biological structure (Figure 11A). Quantitative iLISI evaluation across the full set of 21 datasets with 2,000–8,000 cell subsamplings revealed that LiVAE achieves batch mixing comparable to that with specialized methods (Figure 11B).

Figure 11
A multi-panel scientific figure consisting of three sections labeled A, B, and C. Section A displays UMAP visualizations of different datasets and methods such as scVI, scDHA, scDeepCluster, scGNN, and L/VAE across five datasets. Section B shows box plots comparing ILSI metrics across methods, with significance levels annotated. Section C presents box plots for ARI and NMI metrics across methods, also with significance annotations. The comparisons in sections B and C involve statistical analysis with p-values provided for each test.

Figure 11. Batch integration and supervised clustering across multi-batch scRNA-seq datasets. (A) Representative UMAP embeddings from LiVAE and five comparison methods across five multi-batch datasets, colored by batch. (B) iLISI evaluation across downsampled cell counts (2,000–8,000); LiVAE achieves comparable batch mixing to specialized methods across n=21 datasets. (C) Supervised clustering accuracy (ARI and NMI) using four combinations of pre- and post-clustering algorithms: K-means/K-means (K–K), K-means/Leiden (K–L), Leiden/K-means (L–K), and Leiden/Leiden (L–L).

Supervised clustering evaluation using four pipelines—combining K-means or Leiden for pre- and post-integration clustering (denoted as K–K, K–L, L–K, and L–L)—showed mixed results (Figure 11C; Table 10). LiVAE substantially outperformed scDeepCluster (ARI: Δ=0.101 to Δ=0.167) and scGCC (Δ=0.117 to Δ=0.180) across most pipelines. However, scVI achieved superior accuracy in Leiden-based strategies (ARI: Δ=0.091 to Δ=0.095; NMI: Δ=0.041 to Δ=0.051).

Table 10
www.frontiersin.org

Table 10. Paired differences in supervised clustering across four clustering strategies evaluated on multi-batch single-cell datasets (n=21 experimental conditions from five datasets). Values represent absolute differences (Δ; LiVAE comparator). Significance levels: *p<0.05, **p<0.01, and ***p<0.001. Negative values with markers indicate significant superiority of the comparator; positive values with markers indicate significant superiority of LiVAE. Absence of markers indicates non-significant differences.

These findings demonstrate that LiVAE provides competitive batch integration and stable clustering without specialized batch parameters. Although dedicated batch-correction methods may be preferred for datasets with extreme confounding, LiVAE offers a versatile, general-purpose solution for integrated single-cell analysis.

3.11 Biological interpretability of latent components on a Dapp1 perturbation dataset

To assess LiVAE’s biological interpretability, we analyzed hematopoietic stem and progenitor cell scRNA-seq with Dapp1 knockout perturbation (GSE277292). UMAP embedding showed consistent cellular structure with minimal batch effects between wild-type and knockout conditions (Figure 12A). We annotated components by identifying genes with the highest per-cell expression–activation correlation.

Figure 12
Diagram of gene expression analysis. Part A: workflow showing correlation of components for identifying genes with maximum correlation. Part B: scatter plots display latent components related to cell functions such as protein synthesis, stress response, transcription regulation, and immune identity. Part C: bubble plots for latent components showing enriched biological processes with adjusted significance on the x-axis and score size indicating effect magnitude.

Figure 12. Interpretability of LiVAE latent components in a Dapp1 perturbation scRNA-seq dataset. (A) UMAP of cells from GSE277292, colored by condition (wt: wild-type; ko: knockout), and schematic of the gene-component association workflow based on the maximum expression correlation. (B) UMAPs of selected component activation scores (top of each pair) alongside expression of the most correlated marker genes (bottom). Functional groupings include cell cycle and protein synthesis (Latent0/Cks2 and Latent2/Rps24); stress response and cellular protection (Latent1/Plac8 and Latent8/Ctla2a); transcriptional regulation (Latent4/Junb); immune identity and differentiation (Latent5/Ighm and Latent9/Irf8); and myeloid lineage commitment (Latent3/Mpo, Latent6/Ms4a3, and Latent7/Cd63). (C) Gene Ontology biological process (GOBP) enrichment for the top correlated genes with Latent1, Latent4, and Latent5. Dot size indicates the gene count; color encodes adjusted p-value. Results support roles in hemopoiesis, mitotic cell-cycle processes, and myeloid differentiation.

Individual components captured distinct biological programs (Figure 12B). Cell-cycle and protein synthesis were tracked by Latent0 (Cks2) and Latent2 (Rps24). Stress-response programs aligned with Latent1 [Plac8 (Rogulski et al., 2005)] and Latent8 (Ctla2a). Transcriptional regulation was reflected in Latent4 [Junb (Santilli et al., 2021)]. Immune identity was found through Latent5 [Ighm (Dobre et al., 2021)] and Latent9 [Irf8 (Kurotaki and Tamura, 2019)], while myeloid commitment was found through Latent3 [Mpo (Lanza et al., 2001)], Latent6 [Ms4a3 (Donato et al., 2002)], and Latent7 [Cd63 (Pols and Klumperman, 2009)].

Gene Ontology biological process enrichment corroborated these assignments (Figure 12C): Latent4 enriched for “mitotic cell cycle,” Latent1 enriched for “hemopoiesis,” and Latent5 enriched for “myeloid differentiation.” These results demonstrate that LiVAE decomposes transcriptomes into disentangled, biologically meaningful axes, facilitating data-driven hypothesis generation.

4 Discussion

We introduced LiVAE, a geometrically regularized variational autoencoder that address the local–global trade-off in single-cell representation learning. Through systematic benchmarking across 135 datasets against 21 baseline methods spanning classical dimensionality reduction, deep generative models, graph-based architectures, and modality-specific approaches, we demonstrated that LiVAE achieves higher global topology preservation, richer latent manifold geometry, and enhanced robustness while maintaining competitive local structure fidelity. These technical advances translate to improved biological discovery: LiVAE embeddings better preserve developmental hierarchies, enable more accurate cell-type annotation, and provide interpretable latent dimensions aligned with known biological processes.

Unlike prior hyperbolic deep learning approaches that constrain entire latent spaces to hyperbolic manifolds (Park et al., 2021)—requiring manifold-aware operations, hyperbolic priors, and specialized reparameterization that increase computational cost and reduce flexibility (Cho et al., 2023)—LiVAE applies hyperbolic geometry only as regularization over a standard Euclidean latent space zRd. This hybrid design offers three key advantages with direct biological utility: downstream compatibility (z works seamlessly with standard clustering, trajectory inference, and visualization tools), noise filtering (the dimensional bottleneck dcd discards batch effects and technical noise while retaining hierarchical biological structure), and architectural decoupling (separate pathways optimize reconstruction fidelity and geometric structure, balancing local accuracy with global coherence). Our ablation studies empirically validate this design: the main pathway primarily controls the categorical structure (NMI and ARI), while the bottleneck pathway governs geometric quality (distance correlation and participation ratio) and robustness, with deterministic le providing stable geometric regularization across training iterations.

We use the full latent vector z (d=10) rather than the compressed le (dc=2) because these serve distinct roles: le distills minimal global structure for geometric regularization, while z retains the capacity for both global topology and local variation needed for clustering, marker identification, and trajectory inference (Ding et al., 2018). Our sensitivity analysis confirms that d=10 optimally balances performance and interpretability, enabling decomposition into biologically interpretable components that would be lost at lower dimensions. For scATAC-seq data, which exhibit substantially higher zero rates than scRNA-seq due to biological sparsity and technical dropout (Yan et al., 2020), we adopt ZINB reconstruction loss. Empirical evaluation confirms that ZINB consistently outperforms NB, Poisson, and ZIP on scATAC-seq datasets, yielding 47% relative improvement in latent dimension coupling (COR: Δ=0.632 vs. NB), indicating stronger preservation of coordinated regulatory programs, alongside gains in trajectory directionality (Δ=0.079) and noise resilience (Δ=0.154), aligning with recent ZINB-based scATAC-seq frameworks (Lan et al., 2023; Rachid Zaim et al., 2024).

While not explicitly designed for batch correction, LiVAE achieves comparable iLISI scores to scVI across 21 multi-batch datasets through three mechanisms: the information bottleneck attenuates batch-specific artifacts orthogonal to biological signal (Voloshynovskiy et al., 2019), geometric loss enforces global coherence that implicitly aligns cross-batch representations, and shared decoders incentivize batch-invariant features. For datasets with severe batch confounding (e.g., cell types appearing in only one batch), scVI’s explicit batch modeling may be superior, but LiVAE’s simpler architecture—requiring no batch labels and avoiding adversarial training instabilities—offers practical advantages for routine integration.

Based on our systematic benchmarking, we recommend using LiVAE when the dataset structure is unknown and exploratory analysis is needed, global topology preservation is critical (e.g., identifying rare populations and inferring developmental hierarchies), or cross-dataset integration is required without batch labels. Alternative methods should be used when the trajectory structure is well-defined and pseudotime accuracy is paramount (prefer scDHMap and scTour), extreme sparsity (>98% zeros) dominates scATAC-seq (consider PoissonVI), or supervised batch correction with known batch identities is available (scVI offers marginal advantages in highly confounded scenarios).

Several limitations motivate future development. First, while our component-wise interpretability analysis demonstrates that latent dimensions capture biologically meaningful variation, LiVAE does not enforce strict disentanglement—the components may exhibit residual correlations, unlike β-VAE or FactorVAE frameworks that explicitly penalize dependencies (Burgess et al., 2018; Kim et al., 2018). Extending to true causal disentanglement (Sikka et al., 2019; Abdelaleem et al., 2025) would enable more principled perturbation analysis. Second, LiVAE does not currently handle paired multi-omic measurements (e.g., 10x multiome scRNA + scATAC from the same cells); extending to true multi-modal integration would require modality-specific encoders, cross-modal alignment losses, and validation on datasets such as SHARE-seq or 10x multiome (Zuo and Chen, 2021). Third, experimental validation—such as comparing LiVAE-guided cell sorting with ground-truth lineage tracing—would strengthen claims of biological relevance but requires specialized datasets that are currently unavailable for most benchmarked tissues.

Beyond these immediate needs, our results establish geometric regularization—specifically Lorentzian distance constraints across information bottlenecks—as a powerful strategy for learning hierarchical representations that extend beyond transcriptomics to spatial transcriptomics (cells microenvironments tissue regions), protein interaction networks, and metabolic pathways with tree-like structure (Pogány et al., 2024; Li et al., 2023). Adaptive curvature learning (Skopek et al., 2020) would enable automatic tuning of geometric constraints to dataset-specific hierarchies. In conclusion, LiVAE establishes geometric regularization as a practical alternative to graph-based and batch-correction-centric approaches in single-cell representation learning, achieving state-of-the-art global topology preservation, noise resilience, and interpretability without sacrificing local fidelity. Our open-source implementation and comprehensive benchmarking framework enable community evaluation and extension, accelerating the integration of geometric deep learning into mainstream single-cell genomics workflows.

Data availability statement

The original contributions presented in the study are publicly available. The source code for this research is publicly available on GitHub at https://github.com/PeterPonyu/LiVAE. The single-cell sequencing data for the Dapp1 perturbation experiments is publicly available in the Gene Expression Omnibus (GEO) under accession number GSE277292.

Author contributions

ZF: Writing – review and editing, Visualization, Funding acquisition, Conceptualization, Software, Investigation, Writing – original draft, Resources, Validation, Formal analysis, Project administration, Methodology, Data curation, Supervision. JF: Writing – review and editing, Data curation, Supervision, Formal analysis, Validation, Investigation, Resources, Visualization. CC: Resources, Validation, Formal analysis, Writing – original draft, Data curation, Visualization, Investigation. KZ: Formal analysis, Supervision, Visualization, Writing – original draft, Data curation, Resources, Validation. SW: Writing – review and editing, Data curation, Supervision, Formal analysis, Project administration, Validation, Investigation, Funding acquisition, Resources.

Funding

The authors declare that financial support was received for the research and/or publication of this article. This work was supported by the National Key R&D Program of China (Grant No. 2024YFA1107101), Ministry of Science and Technology of the People’s Republic of China (Grant No. 2024YFA1107101) and the National Key Laboratory of Trauma and Chemical Poisoning (Grant No. 2024K004).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2025.1713727/full#supplementary-material

References

Abdelaleem, E., Abid, A., and Kording, K. (2025). Deep variational multivariate information bottleneck. J. Mach. Learn Res. 26 (19), 1–39. doi:10.48550/arXiv.2310.03311

CrossRef Full Text | Google Scholar

Becht, E., McInnes, L., Healy, J., Dutertre, C. A., Kwok, I. W. H., Ng, L. G., et al. (2019). Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37 (1), 38–44. doi:10.1038/nbt.4314

PubMed Abstract | CrossRef Full Text | Google Scholar

Bronstein, M. M., Bruna, J., Cohen, T., and Veličković, P. (2021). Geometric deep learning: grids, groups, graphs, geodesics, and gauges. arXiv Preprint arXiv:2104.13478. doi:10.48550/arXiv.2104.13478

CrossRef Full Text | Google Scholar

Burgess, C. P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., et al. (2018). Understanding disentangling in β-VAE. arXiv Preprint arXiv:1804.03599. doi:10.48550/arXiv.1804.03599

CrossRef Full Text | Google Scholar

Cao, J., Spielmann, M., Qiu, X., Huang, X., Ibrahim, D. M., Hill, A. J., et al. (2019). The single-cell transcriptional landscape of Mammalian organogenesis. Nature 566 (7745), 496–502. doi:10.1038/s41586-019-0969-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Chami, I., Ying, Z., Ré, C., and Leskovec, J. (2019). Hyperbolic graph convolutional neural networks. Adv. Neural Inf. Process Syst. 32, 4868–4879. doi:10.5555/3454287.3454725

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, S., Lake, B. B., and Zhang, K. (2019). High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37 (12), 1452–1457. doi:10.1038/s41587-019-0290-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Cho, S., Hong, S., Jeon, S., Lee, Y., Sohn, K., and Shin, J. (2023). Hyperbolic VAE via latent Gaussian distributions. Adv. Neural Inf. Process Syst. 36, 569–588. doi:10.48550/arXiv.2209.15217

CrossRef Full Text | Google Scholar

Choi, J., Kim, H., Kim, C., and Sohn, I. (2023). VAE-IF: deep feature extraction with averaging for fully unsupervised artifact detection in routinely acquired ICU time-series. IEEE J. Biomed. Health Inf. 27 (6), 2777–2788.

Google Scholar

Cui, H., Wang, C., Maan, H., Pang, K., Luo, F., Duan, N., et al. (2024). scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21 (8), 1470–1480. doi:10.1038/s41592-024-02201-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Deng, T., Huang, M., Xu, K., Lu, Y., Xu, Y., Chen, S., et al. (2025). LEGEND: identifying co-expressed genes in multimodal transcriptomic sequencing data. Genomics Proteomics Bioinforma., qzaf056. doi:10.1093/gpbjnl/qzaf056

PubMed Abstract | CrossRef Full Text | Google Scholar

Ding, J., Condon, A., and Shah, S. P. (2018). Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun. 9 (1), 2002. doi:10.1038/s41467-018-04368-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Dobre, M., Uddin, M., and Zouali, M. (2021). Molecular checkpoints of human B-cell selection and development in the bone marrow. Immunology 164 (3), 441–451.

Google Scholar

Donato, J. L., Kunz, G., Staf, C., Aeby, P., Kion, T., and Stamenkovic, I. (2002). Htm4, a new member of the CD20/beta-subunit of the B-cell antigen receptor-associated protein family, is a myeloid-specific, G-protein-coupled receptor. J. Immunol. 168 (10), 5037–5047.

Google Scholar

Fang, R., Preissl, S., Li, Y., Hou, X., Lucero, J., Wang, X., et al. (2021). Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat. Commun. 12 (1), 1337. doi:10.1038/s41467-021-21583-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Ganea, O., Bécigneul, G., and Hofmann, T. (2018). Hyperbolic neural networks. Adv. Neural Inf. Process Syst. 31, 5345–5355. doi:10.5555/3327345.3327440

CrossRef Full Text | Google Scholar

Hao, Y., Hao, S., Andersen-Nissen, E., Mauck, W. M., Zheng, S., Butler, A., et al. (2021). Integrated analysis of multimodal single-cell data. Cell 184 (13), 3573–3587.e29. doi:10.1016/j.cell.2021.04.048

PubMed Abstract | CrossRef Full Text | Google Scholar

Heiser, C. N., and Lau, K. S. (2020). A quantitative framework for evaluating single-cell data structure preservation by dimensionality reduction techniques. Cell Rep. 31 (5), 107576. doi:10.1016/j.celrep.2020.107576

PubMed Abstract | CrossRef Full Text | Google Scholar

Hetzel, L., Fischer, D. S., Günnemann, S., and Theis, F. J. (2021). Graph representation learning for single-cell biology. Curr. Opin. Syst. Biol. 28, 100347. doi:10.1016/j.coisb.2021.05.008

CrossRef Full Text | Google Scholar

Hu, J., Li, X., Coleman, K., Schroeder, A., Ma, N., Irwin, D. J., et al. (2021). SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods 18 (11), 1342–1351. doi:10.1038/s41592-021-01255-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, H., and Mnih, A. (2018). “Disentangling by factorising,” in 35th International Conference on Machine Learning (ICML 2018). Stockholm, Sweden: PMLR, 2649–2658.

Google Scholar

Kiselev, V. Y., Andrews, T. S., and Hemberg, M. (2019). Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet. 20 (5), 273–282. doi:10.1038/s41576-018-0088-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Klimovskaia, A., Lopez-Paz, D., Bottou, L., and Nickel, M. (2020). Poincaré maps for analyzing complex hierarchies in single-cell data. Nat. Commun. 11 (1), 2966. doi:10.1038/s41467-020-16822-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Kobak, D., and Berens, P. (2019). The art of using t-SNE for single-cell transcriptomics. Nat. Commun. 10 (1), 5416. doi:10.1038/s41467-019-13056-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Kurotaki, D., and Tamura, T. (2019). IRF8: the orchestrator of myeloid and lymphoid immune system. Int. Immunol. 31 (9), 569–576.

PubMed Abstract | Google Scholar

Lan, W., Yang, S., Liu, J., and Xu, J. (2023). scIAC: clustering scATAC-seq data based on Student’s t-distribution similarity imputation and zero-inflated negative binomial model. IEEE/ACM Trans. Comput. Biol. Bioinform 20 (3), 2152–2163. doi:10.1109/BIBM55620.2022.9995225

CrossRef Full Text | Google Scholar

Lanza, F., Castoldi, G. L., and Castagnari, B. (2001). The flow cytometric analysis of myeloperoxidase (MPO) in the differential diagnosis of acute leukemia. Leuk. Lymphoma 42 (5), 885–896.

Google Scholar

Li, N., Gao, M., Zheng, C., Liu, J., and Liu, Z. (2023). Hyperbolic hierarchical knowledge graph embeddings for biological entities. J. Biomed. Inf. 147, 104520. doi:10.1016/j.jbi.2023.104503

CrossRef Full Text | Google Scholar

Li, W., Zhu, M., Xu, Y., Huang, M., Wang, Z., Chen, J., et al. (2025). SIGEL: a context-aware genomic representation learning framework for spatial genomics analysis. Genome Biol. 26 (1), 287. doi:10.1186/s13059-025-03748-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Lopez, R., Regier, J., Cole, M. B., Jordan, M. I., and Yosef, N. (2018). Deep generative modeling for single-cell transcriptomics. Nat. Methods 15 (12), 1053–1058. doi:10.1038/s41592-018-0229-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Luecken, M. D., Büttner, M., Chaichoompu, K., Danese, A., Interlandi, M., Mueller, M. F., et al. (2022). Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19 (1), 41–50. doi:10.1038/s41592-021-01336-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Lynch, A. W., Theodoris, C. V., Long, A., Brown, M., Liu, X. S., and Meyer, C. A. (2023). Multi-batch single-cell comparative atlas construction by generalized and supervised integration. Nat. Commun. 14 (1), 3854. doi:10.1038/s41467-023-39494-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Madrigal, A., Brazovskaja, A., Guna, A., Treutlein, B., and Lundberg, E. (2024). A unified model for interpretable latent embedding of multi-sample, multi-condition single-cell data. Nat. Commun. 15 (1), 7279. doi:10.1038/s41467-024-50963-0

CrossRef Full Text | Google Scholar

Mathieu, E., Le Lan, C., Maddison, C. J., Tomioka, R., and Teh, Y. W. (2019). Continuous hierarchical representations with Poincaré variational auto-encoders. Adv. Neural Inf. Process Syst. 32, 12565–12576. doi:10.48550/arXiv.1901.06033

CrossRef Full Text | Google Scholar

McInnes, L., Healy, J., and Melville, J. (2018). UMAP: uniform manifold approximation and projection for dimension reduction. arXiv Preprint arXiv:1802.03426. doi:10.48550/arXiv.1802.03426

CrossRef Full Text | Google Scholar

Moon, K. R., van Dijk, D., Wang, Z., Gigante, S., Burkhardt, D. B., Chen, W. S., et al. (2018). Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46. doi:10.1016/j.coisb.2017.12.008

CrossRef Full Text | Google Scholar

Nagano, Y., Yamaguchi, S., Fujita, Y., and Koyama, M. (2019). “A wrapped normal distribution on hyperbolic space for gradient-based learning,” in 36th International Conference on Machine Learning (ICML 2019). Long Beach, CA: PMLR, 4693–4702.

Google Scholar

Nguyen, N. D., Huang, J., and Wang, D. (2022). A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data. Nat. Comput. Sci. 2, 38–46. doi:10.1038/s43588-021-00185-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Nickel, M., and Kiela, D. (2017). Poincaré embeddings for learning hierarchical representations. Adv. Neural Inf. Process Syst. 30, 6338–6347. doi:10.48550/arXiv.1705.08039

CrossRef Full Text | Google Scholar

Park, J., Cho, J., Chang, H. J., and Choi, J. Y. (2021). “Unsupervised hyperbolic representation learning via message passing auto-encoders,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5516–5526.

Google Scholar

Pogány, D., Rečnik, L. M., and Žitnik, M. (2024). Towards explainable interaction prediction: embedding biological hierarchies into hyperbolic interaction space. PLOS ONE 19 (3), e0300906. doi:10.1371/journal.pone.0300906

PubMed Abstract | CrossRef Full Text | Google Scholar

Pols, M. S., and Klumperman, J. (2009). Trafficking and function of the tetraspanin CD63. Exp. Cell Res. 315 (9), 1584–1592. doi:10.1016/j.yexcr.2008.09.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Rachid Zaim, S., Johnson, R., Alcaraz, N. P., Wang, B., Camino, E., Prokop, J. W., et al. (2024). MOCHA’s advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human disease cohorts. Nat. Commun. 15 (1), 6405. doi:10.1038/s41467-024-50612-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Rogulski, K., Li, N., Guerrero, C., Kim, J., Yi, D., Vence, L., et al. (2005). Onzin, a c-Myc-repressed new gene, promotes survival of myeloid cells. FASEB J. 19 (13), 1867–1869.

Google Scholar

Saelens, W., Cannoodt, R., Todorov, H., and Saeys, Y. (2019). A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37 (5), 547–554. doi:10.1038/s41587-019-0071-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Santilli, S., Heckl, D., and Riemke, P. (2021). The role of AP-1 transcription factor JunB in hematopoiesis and leukemogenesis. Haematologica 106 (3), 666–677.

Google Scholar

Sarkar, R. (2012). “Low distortion Delaunay embedding of trees in hyperbolic plane,” in Graph Drawing: 19th International Symposium, GD 2011, 355–366.

Google Scholar

Sikka, H., Zhong, W., Yin, J., and Pehlevan, C. (2019). “A closer look at disentangling in β-VAE,” in 53rd Asilomar conference on signals, systems, and computers (IEEE), 437–441.

Google Scholar

Skopek, O., Ganea, O. E., and Bécigneul, G. (2019). Mixed-curvature variational autoencoders. arXiv Preprint arXiv:1911.08411. doi:10.48550/arXiv.1911.08411

CrossRef Full Text | Google Scholar

Skopek, O., Ganea, O. E., and Bécigneul, G. (2020). “Mixed-curvature variational autoencoders,” in International conference on learning representations.

Google Scholar

Song, T., Choi, H., Jang, A., Han, S., Roh, S., Gong, G., et al. (2022a). Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network. Bioinformatics 38 (5), 1344–1352. doi:10.1093/bioinformatics/btab812

PubMed Abstract | CrossRef Full Text | Google Scholar

Song, Q., Su, J., Zhang, W., Chen, M., and Su, J. (2022b). SMGR: a joint statistical method for integrative analysis of single-cell multi-omics data. Nar. Genom. Bioinform. 4 (3), lqac056. doi:10.1093/nargab/lqac056

PubMed Abstract | CrossRef Full Text | Google Scholar

Strouse, D. J., and Schwab, D. J. (2017). The deterministic information bottleneck. Neural Comput. 29 (6), 1611–1630. doi:10.1162/NECO_a_00961

PubMed Abstract | CrossRef Full Text | Google Scholar

Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck, W. M., et al. (2019). Comprehensive integration of single-cell data. Cell 177 (7), 1888–1902.e21. doi:10.1016/j.cell.2019.05.031

PubMed Abstract | CrossRef Full Text | Google Scholar

Tian, L., Dong, X., Freytag, S., Le Cao, K. A., Su, S., JalalAbadi, A., et al. (2019). Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments. Nat. Methods 16 (6), 479–487. doi:10.1038/s41592-019-0425-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Tian, T., Zhang, Y., Huang, Z., Wan, J., Liu, F., Chen, D., et al. (2023). Complex hierarchical structures in single-cell genomics data unveiled by deep hyperbolic manifold learning. Brief. Bioinform. 24 (2), bbad086.

PubMed Abstract | Google Scholar

Tishby, N., and Zaslavsky, N. (2015). “Deep learning and the information bottleneck principle,” in 2015 IEEE information theory workshop (ITW) (IEEE), 1–5.

CrossRef Full Text | Google Scholar

Trapnell, C., Cacchiarelli, D., Grimsby, J., Pokharel, S., Li, S., Morse, M., et al. (2014). The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32 (4), 381–386. doi:10.1038/nbt.2859

PubMed Abstract | CrossRef Full Text | Google Scholar

Voloshynovskiy, S., Kondah, M., Rezaeifar, S., Taran, O., Holotyak, T., and Rezende, D. J. (2019). Information bottleneck through variational glasses. arXiv Preprint arXiv:1912.00830. doi:10.48550/arXiv.1912.00830

CrossRef Full Text | Google Scholar

Yan, F., Powell, D. R., Curtis, D. J., and Wong, N. C. (2020). From reads to insight: a hitchhiker’s guide to ATAC-seq data analysis. Genome Biol. 21 (1), 22. doi:10.1186/s13059-020-1929-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Yuan, H., and Kelley, D. R. (2022). scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks. Nat. Methods 19 (9), 1088–1096. doi:10.1038/s41592-022-01562-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, F., Kang, Y., and Wakefield, J. (2024). A novel statistical method for differential analysis of single-cell chromatin accessibility sequencing data. PLOS Comput. Biol. 20 (3), e1011854. doi:10.1371/journal.pcbi.1011854

PubMed Abstract | CrossRef Full Text | Google Scholar

Zuo, C., and Chen, L. (2021). Deep-joint-learning analysis model of single cell transcriptome and open chromatin accessibility data. Brief. Bioinform. 22 (4), bbaa287. doi:10.1093/bib/bbaa287

PubMed Abstract | CrossRef Full Text | Google Scholar

Glossary

ARI Adjusted Rand index

BN Bottleneck

CLEAR Contrastive learning for scRNA-seq

DAV Davies–Bouldin index

DIPVAE Disentangled inferred prior VAE

FactorVAE Factor variational autoencoder

HVGs Highly variable genes

ICA Independent component analysis

InfoVAE Information maximizing VAE

KPCA Kernel principal component analysis

LSI Latent semantic indexing

NB Negative binomial

NMI Normalized mutual information

PCA Principal component analysis

PoissonVI Poisson variational inference

scATAC-seq Single-cell ATAC sequencing

scDHMap Single-cell deep hyperbolic manifold learning

scGNN Single-cell graph neural network

scTour Single-cell trajectory optimization by unsupervised representation

SDR Spectral decay rate

TF-IDF Term frequency-inverse document frequency

TSVD Truncated singular value decomposition

VAE Variational autoencoder

ZIP Zero-inflated Poisson

β-VAE β-variational autoencoder

ASW Average silhouette width

CAL Calinski–Harabasz index

COR Coupling degree

DICL Dictionary learning

FA Factor analysis

HSD Honest significant difference

HVPs Highly variable peaks

iLISI Integration local inverse Simpson’s index

KL Kullback–Leibler

LiVAE Lorentzian variational autoencoder

MD Manifold dimensionality

NMF Non-negative matrix factorization

NR Noise resilience

PeakVI Peak variational inference

PR Participation ratio

scDeepCluster Single-cell deep clustering

scGCC Single-cell graph contrastive clustering

scRNA-seq Single-cell RNA sequencing

scVI Single-cell variational inference

TD Trajectory directionality

t-SNE t-distributed stochastic neighbor embedding

UMAP Uniform manifold approximation and projection

ZINB Zero-inflated negative binomial

β-TCVAE Total correlation VAE with β weighting

Keywords: single-cell multi-omics, dual-pathway c, hyperbolic geometry, information bottleneck, manifold learning, interpretable representation

Citation: Fu Z, Fu J, Chen C, Zhang K and Wang S (2026) Lorentz-regularized interpretable VAE for multi-scale single-cell transcriptomic and epigenomic embeddings. Front. Genet. 16:1713727. doi: 10.3389/fgene.2025.1713727

Received: 06 October 2025; Accepted: 20 November 2025;
Published: 05 January 2026.

Edited by:

Kenta Nakai, The University of Tokyo, Japan

Reviewed by:

Xiaobo Sun, Zhongnan University of Economics and Law, China
Weihang Zhang, Duke University, United States

Copyright © 2026 Fu, Fu, Chen, Zhang and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zeyu Fu, ZnV6ZXl1OTlAMTI2LmNvbQ==; Song Wang, c3dhbmcxOTgxQHRtbXUuZWR1LmNu

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.