GTMALoc: prediction of miRNA subcellular localization based on graph transformer and multi-head attention mechanism

Huang, Xindi; Jiang, Jipu; Shi, Lifen; Yan, Cheng

doi:10.3389/fgene.2025.1623008

METHODS article

Front. Genet., 19 June 2025

Sec. RNA

Volume 16 - 2025 | https://doi.org/10.3389/fgene.2025.1623008

This article is part of the Research TopicMicroRNA Secretion and Delivery: Mechanisms, Biomarkers, and Therapeutic InnovationsView all 3 articles

GTMALoc: prediction of miRNA subcellular localization based on graph transformer and multi-head attention mechanism

Xindi Huang

Jipu Jiang

Lifen Shi

Cheng Yan*

School of Informatics, Hunan University of Chinese Medicine, Changsha, China

MicroRNAs (miRNAs) play a crucial role in regulating gene expression, and their subcellular localization is essential for understanding their biological functions. However, accurately predicting miRNA subcellular localization remains a challenging task due to their short sequences, complex structures, and diverse functions. To improve prediction accuracy, this study proposes a novel model based on a graph transformer and a multi-head attention mechanism. The model integrates multi-source features which include the miRNA sequence similarity network, miRNA functional similarity network, miRNA–mRNA association network, miRNA–drug association network, and miRNA–disease association network. Specifically, we first apply the node2vec algorithm to extract features from these biological networks. Then, we use a graph transformer to capture relationships between nodes within the networks, enabling a better understanding of miRNA functions across different biological contexts. Next, a multi-head attention mechanism is implemented to combine miRNA features from multiple networks, allowing the model to capture deeper feature relationships and enhance prediction performance. Performance evaluation shows that the proposed method achieves significant improvements over current approaches on open-access datasets, achieving high performance with an AUC (area of receiver operating characteristic curve) of 0.9108 and AUPR(area of precision-recall curve) of 0.8102. It not only significantly improves prediction accuracy but also exhibits strong generalization and stability.

1 Introduction

MicroRNAs (miRNAs) are a class of small non-coding RNAs widely distributed in eukaryotic cells, typically around 22 nucleotides in length. They mainly regulate gene expression through the post-transcriptional processes Hombach and Kretz (2016); Holley and Topkara (2011). In organisms, miRNAs bind to specific target sites on mRNAs, causing their subsequent degradation or translational inhibition, thereby modulating key fundamental physiological processes like cell proliferation, differentiation, apoptosis, and immune system activation Bartel (2009). Recent studies have shown that miRNAs have indispensable functions in a variety of human diseases, including cancer, neurodegenerative disorders, and cardiovascular diseases Li et al. (2023); Dugger and Dickson (2017). They also show great potential in drug response prediction, resistance mechanisms, and therapeutic target discovery Miska (2007); Small and Olson (2011). In this context, studying the subcellular localization of miRNAs is of great significance for understanding their regulatory networks and functional mechanisms Kabekkodu et al. (2018); Catalanotto et al. (2016); Gurtan and Sharp (2013). Different subcellular localizations often suggest that miRNAs are involved in distinct biological processes. Accurate localization prediction not only facilitates the understanding of functional diversification of miRNAs but also provides theoretical support for early disease diagnosis and targeted therapy Jie et al. (2021). Although conventional experimental methods, such as fluorescence in situ hybridization and subcellular fractionation combined with high-throughput sequencing, can directly determine miRNA distributions, these techniques are often complex, expensive, and lack scalability for large-scale samples Thomson and Dinger (2016). Therefore, developing computational methods to efficiently predict miRNA subcellular localization has become a focal point in bioinformatics research. Currently, researchers have developed various machine learning models based on sequence information to explore potential miRNA localization patterns. For example, Huang et al. (2007) proposed a prediction framework combining k-mer frequency patterns with a Support Vector Machine (SVM) classifier, showing the feasibility of using sequence information for localization recognition. However, due to the short length and complex structure of miRNAs, as well as their heterogeneity across different tissues or disease states, models relying solely on sequence-level features often fail to capture the complete biological semantics, resulting in limited accuracy and generalization Li et al. (2014). To address this, some studies have incorporated biological network information, such as the miRNA-mRNA interaction network Hsu et al. (2011), the miRNA-disease association network Jiang et al. (2010), and the miRNA-drug association network Chen H. et al. (2019), to improve prediction accuracy. For instance, Xie et al. constructed a miRNA-target gene interaction network using Graph Convolutional Networks (GCNs) and applied deep learning to predict miRNA functions within cells Guan et al. (2022). Li et al. integrated miRNA, disease, and drug information through a heterogeneous network and used graph embedding techniques for feature learning Sun et al. (2020). Over the past few years, deep learning innovations have significantly advanced bioinformatics research. Convolutional Neural Networks (CNNs) have been used to retrieve sequence features of miRNAs—such as in the DeepMirTar model, which utilized CNNs to improve target gene prediction accuracy Wen et al. (2018). Recurrent Neural Networks (RNNs) have also been employed to capture sequential dependencies, as seen in MirLocNet, which uses Long Short-Term Memory (LSTM) networks to computationally infer miRNA subcellular localization Chen Q. et al. (2019). Graph Neural Networks (GNNs) are widely applied in modeling biological networks. As proposed by Gao et al. (2022), a novel Graph Attention Networks (GATs) combined with biological network information to improve miRNA function prediction Zhao et al. (2022).

In the field of miRNA subcellular localization, researchers have developed various models to enhance prediction accuracy and biological interpretability. MiRLoc Xu et al. (2022), for instance, inferred miRNA spatial distribution by leveraging known mRNA localization and their interaction with miRNAs, reflecting their role in post-transcriptional regulation. MirLocPredictor Asim et al. (2020) incorporated CNNs and positional encoding of k-mers to enhance sequence representation for multi-label localization tasks. DAmirLocGNet Bai et al. (2023a) integrated Graph Convolutional Networks and autoencoders to jointly model miRNA sequence features, disease associations, and disease semantic networks, learning high-level representations from complex graph structures. Some existing excellent models provide us with references. For example, Wang X.-F. et al. (2024) proposed a multi-channel graph neural network framework that integrates multimodal similarity information with hypergraph contrastive learning, effectively identifying novel cancer biomarkers. Wang X.- et al. (2024) designed a directed graph neural network-based multi-view learning model capable of systematically extracting regulatory feature signals from multiple biological layers, enhancing the model’s representational power. Additionally, Wang et al. (2022) developed KGDCMI, a method that integrates multi-source biological information with deep learning techniques to accurately predict interactions between circRNA and miRNA. Comparatively, PMiSLocMF Chen et al. (2024) fused heterogeneous data such as miRNA-mRNA, miRNA-drug, and miRNA-disease networks using a graph attention mechanism, achieving robust performance even in scenarios with sparse data or incomplete labels. Despite improvements, present architectures still face challenges such as inadequate information integration and underutilization of multi-head feature relationships. Effectively integrating multi-source information and building more expressive feature representations to improve miRNA subcellular localization prediction remains an urgent and critical problem.

To overcome these limitations, this study proposes a novel miRNA subcellular localization prediction model named GTMALoc, based on graph transformer and multi-head attention mechanisms. This approach effectively incorporates miRNA sequence information and their roles across different biological networks to improve prediction performance. Specifically, we first extract miRNA features from multiple biological networks–including miRNA sequence similarity, miRNA-mRNA associations, miRNA-disease associations, and miRNA-drug associations—using node2vec. Then, a graph transformer framework is applied to infer latent node correlations, offering better insight into miRNA functionality in different contexts. A multi-head attention mechanism is subsequently employed to integrate miRNA features across networks, capturing deeper, multi-head relational patterns and enhancing predictive performance. The evaluations show that our model outperforms mainstream methods in terms of accuracy, generalization, and stability on public datasets, demonstrating its effectiveness and feasibility in the miRNA subcellular localization task. Key improvements over existing methods provided by this study are:

(1) We propose a new miRNA subcellular localization prediction model that leverages graph transformer and multi-head attention mechanisms to integrate multi-source biological network information.

(2) Complex relationships within biological networks are modeled using node2vec and graph transformer to improve high-dimensional representations of miRNA features.

(3) A multi-head attention mechanism is employed to fuse heterogeneous network information, thereby strengthening inter-feature relationships and improving the prediction accuracy and generalization ability of model.

2 Materials and methods

2.1 Datasets

The dataset used in this study is sourced from version 2.0 of the RNALocate database Cui et al. (2022), which systematically compiles a large number of experimentally support RNA subcellular localization records. From this database, we select a subset containing 1,041 miRNAs to construct and evaluate our model. To ensure biological consistency, all select miRNAs are included in the miRNA functional similarity network established in the MiRLoc Xu et al. (2022) study, facilitating the exploration of potential functional associations. In terms of localization annotation, these miRNAs are assigned to seven subcellular compartments: cytoplasm, nucleus, nucleolus, mitochondrion, exosome, microvesicle, and extracellular vesicle. The specific numbers are as follows: 870 exosomes, 825 microvesicle, 499 nucleus, 308 cytoplasms, 259 mitochondrion, 102 extracellular vesicle, and 67 nucleolus. This categorization not only covers the major cellular structures where miRNAs may reside but also reflects their diverse roles in intracellular and intercellular communication, providing a rich and challenging dataset for multi-label classification tasks.

2.2 Methods

In this study, we develop a multi-source feature fusion model, GTMALoc, for miRNA subcellular localization prediction, aiming to comprehensively capture miRNA characteristics in various biological contexts. This process is illustrated in Figure 1. First, we extract structural features from several biological networks, including the miRNA sequence similarity network, miRNA–mRNA regulatory network, miRNA–disease association network, and miRNA–drug interaction network. To preserve both local and global structural information within each network, we apply the node2vec algorithm to perform embedding learning on these heterogeneous graphs, thereby obtaining a representation vector for each miRNA under different semantic relations—reflecting its functional characteristics in diverse biological environments. Next, we utilize the graph transformer model to process the graph embedding features. Leveraging its built-in structural awareness and self-attention mechanism, the model captures complex and variable dependencies among nodes, enabling a deeper understanding of miRNA behavior and influence across different networks. To achieve effective multi-source information fusion, we further introduce a multi-head attention mechanism to align and integrate miRNA representations from various networks. This allows the model to automatically uncover important cross-network interactions and latent high-level semantic relationships. The fusion strategy not only enhances the model’s sensitivity to critical features but also significantly improves overall prediction accuracy and generalization performance.

Figure 1

Figure 1. The architecture of the GTMALoc model.

3 miRNA networks

3.1 miRNA sequence similarity network and miRNA functional similarity network

All miRNA sequence data are obtained from the authoritative database miRBase (version 22) Kozomara et al. (2019), which provides experimentally validated miRNA sequences from humans and other species, and is widely used in miRNA research. To construct the miRNA sequence similarity network, we employ the Smith–Waterman algorithm Smith and Waterman, (1981), a classical local sequence alignment technique that precisely evaluates the similarity between two miRNA sequences in terms of base composition and order. Specifically, the algorithm uses dynamic programming to find optimal local alignments based on base matches, mismatches, and gap penalties, thereby computing a similarity score for each miRNA pair (Equation 1).

S W (m_{i}, m_{j}) = \frac{s p (m_{i}, m_{j})}{\sqrt{s p (m_{i}, m_{i}) \cdot s p (m_{j}, m_{j})}}, (1)

$s p (m_{i}, m_{j})$ represents the local alignment fraction of the two sequences, and a symmetrical similarity matrix can be obtained by performing the above alignment process between all miRNAs. The miRNA sequence similarity network was created according to the similarity matrix by assigning the miRNA as the node and the similarity score assigned to the corresponding edge as its weight. In order to develop a functional similarity network of miRNAs, we initially used the association data between miRNAs and diseases to construct a disease hierarchy with the help of medical subject headings (MeSH). Specifically, each disease $d_{i}$ is represented in an acyclic diagram (DAG) by a subgraph that includes the $d_{i}$ , and all of its higher-level diseases. For each disease $d_{t}$ in the subgraph, its contribution to $d_{i}$ can be expressed as (Equation 2):

C (d_{t}, d_{i}) = a \times \frac{1}{Depth (d_{t}, d_{i})}, (2)

where $a$ is the adjustment parameter, which $D e p t h (d_{t}, d_{i})$ represents the hierarchical distance between $d_{t}$ and $d_{i}$ . Next, the semantic value of disease $d_{i}$ is established by aggregating all node contributions within its subgraph (Equation 3):

S V (d_{i}) = \sum_{d_{t} \in S (d_{i})} C (d_{t}, d_{i}), (3)

let $S (d_{i})$ denote the set containing disease $d_{i}$ and all of its ancestor diseases. The semantic similarity between two diseases, $d_{i}$ and $d_{j}$ , denoted as $S i m (d_{i}, d_{j})$ , can be calculated based on the overlap of their semantic values, which reflects their semantic similarity. For two miRNAs, $m_{1}$ and $m_{2}$ , let their associated disease sets be $D_{1}$ and $D_{2}$ , respectively. Their initial functional similarity can then be defined as the average semantic similarity between diseases in $m_{1}$ and those in $m_{2}$ (Equation 4):

F S (m_{1}, m_{2}) = \frac{1}{| D_{1} \cap D_{2} |} \sum_{d \in D_{1} \cap D_{2}} Sim (d) . (4)

The challenge of similarity underestimation arising from disease set sparsity is resolved through linear combination with miRNA GIP kernel similarity, generating robust functional similarity estimates (Equation 5).

F S^{*} (m_{1}, m_{2}) = λ F S (m_{1}, m_{2}) + (1 - λ) G I P (m_{1}, m_{2}), (5)

where $λ$ is a fusion parameter, and $G I P (m_{1}, m_{2})$ represents the similarity between the two miRNAs under the GIP kernel. Specifically, $λ$ is used to balance the contributions of functional similarity and GIP kernel similarity. We perform a grid search over the values [0.1, 0.3, 0.5, 0.7, 0.9], using multi-label classification performance as the evaluation criteria. The optimal value is found to be $λ$ = 0.5, which offered a good trade-off between generalization and robustness. For the threshold T used to binarize the similarity matrix, we adopt an empirical approach Wang et al. (2010), adjusting T to control the sparsity of the resulting adjacency matrix. We ultimately set T to 0.6 to ensure a reasonable balance between sparsity and connectivity in the resulting graph. After calculating the similarity between all miRNA pairs, a similarity matrix is constructed. This matrix is then binarized using a predefined threshold T (Equation 6):

A_{ij} = \{\begin{cases} 1, & i f F S^{*} (m_{i}, m_{j}) > T \\ 0, & o t h e r w i s e \end{cases} . (6)

The resulting adjacency matrix $A$ defines the miRNA functional similarity network, where nodes represent miRNAs and edges reflect their functional similarity.

3.2 miRNA-mRNA association network

The miRNA–mRNA regulatory network in this study is primarily based on data from the authoritative miRTarBase (2020 version) Huang et al. (2020), supplemented by a curated dataset of validated interactions compiled by Xu et al. (2022). miRTarBase is known for its high-quality data, integrating miRNA–target gene interactions supported by both low- and high-throughput experimental evidence, such as reporter gene assays, qRT-PCR, and Western blot. The constructed network contains 8,254 high-confidence regulatory relationships between 1,041 non-coding miRNAs and 2,836 protein-coding genes.

3.3 miRNA-drug association network

The miRNA–drug association network is based on data from ncDR, a drug resistance research database Dai et al. (2017), which collects experimentally verified and predicted interactions between non-coding RNAs and drugs. The data are standardized as follows: First, the 1,041 miRNAs involved in previous studies are matched based on miRBase nomenclature. Second, only interactions with clearly annotated drug resistance evidence (including preclinical or cell line experiments) are retained. This results in 3,305 high-confidence miRNA–drug interactions involving 130 commonly used clinical drugs, such as cisplatin and gefitinib.

3.4 miRNA-diease association network

To construct the required network, this study references the dataset from HMDD v3.2 Bai et al. (2023b), a widely-used human microRNA disease database. After curation and filtering, 15,547 miRNA–disease association pairs are obtained, covering 1,041 miRNAs and 640 human diseases.

4 Node2vec algorithm

Network modeling has emerged as a pivotal paradigm in biomedical research due to its intuitive representation of complex relationships, particularly in systematic miRNA analysis involving multimodal correlations. This study integrates four critical biological networks: the miRNA sequence similarity network (quantifying functional conservation), the miRNA–disease association network (revealing pathological regulation), the miRNA–drug interaction network (reflecting therapeutic targeting), and the miRNA–mRNA regulatory network (decoding genetic circuitry). To effectively capture topological features from these non-Euclidean spatial data, we employ the node2vec algorithm Grover and Leskovec (2016), a graph embedding approach based on adaptive random walk strategies. By tuning search parameters—the return parameter p controlling local neighborhood sampling and the in-out parameter q governing global structural exploration—this approach generates semantically preserved node sequences, subsequently vectorized through Skip-Gram modeling. Notably, we implement dimension-specific embedding strategies tailored to distinct network characteristics: 64-dimensional representations in sequence similarity networks to resolve fine-grained patterns of conserved functional motifs, versus 128-dimensional high-capacity embeddings in the three heterogeneous association networks to capture complex multi-hop interactions. This hierarchical embedding mechanism simultaneously reduces feature redundancy while preserving network-specific information, establishing an interpretable mathematical foundation for subsequent multi-view feature fusion.

5 Graph transformer

This study proposes a structure-aware graph neural network, the graph transformer, to learn high-quality node embeddings from graph structures. Unlike traditional GNNs, which struggle with sparse or heterogeneous structures, the graph transformer incorporates multi-head attention and a structure reconstruction loss, enabling better modeling of local and global graph information. First, the input miRNA functional similarity matrix and association matrix are feature fused. After generating the node feature matrix $h_{k}$ at layer $k$ , it dynamically rebuilds the attention weights matrix $a_{k + 1}$ through dot products of $h_{k}$ and $h_{k}^{T}$ , enabling the joint evolution of topology and features. This gradient-preserving update critically suppresses false-positive edges caused by experimental noise. The model adopts Pre-Layer Normalization (Pre-LN), normalizing features before multi-head attention and feed-forward operations rather than after, which better accommodates high-dimensional biological feature propagation and curbs gradient vanishing in deep training. For extremely sparse data like miRNA-drug networks, binary attention masking embedded in multi-head layers automatically blocks unobserved associations (e.g., unknown miRNA-drug interactions), reducing computational complexity from $O (N^{2})$ to $O (E)$ (where $E$ denotes valid edges) with less GPU memory consumption, while preventing noise from distorting attention weights. These improvements jointly optimize the model, and the output layer further sharpens the precision-recall curve through dot product similarity and L2 normalized feature constraints in the range of [0,1]. During embedding learning, the model stacks L layers of graph attention modules, where each node’s representation is updated by aggregating the representations of its neighbors (Equation 7):

h_{i}^{(l)} = σ (\sum_{j \in N (i)} A_{i j}^{(l)} W^{(l)} h_{j}^{(l - 1)}) . (7)

Here, $σ$ is an activation function, and $W^{(l)}$ is a learnable weight matrix. The attention weights $a_{i j}^{(l)}$ are calculated as (Equation 8):

a_{i j}^{(l)} = \frac{\exp (L e a k y R e L U (a^{⊤} [W^{(l)} h_{i}^{(l - 1)} ‖ W^{(l)} h_{j}^{(l - 1)}]))}{\sum_{k \in N (i)} \exp (L e a k y R e L U (a^{⊤} [W^{(l)} h_{i}^{(l - 1)} ‖ W^{(l)} h_{k}^{(l - 1)}]))}, (8)

where $a$ is the attention weight vector, which $‖$ represents the vector splicing operation. Graph transformer further introduces a multi-head attention mechanism, using K attention heads in parallel in each layer, and finally integrating their output splicing or averaging into the input of the next layer (Equation 9):

h_{i}^{(l)} = \frac{1}{K} \sum_{k = 1}^{K} σ (\sum_{j \in N (i)} a_{i j}^{(l, k)} W^{(l, k)} h_{j}^{(l - 1)}) . (9)

In order to enhance the model’s ability to express structural information, graph transformer also introduces structural reconstruction loss as the training goal. In the unsupervised setting, the model scores the node pairs of the real edges in the input graph, and defines the structural reconstruction similarity as (Equation 10):

s_{i j} = σ (h_{i}^{⊤} h_{j}), (10)

where $h_{i}, h_{j}$ are the final embeddings of nodes i and j, and $σ$ denotes the sigmoid function. The structural reconstruction loss is formulated as (Equation 11):

L_{reconstruct} = - \sum_{(i, j) \in E^{+}} \log (σ (h_{i}^{⊤} h_{j})) - \sum_{(i, j) \in E^{-}} \log (1 - σ (h_{i}^{⊤} h_{j})), (11)

where $E^{+}$ represents the set of true edges, and $E^{-}$ denotes negative-sampled non-edges to prevent degenerate solutions where all nodes become indistinguishable. This objective effectively preserves high separability of node connectivity patterns in the embedding space. The final node embeddings $H = [h_{1}, h_{2}, . . ., h_{N}]$ undergo L2-normalization for downstream tasks.

6 Multi-head attention mechanism

To capture cross-modal dependencies and interactions among various biological features, a Multi-Head Attention (MHA) module is introduced as the core feature interaction component in the fusion model. Based on the transformer encoder, MHA computes attention across multiple subspaces in parallel to enhance local and global correlation modeling. Four input feature types—miRNA sequence, drug features, mRNA features, and disease features are first projected to a common 128-dimensional space using fully connected layers with L2 regularization. These are concatenated and reshaped into a 2D sequence format before being passed into the MHA module. The Multi-Head Attention Mechanism is calculated as follows (Equations 12, 13):

Q = H W^{Q}, K = H W^{K}, V = H W^{V}, (12)

Attention (Q, K, V) = softmax (\frac{Q K^{⊤}}{\sqrt{d_{k}}}) V . (13)

The outputs of all heads are concatenated and transformed, the formula is as follows (Equations 14, 15):

M H A (H) = C o n c a t ({h e a d}_{1}, \dots, {h e a d}_{h}) W^{0}, (14)

H^{(1)} = L a y e r N o r m (H + D r o p o u t (M H A (H))) . (15)

To further improve the representation capability, the multi-head attention output will be delivered through two layers of Feed-Forward Network (FFN), the formula is as follows (Equations 16, 17):

F F N (H) = R e L U (H W_{1} + b_{1}) W_{2} + b_{2}, (16)

H_{fusion} = L a y e r N o r m (H^{(1)} + D r o p o u t (F F N (H^{(1)}))) . (17)

The final output represents the fused multimodal semantic embedding features, which are used as the input of the subsequent self-supervised learning projection head and the multi-label classification header, which not only retains the information of the original modal features, but also integrates the high-order correlation between them.

7 Prediction of miRNA subcellular localization

During forward propagation, the fused high-dimensional features are processed through the MHA and FFN modules. Finally, the classification head maps the features to predicted subcellular localization probabilities (Equation 18):

\hat{y} = o (f_{cls} (x)), (18)

where $f_{cls}$ is a linear mapping and $o$ is the Sigmoid activation function that converts the output into a probability vector over [0, 1]. For each class, if the predicted probability exceeds the threshold (0.5), the class is labeled as positive; otherwise, negative–resulting in the final binary classification output.

8 Results

8.1 10-Fold cross-validation

In our experiments, we employ 10-fold cross-validation to comprehensively assess the generalization ability of the model. The dataset is randomly shuffled and evenly divided into 10 subsets, each fold rotation assigned one decile to testing and nine to training, ensuring comprehensive parameter optimization. After training, the model generates predicted probabilities for each class on the test set, which are then mapped to the [0,1] range using the Sigmoid activation function and binarized with a threshold of 0.5. For each fold, we calculate the Area Under the ROC Curve (AUC) and the Area Under the Precision-Recall Curve (AUPR) as evaluation metrics, and record the results for each class. As shown in Figures 2, 3, our model achieves an average AUC of 0.9108 and an average AUPR of 0.8102 on the multi-label subcellular localization task, fully demonstrating the model’s effectiveness and robustness in capturing multimodal features and their high-order interactions.

Figure 2

Figure 2. The Model GTMALoc 10-fold cross-validation AUC value.

Figure 3

Figure 3. The Model GTMALoc 10-fold cross-validation AUPR value.

8.2 Comparative experiments

To comprehensively evaluate the performance of the GTMALoc model, we use both 5-fold and 10-fold cross-validation strategies and systematically compare it with four existing methods (MiRLoc, MirLocPredictor, DAmiRLocGNet, and PMiSLocMF). The evaluation metrics include AUC and AUPR to thoroughly assess the model’s effectiveness.

As shown in Table 1, GTMALoc achieves an average AUC score of 0.9094 under 5-fold cross-validation, outperforming the other methods across most subcellular localization categories. It performs particularly well in structurally complex or sparsely connected categories such as cytoplasm (0.9240), extracellular vesicle (0.9115), and microvesicle (0.9113), demonstrating its strength in integrating high-dimensional heterogeneous information and modeling complex relationships. Table 2 presents the comparison based on AUPR, which primarily reflects the model’s robustness in class-imbalanced scenarios. GTMALoc also achieves the highest average AUPR of 0.8044, showing excellent performance in critical functional regions such as exosome (0.9900), nucleus (0.9248), and microvesicle (0.9900). Although the score slightly decreases in the nucleolus (0.5142), where signals are sparse, the overall performance remains superior.

Table 1

Table 1. AUC Performance Comparison of miRNA Subcellular Localization Models Based on 5-Fold Cross-Validation.

Table 2

Table 2. AUPR Performance Comparison of miRNA Subcellular Localization Models Based on 5-Fold Cross-Validation.

Furthermore, as indicated in Table 3, GTMALoc’s average AUC increases to 0.9108 under 10-fold cross-validation, further confirming the model’s stability and generalization across different data splits. Its outstanding performance in categories such as nucleolus, cytoplasm, and exosome highlights its ability to accurately identify miRNAs localized in these regions. This advantage is largely attributed to the multi-head attention mechanism, which effectively captures complex sequence patterns and graph-structured information. As shown in Table 4, although GTMALoc’s AUPR scores for some sub-tasks are slightly lower than those of PMiSLocMF, the overall average AUPR reaches 0.8102, demonstrating strong resilience to data imbalance. We observe significant performance differences across localization categories: exosome and microvesicle achieve near-perfect AUPR scores, indicating successful recognition of key regional features, while performance in nucleolus and extracellular vesicle is relatively lower, likely due to insufficient positive samples and data sparsity. This suggests that future work should focus on improving data quality or adopting sample augmentation strategies to enhance performance in low-signal categories. Overall, the experimental results demonstrate that GTMALoc consistently exhibits strong predictive power and generalization ability under various cross-validation strategies, confirming its feasibility and practicality as a reliable tool for miRNA subcellular localization prediction.

Table 3

Table 3. AUC Performance Comparison of miRNA Subcellular Localization Models Based on 10-Fold Cross-Validation.

Table 4

Table 4. AUPR Performance Comparison of miRNA Subcellular Localization Models Based on 10-Fold Cross-Validation.

8.3 Ablation study

To validate the contribution of each submodule within the overall architecture, we conduct detailed ablation studies by sequentially removing key components of the model and observing the resulting AUC performance on the multi-label subcellular localization task. As shown in Figure 4, we design five ablation settings: removing the miRNA-disease association network, removing the miRNA-drug interaction network, removing the miRNA-mRNA regulatory network, removing the graph transformer module, and removing the multi-head attention mechanism. All other modules remain unchanged across experiments to ensure a consistent model structure and fair evaluation. Each module contributes positively to the model’s overall performance, especially the graph transformer and multi-head attention modules, which play crucial roles in capturing high-order cross-modal interactions and both local and global structural features.

Figure 4

Figure 4. The Model GTMALoc ablation experiment.

8.4 Parameter study

To investigate the effect of the number of attention heads on model performance, we systematically evaluate different configurations (2, 4, and eight heads) on the validation set, as shown in Figure 5. A grid search strategy fixes other hyperparameters while varying the number of attention heads to observe sensitivity in the AUC metric. Results show that the model achieves peak performance (AUC = 0.9108) when the number of heads is set to 4, outperforming the 2-head and 8-head configurations by 0.0005 and 0.0006, respectively. This can be attributed to two main factors: (1) a moderate number of heads helps capture complementary interaction patterns in parallel subspaces, enhancing the model’s ability to fuse features across networks; (2) exceeding the optimal number of heads leads to redundancy in attention weights and an increased risk of local overfitting. Therefore, we adopt the 4-head configuration in the final architecture, balancing computational efficiency and predictive accuracy.

Figure 5

Figure 5. The Model GTMALoc parameter experiment.

9 Case studies

To further demonstrate the practical utility of GTMALoc in predicting miRNA subcellular localization, we conduct case studies across seven subcellular categories: cytoplasm, exosomes, nucleolus, nucleus, extracellular vesicles, microvesicles, and mitochondrion. For each compartment, we select the top five miRNAs with the highest predicted probabilities generated by GTMALoc. We then manually verify these predictions against experimental evidence reported in the scientific literature. In total, 35 miRNA–localization associations are examined. As shown in Table 5 of these are supported by published studies, while only five lack current experimental validation.

Table 5

Table 5. Case studies of miRNA subcellular localizations.

Taking miR-122 and miR-21 as representative examples, we analyze the alignment between GTMALoc’s predictions and reported biological findings. miR-122 is a liver-specific miRNA that is highly enriched in hepatocytes, and its cytoplasmic localization is well supported by experimental evidence Zhang et al. (2021). Previous studies indicate that miR-122 plays a crucial role in liver homeostasis by regulating lipid metabolism, cholesterol biosynthesis, and HCV replication through mRNA binding Ren et al. (2008). GTMALoc assigns a high confidence score of 0.98 for its cytoplasmic localization and captures its interactions with liver metabolism-related mRNA nodes, highlighting the model’s capacity to extract biologically meaningful features from the molecular network. In contrast, miR-21 is known for its multi-localization behavior and is highly expressed in various cancer types. It has been shown to be secreted via exosomes, contributing to immune modulation and tumor microenvironment remodeling Krishnamurthy et al. (2018), and also localizes in the nucleolus, where it may influence non-coding RNA processing Beckett et al. (2015). GTMALoc successfully predicts both localizations with high confidence and focuses attention on miR-21’s connections to tumor-associated signaling pathways, consistent with its known roles in cell proliferation, anti-apoptosis, and inflammatory response. These case studies suggest that GTMALoc not only achieves accurate subcellular localization predictions but also provides biologically interpretable outputs, particularly for multi-localized miRNAs, offering valuable insights for downstream functional analysis and subcellular mechanism exploration.

10 Conclusion

In this study, we propose a computational model, GTMALoc, for predicting miRNA subcellular localization. GTMALoc combines graph transformers with a multi-head attention mechanism to fuse heterogeneous biological information from multiple sources. Specifically, the model effectively integrates miRNA sequence features, interaction network structures, and functional properties. Through graph-based modeling and dynamic attention weighting, GTMALoc learns more discriminative high-dimensional feature representations, significantly improving the accuracy of localization prediction. We conduct a comprehensive evaluation of GTMALoc on public datasets. The results show that GTMALoc outperforms existing methods on multiple performance metrics, especially in handling sparse graph structures and high-dimensional feature spaces. Ablation studies confirm the key contributions of each feature modality and attention component to the model’s overall performance. Additionally, through representative case studies, we validate the biological interpretability of GTMALoc. The predicted subcellular localizations are not only consistent with known miRNA functions reported in the literature but also reveal potential regulatory modes that have not been fully explored. Given that miRNA localization is closely related to its regulatory roles in various cellular contexts, accurate localization prediction provides valuable insights into miRNA-mediated mechanisms under physiological and pathological conditions.

Although GTMALoc performs well in various experiments, it still faces some limitations. We integrate multiple heterogeneous features, such as sequence data, functional similarity, and molecular interaction networks; however, biological data often contain noise and incompleteness. For example, the functional annotations of many miRNAs remain incomplete, and some interaction networks may contain missing data or experimental biases, which can adversely affect the accuracy of feature learning. Additionally, differences among data sources in species, experimental conditions, or time points introduce biases and impair the model’s generalizability. In future work, we plan to further refine the model architecture to improve its interpretability, adaptability, and applicability in real-world biomedical research scenarios.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: The datasets for this study are openly available in the public domain: RNALocate v2.0 at http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/. The code and data that support the findings of this study are available at https://github.com/27167199/GTMALoc.

Author contributions

XH: Conceptualization, Methodology, Writing – original draft. JJ: Writing – original draft, Software, Methodology. LS: Writing – original draft. CY: Supervision, Project administration, Writing – review and editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. Authors declare that the financial support is provided by the National Natural Science Foundation of China (Grant Nos. 62473149 and 61962050), the Natural Science Foundation of Hunan Province, China (Grant No. 2022JJ30428), and the Excellent Youth Funding Program of the Hunan Provincial Education Department (Grant No. 22B0372).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Asim, M. N., Malik, M. I., Zehe, C., Trygg, J., Dengel, A., and Ahmed, S. (2020). Mirlocpredictor: a convnet-based multi-label microrna subcellular localization predictor by incorporating k-mer positional information. Genes 11, 1475. doi:10.3390/genes11121475

PubMed Abstract | CrossRef Full Text | Google Scholar

Bai, T., Yan, K., and Liu, B. (2023a). Damirlocgnet: mirna subcellular localization prediction by combining mirna–disease associations and graph convolutional networks. Briefings Bioinforma. 24, bbad212.

PubMed Abstract | CrossRef Full Text | Google Scholar

Bai, T., Yan, K., and Liu, B. (2023b). Damirlocgnet: mirna subcellular localization prediction by combining mirna–disease associations and graph convolutional networks. Briefings Bioinforma. 24, bbad212. doi:10.1093/bib/bbad212

PubMed Abstract | CrossRef Full Text | Google Scholar

Bartel, D. P. (2009). Micrornas: target recognition and regulatory functions. cell 136, 215–233. doi:10.1016/j.cell.2009.01.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Beckett, E. L., Martin, C., Choi, J. H., King, K., Niblett, S., Boyd, L., et al. (2015). Folate status, folate-related genes and serum mir-21 expression: implications for mir-21 as a biomarker. BBA Clin. 4, 45–51. doi:10.1016/j.bbacli.2015.06.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Catalanotto, C., Cogoni, C., and Zardo, G. (2016). Microrna in control of gene expression: an overview of nuclear functions. Int. J. Mol. Sci. 17, 1712. doi:10.3390/ijms17101712

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, H., Zhang, Z., and Feng, D. (2019a). Prediction and interpretation of mirna-disease associations based on mirna target genes using canonical correlation analysis. BMC Bioinforma. 20, 404–408. doi:10.1186/s12859-019-2998-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, L., Gu, J., and Zhou, B. (2024). Pmislocmf: predicting mirna subcellular localizations by incorporating multi-source features of mirnas. Briefings Bioinforma. 25, bbae386. doi:10.1093/bib/bbae386

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, Q., Zhe, Z., Lan, W., Zhang, R., Wang, Z., Luo, C., et al. (2019b). Identifying mirna-disease association based on integrating mirna topological similarity and functional similarity. Quant. Biol. 7, 202–209. doi:10.1007/s40484-019-0176-7

CrossRef Full Text | Google Scholar

Cui, T., Dou, Y., Tan, P., Ni, Z., Liu, T., Wang, D., et al. (2022). Rnalocate v2. 0: an updated resource for rna subcellular localization with increased coverage and annotation. Nucleic acids Res. 50, D333–D339. doi:10.1093/nar/gkab825

PubMed Abstract | CrossRef Full Text | Google Scholar

Dai, E., Yang, F., Wang, J., Zhou, X., Song, Q., An, W., et al. (2017). ncdr: a comprehensive resource of non-coding rnas involved in drug resistance. Bioinformatics 33, 4010–4011. doi:10.1093/bioinformatics/btx523

PubMed Abstract | CrossRef Full Text | Google Scholar

Dugger, B. N., and Dickson, D. W. (2017). Pathology of neurodegenerative diseases. Cold Spring Harb. Perspect. Biol. 9, a028035. doi:10.1101/cshperspect.a028035

PubMed Abstract | CrossRef Full Text | Google Scholar

Grover, A., and Leskovec, J. (2016). “node2vec: scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 855–864.

Google Scholar

Guan, Y.-J., Yu, C.-Q., Qiao, Y., Li, L.-P., You, Z.-H., Ren, Z.-H., et al. (2022). Mfidma: a multiple information integration model for the prediction of drug–mirna associations. Biology 12, 41. doi:10.3390/biology12010041

PubMed Abstract | CrossRef Full Text | Google Scholar

Gurtan, A. M., and Sharp, P. A. (2013). The role of mirnas in regulating gene expression networks. J. Mol. Biol. 425, 3582–3600. doi:10.1016/j.jmb.2013.03.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Holley, C. L., and Topkara, V. K. (2011). An introduction to small non-coding rnas: mirna and snorna. Cardiovasc. drugs Ther. 25, 151–159. doi:10.1007/s10557-011-6290-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Hombach, S., and Kretz, M. (2016). “Non-coding rnas: classification, biology and functioning,” in Non-coding RNAs in colorectal cancer, 3–17.

Google Scholar

Hsu, S.-D., Lin, F.-M., Wu, W.-Y., Liang, C., Huang, W.-C., Chan, W.-L., et al. (2011). mirtarbase: a database curates experimentally validated microrna–target interactions. Nucleic acids Res. 39, D163–D169. doi:10.1093/nar/gkq1107

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, H.-Y., Lin, Y.-C.-D., Li, J., Huang, K.-Y., Shrestha, S., Hong, H.-C., et al. (2020). Mirtarbase 2020: updates to the experimentally validated microrna–target interaction database. Nucleic acids Res. 48, D148–D154. doi:10.1093/nar/gkz896

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, T.-H., Fan, B., Rothschild, M. F., Hu, Z.-L., Li, K., and Zhao, S.-H. (2007). Mirfinder: an improved approach and software implementation for genome-wide fast microrna precursor scans. BMC Bioinforma. 8, 341–10. doi:10.1186/1471-2105-8-341

PubMed Abstract | CrossRef Full Text | Google Scholar

Jiang, Q., Hao, Y., Wang, G., Juan, L., Zhang, T., Teng, M., et al. (2010). Prioritization of disease micrornas through a human phenome-micrornaome network. BMC Syst. Biol. 4, S2–S9. doi:10.1186/1752-0509-4-S1-S2

PubMed Abstract | CrossRef Full Text | Google Scholar

Jie, M., Feng, T., Huang, W., Zhang, M., Feng, Y., Jiang, H., et al. (2021). Subcellular localization of mirnas and implications in cellular homeostasis. Genes 12, 856. doi:10.3390/genes12060856

PubMed Abstract | CrossRef Full Text | Google Scholar

Kabekkodu, S. P., Shukla, V., Varghese, V. K., D’Souza, J., Chakrabarty, S., and Satyamoorthy, K. (2018). Clustered mirnas and their role in biological functions and diseases. Biol. Rev. 93, 1955–1986. doi:10.1111/brv.12428

PubMed Abstract | CrossRef Full Text | Google Scholar

Kozomara, A., Birgaoanu, M., and Griffiths-Jones, S. (2019). mirbase: from microrna sequences to function. Nucleic acids Res. 47, D155–D162. doi:10.1093/nar/gky1141

PubMed Abstract | CrossRef Full Text | Google Scholar

Krishnamurthy, S., Pavani, C., Kurup, P., Palanisamy, S., Jagadeesh, A., Sekar, K., et al. (2018). Cystinuria in a 13-month-old girl with absence of mutations in the slc3a1 and slc7a9 genes. Indian J. Nephrol. 28, 84–85. doi:10.4103/ijn.IJN_20_17

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, S., Lei, Z., and Sun, T. (2023). The role of micrornas in neurodegenerative diseases: a review. Cell Biol. Toxicol. 39, 53–83. doi:10.1007/s10565-022-09761-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, Y., Qiu, C., Tu, J., Geng, B., Yang, J., Jiang, T., et al. (2014). Hmdd v2. 0: a database for experimentally supported human microrna and disease associations. Nucleic acids Res. 42, D1070–D1074. doi:10.1093/nar/gkt1023

PubMed Abstract | CrossRef Full Text | Google Scholar

Miska, E. (2007). “Microrna expression profiles classify human cancers,” in Cytometry part B-clinical cytometry (HOBOKEN, NJ 07030 USA: WILEY-LISS DIV JOHN WILEY and SONS INC), 72, 126.

Google Scholar

Ren, X., Vincenz, C., and Kerppola, T. K. (2008). Changes in the distributions and dynamics of polycomb repressive complexes during embryonic stem cell differentiation. Mol. Cell. Biol. 28, 2884–2895. doi:10.1128/MCB.00949-07

PubMed Abstract | CrossRef Full Text | Google Scholar

Small, E. M., and Olson, E. N. (2011). Pervasive roles of micrornas in cardiovascular biology. Nature 469, 336–342. doi:10.1038/nature09783

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, T. F., and Waterman, M. S. (1981). Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197. doi:10.1016/0022-2836(81)90087-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Sun, J., Zhang, J., Li, Q., Yi, X., Liang, Y., and Zheng, Y. (2020). Predicting citywide crowd flows in irregular regions using multi-view graph convolutional networks. IEEE Trans. Knowl. Data Eng. 34, 2348–2359. doi:10.1109/tkde.2020.3008774

CrossRef Full Text | Google Scholar

Thomson, D. W., and Dinger, M. E. (2016). Endogenous microrna sponges: evidence and controversy. Nat. Rev. Genet. 17, 272–283. doi:10.1038/nrg.2016.20

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, D., Wang, J., Lu, M., Song, F., and Cui, Q. (2010). Inferring the human microrna functional similarity and functional network based on microrna-associated diseases. Bioinformatics 26, 1644–1650. doi:10.1093/bioinformatics/btq241

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, X.-F., Huang, L., Wang, Y., Guan, R.-C., You, Z.-H., Sheng, N., et al. (2024a). Multi-view learning framework for predicting unknown types of cancer markers via directed graph neural networks fitting regulatory networks. Briefings Bioinforma. 25, bbae546. doi:10.1093/bib/bbae546

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, X.-F., Huang, L., Wang, Y., Guan, R.-C., You, Z.-H., Sheng, N., et al. (2024b). A multichannel graph neural network based on multisimilarity modality hypergraph contrastive learning for predicting unknown types of cancer biomarkers. Briefings Bioinforma. 25, bbae575. doi:10.1093/bib/bbae575

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, X.-F., Yu, C.-Q., Li, L.-P., You, Z.-H., Huang, W.-Z., Li, Y.-C., et al. (2022). Kgdcmi: a new approach for predicting circrna–mirna interactions from multi-source information extraction and deep learning. Front. Genet. 13, 958096. doi:10.3389/fgene.2022.958096

PubMed Abstract | CrossRef Full Text | Google Scholar

Wen, M., Cong, P., Zhang, Z., Lu, H., and Li, T. (2018). Deepmirtar: a deep-learning approach for predicting human mirna targets. Bioinformatics 34, 3781–3787. doi:10.1093/bioinformatics/bty424

PubMed Abstract | CrossRef Full Text | Google Scholar

Xu, M., Chen, Y., Xu, Z., Zhang, L., Jiang, H., and Pian, C. (2022). Mirloc: predicting mirna subcellular localization by incorporating mirna–mrna interactions and mrna subcellular localization. Briefings Bioinforma. 23, bbac044. doi:10.1093/bib/bbac044

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, D., Ran, J., Li, J., Yu, C., Cui, Z., Amevor, F. K., et al. (2021). mir-21-5p regulates the proliferation and differentiation of skeletal muscle satellite cells by targeting klf3 in chicken. Genes 12, 814. doi:10.3390/genes12060814

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, H., Li, Z., You, Z.-H., Nie, R., and Zhong, T. (2022). Predicting mirna-disease associations based on neighbor selection graph attention networks. IEEE/ACM Trans. Comput. Biol. Bioinforma. 20, 1298–1307. doi:10.1109/TCBB.2022.3204726

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: miRNA, subcellular localization, graph transformer, multi-head attention mechanism, multi-source features

Citation: Huang X, Jiang J, Shi L and Yan C (2025) GTMALoc: prediction of miRNA subcellular localization based on graph transformer and multi-head attention mechanism. Front. Genet. 16:1623008. doi: 10.3389/fgene.2025.1623008

Received: 05 May 2025; Accepted: 10 June 2025;
Published: 19 June 2025.

Edited by:

Federica Calore, The Ohio State University, United States

Reviewed by:

Kai Zheng, Central South University, China
Xinfei Wang, Jilin University, China

Copyright © 2025 Huang, Jiang, Shi and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Cheng Yan, eWFuY2hlbmcwMUBobnVjbS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.