ORIGINAL RESEARCH article
Front. Big Data
Sec. Data Mining and Management
This article is part of the Research TopicMachine Learning for Large-Scale Data Processing: Algorithms and ApplicationsView all 5 articles
GFTrans: An On-the-fly Static Analysis Framework for Code Performance Profiling
Provisionally accepted- 1South China Normal University, Guangzhou, China
- 2School of Mathematics and Physics Sciences, RI-IM·AI*,Chongqing University of Science and Technology, Chongqing, China
- 3Center for Artificial Intelligence Research and Optimization, Torrens University Australia, Queensland, Australia
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Improving software efficiency is crucial for maintenance, but pinpointing runtime bottlenecks becomes increasingly difficult as systems expand. Traditional dynamic profiling tools require full build-execution cycles, creating significant latency that impedes agile development. To address this, we introduce GFTrans, a static analysis framework that predicts c program performance without execution. GFTrans utilizes a Transformer architecture with a novel "anchor-based embedding" technique to integrate control flow and data dependencies into a unified sequence. Additionally, a dynamic gating mechanism fuses these semantic representations with 16 handcrafted statistical features to comprehensively capture code complexity. Evaluated on a dataset of real-world GitHub c functions with high-precision runtime labels, GFTrans outperforms baseline models like Random Forest and Code2Vec, achieving 78.64% accuracy. The system identifies potential bottlenecks in milliseconds, enabling developers to perform optimization effectively during the coding phase.
Keywords: Code Representation Learning, ControlFlow and Data Flow, Graph Linearization, On-the-fly Profiling, performance prediction, static analysis
Received: 03 Jan 2026; Accepted: 11 Feb 2026.
Copyright: © 2026 Li, Wen, Liu, Zeng and Mirjalili. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Yunbao Wen
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
