Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol.

Sec. Cancer Epidemiology and Prevention

Volume 15 - 2025 | doi: 10.3389/fonc.2025.1638074

This article is part of the Research TopicAdvanced Machine Learning Techniques in Cancer Prognosis and ScreeningView all 9 articles

Detection of acute myeloid leukemia and remission states using heterogeneous flow cytometry data

Provisionally accepted
Renjun  BaoRenjun Bao1Ming  FengMing Feng2Mian  WangMian Wang1Yunkai  LiuYunkai Liu3Liang  HuLiang Hu2*Yonghua  YaoYonghua Yao1*
  • 1Department of Hematology, Shidong Hospital of Shanghai Yangpu District, Shanghai, China
  • 2School of Computer Science and Technology, Tongji University, Shanghai, China
  • 3Laboratory Diagnosis Department, Shanghai Kingmed Center for Clinical Laboratory Co., Ltd., Shanghai, China

The final, formatted version of the article will be published soon.

Acute myeloid leukemia (AML) is a hematological malignancy that requires accurate diagnosis and monitoring to guide effective treatment. Flow cytometry is widely used for AML diagnosis because it detects minimal residual disease. However, most existing methods rely on consistent marker panels across samples, overlooking the marker heterogeneity in real-world clinical settings, where different markers or staining methods are used across patients. Additionally, current approaches often fail to account for the significance of remission states, which are critical for disease management and prognosis. To address these challenges, we propose a machine learning-based classification model that integrates heterogeneous flow cytometry data to distinguish between AML, complete remission (AML-CR), and normal samples. By leveraging a diverse set of 53 markers, we constructed and evaluated predictive models using six different machine learning classifiers. Our results showed that the Random Forest model achieved the best performance, with an accuracy of 94.92%, F1 score of 94.13%, precision of 94.58%, recall of 93.74%, and an AUC of 94.83%. This study demonstrates the potential of machine learning to overcome the limitations of traditional methods and provide a more robust and accurate classification of AML and its remission states using heterogeneous flow cytometry data. The code for this study is available at https://zenodo.org/records/15110287.

Keywords: Acute Myeloid Leukemia, Flow Cytometry, machine learning, Feature importance analysis, Diagnostic model

Received: 30 May 2025; Accepted: 09 Sep 2025.

Copyright: © 2025 Bao, Feng, Wang, Liu, Hu and Yao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Liang Hu, School of Computer Science and Technology, Tongji University, Shanghai, China
Yonghua Yao, Department of Hematology, Shidong Hospital of Shanghai Yangpu District, Shanghai, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.