Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Precision Medicine

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1577451

This article is part of the Research TopicTailored Strategies for Lung Cancer Diagnosis and Treatment in Special PopulationsView all articles

Blood Test-Based Machine Learning Model for Predicting Lung Cancer Risk

Provisionally accepted
Lihi  SchwartzLihi Schwartz1Naor  MataniaNaor Matania2Matanel  LeviMatanel Levi3Teddy  LazebnikTeddy Lazebnik4*Shiri  KushnirShiri Kushnir1Noga  YossefNoga Yossef1Assaf  HoogiAssaf Hoogi3Dekel  ShlomiDekel Shlomi3
  • 1Clalit Health Services, Tel Aviv, Israel
  • 2Bar-Ilan University, Ramat Gan, Tel Aviv District, Israel
  • 3Ariel University, Ariel, Israel
  • 4University College London, London, United Kingdom

The final, formatted version of the article will be published soon.

Background: Individual cancer prediction is the goal of early detection. In lung cancer (LC), age and smoking background are the strongest criteria for annual low-dose CT for screening, leaving other populations at risk. Machine learning (ML) is a promising method to find complex patterns in the data which can reveal personalized disease predictors. Methods: An ML-based model using blood tests before the diagnosis of LC and socio-demographic factors such as age and gender among LC patients versus controls that predict the risk for future LC diagnosis.Results: Apart from age and gender, we found 22 blood tests that contribute to the model. For the entire study population, the ML model had an accuracy of 71.2% to predict LC with a sensitivity of 63% and a positive predictive value of 67.2%. Better accuracy was found for females than males (71.8 vs. 70.8) and for never-smokers than smokers (73.6% vs. 70.1%). Age was the most significant contributor (13.6%) followed by the red blood cell distribution (5.1%), creatinine (5%) gender (3.6%), and mean corpuscular hemoglobin (3.3%). Most of the blood tests had a chaotic contribution to the complex ML model however, some tests, such as red cell distribution width, mean corpuscular hemoglobin, prothrombin time, hematocrit, urea, and calcium had a slightly better dichotomic contribution. Conclusion: Blood tests can be used for the proposed ML model to modestly predict LC. More studies as are needed such as in the basic science fields to find possible explanations between the specific blood results and LC prediction.

Keywords: lung cancer, Artificia lintelligence, Machine learn ing, blood test, Prediction model

Received: 15 Feb 2025; Accepted: 12 May 2025.

Copyright: © 2025 Schwartz, Matania, Levi, Lazebnik, Kushnir, Yossef, Hoogi and Shlomi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Teddy Lazebnik, University College London, London, United Kingdom

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.