ORIGINAL RESEARCH article
Front. Med.
Sec. Hepatobiliary Diseases
Volume 12 - 2025 | doi: 10.3389/fmed.2025.1596476
This article is part of the Research TopicDigital Technologies in Hepatology: Diagnosis, Treatment, and Epidemiological InsightsView all 7 articles
Exploratory Integration of Near-Infrared Spectroscopy with Clinical Data: A Machine Learning Approach for HCV Detection in Serum Samples
Provisionally accepted- 1Rey Juan Carlos University, Móstoles, Madrid, Spain
- 2El Escorial Hospital, San Lorenzo de El Escorial, Madrid, Spain
- 3Hospital Universitario Rey Juan Carlos, Madrid, Madrid, Spain
- 4Hospital Universitario Fundación Alcorcón, Alcorcón, Madrid, Spain
- 5Hospital Universitario de Móstoles, Móstoles, Madrid, Spain
- 6University of Catania, Catania, Sicily, Italy
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
In this study, we propose a novel approach that combines Near-Infrared Spectroscopy (NIRS) and clinical data with machine learning (ML) to improve Hepatitis C Virus (HCV) detection in serum samples. NIRS offers a fast, non-destructive, and residue-free alternative to traditional diagnostic methods, while ML models enable feature selection and predictive analysis. We applied L1-regularized Logistic Regression (L1-LR) to identify the most informative wavelengths for HCV detection within the 1000-2500 nm range, and then integrated these spectral features with routine clinical markers using a Random Forest (RF) model. Our dataset comprised 137 serum samples from 38 patients, each represented by a NIRS spectrum and clinical data from blood tests. After preprocessing with Standard Normal Variate (SNV) correction and downsampling, the bestperforming RF model, which combined NIRS features and clinical data, achieved an accuracy of 72.2% and an AUC-ROC of 0.850, outperforming models using only clinical or spectral data.Feature importance analysis highlighted specific wavelengths near 1150 nm, 1410 nm, and 1927 nm, associated with water molecular states and liver function biomarkers (GPT, GOT, GGT), reinforcing the biological relevance of this approach. These findings suggest that integrating NIRS and clinical data through machine learning enhances HCV diagnostic capabilities, offering a scalable and non-invasive alternative for early detection and risk assessment.
Keywords: NIRS, HCV, Hepatitis C, machine learning, Permutation feature importance
Received: 19 Mar 2025; Accepted: 16 May 2025.
Copyright: © 2025 Pérez Gómez, Gómez, Gonzalo, Salgüero, Riado, Casas, Gutiérrez, Jaime, Pérez-Martínez, García-Carretero, Ramos, Fernández-Rodríguez, Catalá, Martino and Barquero-Pérez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Óscar Barquero-Pérez, Rey Juan Carlos University, Móstoles, 28933, Madrid, Spain
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.