ORIGINAL RESEARCH article
Front. Big Data
Sec. Data Mining and Management
This article is part of the Research TopicMachine Learning for Large-Scale Data Processing: Algorithms and ApplicationsView all 4 articles
A Genetic Algorithm-Based Framework for Online Sparse Feature Selection in Data Streams
Provisionally accepted- 1Southwest University, Chongqing, China
- 2PetroChina Qinghai Oilfield Company, Qinghai, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
High-dimensional streaming data implementations commonly utilize online streaming feature selection (OSFS) techniques. In practice, however, incomplete data due to equipment failures and technical constraints often poses a significant challenge. Online Sparse Streaming Feature Selection (OS2FS) tackles this issue by performing missing data imputation via latent factor analysis. Nevertheless, existing OS2FS approaches exhibit considerable limitations in feature evaluation, resulting in degraded performance. To address these shortcomings, this paper introduces a novel genetic algorithm-based online sparse streaming feature selection (GA-OS2FS) in data streams, which integrates two key innovations: 1) imputation of missing values using a latent factor analysis model, and 2) application of genetic algorithm to assess feature importance. Comprehensive experiments conducted on six real-world datasets show that GA-OS2FS surpasses state-of-the-art OSFS and OS2FS methods, consistently attaining higher accuracy through the selection of optimal feature subsets.
Keywords: Feature Selection, Genetic Algorithm, Latent factor analysis, missing data, Online Learning
Received: 07 Jan 2026; Accepted: 20 Jan 2026.
Copyright: © 2026 Liu, Liu, He, Liu, Bai and Min. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Zhou Min
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
