ORIGINAL RESEARCH article

Front. Signal Process.

Sec. Signal Processing for Communications

Volume 5 - 2025 | doi: 10.3389/frsip.2025.1608347

This article is part of the Research TopicEmerging Optimization, Learning and Signal Processing for Next Generation Wireless Communications and NetworkingView all articles

Reinforcement Learning, Rule-Based, or Generative AI: A Comparison of Model-Free Wi-Fi Slicing Approaches

Provisionally accepted
Rafael  RosalesRafael Rosales1*Dave  CavalcantiDave Cavalcanti2
  • 1Intel (Germany), Munich, Germany
  • 2Intel (United States), Santa Clara, California, United States

The final, formatted version of the article will be published soon.

Resource allocation techniques are key to providing Quality-of-Service guarantees. Wi-Fi standards define features enabling the allocation of radio resources across time, frequency, and link band. However, radio resource slicing, as implemented in 5G cellular networks, is not native to Wi-Fi. A few reinforcement learning (RL) approaches have been proposed for Wi-Fi resource allocation and demonstrated using analytical models where the reward gradient with respect to the model parameters is accessible -i.e., with a differentiable Wi-Fi network model.In this work, we implement -and release under an Apache 2.0 license -a state-of-theart, state-augmented constrained optimization method using a policy-gradient RL algorithm that does not require a differentiable model, to assess model-free RL-based slicing for Wi-Fi frequency resource allocation. We compare this with six model-free baselines: three RL algorithms (REINFORCE, A2C, PPO), two rule-based heuristics (Uniform, Proportional), and a generative AI policy using a commercial foundational Large Language Model (LLM). For rapid RL training, a simple, non-differentiable network model was used. To evaluate the policies, we use an ns-3-based Wi-Fi 6 simulator with a slice-aware MAC. Evaluations were conducted in two traffic scenarios: A) a periodic pattern with one constant low-throughput slice and two high-throughput slices toggled sequentially, and B) a random walk scenario for realism.Results show that, on average -in terms of the trade-off between total throughput and a packet-latency-based metric -the uniform split and LLM-based policy perform best, appearing on the Pareto front in both scenarios. The proportional policy only appears on the front in the periodic case. Our state-augmented constrained approach based on REINFORCE (SAC-RE) is on the second Pareto front for the random walk case, outperforming vanilla REINFORCE. In the periodic scenario, vanilla REINFORCE achieves better throughput -with a latency trade-off -and is co-located with SAC-RE on the second front. Interestingly, the LLM-based policyneither trained nor fine-tuned on any custom data -consistently appears on the first Pareto front, offering higher objective values at some latency cost. Unlike uniform slicing, its behavior is dynamically adjustable via prompt engineering.

Keywords: optimization, Wi-Fi, network slicing, reinforcement learning, Generative AI, Large Language Model, LLM, state-augmented

Received: 08 Apr 2025; Accepted: 13 May 2025.

Copyright: © 2025 Rosales and Cavalcanti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Rafael Rosales, Intel (Germany), Munich, Germany

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.