Can LLMs effectively provide game-theoretic-based scenarios for cybersecurity?

Proverbio, Daniele; Buscemi, Alessio; Di Stefano, Alessandro; Han, The Anh; Castignani, German; Lio', Pietro

doi:10.3389/fcomp.2025.1703586

ORIGINAL RESEARCH article

Front. Comput. Sci.

Sec. Computer Security

This article is part of the Research TopicGenerative AI for Cybersecurity: Attack and Defense StrategiesView all articles

Can LLMs effectively provide game-theoretic-based scenarios for cybersecurity?

Provisionally accepted

Daniele Proverbio¹

Alessio Buscemi^2*

Alessandro Di Stefano³

The Anh Han³

German Castignani²

Pietro Lio'⁴

¹Universita degli Studi di Trento, Trento, Italy
²Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg
³Teesside University, Middlesbrough, United Kingdom
⁴University of Cambridge, Cambridge, United Kingdom

The final, formatted version of the article will be published soon.

Game theory has long served as a foundational tool in cybersecurity to test, predict, and design strategic interactions between attackers and defenders. The recent advent of Large Language Models (LLMs) offers new tools and challenges for the security of computer systems. In this work, we investigate whether classical game-theoretic frameworks can effectively capture the behaviours of LLM-driven actors and bots. Using a reproducible framework for game-theoretic LLM agents, we investigate two canonical scenarios -- the one-shot zero-sum game and the dynamic Prisoner's Dilemma -- and we test whether LLMs converge to expected outcomes or exhibit deviations due to embedded biases. We experiments on four state-of-the-art LLMs and five natural languages (English, French, Arabic, Vietnamese, and Mandarin Chinese) to assess linguistic sensitivity. For both games, we observe that the final payoffs are influenced by agents characteristics such as personality traits or knowledge of repeated rounds. We also uncover an unexpected sensitivity of the final payoffs to the choice of languages, which should warn against indiscriminate application of LLMs in cybersecurity applications and call for in-depth studies, as LLMs may behave differently when deployed in different countries. We also employ quantitative metrics to evaluate the internal consistency and cross-language stability of LLM agents, to help guide the selection of the most stable LLMs and optimising models for secure applications.

Keywords: Game theory, Large Language Model, Generative AI, prisoner's dilemma, Zero-sum game, cybersecurity, eavesdropping, Network Security

Received: 11 Sep 2025; Accepted: 25 Nov 2025.

Copyright: © 2025 Proverbio, Buscemi, Di Stefano, Han, Castignani and Lio'. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Alessio Buscemi

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.