Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Threats and Vulnerabilities in Artificial Intelligence and Agentic AI Models

Provisionally accepted
  • 1University of Oxford, Oxford, United Kingdom
  • 2The Alan Turing Institute, London, United Kingdom
  • 3Cisco Systems Inc, San Jose, United States
  • 4University of Warwick, Coventry, United Kingdom

The final, formatted version of the article will be published soon.

Abstract: This study develops a system-level analytical framework for understanding adversarial vulnerabilities in artificial and agentic AI systems, reconceptualising adversarial robustness beyond input-level perturbations to encompass autonomy, self-governance, and closed-loop decision-making. We formalise adversarial risk across perceptual, cognitive, and executive layers, thereby extending classical adversarial machine-learning theory to agentic architectures whose behaviour unfolds over time through feedback and control. Drawing on a PRISMA-compliant systematic literature review, bibliometric analysis, and targeted empirical validation, we synthesise established adversarial results across vision benchmarks and recent large-language-model red-teaming studies to contextualise the framework rather than to propose new state-of-the-art benchmarks. The analysis demonstrates that no single defence mechanism is sufficient across layers, as vulnerabilities propagate from perception to policy and actuation. We show that architectural similarity, domain shift, and feedback dynamics critically shape adversarial transferability and failure modes, with direct implications for safety-critical domains including mobility, healthcare imaging, and biometric security. By framing higher-order agentic adversarial threats as hypothesis-driven, system-level risks, this work defines a coherent research agenda for agentic AI security focused on behavioural integrity, lifecycle resilience, and governance-aware defence design.

Keywords: Advanced Attack Techniques, adversarial attacks, artificial intelligence, Blackbox Attacks, Carlini and Wagner Attack (C&W), Defense Mechanisms, fast gradient sign method (FGSM), machine learning

Received: 24 Oct 2025; Accepted: 07 Jan 2026.

Copyright: © 2026 Radanliev, Santos and Maple. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Petar Radanliev

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.