Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Language and Computation

Volume 8 - 2025 | doi: 10.3389/frai.2025.1623573

Weaponizing Cognitive Bias in Autonomous Systems: A Framework for Black-box Inference Attacks

Provisionally accepted
  • Aviation Industry Development Research Center of China, Beijing, China

The final, formatted version of the article will be published soon.

Autonomous systems deployed in high-dimensional environments increasingly rely on prioritization heuristics to allocate attention and assess risk. While often perceived as objective, these systems can exhibit cognitive biases, such as salience, spatial framing, and temporal familiarity that alter decision-making without modifying the input or accessing internal states. This study introduces Priority Inversion via Operational Reasoning (PRIOR), a black-box, non-perturbative diagnostic framework that probes reasoning vulnerabilities using structurally biased but semantically neutral scenario cues, without altering pixel-level, statistical, or surface semantic properties. As direct access to embodied vision-based systems remains limited, we evaluate PRIOR using large language models as abstract reasoning proxies. These models simulate cognitive prioritization under constrained textual scenarios, though they do not reflect real-time perception or physical-world interaction. Through controlled experiments on textual surveillance analogs inspired by UAV decision contexts, we find that minimal structural cues can induce consistent priority inversions across multiple models. By jointly analyzing model justifications and confidence estimates, we observe systematic distortions in inferred threat relevance, even under input symmetry. These findings reveal the fragility of the inference level in black-box reasoning systems and motivate new evaluation strategies that go beyond output correctness to interrogate internal prioritization logic. Future work should extend these insights to dynamic, embodied, and visually grounded agents operating in real-world deployments.

Keywords: cognitive bias, Visual Reasoning Vulnerabilities, Trustworthy AI Evaluation, Priority inversion, Non-perturbative Black-box Attacks

Received: 07 May 2025; Accepted: 30 Jul 2025.

Copyright: © 2025 Chu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yuwei Chen, Aviation Industry Development Research Center of China, Beijing, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.