ORIGINAL RESEARCH article
Front. Comput. Sci.
Sec. Computer Security
This article is part of the Research TopicInnovative Solutions for Safeguarding Intelligent SystemsView all 6 articles
RoLLMRec: A Robust LLM-Based Recommender System for Defending Against Shilling and Prompt Injection Attacks
Provisionally accepted- Toronto Metropolitan University Library, Toronto, Canada
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Large Language Models (LLMs) are increasingly being integrated into recommender systems, offering contextual reasoning, cross-domain adaptability, and natural language interaction. However, their adoption also introduces vulnerabilities such as prompt injection, semantic poisoning, and shilling attacks, which can distort recommendations and erode user trust. Addressing these risks is essential for the safe deployment of LLM-based recommenders. We propose RoLLMRec, a defense oriented architectural framework and evaluation methodology for LLM-based recommender systems that integrates prompt filtering, retrieval augmented grounding, trust aware scoring, and an auditing feedback loop. RoLLMRec improves robustness under the evaluated prompt level and semantic adversarial settings,while multimodal support is included at the architectural level only and is not empirically evaluated in the current experimental setup.RoLLMRec unifies five core components: (1) prompt shielding and input filtering to detect and block adversarial instructions; (2) retrieval-augmented generation to enrich factual grounding and reduce hallucination; (3) multimodal LLM encoding for text, metadata, and image inputs; (4) trust-aware scoring and Top-K ranking; and (5) adaptive feedback loops for continual learning. Evaluations on benchmark datasets such as Yelp, MovieLens, and Amazon Books show that RoLLMRec surpasses BERT4Rec, RecVAE, and LightGCN, improving NDCG@10 and HR@10 by up to 6% and 5%, respectively. Under a 10% prompt-injection attack, it maintains a Robust Hit Rate (RHR@10) above 0.63 and a Perturbation Sensitivity Index (PSI) below 0.135, achieving 15–25% higher resilience. It also sustains a Semantic Stability Score (SSS) above 0.60 in zero-shot cross-domain transfer, confirming stable semantic intent.
Keywords: adversarial machine learning, Hallucination, Large Language Model, Large Language Model for Recommendation, LLM4Rec, recommender systems, Retrieval Augmented Generation, Shilling attacks
Received: 29 Oct 2025; Accepted: 13 Feb 2026.
Copyright: © 2026 Shehmir and Kashef. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Rasha Kashef
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.