Edemacu, Kennedy, Shashidhar, Vinay M., Tuape, Micheal, Abudu, Dan, Jang, Beakcheol and Kim, Jong Wook (2025). Defending Against Knowledge Poisoning Attacks During Retrieval-Augmented Generation. Other. arXiv.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to boost the capabilities of large language models (LLMs) by incorporating external, up-to-date knowledge sources. However, this introduces a potential vulnerability to knowledge poisoning attacks, where attackers can compromise the knowledge source to mislead the generation model. One such attack is the PoisonedRAG in which the injected adversarial texts steer the model to generate an attacker-chosen response to a target question. In this work, we propose novel defense methods, FilterRAG and ML-FilterRAG, to mitigate the PoisonedRAG attack. First, we propose a new property to uncover distinct properties to differentiate between adversarial and clean texts in the knowledge data source. Next, we employ this property to filter out adversarial texts from clean ones in the design of our proposed approaches. Evaluation of these methods using benchmark datasets demonstrate their effectiveness, with performances close to those of the original RAG systems.
| Divisions: | College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies College of Engineering & Physical Sciences |
|---|---|
| Additional Information: | This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use: https://creativecommons.org/licenses/by/4.0/ |
| Uncontrolled Keywords: | cs.LG,cs.IR |
| Last Modified: | 24 Oct 2025 07:02 |
| Date Deposited: | 23 Oct 2025 15:06 |
| Full Text Link: | |
| Related URLs: |
https://arxiv.o ... /abs/2508.02835
(Publisher URL) |
PURE Output Type: | ["eprint_fieldname_pure_output_type_workingpaper/preprint" not defined] |
| Published Date: | 2025-08-04 |
| Authors: |
Edemacu, Kennedy
Shashidhar, Vinay M. Tuape, Micheal Abudu, Dan (
0000-0002-9321-0829)
Jang, Beakcheol Kim, Jong Wook |
0000-0002-9321-0829