0x211的文档

Ward: Provable RAG Dataset Inference via LLM Watermarks

RAG使LLM可以轻松地合并外部数据，从而引发了对数据所有者对未经授权使用其内容的担忧。检测此类未经授权用法的挑战仍未得到充满激光，而来自相邻领域的数据集和方法不适合其研究。我们采取了几个步骤来弥合这一差距 ...

0 0 0 0 2025/05/21 arXiv:2410.03537v2 0x211

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

检索增强生成 (RAG) 通过在测试时结合外部知识来改进预训练模型，以实现定制适应。我们研究上下文检索 RAG 语言模型 (LM) 中数据存储泄漏的风险。我们表明，对手可以利用 LM 的指令跟踪功能，通过提示注入轻松地从使用指令调整 LM 构建的 RAG 系统的数据存储中逐字提取文本数据 ...

0 0 0 0 2025/05/20 arXiv:2402.17840v3 0x211

Optimization-based Prompt Injection Attack to LLM-as-a-Judge

LLM-AS-A-Gudge使用大型语言模型（LLM）从一组候选人中选择给定问题的最佳回答。 LLM-AS-A-Gudge有许多应用程序，例如LLM驱动的搜索，使用AI反馈（RLAIF）的增强学习和工具选择。在这项工作中，我们提出了对LLM-AS-A-A-Gudge的基于优化的及时注射攻击的判断者 ...

0 0 0 0 2025/05/18 arXiv:2403.17710v4 0x211

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

及时的注射攻击旨在将恶意指示/数据注入LLM集成应用的输入中，从而以攻击者的要求产生结果。现有作品仅限于案例研究。结果，文献缺乏对快速注射攻击及其防御措施的系统理解 ...

0 0 0 0 2025/05/14 arXiv:2310.12815v4 0x211

ControlNET: A Firewall for RAG-based LLM System

检索增强的生成（RAG）显着提高了大语言模型（LLMS）的事实准确性和域的适应性。这一进步使他们能够在医疗保健，金融和企业应用程序等敏感领域的广泛部署。 RAG通过整合外部知识来减轻幻觉，但引入了隐私风险和安全风险，特别是数据泄露风险和数据中毒风险 ...

0 0 0 0 2025/05/12 arXiv:2504.09593v2 0x211

Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications

在本文中，我们表明，当Genai-Power应用程序之间的通信依赖于基于抹布的推断时，攻击者可以发起我们称为Morris-II的计算机蠕虫样链反应。这是通过制定对抗性自我复制提示来完成的，从而触发生态系统中的一系列间接及时注射级联，并强迫每个受影响的应用程序执行恶意动作并损害其他应用程序的抹布。我们评估了蠕虫在Genai驱动的电子邮件助手的Genai生态系统中创建一系列机密用户数据提取的性能，并分析蠕虫的性能如何受到上下文的大小，使用的对抗性自我复制提示，使用的Algorithm使用的嵌入式ALGORITHM的类型和大小，以及所使用的hops hops hops hops hops hops hops hops hops ...

0 0 0 0 2025/05/12 arXiv:2403.02817v2 0x211

Ward: Provable RAG Dataset Inference via LLM Watermarks

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Optimization-based Prompt Injection Attack to LLM-as-a-Judge

Formalizing and Benchmarking Prompt Injection Attacks and Defenses

ControlNET: A Firewall for RAG-based LLM System

Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications

Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models

Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents

Black-box Adversarial Attacks against Dense Retrieval Models: A Multi-view Contrastive Learning Method

Adversarial Semantic Collisions