0x211的文档

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

当前对LLM的对抗鲁棒性的研究集中在自然语言领域中的离散输入操作，可以将其直接转移到封闭源模型中。但是，这种方法忽略了开源模型的稳定发展。随着开源模型的提高，确保其安全性也变得越来越急切 ...

0 0 0 0 2025/08/15 arXiv:2402.09063v2 0x211

Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models

大型语言模型（LLM）变得越来越流行。他们的紧急功能可以归因于他们的大规模培训数据集。但是，这些数据集通常包含不良或不适当的内容e ...

0 0 0 0 2025/08/15 arXiv:2502.15836v1 0x211

RAG Safety: Exploring Knowledge Poisoning Attacks to Retrieval-Augmented Generation

通过检索外部数据来减轻幻觉和过时的知识问题，检索增强的生成（RAG）可以增强大语言模型（LLM）。受益于促进多样化数据源和支持忠实推理的强大能力，知识图（kg）在抹布系统中越来越多地采用，从而引起了基于kg的抹布（kg-rag）方法。尽管抹布系统广泛应用于各种应用中，但最近的研究也揭示了其在数据中毒攻击中的脆弱性，其中将恶意信息注入外部知识来源可能会误导该系统产生错误或有害的响应 ...

0 0 1 1 2025/08/15 arXiv:2507.08862v1 0x211

Lossless data compression by large models

经过80年的研究，数百万篇论文和广泛的应用，现代数据压缩方法正在慢慢达到其极限。然而，奢侈的6G通信速度要求为革命性的数据压缩思想提出了一个主要的开放问题。我们以前已经表明，在合理的假设下，所有理解或学习都是压缩 ...

0 0 0 0 2025/07/21 arXiv:2407.07723v3 0x211

Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems

大型语言模型（LLMS）在实现人工智能的现实世界决策任务中表现出了巨大的希望，尤其是当微调以利用其固有的常识和推理能力时，同时量身定制了特定应用程序时。但是，这个微调过程引入了相当大的安全性和安全性漏洞，尤其是在安全至关重要的网络物理系统中。在这项工作中，我们提出了第一个综合框架，用于在体现的AI中针对基于LLM的决策系统（BALD）的后门攻击，系统地探索攻击表面和触发机制 ...

0 0 0 0 2025/07/17 arXiv:2405.20774v3 0x211

GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search

基于密集的嵌入文本检索$ \ unicode {x2013} $通过深度学习编码从语料库中检索$ \ unicode {x2013} $已成为一种有力的方法，即获得最先进的搜索结果并普及使用增强生成（RAG）。尽管如此，与其他搜索方法一样，基于嵌入的检索可能会受到搜索引擎优化（SEO）攻击的影响，在这种攻击中，对手通过向Corpora引入对抗性段落来促进恶意内容。为了忠实地评估并洞悉此类系统对SEO的敏感性，这项工作提出了Gaslite Attack，这是一种基于数学原则的基于梯度的搜索方法，用于生成对抗性段落而不依赖语料库内容或修改模型 ...

0 0 1 2 2025/07/15 arXiv:2412.20953v1 0x211

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models

RAG Safety: Exploring Knowledge Poisoning Attacks to Retrieval-Augmented Generation

Lossless data compression by large models

Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems

GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search

L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression

DZip: improved general-purpose lossless compression based on novel neural network modeling

Text Compression for Efficient Language Generation

StruQ: Defending Against Prompt Injection with Structured Queries