niuzai的文档

niuzai

个性签名 ...

BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems

大型语言模型（LLM）变得越来越强大，能够处理复杂的任务，例如构建单个代理和多代理系统 ...

0 0 0 0 2025/06/01 arXiv:2408.15971v1 niuzai

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

基于语言模型的大型多代理系统由于专家的协作而在各种任务中显示出很大的能力，每个人都集中在特定领域。但是，笨拙甚至恶意药物的影响，即 ...

0 0 0 0 2025/05/29 arXiv:2408.00989v3 niuzai

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

当前的大型语言模型（LLMS）不仅限于某些最大上下文长度，而且无法强劲地消耗长输入。为了解决这些局限性，我们提出了ReadGent，这是一种LLM代理系统，在我们的实验中，将有效上下文长度提高到20倍。受到人类如何互动读取长文档的启发，我们将录像带作为一个简单的提示系统，它使用LLMS的先进语言能力来（1）决定将哪些内容存储在存储器情节中，（2）将这些记忆情节压缩为简短的情节记忆中，称为GIST记忆，称为GIST记忆，以及（3）在原始文本中查找段落中的文本中的操作，以启用读取的文本，以完成详细信息，以完成详细信息，以完成一定的详细信息 ...

0 0 0 0 2025/05/19 arXiv:2402.09727v3 niuzai

BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Episodic Memories Generation and Evaluation Benchmark for Large Language Models

Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning

DialSim: A Real-Time Simulator for Evaluating Long-Term Multi-Party Dialogue Understanding of Conversation Systems

LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Online Adaptation of Language Models with a Memory of Amortized Contexts

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation

来一起翻译吧！

为了您和其他读者获得更好的阅读体验，请您在阅读时勇敢地改正翻译，特别是一些显而易见的机器翻译错误。

虽然我们追求卓越，但我们并不要求翻译十全十美，因此请不要担心您翻译有误 —— 我们的服务器已经记录所有的翻译，您不必担心会因为您的失误导致无法挽回的破坏。（改编自维基百科）