← Back to all categories

Tool Use Security

9 resources

Agentic AI Security

Function calling security, plugin safety, and API tool governance

paper reviewed open access 2024

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

John Yang, Carlos E. Jimenez, Alexander Wettig + 4 more — NeurIPS 2024

Demonstrates autonomous coding agents that interact with computer interfaces to solve software engineering tasks, raising questions about agent containment.

paper reviewed open access 2024

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic + 3 more — arXiv preprint

Introduces AgentDojo, a framework for evaluating the security of LLM agents against prompt injection and other attacks in realistic tool-use scenarios.

paper reviewed open access 2024

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents

Qiusi Zhan, Zhixiang Liang, Zifan Ying + 1 more — ACL 2024 Findings

Presents InjecAgent, a benchmark for evaluating indirect prompt injection attacks against LLM agents that use tools, showing most agents are highly vulnerable.

paper reviewed open access 2024

Adversarial Attacks on Multimodal Agents

Chen Henry Wu, Jing Yu Koh, Ruslan Salakhutdinov + 2 more — arXiv preprint

Demonstrates adversarial attacks on multimodal agents that take actions in digital environments, showing visual perturbations can hijack agent behavior.

tool reviewed open access 2024

PyRIT: Python Risk Identification Toolkit for Generative AI

Microsoft AI Red Team — GitHub / Microsoft

Microsoft's open-source framework for red teaming generative AI systems, supporting automated prompt generation, attack strategies, and scoring of AI responses.

paper reviewed open access 2024

Model Context Protocol (MCP): Security Considerations and Best Practices

Anthropic — Anthropic Documentation

Documentation and analysis of security considerations for the Model Context Protocol, covering authentication, authorization, and tool sandboxing.

paper reviewed open access 2024

Model Context Protocol (MCP): Specification

Anthropic — Anthropic / GitHub

Open protocol specification for connecting AI models to external data sources and tools, enabling standardized tool use with security considerations.

paper reviewed open access 2023

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu + 4 more — ICLR 2023

Foundational work on the ReAct paradigm for LLM agents that interleave reasoning and tool-use actions, enabling complex task completion with security implications.

paper reviewed open access 2023

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessi + 6 more — NeurIPS 2023

Demonstrates how LLMs can learn to use external tools (APIs, search engines, calculators) through self-supervised learning, foundational for understanding tool-use security.