Tool Use Security

9 resources

Agentic AI Security

Function calling security, plugin safety, and API tool governance

paper reviewed open access 2024

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

John Yang, Carlos E. Jimenez, Alexander Wettig + 4 more — NeurIPS 2024

Demonstrates autonomous coding agents that interact with computer interfaces to solve software engineering tasks, raising questions about agent containment.

autonomous operations agent architecture tool use security 200 citations

paper reviewed open access 2024

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic + 3 more — arXiv preprint

Introduces AgentDojo, a framework for evaluating the security of LLM agents against prompt injection and other attacks in realistic tool-use scenarios.

agentic threats prompt injection benchmarks tool use security 75 citations

paper reviewed open access 2024

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents

Qiusi Zhan, Zhixiang Liang, Zifan Ying + 1 more — ACL 2024 Findings

Presents InjecAgent, a benchmark for evaluating indirect prompt injection attacks against LLM agents that use tools, showing most agents are highly vulnerable.

prompt injection tool use security benchmarks 65 citations

paper reviewed open access 2024

Adversarial Attacks on Multimodal Agents

Chen Henry Wu, Jing Yu Koh, Ruslan Salakhutdinov + 2 more — arXiv preprint

Demonstrates adversarial attacks on multimodal agents that take actions in digital environments, showing visual perturbations can hijack agent behavior.

agentic threats adversarial examples tool use security 25 citations

tool reviewed open access 2024

PyRIT: Python Risk Identification Toolkit for Generative AI

Microsoft AI Red Team — GitHub / Microsoft

Microsoft's open-source framework for red teaming generative AI systems, supporting automated prompt generation, attack strategies, and scoring of AI responses.

red teaming fuzzing tool use security

paper reviewed open access 2024

Model Context Protocol (MCP): Security Considerations and Best Practices

Anthropic — Anthropic Documentation

Documentation and analysis of security considerations for the Model Context Protocol, covering authentication, authorization, and tool sandboxing.

tool use security agent architecture access control

paper reviewed open access 2024

Model Context Protocol (MCP): Specification

Anthropic — Anthropic / GitHub

Open protocol specification for connecting AI models to external data sources and tools, enabling standardized tool use with security considerations.

tool use security agent architecture access control

paper reviewed open access 2023

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu + 4 more — ICLR 2023

Foundational work on the ReAct paradigm for LLM agents that interleave reasoning and tool-use actions, enabling complex task completion with security implications.

agent architecture tool use security 2500 citations

paper reviewed open access 2023

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessi + 6 more — NeurIPS 2023

Demonstrates how LLMs can learn to use external tools (APIs, search engines, calculators) through self-supervised learning, foundational for understanding tool-use security.

tool use security agent architecture 1400 citations