Agent Architecture

24 resources

Agentic AI Security

Multi-agent security patterns, isolation, and trust boundaries

paper 2026

FlowSteer: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems

Fanxiao Li, Jiaying Wu, Tingchao Fu + 3 more

Multi-agent systems (MAS) powered by large language models (LLMs) increasingly adopt planner--executor architectures, where planners convert prompts into subtasks, roles, dependencies, and routing paths. This flexibility enables adaptive coordination, but exposes an attack surface in workflow formation: prompts can shape agent organization without modifying MAS infrastructure. We study this risk through social influence probing workflows to identify high-impact subtasks and malicious-signal prop

agent architecture

paper 2026

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi, Tianqing Zhu, Zijie Xu + 3 more

Automated intrusion-style workflows require LLM agents to reason over partial observations, tool outputs, and executable artifacts under bounded budgets. A single LLM instance often compresses evidence extraction, planning, execution, and validation into one context, which increases the risk of context drift and error propagation. Existing LLM-based multi-agent systems support general collaboration, but they do not explicitly model the role boundaries, artifact provenance, and cost constraints t

agent architecture

paper 2026

Insider Attacks in Multi-Agent LLM Consensus Systems

Xiaolin Sun, Zixuan Liu, Yibin Hu + 1 more

Large language models (LLMs) are increasingly deployed in multi-agent systems where agents communicate in natural language to solve tasks jointly. A key capability in such systems is consensus formation, where agents iteratively exchange messages and update decisions to reach a shared outcome. However, most existing multi-agent LLM frameworks assume that all participating agents are aligned with the system objective. In practice, a malicious insider may participate as a legitimate member of the

agent architecture

paper 2026

Towards Security-Auditable LLM Agents: A Unified Graph Representation

Chaofan Li, Lyuye Zhang, Jintao Zhai + 9 more

LLM-based agentic systems are rapidly evolving to perform complex autonomous tasks through dynamic tool invocation, stateful memory management, and multi-agent collaboration. However, this semantics-driven execution paradigm creates a severe semantic gap between low-level physical events and high-level execution intent, making post-hoc security auditing fundamentally difficult. Existing representation mechanisms, including static SBOMs and runtime logs, provide only fragmented evidence and fail

agentic threats agent architecture

paper 2026

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents

Mingyu Luo, Zihan Zhang, Zesen Liu + 7 more

Bring-Your-Own-Key (BYOK) agent architectures let users route LLM traffic through third-party relays, creating a critical integrity gap: a malicious relay can modify an aligned LLM response after generation but before agent execution. We formalize this post-alignment tampering threat and show that, without end-to-end integrity, the relay can observe, suppress, or replace downstream messages, making even perfectly aligned LLMs ineffective against such attacks. We instantiate this threat as the Re

guardrails agent architecture

paper 2026

QASecClaw: A Multi-Agent LLM Approach for False Positive Reduction in Static Application Security Testing

Mohd Ruhul Ameen, Md Takrim Ul Alam, Akif Islam

Static Application Security Testing tools help developers find security vulnerabilities before release, but they often produce many false positives. This increases manual review effort, reduces developer trust, and may cause real vulnerabilities to be ignored among noisy reports. We present QASecClaw, a multi agent approach that combines conventional Static Application Security Testing with coding specialized Large Language Model based contextual code review. A SAST engine first reports candidat

agent architecture

paper 2026

Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

Saeid Jamshidi, Foutse Khomh, Carol Fung + 1 more

The adoption of Internet of Things (IoT) systems at the network edge of smart architectures is increasing rapidly, intensifying the need for security mechanisms that are both adaptive and resource-efficient. In such environments, runtime defence mechanisms are no longer limited to detection alone but become a resource-constrained task of selecting mitigation actions. Security controls must be carefully selected, combined, and executed under latency, energy, and computational constraints, while p

agent architecture

paper 2026

When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks

Ziwen Cai, Yihe Zhang, Xiali Hei

Since the official release of ChatGPT in 2022, large language models (LLMs) have rapidly evolved from chatbot-style interfaces into agentic systems that can delegate work through tools and newly spawned subagents. While these capabilities improve automation and scalability, they also pose new security risks in multi-agent networks. Existing research has studied how individual LLM-based agents can be compromised through prompt injection, jailbreaking, poisoned retrieval data, or malicious extensi

prompt injection jailbreaking agentic threats agent architecture

paper 2026

Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning

Samuel Korn

Retrieval-Augmented Generation (RAG) systems are vulnerable to knowledge base poisoning, yet existing attacks have been evaluated almost exclusively against vanilla retrieve-then-generate pipelines. Architectures designed to handle conflicting retrieved information - multi-agent debate, agentic retrieval, recursive language models - remain untested against adversarially optimized contradictions. We evaluate four RAG architectures (vanilla RAG, agentic RAG, MADAM-RAG, and Recursive Language Model

agentic threats agent architecture

paper 2026

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Ali Dehghantanha, Sajad Homayoun

Recent AI systems combine large language models with tools, external knowledge via retrieval-augmented generation (RAG), and even autonomous multi-agent decision loops. This agentic AI paradigm greatly expands capabilities - but also vastly enlarges the attack surface. In this systematization, we map out the trust boundaries and security risks of agentic LLM-based systems. We develop a comprehensive taxonomy of attacks spanning prompt-level injections, knowledge-base poisoning, tool/plug-in expl

agentic threats agent architecture

paper 2026

MAGIQ: A Post-Quantum Multi-Agentic AI Governance System with Provable Security

Sepideh Avizeh, Tushin Mallick, Alina Oprea + 2 more

Our computing ecosystem is being transformed by two emerging paradigms: the increased deployment of agentic AI systems and advancements in quantum computing. With respect to agentic AI systems, one of the most critical problems is creating secure governing architectures that ensure agents follow their owners' communication and interaction policies and can be held accountable for the messages they exchange with other agents. With respect to quantum computing, existing systems must be retrofitted

agentic threats agent architecture

paper 2026

Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure

Krti Tallam

The security discussion around agentic AI focuses heavily on prompt injection. This paper argues that multi-agent systems also create a distinct authorization problem: maintaining authorization invariants as non-human principals retrieve data, delegate tasks, and synthesize results across changing boundaries. We call this problem authorization propagation. It is not reducible to prompt injection and is not fully addressed by classical access-control models such as RBAC, ABAC, or ReBAC. The paper

prompt injection agentic threats access control agent architecture

paper 2026

Hidden Coalitions in Multi-Agent AI: A Spectral Diagnostic from Internal Representations

Cameron Berg, Susan L. Schneider, Mark M. Bailey

Collections of interacting AI agents can form coalitions, creating emergent group-level organization that is critical for AI safety and alignment. However, observing agent behavior alone is often insufficient to distinguish genuine informational coupling from spurious similarity, as consequential coalitions may form at the level of internal representations before any overt behavioral change is apparent. Here, we introduce a practical method for detecting coalition structure from the internal neu

guardrails agent architecture

paper 2026

Position: Safety and Fairness in Agentic AI Depend on Interaction Topology, Not on Model Scale or Alignment

Tanav Singh Bajaj, Nikhil Singh, Karan Anand + 1 more

As large language models are increasingly deployed as interacting agents in high-stakes decisions, the AI safety community assumes that safety properties of individual models will compose into safe multi-agent behavior. This position paper argues that this assumption is fundamentally mistaken. In agentic AI, safety is determined by interaction topology, not model weights. When agents deliberate sequentially or aggregate via parallel voting with a judge, the structure of information flow and deci

agentic threats guardrails agent architecture

paper 2026

AgenticAITA: A Proof-Of-Concept About Deliberative Multi-Agent Reasoning for Autonomous Trading Systems

Ivan Letteri

Conventional algorithmic trading systems are grounded in deterministic heuristics or offline-trained statistical models that cannot adapt to the semantic complexity of rapidly shifting market regimes. This paper introduces AGENTICAITA, an agentic AI framework that replaces the traditional signal then execute paradigm with a fully autonomous deliberative loop in which multiple specialized Large Language Model agents reason, negotiate, and act in concert - without any offline training or human int

agentic threats agent architecture

paper 2026

Ambient Persuasion in a Deployed AI Agent: Unauthorized Escalation Following Routine Non-Adversarial Content Exposure

Diego F. Cuadros, Abdoul-Aziz Maiga

We report a safety incident in a deployed multi-agent research system in which a primary AI agent installed 107 unauthorized software components, overwrote a system registry, overrode a prior negative decision from an oversight agent, and escalated through increasingly privileged operations up to an attempted system administrator command. The incident was preceded not by an adversarial attack but by routine content: a forwarded technology article written for human developers and shared by the pr

adversarial examples agent architecture

standard reviewed open access 2025

OWASP Top 10 for Agentic AI Applications

OWASP Foundation — OWASP Foundation

Identifies the top 10 security risks specific to agentic AI applications including excessive agency, unsafe tool execution, and inadequate oversight.

threat modeling risk frameworks agent architecture

paper reviewed open access 2024

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

John Yang, Carlos E. Jimenez, Alexander Wettig + 4 more — NeurIPS 2024

Demonstrates autonomous coding agents that interact with computer interfaces to solve software engineering tasks, raising questions about agent containment.

autonomous operations agent architecture tool use security 200 citations

paper reviewed open access 2024

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies

Feng He, Tianqing Zhu, Dayong Ye + 3 more — arXiv preprint

Surveys security and privacy challenges specific to LLM-based agents, covering agent architectures, attack surfaces, and defense mechanisms.

survey agentic threats agent architecture 35 citations

paper reviewed open access 2024

Model Context Protocol (MCP): Security Considerations and Best Practices

Anthropic — Anthropic Documentation

Documentation and analysis of security considerations for the Model Context Protocol, covering authentication, authorization, and tool sandboxing.

tool use security agent architecture access control

paper reviewed open access 2024

Model Context Protocol (MCP): Specification

Anthropic — Anthropic / GitHub

Open protocol specification for connecting AI models to external data sources and tools, enabling standardized tool use with security considerations.

tool use security agent architecture access control

paper reviewed open access 2023

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu + 4 more — ICLR 2023

Foundational work on the ReAct paradigm for LLM agents that interleave reasoning and tool-use actions, enabling complex task completion with security implications.

agent architecture tool use security 2500 citations

paper reviewed open access 2023

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessi + 6 more — NeurIPS 2023

Demonstrates how LLMs can learn to use external tools (APIs, search engines, calculators) through self-supervised learning, foundational for understanding tool-use security.

tool use security agent architecture 1400 citations

paper reviewed open access 2023

Voyager: An Open-Ended Embodied Agent with Large Language Models

Guanzhi Wang, Yuqi Xie, Yunfan Jiang + 5 more — NeurIPS 2023

Demonstrates a continuously learning LLM agent in Minecraft that writes and executes code, highlighting autonomous operation and containment challenges.

autonomous operations agent architecture 800 citations