← Back to search
paper reviewed open access llmsec-2024-00053

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

2024 — NeurIPS 2024 55 citations

Abstract

Demonstrates backdoor attacks on chain-of-thought reasoning in LLMs where poisoned demonstrations cause incorrect reasoning chains.

Categories

Tags

chain-of-thoughtbackdoorreasoning

Framework Mappings

OWASP LLM: LLM04 MITRE ATLAS: AML.T0018 MITRE ATLAS: AML.T0020

Cite This Resource

@article{llmsec202400053,
  title = {BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models},
  author = {Zhen Xiang and Fengqing Jiang and Zidi Xiong and Bhaskar Ramasubramanian and Radha Poovendran and Bo Li},
  year = {2024},
  journal = {NeurIPS 2024},
  url = {https://arxiv.org/abs/2401.12242},
}

Metadata

Added
2026-04-14
Added by
manual
Source
manual
arxiv_id
2401.12242