← Back to search
paper reviewed open access llmsec-2024-00048
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
Xiaogeng Liu, Nan Xu, Muhao Chen, Chaowei Xiao
2024 — ICLR 2024 230 citations
Abstract
Proposes AutoDAN, a method for automatically generating stealthy jailbreak prompts that are semantically meaningful and can bypass perplexity-based defenses.
Categories
Tags
automated-jailbreakstealthygenetic-algorithm
Framework Mappings
OWASP LLM: LLM01 MITRE ATLAS: AML.T0054
Cite This Resource
@article{llmsec202400048,
title = {AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models},
author = {Xiaogeng Liu and Nan Xu and Muhao Chen and Chaowei Xiao},
year = {2024},
journal = {ICLR 2024},
url = {https://arxiv.org/abs/2310.04451},
} Metadata
- Added
- 2026-04-14
- Added by
- manual
- Source
- manual
- arxiv_id
- 2310.04451