← Back to search
paper reviewed open access llmsec-2024-00033
TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models
Jiaqi Xue, Mengxin Zheng, Ting Hua, Yilin Shen, Yepeng Liu, Ladislau Boloni, Qian Lou
2024-05 — NeurIPS 2023 85 citations
Abstract
Proposes TrojLLM, a black-box attack that generates universal trojan prompts to compromise LLMs without access to model internals.
Framework Mappings
OWASP LLM: LLM03 OWASP LLM: LLM04 MITRE ATLAS: AML.T0018 MITRE ATLAS: AML.T0043
Cite This Resource
@article{llmsec202400033,
title = {TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models},
author = {Jiaqi Xue and Mengxin Zheng and Ting Hua and Yilin Shen and Yepeng Liu and Ladislau Boloni and Qian Lou},
year = {2024},
journal = {NeurIPS 2023},
url = {https://arxiv.org/abs/2306.06815},
} Metadata
- Added
- 2026-04-14
- Added by
- manual
- Source
- manual
- arxiv_id
- 2306.06815