← Back to search
paper reviewed open access llmsec-2024-00006

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa

2023-12 — arXiv preprint 280 citations

Abstract

Introduces Llama Guard, an LLM-based safeguard model for classifying safety risks in LLM inputs and outputs, achieving strong performance on standard benchmarks.

Categories

Tags

llama-guardcontent-safetyclassifier

Framework Mappings

OWASP LLM: LLM01 OWASP LLM: LLM05

Cite This Resource

@article{llmsec202400006,
  title = {Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations},
  author = {Hakan Inan and Kartikeya Upasani and Jianfeng Chi and Rashi Rungta and Krithika Iyer and Yuning Mao and Michael Tontchev and Qing Hu and Brian Fuller and Davide Testuggine and Madian Khabsa},
  year = {2023},
  journal = {arXiv preprint},
  url = {https://arxiv.org/abs/2312.06674},
}

Metadata

Added
2026-04-14
Added by
manual
Source
manual
arxiv_id
2312.06674