paper reviewed open access llmsec-2024-00006

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa

2023-12 — arXiv preprint 280 citations

View Resource PDF

Abstract

Introduces Llama Guard, an LLM-based safeguard model for classifying safety risks in LLM inputs and outputs, achieving strong performance on standard benchmarks.

Framework Mappings

OWASP LLM: LLM01 OWASP LLM: LLM05

Cite This Resource

@article{llmsec202400006,
  title = {Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations},
  author = {Hakan Inan and Kartikeya Upasani and Jianfeng Chi and Rashi Rungta and Krithika Iyer and Yuning Mao and Michael Tontchev and Qing Hu and Brian Fuller and Davide Testuggine and Madian Khabsa},
  year = {2023},
  journal = {arXiv preprint},
  url = {https://arxiv.org/abs/2312.06674},
}

Metadata

Added: 2026-04-14
Added by: manual
Source: manual
arxiv_id: 2312.06674

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Abstract

Categories

Tags

Framework Mappings

Cite This Resource

Metadata