← Back to search
paper reviewed open access llmsec-2024-00019

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, Rui Wang, Gongshen Liu

2024-01 — EMNLP 2024 35 citations

Abstract

Introduces R-Judge benchmark for evaluating whether LLM agents can identify safety risks in agentic scenarios involving tool use and multi-step reasoning.

Categories

Tags

agent-safetybenchmarkrisk-awareness

Framework Mappings

OWASP LLM: LLM06 OWASP Agentic: AGT06

Cite This Resource

@article{llmsec202400019,
  title = {R-Judge: Benchmarking Safety Risk Awareness for LLM Agents},
  author = {Tongxin Yuan and Zhiwei He and Lingzhong Dong and Yiming Wang and Ruijie Zhao and Tian Xia and Lizhen Xu and Binglin Zhou and Fangqi Li and Zhuosheng Zhang and Rui Wang and Gongshen Liu},
  year = {2024},
  journal = {EMNLP 2024},
  url = {https://arxiv.org/abs/2401.10019},
}

Metadata

Added
2026-04-14
Added by
manual
Source
manual
arxiv_id
2401.10019