← Back to all categories

Surveys

12 resources

Surveys & Meta

Literature surveys, systematizations of knowledge, and meta-analyses

paper reviewed open access 2024

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Boxin Wang, Weixin Chen, Hengzhi Pei + 7 more — NeurIPS 2023

Comprehensive trustworthiness evaluation of GPT models across 8 dimensions including toxicity, bias, robustness, privacy, fairness, and machine ethics.

paper reviewed open access 2024

A Survey on Large Language Model (LLM) Security and Privacy: The Good, The Bad, and The Ugly

Yifan Yao, Jinhao Duan, Kaidi Xu + 3 more — High-Confidence Computing

Comprehensive survey covering LLM security and privacy from three perspectives: beneficial applications of LLMs for security, attacks against LLMs, and defensive techniques.

survey 350 citations
paper reviewed open access 2024

TrustLLM: Trustworthiness in Large Language Models

Lichao Sun, Yue Huang, Haoran Wang + 2 more — ICML 2024

Comprehensive study of LLM trustworthiness across truthfulness, safety, fairness, robustness, privacy, and machine ethics with benchmarks.

paper reviewed open access 2024

Prompt Injection Attack Against LLM-Integrated Applications

Yi Liu, Gelei Deng, Yuekang Li + 6 more — ACM Computing Surveys

First comprehensive survey of prompt injection attacks against LLM-integrated applications, categorizing attacks and defenses with a unified framework.

paper reviewed open access 2024

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats

Leo Schwinn, David Dobre, Stephan Gunnemann + 1 more — arXiv preprint

Systematizes adversarial attacks and defenses for LLMs, connecting them to the classical adversarial ML literature while identifying LLM-specific threats.

paper reviewed open access 2024

A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models

Aysan Esmradi, Daniel Wankit Yip, Chun Fai Chan — arXiv preprint

Surveys attack techniques across the LLM lifecycle including training, fine-tuning, and inference, with comprehensive mitigation strategies.

paper reviewed open access 2024

Machine Unlearning for Large Language Models: A Survey

Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan + 2 more — arXiv preprint

Surveys machine unlearning techniques for LLMs including methods for forgetting specific training data, complying with data deletion requests, and maintaining model utility.

paper reviewed open access 2024

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies

Feng He, Tianqing Zhu, Dayong Ye + 3 more — arXiv preprint

Surveys security and privacy challenges specific to LLM-based agents, covering agent architectures, attack surfaces, and defense mechanisms.

standard reviewed open access 2024

OWASP AI Security and Privacy Guide

Rob van der Veer, OWASP AI Exchange Team — OWASP Foundation

Comprehensive guide for AI security and privacy including threat analysis, controls, and regulatory mapping for AI systems.

paper reviewed open access 2024

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST AI 100-2e2025)

Apostol Vassilev, Alina Oprea, Alie Fordyce + 1 more — NIST

NIST's authoritative taxonomy of adversarial ML attacks and mitigations covering evasion, poisoning, privacy, and abuse attacks against AI systems.

book reviewed 2024

Generative AI Security: Theories and Practices

Ken Huang, Yang Wang, Ben Goertzel + 3 more — Springer

Comprehensive textbook covering generative AI security from foundations to advanced topics including LLM threats, defenses, privacy, and governance.

paper reviewed open access 2023

Identifying and Mitigating the Security Risks of Generative AI

Clark Barrett, Brad Boyd, Elie Burzstein + 20 more — Foundations and Trends in Privacy and Security

Comprehensive treatment of generative AI security risks across the ML lifecycle with a focus on practical mitigations and deployment considerations.