Differential Privacy
5 resourcesPrivacy
DP training, inference, and privacy-preserving mechanisms
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou + 4 more — ICLR 2024
Evaluates LLM privacy behavior through the lens of contextual integrity theory, finding significant mismatches between LLM norms and human privacy expectations.
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Vaidehi Patil, Peter Hase, Mohit Bansal — ICLR 2024
Evaluates methods for deleting sensitive information from trained LLMs, finding current unlearning approaches insufficient against determined adversaries.
DP-SGD for Fine-Tuning Foundation Models: A Privacy-Utility Trade-off Study
Yu-Xiang Wang, Borja Balle, Shiva Prasad Kasiviswanathan — ICLR 2024
Investigates applying differentially private stochastic gradient descent to fine-tune large foundation models, characterizing the privacy-utility trade-off.
Pandora's White-Box: Precise Training Data Detection and Extraction in Large Language Models
Jeffrey Cheng, Ruoxi Jia — arXiv preprint
Develops precise methods for detecting and extracting training data from LLMs when white-box access is available, with implications for copyright and privacy.
Extracting Training Data from Large Language Models
Nicholas Carlini, Florian Tramer, Eric Wallace + 9 more — USENIX Security 2021
Demonstrates that large language models memorize and can be prompted to emit verbatim training data, including PII, revealing significant privacy risks.