Differential Privacy

5 resources

Privacy

DP training, inference, and privacy-preserving mechanisms

paper reviewed open access 2024

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou + 4 more — ICLR 2024

Evaluates LLM privacy behavior through the lens of contextual integrity theory, finding significant mismatches between LLM norms and human privacy expectations.

differential privacy data anonymization 110 citations

paper reviewed open access 2024

Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks

Vaidehi Patil, Peter Hase, Mohit Bansal — ICLR 2024

Evaluates methods for deleting sensitive information from trained LLMs, finding current unlearning approaches insufficient against determined adversaries.

unlearning differential privacy membership inference 70 citations

paper reviewed open access 2024

DP-SGD for Fine-Tuning Foundation Models: A Privacy-Utility Trade-off Study

Yu-Xiang Wang, Borja Balle, Shiva Prasad Kasiviswanathan — ICLR 2024

Investigates applying differentially private stochastic gradient descent to fine-tune large foundation models, characterizing the privacy-utility trade-off.

differential privacy fine tuning security 55 citations

paper reviewed open access 2024

Pandora's White-Box: Precise Training Data Detection and Extraction in Large Language Models

Jeffrey Cheng, Ruoxi Jia — arXiv preprint

Develops precise methods for detecting and extracting training data from LLMs when white-box access is available, with implications for copyright and privacy.

membership inference differential privacy 40 citations

paper reviewed open access 2021

Extracting Training Data from Large Language Models

Nicholas Carlini, Florian Tramer, Eric Wallace + 9 more — USENIX Security 2021

Demonstrates that large language models memorize and can be prompted to emit verbatim training data, including PII, revealing significant privacy risks.

membership inference differential privacy 1200 citations