← Back to all categories
Model Extraction
3 resourcesAttacks & Threats
Model stealing, distillation attacks, and weight extraction
paper reviewed open access 2024
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen, Yiting Qu, Michael Backes + 1 more — USENIX Security 2024
Demonstrates attacks that steal the prompts used to generate images from text-to-image models, raising IP and privacy concerns.
paper reviewed open access 2024
Stealing Part of a Production Language Model
Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham + 10 more — ICML 2024
Demonstrates that it is possible to steal the embedding projection layer of production LLMs like OpenAI's models through the API, confirming model extraction risks.
model extraction 95 citations
paper reviewed open access 2023
Scalable Extraction of Training Data from (Production) Language Models
Milad Nasr, Nicholas Carlini, Jonathan Hayase + 7 more — arXiv preprint
Develops a scalable attack to extract over a gigabyte of training data from semi-open and closed models including ChatGPT, at a cost of roughly $200.