← Back to search
paper reviewed open access llmsec-2025-00017

Poisoning Web-Scale Training Datasets is Practical

Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramer

2024 — IEEE S&P 2024 250 citations

Abstract

Demonstrates practical attacks to poison web-scale datasets like LAION by purchasing expired domains, affecting 0.01% of a dataset for under $60.

Categories

Tags

web-scalepractical-attackdomain-purchasedataset-poisoning

Framework Mappings

OWASP LLM: LLM03 OWASP LLM: LLM04 MITRE ATLAS: AML.T0020

Cite This Resource

@article{llmsec202500017,
  title = {Poisoning Web-Scale Training Datasets is Practical},
  author = {Nicholas Carlini and Matthew Jagielski and Christopher A. Choquette-Choo and Daniel Paleka and Will Pearce and Hyrum Anderson and Andreas Terzis and Kurt Thomas and Florian Tramer},
  year = {2024},
  journal = {IEEE S&P 2024},
  url = {https://arxiv.org/abs/2302.10149},
}

Metadata

Added
2026-04-14
Added by
manual
Source
manual
arxiv_id
2302.10149