Back

Can This Data Poisoning Tool Help Artists Protect Their Work from AI Scraping?

November 21, 2023

By Patrick K. Lin

Generative AI tools like DALL·E, Midjourney, and Stable Diffusion are dominating the cultural zeitgeist but have not received a warm reception from artists. AI companies extract billions of images from the web, relying on artists’ pre-existing works to feed their models.^{^[1]}

The result is AI image generators producing images that contain artists’ visual artifacts and even artists’ signatures. For many artists, the speed and scalability of generative AI threatens to devalue the labor of creative expression or outright eliminate it. For instance, Marvel’s recent Disney+ show “Secret Invasion” featured an AI-generated opening credits sequence, sparking fears of replacing artists with AI tools. Similarly, during the writers’ strike, tech companies showed off a fake, AI-generated TV episode.

AI companies are facing a wave of lawsuits alleging copyright infringement, such as the class action against Midjourney, Stable Diffusion, and DeviantArt for copyright infringement as well as the Getty Images lawsuit against Stable Diffusion creator Stability AI. However, outside of taking expensive legal action, artists have very few defenses and tools at their disposal to deter companies and web scrapers from feeding their art to AI models without consent or compensation.

Enter NightShade, a data poisoning tool developed by a team of researchers from the University of Chicago.^{^[2]} Nightshade changes an image’s pixels in a way that is imperceptible to the human eye. Going forward, machine learning models, however, will detect these subtle changes, which are carefully designed to hinder models’ ability to label their images. If an AI model is trained on these “poisoned samples,” the invisible features of these images gradually corrupt the model.^{^[3]}

This process of data poisoning involves contributing inaccurate or meaningless data, thus encouraging the underlying AI model to perform poorly.^{^[4]} Data poisoning attacks manipulate training data to introduce unexpected behaviors into machine learning models at the training stage.^{^[5]} AdNauseam, for instance, is a browser extension that clicks on every single ad sent your way, which confuses Google’s ad-targeting algorithms. In the art and generative AI context, poisoned data samples can manipulate models into learning to label a hat as cake or cartoon art as impressionism.

A chart from the Nightshade research team’s paper, displaying examples of images generated by the Nightshade-poisoned model compared to the original clean model.

Because AI companies train their models on vast datasets, poisoned data is very difficult to remove. Identifying poisonous images requires AI companies to painstakingly find and remove each corrupted sample. If the training set is large enough, removing all copyrighted or sensitive information from an AI model can require effectively retraining the AI from scratch, which can cost tens of millions of dollars.^{^[6]} Ironically, this is often the very same reason companies give for why biased or nonconsensual data cannot be removed.

When the Nightshade team fed 50 poisoned images, which labeled pictures of cars as cows, into Stable Diffusion, the model started generating distorted images of cars.^{^[7]} After 100 samples, the model began producing images that had more cow-like features than car-like ones. At 300 images, virtually no car-like features remained.

The research team that created NightShade also developed Glaze, a tool that allows artists to “mask” their personal style to prevent it from being scraped by AI companies. The advent of text-to-image generative models has resulted in companies and grifters taking artists’ work to train models to recreate their style. Glaze works in a similar way to NightShade, changing the pixels of images in subtle ways that are invisible to the human eye but convince machine learning models to interpret the image as something else. Glaze received a “Special Mention” award in TIME Best Inventions of 2023.

A chart from the Nightshade research team’s paper, comparing clean and poisoned images and demonstrating how related prompts are also corrupted by the poisoning via a bleed through effect.

If NightShade can effectively break text-to-image models, then the AI companies that develop these models may finally have to respect artists’ rights. For instance, the deterrent effect of data poisoning may motivate AI companies to seek permission from artists and compensate them for continued use of their work.

Although some developers of text-to-image generative models, like Stability AI and OpenAI, have offered to let artists opt out of having their images used to train future versions of the models. These opt-out policies place the onus on artists to reclaim their work rather than on the AI companies systematically scraping images online.

Tools like NightShade should give AI companies pause. With any luck, the risk of destroying their entire model should force companies to think twice before taking artists’ work without their consent.

About the Author

Patrick K. Lin is the Center for Art Law’s 2023-2024 Judith Bresler Fellow and author of Machine See, Machine Do, a book about how public institutions use technology to surveil, police, and make decisions about the public, as well as the historical biases that impact that technology. Patrick is interested in legal issues that exist at the intersection of art and technology, particularly involving artificial intelligence, data privacy, and copyright law.