Why do image generators produce novel pictures when they’re trained to mimic data? A new study argues that what looks like creativity emerges inevitably from how diffusion models denoise images. According to Wired, summarizing work originally published by Quanta Magazine, the technical imperfections of denoising drive these models to synthesize new, coherent content.
The denoising “paradox” and its proposed cause
Diffusion models underpin tools like DALL·E, Imagen, and Stable Diffusion. Though designed to reconstruct training images, they often blend elements into fresh combinations. Giulio Biroli of the École Normale Supérieure called this a paradox: if the systems perfectly memorized, they wouldn’t produce new samples. The study presented at ICML 2025 claims the effect is deterministic, rooted in two traits of trained diffusion models: locality (operating on small pixel patches) and translational equivariance (shifting inputs leads to corresponding shifts in outputs). These constraints keep structure consistent but limit global awareness, setting the stage for emergent novelty.
Bottom-up mechanics and a morphogenesis analogy
Lead author Mason Kamb, a Stanford applied physics graduate student, drew inspiration from morphogenesis and Turing patterns—local rules giving rise to larger forms. He and coauthor Surya Ganguli modeled an “equivariant local score” (ELS) machine: equations that analytically predict denoised compositions using only locality and equivariance. When they ran identical noisy inputs through both ELS and trained diffusion models (including ResNets and UNets), ELS matched outputs with an average accuracy near 90 percent, a result Ganguli described as “shocking.” The extra-fingers phenomenon seen in early AI images aligned with the idea that focusing narrowly on patches, absent broader context, yields characteristic artifacts.
Implications and open questions
The findings suggest that once locality is imposed, creative recombination follows automatically from the models’ dynamics, rather than from higher-order reasoning. Luca Ambrogioni of Radboud University praised the work’s predictive precision while noting it illuminates only part of the broader story. Wired reports that researchers emphasized limits: other systems, such as large language models, also appear creative without relying on the same locality and equivariance. Nonetheless, formalizing diffusion creativity as a by-product of denoising may guide future AI research—and, as quoted in Quanta, offers a lens for comparing human and machine creativity as assembly from experienced building blocks.
This article draws on reporting by Wired, which reprinted the original story from Quanta Magazine.