Study ties diffusion models’ “creativity” to denoising quirks

Close-up of pixel patches on a layered grid coalescing from noise into an organized pattern, no people

Why do image generators produce novel pictures when they’re trained to mimic data? A new study argues that what looks like creativity emerges inevitably from how diffusion models denoise images. According to Wired, summarizing work originally published by Quanta Magazine, the technical imperfections of denoising drive these models to synthesize new, coherent content.

The denoising “paradox” and its proposed cause

Diffusion models underpin tools like DALL·E, Imagen, and Stable Diffusion. Though designed to reconstruct training images, they often blend elements into fresh combinations. Giulio Biroli of the École Normale Supérieure called this a paradox: if the systems perfectly memorized, they wouldn’t produce new samples. The study presented at ICML 2025 claims the effect is deterministic, rooted in two traits of trained diffusion models: locality (operating on small pixel patches) and translational equivariance (shifting inputs leads to corresponding shifts in outputs). These constraints keep structure consistent but limit global awareness, setting the stage for emergent novelty.

Bottom-up mechanics and a morphogenesis analogy

Lead author Mason Kamb, a Stanford applied physics graduate student, drew inspiration from morphogenesis and Turing patterns—local rules giving rise to larger forms. He and coauthor Surya Ganguli modeled an “equivariant local score” (ELS) machine: equations that analytically predict denoised compositions using only locality and equivariance. When they ran identical noisy inputs through both ELS and trained diffusion models (including ResNets and UNets), ELS matched outputs with an average accuracy near 90 percent, a result Ganguli described as “shocking.” The extra-fingers phenomenon seen in early AI images aligned with the idea that focusing narrowly on patches, absent broader context, yields characteristic artifacts.

Implications and open questions

The findings suggest that once locality is imposed, creative recombination follows automatically from the models’ dynamics, rather than from higher-order reasoning. Luca Ambrogioni of Radboud University praised the work’s predictive precision while noting it illuminates only part of the broader story. Wired reports that researchers emphasized limits: other systems, such as large language models, also appear creative without relying on the same locality and equivariance. Nonetheless, formalizing diffusion creativity as a by-product of denoising may guide future AI research—and, as quoted in Quanta, offers a lens for comparing human and machine creativity as assembly from experienced building blocks.

This article draws on reporting by Wired, which reprinted the original story from Quanta Magazine.

Total
0
Shares
Pridaj komentár

Vaša e-mailová adresa nebude zverejnená. Vyžadované polia sú označené *

Previous Post
Close-up circuit board where a glowing fingerprint motif aligns with neural circuit traces under morning light

Ezra Klein reflects on how GPT-5 “feels” different

Next Post
Phone on a desk at dusk casts multiple translucent chat-bubble reflections across papers and a window, empty room, no people

Researchers probe AI chatbots’ role in psychotic thinking

Related Posts