pull down to refresh

Image generators are designed to mimic their training data, so where does their apparent creativity come from? A recent study suggests that it’s an inevitable by-product of their architecture.
We were once promised self-driving cars and robot maids. Instead, we’ve seen the rise of artificial intelligence systems that can beat us in chess, analyze huge reams of text and compose sonnets. This has been one of the great surprises of the modern era: physical tasks that are easy for humans turn out to be very difficult for robots, while algorithms are increasingly able to mimic our intellect.
Another surprise that has long perplexed researchers is those algorithms’ knack for their own, strange kind of creativity.
Diffusion models, the backbone of image-generating tools such as DALL·E, Imagen and Stable Diffusion, are designed to generate carbon copies of the images on which they’ve been trained. In practice, however, they seem to improvise, blending elements within images to create something new — not just nonsensical blobs of color, but coherent images with semantic meaning. This is the “paradox” behind diffusion models, said Giulio Biroli(opens a new tab), an AI researcher and physicist at the École Normale Supérieure in Paris: “If they worked perfectly, they should just memorize,” he said. “But they don’t — they’re actually able to produce new samples.”
So LLMs are mumurations?
reply
42 sats \ 1 reply \ @k00b 8 Jul
I just read this. The conclusion is that creativity, in image diffusion models at least, stems from locality (generating then accepting a small part of the image) and translational equivariance (generating the next small local part and making it coherent with the last over and over). The researchers and are author are a bit coy about it, but imo this is how human creativity works too: progressive coherence, you assume some constraints, make something0 constrained by them, move beyond something0 and making something1 coherent with something0, over and over until the aggregate is something worthwhile and new.
reply
As far as I can tell, you nailed it.
reply