Mean Images (newleftreview.org)

🤖 AI Summary
Researchers and artists have framed a useful lens — “mean images” — to describe what modern text-to-image models actually produce: statistical renderings rather than indexical photographs. Building on Ted Chiang’s “blurry jpeg” metaphor and Francis Galton’s composite portraits, the essay argues that models like Stable Diffusion don’t reproduce specific scenes but converge around averages and medians derived from massive datasets (e.g., LAION‑5B). The result is a predictable “likeliness” that replaces concrete likeness with probabilistic approximation: images shaped by loss functions, long tails and latent vector spaces rather than by photons hitting a sensor. A concrete demo—asking Stable Diffusion to render a portrait of the artist—produces a flattened, demeaning “mean” that reflects aggregated social signals more than any individual subject. For the AI/ML community this matters technically and ethically. Mean images expose how training-distribution imbalances and learned correlations (encoded as vector coordinates) produce stereotyped, mediocrity‑biased outputs and surprising artefacts — for example DreamFusion’s “Janus problem,” where overrepresentation of faces in data yields multiple faces in generated 3D models. The concept reframes models as social filters that surface collective attitudes, raising questions about whose mean is encoded, how individuals are subsumed by the crowd, and the environmental and labor costs of the infrastructures that produce these statistical visions.
Loading comments...
loading comments...