Google boasts 1.3 quadrillion tokens each month (the-decoder.com)

🤖 AI Summary
Google announced at a Google Cloud event that its AI systems now process more than 1.3 quadrillion tokens per month, up from about 980 trillion in June — a jump of roughly 320 trillion. While eye‑catching, that figure mainly reflects backend compute load rather than direct user activity: tokens are the smallest units LLMs process (similar to word fragments), and recent “reasoning” models such as Gemini 2.5 Flash perform many more internal steps per request. Analysis suggests Gemini Flash 2.5 can consume ~17× more tokens per request than its predecessor and be up to ~150× costlier for reasoning tasks; multimodal workloads (video, image, audio) probably add further hidden token counts that Google doesn’t break out. The milestone matters because it signals rapid scaling of compute demands across Google’s AI stack and calls into question simple interpretations of usage or efficiency. Google’s environmental claims — for example, that a typical Gemini text prompt uses 0.24 Wh and 0.03 g CO₂ — are based on “typical” short prompts and likely lighter models, not heavy reasoning or multimodal workloads. As a result, the 1.3 quadrillion‑token stat highlights growing infrastructure and energy intensity that may be under‑represented in public efficiency figures, underscoring the need for clearer, workload‑level transparency.
Loading comments...
loading comments...