🤖 AI Summary
Independent journalist Ed Zitron published a detailed allegation that OpenAI’s ongoing “inference” costs—the GPU compute used to serve model outputs—are far larger than public numbers suggest, and may substantially exceed OpenAI’s revenues. Zitron says internal documents show OpenAI spent $5.02 billion on inference with Microsoft Azure in H1 CY2025 and $8.67 billion through September 2025, up from $3.76 billion in CY2024. He also reconstructs OpenAI revenue from Microsoft’s 20% revenue share payments and finds figures (e.g., implied 2024 revenue ≈ $2.47B vs. previously reported $3.7B) that don’t match public statements. The Financial Times reviewed the figures but Microsoft and OpenAI have declined to confirm them, saying the numbers “aren’t quite right” or are incomplete.
Technically, the piece highlights why inference—not just training—drives GenAI economics: inference FLOPs scale with tokens × model parameters × 2 FLOPs per token-parameter interaction, and newer models can generate far more output tokens (GPT-o1 reportedly produces 50% more tokens and 4× output tokens vs GPT-4o), while per-token pricing can be many times higher. That combination can multiply API costs (estimates range 30–70× for some tasks). If Zitron’s numbers hold, the implication is stark: LLMs may be structurally hard to monetize at scale without major cost reductions, pricing changes, usage caps, architectural improvements, or continued subsidization by partners/investors.
Loading comments...
login to comment
loading comments...
no comments yet