🤖 AI Summary
Luke Hinds argues that LLMs themselves are fundamentally deterministic: once training finishes, a model is just a static set of numerical parameters (stored as PyTorch .pth/.pt or safetensors — essentially a dict of tensors like layer*.weight and layer*.bias). Given identical weights and inputs, the mathematical function computes the same output. The commonly repeated claim “LLMs are non-deterministic” is therefore misleading; the apparent randomness usually comes from execution-level effects on parallel hardware. GPUs/TPUs perform many floating‑point operations in varying orders (due to threading, memory access and optimizations), and finite-precision arithmetic is non-associative. Tiny numerical differences (e.g., sums like 1.0000000000000002 vs 0.9999999999999998) can be amplified by transformer sensitivity and autoregressive decoding, producing divergent token sequences even under identical prompts.
Hinds demonstrates this by running Qwen‑3‑0.6B on a Colab T4 with fixed seeds and the Hugging Face transformers stack, showing identical outputs when the environment is controlled. The takeaway for researchers: variability is an implementation and hardware artifact, not an intrinsic property of the model, so reproducibility and safety work should target computational determinism (strict operation ordering, deterministic GPU modes, hybrid HW/SW designs) rather than treating LLMs as inherently stochastic or “sentient.”
Loading comments...
login to comment
loading comments...
no comments yet