Building the largest known Kubernetes cluster, with 130k nodes (cloud.google.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

Google Cloud reported running an experimental Google Kubernetes Engine (GKE) cluster with 130,000 nodes—double the previously supported 65K limit—to explore scaling for large AI workloads. The test sustained ~1,000 Pod creations per second and stored over 1 million objects in an optimized distributed storage layer. It used a Spanner-based key-value backend driving ~13,000 QPS for lease updates, and a four-phase benchmark that deployed up to 130K Pods for baseline, mixed low/medium/high priority workloads, and spikes of latency-sensitive inference (up to 52K Pods). The exercise demonstrated rapid preemption and gang-style, “all-or-nothing” scheduling via Kueue, showing much faster workload shifts than kube-scheduler alone. Technically, Google relied on several Kubernetes advancements to avoid read amplification and datastore overload: Consistent Reads from Cache (KEP-2340) and a Snapshottable API Server Cache (KEP-4988) to serve strongly consistent LIST/watch requests from memory, plus a Spanner-backed storage layer. They also highlighted work to promote workload-aware scheduling (including KEP-4671 for gang scheduling) into core Kubernetes, RDMA networking (managed DRANET), MultiKueue for multi-cluster orchestration, and faster data access via GCS FUSE with caching or Managed Lustre. Beyond breaking scale records, these innovations harden GKE for everyday users and signal a shift toward workload-centric orchestration needed for next‑gen, multi‑datacenter AI/HPC deployments where power, networking, and cross-cluster scheduling are critical.

Loading comments...

loading comments...