A bare-metal-first architecture to address the GPU virtualization tax (www.ori.co)

0 points 8 hours ago ago | visit original

🤖 AI Summary

Ori is pitching a “bare-metal-first” architecture for AI infrastructure: a distributed cloud OS that treats physical servers as the primary primitive rather than layering a hypervisor. The platform automates node lifecycle via IPMI/PXE, assigns dynamic node roles (GPU nodes for bare-metal HPC or Kubernetes workers; CPU nodes for pods or VMs), and continuously monitors and replaces faulty hardware. When VMs are needed they’re provisioned directly on bare metal with KVM, avoiding monolithic hypervisor management. The goal is to eliminate the persistent “virtualization tax” (commonly 5–30% depending on tuning) that hits CPU cycles and—critically—IO paths (virtual switches, storage controllers) and undermines RDMA/InfiniBand/RoCE performance for distributed training. Technically this delivers lower latency, stronger isolation, and better economics: Ori uses hardware-level isolation (NVIDIA MIG, SR-IOV), EVPN/VXLAN and InfiniBand partitioning for multi-tenancy, and exposes RDMA to frameworks like PyTorch FSDP and JAX for near-zero-latency collective ops. A unified global control plane manages provisioning, scheduling and lifecycle across GPUs, storage and networking, yielding reported 10–15% higher effective throughput for distributed training versus virtualized environments—equivalent to adding >100 GPUs on a 1,000-GPU fleet. The approach also simplifies compliance/auditability and accelerates adoption of new hardware (NVLink, CXL, AMD/ARM accelerators), making it attractive for private AI clouds and regulated or sovereign deployments.

Loading comments...

loading comments...