We Bet on Rust to Supercharge Feature Store at Agoda (medium.com)

🤖 AI Summary
Agoda rewrote its high-traffic Feature Store Serving component from a JVM-based stack (originally a Feast fork, later Scala) into Rust to solve unpredictable P99 latency, scaling bottlenecks, and GC-driven jitter. Feature Store Serving fetches features from ScyllaDB and must meet a P99 latency budget of ~10 ms while handling millions of requests per second, so even small inefficiencies in the serving layer mattered. After a one-week Rust proof-of-concept by a single developer—and leveraging GitHub Copilot plus Rust’s compiler diagnostics—benchmarks showed large gains in requests/sec, CPU, and memory. Rust’s zero-cost abstractions, predictable, non‑GC runtime, and ownership model let the team eliminate blocking patterns and costly data conversions that hampered the JVM implementation. The migration was validated with a production “shadow testing” setup (Istio traffic mirroring + a comparator) to surface behavioral differences before full cutover. Results: traffic has grown ~5x since migration while CPU usage dropped to ~13% of the Scala peak initially (now ~40% at 5x load), memory fell initially to ~1% (now ~15%), and annual compute costs are down ~84% compared to staying on Scala (equivalent to Scala being ~6.3x more expensive today). Key takeaways for ML engineers: Rust can deliver deterministic low-latency serving at much lower resource cost, AI-assisted tooling can flatten the learning curve, and robust shadow testing is critical when replacing a production inference path.
Loading comments...
loading comments...