🤖 AI Summary
Uber’s ML team announced the production rollout of Two-Tower Embeddings (TTE) in Michelangelo to power recommendations—starting with Eats homefeed—after about a year of development. The move replaces a brittle, city-wise DeepMF approach (thousands of Spark jobs) with a single, scalable TTE pipeline that yielded major performance and maintenance wins. Practically, Uber now precomputes store/item embeddings offline and serves eater/query embeddings in real time, using Approximate Nearest Neighbor search via Uber’s SIA inverted index for fast retrieval. This reduced expensive O(q * M) real-time DL inference to O(M) offline + O(q) realtime model calls and fast dot-product ANN lookups, enabling personalized retrieval in ~hundreds of milliseconds.
Technically, the model uses a query tower (user/context) and an item tower (store/item/geo), trained jointly on engagement labels with similarity metrics (dot product/cosine) and in-batch negatives to optimize recall@large_N. Key design choices include spatial indexing to respect geographic constraints, localization (0/1-hop) to allow pre-sampling and disk-backed features instead of in-memory graph processing (avoiding GNN complexity), and straightforward extensibility for numeric and NLP features. Beyond retrieval, TTE can act as a lightweight final ranker and generate transferrable embeddings for downstream tasks—accelerating reuse and opening broader productization across Eats, Groceries, Maps and more.
Loading comments...
login to comment
loading comments...
no comments yet