20x less peak RAM in the new PyTorch memory budget solver (jedrzej.maczan.pl)

0 points 9 hours ago ago | visit original

🤖 AI Summary

PyTorch now includes a much more memory-efficient knapsack-based solver for activation memory planning—dp_knapsack_sliding_hirschberg—which can cut peak RAM used during planning by roughly 20x compared with the default dp_knapsack. That matters because PyTorch’s memory planner chooses which intermediate ops to store vs. recompute under a RAM budget; using a leaner solver lets the planner handle far larger graphs (author reports handling ~2,000 items vs ~100 on a 64 GB machine), enabling bigger models or batches without changing model code. The new solver also shows a nontrivial runtime win (~37% in the author’s informal tests). Technically, dp_knapsack_sliding_hirschberg combines two classic optimizations: the sliding-window (keep only current and previous DP rows) and the Hirschberg divide-and-conquer trick to avoid backtracking and reconstruct the chosen items. That reduces DP memory from O(#items * capacity) to O(2 * capacity) while still producing exact 0/1-knapsack solutions. Alternatives remain: greedy_knapsack for speed-with-approximation, and ilp_knapsack (SciPy) for very fast exact solves if you can add SciPy. The new solver lives on PyTorch’s main branch (not yet released), so to try it today you must build from source and set torch._functorch.config.activation_memory_budget_solver = torch._functorch.config.dp_knapsack_sliding_hirschberg.

Loading comments...

loading comments...