The Unexpected Effectiveness of One-Shot Decompilation with Claude (blog.chrislewis.au)

0 points 228 days ago ago | visit original

🤖 AI Summary

An engineer reports a practical, “one-shot” decompilation workflow that runs Claude (Opus 4.5) headlessly in a continuous loop to automatically match and integrate functions into a Snowboard Kids 2 decompilation. The user runs a simple driver script (vacuum.sh) that invokes four components: a scorer (selects the next easiest function), Claude (does the decompilation), a small Unix-like toolbox (build-and-verify.sh and other defensive scripts), and a driver that manages lifecycle, retries, logging and quitting. Early scorer heuristics (score = instruction_count + 3*branch_count + 2*jump_count + 2*label_count + stack_size) were replaced by a logistic regression; stack_size proved unhelpful and was removed. Claude is instructed to make up to ten attempts per function, commit successful matches, and log failures. Opus 4.5 outperformed Sonnet in a small test and Codex fared poorly. This is significant because it demonstrates that unattended LLM agents can substantially increase throughput for large, repetitive reverse-engineering tasks: the author made more progress in three weeks than in the prior three months and estimates ~79% of functions may be matchable. Key implications include shifting bottlenecks from human attention to compute and model access, the necessity of tooling and explicit guardrails (clear error messages, limited build output to save tokens), and an expected future focus on post-processing LLM outputs—cleanup, documentation and refinement—since matches are often correct but messy (pointer arithmetic, gotos, awkward temporaries). Risks remain (quota exhaustion, LLMs “going off rails”), so defensive tooling and careful logging are essential.

Loading comments...

loading comments...