Code-to-Metric Regression: Predicting Numeric Outcomes of Code Executions (arxiv.org)

🤖 AI Summary
Researchers introduce "code-to-metric" regression using a unified Regression Language Model (RLM) that predicts numeric outcomes of code from text alone—no hand-crafted, domain-specific features. A relatively small 300M-parameter RLM (initialized from T5Gemma) is shown to predict diverse metrics: memory footprints across high-level languages (Python, C++), latency of Triton GPU kernels, and accuracy/speed of neural networks encoded in ONNX. Empirically the RLM achieves >0.9 Spearman rank on APPS competitive-programming submissions, >0.5 average Spearman rank across 17 languages in CodeNet, and attains the highest average Kendall Tau (0.46) on five classic neural-architecture-search (NAS) design spaces—previously the domain of graph neural networks—while also predicting architecture latencies across multiple hardware platforms. This work is significant because it demonstrates a single, text-based model can generalize across languages, hardware targets, and metric types, simplifying workflows that historically relied on bespoke static analysis, instrumentation, or heavy feature engineering. Practical implications include faster, cheaper profiling and performance estimation for compilers, NAS, deployment planning, and resource-aware scheduling; the approach suggests lightweight LMs can become general-purpose predictors for code performance and model evaluation, enabling integration into toolchains and automated optimization loops.
Loading comments...
loading comments...