🤖 AI Summary
Researchers introduce a unified Regression Language Model (RLM) that performs "code-to-metric" regression — predicting numeric outcomes of code execution directly from source text. Without heavy, domain-specific feature engineering, a single RLM (a compact 300M-parameter model initialized from T5Gemma) simultaneously predicts memory footprint across high-level languages (e.g., Python, C++), Triton GPU kernel latency, and ONNX model accuracy and throughput. Performance highlights include >0.9 Spearman rank on competitive-programming submissions from APPS, >0.5 average Spearman rank across 17 languages in CodeNet, and a best-average Kendall Tau of 0.46 on five classic NAS design spaces (outperforming prior graph‑neural‑network approaches), while also estimating architecture latencies across multiple hardware targets.
Significance: this work demonstrates that language models can generalize to quantitative performance estimation tasks previously tackled with specialized tooling, enabling unified, lightweight predictors useful for compiler optimizations, hardware-aware NAS, cost-aware code search, and automated benchmarking. Key technical implications are that (1) direct text-to-metric mapping is feasible across languages and runtimes, (2) small, pretrained encoder–decoder backbones (T5Gemma→300M) suffice to capture performance-relevant patterns, and (3) a single model can transfer across heterogeneous tasks and platforms — opening paths to faster, end-to-end performance modeling and cross-domain transfer in ML-driven software and systems workflows.
Loading comments...
login to comment
loading comments...
no comments yet