ML LLVM Project: Compiler Infrastructure for ML-Driven Optimizations (github.com)

🤖 AI Summary
IITH’s Compilers group published an ml-llvm-project GitHub repository that integrates ML-driven optimizations directly into LLVM via an ML Compiler Bridge and IR2Vec program embeddings. The codebase packages several reinforcement-learning (RL) systems—RL-based loop distribution, RL4ReAl (register allocation), and POSET-RL (phase ordering)—and provides runtimes and example commands to invoke them through ONNX Runtime or TensorFlow AOT. The project is designed for researchers and compiler engineers to reproduce experiments, extend ML models, or plug ML inference into LLVM optimization pipelines without rearchitecting the compiler. Technically, IR2Vec produces distributed program representations from LLVM IR using flow analyses (Use-Def, Reaching Definitions, live variables) to feed ML models. The ML-Compiler-Bridge supports multiple integration modes (gRPC/pipes for server-client and direct ONNX Runtime for in-process inference), and the repo documents required dependencies (C++17, Python 3.10, CMake, gRPC v1.34/protobuf 3.13, Eigen 3.3.7, ONNX Runtime 1.16.3, TF 2.13) plus build/CMake flags to enable ML components. Each optimizer exposes command-line hooks (examples: opt -custom_loop_distribution -cld-use-onnx, clang with -mlra-inference for RL4ReAl, and -poset-rl) and describes reward models (e.g., instruction cost and cache misses), use of SCC dependence graphs, and data-augmentation strategies. The release makes it straightforward to evaluate ML-guided passes on real code and accelerate research into adaptive, architecture-aware compiler optimizations.
Loading comments...
loading comments...