Building an Open ABI and FFI for ML Systems (tvm.apache.org)

🤖 AI Summary
TVM FFI is an open, minimal ABI/FFI for machine learning systems announced as a standalone library evolved from Apache TVM’s ABI work. It targets the growing fragmentation in ML toolchains — frameworks (PyTorch, JAX, CuPy), specialized kernels (FlashAttention, cuDNN), and many DSLs/compilers — by providing a common, stable C-level contract so libraries, DSLs and runtimes can interoperate without per-pair bindings. The project emphasizes zero-copy tensor interchange (DLPack), multi-language bindings (Python, C++, Rust), and a lightweight delivery model (one pip wheel/libtvm_ffi) so a single compiled kernel can be reused across Python versions and non-Python runtimes. Technically, TVM FFI centers on a 16-byte tagged-union value (TVMFFIAny) and intrusive TVMFFIObject pointers with runtime type_index and standalone deleters for cross-language ownership. It exposes a single “packed function” C signature that is type-erased but efficient (≈0.4 µs Python↔C++ overhead; tens of ns for static-to-static), supports closures/callbacks, and preserves GPU stream contexts for zero-copy torch.Tensor calls. Error propagation uses a TLS-based convention to translate exceptions across boundaries. By decoupling ABI from bindings, TVM FFI reduces the combinatorial binding problem, simplifies AOT/JIT targeting for DSL compilers, and enables mix-and-match composition of kernel libraries across deployment targets (desktop, mobile, automotive, WebAssembly).
Loading comments...
loading comments...