Training VLA Models with Normalizing Flows (github.com)

0 points 7 hours ago ago | visit original

🤖 AI Summary

A new implementation called NinA adapts the FLOWER VLA codebase to train VLA (visual-language-action) models using Normalizing Flows (NF) instead of diffusion-based policies. The authors report that NF achieves comparable policy performance while using substantially fewer parameters and providing faster inference, making it an attractive, more efficient alternative for real-time or resource-constrained scenarios. NinA follows a standard NF training loop and is positioned as a drop-in experimental path for researchers who want to trade diffusion complexity for parameter and latency savings. Technically, the repo provides two backbones — a lightweight MLP and a scalable Transformer — implemented in flower/models/flower_nf.py. Training is invoked with python3 flower/training_libero.py and exposes key flags: --backbone (mlp or trans), --n_layers (number of flow layers), --affine_dim (hidden size of flow layers), --action_noise_mult (amplitude of noise added to ground-truth actions — an important hyperparameter), and --use_plu (enable PLU transformations; reported to have minimal impact). Built on the FLOWER VLA setup, NinA is useful for researchers exploring efficient policy representations, offering a practical path to lower-parameter, lower-latency VLA models without sacrificing performance.

Loading comments...

loading comments...