Tinker by Thinking Machines (thinkingmachines.ai)

0 points 18 hours ago ago | visit original

🤖 AI Summary

Thinking Machines has launched Tinker, a training API that hands researchers full programmatic control of model training while the company manages the compute and infrastructure. Tinker exposes four primitive functions—forward_backward (forward and backward pass with gradient accumulation), optim_step (apply optimizer updates), sample (token generation for interaction/eval/RL), and save_state (checkpoint/resume)—so users can implement custom training loops, RL algorithms, evaluation protocols, and schedulers without building infra. It currently supports QWEN and focuses on LoRA-style fine-tuning, meaning only a small add-on is trained instead of altering full model weights. For the AI/ML community this matters because it separates algorithmic experimentation from engineering overhead: teams can iterate on datasets, objectives, and training logic faster and more reproducibly. The fine-grained API lets researchers control gradient accumulation, optimizer behavior, and sampling strategies (useful for RL or human-in-the-loop evaluation), while save_state enables robust checkpointing and resumption. By combining LoRA with managed infrastructure, Tinker lowers cost and complexity for large-model tuning and rapid prototyping—customers report quicker iteration cycles, easier entry into RL work, and fewer infrastructure headaches—making it a practical toolkit for experimental ML research and reproducible model development.

Loading comments...

loading comments...