The new Tinker API from Thinking Machines is half-baked but great at fine tuning (github.com)

0 points 245 days ago ago | visit original

🤖 AI Summary

Thinking Machines’ Tinker framework is being used to fine-tune Meta’s Llama 3.1 8B into a Romanian instruction-following model via an open repo that packages data, training, testing and checkpoint utilities. The project leverages Tinker’s distributed training service (no local GPUs required) plus LoRA (Low‑Rank Adaptation) to run efficient parameter-efficient tuning and live-testing on Tinker’s infrastructure; you sign up for the Tinker beta and use a session ID to test or download checkpoints. That workflow lowers the barrier for building high-quality, language-specific models and makes side‑by‑side comparisons with the base Llama 3.1 straightforward (interactive testing, --compare mode). Key technical details: base model meta-llama/Llama-3.1-8B, LoRA config rank=8, alpha=16, dropout=0.05, targeting all linear layers; training defaults lr=1e‑4, max_steps=1000, batch_size=4, warmup=100, save_steps=100, eval_steps=50, optimizer=AdamW. Data sources include Romanian Wikipedia, OSCAR, and translated Alpaca/Dolly instruction examples formatted as per-example JSONL conversation messages. Tinker keeps weights on its servers (downloadable archives may take minutes), and common beta issues are documented (session expiry, archive delays, checkpoint path errors). The result is a practical, replicable template for low-resource-language fine-tuning with promising reported losses (example run: 428.5 → 1.2 in ~2 hours), but users should expect some rough edges from a beta API and validate data, checkpoints and session lifecycles.

Loading comments...

loading comments...