Show HN: Tokuin – CLI load tester and token estimator for LLM APIs (github.com)

0 points 12 hours ago ago | visit original

🤖 AI Summary

Tokuin is a fast, open-source CLI tool (written in Rust) that estimates token usage and API costs for prompts across many LLM providers (OpenAI, Anthropic, Mistral, OpenRouter’s multi-model catalog, etc.) and can optionally run real load tests against provider APIs. It provides per-model token counts (including role-based breakdowns for system/user/assistant), per-model cost estimates, multi-model comparisons, prompt diffing/minification, file watching, and flexible output (text, JSON, Markdown) — all aimed at helping developers quickly optimize prompts, budget API spend, and integrate token checks into CI or scripts. Under the hood it uses modular tokenizers (tiktoken-rs for OpenAI), a model registry with pricing config, and optional features you enable at build time (markdown, watch, Gemini support, load-test). The load-test mode supports concurrency, retries, think-time, dry-run cost estimation, max-cost stopping, Prometheus/JSON outputs, and provider auto-detection (or explicit provider/endpoint) via flags or env vars. Because it’s Rust-based, Tokuin targets high performance, portability, and safety for both ad-hoc audits and automated tooling; it’s distributed via GitHub releases or build-from-source, and is MIT/Apache-2.0 dual-licensed for easy adoption.

Loading comments...

loading comments...