Show HN: Quick CLI for local text-to-speech using Kokoro (github.com)

0 points 1 day ago ago | visit original

🤖 AI Summary

A lightweight CLI called ltts provides fast, local text-to-speech powered by the Kokoro TTS models. Install with pip (pip install ltts) or use uvx/uv to run without installing (uvx ltts "hello world" or uv run ltts "hello world"). It supports streaming to your speakers (--say), reading stdin, and writing MP3/OGG/FLAC/WAV files. Over 50 voices across multiple languages (American/British English, Japanese, Chinese, Spanish, French, Hindi, Italian, Portuguese, etc.) are available; language is auto-detected from the voice prefix or can be set with -l. First run downloads the model (~330 MB) to ~/.cache/huggingface/ and Japanese voices trigger a one-time ~526 MB dictionary download. Output plays at 24 kHz by default. For the AI/ML community this matters because it provides reproducible, privacy-friendly local TTS without cloud APIs, enabling experimentation and deployment on developer machines. The project integrates easily into scripts and pipelines, supports editable installs and local runs (uv run python -m ltts), and exposes many voice options from the Kokoro repo (see VOICES.md). Notes: ensure PulseAudio/PipeWire access on Linux and accept macOS audio permissions. The small-ish model footprint and multiple usage modes make it practical for prototyping voice UX, offline demos, and on-device synthesis.

Loading comments...

loading comments...