Show HN: Hathora Models – Voice Model Marketplace (models.hathora.dev)

0 points 247 days ago ago | visit original

🤖 AI Summary

Hathora Models has launched a voice-model marketplace for exploring, testing, and deploying production-ready ASR, TTS, and LLMs aimed at voice agents and real‑time apps. The platform pairs an interactive sandbox for instant tryouts with a “Chain” tool that lets developers swap models to evaluate tradeoffs quickly, and provides deployment docs for Pipecat, LiveKit and direct API access to move from prototype to production. The catalog highlights models suited to real-world needs—multilingual ASR with word-level timestamps (nvidia/parakeet-tdt-0.6b-v3), lightweight and cost‑efficient TTS (hexgrad/Kokoro-82M), expressive public TTS (ResembleAI/chatterbox), plus large Qwen3 LLMs (dense and MoE variants) for advanced reasoning and agent use. For practitioners this matters because it reduces friction when assembling voice stacks: you can compare latency, accuracy, expressiveness and features like zero‑shot cloning or timestamping without building end‑to‑end infra first. Notable technical details include support for word-level timestamps, zero-shot voice cloning (nvidia/magpie-tts coming soon), MoE and dense Qwen3 models for improved reasoning and multilingual support, and a coming ultra‑low on‑prem latency TTS (rime/mistv2 ~70 ms). The marketplace’s integrations and model interoperability make it a practical tool for teams balancing inference cost, latency, and audio quality in production voice and real‑time AI systems.

Loading comments...

loading comments...