Show HN: Hyprvoice – Voice-Powered Typing for Wayland/Hyprland (No X11 Hacks) (github.com)

0 points 14 hours ago ago | visit original

🤖 AI Summary

Hyprvoice is a new Wayland-native voice-to-text utility for modern Linux desktops that lets you press a key to start/stop recording and instantly inject transcribed text at the cursor — no X11 hacks. Built as a lightweight daemon + pipeline (recording → transcribing → injecting) with a Unix-socket control plane, it integrates with PipeWire for audio capture, uses notify-send for real-time recording/transcription feedback, and supports systemd user services. The project is beta-ready, packaged on the AUR (auto-installs dependencies like pipewire, wl-clipboard, wtype, libnotify) and works on Hyprland, GNOME, KDE, etc. Technically it supports OpenAI’s Whisper API today and plans whisper.cpp for local/offline inference, giving a migration path from cloud to private models. Audio defaults (16 kHz, mono, s16) and buffering/timeout controls are configurable in a TOML file that hot-reloads; injection modes include direct typing via wtype, clipboard copy/restore, or a smart fallback. The state machine ensures predictable lifecycle handling (idle → recording → transcribing → injecting), and the tool deliberately focuses on reliability (clipboard restore, wtype timeouts) and extensibility for other backends. For AI/ML practitioners this is significant: it provides a polished, compositor-native bridge between speech models and desktop workflows, with clear hooks to swap cloud models for local inference for privacy, latency, and offline use cases.

Loading comments...

loading comments...