🤖 AI Summary
Hyprvoice is a new Wayland-native voice-to-text utility for modern Linux desktops that lets you press a key to start/stop recording and instantly inject transcribed text at the cursor — no X11 hacks. Built as a lightweight daemon + pipeline (recording → transcribing → injecting) with a Unix-socket control plane, it integrates with PipeWire for audio capture, uses notify-send for real-time recording/transcription feedback, and supports systemd user services. The project is beta-ready, packaged on the AUR (auto-installs dependencies like pipewire, wl-clipboard, wtype, libnotify) and works on Hyprland, GNOME, KDE, etc.
Technically it supports OpenAI’s Whisper API today and plans whisper.cpp for local/offline inference, giving a migration path from cloud to private models. Audio defaults (16 kHz, mono, s16) and buffering/timeout controls are configurable in a TOML file that hot-reloads; injection modes include direct typing via wtype, clipboard copy/restore, or a smart fallback. The state machine ensures predictable lifecycle handling (idle → recording → transcribing → injecting), and the tool deliberately focuses on reliability (clipboard restore, wtype timeouts) and extensibility for other backends. For AI/ML practitioners this is significant: it provides a polished, compositor-native bridge between speech models and desktop workflows, with clear hooks to swap cloud models for local inference for privacy, latency, and offline use cases.
Loading comments...
login to comment
loading comments...
no comments yet