Show HN: Micdrop – open-source web framework for voice conversational AI (github.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

Micdrop is a new open-source TypeScript framework (MIT) that provides a modular stack for building real-time, browser-based voice conversations with AI agents. It bundles client and server libraries that handle microphone input, audio playback, VAD, streaming, device selection, WebSocket comms and server-side audio orchestration, plus ready integrations for providers such as OpenAI, ElevenLabs, Mistral, Gladia, Cartesia and React hooks for easy app wiring. Key packages include @micdrop/client, @micdrop/server, @micdrop/ai-sdk and provider adapters for TTS/STT/LLM engines. The significance is practical: instead of a single monolithic voice model, Micdrop lets developers mix and match best-in-class LLMs, TTS and STT engines per language/voice, control exactly when APIs are invoked, and combine open-source and commercial components to lower cost and increase customization. Technically it supports streaming voice-to-voice flows, VAD-driven capture, streaming TTS, and a framework-agnostic agent SDK for orchestrating conversation logic. For teams building voice assistants, multilingual bots or low-latency interactive agents, Micdrop offers granular pipeline control, provider flexibility, and production-ready primitives with development docs to build, test and publish packages.

Loading comments...

loading comments...