🤖 AI Summary
Developer Andupoto posted a proof‑of‑concept "Siri for sites" that embeds voice-driven workflows into any website using MCP and a small SDK. Two demos (a controllable maze and a Memoreco prototype) show natural, contextual voice commands—“buy this,” “add credit, then send me a receipt as PDF,” or “take me back”—that chain actions, use page context and lightweight history for undo/confirm flows, and don’t require users to learn UI affordances. The backend clusters failed intents to recommend new MCP tools, letting teams iterate the voice catalog based on real usage.
Under the hood the @memoreco/memoreco-js@0.2.3 SDK opens an audio-only session, streams to Speechmatics for sub‑second transcription, sends text to Groq for intent parsing, then posts structured tool calls to a configurable MCP endpoint; the MCP reply returns a structured payload to drive the UI. The pipeline is explicit (Record → Transcribe → ParseAI/Tool selection → Execute → Result), supports multi‑language by toggling transcriptionMode to "streaming," and pauses for confirmations or chained steps. For AI/ML teams this is significant: it demonstrates a practical, modular stack for low-latency, context-aware voice assistants on the web that combine STT, LLM intent parsing, and deterministic action execution—ready to drop into e‑commerce or complex web workflows.
Loading comments...
login to comment
loading comments...
no comments yet