OpenAI’s new AI voice models could grant ChatGPT powerful new abilities (www.techradar.com)

🤖 AI Summary
OpenAI has introduced three new AI voice models designed to enhance real-time voice tasks: GPT-Realtime-2 for reasoning, GPT-Realtime-Translate for translation, and GPT-Realtime-Whisper for transcription. These models aim to empower developers to create advanced voice applications, allowing users to interact with AI through natural conversations, live translation, and immediate transcription as they speak. Notably, GPT-Realtime-2 utilizes GPT-5-class reasoning, enabling it to handle complex requests and adapt to user interaction dynamically. This launch is significant for the AI/ML community as it expands the capabilities of voice technology, catering to diverse applications in areas like customer service, travel communication, and live content creation. The models can process speech in over 70 input languages and translate it into 13 output languages, breaking down language barriers in real-time communication. With competitive pricing for API usage, including $32 per million tokens for GPT-Realtime-2 and $0.034 per minute for translation, these tools enhance the usability of AI in practical, everyday contexts, setting the stage for more intuitive and responsive AI-driven applications.
Loading comments...
loading comments...