ElevenLabs CEO says AI audio models will be ‘commoditized’ over time (techcrunch.com)

🤖 AI Summary
At TechCrunch Disrupt 2025, ElevenLabs CEO Mati Staniszewski argued that while his team is still investing in bespoke audio models today — having “cracked some of the model architecture challenges” that make voices sound natural — those models will become commoditized over the next couple of years. In the short term, Staniszewski said, building your own models remains the fastest path to noticeably better voice quality and novel capabilities; over the long term, however, differences between offerings (outside of niche voices or less-common languages) will narrow. He expects many deployments to continue using different models for distinct use cases, while multi‑modal “fused” approaches (audio+video or audio+LLM), exemplified by systems like Google’s Veo 3, will grow rapidly. For the AI/ML community this signals a shift from a pure model-arms race to a focus on product integration, tooling, and deployment: differentiation will increasingly come from combinations of model expertise, application design, scale, latency, safety, fine‑tuning pipelines and partnerships rather than raw model novelty alone. ElevenLabs plans to pursue partnerships and open-source work to blend its audio IP with other modalities and LLMs, reflecting a broader industry move toward multimodal stacks and composable models — where inference orchestration, datasets, evaluation metrics and user experience become as important as base architectures.
Loading comments...
loading comments...