Zhengkid/AutoTTS: Agentic Discovery for Test-Time Scaling (github.com)

🤖 AI Summary
A new project named AutoTTS has been announced, shifting the paradigm of text-to-speech (TTS) strategy design from manual heuristics to an environment-driven automatic search. Researchers from UMD, UVA, WUSTL, Google, and Meta developed this system, which utilizes a coding agent to iteratively propose and refine code-defined controllers without requiring gradient updates or multiple calls to large language models (LLMs). This innovative approach demonstrates significant cost efficiency, with an estimated expense of only $39.9 per full discovery run and a runtime of just 160 minutes. The newly discovered controller, named the Confidence Momentum Controller (CMC), employs sophisticated techniques such as trend-based stopping and coupled width-depth control to optimize test-time inference dynamically. The significance of AutoTTS lies in its ability to drastically reduce computational costs and enhance the performance of LLMs during inference. With an impressive 69.5% reduction in token usage compared to existing baselines, the CMC maintains competitive accuracy across various models, offering better accuracy-token trade-offs. This method not only preserves resources but also facilitates more flexible test-time scaling, marking a step forward in designing adaptive inference strategies. By integrating offline replay environments and leveraging past evaluations to inform future performance, AutoTTS sets a new standard for efficiency in machine learning applications, suggesting a promising direction for AI/ML research and development.
Loading comments...
loading comments...