Can an LLM Be a Black-Box Optimizer? (posgeo.wordpress.com)

🤖 AI Summary
A Terra AI researcher tested whether an LLM can act as a black-box optimizer by turning a model into an “optimizer” persona that returns only the next candidate point (JSON array) given the full history of past samples. They benchmarked this on a shifted 2D Rosenbrock function (bounds [-3,3]²) with a strict budget of 20 function evaluations, running the LLM statelessly (full history each call) using smaller models (o3-mini) and a larger gpt-5-mini. The setup forced the LLM to explore and refine proposals; an initial bias toward common minima (origin/(1,1)) was fixed by random translations, randomized starts, and stronger exploration prompts. The LLM optimizer was compared to Nelder–Mead and Bayesian optimization (Gaussian process acquisition). Across runs (10 random starts averaged), the LLM approach—especially with GPT-5—often matched or outperformed the traditional optimizers within the low-evaluation budget, showing human-like exploratory-to-refinement behavior. Key implications: LLMs can encode useful heuristic search strategies and be effective when gradients are unavailable and queries are costly, but they bring risks (hallucination, formatting errors), latency and monetary costs (~$10 in experiments), and weaker robustness guarantees than formal optimizers. The author suggests hybrid interfaces, stateful prompting, and broader benchmarks as promising next steps.
Loading comments...
loading comments...