Ollama 0.12.11 Brings Vulkan Acceleration (www.phoronix.com)

🤖 AI Summary
Ollama 0.12.11 adds optional Vulkan acceleration — enable it by launching ollama with the OLLAMA_VULKAN=1 environment variable — providing an alternative GPU backend to CUDA and ROCm. That makes Ollama much more usable on systems with open-source Vulkan drivers (RADV), older AMD cards without ROCm support, and other setups where ROCm isn’t available. The move from experimental to an explicit commit means Vulkan is now a supported path for inference; early testing (e.g., with Llama.cpp) even shows Vulkan can outperform ROCm in some cases, so this can broaden hardware choices and simplify local LLM deployment on diverse Linux machines. The 0.12.11 release also brings practical developer and app-level improvements: a Logprobs API for token-level probabilities (useful for interpretability, sampling diagnostics, and constrained decoding), WebP image support inside the new app, improved rendering performance, and smarter scheduling that prefers discrete GPUs over integrated ones. Combined, these changes expand compatibility, improve runtime behavior, and add tooling for more detailed model outputs — all available via the release details and downloads on the Ollama GitHub.
Loading comments...
loading comments...