Show HN: Find the best local LLM for your hardware, ranked by benchmarks (github.com)

🤖 AI Summary
A new tool, **whichllm**, has been launched to help users find the best local large language model (LLM) that runs optimally on their hardware. By automatically detecting a user's GPU, CPU, and RAM, whichllm ranks models available on HuggingFace based on real performance benchmarks rather than merely size fits. For instance, when queried with an RTX 4090, the tool identifies Qwen/Qwen3.6-27B as the top model due to its superior performance metrics, despite the larger Qwen/Qwen3-32B also being compatible. This is significant for the AI/ML community as it addresses a common challenge—selecting the best-performing model that suits specific hardware constraints. The tool emphasizes an evidence-based ranking system using diverse benchmark sources, ensuring that model assessments are fresh and relevant. Features like smart adaptation for different hardware configurations, live HuggingFace data integration, and user-friendly command execution streamline the process of deploying an LLM effectively. By focusing on a model's real-world performance evaluation, whichllm empowers developers to make informed choices, potentially accelerating the adoption and experimentation with LLMs across diverse applications.
Loading comments...
loading comments...