ow HN: Apples2Oranges. Ollama with hardware telemetry. On-device LLM playground. (github.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

bitlyte released apples2oranges, an open-source desktop playground for running and profiling small, on-device LLMs with built-in hardware telemetry and side-by-side model comparison. Built with a Rust backend + Tauri and a React/TypeScript UI, it uses llama.cpp (via llama-cpp-2) to run GGUF-format text-generation models locally. The app captures real-time telemetry—per-core CPU utilization and temps, GPU stats, power/energy, RAM, Tokens Per Second (TPS) and Time to First Token (TTFT)—and offers dual chat, token-generation analysis, quantized-variant comparisons, smart memory load-run-unload cycles, thermal-baseline waiting for consistent benchmarking, and session persistence in SQLite. Visualizations include 3D scatter, radar, parallel coordinates, and multi-session comparisons (up to 7 sessions). This matters because energy, thermals, memory, and quantization trade-offs are central to deploying LLMs at the edge; apples2oranges gives developers, researchers, and learners a lightweight, reproducible way to correlate model quality with hardware impact locally rather than in cloud black boxes. Current limits: macOS-focused (Apple Silicon preferred), GGUF/text-only local models, no API/cloud integrations, and early-release performance/UX rough edges. It’s Apache‑2.0 open source, invites contributions, and plans Windows/Linux, more inference engines, multimodal and cloud-comparison features—making it a practical starting point for edge LLM benchmarking and optimization.

Loading comments...

loading comments...