Self hosted LLM cost monitoring (github.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

pulsecost-oss is an open-source, self-hosted proxy and dashboard that logs, analyzes, and helps optimize LLM usage and costs. It sits between clients and providers (OpenAI) and proxies /v1/chat/completions and /v1/embeddings calls while recording tokens, latency, cost, and cache hit/miss. Features include request deduplication caching to avoid duplicate spend, real-time analytics and AmCharts-powered KPIs (tokens, cost, cost-per-1K, model breakdowns, response times), per-API-key attribution, interactive charts, and activity logs. It’s shipped with a hexagonal Express backend, a Vite+React dashboard served via Nginx, Docker Compose dev/prod setups (SQLite by default, Postgres supported), and database UIs (pgAdmin/phpMyAdmin) for inspection. Technically, PulseCost enforces a client-driven key model: clients supply their own OpenAI key in Authorization: Bearer <key>, the proxy forwards it directly (no server-side API key storage), and uses a deterministic argon2-based hash (fixed salt) to identify keys for analytics. This yields O(1) lookups, single DB entry per client key, and reduced attack surface. The project is AGPL v3-licensed, easily extensible (swap storage/cache/provider backends), and intended for teams or SaaS operators who need transparent cost attribution, caching savings estimates, and per-key performance/cost tracking while keeping billing on client accounts.

Loading comments...

loading comments...