Show HN: Prompt Refiner – Lightweight Python lib to clean and compress LLM input (github.com)

0 points 228 days ago ago | visit original

🤖 AI Summary

Prompt Refiner is a new lightweight, zero-dependency Python library that cleans and compresses LLM inputs before sending them to APIs. It provides modular operations—Cleaner (StripHTML, NormalizeWhitespace, FixUnicode), Compressor (Deduplicate, TruncateTokens with head/tail/middle_out), Scrubber (RedactPII), and Analyzer (CountTokens)—exposed via a pipe-friendly API (a | b) or fluent .pipe() style. The library claims 10–20% typical token savings (with “Aggressive” strategies reaching ~15% average and per-test RAG savings of 17–74%), plus production features like type hints, test coverage, and an online demo. Example: HTML/whitespace cleaning cut a 150-token prompt to 85 tokens (43% saved); at 1M input tokens/month a 15% reduction equates to roughly $54 saved on GPT‑4 input costs. Technically, Prompt Refiner emphasizes negligible latency and easy integration: operations add <0.5ms per 1k tokens (minimal ≈0.05ms, standard ≈0.25ms), so refining is <0.5% of typical end-to-end LLM request time. Benchmarks report small quality trade-offs—Minimal: 4.3% token reduction with cosine ≈0.987; Aggressive: 15% with cosine ≈0.964 and slightly lower judge approval—so users can tune presets (Minimal/Standard/Aggressive) or customize thresholds (e.g., dedupe similarity, PII types, truncate limits). Open-source and pip-installable (llm-prompt-refiner), it’s positioned as a practical, low-overhead tool for RAG, chatbots, and production cost optimization.

Loading comments...

loading comments...