hacker news Hacker News
  1. new
  2. show
  3. ask
  4. jobs
Hi HN,

While building RAG agents, I noticed a lot of token budget was wasted on formatting overhead (HTML tags, JSON structure, whitespace). Existing solutions felt too heavy (often requiring torch/transformers), so I wrote this lightweight, zero-dependency library to solve it.

It includes strategies for context packing, PII redaction, and tool output compression. Benchmarks show it can save ~15% of tokens with negligible latency overhead (<0.5ms).

Happy to answer any questions!

loading...