hacker news Hacker News
  1. new
  2. show
  3. ask
  4. jobs
I was tired of stitching together Langfuse for tracing, promptfoo for red teaming and evals, and custom scripts for CI/CD. It was a mess so I built EvalsHub.

EvalsHub does all of it in one place. Automatic production scoring, red teaming, prompt versioning, and CI/CD integration. Zero to full eval coverage in 30 minutes.

Would love brutal feedback from anyone shipping AI in production.

evalshub.ai

loading...