Open source tools for the problems that don't have clean solutions.

Arbiter

The only LLM evaluation framework that shows you exactly what your evaluations cost. Arbiter tracks every LLM interaction, providing real-time cost calculations so you can see the financial impact of your evaluation choices.

Learn more →

Engram

Memory is strange. AI memory systems have an accuracy crisis—benchmarks show answer accuracies below 56%. Engram preserves ground truth, tracks confidence, and prevents hallucinations.

Learn more →

Tessera

Tessera coordinates breaking changes between data producers and consumers. It prevents operational disruptions by requiring consumer acknowledgment before breaking schema changes can be deployed.

Learn more →