Most tools watch one agent. We grade the ensemble.
Coordination, not outputs
Output evals grade the final reply. Choir grades the rehearsal — who deferred, who masked, who broke mandate, who never spoke at all.
False harmony detection
Surface the failure mode no other tool names: a smooth ensemble answer that conceals suppressed dissent, missed escalation, or authority breach.
Signed Choir Receipts
Every rehearsal becomes a verifiable, forwardable artefact your risk team, your customer, and your future self can actually read.
Adapter-agnostic
Bring traces from LangGraph, CrewAI, AutoGen, or a raw JSONL bundle. We rehearse the score you already wrote.
