judge-calibration
| Field | Value |
|---|---|
| ID | judge-calibration |
| Version | 1.0.0 |
| Mode | llm-judge |
| Layer | Anthropic |
| Category | eval |
| Severity | 🟡 warning |
| SLA | 30,000 ms |
| Depends on | none |
| Source | packages/ai/__gates__/judge-calibration.gate.ts |
What it asserts
LLM-judge agreement with human ratings stays above Cohen's κ 0.85. Below threshold auto-demotes dependent judge gates to shadow.
Run it locally
bun run gates --gate=judge-calibrationSee also
llm-judgemode- Anthropic layer
- Allowlists — how to bound a known finding with an expiration
- Contributing — how to evolve this gate or write a new one
Generated by apps/design/scripts/generate-gate-pages.ts from the gate's source-of-truth metadata. Edit this page by editing the gate file's description / version / etc.