Skip to content

judge-calibration

FieldValue
IDjudge-calibration
Version1.0.0
Modellm-judge
LayerAnthropic
Categoryeval
Severity🟡 warning
SLA30,000 ms
Depends onnone
Sourcepackages/ai/__gates__/judge-calibration.gate.ts

What it asserts

LLM-judge agreement with human ratings stays above Cohen's κ 0.85. Below threshold auto-demotes dependent judge gates to shadow.

Run it locally

bun run gates --gate=judge-calibration

See also


Generated by apps/design/scripts/generate-gate-pages.ts from the gate's source-of-truth metadata. Edit this page by editing the gate file's description / version / etc.

On this page