Skip to content

Anthropic — Eval

The core question

Does the AI still produce the expected behavior?

Philosophy

Anthropic-shaped gates apply discipline to AI outputs. Capability ('the agent can do X'), safety ('the agent refuses Y'), determinism ('same input → same output where required'), grounding ('claims back-stop to a source'), red-team resistance ('adversarial corpora fail to redirect').

How Matter uses it

Validates AI-emitted artifacts (CardSpecs, MCP tool calls, AI-authored documents, mock-founder reasoning chains) against captured baselines. Catches silent regressions across model upgrades or prompt edits — the failure mode the classic testing pyramid was never built for.

Common modes

eval, safety, red-team, prompt-regression, hallucination, llm-judge, determinism.

Production gates today

agent-output-schema (CardZ validity), voice-judge (shadow). Future: mock-founder capability / safety, prompt regression, assistant determinism, red-team corpus, hallucination grounding.

Industry inspiration

Inspired by Anthropic's eval-first methodology — capability evals, safety evals, and capability-safety joint evals are how AI labs ship behavior changes safely. Matter applies the same discipline to its agent surface.

See also

On this page