Claude --explain mode

What it does

bun run gates --explain takes the Findings produced by the runner and pipes each one into Claude with a Matter-specific prompt. Claude returns a short repair suggestion: 2-4 sentences referencing the file path and naming the specific change.

Strictly local and opt-in. Never auto-runs in CI. Never mutates code. Caches by fingerprint so re-running on the same findings hits the cache.

When to use it

You hit a gate failure you don't immediately understand.
You're triaging a long list of findings and want a sketch of what each one needs.
A new contributor is onboarding and wants the framework to explain its own outputs.

When not to use it

You already understand the fix. Just write it.
The finding is straightforward (e.g. a missing route). The explanation costs tokens; the fix is one line.
In CI. The framework explicitly disables auto---explain in CI to keep token costs bounded.

Requirements

ANTHROPIC_API_KEY in the environment (loaded from .env.local via the standard Matter env path). Without it, the framework returns a placeholder ("--explain disabled: ANTHROPIC_API_KEY not set") so the CLI doesn't crash.

How it differs from a LLM-as-judge

--explain is diagnostic — it explains what's wrong and what to do, in human language, for a developer reading the CI output.

LLM-as-judge (mode: "llm-judge") is evaluative — Claude grades a soft output (voice, tone, alignment) against a rubric and returns a pass / fail verdict. Different role, different primitive (llm-judge.ts vs. claude-explain.ts).

Caching

Each fingerprint maps to one suggestion in memory for the duration of a runner invocation. The cache doesn't persist across runs today; a future slice writes to .matter/explain-cache/ so the cache survives re-runs.

The prompt template

Lives at packages/testing/src/primitives/claude-explain.prompt.md (versioned with the package). The current prompt is intentionally short — passes the Finding as JSON and asks for a 2-4 sentence repair plan. Refinements happen in versioned PRs so suggestion quality regressions are reviewable.