Skip to content
TestingLayersLayersLayers

Layers

The layer axis tells you what question the gate answers — independent of mode (which says how the gate answers it). See /testing/pyramid.

Stripe — contract

Is layer A still in sync with layer B?

Stripe-shaped gates assert parity between two surfaces — OpenAPI ↔ generated MCP catalog, OpenAPI ↔ SDK types, exemplar CardSpec ↔ route manifest, dispatcher's resourceUrl table ↔ canonical routes. Drift between layers is the failure shape.

Primary modes: static, drift, contract.

Vercel — budget

Did we regress past a measurable ceiling?

Vercel-shaped gates assert numeric ceilings on observable metrics — LCP, TTFB, CLS, bundle size, axe violations, cost per AI call, p95 latency. Regression is the failure shape.

Primary modes: static (for bundle-size analysis), visual, e2e, synthetic. The budget primitive normalises the ceiling-comparison pattern.

Anthropic — eval

Does the AI still produce the expected behavior?

Anthropic-shaped gates assert AI-output baselines — capability ("the agent can do X"), safety ("the agent refuses Y"), determinism ("same input → same output"), grounding ("claims back-stop to a source"), red-team resistance ("adversarial corpora fail to redirect"). Silent behavior drift is the failure shape.

Primary modes: eval, safety, red-team, prompt-regression, hallucination, llm-judge, determinism.

Security — vulnerability

Did we introduce a vulnerability or leak?

Security-shaped gates probe the codebase itself — dependency vulnerabilities (bun audit), committed secrets (gitleaks-shaped patterns), OWASP-style web vulnerabilities (future: ZAP). OWASP top 10 is the canonical reference; Matter's first cut covers dep audit + secret scan from day one.

Primary modes: static, synthetic (for ZAP on preview deployments).

On this page