Activation Plan

19 of the 31 framework gates ship in severity: "shadow" today — interface-complete, but waiting on a prerequisite (Playwright wiring, ANTHROPIC_API_KEY, a captured baseline, a corpus, a running service, etc.). This page is the contract for what unlocks each one.

Order: cheapest activation at the top.

Tier A — single config change

Gate	What to do
`voice-judge`	Set `ANTHROPIC_API_KEY` in CI secrets. Promote severity to `blocking` after observing 30 green runs.
`assistant-determinism`	Same — `ANTHROPIC_API_KEY` in CI.
`webhook-delivery-audit`	Set `SVIX_TOKEN`. Implementation TODO: pull recent attempts via Svix's API.
`production-smoke`	Already runs in any environment with internet. Promote to `blocking` when the gate has run cleanly against `api.mattermode.com` from the framework workflow for 14 days.
`visual-regression`	Set `CHROMATIC_PROJECT_TOKEN`. Wire `npx chromatic --exit-zero-on-changes` into the gate's run() and read the status.

Tier B — capture a baseline / corpus

Gate	What to do
`prompt-regression`	Write `bun run --filter @repo/ai capture-baseline` script. Run it once to seed `packages/ai/__regression__/baseline.json` and commit.
`assistant-red-team`	Ship a versioned corpus at `apps/app/__red-team__/corpus.json` matching `RedTeamCorpus` schema. ≥30 entries across `prompt_injection`, `jailbreak`, `data_extraction`, `role_hijack`, `tool_misuse`, `ssrf`, `instruction_leak`, `pii_extraction`, `authority_escalation`.
`document-hallucination-check`	Collect AI-authored documents into `apps/app/test-data/ai-docs/`. Implement claim extractor + per-kind oracle (jurisdictions, OpenAPI, MCP).
`api-contract`	Write per-consumer contracts to `apps/api/__contracts__/{node-sdk,python-sdk,mcp-server}.json`.

Tier C — wire a runtime instrument

Gate	What to do
`cost-budget`	Wire `CostMeter` from `@matter/testing` into every AI call site (assistant, mock-founder, brand-voice-learner, voice-judge). Emit JSON to `.matter/cost-meter/<feature>-<run-id>.json`.
`latency-budget`	Wire latency recording into the same call sites. Emit `{ feature, durations_ms }` to `.matter/latency/<feature>.json`.

Tier D — spin up a service in CI

Gate	What to do
`mcp-tool-roundtrip`	Add `bun run --filter matter-mcp-server dev &` + readiness poll to the framework CI workflow. Implement per-tool happy-path dispatch + response-schema validation.
`e2e-entity-lifecycle`	Add `@playwright/test` to `apps/app` devDeps. Wire dev-server-start + readiness-poll in CI. Implement the 7-step lifecycle script (assistant → formation → grant → franchise tax → dissolution → webhook sequence assertion).
`accessibility-budget`	Add `@playwright/test` + `axe-core` to `apps/web` devDeps. Spin up apps/web dev server in CI. Walk each route, run axe, assert violations = 0.

Tier E — heavier build pipeline

Gate	What to do
`perf-budget`	Add `bun run --filter web build` as a CI prerequisite step. Optional: switch from build-manifest reading to a bundle-analyzer JSON for richer per-chunk data. Optional: layer Lighthouse-CI on top for LCP/TTFB/CLS.

Tier F — eval corpus that requires AI runs

Gate	What to do
`mock-founder-capability`	Already activates the moment mock-founder runs land in `apps/app/test-data/mock-founder-runs/`. Promote to `blocking` once the run corpus is stable across 2 consecutive model upgrades.
`mock-founder-safety`	Same. Surface findings are real today (65 destructive-verb-without-confirmation findings on first run) — these are eval signals worth triaging before promotion.

Promotion process

Activate per the table above.
Watch the scorecard for N consecutive green runs (default: 14 days or 30 CI runs).
Edit the gate's severity field: shadow → blocking.
Open a PR. The PR comment will show "no new findings vs. main" if the prerequisite step landed correctly.
Merge. New regressions in the gated surface now fail CI.

Out-of-scope follow-ups

These are listed for transparency but aren't yet scaffolded as gates:

Bundle-size drift across PRs (Vercel-style) — needs CI-side delta computation. Could wrap perf-budget with a diff mode.
OWASP ZAP against preview deployments — needs preview-env hooks Matter doesn't yet expose.
Mutation testing via Stryker — heavier than the manual mutation gate; would need @stryker-mutator/core integration.
AI cost ceilings per PR — needs PR-scoped cost-meter aggregation, not just per-feature totals.
Property tests for every Zod schema — would generate Arbitrary<T> from each schema, fuzz consumers. Substantial scope.

Activation Plan

On this page