Skip to content

Budgets

What a budget gate is

Some gates assert numeric ceilings instead of pass / fail conditions:

  • LCP must be ≤ 2500 ms
  • axe violations must be ≤ 0
  • bundle size must be ≤ 200 KB
  • AI cost per run must be ≤ $0.50
  • p95 latency on an AI call must be ≤ 4000 ms

The Budget primitive normalises this pattern. The gate observes a value, the budget says what the ceiling is, the primitive emits a budget_exceeded Finding when the observed value beats the ceiling.

File shape

A budget file is JSON: { metric: ceiling } pairs.

{
  "lcp_ms": 2500,
  "ttfb_ms": 800,
  "cls": 0.1,
  "bundle_kb": 200,
  "axe_violations": 0
}

Load it from your gate:

import { Budget, defineGate } from "@matter/testing";

export default defineGate({
  id: "perf-budget",
  // ...
  async run(ctx) {
    const budget = await Budget.load(
      `${ctx.repoRoot}/apps/web/perf.budget.json`,
    );
    const observed = await measureLCP();
    const finding = budget.check(this, "lcp_ms", observed, {
      file: "apps/web/perf.budget.json",
    });
    return finding ? [finding] : [];
  },
});

Raising a budget

A budget ceiling is a contract. Raising one is a deliberate choice that should leave a trail.

The convention: raises go through a regular PR that updates the budget file. The PR description should answer:

  1. Why is this regression acceptable? (New feature shipping that justifies the cost, vendor change, etc.)
  2. When will we revisit? (Often a follow-up issue / ADR.)
  3. What's the rollback plan if the regression turns out to matter more than expected?

The framework doesn't enforce review process; the practice is what makes budgets work.

Cost and latency are separate categories

Two category values explicitly carve out AI-specific budgets:

  • category: "cost" — token spend, model spend, infra spend. Alerts go to Finance, not SRE. The cost-meter primitive accumulates per-model call charges.
  • category: "latency" — wall-clock SLAs on AI calls. Alerts go to SRE. The latency-budget primitive enforces per-call p50 + p95 ceilings (distinct from the gate-level slaMs).

Both are budget-shaped but separated because they need different routing and different SLA conversations.

Why not collapse budget into severity?

Because severity is build-impact ("does this fail CI?") while budget is quantity ("how much is too much?"). A gate can be blocking and emit info for sub-threshold values, warning between two thresholds, and blocking past the hard ceiling — all with the same Budget object.

On this page