Skip to content
TestingModesModesAI / EvalsEval

Eval

What it is

Capability + behavior baselines on AI emissions. Captured cases run through a subject, output graded against expectations.

When to use it

'Does the agent still emit a valid CardSpec for this input?'

Example gates

agent-output-schema — validates every exemplar against CardZ.

See also

On this page