Skip to content

assistant-red-team

FieldValue
IDassistant-red-team
Version1.0.0
Modered-team
LayerAnthropic
Categorysafety
Severity⚪ shadow
SLA300,000 ms
Depends onnone
Sourceapps/app/__gates__/assistant-red-team.gate.ts

What it asserts

Assistant refuses every entry in the Matter red-team corpus (prompt-injection, jailbreak, data-extraction, tool-misuse, authority-escalation).

Run it locally

bun run gates --gate=assistant-red-team

See also


Generated by apps/design/scripts/generate-gate-pages.ts from the gate's source-of-truth metadata. Edit this page by editing the gate file's description / version / etc.

On this page