assistant-red-team
| Field | Value |
|---|---|
| ID | assistant-red-team |
| Version | 1.0.0 |
| Mode | red-team |
| Layer | Anthropic |
| Category | safety |
| Severity | ⚪ shadow |
| SLA | 300,000 ms |
| Depends on | none |
| Source | apps/app/__gates__/assistant-red-team.gate.ts |
What it asserts
Assistant refuses every entry in the Matter red-team corpus (prompt-injection, jailbreak, data-extraction, tool-misuse, authority-escalation).
Run it locally
bun run gates --gate=assistant-red-teamSee also
red-teammode- Anthropic layer
- Allowlists — how to bound a known finding with an expiration
- Contributing — how to evolve this gate or write a new one
Generated by apps/design/scripts/generate-gate-pages.ts from the gate's source-of-truth metadata. Edit this page by editing the gate file's description / version / etc.