Risk Classification
Before configuring guardrails, score your agent against the five risk dimensions below. The highest single dimension determines the overall risk tier.Risk dimensions
| Dimension | Low | Medium | High | Critical |
|---|---|---|---|---|
| Dollar impact | Under $500 per action | 5,000 | 50,000 | Over $50,000 or recurring charge |
| Irreversibility | Fully reversible (e.g., tag, draft, status) | Recoverable with effort (e.g., archive, field overwrite) | Hard to undo (e.g., sent email, posted record) | Cannot be undone (e.g., payment settled, contract signed) |
| External visibility | Internal only | Shared with internal teams | Customer-facing | Public or regulatory submission |
| Regulatory exposure | None | Internal policy only | PII / GDPR / HIPAA / SOC | Financial (SOX, PCI-DSS), legal, or contractual |
| System criticality | Experimental or sandbox | Non-critical production | Important but not mission-critical | Mission-critical (billing, payroll, ERP) |
Tier definitions and required guardrails
| Tier | Overall score | Required guardrails |
|---|---|---|
| Low | All dimensions score Low | None required — routine reversible operations |
| Medium | Any dimension scores Medium | HITL gate on the risky branch; anomaly alerts enabled |
| High | Any dimension scores High | HITL gate; hard caps on action volume and spend; allow/deny lists |
| Critical | Any dimension scores Critical | All High guardrails; shadow mode before first autonomous run; two-person AOP review before going live |
Guardrail Patterns
HITL approval on high-risk branches
Add a Human-in-the-Loop approval step before any action that scores Medium or above. The approval request should give the reviewer enough context to decide without opening another system.Hard caps
Limit the blast radius of a runaway run by setting explicit caps in your SOP.Allow and deny lists
Restrict write operations to approved targets. Any action outside the list requires a HITL approval step. Email domain allowlist:Idempotency checks
Prevent duplicate actions when a run retries or runs more than once.Shadow mode before enabling writes
For new high-risk agents, run the agent in read-only mode first. Observe what it would have done, then enable write access once the outputs look correct.SHADOW MODE instruction and enable the write connections.
Two-person AOP review for Critical agents
For Critical-tier agents, require a second team member to review the AOP before any change goes live. In practice this means:- The Builder drafts and saves the updated AOP revision.
- A Manager or Administrator opens the agent, reviews the diff, and runs a test run.
- Only after the reviewer is satisfied does the Manager update the live revision.
Time-window restrictions
Limit autonomous write operations to business hours to ensure a human can respond quickly to anomalies.Detection and Response
Anomaly signals to monitor
Review run outputs regularly for these patterns, especially in the first two weeks after a new agent goes live:| Signal | What it may indicate |
|---|---|
| Run duration much longer than usual | Stuck loop, external system slowdown, runaway processing |
| Action count far above the average | Cap not applied; duplicate records being created |
| Cost spike without corresponding output | Repeated retries, large context being sent to the model |
| Output drift — format or content changes without AOP change | Upstream data schema change; model behavior change |
| HITL denial rate rising above 20% | AOP is generating incorrect outputs; refine before removing gates |
Kill switch: pausing an agent
If something is going wrong, you can stop an agent immediately: Disable a schedule:- Open the assignment.
- Go to Agent Settings > Schedule.
- Toggle the schedule off. The agent stops running automatically. Runs already running continue to completion.
- Open the assignment.
- Go to Agent Settings > Triggers.
- Disable the trigger. No new runs start from that trigger.
- Go to the Connections page.
- Find the connection used by the agent.
- Disconnect it. The agent will fail immediately if it tries to use this connection, which prevents further writes to that system.
Rollback via Agent Versions
If a AOP change caused the problem, revert to the previous working version:- Open the agent.
- Click the revision selector in the builder toolbar.
- Select the last known-good revision.
- The agent now runs the reverted AOP on the next trigger or manual start.
Communication template
When you pause a high-risk agent, notify stakeholders immediately. Waiting for a full post-mortem before communicating increases risk.Worked Example: Purchase Order Processing
The Purchase Order Processing tutorial covers the basic flow. Here is how to apply the full risk classification and guardrail set before enabling it in production.Risk classification
| Dimension | Score | Reason |
|---|---|---|
| Dollar impact | Critical | Individual POs can exceed $50,000 |
| Irreversibility | High | Approved POs trigger downstream procurement actions |
| External visibility | Medium | Confirmations go to vendors |
| Regulatory exposure | Medium | Internal procurement policy; financial audit trail required |
| System criticality | High | ERP is mission-critical |
Guardrails applied
HITL gates:SHADOW MODE instruction to the SOP. Review the weekly run outputs with the procurement lead before enabling write access.
Time window:
Related
- Designing Human-in-the-Loop Workflows — Risk tiers, approval shapes, escalation chains, and ramping toward autonomy
- Agent Versions — How to review and revert to a previous AOP version
- Roles and Permissions — Who can create, edit, and promote agent revisions
- Security & Privacy — Platform-level security controls, SOC 2, and data handling
- Purchase Order Processing — Full tutorial showing threshold-based approval routing
- Expense Report Approval — Full tutorial showing HITL approval with dollar thresholds