When an agent does something an agent shouldn't, the failure usually has a specific shape. The agent isn't malicious. The model didn't go off the rails philosophically. What happened is mundane: the agent found a path through your prompt where the locally optimal next step was a destructive action, took it, and now the cost is real.
The protection against this is a short, stable list of operations the agent cannot run without a human-in-the-loop confirmation. We call it the dangerous-ops contract. This piece is what's on the list, why it's on the list, and the rule for when something gets added.
The mechanics of the confirmation pattern are in Two-key handshakes for irreversible agent actions. This piece is about the contract — the social/architectural agreement on what's protected and why.
The shape of the contract
The contract is a list of operations. Each item has:
- A name — the operation the agent might call.
- A trigger reason — why this operation is on the list.
- A summary template — the text shown to the human at confirmation time.
- A fast-path if applicable — circumstances under which the operation can run without confirmation.
The contract lives in code. Specifically, the consent-gate handler module. Adding to the contract means adding to the handler. The contract is not a doc someone reads; it's the actual switch that determines whether an operation requires a token.
What's on it today
Our current contract, in production:
| Operation | Why | Fast-path |
|---|---|---|
upgrade_plan |
Moves money | None |
downgrade_plan |
Loses access; refund logic | None |
cancel_subscription |
Loses access | None |
add_org_member(role: admin) |
Widens access | None |
remove_org_member |
Loses access; data loss | None |
delete_workspace |
Permanent data loss | Within 30s of creation |
make_workspace_public |
Widens access | None |
send_email_to_org_members |
Audience exposure | None |
transfer_ownership |
Moves account control | None |
The contract is short on purpose. Each entry has a clear reason. None are arbitrary.
The rule for adding to the list
The shorthand: any operation that moves money, widens access, or can't be undone with a click belongs on the list.
Unpacked:
Moves money. Any billing change. Plan changes, charges, refunds, invoice edits. Even if the change seems small (downgrading from Scale to Pro), the cost of getting it wrong is real money on a real card.
Widens access. Adding a member with non-default role. Making a private resource public. Granting an OAuth scope. Sharing a doc with someone who shouldn't have it. The agent giving away authority is one of the highest-leverage mistakes possible.
Can't be undone with a click. Hard deletes. Data exports that leave the system. Notifications sent to other people (you can't unsend an email). Operations against external systems where the side effect is outside our database.
If the operation passes any of these three tests, it goes on the contract. If it fails all three, it doesn't.
The discipline of not adding to the list
The contract gets bigger as features ship. The temptation is to gate everything for safety. Resist this.
A bloated contract has two failure modes:
The human stops reading. When every agent action requires confirmation, humans approve them automatically. The friction trains the response. The point of the gate is the human reading the summary and thinking; that breaks when there are too many gates.
The agent gets useless. Agents are valuable when they run autonomously. An agent that needs human confirmation for every action is just a remote-controlled tool. The contract should preserve the agent's autonomy on the 95% of operations that aren't dangerous, not throttle it.
So the discipline: every addition to the contract is a deliberate decision. Not "let's gate this just in case," but "this passes the three tests, here's why."
What's not on the list and why
A few operations that are tempting to gate but shouldn't be:
- Adding a row to a table. Reversible. The cost of an accidental row is low.
- Editing a doc. Reversible. Diffs are visible. Comments are reversible.
- Adding a comment. Reversible. Lower stakes than a doc edit.
- Searching across workspaces. No side effect.
- Reading any data. No side effect.
- Renaming a workspace. Reversible (and small blast radius).
- Adding a member with default role (member). Reversible (remove them if wrong).
Notice that all of these could be wrong if the agent does them maliciously. The protection isn't that they can't be wrong — the protection is that they can be undone with a click. The shape caps prevent runaway loops; the audit log records what happened; the human can revert.
How fast-paths work
Some operations are usually dangerous but contextually safe. The classic example: deleting a workspace within 30 seconds of creating it. The agent created the workspace, the agent realized it was wrong (or its prompt told it to back out), the agent wants to delete. Forcing a confirmation here is annoying — the workspace is fresh, no real content, and deleting it is the right move.
Fast-paths let specific operations skip confirmation under specific conditions. The conditions have to be checkable in code (not just "the agent is sure") — usually time-since-creation, no-content, no-other-members, or similar.
The discipline: fast-paths are exceptions documented in the same place as the contract. They are not arbitrary "agent feels confident" carve-outs. They have a check that has to pass.
What the human sees
The summary at confirmation time is the human's only signal. It has to be readable by someone who didn't write the agent prompt and doesn't remember the agent's task.
Bad summary: "Confirm operation 3214?"
Better summary: "Argus wants to upgrade the Acme org from Pro to Scale ($49/mo)."
Best summary: "Argus is requesting to upgrade Acme from Pro to Scale ($49/mo, charged immediately, effective now). Card ending in 4242 will be charged $49 today and on the 25th of each month going forward. Argus is asking because of: 'Acme team grew past 20 humans this morning; Pro caps at 20.'"
The last version includes:
- The actor (Argus).
- The operation (upgrade Acme from Pro to Scale).
- The cost ($49/mo, when it charges).
- The card affected (ending in 4242).
- The agent's reason (why Argus is asking).
The reason at the end is optional but valuable. It gives the human enough context to evaluate whether the agent's reasoning is sound. If the reason is "Acme team grew past 20" and the human knows the team is still 12, the human can reject — even though the operation looks fine on its face, the premise is wrong.
Why the contract is small
Compare to a system where everything requires human confirmation:
- The user creates a workspace → "confirm?"
- The user adds a row → "confirm?"
- The agent edits a doc → "confirm?"
- The agent adds a row → "confirm?"
This is exhausting. The friction trains the human to approve everything. The protection is gone.
By contrast, with a small contract:
- The user creates a workspace → done, no friction.
- The user adds a row → done, no friction.
- The agent edits a doc → done, attribution recorded.
- The agent adds a row → done, attribution recorded.
- The agent upgrades the org plan → "Confirm: Argus is requesting upgrade from Pro to Scale, $49/mo, effective immediately."
The friction is concentrated on the operations that warrant friction. The other 95% of agent activity flows through. The agent stays useful. The dangerous operations are protected.
What this contract isn't
A few things the contract is not:
It's not the only safety mechanism. Identity, attribution, scoped permissions, and shape caps all do separate jobs. The contract is the layer for irreversible operations — it doesn't replace the others.
It's not negotiable by the agent. The agent cannot ask for a fast-path to be applied because it's "very sure." The fast-paths are checked in code; the agent's confidence is irrelevant.
It's not user-configurable. The contract is part of the product. We don't ship a "skip confirmation for upgrades" setting because doing so would defeat the protection. The friction is the point.
It's not for low-stakes operations. Don't gate row inserts, doc edits, comments. The cost of those mistakes is recoverable; gating them just trains humans to approve everything.
The cost of getting it wrong
When the contract is missing or incomplete, the failure mode is the agent-in-a-loop disaster. The classic example: an agent runs in a tight loop, calls an irreversible operation, calls it again, calls it again, until it hits a rate limit or runs out of context. By the time the human notices, the operation has happened a hundred times.
We've seen the equivalent in customer environments before consent gates were on the contract. The recovery is always painful — refunds, apologies, manual fixups. The contract is the structural protection against this entire class of failure.
What to take away
If you're adding agent capability to a product, the dangerous-ops contract is a one-day decision and a multi-year asset. List the ten operations that move money, widen access, or can't be undone. Wrap them in the consent-gate pattern. Write the summary templates well. Add new entries when you ship features that pass the three tests.
For the implementation pattern, see Two-key handshakes for irreversible agent actions. For the broader principles, see AI-agent-first primitives.
FAQ
What is the dangerous-ops contract?
A short, stable list of operations that an agent cannot run without a human-in-the-loop confirmation. The list lives in code (specifically the consent-gate handler) and is enforced structurally — there is no path to executing the operation that doesn't go through the confirmation flow.
What goes on the contract?
Operations that move money, widen access, or can't be undone with a click. Plan upgrades, member additions with admin role, workspace deletions, sending email to mailing lists, making things public, transferring ownership. The list grows as features ship; the rule for adding stays stable.
Why not just gate everything?
Two failure modes. First, humans stop reading the summaries when every action requires approval — the friction becomes noise. Second, the agent stops being useful — autonomy is the value, and gating everything kills it. The contract is small on purpose: gate the operations where the cost of a mistake is high, not the routine ones.
How are fast-paths different from regular gated ops?
Some operations are usually dangerous but contextually safe (deleting a fresh empty workspace, for example). Fast-paths let specific operations skip confirmation under specific code-checkable conditions (time-since-creation, no-content, etc.). They are documented exceptions, not "agent feels confident" carve-outs.
Should the human always read the summary?
The summary is designed to be the only signal the human reads. If the summary is well-written, the human can evaluate the operation in 5–10 seconds. The discipline on the system side is to make summaries informative; the discipline on the human side is to actually read them.
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "The dangerous-ops contract",
"description": "A short, stable list of operations agents can never run without a human-in-the-loop confirmation, plus the rule for when to add to the list.",
"datePublished": "2026-04-26",
"author": { "@type": "Person", "name": "Flint" },
"publisher": { "@type": "Organization", "name": "Dock", "url": "https://trydock.ai" },
"image": "https://trydock.ai/blog-mockups/style-d-dreamscape/dangerous-ops-contract.webp",
"mainEntityOfPage": "https://trydock.ai/blog/dangerous-ops-contract"
}
