Consent gates for dangerous operations

When an agent can flip the plan, it shouldn't — not without a human-in-the-loop confirmation. The two-call handshake we ship in production, and why this is the smallest unit of safety that matters.

There's a small set of operations an agent can call where the cost of a mistake is real money, lost access, or irreversible state. Plan upgrades. Member removals. Deleting a workspace. Sending an email blast. We call this set the dangerous ops, and the rule is they never run on the agent's first call.

Instead, they run on the second call — after a human has approved the first call's proposal. This is a consent gate. It's the smallest amount of human-in-the-loop you can add to keep agents safe on the operations that matter, without adding friction to the operations that don't.

This piece is the pattern we ship at Dock — the contract, the token shape, and what we've learned running it in production for thousands of agent-driven operations.

The pattern

The agent makes a propose call. The handler doesn't execute. It returns a confirmation token plus a human-readable summary. The agent surfaces the summary to its human owner. The human approves. The agent makes a commit call with the token. Only the commit call has a side effect.

A concrete walk-through with upgrade_plan:

Agent → API: upgrade_plan(org=acme, plan=scale)
API → Agent: { confirm_token: "tok_abc...", summary: "Upgrade Acme to Scale plan ($49/mo, charged immediately)" }
Agent → Human: shows the summary in chat
Human → Agent: "yes, do it"
Agent → API: upgrade_plan(org=acme, plan=scale, confirm_token="tok_abc...")
API: validates token, executes upgrade, returns success

Two API calls. One human checkpoint. One real side effect.

What the token has to be

The token is the structural protection. It's bound to four things:

Single-use. Consumed in the same transaction as the side effect. A retry returns "already consumed."
Time-bound. 60-second TTL by default. After that, the agent has to re-propose.
Bound to the principal. The agent that proposed must be the agent that commits. Token forwarding doesn't work.
Bound to the operation and parameters. The token says "upgrade Acme to Scale" — using it for "upgrade Acme to Enterprise" fails.

Concretely:

PendingBillingConfirmation {
  id: string                  // ~256 bits, random
  org_id: string
  principal_id: string        // the agent that proposed
  operation: string           // "upgrade_plan"
  params: Record<string, unknown>  // canonical-form serialization
  created_at: Date
  expires_at: Date            // +60s default
  consumed_at: Date | null
}

When the commit call arrives:

const token = await db.pendingConfirmation.findUnique({ where: { id } });
if (!token) throw "no_token";
if (token.consumed_at) throw "already_consumed";
if (token.expires_at < now()) throw "expired";
if (token.principal_id !== principal.id) throw "wrong_principal";
if (token.operation !== op) throw "wrong_operation";
if (!paramsEqual(token.params, params)) throw "param_mismatch";

// Same transaction:
await db.$transaction(async (tx) => {
  await tx.pendingConfirmation.update({
    where: { id },
    data: { consumed_at: now() },
  });
  await executeOp(op, params);  // the actual side effect
});

Token consumption and side effect commit together or neither does. The agent can retry only if both fail, and the retry needs a fresh proposal.

What's gated and what isn't

The list of operations on the consent-gate contract is short on purpose:

Operation	On the contract?	Why
`upgrade_plan`	Yes	Moves money
`downgrade_plan`	Yes	Loses access; refund logic
`cancel_subscription`	Yes	Loses access
`add_org_member(role: admin)`	Yes	Widens access
`remove_org_member`	Yes	Loses access; data loss
`delete_workspace`	Yes	Permanent data loss
`make_workspace_public`	Yes	Widens access
`send_email_to_org_members`	Yes	Audience exposure
`add_row`	No	Reversible
`edit_doc`	No	Reversible, diffs visible
`add_comment`	No	Reversible, low stakes

The discipline: gate operations that move money, widen access, or can't be undone with a click. We covered the broader argument in The dangerous-ops contract.

A common mistake when shipping this is gating too much. If every agent operation requires a human approval, the friction trains the human to approve everything — and the gate stops being a gate. The protection only works when the gates are rare and the summaries are worth reading.

The summary is the entire UX

The human's only signal is the summary returned with the token. If the summary is bad, the gate is theatrical. If the summary is good, the human can evaluate in five seconds.

A good summary template:

Argus is requesting: Upgrade Acme from Pro ($19/mo) to Scale ($49/mo), effective immediately. Card ending in 4242 will be charged $49 today and on the 25th of each month going forward.

Reason given: "Acme team grew past 20 humans this morning; Pro caps at 20."

What's in there:

Actor (Argus)
Operation (upgrade Acme from Pro to Scale)
Cost ($49/mo, when it charges)
Card affected (ending 4242)
Reason (the agent's explanation)

The reason is optional but valuable — it gives the human enough to evaluate the premise of the operation. If the agent says "the team grew past 20" and the human knows the team is 12, the human rejects, even though the operation looks fine on its face.

We require summary templates as part of registering an operation on the contract. You can't add a gated operation without a template; you can't ship a vague summary.

Failure modes from production

A few things we've seen in real use:

Agent proposes, human takes 2 minutes to approve, token expires. Fine — the agent re-proposes, gets a fresh token. The human sees the same summary again with a fresh expiry. One extra round trip; no security cost.

Agent proposes, commits successfully, retries the commit on a network blip. Token already consumed. Retry returns "already_consumed." The agent should not loop on this — it's a sign the agent's success-detection logic is wrong. We've seen agents loop here exactly once; the loop is bounded by rate limits.

Two agents try to commit the same proposal. Reject the second; the principal_id doesn't match.

Agent proposes A, human responds "do B instead." Agent must propose B fresh. Don't reuse the original token. The human approved A's summary, not B's operation.

Token in logs. Treat tokens like credentials. Redact in logs. The 60-second window is small but exfiltration during the window is possible if logs are exposed.

Why this is the right unit of safety

A consent gate is small. It adds one round trip and one human checkpoint to a small set of operations. That's the entire footprint.

What it gives you in exchange:

The agent cannot, in a loop, charge cards. The structural protection prevents the runaway.
The agent cannot, by accident, give away admin access. The summary forces the human to read.
The audit log records propose-and-approve as separate events. The blame line is clear.
The customer support ticket "the AI did something I didn't want" disappears for the gated operations. The human approved.

What it costs:

One round trip per gated operation.
One human checkpoint per gated operation.
Cognitive overhead on the engineering team to register operations correctly.

The trade is sharply favorable. We've shipped this pattern across every dangerous operation in the product, and the count of customer-support escalations from the gated set is exactly zero. The count from non-gated reversible operations is non-zero but recoverable.

What this composes with

Consent gates are one of five primitives we discussed in AI-agent-first primitives. They compose with:

Agent identity. The token binds to a specific agent. Without identity, the principal check is meaningless.
Attribution. The audit log records "Argus proposed; Govind approved." Both events are first-class.
Scoped permissions. The gate doesn't replace scope — it's an additional layer on top of scope. Argus has to be allowed to propose the operation in the first place; the gate only protects the commit.
Shape caps. The gate stops dangerous operations; the shape cap stops volume runaways. Different protections, different layers.

The mechanics of the handshake itself, including what happens when the agent retries or forwards tokens, is in Two-key handshakes for irreversible agent actions.

The smallest version

If you take one thing from this piece: pick the five most dangerous operations in your product, write summary templates for them, ship the propose-commit pattern. That's the entire protection on day one. The list grows as you ship features; the discipline stays the same.

The protection has to be in the substrate. You cannot rely on the agent to "remember" to confirm. The handler has to refuse to execute without a token. That's the structural protection. Everything else is decoration.

FAQ

What is a consent gate?

A pattern where an agent calling a dangerous operation receives a confirmation token plus a summary, surfaces the summary to its human owner, and only executes the operation by replaying the token after the human approves. The token is single-use, time-bound, and bound to the operation, parameters, and principal.

Which operations should be gated?

Operations that move money, widen access, or can't be undone with a click. Plan changes, member additions with admin role, workspace deletions, sending email to mailing lists, making things public. Routine reversible operations (row inserts, doc edits, comments) should not be gated — gating them trains humans to approve everything.

What happens if the agent skips the proposal step?

The commit call has no token. The handler rejects it. There's no path to the side effect that doesn't go through the proposal — the protection is structural, not behavioral.

How long does the token live?

Default 60 seconds. Long enough for the human to read the summary and approve, short enough that an exfiltrated token can't be used hours later. Tunable per operation if needed.

Can the user disable consent gates for trusted agents?

No. The contract isn't user-configurable. Letting a customer skip confirmation for "trusted agents" defeats the protection — the agent that's been good for a year is still subject to prompt injection or model misfires. The friction is the point.

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Consent gates for dangerous operations",
  "description": "When an agent can flip the plan, it shouldn't \u2014 not without a human-in-the-loop confirmation. The two-call handshake we ship in production, and why this is the smallest unit of safety that matters.",
  "datePublished": "2026-04-26",
  "author": { "@type": "Person", "name": "Flint" },
  "publisher": { "@type": "Organization", "name": "Dock", "url": "https://trydock.ai" },
  "image": "https://trydock.ai/blog-mockups/style-d-dreamscape/consent-gates-for-dangerous-ops.webp",
  "mainEntityOfPage": "https://trydock.ai/blog/consent-gates-for-dangerous-ops"
}