---
title: "Dock for IT ops: incident-response workflow with attributed timeline"
excerpt: "Dock runs the IT incident-response workflow by letting an agent assemble an attributed timeline from ServiceNow, PagerDuty, and Datadog while a human IT lead approves the post-mortem before it persists."
author: mei
category: Use Cases
date: "2026-05-30"
---

## How does Dock run an IT incident-response workflow with an attributed timeline?

Dock runs incident response as a reviewed assembly. A named agent watches PagerDuty for a page, pulls correlated signals from Datadog, opens the ServiceNow incident record, and writes a single Dock row that holds the timeline, the inferred root cause, and the proposed remediation. An IT lead reviews each line, approves or rewrites the post-mortem, and only then does the row persist as the durable record. Telemetry stays where it lives. The interpretation is what Dock keeps.

## The architecture

ServiceNow, PagerDuty, and Datadog stay the system of record for the raw incident data: tickets, pages, metrics, logs, traces. Dock is the system of record for what the agent interprets from that data. Each Dock incident row carries a pointer back to the source record through a `servicenow_incident_id`, `pagerduty_incident_id`, and the relevant `datadog_event_id`, alongside the agent identity, the decision made, the reviewing IT lead, and the timestamp. When the agent needs current state mid-incident, it re-fetches through fresh API reads rather than trusting a cached view. This is the same separation we describe in [Dock for IT operations](/blog/dock-for-it-operations) and the broader pattern in [agent audit and compliance](/blog/agent-audit-and-compliance).

## The Dock surface: an `incidents` table

| incident_id | pagerduty_incident_id | servicenow_incident_id | severity | agent_summary | proposed_root_cause | reviewer | status |
|---|---|---|---|---|---|---|---|
| INC-4471 | PD-Q8X2 | INC0029841 | SEV2 | 14m checkout 500s after deploy 7c3a; Datadog shows p99 spike on `orders-api` | Connection pool exhaustion under new retry policy | govind | approved |
| INC-4472 | PD-Q8X4 | INC0029852 | SEV3 | Kafka consumer lag on `events.raw`; redelivery storm at 03:14 UTC | Downstream sink throttled by provider | dustin | pending |
| INC-4473 | PD-Q8X9 | INC0029860 | SEV1 | Auth service 5xx for 6m; correlated to cert rotation job | Expired intermediate not re-pushed to edge | govind | rewritten |

## The worked workflow

PagerDuty pages on a SEV2 in `orders-api`. The on-call agent, identified as `argus@dock`, opens INC-4471 in Dock and writes the first three timeline entries from the page payload. It queries Datadog for the matching service, pulls the deploy marker that fired four minutes earlier, and appends a metrics block with links back to the Datadog event IDs. It opens the ServiceNow incident, attaches the Dock row URL to the work notes, and proposes a root cause.

The IT lead reviews. She accepts the timeline, edits the root cause from "connection pool exhaustion" to "retry policy amplified pool exhaustion under deploy 7c3a," and approves. The row flips to `approved`. The agent then drafts a remediation: revert the retry change and raise the pool ceiling. Because rollback touches production, it routes through the [dangerous ops contract](/blog/dangerous-ops-contract) and a [two-key handshake](/blog/two-key-handshakes-irreversible) before it executes. The post-mortem is the Dock row, frozen at approval, with every edit attributed.

## Why it matters

The attributed timeline is the artifact regulators and auditors actually want. NIST SP 800-61 frames incident handling around documented detection, analysis, and post-incident review, and the Google SRE Book treats a live, multi-author incident document as the commander's single most important responsibility. Dock makes that document a typed row with identity on every line.

It also closes the gap between security and ops. The same surface pattern shows up in [Dock for security operations](/blog/dock-for-security-operations), because incident response is the same problem viewed from two angles. The agent that writes the timeline is the same kind of principal we describe in [agent identity](/blog/agent-identity): named, scoped, reviewable.

And it removes the post-mortem tax. When the interpretation is captured live with citations to telemetry, the write-up is a read, not a rewrite.

Try Dock for your next incident bridge at dock.com.

## FAQ

**Does Dock replace ServiceNow or PagerDuty?**
No. ServiceNow holds the ticket of record, PagerDuty holds the page, Datadog holds the telemetry. Dock holds the agent's interpretation with pointers back to each.

**Who can approve a post-mortem?**
Whichever role the IT lead workspace policy designates. The reviewer field is required before the row can move to `approved`, and the identity is captured per line.

**What happens to the row if Datadog data ages out?**
The pointer remains. The agent can re-fetch what is still in retention, and the Dock row preserves the agent's interpretation and the human approval even when the upstream metrics roll off.

**Can the agent execute a rollback itself?**
Only through the dangerous-ops contract, with a two-key handshake on irreversible actions. Drafting the remediation is one step. Executing it is a separate, gated step with its own approval.
