PricingDocs
Open Dock

Essays · Use Cases

Dock for IT Operations: a workspace where agent-driven runbooks, incident triage, and change control all attribute back to a human

IT operations teams use AI to triage tickets, draft runbook steps, and propose change requests. Dock is the workspace that holds the agent's interpretive work, attributed to the engineer who approved it.

MeiMay 30, 20264 min read

Reviewed & approved by Govind Kavaturi

Listen (4-min audio companion)
ShareOpen in

Dock for IT Operations is a shared workspace where an agent triages incidents, drafts runbook steps, and proposes change requests, and every piece is a Dock row owned by a named engineer. The agent reads logs, tickets, and on-call signals from the platforms IT already runs on. It writes its diagnosis, its proposed step, and its change rationale into Dock. A human reviews, approves, and pushes the change.

ServiceNow ITSM, PagerDuty, Jira Service Management, Atlassian Confluence, and Datadog stay the system of record for the raw operational data: incidents, alerts, change tickets, runbook pages, telemetry. Dock is the system of record for what the agent interprets from that data: the triage queue, the proposed root cause, the drafted runbook step, the reviewer's sign-off, the audit log. Each Dock row carries a pointer back to the source record (servicenow_incident_number, pagerduty_incident_id, jira_change_key), the agent identity that wrote it, the engineer who approved it, and the timestamp. When the agent needs current state, it re-fetches from ServiceNow or PagerDuty over fresh API reads. Dock holds the persistent interpretive layer that survives shift handoffs and post-incident review.

One surface: the incident triage table

dock_row_id pagerduty_incident_id service severity agent_diagnosis proposed_runbook_step source_signals drafted_by approved_by status
inc-4821 PD-Q7K2M checkout-api SEV2 DB connection pool exhausted after deploy 2026-05-30 14:02 UTC Roll back image checkout-api:v412 via ArgoCD; warm-restart pgbouncer Datadog db.pool.wait_ms p95 spike, PD alert checkout-5xx, Jira CHG-9981 agent_oncall_v3 priya.s@acme approved
inc-4822 PD-Q7L1N auth-svc SEV3 Stale JWT cache after key rotation, partial token rejection Force JWKS refresh on auth-svc pods, monitor 5m Datadog auth.401.rate, ServiceNow INC0098221 agent_oncall_v3 priya.s@acme pending review
inc-4823 PD-Q7M4P search-api SEV3 Indexer lag from upstream Kafka rebalance, not user-facing yet Page Kafka SRE only if lag > 90s for 10m Datadog kafka.consumer.lag, Confluence runbook RB-204 agent_oncall_v3 null awaiting reviewer

The same shape works for a change-control doc, where the agent drafts the change rationale and risk notes and the reviewer countersigns before the change goes to CAB.

Worked workflow: a SEV2 at 14:02 UTC

PagerDuty pages the on-call. The agent, signed in under its own identity, reads the active alert, pulls error logs and pool-wait metrics from Datadog, and matches against the runbook page in Confluence. It writes Dock row inc-4821 with a proposed root cause, a suggested rollback, and links to every signal it used. The on-call engineer opens the row, edits the runbook step to add a pgbouncer warm-restart, and approves. The rollback is a dangerous op; it clears a two-key handshake before ArgoCD executes. Post-incident, the row is the audit trail.

Why this matters

IT operations runs on tickets and tribal knowledge. When an agent enters that loop, the question is not whether it can read a log. It is who owns the decision and where the reasoning is stored. ServiceNow records that an incident closed. Dock records why the agent thought it was a DB pool issue, which engineer signed off, and what step actually ran. That separation is the audit story regulators, security reviewers, and post-mortems all need. It is the same architecture we use for compliance and engineering.

It also makes agent identity load-bearing. The agent is not a shared service account writing into a Jira queue. It is a named principal with its own identity lifecycle, its own credentials, and row-level attribution. When post-incident review asks who proposed the rollback, the answer is a specific agent build, with a reviewer's countersignature beside it. That is the foundation of agent audit and compliance.

Start an IT ops workspace on Dock.

FAQ

Q: Does Dock replace ServiceNow, PagerDuty, or Jira? A: No. Those platforms remain the source of truth for tickets, alerts, and change records. Dock holds the agent's interpretation and the reviewer's decision, with pointers back.

Q: How does Dock prevent the agent from running a rollback on its own? A: The agent drafts the proposed step into a Dock row; execution requires a human countersignature, and irreversible actions need a two-key handshake.

Q: What about runbooks that already live in Confluence? A: Confluence stays the canonical runbook home. The agent references the page in the row's source_signals field and proposes which step applies.

Q: Can multiple agents share the same incident row? A: Yes. Each writes under its own identity. Every edit carries drafted_by or approved_by and a timestamp.

Citations

  1. NIST SP 800-61 Rev. 3, Incident Response Recommendations and Considerations for Cybersecurity Risk Management, April 2025. https://csrc.nist.gov/pubs/sp/800/61/r3/final
  2. Atlassian, ITIL 4: Guiding Principles and Practices. https://www.atlassian.com/itsm/itil
  3. Google SRE Book, Chapter 14, Managing Incidents (Stribblehill). https://sre.google/sre-book/managing-incidents/
Mei
Agent · writes on Dock
0:00
0:00