---
title: "Dock for data analytics: A/B experiment review with attributed analyst"
excerpt: "Dock turns an Eppo or Statsig experiment readout into a structured decision memo with an attributed analyst-lead reviewer, so ship-or-kill calls carry a name, a timestamp, and a trail back to the underlying dbt model."
author: mei
category: Use Cases
date: "2026-05-30"
---

A/B experiments stall between "results look green" and "we shipped it." The readout lives in Eppo or Statsig, the metric definition lives in dbt, the decision lives in someone's head. Dock seats an experiment-review agent next to the analyst-lead. The agent reads the readout, drafts a decision memo, and the analyst-lead signs off. The row records who decided, what they saw, and which dbt model fed the numbers. See [Dock for data analytics](/blog/dock-for-data-analytics).

Eppo, Statsig, and dbt stay the system of record for the raw data. Dock is the system of record for what the AGENT INTERPRETS. Each Dock row carries a pointer back to the platform record, agent identity, decision, reviewer, and timestamp. The agent re-fetches platform data via fresh API reads when it needs current state.

## The Dock surface: Experiment decisions table

| experiment | platform | primary_metric | lift | p_value | agent_call | analyst_lead_review | decision | dbt_model |
|---|---|---|---|---|---|---|---|---|
| checkout_v3 | [eppo/exp_8821](https://eppo.example.com/exp/8821) | gross_revenue_per_visitor | +2.4% | 0.018 | ship | priya@ (approved 2026-05-28) | shipped | [marts.fct_checkout_sessions](https://dbt.example.com/#!/model/fct_checkout_sessions) |
| onboarding_nudge | [statsig/exp_4410](https://statsig.example.com/exp/4410) | d7_activation | +0.6% | 0.21 | hold | priya@ (rejected, SRM flag) | rerun | [marts.fct_activation](https://dbt.example.com/#!/model/fct_activation) |
| pricing_banner | [eppo/exp_9102](https://eppo.example.com/exp/9102) | trial_starts | -1.1% | 0.04 | kill | dan@ (approved 2026-05-29) | killed | [marts.fct_trial_funnel](https://dbt.example.com/#!/model/fct_trial_funnel) |

Each row is one decision. The `agent_call` column is the draft. The `analyst_lead_review` column is the binding sign-off. Both are preserved, so a later auditor can ask why the agent said "ship" and the human said "rerun."

## The workflow

When Eppo or Statsig marks a test as readout-ready, the agent pulls the result, the variant assignment counts, and the metric definition from the corresponding dbt model. It writes a draft memo into a Dock row: observed lift, confidence interval, sample size per arm, sample-ratio-mismatch check, and a ship/hold/kill recommendation. Priya, the analyst-lead, gets the row in her queue. She opens the linked Eppo dashboard, confirms the cut, and either approves the agent's call or overrides it. Her approval signs the row. The shipped decisions flow to the eng-lead via the same row, with the dbt model pinned for post-launch monitoring. The agent acts under its own identity, not Priya's seat. See [agent identity](/blog/agent-identity).

## Why it matters

Experiment platforms are good at math and bad at memory. Six months later, no one can reconstruct why a borderline test shipped or why a winner got killed. Dock keeps the interpretation. The pinned dbt model means a definition change later does not silently invalidate the decision. This is the audit layer in [agent audit and compliance](/blog/agent-audit-and-compliance), and the same shape we use for [engineering](/blog/cloud-2-0-for-engineering) and [product](/blog/cloud-2-0-for-product) decisions.

Statsig's guidance on multiple-comparison corrections [^1] and Eppo's writing on sample-ratio mismatch [^2] point at the same gap: stat-sig alone is not a decision.

## Try it

Point Dock at your Eppo or Statsig workspace and your dbt project. The first experiment readout shows up in your analyst-lead's queue with a draft memo attached.

## FAQ

**Does the agent decide which experiments ship?**
No. The agent drafts a recommendation. The analyst-lead's signature is the binding decision. Per [agent identity](/blog/agent-identity), the agent acts under its own credentials, so draft and override are both attributable.

**What if the dbt metric definition changes after the decision?**
The row pins the model reference at decision time. On re-fetch the agent flags drift and routes affected experiments back to the analyst-lead.

**Does this replace Eppo or Statsig?**
No. Those platforms remain the system of record for assignment, variance, and stat tests. Dock records what a named agent said and what a named human decided.

**How does this handle sample-ratio mismatch?**
The agent runs the SRM check on every readout and refuses to recommend "ship" when the chi-square fails. The hold routes to the analyst-lead with the flag on the row.

[^1]: Statsig, "Correct me if I'm wrong: Navigating multiple comparison corrections in A/B Testing," statsig.com/blog.
[^2]: Eppo, "What to Do When You Encounter Sample Ratio Mismatch in A/B Testing," geteppo.com/blog.
