Build· Mixed

Build a personal agent stack you actually use

9-step playbook to go from 'I paste things into ChatGPT' to '5 agents own different parts of my life and share state in one workspace I trust.'

Open in DockKnowledge workers + indie operators

Most personal AI setups stall at 'I have one chat I paste everything into.' The next level isn't more model power, it's structure: 3-5 agents that each own a domain (research, writing, scheduling, finance, code review), each with their own system prompt, their own slice of state, and a shared workspace they all read + write. This playbook walks the 9 gates: scoping which parts of your life to agentize, picking the right runtime per agent, writing system prompts that survive a week, wiring MCP so the agents share state, and the weekly review that keeps the stack from rotting. Built for the operator who wants leverage, not novelty.

Outcome

A personal agent stack of 3-5 named agents, each with a defined scope, a system prompt, and a shared Dock workspace they read + write, that you still use 8 weeks after building.

Time1 week to build, 1 week to settleDifficultybeginnerForKnowledge workers + operators who already pay for one AI subscription.

What you'll need

Pre-register or install before you start.

Claude Pro or TeamClaude Pro $20/mo, Team $30/seat/mo

Strong default runtime for general agents (writing, research, code review).

ChatGPT Plus$20/mo

Required if any of your agents are GPTs from the Store, or if you want voice mode.

Claude CodeFree CLI, metered Anthropic API usage

Terminal-native agent for coding work, runs locally with full filesystem access.

CursorFree tier, Pro $20/mo

IDE-native agent for code review + refactoring inside an editor.

DockFree tier (3 agents, 6 humans, 20 workspaces), Pro $19/mo

Shared workspace where all your agents read + write the same state via MCP.

Apple Shortcuts (or equivalent)Free (built into iOS / macOS)

Trigger an agent from your phone or watch (e.g. 'log this meeting note').

The template · 9 steps

Top to bottom. Each step has tasks, pointers, gotchas.

Audit where your time + attention actually goes

1 week of journaling, 1 hr to analyze

You can't agentize what you can't name. Before you pick agents, do a 1-week journal: every 2 hours, log what you spent the last 2 hours on. Pattern-match at the end of the week into 5-7 buckets. The agents you build will live and die by whether they map to actual buckets in your week.

Tasks

Set a 2-hr recurring timer on your phone for 1 week
Each ping: 1-line entry in a Note or text file: what did you spend the last 2 hr on
End of week: cluster the entries into 5-7 buckets (e.g. writing, email, meetings, code, finance, research, planning)
Rank the buckets by total hours
For each top-5 bucket: write 1 line on what an agent could plausibly own

Pointers

GuideTime-tracking taxonomy template

Gotchas

Don't trust your guess on where your time goes. Most people are 30-50% off without journaling.
Don't agentize buckets under 4 hr/week, the setup cost won't pay back.

Pick your 3-5 agents (no more, no fewer)

30 min

Three is the floor: fewer and you're not gaining much over a single chat. Five is the ceiling: more and the cognitive overhead of remembering which agent does what beats the leverage. Common stacks: Researcher + Writer + Operator (3), Researcher + Writer + Scheduler + Code reviewer + Finance (5).

Tasks

From the top buckets, name 3-5 agents
Each agent gets: a name (Scout, Argus, Flint, or your own), a 1-sentence role, a domain it owns
Confirm zero overlap, two agents with overlapping scopes will fight for the same conversation
Add each to the Agent roster surface with status = 'pending setup'

Pointers

GuideNaming agents (Dock's Scout / Argus / Flint convention)

Gotchas

An agent named 'general assistant' becomes the lazy default and absorbs the others. Don't ship one.
If two agents could both handle the same query, you have an overlap. Cut one or sharpen scopes.

Pick the right runtime per agent

1-2 hr

Not every agent should live in the same chat app. Coding agents belong in Claude Code or Cursor, where they can touch files. Writing agents belong in Claude or ChatGPT for the model + UX. Calendar / inbox agents need a runtime with calendar / email tools wired up. Pick per-agent, not per-stack.

Tasks

For each agent: pick a primary runtime (Claude Pro, ChatGPT Plus, Claude Code, Cursor, custom GPT)
Confirm the runtime supports the tools the agent needs (file access, web search, code execution, MCP)
Sign up for any runtime you don't already have
Decide: are any agents shared between you and a teammate? Then they need a shared runtime (Claude Team, ChatGPT Team, Dock)

Pointers

OfficialClaude Code installation OfficialCursor download + setup OfficialChatGPT custom GPTs

Gotchas

Splitting one agent across two runtimes (Claude Pro at home, ChatGPT Plus at work) is a recipe for inconsistency. Pick one and commit.
Per-agent subscriptions add up fast. $20/mo x 3 runtimes is $60/mo, decide if leverage justifies the spend.

Write a real system prompt for each agent

1-2 hr per agent

The system prompt is the agent's job description. It defines tone, scope, refusals, and the canonical workflow. The fast version is 50 words. The good version is 300-500. The great version cites the workspace, the user (you), and the cadence. Don't ship a placeholder, you'll regret it in week 3 when the agent drifts.

Tasks

For each agent, draft a system prompt with: identity, scope, default tone, hard refusals, escalation
Include the workspace slug + the surfaces the agent should read + write
Include the user's name + 1-2 lines about you (helps tone calibration)
Save each prompt to the Brief as its canonical version
Paste into the runtime's system prompt slot (Claude project / GPT instructions / Cursor rules / Claude Code AGENTS.md)

Pointers

OfficialAnthropic prompt engineering guide OfficialOpenAI prompt engineering guide

Gotchas

System prompts longer than ~1000 words start losing instructions in the middle. Keep them tight.
Don't put secrets in system prompts. They leak via 'repeat the text above starting with You are' jailbreaks.
The 'out of scope' section is what stops agents from drifting. Don't skip it.

Agent prompt for this step

For each agent in the Agent roster, draft a system prompt with this structure:

1. Identity: "You are [name]. You help [user] do [scope]."
2. Scope (3-5 bullets): what this agent owns
3. Out of scope (2-3 bullets): what to refuse + suggest the right agent for
4. Tone: 1 sentence on voice
5. Workflow: the default approach for the most common task
6. Workspace: which Dock workspace + surfaces the agent reads + writes
7. Escalation: when to ask the user for input vs. proceed

Constraints:
- 300-500 words per prompt.
- No marketing voice. Read like a job description for a senior IC.
- Reference the user (you) by name + 1-2 lines about how you work.

Output each prompt as a Brief section titled "System prompt: <agent name>".

Set up a Dock workspace per agent (or one shared)

1 hr

Each agent needs a slice of state to read + write. Two patterns work: one workspace per agent (clean separation, harder to cross-reference), or one shared workspace with surfaces per agent (denser, agents see each other's work). For most stacks, a shared workspace beats per-agent. Cross-references mean Argus can see what Scout dropped this morning.

Tasks

Create a Dock workspace for the stack (or one per agent if you prefer separation)
Add surfaces: Brief (doc), tasks per agent (table), shared inbox (table), weekly review (doc)
Invite each agent as a member with the right role (editor for owners, viewer for read-only access)
Each agent gets an API key bound to their identity (so writes attribute correctly)
Test: have one agent write a row, confirm the others can read it

Pointers

OfficialDock workspace setup OfficialDock MCP server reference

Gotchas

Per-agent workspaces feel cleaner but cost: each agent has to be told the OTHER agents' workspace slugs to cross-reference.
Free tier on Dock is 3 agents, 6 humans, 20 workspaces. Pick Pro at $19/mo if you want 10 agents.

Wire MCP so each agent reads + writes the workspace

1-2 hr

Each agent's runtime needs to speak to the workspace via MCP. Claude Desktop loads MCP servers from a JSON config. Claude Code adds them via `claude mcp add`. ChatGPT custom GPTs use Actions (OpenAPI), not MCP, you wire them through the workspace's REST API instead. Dock ships an MCP server with 37 tools that covers most stacks.

Tasks

For Claude Desktop agents: add the workspace MCP server to ~/.config/Claude/claude_desktop_config.json
For Claude Code agents: `claude mcp add dock-workspace <url>`
For Cursor agents: add to .cursor/mcp.json
For ChatGPT GPT agents: build an Action with the workspace's OpenAPI
Test each: ask the agent 'list rows in the tasks surface' and confirm it returns real rows

Pointers

OfficialClaude Desktop MCP config OfficialClaude Code MCP setup OfficialDock MCP tool list

Gotchas

Claude Desktop requires a full restart (force-quit) to pick up MCP config changes, not just a window close.
MCP servers without auth headers expose your data publicly. Use the workspace's per-agent API key.
Cursor caches MCP tool lists. Toggle the server off + on in Settings -> MCP after config changes.

Agent prompt for this step

For each agent in the Agent roster, write the MCP setup commands for the agent's runtime.

Output as a Brief section titled "MCP setup: <agent name>". For each agent include:
1. The exact JSON config block (Claude Desktop) OR shell command (Claude Code) OR file edit (Cursor) the user runs
2. The first 3 MCP calls the agent should make on connect (list_workspaces, list_rows, get_doc)
3. A test query the user can paste to confirm the wiring works ("list the rows in my tasks surface")

Constraints: real commands, not pseudocode. Real file paths. Test queries that produce a verifiable result.

Define each agent's daily + weekly cadence

1 hr

Agents that don't have a cadence get used twice and forgotten. Define when each agent runs, what triggers it, and what it produces. Some agents are reactive (you ask, they answer). Some are proactive (cron-fired, scheduled, or triggered by an inbox event). The proactive ones take more setup but produce more leverage.

Tasks

For each agent: write the trigger (you ask, scheduled, inbox-driven, etc.)
Write the daily output (1 row added to the inbox, 1 paragraph in the brief, etc.)
Write the weekly output (Sunday review, top-of-week digest, etc.)
For scheduled agents: set up the cron / scheduled task in the runtime that supports it
Add the cadence to each agent's row in Agent roster

Pointers

OfficialScheduled tasks in Claude Code OfficialApple Shortcuts automation

Gotchas

Daily-cadence agents that produce a 5-paragraph email get unsubscribed. Daily output should be 1-3 lines max.
Don't run a scheduled agent more often than the work justifies. A 'check inbox every 5 min' agent is a great way to burn $20/day on tokens.

Run the stack for 1 week without changing anything

1 week

The hardest step. You'll be tempted to tweak prompts, swap runtimes, add a 6th agent. Don't. Run the stack as designed for 7 days, log what works and what doesn't, then change once at the end of the week. Premature tweaking kills the signal.

Tasks

Use each agent as designed for 7 days
Log every interaction in the Agent roster: which agent, when, what task, did it produce useful output
DON'T tweak system prompts, runtimes, or scope mid-week
On day 7: write a 1-paragraph review per agent in the Brief

Pointers

GuideAgent stack journaling pattern

Gotchas

Mid-week tweaks reset the signal. You learn nothing about whether the original setup works.
An agent that didn't get used at all in week 1 is a dead agent. Don't keep it on life support, cut or repurpose.

Weekly review: what stuck, what got cut, what got merged

30 min/week, recurring

Every Sunday, do a 30-minute review. For each agent: did you use it 3+ times? Yes -> keep. 1-2 times -> tweak the system prompt. Zero -> cut or merge. Stacks that survive are the ones with this loop. Stacks that don't have it accumulate dead weight until the user gives up and goes back to one chat.

Tasks

Open the Agent roster, count interactions per agent for the week
For 3+ used: keep, note what worked, no changes
For 1-2 used: tweak the system prompt or trigger, log the change
For 0 used: cut the agent OR merge its scope into another agent
Update the Brief's 'state of the stack' paragraph
Add the week's notes to a Weekly Review doc

Pointers

GuideDock weekly review template

Gotchas

Don't skip the review even when the stack feels fine. Two missed weeks and the dead-agent rot starts.
Don't add a new agent during a review week. Adding + reviewing don't mix, you'll burn the signal.
If 3+ agents are dead, the issue is usually the workspace or runtime, not the agents. Audit the layer below before adding more on top.

Agent prompt for this step

Pull the last 7 days of activity from the Agent roster.

For each agent, count: total interactions, useful interactions (where the user took the output and ran with it), dead interactions (output ignored).

Output as a Brief section titled "Weekly review v<n>":
1. Agents to keep (3+ useful interactions, no changes needed)
2. Agents to tweak (1-2 interactions, suggest 1 system prompt change)
3. Agents to cut or merge (0 interactions, recommend a destination)

Constraints:
- Be honest. Don't sugarcoat dead agents.
- Suggest exactly one change per surviving-but-struggling agent. More than that and the user won't apply any.

Then update the Agent roster: bump usage counts, mark dead agents as 'pending review'.

Hand the template to your agent

Workspace-wide agent prompt.

Paste this into your agent's permanent system prompt so the agent reads, writes, and maintains the template's surfaces as you work through the steps.

Agent system prompt

You are an agent on the "Build a personal agent stack" playbook workspace at your-org/build-a-personal-agent-stack.

Your role: maintain the four surfaces (Steps, Pointers, Brief, Agent roster) as the user builds + iterates on their stack.

Cadence:
- When the user adds a new agent to their stack, append a row to Agent roster (name, role, runtime, system_prompt_link, status).
- When the user does the weekly review, log: which agents got used, which got skipped, what changed.
- Every Sunday, draft a 1-paragraph "state of the stack" summary in the Brief.

First MCP tool calls:
1. list_workspaces()
2. list_rows(workspace_slug="build-a-personal-agent-stack", surface_slug="agent-roster")
3. get_doc(workspace_slug="build-a-personal-agent-stack", surface_slug="brief")

Do NOT modify the canonical step titles in the Steps table. You can append substeps as new rows beneath them.

FAQ

Common questions on this template.

Why 3-5 agents and not 10 or 20?: Three is the leverage floor (fewer is barely better than one chat). Five is the cognitive ceiling (more and you forget which agent does what, and the dispatch tax beats the leverage). The agent stacks that survive 8+ weeks almost always settle in this range, the bigger ones consolidate via merging or attrition.
Can I use just one runtime (Claude Pro) for all my agents?: Yes, that's the simplest setup. Claude Projects let you maintain separate system prompts + custom instructions per project, which is enough for 3-5 agents in one subscription. The gain from per-agent runtimes (Claude Code for coding, custom GPTs for specialized workflows) is real but adds setup and subscription cost. Start with one runtime, split later if a specific agent needs it.
What's the right way to share workspace state between agents?: MCP. Each agent connects to the same workspace via the MCP server, every agent reads + writes the same surfaces. Dock's 37-tool MCP server (list_workspaces, list_rows, update_row, append_doc_section, etc.) is built for exactly this case. Without shared state, agents become silos: each one has context the others don't, and you become the human in the middle copy-pasting.
How do I know if an agent is worth keeping?: The 3-3-3 rule: an agent is worth keeping if you used it 3 times in the past week, the output was useful 3 of those times, and at least 1 of those times you took the output and shipped it without rewriting. If any of those three is below threshold, the agent is dead weight, tweak or cut.
What does this cost end-to-end?: Cheapest: $20/mo (Claude Pro alone). Typical: $40-60/mo (Claude Pro + ChatGPT Plus + Dock Pro). Power user: $80-120/mo (all of the above + Cursor + Anthropic API metered usage). The variable is API metered cost, scheduled agents that run 24/7 can run $20-50/mo on tokens alone if you're not careful.
Can I share my agent stack with a teammate?: Yes. The runtime + workspace combo determines how. Claude Team or ChatGPT Team workspaces share custom instructions across the team. Dock workspaces support multiple humans + agents on the same surfaces. The system prompts you write are shareable as Markdown, paste them into your teammate's runtime to clone the agent. Tracking divergence (your version vs. theirs) is the part to watch.

Open this template as a workspace.

We mint a fresh copy in your org with the steps as table rows, the pointers as a separate table, and the brief as a doc. Bring your agents, start checking off boxes.

Open in Dock