This site is the phone-accessible review layer for the Claude Code Mission Control architecture package. It contains the decisions already made, the outstanding questions that need Saint's review, and the key architecture, interface, and implementation details. No local Ubuntu links required.
Use this section as the decision pass for Phase 2. When you're reviewing on your phone, the practical question is: what should Claude Code build next, exactly?
Approve the architecture package as-is and let Claude Code begin implementation from the current plan.
Keep the core direction, but change specific decisions like branding, auth model, hosting strategy, bot strategy, or Phase 1 scope.
Pause implementation and request changes to the architecture package before any build work starts.
Project: Mission Control
Mode: personal-first, product-capable
Execution engine: Claude Code on-host via Max subscription
Current state: docs complete, build not started
These are the decisions to answer before Claude Code Phase 2 starts implementation.
## 1. Problem Statement Saint operates a complex, growing portfolio of AI-powered projects, client engagements, and agent systems through Claude Code. Today, this work happens across: - Ad hoc Claude Code CLI sessions - Manual file-based tracking (ACTIVE.md, COMMITMENTS.md, memory files) - Cron-triggered agents (R&D Council, nightly cleanup) - One-off agent dispatches for research, builds, and comms **The problems:** 1. **No unified visibility.** There's no single view of what agents are doing, what's queued, what's blocked, or what completed. 2. **No operator control plane.** Saint can't pause, reprioritize, or redirect agents without opening a new CLI session. 3. **No structured delegation.** Task routing is manual — Saint decides which agent or skill to invoke for each task. 4. **No notification pipeline.** Results and alerts are scattered across Telegram, email drafts, and file changes. 5. **No session continuity.** Each Claude Code conversation starts cold; context must be rebuilt from memory files. 6. **No audit trail.** Agent decisions, task outcomes, and delegation chains aren't tracked in a queryable way.
## 2. Vision A mission control system where Saint is the **operator** of an agent fleet. One dashboard shows everything. One triage point routes everything. Every action is tracked, every result is surfaced, every decision is logged. **Personal-first:** Optimized for Saint's workflow, projects, and preferences. **Product-capable:** Architected so the same system can serve other operators later — different agents, different projects, same control plane.
## 4. Core Capabilities
### 4.1 Mission Control Dashboard (Web UI)
A real-time web interface providing:
| Capability | Description |
|------------|-------------|
| **Mission Status** | At-a-glance view of active tasks, agent states, queue depth, alerts |
| **Task Queue** | Ordered list of pending, in-progress, and completed tasks with delegation visibility |
| **Agent Fleet View** | Status of each specialist agent — idle, active, blocked, errored |
| **Activity Feed** | Chronological log of all agent actions, decisions, and outputs |
| **Memory Panel** | Current session context, recent recalls, active project context |
| **Notification Center** | Aggregated alerts from all channels — Telegram, cron results, agent completions |
| **Quick Actions** | Dispatch task, escalate, pause agent, override priority, trigger manual run |
| **Project Lens** | Filter all views by project (e.g., show only Clearfork-related activity) |
### 4.2 Triage Agent
The front door for all incoming work. Receives tasks (from dashboard, CLI, cron, webhook) and:
1. **Classifies** — determines task type (ops, research, build, comms, analysis)
2. **Prioritizes** — assigns urgency based on source, content, and ACTIVE.md context
3. **Routes** — dispatches to the appropriate specialist agent
4. **Tracks** — creates a task record with status, delegation chain, and expected outcome
5. **Escalates** — flags tasks it can't classify or that exceed confidence thresholds
### 4.3 Specialist Agents
| Agent | Responsibility | Examples |
|-------|---------------|----------|
| **Ops Agent** | File management, repo cleanup, cron maintenance, system health | Nightly cleanup, orphan purge, backup rotation |
| **Research Agent** | Web search, competitive analysis, tool evaluation, synthesis | Ecosystem scans, tool deep-dives, market research |
| **Build Agent** | Code generation, deployment, testing, site builds | Project scaffolding, Cloudflare deploys, bug fixes |
| **Comms Agent** | Notifications, Telegram messages, email drafts, report formatting | Daily digests, alert routing, stakeholder updates |
| **Memory Agent** | Session summaries, context retrieval, memory maintenance | Recall queries, memory pruning, context briefings |
| **Council Agent** | Multi-perspective analysis, decision frameworks, reviews | R&D Council memos, architecture reviews, strategy briefs |
### 4.4 Task Lifecycle
```
INTAKE → TRIAGE → QUEUED → DISPATCHED → IN_PROGRESS → COMPLETED/FAILED → ARCHIVED
↓
ESCALATED → OPERATOR_REVIEW → RE-DISPATCHED
```
Every task transitions through defined states. Each transition is logged with timestamp, agent, and reason.
### 4.5 Memory Model
| Layer | Scope | Persistence | Purpose |
|-------|-------|-------------|---------|
| **Session Context** | Current conversation | Ephemeral | Active working memory for the current task |
| **Short-Term Recall** | Last 24-48 hours | JSON file | Recent decisions, outcomes, context fragments |
| **Project Memory** | Per-project | Markdown + JSON | Project-specific state, decisions, history |
| **Operator Memory** | Cross-project | Memory directory | Saint's preferences, patterns, working style |
| **Task Archive** | All time | SQLite | Complete task history — queryable, filterable |
### 4.6 Notification Pipeline
```
Agent Event → Notification Router → Channel Selection → Delivery → Acknowledgment Tracking
Channels:
- Dashboard (always)
- Telegram (urgent / configured)
- Email draft (async / summary)
- Sound/visual alert (future: war-room mode)
```
**Notification levels:**
- **INFO** — task completed, agent status change (dashboard only)
- **NOTICE** — task needs review, completion with warnings (dashboard + optional Telegram)
- **ALERT** — task failed, escalation needed, blocking issue (dashboard + Telegram)
- **CRITICAL** — system error, security event, data loss risk (all channels)## 7. Success Criteria ### Phase 1 Launch (MVP) - [ ] Dashboard renders mission status, task queue, and agent fleet in real-time - [ ] Triage agent correctly classifies and routes 90%+ of standard task types - [ ] At least 3 specialist agents operational (Ops, Research, Build) - [ ] Task lifecycle tracked end-to-end with queryable history - [ ] Telegram notifications working for ALERT and CRITICAL levels - [ ] Memory model supports session context + short-term recall + task archive - [ ] Saint can dispatch, pause, and reprioritize tasks from the dashboard ### Phase 2 (Growth) - [ ] All 6 specialist agents operational - [ ] Voice input for task dispatch - [ ] Multi-project lens with filtering - [ ] War-room mode (focused dashboard for urgent incidents) - [ ] Exportable task history and performance metrics ### Phase 3 (Product) - [ ] Multi-operator support with isolated agent fleets - [ ] Configuration-driven agent definition (no code changes to add specialists) - [ ] Onboarding flow for new operators - [ ] Usage analytics and operator insights
## 2. Technology Stack | Component | Technology | Rationale | |-----------|-----------|-----------| | **Dashboard frontend** | Vanilla HTML + CSS + JS | No build step. Matches Saint's existing pattern (command center, portals). Deployable to Cloudflare Pages. | | **Control server** | Node.js (no framework) | Matches voice-bridge pattern. Lightweight. Native child_process for CLI invocation. | | **Real-time updates** | Server-Sent Events (SSE) | Simpler than WebSocket for unidirectional server→client updates. Native browser support. Auto-reconnect. | | **State storage** | SQLite (via better-sqlite3) | Single-file database. No external service. ACID transactions. Full SQL for task queries. | | **Configuration** | JSON files | Human-editable. Git-trackable. No env vars for core config. | | **Agent execution** | Claude Code CLI (`claude` command) | Native Max subscription path. Supports `--agent`, `--prompt`, `--output-format json`. | | **Notifications** | Telegram Bot API + dashboard | Existing pattern. HTTP-only (no SDK needed). | | **Testing** | Node.js built-in test runner (`node:test`) | Zero dependency. Native assertions. Sufficient for this scope. | | **Process management** | systemd service | Matches voice-bridge pattern. Auto-restart. Journal logging. |
## 1. System Topology ``` ╔══════════════════════════════════════════════════════════════════════╗ ║ MISSION CONTROL ║ ╠══════════════════════════════════════════════════════════════════════╣ ║ ║ ║ ┌────────────────────────────────────────────────────────────────┐ ║ ║ │ WEB DASHBOARD │ ║ ║ │ │ ║ ║ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────────────┐ │ ║ ║ │ │ Mission │ │ Task │ │ Agent │ │ Activity │ │ ║ ║ │ │ Status │ │ Queue │ │ Fleet │ │ Feed │ │ ║ ║ │ └──────────┘ └──────────┘ └──────────┘ └───────────────────┘ │ ║ ║ │ ┌──────────┐ ┌──────────┐ ┌──────────────────────────────────┐│ ║ ║ │ │ Notif. │ │ Quick │ │ Project Lens ││ ║ ║ │ │ Center │ │ Actions │ │ (filter all views) ││ ║ ║ │ └──────────┘ └──────────┘ └──────────────────────────────────┘│ ║ ║ └─────────────────────────┬──────────────────────────────────────┘ ║ ║ │ SSE + REST API ║ ║ ┌─────────────────────────▼──────────────────────────────────────┐ ║ ║ │ CONTROL SERVER (Node.js) │ ║ ║ │ │ ║ ║ │ ┌────────────┐ ┌──────────────┐ ┌────────────────────────┐ │ ║ ║ │ │ Triage │ │ Agent │ │ Notification │ │ ║ ║ │ │ Engine │ │ Executor │ │ Router │ │ ║ ║ │ │ │ │ │ │ │ │ ║ ║ │ │ classify() │ │ dispatch() │ │ route() │ │ ║ ║ │ │ prioritize│ │ monitor() │ │ deliver() │ │ ║ ║ │ │ route() │ │ collect() │ │ track() │ │ ║ ║ │ └─────┬──────┘ └──────┬───────┘ └──────────┬─────────────┘ │ ║ ║ │ │ │ │ │ ║ ║ │ ┌─────▼────────────────▼──────────────────────▼─────────────┐ │ ║ ║ │ │ EVENT BUS (EventEmitter) │ │ ║ ║ │ │ task:created | task:updated | agent:started | notif:new │ │ ║ ║ │ └──────────────────────┬────────────────────────────────────┘ │ ║ ║ │ │ │ ║ ║ │ ┌──────────────────────▼────────────────────────────────────┐ │ ║ ║ │ │ STATE MANAGER (SQLite + JSON) │ │ ║ ║ │ │ tasks | task_events | agents | notifications | config │ │ ║ ║ │ └───────────────────────────────────────────────────────────┘ │ ║ ║ └─────────────────────────┬──────────────────────────────────────┘ ║ ║ │ CLI spawn ║ ║ ┌─────────────────────────▼──────────────────────────────────────┐ ║ ║ │ EXECUTION LAYER │ ║ ║ │ │ ║ ║ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │ ║ ║ │ │ Ops │ │ Research │ │ Build │ │ Comms │ │ ║ ║ │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ ║ ║ │ └──────────┘ └──────────┘ └──────────┘ └──────────────────┘ │ ║ ║ │ ┌──────────┐ ┌──────────┐ │ ║ ║ │ │ Memory │ │ Council │ Each agent = Claude Code CLI │ ║ ║ │ │ Agent │ │ Agent │ process with scoped prompt │ ║ ║ │ └──────────┘ └──────────┘ │ ║ ║ └────────────────────────────────────────────────────────────────┘ ║ ║ ║ ╚══════════════════════════════════════════════════════════════════════╝ ```
## 2. Agent Topology
```
┌─────────────────┐
Dashboard ──────►│ │
CLI ────────────►│ TRIAGE AGENT │──── escalate ────► Operator
Cron ───────────►│ (classifier) │
Webhook ────────►│ │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────────▼───┐ ┌──────▼─────┐ ┌──────▼─────┐
│ OPS │ │ RESEARCH │ │ BUILD │
│ │ │ │ │ │
│ • cleanup │ │ • search │ │ • scaffold │
│ • backup │ │ • analyze │ │ • deploy │
│ • cron │ │ • evaluate │ │ • fix bugs │
│ • health │ │ • compare │ │ • test │
└────────────┘ └────────────┘ └────────────┘
│ │ │
┌────────▼───┐ ┌──────▼─────┐ ┌──────▼─────┐
│ COMMS │ │ MEMORY │ │ COUNCIL │
│ │ │ │ │ │
│ • notify │ │ • recall │ │ • review │
│ • draft │ │ • summarize│ │ • debate │
│ • digest │ │ • brief │ │ • evaluate │
│ • report │ │ • prune │ │ • advise │
└────────────┘ └────────────┘ └────────────┘
```
### Agent Definitions
| Agent | ID | Task Types | Concurrency | Timeout | Prompt Template |
|-------|----|------------|-------------|---------|----------------|
| Ops Agent | `ops` | ops | 1 | 2 min | `prompts/ops-agent.md` |
| Research Agent | `research` | research, analysis | 2 | 5 min | `prompts/research-agent.md` |
| Build Agent | `build` | build | 1 | 10 min | `prompts/build-agent.md` |
| Comms Agent | `comms` | comms | 2 | 2 min | `prompts/comms-agent.md` |
| Memory Agent | `memory` | analysis (memory-specific) | 1 | 1 min | `prompts/memory-agent.md` |
| Council Agent | `council` | council | 1 | 5 min | `prompts/council-agent.md` |
### Agent Communication Pattern
Agents do NOT talk to each other directly. All communication flows through the control server:
```
Agent A completes task
→ output stored in task record
→ event emitted on event bus
→ triage engine evaluates: does this spawn follow-up tasks?
→ if yes: new task created → enters triage pipeline → may route to Agent B
```
This keeps the system debuggable and auditable. Every inter-agent data flow is a task record in the database.## 1. Design Principles 1. **Operator cockpit, not chat window.** Information-dense. Glanceable. No conversational chrome. 2. **Dark theme.** Reduces eye strain during long sessions. High contrast for status indicators. 3. **Real-time by default.** Everything updates live via SSE. No refresh buttons. 4. **Keyboard-first.** Power user shortcuts for common actions. Mouse optional. 5. **Zero build step.** Vanilla HTML/CSS/JS. No React, no bundler. Ship as static files.
## 2. Layout — Main Dashboard ``` ┌──────────────────────────────────────────────────────────────────────────┐ │ MISSION CONTROL [Project ▼] [⚡ 3] [🔔 2] │ ├──────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─── MISSION STATUS ──────────────────────────────────────────────────┐ │ │ │ │ │ │ │ ACTIVE: 3 QUEUED: 5 COMPLETED: 47 FAILED: 1 ESCAL: 1 │ │ │ │ ████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░ Agents: 2/6 busy │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─── TASK QUEUE ──────────────────────┐ ┌─── AGENT FLEET ────────────┐ │ │ │ │ │ │ │ │ │ P1 ● Research fast.io evaluation │ │ OPS ● idle │ │ │ │ ↳ research · in_progress │ │ RESEARCH ◉ active │ │ │ │ P2 ● Deploy Scott portal v3 │ │ └ "Research fast.io" │ │ │ │ ↳ build · queued │ │ BUILD ◉ active │ │ │ │ P2 ● Draft Perry session prep │ │ └ "Deploy Scott v3" │ │ │ │ ↳ comms · queued │ │ COMMS ● idle │ │ │ │ P3 ● Nightly cleanup │ │ MEMORY ● idle │ │ │ │ ↳ ops · queued │ │ COUNCIL ● idle │ │ │ │ P3 ● Prune stale memories │ │ │ │ │ │ ↳ analysis · queued │ │ Capacity: 2/6 active │ │ │ │ │ │ Today: 47 tasks │ │ │ │ [+ New Task] [Filter ▼] │ │ Errors: 1 (last 24h) │ │ │ └──────────────────────────────────────┘ └────────────────────────────┘ │ │ │ │ ┌─── ACTIVITY FEED ──────────────────────────────────────────────────┐ │ │ │ │ │ │ │ 10:42 ✓ research Completed: Ecosystem scan for Valence │ │ │ │ 10:41 → build Dispatched: Deploy Scott portal v3 │ │ │ │ 10:40 ⚡ triage Classified: "Draft Perry prep" → comms (P2) │ │ │ │ 10:38 ✓ ops Completed: Morning health check │ │ │ │ 10:35 ⚠ council Escalated: Strategy brief needs operator input │ │ │ │ 10:30 → research Dispatched: Research fast.io evaluation │ │ │ │ 10:28 + system Task created: Research fast.io (from cron) │ │ │ │ ... │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─── QUICK ACTIONS ──────────────────────────────────────────────────┐ │ │ │ [New Task] [Run Agent ▼] [Pause All] [View Archive] [Memory] │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────┘ ```
## Phase 0: Foundation (Steps 1–4)
### Step 1: Project Scaffold + Database Schema
**Goal:** Bootable project with SQLite database and all tables created.
**Tests first:**
- `test/state.test.js` — test database creation, table existence, schema validation
- Test: `createDatabase()` creates file at configured path
- Test: all 4 tables exist with correct columns
- Test: indexes exist
- Test: WAL mode is enabled
**Then build:**
- `package.json` with `better-sqlite3` dependency
- `src/state.js` — database initialization, schema creation, migration support
- `config/mission-control.json` — default config template
- `.gitignore` — exclude `data/`, `config/mission-control.json` (keep `.example`)
**Done when:** `node --test test/state.test.js` passes.
---
### Step 2: State Manager — CRUD Operations
**Goal:** Full task and agent lifecycle operations on the database.
**Tests first:**
- `test/state.test.js` (extend) — task CRUD, agent CRUD, event logging, notification CRUD
- Test: `tasks.create(input)` returns task with ULID, timestamps, status='intake'
- Test: `tasks.update(id, changes)` updates fields and updated_at
- Test: `tasks.list({status, project, agent})` filters correctly
- Test: `tasks.get(id)` returns full task with events
- Test: `agents.register(config)` creates agent record
- Test: `agents.updateStatus(id, status)` transitions correctly
- Test: `agents.list()` returns all with current status
- Test: `events.append(task_id, type, detail)` creates event with ULID
- Test: `events.forTask(task_id)` returns ordered events
- Test: `notifications.create(...)` stores notification
- Test: `notifications.acknowledge(id)` marks delivered
**Then build:**
- `src/state.js` — add all CRUD methods
- `src/utils.js` — ULID generation, ISO timestamp helpers
**Done when:** All state tests pass.
---
### Step 3: Event Bus
**Goal:** Internal pub/sub for decoupled component communication.
**Tests first:**
- `test/events.test.js`
- Test: `bus.emit('task:created', data)` triggers registered listener
- Test: multiple listeners on same event all fire
- Test: listener receives correct event data
- Test: unsubscribe removes listener
- Test: no error when emitting event with no listeners
**Then build:**
- `src/events.js` — thin EventEmitter wrapper with typed event names
**Done when:** All event tests pass.
---
### Step 4: Configuration Loader
**Goal:** Load, validate, and merge configuration from JSON files.
**Tests first:**
- `test/config.test.js`
- Test: loads `mission-control.json` and returns structured config
- Test: loads `agents.json` and returns agent registry
- Test: loads `triage-rules.json` and returns rules
- Test: missing config file → clear error with file path
- Test: invalid JSON → clear error with parse details
- Test: missing required fields → validation error
**Then build:**
- `src/config.js` — config loader with validation
- `config/mission-control.example.json`
- `config/agents.json`
- `config/triage-rules.json`
- `config/notifications.json`
- `config/projects.json`
**Done when:** All config tests pass.
---## Phase 1: Triage + Execution (Steps 5–8) ### Step 5: Triage Engine — Classification **Goal:** Classify incoming tasks by type using rule-based matching. **Tests first:** - `test/triage.test.js` - Test: "clean up stale files" → task_type: ops - Test: "research Valence competitor" → task_type: research - Test: "deploy Scott portal" → task_type: build - Test: "send Perry prep to Telegram" → task_type: comms - Test: "summarize today's context" → task_type: analysis - Test: "council review on positioning" → task_type: council - Test: ambiguous input → escalate action - Test: multiple rule matches → first match wins (rules are ordered) - Test: priority_sources maps source to default priority **Then build:** - `src/triage.js` — `classify(task)`, `prioritize(task)`, `route(task)` **Done when:** All triage tests pass. --- ### Step 6: Triage Engine — Full Pipeline **Goal:** Triage receives a task, classifies it, assigns priority, routes it, and updates the database. **Tests first:** - `test/triage.test.js` (extend) - Test: `triage.process(task)` updates task status to 'triaged', sets task_type and assigned_agent - Test: creates task_event with type='triaged' - Test: emits 'task:triaged' event on bus - Test: escalated task gets status='escalated' and NOTICE notification created - Test: priority boost from rules applied correctly - Test: manual routing (operator-specified agent) bypasses classification **Then build:** - `src/triage.js` — `process(task)` full pipeline method - Wire triage to event bus and state manager **Done when:** All triage pipeline tests pass. --- ### Step 7: Agent Executor — CLI Spawning **Goal:** Spawn Claude Code CLI with a built prompt and capture output. **Tests first:** - `test/executor.test.js` - Test: `buildPrompt(agent, task, context)` assembles prompt from template + task + context - Test: `dispatch(agent_id, task)` spawns child process with correct args - Test: agent status updates to 'active' on dispatch - Test: task status updates to 'dispatched', then 'in_progress' - Test: successful completion captures stdout as task output - Test: non-zero exit → task 'failed', agent 'errored' - Test: timeout → process killed, task 'failed' with timeout error - Test: concurrency limit respected (agent at max slots → task stays queued) **Mock strategy:** Mock `child_process.spawn` for unit tests. Integration test with a real `echo` command for spawn verification. **Then build:** - `src/executor.js` — `buildPrompt()`, `dispatch()`, `monitor()`, `collect()` - `prompts/` — starter prompt templates for each agent **Done when:** All executor tests pass (with mocked CLI). --- ### Step 8: Task Queue Manager **Goal:** Queue dispatches tasks to agents respecting priority, concurrency, and availability. **Tests first:** - `test/queue.test.js` - Test: highest priority task dispatched first - Test: within same priority, oldest task dispatched first - Test: task not dispatched if assigned agent is at capacity - Test: task dispatched to next available agent when one completes - Test: paused queue holds all dispatches - Test: resume queue dispatches pending tasks immediately - Test: manual dispatch (from operator) bypasses queue ordering **Then build:** - `src/queue.js` — queue manager that polls state and dispatches - Wire queue to event bus (listens for agent:completed to dispatch next) **Done when:** All queue tests pass. ---
## Phase 2: API + Dashboard (Steps 9–14) ### Step 9: HTTP Server — Core API **Goal:** REST API for task and agent operations. **Tests first:** - `test/api.test.js` - Test: `GET /api/status` → returns mission status summary (counts, agent states) - Test: `GET /api/tasks` → returns task list (default: non-archived) - Test: `GET /api/tasks?status=queued&project=clearfork` → filtered results - Test: `GET /api/tasks/:id` → returns single task with events - Test: `POST /api/tasks` with valid body → creates task, returns 201 - Test: `POST /api/tasks` with missing title → returns 400 - Test: `PATCH /api/tasks/:id` → updates priority/status - Test: `GET /api/agents` → returns agent fleet status - Test: `POST /api/agents/:id/run` with prompt → triggers manual dispatch - Test: `GET /api/notifications` → returns notification list - Test: `POST /api/notifications/:id/ack` → marks acknowledged **Then build:** - `src/server.js` — HTTP server with route handlers (Node.js http module, no framework) **Done when:** All API tests pass. --- ### Step 10: Authentication **Goal:** Password-based auth for dashboard and API. **Tests first:** - `test/auth.test.js` - Test: `POST /auth` with correct password → returns session cookie - Test: `POST /auth` with wrong password → returns 401 - Test: API request without session cookie → returns 401 - Test: API request with valid session cookie → returns 200 - Test: expired session → returns 401 - Test: `/auth` endpoint itself does not require auth **Then build:** - Auth middleware in `src/server.js` - Session management (in-memory map, token → expiry) - `dashboard/login.html` **Done when:** All auth tests pass. --- ### Step 11: SSE Event Stream **Goal:** Real-time event delivery to the dashboard. **Tests first:** - `test/sse.test.js` - Test: `GET /api/events` returns content-type `text/event-stream` - Test: event bus emission → SSE client receives formatted event - Test: SSE message format is valid (`event:` + `data:` lines) - Test: multiple simultaneous SSE clients all receive events - Test: client disconnect is handled cleanly (no leaked listeners) - Test: reconnect sends current state snapshot (Last-Event-ID support) **Then build:** - SSE handler in `src/server.js` - Event bus → SSE bridge **Done when:** All SSE tests pass. --- ### Step 12: Dashboard — Static Shell + Mission Status **Goal:** Dashboard HTML/CSS renders with live mission status bar. **Tests first:** - `test/dashboard.test.js` (using Node.js `fetch` against running server) - Test: `GET /` returns HTML with correct structure - Test: HTML contains mission-status, task-queue, agent-fleet, activity-feed elements - Test: CSS loads without errors - Test: JS loads and connects to SSE endpoint - Test: mission status bar updates when status API data changes **Then build:** - `dashboard/index.html` — full page structure - `dashboard/style.css` — dark theme, layout grid - `dashboard/app.js` — SSE connection, component initialization - `dashboard/components/mission-status.js` — status bar component **Done when:** Dashboard loads, shows live mission status counts. --- ### Step 13: Dashboard — Task Queue + Agent Fleet **Goal:** Task list and agent fleet panels render with real data and update in real-time. **Tests first:** - Test: task queue fetches from `/api/tasks` and renders rows - Test: task row shows priority badge, status dot, title, meta - Test: new task SSE event adds row to queue - Test: task update SSE event updates existing row - Test: agent fleet fetches from `/api/agents` and renders rows - Test: agent status change updates fleet panel - Test: clicking task row opens detail panel - Test: clicking agent row opens agent detail panel **Then build:** - `dashboard/components/task-queue.js` - `dashboard/components/agent-fleet.js` **Done when:** Both panels render and update in real-time. --- ### Step 14: Dashboard — Activity Feed + Quick Actions + Notifications **Goal:** Complete dashboard with all panels operational. **Tests first:** - Test: activity feed renders events from SSE stream - Test: feed auto-scrolls on new events, pauses when user scrolls up - Test: new task form submits to API and task appears in queue - Test: pause button stops dispatching, resume restarts - Test: notification center shows unacknowledged notifications - Test: acknowledge button marks notification and removes badge count - Test: keyboard shortcuts trigger correct actions **Then build:** - `dashboard/components/activity-feed.js` - `dashboard/components/quick-actions.js` - `dashboard/components/notification-center.js` - Keyboard shortcut handler in `app.js` **Done when:** Full dashboard is operational. All panels work together. ---
## 1. The Rule **Tests before code. No exceptions.** Every implementation step follows this sequence: 1. Write failing tests that define the expected behavior 2. Run tests — confirm they fail (red) 3. Write the minimum code to make tests pass (green) 4. Refactor if needed — tests must still pass (refactor) 5. Commit: tests + production code together If a commit doesn't include tests, it doesn't ship.
## 6. Coverage Expectations | Category | Target | Rationale | |----------|--------|-----------| | State manager | 95%+ | Core data layer. Bugs here corrupt everything. | | Triage engine | 95%+ | Misclassification = wrong agent = wasted time. | | Agent executor | 85%+ | CLI interaction is hard to fully test. Mock boundaries are tested. | | Queue manager | 90%+ | Ordering bugs = priority inversion. | | API endpoints | 90%+ | Every endpoint, every error code. | | Notifications | 85%+ | External service mocking limits full coverage. | | Memory | 80%+ | File I/O edge cases. | | Dashboard | 60%+ | Server-side tests only in Phase 1. |